digitalmars.D - D source code revision system idea
- jdunne4 bradley.edu (24/24) Aug 19 2004 I'm not sure if this is the right place to throw up an idea like this, b...
- pragma (8/32) Aug 19 2004 Not a bad idea. Would this be a stand-alone project, or something added...
- Jaymz (24/63) Aug 19 2004 Let's see...
- Berin Loritsch (12/40) Aug 19 2004 If you start by getting the diff/patch utilities working properly, with
- Jaymz (20/60) Aug 19 2004 Unfortunately, I don't see how I could create a format compatible with t...
- Regan Heath (10/92) Aug 19 2004 KISS == Keep It Simple Stupid.
- Berin Loritsch (21/45) Aug 19 2004 That was its intention (how did you get this message and I didn't?).
- Jaymz (89/89) Aug 19 2004 That was odd, the posts got out of order and I didn't see Regan's post
-
J C Calvarese
(10/40)
Aug 19 2004
- Jaymz (16/56) Aug 20 2004 Thanks for your comment, but I was just trying to convey the idea of the
- Ilya Minkov (13/13) Aug 19 2004 If it is defined over a tree, i imagine it fairly unstable. Not that it
- Jaymz (19/32) Aug 19 2004 Well, it would have to be defined with something a bit more complex than...
- Ilya Minkov (27/44) Aug 19 2004 That might work... Although i'd like it somehow independant from most
- pragma (22/44) Aug 19 2004 I can see the merit in a stand-alone server, but this may be the wrong w...
I'm not sure if this is the right place to throw up an idea like this, but there seem to be an astonishing number of competent developers here to offer insightful feedback, so I'll go ahead and toss it up ;). Feel free to respond and bounce ideas back off me! What would you think of a source code revision system that does not work on line-by-line code differences, but rather semantical differences? This would be mainly targeted at D source code, since it lends well to this type of revision system. The lack of a pre-processor combined with the concept of modules makes this language an ideal target. Pros: 1) ***More robust patching ability*** (Not line-based, so no "fuzz" needed) 2) Easy merging of codebases (trunks) 3) Easy conflict detection during merges (check function call parameters, etc.) 4) Could spot possible compile errors 5) Code can be regenerated to conform to a formatting standard 6) Accepts only correct code (possible con...) Cons: 1) Maintaining comments and their positions in the code becomes difficult, since they are not compilable elements 2) Somewhat difficult implementation A new patch/diff toolset would need to be created to accomodate this new semantic revision control system as well. Please, let me know what you think! James Dunne
Aug 19 2004
Not a bad idea. Would this be a stand-alone project, or something added to an existing product, like Subversion or CVS? The only thing that comes to mind is: how would you even attempt to define semantic merging and versioning in any language? Are you talking about making sure that merged sources compile okay, or is it something deeper than a unittest? - Pragma In article <cg2i31$23fu$1 digitaldaemon.com>, jdunne4 bradley.edu says...I'm not sure if this is the right place to throw up an idea like this, but there seem to be an astonishing number of competent developers here to offer insightful feedback, so I'll go ahead and toss it up ;). Feel free to respond and bounce ideas back off me! What would you think of a source code revision system that does not work on line-by-line code differences, but rather semantical differences? This would be mainly targeted at D source code, since it lends well to this type of revision system. The lack of a pre-processor combined with the concept of modules makes this language an ideal target. Pros: 1) ***More robust patching ability*** (Not line-based, so no "fuzz" needed) 2) Easy merging of codebases (trunks) 3) Easy conflict detection during merges (check function call parameters, etc.) 4) Could spot possible compile errors 5) Code can be regenerated to conform to a formatting standard 6) Accepts only correct code (possible con...) Cons: 1) Maintaining comments and their positions in the code becomes difficult, since they are not compilable elements 2) Somewhat difficult implementation A new patch/diff toolset would need to be created to accomodate this new semantic revision control system as well. Please, let me know what you think! James Dunne
Aug 19 2004
Let's see... Upon first design, this could just be a simple stand-alone project implemented for the D language, consisting of a defined patch-format and a patch/diff-like toolset. After all, we've got the front-end source to D already! That could *possibly* make this simpler to implement, as it contains all the data structures necessary to parse, analyze, and possibly re-create the code with. How I see the "diff" tool working: 1) Lex & parse the source files 2) Create semantic tree representation of the original & new code 3) Compare new code's semantic tree with original code's semantic tree 4) Output a series of simple, defined operations to transform the original code's semantic tree into the new code's semantic tree. And the "patch" tool would do basically the inverse of the diff tool: 1) Lex & parse the target source file 2) Create semantic tree representation of the target source code 3) Apply defined operations on the semantic tree 4) Rebuild the target code from the modified semantic tree, possibly conforming to a given formatting standard, or using hints provided by the diff tool to recreate the formatting of the original file. This type of patch/diff toolset could handle the creation of an entire module, simply defined by "create" operations on an "empty" semantic tree. Let me know what you all think of this. Thanks for your input, Pragma! In article <cg2mmu$266i$1 digitaldaemon.com>, pragma <EricAnderton at yahoo dot com> says...Not a bad idea. Would this be a stand-alone project, or something added to an existing product, like Subversion or CVS? The only thing that comes to mind is: how would you even attempt to define semantic merging and versioning in any language? Are you talking about making sure that merged sources compile okay, or is it something deeper than a unittest? - Pragma In article <cg2i31$23fu$1 digitaldaemon.com>, jdunne4 bradley.edu says...I'm not sure if this is the right place to throw up an idea like this, but there seem to be an astonishing number of competent developers here to offer insightful feedback, so I'll go ahead and toss it up ;). Feel free to respond and bounce ideas back off me! What would you think of a source code revision system that does not work on line-by-line code differences, but rather semantical differences? This would be mainly targeted at D source code, since it lends well to this type of revision system. The lack of a pre-processor combined with the concept of modules makes this language an ideal target. Pros: 1) ***More robust patching ability*** (Not line-based, so no "fuzz" needed) 2) Easy merging of codebases (trunks) 3) Easy conflict detection during merges (check function call parameters, etc.) 4) Could spot possible compile errors 5) Code can be regenerated to conform to a formatting standard 6) Accepts only correct code (possible con...) Cons: 1) Maintaining comments and their positions in the code becomes difficult, since they are not compilable elements 2) Somewhat difficult implementation A new patch/diff toolset would need to be created to accomodate this new semantic revision control system as well. Please, let me know what you think! James Dunne
Aug 19 2004
Jaymz wrote:Let's see... Upon first design, this could just be a simple stand-alone project implemented for the D language, consisting of a defined patch-format and a patch/diff-like toolset. After all, we've got the front-end source to D already! That could *possibly* make this simpler to implement, as it contains all the data structures necessary to parse, analyze, and possibly re-create the code with. How I see the "diff" tool working: 1) Lex & parse the source files 2) Create semantic tree representation of the original & new code 3) Compare new code's semantic tree with original code's semantic tree 4) Output a series of simple, defined operations to transform the original code's semantic tree into the new code's semantic tree. And the "patch" tool would do basically the inverse of the diff tool: 1) Lex & parse the target source file 2) Create semantic tree representation of the target source code 3) Apply defined operations on the semantic tree 4) Rebuild the target code from the modified semantic tree, possibly conforming to a given formatting standard, or using hints provided by the diff tool to recreate the formatting of the original file. This type of patch/diff toolset could handle the creation of an entire module, simply defined by "create" operations on an "empty" semantic tree. Let me know what you all think of this. Thanks for your input, Pragma!If you start by getting the diff/patch utilities working properly, with a format compatible with the unix diff/patch utilities, then you could specify it as the diff/patch util for the CVS or SVN repos. That would be the only real level of integration you need. I will say this: (ir)Rational ClearCase tries to use this technique as much as possible with abismal results. From what I understand, the PowerBuilder integration works decently, but the XML diff tool is worse than their line diff tool (which still randomizes things). If you get the diff/patch utility right, I will be very impressed. Just be careful to focus only on diff/patch and not try to have a tool that does a whole bunch of stuff. KISS
Aug 19 2004
In article <cg2rcg$290b$1 digitaldaemon.com>, Berin Loritsch says...Jaymz wrote:Unfortunately, I don't see how I could create a format compatible with the unix diff/patch utilities which are line-based, using a semantic tree-based modification scheme. The format would have to be entirely different. I could, however, make my toolset support the command-line arguments of the original diff/patch utilities, ignoring now senseless ones, which would be the best way to go. This would not be a necessarily bad thing for SVN use ... if you make the decision to use my diff/patch utilities from the start, as the new patch format wouldn't be compatible with the unix diff/patch utilities patch format. SVN really doesn't care what the diff/patch format that it stores in its database is, AFAIK. It simply relies on correct operation from diff/patch to do its work. And sorry, I haven't used any of the products to which you made mention: ClearCase or PowerBuilder. Could you post an example of "abysmal results" so we can see what NOT to produce? :-) I do like to develop tools that produce qualiy results -- this is probably due to my delusion that I have unlimited project development time, and that a project is never quite "done" ;). BTW, what's w/ the KISS? Thanks for your comments! James DunneLet's see... Upon first design, this could just be a simple stand-alone project implemented for the D language, consisting of a defined patch-format and a patch/diff-like toolset. After all, we've got the front-end source to D already! That could *possibly* make this simpler to implement, as it contains all the data structures necessary to parse, analyze, and possibly re-create the code with. How I see the "diff" tool working: 1) Lex & parse the source files 2) Create semantic tree representation of the original & new code 3) Compare new code's semantic tree with original code's semantic tree 4) Output a series of simple, defined operations to transform the original code's semantic tree into the new code's semantic tree. And the "patch" tool would do basically the inverse of the diff tool: 1) Lex & parse the target source file 2) Create semantic tree representation of the target source code 3) Apply defined operations on the semantic tree 4) Rebuild the target code from the modified semantic tree, possibly conforming to a given formatting standard, or using hints provided by the diff tool to recreate the formatting of the original file. This type of patch/diff toolset could handle the creation of an entire module, simply defined by "create" operations on an "empty" semantic tree. Let me know what you all think of this. Thanks for your input, Pragma!If you start by getting the diff/patch utilities working properly, with a format compatible with the unix diff/patch utilities, then you could specify it as the diff/patch util for the CVS or SVN repos. That would be the only real level of integration you need. I will say this: (ir)Rational ClearCase tries to use this technique as much as possible with abismal results. From what I understand, the PowerBuilder integration works decently, but the XML diff tool is worse than their line diff tool (which still randomizes things). If you get the diff/patch utility right, I will be very impressed. Just be careful to focus only on diff/patch and not try to have a tool that does a whole bunch of stuff. KISS
Aug 19 2004
On Thu, 19 Aug 2004 19:01:07 +0000 (UTC), Jaymz <jdunne4 bradley.edu> wrote:In article <cg2rcg$290b$1 digitaldaemon.com>, Berin Loritsch says...KISS == Keep It Simple Stupid. And before you take any offense, none was intended (I assume), it's a somewhat common acronymm meaning simply that you should attempt not to *over* complicate things. Regan p.s. I think your idea is great. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/Jaymz wrote:Unfortunately, I don't see how I could create a format compatible with the unix diff/patch utilities which are line-based, using a semantic tree-based modification scheme. The format would have to be entirely different. I could, however, make my toolset support the command-line arguments of the original diff/patch utilities, ignoring now senseless ones, which would be the best way to go. This would not be a necessarily bad thing for SVN use ... if you make the decision to use my diff/patch utilities from the start, as the new patch format wouldn't be compatible with the unix diff/patch utilities patch format. SVN really doesn't care what the diff/patch format that it stores in its database is, AFAIK. It simply relies on correct operation from diff/patch to do its work. And sorry, I haven't used any of the products to which you made mention: ClearCase or PowerBuilder. Could you post an example of "abysmal results" so we can see what NOT to produce? :-) I do like to develop tools that produce qualiy results -- this is probably due to my delusion that I have unlimited project development time, and that a project is never quite "done" ;). BTW, what's w/ the KISS? Thanks for your comments!Let's see... Upon first design, this could just be a simple stand-alone project implemented for the D language, consisting of a defined patch-format and a patch/diff-like toolset. After all, we've got the front-end source to D already! That could *possibly* make this simpler to implement, as it contains all the data structures necessary to parse, analyze, and possibly re-create the code with. How I see the "diff" tool working: 1) Lex & parse the source files 2) Create semantic tree representation of the original & new code 3) Compare new code's semantic tree with original code's semantic tree 4) Output a series of simple, defined operations to transform the original code's semantic tree into the new code's semantic tree. And the "patch" tool would do basically the inverse of the diff tool: 1) Lex & parse the target source file 2) Create semantic tree representation of the target source code 3) Apply defined operations on the semantic tree 4) Rebuild the target code from the modified semantic tree, possibly conforming to a given formatting standard, or using hints provided by the diff tool to recreate the formatting of the original file. This type of patch/diff toolset could handle the creation of an entire module, simply defined by "create" operations on an "empty" semantic tree. Let me know what you all think of this. Thanks for your input, Pragma!If you start by getting the diff/patch utilities working properly, with a format compatible with the unix diff/patch utilities, then you could specify it as the diff/patch util for the CVS or SVN repos. That would be the only real level of integration you need. I will say this: (ir)Rational ClearCase tries to use this technique as much as possible with abismal results. From what I understand, the PowerBuilder integration works decently, but the XML diff tool is worse than their line diff tool (which still randomizes things). If you get the diff/patch utility right, I will be very impressed. Just be careful to focus only on diff/patch and not try to have a tool that does a whole bunch of stuff. KISS
Aug 19 2004
Regan Heath wrote:On Thu, 19 Aug 2004 19:01:07 +0000 (UTC), Jaymz <jdunne4 bradley.edu> wrote:That was its intention (how did you get this message and I didn't?). Anyway for an example of bad integration, ClearCase's XML merge is perfect. If the XML is not properly formatted, the tool will choke beyond reason (this applies to original or new XML documents). By that I mean the tool will attempt to treat the whole block as one element. For example: OLD: <!-- parse error here --> <element <embedded type="element"/> </element> NEW <!-- parse error fixed --> <element> <embedded type="element"/> </element> CONFLICT: The element "element" conflicts with "element <embedded type="element"/> "And sorry, I haven't used any of the products to which you made mention: ClearCase or PowerBuilder. Could you post an example of "abysmal results" so we can see what NOT to produce? :-) I do like to develop tools that produce qualiy results -- this is probably due to my delusion that I have unlimited project development time, and that a project is never quite "done" ;). BTW, what's w/ the KISS? Thanks for your comments!KISS == Keep It Simple Stupid. And before you take any offense, none was intended (I assume), it's a somewhat common acronymm meaning simply that you should attempt not to *over* complicate things. Regan p.s. I think your idea is great.
Aug 19 2004
That was odd, the posts got out of order and I didn't see Regan's post initially... Anyways... Yeah, I'm definitely a fan of KISS... Heh, punny punny... But seriously, I'd like to keep this revision control system on the ground: simple and reliable, yet very powerful. It seems as though after a nice evening of playing Doom 3, I have no will to be near a computer until at least tomorrow morning... Jesus Christ, that game ... wow ... Then, this weekend is gonna be crazy, moving back into house at skool. If anyone, in the down-time here, would like to poke thru the D front-end parser/analyzer code and possibly produce some nice D code to achieve the same effect, that'd be sweet. If not, that's cool too, I'll just do it once I'm at skool. On the train ride home from work today I was jottin' down some ideas on how to do a tree-diff operation. I started writing out D code in an XML-like format just to see how I could process a given module as a syntactic tree and rearrange, add, and remove parts of it. I came up with a quickie example XML-like tree: (some declarations are useless, but exist for example's sake) D source module: module addition; import std.c.stdio; alias int myInt; int add(int a, int b) out { assert(a + b == value); } body { return (a + b); } Corresponding XML tree definition: <module name="addition"> <import name="std.c.stdio"/> <alias type="d:int" name="myInt"/> <function name="add" return="d:int"> <param type="d:int" modifier="in" name="a"/> <param type="d:int" modifier="in" name="b"/> <out> <assert> <opEquals> <left> <opAdd> <left><ref-param name="a"/></left> <right><ref-param name="b"/></right> </opAdd> </left> <right> <return-value/> </right> </assert> </out> <body> <return> <paren> <opAdd> <left><ref-param name="a"/></left> <right><ref-param name="b"/></right> </opAdd> </paren> </return> </body> </function> </module> As you can see, it's pretty much a syntactic representation of the D module. It looks similar to a CodeDOM structure, if you've ever used that from .NET. Of like <linecomment>, <blockcomment>, <nestcomment>, <blankline>, etc to preserve spacing and comments. All of the D operators as tags should be defined by their corresponding op* names. Feel absolutely free to rip on my definition schema here, I just made it up without *much* thought. Admittedly, there was *some* thought. Now to the real meat... Defining the operations to ADD and REMOVE sections is easy enough, just treat the tree in an in-order-traversal manner and linearly add/remove tags (start and end tags must be matched, of course). Process the two trees just as diff processes two files, trying to match them up tag-by-tag wherever possible. This makes a huge benefit in terms of simple changes that have a major impact on the formatting of a document. For example, in unix diff/patch you indent a block of code, all the affected lines are included in the diff. But when using /my/ utility, the affected start and end tags of the if-statement are created and the internal code block is left completely alone, making the diff much more compressed. Here, we win against the unix diff utility, whereas the worst case would be a draw with the unix diff utility. To try to complicate things, defining a MOVE operation without falling back to an ADD and REMOVE operation should be considered. Of course, in the initial implementation it could just very well be not defined, and we could rely on ADD/REMOVE, just as the unix diff/patch utilities do. However, in the future this could be a major source of improvement. Let me know what you all think! James Dunne
Aug 19 2004
Jaymz wrote:That was odd, the posts got out of order and I didn't see Regan's post initially... Anyways......On the train ride home from work today I was jottin' down some ideas on how to do a tree-diff operation. I started writing out D code in an XML-like format just to see how I could process a given module as a syntactic tree and rearrange, add, and remove parts of it. I came up with a quickie example XML-like tree: (some declarations are useless, but exist for example's sake) D source module: module addition; import std.c.stdio; alias int myInt; int add(int a, int b) out { assert(a + b == value); } body { return (a + b); } Corresponding XML tree definition: <module name="addition"> <import name="std.c.stdio"/> <alias type="d:int" name="myInt"/><snip> This discussion reminds me of the DML idea that was mentioned a while back (I think it was brought up 2 or 3 years ago): http://jdanielsmith.org/DML/ I don't know how similar this is what you're thinking, but it is XML-based. -- Justin (a/k/a jcc7) http://jcc_7.tripod.com/d/
Aug 19 2004
In article <cg3qps$2qcg$1 digitaldaemon.com>, J C Calvarese says...Jaymz wrote:Thanks for your comment, but I was just trying to convey the idea of the syntactic tree using an XML-like form. I'm not going to *actually use* any form of XML or DML in the syntactic tree's definition. I'll be keeping that all in memory in a DOM structure. Which reminds me, I did do a little poring over the DMD front-end code last night, and it looks very clean and easy to port over to D for just the syntactic analysis. The data structures used are pretty clear and can easily be suited to this project. Really, now, the only design problem that I can see is how to define the patch format, possibly in a human-readable way. I'm leaning towards an extensible binary format using chunks (like EBML does), since a text-based patch format would get rather lengthy. Does anyone know if SVN needs the patch data to be in ASCII text format, or does it not care? James DunneThat was odd, the posts got out of order and I didn't see Regan's post initially... Anyways......On the train ride home from work today I was jottin' down some ideas on how to do a tree-diff operation. I started writing out D code in an XML-like format just to see how I could process a given module as a syntactic tree and rearrange, add, and remove parts of it. I came up with a quickie example XML-like tree: (some declarations are useless, but exist for example's sake) D source module: module addition; import std.c.stdio; alias int myInt; int add(int a, int b) out { assert(a + b == value); } body { return (a + b); } Corresponding XML tree definition: <module name="addition"> <import name="std.c.stdio"/> <alias type="d:int" name="myInt"/><snip> This discussion reminds me of the DML idea that was mentioned a while back (I think it was brought up 2 or 3 years ago): http://jdanielsmith.org/DML/ I don't know how similar this is what you're thinking, but it is XML-based. -- Justin (a/k/a jcc7) http://jcc_7.tripod.com/d/
Aug 20 2004
If it is defined over a tree, i imagine it fairly unstable. Not that it couldn't be done, but i'm somewhat sceptical, also considering the extandable syntax which might come in 2.x. Are there any good tree diff/merge tools already? Any open-source ones? If there are such tools for XML, one could define some mapping between D and XML. If you define it over a stream of lexemes, it will be wonderfully robust, but i don't imagibe it being too useful. It will at most take care of formatting issue (different contributors prefer different formatting), but projects now use some kind of an auto-formatter with certain settings, which also provides (an admittably much cruder) solution. What i would think of being more valuable for now, would be a documentation system and code formatter written completely in D. -eye
Aug 19 2004
In article <cg2tk8$2amn$1 digitaldaemon.com>, Ilya Minkov says...If it is defined over a tree, i imagine it fairly unstable. Not that it couldn't be done, but i'm somewhat sceptical, also considering the extandable syntax which might come in 2.x. Are there any good tree diff/merge tools already? Any open-source ones? If there are such tools for XML, one could define some mapping between D and XML. If you define it over a stream of lexemes, it will be wonderfully robust, but i don't imagibe it being too useful. It will at most take care of formatting issue (different contributors prefer different formatting), but projects now use some kind of an auto-formatter with certain settings, which also provides (an admittably much cruder) solution. What i would think of being more valuable for now, would be a documentation system and code formatter written completely in D. -eyeWell, it would have to be defined with something a bit more complex than just a tree structure. A tree-based structure, like a DOM, would be ideal. I don't see how that'd be unstable. It should be defined over a stream of lexemes, of course. That's what the DOM will hold. I'm not too keen on having this be another implementation of a source code re-formatter. It's merely just a different way of patching source code using the assumption that we're reading SOURCE CODE, not just arbitrary lines of text. The code re-formatting comes out of the need to reproduce the code from the DOM. A documentation system for D written entirely in D? Just a few simple changes to Doxygen it sounds like, minus the initial work of porting to D ;). This could be a whole different pile of monkeys if class meta-data support was in D *WINK WINK*. I saw a few threads of discussion on meta-data, but it didn't seem to end up anywhere. Gr. I don't see what the big issue is, the symbol table doesn't take up *that* much room. I personally would like a bit more flexibility at the cost of the executable size being bumped up a few KB. BTW, could you elaborate a bit on your skepticism? I'm a bit confused here. Thanks! James Dunne
Aug 19 2004
Jaymz schrieb:Well, it would have to be defined with something a bit more complex than just a tree structure. A tree-based structure, like a DOM, would be ideal. I don't see how that'd be unstable. It should be defined over a stream of lexemes, of course. That's what the DOM will hold.That might work... Although i'd like it somehow independant from most language constructs, and being able to handle new syntax constructs gracefully... more or less like a highliting editor with "levels" recognition does. Perhaps even some extensibility?I'm not too keen on having this be another implementation of a source code re-formatter. It's merely just a different way of patching source code using the assumption that we're reading SOURCE CODE, not just arbitrary lines of text. The code re-formatting comes out of the need to reproduce the code from the DOM.On the other hand the DIFF will not be very human-readable.A documentation system for D written entirely in D? Just a few simple changes to Doxygen it sounds like, minus the initial work of porting to D ;).Hr hr. :)This could be a whole different pile of monkeys if class meta-data support was in D *WINK WINK*. I saw a few threads of discussion on meta-data, but it didn't seem to end up anywhere. Gr. I don't see what the big issue is, the symbol table doesn't take up *that* much room. I personally would like a bit more flexibility at the cost of the executable size being bumped up a few KB.The metadata was already there in DLI, and was incomplete, and only the DLI verion of Phobos ever used it. The topic must be raised again in the post-1.0 era. For now, the consensus was that a parser and some custom code generators would have to do the work for the others, and relieve Walter from something unnecessary to do right now. Besides, the metadata was only intended to be used in a program itself. I wonder whether i find some time to bake a D version of my favorite parser gen (COCO/R) and a corresponding D grammar... It would be a great help on creating tools. I started to port the Java version, but after seeing the C version i have come to dislike that for Java and will probably first hack up a C version which outputs D code, then someone else could finish porting it. I am sure that the tool can cope perfectly with D syntax, and the generated code is efficient.BTW, could you elaborate a bit on your skepticism? I'm a bit confused here. Thanks!I don't know, i'm totally new to the matter... That means i'm confused and skeptical. Still, are there any tree diffs out there? One point to consider is that /me and Bill Cox has raised the question of an extentable language, where libraries could introduce new syntax, like in OpenC++ and similar. Walter promised to consider this again in the post-1.0 era. -eye
Aug 19 2004
In article <cg2qa4$28dh$1 digitaldaemon.com>, Jaymz says...Let's see... Upon first design, this could just be a simple stand-alone project implemented for the D language, consisting of a defined patch-format and a patch/diff-like toolset. After all, we've got the front-end source to D already! That could *possibly* make this simpler to implement, as it contains all the data structures necessary to parse, analyze, and possibly re-create the code with. How I see the "diff" tool working: 1) Lex & parse the source files 2) Create semantic tree representation of the original & new code 3) Compare new code's semantic tree with original code's semantic tree 4) Output a series of simple, defined operations to transform the original code's semantic tree into the new code's semantic tree. And the "patch" tool would do basically the inverse of the diff tool: 1) Lex & parse the target source file 2) Create semantic tree representation of the target source code 3) Apply defined operations on the semantic tree 4) Rebuild the target code from the modified semantic tree, possibly conforming to a given formatting standard, or using hints provided by the diff tool to recreate the formatting of the original file. This type of patch/diff toolset could handle the creation of an entire module, simply defined by "create" operations on an "empty" semantic tree. Let me know what you all think of this. Thanks for your input, Pragma!I can see the merit in a stand-alone server, but this may be the wrong way to start out. Honestly, I think an add-on module to an existing source control tool might prove much more useful than an outright replacement. Take dsource.org for example: an entire website dedicated to D programming that is backed on Subversion. IMO an extension to Subversion would be far more useful (and easier to implement) to the D community as a whole. All the same, please look at Mango over on dsource if you're going to write a stand-alone server. The I/O and socket portions of that library may give you a good head-start. That aside, I like where you're going with this, especially with 'creating a semantic tree' of the code. I gather this would be some form of pseudocode or XML? I can see this becoming useful for increasing performance if you keep the current semantic tree version on hand at all times. That way one can compare the tree against their own local source to make sure they're not altering other portions of the application too badly (i.e. trying not to violate contracts across a whole project) Another thing, a lot of the spirit of what you're proposing here is captured in D's in/out/body and unittest contracting system. Have you considered incorporating these statements in particular to deepen the semantic meaning of code when you assess it? :) - Pragma
Aug 19 2004
In article <cg2vh6$2c5t$1 digitaldaemon.com>, pragma <EricAnderton at yahoo dot com> says... <<snip>>I can see the merit in a stand-alone server, but this may be the wrong way to start out. Honestly, I think an add-on module to an existing source control tool might prove much more useful than an outright replacement. Take dsource.org for example: an entire website dedicated to D programming that is backed on Subversion. IMO an extension to Subversion would be far more useful (and easier to implement) to the D community as a whole. All the same, please look at Mango over on dsource if you're going to write a stand-alone server. The I/O and socket portions of that library may give you a good head-start. That aside, I like where you're going with this, especially with 'creating a semantic tree' of the code. I gather this would be some form of pseudocode or XML? I can see this becoming useful for increasing performance if you keep the current semantic tree version on hand at all times. That way one can compare the tree against their own local source to make sure they're not altering other portions of the application too badly (i.e. trying not to violate contracts across a whole project) Another thing, a lot of the spirit of what you're proposing here is captured in D's in/out/body and unittest contracting system. Have you considered incorporating these statements in particular to deepen the semantic meaning of code when you assess it? :) - PragmaWell, I wouldn't know anything about extensibility with SVN, as I haven't a copy of the code on hand. I do like the system and am using it personally, just never cared to see its code ;). But if you say it is easier to extend, then I will believe you. An SVN extension is definitely a possible direction in the future for this project, assuming the proof-of-concept diff/patch toolset works. After all, I didn't really have any ambition to create a new stand-alone server in the first place. You're saying I could gather contract information from the in, out, invariant, etc. constructs that D provides and make sure the coder isn't going to violate them with the code commit? Wow, that takes balls. Actually I don't think that's possible. How are you to know at compile time if the coder is violating any contracts? Where do you get your values to test against the contracts? And finally, HOW do you represent a contract in an evaluative way, assuming you magically have values provided by the committed code to test against the contracts? I don't think the contract information would be too useful in a revision control system, and it wouldn't be very language-independent either. But it's a cool idea, nonetheless. Er, anyway... The real intent behind building the semantic tree of the module is to have a uniform way of accessing functions, structures, classes in order to compare them and change them easily. I could foresee this being defined by a relatively large inter-related class hierarchy of things like expressions, statements, etc... .. Wait a tick... that's a SYNTACTIC tree... Aww dammit all. My bad... Well, a semantic tree is really an extension of a syntactic tree, isn't it? Oh God, my head... Someone clarify myself for me. James Dunne
Aug 19 2004
In article <cg31on$2ega$1 digitaldaemon.com>, Jaymz says...Well, I wouldn't know anything about extensibility with SVN, as I haven't a copy of the code on hand. I do like the system and am using it personally, just never cared to see its code ;). But if you say it is easier to extend, then I will believe you. An SVN extension is definitely a possible direction in the future for this project, assuming the proof-of-concept diff/patch toolset works.Well, I haven't done it personally, but word has it that it has an event model of some kind that was written with extensibility in mind. :)You're saying I could gather contract information from the in, out, invariant, etc. constructs that D provides and make sure the coder isn't going to violate them with the code commit? Wow, that takes balls.Um, thank you? I wasn't aware of that statement being all that out there, but in retrospect it's pretty bogus. Its probably all this going back and forth between ColdFusion (work) and D (here in the NG). But I am on the same page now and will restrain from making any future "ballsy" comments. ;)Actually I don't think that's possible. How are you to know at compile time if the coder is violating any contracts? Where do you get your values to test against the contracts? And finally, HOW do you represent a contract in an evaluative way, assuming you magically have values provided by the committed code to test against the contracts? I don't think the contract information would be too useful in a revision control system, and it wouldn't be very language-independent either. But it's a cool idea, nonetheless.Okay, I see where you're coming from now. I was thinking more at the compilation and unittest level, where testing DBC *really* comes into play. You're right: you can't use that kind of information when you're just looking at how the code is put together. Of course there's no reason why you couldn't get a nightly or on-demand build to do some analysis when an assert or static assert fires in a unittest. After all that processing, wouldn't it be pretty easy to correlate a line number and error message with a particular change ... especially since your semantic pass will know how everthing is interrelated?Er, anyway... The real intent behind building the semantic tree of the module is to have a uniform way of accessing functions, structures, classes in order to compare them and change them easily. I could foresee this being defined by a relatively large inter-related class hierarchy of things like expressions, statements, etc...Gotcha. So if the revision system can acutally "understand" the code it's processing, then it'll be less prone to screwups and possibly catch developer mistakes as well..... Wait a tick... that's a SYNTACTIC tree... Aww dammit all. My bad... Well, a semantic tree is really an extension of a syntactic tree, isn't it? Oh God, my head... Someone clarify myself for me.I'll take a stab at that one. I've always understood the semantics of a program to be derived from the syntax used. Yes, it's almost 1-for-1 between meaining and the syntax used, especially in D. The difference lies in how one can do some things in more than one way, like using "?" instead of "if()" and so on: both have the same semantic meaning, but the syntax is totally different. - Pragma
Aug 19 2004
In article <cg358p$2hho$1 digitaldaemon.com>, pragma <EricAnderton at yahoo dot com> says...In article <cg31on$2ega$1 digitaldaemon.com>, Jaymz says...I thought you were gonna *restrain* from making future "ballsy" comments? ;). What type of algorithm could be developed based on a commit and a static assert firing to lead to a possible collection of offending line numbers? Now THAT would be really interesting, and is technically feasible!Well, I wouldn't know anything about extensibility with SVN, as I haven't a copy of the code on hand. I do like the system and am using it personally, just never cared to see its code ;). But if you say it is easier to extend, then I will believe you. An SVN extension is definitely a possible direction in the future for this project, assuming the proof-of-concept diff/patch toolset works.Well, I haven't done it personally, but word has it that it has an event model of some kind that was written with extensibility in mind. :)You're saying I could gather contract information from the in, out, invariant, etc. constructs that D provides and make sure the coder isn't going to violate them with the code commit? Wow, that takes balls.Um, thank you? I wasn't aware of that statement being all that out there, but in retrospect it's pretty bogus. Its probably all this going back and forth between ColdFusion (work) and D (here in the NG). But I am on the same page now and will restrain from making any future "ballsy" comments. ;)Actually I don't think that's possible. How are you to know at compile time if the coder is violating any contracts? Where do you get your values to test against the contracts? And finally, HOW do you represent a contract in an evaluative way, assuming you magically have values provided by the committed code to test against the contracts? I don't think the contract information would be too useful in a revision control system, and it wouldn't be very language-independent either. But it's a cool idea, nonetheless.Okay, I see where you're coming from now. I was thinking more at the compilation and unittest level, where testing DBC *really* comes into play. You're right: you can't use that kind of information when you're just looking at how the code is put together. Of course there's no reason why you couldn't get a nightly or on-demand build to do some analysis when an assert or static assert fires in a unittest. After all that processing, wouldn't it be pretty easy to correlate a line number and error message with a particular change ... especially since your semantic pass will know how everthing is interrelated?Well, the project's scope certainly has escalated from a simple syntactic tree based revision control system to an intelligent learning machine that'll automagically fix your mistakes and know what you *really* want to do. LOL. Not to pick on you, Pragma. ;) Hey! Why don't we just build a neural network of a few billion nodes and train it on D grammar and semantics 'til it's sick? Oh wait, we already got a couple hundred of 'em walkin around.. Dammit. lol.Er, anyway... The real intent behind building the semantic tree of the module is to have a uniform way of accessing functions, structures, classes in order to compare them and change them easily. I could foresee this being defined by a relatively large inter-related class hierarchy of things like expressions, statements, etc...Gotcha. So if the revision system can acutally "understand" the code it's processing, then it'll be less prone to screwups and possibly catch developer mistakes as well...<not snobby>I do realize the difference between syntax and semantics</not snobby>. And in general, a language's syntax will strongly reflect its semantics (unless you complain of such silly things as READABILITY ... damn VB coders). Regardless, should the revision control system be based on a /syntactic/ or /semantic/ tree representation of the code? To contradict myself, as I always do, I don't see much benefit now in a /semantic/ tree for a simple revision control system. :) I do hope I've successfully confused everyone now. The master of deception and contradiction will be back tomorrow morning. James Dunne.. Wait a tick... that's a SYNTACTIC tree... Aww dammit all. My bad... Well, a semantic tree is really an extension of a syntactic tree, isn't it? Oh God, my head... Someone clarify myself for me.I'll take a stab at that one. I've always understood the semantics of a program to be derived from the syntax used. Yes, it's almost 1-for-1 between meaining and the syntax used, especially in D. The difference lies in how one can do some things in more than one way, like using "?" instead of "if()" and so on: both have the same semantic meaning, but the syntax is totally different. - Pragma
Aug 19 2004