digitalmars.D - dmd Lexer and Parser in D

Zach Tollen (11/11) Feb 03 2012 Greetings! I am a rather new programmer and while this is my first post

Trass3r (2/2) Feb 04 2012 Maybe there's some IDE that can make use of this.
F i L (11/24) Feb 04 2012 Very cool. I was talking with someone on the IRC about the

Trass3r (3/8) Feb 04 2012 It's stuck at 2.040. I doubt getting it up-to-date would be worth the

F i L (1/4) Feb 04 2012 Ah, oh well...

Zach Tollen (33/44) Feb 04 2012 This is my thinking too. One good thing about having cut the program is

Timon Gehr (8/52) Feb 04 2012 You are: The source file can be seen as the representation of a tree

Zach Tollen (33/44) Feb 04 2012 I know what you mean, but what I mean is that it would be cool if my

Jacob Carlborg (6/50) Feb 04 2012 You could have a look at Clang. It's a frontend for LLVM that's
Daniel Murphy (12/17) Feb 04 2012 The code inside dmd that does lowerings does something like this.

Daniel Murphy (14/14) Feb 04 2012 On a related note, how much interest is there around here in having an

Nick Sabalausky (12/14) Feb 04 2012 I'm interested in a D *API* for taking in D sources and spitting out the...

Armin Kazmi (9/27) Feb 04 2012 Well, I think, it might be easier to change the dmd implemention to use ...
Daniel Murphy (9/17) Feb 04 2012 This is not that far off. I've got a branch of dmd, with a di file for

Jonathan M Davis (7/24) Feb 04 2012 The intention is to have a lexer and parser for D in Phobos at some poin...
H. S. Teoh (16/22) Feb 04 2012 [...]

Zach Tollen <reachzachatgooglesmailservice dot.com> writes:

Greetings! I am a rather new programmer and while this is my first post 
I wanted to say that I did some work on the ddmd project at dsource.org, 
which was kind of a big hairy mess. My fork of this project is at:

https://github.com/zachthemystic/ddmd-clean

The point is, I cleaned out the crappiness but I chucked the entire 
semantic and backend, so that you have left a port of the dmd lexer and 
parser in the D language now. The README there has more to say.

I might well "announce" this on D.announce but I'm too new to have a 
feel for the significance of it all.

Thanks for reading,

Zach

Feb 03 2012

Trass3r <un known.com> writes:

Maybe there's some IDE that can make use of this.
"Unfortunately" VisualD already has its own ;)

Feb 04 2012

"F i L" <witte2008 gmail.com> writes:

On Saturday, 4 February 2012 at 05:24:45 UTC, Zach Tollen wrote:
 Greetings! I am a rather new programmer and while this is my 
 first post I wanted to say that I did some work on the ddmd 
 project at dsource.org, which was kind of a big hairy mess. My 
 fork of this project is at:

 https://github.com/zachthemystic/ddmd-clean

 The point is, I cleaned out the crappiness but I chucked the 
 entire semantic and backend, so that you have left a port of 
 the dmd lexer and parser in the D language now. The README 
 there has more to say.

 I might well "announce" this on D.announce but I'm too new to 
 have a feel for the significance of it all.

 Thanks for reading,

 Zach

Very cool. I was talking with someone on the IRC about the 
possibility/difficulties of making DMD's parser/lexer/AST stay 
open in memory with protocols designed for IDE code-completion 
communication. It would be ideal to have an IDE's intellisense 
automatically update with DMD semantically.

Unfortunately the conclusion was that it would be to difficult an 
undertaking to be realistic, since DMD is designed to be 
run-and-done (also something about "Walter code" :-)). But maybe 
a rewrite/port of DMD, especially one written in D, might be able 
to be reworked with this goal in mind? How complete is DDMD?

Feb 04 2012

Trass3r <un known.com> writes:

 Unfortunately the conclusion was that it would be to difficult an  
 undertaking to be realistic, since DMD is designed to be run-and-done  
 (also something about "Walter code" :-)). But maybe a rewrite/port of  
 DMD, especially one written in D, might be able to be reworked with this  
 goal in mind? How complete is DDMD?

It's stuck at 2.040. I doubt getting it up-to-date would be worth the  
effort.
Also there are still plenty of unimplemented functions.

Feb 04 2012

"F i L" <witte2008 gmail.com> writes:

 It's stuck at 2.040. I doubt getting it up-to-date would be 
 worth the effort.
 Also there are still plenty of unimplemented functions.

Ah, oh well...

Feb 04 2012

Zach Tollen <reachzachatgooglesmailservice dot.com> writes:

On 2/4/12 6:59 AM, F i L wrote:
 Very cool. I was talking with someone on the IRC about the
 possibility/difficulties of making DMD's parser/lexer/AST stay open in
 memory with protocols designed for IDE code-completion communication. It
 would be ideal to have an IDE's intellisense automatically update with
 DMD semantically.

This is my thinking too. One good thing about having cut the program is 
that it's a much lighter weight now, and I did it because I thought, 
well, maybe once it's paired down, I can actually steer it toward IDE 
functionality. For example, you could really cut out a lot of the 
members of the data structures which only point to backend functionality 
anyway.

Even if the whole project fails I won't regret doing it because I 
learned a lot about D in the process.

What I'm really wondering is if you wanted a program which helped you 
edit the syntax tree directly and only produced a text file for saving 
and running, what kind of data structure would you like to have 
representing the syntax tree? Without knowing anything else, I guessed 
that it would be nice to have something resembling the official D 
parse-tree.

 Unfortunately the conclusion was that it would be to difficult an
 undertaking to be realistic, since DMD is designed to be run-and-done
 (also something about "Walter code" :-)).

I was wondering if you couldn't take a parse-tree data structure and 
deparse (disparse?) it back to formatted program code so that you could 
see what you were editing? As unrealistic as that sounds, I'm 
sufficiently attracted to the idea that I'm investigating it with an 
open mind.

 But maybe a rewrite/port of DMD, especially one written in D, might
 be able to be reworked with this
 goal in mind? How complete is DDMD?

This is exactly what I'm aiming at. My basic hopes for its being 
possible are the comforting notion that the huge part of dmd is actually 
the stuff I threw out! The goal would be to construct the front end (of 
a front end) which was at least theoretically capable both of allowing 
code editing, and of translation to a more backend-friendly data 
structure. If that's not possible, then I'm stuck with this thought that 
you edit the tree, then the IDE reverses the parse back into ordinary 
code for saving and compiling.

If anybody can refer me to any examples and demonstrations of this type 
of code-editing, please do. As someone new to programming I'm really 
wondering why, if the program itself is understood by the computer as a 
tree, why do I have to edit a text file instead of a tree?

Zach

Feb 04 2012

Timon Gehr <timon.gehr gmx.ch> writes:

On 02/04/2012 06:39 PM, Zach Tollen wrote:
 On 2/4/12 6:59 AM, F i L wrote:
 Very cool. I was talking with someone on the IRC about the
 possibility/difficulties of making DMD's parser/lexer/AST stay open in
 memory with protocols designed for IDE code-completion communication. It
 would be ideal to have an IDE's intellisense automatically update with
 DMD semantically.

 This is my thinking too. One good thing about having cut the program is
 that it's a much lighter weight now, and I did it because I thought,
 well, maybe once it's paired down, I can actually steer it toward IDE
 functionality. For example, you could really cut out a lot of the
 members of the data structures which only point to backend functionality
 anyway.

 Even if the whole project fails I won't regret doing it because I
 learned a lot about D in the process.

 What I'm really wondering is if you wanted a program which helped you
 edit the syntax tree directly and only produced a text file for saving
 and running, what kind of data structure would you like to have
 representing the syntax tree? Without knowing anything else, I guessed
 that it would be nice to have something resembling the official D
 parse-tree.

 Unfortunately the conclusion was that it would be to difficult an
 undertaking to be realistic, since DMD is designed to be run-and-done
 (also something about "Walter code" :-)).

 I was wondering if you couldn't take a parse-tree data structure and
 deparse (disparse?) it back to formatted program code so that you could
 see what you were editing? As unrealistic as that sounds, I'm
 sufficiently attracted to the idea that I'm investigating it with an
 open mind.

  > But maybe a rewrite/port of DMD, especially one written in D, might
  > be able to be reworked with this
  > goal in mind? How complete is DDMD?

 This is exactly what I'm aiming at. My basic hopes for its being
 possible are the comforting notion that the huge part of dmd is actually
 the stuff I threw out! The goal would be to construct the front end (of
 a front end) which was at least theoretically capable both of allowing
 code editing, and of translation to a more backend-friendly data
 structure. If that's not possible, then I'm stuck with this thought that
 you edit the tree, then the IDE reverses the parse back into ordinary
 code for saving and compiling.

 If anybody can refer me to any examples and demonstrations of this type
 of code-editing, please do. As someone new to programming I'm really
 wondering why, if the program itself is understood by the computer as a
 tree, why do I have to edit a text file instead of a tree?

 Zach

You are: The source file can be seen as the representation of a tree 
structure, and if you read the source you group the characters in a 
tree-like way in order to understand what it is saying. Anyway, this is 
true for any language. Your post could be parsed into a tree structure too.

You might want to have a look at lisp. Its syntax is a straightforward 
description of the parse tree. 
http://en.wikipedia.org/wiki/Lisp_%28programming_language%29

Feb 04 2012

Zach Tollen <reachzachatgooglesmailservice dot.com> writes:

On 2/4/12 1:24 PM, Timon Gehr wrote:
 On 02/04/2012 06:39 PM, Zach Tollen wrote:
 If anybody can refer me to any examples and demonstrations of this type
 of code-editing, please do. As someone new to programming I'm really
 wondering why, if the program itself is understood by the computer as a
 tree, why do I have to edit a text file instead of a tree?

 Zach

 You are: The source file can be seen as the representation of a tree
 structure, and if you read the source you group the characters in a
 tree-like way in order to understand what it is saying. Anyway, this is
 true for any language. Your post could be parsed into a tree structure too.

I know what you mean, but what I mean is that it would be cool if my 
text editor knew that when I started a line with 'writeln(' that I had 
no intention of finishing the line without inserting a corresponding 
');'. Instead, if I forget to add the ending parenthesis, the compiler 
thinks I meant never to end the function call and it gives a parse error 
when it gets to something it can't read according to its expectations. 
It gets a little worse when the structure in question gets larger or 
more complicated.

What I wish would happen is that I simply told the editor directly "I 
want to insert a complete statement here", and then it inserts the 
statement into it's tree, and I can't get rid of it without giving a 
specific command to do so. So while I understand that the text file 
*represents* a syntax tree, I wish it were more controlled than that. 
That thought made it tempting to consider, well how hard would it be to 
have the editor just hold the tree itself in memory and all the editor's 
commands were oriented toward adding, deleting, changing the program 
itself instead of changing textual characters which merely represent the 
tree?

I see two reasons this might be a bad idea.

First, even in an ideal world, where you had a fully implemented syntax 
tree editor, it might turn out that it's just worse than manually 
editing the files.

But the other reason is one of tradition and infrastructure. If all the 
experiences folks have is with text editing then they don't want to 
change, and all the infrastructure is already built to support text 
editing anyway.

It's this second reason I'm scared of. It would seem like a shame if 
that were the only reason nobody wants to build a syntax-tree editor.

So I'm still interested in this idea. I'm going to try to research 
people's experiences with this kind of thing.

http://en.wikipedia.org/wiki/Structure_editor

Zach

Feb 04 2012

Jacob Carlborg <doob me.com> writes:

On 2012-02-04 18:39, Zach Tollen wrote:
 On 2/4/12 6:59 AM, F i L wrote:
 Very cool. I was talking with someone on the IRC about the
 possibility/difficulties of making DMD's parser/lexer/AST stay open in
 memory with protocols designed for IDE code-completion communication. It
 would be ideal to have an IDE's intellisense automatically update with
 DMD semantically.

 This is my thinking too. One good thing about having cut the program is
 that it's a much lighter weight now, and I did it because I thought,
 well, maybe once it's paired down, I can actually steer it toward IDE
 functionality. For example, you could really cut out a lot of the
 members of the data structures which only point to backend functionality
 anyway.

 Even if the whole project fails I won't regret doing it because I
 learned a lot about D in the process.

 What I'm really wondering is if you wanted a program which helped you
 edit the syntax tree directly and only produced a text file for saving
 and running, what kind of data structure would you like to have
 representing the syntax tree? Without knowing anything else, I guessed
 that it would be nice to have something resembling the official D
 parse-tree.

 Unfortunately the conclusion was that it would be to difficult an
 undertaking to be realistic, since DMD is designed to be run-and-done
 (also something about "Walter code" :-)).

 I was wondering if you couldn't take a parse-tree data structure and
 deparse (disparse?) it back to formatted program code so that you could
 see what you were editing? As unrealistic as that sounds, I'm
 sufficiently attracted to the idea that I'm investigating it with an
 open mind.

  > But maybe a rewrite/port of DMD, especially one written in D, might
  > be able to be reworked with this
  > goal in mind? How complete is DDMD?

 This is exactly what I'm aiming at. My basic hopes for its being
 possible are the comforting notion that the huge part of dmd is actually
 the stuff I threw out! The goal would be to construct the front end (of
 a front end) which was at least theoretically capable both of allowing
 code editing, and of translation to a more backend-friendly data
 structure. If that's not possible, then I'm stuck with this thought that
 you edit the tree, then the IDE reverses the parse back into ordinary
 code for saving and compiling.

 If anybody can refer me to any examples and demonstrations of this type
 of code-editing, please do. As someone new to programming I'm really
 wondering why, if the program itself is understood by the computer as a
 tree, why do I have to edit a text file instead of a tree?

 Zach

You could have a look at Clang. It's a frontend for LLVM that's 
developed to be used both as a compiler and as a library to build other 
tools on, like IDE's and other tools.

-- 
/Jacob Carlborg

Feb 04 2012

"Daniel Murphy" <yebblies nospamgmail.com> writes:

"Zach Tollen" <reachzachatgooglesmailservice dot.com> wrote in message 
news:jgjqfo$2puc$1 digitalmars.com...
 What I'm really wondering is if you wanted a program which helped you edit 
 the syntax tree directly and only produced a text file for saving and 
 running, what kind of data structure would you like to have representing 
 the syntax tree? Without knowing anything else, I guessed that it would be 
 nice to have something resembling the official D parse-tree.

The code inside dmd that does lowerings does something like this.

Want to add printf to the end of a function?  Sure!

fbody = new CompoundStatement(loc, fbody, new ExpStatement(loc, new 
CallExp(loc, new VarExp(loc, new IdentifierExp(loc, 
Lexer::idPool("printf"))), new StringExp(loc, "Hello world!\\n"))));

It is a lot better to have the parser generate the syntax tree for you!

fbody = new CompoundStatement(loc, fbody, new CompileStatement(loc, 
"printf(\"Hello world!\\n\");"));

(I know this isn't what you meant, but that's what the parse tree looks 
like)

Feb 04 2012

"Daniel Murphy" <yebblies nospamgmail.com> writes:

On a related note, how much interest is there around here in having an 
official version of dmd written in D?

There are two ways I can imagine this actually happening:
1.
- Improve D's ability to link with C++
- Make D bindings out of the header files
- Port code to D incrementally

2.
- Dify the C++ source (no classes on the stack/embedded, no bitfields, etc)
- Fix all #ifdefs that break up expressions so they can be turned into 
versions
- Create a conversion program to turn it into D ('->' -> '.', (type) -> 
cast(type) etc)

Just something to think about for the distant future.

Feb 04 2012

"Nick Sabalausky" <a a.a> writes:

"Daniel Murphy" <yebblies nospamgmail.com> wrote in message 
news:jgj9mu$1q60$1 digitalmars.com...
 On a related note, how much interest is there around here in having an 
 official version of dmd written in D?

I'm interested in a D *API* for taking in D sources and spitting out the 
user's choice of either the parser results, or an AST with all the 
semantics/CTFE/etc already run. I get the impressiona lot of people are 
intrested in this.

As far as the actual *implementation* behind the D interface, I don't 
particularly care if it's C, C++, or D.

I suspect having it D might be a pain until a lot more issues get resolved. 
A bootstrapping compiler, I would imagine, would need a much more stable 
base than other types of software would need (though I don't have any 
experience with bootstrapping compilers, so I could be wrong).

Feb 04 2012

Armin Kazmi <armin.kazmi tu-dortmund.de> writes:

Well, I think, it might be easier to change the dmd implemention to use  
C only and then write language bindings to that. We all know the 
binding situation to C++ won't change sooner or later.

 "Daniel Murphy" <yebblies nospamgmail.com> wrote in message
 news:jgj9mu$1q60$1 digitalmars.com...
 On a related note, how much interest is there around here in having 


an
 official version of dmd written in D?

 
 I'm interested in a D *API* for taking in D sources and spitting out 

the
 user's choice of either the parser results, or an AST with all the
 semantics/CTFE/etc already run. I get the impressiona lot of people 

are
 intrested in this.
 
 As far as the actual *implementation* behind the D interface, I don't
 particularly care if it's C, C++, or D.
 
 I suspect having it D might be a pain until a lot more issues get
 resolved. A bootstrapping compiler, I would imagine, would need a 

much
 more stable base than other types of software would need (though I 

don't
 have any experience with bootstrapping compilers, so I could be 

wrong).

Feb 04 2012

"Daniel Murphy" <yebblies nospamgmail.com> writes:

"Nick Sabalausky" <a a.a> wrote in message 
news:jgjngv$2je9$1 digitalmars.com...
 I'm interested in a D *API* for taking in D sources and spitting out the 
 user's choice of either the parser results, or an AST with all the 
 semantics/CTFE/etc already run. I get the impressiona lot of people are 
 intrested in this.

This is not that far off.  I've got a branch of dmd, with a di file for 
every h file, that is able to link to itself.  There are still some issues 
with vtables and static variables but hopefully I will sort them out in the 
near future.

What would the D api look like?  If D can link to c++ well enough to call 
into the dmd source, building an api on top of that wouldn't be that bad.

 I suspect having it D might be a pain until a lot more issues get 
 resolved. A bootstrapping compiler, I would imagine, would need a much 
 more stable base than other types of software would need (though I don't 
 have any experience with bootstrapping compilers, so I could be wrong).

Yeah.

Feb 04 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday, February 04, 2012 23:52:46 Daniel Murphy wrote:
 On a related note, how much interest is there around here in having an
 official version of dmd written in D?
 
 There are two ways I can imagine this actually happening:
 1.
 - Improve D's ability to link with C++
 - Make D bindings out of the header files
 - Port code to D incrementally
 
 2.
 - Dify the C++ source (no classes on the stack/embedded, no bitfields, etc)
 - Fix all #ifdefs that break up expressions so they can be turned into
 versions
 - Create a conversion program to turn it into D ('->' -> '.', (type) ->
 cast(type) etc)
 
 Just something to think about for the distant future.

The intention is to have a lexer and parser for D in Phobos at some point, but 
I don't know how much we gain by having the whole compiler in D. It's not a 
bad idea in the least, and it would be a great project for someone to tackle, 
but of all of the things that a contributor could be doing, I'm not sure that 
that's really all that high on the list as far as value goes.

- Jonathan M Davis

Feb 04 2012

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sat, Feb 04, 2012 at 01:56:50PM -0800, Jonathan M Davis wrote:
[...]
 The intention is to have a lexer and parser for D in Phobos at some
 point, but I don't know how much we gain by having the whole compiler
 in D. It's not a bad idea in the least, and it would be a great
 project for someone to tackle, but of all of the things that a
 contributor could be doing, I'm not sure that that's really all that
 high on the list as far as value goes.

[...]

I'm actually thinking about writing a D pretty printer as a little
exercise in D programming. I haven't decided whether or not to simply
adapt the existing dmd frontend. One of Walter's stated advantages of D
is that it's easily lexed and parsed, even if semantics are disregarded.
I'm considering to prove that statement by building a lexer/parser from
ground up, though of course lacking most of the complexity of the real
compiler since I only need to do just enough to be able to pretty-print
D code.

If it's done correctly, it might even be useful in automated conversions
between different preferred indentation styles, etc..


T

-- 
This is a tpyo.

Feb 04 2012

D Programming

C/C++ Programming

Other

digitalmars.D - dmd Lexer and Parser in D