www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - ANTLR grammar for D?

reply "Wesley Hamilton" <ffhighwind gmail.com> writes:
I've started making a D grammar for ANTLR4, but I didn't want to 
spend days testing and debugging it later if somebody already has 
one.

The best search results turn up posts that are 10 years old. Only 
one post has a link to a grammar file and the page seems to have 
been removed. I also assume it would be obsolete with changes to 
ANTLR and D.
http://www.digitalmars.com/d/archives/digitalmars/D/25302.html
http://www.digitalmars.com/d/archives/digitalmars/D/4953.html
Jun 19 2014
next sibling parent reply dennis luehring <dl.soluz gmx.net> writes:
Am 20.06.2014 08:57, schrieb Wesley Hamilton:
 I've started making a D grammar for ANTLR4, but I didn't want to
 spend days testing and debugging it later if somebody already has
 one.

 The best search results turn up posts that are 10 years old. Only
 one post has a link to a grammar file and the page seems to have
 been removed. I also assume it would be obsolete with changes to
 ANTLR and D.
 http://www.digitalmars.com/d/archives/digitalmars/D/25302.html
 http://www.digitalmars.com/d/archives/digitalmars/D/4953.html
most uptodate seems to be https://github.com/Hackerpilot/DGrammar
Jun 20 2014
parent reply "Wesley Hamilton" <ffhighwind gmail.com> writes:
On Friday, 20 June 2014 at 07:47:44 UTC, dennis luehring wrote:
 Am 20.06.2014 08:57, schrieb Wesley Hamilton:
 I've started making a D grammar for ANTLR4, but I didn't want 
 to
 spend days testing and debugging it later if somebody already 
 has
 one.

 The best search results turn up posts that are 10 years old. 
 Only
 one post has a link to a grammar file and the page seems to 
 have
 been removed. I also assume it would be obsolete with changes 
 to
 ANTLR and D.
 http://www.digitalmars.com/d/archives/digitalmars/D/25302.html
 http://www.digitalmars.com/d/archives/digitalmars/D/4953.html
most uptodate seems to be https://github.com/Hackerpilot/DGrammar
Thanks. Just realized that the "add grammar" button for ANTLR grammar list is broken... so that could be why it's not there. I'll probably still finish the grammar I'm making since I'm 75% done. That's a great reference, though. I think it's missing a few minor details like delimited strings, token strings, and assembly keywords. It should help where the Language Reference pages aren't accurate. For example, I think HexLetter is incorrectly defined.
Jun 20 2014
next sibling parent reply Artur Skawina via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 06/20/14 11:22, Wesley Hamilton via Digitalmars-d wrote:
 It should help where the Language Reference pages aren't accurate. For
example, I think HexLetter is incorrectly defined.
What's the problem with HexLetter? Once upon the time I did play with parsing D, unfortunately the compiler situation has indirectly resulted in a year+ long pause, as I was (maybe naively) hoping to be able to finish the project using an at least semi-modern D dialect, so that it would be usable for not just me... At least the lexer was done by then, and I think I fixed most of the dlang problems during the conversion to PEG. It's still available, maybe is has some value as an additional reference: http://repo.or.cz/w/girtod.git/blob/refs/heads/lexer:/dlanglexer.d At least back when I did that, the dlang.org docs had quite a few problems; some have probably been fixed since. artur
Jun 20 2014
parent "Wesley Hamilton" <ffhighwind gmail.com> writes:
On Friday, 20 June 2014 at 12:35:26 UTC, Artur Skawina via 
Digitalmars-d wrote:
 On 06/20/14 11:22, Wesley Hamilton via Digitalmars-d wrote:
 It should help where the Language Reference pages aren't 
 accurate. For example, I think HexLetter is incorrectly 
 defined.
What's the problem with HexLetter? Once upon the time I did play with parsing D, unfortunately the compiler situation has indirectly resulted in a year+ long pause, as I was (maybe naively) hoping to be able to finish the project using an at least semi-modern D dialect, so that it would be usable for not just me... At least the lexer was done by then, and I think I fixed most of the dlang problems during the conversion to PEG. It's still available, maybe is has some value as an additional reference: http://repo.or.cz/w/girtod.git/blob/refs/heads/lexer:/dlanglexer.d At least back when I did that, the dlang.org docs had quite a few problems; some have probably been fixed since. artur
http://dlang.org/lex Maybe I'm blind but HexLetter includes an underscore. HexDigitsUS isn't defined. Based on this I figure there's a possibility of another error. I realize that there's an "improve this page" button... maybe I'll get around to testing the compiler with this theory.
Jun 20 2014
prev sibling parent reply "Brian Schott" <briancschott gmail.com> writes:
On Friday, 20 June 2014 at 09:22:07 UTC, Wesley Hamilton wrote:
 Thanks. Just realized that the "add grammar" button for ANTLR 
 grammar list is broken... so that could be why it's not there. 
 I'll probably still finish the grammar I'm making since I'm 75% 
 done. That's a great reference, though. I think it's missing a 
 few minor details like delimited strings, token strings, and 
 assembly keywords.
Keep in mind that assembly keywords aren't keywords outside of ASM blocks. You need your lexer to identify them as identifiers.
 It should help where the Language Reference pages aren't 
 accurate. For example, I think HexLetter is incorrectly defined.
If you find problems in the grammar please file an issue on Github or create a pull request. If you need the AST of some D code, you'll save a lot of time by downloading D-Scanner and running "dscanner --ast sourcecode.d > sourcecode_ast.xml"
Jun 20 2014
parent reply "Wesley Hamilton" <ffhighwind gmail.com> writes:
On Friday, 20 June 2014 at 18:20:36 UTC, Brian Schott wrote:
 On Friday, 20 June 2014 at 09:22:07 UTC, Wesley Hamilton wrote:
 Thanks. Just realized that the "add grammar" button for ANTLR 
 grammar list is broken... so that could be why it's not there. 
 I'll probably still finish the grammar I'm making since I'm 
 75% done. That's a great reference, though. I think it's 
 missing a few minor details like delimited strings, token 
 strings, and assembly keywords.
Keep in mind that assembly keywords aren't keywords outside of ASM blocks. You need your lexer to identify them as identifiers.
 It should help where the Language Reference pages aren't 
 accurate. For example, I think HexLetter is incorrectly 
 defined.
If you find problems in the grammar please file an issue on Github or create a pull request. If you need the AST of some D code, you'll save a lot of time by downloading D-Scanner and running "dscanner --ast sourcecode.d > sourcecode_ast.xml"
My intent is to develop a language based on D and a compiler to go with it. I've done something similar using ANTLR once before. I might turn it into a BS project.
Jun 20 2014
parent "Wesley Hamilton" <ffhighwind gmail.com> writes:
On Friday, 20 June 2014 at 18:45:15 UTC, Wesley Hamilton wrote:
 On Friday, 20 June 2014 at 18:20:36 UTC, Brian Schott wrote:
 On Friday, 20 June 2014 at 09:22:07 UTC, Wesley Hamilton wrote:
 Thanks. Just realized that the "add grammar" button for ANTLR 
 grammar list is broken... so that could be why it's not 
 there. I'll probably still finish the grammar I'm making 
 since I'm 75% done. That's a great reference, though. I think 
 it's missing a few minor details like delimited strings, 
 token strings, and assembly keywords.
Keep in mind that assembly keywords aren't keywords outside of ASM blocks. You need your lexer to identify them as identifiers.
 It should help where the Language Reference pages aren't 
 accurate. For example, I think HexLetter is incorrectly 
 defined.
If you find problems in the grammar please file an issue on Github or create a pull request. If you need the AST of some D code, you'll save a lot of time by downloading D-Scanner and running "dscanner --ast sourcecode.d > sourcecode_ast.xml"
My intent is to develop a language based on D and a compiler to go with it. I've done something similar using ANTLR once before. I might turn it into a BS project.
I realize assembly instruction keywords aren't actually tokens for the lexer. Having a clean ANTLR file that doesn't include predicates and language dependent code is nice as a starting point, but the parser eventually needs to check validity of the asm statements. Identifier DelimitedStrings need predicates too. Also, TokenStrings can't be a simple parse rule since the dot operator only applies to characters and not tokens.
Jun 20 2014
prev sibling parent "Brian Schott" <briancschott gmail.com> writes:
On Friday, 20 June 2014 at 06:57:31 UTC, Wesley Hamilton wrote:
 I've started making a D grammar for ANTLR4, but I didn't want 
 to spend days testing and debugging it later if somebody 
 already has one.

 The best search results turn up posts that are 10 years old. 
 Only one post has a link to a grammar file and the page seems 
 to have been removed. I also assume it would be obsolete with 
 changes to ANTLR and D.
 http://www.digitalmars.com/d/archives/digitalmars/D/25302.html
 http://www.digitalmars.com/d/archives/digitalmars/D/4953.html
https://github.com/Hackerpilot/DGrammar/blob/master/D.g4 It works around a few problems in ANTLR by combining a bunch of rules that should be separate into the unaryExpression rule, but it does build and produce a parse tree now. (I have no idea if the parse trees are always correct)
Jun 20 2014