www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Lexical grammar rules need fixing

In a quick study of the _lexical_ grammar rules for D (stealing them for 
my own use :)), I came across a few ambiguities and small mistakes:

I refer you to http://digitalmars.com/d/lex.html:

1. Digit is not defined on the page (is it elsewhere?)
2. Character is not defined on the page (is it universal alpha or 
defined elsewhere?)
3. Comments use Characters, which are never defined, but are not so easy 
to define unambiguously due to the parsing rules of the three types of 
comments themselves.
4. WysiwygCharacters is never defined, implied to be:

	WysiwygCharacter WysiwygCharacters

5. DoubleQuotedCharacters is never defined, implied to be:

	DoubleQuotedCharacter DoubleQuotedCharacters

6. HexStringChars is never defined, implied to be:

	HexStringChar HexStringChars

7. HexStringChar includes WhiteSpace as acceptable input, while 
WhiteSpace includes Comments (via Space declaration).
	So, in a hex-string I can write a comment?
	My suggestion would be to take comment out of the Space declaration and 
separate it out into the same declaration where Space is included as 
regular program input.

8. Comments should be defined as grammar rules.  Is there some reason 
why this cannot be done trivially?

9. There is no starting rule for the grammar, except the statement
"The source text consists of white space, end of lines, comments, 
special token sequences, tokens, all followed by end of file."  However, 
this is somewhat redundant because comments are counted as white space 
and end of lines are included in white space.  Simply redefine this 
sentence as a grammar rule and leave comments in there, but take them 
out of the Space declaration.

10. HexDigit can be simplified by removing 0-9 and replacing it with 
DecimalDigits... Why is DecimalDigits plural?

These are easy enough to fix, and I think it'd be worth it to do so. 
Any thoughts?

James Dunne
Apr 02 2006