digitalmars.D - Lexer related questions

Casper Ellingsen (25/25) Jan 13 2006 Hey,

Sean Kelly (27/55) Jan 13 2006 The D regexp and BNF information is woefully inaccurate in places,

Casper Ellingsen (7/26) Jan 13 2006 Thanks. As far as I can tell, this syntax is the same as for D, except f...

Casper Ellingsen (12/12) Jan 14 2006 On Fri, 13 Jan 2006 22:38:18 +0100, Casper Ellingsen wrot...

Sean Kelly (9/21) Jan 14 2006 I'd guess it would be private, and equivalent to the following:

Casper Ellingsen (15/34) Jan 14 2006 Yes, that could make sense. I haven't had the time to confirm this yet

Hasan Aljudy (145/149) Jan 15 2006 I don't really know.

Casper Ellingsen (15/15) Jan 15 2006 On Fri, 13 Jan 2006 22:38:18 +0100, Casper Ellingsen wrot...

Don Clugston (4/24) Jan 15 2006 The parentheses are in the wrong place all through the docs. I think

Bruno Medeiros (10/39) Jan 18 2006 Indeed. I've wondered if that was wrong, or if it was just a different

Don Clugston (5/45) Jan 18 2006 Nothing, except that they are no longer written in HTML, they're .ddoc

Casper Ellingsen (36/36) Jan 16 2006 There's two conflicting definitions of postfix expressions in

Hasan Aljudy (5/49) Jan 16 2006 Obviously the second one is obselete.

"Casper Ellingsen" <no reply.com> writes:

Hey,

I'm using JFlex (http://jflex.de/) to implement a lexical analyser for the  
D language. I've already got quite alot done, but there's some issues here  
and there that I need to work on. Also, there's a couple things I need  
feedback on.

For example, I can't seem to understand why it's allowed to have several  
succeeding _'s in a decimal/integer value. The grammer says

Decimal:
	0
	NonZeroDigit
	NonZeroDigit Decimal
	NonZeroDigit _ Decimal

which means that 0, 1, 12, 1_2 and 1_2_3 is allowed, but in my opinion,  
1__2__3 is not allowed. The DMD compiler, however, accepts that value as  
123.

Also, the specification (http://www.digitalmars.com/d/lex.html) seems to  
lack information on some parts of the grammar. For example, it says

Float:
	DecimalFloat
	HexFloat
	Float _

but it doesn't describe the grammar of DecimalFloat nor HexFloat.

I'll post more questions once I find other issues.
-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

Jan 13 2006

Sean Kelly <sean f4.ca> writes:

Casper Ellingsen wrote:
 Hey,
 
 I'm using JFlex (http://jflex.de/) to implement a lexical analyser for 
 the D language. I've already got quite alot done, but there's some 
 issues here and there that I need to work on. Also, there's a couple 
 things I need feedback on.
 
 For example, I can't seem to understand why it's allowed to have several 
 succeeding _'s in a decimal/integer value. The grammer says
 
 Decimal:
     0
     NonZeroDigit
     NonZeroDigit Decimal
     NonZeroDigit _ Decimal
 
 which means that 0, 1, 12, 1_2 and 1_2_3 is allowed, but in my opinion, 
 1__2__3 is not allowed. The DMD compiler, however, accepts that value as 
 123.

The D regexp and BNF information is woefully inaccurate in places, 
largely because Walter wrote DMD entirely by hand.  You're best off 
verifying it against the written documentation:

http://digitalmars.com/d/lex.html#integerliteral

"Integers can have embedded '_' characters, which are ignored."

 Also, the specification (http://www.digitalmars.com/d/lex.html) seems to 
 lack information on some parts of the grammar. For example, it says
 
 Float:
     DecimalFloat
     HexFloat
     Float _
 
 but it doesn't describe the grammar of DecimalFloat nor HexFloat.

Same thing here.  Check this link:

http://digitalmars.com/d/lex.html#floatliteral

Though I suspect that aside from the embedded underscores, the syntax is 
identical to what it is in C/C++.  Here's the pertinent bit of the C++ 
standard:

floating-literal:
	fractional-constant exponent-part(opt) floating-suffix(opt)
	digit-sequence exponent-part floating-suffix(opt)
fractional-constant:
	digit-sequence(opt) . digit-sequence
	digit-sequence .
exponent-part:
	e sign(opt) digit-sequence
	E sign(opt) digit-sequence
sign: one of
	+ -
digit-sequence:
	digit
	digit-sequence digit
floating-suffix: one of
	f l F L

Jan 13 2006

"Casper Ellingsen" <no reply.com> writes:

On Fri, 13 Jan 2006 23:02:31 +0100, Sean Kelly <sean f4.ca> wrote:

 Though I suspect that aside from the embedded underscores, the syntax is  
 identical to what it is in C/C++.  Here's the pertinent bit of the C++  
 standard:

 floating-literal:
 	fractional-constant exponent-part(opt) floating-suffix(opt)
 	digit-sequence exponent-part floating-suffix(opt)
 fractional-constant:
 	digit-sequence(opt) . digit-sequence
 	digit-sequence .
 exponent-part:
 	e sign(opt) digit-sequence
 	E sign(opt) digit-sequence
 sign: one of
 	+ -
 digit-sequence:
 	digit
 	digit-sequence digit
 floating-suffix: one of
 	f l F L

Thanks. As far as I can tell, this syntax is the same as for D, except for  
the floating-suffix, which has no imaginary part in C/C++. That's an easy  
fix though. I already added it to the jflex file, and it seems to work  
perfectly. Now I'll move on to hex floats.
-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

Jan 13 2006

"Casper Ellingsen" <no reply.com> writes:

On Fri, 13 Jan 2006 22:38:18 +0100, Casper Ellingsen <no reply.com> wrote:

This is more a parser related question, but still, here goes: What  
visibility will the following function have, and why is it even legal to  
use more than one visibility keyword in combination like that? I mean, is  
it anything but confusing?

public package private foo(int i) {
	writefln(i);
}

Also, how accurate is the BNF in  
http://www.digitalmars.com/d/declaration.html?
-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

Jan 14 2006

Sean Kelly <sean f4.ca> writes:

Casper Ellingsen wrote:
 On Fri, 13 Jan 2006 22:38:18 +0100, Casper Ellingsen <no reply.com> wrote:
 
 This is more a parser related question, but still, here goes: What 
 visibility will the following function have, and why is it even legal to 
 use more than one visibility keyword in combination like that? I mean, 
 is it anything but confusing?
 
 public package private foo(int i) {
     writefln(i);
 }

I'd guess it would be private, and equivalent to the following:

public:
package:
private:
     void foo(int i);

 Also, how accurate is the BNF in 
 http://www.digitalmars.com/d/declaration.html?

It looks pretty close, at a glance.  But perhaps someone who's spent 
more time with the D parser could offer a more informed opinion.


Sean

Jan 14 2006

"Casper Ellingsen" <no reply.com> writes:

On Sun, 15 Jan 2006 06:12:13 +0100, Sean Kelly <sean f4.ca> wrote:

 Casper Ellingsen wrote:
 On Fri, 13 Jan 2006 22:38:18 +0100, Casper Ellingsen <no reply.com>  
 wrote:
  This is more a parser related question, but still, here goes: What  
 visibility will the following function have, and why is it even legal  
 to use more than one visibility keyword in combination like that? I  
 mean, is it anything but confusing?
  public package private foo(int i) {
     writefln(i);
 }

 I'd guess it would be private, and equivalent to the following:

 public:
 package:
 private:
      void foo(int i);

Yes, that could make sense. I haven't had the time to confirm this yet  
though.

 Also, how accurate is the BNF in  
 http://www.digitalmars.com/d/declaration.html?

 It looks pretty close, at a glance.  But perhaps someone who's spent  
 more time with the D parser could offer a more informed opinion.

Some of it looks correct, but other parts confuse me. Like the '()  
Declarator' part of the Declarator rule. Can someone please provide me  
with an example of usage of this rule? Also, isn't the last declarator  
rule redundant?

Declarator:
         BasicType2 Declarator
         Identifier
         () Declarator
         Identifier DeclaratorSuffixes
         () Declarator  DeclaratorSuffixes
-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

Jan 14 2006

Hasan Aljudy <hasan.aljudy gmail.com> writes:

Casper Ellingsen wrote:
 On Fri, 13 Jan 2006 22:38:18 +0100, Casper Ellingsen <no reply.com> wrote:
 
 Also, how accurate is the BNF in  
 http://www.digitalmars.com/d/declaration.html?

I don't really know.
I'm toying with a making a parser .. I couldn't use exactly the grammer 
that was there .. too confusing.
I tried to come up with my own description of the grammer .. it's not 
complete, mind you. I introduced some new rules to resolve some 
ambiguities (actually, work around them).

It's very experimental (and incomplete) at the moment. Use with care (if 
you ever use it anyway).
Note that I didn't include any keyword (i.e. int, float, etc) in the 
Type, because I don't lex them as keywords, but as Identifiers.

I'm not even sure how accurate it is, but here it is anyway:
	
Declaration:
	Type Declarator ;
	Type Declarator , DeclIdentifierList ;
	Type Declarator Parameters ;
	Type Declarator Parameters FunctionBody
		
Type:
	IdentifierSequence
	IdentifierSequence TypeSuffixes
	
TypeSuffixes:
	TypeSuffix
	TypeSuffix TypeSuffixes
	
TypeSuffix:
	Pointer
	Array
	FunctionPointer
	Delegate	

Pointer:
	*
	
Array:
	[]
	[ ExprType ]
	
ExprType:	
	AssignExpression
	AssignExpression TypeSuffixes
	
FunctionPointer:
	function Parameters

Delegate:
	delegate Parameters
	
Declarator:
	Identifier
	Declarator CTypeSuffixes
	Declarator = Initializer
	( Declarator )
	( TypeSuffixes Declarator )
	
CTypeSuffixes:
	Array
	Array CTypeSuffixes	
	
DeclIdentifierList:
	DeclIdentifier
	DeclIdentifier, DeclIdentifierList
	
DeclIdentifier:
	Identifier
	Identifier = Initializer
	
IdentifierSequence:
	IdentifierList
	.IdentifierList
	IdentifierSequence ! TemplateArguments
		
IdentifierList:
	Identifier
	Identifier.IdentifierList
	
TemplateArguments:
	( TemplateArgumentList )

TemplateArgumentList:
	TemplateArgument
	TemplateArgument, TemplateArgumentList
	
TemplateArgument:
	ExprType
	
Initializer:
	void
	AssignExpression
	ArrayInitializer
	StructInitializer
		
ArrayInitializer:
	[ ArrayMemberInitializations ]
	[ ]		

ArrayMemberInitializations:
	ArrayMemberInitialization
	ArrayMemberInitialization ,
	ArrayMemberInitialization , ArrayMemberInitializations

ArrayMemberInitialization:
	AssignExpression
	AssignExpression : AssignExpression
	
StructInitializer:
	{  }
	{ StructMemberInitializers }

StructMemberInitializers:
	StructMemberInitializer
	StructMemberInitializer ,
	StructMemberInitializer , StructMemberInitializers

StructMemberInitializer:
	AssignExpression
	Identifier : AssignExpression	
		
Parameters:
	( )
	( ParameterList )
	
ParameterList:
	Paremeter
	Parameter, ParameterList
	
Parameter:
	Type
	Type Declarator
	Type Declarator = Initializer
	InOut Parameter
	
InOut:
	in
	out
	inout
	
FunctionBody:
	StatementBlock
	FunctionContracts body StatementBlock

FunctionContracts:
	InContract
	OutContract
	InContract OutContract
	OutContract InContract

InContract:
	in StatementBlock

OutContract:
	out StatementBlock
	out ( Identifier ) StatementBlock

Jan 15 2006

"Casper Ellingsen" <no reply.com> writes:

On Fri, 13 Jan 2006 22:38:18 +0100, Casper Ellingsen <no reply.com> wrote:

A version condition is defined in  
http://www.digitalmars.com/d/version.html as

	VersionCondition:
		version () Integer
		version () Identifier

One valid version condition is

	version(X86)

so why isn't the BNF rules defined as

	VersionCondition:
		version ( Integer )
		version ( Identifier )

instead? It just seems odd to me, and really confused me for a while.
-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

Jan 15 2006

Don Clugston <dac nospam.com.au> writes:

Casper Ellingsen wrote:
 On Fri, 13 Jan 2006 22:38:18 +0100, Casper Ellingsen <no reply.com> wrote:
 
 A version condition is defined in  
 http://www.digitalmars.com/d/version.html as
 
     VersionCondition:
         version () Integer
         version () Identifier
 
 One valid version condition is
 
     version(X86)
 
 so why isn't the BNF rules defined as
 
     VersionCondition:
         version ( Integer )
         version ( Identifier )
 
 instead? It just seems odd to me, and really confused me for a while.

The parentheses are in the wrong place all through the docs. I think 
it's a ddoc problem (the docs weren't updated properly when they were 
converted to Ddoc).

Jan 15 2006

Bruno Medeiros <daiphoenixNO SPAMlycos.com> writes:

Don Clugston wrote:
 Casper Ellingsen wrote:
 
 On Fri, 13 Jan 2006 22:38:18 +0100, Casper Ellingsen <no reply.com> 
 wrote:

 A version condition is defined in  
 http://www.digitalmars.com/d/version.html as

     VersionCondition:
         version () Integer
         version () Identifier

 One valid version condition is

     version(X86)

 so why isn't the BNF rules defined as

     VersionCondition:
         version ( Integer )
         version ( Identifier )

 instead? It just seems odd to me, and really confused me for a while.

 
 
 The parentheses are in the wrong place all through the docs. 

Indeed. I've wondered if that was wrong, or if it was just a different 
kind of notation for the grammar, that I was unfamiliar with, since I'm 
no expert in this subject.

 I think 
 it's a ddoc problem (the docs weren't updated properly when they were 
 converted to Ddoc).

Hum...
What does the grammar doc have anything to do with ddoc ?


-- 
Bruno Medeiros - CS/E student
"Certain aspects of D are a pathway to many abilities some consider to 
be... unnatural."

Jan 18 2006

Don Clugston <dac nospam.com.au> writes:

Bruno Medeiros wrote:
 Don Clugston wrote:
 
 Casper Ellingsen wrote:

 On Fri, 13 Jan 2006 22:38:18 +0100, Casper Ellingsen <no reply.com> 
 wrote:

 A version condition is defined in  
 http://www.digitalmars.com/d/version.html as

     VersionCondition:
         version () Integer
         version () Identifier

 One valid version condition is

     version(X86)

 so why isn't the BNF rules defined as

     VersionCondition:
         version ( Integer )
         version ( Identifier )

 instead? It just seems odd to me, and really confused me for a while.



 The parentheses are in the wrong place all through the docs. 

 
 Indeed. I've wondered if that was wrong, or if it was just a different 
 kind of notation for the grammar, that I was unfamiliar with, since I'm 
 no expert in this subject.
 
 I think it's a ddoc problem (the docs weren't updated properly when 
 they were converted to Ddoc).

 
 
 Hum...
 What does the grammar doc have anything to do with ddoc ?

Nothing, except that they are no longer written in HTML, they're .ddoc 
files which are converted into HTML (so that they get proper D code 
colouring, etc). Funny things happened to the ampersands (in ddoc you 
can write &, in HTML it must be &amp;), and apparently the parentheses, too.

Jan 18 2006

"Casper Ellingsen" <no reply.com> writes:

There's two conflicting definitions of postfix expressions in  
http://www.digitalmars.com/d/expression.html. In the BNF at the top a  
postfix expression is defined as

	PostfixExpression:
		PrimaryExpression
		PostfixExpression . Identifier
		PostfixExpression ++
		PostfixExpression --
		PostfixExpression ( )
		PostfixExpression ( ArgumentList )
		IndexExpression
		SliceExpression

	IndexExpression:
		PostfixExpression [ ArgumentList ]

	SliceExpression:
		PostfixExpression [ ]
		PostfixExpression [ AssignExpression .. AssignExpression ]

On the other hand, in the textual description further down, a postfix  
expression is defined as

	PostfixExpression:
		PostfixExpression . Identifier
		PostfixExpression -> Identifier
		PostfixExpression ++
		PostfixExpression --
		PostfixExpression ( ArgumentList )
		PostfixExpression [ ArgumentList ]
		PostfixExpression [ AssignExpression .. AssignExpression ]

The first one has

		PostfixExpression ( )
		PostfixExpression [ ]

which the second one doesn't have, whereas the second one has

		PostfixExpression -> Identifier

which the first one doesn't have. What's the correct definition? Oh, if  
only the BNF grammar was correct. :/
-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

Jan 16 2006

Hasan Aljudy <hasan.aljudy gmail.com> writes:

Casper Ellingsen wrote:
 There's two conflicting definitions of postfix expressions in  
 http://www.digitalmars.com/d/expression.html. In the BNF at the top a  
 postfix expression is defined as
 
     PostfixExpression:
         PrimaryExpression
         PostfixExpression . Identifier
         PostfixExpression ++
         PostfixExpression --
         PostfixExpression ( )
         PostfixExpression ( ArgumentList )
         IndexExpression
         SliceExpression
 
     IndexExpression:
         PostfixExpression [ ArgumentList ]
 
     SliceExpression:
         PostfixExpression [ ]
         PostfixExpression [ AssignExpression .. AssignExpression ]
 
 On the other hand, in the textual description further down, a postfix  
 expression is defined as
 
     PostfixExpression:
         PostfixExpression . Identifier
         PostfixExpression -> Identifier
         PostfixExpression ++
         PostfixExpression --
         PostfixExpression ( ArgumentList )
         PostfixExpression [ ArgumentList ]
         PostfixExpression [ AssignExpression .. AssignExpression ]
 
 The first one has
 
         PostfixExpression ( )
         PostfixExpression [ ]
 
 which the second one doesn't have, whereas the second one has
 
         PostfixExpression -> Identifier
 
 which the first one doesn't have. What's the correct definition? Oh, if  
 only the BNF grammar was correct. :/

Obviously the second one is obselete.
D doesn't have the -> operator, it seems like it had it in the past though.
Also, the [] on expressions is a ``slice`` operator, which goes (I 
think) like [0..$]

Jan 16 2006

D Programming

C/C++ Programming

Other

digitalmars.D - Lexer related questions