D - D grammar (yacc format)
- Sammy (39/39) Jan 28 2004 I began to write a yacc style grammar for D.
- C (5/44) Jan 28 2004 Wow, this would be awesome! Think you'll end up using somethign besides
- Ben Hinkle (7/46) Jan 28 2004 Interestingly GCC's C++ frontend switched from using a yacc generated pa...
I began to write a yacc style grammar for D. I am using the parse.c from the dmd source. Each function is being translated to a piece of the grammar. Each piece is added to the source file using some tag ( /** Grammar ) Eg: /** Grammar * * %token "^" TOKxor * * XorExp : AndExp * | XorExp "^" AndExp * ; * **/ Expression *Parser::parseXorExp() { Expression *e; Expression *e2; Loc loc = this->loc; e = parseAndExp(); while (token.value == TOKxor) { nextToken(); e2 = parseAndExp(); e = new XorExp(loc, e, e2); } return e; } I dont know if the final grammar will be LALR, nor if it will be unabiguous. If everything is ok, i will use bison to process the grammar. if not, there are other parser generators who can handle anbiguous and/or non-lalr grammars ( eg: btyacc, accent, msta, D'Parser ) I would like to see the grammar maintained ( i can do that ) and distributed with DMD ( i cant do that :-) . I also have a lexer (made with flex), who needs to be updated. What do you think ?
Jan 28 2004
Wow, this would be awesome! Think you'll end up using somethign besides bison though , D'Parser looks really cool. C "Sammy" <none here.com> wrote in message news:bv941u$1e1e$1 digitaldaemon.com...I began to write a yacc style grammar for D. I am using the parse.c from the dmd source. Each function is being translated to a piece of the grammar. Each piece is added to the source file using some tag ( /** Grammar ) Eg: /** Grammar * * %token "^" TOKxor * * XorExp : AndExp * | XorExp "^" AndExp * ; * **/ Expression *Parser::parseXorExp() { Expression *e; Expression *e2; Loc loc = this->loc; e = parseAndExp(); while (token.value == TOKxor) { nextToken(); e2 = parseAndExp(); e = new XorExp(loc, e, e2); } return e; } I dont know if the final grammar will be LALR, nor if it will be unabiguous. If everything is ok, i will use bison to process the grammar. if not, there are other parser generators who can handle anbiguous and/or non-lalr grammars ( eg: btyacc, accent, msta, D'Parser ) I would like to see the grammar maintained ( i can do that ) and distributed with DMD ( i cant do that :-) . I also have a lexer (made with flex), who needs to be updated. What do you think ?
Jan 28 2004
Interestingly GCC's C++ frontend switched from using a yacc generated parser to a hand-written parser in GCC-3.4. I don't know all the reasons why but you could probably Google around to find out why. Personally I find the 3.4 hand-written parser is easier to understand. -Ben "Sammy" <none here.com> wrote in message news:bv941u$1e1e$1 digitaldaemon.com...I began to write a yacc style grammar for D. I am using the parse.c from the dmd source. Each function is being translated to a piece of the grammar. Each piece is added to the source file using some tag ( /** Grammar ) Eg: /** Grammar * * %token "^" TOKxor * * XorExp : AndExp * | XorExp "^" AndExp * ; * **/ Expression *Parser::parseXorExp() { Expression *e; Expression *e2; Loc loc = this->loc; e = parseAndExp(); while (token.value == TOKxor) { nextToken(); e2 = parseAndExp(); e = new XorExp(loc, e, e2); } return e; } I dont know if the final grammar will be LALR, nor if it will be unabiguous. If everything is ok, i will use bison to process the grammar. if not, there are other parser generators who can handle anbiguous and/or non-lalr grammars ( eg: btyacc, accent, msta, D'Parser ) I would like to see the grammar maintained ( i can do that ) and distributed with DMD ( i cant do that :-) . I also have a lexer (made with flex), who needs to be updated. What do you think ?
Jan 28 2004
"Ben Hinkle" <bhinkle4 juno.com> wrote in message news:bv9etc$21ib$1 digitaldaemon.com...Interestingly GCC's C++ frontend switched from using a yacc generatedparserto a hand-written parser in GCC-3.4. I don't know all the reasons why but you could probably Google around to find out why. Personally I find the3.4hand-written parser is easier to understand. -BenI am not sure, and i dont remember where i read this, but I think its because the parser is faster, simpler and as you said, easier to understand ( and possibly to implement ). Implement the parser, any way you want. But if you want to have a clear picture of the language, i think you need a grammar. If you use EBNF, instead of pure BNF, you will end up with a grammar fitting on 2 or 3 pages instead of the 67 pages of the file parser.c So you are right, but i still find its still and interesting exercice. - S
Jan 28 2004
"Ben Hinkle" <bhinkle4 juno.com> wrote in message news:bv9etc$21ib$1 digitaldaemon.com...Interestingly GCC's C++ frontend switched from using a yacc generatedparserto a hand-written parser in GCC-3.4. I don't know all the reasons why but you could probably Google around to find out why. Personally I find the3.4hand-written parser is easier to understand. -BenFound this short explaination on : http://gnu.essentkabel.com/software/gcc/gcc-3.4/changes.html GCC 3.4 Release Series. Changes, New Features, and Fixes C++ A hand-written recursive-descent C++ parser has replaced the YACC-derived C++ parser from previous GCC releases. The new parser contains much improved infrastructure needed for better parsing of C++ source codes, handling of extensions, and clean separation (where possible) between proper semantics analysis and parsing. The new parser fixes many bugs that were found in the old parser. - S
Jan 28 2004
"Ben Hinkle" <bhinkle4 juno.com> wrote in message news:bv9etc$21ib$1 digitaldaemon.com...Interestingly GCC's C++ frontend switched from using a yacc generatedparserto a hand-written parser in GCC-3.4. I don't know all the reasons why but you could probably Google around to find out why. Personally I find the3.4hand-written parser is easier to understand. -BenSorry, to post again. All those explanation tells that its a very good choice to use a hand written parser... From: http://compilers.iecc.com/comparch/article/02-01-022 ---------------------------------------------------------------------- Re: C++ parsing : what's new ? From comp.compilers From: Zack Weinberg <zackw panix.com> Newsgroups: comp.compilers Date: 4 Jan 2002 00:09:35 -0500 Organization: PANIX -- Public Access Networks Corp. References: 01-12-179 Keywords: C++, parse Posted-Date: 04 Jan 2002 00:09:35 EST Peter H. Froehlich <pfroehli ics.uci.edu> writes:On Saturday, December 29, 2001, at 10:30 , Travis Nixon wrote:The motivations are correctness, performance, and better error recovery, in that order. The existing gcc grammar for C++ does not interpret all language constructs correctly; this has been judged unfixable while continuing to use yacc. There are already horrible kludges to get certain constructs mostly right (look at cp/spew.c, if you want to know) that become unnecessary with a recursive descent parser. A hand-rolled recursive descent parser has a lot more information accessible to it. This should permit it to make better backtracking decisions. Bad backtracking is the major cause of bad performance in the existing parser. It will certainly permit better error recovery. Note that this approach is known to work - the Edison Design Group (EDG)'s C/C++ front end uses a hand-written recursive descent parser, and it does all the above things better than g++. As someone who's worked on both front ends, I can also say that I found it _easier_ to work with the recursive descent parser. zw"Martin von Loewis" <loewis informatik.hu-berlin.de> wrote in messageAs I recall the motivation is performance. The people doing the rewrite expect the recursive descent parser to be way faster than the table-driven one they have now. However, I sure would *not* want to write a parser like that for a grammar as hairy as C/C++. :-)The gcc grammar is currently being rewritten, from a bison-based one to a hand-written recursive-descent parser.Is there a discussion online anywhere about the reasons for doing this?
Jan 28 2004