www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - [SAOC] "D backend for Bison" thread

reply Adela Vais <adela.vais99 gmail.com> writes:
Hello all!

My name is Adela Vais, and I am a 4th-year student at the 
"Politehnica" University of Bucharest, where I study Computer 
Engineering.
I love learning new programming languages and during the Formal 
Languages and Automata course I took at university I became 
interested in LR theory.
Eduard Stăniloiu and Răzvan Nițu made my introduction to D, at a 
workshop that took place during the “Ideas and Projects Workshop” 
Summer School 2019, and I found the language interesting because 
of its high expressiveness and its fast memory safety feature.
This is why I decided to do this summer an internship at my 
university, learning and making contributions to Dlang and GNU 
Bison. I consider SAoC a great way to get involved in the D 
community.

The project I will pursue during SAoC aims to complete the D 
backend for Bison, by creating the GLR Parser for the D language.
Currently, Bison supports only the LALR1 Parser for D. While 
simple and fast, this parser has its limitations: it cannot 
handle non-deterministic or ambiguous grammars. A GLR Parser 
would be able to handle such a grammar, without the constraint of 
only one token lookahead.
Combining my love for D, for learning programming languages, and 
for LR theory, this project is a tremendous learning opportunity 
for me.

What I intend to do during SAoC, for each milestone, is:

Milestone 1 - Understand the already existing code

- Analyze the LALR1 D language parser and the C and C++ GLR 
parsers.
- Understand M4 (the language I will partly write the parser in) 
by working in the GNU Bison repository, doing smaller tasks 
(adding functionalities like lookahead correction, and starting 
working at a push parser, which will further my understanding of 
the LALR1 D parser).
- Creating (at least) 4 small programs that will make me 
understand the differences between the C and C++’s LALR1 and GLR 
parsers, using ambiguous and non-ambiguous grammars.

Milestone 2 - Write the GLR support for the D language

- Create the glr.d file. Similar to the C++ code, this parser 
will be a class that wraps around the C GLR parser (for easier 
maintenance).
- Provide the same interface to that of the LALR1 parser. The 
user should not be able to feel any difference on their end. Add 
support so that the user is able to provide input from both stdin 
and files, and create the Lexer interface that allows the user to 
create a class implementing a lexer method, an error reporting 
method, and location tracking.

Milestone 3 - Write the GLR support for the D language

- Continue working on the interface. Add the declarations 
currently supported by the LALR1 parser: %error-verbose, 
%parse-param, %union, %code, %locations, %initial-action.
- Merge the glr.d and lalr1.d files. The two files will likely 
end up duplicating a lot of code, so a merge will be needed.

Milestone 4 – Test and write documentation

- Pass all unit tests, fix any bug that appears.
- If the time allows, I plan to integrate this parser in a 
project made by a third party, to further test the correctness.
- Write the documentation.

I will post weekly (or biweekly) updates with my progress.

I already made some changes to the existing D language support 
and I will continue to do so beyond SAoC, too. I intend to pursue 
this project for my undergraduate thesis topic and continue as an 
ambassador on behalf of the D community after that.


Thanks!
Adela
Sep 09 2020
parent reply Jacob Carlborg <doob me.com> writes:
On 2020-09-09 21:59, Adela Vais wrote:

 The project I will pursue during SAoC aims to complete the D backend for 
 Bison, by creating the GLR Parser for the D language.
I don't know much about Bison. But isn't Bison a parser generator that takes some language grammar as input and outputs a parser implemented in some language. Bison seem to currently support C, C++ and Java for the parser implementation. Is the goal to add D to that list?
 - Pass all unit tests, fix any bug that appears.
Why is this a separate task? In my opinion all unit tests should be written at the same time as the implementation, or before, if you're doing test driven development. And the implementation is not done until the tests are done an all pass. The tests are part of the implementation. -- /Jacob Carlborg
Sep 10 2020
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Sep 10, 2020 at 05:04:42PM +0200, Jacob Carlborg via Digitalmars-d
wrote:
[...]
 Bison seem to currently support C, C++ and Java for the parser
 implementation. Is the goal to add D to that list?
Yes. T -- The trouble with TCP jokes is that it's like hearing the same joke over and over.
Sep 10 2020
prev sibling parent Adela Vais <adela.vais99 gmail.com> writes:
On Thursday, 10 September 2020 at 15:04:42 UTC, Jacob Carlborg 
wrote:
 - Pass all unit tests, fix any bug that appears.
Why is this a separate task? In my opinion all unit tests should be written at the same time as the implementation, or before, if you're doing test driven development. And the implementation is not done until the tests are done an all pass. The tests are part of the implementation.
I had in mind extensive, combine-all-the-features-you-possibly-can type of tests when I wrote that, of course I will also test along the way.
Sep 11 2020