digitalmars.D.announce - re2d lexer generator
- Ulya (23/23) Nov 25 Regular expression compiler [re2c](http://re2c.org) now [supports
Regular expression compiler [re2c](http://re2c.org) now [supports D](http://re2c.org/releases/release_notes.html#release-4-0). A short intro from the official website: *re2c* stands for *Regular Expressions to Code*. It is a free and open-source lexer generator that supports C, C++, D, Go, Haskell, Java, JavaScript, OCaml, Python, Rust, V, Zig, and can be extended to other languages by implementing a single [syntax file](http://re2c.org/manual/manual_d.html#syntax-files). The primary focus of re2c is on generating *fast* code: it compiles regular expressions to deterministic finite automata and translates them into direct-coded lexers in the target language (such lexers are generally faster and easier to debug than their table-driven analogues). Secondary re2c focus is on *flexibility*: it does not assume a fixed program template; instead, it allows the user to embed lexers anywhere in the source code and configure them to avoid unnecessary buffering and bounds checks. Internal algorithm used by re2c is based on a special kind of deterministic finite automata: [lookahead TDFA](http://re2c.org/2022_borsotti_trofimovich_a_closer_look_at_tdfa.pdf). These automata are as fast as ordinary DFA, but they are also capable of performing submatch extraction with minimal overhead. There is a [detailed user guide](http://re2c.org/manual/manual_d.html) an [online playground](http://re2c.org/playground/?example=d/01_basic.re) with many examples.
Nov 25
On Monday, 25 November 2024 at 16:01:54 UTC, Ulya wrote:a special kind of deterministic finite automata: [lookahead TDFA](http://re2c.org/2022_borsotti_trofimovich_a_closer_look_at_tdfa.pdf). These automata are as fast as ordinary DFA, but they are also capable of performing submatch extraction with minimal overhead. There is a [detailed user guide](http://re2c.org/manual/manual_d.html) an [online playground](http://re2c.org/playground/?example=d/01_basic.re) with many examples.Hi Ulya. I don't have an account on LOR so glad you wrote here :) Based on some examples from the playground it seems re2c is inserting `#line` directives. I think it is not supported by D lang. I've checked for example 'reuse.re'
Nov 25
On Monday, 25 November 2024 at 19:18:40 UTC, Sergey wrote:On Monday, 25 November 2024 at 16:01:54 UTC, Ulya wrote:Hi Sergey :) I believe `#line` directives are supported, as described here: https://dlang.org/spec/lex.html#special-token-sequence. All examples are compiled with `dmd -g -wi` and tested that they produce the expected output: https://github.com/skvadrik/re2c/blob/master/examples/d/__run_all.sh#L26. It is possible to disable line directives for an individual file using `-i`, or disable them globally with [this setting in syntax file](https://github.com/skvadrik/re2c/blob/master/include/syntax/d#L31).a special kind of deterministic finite automata: [lookahead TDFA](http://re2c.org/2022_borsotti_trofimovich_a_closer_look_at_tdfa.pdf). These automata are as fast as ordinary DFA, but they are also capable of performing submatch extraction with minimal overhead. There is a [detailed user guide](http://re2c.org/manual/manual_d.html) an [online playground](http://re2c.org/playground/?example=d/01_basic.re) with many examples.Hi Ulya. I don't have an account on LOR so glad you wrote here :) Based on some examples from the playground it seems re2c is inserting `#line` directives. I think it is not supported by D lang. I've checked for example 'reuse.re'
Nov 25
On Monday, 25 November 2024 at 21:33:35 UTC, Ulya wrote:Hi Sergey :) I believe `#line` directives are supported, as described here: https://dlang.org/spec/lex.html#special-token-sequence.Oh cool. I didn't know that and it is kinda unexpected for me :) Thanks!
Nov 25
On Monday, 25 November 2024 at 16:01:54 UTC, Ulya wrote:Regular expression compiler [re2c](http://re2c.org) now [supports D](http://re2c.org/releases/release_notes.html#release-4-0). [...]BTW this is completely different from https://code.dlang.org/packages/re2d. The latter is bindings to re2 library, while re2c is an ahead of time regexp compiler (a port of a tool that existed since 1993). The name clash is unfortunate.
Nov 25