digitalmars.D.learn - RegEx for a simple Lexer
- Tim Holzschuh via Digitalmars-d-learn (34/34) May 13 2014 Hi there,
- anonymous (20/24) May 13 2014 That string literal is malformed. WYSIWYG strings (r"...") don't
- Ary Borenszweig (3/10) May 13 2014 I think he's confusing r"..." with a regular expression literal (I also
- Brian Schott (11/20) May 13 2014 You may find the following useful:
- Kagamin (4/6) May 14 2014 See http://forum.dlang.org/post/lbnheh$2ssm$1@digitalmars.com
Hi there, I read a book about an introduction to creating programming languages (really basic). The sample code is written in Ruby, but I want to rewrite the examples in D. However, the Lexer uses Ruby's regex features to scan the code. I'm not very familiar with D's RegEx system (nor with another..), so it would be very helpful to receive some tips on how to "translate" the ruby RegEx's to D's implementation. If in Ruby I have a string called src, I just can use this: src[/\A([A-Z]\w*)/, 1]. Would match( src, r"([A-Z]\w*)" ); essentially do the same? (I know I have to use .captures to receive the found expression) If I also want to create a RegEx to filter string-expressions a la " xyz ", how would I do this? At least match( src, r"^\" (.*) $\" " ); doesn't seem to work and I couldn't find in the Library Reference how to change it.. Sorry if these questions seem dumb to you.. Ahh, I forgot one: In the book a parser generator like Yacc is used to create a suitable parser. Is there an equivalent for D? Or if not: is it really that hard to create a parser that is able to parse sth. like this: // Example class Foo: def name: "name" def asdf: 100 foo = Foo.new print( foo.nam ) print( foo.asdf ) Thank you for helping, Tim
May 13 2014
On Tuesday, 13 May 2014 at 19:53:17 UTC, Tim Holzschuh via Digitalmars-d-learn wrote:If I also want to create a RegEx to filter string-expressions a la " xyz ", how would I do this? At least match( src, r"^\" (.*) $\" " ); doesn't seem to work and I couldn't find in the Library Reference how to change it..That string literal is malformed. WYSIWYG strings (r"...") don't know escape sequences. So, the string ends at the second quote, and the rest is syntactical garbage to the compiler. "^\" (.*) $\" " would be a proper D string literal. You could also use the alternative WYSIWYG syntax: `^" (.*) $" ` That dollar sign looks off, though. It matches the end of the input. You probably want to put that at the end of the regex: "^\" (.*) \"$" Meaning: The match has to start at the beginning of the input (^). Matches a quote, then a space, then anything (.*), then a space, then a quote. The match has to end at the end of the input ($). Then again, when you're writing a tokenizer/parser, you usually don't require an expression to span the whole input, but just match as far as it goes. In that case, drop the dollar sign. And think about what happens when there are quotes in the payload.
May 13 2014
On 5/13/14, 5:43 PM, anonymous wrote:On Tuesday, 13 May 2014 at 19:53:17 UTC, Tim Holzschuh via Digitalmars-d-learn wrote:I think he's confusing r"..." with a regular expression literal (I also confused them)If I also want to create a RegEx to filter string-expressions a la " xyz ", how would I do this? At least match( src, r"^\" (.*) $\" " ); doesn't seem to work and I couldn't find in the Library Reference how to change it..
May 13 2014
On Tuesday, 13 May 2014 at 19:53:17 UTC, Tim Holzschuh via Digitalmars-d-learn wrote:Hi there, I read a book about an introduction to creating programming languages (really basic). The sample code is written in Ruby, but I want to rewrite the examples in D. However, the Lexer uses Ruby's regex features to scan the code. I'm not very familiar with D's RegEx system (nor with another..), so it would be very helpful to receive some tips on how to "translate" the ruby RegEx's to D's implementation.You may find the following useful: http://hackerpilot.github.io/experimental/std_lexer/phobos/lexer.html The source of the lexer generator is located here: https://github.com/Hackerpilot/Dscanner/blob/master/std/lexer.d D lexer: https://github.com/Hackerpilot/Dscanner/blob/master/std/d/lexer.d There's also a parser and AST library for D in that same project. The lexer generator may not be as simple as what you're using right now, but it is very fast.
May 13 2014
On Tuesday, 13 May 2014 at 20:02:59 UTC, Tim Holzschuh via Digitalmars-d-learn wrote:Still: Would it be very difficult to write a suitable parser from scratch?See http://forum.dlang.org/post/lbnheh$2ssm$1 digitalmars.com with duscussion about parsers on reddit.
May 14 2014