digitalmars.D.learn - Should a parser type be a struct or class?
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (5/5) Jun 17 2020 Should a range-compliant aggregate type realizing a parser be
- Simen =?UTF-8?B?S2rDpnLDpXM=?= (11/16) Jun 17 2020 The heuristic I use is 'do I need polymorphism?' If no, it's a
- Stanislav Blinov (59/64) Jun 17 2020 What's a range-compliant aggregate type? Ranges are typically
- H. S. Teoh (18/23) Jun 17 2020 Probably for historical reasons.
- Stefan Koch (6/11) Jun 17 2020 I would say a struct.
- Adam D. Ruppe (2/3) Jun 17 2020 why would a parser ever inherit from a lexer?
- H. S. Teoh (6/10) Jun 17 2020 Because, unlike a regular parser-driven compiler, dmd is a lexer-driven
- welkam (2/5) Jun 18 2020 So you can write nextToken() instead of lexer.nextToken()
- welkam (1/1) Jun 18 2020 Oh an also https://github.com/dlang/dmd/pull/9899
- user1234 (10/15) Jun 17 2020 You have the example of libdparse that shows that using a class
- Meta (11/16) Jun 18 2020 IMO it doesn't need to be. However, it's worth saying that range
Should a range-compliant aggregate type realizing a parser be encoded as a struct or class? In dmd `Lexer` and `Parser` are both classes. In general how should I reason about whether an aggregate type should be encoded as a struct or class?
Jun 17 2020
On Wednesday, 17 June 2020 at 11:50:27 UTC, Per Nordlöw wrote:Should a range-compliant aggregate type realizing a parser be encoded as a struct or class? In dmd `Lexer` and `Parser` are both classes. In general how should I reason about whether an aggregate type should be encoded as a struct or class?The heuristic I use is 'do I need polymorphism?' If no, it's a struct. Another thing that may be worth considering is reference semantics. The latter is easy to do with a struct, while polymorphism is generally a class-only thing (but check out Tardy, which Atila Neves recently posted in the Announce group). I would say I basically never use classes in D - pointers and arrays give me all the reference semantics I need, and polymorphism I almost never need. -- Simen
Jun 17 2020
On Wednesday, 17 June 2020 at 11:50:27 UTC, Per Nordlöw wrote:Should a range-compliant aggregate type realizing a parser be encoded as a struct or class? In dmd `Lexer` and `Parser` are both classes. In general how should I reason about whether an aggregate type should be encoded as a struct or class?What's a range-compliant aggregate type? Ranges are typically views of someone else's data; an owner of the data woulnd't store mutable iterators, and won't be a range. For that reason also, ranges are structs, as most of them are thin wrappers over a set of iterators with an interface to mutate them. If you *really* need runtime polymorphism as provided by the language - use a class. Otherwise - use a struct. It's pretty straightforward. Even then, in some cases one can realize their own runtime polymorphism without classes (look at e.g. Atila Neves' 'tardy' library). It's very easy to implement a lexer as an input range: it'd just be a pointer into a buffer plus some additional iteration data (like line/column position, for example). I.e. a struct. Making it a struct also allows to make it into a forward range, instead of input range, which is useful if you need lookahead: struct TokenStream { this(SourceBuffer source) { this.cursor = source.text.ptr; advance(this); } bool empty() const { return token.type == TokenType.eof; } ref front() return scope const { return token; } void popFront() { switch (token.type) { default: advance(this); break; case TokenType.eof: break; case TokenType.error: token.type = TokenType.eof; token.lexSpan = LexicalSpan(token.lexSpan.end, token.lexSpan.end); break; } } TokenStream save() const { return this; } private: const(char)* cursor; Location location; Token token; } , where `advance` is implemented as a module private function that actually parses source into next token. DMD's Lexer/Parser aren't ranges. They're ourobori.
Jun 17 2020
On Wed, Jun 17, 2020 at 11:50:27AM +0000, Per Nordlöw via Digitalmars-d-learn wrote:Should a range-compliant aggregate type realizing a parser be encoded as a struct or class?Preferably a struct IMO, but see below.In dmd `Lexer` and `Parser` are both classes.Probably for historical reasons.In general how should I reason about whether an aggregate type should be encoded as a struct or class?1) Does it need runtime polymorphism? If it does, use a class. If not, probably a struct. 2) Does it make more sense as a by-value type, or a by-reference type? In several of my projects, for example, I've had aggregate types start out as structs (because of (1)), but eventually rewritten as (final) classes because I started finding myself using `ref` or `&` everywhere to get by-reference semantics. My rule-of-thumb is basically adopted from TDPL: a struct as a "glorified int" with by-value semantics, a class is a more traditional OO object. If my aggregate behaves like a glorified int, then a struct is a good choice. If it behaves more like a traditional OO encapsulated type, then a class is probably the right answer. T -- Many open minds should be closed for repairs. -- K5 user
Jun 17 2020
On Wednesday, 17 June 2020 at 11:50:27 UTC, Per Nordlöw wrote:Should a range-compliant aggregate type realizing a parser be encoded as a struct or class? In dmd `Lexer` and `Parser` are both classes. In general how should I reason about whether an aggregate type should be encoded as a struct or class?I would say a struct. Parser in dmd does even inherit from Lexer. It seems to be a quirky design. Especially for multi-threaded parsing you might want to have more control over memory layout than classes usually give you.
Jun 17 2020
On Wednesday, 17 June 2020 at 14:24:01 UTC, Stefan Koch wrote:Parser in dmd does even inherit from Lexer.why would a parser ever inherit from a lexer?
Jun 17 2020
On Wed, Jun 17, 2020 at 02:32:09PM +0000, Adam D. Ruppe via Digitalmars-d-learn wrote:On Wednesday, 17 June 2020 at 14:24:01 UTC, Stefan Koch wrote:Because, unlike a regular parser-driven compiler, dmd is a lexer-driven one. :-D T -- The volume of a pizza of thickness a and radius z can be described by the following formula: pi zz a. -- Wouter VerhelstParser in dmd does even inherit from Lexer.why would a parser ever inherit from a lexer?
Jun 17 2020
On Wednesday, 17 June 2020 at 14:32:09 UTC, Adam D. Ruppe wrote:On Wednesday, 17 June 2020 at 14:24:01 UTC, Stefan Koch wrote:So you can write nextToken() instead of lexer.nextToken()Parser in dmd does even inherit from Lexer.why would a parser ever inherit from a lexer?
Jun 18 2020
On Wednesday, 17 June 2020 at 11:50:27 UTC, Per Nordlöw wrote:Should a range-compliant aggregate type realizing a parser be encoded as a struct or class? In dmd `Lexer` and `Parser` are both classes. In general how should I reason about whether an aggregate type should be encoded as a struct or class?You have the example of libdparse that shows that using a class can be a good idea [1] [2]. For DCD, the parser overrides a few thing because otherwise completion does not work properly or has scope issues. But TBH there's not many reasons to use a class otherwise. [1] https://github.com/dlang-community/dsymbol/blob/master/src/dsymbol/conversion/package.d#L102 [2] https://github.com/dlang-community/dsymbol/blob/master/src/dsymbol/conversion/package.d#L138
Jun 17 2020
On Wednesday, 17 June 2020 at 11:50:27 UTC, Per Nordlöw wrote:Should a range-compliant aggregate type realizing a parser be encoded as a struct or class? In dmd `Lexer` and `Parser` are both classes. In general how should I reason about whether an aggregate type should be encoded as a struct or class?IMO it doesn't need to be. However, it's worth saying that range semantics aren't a great fit for parsers - at least that's been my experience. Parsers need to be able to "synchronize" to recover from syntax errors, which does not fit into the range API very well. You can probably fit it in somewhere in popFront or front or empty, as your implementation permits, but I find it's just easier to forego the range interface and implement whatever primitives you need; *then* you can add a range interface over top that models the output of the parser as a range of expressions, or whatever you want.
Jun 18 2020