digitalmars.D - Meaningful identifiers and other multi-token keywords
- Quirin Schroll (30/30) Sep 24 D’s has 4 places in the grammar where meaningful identifiers are
- Dom DiSc (5/16) Sep 24 I think this is a good idea. They are multi-token keywords just
- Richard (Rikki) Andrew Cattermole (9/11) Sep 24 I'm not sure this one is a good idea.
- Tim (19/25) Sep 25 I don't think, the lexer would be the right place, because the
- Quirin Schroll (59/88) Sep 26 The whitespace is not an issue. The comments maybe are. But even
- ryuukk_ (7/38) Sep 25 I love scope guards, i use them all the time, however, they are
- Walter Bright (2/3) Sep 27 It isn't clear what problem is being solved by this.
- Imperatorn (4/11) Sep 27 Just a question, what would that mean for backwards compatibility
D’s has 4 places in the grammar where meaningful identifiers are used instead of keywords: - Pragmas - Traits - Linkages - Scope guards For pragmas and traits, this is total non-issue as they have special and dedicated keywords. For linkages and scope guards, there will be rough edges if we make `(Type)` be a well-formed `BasicType`. The reason is that `extern(C)` could mean `extern` plus the basic type `(C)`, where `C` denotes e.g. a dummy class; or `scope (exit) x = 10;` with the intention not to assign `x`, but to declare `x` as a `scope` variable of type `exit`. In general, you could ask: Why would one write such code? and you’d be correct. The issue is with the argument to `extern` and `linkage` being identifiers. For linkage, it’s implementation defined which ones are supported, and they’re not just identifiers (e.g. `C++` and `Objective-C`), however, with scope guards, there are only `exit`, `success`, and `failure`. I want to suggest moving the parsing of scope guards and linkages to the lexer, i.e., if the lexer sees `scope`, `(`, any one of the identifiers `exit`, `success`, or `failure`, and `)`, that is a scope guard and is treated as a single token. The same with `extern(C)` – it will never be seen as anything but a linkage. It’s a multi-token keyword. Possibly, we can handle other cases alike, e.g. `static assert`, `static foreach`, and `auto ref`. By all accounts, their meaning isn’t derived from composing the semantics of the parts. What do you think?
Sep 24
On Tuesday, 24 September 2024 at 20:37:36 UTC, Quirin Schroll wrote:I want to suggest moving the parsing of scope guards and linkages to the lexer, i.e., if the lexer sees `scope`, `(`, any one of the identifiers `exit`, `success`, or `failure`, and `)`, that is a scope guard and is treated as a single token. The same with `extern(C)` – it will never be seen as anything but a linkage. It’s a multi-token keyword. Possibly, we can handle other cases alike, e.g. `static assert`, `static foreach`, and `auto ref`. By all accounts, their meaning isn’t derived from composing the semantics of the parts. What do you think?I think this is a good idea. They are multi-token keywords just to not occupy more words as keywords, but in fact could be treated as single entities.
Sep 24
On 25/09/2024 8:37 AM, Quirin Schroll wrote:The same with |extern(C)| – it will never be seen as anything but a linkage. It’s a multi-token keyword.I'm not sure this one is a good idea. Not all linkages can be done i.e. C++ has namespace. So it is moving one behavior that has no special casing, into another place that would require special casing and will slow things down. Overall I'm convinced that given how the lexer works, that this isn't a path we should be going down. Its done the way it is for a reason. I would expect that any changes down this path to slow down all identifiers for very little value.
Sep 24
On Tuesday, 24 September 2024 at 20:37:36 UTC, Quirin Schroll wrote:I want to suggest moving the parsing of scope guards and linkages to the lexer, i.e., if the lexer sees `scope`, `(`, any one of the identifiers `exit`, `success`, or `failure`, and `)`, that is a scope guard and is treated as a single token. The same with `extern(C)` – it will never be seen as anything but a linkage. It’s a multi-token keyword.I don't think, the lexer would be the right place, because the constructs are still multiple tokens. For example whitespace and comments are allowed in `extern ( C ++ /*comment*/ )`. Unknown languages in `extern(...)` attributes should also produce errors, so future compilers can add them without breaking code. Consider this example: ``` extern(X) x = 0; ```` Currently `X` is a normal identifier, but in the future it be could another language supported by the compiler. If `(X)` is interpreted as a type, then adding `extern(X)` to the compiler would be a breaking change. For forward compatibility it would be best if `extern(...)` and `scope(...)` are always parsed as whole attributes and not attributes with types in parens. Unknown languages or scope guard identifiers would then produce errors, so future compilers could add them without breaking code.
Sep 25
On Wednesday, 25 September 2024 at 15:50:20 UTC, Tim wrote:On Tuesday, 24 September 2024 at 20:37:36 UTC, Quirin Schroll wrote:The whitespace is not an issue. The comments maybe are. But even if they were, one option would be to just ban comments in linkage attributes and scope guards and not deal with the problem. I mean, who would do that, except for a QA tester?I want to suggest moving the parsing of scope guards and linkages to the lexer, i.e., if the lexer sees `scope`, `(`, any one of the identifiers `exit`, `success`, or `failure`, and `)`, that is a scope guard and is treated as a single token. The same with `extern(C)` – it will never be seen as anything but a linkage. It’s a multi-token keyword.I don't think, the lexer would be the right place, because the constructs are still multiple tokens. For example whitespace and comments are allowed in `extern ( C ++ /*comment*/ )`.Unknown languages in `extern(...)` attributes should also produce errors, so future compilers can add them without breaking code.Officially, it’s implementation defined what’s supported beyond `D` and `C`, see [here](https://dlang.org/spec/attribute.html#LinkageAttribute). The fact that DMD supports `C++`, `Objective-C`, `System`, and `Windows` is already an extension. Considering C++ namespaces, the syntax is quite flexible. Essentially, any token soup with balanced parentheses is allowed. Maybe C++ was right, there it’s `extern "C"`.Consider this example: ``` extern(X) x = 0; ```` Currently `X` is a normal identifier, but in the future it be could another language supported by the compiler. If `(X)` is interpreted as a type, then adding `extern(X)` to the compiler would be a breaking change. For forward compatibility it would be best if `extern(...)` and `scope(...)` are always parsed as whole attributes and not attributes with types in parens. Unknown languages or scope guard identifiers would then produce errors, so future compilers could add them without breaking code.With Primary Type Syntax, `extern (Type)` can happen by accident, yes. Then, `Type` could happen to be a valid linkage, but even in that case, there’s a high likelihood that there’s a parse error down the line. (I fact, it might be guaranteed, I couldn’t find a way how it’s not.) That is because linkage attributes are not storage classes. Unlike `static`, `ref`, etc., `extern(C)` cannot be used instead of `auto`. ```d // Current behavior: alias C = int; extern (C) x = 0; // Error: basic type expected extern (C) auto x = 0; // Good, and `C` can’t be the type of `x` static extern (C) x = 0; // Error: basic type expected extern (C) static x = 0; // Good, and `C` can’t be the type of `x`, even if it denotes a type alias Type = int; extern (Type) x = 0; // Error: Type is not a linkage extern (Type) auto x = 0; // Error: Type is not a linkage static extern (Type) x = 0; // Error: Type is not a linkage extern (Type) static x = 0; // Error: Type is not a linkage // My implementation: alias C = int; extern (C) x = 0; // Error: basic type expected extern (C) auto x = 0; // Good, and `C` can’t be the type of `x` static extern (C) x = 0; // Error: basic type expected extern (C) static x = 0; // Good, but `C` can’t be the type of `x`, even if it denotes a type alias Type = int; extern (Type) x = 0; // Error: `Type` is not a linkage extern (Type) auto x = 0; // Error: `Type` is not a linkage static extern (Type) x = 0; // Error: `Type` is not a linkage extern (Type) static x = 0; // Error: `Type` is not a linkage ``` From what I’ve understood in [the attribute spec](https://dlang.org/spec/attribute.html), `extern` marks a symbol a declaration whereas without, it would be a definition. It comes with or implies `export`, or at least implies `static`. So, if needed, one can just put that (or any nothingburger) between `extern` and `(Type)` and be good. ```d // My implementation extern export (Type) x; // ok extern static (Type) y; // ok extern 0 (Type) z; // ok ```
Sep 26
On Tuesday, 24 September 2024 at 20:37:36 UTC, Quirin Schroll wrote:D’s has 4 places in the grammar where meaningful identifiers are used instead of keywords: - Pragmas - Traits - Linkages - Scope guards For pragmas and traits, this is total non-issue as they have special and dedicated keywords. For linkages and scope guards, there will be rough edges if we make `(Type)` be a well-formed `BasicType`. The reason is that `extern(C)` could mean `extern` plus the basic type `(C)`, where `C` denotes e.g. a dummy class; or `scope (exit) x = 10;` with the intention not to assign `x`, but to declare `x` as a `scope` variable of type `exit`. In general, you could ask: Why would one write such code? and you’d be correct. The issue is with the argument to `extern` and `linkage` being identifiers. For linkage, it’s implementation defined which ones are supported, and they’re not just identifiers (e.g. `C++` and `Objective-C`), however, with scope guards, there are only `exit`, `success`, and `failure`. I want to suggest moving the parsing of scope guards and linkages to the lexer, i.e., if the lexer sees `scope`, `(`, any one of the identifiers `exit`, `success`, or `failure`, and `)`, that is a scope guard and is treated as a single token. The same with `extern(C)` – it will never be seen as anything but a linkage. It’s a multi-token keyword. Possibly, we can handle other cases alike, e.g. `static assert`, `static foreach`, and `auto ref`. By all accounts, their meaning isn’t derived from composing the semantics of the parts. What do you think?I love scope guards, i use them all the time, however, they are both painful to type and makes code ugly to read Perhaps `scope(exit)` and `scope(failure)` should be renamed, `defer` and `errDefer` Solves your problem, and mine
Sep 25
On 9/24/2024 1:37 PM, Quirin Schroll wrote:What do you think?It isn't clear what problem is being solved by this.
Sep 27
On Tuesday, 24 September 2024 at 20:37:36 UTC, Quirin Schroll wrote:D’s has 4 places in the grammar where meaningful identifiers are used instead of keywords: - Pragmas - Traits - Linkages - Scope guards [...]Just a question, what would that mean for backwards compatibility and potential loss of flexibility?
Sep 27