digitalmars.D - DIP 1026---Deprecate Context-Sensitive String Literals---Community
- Mike Parker (16/16) Dec 03 2019 This is the feedback thread for the first round of Community
- Andrea Fontana (7/23) Dec 03 2019 I think there's a problem with this analysis.
- Dennis (4/10) Dec 03 2019 I agree, I definitely want to expand my collection of open source
- John Colvin (6/22) Dec 03 2019 Is there much point being almost context-free? Seems like we
- Dennis (10/15) Dec 03 2019 I _think_ this is the only thing in the lexical grammar that is
- Andrei Alexandrescu (5/9) Dec 03 2019 This DIP is a non-starter. Here documents are easily and effectively
- Dennis (40/42) Dec 03 2019 I consider this low-hanging fruit: just deprecating a token takes
- Adam D. Ruppe (18/20) Dec 03 2019 The identifier ones are trivial, they are a simple regex. Heck,
- Andrei Alexandrescu (24/34) Dec 03 2019 These can never be the primary reasons for removing a feature. One
- Dennis (37/56) Dec 03 2019 The DIP mentions:
- Andrei Alexandrescu (11/20) Dec 03 2019 It was great primarily because it was a built-in feature made
- Dennis (12/14) Dec 03 2019 If you truly wanted to convey that, you did a good job. But I do
- mipri (69/71) Dec 03 2019 Bad motivation and bad construction. The bad construction is
- FeepingCreature (15/23) Dec 04 2019 I think this is a really questionable argument, because it
- Dennis (12/26) Dec 04 2019 That's the nature of deprecation: a short term cost for a long
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (4/6) Dec 04 2019 Suggesting a workable alternative usually is easier. Like:
- mipri (7/13) Dec 04 2019 Or specify that q"<<< (three chars exactly) can only be matched
- Andrei Alexandrescu (40/43) Dec 04 2019 That got me thinking. Here's what I'd opine.
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (11/18) Dec 04 2019 That will prevent qualitative incremental improvements. You
- Walter Bright (7/13) Dec 04 2019 This would be a good opening for a separate thread.
- Kagamin (7/11) Dec 05 2019 If those other literals are bad. For python it's the opposite:
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (8/13) Dec 05 2019 Yes, that usage you link to was for docs-strings though (more
- Kagamin (4/6) Dec 05 2019 D can embed files with import expression
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (12/18) Dec 05 2019 That is a nice alternative for long text, but when building
- mipri (2/12) Dec 05 2019 Python doesn't have delimited strings.
- Kagamin (4/7) Dec 04 2019 Alternative can be any other type of string or an import
- Exil (16/56) Dec 03 2019 C++ removed features that were almost never used. So much so I
- H. S. Teoh (47/83) Dec 03 2019 Agreed, but that can't be the only criterion for removing a feature. By
- Elronnd (11/27) Dec 03 2019 That's clearly not a fair comparison. Heredocs can be reduced to
- H. S. Teoh (17/28) Dec 03 2019 This is a valid consideration *before* the language is implemented. The
- WebFreak001 (4/18) Dec 03 2019 actually with textmate based grammars this is pretty easy to
- H. S. Teoh (41/49) Dec 03 2019 [...]
- Paul Backus (8/23) Dec 03 2019 By definition, a context-free grammar is defined in terms of a
- H. S. Teoh (20/29) Dec 03 2019 [...]
- Andrei Alexandrescu (6/8) Dec 03 2019 I feared that would happen. When I drafted the initial answer, I had
- H. S. Teoh (8/17) Dec 03 2019 Yes, sigh, I can see it already: this thread is going to be another of
- Walter Bright (3/7) Dec 06 2019 It's a well-known effect that the less technical a proposal is, the more...
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (9/14) Dec 03 2019 Just change the syntax to q"delimiter .... retimiled" and I
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (3/11) Dec 03 2019 That was a joke! Don't argue it...
- Dennis (43/53) Dec 03 2019 I don't think you use the same terminology as the DIP so I might
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (2/4) Dec 03 2019 Can't you use a lexer with a PEG parser?
- H. S. Teoh (100/142) Dec 03 2019 Walter has admitted that having 3 encodings, with the corresponding 3
- mipri (5/7) Dec 03 2019 Python actually doesn't have HERE docs. When it's included in
- Adam D. Ruppe (20/26) Dec 03 2019 VERY useful and helps make D on Windows feel first class, so it
- Dennis (56/87) Dec 04 2019 https://rosettacode.org/wiki/Here_document#Python
- Timon Gehr (10/17) Dec 04 2019 A small fix for this small problem is to just say in the specification
- Walter Bright (10/27) Dec 04 2019 Another case of my lack of academic CS training showing. I would appreci...
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (17/20) Dec 04 2019 I don't think a spec has to use a lot of CS terms, probably
- Adam D. Ruppe (4/6) Dec 04 2019 In that context, if you replace "covariant with" with "can act as
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (4/10) Dec 04 2019 That is much easier to understand, for sure. I think the best
- mipri (65/88) Dec 04 2019 The big (and only) advantage of HERE docs is that you so rarely
- Kagamin (35/38) Dec 04 2019 In a compiler.
- Guillaume Piolat (7/11) Dec 03 2019 YES
- Jonathan M Davis (15/32) Dec 03 2019 There are definitely people who use token strings in their code when wri...
- Adam D. Ruppe (4/6) Dec 03 2019 Token strings are q{ }, this is about the delimited strings like
- Dennis (12/16) Dec 03 2019 I don't propose deprecating token strings, only the identifier
- Jonathan M Davis (8/24) Dec 03 2019 Ah. Clearly, I glanced over it all too quickly. I confess that that
- H. S. Teoh (28/40) Dec 03 2019 The problem is that token strings require the contents to be *D tokens*.
- Elronnd (4/8) Dec 03 2019 Bracket-delimited string (q"[text]", allowing <>, [], (), and {}
- H. S. Teoh (6/14) Dec 03 2019 They still need to nest properly, though. Generating BF snippets, for
- Kagamin (9/13) Dec 04 2019 It requires efficient memory management. Wait, it requires memory
- Les De Ridder (4/11) Dec 03 2019 This DIP explicitly doesn't deprecate token strings, only
- aliak (7/23) Dec 03 2019 1) Are there any examples of strings that don't have an in-source
- Dennis (25/29) Dec 03 2019 Considering escape sequences such as "\x0B" and string
- Arun Chandrasekaran (3/7) Dec 03 2019 We use this feature. We can fix the code, but the DIP doesn't
- Walter Bright (2/2) Dec 04 2019 There are a lot of DIPs in the pipeline, and this looks highly unlikely ...
This is the feedback thread for the first round of Community Review for DIP 1026, "Deprecate Context-Sensitive String Literals": https://github.com/dlang/DIPs/blob/a7199bcec2ca39b74739b165fc7b97afff9e29d1/DIPs/DIP1026.md All review-related feedback on and discussion of the DIP should occur in this thread. The review period will end at 11:59 PM ET on December 17, or when I make a post declaring it complete. At the end of Round 1, if further review is deemed necessary, the DIP will be scheduled for another round of Community Review. Otherwise, it will be queued for the Final Review and Formal Assessment. Anyone intending to post feedback in this thread is expected to be familiar with the reviewer guidelines: https://github.com/dlang/DIPs/blob/master/docs/guidelines-reviewers.md *Please stay on topic!* Thanks in advance to all who participate.
Dec 03 2019
On Tuesday, 3 December 2019 at 09:03:44 UTC, Mike Parker wrote:This is the feedback thread for the first round of Community Review for DIP 1026, "Deprecate Context-Sensitive String Literals": https://github.com/dlang/DIPs/blob/a7199bcec2ca39b74739b165fc7b97afff9e29d1/DIPs/DIP1026.md All review-related feedback on and discussion of the DIP should occur in this thread. The review period will end at 11:59 PM ET on December 17, or when I make a post declaring it complete. At the end of Round 1, if further review is deemed necessary, the DIP will be scheduled for another round of Community Review. Otherwise, it will be queued for the Final Review and Formal Assessment. Anyone intending to post feedback in this thread is expected to be familiar with the reviewer guidelines: https://github.com/dlang/DIPs/blob/master/docs/guidelines-reviewers.md *Please stay on topic!* Thanks in advance to all who participate.I think there's a problem with this analysis. The package registry contains a lot of libraries and just a few other projects. I wonder if libraries represent the real usage by "final" users. Maybe those stats should be run over github D projects, at least. Andrea
Dec 03 2019
On Tuesday, 3 December 2019 at 09:27:37 UTC, Andrea Fontana wrote:I think there's a problem with this analysis. The package registry contains a lot of libraries and just a few other projects. I wonder if libraries represent the real usage by "final" users. Maybe those stats should be run over github D projects, at least.I agree, I definitely want to expand my collection of open source D code to be more representative. If I have time I may do this before the next review round.
Dec 03 2019
On Tuesday, 3 December 2019 at 09:03:44 UTC, Mike Parker wrote:This is the feedback thread for the first round of Community Review for DIP 1026, "Deprecate Context-Sensitive String Literals": https://github.com/dlang/DIPs/blob/a7199bcec2ca39b74739b165fc7b97afff9e29d1/DIPs/DIP1026.md All review-related feedback on and discussion of the DIP should occur in this thread. The review period will end at 11:59 PM ET on December 17, or when I make a post declaring it complete. At the end of Round 1, if further review is deemed necessary, the DIP will be scheduled for another round of Community Review. Otherwise, it will be queued for the Final Review and Formal Assessment. Anyone intending to post feedback in this thread is expected to be familiar with the reviewer guidelines: https://github.com/dlang/DIPs/blob/master/docs/guidelines-reviewers.md *Please stay on topic!* Thanks in advance to all who participate.Is there much point being almost context-free? Seems like we should know for sure whether there are any other parts of the grammar that are context dependent before we use it as motivation. The DIP is somewhat vague about whether this has been properly established
Dec 03 2019
On Tuesday, 3 December 2019 at 12:24:09 UTC, John Colvin wrote:Is there much point being almost context-free? Seems like we should know for sure whether there are any other parts of the grammar that are context dependent before we use it as motivation. The DIP is somewhat vague about whether this has been properly establishedI _think_ this is the only thing in the lexical grammar that is not context-free but I haven't verified this so I am intentionally a bit vague about that. This DIP is obviously necessary for being context-free, but not sufficient per se. I can spend some time on this if it helps, but didn't want to put too much time into this before the first review in case it got shut down immediately. (And considering Andrei's post, it is at risk of that.)
Dec 03 2019
On 12/3/19 4:03 AM, Mike Parker wrote:This is the feedback thread for the first round of Community Review for DIP 1026, "Deprecate Context-Sensitive String Literals": https://github.com/dlang/DIPs/blob/a7199bcec2ca39b74739b165fc7b97afff9e 9d1/DIPs/DIP1026.mdThis DIP is a non-starter. Here documents are easily and effectively handled during lexing and have no impact on the language grammar. Waste of labor is sadly a common theme in our community. We should have a mechanism to direct such investment of work toward productive outcome.
Dec 03 2019
On Tuesday, 3 December 2019 at 12:38:29 UTC, Andrei Alexandrescu wrote:Waste of labor is sadly a common theme in our community.I consider this low-hanging fruit: just deprecating a token takes little implementation effort, and reduction in language complexity is (as far as I know) always welcome for the usual reasons: - less code in dmd - less specification text - less didactic material / stuff to learn for new D programmers - less bug/enhancement reports - any tool that re-implements some part of the compiler is easier to make In this case, such tools would be syntax highlighters. There are lots of syntax highlighting implementations for D, just a few off the top off my head: - GitHub - Code-d - Kate - Atom - Sublime - Chroma - Vim - Emacs - Notepad++ - ... They all tend to use their own domain specific language, and I'm pretty sure most of them are not powerful enough to express identifier-delimited strings. Here's an example of one if you're curious what they look like: https://github.com/alecthomas/chroma/blob/master/lexers/d/d.go Notice the:// TODO support delimited stringsIf we don't want D support in syntax highlighters to be half-baked everywhere, keeping the lexical grammar simple is a good cause. I can improve the rationale for this DIP with examples like in this post, though if you're absolutely adamant that this is a waste of effort then that won't help obviously. Maybe you don't care about syntax highlighting, but please judge this DIP by its own merits and not compared to potential other DIPs that you care more about.
Dec 03 2019
On Tuesday, 3 December 2019 at 14:45:31 UTC, Dennis wrote:I'm pretty sure most of them are not powerful enough to express identifier-delimited strings.The identifier ones are trivial, they are a simple regex. Heck, my vim syntax highlight file not only supports them, but uses the opening as a hint as to what language is embedded: q"html <!-- highlights this as html! --> "; that said though, I don't love them because they must end on a new line, without indentation. But still, it was easy to implement. syn region dHTML keepend matchgroup=string start="q\"html$" end="^html\"" contains= html And the generic fallback for other identifiers of course is just syn region dDelimString start=+q"\z(.\)+ end=+\z1"+ contains= Spell syn region dHereString start=+q"\z(\I\i*\)\n+ end=+^\z1"+ contains= Spell vim manages to do it all pretty well....
Dec 03 2019
On 12/3/19 9:45 AM, Dennis wrote:On Tuesday, 3 December 2019 at 12:38:29 UTC, Andrei Alexandrescu wrote:These can never be the primary reasons for removing a feature. One doesn't remove a feature because it's easy to remove. One removes a feature because there are good reasons to remove it, and as perks we get simplification of the language and maybe it's easy to remove.Waste of labor is sadly a common theme in our community.I consider this low-hanging fruit: just deprecating a token takes little implementation effort, and reduction in language complexity is (as far as I know) always welcomeIn this case, such tools would be syntax highlighters.The entire narrative of the DIP puts CFG front and center. Reader's first thought is, "wait, the author is confused about what a CFG is." FIRST sentence in the abstract: "D is intended to have a context-free grammar..." FIRST paragraph in the rationale: "Regarding language design, Walter Bright has stated: [... CFG stuff ...]" Even the "Grammar Changes" section should be a give-away: the diff proposed is in the LEXICAL definition (https://dlang.org/spec/lex.html), not in the GRAMMAR definition (https://dlang.org/spec/grammar.html). If syntax highlighters are the primary reason for the DIP, it should be the primary reason in the DIP. The entire rationale needs to be redone. There should be an enumeration of syntax highlighters along with their success/failure of implementing heredocs. (Didn't test all but far as I can tell I've never heard of difficulties with implementing heredocs for bash, perl and the like.)Maybe you don't care about syntax highlighting, but please judge this DIP by its own merits and not compared to potential other DIPs that you care more about.A DIP ought to be judged by reading the DIP. This DIP is ill informed because it is built around the CFG argument, a non-existing issue. If the DIP requires a forum post explaining how it needs to be judged, that's a problem with the DIP, not the reader.
Dec 03 2019
On Tuesday, 3 December 2019 at 19:42:12 UTC, Andrei Alexandrescu wrote:These can never be the primary reasons for removing a feature. One doesn't remove a feature because it's easy to remove. One removes a feature because there are good reasons to remove it, and as perks we get simplification of the language and maybe it's easy to remove.The DIP mentions: - D's flagship parser generator Pegged can't express the D grammar (without user defined parser functions) - Syntax highlighters such as the one on Rosetta code have trouble with it - there is precedent of deprecating hexstring literals I'll admit that the rationale section is not clear in the "primary reasons" to remove it, but I considered reducing language complexity an obvious win. Every feature is a trade off between what it brings to the table and what it costs, and when it turns out the benefit of a feature is low it gets removed, even when it's not inherently problematic. That's what happened with .sort, .reverse, Floating point NCEG operators, octal literals, hexstring literals, escape string literals. Please answer this: Do you think there were good reasons to deprecate hexstring literals, or do you consider that a mistake / unnecessary?FIRST paragraph in the rationale: "Regarding language design, Walter Bright has stated: [... CFG stuff ...]" Even the "Grammar Changes" section should be a give-away: the diff proposed is in the LEXICAL definition (https://dlang.org/spec/lex.html), not in the GRAMMAR definition (https://dlang.org/spec/grammar.html).And the very first thing on the grammar page is:3.1 Lexical SyntaxWith a link to the lexical grammar page. I consider lexical grammar part of "the grammar of D", even when the lexer and parser are separate stages in the compiler. You might say Walter was exclusively talking about parsing grammar and not lexing grammar, but considering this part of the quote:A context free grammar, besides making things a lot simpler, means that IDEs can do syntax highlighting without integrating in most of a compiler front endIt mentions syntax highlighting which does not require parsing.If syntax highlighters are the primary reason for the DIP, it should be the primary reason in the DIP.I don't want to commit to it as 'the primary reason', but I will put more emphasis on it in the next iteration.If the DIP requires a forum post explaining how it needs to be judged, that's a problem with the DIP, not the reader.Your first reply came across as "this is useless, please work on something else". That felt like a destructive comment. This reply actually has constructive feedback, which helps. Thanks for that. I will be more specific when talking about 'the grammar', give some more focus on syntax highlighters and maybe dive more into the precedent of reducing language complexity by removing features.
Dec 03 2019
On 12/3/19 3:51 PM, Dennis wrote:Please answer this: Do you think there were good reasons to deprecate hexstring literals, or do you consider that a mistake / unnecessary?It was great primarily because it was a built-in feature made unnecessary by improvements to the language. It would be a mistake to presuppose that hex string literals are a good precedent, however. Heredocs have no library alternative. The DIP would not be helped by attempting a parallel.Your first reply came across as "this is useless, please work on something else". That felt like a destructive comment. This reply actually has constructive feedback, which helps. Thanks for that. I will be more specific when talking about 'the grammar', give some more focus on syntax highlighters and maybe dive more into the precedent of reducing language complexity by removing features.The destructive comment was actually more useful than one that prompts improvements to this DIP. Even if executed to perfection the impact would be null. Let me ask this question: what would be a nice way to convey "this is useless, please work on something else"?
Dec 03 2019
On Tuesday, 3 December 2019 at 21:11:49 UTC, Andrei Alexandrescu wrote:Let me ask this question: what would be a nice way to convey "this is useless, please work on something else"?If you truly wanted to convey that, you did a good job. But I do wonder how you expected me to take that. I would not reply "Got it, be right back, I'll e-mail Mike immediately and cancel this DIP and terminate all my effort so far right here.". Not after three comments in review round 1. Even if this DIP is a failure, we could at least try to salvage some lessons from it. Why is it a bad DIP? What criteria should a language feature have to be candidate for removal, and why don't context-sensitive string literals fit those criteria? What sources of language complexity can be removed instead?
Dec 03 2019
On Tuesday, 3 December 2019 at 22:11:22 UTC, Dennis wrote:Even if this DIP is a failure, we could at least try to salvage some lessons from it. Why is it a bad DIP?Bad motivation and bad construction. The bad construction is apparently that HERE docs do not actually conflict with context free grammars and that the entire point of the DIP is moot. That wasn't obvious to me; I was mainly thinking "I guess it's assumed that dmd will compile faster with this?" I think the bad motivation is more interesting, even though a lot of this is how I received your DIP rather than how you may have definitely meant it: 1. "Less is always better." Not stated in the DIP, but in your defense of it here: reduction in language complexity is (as far as I know) always welcome for the usual reasons: - less code in dmd - less specification text - less didactic material / stuff to learn for new D programmers - less bug/enhancement reports - any tool that re-implements some part of the compiler is easier to make Less should have a *point*, though. Much code, specification, and most importantly didactic material is already written. I have physical bound books within arms reach of me that discuss these features. Removing the feature doesn't make these books easier to write, it just makes it more annoying for people to read them, as they're introduced to deprecated features. It makes "other people's code" slightly more annoying to consider, as you may have to update that code to remove since-deprecated features. Removing HERE docs doesn't create a python2/python3 or a perl5/perl6 situation, but it still forks the language and the old language still does not simply or automatically disappear. I really dislike this about C++: that no matter how modern it gets, there will be these huge carbon-dated layers of code out there that are pre-modern and that can hardly be understood without also learning the stuff that the modern features are supposed to have replaced. If a feature were to be judged a mistake, it can still be a mistake to remove the feature later on. Less is not always better. 2. D's problem is "too many features" -> let's remove any that looks relatively easy to remove. How much agreement do you think there is on the first point? Consider the "remove ~= from arrays" DIP. It removed a feature, and removing the feature arguably materially improved D's options to evolve as a language, and it got a really incensed negative response. A human engineer can improve a machine by shutting it down, tearing it apart, making an improvement, and putting it back together again. This interruptability of the engineered system is one of the characteristics of human engineering, along with "use dry materials" and "use stiff materials", that distinguishes it from what you might call engineering by Mother Nature, who uses wet materials, and flexible materials, and whose works (even if they pull some tricks like molting or entering a cocoon) must continue to stay alive even as they undergo radical changes in form. A DIP can't kill D, take it apart, make an improvement, and then put it back together again, because then all the users will be gone. Language design is more like natural engineering in this way. If part of D's problems is that it has a lot of features, the best way forward can still not be to remove them. 3. "Walter said a thing about D, but a StackOverflow comment refuted that, so the language should change so that this criticism is no longer true." https://stackoverflow.com/a/7083615 Geez. Someone who thinks D has "an obnoxious amount of ambiguity" is definitely still going to think that after HERE docs are gone.
Dec 03 2019
On Tuesday, 3 December 2019 at 23:35:21 UTC, mipri wrote:2. D's problem is "too many features" -> let's remove any that looks relatively easy to remove. How much agreement do you think there is on the first point? Consider the "remove ~= from arrays" DIP. It removed a feature, and removing the feature arguably materially improved D's options to evolve as a language, and it got a really incensed negative response.I think this is a really questionable argument, because it implicitly presumes that all features are worth the same. The "remove ~= from arrays" DIP got, as far as I could see, basically no feedback along the lines of "whatever, we use it but we could replace it easily" or "I think D doesn't need to reduce its feature set in general." The feedback it got was, as far as I could tell, overwhelmingly "this feature is a core component of the usefulness of the D language and definitely the *wrong place* to start removing things." Logically speaking, the more people think it is the wrong place to start removing features, the less that debate says about removing features as a whole, because people were more motivated by the specific feature rather than the general state of the language.
Dec 04 2019
Thanks for your detailed breakdown. On Tuesday, 3 December 2019 at 23:35:21 UTC, mipri wrote:It makes "other people's code" slightly more annoying to consider, as you may have to update that code to remove since-deprecated features.That's the nature of deprecation: a short term cost for a long term improvement.If a feature were to be judged a mistake, it can still be a mistake to remove the feature later on. Less is not always better.That's true.2. D's problem is "too many features" -> let's remove any that looks relatively easy to remove. How much agreement do you think there is on the first point?I don't know how much explicit agreement there is to the sentiment that D has too many features, but I do know at least Walter is always interested in reducing language complexity, and many non-actionable complaints of users (such as "D is difficult too learn") are rooted in things like this.3. "Walter said a thing about D, but a StackOverflow comment refuted that, so the language should change so that this criticism is no longer true."That is only there for the narrative / background, correcting criticism is not a goal of this DIP.
Dec 04 2019
On Wednesday, 4 December 2019 at 09:42:32 UTC, Dennis wrote:That is only there for the narrative / background, correcting criticism is not a goal of this DIP.Suggesting a workable alternative usually is easier. Like: replace: q"delimiter... with Python like: """
Dec 04 2019
On Wednesday, 4 December 2019 at 10:10:09 UTC, Ola Fosheim Grøstad wrote:On Wednesday, 4 December 2019 at 09:42:32 UTC, Dennis wrote:Or specify that q"<<< (three chars exactly) can only be matched with >>>", along with the other matching delimiters. This is a breaking change though since the current behavior is: $ rdmd --eval 'writeln(q"<<< hello >>>")' << hello >>That is only there for the narrative / background, correcting criticism is not a goal of this DIP.Suggesting a workable alternative usually is easier. Like: replace: q"delimiter... with Python like: """
Dec 04 2019
On 12/3/19 5:11 PM, Dennis wrote:What criteria should a language feature have to be candidate for removal, and why don't context-sensitive string literals fit those criteria? What sources of language complexity can be removed instead?That got me thinking. Here's what I'd opine. A good DIP creates a scientific argument. It would have the general attitude of building, through a series of factual statements, a hypothesis that is convincing. A neutral person with the proper background would read the facts and reach the conclusion as much as the author. (In contrast, a DIP that is not scientific would attempt to use qualitative arguments and rhetoric in an attempt to create an opinion trend.) Consider someone reads a DIP proposing the removal of here docs containing facts such as these: * "We have analyzed x languages and of these, we found y historical issues related to mistaken or poor performance implementation of heredocs. [... details ...]" * "Across x editors, we discovered that x1 do not implement here docs for any of their supported languages, x2 do not implement them for D, and x3 implement them with severe performance bottlenecks. [... details ...]" " "In the D compiler issue, we found x bug reports issued over y years. They took z days on average to fix. x1 issues are still open. [... details ...]" * "The code dedicated to heredocs in the D reference parser is y lines long, which constitutes z% of the entire lexer. Lexing of heredocs is t% slower than any other equivalent strings, revealing a serious performance bottleneck. [... details ...]" With such arguments at hand, a proposal would build a powerful argument that anyone can easily verify and take into consideration. No need for argumentation, explanations, etc. Conversely, if one does such an investigation and gets no meaningful results, the conclusion that heredocs are okay as they are would also be immediate. Now it may be argued that all of this is hard work, and of high risk - even if the DIP is well-argued, it could be rejected. Also, is the result of the work (a small language simplification) worth the effort? Sadly I know of no solution to this. What I can say is that it's the main dilemma tormenting graduate students doing research. A colleague of mine in the PhD program said he has any number of ideas to research, but the cognitive load of putting work into something that may not pan out is paralyzing him, so he ends up doing nothing for long periods of time. He ended up not finishing his degree. For all I know he was smarter and better than many who did graduate.
Dec 04 2019
On Wednesday, 4 December 2019 at 21:57:00 UTC, Andrei Alexandrescu wrote:A good DIP creates a scientific argument. It would have the general attitude of building, through a series of factual statements, a hypothesis that is convincing. A neutral person with the proper background would read the facts and reach the conclusion as much as the author. (In contrast, a DIP that is not scientific would attempt to use qualitative arguments and rhetoric in an attempt to create an opinion trend.)That will prevent qualitative incremental improvements. You cannot make quantitative arguments without very large amounts of data... there is no such dataset, only github. If the DIP had provided an argument for an alternative here-document syntax that was easier to parse then it is probable that there would have been few objections to it. It could have been automated. There is really no use in pretending that language changes are apolitical. They are usually inherently political.
Dec 04 2019
On 12/3/2019 2:11 PM, Dennis wrote:Why is it a bad DIP?I think Andrei covered that fairly well.What criteria should a language feature have to be candidate for removal,This would be a good opening for a separate thread.and why don't context-sensitive string literals fit those criteria?The only real cost identified is poor support for syntax highlighting in some text editors. On the other hand, heredocs are a common language feature, and other methods of doing it are so clumsy people rarely have the stomach to do it.What sources of language complexity can be removed instead?This would be a good opening for a separate thread.
Dec 04 2019
On Wednesday, 4 December 2019 at 22:37:24 UTC, Walter Bright wrote:The only real cost identified is poor support for syntax highlighting in some text editors. On the other hand, heredocs are a common language feature, and other methods of doing it are so clumsy people rarely have the stomach to do it.If those other literals are bad. For python it's the opposite: given triple quoted strings people can't stand delimited strings and use triple quoted strings predominantly instead of delimited strings, see it in action: https://github.com/django/django/blob/master/django/core/signing.py - it's the first random python code I found on github.
Dec 05 2019
On Thursday, 5 December 2019 at 18:07:08 UTC, Kagamin wrote:If those other literals are bad. For python it's the opposite: given triple quoted strings people can't stand delimited strings and use triple quoted strings predominantly instead of delimited strings, see it in action: https://github.com/django/django/blob/master/django/core/signing.py - it's the first random python code I found on github.Yes, that usage you link to was for docs-strings though (more like comments), but I use Python triple quoted strings all the time. I have never really run into a situation where there was a clash with """, actually. Looks like a too simple solution, but works very well in practice. Another point is that here-documents may be important in WebAssembly for embedding "files".
Dec 05 2019
On Thursday, 5 December 2019 at 18:23:10 UTC, Ola Fosheim Grøstad wrote:Another point is that here-documents may be important in WebAssembly for embedding "files".D can embed files with import expression https://dlang.org/spec/expression.html#import_expressions
Dec 05 2019
On Thursday, 5 December 2019 at 18:34:12 UTC, Kagamin wrote:On Thursday, 5 December 2019 at 18:23:10 UTC, Ola Fosheim Grøstad wrote:That is a nice alternative for long text, but when building websites you often deal with many shorter blocks of text. Anyway. Although I prefer """ as it is visually cleaner, C++ actually has something similar to D: const char* s1 = R"foo( Hello World )foo"; https://en.cppreference.com/w/cpp/language/string_literal So, highlighters need to support that if they want to support C++...Another point is that here-documents may be important in WebAssembly for embedding "files".D can embed files with import expression https://dlang.org/spec/expression.html#import_expressions
Dec 05 2019
On Thursday, 5 December 2019 at 18:07:08 UTC, Kagamin wrote:On Wednesday, 4 December 2019 at 22:37:24 UTC, Walter Bright wrote:Python doesn't have delimited strings.The only real cost identified is poor support for syntax highlighting in some text editors. On the other hand, heredocs are a common language feature, and other methods of doing it are so clumsy people rarely have the stomach to do it.If those other literals are bad. For python it's the opposite: given triple quoted strings people can't stand delimited strings and use triple quoted strings predominantly instead of delimited strings, see it in action:
Dec 05 2019
On Tuesday, 3 December 2019 at 21:11:49 UTC, Andrei Alexandrescu wrote:It would be a mistake to presuppose that hex string literals are a good precedent, however. Heredocs have no library alternative.Alternative can be any other type of string or an import expression.
Dec 04 2019
On Tuesday, 3 December 2019 at 19:42:12 UTC, Andrei Alexandrescu wrote:On 12/3/19 9:45 AM, Dennis wrote:C++ removed features that were almost never used. So much so I don't even remember what they were called. This is a D feature I never knew existed. It does make it simpler and I'd argue for removing it entirely rather than adding replacements for it.On Tuesday, 3 December 2019 at 12:38:29 UTC, Andrei Alexandrescu wrote:These can never be the primary reasons for removing a feature. One doesn't remove a feature because it's easy to remove. One removes a feature because there are good reasons to remove it, and as perks we get simplification of the language and maybe it's easy to remove.Waste of labor is sadly a common theme in our community.I consider this low-hanging fruit: just deprecating a token takes little implementation effort, and reduction in language complexity is (as far as I know) always welcomeThe tools for IDEs, I'd argue auto complete is probably the most useful tool an IDE has. You can't implement it without basically having the entire front end of the compiler because of CTFE. Its so complicated in fact that there are no tools for D that support. Ice seen some incorrect syntax highlighting for D but I think it was specifically cause by q{} which this doesn't remove anyways.In this case, such tools would be syntax highlighters.The entire narrative of the DIP puts CFG front and center. Reader's first thought is, "wait, the author is confused about what a CFG is." FIRST sentence in the abstract: "D is intended to have a context-free grammar..." FIRST paragraph in the rationale: "Regarding language design, Walter Bright has stated: [... CFG stuff ...]" Even the "Grammar Changes" section should be a give-away: the diff proposed is in the LEXICAL definition (https://dlang.org/spec/lex.html), not in the GRAMMAR definition (https://dlang.org/spec/grammar.html). If syntax highlighters are the primary reason for the DIP, it should be the primary reason in the DIP. The entire rationale needs to be redone. There should be an enumeration of syntax highlighters along with their success/failure of implementing heredocs. (Didn't test all but far as I can tell I've never heard of difficulties with implementing heredocs for bash, perl and the like.)DIP1021. If the D federation leadership holds itself to that kind of standard, I don't see why anyone should expect them to hold someone else to a standard above and beyond their own.Maybe you don't care about syntax highlighting, but please judge this DIP by its own merits and not compared to potential other DIPs that you care more about.A DIP ought to be judged by reading the DIP. This DIP is ill informed because it is built around the CFG argument, a non-existing issue. If the DIP requires a forum post explaining how it needs to be judged, that's a problem with the DIP, not the reader.
Dec 03 2019
On Tue, Dec 03, 2019 at 02:45:31PM +0000, Dennis via Digitalmars-d wrote:On Tuesday, 3 December 2019 at 12:38:29 UTC, Andrei Alexandrescu wrote:That's a bit uncalled for.Waste of labor is sadly a common theme in our community.I consider this low-hanging fruit: just deprecating a token takes little implementation effort, and reduction in language complexity is (as far as I know) always welcome for the usual reasons: - less code in dmd - less specification text - less didactic material / stuff to learn for new D programmers - less bug/enhancement reports - any tool that re-implements some part of the compiler is easier to makeAgreed, but that can't be the only criterion for removing a feature. By the same argument, one could make the case for removing templates from D. Bingo, the language instantly becomes so much easier to parse! And it greatly simplifies the compiler -- we can delete large sections of it, in fact! The spec becomes simpler, D newbies don't need to learn this hard template stuff anymore, and we can close all template-relateed bugs, and tools become greatly simplified.In this case, such tools would be syntax highlighters. There are lots of syntax highlighting implementations for D, just a few off the top off my head: - GitHub - Code-d - Kate - Atom - Sublime - Chroma - Vim - Emacs - Notepad++ - ... They all tend to use their own domain specific language, and I'm pretty sure most of them are not powerful enough to express identifier-delimited strings.Are you sure? Adam just gave an example of correct heredoc highlighting in vim. It may not be *trivial*, but it's possible. And users don't have to worry about it, somebody writes the snippet once for all, and everyone else can just reuse it. [...]If we don't want D support in syntax highlighters to be half-baked everywhere, keeping the lexical grammar simple is a good cause.IOW, implementators aren't competent enough to implement something up to spec, therefore we should dumb down the spec for their sake? Sounds like a backwards reason for doing something.I can improve the rationale for this DIP with examples like in this post, though if you're absolutely adamant that this is a waste of effort then that won't help obviously. Maybe you don't care about syntax highlighting, but please judge this DIP by its own merits and not compared to potential other DIPs that you care more about.The problem with this DIP is that it removes a marginal feature for no good rationale, breaking a pretty long list of existing D code projects that depend on said feature, while offering very little in return (nothing that can't be fixed another way, e.g., fix broken syntax highlighters so that they work properly(!)). And it does so without considering why this feature might have been added in the first place, what kind of problems it solves, and how said problems can be mitigated if the feature was removed. As I've already said, I work a lot with code generators and other code that embed long-ish text passages in code. Heredoc syntax is ideal for this sort of code, allowing you to temporarily "escape" from D syntax and write code snippets as-is, rather than require onerous escaping which makes said text less readable. E.g., if I want to embed a mini Perl script inside a function, I couldn't write it as a token string (some Perl tokens are not D tokens), and writing it as a quoted string induces Leaning Toothpick Syndrome, making it hard to edit the script. The script itself is short enough it doesn't seem worth creating it as a separate file (and then needing to fight with paths to find it in the right place). Heredoc syntax lets me just write the danged script in situ and move on already, instead of fighting with Leaning Toothpick Syndrome or heaping on yet another layer of pathname resolution code just to find a miserable 5-line script file. Same goes for embedded long-ish text (don't have to type ""~ all over the place), etc.. It's marginal, yes, but heredocs are quite useful for the use cases they were intended to be used, and I really don't see why they should be singled out among so many other things that D could stand to improve in. T -- If the comments and the code disagree, it's likely that *both* are wrong. -- Christopher
Dec 03 2019
On Tuesday, 3 December 2019 at 20:53:07 UTC, H. S. Teoh wrote:That's clearly not a fair comparison. Heredocs can be reduced to a set of local transformations, while templates cannot. This means: code using heredocs can be mechanically changed to not use them, and heredocs do not make the language more expressive.*snipped various arguments to do with simplicity*Agreed, but that can't be the only criterion for removing a feature. By the same argument, one could make the case for removing templates from D. Bingo, the language instantly becomes so much easier to parse! And it greatly simplifies the compiler -- we can delete large sections of it, in fact! The spec becomes simpler, D newbies don't need to learn this hard template stuff anymore, and we can close all template-relateed bugs, and tools become greatly simplified.The easier the language is to implement, the more implementors there will be. If there are compelling reasons to include a language feature, and it makes implementation more difficult, it should be included regardless. But that doesn't mean that ease of implementation should be completely ignored when considering language features.If we don't want D support in syntax highlighters to be half-baked everywhere, keeping the lexical grammar simple is a good cause.IOW, implementators aren't competent enough to implement something up to spec, therefore we should dumb down the spec for their sake? Sounds like a backwards reason for doing something.
Dec 03 2019
On Tue, Dec 03, 2019 at 09:38:28PM +0000, Elronnd via Digitalmars-d wrote:On Tuesday, 3 December 2019 at 20:53:07 UTC, H. S. Teoh wrote:[...]This is a valid consideration *before* the language is implemented. The current situation is: 1) Heredocs are *already* implemented, have been for a long time, and working very well, except with the wrinkle of some poor syntax highlighter implementations that fail to parse them correctly. 2) Parsing heredocs is actually not *that* hard, as proven by already (at least) two examples given in this very thread of syntax highlighting code that actually parses them correctly. We aren't talking about solving NP complete problems here, that might be considered reasonable cause for simplifying something. It does not take a day's work to write a parser that understands heredocs, and we're debating about implementation *difficulty*? Whoa. T -- My program has no bugs! Only undocumented features...IOW, implementators aren't competent enough to implement something up to spec, therefore we should dumb down the spec for their sake? Sounds like a backwards reason for doing something.The easier the language is to implement, the more implementors there will be. If there are compelling reasons to include a language feature, and it makes implementation more difficult, it should be included regardless. But that doesn't mean that ease of implementation should be completely ignored when considering language features.
Dec 03 2019
On Tuesday, 3 December 2019 at 14:45:31 UTC, Dennis wrote:On Tuesday, 3 December 2019 at 12:38:29 UTC, Andrei Alexandrescu wrote:actually with textmate based grammars this is pretty easy to implement: https://github.com/Pure-D/code-d/blob/master/syntaxes/d.json#L2190-L2200[...]I consider this low-hanging fruit: just deprecating a token takes little implementation effort, and reduction in language complexity is (as far as I know) always welcome for the usual reasons: - less code in dmd - less specification text - less didactic material / stuff to learn for new D programmers - less bug/enhancement reports - any tool that re-implements some part of the compiler is easier to make [...]
Dec 03 2019
On Tue, Dec 03, 2019 at 07:38:29AM -0500, Andrei Alexandrescu via Digitalmars-d wrote:On 12/3/19 4:03 AM, Mike Parker wrote:[...] When I read the title "context-sensitive string literals" I was wondering what part of D actually has strings whose interpretation changes depending on context. I was shocked to discover that it was referring to heredoc strings. Please don't get rid of heredoc strings. I use them quite a bit, because I work a lot with code generators. They are a refreshing change from C/C++ where trying to quote a piece of code as a string requires Leaning Toothpick Syndrome (i.e., \'s all over the place to escape quoted string metacharacters). I do *not* want to return to that nastiness, thank you very much. As Andrei said, heredoc string are trivial to parse because they are essentially a single big token. This should not pose any problem for the parser at all. The argument in the DIP is flawed because, at the level of a lexer/parser, a heredoc string is no different from a delimited string: it starts with a sequence of one or more characters (the opening delimiter), spans some arbitrary number of characters (the string content) until another sequence of one or more characters (the closing delimiter). Nothing stops someone from writing a 50,000-character double-quoted string, for example, and the lexer/parser will handle it just fine. So why the hate against heredoc strings? Arguably, heredoc strings are exactly what *solves* the problem of 50,000-character strings being essentially unreadable to a human reader because of poor formatting. As for poor syntax highlighting as mentioned in the DIP, how is that even a problem with the language?! It's a strawman argument based on skewed data obtained from badly-written lexers that don't actually lex D code correctly. It should be the syntax highlighter that should be fixed, rather than deprecate an actually useful feature in the language. Not to mention, the long list of projects at the end that will need to be updated, which includes dmd itself BTW, looks like strong evidence of good use of such string literals, rather than marginal use that might be construed to be a reason for deprecation. And most importantly of all: string literals are *single tokens* in the language. They are lexical units, and therefore have nothing whatsoever to do with the grammar being context-free or not. We're shooting at the wrong target here. T -- Famous last words: I wonder what will happen if I do *this*...This is the feedback thread for the first round of Community Review for DIP 1026, "Deprecate Context-Sensitive String Literals": https://github.com/dlang/DIPs/blob/a7199bcec2ca39b74739b165fc7b97afff9e29d1/DIPs/DIP1026.mdThis DIP is a non-starter. Here documents are easily and effectively handled during lexing and have no impact on the language grammar.
Dec 03 2019
On Tuesday, 3 December 2019 at 18:34:22 UTC, H. S. Teoh wrote:On Tue, Dec 03, 2019 at 07:38:29AM -0500, Andrei Alexandrescu via Digitalmars-d wrote:[...]On 12/3/19 4:03 AM, Mike Parker wrote:This is the feedback thread for the first round of Community Review for DIP 1026, "Deprecate Context-Sensitive String Literals": https://github.com/dlang/DIPs/blob/a7199bcec2ca39b74739b165fc7b97afff9e29d1/DIPs/DIP1026.mdThis DIP is a non-starter. Here documents are easily and effectively handled during lexing and have no impact on the language grammar.As Andrei said, heredoc string are trivial to parse because they are essentially a single big token. This should not pose any problem for the parser at all.By definition, a context-free grammar is defined in terms of a finite set of non-terminal symbols (i.e., tokens). [1] The set of all string literals is infinite. Therefore, either string literals are not tokens, or D's grammar is not context-free. [1] https://en.wikipedia.org/wiki/Context-free_grammar#Formal_definitions
Dec 03 2019
On Tue, Dec 03, 2019 at 08:40:14PM +0000, Paul Backus via Digitalmars-d wrote:On Tuesday, 3 December 2019 at 18:34:22 UTC, H. S. Teoh wrote:[...][...] I think you're imposing a needlessly literal(!) interpretation of context-free grammars. For example, integer literals are also unbounded (there is no largest integer, therefore the set of integer literals is infinite). Does that mean that a calculator program that includes integer literals in its grammar is not context-free? I think that's a preposterous application of the definitions. As far as the grammar is concerned, all integer literals are the same terminal symbol, because the grammar does not (need to) distinguish between them. Treating string (or any other) literals as non-tokens makes no sense because they are not symmetric with non-string (or other) tokens, e.g., D tokens allow arbitrary whitespace between them, yet you cannot arbitrarily insert whitespace into a string literal without changing its semantics. T -- Time flies like an arrow. Fruit flies like a banana.As Andrei said, heredoc string are trivial to parse because they are essentially a single big token. This should not pose any problem for the parser at all.By definition, a context-free grammar is defined in terms of a finite set of non-terminal symbols (i.e., tokens). [1] The set of all string literals is infinite. Therefore, either string literals are not tokens, or D's grammar is not context-free.
Dec 03 2019
On 12/3/19 4:04 PM, H. S. Teoh wrote:I think you're imposing a needlessly literal(!) interpretation of context-free grammars.I feared that would happen. When I drafted the initial answer, I had this text: "Subject to the way the grammar is defined across lexical tokens and higher-level constructs, yes, one could build a theoretical argument that heredocs are a context-dependent construct." Then I removed it to avoid divagating. Now, here we are.
Dec 03 2019
On Tue, Dec 03, 2019 at 04:14:47PM -0500, Andrei Alexandrescu via Digitalmars-d wrote:On 12/3/19 4:04 PM, H. S. Teoh wrote:Yes, sigh, I can see it already: this thread is going to be another of those interminably-long debates and nitpicking over technicalities, and at the end of it all, this DIP will fall by the wayside and we will have accomplished nothing. T -- Ph.D. = Permanent head DamageI think you're imposing a needlessly literal(!) interpretation of context-free grammars.I feared that would happen. When I drafted the initial answer, I had this text: "Subject to the way the grammar is defined across lexical tokens and higher-level constructs, yes, one could build a theoretical argument that heredocs are a context-dependent construct." Then I removed it to avoid divagating. Now, here we are.
Dec 03 2019
On 12/3/2019 1:27 PM, H. S. Teoh wrote:Yes, sigh, I can see it already: this thread is going to be another of those interminably-long debates and nitpicking over technicalities, and at the end of it all, this DIP will fall by the wayside and we will have accomplished nothing.It's a well-known effect that the less technical a proposal is, the more debate will follow.
Dec 06 2019
On Tuesday, 3 December 2019 at 21:04:52 UTC, H. S. Teoh wrote:Treating string (or any other) literals as non-tokens makes no sense because they are not symmetric with non-string (or other) tokens, e.g., D tokens allow arbitrary whitespace between them, yet you cannot arbitrarily insert whitespace into a string literal without changing its semantics.Just change the syntax to q"delimiter .... retimiled" and I believe it will be context free... IIRC. So yeah, I agree. CFG is not a the right argument. Never understood why people are so enarmoured by them, parsers are far more powerful today than they used to be. The human should be the important factor when designing syntax, not the parser... Also, not sure if it is context free if you include comments... But I could be wrong, and again I don't think it should matter...
Dec 03 2019
On Tuesday, 3 December 2019 at 21:21:30 UTC, Ola Fosheim Grøstad wrote:On Tuesday, 3 December 2019 at 21:04:52 UTC, H. S. Teoh wrote:That was a joke! Don't argue it...Treating string (or any other) literals as non-tokens makes no sense because they are not symmetric with non-string (or other) tokens, e.g., D tokens allow arbitrary whitespace between them, yet you cannot arbitrarily insert whitespace into a string literal without changing its semantics.Just change the syntax to q"delimiter .... retimiled" and I believe it will be context free... IIRC.
Dec 03 2019
On Tuesday, 3 December 2019 at 18:34:22 UTC, H. S. Teoh wrote:So why the hate against heredoc strings?I don't think you use the same terminology as the DIP so I might misinterpret this, but I have nothing against here documents. I'm glad D provides plenty of useful string literals for including text in source code, it's just that some of them are rarely used and bump up the complexity class of D's lexical grammar. D has 6 types of string literals ("double quote" `back tick` r"r string" q{tokens} q"<brackets>" q”EOS ident EOS”) with 3 encoding options (char, wchar, dchar). There is a DIP for adding interpolated strings to D. People are mentioning how D keeps adding adding features and is on a road towards C++ complexity. There is precedent for removing barely used features (see e.g. octal, escape or hexstring literals on https://dlang.org/deprecate.html). And of course there are always users that remorse the removal of their favorite feature, but in the long run everyone benefits from a simpler language. As for your use case of code generation, I'm having trouble relating to it. I happened to write some code generation algorithms myself recently, and could do fine with q{} strings for large templates and regular "" or `` string for small token parts like "switch(". - Do you truly have 50,000 character string literals in your code base? - Can't you use bracket delimited strings instead, q"<like this?>" - If accidental early termination in huge string literals is a concern, even an identifier-delimited string isn't always safe. Can't you use an `import()` statement on an external text file? - If those 50,000 characters are code and you value readability of it, isn't it a problem that there is no syntax highlighting in a q"EOS EOS" string? - Can you maybe post an example of some of your q"EOS EOS" strings used for code generation?As for poor syntax highlighting as mentioned in the DIP, how is that even a problem with the language?! It's a strawman argument based on skewed data obtained from badly-written lexers that don't actually lex D code correctly. It should be the syntax highlighter that should be fixed, rather than deprecate an actually useful feature in the language.The thing is, these string literals simply can't be expressed in e.g. a PEG grammar. The D's grammar is one complexity class higher than needed just for this one relatively obscure string literal. Sure you can say "not our problem, those tooling authors just need to account for D's complexity", but I don't think that is useful for D's tooling ecosystem.Not to mention, the long list of projects at the end that will need to be updated, which includes dmd itself BTW, looks like strong evidence of good use of such string literalsdmd only uses them in the test-suite, same as libdparse. I can spend some more time in the DIP exploring how other packages use them however.
Dec 03 2019
On Tuesday, 3 December 2019 at 21:34:26 UTC, Dennis wrote:The thing is, these string literals simply can't be expressed in e.g. a PEG grammar.Can't you use a lexer with a PEG parser?
Dec 03 2019
On Tue, Dec 03, 2019 at 09:34:26PM +0000, Dennis via Digitalmars-d wrote:On Tuesday, 3 December 2019 at 18:34:22 UTC, H. S. Teoh wrote:[...]D has 6 types of string literals ("double quote" `back tick` r"r string" q{tokens} q"<brackets>" q”EOS ident EOS”) with 3 encoding options (char, wchar, dchar).Walter has admitted that having 3 encodings, with the corresponding 3 string types, was a "miss" in D's design, and that he should have just stuck with UTF-8. UTF-16 is occasionally useful for interfacing with Windows APIs, but that's pretty narrow and contained, and nobody uses UTF-32 strings in practice. In practice, I've not seen many examples of non-UTF-8 strings in D code. I admit D having 6 types of string literals is excessive, but as somebody has already said, even if something was a mistake in retrospect, doesn't necessarily mean that removing it isn't also a mistake. Because now you have the weight of existing code weighing against removing it. And just for a bit more perspective, Python also has heredoc syntax, so does Perl, PHP, bash, and probably many others. If heredocs were really such a bad idea, why are people putting them into so many languages, over and over again? Perhaps, just perhaps, there are use cases for them that this DIP has overlooked / underrepresented? I don't hear people clamoring for removing heredocs from Python, for example, so I'm really having a hard time understanding why we're having this debacle right now.There is a DIP for adding interpolated strings to D.That DIP seems dead in the water though. The author has vanished and nobody has taken up the reins.People are mentioning how D keeps adding adding features and is on a road towards C++ complexity. There is precedent for removing barely used features (see e.g. octal, escape or hexstring literals on https://dlang.org/deprecate.html).Actually, I was a bit disappointed with the removal of hexstring literals, but the issue is somewhat more complex. The problem with hexstring literals was that it was some kind of half-hearted attempt at supporting literal hexadecimal data, because it coerces the result into string rather than ubyte[]. The hexstring *syntax* was ideal for entering hex data, but then having the result coerced into string seemed to me like a backwards misfit. If it had produced a ubyte[] then there would have been much more reason to keep it in the language, since occasionally it's very useful to be able to enter blocks of binary data in hex. As to why the original design produced a string rather than a ubyte[], I can only speculate. Perhaps it was meant as a poor man's way of writing a Unicode string without a Unicode-aware keyboard / input method? Who knows. In any case, *that* use case is rendered completely moot by the \u.... and \U........ escape sequences in your regular double-quoted string. The ubyte[] use case is arguably implementable in a CTFE parser the same way octal literals can, and so hexstrings went the way of the dodo.And of course there are always users that remorse the removal of their favorite feature, but in the long run everyone benefits from a simpler language. As for your use case of code generation, I'm having trouble relating to it. I happened to write some code generation algorithms myself recently, and could do fine with q{} strings for large templates and regular "" or `` string for small token parts like "switch(".q{} works well for emitting *D code*. Not so well for non-D code.- Do you truly have 50,000 character string literals in your code base?No, but I do have a number of large multi-line string literals that simply look best / are most maintainable in heredoc format.- Can't you use bracket delimited strings instead, q"<like this?>"Heredoc syntax is better because the ending delimiter is obvious. When the string literal spans multiple lines, single-character terminating delimiters just aren't the best way to do it.- If accidental early termination in huge string literals is a concern, even an identifier-delimited string isn't always safe. Can't you use an `import()` statement on an external text file?Identifier-delimited string is safe because the literal is typed in directly as code, so you already know beforehand what words might appear or not appear in it, and you already know what will *never* appear in the string. It isn't as though I'm copy-n-pasting arbitrary text from arbitrary input files into my code just for fun. String imports require creating an extra file to contain the string, and requires running the compiler with -J + the right path(s), all of which are extra hurdles to jump through. It's the same thing with external unittests vs. unittest blocks that you can just write inline. It's *possible*, but inconvenient and liable to go out-of-sync as you modify the code.- If those 50,000 characters are code and you value readability of it, isn't it a problem that there is no syntax highlighting in a q"EOS EOS" string?As I said, I don't use a syntax highlighter. Also, any attempt to highlight is moot if the string contains code of a different language (see below for my use cases).- Can you maybe post an example of some of your q"EOS EOS" strings used for code generation?I feel a single example will not adequately convey my point. Here's a list of use cases I use heredocs for (in no particular order): 1) Generating HTML snippets 2) Generating PovRay scene description snippets 3) Generating D code snippets 4) Generating snippets of a DSL I use for generating geometric models 5) Generating boilerplate for input data to an external convex hull solver (has its own peculiar syntax) 6) Generating GLSL shader code snippets 7) Generating Java code snippets 8) Command line usage descriptions Some of this code is somewhat old but is actively used as infrastructure for my current projects, and having to go back to rewrite heredocs just because of some ivory tower ideal of "cleaning up useless literals in D" is rather distasteful to me, you understand, esp. since I don't even use syntax highlighting in the first place, so this is just pure work for zero benefit. If we were still in the early stages of D development, then sure, go ahead and nuke heredocs if you have very good reasons for it, but I'm not about to go rewriting code for (1) to (8) now, not when there's basically zero benefit in doing so.?! Can't you just use a custom lexer with your PEG grammar?As for poor syntax highlighting as mentioned in the DIP, how is that even a problem with the language?! It's a strawman argument based on skewed data obtained from badly-written lexers that don't actually lex D code correctly. It should be the syntax highlighter that should be fixed, rather than deprecate an actually useful feature in the language.The thing is, these string literals simply can't be expressed in e.g. a PEG grammar.The D's grammar is one complexity class higher than needed just for this one relatively obscure string literal. Sure you can say "not our problem, those tooling authors just need to account for D's complexity", but I don't think that is useful for D's tooling ecosystem.[...] Then isn't the solution simply to write a self-contained heredoc parsing function, put it in a dub package, and let everyone reuse it? Then nobody will have to write it for themselves again. Problem solved. (If it's even that complex to begin with. As I said, we already have 2 working examples of syntax highlighter code that work fine with heredocs. It's not as though D invented heredocs; they have been around since the early days of the Unix shell, and people have been writing parsing code for it for a long time. Its supposed "complexity" is really blown out of proportion here.) This whole debacle feels like heredocs are being singled out as a scapegoat in a misguided quest to "simplify the language". Like we're grasping at straws because we're unable to tackle the bigger issues, so here's a convenient simple target we can shoot and kill and feel good about ourselves that we're finally making progress. Talking about straining out the gnat and swallowing the camel. T -- "I'm not childish; I'm just in touch with the child within!" - RL
Dec 03 2019
On Wednesday, 4 December 2019 at 01:26:24 UTC, H. S. Teoh wrote:And just for a bit more perspective, Python also has heredoc syntax, so does Perl, PHP, bash, and probably many others.Python actually doesn't have HERE docs. When it's included in lists of "languages with HERE docs", it's just to show what a Python programmer would use in their stead. Please accept Ruby as a replacement example.
Dec 03 2019
On Wednesday, 4 December 2019 at 01:26:24 UTC, H. S. Teoh wrote:UTF-16 is <snip>VERY useful and helps make D on Windows feel first class, so it is easy to do things right. utf-32 doesn't matter, but "string"w is very, very nice for working with Windows, .net, java, etc. easily, efficiently, and correctly.That DIP seems dead in the water though. The author has vanished and nobody has taken up the reins.The string interpolation thing is cool, I wrote up my proposal, I'm just not likely to bother with the burden of DIP bureaucracy. Even javascript has some stuff that beats us now.As I said, I don't use a syntax highlighter. Also, any attempt to highlight is moot if the string contains code of a different language (see below for my use cases).And I use the heredoc strings BECAUSE of how well they can be highlighted - again my vim happens to treat q"html and q"sql and q"css and others specially knowing they are embedded. I could do that with something like css!" " too - a template instead and the type information could even be improved but still the heredoc is kinda cool for syntax highlighting. BTW if heredoc strings were to be removed.... tbh I can live with it. It bugs me that they must end at the beginning of a line. I wish it would let you indent it. Seriously bugs me and is a reason why I don't use them more. but still since they are there i use them.
Dec 03 2019
On Wednesday, 4 December 2019 at 01:26:24 UTC, H. S. Teoh wrote:And just for a bit more perspective, Python also has heredoc syntax, so does Perl, PHP, bash, and probably many others. If heredocs were really such a bad idea, why are people putting them into so many languages, over and over again?To me the opposite seems true. First of all:Python does not have here-docs. It does however have triple-quoted strings which can be used similarly.https://rosettacode.org/wiki/Here_document#Python Then considering which notable languages have context-sensitive string literals: 1987: Perl 1989: Bash 1995: PHP 2001: D 2011: C++11 If you know any other examples, please tell. I don't think context-sensitive string literals were ever put in a notable language created after 2001. (C++ has the most recent addition, but parsing that is already so complex they have nothing to lose)That DIP seems dead in the water though. The author has vanished and nobody has taken up the reins.I was referring to Walter Bright's one: https://github.com/dlang/DIPs/pull/1651) Generating HTML snippets 2) Generating PovRay scene description snippets 3) Generating D code snippets 4) Generating snippets of a DSL I use for generating geometric models 5) Generating boilerplate for input data to an external convex hull solver (has its own peculiar syntax) 6) Generating GLSL shader code snippets 7) Generating Java code snippets 8) Command line usage descriptionsI do believe for most of these you can use ``, q{} and q"<>" with little problems, but I understand that you prefer the q"EOS EOS" ones and would not want to rewrite your old code.?! Can't you just use a custom lexer with your PEG grammar? Then isn't the solution simply to write a self-contained heredoc parsing function, put it in a dub package, and let everyone reuse it? Then nobody will have to write it for themselves again. Problem solved.Of course you can make it work. I'm not saying that context-sensitive string literals make or break all D lexers, it's just a little source of complexity that may not bear its weight. And a good couple of syntax highlighters support multiple different languages while being implemented in one, take for example this one written in Go: https://github.com/alecthomas/chroma/blob/master/lexers/d/d.go I wouldn't expect them to add dub package for D, cargo package for Rust, npm package for JavaScript etc.This whole debacle feels like heredocs are being singled out as a scapegoat in a misguided quest to "simplify the language". Like we're grasping at straws because we're unable to tackle the bigger issues, so here's a convenient simple target we can shoot and kill and feel good about ourselves that we're finally making progress. Talking about straining out the gnat and swallowing the camel.It seems to me D has this history of removing small features with a small problem: - Small feature: escape string literals Small problem: doesn't have much use - Small feature: octal string literals Small problem: can be confused for decimal literal, and can be made a library feature - Small feature: hexstring literals Small problem: can be better represented in a library function Now my proposed next one is: - Small feature: context-sensitive string literals Small problem: accidentally bumps the complexity class of D's lexical grammar. Now I understand that reviewers are debating whether it is a small feature ("I actually use these a lot") and whether the small problem isn't too small ("making D lexers still isn't hard"). That's what I like to see in the review, thanks especially to WebFreak and Adam D. Ruppe for their input on their VSCode and Vim highlighters, and thanks to you for your use cases. What I don't get is why this is called a "non-starter" by Andrei and a "debacle" / "misguided quest" by you. Is it such a ludicrous idea to deprecate this particular part of the language? I admit that I misjudged that amount of use, breakage and complexity this feature has before writing this DIP. If this trend continues then this DIP is dead, I'm not going to push this hard or anything. But I am at least still interested in Walter and Atila's opinion.
Dec 04 2019
On 04.12.19 12:10, Dennis wrote:Now my proposed next one is: - Small feature: context-sensitive string literals Small problem: accidentally bumps the complexity class of D's lexical grammar.A small fix for this small problem is to just say in the specification that heredoc identifiers may not exceed 1e100 characters. ;) Another fix could be to just go over the language specification and replace all wrongly applied CS terms by a short explanation of what is actually going on. (In practice, when Walter says D's grammar is context-free, what he means is that parsing does not depend on semantic analysis on a prefix of the code, a property that C++ has which implies context-sensitivity and is usually abbreviated this way, and Walter's aim was to contrast D to this.)
Dec 04 2019
On 12/4/2019 5:35 AM, Timon Gehr wrote:On 04.12.19 12:10, Dennis wrote:Another case of my lack of academic CS training showing. I would appreciate it if qualified people would indeed go through the D spec and correct misuse of the terms. I know Timon likes to excoriate my conflation of "assert" and "assume", which have precise CS definitions. I'm sure there's plenty more in the spec.Now my proposed next one is: - Small feature: context-sensitive string literals Small problem: accidentally bumps the complexity class of D's lexical grammar.A small fix for this small problem is to just say in the specification that heredoc identifiers may not exceed 1e100 characters. ;) Another fix could be to just go over the language specification and replace all wrongly applied CS terms by a short explanation of what is actually going on.(In practice, when Walter says D's grammar is context-free, what he means is that parsing does not depend on semantic analysis on a prefix of the code, a property that C++ has which implies context-sensitivity and is usually abbreviated this way, and Walter's aim was to contrast D to this.)That's right. I often express it in even simpler (but less precise) terms - a symbol table is not required to parse it. Yes, I know the pedant will point out that heredoc has a symbol table with exactly one symbol in it, but please, allow me to concede that in advance and spare us :-)
Dec 04 2019
On Wednesday, 4 December 2019 at 22:57:21 UTC, Walter Bright wrote:Another case of my lack of academic CS training showing. I would appreciate it if qualified people would indeed go through the D spec and correct misuse of the terms.I don't think a spec has to use a lot of CS terms, probably better to describe it in language that most users can understand. Like, the other day I got confused by the usage of the term "covariant" in https://dlang.org/spec/function.html It says stuff like "a pure function … is covariant with an impure function", "Nothrow functions are covariant with throwing ones.", "Safe functions are covariant with trusted or system functions." and "System functions are not covariant with trusted or safe functions." This doesn't tell me anything even if I happened to remember what the term means. My understanding is that covariant means that if T(A) is related to T'(A') then T<:T' and A<:A', wheras covariant means that one of the subtyping relations point the other way. I cannot fix it either, since I don't know what was meant...
Dec 04 2019
On Wednesday, 4 December 2019 at 23:35:09 UTC, Ola Fosheim Grøstad wrote:Like, the other day I got confused by the usage of the term "covariant" inIn that context, if you replace "covariant with" with "can act as a substitute for" it would work pretty well.
Dec 04 2019
On Wednesday, 4 December 2019 at 23:52:54 UTC, Adam D. Ruppe wrote:On Wednesday, 4 December 2019 at 23:35:09 UTC, Ola Fosheim Grøstad wrote:That is much easier to understand, for sure. I think the best parts of the documentation is where examples are provided.Like, the other day I got confused by the usage of the term "covariant" inIn that context, if you replace "covariant with" with "can act as a substitute for" it would work pretty well.
Dec 04 2019
On Wednesday, 4 December 2019 at 11:10:45 UTC, Dennis wrote:I do believe for most of these you can use ``, q{} and q"<>" with little problems, but I understand that you prefer the q"EOS EOS" ones and would not want to rewrite your old code.The big (and only) advantage of HERE docs is that you so rarely have to think about them or revise them that this is not a concern. "Check and see if you've broken the string literal" is not a step that you go through every single time you have to touch the content of the string. The most annoying part of HERE docs for code presentation, that the ending delimiter has such strict requirements, is precisely what makes them not annoying at all for holding random snippets of HTML or whatever. You just don't get collisions, or they are very obvious. The reader's ease is repeated in the ease that tools have with them: they don't need a stack; they can just read lines and throw them away until they find a line that (classically) has some exact contents, or (in D) starts with some exact prefix. With only matching nested delimiter strings, accidental collisions will happen. Not often. But neither " or ])>} are infrequent characters to find randomly in a string, and the first time you have to change both ends of a q"( string to make it a q"[ string because you added a URL that ended in parentheses to some embedded HTML, you'll think: man, I should just take all these snippets and stuff them under __EOF__ , then read that statically, and stuff them into a map on module load. And *then* you'll think: wait, people hardly ever use __EOF__ in D, so someone's definitely going to come along and deprecate *that* code, too! The world isn't divided only between good practices and bad practices. Across from the Scylla of legacy-code-is-sacred languages that never remove anything, even obviously bad features that nobody likes (' as a module separator in Perl, or octal literals that start with 0), there's a Charybdis of code-is-always-bitrotting languages that jerk you around with pointless deprecations.It seems to me D has this history of removing small features with a small problem: - Small feature: escape string literals Small problem: doesn't have much useI was surprised when \e didn't work. So it was removed for such a reason.- Small feature: octal string literals Small problem: can be confused for decimal literal, and can be made a library featureThis is a significant problem actually. The *only* reason languages have C-style octal literals is because they can't remove them anymore. It's not "octal literals" in general that 0o123 are octal literals that don't get confused with a nice decimal number like 0123.- Small feature: hexstring literals Small problem: can be better represented in a library functionWhat these removals all have in common is that the post-removal experience is: you reach for the removed feature, you get an error, you find out what to do instead, and then there are no more problems for you. Yes, you're still moving towards Charybdis with stuff like this, but the point of the myth isn't "all movements in the direction of Charybdis are bad.", as those movements are still movements *away* from Scylla. Removing HERE docs, though, makes the language permanently more annoying to use for the task that would've benefited from them. To the point that, rather than just use the intended replacement, people might rather do something else entirely. Someone might personally not like the look of \033 vs. \e, or octal!123 vs. 0123, but the replacement doesn't make them work any harder. It's not a huge problem, but it's a difference between this small deprecation and the previous ones.Now I understand that reviewers are debating whether it is a small feature ("I actually use these a lot") and whether the small problem isn't too small ("making D lexers still isn't hard"). That's what I like to see in the review, thanks especially to WebFreak and Adam D. Ruppe for their input on their VSCode and Vim highlighters, and thanks to you for your use cases. What I don't get is why this is called a "non-starter" by Andrei and a "debacle" / "misguided quest" by you. Is it such a ludicrous idea to deprecate this particular part of the language?1. It's *because* the proposed change isn't that bad that it's getting the responses it's getting, rather than complaints that the proposed change is very bad and that HERE docs are irreplaceable treasures. It wasn't until my post just now that anyone took the time to say that HERE docs have any unique advantages at all. 2. Andrei's response isn't just "non-starter" but also "HERE documents have no impact on the language grammar."
Dec 04 2019
On Tuesday, 3 December 2019 at 12:38:29 UTC, Andrei Alexandrescu wrote:This DIP is a non-starter. Here documents are easily and effectively handled during lexing and have no impact on the language grammar.In a compiler. Here's an implementation for bash heredoc strings, say something nice about it: --- class HereDocCls { // Class to manage HERE document elements public: int State; // 0: '<<' encountered // 1: collect the delimiter // 2: here doc text (lines after the delimiter) int Quote; // the char after '<<' bool Quoted; // true if Quote in ('\'','"','`') bool Indent; // indented delimiter (for <<-) int DelimiterLength; // strlen(Delimiter) char *Delimiter; // the Delimiter, 256: sizeof PL_tokenbuf HereDocCls() { State = 0; Quote = 0; Quoted = false; Indent = 0; DelimiterLength = 0; Delimiter = new char[HERE_DELIM_MAX]; Delimiter[0] = '\0'; } void Append(int ch) { Delimiter[DelimiterLength++] = static_cast<char>(ch); Delimiter[DelimiterLength] = '\0'; } ~HereDocCls() { delete []Delimiter; } }; HereDocCls HereDoc; ---
Dec 04 2019
On Tuesday, 3 December 2019 at 09:03:44 UTC, Mike Parker wrote:This is the feedback thread for the first round of Community Review for DIP 1026, "Deprecate Context-Sensitive String Literals": https://github.com/dlang/DIPs/blob/a7199bcec2ca39b74739b165fc7b97afff9e29d1/DIPs/DIP1026.mdYES I'm generally in favor of removing things. I think I've used token strings at times, but not the other variety discussed in the DIP and that I didn't know of. It will break some DUB package, and that's OK since we have SemVer.
Dec 03 2019
On Tuesday, December 3, 2019 2:03:44 AM MST Mike Parker via Digitalmars-d wrote:This is the feedback thread for the first round of Community Review for DIP 1026, "Deprecate Context-Sensitive String Literals": https://github.com/dlang/DIPs/blob/a7199bcec2ca39b74739b165fc7b97afff9e29d 1/DIPs/DIP1026.md All review-related feedback on and discussion of the DIP should occur in this thread. The review period will end at 11:59 PM ET on December 17, or when I make a post declaring it complete. At the end of Round 1, if further review is deemed necessary, the DIP will be scheduled for another round of Community Review. Otherwise, it will be queued for the Final Review and Formal Assessment. Anyone intending to post feedback in this thread is expected to be familiar with the reviewer guidelines: https://github.com/dlang/DIPs/blob/master/docs/guidelines-reviewers.md *Please stay on topic!* Thanks in advance to all who participate.There are definitely people who use token strings in their code when writing string mixins, because it makes it so that the code in the strings actually gets syntax highlighting like normal code does instead of being displayed as a string. I expect that a number of people would be quite unhappy to not be able to do that anymore. Personally, I never use token strings, and I'm not sure that I'd even know about them if I hadn't worked on a D lexer several years ago. I also prefer that strings look like strings even if they contain code, but I don't care enough about that to try to get the feature removed, and I'm not sure that I care much whether the DIP is accepted or not. However, there's no question that some people think that they're very valuable when writing string mixins. - Jonathan M Davis
Dec 03 2019
On Tuesday, 3 December 2019 at 15:05:14 UTC, Jonathan M Davis wrote:There are definitely people who use token strings in their code when writing string mixinsToken strings are q{ }, this is about the delimited strings like q"xxx .... xxx" and q"( lll )";
Dec 03 2019
On Tuesday, 3 December 2019 at 15:05:14 UTC, Jonathan M Davis wrote:There are definitely people who use token strings in their code when writing string mixins, because it makes it so that the code in the strings actually gets syntax highlighting like normal code does instead of being displayed as a string.I don't propose deprecating token strings, only the identifier delimited ones, which get highlighted as strings. ``` string s = q{ this is fine }; string t = q"EOS this is not fine EOS"; ```
Dec 03 2019
On Tuesday, December 3, 2019 8:09:19 AM MST Dennis via Digitalmars-d wrote:On Tuesday, 3 December 2019 at 15:05:14 UTC, Jonathan M Davis wrote:Ah. Clearly, I glanced over it all too quickly. I confess that that particular type of string literal seems useless to me. I don't think that I've ever seen anyone use them, and I'd be even less interested in using them than token strings. I don't feel particularly strongly about whether we remove them from the language, but if we were talking about adding them, I'd certainly be against it. - Jonathan M DavisThere are definitely people who use token strings in their code when writing string mixins, because it makes it so that the code in the strings actually gets syntax highlighting like normal code does instead of being displayed as a string.I don't propose deprecating token strings, only the identifier delimited ones, which get highlighted as strings. ``` string s = q{ this is fine }; string t = q"EOS this is not fine EOS"; ```
Dec 03 2019
On Tue, Dec 03, 2019 at 03:09:19PM +0000, Dennis via Digitalmars-d wrote: [...]I don't propose deprecating token strings, only the identifier delimited ones, which get highlighted as strings. ``` string s = q{ this is fine }; string t = q"EOS this is not fine EOS"; ```The problem is that token strings require the contents to be *D tokens*. So if I need to emit snippets of another language, I'm out of luck, and have to resort to quoted strings and Leaning Toothpick Syndrome. I oppose this DIP. 1) It puts undue focus on a marginal, non-intrusive language feature and makes it seem as if it's a primary cause of tooling problems (it does add some complexity, no doubt, but let's not make mountains out of molehills here); 2) It places the blame of the syntax highlighting issue at the wrong place: syntax highlighters should be fixed, not the other way round. 3) It does not adequately strive to understand why heredoc syntax was introduced in the first place, where/when it might be useful, and how to mitigate the problems heredoc syntax solves if we were to remove it; 4) It breaks a pretty long list of existing D projects, yet does not provide strong enough benefits to justify this breakage (doubly so for me, because I don't use syntax highlighters to begin with, so for me this is all loss and no gain); 5) The breakage does not unquestionably improve code, in fact, I can already see many cases for which it makes code *less* readable; 6) The amount of work it will take to rewrite heredoc literals far outweighs any small benefits this DIP might bring (and in my case, it's work for *no* benefit). T -- Claiming that your operating system is the best in the world because more people use it is like saying McDonalds makes the best food in the world. -- Carl B. Constantine
Dec 03 2019
On Tuesday, 3 December 2019 at 21:20:57 UTC, H. S. Teoh wrote:The problem is that token strings require the contents to be *D tokens*. So if I need to emit snippets of another language, I'm out of luck, and have to resort to quoted strings and Leaning Toothpick Syndrome.Bracket-delimited string (q"[text]", allowing <>, [], (), and {} as delimiters) are still allowed and do not need to contain valid tokens.
Dec 03 2019
On Tue, Dec 03, 2019 at 09:35:42PM +0000, Elronnd via Digitalmars-d wrote:On Tuesday, 3 December 2019 at 21:20:57 UTC, H. S. Teoh wrote:They still need to nest properly, though. Generating BF snippets, for example, wouldn't work. T -- English has the lovely word "defenestrate", meaning "to execute by throwing someone out a window", or more recently "to remove Windows from a computer and replace it with something useful". :-) -- John CowanThe problem is that token strings require the contents to be *D tokens*. So if I need to emit snippets of another language, I'm out of luck, and have to resort to quoted strings and Leaning Toothpick Syndrome.Bracket-delimited string (q"[text]", allowing <>, [], (), and {} as delimiters) are still allowed and do not need to contain valid tokens.
Dec 03 2019
On Tuesday, 3 December 2019 at 21:20:57 UTC, H. S. Teoh wrote:2) It places the blame of the syntax highlighting issue at the wrong place: syntax highlighters should be fixed, not the other way round.It requires efficient memory management. Wait, it requires memory management? Also the usual tradeoff between space, complexity and time, maybe hashtable and CSPRNG. Usually delimited strings are simply not implemented as the only reasonable option, but then people here say that such highlighter "doesn't support D". So, it's not really a problem for highlighter, delimited strings simply don't exist there, and can opt in by choosing a different highlighter.
Dec 04 2019
On Tuesday, 3 December 2019 at 15:05:14 UTC, Jonathan M Davis wrote:[...] There are definitely people who use token strings in their code when writing string mixins, because it makes it so that the code in the strings actually gets syntax highlighting like normal code does instead of being displayed as a string. I expect that a number of people would be quite unhappy to not be able to do that anymore.This DIP explicitly doesn't deprecate token strings, only identifier-delimited strings and character-delimited strings.
Dec 03 2019
On Tuesday, 3 December 2019 at 09:03:44 UTC, Mike Parker wrote:This is the feedback thread for the first round of Community Review for DIP 1026, "Deprecate Context-Sensitive String Literals": https://github.com/dlang/DIPs/blob/a7199bcec2ca39b74739b165fc7b97afff9e29d1/DIPs/DIP1026.md All review-related feedback on and discussion of the DIP should occur in this thread. The review period will end at 11:59 PM ET on December 17, or when I make a post declaring it complete. At the end of Round 1, if further review is deemed necessary, the DIP will be scheduled for another round of Community Review. Otherwise, it will be queued for the Final Review and Formal Assessment. Anyone intending to post feedback in this thread is expected to be familiar with the reviewer guidelines: https://github.com/dlang/DIPs/blob/master/docs/guidelines-reviewers.md *Please stay on topic!* Thanks in advance to all who participate.1) Are there any examples of strings that don't have an in-source code workaround if this dip is accepted? 2) the link in rosetta code shows a lot of the languages with funky parsing. So I'm not sure that proves anything. 3) how much less complex does the parser actually get? Is it trivial?
Dec 03 2019
On Tuesday, 3 December 2019 at 23:13:16 UTC, aliak wrote:1) Are there any examples of strings that don't have an in-source code workaround if this dip is accepted?Considering escape sequences such as "\x0B" and string concatenation with ~, any string literal can still be expressed. The generic, most non-intrusive transformation I can think of would be: Given an identifier delimited string, check which of < ( { [ has the least amount of mismatched brackets. Then convert the string literal to a bracket delimited string with all unmatched brackets concatenated in: ``` q"EOS ((["`[<< { ((["`[<< EOS" // only one mismatching {, so it becomes q"{((["`[<< }" ~ "{" ~ q"{ ((["`[<<}" ``` (This is a worst case example, in practice I expect there to be not so many mismatched brackets and quotes/back ticks in a string literal)3) how much less complex does the parser actually get? Is it trivial?In dmd not so much, it would just make this function a bit smaller: https://github.com/dlang/dmd/blob/073b6861b1d1a9859a90e25c8d7f079b54280aca/src/dmd/lexer.d#L1477 For implementations of a D lexer in lexer/parser generators (e.g. http://dinosaur.compilertools.net/lex/index.html), it means only needing context-free constructs to express everything.
Dec 03 2019
On Tuesday, 3 December 2019 at 09:03:44 UTC, Mike Parker wrote:This is the feedback thread for the first round of Community Review for DIP 1026, "Deprecate Context-Sensitive String Literals": [...]We use this feature. We can fix the code, but the DIP doesn't state a convincing reason to remove this from the language.
Dec 03 2019
There are a lot of DIPs in the pipeline, and this looks highly unlikely to get traction, based on the comments. I suggest withdrawing it.
Dec 04 2019