digitalmars.D - The state of string interpolation...one year later
- Jonathan Marler (168/168) Mar 16 2019 It's been about a year since I submitted an implementation for
- Nicholas Wilson (6/15) Mar 17 2019 Dconf! I've added this to the list of topics for the foundation
- Jacob Carlborg (6/9) Mar 17 2019 We can. But just as there's no interest in adding language support for
- Nick Treleaven (50/92) Mar 17 2019 Maybe he wrote that on the forum. In a text editor syntax
- Atila Neves (3/11) Mar 19 2019 Yep. At the time I pointed out I wouldn't have made the same
- Jonathan Marler (7/20) Mar 19 2019 For me, even with syntax highlighting it's still hard to spot the
- ag0aep6g (34/61) Mar 17 2019 Either way, you likely got yourself an HTML injection.
- kinke (7/22) Mar 17 2019 As long as the embedded expressions are always separated by a
- ag0aep6g (20/24) Mar 18 2019 I see. Like this:
- Olivier FAURE (11/15) Mar 18 2019 One way to deal with these cases would be to have an alternate
- Rubn (5/176) Mar 17 2019 Seems you've done everything but write a DIP, I don't really see
- Jonathan Marler (46/50) Mar 17 2019 I'll have to disagree with you here. I'm not sure if you've
- Nicholas Wilson (13/58) Mar 17 2019 That does smack of my experience with DIP1016. I've added DIP1011
- Kagamin (7/22) Mar 17 2019 There's IDE support for format string validation for C and C#.
- Jonathan Marler (22/45) Mar 17 2019 Right, which is why I mentioned we must consider the price
- Kagamin (4/14) Mar 18 2019 It's hard to parse even for a library? That sounds bad. Is it
- Steven Schveighoffer (12/30) Mar 18 2019 Yeah, you could write it I think just like the language solution:
- Kagamin (9/13) Mar 19 2019 Hmm... the pr seemingly doesn't support full nested language: the
- Jonathan Marler (11/24) Mar 19 2019 Your absolutely right here. Thanks for taking time to look at the
- Jonathan Marler (22/37) Mar 18 2019 Yeah, like I said you wouldn't write that in real code. It's an
- Olivier FAURE (3/4) Mar 18 2019 By the way, quick question, how does the PR handle this case?
- Steven Schveighoffer (4/10) Mar 18 2019 Hm... it would be good for the feature to handle this (and treat it as
- Meta (7/11) Mar 18 2019 Also, recursive use of interpolated strings. Does this work or
- H. S. Teoh (7/22) Mar 18 2019 [...]
- Patrick Schluter (3/25) Mar 18 2019 \$ would be consistent with normal C and D quoting rules. $$ not
It's been about a year since I submitted an implementation for interpolated strings: https://github.com/dlang/dmd/pull/7988 In that time, various people have been popping up asking about it. There has been alot of discussion around this feature on the forums and the place we left off was with Andrei saying that we should: * continue to explore alternative library solutions * focus on improving existing features instead of adding new features At the request of Andrei, I implemented a small library solution as well (https://github.com/dlang/phobos/pull/6339) but the leadership never followed up with it. And that's ok, they only have so much time and they need to prioritize how they feel is best. With that, I read through some discussion and thought it could be helpful to summarize my thoughts on the matter since people continue to ask questions about it. In my mind, there's really only one reason for string interpolation... Better Syntax In many ways syntax isn't that important. There's alot of subjectivity around it, but sometimes a change can make it objectively better. Any time you can make syntax objectively better, you're making code easier to read, write and maintain. Better syntax means it's easier to write "correct code" and harder to write "incorrect code". I recall Atila arguing that the syntax without string interpolation wasn't that bad. Then he provided this example (https://forum.dlang.org/post/jahvdekidbugougmyhgb forum.dlang.org): text("a is", a, ", b is ", b, " and the sum is: ", a + b) Ironically, his example had a mistake, but it was hard to notice. Look at the same example with string interpolation: text("a is$a, b is $b and the sum is: $(a + b)") You could say that "better syntax" is one of the main reasons D exists. It's the main if not the only reason for alot of features like UFCS and foreach. Andrei's biggest critique is that we should firt try to implement this in a library...and he's completely right to ask that question. The problem is that over the years, no one's been able to achieve a library solution that results in a nice syntax. Having a poor syntax is a bad sign for a feature that only exists to improve syntax. However, even if we could make the syntax better, there are still a handful of reasons why a library solution can't measure up to language support. Including the "poor syntax", the following are the 5 CONS I see with a library solution: CON 1. The syntax is "not nice". This defeats the entire point of interpolated strings. Saying that interpolated strings aren't popular because people are not using a library for them is like saying Elvis isn't popular because people don't like elvis impersonators. People not liking a poor imitation of something doesn't say anything about how they feel about the genuine article. CON 2. Real error messages. What's one of the most annoying parts of mixins? Error messages. When you get an error in a mixin, you don't get a line of code to go fix, you get an "imaginary" line that doesn't exist. With library solutions, you can't point syntax errors inside interpolated strings to source locations. That information is not available to the language. When you get a string, you don't know where each character inside that string originated from, only the compiler knows that. CON 3. Performance. No matter what we do, any library solution will never be as fast as a language solution. The reason why performance is especially important here, is because bad performance means developers will have to chose between better syntax or faster compilation. We already see this today with templates and mixins. With a language implementation, developers can have both. CON 4. IDE/Editor Support. A library solution won't be able to have IDE/Editor support for syntax highlighting, auto-complete, etc. When the editor sees an interpolated string, it will be able to highlight the code inside it just like normal code. CON 5. Full solution requires full copy of lexer/parser. One big problem with a library solution is that it will be hard for a library to delimit interpolated expresions. For full support, it will need to have a full implementation of the compiler's lexer/parser. Without that, it will have limitations on the kind of code that can be inside an interpolated string. Take the following (contrived) example: foreach (i; 0 .. 10) { mixin(interp(`$( i ~ ")" ) entry $(array[i])`)); } The library solution needs to parse that interpolated string but needs to know that the right paren at `")"` is actually just a string literal inside the expression and not a right paren to delimit the end of the expression. This is a contrived example, but if you have anything less than a full lexer/parser then developers are going to have a hard time being able to know what can and can't go inside an interpolated expression. By having interpolated strings as a part of the langugage, the implementation has full access to the lexer/parser, so it doesn't need to force any limitation on the syntax available inside interpolated string expressions. --- Now I'm not saying that the CONS of the library solution justify the addition of interpolated strings to the language. I focused on that because that is Andrei's main sticking point. Even if everyone agrees that library solution's don't work (and we can't enhance the language to make them work), we still need to show that the feature is going to be popular/useful enough to justify a new type of string literal. The usefulness of the feature needs to outweight the work to support it. The more features we add to D, the more developers need to learn to understand it. That being said, I consider the implementation and complexity it adds to be quite minimal (see the PR for more details). As for the usefullness, I can say personally I would use this feature to replace almost all my usages of writefln/format and writeln which would be a big shift for my projects. Instead of: writefln("My name is %s and my age is %s and my favorite hex is %s", name, age, favnum); I will be writing: writeln(i"My name is $name and my age is $age and my favorite hex $(favnum.formatHex)"); When I generate code, instead of: return returnType ~ ` ` ~ name ~ `(` ~ type ~ ` left, ` ~ type ~ ` right) { return cast(` ~ returnType ~ `)(left ` ~ op ~ ` right); } `; It will be return text(iq{ $returnType $name($type left, $type right) { return cast($returnType)(left $op right); } }); When I generate HTML documents in my cgi library, instead of: writeln(`<html><body> <title>`, title, `</title> <name>`, name, `</name><age>`, age, `</age> <a href="`, link, `">`, linkName, `</a> </body></html> `); or even: writefln(`<html><body> <title>%s</title> <name>%s</name><age>%s</age> <a href="%s">%s</a> </body></html> `, title, name, age, link, linkName); It will be: writeln(i`<html><body> <title>$title</title> <name>$name</name><age>$age</age> <a href="$link">$linkName</a> </body></html> `); When I first saw interpolated strings I didn't immediately realize the benefit of them. Using them eliminates the problem of keeping format strings in sync with arguments. It also avoids the "noise problem" you get when you alternate between code and expressions inside a function call, i.e. `writeln("a is", a, ", b is ", b)`. That pretty much sums up the benefits in my mind. So what's next? I'm curious where leadership currently stands. What's their thoughts on the library solutions that have been presented? What do they think of the 5 CONS I've presented that all library solutions will have? What's their opinion on the usefullness of the feature? For me personally, I am surprised at the amount of interest this feature continues to garner. I think the feature is a net positive for D, but then again I don't think it's a "make or break" feature. Just a "nice addition". Anyway those are my thoughts. Sorry for the long post. I hope it's helpful and ultimately makes D better.
Mar 16 2019
On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:CON 2. Real error messages. What's one of the most annoying parts of mixins? Error messages. When you get an error in a mixin, you don't get a line of code to go fix, you get an "imaginary" line that doesn't exist.Just a note that you can make it exist with -mixin=filenameSo what's next?Dconf! I've added this to the list of topics for the foundation meeting where we can get a lot of these decisions made.For me personally, I am surprised at the amount of interest this feature continues to garner. I think the feature is a net positive for D, but then again I don't think it's a "make or break" feature. Just a "nice addition".Indeed, I hadn't realised how much nicer that would make dealing with formatting mixins, thanks for that.
Mar 17 2019
On 2019-03-17 07:01, Jonathan Marler wrote:Even if everyone agrees that library solution's don't work (and we can't enhance the language to make them work)We can. But just as there's no interest in adding language support for string interpolation there's no interest to add support for those features that would help to make a library solution satisfactory. -- /Jacob Carlborg
Mar 17 2019
On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:text("a is", a, ", b is ", b, " and the sum is: ", a + b) Ironically, his example had a mistake, but it was hard to notice.Maybe he wrote that on the forum. In a text editor syntax highlighting would make the mistake clearer, but it can still happen. The interpolated syntax is definitely clearer. I think interpolated strings as a language feature is justifiable, but that a mixin solution can work with some small more general changes to the language, and that these changes are both simpler to support and more useful and flexible.CON 1. The syntax is "not nice". This defeats the entire point of interpolated strings.What about `$ident!targs(args)` being sugar for `mixin(ident!targs(args))`? Then we can have a function template in object.d invoked as: $iStr!"v = $v" becomes: mixin(iStr!"v = $v")With library solutions, you can't point syntax errors inside interpolated strings to source locations. That information is not available to the language. When you get a string, you don't know where each character inside that string originated from, only the compiler knows that.Yes, but the language could support a way to get the file and line from template alias arguments inside the template. So at least the starting line of the string literal could be known and given in an error message.CON 3. Performance. No matter what we do, any library solution will never be as fast as a language solution.True, but what matters is whether a library solution is fast enough. If dmd isn't currently then perhaps the CTFE rewrite will be.CON 4. IDE/Editor Support. A library solution won't be able to have IDE/Editor support for syntax highlighting, auto-complete, etc. When the editor sees an interpolated string, it will be able to highlight the code inside it just like normal code.The editor has to be updated for interpolated syntax, so it could just as easily be updated to recognise invocation of $iStr if iStr was in object.d.The library solution needs to parse that interpolated string but needs to know that the right paren at `")"` is actually just a string literal inside the expression and not a right paren to delimit the end of the expression. This is a contrived example, but if you have anything less than a full lexer/parser then developers are going to have a hard time being able to know what can and can't go inside an interpolated expression.Good point, but I think banning all string delimiters `'" is a reasonable and easy to understand restriction. It's good to discourage people from putting complex expressions inside strings, and this also makes iStr editor highlighting easier to implement vs the unrestricted language solution (which could also be restricted).That being said, I consider the implementation and complexity it adds to be quite minimal (see the PR for more details). As for the usefullness, I can say personally I would use this feature to replace almost all my usages of writefln/format and writeln which would be a big shift for my projects. Instead of: writefln("My name is %s and my age is %s and my favorite hex is %s", name, age, favnum);Actually this should be `writefln!"My name is..."(name, ...)`. Formatting can be more efficient if the format string is known at compile-time.I will be writing: writeln(i"My name is $name and my age is $age and my favorite hex $(favnum.formatHex)");Does formatHex exist?return text(iq{ $returnType $name($type left, $type right) { return cast($returnType)(left $op right); } });Here there are advantages to the $iStr solution vs your implementation. First I would have two mixin functions, iSeq to do what you want, and iStr which includes the call to std.conv.text: 1. iStr doesn't need to import std.conv.text, which is a very common case. Having to import `text` explicitly would often make me avoid using the interpolated string feature and use existing string syntax instead. (A different language implementation could expose text as a property of the interpolated string though). 2. The iStr template gets the string literal passed at compile time, so the length is available for buffer pre-allocation without summing the string fragment lengths at runtime. (A different language implementation could expose the original string length as an enum, then text, writef and format can take advantage of it).
Mar 17 2019
On Sunday, 17 March 2019 at 10:40:28 UTC, Nick Treleaven wrote:On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:Yep. At the time I pointed out I wouldn't have made the same mistake in Emacs.text("a is", a, ", b is ", b, " and the sum is: ", a + b) Ironically, his example had a mistake, but it was hard to notice.Maybe he wrote that on the forum. In a text editor syntax highlighting would make the mistake clearer, but it can still happen. The interpolated syntax is definitely clearer.
Mar 19 2019
On Tuesday, 19 March 2019 at 10:11:27 UTC, Atila Neves wrote:On Sunday, 17 March 2019 at 10:40:28 UTC, Nick Treleaven wrote:For me, even with syntax highlighting it's still hard to spot the mistake. So it's good to get your perspective which is that you know syntax highlighting would have prevented this mistake in your case. Off topic, I enjoyed your talk on emacs, I'm a long time user myself.On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:Yep. At the time I pointed out I wouldn't have made the same mistake in Emacs.text("a is", a, ", b is ", b, " and the sum is: ", a + b) Ironically, his example had a mistake, but it was hard to notice.Maybe he wrote that on the forum. In a text editor syntax highlighting would make the mistake clearer, but it can still happen. The interpolated syntax is definitely clearer.
Mar 19 2019
On 17.03.19 07:01, Jonathan Marler wrote:When I generate HTML documents in my cgi library, instead of: writeln(`<html><body> <title>`, title, `</title> <name>`, name, `</name><age>`, age, `</age> <a href="`, link, `">`, linkName, `</a> </body></html> `); or even: writefln(`<html><body> <title>%s</title> <name>%s</name><age>%s</age> <a href="%s">%s</a> </body></html> `, title, name, age, link, linkName); It will be: writeln(i`<html><body> <title>$title</title> <name>$name</name><age>$age</age> <a href="$link">$linkName</a> </body></html> `);Either way, you likely got yourself an HTML injection. That might be the crux of string interpolation: It looks nice in simple examples, but is it still nice when you need to encode your variables for the output? I think that should be a goal. We don't want to encourage writing bad code by making it more beautiful than correct code. Unless I'm missing something (I've only skimmed your PRs), you don't have mechanisms to aid in this. So your example would look like this with encoding: writeln(i`<html><body> <title>$(title.toHTML)</title> <name>$(name.toHTML)</name><age>$(age.toHTML)</age> <a href="$(link.toHTML)">$(linkName.toHTML)</a> </body></html> `); That might still be prettier than the alternative with a plain `writeln`, but the difference is less pronounced. And with `writefln` we can do something like this: void writeflnToHTML(S ...)(string f, S stuff) { writefln(f, tupleMap!toHTML(stuff).expand); } writeflnToHTML(`<html><body> <title>%s</title> <name>%s</name><age>%s</age> <a href="%s">%s</a> </body></html> `, title, name, age, link, linkName); That's still not pretty at all, but we can't forget a `.toHTML` this way. (Though `tupleMap` isn't in phobos and might be hard to get exactly right.) Ideally, something like that would be possible with interpolated strings, too.
Mar 17 2019
I like the idea and find the tuple approach very elegant. On Sunday, 17 March 2019 at 14:01:36 UTC, ag0aep6g wrote:void writeflnToHTML(S ...)(string f, S stuff) { writefln(f, tupleMap!toHTML(stuff).expand); } writeflnToHTML(`<html><body> <title>%s</title> <name>%s</name><age>%s</age> <a href="%s">%s</a> </body></html> `, title, name, age, link, linkName); That's still not pretty at all, but we can't forget a `.toHTML` this way. (Though `tupleMap` isn't in phobos and might be hard to get exactly right.) Ideally, something like that would be possible with interpolated strings, too.As long as the embedded expressions are always separated by a string literal in the resulting tuple, incl. an empty one (`i"$(a)$(b)"` => `"", a, "", b`), these embedded expressions can be identified (and appropriately transformed etc.) by an odd index.
Mar 17 2019
On 18.03.19 00:21, kinke wrote:As long as the embedded expressions are always separated by a string literal in the resulting tuple, incl. an empty one (`i"$(a)$(b)"` => `"", a, "", b`), these embedded expressions can be identified (and appropriately transformed etc.) by an odd index.I see. Like this: void writelnToHTML(S ...)(S stuff) { static foreach (i, thing; stuff) { if (i % 2 == 0) write(thing); else write(thing.toHTML); } writeln; } writelnToHTML(i`<html><body> <title>$(title)</title> <name>$(name)</name><age>$(age)</age> <a href="$(link)">$(linkName)</a> </body></html> `); That's pretty good. It's not quite foolproof; one might put arguments before/after the interpolating, messing up the arrangement. But other than that it seems nice.
Mar 18 2019
On Sunday, 17 March 2019 at 14:01:36 UTC, ag0aep6g wrote:Either way, you likely got yourself an HTML injection. That might be the crux of string interpolation: It looks nice in simple examples, but is it still nice when you need to encode your variables for the output?One way to deal with these cases would be to have an alternate string interpolation syntax for format string, eg: fi"SELECT $field FROM $table" is lowered to "SELECT %s FROM %s", field, table Other alternatives have been suggested (eg interpolation creating delegates that can be passed at compile time), but I think the above solution is the most KISS and elegant. It encourages robust design, where the only argument parsed is the first one and every other argument is sanitized by default.
Mar 18 2019
On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:It's been about a year since I submitted an implementation for interpolated strings: https://github.com/dlang/dmd/pull/7988 In that time, various people have been popping up asking about it. There has been alot of discussion around this feature on the forums and the place we left off was with Andrei saying that we should: * continue to explore alternative library solutions * focus on improving existing features instead of adding new features At the request of Andrei, I implemented a small library solution as well (https://github.com/dlang/phobos/pull/6339) but the leadership never followed up with it. And that's ok, they only have so much time and they need to prioritize how they feel is best. With that, I read through some discussion and thought it could be helpful to summarize my thoughts on the matter since people continue to ask questions about it. In my mind, there's really only one reason for string interpolation... Better Syntax In many ways syntax isn't that important. There's alot of subjectivity around it, but sometimes a change can make it objectively better. Any time you can make syntax objectively better, you're making code easier to read, write and maintain. Better syntax means it's easier to write "correct code" and harder to write "incorrect code". I recall Atila arguing that the syntax without string interpolation wasn't that bad. Then he provided this example (https://forum.dlang.org/post/jahvdekidbugougmyhgb forum.dlang.org): text("a is", a, ", b is ", b, " and the sum is: ", a + b) Ironically, his example had a mistake, but it was hard to notice. Look at the same example with string interpolation: text("a is$a, b is $b and the sum is: $(a + b)") You could say that "better syntax" is one of the main reasons D exists. It's the main if not the only reason for alot of features like UFCS and foreach. Andrei's biggest critique is that we should firt try to implement this in a library...and he's completely right to ask that question. The problem is that over the years, no one's been able to achieve a library solution that results in a nice syntax. Having a poor syntax is a bad sign for a feature that only exists to improve syntax. However, even if we could make the syntax better, there are still a handful of reasons why a library solution can't measure up to language support. Including the "poor syntax", the following are the 5 CONS I see with a library solution: CON 1. The syntax is "not nice". This defeats the entire point of interpolated strings. Saying that interpolated strings aren't popular because people are not using a library for them is like saying Elvis isn't popular because people don't like elvis impersonators. People not liking a poor imitation of something doesn't say anything about how they feel about the genuine article. CON 2. Real error messages. What's one of the most annoying parts of mixins? Error messages. When you get an error in a mixin, you don't get a line of code to go fix, you get an "imaginary" line that doesn't exist. With library solutions, you can't point syntax errors inside interpolated strings to source locations. That information is not available to the language. When you get a string, you don't know where each character inside that string originated from, only the compiler knows that. CON 3. Performance. No matter what we do, any library solution will never be as fast as a language solution. The reason why performance is especially important here, is because bad performance means developers will have to chose between better syntax or faster compilation. We already see this today with templates and mixins. With a language implementation, developers can have both. CON 4. IDE/Editor Support. A library solution won't be able to have IDE/Editor support for syntax highlighting, auto-complete, etc. When the editor sees an interpolated string, it will be able to highlight the code inside it just like normal code. CON 5. Full solution requires full copy of lexer/parser. One big problem with a library solution is that it will be hard for a library to delimit interpolated expresions. For full support, it will need to have a full implementation of the compiler's lexer/parser. Without that, it will have limitations on the kind of code that can be inside an interpolated string. Take the following (contrived) example: foreach (i; 0 .. 10) { mixin(interp(`$( i ~ ")" ) entry $(array[i])`)); } The library solution needs to parse that interpolated string but needs to know that the right paren at `")"` is actually just a string literal inside the expression and not a right paren to delimit the end of the expression. This is a contrived example, but if you have anything less than a full lexer/parser then developers are going to have a hard time being able to know what can and can't go inside an interpolated expression. By having interpolated strings as a part of the langugage, the implementation has full access to the lexer/parser, so it doesn't need to force any limitation on the syntax available inside interpolated string expressions. --- Now I'm not saying that the CONS of the library solution justify the addition of interpolated strings to the language. I focused on that because that is Andrei's main sticking point. Even if everyone agrees that library solution's don't work (and we can't enhance the language to make them work), we still need to show that the feature is going to be popular/useful enough to justify a new type of string literal. The usefulness of the feature needs to outweight the work to support it. The more features we add to D, the more developers need to learn to understand it. That being said, I consider the implementation and complexity it adds to be quite minimal (see the PR for more details). As for the usefullness, I can say personally I would use this feature to replace almost all my usages of writefln/format and writeln which would be a big shift for my projects. Instead of: writefln("My name is %s and my age is %s and my favorite hex is %s", name, age, favnum); I will be writing: writeln(i"My name is $name and my age is $age and my favorite hex $(favnum.formatHex)"); When I generate code, instead of: return returnType ~ ` ` ~ name ~ `(` ~ type ~ ` left, ` ~ type ~ ` right) { return cast(` ~ returnType ~ `)(left ` ~ op ~ ` right); } `; It will be return text(iq{ $returnType $name($type left, $type right) { return cast($returnType)(left $op right); } }); When I generate HTML documents in my cgi library, instead of: writeln(`<html><body> <title>`, title, `</title> <name>`, name, `</name><age>`, age, `</age> <a href="`, link, `">`, linkName, `</a> </body></html> `); or even: writefln(`<html><body> <title>%s</title> <name>%s</name><age>%s</age> <a href="%s">%s</a> </body></html> `, title, name, age, link, linkName); It will be: writeln(i`<html><body> <title>$title</title> <name>$name</name><age>$age</age> <a href="$link">$linkName</a> </body></html> `); When I first saw interpolated strings I didn't immediately realize the benefit of them. Using them eliminates the problem of keeping format strings in sync with arguments. It also avoids the "noise problem" you get when you alternate between code and expressions inside a function call, i.e. `writeln("a is", a, ", b is ", b)`. That pretty much sums up the benefits in my mind. So what's next? I'm curious where leadership currently stands. What's their thoughts on the library solutions that have been presented? What do they think of the 5 CONS I've presented that all library solutions will have? What's their opinion on the usefullness of the feature? For me personally, I am surprised at the amount of interest this feature continues to garner. I think the feature is a net positive for D, but then again I don't think it's a "make or break" feature. Just a "nice addition". Anyway those are my thoughts. Sorry for the long post. I hope it's helpful and ultimately makes D better.Seems you've done everything but write a DIP, I don't really see why this feature should be exempt from the process. Even if the process isn't the greatest, that isn't reason enough for it to circumvent it.
Mar 17 2019
On Sunday, 17 March 2019 at 14:20:20 UTC, Rubn wrote:Seems you've done everything but write a DIP, I don't really see why this feature should be exempt from the process. Even if the process isn't the greatest, that isn't reason enough for it to circumvent it.I'll have to disagree with you here. I'm not sure if you've written a DIP but I have. I started DIP 1011 about 2 years ago. After about a year, it was forwarded to W&A and I got a response from Andrei that contained a fair number of errors. To me it seemed that he didn't take enough time to read/understand the proposal (I've heard this isn't the first time this has happened). The proposal itself is pretty simple, but the ramifications of the change weren't so clear. A discussion between the leadership and the points they had questions about would have been very helpful early on to know where to put effort into researching the proposal. However, that's not how the DIP process is written to work. After Andrei's response I attempted to discuss their concerns but everything was filtered through the DIP manager Michael Parker and they never responded to my comments and questions. We left off with them providing an example library implementation asking me to comment on it. I did so, explaining that their example was incorrect and had little bearing on the DIP itself as they had implemented different semantics than what the DIP was proposing, but then they never responded. That was about a year ago, and the DIP is still considered to be in "Formal Review". In my opinion the DIP process is broken. I don't want to introduce a potentially good feature for D into a system where I believe it will actually harm the chances for the feature rather then help them. If the process is fixed however, I will gladly create a DIP and would look forward to really hashing out the feature and seeing how it could best be implemented. D has a big potential to make a splash with this feature by showcasing its meta-programming capabilities with a zero overhead implementation of string interpolation that also happens to be the most powerful/flexible one out there. I've heard some pretty cool ideas from people on it that I hadn't thought of, and would love to work with the community on creating a robust well-researched proposal, but I believe if I use the current DIP system as it exists to introduce it, it will actually make it more likely to fail than if I didn't write a DIP at all at this point. Please understand, I don't shy away from good, robust work and research. I'm a highly motivated mathematician, who loves optimizing and finding elegant solutions. That's what I like spending my time on. Researching language proposals is exactly the type of work I like to do. I spend alot of time reading and researching other languages and the features they bring to the table. I would very much enjoy contributing to a DIP process that fosters collaboration and feedback that results in more consensus and communal understanding.
Mar 17 2019
On Sunday, 17 March 2019 at 16:50:13 UTC, Jonathan Marler wrote:On Sunday, 17 March 2019 at 14:20:20 UTC, Rubn wrote:That does smack of my experience with DIP1016. I've added DIP1011 to the list of DIP related stuff for the dconf AGM[1]Seems you've done everything but write a DIP, I don't really see why this feature should be exempt from the process. Even if the process isn't the greatest, that isn't reason enough for it to circumvent it.I'll have to disagree with you here. I'm not sure if you've written a DIP but I have. I started DIP 1011 about 2 years ago. After about a year, it was forwarded to W&A and I got a response from Andrei that contained a fair number of errors. To me it seemed that he didn't take enough time to read/understand the proposal (I've heard this isn't the first time this has happened). The proposal itself is pretty simple, but the ramifications of the change weren't so clear. A discussion between the leadership and the points they had questions about would have been very helpful early on to know where to put effort into researching the proposal. However, that's not how the DIP process is written to work. After Andrei's response I attempted to discuss their concerns but everything was filtered through the DIP manager Michael Parker and they never responded to my comments and questions. We left off with them providing an example library implementation asking me to comment on it. I did so, explaining that their example was incorrect and had little bearing on the DIP itself as they had implemented different semantics than what the DIP was proposing, but then they never responded. That was about a year ago, and the DIP is still considered to be in "Formal Review".In my opinion the DIP process is broken. I don't want to introduce a potentially good feature for D into a system where I believe it will actually harm the chances for the feature rather then help them. If the process is fixed however, I will gladly create a DIP and would look forward to really hashing out the feature and seeing how it could best be implemented.I hope to get the DIP process fixed at the AGM, but for making progress on the topic of interpolated strings it would really help if you have a draft (or steal/polish that one from the DIP PR queue) so that we have something concrete to discuss (i.e. get (part of) a community round done).I believe if I use the current DIP system as it exists to introduce it, it will actually make it more likely to fail than if I didn't write a DIP at all at this point. Please understand, I don't shy away from good, robust work and research. I'm a highly motivated mathematician, who loves optimizing and finding elegant solutions. That's what I like spending my time on. Researching language proposals is exactly the type of work I like to do. I spend alot of time reading and researching other languages and the features they bring to the table. I would very much enjoy contributing to a DIP process that fosters collaboration and feedback that results in more consensus and communal understanding.This really does highlight the problems with organisation we really want to be getting all the best work you can provide and if that is bottlenecked on the current DIP process then we definitely need to look at why it is and how to fix it. [1]: http://dconf.org/2019/talks/agm.html
Mar 17 2019
On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:CON 3. Performance. No matter what we do, any library solution will never be as fast as a language solution.Must consider the price. Language bloat is not free.CON 4. IDE/Editor Support. A library solution won't be able to have IDE/Editor support for syntax highlighting, auto-complete, etc. When the editor sees an interpolated string, it will be able to highlight the code inside it just like normal code.CON 5. Full solution requires full copy of lexer/parser. One big problem with a library solution is that it will be hard for a library to delimit interpolated expresions. For full support, it will need to have a full implementation of the compiler's lexer/parser.The interpolated string syntax is as complex as the whole D language? That sounds bad. AFAIK IDEs don't even support UFCS yet even though it should be simple.foreach (i; 0 .. 10) { mixin(interp(`$( i ~ ")" ) entry $(array[i])`)); }That doesn't look nice, does it?
Mar 17 2019
On Sunday, 17 March 2019 at 17:05:46 UTC, Kagamin wrote:On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:Right, which is why I mentioned we must consider the price further down. I know it's a long post though, I don't blame you for not reading it all :)CON 3. Performance. No matter what we do, any library solution will never be as fast as a language solution.Must consider the price. Language bloat is not free.Which works in some cases. It can't work in all cases unless language support is added. Still a CON for the library solution.CON 4. IDE/Editor Support. A library solution won't be able to have IDE/Editor support for syntax highlighting, auto-complete, etc. When the editor sees an interpolated string, it will be able to highlight the code inside it just like normal code.The full solution requires a full copy of the lexer/parser, but the syntax for interpolated strings is just allowing string literals to be prefixed with the letter 'i'. You've confused the syntax for interpolated strings with the complexity of arbitrary D expression inside them. If you want to be able to write any code inside those expressions, then you need the full parser lexer to parse it. A language solution already has access to this but a library implementation will need to either copy the full implementation, or make compromises in what it can support.CON 5. Full solution requires full copy of lexer/parser. One big problem with a library solution is that it will be hard for a library to delimit interpolated expresions. For full support, it will need to have a full implementation of the compiler's lexer/parser.The interpolated string syntax is as complex as the whole D language? That sounds bad. AFAIK IDEs don't even support UFCS yet even though it should be simple.Not sure if you realize this but you're criticizing how the library solution looks, which was one of my main points :) And it's a contrived example to demonstrate an example that would be hard for a library to parse. In reality you wouldn't write it that way (hence why I called it "contrived"). With full interpolation support you would write it like this: i"$(i)) entry $(array[i])"foreach (i; 0 .. 10) { mixin(interp(`$( i ~ ")" ) entry $(array[i])`)); }That doesn't look nice, does it?
Mar 17 2019
On Sunday, 17 March 2019 at 18:32:55 UTC, Jonathan Marler wrote:I mean $( i ~ ")Not sure if you realize this but you're criticizing how the library solution looks, which was one of my main points :)foreach (i; 0 .. 10) { mixin(interp(`$( i ~ ")" ) entry $(array[i])`)); }That doesn't look nice, does it?And it's a contrived example to demonstrate an example that would be hard for a library to parse.It's hard to parse even for a library? That sounds bad. Is it required to be hard to parse?
Mar 18 2019
On 3/18/19 8:12 AM, Kagamin wrote:On Sunday, 17 March 2019 at 18:32:55 UTC, Jonathan Marler wrote:Yeah, you could write it I think just like the language solution: mixin(interp(`$(i)) entry $(array[i])`)); You'd still have to deal with the stray parentheses as a string, but I'm sure there are other expressions inside the escapes that are more likely to be in the wild which would require a lexer/parser. It may not be a full one, though, we don't need to make AST out of it.I mean $( i ~ ")Not sure if you realize this but you're criticizing how the library solution looks, which was one of my main points :)foreach (i; 0 .. 10) { mixin(interp(`$( i ~ ")" ) entry $(array[i])`)); }That doesn't look nice, does it?If you look at the PR that he created, it's super-simple. It uses the already existing parser/lexer in the front end. The point he's making is that we'd have to DUPLICATE that for a library solution. -SteveAnd it's a contrived example to demonstrate an example that would be hard for a library to parse.It's hard to parse even for a library? That sounds bad. Is it required to be hard to parse?
Mar 18 2019
On Monday, 18 March 2019 at 13:05:37 UTC, Steven Schveighoffer wrote:If you look at the PR that he created, it's super-simple. It uses the already existing parser/lexer in the front end.Hmm... the pr seemingly doesn't support full nested language: the string is lexed as a normal double quoted string. This is different from, say, javascript, where interpolated strings support nesting without escapes and thus can't be lexed without parsing all their content.The point he's making is that we'd have to DUPLICATE that for a library solution.If they supported full nested language, but they don't. So I see no reason to assume that they should be difficult to parse.
Mar 19 2019
On Tuesday, 19 March 2019 at 08:13:16 UTC, Kagamin wrote:On Monday, 18 March 2019 at 13:05:37 UTC, Steven Schveighoffer wrote:Your absolutely right here. Thanks for taking time to look at the code by the way. The current implementation isn't finished. I just wanted to start out with something easy to explore the feature. So today the language solution has the limitation that parens inside the code need to be balanced. But that limitation can be fixed with a bit more work. I just need to combine the current 2 pass interpolation logic into one-pass, but that would require some extra changes in the parse code and at this stage I wanted to minimize turmoil. So it's a temporary compromise. The final solution wouldn't have this limitation.If you look at the PR that he created, it's super-simple. It uses the already existing parser/lexer in the front end.Hmm... the pr seemingly doesn't support full nested language: the string is lexed as a normal double quoted string. This is different from, say, javascript, where interpolated strings support nesting without escapes and thus can't be lexed without parsing all their content.The point he's making is that we'd have to DUPLICATE that for a library solution.If they supported full nested language, but they don't. So I see no reason to assume that they should be difficult to parse.
Mar 19 2019
On Monday, 18 March 2019 at 12:12:12 UTC, Kagamin wrote:On Sunday, 17 March 2019 at 18:32:55 UTC, Jonathan Marler wrote:Yeah, like I said you wouldn't write that in real code. It's an example to demonstrate something that would be hard for a library to parse.I mean $( i ~ ")Not sure if you realize this but you're criticizing how the library solution looks, which was one of my main points :)foreach (i; 0 .. 10) { mixin(interp(`$( i ~ ")" ) entry $(array[i])`)); }That doesn't look nice, does it?Yeah it is bad. Hence why I listed it as a CON of the library solution. But it's very easy if it's implemented in the compiler since is has access to the parser.And it's a contrived example to demonstrate an example that would be hard for a library to parse.It's hard to parse even for a library? That sounds bad.Is it required to be hard to parse?The problem is knowing when right-paren ')' characters are supposed to be a part of the code or when they are being used to end the code. For the parser this is easy to determine, but without it, you now have to go through all the tokens/grammar nodes to see where these right-paren characters can appear (i.e. string literals, function calls, templates, etc). You could implement a simple heuristic where you support as many right-paren in your code as there are left-paren characters, which will get you most of the way there, but now you've put an arbitrary limitation on the code that can appear inside these interpolated expressions. The point is, a language solution doesn't have this problem. It doesn't need to put any limitations on the code, and it doesn't have to come up with a mechanism to know when paren characters are or are not apart of the code. This problem is non-existent for an implementation inside the compiler.
Mar 18 2019
On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:https://github.com/dlang/dmd/pull/7988By the way, quick question, how does the PR handle this case? i"$a $b" ~ i$"c $d"
Mar 18 2019
On 3/18/19 5:07 AM, Olivier FAURE wrote:On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:Hm... it would be good for the feature to handle this (and treat it as i"$a $b$c $d". I'm assuming that i$"c is a typo and you meant i"$c. -Stevehttps://github.com/dlang/dmd/pull/7988By the way, quick question, how does the PR handle this case? i"$a $b" ~ i$"c $d"
Mar 18 2019
On Monday, 18 March 2019 at 09:07:26 UTC, Olivier FAURE wrote:On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:Also, recursive use of interpolated strings. Does this work or fail with a compiler error like "can't find symbol c": int a = 2, b = 5; enum code = text(i"mixin(\"int c = $(a + b)\"); writeln(i\"a = $a, b = $b, c = $c\");"); mixin(code);https://github.com/dlang/dmd/pull/7988By the way, quick question, how does the PR handle this case? i"$a $b" ~ i$"c $d"
Mar 18 2019
On Mon, Mar 18, 2019 at 04:30:26PM +0000, Meta via Digitalmars-d wrote:On Monday, 18 March 2019 at 09:07:26 UTC, Olivier FAURE wrote:[...] I'd expect you'd have to escape the $ in the nested interpolated string. I forgot what the escape was, perhaps `$$`? T -- Tell me and I forget. Teach me and I remember. Involve me and I understand. -- Benjamin FranklinOn Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:Also, recursive use of interpolated strings. Does this work or fail with a compiler error like "can't find symbol c": int a = 2, b = 5; enum code = text(i"mixin(\"int c = $(a + b)\"); writeln(i\"a = $a, b = $b, c = $c\");"); mixin(code);https://github.com/dlang/dmd/pull/7988By the way, quick question, how does the PR handle this case? i"$a $b" ~ i$"c $d"
Mar 18 2019
On Monday, 18 March 2019 at 16:48:59 UTC, H. S. Teoh wrote:On Mon, Mar 18, 2019 at 04:30:26PM +0000, Meta via Digitalmars-d wrote:\$ would be consistent with normal C and D quoting rules. $$ not so much.On Monday, 18 March 2019 at 09:07:26 UTC, Olivier FAURE wrote:[...] I'd expect you'd have to escape the $ in the nested interpolated string. I forgot what the escape was, perhaps `$$`?On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:Also, recursive use of interpolated strings. Does this work or fail with a compiler error like "can't find symbol c": int a = 2, b = 5; enum code = text(i"mixin(\"int c = $(a + b)\"); writeln(i\"a = $a, b = $b, c = $c\");"); mixin(code);https://github.com/dlang/dmd/pull/7988By the way, quick question, how does the PR handle this case? i"$a $b" ~ i$"c $d"
Mar 18 2019