www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - The state of string interpolation...one year later

reply Jonathan Marler <johnnymarler gmail.com> writes:
It's been about a year since I submitted an implementation for 
interpolated strings:

https://github.com/dlang/dmd/pull/7988

In that time, various people have been popping up asking about 
it. There has been alot of discussion around this feature on the 
forums and the place we left off was with Andrei saying that we 
should:

* continue to explore alternative library solutions
* focus on improving existing features instead of adding new 
features

At the request of Andrei, I implemented a small library solution 
as well (https://github.com/dlang/phobos/pull/6339) but the 
leadership never followed up with it.  And that's ok, they only 
have so much time and they need to prioritize how they feel is 
best.

With that, I read through some discussion and thought it could be 
helpful to summarize my thoughts on the matter since people 
continue to ask questions about it.

In my mind, there's really only one reason for string 
interpolation...

     Better Syntax

In many ways syntax isn't that important.  There's alot of 
subjectivity around it, but sometimes a change can make it 
objectively better.  Any time you can make syntax objectively 
better, you're making code easier to read, write and maintain.  
Better syntax means it's easier to write "correct code" and 
harder to write "incorrect code".

I recall Atila arguing that the syntax without string 
interpolation wasn't that bad. Then he provided this example 
(https://forum.dlang.org/post/jahvdekidbugougmyhgb forum.dlang.org):

     text("a is", a, ", b is ", b, " and the sum is: ", a + b)

Ironically, his example had a mistake, but it was hard to notice. 
Look at the same example with string interpolation:

     text("a is$a, b is $b and the sum is: $(a + b)")

You could say that "better syntax" is one of the main reasons D 
exists.  It's the main if not the only reason for alot of 
features like UFCS and foreach.

Andrei's biggest critique is that we should firt try to implement 
this in a library...and he's completely right to ask that 
question.  The problem is that over the years, no one's been able 
to achieve a library solution that results in a nice syntax.  
Having a poor syntax is a bad sign for a feature that only exists 
to improve syntax.  However, even if we could make the syntax 
better, there are still a handful of reasons why a library 
solution can't measure up to language support.  Including the 
"poor syntax", the following are the 5 CONS I see with a library 
solution:

CON 1. The syntax is "not nice".  This defeats the entire point 
of interpolated strings. Saying that interpolated strings aren't 
popular because people are not using a library for them is like 
saying Elvis isn't popular because people don't like elvis 
impersonators.  People not liking a poor imitation of something 
doesn't say anything about how they feel about the genuine 
article.

CON 2. Real error messages.  What's one of the most annoying 
parts of mixins?  Error messages.  When you get an error in a 
mixin, you don't get a line of code to go fix, you get an 
"imaginary" line that doesn't exist.  With library solutions, you 
can't point syntax errors inside interpolated strings to source 
locations.  That information is not available to the language.  
When you get a string, you don't know where each character inside 
that string originated from, only the compiler knows that.

CON 3. Performance.  No matter what we do, any library solution 
will never be as fast as a language solution. The reason why 
performance is especially important here, is because bad 
performance means developers will have to chose between better 
syntax or faster compilation.  We already see this today with 
templates and mixins.  With a language implementation, developers 
can have both.

CON 4. IDE/Editor Support.  A library solution won't be able to 
have IDE/Editor support for syntax highlighting, auto-complete, 
etc.  When the editor sees an interpolated string, it will be 
able to highlight the code inside it just like normal code.

CON 5. Full solution requires full copy of lexer/parser.  One big 
problem with a library solution is that it will be hard for a 
library to delimit interpolated expresions.  For full support, it 
will need to have a full implementation of the compiler's 
lexer/parser.  Without that, it will have limitations on the kind 
of code that can be inside an interpolated string.  Take the 
following (contrived) example:

foreach (i; 0 .. 10)
{
     mixin(interp(`$( i ~ ")" ) entry $(array[i])`));
}

The library solution needs to parse that interpolated string but 
needs to know that the right paren at `")"` is actually just a 
string literal inside the expression and not a right paren to 
delimit the end of the expression.  This is a contrived example, 
but if you have anything less than a full lexer/parser then 
developers are going to have a hard time being able to know what 
can and can't go inside an interpolated expression.  By having 
interpolated strings as a part of the langugage, the 
implementation has full access to the lexer/parser, so it doesn't 
need to force any limitation on the syntax available inside 
interpolated string expressions.

---

Now I'm not saying that the CONS of the library solution justify 
the addition of interpolated strings to the language.  I focused 
on that because that is Andrei's main sticking point.  Even if 
everyone agrees that library solution's don't work (and we can't 
enhance the language to make them work), we still need to show 
that the feature is going to be popular/useful enough to justify 
a new type of string literal.  The usefulness of the feature 
needs to outweight the work to support it.  The more features we 
add to D, the more developers need to learn to understand it.  
That being said, I consider the implementation and complexity it 
adds to be quite minimal (see the PR for more details). As for 
the usefullness, I can say personally I would use this feature to 
replace almost all my usages of writefln/format and writeln which 
would be a big shift for my projects.  Instead of:

writefln("My name is %s and my age is %s and my favorite hex is 
%s", name, age, favnum);

I will be writing:

writeln(i"My name is $name and my age is $age and my favorite hex 
$(favnum.formatHex)");

When I generate code, instead of:

     return
         returnType ~ ` ` ~ name ~ `(` ~ type ~ ` left, ` ~ type ~ 
` right)
         {
             return cast(` ~ returnType ~ `)(left ` ~ op ~ ` 
right);
         }
     `;
  It will be

      return text(iq{
         $returnType $name($type left, $type right)
         {
             return cast($returnType)(left $op right);
         }
     });

When I generate HTML documents in my cgi library, instead of:

     writeln(`<html><body>
     <title>`, title, `</title>
     <name>`, name, `</name><age>`, age, `</age>
     <a href="`, link, `">`, linkName, `</a>
     </body></html>
`);

or even:

     writefln(`<html><body>
     <title>%s</title>
     <name>%s</name><age>%s</age>
     <a href="%s">%s</a>
     </body></html>
`, title, name, age, link, linkName);

It will be:

     writeln(i`<html><body>

     <title>$title</title>
     <name>$name</name><age>$age</age>
     <a href="$link">$linkName</a>
     </body></html>
`);

When I first saw interpolated strings I didn't immediately 
realize the benefit of them.  Using them eliminates the problem 
of keeping format strings in sync with arguments.  It also avoids 
the "noise problem" you get when you alternate between code and 
expressions inside a function call, i.e. `writeln("a is", a, ", b 
is ", b)`. That pretty much sums up the benefits in my mind.

So what's next? I'm curious where leadership currently stands.  
What's their thoughts on the library solutions that have been 
presented? What do they think of the 5 CONS I've presented that 
all library solutions will have?  What's their opinion on the 
usefullness of the feature? For me personally, I am surprised at 
the amount of interest this feature continues to garner.  I think 
the feature is a net positive for D, but then again I don't think 
it's a "make or break" feature.  Just a "nice addition".  Anyway 
those are my thoughts. Sorry for the long post.  I hope it's 
helpful and ultimately makes D better.
Mar 16
next sibling parent Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:
 CON 2. Real error messages.  What's one of the most annoying 
 parts of mixins?  Error messages.  When you get an error in a 
 mixin, you don't get a line of code to go fix, you get an 
 "imaginary" line that doesn't exist.
Just a note that you can make it exist with -mixin=filename
 So what's next?
Dconf! I've added this to the list of topics for the foundation meeting where we can get a lot of these decisions made.
 For me personally, I am surprised at the amount of interest 
 this feature continues to garner.  I think the feature is a net 
 positive for D, but then again I don't think it's a "make or 
 break" feature.  Just a "nice addition".
Indeed, I hadn't realised how much nicer that would make dealing with formatting mixins, thanks for that.
Mar 17
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2019-03-17 07:01, Jonathan Marler wrote:

 Even if everyone agrees 
 that library solution's don't work (and we can't enhance the language to 
 make them work)
We can. But just as there's no interest in adding language support for string interpolation there's no interest to add support for those features that would help to make a library solution satisfactory. -- /Jacob Carlborg
Mar 17
prev sibling next sibling parent reply Nick Treleaven <nick geany.org> writes:
On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:
     text("a is", a, ", b is ", b, " and the sum is: ", a + b)

 Ironically, his example had a mistake, but it was hard to
 notice.
Maybe he wrote that on the forum. In a text editor syntax highlighting would make the mistake clearer, but it can still happen. The interpolated syntax is definitely clearer. I think interpolated strings as a language feature is justifiable, but that a mixin solution can work with some small more general changes to the language, and that these changes are both simpler to support and more useful and flexible.
 CON 1. The syntax is "not nice".  This defeats the entire point
 of interpolated strings.
What about `$ident!targs(args)` being sugar for `mixin(ident!targs(args))`? Then we can have a function template in object.d invoked as: $iStr!"v = $v" becomes: mixin(iStr!"v = $v")
 With library solutions,
 you can't point syntax errors inside interpolated strings to
 source locations.  That information is not available to the
 language.  When you get a string, you don't know where each
 character inside that string originated from, only the compiler
 knows that.
Yes, but the language could support a way to get the file and line from template alias arguments inside the template. So at least the starting line of the string literal could be known and given in an error message.
 CON 3. Performance.  No matter what we do, any library solution
 will never be as fast as a language solution.
True, but what matters is whether a library solution is fast enough. If dmd isn't currently then perhaps the CTFE rewrite will be.
 CON 4. IDE/Editor Support.  A library solution won't be able to
 have IDE/Editor support for syntax highlighting, auto-complete,
 etc.  When the editor sees an interpolated string, it will be
 able to highlight the code inside it just like normal code.
The editor has to be updated for interpolated syntax, so it could just as easily be updated to recognise invocation of $iStr if iStr was in object.d.
 The library solution needs to parse that interpolated string
 but needs to know that the right paren at `")"` is actually
 just a string literal inside the expression and not a right
 paren to delimit the end of the expression.  This is a
 contrived example, but if you have anything less than a full
 lexer/parser then developers are going to have a hard time
 being able to know what can and can't go inside an interpolated
 expression.
Good point, but I think banning all string delimiters `'" is a reasonable and easy to understand restriction. It's good to discourage people from putting complex expressions inside strings, and this also makes iStr editor highlighting easier to implement vs the unrestricted language solution (which could also be restricted).
 That being said, I consider the implementation
 and complexity it adds to be quite minimal (see the PR for more
 details). As for the usefullness, I can say personally I would
 use this feature to replace almost all my usages of
 writefln/format and writeln which would be a big shift for my
 projects.  Instead of:

 writefln("My name is %s and my age is %s and my favorite hex is
 %s", name, age, favnum);
Actually this should be `writefln!"My name is..."(name, ...)`. Formatting can be more efficient if the format string is known at compile-time.
 I will be writing:

 writeln(i"My name is $name and my age is $age and my favorite
 hex $(favnum.formatHex)");
Does formatHex exist?
      return text(iq{
         $returnType $name($type left, $type right)
         {
             return cast($returnType)(left $op right);
         }
     });
Here there are advantages to the $iStr solution vs your implementation. First I would have two mixin functions, iSeq to do what you want, and iStr which includes the call to std.conv.text: 1. iStr doesn't need to import std.conv.text, which is a very common case. Having to import `text` explicitly would often make me avoid using the interpolated string feature and use existing string syntax instead. (A different language implementation could expose text as a property of the interpolated string though). 2. The iStr template gets the string literal passed at compile time, so the length is available for buffer pre-allocation without summing the string fragment lengths at runtime. (A different language implementation could expose the original string length as an enum, then text, writef and format can take advantage of it).
Mar 17
parent reply Atila Neves <atila.neves gmail.com> writes:
On Sunday, 17 March 2019 at 10:40:28 UTC, Nick Treleaven wrote:
 On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:
     text("a is", a, ", b is ", b, " and the sum is: ", a + b)

 Ironically, his example had a mistake, but it was hard to
 notice.
Maybe he wrote that on the forum. In a text editor syntax highlighting would make the mistake clearer, but it can still happen. The interpolated syntax is definitely clearer.
Yep. At the time I pointed out I wouldn't have made the same mistake in Emacs.
Mar 19
parent Jonathan Marler <johnnymarler gmail.com> writes:
On Tuesday, 19 March 2019 at 10:11:27 UTC, Atila Neves wrote:
 On Sunday, 17 March 2019 at 10:40:28 UTC, Nick Treleaven wrote:
 On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler 
 wrote:
     text("a is", a, ", b is ", b, " and the sum is: ", a + b)

 Ironically, his example had a mistake, but it was hard to
 notice.
Maybe he wrote that on the forum. In a text editor syntax highlighting would make the mistake clearer, but it can still happen. The interpolated syntax is definitely clearer.
Yep. At the time I pointed out I wouldn't have made the same mistake in Emacs.
For me, even with syntax highlighting it's still hard to spot the mistake. So it's good to get your perspective which is that you know syntax highlighting would have prevented this mistake in your case. Off topic, I enjoyed your talk on emacs, I'm a long time user myself.
Mar 19
prev sibling next sibling parent reply ag0aep6g <anonymous example.com> writes:
On 17.03.19 07:01, Jonathan Marler wrote:
 When I generate HTML documents in my cgi library, instead of:
 
      writeln(`<html><body>
      <title>`, title, `</title>
      <name>`, name, `</name><age>`, age, `</age>
      <a href="`, link, `">`, linkName, `</a>
      </body></html>
 `);
 
 or even:
 
      writefln(`<html><body>
      <title>%s</title>
      <name>%s</name><age>%s</age>
      <a href="%s">%s</a>
      </body></html>
 `, title, name, age, link, linkName);
 
 It will be:
 
      writeln(i`<html><body>
 
      <title>$title</title>
      <name>$name</name><age>$age</age>
      <a href="$link">$linkName</a>
      </body></html>
 `);
Either way, you likely got yourself an HTML injection. That might be the crux of string interpolation: It looks nice in simple examples, but is it still nice when you need to encode your variables for the output? I think that should be a goal. We don't want to encourage writing bad code by making it more beautiful than correct code. Unless I'm missing something (I've only skimmed your PRs), you don't have mechanisms to aid in this. So your example would look like this with encoding: writeln(i`<html><body> <title>$(title.toHTML)</title> <name>$(name.toHTML)</name><age>$(age.toHTML)</age> <a href="$(link.toHTML)">$(linkName.toHTML)</a> </body></html> `); That might still be prettier than the alternative with a plain `writeln`, but the difference is less pronounced. And with `writefln` we can do something like this: void writeflnToHTML(S ...)(string f, S stuff) { writefln(f, tupleMap!toHTML(stuff).expand); } writeflnToHTML(`<html><body> <title>%s</title> <name>%s</name><age>%s</age> <a href="%s">%s</a> </body></html> `, title, name, age, link, linkName); That's still not pretty at all, but we can't forget a `.toHTML` this way. (Though `tupleMap` isn't in phobos and might be hard to get exactly right.) Ideally, something like that would be possible with interpolated strings, too.
Mar 17
next sibling parent reply kinke <noone nowhere.com> writes:
I like the idea and find the tuple approach very elegant.

On Sunday, 17 March 2019 at 14:01:36 UTC, ag0aep6g wrote:
     void writeflnToHTML(S ...)(string f, S stuff)
     {
         writefln(f, tupleMap!toHTML(stuff).expand);
     }
     writeflnToHTML(`<html><body>
         <title>%s</title>
         <name>%s</name><age>%s</age>
         <a href="%s">%s</a>
         </body></html>
     `, title, name, age, link, linkName);

 That's still not pretty at all, but we can't forget a `.toHTML` 
 this way. (Though `tupleMap` isn't in phobos and might be hard 
 to get exactly right.)

 Ideally, something like that would be possible with 
 interpolated strings, too.
As long as the embedded expressions are always separated by a string literal in the resulting tuple, incl. an empty one (`i"$(a)$(b)"` => `"", a, "", b`), these embedded expressions can be identified (and appropriately transformed etc.) by an odd index.
Mar 17
parent ag0aep6g <anonymous example.com> writes:
On 18.03.19 00:21, kinke wrote:
 As long as the embedded expressions are always separated by a string 
 literal in the resulting tuple, incl. an empty one (`i"$(a)$(b)"` => 
 `"", a, "", b`), these embedded expressions can be identified (and 
 appropriately transformed etc.) by an odd index.
I see. Like this: void writelnToHTML(S ...)(S stuff) { static foreach (i, thing; stuff) { if (i % 2 == 0) write(thing); else write(thing.toHTML); } writeln; } writelnToHTML(i`<html><body> <title>$(title)</title> <name>$(name)</name><age>$(age)</age> <a href="$(link)">$(linkName)</a> </body></html> `); That's pretty good. It's not quite foolproof; one might put arguments before/after the interpolating, messing up the arrangement. But other than that it seems nice.
Mar 18
prev sibling parent Olivier FAURE <couteaubleu gmail.com> writes:
On Sunday, 17 March 2019 at 14:01:36 UTC, ag0aep6g wrote:
 Either way, you likely got yourself an HTML injection.

 That might be the crux of string interpolation: It looks nice 
 in simple examples, but is it still nice when you need to 
 encode your variables for the output?
One way to deal with these cases would be to have an alternate string interpolation syntax for format string, eg: fi"SELECT $field FROM $table" is lowered to "SELECT %s FROM %s", field, table Other alternatives have been suggested (eg interpolation creating delegates that can be passed at compile time), but I think the above solution is the most KISS and elegant. It encourages robust design, where the only argument parsed is the first one and every other argument is sanitized by default.
Mar 18
prev sibling next sibling parent reply Rubn <where is.this> writes:
On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:
 It's been about a year since I submitted an implementation for 
 interpolated strings:

 https://github.com/dlang/dmd/pull/7988

 In that time, various people have been popping up asking about 
 it. There has been alot of discussion around this feature on 
 the forums and the place we left off was with Andrei saying 
 that we should:

 * continue to explore alternative library solutions
 * focus on improving existing features instead of adding new 
 features

 At the request of Andrei, I implemented a small library 
 solution as well (https://github.com/dlang/phobos/pull/6339) 
 but the leadership never followed up with it.  And that's ok, 
 they only have so much time and they need to prioritize how 
 they feel is best.

 With that, I read through some discussion and thought it could 
 be helpful to summarize my thoughts on the matter since people 
 continue to ask questions about it.

 In my mind, there's really only one reason for string 
 interpolation...

     Better Syntax

 In many ways syntax isn't that important.  There's alot of 
 subjectivity around it, but sometimes a change can make it 
 objectively better.  Any time you can make syntax objectively 
 better, you're making code easier to read, write and maintain.  
 Better syntax means it's easier to write "correct code" and 
 harder to write "incorrect code".

 I recall Atila arguing that the syntax without string 
 interpolation wasn't that bad. Then he provided this example 
 (https://forum.dlang.org/post/jahvdekidbugougmyhgb forum.dlang.org):

     text("a is", a, ", b is ", b, " and the sum is: ", a + b)

 Ironically, his example had a mistake, but it was hard to 
 notice. Look at the same example with string interpolation:

     text("a is$a, b is $b and the sum is: $(a + b)")

 You could say that "better syntax" is one of the main reasons D 
 exists.  It's the main if not the only reason for alot of 
 features like UFCS and foreach.

 Andrei's biggest critique is that we should firt try to 
 implement this in a library...and he's completely right to ask 
 that question.  The problem is that over the years, no one's 
 been able to achieve a library solution that results in a nice 
 syntax.  Having a poor syntax is a bad sign for a feature that 
 only exists to improve syntax.  However, even if we could make 
 the syntax better, there are still a handful of reasons why a 
 library solution can't measure up to language support.  
 Including the "poor syntax", the following are the 5 CONS I see 
 with a library solution:

 CON 1. The syntax is "not nice".  This defeats the entire point 
 of interpolated strings. Saying that interpolated strings 
 aren't popular because people are not using a library for them 
 is like saying Elvis isn't popular because people don't like 
 elvis impersonators.  People not liking a poor imitation of 
 something doesn't say anything about how they feel about the 
 genuine article.

 CON 2. Real error messages.  What's one of the most annoying 
 parts of mixins?  Error messages.  When you get an error in a 
 mixin, you don't get a line of code to go fix, you get an 
 "imaginary" line that doesn't exist.  With library solutions, 
 you can't point syntax errors inside interpolated strings to 
 source locations.  That information is not available to the 
 language.  When you get a string, you don't know where each 
 character inside that string originated from, only the compiler 
 knows that.

 CON 3. Performance.  No matter what we do, any library solution 
 will never be as fast as a language solution. The reason why 
 performance is especially important here, is because bad 
 performance means developers will have to chose between better 
 syntax or faster compilation.  We already see this today with 
 templates and mixins.  With a language implementation, 
 developers can have both.

 CON 4. IDE/Editor Support.  A library solution won't be able to 
 have IDE/Editor support for syntax highlighting, auto-complete, 
 etc.  When the editor sees an interpolated string, it will be 
 able to highlight the code inside it just like normal code.

 CON 5. Full solution requires full copy of lexer/parser.  One 
 big problem with a library solution is that it will be hard for 
 a library to delimit interpolated expresions.  For full 
 support, it will need to have a full implementation of the 
 compiler's lexer/parser.  Without that, it will have 
 limitations on the kind of code that can be inside an 
 interpolated string.  Take the following (contrived) example:

 foreach (i; 0 .. 10)
 {
     mixin(interp(`$( i ~ ")" ) entry $(array[i])`));
 }

 The library solution needs to parse that interpolated string 
 but needs to know that the right paren at `")"` is actually 
 just a string literal inside the expression and not a right 
 paren to delimit the end of the expression.  This is a 
 contrived example, but if you have anything less than a full 
 lexer/parser then developers are going to have a hard time 
 being able to know what can and can't go inside an interpolated 
 expression.  By having interpolated strings as a part of the 
 langugage, the implementation has full access to the 
 lexer/parser, so it doesn't need to force any limitation on the 
 syntax available inside interpolated string expressions.

 ---

 Now I'm not saying that the CONS of the library solution 
 justify the addition of interpolated strings to the language.  
 I focused on that because that is Andrei's main sticking point.
  Even if everyone agrees that library solution's don't work 
 (and we can't enhance the language to make them work), we still 
 need to show that the feature is going to be popular/useful 
 enough to justify a new type of string literal.  The usefulness 
 of the feature needs to outweight the work to support it.  The 
 more features we add to D, the more developers need to learn to 
 understand it.  That being said, I consider the implementation 
 and complexity it adds to be quite minimal (see the PR for more 
 details). As for the usefullness, I can say personally I would 
 use this feature to replace almost all my usages of 
 writefln/format and writeln which would be a big shift for my 
 projects.  Instead of:

 writefln("My name is %s and my age is %s and my favorite hex is 
 %s", name, age, favnum);

 I will be writing:

 writeln(i"My name is $name and my age is $age and my favorite 
 hex $(favnum.formatHex)");

 When I generate code, instead of:

     return
         returnType ~ ` ` ~ name ~ `(` ~ type ~ ` left, ` ~ type 
 ~ ` right)
         {
             return cast(` ~ returnType ~ `)(left ` ~ op ~ ` 
 right);
         }
     `;
  It will be

      return text(iq{
         $returnType $name($type left, $type right)
         {
             return cast($returnType)(left $op right);
         }
     });

 When I generate HTML documents in my cgi library, instead of:

     writeln(`<html><body>
     <title>`, title, `</title>
     <name>`, name, `</name><age>`, age, `</age>
     <a href="`, link, `">`, linkName, `</a>
     </body></html>
 `);

 or even:

     writefln(`<html><body>
     <title>%s</title>
     <name>%s</name><age>%s</age>
     <a href="%s">%s</a>
     </body></html>
 `, title, name, age, link, linkName);

 It will be:

     writeln(i`<html><body>

     <title>$title</title>
     <name>$name</name><age>$age</age>
     <a href="$link">$linkName</a>
     </body></html>
 `);

 When I first saw interpolated strings I didn't immediately 
 realize the benefit of them.  Using them eliminates the problem 
 of keeping format strings in sync with arguments.  It also 
 avoids the "noise problem" you get when you alternate between 
 code and expressions inside a function call, i.e. `writeln("a 
 is", a, ", b is ", b)`. That pretty much sums up the benefits 
 in my mind.

 So what's next? I'm curious where leadership currently stands.  
 What's their thoughts on the library solutions that have been 
 presented? What do they think of the 5 CONS I've presented that 
 all library solutions will have?  What's their opinion on the 
 usefullness of the feature? For me personally, I am surprised 
 at the amount of interest this feature continues to garner.  I 
 think the feature is a net positive for D, but then again I 
 don't think it's a "make or break" feature.  Just a "nice 
 addition".  Anyway those are my thoughts. Sorry for the long 
 post.  I hope it's helpful and ultimately makes D better.
Seems you've done everything but write a DIP, I don't really see why this feature should be exempt from the process. Even if the process isn't the greatest, that isn't reason enough for it to circumvent it.
Mar 17
parent reply Jonathan Marler <johnnymarler gmail.com> writes:
On Sunday, 17 March 2019 at 14:20:20 UTC, Rubn wrote:
 Seems you've done everything but write a DIP, I don't really 
 see why this feature should be exempt from the process. Even if 
 the process isn't the greatest, that isn't reason enough for it 
 to circumvent it.
I'll have to disagree with you here. I'm not sure if you've written a DIP but I have. I started DIP 1011 about 2 years ago. After about a year, it was forwarded to W&A and I got a response from Andrei that contained a fair number of errors. To me it seemed that he didn't take enough time to read/understand the proposal (I've heard this isn't the first time this has happened). The proposal itself is pretty simple, but the ramifications of the change weren't so clear. A discussion between the leadership and the points they had questions about would have been very helpful early on to know where to put effort into researching the proposal. However, that's not how the DIP process is written to work. After Andrei's response I attempted to discuss their concerns but everything was filtered through the DIP manager Michael Parker and they never responded to my comments and questions. We left off with them providing an example library implementation asking me to comment on it. I did so, explaining that their example was incorrect and had little bearing on the DIP itself as they had implemented different semantics than what the DIP was proposing, but then they never responded. That was about a year ago, and the DIP is still considered to be in "Formal Review". In my opinion the DIP process is broken. I don't want to introduce a potentially good feature for D into a system where I believe it will actually harm the chances for the feature rather then help them. If the process is fixed however, I will gladly create a DIP and would look forward to really hashing out the feature and seeing how it could best be implemented. D has a big potential to make a splash with this feature by showcasing its meta-programming capabilities with a zero overhead implementation of string interpolation that also happens to be the most powerful/flexible one out there. I've heard some pretty cool ideas from people on it that I hadn't thought of, and would love to work with the community on creating a robust well-researched proposal, but I believe if I use the current DIP system as it exists to introduce it, it will actually make it more likely to fail than if I didn't write a DIP at all at this point. Please understand, I don't shy away from good, robust work and research. I'm a highly motivated mathematician, who loves optimizing and finding elegant solutions. That's what I like spending my time on. Researching language proposals is exactly the type of work I like to do. I spend alot of time reading and researching other languages and the features they bring to the table. I would very much enjoy contributing to a DIP process that fosters collaboration and feedback that results in more consensus and communal understanding.
Mar 17
parent Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Sunday, 17 March 2019 at 16:50:13 UTC, Jonathan Marler wrote:
 On Sunday, 17 March 2019 at 14:20:20 UTC, Rubn wrote:
 Seems you've done everything but write a DIP, I don't really 
 see why this feature should be exempt from the process. Even 
 if the process isn't the greatest, that isn't reason enough 
 for it to circumvent it.
I'll have to disagree with you here. I'm not sure if you've written a DIP but I have. I started DIP 1011 about 2 years ago. After about a year, it was forwarded to W&A and I got a response from Andrei that contained a fair number of errors. To me it seemed that he didn't take enough time to read/understand the proposal (I've heard this isn't the first time this has happened). The proposal itself is pretty simple, but the ramifications of the change weren't so clear. A discussion between the leadership and the points they had questions about would have been very helpful early on to know where to put effort into researching the proposal. However, that's not how the DIP process is written to work. After Andrei's response I attempted to discuss their concerns but everything was filtered through the DIP manager Michael Parker and they never responded to my comments and questions. We left off with them providing an example library implementation asking me to comment on it. I did so, explaining that their example was incorrect and had little bearing on the DIP itself as they had implemented different semantics than what the DIP was proposing, but then they never responded. That was about a year ago, and the DIP is still considered to be in "Formal Review".
That does smack of my experience with DIP1016. I've added DIP1011 to the list of DIP related stuff for the dconf AGM[1]
 In my opinion the DIP process is broken.  I don't want to 
 introduce a potentially good feature for D into a system where 
 I believe it will actually harm the chances for the feature 
 rather then help them.  If the process is fixed however, I will 
 gladly create a DIP and would look forward to really hashing 
 out the feature and seeing how it could best be implemented.
I hope to get the DIP process fixed at the AGM, but for making progress on the topic of interpolated strings it would really help if you have a draft (or steal/polish that one from the DIP PR queue) so that we have something concrete to discuss (i.e. get (part of) a community round done).
 I believe if I use the current DIP system as it exists to 
 introduce it, it will actually make it more likely to fail than 
 if I didn't write a DIP at all at this point.

 Please understand, I don't shy away from good, robust work and 
 research.  I'm a highly motivated mathematician, who loves 
 optimizing and finding elegant solutions.  That's what I like 
 spending my time on.  Researching language proposals is exactly 
 the type of work I like to do. I spend alot of time reading and 
 researching other languages and the features they bring to the 
 table. I would very much enjoy contributing to a DIP process 
 that fosters collaboration and feedback that results in more 
 consensus and communal understanding.
This really does highlight the problems with organisation we really want to be getting all the best work you can provide and if that is bottlenecked on the current DIP process then we definitely need to look at why it is and how to fix it. [1]: http://dconf.org/2019/talks/agm.html
Mar 17
prev sibling next sibling parent reply Kagamin <spam here.lot> writes:
On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:
 CON 3. Performance.  No matter what we do, any library solution 
 will never be as fast as a language solution.
Must consider the price. Language bloat is not free.
 CON 4. IDE/Editor Support.  A library solution won't be able to 
 have IDE/Editor support for syntax highlighting, auto-complete, 
 etc.  When the editor sees an interpolated string, it will be 
 able to highlight the code inside it just like normal code.
There's IDE support for format string validation for C and C#.
 CON 5. Full solution requires full copy of lexer/parser.  One 
 big problem with a library solution is that it will be hard for 
 a library to delimit interpolated expresions.  For full 
 support, it will need to have a full implementation of the 
 compiler's lexer/parser.
The interpolated string syntax is as complex as the whole D language? That sounds bad. AFAIK IDEs don't even support UFCS yet even though it should be simple.
 foreach (i; 0 .. 10)
 {
     mixin(interp(`$( i ~ ")" ) entry $(array[i])`));
 }
That doesn't look nice, does it?
Mar 17
parent reply Jonathan Marler <johnnymarler gmail.com> writes:
On Sunday, 17 March 2019 at 17:05:46 UTC, Kagamin wrote:
 On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:
 CON 3. Performance.  No matter what we do, any library 
 solution will never be as fast as a language solution.
Must consider the price. Language bloat is not free.
Right, which is why I mentioned we must consider the price further down. I know it's a long post though, I don't blame you for not reading it all :)
 CON 4. IDE/Editor Support.  A library solution won't be able 
 to have IDE/Editor support for syntax highlighting, 
 auto-complete, etc.  When the editor sees an interpolated 
 string, it will be able to highlight the code inside it just 
 like normal code.
There's IDE support for format string validation for C and C#.
Which works in some cases. It can't work in all cases unless language support is added. Still a CON for the library solution.
 CON 5. Full solution requires full copy of lexer/parser.  One 
 big problem with a library solution is that it will be hard 
 for a library to delimit interpolated expresions.  For full 
 support, it will need to have a full implementation of the 
 compiler's lexer/parser.
The interpolated string syntax is as complex as the whole D language? That sounds bad. AFAIK IDEs don't even support UFCS yet even though it should be simple.
The full solution requires a full copy of the lexer/parser, but the syntax for interpolated strings is just allowing string literals to be prefixed with the letter 'i'. You've confused the syntax for interpolated strings with the complexity of arbitrary D expression inside them. If you want to be able to write any code inside those expressions, then you need the full parser lexer to parse it. A language solution already has access to this but a library implementation will need to either copy the full implementation, or make compromises in what it can support.
 foreach (i; 0 .. 10)
 {
     mixin(interp(`$( i ~ ")" ) entry $(array[i])`));
 }
That doesn't look nice, does it?
Not sure if you realize this but you're criticizing how the library solution looks, which was one of my main points :) And it's a contrived example to demonstrate an example that would be hard for a library to parse. In reality you wouldn't write it that way (hence why I called it "contrived"). With full interpolation support you would write it like this: i"$(i)) entry $(array[i])"
Mar 17
parent reply Kagamin <spam here.lot> writes:
On Sunday, 17 March 2019 at 18:32:55 UTC, Jonathan Marler wrote:
 foreach (i; 0 .. 10)
 {
     mixin(interp(`$( i ~ ")" ) entry $(array[i])`));
 }
That doesn't look nice, does it?
Not sure if you realize this but you're criticizing how the library solution looks, which was one of my main points :)
I mean $( i ~ ")
 And it's a contrived example to demonstrate an example that 
 would be hard for a library to parse.
It's hard to parse even for a library? That sounds bad. Is it required to be hard to parse?
Mar 18
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 3/18/19 8:12 AM, Kagamin wrote:
 On Sunday, 17 March 2019 at 18:32:55 UTC, Jonathan Marler wrote:
 foreach (i; 0 .. 10)
 {
     mixin(interp(`$( i ~ ")" ) entry $(array[i])`));
 }
That doesn't look nice, does it?
Not sure if you realize this but you're criticizing how the library solution looks, which was one of my main points :)
I mean $( i ~ ")
Yeah, you could write it I think just like the language solution: mixin(interp(`$(i)) entry $(array[i])`)); You'd still have to deal with the stray parentheses as a string, but I'm sure there are other expressions inside the escapes that are more likely to be in the wild which would require a lexer/parser. It may not be a full one, though, we don't need to make AST out of it.
 
 And it's a contrived example to demonstrate an example that would be 
 hard for a library to parse.
It's hard to parse even for a library? That sounds bad. Is it required to be hard to parse?
If you look at the PR that he created, it's super-simple. It uses the already existing parser/lexer in the front end. The point he's making is that we'd have to DUPLICATE that for a library solution. -Steve
Mar 18
parent reply Kagamin <spam here.lot> writes:
On Monday, 18 March 2019 at 13:05:37 UTC, Steven Schveighoffer 
wrote:
 If you look at the PR that he created, it's super-simple. It 
 uses the already existing parser/lexer in the front end.
Hmm... the pr seemingly doesn't support full nested language: the string is lexed as a normal double quoted string. This is different from, say, javascript, where interpolated strings support nesting without escapes and thus can't be lexed without parsing all their content.
 The point he's making is that we'd have to DUPLICATE that for a 
 library solution.
If they supported full nested language, but they don't. So I see no reason to assume that they should be difficult to parse.
Mar 19
parent Jonathan Marler <johnnymarler gmail.com> writes:
On Tuesday, 19 March 2019 at 08:13:16 UTC, Kagamin wrote:
 On Monday, 18 March 2019 at 13:05:37 UTC, Steven Schveighoffer 
 wrote:
 If you look at the PR that he created, it's super-simple. It 
 uses the already existing parser/lexer in the front end.
Hmm... the pr seemingly doesn't support full nested language: the string is lexed as a normal double quoted string. This is different from, say, javascript, where interpolated strings support nesting without escapes and thus can't be lexed without parsing all their content.
 The point he's making is that we'd have to DUPLICATE that for 
 a library solution.
If they supported full nested language, but they don't. So I see no reason to assume that they should be difficult to parse.
Your absolutely right here. Thanks for taking time to look at the code by the way. The current implementation isn't finished. I just wanted to start out with something easy to explore the feature. So today the language solution has the limitation that parens inside the code need to be balanced. But that limitation can be fixed with a bit more work. I just need to combine the current 2 pass interpolation logic into one-pass, but that would require some extra changes in the parse code and at this stage I wanted to minimize turmoil. So it's a temporary compromise. The final solution wouldn't have this limitation.
Mar 19
prev sibling parent Jonathan Marler <johnnymarler gmail.com> writes:
On Monday, 18 March 2019 at 12:12:12 UTC, Kagamin wrote:
 On Sunday, 17 March 2019 at 18:32:55 UTC, Jonathan Marler wrote:
 foreach (i; 0 .. 10)
 {
     mixin(interp(`$( i ~ ")" ) entry $(array[i])`));
 }
That doesn't look nice, does it?
Not sure if you realize this but you're criticizing how the library solution looks, which was one of my main points :)
I mean $( i ~ ")
Yeah, like I said you wouldn't write that in real code. It's an example to demonstrate something that would be hard for a library to parse.
 And it's a contrived example to demonstrate an example that 
 would be hard for a library to parse.
It's hard to parse even for a library? That sounds bad.
Yeah it is bad. Hence why I listed it as a CON of the library solution. But it's very easy if it's implemented in the compiler since is has access to the parser.
 Is it required to be hard to parse?
The problem is knowing when right-paren ')' characters are supposed to be a part of the code or when they are being used to end the code. For the parser this is easy to determine, but without it, you now have to go through all the tokens/grammar nodes to see where these right-paren characters can appear (i.e. string literals, function calls, templates, etc). You could implement a simple heuristic where you support as many right-paren in your code as there are left-paren characters, which will get you most of the way there, but now you've put an arbitrary limitation on the code that can appear inside these interpolated expressions. The point is, a language solution doesn't have this problem. It doesn't need to put any limitations on the code, and it doesn't have to come up with a mechanism to know when paren characters are or are not apart of the code. This problem is non-existent for an implementation inside the compiler.
Mar 18
prev sibling parent reply Olivier FAURE <couteaubleu gmail.com> writes:
On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:
 https://github.com/dlang/dmd/pull/7988
By the way, quick question, how does the PR handle this case? i"$a $b" ~ i$"c $d"
Mar 18
next sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 3/18/19 5:07 AM, Olivier FAURE wrote:
 On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:
 https://github.com/dlang/dmd/pull/7988
By the way, quick question, how does the PR handle this case?     i"$a $b" ~ i$"c $d"
Hm... it would be good for the feature to handle this (and treat it as i"$a $b$c $d". I'm assuming that i$"c is a typo and you meant i"$c. -Steve
Mar 18
prev sibling parent reply Meta <jared771 gmail.com> writes:
On Monday, 18 March 2019 at 09:07:26 UTC, Olivier FAURE wrote:
 On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:
 https://github.com/dlang/dmd/pull/7988
By the way, quick question, how does the PR handle this case? i"$a $b" ~ i$"c $d"
Also, recursive use of interpolated strings. Does this work or fail with a compiler error like "can't find symbol c": int a = 2, b = 5; enum code = text(i"mixin(\"int c = $(a + b)\"); writeln(i\"a = $a, b = $b, c = $c\");"); mixin(code);
Mar 18
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Mar 18, 2019 at 04:30:26PM +0000, Meta via Digitalmars-d wrote:
 On Monday, 18 March 2019 at 09:07:26 UTC, Olivier FAURE wrote:
 On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:
 https://github.com/dlang/dmd/pull/7988
By the way, quick question, how does the PR handle this case? i"$a $b" ~ i$"c $d"
Also, recursive use of interpolated strings. Does this work or fail with a compiler error like "can't find symbol c": int a = 2, b = 5; enum code = text(i"mixin(\"int c = $(a + b)\"); writeln(i\"a = $a, b = $b, c = $c\");"); mixin(code);
[...] I'd expect you'd have to escape the $ in the nested interpolated string. I forgot what the escape was, perhaps `$$`? T -- Tell me and I forget. Teach me and I remember. Involve me and I understand. -- Benjamin Franklin
Mar 18
parent Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Monday, 18 March 2019 at 16:48:59 UTC, H. S. Teoh wrote:
 On Mon, Mar 18, 2019 at 04:30:26PM +0000, Meta via 
 Digitalmars-d wrote:
 On Monday, 18 March 2019 at 09:07:26 UTC, Olivier FAURE wrote:
 On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler 
 wrote:
 https://github.com/dlang/dmd/pull/7988
By the way, quick question, how does the PR handle this case? i"$a $b" ~ i$"c $d"
Also, recursive use of interpolated strings. Does this work or fail with a compiler error like "can't find symbol c": int a = 2, b = 5; enum code = text(i"mixin(\"int c = $(a + b)\"); writeln(i\"a = $a, b = $b, c = $c\");"); mixin(code);
[...] I'd expect you'd have to escape the $ in the nested interpolated string. I forgot what the escape was, perhaps `$$`?
\$ would be consistent with normal C and D quoting rules. $$ not so much.
Mar 18