digitalmars.D - The state of string interpolation...one year later

Jonathan Marler (168/168) Mar 16 2019 It's been about a year since I submitted an implementation for

Nicholas Wilson (6/15) Mar 17 2019 Dconf! I've added this to the list of topics for the foundation
Jacob Carlborg (6/9) Mar 17 2019 We can. But just as there's no interest in adding language support for
Nick Treleaven (50/92) Mar 17 2019 Maybe he wrote that on the forum. In a text editor syntax

Atila Neves (3/11) Mar 19 2019 Yep. At the time I pointed out I wouldn't have made the same

Jonathan Marler (7/20) Mar 19 2019 For me, even with syntax highlighting it's still hard to spot the

ag0aep6g (34/61) Mar 17 2019 Either way, you likely got yourself an HTML injection.

kinke (7/22) Mar 17 2019 As long as the embedded expressions are always separated by a

ag0aep6g (20/24) Mar 18 2019 I see. Like this:

Olivier FAURE (11/15) Mar 18 2019 One way to deal with these cases would be to have an alternate

Rubn (5/176) Mar 17 2019 Seems you've done everything but write a DIP, I don't really see

Jonathan Marler (46/50) Mar 17 2019 I'll have to disagree with you here. I'm not sure if you've

Nicholas Wilson (13/58) Mar 17 2019 That does smack of my experience with DIP1016. I've added DIP1011

Kagamin (7/22) Mar 17 2019 There's IDE support for format string validation for C and C#.

Jonathan Marler (22/45) Mar 17 2019 Right, which is why I mentioned we must consider the price

Kagamin (4/14) Mar 18 2019 It's hard to parse even for a library? That sounds bad. Is it

Steven Schveighoffer (12/30) Mar 18 2019 Yeah, you could write it I think just like the language solution:

Kagamin (9/13) Mar 19 2019 Hmm... the pr seemingly doesn't support full nested language: the

Jonathan Marler (11/24) Mar 19 2019 Your absolutely right here. Thanks for taking time to look at the

Jonathan Marler (22/37) Mar 18 2019 Yeah, like I said you wouldn't write that in real code. It's an

Olivier FAURE (3/4) Mar 18 2019 By the way, quick question, how does the PR handle this case?

Steven Schveighoffer (4/10) Mar 18 2019 Hm... it would be good for the feature to handle this (and treat it as
Meta (7/11) Mar 18 2019 Also, recursive use of interpolated strings. Does this work or

H. S. Teoh (7/22) Mar 18 2019 [...]

Patrick Schluter (3/25) Mar 18 2019 \$ would be consistent with normal C and D quoting rules. $$ not

Jonathan Marler <johnnymarler gmail.com> writes:

It's been about a year since I submitted an implementation for 
interpolated strings:

https://github.com/dlang/dmd/pull/7988

In that time, various people have been popping up asking about 
it. There has been alot of discussion around this feature on the 
forums and the place we left off was with Andrei saying that we 
should:

* continue to explore alternative library solutions
* focus on improving existing features instead of adding new 
features

At the request of Andrei, I implemented a small library solution 
as well (https://github.com/dlang/phobos/pull/6339) but the 
leadership never followed up with it.  And that's ok, they only 
have so much time and they need to prioritize how they feel is 
best.

With that, I read through some discussion and thought it could be 
helpful to summarize my thoughts on the matter since people 
continue to ask questions about it.

In my mind, there's really only one reason for string 
interpolation...

     Better Syntax

In many ways syntax isn't that important.  There's alot of 
subjectivity around it, but sometimes a change can make it 
objectively better.  Any time you can make syntax objectively 
better, you're making code easier to read, write and maintain.  
Better syntax means it's easier to write "correct code" and 
harder to write "incorrect code".

I recall Atila arguing that the syntax without string 
interpolation wasn't that bad. Then he provided this example 
(https://forum.dlang.org/post/jahvdekidbugougmyhgb forum.dlang.org):

     text("a is", a, ", b is ", b, " and the sum is: ", a + b)

Ironically, his example had a mistake, but it was hard to notice. 
Look at the same example with string interpolation:

     text("a is$a, b is $b and the sum is: $(a + b)")

You could say that "better syntax" is one of the main reasons D 
exists.  It's the main if not the only reason for alot of 
features like UFCS and foreach.

Andrei's biggest critique is that we should firt try to implement 
this in a library...and he's completely right to ask that 
question.  The problem is that over the years, no one's been able 
to achieve a library solution that results in a nice syntax.  
Having a poor syntax is a bad sign for a feature that only exists 
to improve syntax.  However, even if we could make the syntax 
better, there are still a handful of reasons why a library 
solution can't measure up to language support.  Including the 
"poor syntax", the following are the 5 CONS I see with a library 
solution:

CON 1. The syntax is "not nice".  This defeats the entire point 
of interpolated strings. Saying that interpolated strings aren't 
popular because people are not using a library for them is like 
saying Elvis isn't popular because people don't like elvis 
impersonators.  People not liking a poor imitation of something 
doesn't say anything about how they feel about the genuine 
article.

CON 2. Real error messages.  What's one of the most annoying 
parts of mixins?  Error messages.  When you get an error in a 
mixin, you don't get a line of code to go fix, you get an 
"imaginary" line that doesn't exist.  With library solutions, you 
can't point syntax errors inside interpolated strings to source 
locations.  That information is not available to the language.  
When you get a string, you don't know where each character inside 
that string originated from, only the compiler knows that.

CON 3. Performance.  No matter what we do, any library solution 
will never be as fast as a language solution. The reason why 
performance is especially important here, is because bad 
performance means developers will have to chose between better 
syntax or faster compilation.  We already see this today with 
templates and mixins.  With a language implementation, developers 
can have both.

CON 4. IDE/Editor Support.  A library solution won't be able to 
have IDE/Editor support for syntax highlighting, auto-complete, 
etc.  When the editor sees an interpolated string, it will be 
able to highlight the code inside it just like normal code.

CON 5. Full solution requires full copy of lexer/parser.  One big 
problem with a library solution is that it will be hard for a 
library to delimit interpolated expresions.  For full support, it 
will need to have a full implementation of the compiler's 
lexer/parser.  Without that, it will have limitations on the kind 
of code that can be inside an interpolated string.  Take the 
following (contrived) example:

foreach (i; 0 .. 10)
{
     mixin(interp(`$( i ~ ")" ) entry $(array[i])`));
}

The library solution needs to parse that interpolated string but 
needs to know that the right paren at `")"` is actually just a 
string literal inside the expression and not a right paren to 
delimit the end of the expression.  This is a contrived example, 
but if you have anything less than a full lexer/parser then 
developers are going to have a hard time being able to know what 
can and can't go inside an interpolated expression.  By having 
interpolated strings as a part of the langugage, the 
implementation has full access to the lexer/parser, so it doesn't 
need to force any limitation on the syntax available inside 
interpolated string expressions.

---

Now I'm not saying that the CONS of the library solution justify 
the addition of interpolated strings to the language.  I focused 
on that because that is Andrei's main sticking point.  Even if 
everyone agrees that library solution's don't work (and we can't 
enhance the language to make them work), we still need to show 
that the feature is going to be popular/useful enough to justify 
a new type of string literal.  The usefulness of the feature 
needs to outweight the work to support it.  The more features we 
add to D, the more developers need to learn to understand it.  
That being said, I consider the implementation and complexity it 
adds to be quite minimal (see the PR for more details). As for 
the usefullness, I can say personally I would use this feature to 
replace almost all my usages of writefln/format and writeln which 
would be a big shift for my projects.  Instead of:

writefln("My name is %s and my age is %s and my favorite hex is 
%s", name, age, favnum);

I will be writing:

writeln(i"My name is $name and my age is $age and my favorite hex 
$(favnum.formatHex)");

When I generate code, instead of:

     return
         returnType ~ ` ` ~ name ~ `(` ~ type ~ ` left, ` ~ type ~ 
` right)
         {
             return cast(` ~ returnType ~ `)(left ` ~ op ~ ` 
right);
         }
     `;
  It will be

      return text(iq{
         $returnType $name($type left, $type right)
         {
             return cast($returnType)(left $op right);
         }
     });

When I generate HTML documents in my cgi library, instead of:

     writeln(`<html><body>
     <title>`, title, `</title>
     <name>`, name, `</name><age>`, age, `</age>
     <a href="`, link, `">`, linkName, `</a>
     </body></html>
`);

or even:

     writefln(`<html><body>
     <title>%s</title>
     <name>%s</name><age>%s</age>
     <a href="%s">%s</a>
     </body></html>
`, title, name, age, link, linkName);

It will be:

     writeln(i`<html><body>

     <title>$title</title>
     <name>$name</name><age>$age</age>
     <a href="$link">$linkName</a>
     </body></html>
`);

When I first saw interpolated strings I didn't immediately 
realize the benefit of them.  Using them eliminates the problem 
of keeping format strings in sync with arguments.  It also avoids 
the "noise problem" you get when you alternate between code and 
expressions inside a function call, i.e. `writeln("a is", a, ", b 
is ", b)`. That pretty much sums up the benefits in my mind.

So what's next? I'm curious where leadership currently stands.  
What's their thoughts on the library solutions that have been 
presented? What do they think of the 5 CONS I've presented that 
all library solutions will have?  What's their opinion on the 
usefullness of the feature? For me personally, I am surprised at 
the amount of interest this feature continues to garner.  I think 
the feature is a net positive for D, but then again I don't think 
it's a "make or break" feature.  Just a "nice addition".  Anyway 
those are my thoughts. Sorry for the long post.  I hope it's 
helpful and ultimately makes D better.

Mar 16 2019

Nicholas Wilson <iamthewilsonator hotmail.com> writes:

On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:
 CON 2. Real error messages.  What's one of the most annoying 
 parts of mixins?  Error messages.  When you get an error in a 
 mixin, you don't get a line of code to go fix, you get an 
 "imaginary" line that doesn't exist.

Just a note that you can make it exist with -mixin=filename

 So what's next?

Dconf! I've added this to the list of topics for the foundation 
meeting where we can get a lot of these decisions made.

 For me personally, I am surprised at the amount of interest 
 this feature continues to garner.  I think the feature is a net 
 positive for D, but then again I don't think it's a "make or 
 break" feature.  Just a "nice addition".

Indeed, I hadn't realised how much nicer that would make dealing 
with formatting mixins, thanks for that.

Mar 17 2019

Jacob Carlborg <doob me.com> writes:

On 2019-03-17 07:01, Jonathan Marler wrote:

 Even if everyone agrees 
 that library solution's don't work (and we can't enhance the language to 
 make them work)

We can. But just as there's no interest in adding language support for 
string interpolation there's no interest to add support for those 
features that would help to make a library solution satisfactory.

-- 
/Jacob Carlborg

Mar 17 2019

Nick Treleaven <nick geany.org> writes:

On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:
     text("a is", a, ", b is ", b, " and the sum is: ", a + b)

 Ironically, his example had a mistake, but it was hard to
 notice.

Maybe he wrote that on the forum. In a text editor syntax 
highlighting would make the mistake clearer, but it can still 
happen. The interpolated syntax is definitely clearer.

I think interpolated strings as a language feature is 
justifiable, but that a mixin solution can work with some small 
more general changes to the language, and that these changes are 
both simpler to support and more useful and flexible.

 CON 1. The syntax is "not nice".  This defeats the entire point
 of interpolated strings.

What about `$ident!targs(args)` being sugar for 
`mixin(ident!targs(args))`? Then we can have a function template 
in object.d invoked as:

$iStr!"v = $v"
becomes:
mixin(iStr!"v = $v")

 With library solutions,
 you can't point syntax errors inside interpolated strings to
 source locations.  That information is not available to the
 language.  When you get a string, you don't know where each
 character inside that string originated from, only the compiler
 knows that.

Yes, but the language could support a way to get the file and 
line from template alias arguments inside the template. So at 
least the starting line of the string literal could be known and 
given in an error message.

 CON 3. Performance.  No matter what we do, any library solution
 will never be as fast as a language solution.

True, but what matters is whether a library solution is fast 
enough. If dmd isn't currently then perhaps the CTFE rewrite will 
be.

 CON 4. IDE/Editor Support.  A library solution won't be able to
 have IDE/Editor support for syntax highlighting, auto-complete,
 etc.  When the editor sees an interpolated string, it will be
 able to highlight the code inside it just like normal code.

The editor has to be updated for interpolated syntax, so it could 
just as easily be updated to recognise invocation of $iStr if 
iStr was in object.d.

 The library solution needs to parse that interpolated string
 but needs to know that the right paren at `")"` is actually
 just a string literal inside the expression and not a right
 paren to delimit the end of the expression.  This is a
 contrived example, but if you have anything less than a full
 lexer/parser then developers are going to have a hard time
 being able to know what can and can't go inside an interpolated
 expression.

Good point, but I think banning all string delimiters `'" is a 
reasonable and easy to understand restriction. It's good to 
discourage people from putting complex expressions inside 
strings, and this also makes iStr editor highlighting easier to 
implement vs the unrestricted language solution (which could also 
be restricted).

 That being said, I consider the implementation
 and complexity it adds to be quite minimal (see the PR for more
 details). As for the usefullness, I can say personally I would
 use this feature to replace almost all my usages of
 writefln/format and writeln which would be a big shift for my
 projects.  Instead of:

 writefln("My name is %s and my age is %s and my favorite hex is
 %s", name, age, favnum);

Actually this should be `writefln!"My name is..."(name, ...)`. 
Formatting can be more efficient if the format string is known at 
compile-time.

 I will be writing:

 writeln(i"My name is $name and my age is $age and my favorite
 hex $(favnum.formatHex)");

Does formatHex exist?

      return text(iq{
         $returnType $name($type left, $type right)
         {
             return cast($returnType)(left $op right);
         }
     });

Here there are advantages to the $iStr solution vs your 
implementation. First I would have two mixin functions, iSeq to 
do what you want, and iStr which includes the call to 
std.conv.text:

1. iStr doesn't need to import std.conv.text, which is a very 
common case. Having to import `text` explicitly would often make 
me avoid using the interpolated string feature and use existing 
string syntax instead. (A different language implementation could 
expose text as a property of the interpolated string though).

2. The iStr template gets the string literal passed at compile 
time,
so the length is available for buffer pre-allocation without 
summing the string fragment lengths at runtime. (A different 
language implementation could expose the original string length 
as an enum, then text, writef and format can take advantage of 
it).

Mar 17 2019

Atila Neves <atila.neves gmail.com> writes:

On Sunday, 17 March 2019 at 10:40:28 UTC, Nick Treleaven wrote:
 On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:
     text("a is", a, ", b is ", b, " and the sum is: ", a + b)

 Ironically, his example had a mistake, but it was hard to
 notice.

 Maybe he wrote that on the forum. In a text editor syntax 
 highlighting would make the mistake clearer, but it can still 
 happen. The interpolated syntax is definitely clearer.

Yep. At the time I pointed out I wouldn't have made the same 
mistake in Emacs.

Mar 19 2019

Jonathan Marler <johnnymarler gmail.com> writes:

On Tuesday, 19 March 2019 at 10:11:27 UTC, Atila Neves wrote:
 On Sunday, 17 March 2019 at 10:40:28 UTC, Nick Treleaven wrote:
 On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler 
 wrote:
     text("a is", a, ", b is ", b, " and the sum is: ", a + b)

 Ironically, his example had a mistake, but it was hard to
 notice.

 Maybe he wrote that on the forum. In a text editor syntax 
 highlighting would make the mistake clearer, but it can still 
 happen. The interpolated syntax is definitely clearer.

 Yep. At the time I pointed out I wouldn't have made the same 
 mistake in Emacs.

For me, even with syntax highlighting it's still hard to spot the 
mistake. So it's good to get your perspective which is that you 
know syntax highlighting would have prevented this mistake in 
your case.

Off topic, I enjoyed your talk on emacs, I'm a long time user 
myself.

Mar 19 2019

ag0aep6g <anonymous example.com> writes:

On 17.03.19 07:01, Jonathan Marler wrote:
 When I generate HTML documents in my cgi library, instead of:
 
      writeln(`<html><body>
      <title>`, title, `</title>
      <name>`, name, `</name><age>`, age, `</age>
      <a href="`, link, `">`, linkName, `</a>
      </body></html>
 `);
 
 or even:
 
      writefln(`<html><body>
      <title>%s</title>
      <name>%s</name><age>%s</age>
      <a href="%s">%s</a>
      </body></html>
 `, title, name, age, link, linkName);
 
 It will be:
 
      writeln(i`<html><body>
 
      <title>$title</title>
      <name>$name</name><age>$age</age>
      <a href="$link">$linkName</a>
      </body></html>
 `);

Either way, you likely got yourself an HTML injection.

That might be the crux of string interpolation: It looks nice in simple 
examples, but is it still nice when you need to encode your variables 
for the output?

I think that should be a goal. We don't want to encourage writing bad 
code by making it more beautiful than correct code.

Unless I'm missing something (I've only skimmed your PRs), you don't 
have mechanisms to aid in this. So your example would look like this 
with encoding:

     writeln(i`<html><body>
         <title>$(title.toHTML)</title>
         <name>$(name.toHTML)</name><age>$(age.toHTML)</age>
         <a href="$(link.toHTML)">$(linkName.toHTML)</a>
         </body></html>
     `);

That might still be prettier than the alternative with a plain 
`writeln`, but the difference is less pronounced.

And with `writefln` we can do something like this:

     void writeflnToHTML(S ...)(string f, S stuff)
     {
         writefln(f, tupleMap!toHTML(stuff).expand);
     }
     writeflnToHTML(`<html><body>
         <title>%s</title>
         <name>%s</name><age>%s</age>
         <a href="%s">%s</a>
         </body></html>
     `, title, name, age, link, linkName);

That's still not pretty at all, but we can't forget a `.toHTML` this 
way. (Though `tupleMap` isn't in phobos and might be hard to get exactly 
right.)

Ideally, something like that would be possible with interpolated 
strings, too.

Mar 17 2019

kinke <noone nowhere.com> writes:

I like the idea and find the tuple approach very elegant.

On Sunday, 17 March 2019 at 14:01:36 UTC, ag0aep6g wrote:
     void writeflnToHTML(S ...)(string f, S stuff)
     {
         writefln(f, tupleMap!toHTML(stuff).expand);
     }
     writeflnToHTML(`<html><body>
         <title>%s</title>
         <name>%s</name><age>%s</age>
         <a href="%s">%s</a>
         </body></html>
     `, title, name, age, link, linkName);

 That's still not pretty at all, but we can't forget a `.toHTML` 
 this way. (Though `tupleMap` isn't in phobos and might be hard 
 to get exactly right.)

 Ideally, something like that would be possible with 
 interpolated strings, too.

As long as the embedded expressions are always separated by a 
string literal in the resulting tuple, incl. an empty one 
(`i"$(a)$(b)"` => `"", a, "", b`), these embedded expressions can 
be identified (and appropriately transformed etc.) by an odd 
index.

Mar 17 2019

ag0aep6g <anonymous example.com> writes:

On 18.03.19 00:21, kinke wrote:
 As long as the embedded expressions are always separated by a string 
 literal in the resulting tuple, incl. an empty one (`i"$(a)$(b)"` => 
 `"", a, "", b`), these embedded expressions can be identified (and 
 appropriately transformed etc.) by an odd index.

I see. Like this:


     void writelnToHTML(S ...)(S stuff)
     {
         static foreach (i, thing; stuff)
         {
             if (i % 2 == 0) write(thing);
             else write(thing.toHTML);
         }
         writeln;
     }
     writelnToHTML(i`<html><body>
         <title>$(title)</title>
         <name>$(name)</name><age>$(age)</age>
         <a href="$(link)">$(linkName)</a>
         </body></html>
     `);

That's pretty good. It's not quite foolproof; one might put arguments 
before/after the interpolating, messing up the arrangement. But other 
than that it seems nice.

Mar 18 2019

Olivier FAURE <couteaubleu gmail.com> writes:

On Sunday, 17 March 2019 at 14:01:36 UTC, ag0aep6g wrote:
 Either way, you likely got yourself an HTML injection.

 That might be the crux of string interpolation: It looks nice 
 in simple examples, but is it still nice when you need to 
 encode your variables for the output?

One way to deal with these cases would be to have an alternate 
string interpolation syntax for format string, eg:

     fi"SELECT $field FROM $table"

is lowered to

     "SELECT %s FROM %s", field, table

Other alternatives have been suggested (eg interpolation creating 
delegates that can be passed at compile time), but I think the 
above solution is the most KISS and elegant.

It encourages robust design, where the only argument parsed is 
the first one and every other argument is sanitized by default.

Mar 18 2019

Rubn <where is.this> writes:

On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:
 It's been about a year since I submitted an implementation for 
 interpolated strings:

 https://github.com/dlang/dmd/pull/7988

 In that time, various people have been popping up asking about 
 it. There has been alot of discussion around this feature on 
 the forums and the place we left off was with Andrei saying 
 that we should:

 * continue to explore alternative library solutions
 * focus on improving existing features instead of adding new 
 features

 At the request of Andrei, I implemented a small library 
 solution as well (https://github.com/dlang/phobos/pull/6339) 
 but the leadership never followed up with it.  And that's ok, 
 they only have so much time and they need to prioritize how 
 they feel is best.

 With that, I read through some discussion and thought it could 
 be helpful to summarize my thoughts on the matter since people 
 continue to ask questions about it.

 In my mind, there's really only one reason for string 
 interpolation...

     Better Syntax

 In many ways syntax isn't that important.  There's alot of 
 subjectivity around it, but sometimes a change can make it 
 objectively better.  Any time you can make syntax objectively 
 better, you're making code easier to read, write and maintain.  
 Better syntax means it's easier to write "correct code" and 
 harder to write "incorrect code".

 I recall Atila arguing that the syntax without string 
 interpolation wasn't that bad. Then he provided this example 
 (https://forum.dlang.org/post/jahvdekidbugougmyhgb forum.dlang.org):

     text("a is", a, ", b is ", b, " and the sum is: ", a + b)

 Ironically, his example had a mistake, but it was hard to 
 notice. Look at the same example with string interpolation:

     text("a is$a, b is $b and the sum is: $(a + b)")

 You could say that "better syntax" is one of the main reasons D 
 exists.  It's the main if not the only reason for alot of 
 features like UFCS and foreach.

 Andrei's biggest critique is that we should firt try to 
 implement this in a library...and he's completely right to ask 
 that question.  The problem is that over the years, no one's 
 been able to achieve a library solution that results in a nice 
 syntax.  Having a poor syntax is a bad sign for a feature that 
 only exists to improve syntax.  However, even if we could make 
 the syntax better, there are still a handful of reasons why a 
 library solution can't measure up to language support.  
 Including the "poor syntax", the following are the 5 CONS I see 
 with a library solution:

 CON 1. The syntax is "not nice".  This defeats the entire point 
 of interpolated strings. Saying that interpolated strings 
 aren't popular because people are not using a library for them 
 is like saying Elvis isn't popular because people don't like 
 elvis impersonators.  People not liking a poor imitation of 
 something doesn't say anything about how they feel about the 
 genuine article.

 CON 2. Real error messages.  What's one of the most annoying 
 parts of mixins?  Error messages.  When you get an error in a 
 mixin, you don't get a line of code to go fix, you get an 
 "imaginary" line that doesn't exist.  With library solutions, 
 you can't point syntax errors inside interpolated strings to 
 source locations.  That information is not available to the 
 language.  When you get a string, you don't know where each 
 character inside that string originated from, only the compiler 
 knows that.

 CON 3. Performance.  No matter what we do, any library solution 
 will never be as fast as a language solution. The reason why 
 performance is especially important here, is because bad 
 performance means developers will have to chose between better 
 syntax or faster compilation.  We already see this today with 
 templates and mixins.  With a language implementation, 
 developers can have both.

 CON 4. IDE/Editor Support.  A library solution won't be able to 
 have IDE/Editor support for syntax highlighting, auto-complete, 
 etc.  When the editor sees an interpolated string, it will be 
 able to highlight the code inside it just like normal code.

 CON 5. Full solution requires full copy of lexer/parser.  One 
 big problem with a library solution is that it will be hard for 
 a library to delimit interpolated expresions.  For full 
 support, it will need to have a full implementation of the 
 compiler's lexer/parser.  Without that, it will have 
 limitations on the kind of code that can be inside an 
 interpolated string.  Take the following (contrived) example:

 foreach (i; 0 .. 10)
 {
     mixin(interp(`$( i ~ ")" ) entry $(array[i])`));
 }

 The library solution needs to parse that interpolated string 
 but needs to know that the right paren at `")"` is actually 
 just a string literal inside the expression and not a right 
 paren to delimit the end of the expression.  This is a 
 contrived example, but if you have anything less than a full 
 lexer/parser then developers are going to have a hard time 
 being able to know what can and can't go inside an interpolated 
 expression.  By having interpolated strings as a part of the 
 langugage, the implementation has full access to the 
 lexer/parser, so it doesn't need to force any limitation on the 
 syntax available inside interpolated string expressions.

 ---

 Now I'm not saying that the CONS of the library solution 
 justify the addition of interpolated strings to the language.  
 I focused on that because that is Andrei's main sticking point.
  Even if everyone agrees that library solution's don't work 
 (and we can't enhance the language to make them work), we still 
 need to show that the feature is going to be popular/useful 
 enough to justify a new type of string literal.  The usefulness 
 of the feature needs to outweight the work to support it.  The 
 more features we add to D, the more developers need to learn to 
 understand it.  That being said, I consider the implementation 
 and complexity it adds to be quite minimal (see the PR for more 
 details). As for the usefullness, I can say personally I would 
 use this feature to replace almost all my usages of 
 writefln/format and writeln which would be a big shift for my 
 projects.  Instead of:

 writefln("My name is %s and my age is %s and my favorite hex is 
 %s", name, age, favnum);

 I will be writing:

 writeln(i"My name is $name and my age is $age and my favorite 
 hex $(favnum.formatHex)");

 When I generate code, instead of:

     return
         returnType ~ ` ` ~ name ~ `(` ~ type ~ ` left, ` ~ type 
 ~ ` right)
         {
             return cast(` ~ returnType ~ `)(left ` ~ op ~ ` 
 right);
         }
     `;
  It will be

      return text(iq{
         $returnType $name($type left, $type right)
         {
             return cast($returnType)(left $op right);
         }
     });

 When I generate HTML documents in my cgi library, instead of:

     writeln(`<html><body>
     <title>`, title, `</title>
     <name>`, name, `</name><age>`, age, `</age>
     <a href="`, link, `">`, linkName, `</a>
     </body></html>
 `);

 or even:

     writefln(`<html><body>
     <title>%s</title>
     <name>%s</name><age>%s</age>
     <a href="%s">%s</a>
     </body></html>
 `, title, name, age, link, linkName);

 It will be:

     writeln(i`<html><body>

     <title>$title</title>
     <name>$name</name><age>$age</age>
     <a href="$link">$linkName</a>
     </body></html>
 `);

 When I first saw interpolated strings I didn't immediately 
 realize the benefit of them.  Using them eliminates the problem 
 of keeping format strings in sync with arguments.  It also 
 avoids the "noise problem" you get when you alternate between 
 code and expressions inside a function call, i.e. `writeln("a 
 is", a, ", b is ", b)`. That pretty much sums up the benefits 
 in my mind.

 So what's next? I'm curious where leadership currently stands.  
 What's their thoughts on the library solutions that have been 
 presented? What do they think of the 5 CONS I've presented that 
 all library solutions will have?  What's their opinion on the 
 usefullness of the feature? For me personally, I am surprised 
 at the amount of interest this feature continues to garner.  I 
 think the feature is a net positive for D, but then again I 
 don't think it's a "make or break" feature.  Just a "nice 
 addition".  Anyway those are my thoughts. Sorry for the long 
 post.  I hope it's helpful and ultimately makes D better.

Seems you've done everything but write a DIP, I don't really see 
why this feature should be exempt from the process. Even if the 
process isn't the greatest, that isn't reason enough for it to 
circumvent it.

Mar 17 2019

Jonathan Marler <johnnymarler gmail.com> writes:

On Sunday, 17 March 2019 at 14:20:20 UTC, Rubn wrote:
 Seems you've done everything but write a DIP, I don't really 
 see why this feature should be exempt from the process. Even if 
 the process isn't the greatest, that isn't reason enough for it 
 to circumvent it.

I'll have to disagree with you here. I'm not sure if you've 
written a DIP but I have.

I started DIP 1011 about 2 years ago.  After about a year, it was 
forwarded to W&A and I got a response from Andrei that contained 
a fair number of errors.  To me it seemed that he didn't take 
enough time to read/understand the proposal (I've heard this 
isn't the first time this has happened). The proposal itself is 
pretty simple, but the ramifications of the change weren't so 
clear.  A discussion between the leadership and the points they 
had questions about would have been very helpful early on to know 
where to put effort into researching the proposal.  However, 
that's not how the DIP process is written to work.  After 
Andrei's response I attempted to discuss their concerns but 
everything was filtered through the DIP manager Michael Parker 
and they never responded to my comments and questions.  We left 
off with them providing an example library implementation asking 
me to comment on it. I did so, explaining that their example was 
incorrect and had little bearing on the DIP itself as they had 
implemented different semantics than what the DIP was proposing, 
but then they never responded.  That was about a year ago, and 
the DIP is still considered to be in "Formal Review".

In my opinion the DIP process is broken.  I don't want to 
introduce a potentially good feature for D into a system where I 
believe it will actually harm the chances for the feature rather 
then help them.  If the process is fixed however, I will gladly 
create a DIP and would look forward to really hashing out the 
feature and seeing how it could best be implemented.  D has a big 
potential to make a splash with this feature by showcasing its 
meta-programming capabilities with a zero overhead implementation 
of string interpolation that also happens to be the most 
powerful/flexible one out there. I've heard some pretty cool 
ideas from people on it that I hadn't thought of, and would love 
to work with the community on creating a robust well-researched 
proposal, but I believe if I use the current DIP system as it 
exists to introduce it, it will actually make it more likely to 
fail than if I didn't write a DIP at all at this point.

Please understand, I don't shy away from good, robust work and 
research.  I'm a highly motivated mathematician, who loves 
optimizing and finding elegant solutions.  That's what I like 
spending my time on.  Researching language proposals is exactly 
the type of work I like to do. I spend alot of time reading and 
researching other languages and the features they bring to the 
table. I would very much enjoy contributing to a DIP process that 
fosters collaboration and feedback that results in more consensus 
and communal understanding.

Mar 17 2019

Nicholas Wilson <iamthewilsonator hotmail.com> writes:

On Sunday, 17 March 2019 at 16:50:13 UTC, Jonathan Marler wrote:
 On Sunday, 17 March 2019 at 14:20:20 UTC, Rubn wrote:
 Seems you've done everything but write a DIP, I don't really 
 see why this feature should be exempt from the process. Even 
 if the process isn't the greatest, that isn't reason enough 
 for it to circumvent it.

 I'll have to disagree with you here. I'm not sure if you've 
 written a DIP but I have.

 I started DIP 1011 about 2 years ago.  After about a year, it 
 was forwarded to W&A and I got a response from Andrei that 
 contained a fair number of errors.  To me it seemed that he 
 didn't take enough time to read/understand the proposal (I've 
 heard this isn't the first time this has happened). The 
 proposal itself is pretty simple, but the ramifications of the 
 change weren't so clear.  A discussion between the leadership 
 and the points they had questions about would have been very 
 helpful early on to know where to put effort into researching 
 the proposal.  However, that's not how the DIP process is 
 written to work.  After Andrei's response I attempted to 
 discuss their concerns but everything was filtered through the 
 DIP manager Michael Parker and they never responded to my 
 comments and questions.  We left off with them providing an 
 example library implementation asking me to comment on it. I 
 did so, explaining that their example was incorrect and had 
 little bearing on the DIP itself as they had implemented 
 different semantics than what the DIP was proposing, but then 
 they never responded.  That was about a year ago, and the DIP 
 is still considered to be in "Formal Review".

That does smack of my experience with DIP1016. I've added DIP1011 
to the list of DIP related stuff for the dconf AGM[1]

 In my opinion the DIP process is broken.  I don't want to 
 introduce a potentially good feature for D into a system where 
 I believe it will actually harm the chances for the feature 
 rather then help them.  If the process is fixed however, I will 
 gladly create a DIP and would look forward to really hashing 
 out the feature and seeing how it could best be implemented.

I hope to get the DIP process fixed at the AGM, but for making 
progress on the topic of interpolated strings it would really 
help if you have a draft (or steal/polish that one from the DIP 
PR queue) so that we have something concrete to discuss (i.e. get 
(part of) a community round done).

 I believe if I use the current DIP system as it exists to 
 introduce it, it will actually make it more likely to fail than 
 if I didn't write a DIP at all at this point.

 Please understand, I don't shy away from good, robust work and 
 research.  I'm a highly motivated mathematician, who loves 
 optimizing and finding elegant solutions.  That's what I like 
 spending my time on.  Researching language proposals is exactly 
 the type of work I like to do. I spend alot of time reading and 
 researching other languages and the features they bring to the 
 table. I would very much enjoy contributing to a DIP process 
 that fosters collaboration and feedback that results in more 
 consensus and communal understanding.

This really does highlight the problems with organisation we 
really want to be getting all the best work you can provide and 
if that is bottlenecked on the current DIP process then we 
definitely need to look at why it is and how to fix it.

[1]: http://dconf.org/2019/talks/agm.html

Mar 17 2019

Kagamin <spam here.lot> writes:

On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:
 CON 3. Performance.  No matter what we do, any library solution 
 will never be as fast as a language solution.

Must consider the price. Language bloat is not free.

 CON 4. IDE/Editor Support.  A library solution won't be able to 
 have IDE/Editor support for syntax highlighting, auto-complete, 
 etc.  When the editor sees an interpolated string, it will be 
 able to highlight the code inside it just like normal code.



 CON 5. Full solution requires full copy of lexer/parser.  One 
 big problem with a library solution is that it will be hard for 
 a library to delimit interpolated expresions.  For full 
 support, it will need to have a full implementation of the 
 compiler's lexer/parser.

The interpolated string syntax is as complex as the whole D 
language? That sounds bad. AFAIK IDEs don't even support UFCS yet 
even though it should be simple.

 foreach (i; 0 .. 10)
 {
     mixin(interp(`$( i ~ ")" ) entry $(array[i])`));
 }

That doesn't look nice, does it?

Mar 17 2019

Jonathan Marler <johnnymarler gmail.com> writes:

On Sunday, 17 March 2019 at 17:05:46 UTC, Kagamin wrote:
 On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:
 CON 3. Performance.  No matter what we do, any library 
 solution will never be as fast as a language solution.

 Must consider the price. Language bloat is not free.

Right, which is why I mentioned we must consider the price 
further down. I know it's a long post though, I don't blame you 
for not reading it all :)

 CON 4. IDE/Editor Support.  A library solution won't be able 
 to have IDE/Editor support for syntax highlighting, 
 auto-complete, etc.  When the editor sees an interpolated 
 string, it will be able to highlight the code inside it just 
 like normal code.



Which works in some cases.  It can't work in all cases unless 
language support is added.  Still a CON for the library solution.

 CON 5. Full solution requires full copy of lexer/parser.  One 
 big problem with a library solution is that it will be hard 
 for a library to delimit interpolated expresions.  For full 
 support, it will need to have a full implementation of the 
 compiler's lexer/parser.

 The interpolated string syntax is as complex as the whole D 
 language? That sounds bad. AFAIK IDEs don't even support UFCS 
 yet even though it should be simple.

The full solution requires a full copy of the lexer/parser, but 
the syntax for interpolated strings is just allowing string 
literals to be prefixed with the letter 'i'.  You've confused the 
syntax for interpolated strings with the complexity of arbitrary 
D expression inside them.  If you want to be able to write any 
code inside those expressions, then you need the full parser 
lexer to parse it.  A language solution already has access to 
this but a library implementation will need to either copy the 
full implementation, or make compromises in what it can support.

 foreach (i; 0 .. 10)
 {
     mixin(interp(`$( i ~ ")" ) entry $(array[i])`));
 }

 That doesn't look nice, does it?

Not sure if you realize this but you're criticizing how the 
library solution looks, which was one of my main points :) And 
it's a contrived example to demonstrate an example that would be 
hard for a library to parse.  In reality you wouldn't write it 
that way (hence why I called it "contrived").  With full 
interpolation support you would write it like this:

i"$(i)) entry $(array[i])"

Mar 17 2019

Kagamin <spam here.lot> writes:

On Sunday, 17 March 2019 at 18:32:55 UTC, Jonathan Marler wrote:
 foreach (i; 0 .. 10)
 {
     mixin(interp(`$( i ~ ")" ) entry $(array[i])`));
 }

 That doesn't look nice, does it?

 Not sure if you realize this but you're criticizing how the 
 library solution looks, which was one of my main points :)

I mean $( i ~ ")

 And it's a contrived example to demonstrate an example that 
 would be hard for a library to parse.

It's hard to parse even for a library? That sounds bad. Is it 
required to be hard to parse?

Mar 18 2019

Steven Schveighoffer <schveiguy gmail.com> writes:

On 3/18/19 8:12 AM, Kagamin wrote:
 On Sunday, 17 March 2019 at 18:32:55 UTC, Jonathan Marler wrote:
 foreach (i; 0 .. 10)
 {
     mixin(interp(`$( i ~ ")" ) entry $(array[i])`));
 }

 That doesn't look nice, does it?

 Not sure if you realize this but you're criticizing how the library 
 solution looks, which was one of my main points :)

 
 I mean $( i ~ ")

Yeah, you could write it I think just like the language solution:

mixin(interp(`$(i)) entry $(array[i])`));

You'd still have to deal with the stray parentheses as a string, but I'm 
sure there are other expressions inside the escapes that are more likely 
to be in the wild which would require a lexer/parser. It may not be a 
full one, though, we don't need to make AST out of it.

 
 And it's a contrived example to demonstrate an example that would be 
 hard for a library to parse.

 
 It's hard to parse even for a library? That sounds bad. Is it required 
 to be hard to parse?

If you look at the PR that he created, it's super-simple. It uses the 
already existing parser/lexer in the front end.

The point he's making is that we'd have to DUPLICATE that for a library 
solution.

-Steve

Mar 18 2019

Kagamin <spam here.lot> writes:

On Monday, 18 March 2019 at 13:05:37 UTC, Steven Schveighoffer 
wrote:
 If you look at the PR that he created, it's super-simple. It 
 uses the already existing parser/lexer in the front end.

Hmm... the pr seemingly doesn't support full nested language: the 
string is lexed as a normal double quoted string. This is 
different from, say, javascript, where interpolated strings 
support nesting without escapes and thus can't be lexed without 
parsing all their content.

 The point he's making is that we'd have to DUPLICATE that for a 
 library solution.

If they supported full nested language, but they don't. So I see 
no reason to assume that they should be difficult to parse.

Mar 19 2019

Jonathan Marler <johnnymarler gmail.com> writes:

On Tuesday, 19 March 2019 at 08:13:16 UTC, Kagamin wrote:
 On Monday, 18 March 2019 at 13:05:37 UTC, Steven Schveighoffer 
 wrote:
 If you look at the PR that he created, it's super-simple. It 
 uses the already existing parser/lexer in the front end.

 Hmm... the pr seemingly doesn't support full nested language: 
 the string is lexed as a normal double quoted string. This is 
 different from, say, javascript, where interpolated strings 
 support nesting without escapes and thus can't be lexed without 
 parsing all their content.

 The point he's making is that we'd have to DUPLICATE that for 
 a library solution.

 If they supported full nested language, but they don't. So I 
 see no reason to assume that they should be difficult to parse.

Your absolutely right here. Thanks for taking time to look at the 
code by the way. The current implementation isn't finished. I 
just wanted to start out with something easy to explore the 
feature. So today the language solution has the limitation that 
parens inside the code need to be balanced. But that limitation 
can be fixed with a bit more work. I just need to combine the 
current 2 pass interpolation logic into one-pass, but that would 
require some extra changes in the parse code and at this stage I 
wanted to minimize turmoil. So it's a temporary compromise. The 
final solution wouldn't have this limitation.

Mar 19 2019

Jonathan Marler <johnnymarler gmail.com> writes:

On Monday, 18 March 2019 at 12:12:12 UTC, Kagamin wrote:
 On Sunday, 17 March 2019 at 18:32:55 UTC, Jonathan Marler wrote:
 foreach (i; 0 .. 10)
 {
     mixin(interp(`$( i ~ ")" ) entry $(array[i])`));
 }

 That doesn't look nice, does it?

 Not sure if you realize this but you're criticizing how the 
 library solution looks, which was one of my main points :)

 I mean $( i ~ ")

Yeah, like I said you wouldn't write that in real code. It's an 
example to demonstrate something that would be hard for a library 
to parse.

 And it's a contrived example to demonstrate an example that 
 would be hard for a library to parse.

 It's hard to parse even for a library? That sounds bad.

Yeah it is bad. Hence why I listed it as a CON of the library 
solution. But it's very easy if it's implemented in the compiler 
since is has access to the parser.

 Is it required to be hard to parse?

The problem is knowing when right-paren ')' characters are 
supposed to be a part of the code or when they are being used to 
end the code. For the parser this is easy to determine, but 
without it, you now have to go through all the tokens/grammar 
nodes to see where these right-paren characters can appear (i.e. 
string literals, function calls, templates, etc). You could 
implement a simple heuristic where you support as many 
right-paren in your code as there are left-paren characters, 
which will get you most of the way there, but now you've put an 
arbitrary limitation on the code that can appear inside these 
interpolated expressions. The point is, a language solution 
doesn't have this problem. It doesn't need to put any limitations 
on the code, and it doesn't have to come up with a mechanism to 
know when paren characters are or are not apart of the code. This 
problem is non-existent for an implementation inside the compiler.

Mar 18 2019

Olivier FAURE <couteaubleu gmail.com> writes:

On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:
 https://github.com/dlang/dmd/pull/7988

By the way, quick question, how does the PR handle this case?

     i"$a $b" ~ i$"c $d"

Mar 18 2019

Steven Schveighoffer <schveiguy gmail.com> writes:

On 3/18/19 5:07 AM, Olivier FAURE wrote:
 On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:
 https://github.com/dlang/dmd/pull/7988

 
 By the way, quick question, how does the PR handle this case?
 
      i"$a $b" ~ i$"c $d"

Hm... it would be good for the feature to handle this (and treat it as 
i"$a $b$c $d". I'm assuming that i$"c is a typo and you meant i"$c.

-Steve

Mar 18 2019

Meta <jared771 gmail.com> writes:

On Monday, 18 March 2019 at 09:07:26 UTC, Olivier FAURE wrote:
 On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:
 https://github.com/dlang/dmd/pull/7988

 By the way, quick question, how does the PR handle this case?

     i"$a $b" ~ i$"c $d"

Also, recursive use of interpolated strings. Does this work or 
fail with a compiler error like "can't find symbol c":

int a = 2, b = 5;
enum code = text(i"mixin(\"int c = $(a + b)\"); writeln(i\"a = 
$a, b = $b, c = $c\");");
mixin(code);

Mar 18 2019

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Mon, Mar 18, 2019 at 04:30:26PM +0000, Meta via Digitalmars-d wrote:
 On Monday, 18 March 2019 at 09:07:26 UTC, Olivier FAURE wrote:
 On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:
 https://github.com/dlang/dmd/pull/7988

 
 By the way, quick question, how does the PR handle this case?
 
     i"$a $b" ~ i$"c $d"

 
 Also, recursive use of interpolated strings. Does this work or fail
 with a compiler error like "can't find symbol c":
 
 int a = 2, b = 5;
 enum code = text(i"mixin(\"int c = $(a + b)\"); writeln(i\"a = $a, b =
 $b, c = $c\");");
 mixin(code);

[...]

I'd expect you'd have to escape the $ in the nested interpolated string.
I forgot what the escape was, perhaps `$$`?


T

-- 
Tell me and I forget. Teach me and I remember. Involve me and I understand. --
Benjamin Franklin

Mar 18 2019

Patrick Schluter <Patrick.Schluter bbox.fr> writes:

On Monday, 18 March 2019 at 16:48:59 UTC, H. S. Teoh wrote:
 On Mon, Mar 18, 2019 at 04:30:26PM +0000, Meta via 
 Digitalmars-d wrote:
 On Monday, 18 March 2019 at 09:07:26 UTC, Olivier FAURE wrote:
 On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler 
 wrote:
 https://github.com/dlang/dmd/pull/7988

 
 By the way, quick question, how does the PR handle this case?
 
     i"$a $b" ~ i$"c $d"

 
 Also, recursive use of interpolated strings. Does this work or 
 fail with a compiler error like "can't find symbol c":
 
 int a = 2, b = 5;
 enum code = text(i"mixin(\"int c = $(a + b)\"); writeln(i\"a = 
 $a, b =
 $b, c = $c\");");
 mixin(code);

 [...]

 I'd expect you'd have to escape the $ in the nested 
 interpolated string. I forgot what the escape was, perhaps `$$`?

\$ would be consistent with normal C and D quoting rules. $$ not 
so much.

Mar 18 2019

D Programming

C/C++ Programming

Other

digitalmars.D - The state of string interpolation...one year later