digitalmars.D - Idea: limited template expansion

Steven Schveighoffer (35/35) Jan 20 2016 I am writing some code for a library, and an interesting situation comes...

David Nadlinger (13/17) Jan 20 2016 Can't you just write a wrapper function that takes the template

Steven Schveighoffer (13/20) Jan 20 2016 I'm sure it can be done using some existing techniques, possibly with

David Nadlinger (8/13) Jan 20 2016 But if you abstract away the dispatch part into a (meta)template

Steven Schveighoffer (44/53) Jan 21 2016 The use case I have in mind is a parser, let's say an xml parser.

David Nadlinger (15/21) Jan 21 2016 I still don't see how a language feature as described in your

Steven Schveighoffer (14/32) Jan 21 2016 It may not be an "easier" thing to have the compiler do this instead of

Sebastiaan Koppe (7/13) Jan 21 2016 I have used EnumMembers a few times and it definitely saved me

Steven Schveighoffer (5/18) Jan 21 2016 Yeah, you really need an enum to make this make sense (there's not a

Steven Schveighoffer <schveiguy yahoo.com> writes:

I am writing some code for a library, and an interesting situation comes 
up. Let's say you have some code that can deal with wchar, dchar, or 
char arrays. You template the code based on the code unit type, but the 
input comes in as a file stream (which is bytes).

What I have been doing is this:

void foo(C)(C[] buf) {...}


ubyte[] buffer = ...;
switch(detectBOM(buffer))
{
   case UTF8:
      foo(cast(char[])buffer);
      break;
   case UTF16:
      foo(cast(wchar[])buffer);
      break;
   case UTF32:
      foo(cast(dchar[])buffer);
      break;
   default:
      assert(0);
}

Essentially, I'm wrapping a runtime check into a compile-time construct, 
but I have to always deal with this switch mechanism and a lot of boiler 
plate.

It would be cool if the compiler could "expand" a finitely instantiable 
template (one with a finite number of ways it can be instantiated) based 
on a runtime value, and do the switch for me.

Something like:

void foo(BOM b)(ubyte[] buffer) { ... /* cast to correct char type */ }

ubyte[] buffer = ...;
foo!(detectBOM(buffer))(buffer);

would expand into what I wrote manually. It may need a different syntax 
from template instantiation, since it does involve runtime checking.

Does this have any appeal to people? Any other use cases?

-Steve

Jan 20 2016

David Nadlinger <code klickverbot.at> writes:

On Wednesday, 20 January 2016 at 21:27:15 UTC, Steven 
Schveighoffer wrote:
 It would be cool if the compiler could "expand" a finitely 
 instantiable template (one with a finite number of ways it can 
 be instantiated) based on a runtime value, and do the switch 
 for me.

Can't you just write a wrapper function that takes the template 
function as a compile-time argument and generates the switch for 
you by iterating over the std.traits.EnumMembers of said 
compile-time parameter?

Something like `enumTemplateDispatch!foo(detectBOM(buffer), 
<runtime args>))`.

Depending on what template signatures you want to support, you 
might need to make enumTemplateDispatch explicitly take the enum 
type to use or the position of the respective parameter in the 
template argument list of the target.

  — David

Jan 20 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 1/20/16 4:44 PM, David Nadlinger wrote:
 On Wednesday, 20 January 2016 at 21:27:15 UTC, Steven Schveighoffer wrote:
 It would be cool if the compiler could "expand" a finitely
 instantiable template (one with a finite number of ways it can be
 instantiated) based on a runtime value, and do the switch for me.

 Can't you just write a wrapper function that takes the template function
 as a compile-time argument and generates the switch for you by iterating
 over the std.traits.EnumMembers of said compile-time parameter?

I'm sure it can be done using some existing techniques, possibly with 
mixins.

However, one thing I didn't mention is the function called foo would 
likely simply be inline code in the higher level function, if it could be.

Imagine if you such a call returned something, and you simply used it 
throughout the rest of your function. The compiler could rewrite this as 
a template and compile all three versions and branch to the appropriate 
one, but your code would simply read as a straightforward procedural 
function.

In any case, it was an idea that I had, not sure if it makes a lot of 
sense. It would likely be a huge amount of work on the compiler.

-Steve

Jan 20 2016

David Nadlinger <code klickverbot.at> writes:

On Wednesday, 20 January 2016 at 22:00:45 UTC, Steven 
Schveighoffer wrote:
 Imagine if you such a call returned something, and you simply 
 used it throughout the rest of your function. The compiler 
 could rewrite this as a template and compile all three versions 
 and branch to the appropriate one, but your code would simply 
 read as a straightforward procedural function.

But if you abstract away the dispatch part into a (meta)template 
function like I suggested, the caller would still be just as 
linear, right?

How would the client code be simpler with a built-in language 
feature?

  — David

Jan 20 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 1/20/16 6:01 PM, David Nadlinger wrote:
 On Wednesday, 20 January 2016 at 22:00:45 UTC, Steven Schveighoffer wrote:
 Imagine if you such a call returned something, and you simply used it
 throughout the rest of your function. The compiler could rewrite this
 as a template and compile all three versions and branch to the
 appropriate one, but your code would simply read as a straightforward
 procedural function.

 But if you abstract away the dispatch part into a (meta)template
 function like I suggested, the caller would still be just as linear, right?

 How would the client code be simpler with a built-in language feature?

The use case I have in mind is a parser, let's say an xml parser.

You have a file, it could be UTF8, UTF16, UTF32 (not to mention 
endianness, but that just adds to the number of finite possibilities)

Once you have determined the encoding, it doesn't change. So it's not 
advantageous to store the encoding in a runtime variable and check it at 
every turn. It makes more sense to templatize the code that deals with 
the XML parsing based on the code unit type, once that has been determined.

But client code ALSO has to deal with this, not just the library. So the 
implementation details sort of leak out into the caller.

Let's say there is a function parseXML that returns a range (a la 
byLine) that gives you elements from the xml tree.

Instead of a nice thing like:

foreach(element; file.asUTFX.parseXML)
{
    ... client code
}

you have to do something like this:

switch(file.detectBOM)
{
    case UTF8:
        foreach(element; file.asUTF8.parseXML)
        {
            ... client code
        }
        break;
    case UTF16:
        foreach(element; file.asUTF16.parseXML)
        {
            ... client code
        }
        break;
    case UTF32:
        foreach(element; file.asUTF32.parseXML)
        {
            ... client code
        }
        break;
}

In other words, the API *requires* you to write this switch thingy (and 
obviously to break up your client code into another function), or use 
some other mechanism. I want to hide the details of what is happening 
there in a simple call so the code is easy to read/write.

-Steve

Jan 21 2016

David Nadlinger <code klickverbot.at> writes:

On Thursday, 21 January 2016 at 13:36:28 UTC, Steven 
Schveighoffer wrote:
 On 1/20/16 6:01 PM, David Nadlinger wrote:
 How would the client code be simpler with a built-in language 
 feature?

 The use case I have in mind is a parser, let's say an xml 
 parser.

 […]

I still don't see how a language feature as described in your 
first post would make this any easier than using a template for 
that exact purpose (switching between methods to call).

If what you are trying to say is that you want the different 
template function instantiations to return incompatible types in 
addition to that, the feature from your initial post won't help 
you there either. Values can't have different types in 
non-template code, so you'd necessarily need to make the client 
code a template too. Streamlining the runtime -> compile time 
value dispatch process (as shown in your initial example) doesn't 
change the fact that the type of a given value cannot be 
influenced by runtime decisions.

  — David

Jan 21 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 1/21/16 10:02 AM, David Nadlinger wrote:
 On Thursday, 21 January 2016 at 13:36:28 UTC, Steven Schveighoffer wrote:
 On 1/20/16 6:01 PM, David Nadlinger wrote:
 How would the client code be simpler with a built-in language feature?

 The use case I have in mind is a parser, let's say an xml parser.

 […]

 I still don't see how a language feature as described in your first post
 would make this any easier than using a template for that exact purpose
 (switching between methods to call).

It may not be an "easier" thing to have the compiler do this instead of 
the user, but conceptually to me it reads better. One can write foreach 
loops as for loops also, it's not really hard. Yet foreach loops read 
and write better conceptually than a for loop. And the boilerplate is 
cut down significantly.

I'm simply trying to get rid of the boiler plate that one would need for 
such a thing. And I'm wondering if this is a unique situation for my use 
case, or if there are other use cases. If it's just for this one case, 
it's likely not worth exploring a modification to the language.

 If what you are trying to say is that you want the different template
 function instantiations to return incompatible types in addition to
 that, the feature from your initial post won't help you there either.
 Values can't have different types in non-template code, so you'd
 necessarily need to make the client code a template too. Streamlining
 the runtime -> compile time value dispatch process (as shown in your
 initial example) doesn't change the fact that the type of a given value
 cannot be influenced by runtime decisions.

The client code would be a template. Essentially, the compiler would 
rewrite the block as a template function and call the appropriate one. 
But the code itself would look like an inline block.

-Steve

Jan 21 2016

Sebastiaan Koppe <mail skoppe.eu> writes:

On Wednesday, 20 January 2016 at 21:44:06 UTC, David Nadlinger 
wrote:
 Can't you just write a wrapper function that takes the template 
 function as a compile-time argument and generates the switch 
 for you by iterating over the std.traits.EnumMembers of said 
 compile-time parameter?

 Something like `enumTemplateDispatch!foo(detectBOM(buffer), 
 <runtime args>))`.

I have used EnumMembers a few times and it definitely saved me 
the hassle of writing the switch myself.

However, in this case, isn't there a mapping between enums and 
types that one still needs to specify? That is: ( UTF8=>char[], 
UTF16=>wchar[], UTF32=>dchar[] ).

Jan 21 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 1/21/16 7:18 AM, Sebastiaan Koppe wrote:
 On Wednesday, 20 January 2016 at 21:44:06 UTC, David Nadlinger wrote:
 Can't you just write a wrapper function that takes the template
 function as a compile-time argument and generates the switch for you
 by iterating over the std.traits.EnumMembers of said compile-time
 parameter?

 Something like `enumTemplateDispatch!foo(detectBOM(buffer), <runtime
 args>))`.

 I have used EnumMembers a few times and it definitely saved me the
 hassle of writing the switch myself.

 However, in this case, isn't there a mapping between enums and types
 that one still needs to specify? That is: ( UTF8=>char[],
 UTF16=>wchar[], UTF32=>dchar[] ).

Yeah, you really need an enum to make this make sense (there's not a 
great way to tell the compiler "I only mean char types"). I didn't show 
what it would look like, but one is trivial to write.

-Steve

Jan 21 2016

D Programming

C/C++ Programming

Other

digitalmars.D - Idea: limited template expansion