www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Phobos begat madness

reply Walter Bright <newshound2 digitalmars.com> writes:
In trying to get Phobos to compile with dip1000, I was faced with the following 
innocuous looking line:

   assert(equal(achr, arr), text(a.byCodepoint));

   https://github.com/dlang/phobos/blob/master/std/uni.d#L3763

and the compiler complained that `text(a.byCodepoint)` was a system call. I 
decided to accept the Mission Impossible of figuring out why it was inferred as 
 system.

I found myself getting lost in a maze of twisty template expansions, all
looking 
slightly different:

   text
   begat textImpl
   begat to
   begat toImpl
   begat toStr
   begat formatValue
   begat formatValueImpl
   begat put
   begat doPut
   begat Appender.put
   begat Appender.put

These templates are all just wrappers around wrappers. They're laden with 
overloads, constraints and static ifs, so finding just what code is being 
compiled with what parameters is absurdly difficult. I can't begin to defend
this.

Converting a range of empty/front/popFront to a string should be a simple, 
direct piece of code.

For example, simple conversion to a string should not need to detour through a 
complex formatting function.

(For those wondering about the genesis of 
https://github.com/dlang/dmd/pull/8384, trying to figure this out motivated it. 
Saved me lots of time, especially in finding which overload was being called.)

If someone wants to figure out how to build an uncomplicated, straightforward, 
efficient implementation of text() that we can be proud of, that would be a
nice 
contribution.
Jun 22 2018
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 22 June 2018 at 21:37:07 UTC, Walter Bright wrote:
 I decided to accept the Mission Impossible of figuring out why 
 it was inferred as  system.
This would be trivial if the compiler had decent error messages. https://issues.dlang.org/show_bug.cgi?id=17374 It could just tell us the bottom-most call that was not safe (or nogc or nothrow or whatever) that is not inferred, and boom, problem solved in seconds.
Jun 22 2018
next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 6/22/18 6:03 PM, Adam D. Ruppe wrote:
 On Friday, 22 June 2018 at 21:37:07 UTC, Walter Bright wrote:
 I decided to accept the Mission Impossible of figuring out why it was 
 inferred as  system.
This would be trivial if the compiler had decent error messages. https://issues.dlang.org/show_bug.cgi?id=17374 It could just tell us the bottom-most call that was not safe (or nogc or nothrow or whatever) that is not inferred, and boom, problem solved in seconds.
That would be awesome. I've gone through the "annotate this and continue" exercise, and it is not fun. Especially when the template is used in many places, you have to make a copy of the template to play with. -Steve
Jun 22 2018
parent Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 22 June 2018 at 22:25:35 UTC, Steven Schveighoffer 
wrote:
 I've gone through the "annotate this and continue" exercise, 
 and it is not fun. Especially when the template is used in many 
 places, you have to make a copy of the template to play with.
Aye. I think that's the most straightforward implementation though: make the compiler on the inside just set the inferred flag, then rerun the semantic and collect the errors until it gets to the bottom. (then perhaps cut out some of the errors from the middle for brevity of the message, just showing the top and bottom parts of the call chain, since that is most likely what the user actually would want to know to fix it)
Jun 22 2018
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/22/2018 3:03 PM, Adam D. Ruppe wrote:
 On Friday, 22 June 2018 at 21:37:07 UTC, Walter Bright wrote:
 I decided to accept the Mission Impossible of figuring out why it was inferred 
 as  system.
This would be trivial if the compiler had decent error messages. https://issues.dlang.org/show_bug.cgi?id=17374 It could just tell us the bottom-most call that was not safe (or nogc or nothrow or whatever) that is not inferred, and boom, problem solved in seconds.
That would help, do you want to implement it? (The begats are still a problem regardless.)
Jun 22 2018
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 22 June 2018 at 22:28:17 UTC, Walter Bright wrote:
 That would help, do you want to implement it?
I can't for at last two weeks, I'm about to fly tomorrow... and I'm getting married in August too so my life is kinda hectic. But I might look at it around July 4 if I get a chance and it isn't done by then.
 (The begats are still a problem regardless.)
Yeah, I don't like the way to is implemented either.
Jun 22 2018
next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 6/22/18 6:36 PM, Adam D. Ruppe wrote:
 On Friday, 22 June 2018 at 22:28:17 UTC, Walter Bright wrote:
 That would help, do you want to implement it?
I can't for at last two weeks, I'm about to fly tomorrow... and I'm getting married in August too so my life is kinda hectic.
Congrats! -Steve
Jun 22 2018
parent Ali <fakeemail example.com> writes:
On Friday, 22 June 2018 at 22:39:40 UTC, Steven Schveighoffer 
wrote:
 On 6/22/18 6:36 PM, Adam D. Ruppe wrote:
 On Friday, 22 June 2018 at 22:28:17 UTC, Walter Bright wrote:
 That would help, do you want to implement it?
I can't for at last two weeks, I'm about to fly tomorrow... and I'm getting married in August too so my life is kinda hectic.
Congrats! -Steve
Congrats :)
Jun 22 2018
prev sibling next sibling parent Bauss <jj_1337 live.dk> writes:
On Friday, 22 June 2018 at 22:36:23 UTC, Adam D. Ruppe wrote:
 On Friday, 22 June 2018 at 22:28:17 UTC, Walter Bright wrote:
 That would help, do you want to implement it?
I can't for at last two weeks, I'm about to fly tomorrow... and I'm getting married in August too so my life is kinda hectic. But I might look at it around July 4 if I get a chance and it isn't done by then.
 (The begats are still a problem regardless.)
Yeah, I don't like the way to is implemented either.
Hijacking the thread with your wedding. I'm just kidding. Congratulations and the best of wishes for your future with your soon-to-be wife.
Jun 22 2018
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 6/23/18 1:36 AM, Adam D. Ruppe wrote:
 On Friday, 22 June 2018 at 22:28:17 UTC, Walter Bright wrote:
 That would help, do you want to implement it?
I can't for at last two weeks, I'm about to fly tomorrow... and I'm getting married in August too so my life is kinda hectic.
Congratulations, Adam! Andrei
Jun 23 2018
prev sibling parent David Bennett <davidbennett bravevision.com> writes:
On Friday, 22 June 2018 at 22:36:23 UTC, Adam D. Ruppe wrote:
 I can't for at last two weeks, I'm about to fly tomorrow... and 
 I'm getting married in August too so my life is kinda hectic.
Congrats!
Jun 24 2018
prev sibling next sibling parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Friday, 22 June 2018 at 21:37:07 UTC, Walter Bright wrote:
   text
   begat textImpl
   begat to
   begat toImpl
   begat toStr
   begat formatValue
   begat formatValueImpl
   begat put
   begat doPut
   begat Appender.put
   begat Appender.put
Hmm, this looks suboptimal for more than one reason. Wouldn't `text` benefit from using an appender in its implementation, rather than letting every invocation of `to` have its own? I.e. there ought to be one Appender per `text` invocation, rather than per one `text` argument. textImpl seems to already have a branch for >1 arguments, but it still invokes to!S(arg) for them.
Jun 23 2018
parent David Bennett <davidbennett bravevision.com> writes:
On Saturday, 23 June 2018 at 20:17:01 UTC, Vladimir Panteleev 
wrote:
 On Friday, 22 June 2018 at 21:37:07 UTC, Walter Bright wrote:
   text
   begat textImpl
   begat to
   begat toImpl
   begat toStr
   begat formatValue
   begat formatValueImpl
   begat put
   begat doPut
   begat Appender.put
   begat Appender.put
Hmm, this looks suboptimal for more than one reason. Wouldn't `text` benefit from using an appender in its implementation, rather than letting every invocation of `to` have its own? I.e. there ought to be one Appender per `text` invocation, rather than per one `text` argument. textImpl seems to already have a branch for >1 arguments, but it still invokes to!S(arg) for them.
`text` has an appender [0] but it calls `to!` for every non-integer, so it has effectively O(n+1) GC allocations. We should be able to get this down to O(1). The initially obvious way to do this would be allowing `to!` to take an `OutputRange`... but this would not improve the OP issue Walter's had with trying to trace template expansions. (in fact it would make it worse and add a lot of template bloat) So my idea is to have a lower level API that is ` safe nogc pure nothrow`, takes already allocated memory and writes to it like: --- // Reterning W[] here with a slice of the original buffer so you can use it in tests easily. safe nogc pure nothrow W[] writeCharsTo(T, W)(T input, scope W[] buffer); --- This could then be used in both `to!` and `text`, or directly if the user wanted a nogc option (ie json generator, or use with std.io etc). I've done some benchmarks with integers and `to!` becomes almost twice as fast with `dmd -O -release -inline` and still ~30% faster with `ldc2 -O3 -release` (as its zero copy unlike the current `toChars` InputRange which copies when used in `to!`) Early results of a `text` benchmark show even better speed improvements. I will clean up the benchmark code I have and post it for you to see tonight. In principle would this type of addition be accepted into phobos? Thanks, David [0] https://github.com/dlang/phobos/blob/master/std/conv.d#L4243
Jun 24 2018
prev sibling parent reply David Bennett <davidbennett bravevision.com> writes:
On Friday, 22 June 2018 at 21:37:07 UTC, Walter Bright wrote:
 If someone wants to figure out how to build an uncomplicated, 
 straightforward, efficient implementation of text() that we can 
 be proud of, that would be a nice contribution.
Here [0] is part of a benchmark for implementing `to!string` with the lower level api design I was talking about in my other post [1]. On my machine i get these results: $ dmd -dip1000 to!(string): 17 secs, 88 ms, 322 μs, and 1 hnsec newTo!(string): 8 secs, 785 ms, 325 μs, and 1 hnsec nogc writeCharsTo: 4 secs, 404 ms, 21 μs, and 7 hnsecs $ dmd -dip1000 -O -release -inline to!(string): 9 secs, 396 ms, 688 μs, and 1 hnsec newTo!(string): 4 secs, 640 ms, and 698 μs nogc writeCharsTo: 1 sec, 288 ms, 176 μs, and 9 hnsecs $ ldc2 -dip1000 -O3 -release to!(string): 4 secs, 489 ms, 52 μs, and 3 hnsecs newTo!(string): 3 secs, 781 ms, 990 μs, and 5 hnsecs nogc writeCharsTo: 752 ms, 989 μs, and 5 hnsecs I will upload the benchmark for `text` tonight. Also other points to make are: Using this low level api will work in -betterC code. It might even fit into utiliD [2]. [0] https://run.dlang.io/gist/97639a90931ad8e6e3a1ea564c46d710?args=-O%20-release%20-inline%20-dip1000 [1] https://forum.dlang.org/post/erxtrmnqbcbtgokoutbc forum.dlang.org [2] https://github.com/JinShil/utiliD
Jun 24 2018
parent David Bennett <davidbennett bravevision.com> writes:
On Monday, 25 June 2018 at 01:03:03 UTC, David Bennett wrote:
 On Friday, 22 June 2018 at 21:37:07 UTC, Walter Bright wrote:
 If someone wants to figure out how to build an uncomplicated, 
 straightforward, efficient implementation of text() that we 
 can be proud of, that would be a nice contribution.
I will upload the benchmark for `text` tonight.
I have made a more complete benchmark [0] so we can see the speed of each basic type. Here's the benchmark results for dmd: |-----------------------------------------------------------------------| | <convert> .to!(string) | .newTo!(string) | .writeCharsTo(buf) | |-----------------------------------------------------------------------| | ubyte : 1856 ms | 1037 ms | 201 ms | | ushort : 2336 ms | 1350 ms | 382 ms | | uint : 3415 ms | 1538 ms | 669 ms | | ulong : 6171 ms | 2275 ms | 1398 ms | | byte : 1908 ms | 1044 ms | 227 ms | | short : 2234 ms | 1364 ms | 406 ms | | int : 3560 ms | 1603 ms | 629 ms | | long : 6276 ms | 2219 ms | 1369 ms | | char : 3013 ms | 1061 ms | 181 ms | | wchar : 3285 ms | 1043 ms | 242 ms | | dchar : 3211 ms | 1048 ms | 277 ms | |-----------------------------------------------------------------------| |-----------------------------------------------------------------------| | <concat> text(...) | newText(...) | .writeCharsTo(buf) | |-----------------------------------------------------------------------| | ubyte : 2643 ms | 937 ms | 254 ms | | ushort : 3237 ms | 1173 ms | 530 ms | | uint : 5144 ms | 1448 ms | 738 ms | | ulong : 9854 ms | 2126 ms | 1427 ms | | byte : 2805 ms | 951 ms | 310 ms | | short : 3272 ms | 1268 ms | 583 ms | | int : 5548 ms | 1456 ms | 819 ms | | long : 10303 ms | 2182 ms | 1488 ms | | char : 3853 ms | 891 ms | 258 ms | | wchar : 4151 ms | 940 ms | 263 ms | | dchar : 3945 ms | 949 ms | 352 ms | |-----------------------------------------------------------------------| And ldc2: |-----------------------------------------------------------------------| | <convert> .to!(string) | .newTo!(string) | .writeCharsTo(buf) | |-----------------------------------------------------------------------| | ubyte : 1003 ms | 1011 ms | 88 ms | | ushort : 959 ms | 772 ms | 163 ms | | uint : 1519 ms | 1237 ms | 382 ms | | ulong : 2030 ms | 1484 ms | 735 ms | | byte : 899 ms | 886 ms | 121 ms | | short : 1115 ms | 1061 ms | 189 ms | | int : 1267 ms | 1041 ms | 390 ms | | long : 2030 ms | 1596 ms | 852 ms | | char : 1958 ms | 755 ms | 28 ms | | wchar : 1945 ms | 836 ms | 129 ms | | dchar : 2141 ms | 850 ms | 142 ms | |-----------------------------------------------------------------------| |-----------------------------------------------------------------------| | <concat> text(...) | newText(...) | .writeCharsTo(buf) | |-----------------------------------------------------------------------| | ubyte : 1248 ms | 448 ms | 164 ms | | ushort : 1676 ms | 612 ms | 290 ms | | uint : 1491 ms | 732 ms | 445 ms | | ulong : 2343 ms | 1074 ms | 1095 ms | | byte : 1288 ms | 506 ms | 143 ms | | short : 1494 ms | 656 ms | 311 ms | | int : 1315 ms | 633 ms | 388 ms | | long : 2540 ms | 1363 ms | 1044 ms | | char : 2528 ms | 411 ms | 189 ms | | wchar : 2600 ms | 506 ms | 149 ms | | dchar : 2697 ms | 452 ms | 167 ms | |-----------------------------------------------------------------------| So I think there are quite a lot of savings to be had... actually I notice that the char version of text(...) is really slow when it should be quite simple... oh its being converted using to! to a GC array and the copied in... I'll submit a PR... [1] That's better (DMD Master + my patch): |------------------------------| | <concat> text(...) | |------------------------------| | char : 753 ms | | wchar : 928 ms | | dchar : 1028 ms | |------------------------------| Thanks, David [0] https://github.com/complistic-gaff/newconv [1] https://github.com/dlang/phobos/pull/6615
Jun 26 2018