www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Slow code, slow

reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
Now that I got your attention:

	https://issues.dlang.org/show_bug.cgi?id=18511

tl;dr: A trivial piece of code, written as ostensibly "idiomatic D" with
std.algorithm and std.range templates, compiles *an order of magnitude*
slower than the equivalent hand-written loop.  The way the compiler
compiles templates needs some serious improvement.

(And this is why our current fast-fast-fast slogan annoys me so much.
One can argue that it's misleading advertising, given that what's
considered "idiomatic D", using features like templates and generic code
that's highly-touted as D's strong points, compiles a whole order of
magnitude slower than C-style D.  Makes me cringe every time I hear
"fast code, fast". Our old slogan is a much more accurate description of
the current state of things.)


T

-- 
Don't throw out the baby with the bathwater. Use your hands...
Feb 23 2018
next sibling parent reply Rubn <where is.this> writes:
On Friday, 23 February 2018 at 20:15:12 UTC, H. S. Teoh wrote:
 Now that I got your attention:

 	https://issues.dlang.org/show_bug.cgi?id=18511

 tl;dr: A trivial piece of code, written as ostensibly 
 "idiomatic D" with std.algorithm and std.range templates, 
 compiles *an order of magnitude* slower than the equivalent 
 hand-written loop.  The way the compiler compiles templates 
 needs some serious improvement.

 (And this is why our current fast-fast-fast slogan annoys me so 
 much. One can argue that it's misleading advertising, given 
 that what's considered "idiomatic D", using features like 
 templates and generic code that's highly-touted as D's strong 
 points, compiles a whole order of magnitude slower than C-style 
 D.  Makes me cringe every time I hear "fast code, fast". Our 
 old slogan is a much more accurate description of the current 
 state of things.)


 T
It's not that big of a slow down. Using "fast" you don't import any modules so they never have to be parsed. That's pretty much all of phobos you don't have to parse in that example. That's just the initial cost too. In a big project this won't make a difference. You create a tiny example that is irrelevant to the larger scale, that takes 0.3 seconds longer to compile. It's a magnitude slower cause in your fast example it's literately only parsing 5 lines of code instead of hundreds of lines like it is in your slow example.
Feb 23 2018
next sibling parent reply bauss <jj_1337 live.dk> writes:
On Friday, 23 February 2018 at 20:35:44 UTC, Rubn wrote:
 On Friday, 23 February 2018 at 20:15:12 UTC, H. S. Teoh wrote:
 Now that I got your attention:

 	https://issues.dlang.org/show_bug.cgi?id=18511

 tl;dr: A trivial piece of code, written as ostensibly 
 "idiomatic D" with std.algorithm and std.range templates, 
 compiles *an order of magnitude* slower than the equivalent 
 hand-written loop.  The way the compiler compiles templates 
 needs some serious improvement.

 (And this is why our current fast-fast-fast slogan annoys me 
 so much. One can argue that it's misleading advertising, given 
 that what's considered "idiomatic D", using features like 
 templates and generic code that's highly-touted as D's strong 
 points, compiles a whole order of magnitude slower than 
 C-style D.  Makes me cringe every time I hear "fast code, 
 fast". Our old slogan is a much more accurate description of 
 the current state of things.)


 T
It's not that big of a slow down. Using "fast" you don't import any modules so they never have to be parsed. That's pretty much all of phobos you don't have to parse in that example. That's just the initial cost too. In a big project this won't make a difference. You create a tiny example that is irrelevant to the larger scale, that takes 0.3 seconds longer to compile. It's a magnitude slower cause in your fast example it's literately only parsing 5 lines of code instead of hundreds of lines like it is in your slow example.
I disagree. It actually matters a lot for big projects with lots of templates, especially nested templates. Gets a whole lot worse when it's templates within mixin templates with templates. It's not just a "0.3" second difference, but can be half a minute or even more.
Feb 23 2018
next sibling parent reply Rubn <where is.this> writes:
On Friday, 23 February 2018 at 20:41:17 UTC, bauss wrote:
 On Friday, 23 February 2018 at 20:35:44 UTC, Rubn wrote:
 On Friday, 23 February 2018 at 20:15:12 UTC, H. S. Teoh wrote:
 Now that I got your attention:

 	https://issues.dlang.org/show_bug.cgi?id=18511

 tl;dr: A trivial piece of code, written as ostensibly 
 "idiomatic D" with std.algorithm and std.range templates, 
 compiles *an order of magnitude* slower than the equivalent 
 hand-written loop.  The way the compiler compiles templates 
 needs some serious improvement.

 (And this is why our current fast-fast-fast slogan annoys me 
 so much. One can argue that it's misleading advertising, 
 given that what's considered "idiomatic D", using features 
 like templates and generic code that's highly-touted as D's 
 strong points, compiles a whole order of magnitude slower 
 than C-style D.  Makes me cringe every time I hear "fast 
 code, fast". Our old slogan is a much more accurate 
 description of the current state of things.)


 T
It's not that big of a slow down. Using "fast" you don't import any modules so they never have to be parsed. That's pretty much all of phobos you don't have to parse in that example. That's just the initial cost too. In a big project this won't make a difference. You create a tiny example that is irrelevant to the larger scale, that takes 0.3 seconds longer to compile. It's a magnitude slower cause in your fast example it's literately only parsing 5 lines of code instead of hundreds of lines like it is in your slow example.
I disagree. It actually matters a lot for big projects with lots of templates, especially nested templates. Gets a whole lot worse when it's templates within mixin templates with templates. It's not just a "0.3" second difference, but can be half a minute or even more.
Like with anything, since you can now basically run code at compile time, you are going to have to make optimizations to your code. If you make a million template instances, well a compiler isn't going to magically be able to make that fast. This slowdown for this specific example isn't cause by templates, it's caused by having to parse all the extra lines of code from phobos. I didn't say there aren't problems with templates, but this example accurately depicts nothing.
Feb 23 2018
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Feb 23, 2018 at 08:51:20PM +0000, Rubn via Digitalmars-d wrote:
[...]
 This slowdown for this specific example isn't cause by templates, it's
 caused by having to parse all the extra lines of code from phobos. I
 didn't say there aren't problems with templates, but this example
 accurately depicts nothing.
I say again, do you have measurements to back up your statement? Parsing is actually very fast with the DMD front end. I can't believe that it will take half a second to parse a Phobos module -- the compiler's parser is not that stupid. I have a 1600+ line module that compiles in about 0.4 seconds (that's lexing + parsing + semantic + codegen), but that time more than doubles when you just change a loop into a range-based algorithm. Clearly, parsing is not the bottleneck here. T -- Unix is my IDE. -- Justin Whear
Feb 23 2018
parent Rubn <where is.this> writes:
On Friday, 23 February 2018 at 21:10:25 UTC, H. S. Teoh wrote:
 On Fri, Feb 23, 2018 at 08:51:20PM +0000, Rubn via 
 Digitalmars-d wrote: [...]
 This slowdown for this specific example isn't cause by 
 templates, it's caused by having to parse all the extra lines 
 of code from phobos. I didn't say there aren't problems with 
 templates, but this example accurately depicts nothing.
I say again, do you have measurements to back up your statement? Parsing is actually very fast with the DMD front end. I can't believe that it will take half a second to parse a Phobos module -- the compiler's parser is not that stupid. I have a 1600+ line module that compiles in about 0.4 seconds (that's lexing + parsing + semantic + codegen), but that time more than doubles when you just change a loop into a range-based algorithm. Clearly, parsing is not the bottleneck here. T
I did measure it, adding another instigation of the templates using a different type adds a fraction of the time. Not another 0.3 seconds. I don't know what your so called 1600+ line module is doing, just cause it's 1600 lines doesn't mean there won't be the same slow down if you don't use part of phobos in all those lines. Then add a few lines that do use it, which will incur this slowdown.
Feb 23 2018
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Feb 23, 2018 at 08:41:17PM +0000, bauss via Digitalmars-d wrote:
[...]
 It actually matters a lot for big projects with lots of templates,
 especially nested templates. Gets a whole lot worse when it's
 templates within mixin templates with templates.
The situation has actually improved somewhat after Rainer's symbol backreferencing PR was merged late last year. Before that, deeply nested templates were spending most of their time generating, scanning, and writing out 20MB-long symbols. :-D Now that superlong symbols are no longer the bottleneck, though, other issues with the implementation of templates are coming to the surface. Like this one, where it takes *3 seconds* to compile a program containing a *single* (trivial) regex: https://issues.dlang.org/show_bug.cgi?id=18378
 It's not just a "0.3" second difference, but can be half a minute or
 even more.
In the old days, when yours truly submitted a naïve implementation of cartesianProduct to Phobos, compiling Phobos unittests would cause the autotester to freeze for a long time and then die with an OOM, because using cartesianProduct with multiple arguments caused an exponential number of templates to get instantiated. :-D Over the years there have also been a number of PRs that try to mitigate the problem somewhat by, e.g., replacing a linearly-recursive template (usually tail-recursive -- but the compiler currently does not take advantage of that) with a divide-and-conquer scheme instead. A lot of stuff that iterates over AliasSeq suffers from this problem, actually. AIUI, due to the way templates are currently implemented, a linearly-recursive template causes quadratic slowdown in compilation time. Clearly, the quality of implementation needs improvement here. T -- Once bitten, twice cry...
Feb 23 2018
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Feb 23, 2018 at 08:35:44PM +0000, Rubn via Digitalmars-d wrote:
[...]
 It's not that big of a slow down. Using "fast" you don't import any modules
 so they never have to be parsed. That's pretty much all of phobos you don't
 have to parse in that example. That's just the initial cost too. In a big
 project this won't make a difference.
Wrong. This code was reduced from a bigger module (1600+ lines of code) containing the offending function. If I write that function with a straight loop, the entire module compiles in about 0.4 seconds. If I change that function to use Phobos algorithms, the compilation time slows down to more than 1 second.
 You create a tiny example that is irrelevant to the larger scale, that
 takes 0.3 seconds longer to compile.  It's a magnitude slower cause in
 your fast example it's literately only parsing 5 lines of code instead
 of hundreds of lines like it is in your slow example.
Please measure before you make statements like that. You're assuming I wrote that example out of thin air, but it's actually code reduced from a larger module where changing a single function more than doubles the compilation time of the *entire module*. Parsing is actually extremely fast, esp. with the DMD front end. The slowdown is caused by the way the compiler handles templates (and possibly the way Phobos uses exponential templates in some places). And this is only a smaller example of a single module. I do have code across multiple modules that take horrendously long to compile because of heavy template use. T -- If blunt statements had a point, they wouldn't be blunt...
Feb 23 2018
parent Rubn <where is.this> writes:
On Friday, 23 February 2018 at 20:52:47 UTC, H. S. Teoh wrote:
 On Fri, Feb 23, 2018 at 08:35:44PM +0000, Rubn via 
 Digitalmars-d wrote: [...]
 It's not that big of a slow down. Using "fast" you don't 
 import any modules so they never have to be parsed. That's 
 pretty much all of phobos you don't have to parse in that 
 example. That's just the initial cost too. In a big project 
 this won't make a difference.
Wrong. This code was reduced from a bigger module (1600+ lines of code) containing the offending function. If I write that function with a straight loop, the entire module compiles in about 0.4 seconds. If I change that function to use Phobos algorithms, the compilation time slows down to more than 1 second.
I don't know what else you are doing, but if you aren't using phobos or any of it's functions in there other than those few lines of code. Then yah you'll get the same result.
 You create a tiny example that is irrelevant to the larger 
 scale, that takes 0.3 seconds longer to compile.  It's a 
 magnitude slower cause in your fast example it's literately 
 only parsing 5 lines of code instead of hundreds of lines like 
 it is in your slow example.
Please measure before you make statements like that. You're assuming I wrote that example out of thin air, but it's actually code reduced from a larger module where changing a single function more than doubles the compilation time of the *entire module*. Parsing is actually extremely fast, esp. with the DMD front end. The slowdown is caused by the way the compiler handles templates (and possibly the way Phobos uses exponential templates in some places). And this is only a smaller example of a single module. I do have code across multiple modules that take horrendously long to compile because of heavy template use. T
I did measure it, adding another instigation of the templates using a different type adds a fraction of the time. Not another 0.3 seconds.
Feb 23 2018
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/23/18 3:15 PM, H. S. Teoh wrote:
 Now that I got your attention:
 
 	https://issues.dlang.org/show_bug.cgi?id=18511
 
 tl;dr: A trivial piece of code, written as ostensibly "idiomatic D" with
 std.algorithm and std.range templates, compiles *an order of magnitude*
 slower than the equivalent hand-written loop.  The way the compiler
 compiles templates needs some serious improvement.
 
 (And this is why our current fast-fast-fast slogan annoys me so much.
 One can argue that it's misleading advertising, given that what's
 considered "idiomatic D", using features like templates and generic code
 that's highly-touted as D's strong points, compiles a whole order of
 magnitude slower than C-style D.  Makes me cringe every time I hear
 "fast code, fast". Our old slogan is a much more accurate description of
 the current state of things.)
cc Dmitry Thanks for a solid bug report. The right response here is to live into our "fast code, fast" principle. It might be the case that the slowdown is actually the negative side of an acceleration :o) - before Dmitry's recent work, the sheer act of importing std.regex would be slow. Dmitry, do you think you could use some precompiled tables to mitigate this? Is your caching compiler going to help the matter? Andrei
Feb 23 2018
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On Saturday, 24 February 2018 at 00:21:06 UTC, Andrei 
Alexandrescu wrote:
 On 2/23/18 3:15 PM, H. S. Teoh wrote:
 
 tl;dr: A trivial piece of code, written as ostensibly 
 "idiomatic D" with.  Makes me cringe every time I hear
 "fast code, fast". Our old slogan is a much more accurate 
 description of
 the current state of things.)
cc Dmitry Thanks for a solid bug report. The right response here is to live into our "fast code, fast" principle. It might be the case that the slowdown is actually the negative side of an acceleration :o) - before Dmitry's recent work, the sheer act of importing std.regex would be slow. Dmitry, do you think you could use some precompiled tables to mitigate this?
First things first sombody need to profile compiler while compiling this snippet. My guesswork is that instantiating templates + generating long symbols is the problem. The template system obviously needs some (re)work, I think at a time nobody thought templates would be that abundant in D code. Nowdays it’s easily more templates then normal functions.
 Is your caching compiler going to help the matter?
In some distant bright future where it may be finally applied to instantiating templates and caching codegen but even then I’m not 100% positive. Finally, I repeat - we have not yet identified problem. What takes time in the compiler needs to be figured out by disecting the time taken via profiler and experimentation. — Dmitry Olshansky
Feb 23 2018
prev sibling next sibling parent kdevel <kdevel vogtner.de> writes:
On Friday, 23 February 2018 at 20:15:12 UTC, H. S. Teoh wrote:
 Now that I got your attention:

 	https://issues.dlang.org/show_bug.cgi?id=18511
Your bug report is about slowdown in *compilation* time. I wondered if the longer compilation time is due to the better (faster) generated code. But this is not the case either: $ ./dotbench initialized arrays of type double dot_fast: 279 ms value = 0 dot_slow: 5413 ms value = 0 dotProduct: 217 ms value = 0
Feb 23 2018
prev sibling parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Friday, 23 February 2018 at 20:15:12 UTC, H. S. Teoh wrote:
 Now that I got your attention:

 	https://issues.dlang.org/show_bug.cgi?id=18511

 tl;dr: A trivial piece of code, written as ostensibly 
 "idiomatic D" with std.algorithm and std.range templates, 
 compiles *an order of magnitude* slower than the equivalent 
 hand-written loop.  The way the compiler compiles templates 
 needs some serious improvement.

 (And this is why our current fast-fast-fast slogan annoys me so 
 much. One can argue that it's misleading advertising, given 
 that what's considered "idiomatic D", using features like 
 templates and generic code that's highly-touted as D's strong 
 points, compiles a whole order of magnitude slower than C-style 
 D.  Makes me cringe every time I hear "fast code, fast". Our 
 old slogan is a much more accurate description of the current 
 state of things.)


 T
This particular slowdown happens because there are somehow depdencies on std.format.format which is instantiated. Which has a ton of dependencies itself.
Feb 24 2018
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Feb 24, 2018 at 09:43:35AM +0000, Stefan Koch via Digitalmars-d wrote:
[...]
 This particular slowdown happens because there are somehow depdencies
 on std.format.format which is instantiated.
 Which has a ton of dependencies itself.
Aha! That explains it. Thanks, Stefan, for the accurate diagnosis. :-) Now the next problem is: how to trim the fat off std.format ... T -- What is Matter, what is Mind? Never Mind, it doesn't Matter.
Feb 26 2018
next sibling parent ketmar <ketmar ketmar.no-ip.org> writes:
H. S. Teoh wrote:

 On Sat, Feb 24, 2018 at 09:43:35AM +0000, Stefan Koch via Digitalmars-d 
 wrote:
 [...]
 This particular slowdown happens because there are somehow depdencies
 on std.format.format which is instantiated.
 Which has a ton of dependencies itself.
Aha! That explains it. Thanks, Stefan, for the accurate diagnosis. :-) Now the next problem is: how to trim the fat off std.format ...
no wai. it is just Too General to be slim. what can be done, though, is `std.format.lite` module (or something), that supports only a very small subset of "big format" features, like simply printing numbers, arrays, and unconditionally calling `.toString` on structs/classes, and a very restricted set of formatting options. that should cover alot of use cases. 'cause most of the time people only need something like `%3d %s` and such.
Feb 26 2018
prev sibling next sibling parent reply ketmar <ketmar ketmar.no-ip.org> writes:
H. S. Teoh wrote:

 On Sat, Feb 24, 2018 at 09:43:35AM +0000, Stefan Koch via Digitalmars-d 
 wrote:
 [...]
 This particular slowdown happens because there are somehow depdencies
 on std.format.format which is instantiated.
 Which has a ton of dependencies itself.
Aha! That explains it. Thanks, Stefan, for the accurate diagnosis. :-) Now the next problem is: how to trim the fat off std.format ...
p.s.: and ditch type safety. 'cause `foo(T...) (T args)` is a major slowdown -- it is a template. ;-)
Feb 26 2018
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Feb 26, 2018 at 08:38:39PM +0200, ketmar via Digitalmars-d wrote:
 H. S. Teoh wrote:
 
 On Sat, Feb 24, 2018 at 09:43:35AM +0000, Stefan Koch via Digitalmars-d
 wrote:
 [...]
 This particular slowdown happens because there are somehow
 depdencies on std.format.format which is instantiated.  Which has
 a ton of dependencies itself.
Aha! That explains it. Thanks, Stefan, for the accurate diagnosis. :-) Now the next problem is: how to trim the fat off std.format ...
p.s.: and ditch type safety. 'cause `foo(T...) (T args)` is a major slowdown -- it is a template. ;-)
Actually, I think this is the classic example of why the compiler should improve the way it implements templates. In my mind, even C's printf API is sucky, because it involves runtime parsing of what's usually a static string, over and over again. What we *really* want is for something like: writeln("blah %d bluh %s", i, s); to be translated into something like: stdout.putString("blah "); stdout.putInt(i); stdout.putString(" bluh "); stdout.putString(s); I.e., there should not be Yet Another Template with Yet Another Ridiculously Long Argument List Type Encoded In The Mangled Name, along with needless marshalling of function arguments on the stack, branching to some other part of the code (potentially causing an instruction cache miss), tons of copy-pasta for calling the same old functions for outputting strings and formatting integers, and incurring Yet Another Branch Hazard when the function finally returns. And there should definitely be no silly runtime parsing of format strings and all of that useless dance. The latest Phobos does support compile-time format strings, but all that does currently is to forward to the silly runtime parsing code (not to mention coming with its own baggage of additional templates to do the compile-time format string checking). Basically, the whole stupid function call should just be completely inlined and any external template function bodies thrown out the window, because chances are you'll never call format() again with exactly the same parameters somewhere else in the code. I haven't checked if ldc will actually do this level of inlining, but dmd certainly won't with its overly-conservative inliner. And besides, it's a stupid waste of compiler resources to have to generate all of that template code every single time format() is called, only to have the optimizer basically undo half of the work. There should be some way in the language to express format() in a way that doesn't involve tons of template bloat and wasted function body copy-pasta. T -- Meat: euphemism for dead animal. -- Flora
Feb 26 2018
parent reply ketmar <ketmar ketmar.no-ip.org> writes:
H. S. Teoh wrote:

 On Mon, Feb 26, 2018 at 08:38:39PM +0200, ketmar via Digitalmars-d wrote:
 H. S. Teoh wrote:
 
 On Sat, Feb 24, 2018 at 09:43:35AM +0000, Stefan Koch via Digitalmars-d
 wrote:
 [...]
 This particular slowdown happens because there are somehow
 depdencies on std.format.format which is instantiated.  Which has
 a ton of dependencies itself.
Aha! That explains it. Thanks, Stefan, for the accurate diagnosis. :-) Now the next problem is: how to trim the fat off std.format ...
p.s.: and ditch type safety. 'cause `foo(T...) (T args)` is a major slowdown -- it is a template. ;-)
Actually, I think this is the classic example of why the compiler should improve the way it implements templates. In my mind, even C's printf API is sucky, because it involves runtime parsing of what's usually a static string, over and over again. What we *really* want is for something like: writeln("blah %d bluh %s", i, s); to be translated into something like: stdout.putString("blah "); stdout.putInt(i); stdout.putString(" bluh "); stdout.putString(s); I.e., there should not be Yet Another Template with Yet Another Ridiculously Long Argument List Type Encoded In The Mangled Name, along with needless marshalling of function arguments on the stack, branching to some other part of the code (potentially causing an instruction cache miss), tons of copy-pasta for calling the same old functions for outputting strings and formatting integers, and incurring Yet Another Branch Hazard when the function finally returns. And there should definitely be no silly runtime parsing of format strings and all of that useless dance.
i once wrote such thing (for fun, using "Functional Programming With Templates"). it was fun to do, and freakin' slow due to template bloat. ;-) but yes, it generates a string mixin, and in runtime there was no format string parsing. still, we can be either smart, or have fast compile times, but not both. T_T p.s.: oops. just found that i cannot pass structs with dtors to `(...)` functions. not fun at all.
Feb 26 2018
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Feb 26, 2018 at 09:03:14PM +0200, ketmar via Digitalmars-d wrote:
 H. S. Teoh wrote:
[...]
 In my mind, even C's printf API is sucky, because it involves runtime
 parsing of what's usually a static string, over and over again. What we
 *really* want is for something like:
 
 	writeln("blah %d bluh %s", i, s);
 
 to be translated into something like:
 
 	stdout.putString("blah ");
 	stdout.putInt(i);
 	stdout.putString(" bluh ");
 	stdout.putString(s);
[...]
 i once wrote such thing (for fun, using "Functional Programming With
 Templates"). it was fun to do, and freakin' slow due to template
 bloat. ;-)
 but yes, it generates a string mixin, and in runtime there was no
 format string parsing.
The problem is not the Phobos implementation. The problem is that the compiler's way of handling templates and CTFE needs to be improved. We seriously need to muster some manpower to help Stefan finish newCTFE, and then we need to take a serious look at improving the current implementation of templates.
 still, we can be either smart, or have fast compile times, but not
 both. T_T
[...] I'll like to disagree. :-D There's got to be a way to do this that doesn't have to compromise either way. I mean, this is not like we're doing rocket science here, or solving an NP complete problem. It's a straightforward way of recognizing a particular code pattern and applying 1-to-1 mappings. The general case of completely arbitrary templates can still fallback to the current implementation. The point is to optimize for specific template usage patterns that are common and yields big speedups, but still leave the door open for weirder, but less common, template code. T -- Fact is stranger than fiction.
Feb 26 2018
parent reply ketmar <ketmar ketmar.no-ip.org> writes:
H. S. Teoh wrote:

 The problem is not the Phobos implementation.  The problem is that the
 compiler's way of handling templates and CTFE needs to be improved.  We
 seriously need to muster some manpower to help Stefan finish newCTFE,
 and then we need to take a serious look at improving the current
 implementation of templates.
yeah, i'm not saying that phobos code is wrong. but being "not wrong" and being fast is not always the same. ;-)
 still, we can be either smart, or have fast compile times, but not
 both. T_T
[...] I'll like to disagree. :-D There's got to be a way to do this that doesn't have to compromise either way. I mean, this is not like we're doing rocket science here, or solving an NP complete problem. It's a straightforward way of recognizing a particular code pattern and applying 1-to-1 mappings. The general case of completely arbitrary templates can still fallback to the current implementation. The point is to optimize for specific template usage patterns that are common and yields big speedups, but still leave the door open for weirder, but less common, template code.
but until that brave new world materializes, we have a smart/fast dilemma. alas.
Feb 26 2018
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Feb 26, 2018 at 10:12:25PM +0200, ketmar via Digitalmars-d wrote:
[...]
 but until that brave new world materializes, we have a smart/fast
 dilemma.  alas.
I'd like to contribute to the materialization of that brave new world. Rather than sit around and wait for it to happen. :-P T -- Your inconsistency is the only consistent thing about you! -- KD
Feb 26 2018
parent reply ketmar <ketmar ketmar.no-ip.org> writes:
H. S. Teoh wrote:

 On Mon, Feb 26, 2018 at 10:12:25PM +0200, ketmar via Digitalmars-d wrote:
 [...]
 but until that brave new world materializes, we have a smart/fast
 dilemma.  alas.
I'd like to contribute to the materialization of that brave new world. Rather than sit around and wait for it to happen. :-P
i'd like to do it too, but dmd code to process templates knocks me out of my consciousness. alas.
Feb 26 2018
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Monday, 26 February 2018 at 21:38:09 UTC, ketmar wrote:
 H. S. Teoh wrote:

 On Mon, Feb 26, 2018 at 10:12:25PM +0200, ketmar via 
 Digitalmars-d wrote:
 [...]
 but until that brave new world materializes, we have a 
 smart/fast
 dilemma.  alas.
I'd like to contribute to the materialization of that brave new world. Rather than sit around and wait for it to happen. :-P
i'd like to do it too, but dmd code to process templates knocks me out of my consciousness. alas.
Yes, the dmd template is seriously mind-blowing. Which is why I am building an alternative system, which is aimed at being fast. At the expensive of the programmer though, because there is no automatic caching. If you want to "instantiate" the same "template-replacement" if the same parameter you better know that in advance.
Feb 27 2018
parent reply Martin Tschierschke <mt smartdolphin.de> writes:
On Tuesday, 27 February 2018 at 08:49:15 UTC, Stefan Koch wrote:
 On Monday, 26 February 2018 at 21:38:09 UTC, ketmar wrote:
 H. S. Teoh wrote:

 On Mon, Feb 26, 2018 at 10:12:25PM +0200, ketmar via 
 Digitalmars-d wrote:
[...] When looking at the problem of compilation times I think: Wouldn't it speed up the development process, if spiting your code in modules would automatically results in creating small libs which are - if possible - compiled only once? The idea of using a caching mechanism, is an other general way not to compile the same over and over again. Part of the discussion is here: https://github.com/dlang/dmd/pull/7239#issuecomment-340256110
Feb 27 2018
next sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Tuesday, 27 February 2018 at 09:25:57 UTC, Martin Tschierschke 
wrote:
 On Tuesday, 27 February 2018 at 08:49:15 UTC, Stefan Koch wrote:
 On Monday, 26 February 2018 at 21:38:09 UTC, ketmar wrote:
 H. S. Teoh wrote:

 On Mon, Feb 26, 2018 at 10:12:25PM +0200, ketmar via 
 Digitalmars-d wrote:
[...] When looking at the problem of compilation times I think: Wouldn't it speed up the development process, if spiting your code in modules would automatically results in creating small libs which are - if possible - compiled only once? The idea of using a caching mechanism, is an other general way not to compile the same over and over again. Part of the discussion is here: https://github.com/dlang/dmd/pull/7239#issuecomment-340256110
It's more complicate then that. The problem with caching is dependency analysis which is pretty difficult with templates in the mix.
Feb 27 2018
prev sibling parent reply ketmar <ketmar ketmar.no-ip.org> writes:
Martin Tschierschke wrote:

 On Tuesday, 27 February 2018 at 08:49:15 UTC, Stefan Koch wrote:
 On Monday, 26 February 2018 at 21:38:09 UTC, ketmar wrote:
 H. S. Teoh wrote:

 On Mon, Feb 26, 2018 at 10:12:25PM +0200, ketmar via Digitalmars-d 
 wrote:
[...] When looking at the problem of compilation times I think: Wouldn't it speed up the development process, if spiting your code in modules would automatically results in creating small libs which are - if possible - compiled only once? The idea of using a caching mechanism, is an other general way not to compile the same over and over again. Part of the discussion is here: https://github.com/dlang/dmd/pull/7239#issuecomment-340256110
basically, compilation of a code without templates is FAST. 500+ KB of source almost without templates often compiles in less than a second (on not-so-bleeding-edge i3, not even i7). but throw templates in a mix, and BOOM! coffee and cigarettes.
Feb 27 2018
parent reply Martin Tschierschke <mt smartdolphin.de> writes:
On Tuesday, 27 February 2018 at 13:35:14 UTC, ketmar wrote:
 Martin Tschierschke wrote:

 On Tuesday, 27 February 2018 at 08:49:15 UTC, Stefan Koch 
 wrote:
 On Monday, 26 February 2018 at 21:38:09 UTC, ketmar wrote:
 H. S. Teoh wrote:

 On Mon, Feb 26, 2018 at 10:12:25PM +0200, ketmar via 
 Digitalmars-d wrote:
[...] When looking at the problem of compilation times I think: Wouldn't it speed up the development process, if spiting your code in modules would automatically results in creating small libs which are - if possible - compiled only once? The idea of using a caching mechanism, is an other general way not to compile the same over and over again. Part of the discussion is here: https://github.com/dlang/dmd/pull/7239#issuecomment-340256110
basically, compilation of a code without templates is FAST. 500+ KB of source almost without templates often compiles in less than a second (on not-so-bleeding-edge i3, not even i7). but throw templates in a mix, and BOOM! coffee and cigarettes.
My negative experience was, when using ctRegex and normal regex. But it was no problem to separate the functions using regex in a lib and compile them separately. (app saving 3 seconds) The same approach was working with .dt (diet template in vibe.d) and the function(s) instantiating it, put both together in an own lib. And define it as a local external dependency. In the moment I am thinking about a way to do this automatically. So that every new build of my vibe.d app, only needs to compile the changes. (p.s. I am aware of this: https://github.com/rejectedsoftware/diet-ng#experimental-html-template-caching)
Feb 27 2018
parent ketmar <ketmar ketmar.no-ip.org> writes:
Martin Tschierschke wrote:

 basically, compilation of a code without templates is FAST. 500+ KB of 
 source almost without templates often compiles in less than a second (on 
 not-so-bleeding-edge i3, not even i7).

 but throw templates in a mix, and BOOM! coffee and cigarettes.
My negative experience was, when using ctRegex and normal regex. But it was no problem to separate the functions using regex in a lib and compile them separately. (app saving 3 seconds)
you happened to hit another issue, yeah: slow CTFE enigne. that should be improved too. ;-)
Feb 27 2018
prev sibling parent reply ketmar <ketmar ketmar.no-ip.org> writes:
H. S. Teoh wrote:

 On Sat, Feb 24, 2018 at 09:43:35AM +0000, Stefan Koch via Digitalmars-d 
 wrote:
 [...]
 This particular slowdown happens because there are somehow depdencies
 on std.format.format which is instantiated.
 Which has a ton of dependencies itself.
Aha! That explains it. Thanks, Stefan, for the accurate diagnosis. :-) Now the next problem is: how to trim the fat off std.format ...
p.p.s.: or replace it with `void fmtlite (...) {}` thingy. this way we can still have type safety, but get rid of templates.
Feb 26 2018
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Feb 26, 2018 at 08:42:22PM +0200, ketmar via Digitalmars-d wrote:
[...]
 p.p.s.: or replace it with `void fmtlite (...) {}` thingy. this way we
 can still have type safety, but get rid of templates.
Given the huge amount of templates in your typical, average D code, I think it's a much more worthwhile effort to improve the way the compiler implements templates instead. :-D T -- Leather is waterproof. Ever see a cow with an umbrella?
Feb 26 2018