www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Is there a list of things which are slow to compile?

reply drathier <forum.dlang.org fi.fo> writes:
I'm wondering if there's a place that lists things which are 
slower/faster to compile? DMD is pretty famed for compiling 
quickly, but I'm not seeing particularly high speed at all, and I 
want to fix that.

Currently at ~1ksloc/s of d input without optimizing anything, 
which corresponds to 350ksloc/s if measuring by `-vcg-ast` output 
instead of d source input, while using the same time measurement 
from before, so the flag doesn't cost time.

Here's my learnings so far:
- CTFE is obviously unboundedly slow, since it runs arbitrary code
- Template expansion is presumably O(n) in the size of the 
generated code, and the `-vcg-ast` flag helps a bit to see how 
much it's expanding. I'm not convinced it's the reason my code 
compiles slowly, though.
- no idea how expensive static if's on traits are
- std.regex compiles really slowly
- CTFE always runs on all top-level value definitions (even if 
they contain things which cannot be executed at compile-time, and 
I hate this so so much)

What other things are there?
- identifier lengths?
- comments?
- aliases?
- delegates?
- Variants?
Jun 03 2020
next sibling parent drathier <forum.dlang.org fi.fo> writes:
On Wednesday, 3 June 2020 at 09:36:52 UTC, drathier wrote:
 Currently at ~1ksloc/s of d input without optimizing anything, 
 which corresponds to 350ksloc/s if measuring by `-vcg-ast` 
 output instead of d source input, while using the same time 
 measurement from before, so the flag doesn't cost time.
Sorry, that should read `44ksloc/s`, not `350ksloc/s`.
Jun 03 2020
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jun 03, 2020 at 09:36:52AM +0000, drathier via Digitalmars-d-learn
wrote:
 I'm wondering if there's a place that lists things which are
 slower/faster to compile? DMD is pretty famed for compiling quickly,
 but I'm not seeing particularly high speed at all, and I want to fix
 that.
The two usual culprits are: - Recursive/chained templates - Excessive CTFE Note that while the current CTFE engine is slow, it's still reasonably fast for short computations. Just don't write nested loops or loops with a huge number of iterations inside your CTFE code, and you should be fine. And on that note, even running std.format with all of its complexity inside CTFE is reasonably fast, as long as you don't do it too often; so generally you won't see a problem here unless you have loop with too many iterations or too deeply-nested loops running in CTFE. Templates are generally reasonably OK, until you use too many recursive templates. Or if you chain too many of them together, like if you have excessively long UFCS chains with Phobos algorithms. Short chains are generally OK, but once they start getting long they will generate large symbols and large numbers of instantiations. Large symbols used to be a big problem, but ever since Rainer's fix they have generally been a lot tamer. But still, it's something to avoid unless you can't help it. Recursive templates are generally bad because they tend to produce a super-linear number of instantiations, which consume lots of compiler memory and also slow things down. Use too many of them, and things will quickly slow to a crawl. Worst is if you combine both deeply-nested templates and CTFE, like std.regex does. Similarly, std.format (which includes writefln & co) tends to add 1-2 seconds to compile time. Another is if you have an excessively long function body, IIRC there are some O(n^2) algorithms in the compiler w.r.t. the length of the function body. But I don't expect normal code to reach the point where this begins to matter; generally you won't run into this unless your code is *really* poorly written (like the entire application inside main()), or you're using excessive code generation (like the mixin of a huge procedurally generated string). Identifier lengths are generally no problem unless you're talking about 100KB-long identifiers, which used to be a problem until Rainer implemented backreferences in the mangling. But I don't expect normal code to generate symbols of this order of magnitude unless you're using excessively-long UFCS chains with nested templates. Identifier length generally doesn't even register on the radar unless they're ridiculously long, like tens or hundreds of KB long -- not something a human would type. What humans would consider a long identifier, like Java-style names that span 50 characters, are mere round-off error and probably don't even make a measurable difference. The problem really only begins to surface when you have 10,000 characters in your identifier or larger. Comments are not even a blip on the radar: lexing is the fastest part of the compilation process. Similarly, aliases are extremely cheap, it's not even on the radar. Delegates have only a runtime cost; they are similarly unnoticeably cheap during compilation. As are Variants, unless you're running Variants inside CTFE (which I don't think even works). T -- Why waste time reinventing the wheel, when you could be reinventing the engine? -- Damian Conway
Jun 03 2020
next sibling parent drathier <forum.dlang.org fi.fo> writes:
On Wednesday, 3 June 2020 at 17:02:35 UTC, H. S. Teoh wrote:
 On Wed, Jun 03, 2020 at 09:36:52AM +0000, drathier via 
 Digitalmars-d-learn wrote:
 I'm wondering if there's a place that lists things which are 
 slower/faster to compile? DMD is pretty famed for compiling 
 quickly, but I'm not seeing particularly high speed at all, 
 and I want to fix that.
The two usual culprits are: - Recursive/chained templates - Excessive CTFE ... T
Thanks for the comprehensive answer! I'm not using CTFE at all, because as you thought, Variants aren't supported in CTFE. I had to go out of my way to avoid CTFE running, since it crashes on Variants. I'm not using UFCS, and the long identifiers I was talking about are like 50 characters long, from mangling package name + module name + variable name together in the source language. I'm guessing it's mainly templates from my code gen then, and there's not much I can do about that; I'm doing code gen from a functional language where polymorphism is literally everywhere, and so are templates then. Regarding std.format, std.regex and such, would it be possible to put those into their own package or something, so `dub` doesn't rebuild them every time? It feels like that'd save a lot of time.
Jun 04 2020
prev sibling parent reply aberba <karabutaworld gmail.com> writes:
On Wednesday, 3 June 2020 at 17:02:35 UTC, H. S. Teoh wrote:
 On Wed, Jun 03, 2020 at 09:36:52AM +0000, drathier via 
 Digitalmars-d-learn wrote:
 I'm wondering if there's a place that lists things which are 
 slower/faster to compile? DMD is pretty famed for compiling 
 quickly, but I'm not seeing particularly high speed at all, 
 and I want to fix that.
The two usual culprits are: - Recursive/chained templates - Excessive CTFE Note that while the current CTFE engine is slow, it's still reasonably fast for short computations. Just don't write nested loops or loops with a huge number of iterations inside your CTFE code, and you should be fine. And on that note, even running std.format with all of its complexity inside CTFE is reasonably fast, as long as you don't do it too often; so generally you won't see a problem here unless you have loop with too many iterations or too deeply-nested loops running in CTFE. Templates are generally reasonably OK, until you use too many recursive templates. Or if you chain too many of them together, like if you have excessively long UFCS chains with Phobos algorithms. Short chains are generally OK, but once they start getting long they will generate large symbols and large numbers of instantiations. Large symbols used to be a big problem, but ever since Rainer's fix they have generally been a lot tamer. But still, it's something to avoid unless you can't help it. Recursive templates are generally bad because they tend to produce a super-linear number of instantiations, which consume lots of compiler memory and also slow things down. Use too many of them, and things will quickly slow to a crawl. Worst is if you combine both deeply-nested templates and CTFE, like std.regex does. Similarly, std.format (which includes writefln & co) tends to add 1-2 seconds to compile time. Another is if you have an excessively long function body, IIRC there are some O(n^2) algorithms in the compiler w.r.t. the length of the function body. But I don't expect normal code to reach the point where this begins to matter; generally you won't run into this unless your code is *really* poorly written (like the entire application inside main()), or you're using excessive code generation (like the mixin of a huge procedurally generated string). Identifier lengths are generally no problem unless you're talking about 100KB-long identifiers, which used to be a problem until Rainer implemented backreferences in the mangling. But I don't expect normal code to generate symbols of this order of magnitude unless you're using excessively-long UFCS chains with nested templates. Identifier length generally doesn't even register on the radar unless they're ridiculously long, like tens or hundreds of KB long -- not something a human would type. What humans would consider a long identifier, like Java-style names that span 50 characters, are mere round-off error and probably don't even make a measurable difference. The problem really only begins to surface when you have 10,000 characters in your identifier or larger. Comments are not even a blip on the radar: lexing is the fastest part of the compilation process. Similarly, aliases are extremely cheap, it's not even on the radar. Delegates have only a runtime cost; they are similarly unnoticeably cheap during compilation. As are Variants, unless you're running Variants inside CTFE (which I don't think even works). T
I'm thinking about a resource hub for D with information like these. Can I use this information? ...of course I'll reference this thread and you can always call for changes.
Jun 05 2020
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Jun 05, 2020 at 08:25:13AM +0000, aberba via Digitalmars-d-learn wrote:
 On Wednesday, 3 June 2020 at 17:02:35 UTC, H. S. Teoh wrote:
 On Wed, Jun 03, 2020 at 09:36:52AM +0000, drathier via
 Digitalmars-d-learn wrote:
 I'm wondering if there's a place that lists things which are
 slower/faster to compile? DMD is pretty famed for compiling
 quickly, but I'm not seeing particularly high speed at all, and I
 want to fix that.
The two usual culprits are: - Recursive/chained templates - Excessive CTFE
[...]
 I'm thinking about a resource hub for D with information like these.
 Can I use this information?
[...] Of course. No need to reference this thread, what I wrote above is pretty much common knowledge for anyone who has worked with D long enough. T -- Famous last words: I *think* this will work...
Jun 05 2020