digitalmars.D - Update on the D-to-Jai guy: We have real problems with the language
- FeepingCreature (51/51) Nov 27 2022 I've had an extended Discord call taking a look at the codebase.
- rikki cattermole (15/15) Nov 27 2022 It is significantly worse than just a few missing tools.
- rikki cattermole (11/11) Nov 27 2022 Today I got some timings after (some how?) fixing dmd builds for my code...
- Basile B. (14/27) Nov 27 2022 For better compile times and with LDC people should also always
- rikki cattermole (6/6) Nov 28 2022 Okay that is a pretty impressive speed boost.
- H. S. Teoh (10/21) Nov 28 2022 Hmm. I just tested `--disable-verify` on one of my medium-complexity
- Basile B. (3/17) Nov 28 2022 Maybe your code is too good to make IR verification falling into
- rikki cattermole (4/7) Nov 28 2022 Very interestingly it is indeed not 15s at all, but ~4s.
- rikki cattermole (8/17) Nov 28 2022 LDC is doing a full link each time, while with incremental linking you
- Hipreme (28/82) Nov 27 2022 Totally agreed, specially with the part **basically no tools to
- ryuukk_ (6/60) Nov 27 2022 So he is using ``std.meta`` and ``std.traits``, then no wonder
- ryuukk_ (5/5) Nov 27 2022 Speaking of Jai, it's not fast either when you start to do lot of
- ryuukk_ (9/9) Nov 27 2022 Sorry, Jai takes not 5.5sec, actually up to 7.8sec, as you do
- Steven Schveighoffer (28/42) Nov 27 2022 I avoid hasUDA and getUDA unless it's a simple project. If I'm doing any...
- Walter Bright (8/14) Nov 28 2022 I once tracked just what was being instantiated to format an integer int...
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (3/7) Nov 28 2022 Finally, we are starting realize the cost of these issues in
- Walter Bright (2/10) Nov 28 2022 I've complained about the conversion thing for 10 years now :-)
- rikki cattermole (10/10) Nov 28 2022 Early into my initial scoping of value type exceptions, I looked into
- Steven Schveighoffer (32/42) Nov 28 2022 Sure, you can look at this as "templates are bad, we shouldn't use them
- zjh (6/9) Nov 28 2022 If there is no `metaprogramming` for `D`, why not use `C++`?
- Steven Schveighoffer (5/19) Nov 28 2022 1) C++ metaprogramming is... not the same.
- zjh (3/4) Nov 28 2022 Looking forward to your article.
- Paulo Pinto (7/28) Nov 29 2022 Using C++20 modules in Visual C++, I can assert that that reason
- Walter Bright (8/8) Nov 28 2022 We have made several attempts at making templates faster, such as the al...
- TheGag96 (8/9) Nov 28 2022 Hey, good on you for reaching out to the guy!! That's really
- zjh (3/6) Nov 28 2022 In a world where language competition is fierce, any
- zjh (4/8) Nov 28 2022 `'D'` should also absorpt some people into the `core team` and
- rikki cattermole (2/5) Nov 28 2022 You don't need to be in the core team to contribute, or to experiment.
- Walter Bright (5/6) Nov 28 2022 std.traits.isPointer is defined as:
- Walter Bright (12/19) Nov 28 2022 hasUDA and getUDAs are defined:
- FeepingCreature (8/29) Nov 29 2022 Well, in his codebase I ended up just redefining `hasUDA` in
- rikki cattermole (4/5) Nov 29 2022 If we had some way to determine cost of template instantiation then we
- Guillaume Lathoud (21/25) Nov 29 2022 Hello, I'm far from being a D compilation specialist, but in case
I've had an extended Discord call taking a look at the codebase. Now, these are only my own thoughts, but I'd share them anyway: - This is a fairly pedestrian codebase. No CTFE craziness, restrained "normal" use of templates. It's exactly the sort of code that D is supposed to be fast at. - To be fair, his computer isn't the fastest. But it's an 8core AMD, so DMD's lack of internal parallelization hurts it here. This will only get worse in the future. - And sure, there's a bunch of somewhat quadratic templates that explode a bit. But! But! 1. It's all "pedestrian" use. Containers with lots of members instantiated with lots of types. 2. The compiler doesn't surface what is fast and what is slow and doesn't give you a way to notice it, no -vtemplates isn't enough, we need a way to tell the *time taken* not just the number of instantiations. 3. But also if we're talking about number of instantiations, `hasUDA` and `getUDA` lead the pack. I think the way these work is just bad - I've rewritten all my own `hasUDA`/`getUDA` code to be of the form `udaIndex!(U, __traits(getAttributes, T))` - instantiating a unique copy for every combination of field and UDA is borderline quadratic - but that didn't help much even though `-vtemplates` hinted that it should. `-vtemplates` needs compiler time attributed to template recursively. 4. LLVM is painful. Unavoidable, but painful. Probably twice the compile time of the ldc2 run was in the LLVM backend. 5. There was no smoking gun. It's not like "ah yeah, this thing, just don't do it." It's a lot of code that instantiates a lot of genuine workhorse templates (99% "function with type" or "struct with type"), and it was okay for a long time and then it wasn't. I really think the primary issue here is just that D gives you a hundred tools to dig yourself in a hole, and has basically no tools to dig yourself out of it, and if you do so you have to go "against the grain" of how the language wants to be used. And like, as an experienced dev I know the tricks of how to optimize templates, and I've sunk probably a hundred hours into this for my two libs at work alone, but this is *folk knowledge*, it's not part of the stdlib, or the spec, or documented anywhere at all. Like `if (__ctfe) return;`. Like `udaIndex!(__traits)`. Like `is(T : U*, U)` instead of `isPointer`. Like making struct methods templates so they're only compiled when needed. Like moving recursive types out of templates to reduce the compilation time. Like keeping your unique instantiations as low as possible by querying information with traits at the site of instantiation. Like `-v` to see where time is spent. Like ... and so on. This goes for every part of the language, not just templates. DMD is fast. DMD is even fast for what it does. But DMD is not as fast as it implicitly promises when templates are advertised, and DMD does not expose enough good ways to make your code fast again when you've fallen in a hole.
Nov 27 2022
It is significantly worse than just a few missing tools. The common solution in the native world to improve compilation time is shared libraries. Incremental compilation does help, but shared libraries is how you isolate and hide away a ton of details regarding template instantiations that don't need to be exposed. Only.. we can't do shared libraries cleanly and where it is possible it is pretty limiting (such as a specific compiler). Yesterday I tried to get the .di generator to produce a .di file for a project. It has somehow started to produce a ton of garbage at the bottom of the file that certainly isn't valid D code. Even if that wasn't there, how on earth is the compiler going to -I them when they are not in directories? Yikes. Needless to say, we have a ton of implementation details that are both low hanging and high value which have no alternatives.
Nov 27 2022
Today I got some timings after (some how?) fixing dmd builds for my code. 1) ldc is ~45s 2) ldc --link-internally is ~30s 3) dmd is ~16s Note it takes about ~3s to ``dub run dub ~master -- build`` due to needing latest. So what is interesting about this is MSVC link is taking about ~15s by itself, LLVM is 15s which means that the frontend is actually taking only like 1s at most. Pretty rough estimates, but all my attempts to speed up my codebase had very little effect as it turns out (including removing hasUDA!).
Nov 27 2022
On Sunday, 27 November 2022 at 16:38:40 UTC, rikki cattermole wrote:Today I got some timings after (some how?) fixing dmd builds for my code. 1) ldc is ~45s 2) ldc --link-internally is ~30s 3) dmd is ~16s Note it takes about ~3s to ``dub run dub ~master -- build`` due to needing latest. So what is interesting about this is MSVC link is taking about ~15s by itself, LLVM is 15s which means that the frontend is actually taking only like 1s at most. Pretty rough estimates, but all my attempts to speed up my codebase had very little effect as it turns out (including removing hasUDA!).For better compile times and with LDC people should also always use the undocumented option `--disable-verify` (for a DUB recipe this would go in the dlags-ldc2 array for example). By default ldc2 verifies the IR produced but that verification is mostly useful to detect bugs in the AST to IR translation, so unlikely to detected any problems for a project like LDC, that's well settled, and has the main drawback to be very slow, especially with functions with bad a cyclomatic complexity. For example for my old iz library, 12KSLOCs of D (per D-Scanner critetions), the gain measured with `--disable-verify` goes from 150 to 300ms, depending on the run.
Nov 27 2022
Okay that is a pretty impressive speed boost. ldc2: --disable-verify ~35s Except: ldc2: --disable-verify --link-internally ~30s A cost that is still worth paying given that its within the margin of error for my case anyway.
Nov 28 2022
On Mon, Nov 28, 2022 at 04:27:03AM +0000, Basile B. via Digitalmars-d wrote: [...]For better compile times and with LDC people should also always use the undocumented option `--disable-verify` (for a DUB recipe this would go in the dlags-ldc2 array for example). By default ldc2 verifies the IR produced but that verification is mostly useful to detect bugs in the AST to IR translation, so unlikely to detected any problems for a project like LDC, that's well settled, and has the main drawback to be very slow, especially with functions with bad a cyclomatic complexity. For example for my old iz library, 12KSLOCs of D (per D-Scanner critetions), the gain measured with `--disable-verify` goes from 150 to 300ms, depending on the run.Hmm. I just tested `--disable-verify` on one of my medium-complexity projects (just under 40 .d files, compiled in a single command); didn't measure any significant speed difference. Both with and without `--disable-verify` it took about 20 seconds for a full build (generate Linux & Windows executables + run unittests). T -- Javascript is what you use to allow third party programs you don't know anything about and doing you know not what to run on your computer. -- Charles Hixson
Nov 28 2022
On Monday, 28 November 2022 at 17:08:04 UTC, H. S. Teoh wrote:On Mon, Nov 28, 2022 at 04:27:03AM +0000, Basile B. via Digitalmars-d wrote: [...]Maybe your code is too good to make IR verification falling into pathological cases.For better compile times and with LDC people should also always use the undocumented option `--disable-verify` (for a DUB recipe this would go in the dlags-ldc2 array for example). By default ldc2 verifies the IR produced [...]Hmm. I just tested `--disable-verify` on one of my medium-complexity projects (just under 40 .d files, compiled in a single command); didn't measure any significant speed difference. Both with and without `--disable-verify` it took about 20 seconds for a full build (generate Linux & Windows executables + run unittests). T
Nov 28 2022
On 28/11/2022 5:38 AM, rikki cattermole wrote:So what is interesting about this is MSVC link is taking about ~15s by itself, LLVM is 15s which means that the frontend is actually taking only like 1s at most.Very interestingly it is indeed not 15s at all, but ~4s. Thanks to the nifty /TIME switch on MSVC link! Welp, guess something else is doing it.
Nov 28 2022
On 29/11/2022 9:12 AM, rikki cattermole wrote:On 28/11/2022 5:38 AM, rikki cattermole wrote:LDC is doing a full link each time, while with incremental linking you can get this far lower (like dmd is doing), which is not accurate unfortunately. If you did use incremental linking you unfortunately must remove the extra unused import libraries that LDC is adding (which don't make enough of a difference to matter with a full link. TLDR: LDC is correct, DMD is giving not entirely correct results. But there should be some wins here if different choices were made.So what is interesting about this is MSVC link is taking about ~15s by itself, LLVM is 15s which means that the frontend is actually taking only like 1s at most.Very interestingly it is indeed not 15s at all, but ~4s. Thanks to the nifty /TIME switch on MSVC link! Welp, guess something else is doing it.
Nov 28 2022
On Sunday, 27 November 2022 at 09:29:29 UTC, FeepingCreature wrote:I've had an extended Discord call taking a look at the codebase. Now, these are only my own thoughts, but I'd share them anyway: - This is a fairly pedestrian codebase. No CTFE craziness, restrained "normal" use of templates. It's exactly the sort of code that D is supposed to be fast at. - To be fair, his computer isn't the fastest. But it's an 8core AMD, so DMD's lack of internal parallelization hurts it here. This will only get worse in the future. - And sure, there's a bunch of somewhat quadratic templates that explode a bit. But! But! 1. It's all "pedestrian" use. Containers with lots of members instantiated with lots of types. 2. The compiler doesn't surface what is fast and what is slow and doesn't give you a way to notice it, no -vtemplates isn't enough, we need a way to tell the *time taken* not just the number of instantiations. 3. But also if we're talking about number of instantiations, `hasUDA` and `getUDA` lead the pack. I think the way these work is just bad - I've rewritten all my own `hasUDA`/`getUDA` code to be of the form `udaIndex!(U, __traits(getAttributes, T))` - instantiating a unique copy for every combination of field and UDA is borderline quadratic - but that didn't help much even though `-vtemplates` hinted that it should. `-vtemplates` needs compiler time attributed to template recursively. 4. LLVM is painful. Unavoidable, but painful. Probably twice the compile time of the ldc2 run was in the LLVM backend. 5. There was no smoking gun. It's not like "ah yeah, this thing, just don't do it." It's a lot of code that instantiates a lot of genuine workhorse templates (99% "function with type" or "struct with type"), and it was okay for a long time and then it wasn't. I really think the primary issue here is just that D gives you a hundred tools to dig yourself in a hole, and has basically no tools to dig yourself out of it, and if you do so you have to go "against the grain" of how the language wants to be used. And like, as an experienced dev I know the tricks of how to optimize templates, and I've sunk probably a hundred hours into this for my two libs at work alone, but this is *folk knowledge*, it's not part of the stdlib, or the spec, or documented anywhere at all. Like `if (__ctfe) return;`. Like `udaIndex!(__traits)`. Like `is(T : U*, U)` instead of `isPointer`. Like making struct methods templates so they're only compiled when needed. Like moving recursive types out of templates to reduce the compilation time. Like keeping your unique instantiations as low as possible by querying information with traits at the site of instantiation. Like `-v` to see where time is spent. Like ... and so on. This goes for every part of the language, not just templates. DMD is fast. DMD is even fast for what it does. But DMD is not as fast as it implicitly promises when templates are advertised, and DMD does not expose enough good ways to make your code fast again when you've fallen in a hole.Totally agreed, specially with the part **basically no tools to dig yourself out of it**. I would like to refer to some PR's which I think it could be game changer for D. - WIP in DMD that both Per and Stefan has done for better build times profiling: https://github.com/dlang/dmd/pull/14635 *Having talked with Stefan, there isn't much hope into this getting merged, thought it's so important* - CTFECache, caching the CTFE https://github.com/dlang/dmd/pull/7843 *This one been a bit more inactive recently, I think it may need a help* Those 2 PRs should have more attention than other things right now, specially I think there have been an increasing number of people unsatisfied with D compilation times (see reggae). I have been having problem with compilation times have been some time right now. - I have almost wiped stdlib usage from my project due to its immense imports, template usages and some choices that breaks compilation speed (looking at you to!string(float). - From ldc build profile, importing core.sys.windows did take too much time, so, I rewrote only the part that I needed for making the build times slightly faster (I think I got like 0.3 seconds) - My projects have been completely modularized, and there is like 2 modules that are included by all other modules, yet it didn't make much difference modularizing it or not.
Nov 27 2022
On Sunday, 27 November 2022 at 09:29:29 UTC, FeepingCreature wrote:I've had an extended Discord call taking a look at the codebase. Now, these are only my own thoughts, but I'd share them anyway: - This is a fairly pedestrian codebase. No CTFE craziness, restrained "normal" use of templates. It's exactly the sort of code that D is supposed to be fast at. - To be fair, his computer isn't the fastest. But it's an 8core AMD, so DMD's lack of internal parallelization hurts it here. This will only get worse in the future. - And sure, there's a bunch of somewhat quadratic templates that explode a bit. But! But! 1. It's all "pedestrian" use. Containers with lots of members instantiated with lots of types. 2. The compiler doesn't surface what is fast and what is slow and doesn't give you a way to notice it, no -vtemplates isn't enough, we need a way to tell the *time taken* not just the number of instantiations. 3. But also if we're talking about number of instantiations, `hasUDA` and `getUDA` lead the pack. I think the way these work is just bad - I've rewritten all my own `hasUDA`/`getUDA` code to be of the form `udaIndex!(U, __traits(getAttributes, T))` - instantiating a unique copy for every combination of field and UDA is borderline quadratic - but that didn't help much even though `-vtemplates` hinted that it should. `-vtemplates` needs compiler time attributed to template recursively. 4. LLVM is painful. Unavoidable, but painful. Probably twice the compile time of the ldc2 run was in the LLVM backend. 5. There was no smoking gun. It's not like "ah yeah, this thing, just don't do it." It's a lot of code that instantiates a lot of genuine workhorse templates (99% "function with type" or "struct with type"), and it was okay for a long time and then it wasn't. I really think the primary issue here is just that D gives you a hundred tools to dig yourself in a hole, and has basically no tools to dig yourself out of it, and if you do so you have to go "against the grain" of how the language wants to be used. And like, as an experienced dev I know the tricks of how to optimize templates, and I've sunk probably a hundred hours into this for my two libs at work alone, but this is *folk knowledge*, it's not part of the stdlib, or the spec, or documented anywhere at all. Like `if (__ctfe) return;`. Like `udaIndex!(__traits)`. Like `is(T : U*, U)` instead of `isPointer`. Like making struct methods templates so they're only compiled when needed. Like moving recursive types out of templates to reduce the compilation time. Like keeping your unique instantiations as low as possible by querying information with traits at the site of instantiation. Like `-v` to see where time is spent. Like ... and so on. This goes for every part of the language, not just templates. DMD is fast. DMD is even fast for what it does. But DMD is not as fast as it implicitly promises when templates are advertised, and DMD does not expose enough good ways to make your code fast again when you've fallen in a hole.So he is using ``std.meta`` and ``std.traits``, then no wonder why, he should nuke these two imports These two modules should be removed from the language plain and simple, and ``__traits`` should be improved to accommodate
Nov 27 2022
Speaking of Jai, it's not fast either when you start to do lot of logic at compile time Here as you can see, a jai project that takes 5.5 seconds to compile https://i.imgur.com/weC9ejD.png
Nov 27 2022
Sorry, Jai takes not 5.5sec, actually up to 7.8sec, as you do more compile time logic https://i.imgur.com/SbF2lP1.png No language is immune to bad code However i agree with posts above, that tracing PR is important, i made the remark about the lack of tracing/benchmark in the DMD codebase few months ago, it is important to have Integrating tracy should be useful: https://github.com/wolfpld/tracy
Nov 27 2022
On 11/27/22 4:29 AM, FeepingCreature wrote:3. But also if we're talking about number of instantiations, `hasUDA` and `getUDA` lead the pack. I think the way these work is just bad - I've rewritten all my own `hasUDA`/`getUDA` code to be of the form `udaIndex!(U, __traits(getAttributes, T))` - instantiating a unique copy for every combination of field and UDA is borderline quadratic - but that didn't help much even though `-vtemplates` hinted that it should.I avoid hasUDA and getUDA unless it's a simple project. If I'm doing any complex attribute mechanisms, I use an introspection blueprint, i.e. loop over all the attributes once and build a struct that has all the information I need once. There's not a simple abstraction for this, you just have to build it.I really think the primary issue here is just that D gives you a hundred tools to dig yourself in a hole, and has basically no tools to dig yourself out of it, and if you do so you have to go "against the grain" of how the language wants to be used.But really this is kind of how you have to deal with D templates. I think we are missing a guide on this, because it's easy to write D code that looks nice, and doesn't compile with horrible performance, but will add up to something that is unworkable. There's bad ways to implement many algorithms. There's also ways to implement algorithms that assist the optimizer, or to help performance by considering the hardware being used. For sure there's a lot less attention paid to what is "bad" in a template and CTFE, and what performs well. The wisdom there is not *conventional* and is not the same as regular code wisdom. I think we can do better here.DMD is fast. DMD is even fast for what it does. But DMD is not as fast as it implicitly promises when templates are advertised, and DMD does not expose enough good ways to make your code fast again when you've fallen in a hole.Yes, I think we need more tools to inspect what is taking the time, and we need more guides on how to avoid those. Understanding where the cost goes when instantiating a template is kind of key knowledge if you are going to use a lot of them. Phobos does not make this easy either. Things like std.format are so insanely complex because you can just reach for a bunch of sub-templates. It's easy to write the code, but it increases compile times significantly. I still have some hope that there are ways to decrease the template cost that will just improve performance across the board. Maybe that needs a new frontend compiler, I don't know. -Steve
Nov 27 2022
On 11/27/2022 8:12 AM, Steven Schveighoffer wrote:Phobos does not make this easy either. Things like std.format are so insanely complex because you can just reach for a bunch of sub-templates. It's easy to write the code, but it increases compile times significantly.I once tracked just what was being instantiated to format an integer into a string. The layers of templates are literally 10 deep. One template forwards to the next, which forwards to the next, which forwards to the next, 10 layers deep. This is not D's fault. It's poor design of the conversion code.I still have some hope that there are ways to decrease the template cost that will just improve performance across the board. Maybe that needs a new frontend compiler, I don't know.Phobos2 needs to take a hard look at all the template forwarding going on. I've also noticed that many templates can be replaced with 2 or 3 ordinary function overloads.
Nov 28 2022
On Monday, 28 November 2022 at 22:27:34 UTC, Walter Bright wrote:Phobos2 needs to take a hard look at all the template forwarding going on. I've also noticed that many templates can be replaced with 2 or 3 ordinary function overloads.Finally, we are starting realize the cost of these issues in terms of lost productivity during incremental development.
Nov 28 2022
On 11/28/2022 2:39 PM, Per Nordlöw wrote:On Monday, 28 November 2022 at 22:27:34 UTC, Walter Bright wrote:I've complained about the conversion thing for 10 years now :-)Phobos2 needs to take a hard look at all the template forwarding going on. I've also noticed that many templates can be replaced with 2 or 3 ordinary function overloads.Finally, we are starting realize the cost of these issues in terms of lost productivity during incremental development.
Nov 28 2022
Early into my initial scoping of value type exceptions, I looked into std.format. There is no reason formattedWrite should allocate right? So why isn't it already working with -betterC. Well, the answer is quite simple, sooooo many exceptions are strewn throughout ready to be fired off. I kinda gave up any hope that it could ever be usable in even the harshest of scenarios. But if we are thinking about doing a full rewrite of it, it would certainly be good to ditch the class based exception mechanism for error handling!
Nov 28 2022
On 11/28/22 5:27 PM, Walter Bright wrote:On 11/27/2022 8:12 AM, Steven Schveighoffer wrote:Sure, you can look at this as "templates are bad, we shouldn't use them as much", but I see it more of a problem that "templates are bad, we should make them less bad". I am also a firm believer that running ordinary functions instead of templates can be much easier to write, easier to debug, and maybe easier to optimize with a new CTFE engine. Perhaps it *is* just a case of using the wrong tool for the job. But let's also see if there's anything we can do about template performance also. And we have to make it more pleasant to use such things (type functions would be nice to have). I did a test on something I was working on for my talk, and I'm going to write a blog post about it, because I'm kind of stunned at the results. But in essence, the template `ReturnType!fun` adds 60KB permanently to the RAM usage of the compiler, even if the function is just a temporary lambda used to check a constraint, and it adds a non-significant amount of compile time vs. just `is(typeof(fun()) T)`. The compile time difference is hard to measure though, let's say it's 500µs. I think we need to start picking apart how these things are being processed in the compiler, and realize that while it doesn't add *that* much, all those little 60kb and 500µs add up when you are generating significant tonnage of templates and CTFE. D's *core strength* is compile-time metaprogramming and code generation. It shouldn't also be the thing that drives you away because of compile times and memory usage. In other words, we shouldn't have to say "oh you did it wrong because you used too much of D's cool unique features". Maybe I'm wrong, maybe we just have to tell people not to use these things. But then they really shouldn't be in phobos... -Steve P.S., when I say "we" should make them better, I'm shamefully aware that I am too ignorant to be part of that we, it's like the compiler devs are my sports team and I refer to them and me as "we" like I'm on the team! I appreciate all you guys do!I still have some hope that there are ways to decrease the template cost that will just improve performance across the board. Maybe that needs a new frontend compiler, I don't know.Phobos2 needs to take a hard look at all the template forwarding going on. I've also noticed that many templates can be replaced with 2 or 3 ordinary function overloads.
Nov 28 2022
On Tuesday, 29 November 2022 at 01:58:48 UTC, Steven Schveighoffer wrote:such things (type functions would be nice to have).D's *core strength* is compile-time metaprogramming and code generation. -SteveIf there is no `metaprogramming` for `D`, why not use `C++`? The author leaves `D`, which is also the compile time performance of `D`, currently cannot meet the needs of `heavy` metaprogramming.
Nov 28 2022
On 11/28/22 9:12 PM, zjh wrote:On Tuesday, 29 November 2022 at 01:58:48 UTC, Steven Schveighoffer wrote:1) C++ metaprogramming is... not the same. 2) I think if you are looking for better compile times, C++ is not the right path. -Stevesuch things (type functions would be nice to have).D's *core strength* is compile-time metaprogramming and code generation. -SteveIf there is no `metaprogramming` for `D`, why not use `C++`? The author leaves `D`, which is also the compile time performance of `D`, currently cannot meet the needs of `heavy` metaprogramming.
Nov 28 2022
On Tuesday, 29 November 2022 at 02:37:25 UTC, Steven Schveighoffer wrote:...Looking forward to your article.
Nov 28 2022
On Tuesday, 29 November 2022 at 02:37:25 UTC, Steven Schveighoffer wrote:On 11/28/22 9:12 PM, zjh wrote:Using C++20 modules in Visual C++, I can assert that that reason to fame for C++ will eventually be sorted out. All my hobby coding with C++ now makes use of C++ modules. And yes, this is yet something that C++ has taken inspiration from D.On Tuesday, 29 November 2022 at 01:58:48 UTC, Steven Schveighoffer wrote:1) C++ metaprogramming is... not the same. 2) I think if you are looking for better compile times, C++ is not the right path. -Stevesuch things (type functions would be nice to have).D's *core strength* is compile-time metaprogramming and code generation. -SteveIf there is no `metaprogramming` for `D`, why not use `C++`? The author leaves `D`, which is also the compile time performance of `D`, currently cannot meet the needs of `heavy` metaprogramming.
Nov 29 2022
We have made several attempts at making templates faster, such as the alias reassignment change. Some big improvements happened. Some benchmarking shows that a couple templates were at the top of the list of time lost instantiating them. I hardwired them into the compiler, and now those go pretty fast. A lot can be done by simply going through Phobos and examining the templates that forward to other templates, and perhaps manually inlining templates to avoid expansions.
Nov 28 2022
On Sunday, 27 November 2022 at 09:29:29 UTC, FeepingCreature wrote:(snip)Hey, good on you for reaching out to the guy!! That's really cool. I had thought there would be some obvious reason why the compile times would be so bad, but I guess there's not. Maybe it's time to revisit Stefan's type functions? Or, even though it won't exactly help the template slowness, work on getting newCTFE finished up?
Nov 28 2022
On Monday, 28 November 2022 at 21:45:37 UTC, TheGag96 wrote:Maybe it's time to revisit Stefan's type functions? Or, even though it won't exactly help the template slowness, work on getting newCTFE finished up?In a world where language competition is fierce, any `improvement` is worth it.
Nov 28 2022
On Tuesday, 29 November 2022 at 02:16:14 UTC, zjh wrote:On Monday, 28 November 2022 at 21:45:37 UTC, TheGag96 wrote:`'D'` should also absorpt some people into the `core team` and allow them to `play freely`. Take advantage of the fact that they already know D very well.Maybe it's time to revisit Stefan's type functions? Or, even though it won't exactly help the template slowness, work on getting newCTFE finished up?
Nov 28 2022
On 29/11/2022 3:22 PM, zjh wrote:`'D'` should also absorpt some people into the `core team` and allow them to `play freely`. Take advantage of the fact that they already know D very well.You don't need to be in the core team to contribute, or to experiment.
Nov 28 2022
On 11/27/2022 1:29 AM, FeepingCreature wrote:Like `is(T : U*, U)` instead of `isPointer`.std.traits.isPointer is defined as: enum bool isPointer(T) = is(T == U*, U) && __traits(isScalar, T); though I have no idea why the isScalar is there. When is a pointer ever not a scalar?
Nov 28 2022
On 11/27/2022 1:29 AM, FeepingCreature wrote:3. But also if we're talking about number of instantiations, `hasUDA` and `getUDA` lead the pack. I think the way these work is just bad - I've rewritten all my own `hasUDA`/`getUDA` code to be of the form `udaIndex!(U, __traits(getAttributes, T))` - instantiating a unique copy for every combination of field and UDA is borderline quadratic - but that didn't help much even though `-vtemplates` hinted that it should. `-vtemplates` needs compiler time attributed to template recursively.hasUDA and getUDAs are defined: enum hasUDA(alias symbol, alias attribute) = getUDAs!(symbol, attribute).length != 0; template getUDAs(alias symbol, alias attribute) { import std.meta : Filter; alias getUDAs = Filter!(isDesiredUDA!attribute, __traits(getAttributes, symbol)); } These do look pretty inefficient. Who wants to fix Phobos with FeepingCreature's solution?
Nov 28 2022
On Tuesday, 29 November 2022 at 07:15:25 UTC, Walter Bright wrote:On 11/27/2022 1:29 AM, FeepingCreature wrote:Well, in his codebase I ended up just redefining `hasUDA` in terms of `udaIndex`, and even though `hasUDA` led the pack in `-vtemplates` this didn't actually result in any noticeable change in speed. I think even though `hasUDA` gets instantiated a lot, it doesn't result in much actual compile time. Unfortunately there's no good way to know this without porting everything, which is I think the actual problem.3. But also if we're talking about number of instantiations, `hasUDA` and `getUDA` lead the pack. I think the way these work is just bad - I've rewritten all my own `hasUDA`/`getUDA` code to be of the form `udaIndex!(U, __traits(getAttributes, T))` - instantiating a unique copy for every combination of field and UDA is borderline quadratic - but that didn't help much even though `-vtemplates` hinted that it should. `-vtemplates` needs compiler time attributed to template recursively.hasUDA and getUDAs are defined: enum hasUDA(alias symbol, alias attribute) = getUDAs!(symbol, attribute).length != 0; template getUDAs(alias symbol, alias attribute) { import std.meta : Filter; alias getUDAs = Filter!(isDesiredUDA!attribute, __traits(getAttributes, symbol)); } These do look pretty inefficient. Who wants to fix Phobos with FeepingCreature's solution?
Nov 29 2022
On 30/11/2022 1:43 AM, FeepingCreature wrote:Unfortunately there's no good way to know this without porting everythingIf we had some way to determine cost of template instantiation then we would have a good idea, but that tool is currently missing. Very high value this feature would be.
Nov 29 2022
On Sunday, 27 November 2022 at 09:29:29 UTC, FeepingCreature wrote:... - To be fair, his computer isn't the fastest. But it's an 8core AMD, so DMD's lack of internal parallelization hurts it here. This will only get worse in the future.Hello, I'm far from being a D compilation specialist, but in case this is of any use or inspiration: I've been using parallel compilation for a few years now, recompiling only the new files, one-by-one, distributed over the available cores, then linking. Here it is, just one bash script: https://github.com/glathoud/d_glat/blob/master/dpaco.sh (So far used with LDC only.) The result is far from perfect, sometimes the resulting binary does not reflect a code change, but 80-90% of the time it does. And I don't have to maintain a build system at all. Overall this approach saves quite a bit of time - and improves motivation, having to wait only a few seconds on a project that has grown to about 180 D files. My use of templating is limited but happens regularly. If there is, or would be a 100% reliable solution to do parallel compilation without a build system, that'd be wonderful. Not just for me, I guess. Best regards, Guillaume
Nov 29 2022