www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Building C++ modules

reply Antonio Corbi <antonio ggmail.cov> writes:
Hi!

Jussi Pakkanen (from Meson build system fame) has written this 
article[1] about compiling C++ code that uses modules. I think 
it's worth a read.

Antonio

[1] 
https://nibblestew.blogspot.com/2019/08/building-c-modules-take-n1.html
Aug 07
parent reply Russel Winder <russel winder.org.uk> writes:
On Wed, 2019-08-07 at 16:04 +0000, Antonio Corbi via Digitalmars-d wrote:
 Hi!
=20
 Jussi Pakkanen (from Meson build system fame) has written this=20
 article[1] about compiling C++ code that uses modules. I think=20
 it's worth a read.
=20
 Antonio
=20
 [1]=20
 https://nibblestew.blogspot.com/2019/08/building-c-modules-take-n1.html
So if C++ has this dependency problem why doesn't D, Go, or Rust? Chapel developers decided to allow for separate compilation but tend to advocate who source at once compilation. This is cool as you can do global optimisation in a way not possible with separate compilation. If one of D's plus points is fast parsing and compilation, perhaps whole source at once compilation should be a good thing?=20 --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Dr Russel Winder t: +44 20 7585 2200 41 Buckmaster Road m: +44 7770 465 077 London SW11 1EN, UK w: www.russel.org.uk
Aug 08
next sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Thursday, 8 August 2019 at 10:10:30 UTC, Russel Winder wrote:
 On Wed, 2019-08-07 at 16:04 +0000, Antonio Corbi via 
 Digitalmars-d wrote:
 Hi!
 
 Jussi Pakkanen (from Meson build system fame) has written this 
 article[1] about compiling C++ code that uses modules. I think 
 it's worth a read.
 
 Antonio
 
 [1] 
 https://nibblestew.blogspot.com/2019/08/building-c-modules-take-n1.html
So if C++ has this dependency problem why doesn't D, Go, or Rust?
I think that one of the reasons is that D, Go, and Rust (and Java, and...) map module names to file names, so other than -I flags or equivalent there's no need to figure out where the modules are actually located. C++, to preserve support for platforms that don't have a hierarchical filesystem (I have no idea what they are or why anyone cares about them), decided to punt on how to find modules and leaves it up to the implementation. The other difference is that despite having modules, C++20 still has the equivalent of module headers and module implementations. I read a whole blog post explaining C++ modules once. I shook my head throughout and can't believe what they've gone with. I've also forgotten most of it, probably due to my brain protecting itself lest I go mad.
 If one of D's plus points is fast parsing and compilation, 
 perhaps whole source at once compilation should be a good thing?
I don't think it is. Fast is relative, and it's death by a thousand cuts for me at the moment. And this despite the fact that I use reggae as much as I can, which means I wait less than most other D programmers on average to get a compiled binary!
Aug 09
next sibling parent reply Dukc <ajieskola gmail.com> writes:
On Friday, 9 August 2019 at 08:37:25 UTC, Atila Neves wrote:
 The other difference is that despite having modules, C++20 
 still has the equivalent of module headers and module 
 implementations.
Don't we have those too? .hd files.
 Fast is relative, and it's death by a thousand cuts for me at 
 the moment. And this despite the fact that I use reggae as much 
 as I can, which means I wait less than most other D programmers 
 on average to get a compiled binary!
I remember that you did blog that 1.5 seconds compile time or something like that is already frustatingly long for you. What do you think if the effect of compile time there, to programmer? I had to use Dub with a --combined build back at Windows, and while it sure was a lot slower than a bare D compiler, it was still faster than the main part of my project written in C#. Even then, I found the compile time (haven't measured, but probably like 20 seconds for debug build if both D and C# files were changed) only mildly annoying. So I'm interested, are virtually-instant compiles something one has to experience to understand their benefit?
Aug 09
next sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Friday, 9 August 2019 at 09:22:12 UTC, Dukc wrote:
 On Friday, 9 August 2019 at 08:37:25 UTC, Atila Neves wrote:
 The other difference is that despite having modules, C++20 
 still has the equivalent of module headers and module 
 implementations.
Don't we have those too? .hd files.
.di files are usually auto-generated, not needed, and AFAIK not that used.
 I remember that you did blog that 1.5 seconds compile time or 
 something like that is already frustatingly long for you.
Indeed.
 What do you think if the effect of compile time there, to 
 programmer?
From experience, it makes me work much slower if I don't get results in less than 100ms. If I'm not mistaken, IBM did a study on this that I read once but never managed to find again about how much faster people worked on short feedback cycles.
 I had to use Dub with a --combined build back at Windows,
Doing pretty much anything on Windows will mean it'll be slower.
 and while it sure was a lot slower than a bare D compiler, it 
 was still faster than the main part of my project written in 
 C#. Even then, I found the compile time (haven't measured, but 
 probably like 20 seconds for debug build if both D and C# files 
 were changed) only mildly annoying.
That's why I said fast is relative. What's mildly annoying for you makes my brain slow to a crawl, never mind the frustration and desire to headbutt the wall repeatedly.
 So I'm interested, are virtually-instant compiles something one 
 has to experience to understand their benefit?
Maybe? There's the IBM research I talked about, but I can't remember the details or the general applicability.
Aug 09
next sibling parent reply Ethan <gooberman gmail.com> writes:
On Friday, 9 August 2019 at 13:17:02 UTC, Atila Neves wrote:
 .di files are usually auto-generated, not needed, and AFAIK not 
 that used.
They're needed. At least, the idea of them is needed. Right now, it's just a glorified implementation stripper. Gets rid of everything between {} that aren't aggregate/enum definitions, and leaves the slow and expensive mixins there to be recompiled every. single. time. they're. imported. .di files need to be redefined to represent a module after mixins have resolved.
Aug 09
parent Atila Neves <atila.neves gmail.com> writes:
On Friday, 9 August 2019 at 13:42:27 UTC, Ethan wrote:
 On Friday, 9 August 2019 at 13:17:02 UTC, Atila Neves wrote:
 .di files are usually auto-generated, not needed, and AFAIK 
 not that used.
They're needed. At least, the idea of them is needed. Right now, it's just a glorified implementation stripper. Gets rid of everything between {} that aren't aggregate/enum definitions, and leaves the slow and expensive mixins there to be recompiled every. single. time. they're. imported. .di files need to be redefined to represent a module after mixins have resolved.
Hmm, interesting! Is that you volunteering to work on that? :P
Aug 12
prev sibling next sibling parent Dukc <ajieskola gmail.com> writes:
On Friday, 9 August 2019 at 13:17:02 UTC, Atila Neves wrote:
 From experience, it makes me work much slower if I don't get 
 results in less than 100ms.
Wow! I guess my work habits are wayyyyy too lax :D.
 Doing pretty much anything on Windows will mean it'll be slower.
The C# compilation feels about as fast, perhaps because I use its compiler via Wine. But DUB on Linux is just a lightning strike (from my perspective). And that's even if run on the NTFS-using Windows partition of my hard drive! And OpenSUSE does not seem to periodically choke on running some dubious background process, unlike Windows. So, I mostly agree.
Aug 09
prev sibling next sibling parent reply Russel Winder <russel winder.org.uk> writes:
On Fri, 2019-08-09 at 13:17 +0000, Atila Neves via Digitalmars-d wrote:
=20
[=E2=80=A6]
  From experience, it makes me work much slower if I don't get=20
 results in less than 100ms. If I'm not mistaken, IBM did a study=20
 on this that I read once but never managed to find again about=20
 how much faster people worked on short feedback cycles.
=20
[=E2=80=A6] Most people in the world couldn't tell the difference between 100ms and 200= ms. But this leads to a whole off-theme discussion about psychology, reaction times, and "won't wait" times. Also of course: https://www.xkcd.com/303/ --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Dr Russel Winder t: +44 20 7585 2200 41 Buckmaster Road m: +44 7770 465 077 London SW11 1EN, UK w: www.russel.org.uk
Aug 10
parent reply Atila Neves <atila.neves gmail.com> writes:
On Saturday, 10 August 2019 at 08:17:03 UTC, Russel Winder wrote:
 On Fri, 2019-08-09 at 13:17 +0000, Atila Neves via 
 Digitalmars-d wrote:
 
[…]
  From experience, it makes me work much slower if I don't get
 results in less than 100ms. If I'm not mistaken, IBM did a 
 study
 on this that I read once but never managed to find again about
 how much faster people worked on short feedback cycles.
 
[…] Most people in the world couldn't tell the difference between 100ms and 200ms.
Musicians can ;) The threshold for noticeable latency in an audio interface is ~10ms. Imagine if it took 100ms to from hitting a key to seeing the character on screen. That's almost how slow compile times feel to me.
 But this leads to a whole off-theme discussion about 
 psychology, reaction times, and "won't wait" times.
Indeed.
 Also of course: https://www.xkcd.com/303/
Classic.
Aug 12
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 8/12/2019 2:40 AM, Atila Neves wrote:
 Indeed.
Building a syntax check into one's code editor would probably double compile speeds because half the errors would be detected before you even wrote the file out :-)
Aug 14
parent Atila Neves <atila.neves gmail.com> writes:
On Wednesday, 14 August 2019 at 22:56:30 UTC, Walter Bright wrote:
 On 8/12/2019 2:40 AM, Atila Neves wrote:
 Indeed.
Building a syntax check into one's code editor would probably double compile speeds because half the errors would be detected before you even wrote the file out :-)
I already have that with flycheck in emacs. My main concern is running tests, specifically only the tests needed to be compiled/interpreted and getting fast feedback on whether I broke anything or not.
Aug 15
prev sibling parent reply matheus <matheus gmail.com> writes:
On Friday, 9 August 2019 at 13:17:02 UTC, Atila Neves wrote:
 From experience, it makes me work much slower if I don't get 
 results in less than 100ms. If I'm not mistaken, IBM did a 
 study on this that I read once but never managed to find again 
 about how much faster people worked on short feedback cycles.
This is bit an exaggeration right? We're talking about the speed of a human blink. I can't see a great difference between 1 sec vs 100 ms "while working". Of course someone could say if you did 10 consecutive compilations, then 10 x 100ms = 1 sec while in the other case would be 10 seconds, but this is extreme, you usually take a time change code and compile. But overall I couldn't be bothered at all. Now imagine waiting ~40 seconds just to open any solution on Visual Studio (Mostly used for projects where I work), on a NOT so old machine, and like 4 ~ 10 seconds every time you run an app while debugging. That's is the meaning of pain. Matheus.
Aug 14
next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 15 August 2019 at 00:45:06 UTC, matheus wrote:
 I can't see a great difference between 1 sec vs 100 ms "while 
 working".
It is pretty frustrating and easy to lose your train of thought as things make you wait. Though I find 1 second to still generally be acceptable, once we get into like 3 seconds it is getting absurd.
Aug 14
prev sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Thursday, 15 August 2019 at 00:45:06 UTC, matheus wrote:
 On Friday, 9 August 2019 at 13:17:02 UTC, Atila Neves wrote:
 From experience, it makes me work much slower if I don't get 
 results in less than 100ms. If I'm not mistaken, IBM did a 
 study on this that I read once but never managed to find again 
 about how much faster people worked on short feedback cycles.
This is bit an exaggeration right?
No, no exaggeration.
 We're talking about the speed of a human blink.
Apparently blinks are slower than that (I just googled). It doesn't matter though, since it has an effect. As I mentioned before, latencies over 10ms on an audio interface are noticeable by musicians when they play live through effects, and that's an order of magnitude removed.
 I can't see a great difference between 1 sec vs 100 ms "while 
 working".
I can. It makes me want to punch the wall.
 Of course someone could say if you did 10 consecutive 
 compilations, then 10 x 100ms = 1 sec while in the other case 
 would be 10 seconds, but this is extreme, you usually take a 
 time change code and compile.
No, it's not that. It's that it interrupts my train of thought. I can work faster if I get faster feedback.
 But overall I couldn't be bothered at all.

 Now imagine waiting ~40 seconds just to open any solution on 
 Visual Studio (Mostly used for projects where I work), on a NOT 
 so old machine, and like 4 ~ 10 seconds every time you run an 
 app while debugging.

 That's is the meaning of pain.
I can take waiting 40s once a day. I can't take waiting >1s every time I build though.
Aug 15
next sibling parent reply Exil <Exil gmall.com> writes:
On Thursday, 15 August 2019 at 15:00:43 UTC, Atila Neves wrote:
 On Thursday, 15 August 2019 at 00:45:06 UTC, matheus wrote:
 On Friday, 9 August 2019 at 13:17:02 UTC, Atila Neves wrote:
 From experience, it makes me work much slower if I don't get 
 results in less than 100ms. If I'm not mistaken, IBM did a 
 study on this that I read once but never managed to find 
 again about how much faster people worked on short feedback 
 cycles.
This is bit an exaggeration right?
No, no exaggeration.
Don't know any compiler that's that fast, definitely not D and not even Jai.
 We're talking about the speed of a human blink.
Apparently blinks are slower than that (I just googled). It doesn't matter though, since it has an effect. As I mentioned before, latencies over 10ms on an audio interface are noticeable by musicians when they play live through effects, and that's an order of magnitude removed.
 I can't see a great difference between 1 sec vs 100 ms "while 
 working".
I can. It makes me want to punch the wall.
The difference is noticable but really not to that point. What do you do when you have to wait 30 mins? I guess some people are just less trigger happy and patient than others.
 Of course someone could say if you did 10 consecutive 
 compilations, then 10 x 100ms = 1 sec while in the other case 
 would be 10 seconds, but this is extreme, you usually take a 
 time change code and compile.
No, it's not that. It's that it interrupts my train of thought. I can work faster if I get faster feedback.
Wouldn't compiler errors do the same.thing, if not worse? Not only do they interrupt your train of thought they require you to think about something else entirely. What do you do when you get a compiler error?
 But overall I couldn't be bothered at all.

 Now imagine waiting ~40 seconds just to open any solution on 
 Visual Studio (Mostly used for projects where I work), on a 
 NOT so old machine, and like 4 ~ 10 seconds every time you run 
 an app while debugging.

 That's is the meaning of pain.
I can take waiting 40s once a day. I can't take waiting >1s every time I build though.
Feel like you don't have to wait. You can continue to do other things while it is compiling. I suppose some people aren't as good at multi tasking.
Aug 15
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Aug 15, 2019 at 06:53:53PM +0000, Exil via Digitalmars-d wrote:
 On Thursday, 15 August 2019 at 15:00:43 UTC, Atila Neves wrote:
 On Thursday, 15 August 2019 at 00:45:06 UTC, matheus wrote:
[...]
 I can't see a great difference between 1 sec vs 100 ms "while
 working".
I can. It makes me want to punch the wall.
The difference is noticable but really not to that point. What do you do when you have to wait 30 mins? I guess some people are just less trigger happy and patient than others.
Keep in mind, you're talking to a long-time D user who has gotten used to lightning-fast compile speeds. In the old days, when I was still using C/C++ by choice, compilation breaks were an accepted norm. It's just the thing that you do after X minutes of coding, and it's known that it's a slow process. You just take a coffee break, a washroom break, browse the internet, or whatever, and then resume working when the build is done. It's just the way things were. Working continuously was unimaginable, and you never missed it because you never experienced such a thing before. Now that I'm used to D compilation speeds, I find anything longer than 3-5 seconds totally intolerable. With a <1-2 second turnaround time, I find myself in a completely different mode of thought -- I can try out experimental bits of code and get almost instant feedback on the effect it has on the program. Within a short 5-minute span I can have already tested out 20-25 different implementation ideas and zeroed in on the best one. Your train of thought can actually proceed uninterrupted, and your coding process gets elevated to a new level of intense focus and productivity. Whereas back in the day of C/C++ slow compiles, such a process would have taken hours, if not days, and development speed would be tortoise-slow. Once you've tasted this level of coding focus and productivity, going back to the old way is simply unpalatable. You find yourself losing your train of thought, distracted by other things during compilation and need to spend (i.e., waste) some time to refocus afterwards to get back in the groove. Productivity is abysmal, relatively speaking. Too much time wasted redirecting your attention from various distractions back to the problem at hand. Like having to constantly switch your hand from the keyboard to the mouse instead of a keyboard-driven UI where your fingers are right there and ready to go, having a slow compile just pulls you back an order of magnitude in productivity. (I used to use a mouse-driven GUI for decades -- now I use a 99.99% keyboard-only interface with no distracting frills, and I can tell you productivity has skyrocketed to a whole 'nother level never before imaginable.) [...]
 No, it's not that. It's that it interrupts my train of thought. I
 can work faster if I get faster feedback.
Wouldn't compiler errors do the same.thing, if not worse? Not only do they interrupt your train of thought they require you to think about something else entirely. What do you do when you get a compiler error?
Compile errors that appear instantly means you're still focused and can immediately get on the task of identifying the problem code. Compile errors that appear after X seconds means you spend an additional Y seconds refocusing your brain on the programming problem at hand, and *then* get on the task of identifying the problem code, thus slowing you down by (X+Y+Z) seconds rather than just spending the Z seconds finding the problem. [...]
 Now imagine waiting ~40 seconds just to open any solution on
 Visual Studio (Mostly used for projects where I work), on a NOT so
 old machine, and like 4 ~ 10 seconds every time you run an app
 while debugging.
 
 That's is the meaning of pain.
And that is why I don't bother with IDEs, or anything, really, that has needless eye-candy and frills I don't use. Give me vim over an SSH remote connection, and I can be 100% productive anywhere. A GUI that requires umpteen GBs of RAM and 40s to start? Not even on my radar.
 I can take waiting 40s once a day. I can't take waiting >1s every
 time I build though.
Feel like you don't have to wait. You can continue to do other things while it is compiling. I suppose some people aren't as good at multi tasking.
It's not an issue of multitasking. It's an issue of wasting time because you have to keep switching mental gears. Context-switching is not free, even in the human brain. :-D (I'd even say *especially* in the human brain -- CPU context switches are basically undetectable as far as human perception is concerned, but switching mental gears definitely takes a much more significant amount of time.) T -- Leather is waterproof. Ever see a cow with an umbrella?
Aug 15
next sibling parent reply Exil <Exil gmall.com> writes:
On Thursday, 15 August 2019 at 19:30:44 UTC, H. S. Teoh wrote:
 On Thu, Aug 15, 2019 at 06:53:53PM +0000, Exil via 
 Digitalmars-d wrote:
 On Thursday, 15 August 2019 at 15:00:43 UTC, Atila Neves wrote:
 On Thursday, 15 August 2019 at 00:45:06 UTC, matheus wrote:
[...]
 I can't see a great difference between 1 sec vs 100 ms 
 "while working".
I can. It makes me want to punch the wall.
The difference is noticable but really not to that point. What do you do when you have to wait 30 mins? I guess some people are just less trigger happy and patient than others.
Keep in mind, you're talking to a long-time D user who has gotten used to lightning-fast compile speeds. In the old days, when I was still using C/C++ by choice, compilation breaks were an accepted norm. It's just the thing that you do after X minutes of coding, and it's known that it's a slow process. You just take a coffee break, a washroom break, browse the internet, or whatever, and then resume working when the build is done. It's just the way things were. Working continuously was unimaginable, and you never missed it because you never experienced such a thing before. Now that I'm used to D compilation speeds, I find anything longer than 3-5 seconds totally intolerable. With a <1-2 second turnaround time, I find myself in a completely different mode of thought -- I can try out experimental bits of code and get almost instant feedback on the effect it has on the program. Within a short 5-minute span I can have already tested out 20-25 different implementation ideas and zeroed in on the best one. Your train of thought can actually proceed uninterrupted, and your coding process gets elevated to a new level of intense focus and productivity. Whereas back in the day of C/C++ slow compiles, such a process would have taken hours, if not days, and development speed would be tortoise-slow. Once you've tasted this level of coding focus and productivity, going back to the old way is simply unpalatable. You find yourself losing your train of thought, distracted by other things during compilation and need to spend (i.e., waste) some time to refocus afterwards to get back in the groove. Productivity is abysmal, relatively speaking. Too much time wasted redirecting your attention from various distractions back to the problem at hand. Like having to constantly switch your hand from the keyboard to the mouse instead of a keyboard-driven UI where your fingers are right there and ready to go, having a slow compile just pulls you back an order of magnitude in productivity. (I used to use a mouse-driven GUI for decades -- now I use a 99.99% keyboard-only interface with no distracting frills, and I can tell you productivity has skyrocketed to a whole 'nother level never before imaginable.)
See my experience with D is that it still takes over a minute to compile my relatively small program. A minute isn't really fast. Even worse if you use -O with DMD (I stopped it after 20 mins). D really isn't a fast language, or rather the only frontend is really really slow especially for CTFE.
 [...]
 No, it's not that. It's that it interrupts my train of 
 thought. I can work faster if I get faster feedback.
Wouldn't compiler errors do the same.thing, if not worse? Not only do they interrupt your train of thought they require you to think about something else entirely. What do you do when you get a compiler error?
Compile errors that appear instantly means you're still focused and can immediately get on the task of identifying the problem code. Compile errors that appear after X seconds means you spend an additional Y seconds refocusing your brain on the programming problem at hand, and *then* get on the task of identifying the problem code, thus slowing you down by (X+Y+Z) seconds rather than just spending the Z seconds finding the problem.
Now you have to read and interpret something else entirely. Your not waiting 200 ms to continue your line of thought, your wasting minutes if not more if you continue to get compiler errors. Especially one of those template errors that are difficult to interpret. It's the same problem but x10 worse. Your not the person I was replying to, so sure maybe its something else to you, but the person I was replying to made it pretty clear what it was for them. Just curious, how fast do you type? If it's about wasting time for you I imagine you must type pretty quickly then :P.
 [...]
 Now imagine waiting ~40 seconds just to open any solution 
 on Visual Studio (Mostly used for projects where I work), 
 on a NOT so old machine, and like 4 ~ 10 seconds every 
 time you run an app while debugging.
 
 That's is the meaning of pain.
And that is why I don't bother with IDEs, or anything, really, that has needless eye-candy and frills I don't use. Give me vim over an SSH remote connection, and I can be 100% productive anywhere. A GUI that requires umpteen GBs of RAM and 40s to start? Not even on my radar.
Haven't use an IDE in a long time either. The only one that's ever really been of note is Visual Studio, and their release cycles or so slow. They don't really add anything to the IDE anymore. It's mostly just the compiler they are updating. Compared to other Editors that release a new build every month. There's only really a handful of features that make IDEs useful for productivity, most editors have a few of those features, even less has most of them.
 I can take waiting 40s once a day. I can't take waiting >1s 
 every time I build though.
Feel like you don't have to wait. You can continue to do other things while it is compiling. I suppose some people aren't as good at multi tasking.
It's not an issue of multitasking. It's an issue of wasting time because you have to keep switching mental gears. Context-switching is not free, even in the human brain. :-D (I'd even say *especially* in the human brain -- CPU context switches are basically undetectable as far as human perception is concerned, but switching mental gears definitely takes a much more significant amount of time.) T
You've just described multi tasking. If you have trouble keeping multiple things in your head at once and it takes you a long time to recall where you were in two unrelated tasks that you were doing relatively closely together. Then you aren't very good at multi-tasking.
Aug 15
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Aug 15, 2019 at 08:09:33PM +0000, Exil via Digitalmars-d wrote:
[...]
 See my experience with D is that it still takes over a minute to
 compile my relatively small program.
I've seen that before. The cause is usually one or more of: (1) Using dub. Dub is, sorry to say, dog-slow. I do not bother with dub because of that (plus a variety of other reasons). I use a saner build system that invokes dmd directly, and it's a lot faster than using dub. (2) Using too many templates / CTFE. CTFE is known to be dog-slow (Diet templates are a leading cause of dog-slow compilation in vibe.d projects, because they try to do too much at compile-time. I haven't gotten around to it yet, but in my own vibe.d project the goal is to move away from Diet templates and use a faster, home-brew HTML generation solution instead.) NewCTFE is supposed to improve this, but at the rate things are going, I'm not holding my breath for it. Templates are also known to be slow when you get into recursive templates or just careless, wanton use of compile-time arguments where runtime arguments do just fine. If you reduce your use of templates, and use runtime code where it's not important to stuff everything into compile-time, your compile times will improve a lot. (3) Using certain Phobos modules that are known to be very slow to compile, such as std.regex. The underlying cause is essentially the same as (2), but I thought I'd point it out because sometimes it's not obvious that that's the problem when all you did was to import Phobos.
 A minute isn't really fast.
If my D project is taking a minute to compile, I'd seriously look into eliminating needless templates/CTFE and/or replacing dog-slow Phobos modules with custom code.
 Even worse if you use -O with DMD (I stopped it after 20 mins).
IMNSHO, using -O with DMD is a waste of time. It increases compilation time, and has a higher probability of running into a compiler (usu. backend) bug, yet the resulting executable is still woefully suboptimal compared to, say, gdc or ldc. Just not worth it. If performance was important to me (and it is, in some of my projects), I'd just use ldc2 outright and forget about DMD.
 D really isn't a fast language, or rather the only frontend is really
 really slow especially for CTFE.
I understand the sentiment, but I think it's an unfair comparison. If I were to implement in C++ the equivalent of the some CTFE functionality that's making my compilation slow, I'm almost certain the resulting C++ compile times will make dmd look like lightning speed by comparison. CTFE *is* known to be slow, no question about that, but I suspect it's still a lot faster than what it would have taken to accomplish the same thing in C++. [...]
 Compile errors that appear instantly means you're still focused and
 can immediately get on the task of identifying the problem code.
 Compile errors that appear after X seconds means you spend an
 additional Y seconds refocusing your brain on the programming
 problem at hand, and *then* get on the task of identifying the
 problem code, thus slowing you down by (X+Y+Z) seconds rather than
 just spending the Z seconds finding the problem.
Now you have to read and interpret something else entirely. Your not waiting 200 ms to continue your line of thought, your wasting minutes if not more if you continue to get compiler errors. Especially one of those template errors that are difficult to interpret. It's the same problem but x10 worse.
But you have to spend that time *anyway*, eventually if not right then, when you have to fix the bug in your code. Why make the total time even longer by being forced to wait for long compilation times?
 Your not the person I was replying to, so sure
 maybe its something else to you, but the person I was replying to made
 it pretty clear what it was for them.
 
 Just curious, how fast do you type? If it's about wasting time for you
 I imagine you must type pretty quickly then :P.
[...] I type relatively fast -- not super-fast, mind you, but the point is that long compilation times are *on top* of my typing times. If I'm already typing not super-fast, then I really don't want compilation times to make the total time even longer. T -- Creativity is not an excuse for sloppiness.
Aug 15
next sibling parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Thursday, 15 August 2019 at 22:37:16 UTC, H. S. Teoh wrote:
 On Thu, Aug 15, 2019 at 08:09:33PM +0000, Exil via 
 Digitalmars-d wrote: [...]
 (2) Using too many templates / CTFE.  CTFE is known to be 
 dog-slow. NewCTFE is supposed to improve this, but at the rate 
 things are going, I'm not holding my breath for it.
Now, if you cloud hold your breath for about a year .... Regarding phobos even worse than regex is std.uni and std.format which imports uni. And because of importing also std.conv and std.stdio. By removing all mentions of std.stdio.writeln and std.conv.to in my own code, I was a able to reduce my compile-times from 30s to 2s and reduce the memory usage from 26G to .6G.
Aug 15
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Aug 15, 2019 at 10:48:35PM +0000, Stefan Koch via Digitalmars-d wrote:
 On Thursday, 15 August 2019 at 22:37:16 UTC, H. S. Teoh wrote:
 On Thu, Aug 15, 2019 at 08:09:33PM +0000, Exil via Digitalmars-d wrote:
 [...]
 (2) Using too many templates / CTFE.  CTFE is known to be dog-slow.
 NewCTFE is supposed to improve this, but at the rate things are
 going, I'm not holding my breath for it.
 
Now, if you cloud hold your breath for about a year ....
Whoa, now *that's* good news I've been dying to hear! :-)
 Regarding phobos even worse than regex is std.uni and std.format which
 imports uni.
 And because of importing also std.conv and std.stdio.
Oh yeah, std.format is a bear. Sadly, I can't live without it (too addicted to format-string based string output), but every second spent waiting for std.format to compile (and it does increase compilation by a significant number of seconds) is a second of anguish and teeth-gritting ("Why is std.format so dog-slow to compile?!?!").
 By removing all mentions of std.stdio.writeln and std.conv.to in my
 own code, I was a able to reduce my compile-times from 30s to 2s and
 reduce the memory usage from 26G to .6G.
[...] I thought std.conv.to wasn't *that* bad? Well, I guess it depends on what you use it for. The implementation *does* involve a ridiculous number of template expansions. I can't help wondering if we could refactor the ridiculous number of toImpl overloads into a single function with static-if blocks to dispatch to the various different cases. (This would also improve the documentation, which last I checked still suffers from sig constraint abuse, AKA exposing implementation details in the sig constraint that are irrelevant to the user, where it really should be a static-if + static-assert inside the implementation.) T -- "The whole problem with the world is that fools and fanatics are always so certain of themselves, but wiser people so full of doubts." -- Bertrand Russell. "How come he didn't put 'I think' at the end of it?" -- Anonymous
Aug 15
prev sibling parent reply Exil <Exil gmall.com> writes:
On Thursday, 15 August 2019 at 22:37:16 UTC, H. S. Teoh wrote:
 On Thu, Aug 15, 2019 at 08:09:33PM +0000, Exil via 
 Digitalmars-d wrote: [...]
 See my experience with D is that it still takes over a minute 
 to compile my relatively small program.
I've seen that before. The cause is usually one or more of: (1) Using dub. Dub is, sorry to say, dog-slow. I do not bother with dub because of that (plus a variety of other reasons). I use a saner build system that invokes dmd directly, and it's a lot faster than using dub. (2) Using too many templates / CTFE. CTFE is known to be dog-slow (Diet templates are a leading cause of dog-slow compilation in vibe.d projects, because they try to do too much at compile-time. I haven't gotten around to it yet, but in my own vibe.d project the goal is to move away from Diet templates and use a faster, home-brew HTML generation solution instead.) NewCTFE is supposed to improve this, but at the rate things are going, I'm not holding my breath for it. Templates are also known to be slow when you get into recursive templates or just careless, wanton use of compile-time arguments where runtime arguments do just fine. If you reduce your use of templates, and use runtime code where it's not important to stuff everything into compile-time, your compile times will improve a lot. (3) Using certain Phobos modules that are known to be very slow to compile, such as std.regex. The underlying cause is essentially the same as (2), but I thought I'd point it out because sometimes it's not obvious that that's the problem when all you did was to import Phobos.
 A minute isn't really fast.
If my D project is taking a minute to compile, I'd seriously look into eliminating needless templates/CTFE and/or replacing dog-slow Phobos modules with custom code.
Already don't use Phobos and I can't eliminate needless CTFE cause it isn't needless. D's fast except if you use this this this and that. Well then it isn't fast. Especially when the whole point of using it are for those features. Then I'm wasting time trying to make my code fast for the compiler instead of using that time to making the code actually better.
 Even worse if you use -O with DMD (I stopped it after 20 mins).
IMNSHO, using -O with DMD is a waste of time. It increases compilation time, and has a higher probability of running into a compiler (usu. backend) bug, yet the resulting executable is still woefully suboptimal compared to, say, gdc or ldc. Just not worth it. If performance was important to me (and it is, in some of my projects), I'd just use ldc2 outright and forget about DMD.
Yah I don't really use DMD anymore, too many bugs and I have to build it myself.
 D really isn't a fast language, or rather the only frontend is 
 really really slow especially for CTFE.
I understand the sentiment, but I think it's an unfair comparison. If I were to implement in C++ the equivalent of the some CTFE functionality that's making my compilation slow, I'm almost certain the resulting C++ compile times will make dmd look like lightning speed by comparison. CTFE *is* known to be slow, no question about that, but I suspect it's still a lot faster than what it would have taken to accomplish the same thing in C++.
Its a C++ project I converted, and it actually compiled faster in C++. I was using my own program to do what I am doing on CTFE now. As a result I was able to optimize it myself.
 [...]
 Compile errors that appear instantly means you're still 
 focused and can immediately get on the task of identifying 
 the problem code. Compile errors that appear after X seconds 
 means you spend an additional Y seconds refocusing your 
 brain on the programming problem at hand, and *then* get on 
 the task of identifying the problem code, thus slowing you 
 down by (X+Y+Z) seconds rather than just spending the Z 
 seconds finding the problem.
Now you have to read and interpret something else entirely. Your not waiting 200 ms to continue your line of thought, your wasting minutes if not more if you continue to get compiler errors. Especially one of those template errors that are difficult to interpret. It's the same problem but x10 worse.
But you have to spend that time *anyway*, eventually if not right then, when you have to fix the bug in your code. Why make the total time even longer by being forced to wait for long compilation times?
Point being, someone is willing to punch a wall for 100ms, I wonder why they aren't getting angree at other things equally. Especially with how horrible some error messages can be.
 Your not the person I was replying to, so sure
 maybe its something else to you, but the person I was replying 
 to made
 it pretty clear what it was for them.
 
 Just curious, how fast do you type? If it's about wasting time 
 for you I imagine you must type pretty quickly then :P.
[...] I type relatively fast -- not super-fast, mind you, but the point is that long compilation times are *on top* of my typing times. If I'm already typing not super-fast, then I really don't want compilation times to make the total time even longer. T
That's my point, what are you doing to improve how fast you type? What WPM? One persons fast is another persons too slow, relatively.
Aug 16
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Aug 16, 2019 at 02:05:54PM +0000, Exil via Digitalmars-d wrote:
 On Thursday, 15 August 2019 at 22:37:16 UTC, H. S. Teoh wrote:
 On Thu, Aug 15, 2019 at 08:09:33PM +0000, Exil via Digitalmars-d wrote:
[...]
 D really isn't a fast language, or rather the only frontend is
 really really slow especially for CTFE.
I understand the sentiment, but I think it's an unfair comparison. If I were to implement in C++ the equivalent of the some CTFE functionality that's making my compilation slow, I'm almost certain the resulting C++ compile times will make dmd look like lightning speed by comparison. CTFE *is* known to be slow, no question about that, but I suspect it's still a lot faster than what it would have taken to accomplish the same thing in C++.
Its a C++ project I converted, and it actually compiled faster in C++.
That's curious. Is there any code you could share that demonstrates this? I'm wondering if your implementation choices could be suboptimal as far as compile times are concerned. But it's possible that CTFE is just that slow. :-/ (So much for that fast-fast-fast slogan. It's been making me cringe ever since it was introduced, and nobody else seems to think it's a problem. *shrug*)
 I was using my own program to do what I am doing on CTFE now. As a
 result I was able to optimize it myself.
[...] Hold on, if you were using an external program to do what you're doing in CTFE now, that would explain the speed difference. CTFE is currently an interpreter, and not a very good one at that (in terms of speed). In my own D projects, I have no qualms about writing utility programs that generate D code that then gets compiled as part of the target executable. In fact, some of the generated code I have involves pretty heavy-duty data processing from external data files. Running it in CTFE would be several orders of magnitude slower than just writing a code generator utility. If compile times are important to you, I'd recommend taking this approach instead. T -- If lightning were to ever strike an orchestra, it'd always hit the conductor first.
Aug 16
prev sibling parent matheus <matheus gmail.com> writes:
On Thursday, 15 August 2019 at 19:30:44 UTC, H. S. Teoh wrote:
 [...]
 Now imagine waiting ~40 seconds just to open any solution 
 on Visual Studio (Mostly used for projects where I work), 
 on a NOT so old machine, and like 4 ~ 10 seconds every 
 time you run an app while debugging.
 
 That's is the meaning of pain.
And that is why I don't bother with IDEs, or anything, really, that has needless eye-candy and frills I don't use. Give me vim over an SSH remote connection, and I can be 100% productive anywhere. A GUI that requires umpteen GBs of RAM and 40s to start? Not even on my radar. ...
I don't like either. I just use where I work because it's pretty much a norm there. For my own projects I use plain text editor. Well I still have a version of Visual C++ 6.0 (98?) installed on an old Windows Machine, which I use sometimes when I write C code because the great debugger and since it's a old software, it's blazing fast even on that old machine. In fact it's literally faster than a blink, even with big projects like Game Engines. Matheus.
Aug 15
prev sibling next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 15 August 2019 at 18:53:53 UTC, Exil wrote:
 Don't know any compiler that's that fast, definitely not D and 
 not even Jai.
You ever use D1? I still have some of my old games written in it. I went back to it briefly not long ago. I thought the compiler was broken because there was no visible delay at all - it was done faster than my prompt would reappear. D used to be really fast. used to be :(
Aug 15
next sibling parent matheus <matheus gmail.com> writes:
On Thursday, 15 August 2019 at 19:50:51 UTC, Adam D. Ruppe wrote:
 On Thursday, 15 August 2019 at 18:53:53 UTC, Exil wrote:
 Don't know any compiler that's that fast, definitely not D and 
 not even Jai.
You ever use D1? I still have some of my old games written in it. I went back to it briefly not long ago. I thought the compiler was broken because there was no visible delay at all - it was done faster than my prompt would reappear. D used to be really fast. used to be :(
Could you please share what happened since D1 or the main cause? Matheus.
Aug 15
prev sibling parent bachmeier <no spam.net> writes:
On Thursday, 15 August 2019 at 19:50:51 UTC, Adam D. Ruppe wrote:
 On Thursday, 15 August 2019 at 18:53:53 UTC, Exil wrote:
 Don't know any compiler that's that fast, definitely not D and 
 not even Jai.
You ever use D1? I still have some of my old games written in it. I went back to it briefly not long ago. I thought the compiler was broken because there was no visible delay at all - it was done faster than my prompt would reappear. D used to be really fast. used to be :(
It still is, if you're able to avoid templates. I get the impression that templates are very popular with D programmers.
Aug 15
prev sibling next sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Thursday, 15 August 2019 at 18:53:53 UTC, Exil wrote:
 On Thursday, 15 August 2019 at 15:00:43 UTC, Atila Neves wrote:
 On Thursday, 15 August 2019 at 00:45:06 UTC, matheus wrote:
 On Friday, 9 August 2019 at 13:17:02 UTC, Atila Neves wrote:
 From experience, it makes me work much slower if I don't get 
 results in less than 100ms. If I'm not mistaken, IBM did a 
 study on this that I read once but never managed to find 
 again about how much faster people worked on short feedback 
 cycles.
This is bit an exaggeration right?
No, no exaggeration.
Don't know any compiler that's that fast, definitely not D and not even Jai.
dmd compiles "hello world" in 50ms on my system. Then I tried it on a file with 1000 functions that add two integers. 50ms again. The problem, of course, is that any real D code I write is mostly templates and CTFE. Then there's the unittest/phobos issue with templates.
 The difference is noticable but really not to that point. What 
 do you do when you have to wait 30 mins?
Be 1000 times less productive.
 I guess some people are just less trigger happy and patient 
 than others.
I guess so.
 Wouldn't compiler errors do the same.thing, if not worse? Not 
 only do they interrupt your train of thought they require you 
 to think about something else entirely. What do you do when you 
 get a compiler error?
Not sure. I think it's a combination of not getting them that often and getting them early in the editor before I've actually finished typing that means they don't bother me nearly as much.
 Feel like you don't have to wait. You can continue to do other 
 things while it is compiling. I suppose some people aren't as 
 good at multi tasking.
Nobody is good at multi tasking (there are studies). A lot of people are really good at believing they are, though. CPUs have nothing on the brain when it comes to context switching.
Aug 16
next sibling parent Dominikus Dittes Scherkl <dominikus.scherkl continental-corporation.com> writes:
On Friday, 16 August 2019 at 09:20:58 UTC, Atila Neves wrote:

 Nobody is good at multi tasking (there are studies). A lot of 
 people are really good at believing they are, though.
So true.
 CPUs have nothing on the brain when it comes to context 
 switching.
No, even they do nowadays. This is called cache, which tend to be garbled by frequent context switches (at least if more threads are running than available cores, which indeed is usually the case - but hopefully most of them take only a very small time slice)
Aug 16
prev sibling parent SashaGreat <s g.com> writes:
On Friday, 16 August 2019 at 09:20:58 UTC, Atila Neves wrote:
 Nobody is good at multi tasking (there are studies). A lot of 
 people are really good at believing they are, though. CPUs have 
 nothing on the brain when it comes to context switching.
I think this is tricky! I know people who can sing and play guitar at the same time, while there are others whom just can't. You can search "How can I sing and p..." the rest will be auto completed by this problem. People who can do this, usually can timing the song and do these 2 actions nicely. I can't do these 2 things, but on the other hand while at the gym I can think and solve problems better while doing some push ups, I think this is called Synapse or something like that. Sasha.
Aug 16
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2019-08-15 20:53, Exil wrote:

 Don't know any compiler that's that fast, definitely not D and not even 
 Jai.
It highly depends on what kind of code you write. A simple application using DWT [1] takes 7.5 seconds to compile on my machine, using Docker and inside a virtual machine. DWT is around 400k lines of code and comments. But DWT is a port of the Java library SWT, so it contains very few templates and other compile time features. [1] https://github.com/d-widget-toolkit/dwt -- /Jacob Carlborg
Aug 18
prev sibling parent reply Kagamin <spam here.lot> writes:
On Thursday, 15 August 2019 at 15:00:43 UTC, Atila Neves wrote:
 I can take waiting 40s once a day. I can't take waiting >1s 
 every time I build though.
I just spent two weeks doing a biggish refactoring, I didn't safe files until the end. If I know it's not ready and won't work, and know what's left to do, why bother, it's just a waste of entropy.
Aug 16
next sibling parent Atila Neves <atila.neves gmail.com> writes:
On Friday, 16 August 2019 at 13:46:01 UTC, Kagamin wrote:
 On Thursday, 15 August 2019 at 15:00:43 UTC, Atila Neves wrote:
 I can take waiting 40s once a day. I can't take waiting >1s 
 every time I build though.
I just spent two weeks doing a biggish refactoring, I didn't safe files until the end. If I know it's not ready and won't work, and know what's left to do, why bother, it's just a waste of entropy.
There are people who work the way you do, and for them fast iteration cycles aren't important. I'm not one of them.
Aug 16
prev sibling parent matheus <matheus gmail.com> writes:
On Friday, 16 August 2019 at 13:46:01 UTC, Kagamin wrote:
 On Thursday, 15 August 2019 at 15:00:43 UTC, Atila Neves wrote:
 I can take waiting 40s once a day. I can't take waiting >1s 
 every time I build though.
I just spent two weeks doing a biggish refactoring, I didn't safe files until the end. If I know it's not ready and won't work, and know what's left to do, why bother, it's just a waste of entropy.
I was watching Jonathan Blow refactoring some code yesterday (https://www.youtube.com/watch?v=ndRrSGttlus), the process he used involves compiling through changes. It takes some time doing this way, and of course running test cases together. But in the end it's safer and I usually do this too. Matheus.
Aug 16
prev sibling parent reply MrSmith <mrsmith33 yandex.ru> writes:
On Friday, 9 August 2019 at 09:22:12 UTC, Dukc wrote:
 So I'm interested, are virtually-instant compiles something one 
 has to experience to understand their benefit?
Here is recent article on that topic: http://jsomers.net/blog/speed-matters
Aug 10
parent Dukc <ajieskola gmail.com> writes:
On Saturday, 10 August 2019 at 10:31:41 UTC, MrSmith wrote:
 On Friday, 9 August 2019 at 09:22:12 UTC, Dukc wrote:
 So I'm interested, are virtually-instant compiles something 
 one has to experience to understand their benefit?
Here is recent article on that topic: http://jsomers.net/blog/speed-matters
Thanks! That one needs a deep thought.
Aug 10
prev sibling next sibling parent Gregor =?UTF-8?B?TcO8Y2ts?= <gregormueckl gmx.de> writes:
On Friday, 9 August 2019 at 08:37:25 UTC, Atila Neves wrote:
 C++, to preserve support for platforms that don't have a 
 hierarchical filesystem (I have no idea what they are or why 
 anyone cares about them), decided to punt on how to find 
 modules and leaves it up to the implementation.
This was done to support mainframe operaring systems, I think. As far as I can tell, z/OS (formerly OS/360) doesn't have a hierarchical file system in the classic sense. Objects there have a name with different segments/fields, but at least a few of them have semantics that the operating susten cares about. This is also one of the reasons why the committee didn't standardize #pragma once in favor of a #once [unique id].
 The other difference is that despite having modules, C++20 
 still has the equivalent of module headers and module 
 implementations.

 I read a whole blog post explaining C++ modules once. I shook 
 my head throughout and can't believe what they've gone with. 
 I've also forgotten most of it, probably due to my brain 
 protecting itself lest I go mad.
Reading up on C++ modules leaves the impression that the ball was dropped so hard that it left a massive crater on impact. I have tries ti read up on them several times and I still haven't figured out how you aee supposed to use them.
Aug 09
prev sibling parent reply Russel Winder <russel winder.org.uk> writes:
On Fri, 2019-08-09 at 08:37 +0000, Atila Neves via Digitalmars-d wrote:
[=E2=80=A6]
 I don't think it is. Fast is relative, and it's death by a=20
 thousand cuts for me at the moment. And this despite the fact=20
 that I use reggae as much as I can, which means I wait less than=20
 most other D programmers on average to get a compiled binary!
Is there any chance of getting Reggae into D-Apt or better still the standa= rd Debian Sid repository along with ldc2, GtkD, and GStreamerD, so it can be a standard install for anyone using Debian or Ubuntu and so get some real traction in the D build market? --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Dr Russel Winder t: +44 20 7585 2200 41 Buckmaster Road m: +44 7770 465 077 London SW11 1EN, UK w: www.russel.org.uk
Aug 09
parent reply Atila Neves <atila.neves gmail.com> writes:
On Friday, 9 August 2019 at 16:45:05 UTC, Russel Winder wrote:
 On Fri, 2019-08-09 at 08:37 +0000, Atila Neves via 
 Digitalmars-d wrote: […]
 I don't think it is. Fast is relative, and it's death by a 
 thousand cuts for me at the moment. And this despite the fact 
 that I use reggae as much as I can, which means I wait less 
 than most other D programmers on average to get a compiled 
 binary!
Is there any chance of getting Reggae into D-Apt or better still the standard Debian Sid repository along with ldc2, GtkD, and GStreamerD, so it can be a standard install for anyone using Debian or Ubuntu and so get some real traction in the D build market?
That's a good question. The thing is I basically want to rewrite reggae from scratch, and, worse than that, am trying to figure out how to best leverage the work done in this paper: https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=2ahUKEwjmtM7hhf3jAhXr0qYKHZRUCoYQFjAAegQIABAC&url=https%3A%2F%2Fwww.microsoft.com%2Fen-us%2Fresearch%2Fuploads%2Fprod%2F2018%2F03%2Fbuild-systems-final.pdf&usg=AOvVaw2j71hXjOoEQLNNjvEOp_RQ (Build Systems à la carte by Microsoft Research, in which they show how to compose different types of build systems in Haskell) It's clear to me that the current way of building software is broken in the sense that we almost always do more work than needed. My vision for the future is a build system so smart that it only rebuilds mod1.d if mod0.d was modified in such a way that it actually needs to. For instance, if the signatures of any functions imported are changed. I think that paper is a step in the right direction by abstracting away how changes are computed.
Aug 12
next sibling parent Russel Winder <russel winder.org.uk> writes:
On Mon, 2019-08-12 at 09:47 +0000, Atila Neves via Digitalmars-d wrote:
=20
[=E2=80=A6]
 That's a good question. The thing is I basically want to rewrite=20
 reggae from scratch, and, worse than that, am trying to figure=20
 out how to best leverage the work done in this paper:
=20
 https://www.google.com/url?sa=3Dt&rct=3Dj&q=3D&esrc=3Ds&source=3Dweb&cd=
=3D1&cad=3Drja&uact=3D8&ved=3D2ahUKEwjmtM7hhf3jAhXr0qYKHZRUCoYQFjAAegQIABAC= &url=3Dhttps%3A%2F%2Fwww.microsoft.com%2Fen-us%2Fresearch%2Fuploads%2Fprod%= 2F2018%2F03%2Fbuild-systems-final.pdf&usg=3DAOvVaw2j71hXjOoEQLNNjvEOp_RQ
=20
 (Build Systems =C3=A0 la carte by Microsoft Research, in which they=20
 show how to compose different types of build systems in Haskell)
=20
 It's clear to me that the current way of building software is=20
 broken in the sense that we almost always do more work than=20
 needed. My vision for the future is a build system so smart that=20
 it only rebuilds mod1.d if mod0.d was modified in such a way that=20
 it actually needs to. For instance, if the signatures of any=20
 functions imported are changed. I think that paper is a step in=20
 the right direction by abstracting away how changes are computed.
Anything Simon is involved in is always worth looking at. The focus of the document is though Microsoft and Haskell, so lots of good stuff, but potentially missing lots of other good stuff. Although slightly different in some ways, Gradle has had (and continues to have) a huge amount of work in it to try and minimize dependencies being se= en as causing a rebuild. Gradle is principally a JVM-oriented system, but a ve= ry big client funded Gradle working with C++. SCons (and Waf) have done quite a lot of work on build minimization (especially Parts which is an addition over SCons), I am not sure how much = of this got into Meson =E2=80=93 I guess that partly depends on what Ninja doe= s. I have not used Tup, but it should have a role in any review given it's cla= ims of minimising work. The core question is though given Dub and Meson, can Reggae gain real tract= ion in the D build arena possibly replacing Dub as the default D project build controller? Is a rewrite of Dub more cost effective than a rewrite of Regga= e?=20 --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Dr Russel Winder t: +44 20 7585 2200 41 Buckmaster Road m: +44 7770 465 077 London SW11 1EN, UK w: www.russel.org.uk
Aug 12
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2019-08-12 11:47, Atila Neves wrote:

 That's a good question. The thing is I basically want to rewrite reggae 
 from scratch, and, worse than that, am trying to figure out how to best 
 leverage the work done in this paper:
 
 https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=2ahUKEwjmtM7hhf3jAhXr0qYKHZRUCoYQFjAAegQIABAC&url=https%3A%2F%2Fwww.microsoft.com%2Fen-us%2Fresearch%2Fuploads%2Fprod%2F2018%2F03%2Fbuild-systems-final.pdf&usg=AOvVaw2j
1hXjOoEQLNNjvEOp_RQ 
 
 
 (Build Systems à la carte by Microsoft Research, in which they show how 
 to compose different types of build systems in Haskell)
 
 It's clear to me that the current way of building software is broken in 
 the sense that we almost always do more work than needed. My vision for 
 the future is a build system so smart that it only rebuilds mod1.d if 
 mod0.d was modified in such a way that it actually needs to. For 
 instance, if the signatures of any functions imported are changed. I 
 think that paper is a step in the right direction by abstracting away 
 how changes are computed.
I suggest you look into incremental compilation, if you haven't done that already. I'm not talking about recompiling a whole file and relink. I'm talking incremental lexing, parsing, semantic analyzing and code generation. That is, recompile only those characters that have changed in a source file and what depends on it. For example: "void foo()". If "foo" is changed to "bar" then the compiler only needs to lex those three characters: "bar". Then run the rest of the compiler only on AST nodes that is dependent on the "bar" token. The Eclipse Java compiler (JDT) has a pretty interesting concept. It allows to compile and run invalid code. I'm guessing a bit here, but I assume if a function has a valid signature and the body is syntactically valid but semantically it contains errors. The compiler will replace the body of the function with a runtime error. If that function is never called at runtime there is no problem. Similar to how templates work in D. -- /Jacob Carlborg
Aug 13
next sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Tuesday, 13 August 2019 at 09:13:50 UTC, Jacob Carlborg wrote:
 On 2019-08-12 11:47, Atila Neves wrote:

 [...]
I suggest you look into incremental compilation, if you haven't done that already. I'm not talking about recompiling a whole file and relink. I'm talking incremental lexing, parsing, semantic analyzing and code generation. That is, recompile only those characters that have changed in a source file and what depends on it.
That's basically what I want.
Aug 13
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Aug 13, 2019 at 06:06:48PM +0000, Atila Neves via Digitalmars-d wrote:
 On Tuesday, 13 August 2019 at 09:13:50 UTC, Jacob Carlborg wrote:
[...]
 I suggest you look into incremental compilation, if you haven't done
 that already. I'm not talking about recompiling a whole file and
 relink.  I'm talking incremental lexing, parsing, semantic analyzing
 and code generation. That is, recompile only those characters that
 have changed in a source file and what depends on it.
That's basically what I want.
I suppose in C++, with its overcomplex lexing/parsing, this could potentially be significant savings. But based on what Walter has said about storing binary forms of the source code (which is basically what you'll end up doing if you really want to implement incremental compilation in the above sense), it's not much more efficient than (re)lexing and (re)parsing the entire file. Plus, the way dmd currently works, the AST is mutated by the various semantic passes, so you can't really do something like caching the AST and rewriting subtrees of it based on what changed in the source file, either. And even if you could, I'm not sure it will be much more efficient than just recompiling the entire file. (Assuming reasonable file sizes, that is -- obviously, if you have a 100,000-line source file, then you might want to look into refactoring that into smaller chunks, for many other reasons.) Templates & CTFE, though, are well-known and acknowledged performance hogs. If anything, I'd focus on improving templates and CTFE instead of worrying too much about incremental compilation in the sense Jacob describes. (Speaking of which, when is newCTFE landing in master? :-/) T -- Жил-был король когда-то, при нём блоха жила.
Aug 13
parent reply Jacob Carlborg <doob me.com> writes:
On 2019-08-13 20:33, H. S. Teoh wrote:

 I suppose in C++, with its overcomplex lexing/parsing, this could
 potentially be significant savings.  But based on what Walter has said
 about storing binary forms of the source code (which is basically what
 you'll end up doing if you really want to implement incremental
 compilation in the above sense), it's not much more efficient than
 (re)lexing and (re)parsing the entire file.
I'm guessing the compiler would run as a daemon in the background storing everything in memory.
 Plus, the way dmd currently works, the AST is mutated by the various
 semantic passes, so you can't really do something like caching the AST
 and rewriting subtrees of it based on what changed in the source file,
 either.
Yeah, that's quite annoying, for other reasons as well. -- /Jacob Carlborg
Aug 14
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Aug 14, 2019 at 12:32:11PM +0200, Jacob Carlborg via Digitalmars-d
wrote:
 On 2019-08-13 20:33, H. S. Teoh wrote:
 
 I suppose in C++, with its overcomplex lexing/parsing, this could
 potentially be significant savings.  But based on what Walter has
 said about storing binary forms of the source code (which is
 basically what you'll end up doing if you really want to implement
 incremental compilation in the above sense), it's not much more
 efficient than (re)lexing and (re)parsing the entire file.
I'm guessing the compiler would run as a daemon in the background storing everything in memory.
[...] In theory, I suppose it could work. But again, it comes back to this point of whether it's worth the trouble to maintain the necessary data structures that allows you to reliably map sections of a modified source file to the AST. Some parts of the AST aren't 100% context-free. Think, for example, of static ifs or static foreach that generate parts of the AST based on some CTFE computation taking as input some enums defined elsewhere in the file, or in another module. You'll need some pretty complex data structures to keep track of which part(s) of the AST depends on which other part(s), and you'll need some clever algorithms to update everything consistently when you detect that one of the parts of the source file corresponding to these AST subtrees is modified. I suspect the complexity required to maintain consistency and update the AST will be not much better than the cost of a straightforward re-parse of the entire source file and rebuild of the AST from scratch. It probably only starts to be superior when your source file becomes unreasonably large, in which case you should be splitting it up into more manageable modules anyway. But then again, this is all in the hypothetical. Somebody should run a proof-of-concept experiment and profile it to compare the actual performance characteristics of such a scheme. T -- If you want to solve a problem, you need to address its root cause, not just its symptoms. Otherwise it's like treating cancer with Tylenol...
Aug 14
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 8/13/2019 2:13 AM, Jacob Carlborg wrote:
 I suggest you look into incremental compilation, if you haven't done that 
 already.
Back in the 90's, when dinosaurs ruled the Earth, it was demanded of us why our linker (Optlink) didn't do incremental linking, as Microsoft just released incremental linking support and did a good job marketing it. The answer, of course, was that Optlink would do a full link faster than MS-Link would do an incremental link. The other (perennial) problem with incremental work was amply illustrated by MS-Link - it would produce erratically broken executables because of mistakes in the dependency management. Most people just kinda gave up and did full links just because they could get reliable builds that way. Not correctly handling every single dependency means you'll get unrepeatable builds, which is a disaster with dev tools. The solution I've always focused on was doing the full builds faster. Although it is not implemented, the design of D is such that lexing, parsing, semantic analysis, and code generation can be done concurrently. Running the lex/parse in parallel across all the imports can be a nice win.
Aug 14
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 8/8/2019 3:10 AM, Russel Winder wrote:
 So if C++ has this dependency problem why doesn't D, Go, or Rust?
Great question! The answer is it does, it's just that you don't notice it. The trick, if you can call it that, is the same as how D can compile: int x = y; enum y = 3; In other words, D can handle forward references, and even many cases of circular references. (C++, weirdly, can handle forward references in struct definitions, but nowhere else.) D accomplishes this with 3 techniques: 1. Parsing is completely separate from semantic analysis. I.e. all code can be lexed/parsed in parallel, or in any order, without concern for dependencies. 2. Semantic analysis is lazy, i.e. it is done symbol-by-symbol on demand. In the above example, when y is encountered, the compiler goes "y is an enum, I'd better suspend the semantic analysis of x and go do the semantic analysis for y now". 3. D is able to partially semantically analyze things. This comes into play when two structs mutually refer to each other. It does this well enough that only rarely do "circular reference" errors come up that possibly could be resolved. D processes imports by reading the file and doing a parse on them. Only "on demand" does it do semantic analysis on them. My original plan was to parse them and write a binary file, and then the import would simply and fastly load the binary file. But it turned out the code to load the binary file and turn it into an AST wasn't much better than simply parsing the source code, so I abandoned the binary file approach. Another way D deals with the issue is you can manually prepare a "header" file for a module, a .di file. This makes a lot of sense for modules full of code that's irrelevant to the user, like the gc implementation. ------------------- Some languages deal with this issue by disallowing circular imports entirely. Then the dependency graph is a simple acyclic graph, i.e. a tree. This method does have its attractions, as it forces the programmer to carefully decompose the project into properly encapsulated units. On the other hand, it makes it very difficult to interface smoothly with C and C++ files, which typically each just #include all the headers in the project.
Aug 12
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Aug 12, 2019 at 12:33:07PM -0700, Walter Bright via Digitalmars-d wrote:
[...]
 1. Parsing is completely separate from semantic analysis. I.e. all
 code can be lexed/parsed in parallel, or in any order, without concern
 for dependencies.
This is a big part of why C++'s must-be-parsed-before-it-can-be-lexed syntax is a big hindrance to meaningful progress. The only way such a needlessly over-complex syntax can be handled is a needlessly over-complex lexer/parser combo, which necessarily results in needlessly over-complex corner cases and other such gotchas. Part of this nastiness is the poor choice of template syntax (overloading '<' and '>' to be delimiters in addition to their original roles of comparison operators), among several other things.
 2. Semantic analysis is lazy, i.e. it is done symbol-by-symbol on
 demand. In the above example, when y is encountered, the compiler goes
 "y is an enum, I'd better suspend the semantic analysis of x and go do
 the semantic analysis for y now".
This is an extremely powerful approach, and many may not be aware that it's a powerful cornerstone on which D's meta-programming capabilities are built. It's a beautiful example of the principle of "laziness": don't do the work until it's actually necessary. Something too many applications of today fail to observe, with their own detriment.
 3. D is able to partially semantically analyze things. This comes into
 play when two structs mutually refer to each other. It does this well
 enough that only rarely do "circular reference" errors come up that
 possibly could be resolved.
I wasn't aware of this before, but it makes sense, in retrospect.
 D processes imports by reading the file and doing a parse on them.
 Only "on demand" does it do semantic analysis on them. My original
 plan was to parse them and write a binary file, and then the import
 would simply and fastly load the binary file. But it turned out the
 code to load the binary file and turn it into an AST wasn't much
 better than simply parsing the source code, so I abandoned the binary
 file approach.
[...] That's an interesting data point. I've been toying with the same idea over the years, but it seems that's a dead-end approach. In any case, from what I've gathered parsing and lexing are nowhere near the bottleneck as far as D compilation is concerned (though it might be different for a language like C++, but even there I doubt it would play much of a role in the overall compilation performance, there being far more complex problems in semantic analyses and codegen that require algorithms with non-trivial running times). There are bigger fish to fry elsewhere in the compiler. (Like *cough*memory usage*ahem*, that to this day makes D a laughing stock on low-memory systems. Even with -lowmem the situation today isn't much better from a year or two ago. I find my hands tied w.r.t. D as far as low-memory systems are concerned, and that's a very sad thing, since I'd have liked to replace many things with D. Currently I can't, because either dmd outright won't run and I have to build executables offline and upload them, or else I have to build the dmd toolchain offline and upload it to the low-memory target system. Both choices suck.) T -- Why can't you just be a nonconformist like everyone else? -- YHL
Aug 12
parent reply Jacob Carlborg <doob me.com> writes:
On 2019-08-12 21:58, H. S. Teoh wrote:

 This is a big part of why C++'s must-be-parsed-before-it-can-be-lexed
 syntax is a big hindrance to meaningful progress.  The only way such a
 needlessly over-complex syntax can be handled is a needlessly
 over-complex lexer/parser combo, which necessarily results in needlessly
 over-complex corner cases and other such gotchas.  Part of this
 nastiness is the poor choice of template syntax (overloading '<' and '>'
 to be delimiters in addition to their original roles of comparison
 operators), among several other things.
I don't know how this is implemented in a C++ compiler but can't the lexer use a more abstract token that includes both the usage for templates and for comparison operators? The parser can then figure out exactly what it is. DMD is doing something similar, but at a later stage. For example, in the following code snippet: "int a = foo;", "foo" is parsed as an identifier expression. Then the semantic analyzer figures out if "foo" is a function call or a variable. -- /Jacob Carlborg
Aug 13
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Aug 13, 2019 at 11:19:16AM +0200, Jacob Carlborg via Digitalmars-d
wrote:
 On 2019-08-12 21:58, H. S. Teoh wrote:
 
 This is a big part of why C++'s
 must-be-parsed-before-it-can-be-lexed syntax is a big hindrance to
 meaningful progress.  The only way such a needlessly over-complex
 syntax can be handled is a needlessly over-complex lexer/parser
 combo, which necessarily results in needlessly over-complex corner
 cases and other such gotchas.  Part of this nastiness is the poor
 choice of template syntax (overloading '<' and '>' to be delimiters
 in addition to their original roles of comparison operators), among
 several other things.
I don't know how this is implemented in a C++ compiler but can't the lexer use a more abstract token that includes both the usage for templates and for comparison operators? The parser can then figure out exactly what it is.
It's not so simple. The problem is that in C++, the *structure* of the parse tree changes depending on previous declarations. I.e., the lexical structure is not context-free. For example, given this C++ code: int main() { A a; B b; // What do these lines do? fun<A, B>(a, b); gun<T, U>(a, b); } What do you think the parse tree should be? On the surface, it would appear that main() contains two variable declarations, followed by calling two template functions with (a, b) as the arguments. Unfortunately, this is not true. The way the last two lines of main() are parsed can be *wildly divergent* depending on what declarations came before. To see how this can be so, here's the full code (which I posted a while back in a discussion on template syntax): -----------------------------------snip------------------------------------ // Totally evil example of why C++ template syntax and free-for-all operator // overloading is a Bad, Bad Idea. #include <iostream> struct Bad { }; struct B { }; struct A { Bad operator,(B b) { return Bad(); } }; struct D { }; struct Ugly { D operator>(Bad b) { return D(); } } U; struct Terrible { } T; struct Evil { ~Evil() { std::cout << "Hard drive reformatted." << std::endl; } }; struct Nasty { Evil operator,(D d) { return Evil(); } }; struct Idea { void operator()(A a, B b) { std::cout << "Good idea, data saved." << std::endl; } Nasty operator<(Terrible t) { return Nasty(); } } gun; template<typename T, typename U> void fun(A a, B b) { std::cout << "Have fun!" << std::endl; } int main() { A a; B b; // What do these lines do? fun<A, B>(a, b); gun<T, U>(a, b); } -----------------------------------snip------------------------------------ Note that `gun` is not a template, and not even a function. It's a global struct instance with a completely-abusive series of operator overloads. While I admit that this example is contrived, it does prove my point that you simply cannot parse C++ code in any straightforward way. You have to use nasty hacks in both the lexer and the parser just to get the thing to parse at all, and this is not even touching the more pertinent topic of C++ semantic analysis, which in many places is even worse (SFINAE and Koenig Lookup, anyone? -- thanks to which, the meaning of your code can change simply by adding an #include line at the top of the file without touching anything else. Symbol hijacking galore!).
 DMD is doing something similar, but at a later stage. For example, in
 the following code snippet: "int a = foo;", "foo" is parsed as an
 identifier expression. Then the semantic analyzer figures out if "foo"
 is a function call or a variable.
[...] There's no such thing as an 'identifier expression'. `foo` is parsed as an expression, period. The parse tree is pretty straightforward. Of course, there *is* a hack later that turns it into an implicit function call, but that's already long past the parsing stage. Unlike the C++ example above, the *structure* of the parse tree doesn't change, just the meaning of a leaf node. You don't end up with a completely unrelated parse tree structure just because of some strange declarations elsewhere in the source code. T -- Windows 95 was a joke, and Windows 98 was the punchline.
Aug 13
next sibling parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Tuesday, 13 August 2019 at 15:21:56 UTC, H. S. Teoh wrote:
 On Tue, Aug 13, 2019 at 11:19:16AM +0200, Jacob Carlborg via 
 Digitalmars-d wrote:
 [...]
It's not so simple. The problem is that in C++, the *structure* of the parse tree changes depending on previous declarations. I.e., the lexical structure is not context-free. For example, given this C++ code: [...]
I love the fon< fun< 1 >>::three >::two >::one expression in C++ from Jens Gustedt's blog [1] There the expression means something different in C++98 than in C++11 Let’s have a look how this expression is parsed 1 fon< fun< 1 >>::three >::two >::one // in C++98 2 ----------- 3 ----------------------- 4 ----------------------------------- 5 1 fon< fun< 1 >>::three >::two >::one // in C++11 2 -------- 3 --------------------- 4 ---------------------------- 5 ----------------------------------- [1]: https://gustedt.wordpress.com/2013/12/18/right-angle-brackets-shifting-semantics/#more-2083
Aug 13
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Aug 13, 2019 at 06:51:10PM +0000, Patrick Schluter via Digitalmars-d
wrote:
 On Tuesday, 13 August 2019 at 15:21:56 UTC, H. S. Teoh wrote:
[...]
 It's not so simple.  The problem is that in C++, the *structure* of
 the parse tree changes depending on previous declarations. I.e., the
 lexical structure is not context-free.
[...]
 I love the
 
     fon< fun< 1 >>::three >::two >::one
 
 expression in C++ from Jens Gustedt's blog [1]
 
 There the expression means something different in C++98 than in C++11
 
 Let’s have a look how this expression is parsed
 	
 1  fon< fun< 1 >>::three >::two >::one    // in C++98
 2            -----------
 3       -----------------------
 4  -----------------------------------
 5
 	
 1  fon< fun< 1 >>::three >::two >::one    // in C++11
 2       --------
 3  ---------------------
 4  ----------------------------
 5  -----------------------------------
 
 
 [1]: https://gustedt.wordpress.com/2013/12/18/right-angle-brackets-shifting-semantics/#more-2083
Yeah, it's things like this that convinced me that C++ is hopelessly and needlessly over-complex, and that it was time for me to find a better language. Like D. :-D Using <> as delimiters for template arguments must have been one of the biggest blunders of C++, among many other things. T -- Recently, our IT department hired a bug-fix engineer. He used to work for Volkswagen.
Aug 13
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2019-08-13 17:21, H. S. Teoh wrote:
 On Tue, Aug 13, 2019 at 11:19:16AM +0200, Jacob Carlborg via Digitalmars-d
wrote:
 DMD is doing something similar, but at a later stage. For example, in
 the following code snippet: "int a = foo;", "foo" is parsed as an
 identifier expression. Then the semantic analyzer figures out if "foo"
 is a function call or a variable.
[...] There's no such thing as an 'identifier expression'. `foo` is parsed as an expression, period.
I'm not sure if you're still talking about DMD here, but there's definitely an "identifier expression", it's right here [1]. It's created in the parser here [2]. Then during the semantic phase it's turned into something else. [1] https://github.com/dlang/dmd/blob/b2522da8566783491648bc104a29b42dc2dc569e/src/dmd/expression.d#L2056 [2] https://github.com/dlang/dmd/blob/b2522da8566783491648bc104a29b42dc2dc569e/src/dmd/parse.d#L7633 -- /Jacob Carlborg
Aug 14
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Aug 14, 2019 at 12:40:00PM +0200, Jacob Carlborg via Digitalmars-d
wrote:
 On 2019-08-13 17:21, H. S. Teoh wrote:
 On Tue, Aug 13, 2019 at 11:19:16AM +0200, Jacob Carlborg via Digitalmars-d
wrote:
 DMD is doing something similar, but at a later stage. For example,
 in the following code snippet: "int a = foo;", "foo" is parsed as
 an identifier expression. Then the semantic analyzer figures out
 if "foo" is a function call or a variable.
[...] There's no such thing as an 'identifier expression'. `foo` is parsed as an expression, period.
I'm not sure if you're still talking about DMD here, but there's definitely an "identifier expression", it's right here [1]. It's created in the parser here [2]. Then during the semantic phase it's turned into something else.
[...] Whoa. OK, I stand corrected. Didn't know that was how DMD did it! That's pretty ... weird. And explains some of the quirkiness around certain bits of D syntax, like some of the ugly corner cases of `alias`. T -- It always amuses me that Windows has a Safe Mode during bootup. Does that mean that Windows is normally unsafe?
Aug 14
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Aug 13, 2019 at 08:21:56AM -0700, H. S. Teoh via Digitalmars-d wrote:
 On Tue, Aug 13, 2019 at 11:19:16AM +0200, Jacob Carlborg via Digitalmars-d
wrote:
[...]
 I don't know how this is implemented in a C++ compiler but can't the
 lexer use a more abstract token that includes both the usage for
 templates and for comparison operators? The parser can then figure
 out exactly what it is.
It's not so simple. The problem is that in C++, the *structure* of the parse tree changes depending on previous declarations. I.e., the lexical structure is not context-free.
[...] Not to mention, in the more recent C++ revisions, it's not just the parse tree that changes, even the tokenization changes. I.e.: fun<gun<A, B>>(c, d); can be tokenized as either: fun < gun < A , B >> ( c , d ) ; (i.e., '>>' is the right-shift operator), or: fun < gun < A , B > > ( c , d ); (i.e., '>>' is *two* closing template argument list delimiters). There is simply no way you can write a straightforward, context-free lexer for C++. Such a thing simply doesn't exist, because C++ must be parsed before it can be lexed. The lexer has to somehow know when '>>' should be lexed as two tokens, or when it should be lexed as a single token. The only way it can know this is if the parser informs it what parse tree it's currently expecting. But that means the parser has to be running *before* the lexer has completely lexified the input. Furthermore, how does the parser know when it's expecting a template argument list? From my previous example, you see that even when an input statement looks like a template function call, it may not actually be one. Which means *semantic analysis* has to have already begun (at least partially), enough to recognize certain identifiers as templates, with a feedback loop to the parser, which in turn has a feedback loop to the lexer so that it knows whether '>>' should be two tokens or one. You can't get around this inherent complexity without becoming non-compliant with the C++ spec. So you see, the seemingly insignificant choice of <> as template argument list delimiters has far-reaching consequences. In retrospect, it was a bad design decision. '<' and '>' should have been left alone as comparison operators only, not overloaded with a completely unrelated meaning that leads to all sorts of pathological ambiguities and needless parser complexity. T -- Why ask rhetorical questions? -- JC
Aug 13
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 8/13/2019 9:32 AM, H. S. Teoh wrote:
 So you see, the seemingly insignificant choice of <> as template
 argument list delimiters has far-reaching consequences.  In retrospect,
 it was a bad design decision.
There were some who warned about this from the beginning, but they were overruled. There was a prevailing attitude that implementation complexity was not relevant, only the user experience. Another parsing problem exists in C as well: A * B; Is that a declaration of B or is it a multiply expression? It cannot be determined until the compiler knows what A is, which requires semantic analysis. You might think D has the same problem, but I added a simple rule to D: "If it can be parsed as a declaration, it's a declaration." Which means A*B; is always a declaration, even if A is later determined to be a variable, in which case the compiler gives an error. The reason this works is because A*B; is an expression with no effect, and hence people don't write such code. You might then think "what if * is an overloaded operator with side effects?" There's another rule for that, and that is overloading of arithmetic operators should be to implement arithmetic, which shouldn't have side effects. (If you absolutely must have operator overloading, you can write (A*B); but such is strongly discouraged.) The fact that in 20 years pretty much nobody has noticed that D operates this way is testament to it being the correct decision and it "just works".
Aug 14
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Aug 14, 2019 at 11:51:55AM -0700, Walter Bright via Digitalmars-d wrote:
 On 8/13/2019 9:32 AM, H. S. Teoh wrote:
 So you see, the seemingly insignificant choice of <> as template
 argument list delimiters has far-reaching consequences.  In
 retrospect, it was a bad design decision.
There were some who warned about this from the beginning, but they were overruled. There was a prevailing attitude that implementation complexity was not relevant, only the user experience.
Having grown up in that era, I can sympathize with that attitude, but I must say that given today's hindsight, citing user experience as the reason for choosing <> as delimiters is very ironic indeed. :-D
 Another parsing problem exists in C as well:
 
   A * B;
 
 Is that a declaration of B or is it a multiply expression? It cannot
 be determined until the compiler knows what A is, which requires
 semantic analysis.
 
 You might think D has the same problem, but I added a simple rule to
 D:
 
   "If it can be parsed as a declaration, it's a declaration."
 
 Which means A*B; is always a declaration, even if A is later
 determined to be a variable, in which case the compiler gives an
 error.
 
 The reason this works is because A*B; is an expression with no effect,
 and hence people don't write such code. You might then think "what if
 * is an overloaded operator with side effects?" There's another rule
 for that, and that is overloading of arithmetic operators should be to
 implement arithmetic, which shouldn't have side effects.
 
 (If you absolutely must have operator overloading, you can write
 (A*B); but such is strongly discouraged.)
 
 The fact that in 20 years pretty much nobody has noticed that D
 operates this way is testament to it being the correct decision and it
 "just works".
Wow. I never even noticed that, all this time! :-D It also goes to strengthen the argument that operator overloading should not be abused the way it has been in C++. Choosing << and >> for I/O seemed like a clever thing to do at the time, but it led to all sorts of silliness like unexpected precedence and ambiguity with actual arithmetic shift operations, necessitating the proliferation of parentheses around I/O chains (which, arguably, defeats the aesthetics of "<<" and ">>" in the first place). And don't even mention that Boost monstrosity that uses operator overloading for compile-time regexen. Ick. The very thought makes me cringe. T -- Why can't you just be a nonconformist like everyone else? -- YHL
Aug 14
parent Walter Bright <newshound2 digitalmars.com> writes:
On 8/14/2019 12:08 PM, H. S. Teoh wrote:
 It also goes to strengthen the argument that operator overloading should
 not be abused the way it has been in C++.  Choosing << and >> for I/O
 seemed like a clever thing to do at the time, but it led to all sorts of
 silliness like unexpected precedence and ambiguity with actual
 arithmetic shift operations, necessitating the proliferation of
 parentheses around I/O chains (which, arguably, defeats the aesthetics
 of "<<" and ">>" in the first place).
Worse, you cannot do anything that requires persistent state, because for `A<<B` if A sets a state, an exception can be thrown before B is executed, and now the iostream state is hosed. It isn't thread-safe, either.
 And don't even mention that Boost
 monstrosity that uses operator overloading for compile-time regexen.
 Ick. The very thought makes me cringe.
That was one of the examples that convinced me to make it hard to do such things with D.
Aug 14
prev sibling parent Kagamin <spam here.lot> writes:
On Thursday, 8 August 2019 at 10:10:30 UTC, Russel Winder wrote:
 So if C++ has this dependency problem why doesn't D, Go, or 
 Rust?
Walter derailed the topic with parsing stuff. That dependency is not the lexical order dependency, it's build sequence dependency: one module depends on compilation output of another module, and it's not a problem of C++, it's a problem of this particular module design. D has no such dependency problem (except for idgen), you can compile D files in any order.
Aug 16