www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Ref counting for CTFE?

reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
One subject that frequented the talks at dconf was the poor performance of  
CTFE and mixins.

The major issue as I understand it (maybe I'm wrong) is the vast amounts  
of memory the compiler consumes while building mixin strings. In fact, one  
of the talks (can't remember which one) mentioned that someone had to  
build their project in steps so the compiler did not crash from OOM.

In CTFE, we are not constrained by the runtime GC, and in fact, we have no  
GC at this point (it's disabled). What about implementing rudimentary,  
possibly slow but correct, reference counting for CTFE allocated data? It  
doesn't have to be perfect, but something that prevents consumption of GB  
of memory to compile a project may make the difference between actually  
compiling a project and not. It would also be a nice little confined  
environment to try out ref counting + GC for cycles.

As a side note, I find it weird that the compiler has given up freeing  
memory, EVER, in order to achieve speed. Sure, the killing of the process  
will act as the ultimate GC, but when it is killed before the process is  
done compiling, the time it takes to compile approaches infinity. And when  
the computer helpfully starts swapping to avoid killing the process,  
things aren't much better.

I understand compiler speed is really important. But it's not worth much  
if the speed comes at the cost of ACTUALLY COMPILING.

It reminds me of the tango.xml accolades. Not everyone realizes that it  
only works on a fully memory-loaded XML file. Sure, it's fast after that,  
but you can't just discount the time it takes to load (or the memory  
required!)

-Steve
May 29 2014
next sibling parent reply "safety0ff" <safety0ff.dev gmail.com> writes:
If would be nice if Don could elaborate on his comment in bug 
https://issues.dlang.org/show_bug.cgi?id=6498#c1)

I.e. What has been done, what needs to be done, what is a "proper 
fix".
I think it was stated somewhere that the goal was to implement 
reference counting.

This would help people pick up where he left off.
May 29 2014
parent reply "Martin Nowak" <code dawg.eu> writes:
On Thursday, 29 May 2014 at 15:28:28 UTC, safety0ff wrote:
 If would be nice if Don could elaborate on his comment in bug 
https://issues.dlang.org/show_bug.cgi?id=6498#c1)
What is really needed is the ability to update variables in place. Currently every mutation allocates a new value. IMO the underlying problem is that CTFE operates on full AST nodes. To solve this, we either need a separate data representation for CTFE or even better work on JITing.
May 30 2014
next sibling parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Saturday, 31 May 2014 at 02:40:29 UTC, Martin Nowak wrote:
 On Thursday, 29 May 2014 at 15:28:28 UTC, safety0ff wrote:
 If would be nice if Don could elaborate on his comment in bug 
https://issues.dlang.org/show_bug.cgi?id=6498#c1)
What is really needed is the ability to update variables in place. Currently every mutation allocates a new value. IMO the underlying problem is that CTFE operates on full AST nodes. To solve this, we either need a separate data representation for CTFE or even better work on JITing.
JITing can wait. Just an abstract interpreter would be a huge improvement on what we have currently. As long as it is designed to allow JITing in the future.
Jun 01 2014
parent "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 1 June 2014 at 10:46:52 UTC, Peter Alexander wrote:
 On Saturday, 31 May 2014 at 02:40:29 UTC, Martin Nowak wrote:
 On Thursday, 29 May 2014 at 15:28:28 UTC, safety0ff wrote:
 If would be nice if Don could elaborate on his comment in bug 
https://issues.dlang.org/show_bug.cgi?id=6498#c1)
What is really needed is the ability to update variables in place. Currently every mutation allocates a new value. IMO the underlying problem is that CTFE operates on full AST nodes. To solve this, we either need a separate data representation for CTFE or even better work on JITing.
JITing can wait. Just an abstract interpreter would be a huge improvement on what we have currently. As long as it is designed to allow JITing in the future.
JITing don't need to wait. In fact it is already there ! Now come help implement the rest of the language in SDC :D
Jun 01 2014
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 30 May 2014 22:40:29 -0400, Martin Nowak <code dawg.eu> wrote:

 On Thursday, 29 May 2014 at 15:28:28 UTC, safety0ff wrote:

 (https://issues.dlang.org/show_bug.cgi?id=6498#c1)
What is really needed is the ability to update variables in place. Currently every mutation allocates a new value.
Wouldn't ref-counting actually help this? -Steve
Jun 02 2014
parent reply "Dicebot" <public dicebot.lv> writes:
On Monday, 2 June 2014 at 14:16:50 UTC, Steven Schveighoffer
wrote:
 On Fri, 30 May 2014 22:40:29 -0400, Martin Nowak <code dawg.eu> 
 wrote:

 On Thursday, 29 May 2014 at 15:28:28 UTC, safety0ff wrote:
 If would be nice if Don could elaborate on his comment in bug 
https://issues.dlang.org/show_bug.cgi?id=6498#c1)
What is really needed is the ability to update variables in place. Currently every mutation allocates a new value.
Wouldn't ref-counting actually help this? -Steve
You don't need to optimize with ref-counting if you don't allocate new instances at all ;)
Jun 02 2014
next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 02 Jun 2014 10:47:56 -0400, Dicebot <public dicebot.lv> wrote:

 On Monday, 2 June 2014 at 14:16:50 UTC, Steven Schveighoffer
 wrote:
 On Fri, 30 May 2014 22:40:29 -0400, Martin Nowak <code dawg.eu> wrote:

 On Thursday, 29 May 2014 at 15:28:28 UTC, safety0ff wrote:

 (https://issues.dlang.org/show_bug.cgi?id=6498#c1)
What is really needed is the ability to update variables in place. Currently every mutation allocates a new value.
Wouldn't ref-counting actually help this? -Steve
You don't need to optimize with ref-counting if you don't allocate new instances at all ;)
Sure, but if it's a case of re-implementing CTFE from the ground up, or changing the memory allocator, which is easier to make happen first? Seriously speaking from ignorance, I have no idea. Note, I think Ref-counting will still help even when ++x doesn't allocate. -Steve
Jun 02 2014
prev sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Monday, 2 June 2014 at 14:47:57 UTC, Dicebot wrote:
 On Monday, 2 June 2014 at 14:16:50 UTC, Steven Schveighoffer
 wrote:
 On Fri, 30 May 2014 22:40:29 -0400, Martin Nowak 
 <code dawg.eu> wrote:

 On Thursday, 29 May 2014 at 15:28:28 UTC, safety0ff wrote:
 If would be nice if Don could elaborate on his comment in 
https://issues.dlang.org/show_bug.cgi?id=6498#c1)
What is really needed is the ability to update variables in place. Currently every mutation allocates a new value.
Wouldn't ref-counting actually help this? -Steve
You don't need to optimize with ref-counting if you don't allocate new instances at all ;)
Even if you do, you could create a pool for allocating CTFE. At the end, you move objects you are interested in from that pool to some other memory location and trash the whole pool.
Jun 02 2014
parent "Dicebot" <public dicebot.lv> writes:
On Monday, 2 June 2014 at 20:54:28 UTC, deadalnix wrote:
 Even if you do, you could create a pool for allocating CTFE. At
 the end, you move objects you are interested in from that pool 
 to
 some other memory location and trash the whole pool.
I have proposed that during DConf as temporary workaround ;) (region allocation for each ctfe chain)
Jun 02 2014
prev sibling next sibling parent reply Ary Borenszweig <ary esperanto.org.ar> writes:
On 5/29/14, 12:22 PM, Steven Schveighoffer wrote:
 One subject that frequented the talks at dconf was the poor performance
 of CTFE and mixins.

 The major issue as I understand it (maybe I'm wrong) is the vast amounts
 of memory the compiler consumes while building mixin strings. In fact,
 one of the talks (can't remember which one) mentioned that someone had
 to build their project in steps so the compiler did not crash from OOM.
If you add reference counting or a GC to the compiler, it will make those large projects compile, but it will inevitably be slower than now. That's why Walter disabled GC completely in the compiler (turning it on made the compiler really slow). I think the right steps are: 1. Enable some kind of GC 2. Profile and see where are the bottlenecks. 3. Optimize those cases. 4. Go to 2.
May 29 2014
next sibling parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, May 29, 2014 at 01:13:39PM -0300, Ary Borenszweig via Digitalmars-d
wrote:
 On 5/29/14, 12:22 PM, Steven Schveighoffer wrote:
One subject that frequented the talks at dconf was the poor
performance of CTFE and mixins.

The major issue as I understand it (maybe I'm wrong) is the vast
amounts of memory the compiler consumes while building mixin strings.
In fact, one of the talks (can't remember which one) mentioned that
someone had to build their project in steps so the compiler did not
crash from OOM.
If you add reference counting or a GC to the compiler, it will make those large projects compile, but it will inevitably be slower than now. That's why Walter disabled GC completely in the compiler (turning it on made the compiler really slow). I think the right steps are: 1. Enable some kind of GC 2. Profile and see where are the bottlenecks. 3. Optimize those cases. 4. Go to 2.
Shouldn't be as simple as a compiler switch to enable compile-time GC? T -- The day Microsoft makes something that doesn't suck is probably the day they start making vacuum cleaners... -- Slashdotter
May 29 2014
prev sibling parent reply "safety0ff" <safety0ff.dev gmail.com> writes:
On Thursday, 29 May 2014 at 16:13:40 UTC, Ary Borenszweig wrote:
 If you add reference counting or a GC to the compiler, it will 
 make those large projects compile, but it will inevitably be 
 slower than now. That's why Walter disabled GC completely in 
 the compiler (turning it on made the compiler really slow).
AFAIK, he was talking about a whole compiler GC and not a CTFE only GC/RC. AFAIK, before the data structures created by CTFE join the AST, they get "scrubbed" which could help us implement a self-contained memory managing strategy for CTFE.
May 29 2014
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 29 May 2014 13:07:17 -0400, safety0ff <safety0ff.dev gmail.com>  
wrote:

 On Thursday, 29 May 2014 at 16:13:40 UTC, Ary Borenszweig wrote:
 If you add reference counting or a GC to the compiler, it will make  
 those large projects compile, but it will inevitably be slower than  
 now. That's why Walter disabled GC completely in the compiler (turning  
 it on made the compiler really slow).
AFAIK, he was talking about a whole compiler GC and not a CTFE only GC/RC. AFAIK, before the data structures created by CTFE join the AST, they get "scrubbed" which could help us implement a self-contained memory managing strategy for CTFE.
If by "he" you mean me, I was talking only about CTFE, not the whole compiler. In general, any data the compiler generates while actually compiling is stored for later reference. Using a GC is not going to help, because the compiler doesn't generally create much garbage. But CTFE is full of code that expects to have a GC running, e.g. string concatenation for mixins, etc. Not only that, CTFE functions are generally run under the same (or even more strict) rules as strong-pure functions. It would be entirely conceivable to just throw away all memory it allocates except for the return value. But I don't know how that would fare. But it does REALLY stupid things with memory management within CTFE functions. I think this is where RC would help. That bug you referred to doesn't even use something that would normally allocate in normal code! -Steve
May 29 2014
next sibling parent "safety0ff" <safety0ff.dev gmail.com> writes:
On Thursday, 29 May 2014 at 17:33:15 UTC, Steven Schveighoffer 
wrote:
 On Thu, 29 May 2014 13:07:17 -0400, safety0ff 
 <safety0ff.dev gmail.com> wrote:

 On Thursday, 29 May 2014 at 16:13:40 UTC, Ary Borenszweig 
 wrote:
 If you add reference counting or a GC to the compiler, it 
 will make those large projects compile, but it will 
 inevitably be slower than now. That's why Walter disabled GC 
 completely in the compiler (turning it on made the compiler 
 really slow).
AFAIK, he was talking about a whole compiler GC and not a CTFE only GC/RC. AFAIK, before the data structures created by CTFE join the AST, they get "scrubbed" which could help us implement a self-contained memory managing strategy for CTFE.
If by "he" you mean me, I was talking only about CTFE, not the whole compiler.
By "he", I meant Walter.
May 29 2014
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 05/29/2014 07:33 PM, Steven Schveighoffer wrote:
 But CTFE is full of code that expects to have a GC running, e.g. string
 concatenation for mixins, etc.
Even the following code runs out of memory on my machine: int foo(){ foreach(i;0..100000000){} return 2; } pragma(msg, foo()); I.e. incrementing the loop counter consumes memory.
May 29 2014
next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 29 May 2014 13:54:07 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 05/29/2014 07:33 PM, Steven Schveighoffer wrote:
 But CTFE is full of code that expects to have a GC running, e.g. string
 concatenation for mixins, etc.
Even the following code runs out of memory on my machine: int foo(){ foreach(i;0..100000000){} return 2; } pragma(msg, foo()); I.e. incrementing the loop counter consumes memory.
Yes, this is the bug referenced earlier by safetyOff. -Steve
May 29 2014
prev sibling parent "Remo" <remo4d gmail.com> writes:
On Thursday, 29 May 2014 at 17:54:08 UTC, Timon Gehr wrote:
 On 05/29/2014 07:33 PM, Steven Schveighoffer wrote:
 But CTFE is full of code that expects to have a GC running, 
 e.g. string
 concatenation for mixins, etc.
Even the following code runs out of memory on my machine: int foo(){ foreach(i;0..100000000){} return 2; } pragma(msg, foo()); I.e. incrementing the loop counter consumes memory.
Using x64 build of DMD this code will compile but consumes about 9 GB of RAM. So this is really not optimal.
May 29 2014
prev sibling next sibling parent reply "Dylan Knutson" <tcdknutson gmail.com> writes:
I'm not well acquainted with how the compiler works internally, 
or how CTFE is implemented. But it seems like a full-blown D 
interpreter with eval functionality is needed. Lots of scripting 
language interpreters exist out there, and they all get 
relatively decent performance and memory footprints (or at least 
much better than what DMD can get when performing CTFE).

Is there anything so radically different in D than these other 
languages, that prevents the implementation of a run-of-the-mill 
VM to eval D code? It just seems strange to me that it's such a 
problem when this is basically solved by all scripting languages. 
And I'm really not trying to downplay the difficulty in 
implementing CTFE in D, but rather just figure out why it's so 
hard to implement in comparison.
May 29 2014
next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 29 May 2014 12:53:54 -0400, Dylan Knutson <tcdknutson gmail.com>  
wrote:

 I'm not well acquainted with how the compiler works internally, or how  
 CTFE is implemented. But it seems like a full-blown D interpreter with  
 eval functionality is needed. Lots of scripting language interpreters  
 exist out there, and they all get relatively decent performance and  
 memory footprints (or at least much better than what DMD can get when  
 performing CTFE).

 Is there anything so radically different in D than these other  
 languages, that prevents the implementation of a run-of-the-mill VM to  
 eval D code? It just seems strange to me that it's such a problem when  
 this is basically solved by all scripting languages. And I'm really not  
 trying to downplay the difficulty in implementing CTFE in D, but rather  
 just figure out why it's so hard to implement in comparison.
The compilation speed of D is touted continually as a very important "feature". Realistically, as long as you have the memory to sustain it, you will NEVER beat the current implementation because it never deallocates anything. By definition, deallocating some things will add more length to the compile time. I think as long as the slowdown doesn't result in an order of magnitude difference, we should be fine. Nobody will complain about 5 second compile times vs. 4 second ones (well, mostly nobody). People will definitely notice and complain about 15 second compile times vs 4 second ones. It remains to be seen how any of this would perform. It is an idea to think about, and I didn't realize someone already had considered it. The thing I like about it, is that the environment is a restrictive one where we can try ref counting + GC out, without adversely affecting actual running D code. Once the compiler is finished, there are no traces of the CTFE interpreted heap. See the recent threads about the issues with adding ref counting to the language. And when I say "we", I mean people other than me who can write compiler code :) So there is that... (ducks) -Steve
May 29 2014
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 05/29/2014 06:53 PM, Dylan Knutson wrote:
 ...

 Is there anything so radically different in D than these other
 languages, that prevents the implementation of a run-of-the-mill VM to
 eval D code?
No. (In fact, I've written a naive but mostly complete byte code interpreter in half a week or so last year, as part of an ongoing recreational D front end implementation effort.)
 It just seems strange to me that it's such a problem when
 this is basically solved by all scripting languages. And I'm really not
 trying to downplay the difficulty in implementing CTFE in D, but rather
 just figure out why it's so hard to implement in comparison.
CTFE is somewhat intertwined with semantic analysis, which makes it a little harder to specify/implement than usual interpreters. However, the performance problem is mostly a structural issue of the current implementation: DMDs CTFE interpreter gradually grew out of its constant folder in some kind of best effort fashion as far as I understand. It is feasible to do everything in the usual fashion and occasionally just pause or restart interpretation at well-defined points where it needs to interface with semantic analysis.
May 29 2014
parent "Don" <x nospam.com> writes:
On Thursday, 29 May 2014 at 18:12:59 UTC, Timon Gehr wrote:
 On 05/29/2014 06:53 PM, Dylan Knutson wrote:
 ...

 Is there anything so radically different in D than these other
 languages, that prevents the implementation of a 
 run-of-the-mill VM to
 eval D code?
No. (In fact, I've written a naive but mostly complete byte code interpreter in half a week or so last year, as part of an ongoing recreational D front end implementation effort.)
 It just seems strange to me that it's such a problem when
 this is basically solved by all scripting languages. And I'm 
 really not
 trying to downplay the difficulty in implementing CTFE in D, 
 but rather
 just figure out why it's so hard to implement in comparison.
CTFE is somewhat intertwined with semantic analysis, which makes it a little harder to specify/implement than usual interpreters. However, the performance problem is mostly a structural issue of the current implementation: DMDs CTFE interpreter gradually grew out of its constant folder in some kind of best effort fashion as far as I understand. It is feasible to do everything in the usual fashion and occasionally just pause or restart interpretation at well-defined points where it needs to interface with semantic analysis.
Exactly. Historically, most of the work I've done on CTFE was in fixing up the relationship between CTFE and the rest of the compiler, ironing out all of the weird semantic interactions. Almost *nothing* has ever been done on the CTFE implementation itself. The implementation is the crappiest thing you could imagine, it leaks memory like BP leaks oil. It's been hard to fix not because doing a JIT is hard, but because of the semantic interaction bugs. The good news is that most of those are fixed now. But, it's worth mentioning that at dconf, CTFE and mixins were blamed for many things they aren't responsible for. For example, Phobos takes forever to compile, but it's nothing to do with CTFE. Phobos is slow to compile because everything imports everything else, and it instantiates nearly a million templates. IE, an infinitely fast CTFE engine would make very little difference to Phobos compile times.
Jun 03 2014
prev sibling next sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Thursday, 29 May 2014 at 15:22:54 UTC, Steven Schveighoffer 
wrote:
 One subject that frequented the talks at dconf was the poor 
 performance of CTFE and mixins.

 The major issue as I understand it (maybe I'm wrong) is the 
 vast amounts of memory the compiler consumes while building 
 mixin strings. In fact, one of the talks (can't remember which 
 one) mentioned that someone had to build their project in steps 
 so the compiler did not crash from OOM.
It was during my talk. I have mentioned experience of Etienne (etcimon) when building ASN.1 parser using Pegged which was impossible to compile unless you did all code generation in separate run from actual project compilation. I have talked about it with Don and he says that many CTFE memory allocation issues can be fixed without any GC by simply improving interpreter implementation. For example, incrementing an integer allocates new instance right now AFAIR. But he is very unlikely to work on it personally in any near future.
 As a side note, I find it weird that the compiler has given up 
 freeing memory, EVER, in order to achieve speed. Sure, the 
 killing of the process will act as the ultimate GC, but when it 
 is killed before the process is done compiling, the time it 
 takes to compile approaches infinity. And when the computer 
 helpfully starts swapping to avoid killing the process, things 
 aren't much better.
I'd love to see command-line flag that enables garbage collection in compiler (disabled by default). It does not matter how fast compiler is if it crashes on big project. And difference between 10 seconds vs 30 seconds is not as important as difference between 2 seconds vs 10 seconds anyway.
May 29 2014
parent reply "Puming" <zhaopuming gmail.com> writes:
On Thursday, 29 May 2014 at 20:44:43 UTC, Dicebot wrote:

 I'd love to see command-line flag that enables garbage 
 collection in compiler (disabled by default). It does not 
 matter how fast compiler is if it crashes on big project. And 
 difference between 10 seconds vs 30 seconds is not as important 
 as difference between 2 seconds vs 10 seconds anyway.
I'd like to provide another use case: I use vibe.d to host my website in a DigitalOcean virtual machine with 512M RAM, which is the cheapest and most popular VPS sulotion out there. But I can't build my dub/vibe.d project on it because 512M RAM is far from enough for any CTFE related code to build. It crashes every time. My solution now is to use a VirtualBox ubuntu on my Mac/Win8 to build the project and rsync it on to DigitalOcean. Which is very slow turnaround. I think to make D based web programming popular, we have to make it possible to run on most of the cloud PAAS platforms (see Python/GoogleAppEngine and Ruby/Heroku and all those PHP machines out there). 512M RAM is a crucial deadline for the compiler's memory usage if we really want that to happen.
May 29 2014
next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 29 May 2014 23:11:25 -0400, Puming <zhaopuming gmail.com> wrote:

 On Thursday, 29 May 2014 at 20:44:43 UTC, Dicebot wrote:

 I'd love to see command-line flag that enables garbage collection in  
 compiler (disabled by default). It does not matter how fast compiler is  
 if it crashes on big project. And difference between 10 seconds vs 30  
 seconds is not as important as difference between 2 seconds vs 10  
 seconds anyway.
I'd like to provide another use case: I use vibe.d to host my website in a DigitalOcean virtual machine with 512M RAM, which is the cheapest and most popular VPS sulotion out there. But I can't build my dub/vibe.d project on it because 512M RAM is far from enough for any CTFE related code to build. It crashes every time. My solution now is to use a VirtualBox ubuntu on my Mac/Win8 to build the project and rsync it on to DigitalOcean. Which is very slow turnaround.
I have the same problem. Dreamhost will actually kill the whole VPS system if you use up the memory since it's a vLinux system. -Steve
May 30 2014
prev sibling parent "w0rp" <devw0rp gmail.com> writes:
On Friday, 30 May 2014 at 03:11:27 UTC, Puming wrote:
 On Thursday, 29 May 2014 at 20:44:43 UTC, Dicebot wrote:

 I'd love to see command-line flag that enables garbage 
 collection in compiler (disabled by default). It does not 
 matter how fast compiler is if it crashes on big project. And 
 difference between 10 seconds vs 30 seconds is not as 
 important as difference between 2 seconds vs 10 seconds anyway.
I'd like to provide another use case: I use vibe.d to host my website in a DigitalOcean virtual machine with 512M RAM, which is the cheapest and most popular VPS sulotion out there. But I can't build my dub/vibe.d project on it because 512M RAM is far from enough for any CTFE related code to build. It crashes every time. My solution now is to use a VirtualBox ubuntu on my Mac/Win8 to build the project and rsync it on to DigitalOcean. Which is very slow turnaround. I think to make D based web programming popular, we have to make it possible to run on most of the cloud PAAS platforms (see Python/GoogleAppEngine and Ruby/Heroku and all those PHP machines out there). 512M RAM is a crucial deadline for the compiler's memory usage if we really want that to happen.
I have run into exactly this issue before, which I mentioned in the redesign thread. I'm not sure how this can be fixed, but it does need to be fixed. I think perhaps a compiler switch for lower memory environments would be acceptable.
Jun 02 2014
prev sibling next sibling parent Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Thu, 29 May 2014 11:22:54 -0400
Steven Schveighoffer via Digitalmars-d <digitalmars-d puremagic.com>
wrote:

 One subject that frequented the talks at dconf was the poor
 performance of CTFE and mixins.

 The major issue as I understand it (maybe I'm wrong) is the vast
 amounts of memory the compiler consumes while building mixin strings.
 In fact, one of the talks (can't remember which one) mentioned that
 someone had to build their project in steps so the compiler did not
 crash from OOM.

 In CTFE, we are not constrained by the runtime GC, and in fact, we
 have no GC at this point (it's disabled). What about implementing
 rudimentary, possibly slow but correct, reference counting for CTFE
 allocated data? It doesn't have to be perfect, but something that
 prevents consumption of GB of memory to compile a project may make
 the difference between actually compiling a project and not. It would
 also be a nice little confined environment to try out ref counting +
 GC for cycles.
That might help, but the core problem with CTFE (as Don explains it) is that currently each value sits on the heap, and when you mutate it, you get a whole new object allocated on the heap. So, something like int i = 0; while(i < 10) ++i; would be allocating a value for i on the heap 10 times. e.g. something like int* i = new int(0); while(i < 10) i = new int(*i + 1); So, you end up with an insane number of allocations for basic stuff. CTFE was originally pretty much a hack in the compiler, so it was a huge mess. Don went to a lot of time and effort to clean it up so that it actually has a single entry point in the compiler instead of being scattered throughout the compiler in hard-to-find places. All of that had to be done _before_ performance improvements could even be explored. Unfortunately, after Don got to that point last year, he didn't have time to continue working on it, and no one else has picked up the torch (I expect that he'll be back to it, but I don't know when). Don is convinced that simply making it so that CTFE has true mutation for _integers_ (without even doing it for anything else yet) would result in enormous speed gains (and it would obviously significantly reduce the memory requirements in the process). So, it looks like there are fundamental issues with CTFE that really should be solved before we discuss stuff like reference counting its memory. And just solving those could make referencing counting irrelevant. - Jonathan M Davis
May 29 2014
prev sibling parent Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Thu, 29 May 2014 09:16:26 -0700
"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> wrote:

 On Thu, May 29, 2014 at 01:13:39PM -0300, Ary Borenszweig via
 Digitalmars-d wrote:
 On 5/29/14, 12:22 PM, Steven Schveighoffer wrote:
One subject that frequented the talks at dconf was the poor
performance of CTFE and mixins.

The major issue as I understand it (maybe I'm wrong) is the vast
amounts of memory the compiler consumes while building mixin
strings. In fact, one of the talks (can't remember which one)
mentioned that someone had to build their project in steps so the
compiler did not crash from OOM.
If you add reference counting or a GC to the compiler, it will make those large projects compile, but it will inevitably be slower than now. That's why Walter disabled GC completely in the compiler (turning it on made the compiler really slow). I think the right steps are: 1. Enable some kind of GC 2. Profile and see where are the bottlenecks. 3. Optimize those cases. 4. Go to 2.
Shouldn't be as simple as a compiler switch to enable compile-time GC?
The compiler has a GC in it already. It's just that it's disabled, because enabling it seriously slowed down compilation What we should probably do is simply make it so that the compiler uses a GC when it actually runs out of memory but otherwise makes no attempt at deallocation. That way, it's efficient for normal compilation, and the programs that run out of memory while compiling can still be compiled. - Jonathan M Davis
May 29 2014