digitalmars.D - Compile-time memory footprint of std.algorithm
- Iain Buclaw (6/6) Apr 22 2014 Testing a 2.065 pre-release snapshot against GDC. I see that
- H. S. Teoh via Digitalmars-d (7/14) Apr 22 2014 Didn't we say (many months ago!) that we wanted to split up
- Peter Alexander (4/8) Apr 22 2014 My (ancient) laptop only has 2GB of RAM :-)
- Iain Buclaw via Digitalmars-d (4/12) Apr 22 2014 I blame Kenji and all the semanticTiargs and other template-related
- Dmitry Olshansky (6/21) Apr 22 2014 At a times I really don't know why can't we just drop in a Boehm GC (the...
- Walter Bright (2/5) Apr 22 2014 I made a build of dmd with a collector in it. It destroyed the speed. To...
- Dmitry Olshansky (5/14) Apr 22 2014 Getting more practical - any chance to use it selectively in CTFE and
- Walter Bright (2/4) Apr 23 2014 Using it there only will require a rewrite of interpret.c.
- Kagamin (4/6) Apr 23 2014 Is it because of garbage collections? Then allow people configure
- Walter Bright (5/10) Apr 23 2014 It's more than that. I invite you to read the article I wrote on DrDobbs...
- Dmitry Olshansky (7/20) Apr 23 2014 This stinks it's not even half-serious. A x2 speed increase was due to
- Walter Bright (6/8) Apr 23 2014 I've tried adding a collector to DMD with poor results. If you'd like to...
- Dmitry Olshansky (6/16) Apr 23 2014 That is understood, thanks for honesty.
- Steve Teale (2/12) Apr 23 2014 Well said Walter!
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (10/12) Apr 23 2014 What about packing DMD structure members such as integers and
- Peter Alexander (3/16) Apr 23 2014 Maybe we should investigate where the memory is going first
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (1/3) Apr 23 2014 I agree. Tool anyone?
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (2/3) Apr 23 2014 https://stackoverflow.com/questions/23255043/finding-unexercised-bits-of...
- Iain Buclaw via Digitalmars-d (3/6) Apr 23 2014 I'm using valgrind - may take a while to process and merge them all.
- Iain Buclaw via Digitalmars-d (8/16) Apr 23 2014 I was amazed to see some small losses in the glue (that I'll be
- Jussi Jumppanen (9/11) Apr 23 2014 FWIW one hint might be found in the DCD project found here:
- Brian Schott (6/8) Apr 23 2014 The code is actually located here:
- Kagamin (2/4) Apr 23 2014 Alternatively we could replace heap on size threshold.
- Steven Schveighoffer (7/15) Apr 23 2014 The time it takes to compile a program where the compiler consumes 2G of...
- Messenger (3/5) Apr 23 2014 (nitpick: not necessarily given good swap behaviour!)
- Jacob Carlborg (5/7) Apr 23 2014 Isn't that bad advertisement for the GC in D? Or has it something to do
- Ary Borenszweig (2/7) Apr 24 2014 dmd is written in C++, the collector must have been boehm
- Iain Buclaw via Digitalmars-d (3/14) Apr 24 2014 It wasn't IIRC. 'Twas in-house GC, no?
- Walter Bright (2/4) Apr 26 2014 It was with the C++ version of the original D collector.
- monarch_dodra (4/10) Apr 24 2014 Well, keep in mind we are comparing using the GC versus "doing
- Daniel Murphy (2/5) Apr 23 2014 Or you know, switch to D and use druntime's GC.
- Dmitry Olshansky (4/9) Apr 23 2014 Good point. Can't wait to see D-only codebase.
- Marco Leise (10/19) Apr 23 2014 ...
- Dmitry Olshansky (11/29) Apr 26 2014 No it doesn't. It used a precursor of D's GC and that turned out to be
- Ary Borenszweig (5/10) Apr 23 2014 But that will be slow.
- Iain Buclaw via Digitalmars-d (6/11) Jun 21 2014 The final nail in the coffin was when my laptop locked up building
- H. S. Teoh via Digitalmars-d (13/29) Jun 21 2014 It's long past due for std.algorithm to be broken up. And this isn't the
Testing a 2.065 pre-release snapshot against GDC. I see that std.algorithm now surpasses 2.1GBs of memory consumption when compiling unittests. This is bringing my laptop down to its knees for a painful 2/3 minutes. This is time that could be better spent if the unittests where simply broken down/split up.
Apr 22 2014
On Tue, Apr 22, 2014 at 06:09:11PM +0000, Iain Buclaw via Digitalmars-d wrote:Testing a 2.065 pre-release snapshot against GDC. I see that std.algorithm now surpasses 2.1GBs of memory consumption when compiling unittests. This is bringing my laptop down to its knees for a painful 2/3 minutes. This is time that could be better spent if the unittests where simply broken down/split up.Didn't we say (many months ago!) that we wanted to split up std.algorithm into more manageable chunks? I see that that hasn't happened yet. :-( T -- "Real programmers can write assembly code in any language. :-)" -- Larry Wall
Apr 22 2014
On Tuesday, 22 April 2014 at 18:09:12 UTC, Iain Buclaw wrote:Testing a 2.065 pre-release snapshot against GDC. I see that std.algorithm now surpasses 2.1GBs of memory consumption when compiling unittests. This is bringing my laptop down to its knees for a painful 2/3 minutes.My (ancient) laptop only has 2GB of RAM :-) Has anyone looked into why it is using so much? Is it all the temporary allocations created by CTFE that are never cleaned up?
Apr 22 2014
On 22 April 2014 21:43, Peter Alexander via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Tuesday, 22 April 2014 at 18:09:12 UTC, Iain Buclaw wrote:I blame Kenji and all the semanticTiargs and other template-related copying and discarding of memory around the place. :o)Testing a 2.065 pre-release snapshot against GDC. I see that std.algorithm now surpasses 2.1GBs of memory consumption when compiling unittests. This is bringing my laptop down to its knees for a painful 2/3 minutes.My (ancient) laptop only has 2GB of RAM :-) Has anyone looked into why it is using so much? Is it all the temporary allocations created by CTFE that are never cleaned up?
Apr 22 2014
23-Apr-2014 01:00, Iain Buclaw via Digitalmars-d пишет:On 22 April 2014 21:43, Peter Alexander via Digitalmars-d <digitalmars-d puremagic.com> wrote:At a times I really don't know why can't we just drop in a Boehm GC (the stock one, not homebrew stuff) and be done with it. Speed? There is no point in speed if it leaks that much. -- Dmitry OlshanskyOn Tuesday, 22 April 2014 at 18:09:12 UTC, Iain Buclaw wrote:I blame Kenji and all the semanticTiargs and other template-related copying and discarding of memory around the place. :o)Testing a 2.065 pre-release snapshot against GDC. I see that std.algorithm now surpasses 2.1GBs of memory consumption when compiling unittests. This is bringing my laptop down to its knees for a painful 2/3 minutes.My (ancient) laptop only has 2GB of RAM :-) Has anyone looked into why it is using so much? Is it all the temporary allocations created by CTFE that are never cleaned up?
Apr 22 2014
On 4/22/2014 11:33 PM, Dmitry Olshansky wrote:At a times I really don't know why can't we just drop in a Boehm GC (the stock one, not homebrew stuff) and be done with it. Speed? There is no point in speed if it leaks that much.I made a build of dmd with a collector in it. It destroyed the speed. Took it out.
Apr 22 2014
23-Apr-2014 10:39, Walter Bright пишет:On 4/22/2014 11:33 PM, Dmitry Olshansky wrote:Getting more practical - any chance to use it selectively in CTFE and related stuff that is KNOWN to generate garbage? -- Dmitry OlshanskyAt a times I really don't know why can't we just drop in a Boehm GC (the stock one, not homebrew stuff) and be done with it. Speed? There is no point in speed if it leaks that much.I made a build of dmd with a collector in it. It destroyed the speed. Took it out.
Apr 22 2014
On 4/22/2014 11:56 PM, Dmitry Olshansky wrote:Getting more practical - any chance to use it selectively in CTFE and related stuff that is KNOWN to generate garbage?Using it there only will require a rewrite of interpret.c.
Apr 23 2014
On Wednesday, 23 April 2014 at 06:39:04 UTC, Walter Bright wrote:I made a build of dmd with a collector in it. It destroyed the speed. Took it out.Is it because of garbage collections? Then allow people configure collection threshold, say, collect garbage only when the heap is bigger than 16GB.
Apr 23 2014
On 4/23/2014 12:20 AM, Kagamin wrote:On Wednesday, 23 April 2014 at 06:39:04 UTC, Walter Bright wrote:It's more than that. I invite you to read the article I wrote on DrDobbs a while back about changes to the allocator to improve speed. tl;dr: allocation is a critical speed issue with dmd. Using the bump-pointer method is very fast, and it matters.I made a build of dmd with a collector in it. It destroyed the speed. Took it out.Is it because of garbage collections? Then allow people configure collection threshold, say, collect garbage only when the heap is bigger than 16GB.
Apr 23 2014
23-Apr-2014 12:12, Walter Bright пишет:On 4/23/2014 12:20 AM, Kagamin wrote:This stinks it's not even half-serious. A x2 speed increase was due to scraping the old allocator on Win32 altogether and using plain HeapAPI. If the prime reason compilation is fast is because we just throw away memory, we must be doing something wrong, very wrong. -- Dmitry OlshanskyOn Wednesday, 23 April 2014 at 06:39:04 UTC, Walter Bright wrote:It's more than that. I invite you to read the article I wrote on DrDobbs a while back about changes to the allocator to improve speed. tl;dr: allocation is a critical speed issue with dmd. Using the bump-pointer method is very fast, and it matters.I made a build of dmd with a collector in it. It destroyed the speed. Took it out.Is it because of garbage collections? Then allow people configure collection threshold, say, collect garbage only when the heap is bigger than 16GB.
Apr 23 2014
On 4/23/2014 2:00 AM, Dmitry Olshansky wrote:If the prime reason compilation is fast is because we just throw away memory, we must be doing something wrong, very wrong.I've tried adding a collector to DMD with poor results. If you'd like to give it a try as well, please do so. The thing is, I work all day every day on D. I cannot do more. If people want more things done, like redesigning memory allocation in the compiler, redesigning D to do ARC, etc., they'll need to pitch in.
Apr 23 2014
23-Apr-2014 21:16, Walter Bright пишет:On 4/23/2014 2:00 AM, Dmitry Olshansky wrote:I'll give it a spin then.If the prime reason compilation is fast is because we just throw away memory, we must be doing something wrong, very wrong.I've tried adding a collector to DMD with poor results. If you'd like to give it a try as well, please do so.The thing is, I work all day every day on D. I cannot do more.That is understood, thanks for honesty.If people want more things done, like redesigning memory allocation in the compiler, redesigning D to do ARC, etc., they'll need to pitch in.True. -- Dmitry Olshansky
Apr 23 2014
On Wednesday, 23 April 2014 at 17:16:40 UTC, Walter Bright wrote:On 4/23/2014 2:00 AM, Dmitry Olshansky wrote:Well said Walter!If the prime reason compilation is fast is because we just throw away memory, we must be doing something wrong, very wrong.I've tried adding a collector to DMD with poor results. If you'd like to give it a try as well, please do so. The thing is, I work all day every day on D. I cannot do more. If people want more things done, like redesigning memory allocation in the compiler, redesigning D to do ARC, etc., they'll need to pitch in.
Apr 23 2014
tl;dr: allocation is a critical speed issue with dmd. Using the bump-pointer method is very fast, and it matters.What about packing DMD structure members such as integers and enums more efficiently? We could start with making enums __attribute__((packed)). Is there any free static/dynamic tool to check for unexercized bits? How does Clang do to save so much space compared to GCC? Do they pack gentlier or use deallocation? A much higher-hanging fruit is to switch from using pointers to 32-bit handles on 64-bit CPUs to reference tokens, sub-expressions etc. But I guess that is a big undertaking getting type-safe and may give performance hits.
Apr 23 2014
On Wednesday, 23 April 2014 at 19:54:29 UTC, Nordlöw wrote:Maybe we should investigate where the memory is going first before planning our attack :-)tl;dr: allocation is a critical speed issue with dmd. Using the bump-pointer method is very fast, and it matters.What about packing DMD structure members such as integers and enums more efficiently? We could start with making enums __attribute__((packed)). Is there any free static/dynamic tool to check for unexercized bits? How does Clang do to save so much space compared to GCC? Do they pack gentlier or use deallocation? A much higher-hanging fruit is to switch from using pointers to 32-bit handles on 64-bit CPUs to reference tokens, sub-expressions etc. But I guess that is a big undertaking getting type-safe and may give performance hits.
Apr 23 2014
Maybe we should investigate where the memory is going first before planning our attack :-)I agree. Tool anyone?
Apr 23 2014
I agree. Tool anyone?https://stackoverflow.com/questions/23255043/finding-unexercised-bits-of-allocated-data Massif may give some clues.
Apr 23 2014
On 23 April 2014 21:55, "Nordlöw" <digitalmars-d puremagic.com> wrote:I'm using valgrind - may take a while to process and merge them all. I'll post an update in the morning.Maybe we should investigate where the memory is going first before planning our attack :-)I agree. Tool anyone?
Apr 23 2014
On 23 April 2014 22:24, Iain Buclaw <ibuclaw gdcproject.org> wrote:On 23 April 2014 21:55, "Nordlöw" <digitalmars-d puremagic.com> wrote:I was amazed to see some small losses in the glue (that I'll be dealing with), but by and large the worst culprits were all the syntaxCopy'ing done in Template semantic analysis. The resultant assembly file emitted by gdc is 83MB in size, so I think it is impossible to not have a large memory consumption here. The stats file is 100MB (39k reported leaks) and I'm not sure just what to do with it yet.I'm using valgrind - may take a while to process and merge them all. I'll post an update in the morning.Maybe we should investigate where the memory is going first before planning our attack :-)I agree. Tool anyone?
Apr 23 2014
On Wednesday, 23 April 2014 at 20:04:09 UTC, Peter Alexander wrote:Maybe we should investigate where the memory is going first before planning our attack :-)FWIW one hint might be found in the DCD project found here: https://github.com/Hackerpilot/DCD/ In that project compiling the lexer.d file causes a massive increase in compiler memory usage. More details found here: https://github.com/Hackerpilot/DCD/issues/93 NOTE: That was DMD running on a 32 bit Windows XP machine.
Apr 23 2014
On Wednesday, 23 April 2014 at 23:19:20 UTC, Jussi Jumppanen wrote:In that project compiling the lexer.d file causes a massive increase in compiler memory usage.The code is actually located here: https://github.com/Hackerpilot/Dscanner If you want to make DMD cry, compile it with "-O -inline -release".
Apr 23 2014
On Wednesday, 23 April 2014 at 08:12:42 UTC, Walter Bright wrote:tl;dr: allocation is a critical speed issue with dmd. Using the bump-pointer method is very fast, and it matters.Alternatively we could replace heap on size threshold.
Apr 23 2014
On Wed, 23 Apr 2014 02:39:05 -0400, Walter Bright <newshound2 digitalmars.com> wrote:On 4/22/2014 11:33 PM, Dmitry Olshansky wrote:The time it takes to compile a program where the compiler consumes 2G of ram on a 2G machine is infinite ;) There must be some compromise between slow-but-perfect memory management and invoking the OOM killer. -SteveAt a times I really don't know why can't we just drop in a Boehm GC (the stock one, not homebrew stuff) and be done with it. Speed? There is no point in speed if it leaks that much.I made a build of dmd with a collector in it. It destroyed the speed. Took it out.
Apr 23 2014
On Wednesday, 23 April 2014 at 15:46:00 UTC, Steven Schveighoffer wrote:The time it takes to compile a program where the compiler consumes 2G of ram on a 2G machine is infinite ;)(nitpick: not necessarily given good swap behaviour!)
Apr 23 2014
On 23/04/14 08:39, Walter Bright wrote:I made a build of dmd with a collector in it. It destroyed the speed. Took it out.Isn't that bad advertisement for the GC in D? Or has it something to do with DMD not being designed with a GC in mind? -- /Jacob Carlborg
Apr 23 2014
On 4/24/14, 3:16 AM, Jacob Carlborg wrote:On 23/04/14 08:39, Walter Bright wrote:dmd is written in C++, the collector must have been boehmI made a build of dmd with a collector in it. It destroyed the speed. Took it out.Isn't that bad advertisement for the GC in D? Or has it something to do with DMD not being designed with a GC in mind?
Apr 24 2014
On 24 April 2014 12:01, Ary Borenszweig via Digitalmars-d <digitalmars-d puremagic.com> wrote:On 4/24/14, 3:16 AM, Jacob Carlborg wrote:It wasn't IIRC. 'Twas in-house GC, no?On 23/04/14 08:39, Walter Bright wrote:dmd is written in C++, the collector must have been boehmI made a build of dmd with a collector in it. It destroyed the speed. Took it out.Isn't that bad advertisement for the GC in D? Or has it something to do with DMD not being designed with a GC in mind?
Apr 24 2014
On 4/24/2014 7:16 AM, Iain Buclaw via Digitalmars-d wrote:On 24 April 2014 12:01, Ary Borenszweig via Digitalmars-d It wasn't IIRC. 'Twas in-house GC, no?It was with the C++ version of the original D collector.
Apr 26 2014
On Thursday, 24 April 2014 at 06:16:05 UTC, Jacob Carlborg wrote:On 23/04/14 08:39, Walter Bright wrote:Well, keep in mind we are comparing using the GC versus "doing nothing". I'd be interested in knowing the speed with *any* memory management model in DMD.I made a build of dmd with a collector in it. It destroyed the speed. Took it out.Isn't that bad advertisement for the GC in D? Or has it something to do with DMD not being designed with a GC in mind?
Apr 24 2014
"Dmitry Olshansky" wrote in message news:lj7mrr$1p5s$1 digitalmars.com...At a times I really don't know why can't we just drop in a Boehm GC (the stock one, not homebrew stuff) and be done with it. Speed? There is no point in speed if it leaks that much.Or you know, switch to D and use druntime's GC.
Apr 23 2014
23-Apr-2014 20:56, Daniel Murphy пишет:"Dmitry Olshansky" wrote in message news:lj7mrr$1p5s$1 digitalmars.com...Good point. Can't wait to see D-only codebase. -- Dmitry OlshanskyAt a times I really don't know why can't we just drop in a Boehm GC (the stock one, not homebrew stuff) and be done with it. Speed? There is no point in speed if it leaks that much.Or you know, switch to D and use druntime's GC.
Apr 23 2014
Am Wed, 23 Apr 2014 21:23:17 +0400 schrieb Dmitry Olshansky <dmitry.olsh gmail.com>:23-Apr-2014 20:56, Daniel Murphy =D0=BF=D0=B8=D1=88=D0=B5=D1=82:..."Dmitry Olshansky" wrote in message news:lj7mrr$1p5s$1 digitalmars.com=Hmm. DMD doesn't use a known and tried, imprecise GC because it is a lot slower. How is DMD written in D using the druntime GC going to help that ? I wondered about this ever since there was talk about DDMD. I'm totally expecting compile times to multiply by 1.2 or so. --=20 Marco=20 Good point. Can't wait to see D-only codebase.At a times I really don't know why can't we just drop in a Boehm GC (the stock one, not homebrew stuff) and be done with it. Speed? There is no point in speed if it leaks that much.Or you know, switch to D and use druntime's GC.
Apr 23 2014
24-Apr-2014 05:12, Marco Leise пишет:Am Wed, 23 Apr 2014 21:23:17 +0400 schrieb Dmitry Olshansky <dmitry.olsh gmail.com>:No it doesn't. It used a precursor of D's GC and that turned out to be slow. See Walter's post.23-Apr-2014 20:56, Daniel Murphy пишет:Hmm. DMD doesn't use a known and tried, imprecise GC because it is a lot slower."Dmitry Olshansky" wrote in message news:lj7mrr$1p5s$1 digitalmars.com...Good point. Can't wait to see D-only codebase.At a times I really don't know why can't we just drop in a Boehm GC (the stock one, not homebrew stuff) and be done with it. Speed? There is no point in speed if it leaks that much.Or you know, switch to D and use druntime's GC.How is DMD written in D using the druntime GC going to help that ?GC is that easier to reach, every enhancement to D's GC becomes instantly available. Wanna make compiler faster - make D's runtime faster! ;)I wondered about this ever since there was talk about DDMD. I'm totally expecting compile times to multiply by 1.2 or so.Since memory management is going to stay the same with disabled GC (at least for starters), I doubt things will change radically. If they will then it'll just highlight perf problems in D's runtime that need work. -- Dmitry Olshansky
Apr 26 2014
On 4/23/14, 1:56 PM, Daniel Murphy wrote:"Dmitry Olshansky" wrote in message news:lj7mrr$1p5s$1 digitalmars.com...But that will be slow. Walter's point is that if you introduce a GC it will be slower. Of course, you won't be able to compile big stuff. But developers usually have good machines, so it's not that a big deal.At a times I really don't know why can't we just drop in a Boehm GC (the stock one, not homebrew stuff) and be done with it. Speed? There is no point in speed if it leaks that much.Or you know, switch to D and use druntime's GC.
Apr 23 2014
On 22 April 2014 19:09, Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> wrote:Testing a 2.065 pre-release snapshot against GDC. I see that std.algorithm now surpasses 2.1GBs of memory consumption when compiling unittests. This is bringing my laptop down to its knees for a painful 2/3 minutes. This is time that could be better spent if the unittests where simply broken down/split up.The final nail in the coffin was when my laptop locked up building phobos development using dmd. Went out and bought an SSD disk and replaced my crippled HDD drive - expecting no further problems in the near future...
Jun 21 2014
On Sat, Jun 21, 2014 at 09:34:35PM +0100, Iain Buclaw via Digitalmars-d wrote:On 22 April 2014 19:09, Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> wrote:It's long past due for std.algorithm to be broken up. And this isn't the first time problems like this came up, either. I vaguely recall someone working on an algorithms module, potentially splitting up some of the stuff from the current std.algorithm; whatever happened with that? In fact, splitting std.algorithm has been mentioned so many times, that I feel like I should just shut up and submit a PR for it instead. Even if it gets rejected, at least it gets things moving instead of everyone talking about it yet nothing ever comes of it. T -- Perhaps the most widespread illusion is that if we were in power we would behave very differently from those who now hold it---when, in truth, in order to get power we would have to become very much like them. -- UnknownTesting a 2.065 pre-release snapshot against GDC. I see that std.algorithm now surpasses 2.1GBs of memory consumption when compiling unittests. This is bringing my laptop down to its knees for a painful 2/3 minutes. This is time that could be better spent if the unittests where simply broken down/split up.The final nail in the coffin was when my laptop locked up building phobos development using dmd. Went out and bought an SSD disk and replaced my crippled HDD drive - expecting no further problems in the near future...
Jun 21 2014