digitalmars.D.learn - long compile time question
- Dan (15/15) Oct 23 2012 The following takes nearly three minutes to compile.
- 1100110 (7/22) Oct 23 2012 I have the same issue on linux x64 2.060
- Era Scarecrow (24/31) Oct 24 2012 It appears it's all happening during copying init, why I am not
- Maxim Fomin (5/38) Oct 24 2012 According to assembly dmd just generates repetitive instructions
- thedeemon (21/36) Oct 24 2012 The code DMD generates for initializing the struct does not use
- Don Clugston (2/40) Oct 24 2012 That's incredibly horrible, please add to bugzilla.
- H. S. Teoh (25/57) Oct 24 2012 [...]
- thedeemon (8/13) Oct 24 2012 Must be other factors. "Optimized" code (generated by dmd with
- Jonathan M Davis (6/7) Oct 24 2012 There are at least a couple of potentially related bugs already:
- Era Scarecrow (6/14) Oct 24 2012 That seems silly. I would think after the struct's init/contents
- H. S. Teoh (10/27) Oct 24 2012 Not true. Memset is *usually* used to set memory to 0, but it can set
- Manfred Nowak (13/14) Oct 24 2012 ... and this returns immediately:
- Jonathan M Davis (6/23) Oct 24 2012 That's incredibly dangerous though, because than B.init has garbage in i...
The following takes nearly three minutes to compile. The culprit is the line bar ~= B(); What is wrong with this? Thanks, Dan ---------------- struct B { const size_t SIZE = 1024*64; int[SIZE] x; } void main() { B[] barr; barr ~= B(); } -----------------
Oct 23 2012
On Tue, 23 Oct 2012 22:50:46 -0500, Dan <dbdavidson yahoo.com> wrote:The following takes nearly three minutes to compile. The culprit is the line bar ~= B(); What is wrong with this? Thanks, Dan ---------------- struct B { const size_t SIZE = 1024*64; int[SIZE] x; } void main() { B[] barr; barr ~= B(); } -----------------I have the same issue on linux x64 2.060 So appending to a dynamic array isn't really that efficient. But this goes WAY over that line. I'm timing your test now. It's still going... -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Oct 23 2012
On Wednesday, 24 October 2012 at 04:49:19 UTC, 1100110 wrote:The following takes nearly three minutes to compile. The culprit is the line bar ~= B(); What is wrong with this?I have the same issue on linux x64 2.060 So appending to a dynamic array isn't really that efficient. But this goes WAY over that line. I'm timing your test now. It's still going...It appears it's all happening during copying init, why I am not sure. [code] struct B { enum SIZE = 1024 * 64; int[SIZE] x; } //same timing issue, no array involved void test(B b) {} void main() { test(B()); } [/code] I've changed the *64 to various numbers and gotten curious results. The mb is the noted (estimated) memory footprint it used during compiling. 1: 0m0.725s mb:?? 2: 0m1.414s mb:?? 4: 0m2.620s mb:28 8: 0m8.937s mb:30 16: 0m35.869s mb:34 32: 2m36.922s mb:42 64: 9m27.353s mb:56
Oct 24 2012
On Wednesday, 24 October 2012 at 09:50:38 UTC, Era Scarecrow wrote:On Wednesday, 24 October 2012 at 04:49:19 UTC, 1100110 wrote:According to assembly dmd just generates repetitive instructions to zero memory instead of making it in a loop. A workaround is to initialize array to void and then zero it in a loop.The following takes nearly three minutes to compile. The culprit is the line bar ~= B(); What is wrong with this?I have the same issue on linux x64 2.060 So appending to a dynamic array isn't really that efficient. But this goes WAY over that line. I'm timing your test now. It's still going...It appears it's all happening during copying init, why I am not sure. [code] struct B { enum SIZE = 1024 * 64; int[SIZE] x; } //same timing issue, no array involved void test(B b) {} void main() { test(B()); } [/code] I've changed the *64 to various numbers and gotten curious results. The mb is the noted (estimated) memory footprint it used during compiling. 1: 0m0.725s mb:?? 2: 0m1.414s mb:?? 4: 0m2.620s mb:28 8: 0m8.937s mb:30 16: 0m35.869s mb:34 32: 2m36.922s mb:42 64: 9m27.353s mb:56
Oct 24 2012
On Wednesday, 24 October 2012 at 03:50:47 UTC, Dan wrote:The following takes nearly three minutes to compile. The culprit is the line bar ~= B(); What is wrong with this? Thanks, Dan ---------------- struct B { const size_t SIZE = 1024*64; int[SIZE] x; } void main() { B[] barr; barr ~= B(); } -----------------The code DMD generates for initializing the struct does not use loops, so it's xor ecx, ecx mov [eax], ecx mov [eax+4], ecx mov [eax+8], ecx mov [eax+0Ch], ecx mov [eax+10h], ecx mov [eax+14h], ecx mov [eax+18h], ecx mov [eax+1Ch], ecx mov [eax+20h], ecx mov [eax+24h], ecx mov [eax+28h], ecx mov [eax+2Ch], ecx mov [eax+30h], ecx mov [eax+34h], ecx mov [eax+38h], ecx ... So your code creates a lot of work for the compiler.
Oct 24 2012
On 24/10/12 17:39, thedeemon wrote:On Wednesday, 24 October 2012 at 03:50:47 UTC, Dan wrote:That's incredibly horrible, please add to bugzilla.The following takes nearly three minutes to compile. The culprit is the line bar ~= B(); What is wrong with this? Thanks, Dan ---------------- struct B { const size_t SIZE = 1024*64; int[SIZE] x; } void main() { B[] barr; barr ~= B(); } -----------------The code DMD generates for initializing the struct does not use loops, so it's xor ecx, ecx mov [eax], ecx mov [eax+4], ecx mov [eax+8], ecx mov [eax+0Ch], ecx mov [eax+10h], ecx mov [eax+14h], ecx mov [eax+18h], ecx mov [eax+1Ch], ecx mov [eax+20h], ecx mov [eax+24h], ecx mov [eax+28h], ecx mov [eax+2Ch], ecx mov [eax+30h], ecx mov [eax+34h], ecx mov [eax+38h], ecx ... So your code creates a lot of work for the compiler.
Oct 24 2012
On Wed, Oct 24, 2012 at 06:04:10PM +0200, Don Clugston wrote:On 24/10/12 17:39, thedeemon wrote:[...] Yikes!! Why aren't we using memset (or equivalent) here?!On Wednesday, 24 October 2012 at 03:50:47 UTC, Dan wrote:The following takes nearly three minutes to compile. The culprit is the line bar ~= B(); What is wrong with this? Thanks, Dan ---------------- struct B { const size_t SIZE = 1024*64; int[SIZE] x; } void main() { B[] barr; barr ~= B(); } -----------------The code DMD generates for initializing the struct does not use loops, so it's xor ecx, ecx mov [eax], ecx mov [eax+4], ecx mov [eax+8], ecx mov [eax+0Ch], ecx mov [eax+10h], ecx mov [eax+14h], ecx mov [eax+18h], ecx mov [eax+1Ch], ecxThat's incredibly horrible, please add to bugzilla.[...] Yeah, no kidding! For comparison, GDC does a way better job in this department: $ time dmd test.d real 0m7.564s user 0m7.529s sys 0m0.029s $ time gdc test.d real 0m0.107s user 0m0.069s sys 0m0.036s $ This is with SIZE = 1024*16 (I didn't dare try it with 1024*64). That's a 75:1 ratio between dmd and gdc, which is pretty horrible, since dmd is usually significantly faster than gdc. Surprisingly, though, dmd still produces a smaller executable than gdc for this code! I'm guessing the optimizer cleans up that code afterwards? (Or maybe there are other factors at play here that I'm not aware of.) T -- EMACS = Extremely Massive And Cumbersome System
Oct 24 2012
On Wednesday, 24 October 2012 at 17:43:11 UTC, H. S. Teoh wrote:Surprisingly, though, dmd still produces a smaller executable than gdc for this code! I'm guessing the optimizer cleans up that code afterwards? (Or maybe there are other factors at play here that I'm not aware of.)Must be other factors. "Optimized" code (generated by dmd with -release -O) looks like mov DWORD PTR [edx], 0 mov DWORD PTR [edx+4], 0 mov DWORD PTR [edx+8], 0 ... so it should be even bigger and probably slower.
Oct 24 2012
On Wednesday, October 24, 2012 18:04:10 Don Clugston wrote:That's incredibly horrible, please add to bugzilla.There are at least a couple of potentially related bugs already: http://d.puremagic.com/issues/show_bug.cgi?id=8828 http://d.puremagic.com/issues/show_bug.cgi?id=8449 though this may still merit a new report. - Jonathan M Davis
Oct 24 2012
On Wednesday, 24 October 2012 at 15:39:19 UTC, thedeemon wrote:The code DMD generates for initializing the struct does not use loops, so it's xor ecx, ecx mov [eax], ecx mov [eax+4], ecx mov [eax+8], ecx ... So your code creates a lot of work for the compiler.That seems silly. I would think after the struct's init/contents were known it would make a single block that holds the basic init for it and bulk copy every time it needed it (if it's beyond a certain size, say 32 bytes). Also memset only works if the data can be defaulted to 0. Hmmm...
Oct 24 2012
On Thu, Oct 25, 2012 at 12:05:25AM +0200, Era Scarecrow wrote:On Wednesday, 24 October 2012 at 15:39:19 UTC, thedeemon wrote:Not true. Memset is *usually* used to set memory to 0, but it can set memory to other byte values too. Although this doesn't help if the .init value isn't something consisting of repeated byte values. In any case, though, separately initializing every member of an array is silly. That's what a loop is for. That, or a memcpy from an immutable copy of .init. T -- Life would be easier if I had the source code. -- YHLThe code DMD generates for initializing the struct does not use loops, so it's xor ecx, ecx mov [eax], ecx mov [eax+4], ecx mov [eax+8], ecx ... So your code creates a lot of work for the compiler.That seems silly. I would think after the struct's init/contents were known it would make a single block that holds the basic init for it and bulk copy every time it needed it (if it's beyond a certain size, say 32 bytes). Also memset only works if the data can be defaulted to 0. Hmmm...
Oct 24 2012
In any case, though, separately initializing every member of an array is silly. That's what a loop is for. That, or a memcpy from an immutable copy of .init.I think that the reasoning behind DMD's implementation is that for small structs, writing out the instructions is more efficient than a loop or a memcpy(); it's essentially the equivalent of loop unrolling and function inlining. However, that reasoning breaks down as soon as the struct's size goes beyond a certain value. In my opinion, though, this behavior should be kept for small structs. For example, if you have a struct that just wraps a size_t, just generating a move instruction is _way_ faster than a call to memcpy().
Oct 27 2012
On Saturday, 27 October 2012 at 23:07:19 UTC, BLM768 wrote:I just realized that this post is redundant; other posts have also mentioned optimization for small structs. That makes two relatively dumb posts from me in a day; maybe I should just stop for now. :)In any case, though, separately initializing every member of an array is silly. That's what a loop is for. That, or a memcpy from an immutable copy of .init.I think that the reasoning behind DMD's implementation is that for small structs, writing out the instructions is more efficient than a loop or a memcpy(); it's essentially the equivalent of loop unrolling and function inlining. However, that reasoning breaks down as soon as the struct's size goes beyond a certain value. In my opinion, though, this behavior should be kept for small structs. For example, if you have a struct that just wraps a size_t, just generating a move instruction is _way_ faster than a call to memcpy().
Oct 27 2012
Dan wrote:The following takes nearly three minutes to compile.... and this returns immediately: -------------------------------- struct B { const size_t SIZE = 1024*64; int[SIZE] x= void; // !!! } void main() { B[] barr; barr ~= B(); } ------------------------------- - manfred
Oct 24 2012
On Wednesday, October 24, 2012 21:35:22 Manfred Nowak wrote:Dan wrote:That's incredibly dangerous though, because than B.init has garbage in it. Still, it's interesting that it would affect the speed. Maybe the compiler recognizes that it's garbage and so doesn't bother copying it (meaning that you'll get different garbage for every use of B.init). - Jonathan M DavisThe following takes nearly three minutes to compile.... and this returns immediately: -------------------------------- struct B { const size_t SIZE = 1024*64; int[SIZE] x= void; // !!! } void main() { B[] barr; barr ~= B(); } ------------------------------- - manfred
Oct 24 2012