digitalmars.D.bugs - [Issue 6658] New: Slow short array equality
- d-bugmail puremagic.com (49/49) Sep 12 2011 http://d.puremagic.com/issues/show_bug.cgi?id=6658
- d-bugmail puremagic.com (49/49) Mar 31 2013 http://d.puremagic.com/issues/show_bug.cgi?id=6658
- d-bugmail puremagic.com (13/14) Apr 01 2013 http://d.puremagic.com/issues/show_bug.cgi?id=6658
- d-bugmail puremagic.com (9/16) Apr 01 2013 http://d.puremagic.com/issues/show_bug.cgi?id=6658
- d-bugmail puremagic.com (11/11) Apr 01 2013 http://d.puremagic.com/issues/show_bug.cgi?id=6658
- d-bugmail puremagic.com (12/18) Apr 02 2013 http://d.puremagic.com/issues/show_bug.cgi?id=6658
- d-bugmail puremagic.com (11/13) Apr 02 2013 http://d.puremagic.com/issues/show_bug.cgi?id=6658
- d-bugmail puremagic.com (8/8) Apr 02 2013 http://d.puremagic.com/issues/show_bug.cgi?id=6658
- d-bugmail puremagic.com (13/13) Apr 02 2013 http://d.puremagic.com/issues/show_bug.cgi?id=6658
- d-bugmail puremagic.com (15/36) Apr 02 2013 http://d.puremagic.com/issues/show_bug.cgi?id=6658
- d-bugmail puremagic.com (10/10) Apr 02 2013 http://d.puremagic.com/issues/show_bug.cgi?id=6658
- d-bugmail puremagic.com (9/21) Apr 02 2013 http://d.puremagic.com/issues/show_bug.cgi?id=6658
- d-bugmail puremagic.com (13/13) Apr 02 2013 http://d.puremagic.com/issues/show_bug.cgi?id=6658
http://d.puremagic.com/issues/show_bug.cgi?id=6658 Summary: Slow short array equality Product: D Version: D2 Platform: x86 OS/Version: Windows Status: NEW Keywords: performance Severity: enhancement Priority: P2 Component: DMD AssignedTo: nobody puremagic.com ReportedBy: bearophile_hugs eml.cc Equality comparisons between very short arrays is quite slow in D/DMD. When the arrays are both fixed-sized and short, I suggest to inline item comparisons instead of calling a runtime function, to speed up the code significantly. A benchmark: bool eq1(T, size_t N)(const ref T[N] a, const ref T[N] b) pure nothrow { return a == b; } bool eq2(T, size_t N)(const ref T[N] a, const ref T[N] b) pure nothrow { foreach (size_t i; 0 .. a.length) if (a[i] != b[i]) return false; return true; } void main() { import std.stdio, std.random; enum size_t N = 3; enum size_t M = 100; rndGen.seed(1); int[N][M] data; foreach (ref d; data) foreach (ref x; d) x = uniform(0, 4); int total; foreach (i; 0 .. 20_000) foreach (j; 0 .. M) foreach (k; 0 .. M) static if (1) total += eq1(data[j], data[k]); // 11.5 seconds else total += eq2(data[j], data[k]); // 2.45 seconds writeln(total); } -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 12 2011
http://d.puremagic.com/issues/show_bug.cgi?id=6658 After closing Issue 9477 the situation is changed a little. On the same PC with DMD 2.063alpha the run-time with eq2() is 2.04 seconds and the run-time with eq1() is 6.37 seconds (3.12X). The asm produced by dmd (-O -release -inline -noboundscheck) for the two versions: eq1: push EAX mov ECX, 0Ch push ESI mov ESI, 0Ch[ESP] push EDI mov EDI, EAX xor EAX, EAX rep cmpsb je L19 sbb EAX, EAX sbb EAX, 0FFFFFFFFh L19: neg EAX sbb EAX, EAX inc EAX pop EDI pop ESI pop ECX ret 4 eq2: push EBX mov EDX, EAX mov ECX, 8[ESP] xor EBX, EBX L9: mov EAX, [EBX*4][ECX] cmp EAX, [EBX*4][EDX] je L17 pop EBX xor EAX, EAX ret 4 L17: inc EBX cmp EBX, 3 jb L9 pop EBX mov EAX, 1 ret 4 Is cmpsb efficient on modern CPUs? Maybe the answer is negative: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052 -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 31 2013
http://d.puremagic.com/issues/show_bug.cgi?id=6658 Vladimir Panteleev <thecybershadow gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |thecybershadow gmail.com 00:56:15 EEST --- Can we close this issue as a duplicate of issue 5282?Is cmpsb efficient on modern CPUs?As the linked discussion mentions, it's not as efficient for short, fixed-size data blocks. Backends other than DMD might take advantage of a known array length, but DMD does not. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 01 2013
http://d.puremagic.com/issues/show_bug.cgi?id=6658Can we close this issue as a duplicate of issue 5282?Do you know why DMD can't be smarter here? The compiler knows the types and lengths, so it knows when one of such situations happens. At worst calling a template function defined in the object module is able to solve the situation. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------Is cmpsb efficient on modern CPUs?As the linked discussion mentions, it's not as efficient for short, fixed-size data blocks. Backends other than DMD might take advantage of a known array length, but DMD does not.
Apr 01 2013
http://d.puremagic.com/issues/show_bug.cgi?id=6658 02:10:35 EEST --- It can, it's just that no one has implemented such an optimization yet. Keep in mind that the patch would be against DMDBE (assuming the strategy would be to optimize the memcmp intrinsic), so it would only improve the non-free, non-redistributable part of one D compiler. D users who want their programs to be fast should use GDC or LDC. In contrast, the patch to issue 9477 was in the frontend, so it benefits all of GDC, LDC and DMD. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 01 2013
http://d.puremagic.com/issues/show_bug.cgi?id=6658 Iain Buclaw <ibuclaw ubuntu.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |ibuclaw ubuntu.comIt can, it's just that no one has implemented such an optimization yet. Keep in mind that the patch would be against DMDBE (assuming the strategy would be to optimize the memcmp intrinsic), so it would only improve the non-free, non-redistributable part of one D compiler. D users who want their programs to be fast should use GDC or LDC. In contrast, the patch to issue 9477 was in the frontend, so it benefits all of GDC, LDC and DMD.I'd just like to say that statement is false. It wasn't a patch in the frontend, and no one but DMD benefits. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 02 2013
http://d.puremagic.com/issues/show_bug.cgi?id=6658 10:35:39 EEST ---I'd just like to say that statement is false. It wasn't a patch in the frontend, and no one but DMD benefits.Really?? We're talking about this, right? https://github.com/D-Programming-Language/dmd/commit/c984175cf25dfa17b3956e8e33ff83547fa56b0a I thought GDC/LDC simply provided their own implementation of el_* and the elem type. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 02 2013
http://d.puremagic.com/issues/show_bug.cgi?id=6658 10:42:00 EEST --- I don't understand why GDC and LDC did not use the existing code in e2ir. This would imply that every implementation would have to reimplement exactly the same things as DMD (emit calls to Druntime, for example). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 02 2013
http://d.puremagic.com/issues/show_bug.cgi?id=6658 10:55:41 EEST --- DMD: https://github.com/D-Programming-Language/dmd/blob/v2.062/src/e2ir.c#L2431 GDC: https://github.com/D-Programming-GDC/GDC/blob/master/gcc/d/d-elem.cc#L100 LDC: https://github.com/ldc-developers/ldc/blob/master/gen/arrays.cpp#L812 It's all reimplementations of the same thing. I guess there was a reason for why the layer of abstraction was chosen to be this high, but in this situation it's not doing anyone any favors. Well, I guess that now that DMD is faster than both GDC and LDC at something, the burden is on those compilers' maintainers to catch up with it :) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 02 2013
http://d.puremagic.com/issues/show_bug.cgi?id=6658DMD: https://github.com/D-Programming-Language/dmd/blob/v2.062/src/e2ir.c#L2431 GDC: https://github.com/D-Programming-GDC/GDC/blob/master/gcc/d/d-elem.cc#L100 LDC: https://github.com/ldc-developers/ldc/blob/master/gen/arrays.cpp#L812 It's all reimplementations of the same thing. I guess there was a reason for why the layer of abstraction was chosen to be this high, but in this situation it's not doing anyone any favors.When gdc was first released, it provided it's own implementation of dt* and the dt_t type. This is now what I widely regard of as a bad move. The DMD implementation of toobj.c; todt.c; typinf.c; has been getting further from GDC's copy of that part of the frontend, thus has become a pain to maintain upon each frontend update. Frankly I can never see it working trying to hack and wrap dmd backend calls into producing something that another backend see's as correct. So I'll be yanking those three files out of dfrontend/ sometime around this fall too. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------I'd just like to say that statement is false. It wasn't a patch in the frontend, and no one but DMD benefits.Really?? We're talking about this, right? https://github.com/D-Programming-Language/dmd/commit/c984175cf25dfa17b3956e8e33ff83547fa56b0a I thought GDC/LDC simply provided their own implementation of el_* and the elem type.
Apr 02 2013
http://d.puremagic.com/issues/show_bug.cgi?id=6658 12:50:44 EEST --- I don't know what dt_t is, but judging from all 3 of its mentions in the 5381-line e2ir.c, I can only suppose that it is a leaky abstraction poking out of the DMD backend. Anyway, I don't see how this applies to e2ir.c, since, as I've mentioned, dt_t only occurs 3 times in the file. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 02 2013
http://d.puremagic.com/issues/show_bug.cgi?id=6658 ---I don't know what dt_t is, but judging from all 3 of its mentions in the 5381-line e2ir.c, I can only suppose that it is a leaky abstraction poking out of the DMD backend. Anyway, I don't see how this applies to e2ir.c, since, as I've mentioned, dt_t only occurs 3 times in the file.I don't know what dt_t is, but judging from all 3 of its mentions in the 5381-line e2ir.c, I can only suppose that it is a leaky abstraction poking out of the DMD backend. Anyway, I don't see how this applies to e2ir.c, since, as I've mentioned, dt_t only occurs 3 times in the file.In brief, why define all these dmd backend symbols (OPcall) that are of no use to gcc, and when you can just build gcc trees directly (CALL_EXPR)? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 02 2013
http://d.puremagic.com/issues/show_bug.cgi?id=6658 21:57:32 EEST --- I don't understand the argument. Because... choosing a lower (and simpler) layer of abstraction would mean that less code would need to be reimplemented? As I understand it, every time DMD implements a new expression type (for example, dot multiply), GDC and LDC would need to be updated. However, DMD already has a glue layer that lowers all of the D-specific expressions to something closer to abstract machine code, which I'd think is what alternative backends would be more interested in. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 02 2013