digitalmars.D.ldc - A little benchmark in D and C
- bearophile (55/55) Jul 04 2014 Through Reddit I've found a little benchmark. Here there is a D
- Jyxent (5/12) Jul 04 2014 I repeated your test with ldc2, clang, and gcc. The D and C
- David Nadlinger (8/14) Jul 04 2014 A quick test suggests that this seems to be a matter of GCC's
- bearophile (4/8) Jul 04 2014 So it looks like a ER/bug report for LLVM instead of LDC :-)
- bearophile (5/10) Jul 04 2014 Discussed a bit in the llvm irc channel and then I have reported
- safety0ff (32/34) Jul 06 2014 I reversed the gcc assembly output, gcc compiles the run()
- safety0ff (12/18) Jul 06 2014 Oops, this should be:
Through Reddit I've found a little benchmark. Here there is a D version, followed by a C99 version. I compile with: ldmd2 -O -release -inline -noboundscheck test1.d gcc -O3 -std=c99 test2.c -o test2 I use gcc 4.8.0. On an old PC the run-time of C version is about 5.2 seconds, while the D version compiled with the latest ldc2 runs in about 26 seconds (almost 5 times slower). Bye, bearophile --------------------------------- // D version. import core.stdc.stdio; enum int t = 20; bool isEvenlyDivisible(in int i, in int a, in int b) pure nothrow safe { if (i > b) return true; else return (a % i == 0) && isEvenlyDivisible(i + 1, a, b); } void run() nothrow { int i = 10; while (!isEvenlyDivisible(2, i, t)) i += 2; printf("%d\n", i); } void main() { for (int i = 0; i < 5; i++) run; } --------------------------------- // C version. #include <stdio.h> #include <stdbool.h> const int t = 20; bool isEvenlyDivisible(const int i, const int a, const int b) { if (i > b) return true; else return (a % i == 0) && isEvenlyDivisible(i + 1, a, b); } void run() { int i = 10; while (!isEvenlyDivisible(2, i, t)) i += 2; printf("%d\n", i); } int main() { for (int i = 0; i < 5; i++) run(); return 0; } ---------------------------------
Jul 04 2014
On Friday, 4 July 2014 at 15:40:26 UTC, bearophile wrote:I compile with: ldmd2 -O -release -inline -noboundscheck test1.d gcc -O3 -std=c99 test2.c -o test2 I use gcc 4.8.0. On an old PC the run-time of C version is about 5.2 seconds, while the D version compiled with the latest ldc2 runs in about 26 seconds (almost 5 times slower).I repeated your test with ldc2, clang, and gcc. The D and C programs both run in about 16 seconds when compiled with ldc2 and clang on my machine. The gcc compiled C program ran in about 1.5 seconds. So probably something LLVM specific?
Jul 04 2014
On Friday, 4 July 2014 at 15:40:26 UTC, bearophile wrote:Through Reddit I've found a little benchmark. Here there is a D version, followed by a C99 version. I compile with: ldmd2 -O -release -inline -noboundscheck test1.d gcc -O3 -std=c99 test2.c -o test2 […]A quick test suggests that this seems to be a matter of GCC's optimizer picking up on something that LLVM's doesn't, as Clang 3.4.2 produces an executable just as slow/fast than that by LDC built against LLVM 3.4.2. Would be interesting to look at the asm to see what is going on here. David
Jul 04 2014
David Nadlinger:A quick test suggests that this seems to be a matter of GCC's optimizer picking up on something that LLVM's doesn't, as Clang 3.4.2 produces an executable just as slow/fast than that by LDC built against LLVM 3.4.2.So it looks like a ER/bug report for LLVM instead of LDC :-) Bye, bearophile
Jul 04 2014
David Nadlinger:Discussed a bit in the llvm irc channel and then I have reported it: http://llvm.org/bugs/show_bug.cgi?id=20205 Bye, bearophileA quick test suggests that this seems to be a matter of GCC's optimizer picking up on something that LLVM's doesn't, as Clang 3.4.2 produces an executable just as slow/fast than that by LDC built against LLVM 3.4.2.
Jul 04 2014
On Friday, 4 July 2014 at 15:55:50 UTC, David Nadlinger wrote:Would be interesting to look at the asm to see what is going on here.I reversed the gcc assembly output, gcc compiles the run() function as follows: void run() { unsigned i = 10; for (;; i += 2) { // movl $0x55555555, %ebp // magic contant for division via multiplication // mull %ebp // edx = mulhi (magic, i) // shrl %edx // edx >>= 1 (part of magic division) // leal (%rdx,%rdx,2), %eax // eax = 3*rdx // if (i - eax != 0) // i - 3 * rdx = remainder // continue if ((i & 3) != 0) // above comments show how this line is implemented continue; // see ch.10 section 3 of hacker's delight 2nd ed. if (i % 5 != 0) continue; if (i % 6 != 0) continue; if (i % 7 != 0) continue; if (isEvenlyDivisible(8, i, t)) break; } printf("%d\n", i); } Without the division by 3 optimization the code runs slower with that check in place. Basically there's some epic wins in unrolling that function.
Jul 06 2014
On Sunday, 6 July 2014 at 08:30:03 UTC, safety0ff wrote:if ((i & 3) != 0) // above comments show how this line is implemented continue; // see ch.10 section 3 of hacker's delight 2nd ed. if (i % 5 != 0) continue;Oops, this should be: if (i % 3 != 0) // above comments show how this line is implemented continue; // see ch.10 section 3 of hacker's delight 2nd ed. if ((i & 3) != 0) continue; if (i % 5 != 0) continue; Also, I switched the signed integers to unsigned to help simplify the assembly.
Jul 06 2014