www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.ldc - A little benchmark in D and C

reply "bearophile" <bearophileHUGS lycos.com> writes:
Through Reddit I've found a little benchmark. Here there is a D 
version, followed by a C99 version.

I compile with:
ldmd2 -O -release -inline -noboundscheck test1.d
gcc -O3 -std=c99 test2.c -o test2

I use gcc 4.8.0.

On an old PC the run-time of C version is about 5.2 seconds, 
while the D version compiled with the latest ldc2 runs in about 
26 seconds (almost 5 times slower).

Bye,
bearophile

---------------------------------

// D version.
import core.stdc.stdio;

enum int t = 20;

bool isEvenlyDivisible(in int i, in int a, in int b)
pure nothrow  safe {
     if (i > b)
         return true;
     else
         return (a % i == 0) && isEvenlyDivisible(i + 1, a, b);
}

void run() nothrow {
     int i = 10;
     while (!isEvenlyDivisible(2, i, t))
         i += 2;
     printf("%d\n", i);
}

void main() {
     for (int i = 0; i < 5; i++)
       run;
}

---------------------------------

// C version.
#include <stdio.h>
#include <stdbool.h>

const int t = 20;

bool isEvenlyDivisible(const int i, const int a, const int b) {
     if (i > b)
         return true;
     else
         return (a % i == 0) && isEvenlyDivisible(i + 1, a, b);
}

void run() {
     int i = 10;
     while (!isEvenlyDivisible(2, i, t))
         i += 2;
     printf("%d\n", i);
}

int main() {
     for (int i = 0; i < 5; i++)
       run();
     return 0;
}

---------------------------------
Jul 04 2014
next sibling parent "Jyxent" <jyxent example.com> writes:
On Friday, 4 July 2014 at 15:40:26 UTC, bearophile wrote:
 I compile with:
 ldmd2 -O -release -inline -noboundscheck test1.d
 gcc -O3 -std=c99 test2.c -o test2

 I use gcc 4.8.0.

 On an old PC the run-time of C version is about 5.2 seconds, 
 while the D version compiled with the latest ldc2 runs in about 
 26 seconds (almost 5 times slower).
I repeated your test with ldc2, clang, and gcc. The D and C programs both run in about 16 seconds when compiled with ldc2 and clang on my machine. The gcc compiled C program ran in about 1.5 seconds. So probably something LLVM specific?
Jul 04 2014
prev sibling parent reply "David Nadlinger" <code klickverbot.at> writes:
On Friday, 4 July 2014 at 15:40:26 UTC, bearophile wrote:
 Through Reddit I've found a little benchmark. Here there is a D 
 version, followed by a C99 version.

 I compile with:
 ldmd2 -O -release -inline -noboundscheck test1.d
 gcc -O3 -std=c99 test2.c -o test2

 […]
A quick test suggests that this seems to be a matter of GCC's optimizer picking up on something that LLVM's doesn't, as Clang 3.4.2 produces an executable just as slow/fast than that by LDC built against LLVM 3.4.2. Would be interesting to look at the asm to see what is going on here. David
Jul 04 2014
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
David Nadlinger:

 A quick test suggests that this seems to be a matter of GCC's 
 optimizer picking up on something that LLVM's doesn't, as Clang 
 3.4.2 produces an executable just as slow/fast than that by LDC 
 built against LLVM 3.4.2.
So it looks like a ER/bug report for LLVM instead of LDC :-) Bye, bearophile
Jul 04 2014
parent "bearophile" <bearophileHUGS lycos.com> writes:
 David Nadlinger:

 A quick test suggests that this seems to be a matter of GCC's 
 optimizer picking up on something that LLVM's doesn't, as 
 Clang 3.4.2 produces an executable just as slow/fast than that 
 by LDC built against LLVM 3.4.2.
Discussed a bit in the llvm irc channel and then I have reported it: http://llvm.org/bugs/show_bug.cgi?id=20205 Bye, bearophile
Jul 04 2014
prev sibling parent reply "safety0ff" <safety0ff.dev gmail.com> writes:
On Friday, 4 July 2014 at 15:55:50 UTC, David Nadlinger wrote:
 Would be interesting to look at the asm to see what is going on 
 here.
I reversed the gcc assembly output, gcc compiles the run() function as follows: void run() { unsigned i = 10; for (;; i += 2) { // movl $0x55555555, %ebp // magic contant for division via multiplication // mull %ebp // edx = mulhi (magic, i) // shrl %edx // edx >>= 1 (part of magic division) // leal (%rdx,%rdx,2), %eax // eax = 3*rdx // if (i - eax != 0) // i - 3 * rdx = remainder // continue if ((i & 3) != 0) // above comments show how this line is implemented continue; // see ch.10 section 3 of hacker's delight 2nd ed. if (i % 5 != 0) continue; if (i % 6 != 0) continue; if (i % 7 != 0) continue; if (isEvenlyDivisible(8, i, t)) break; } printf("%d\n", i); } Without the division by 3 optimization the code runs slower with that check in place. Basically there's some epic wins in unrolling that function.
Jul 06 2014
parent "safety0ff" <safety0ff.dev gmail.com> writes:
On Sunday, 6 July 2014 at 08:30:03 UTC, safety0ff wrote:
         if ((i & 3) != 0) // above comments show how this line 
 is implemented
             continue;     // see ch.10 section 3 of hacker's 
 delight 2nd ed.
         if (i % 5 != 0)
             continue;
Oops, this should be: if (i % 3 != 0) // above comments show how this line is implemented continue; // see ch.10 section 3 of hacker's delight 2nd ed. if ((i & 3) != 0) continue; if (i % 5 != 0) continue; Also, I switched the signed integers to unsigned to help simplify the assembly.
Jul 06 2014