www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Re: Basic benchmark

reply bearophile <bearophileHUGS lycos.com> writes:
Content-Type: text/plain

Tomas Lindquist Olsen:
 ...
 $ dmd bench.d -O -release -inline
 long arith:  55630 ms
 nested loop:  5090 ms
 
 $ ldc bench.d -O3 -release -inline
 long arith:  13870 ms
 nested loop:   120 ms
 
 $ gcc bench.c -O3 -s -fomit-frame-pointer
 long arith: 13600 ms
 nested loop:  170 ms
...

Very nice results. If you have a little more time I have another small C and D benchmark to offer you, to be tested with GCC and LDC. It's the C version of the "nbody" benchmarks of the Shootout, a very close translation to D (file name "nbody_d1.d") and my faster D version (file name "nbody_d2.d") (the faster D version is relative to DMD compiler, of course). I haven't tried LDC yet, so I can't be sure of what the timings will tell. Thank you for your work, bearophile
Dec 14 2008
next sibling parent The Anh Tran <trtheanh gmail.com> writes:
bearophile wrote:
 Tomas Lindquist Olsen:
 ...
 $ dmd bench.d -O -release -inline
 long arith:  55630 ms
 nested loop:  5090 ms

 $ ldc bench.d -O3 -release -inline
 long arith:  13870 ms
 nested loop:   120 ms

 $ gcc bench.c -O3 -s -fomit-frame-pointer
 long arith: 13600 ms
 nested loop:  170 ms
 ...

Very nice results. If you have a little more time I have another small C and D benchmark to offer you, to be tested with GCC and LDC. It's the C version of the "nbody" benchmarks of the Shootout, a very close translation to D (file name "nbody_d1.d") and my faster D version (file name "nbody_d2.d") (the faster D version is relative to DMD compiler, of course). I haven't tried LDC yet, so I can't be sure of what the timings will tell. Thank you for your work, bearophile

IMHO, spectralnorm is 'a little bit' better than nbody. :)
Dec 14 2008
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Lindquist and another gentle person on IRC have given their timings relative to
the D and C versions of the 'nbody' code I have shown in the attach in the
precedent email (they have tested the first D code version only).

Timings N=20_000_000, on an athlon64 x2 3800+ CPU:
  gcc: 10.8  s
  ldc: 14.2  s
  dmd: 15.5  s
  gdc:

------------

Timings N=10_000_000, on an AMD 2500+ CPU:
  gcc:  8.78 s
  ldc: 12.26 s
  dmd: 13.9  s
  gdc:  9.82 s

Compiler arguments used on the AMD 2500+ CPU:
  GCC: -O3 -s -fomit-frame-pointer
  DMD: -release -O
  GDC: -O3 -s -fomit-frame-pointer
  LDC: -ofmain -O3 -release -inline

This time the results seems good enough to me.

This benchmark is relative to FP computations, the faster language for this
naive physics simulation is Fortran90, as can be seen in the later pages of the
Shootout).
(I'd like to test one last one, the 'recursive' benchmark, but it's for later).

Bye,
bearophile
Dec 14 2008
parent bearophile <bearophileHUGS lycos.com> writes:
Content-Type: text/plain

(The other gentle person on IRC was wilsonk).
The timing results for the nbody benchmark (the code is in attach in one my
last posts) as found by 
wilsonk on IRC, N=10_000_000, on an AMD 2500+ CPU:
  64-bit GCC C code: 3.31 s
  64-bit LDC D code: 5.74 s
  
You can see the ratio is very similar to the 32 bit one (but absolute timings
are quite lower).

------------------------

Then the timings for the recursive4 benchmark (the code is in attach in this
post):
On an AMD 2500+ CPU, by wilsonk, 64 bit timings, recursive4:
  C code GCC, N=13: 22.93 s
  D code LDC, N=13: 28.88 s


Timings by Elrood, recursive4 benchmark, on a 32-bit WinXP, AMD x2 3600 CPU:
  C code GCC, N=13: ~25 s
  D code LDC, N=13: >60 s

For this benchmark the LLVM shows to need some improvement still :-)

Bye,
bearophile
Dec 14 2008