digitalmars.D.learn - D code optimization
- Sandu (48/48) Sep 22 2016 It is often being claimed that D is at least as fast as C++.
- Lodovico Giaretta (18/23) Sep 22 2016 Benchmarking C++ vs D is less trivial than it looks, for various
- H. S. Teoh via Digitalmars-d-learn (11/17) Sep 22 2016 [...]
- Brad Anderson (10/16) Sep 22 2016 Just a small tip that applies to both D and C++ in that code. You
- thedeemon (6/11) Sep 22 2016 If you care about speed, better uncomment that `delete`. Without
- Jonathan Marler (3/8) Sep 22 2016 Can you include the C++ source code, the C++ compiler command
- Guillaume Piolat (109/109) Sep 22 2016 Hi,
It is often being claimed that D is at least as fast as C++. Now, I am fairly new to D. But, here is an example where I want to see how can this be made possible. So far my C++ code compiles in ~850 ms. While my D code runs in about 2.1 seconds. The code translated in D looks as follows (can't see any attach button here): import std.stdio, std.math; import std.datetime; int main() { StopWatch sw; sw.start(); double C=0.0; for (int k=0;k<10000;++k) { // iterate 1000x double S0 = 100.0; double r = 0.03; double alpha = 0.07; double sigma = 0.2; double T = 1.0; double strike = 100.0; double S = 0.0; const int n = 252; double dt = T / n; double R = exp(r*dt); double u = exp(alpha*dt + sigma*sqrt(dt)); double d = exp(alpha*dt - sigma*sqrt(dt)); double qU = (R - d) / (R*(u - d)); double qD = (1 - R*qU) / R; //double* call = new double [n + 1]; double[] call = new double[n+1]; for (int i = 0; i <= n; ++i) call[i] = fmax(S0*pow(u, n-i)*pow(d, i)-strike, 0.0); for (int i = n-1; i >= 0 ; --i) { for (int j = 0; j <= i; ++j) { call[j] = qU * call[j] + qD * call[j+1]; } } C = call[0]; //delete call; // since D is has a garbage collector, explicit deallocation of arrays is not necessary. // nevertheless we do this } long exec_ms = sw.peek().msecs; writeln("Option value: ", C, " / execution time: ", exec_ms, " ms\n" ); return 0; }
Sep 22 2016
On Thursday, 22 September 2016 at 16:09:49 UTC, Sandu wrote:It is often being claimed that D is at least as fast as C++. Now, I am fairly new to D. But, here is an example where I want to see how can this be made possible. So far my C++ code compiles in ~850 ms.I assume you meant that it runs in that time.While my D code runs in about 2.1 seconds.Benchmarking C++ vs D is less trivial than it looks, for various reasons: - compiler optimizations: - which compilers (both C++ and D) are you using? Are you aware of the differences in code optimization between DMD, GDC and LDC? - which flags are you passing to your C++ and D compilers? - your code is actually testing the compiler ability in loop unrolling, constant folding and operation hoisting - code semantics: C++ and D, when they look similar, they usually produce the same results, but the often behave very differently internally: - in the posted code you allocate a lot of managed memory, putting a big burden on the garbage collector, which in C++ you don't do, because you talk directly to the C runtime So it's difficult to extract useful data from this kind of benchmark.
Sep 22 2016
On Thu, Sep 22, 2016 at 04:09:49PM +0000, Sandu via Digitalmars-d-learn wrote:It is often being claimed that D is at least as fast as C++. Now, I am fairly new to D. But, here is an example where I want to see how can this be made possible. So far my C++ code compiles in ~850 ms. While my D code runs in about 2.1 seconds.[...] Which compiler are you using? If you're looking for performance, you should use gdc or ldc, as they have better optimizers. While dmd is the most up-to-date in terms of language implementation, I've found that the code it generates consistently performs about 20-30% slower than code generated by gdc (sometimes even more, depending on what the program does). T -- Век живи - век учись. А дураком помрёшь.
Sep 22 2016
On Thursday, 22 September 2016 at 16:09:49 UTC, Sandu wrote:It is often being claimed that D is at least as fast as C++. Now, I am fairly new to D. But, here is an example where I want to see how can this be made possible. So far my C++ code compiles in ~850 ms. While my D code runs in about 2.1 seconds. [snip]Just a small tip that applies to both D and C++ in that code. You can use a static array rather than a dynamically allocated array in the loop (enum n = 252; then double[n+1] call; in D). You can also use "double[n+1] call = void;" to mimic C++'s behavior of uninitialized memory. Use GDC or LDC when doing performance related work as they generate faster code typically. I'd be surprised if the C++ and D code asm wasn't nearly identical for a big chunk of this code when using GCC/GDC or Clang/LDC.
Sep 22 2016
On Thursday, 22 September 2016 at 16:09:49 UTC, Sandu wrote:const int n = 252; double[] call = new double[n+1]; ... //delete call; // since D is has a garbage collector, explicit deallocation of arrays is not necessary.If you care about speed, better uncomment that `delete`. Without delete, when allocating this array 10000 times you'll trigger GC multiple times without good reason to do so. With delete, the same memory shall be reused and no GC triggered, run time should be much better.
Sep 22 2016
On Thursday, 22 September 2016 at 16:09:49 UTC, Sandu wrote:It is often being claimed that D is at least as fast as C++. Now, I am fairly new to D. But, here is an example where I want to see how can this be made possible. So far my C++ code compiles in ~850 ms. While my D code runs in about 2.1 seconds.Can you include the C++ source code, the C++ compiler command line, and the D compiler command line?
Sep 22 2016
Hi, Interesting question, so I took your examples and made them do the same thing with regards to allocation (using malloc instead of new in both languages). I removed the stopwatch to use "time" instead. Now the programs should do the very same thing. Will they be as fast too? D code: ------------------------ bench.d import std.stdio, std.math; import core.stdc.stdlib; import core.stdc.stdio; int main() { double C=0.0; for (int k=0;k<10000;++k) { // iterate 1000x double S0 = 100.0; double r = 0.03; double alpha = 0.07; double sigma = 0.2; double T = 1.0; double strike = 100.0; double S = 0.0; const int n = 252; double dt = T / n; double R = exp(r*dt); double u = exp(alpha*dt + sigma*sqrt(dt)); double d = exp(alpha*dt - sigma*sqrt(dt)); double qU = (R - d) / (R*(u - d)); double qD = (1 - R*qU) / R; double* call = cast(double*)malloc(double.sizeof * (n+1)); for (int i = 0; i <= n; ++i) call[i] = fmax(S0*pow(u, n-i)*pow(d, i)-strike, 0.0); for (int i = n-1; i >= 0 ; --i) { for (int j = 0; j <= i; ++j) { call[j] = qU * call[j] + qD * call[j+1]; } } C = call[0]; } printf("%f\n", C); return 0; } ------------------------ C++ code ------------------------ bench.cpp #include <cmath> #include <cstdlib> #include <cstdio> int main() { double C=0.0; for (int k=0;k<10000;++k) { // iterate 1000x double S0 = 100.0; double r = 0.03; double alpha = 0.07; double sigma = 0.2; double T = 1.0; double strike = 100.0; double S = 0.0; const int n = 252; double dt = T / n; double R = exp(r*dt); double u = exp(alpha*dt + sigma*sqrt(dt)); double d = exp(alpha*dt - sigma*sqrt(dt)); double qU = (R - d) / (R*(u - d)); double qD = (1 - R*qU) / R; double* call = (double*)malloc(sizeof(double) * (n+1)); for (int i = 0; i <= n; ++i) call[i] = fmax(S0*pow(u, n-i)*pow(d, i)-strike, 0.0); for (int i = n-1; i >= 0 ; --i) { for (int j = 0; j <= i; ++j) { call[j] = qU * call[j] + qD * call[j+1]; } } C = call[0]; } printf("%f\n", C); return 0; } ------------------------ Here is the bench script: ------------------------ bench.sh ldc2 -O2 bench.d clang++ -O2 bench.cpp -o bench-cpp; time ./bench time ./bench-cpp time ./bench time ./bench-cpp time ./bench time ./bench-cpp time ./bench time ./bench-cpp ------------------------ Note that I use clang-703.0.31 that comes with Xcode 7.3 that is based on LLVM 3.8.0 from what I can gather. Using ldc 1.0.0-b2 which is at LLVM 3.8.0 too! Maybe the backend is out of the equation. The results at -O2 (minimum of 4 samples): // C++ real 0m0.484s user 0m0.466s sys 0m0.011s // D real 0m0.390s user 0m0.373s sys 0m0.012s Why is the D code 1.25x as fast as the C++ code if they do the same thing? Well I don't know, I've not analyzed further.
Sep 22 2016