Archives
D Programming
DD.gnu digitalmars.D digitalmars.D.bugs digitalmars.D.dtl digitalmars.D.dwt digitalmars.D.announce digitalmars.D.learn digitalmars.D.debugger C/C++ Programming
c++c++.announce c++.atl c++.beta c++.chat c++.command-line c++.dos c++.dos.16-bits c++.dos.32-bits c++.idde c++.mfc c++.rtl c++.stl c++.stl.hp c++.stl.port c++.stl.sgi c++.stlsoft c++.windows c++.windows.16-bits c++.windows.32-bits c++.wxwindows digitalmars.empire digitalmars.DMDScript |
c++ - floating point performance
I'm writing a very numerically intensive application, that involves mainly integration. Using the trapeze method (source at the end of msg), I got widely different execution times for different compilers (times are in seconds, and the OS is Win2k, except for gcc running on Linux, where specified): gcc-2.95.2 Debian GNU/Linux => 81 bcc 5.5.1 => 176 gcc-2.95.3 (MinGW 1.0) => 255 gcc-2.95.3 (Cygwin) => 119 sc 8.1d (Win32) => 316 sc 8.1d (X32) => 383 lcc-win32 => 326 I'm using a 1.1GHz Athlon with 256 MB RAM. It seems that DigitalMars is not the best choice for numerical applications, or maybe I just got into a particular case, into which DM is behaving poorly? The flags used at compiling are "-o+all -6 -ff" (-mn -WA for Win32 and -mx for DOS extended version, of course). It's strange to see such big differences between different flavors of gcc. Maybe performance is mainly affected by the run-time libraries, which are more or less optimized? MinGW is using MSVCRT, so... ;) Another thing: why the difference between the Win32 and X32 versions of the DM generated exe? It's only pure calculations, no i/o calls that might involve switching between protected and real mode... Under real DOS (with EMM386, so VCPI is involved) it's even slower! Laurentiu // integrate.cpp #include <stdio.h> #include <math.h> #include <time.h> double fn(double x) { return 0.5 * exp (-x*x/2.0); } double integrate(double a, double b, double eps, double(*f)(double)) { time_t before, after; time(&before); unsigned points = 4; register unsigned i; double previous, x, dx; register double current = 0.0; do { previous = current; current = ((*f)(a) + (*f)(b)) / 2.0; x = a; dx = (b - a) / (points - 1); for (i = points - 3; i--; x += dx) { current += (*f)(x); } points <<= 1; current *= dx; } while(fabs((current - previous) / current) >= eps); time(&after); printf("value = %g\tpoints = %u\ttime = %g\n", current, points, difftime(after, before)); return current; } int main() { //fesetprec(FE_DBLPREC); // no speedup integrate(0.0, 1.0, 1e-9, fn); return 0; } Sep 18 2001
DMC has significantly more accurate floating point than other compilers do. This is particularly apparent in the floating point library, exp() included. It involves correctly handling things like NaN's and Infinities, which requires some extra code to be executed. Many C compilers simply ignore those cases. -Walter Laurentiu Pancescu wrote in message <9o86u7$2oki$1 digitaldaemon.com>...I'm writing a very numerically intensive application, that involves mainly integration. Using the trapeze method (source at the end of msg), I got widely different execution times for different compilers (times are in seconds, and the OS is Win2k, except for gcc running on Linux, where specified): gcc-2.95.2 Debian GNU/Linux => 81 bcc 5.5.1 => 176 gcc-2.95.3 (MinGW 1.0) => 255 gcc-2.95.3 (Cygwin) => 119 sc 8.1d (Win32) => 316 sc 8.1d (X32) => 383 lcc-win32 => 326 I'm using a 1.1GHz Athlon with 256 MB RAM. It seems that DigitalMars is not the best choice for numerical applications, or maybe I just got into a particular case, into which DM is behaving poorly? The flags used at compiling are "-o+all -6 -ff" (-mn -WA for Win32 and -mx for DOS extended version, of course). It's strange to see such big differences between different flavors of gcc. Maybe performance is mainly affected by the run-time libraries, which are more or less optimized? MinGW is using MSVCRT, so... ;) Another thing: why the difference between the Win32 and X32 versions of the DM generated exe? It's only pure calculations, no i/o calls that might involve switching between protected and real mode... Under real DOS (with EMM386, so VCPI is involved) it's even slower! Laurentiu // integrate.cpp #include <stdio.h> #include <math.h> #include <time.h> double fn(double x) { return 0.5 * exp (-x*x/2.0); } double integrate(double a, double b, double eps, double(*f)(double)) { time_t before, after; time(&before); unsigned points = 4; register unsigned i; double previous, x, dx; register double current = 0.0; do { previous = current; current = ((*f)(a) + (*f)(b)) / 2.0; x = a; dx = (b - a) / (points - 1); for (i = points - 3; i--; x += dx) { current += (*f)(x); } points <<= 1; current *= dx; } while(fabs((current - previous) / current) >= eps); time(&after); printf("value = %g\tpoints = %u\ttime = %g\n", current, points, difftime(after, before)); return current; } int main() { //fesetprec(FE_DBLPREC); // no speedup integrate(0.0, 1.0, 1e-9, fn); return 0; } Sep 18 2001
I rewrote completely all the numerically-intensive functions, and I was amazed by the speed of DMC generated code: it's the best compiler on Win32!! Borland's free compiler generates a crashing EXE, while Cygwin and MinGW generated code with about half the speed of DMC's code - unbelievable!! It seems that the "-ff" switch is very effective (almost doubles execution speed in this case). Even more, after this code rewrite, the X32 version is exactly as fast as the Win32 version (which is normal, I must have done some stupid things in the first version). Only gcc-2.95.2 on Debian GNU/Linux beats DMC, but the difference is not so much (about 9% faster code)... Congratulations, Walter!! DMC is really great, and the ability of treating Infinity and NaN without inline assembly is extremely useful for mathematical applications. Laurentiu "Walter" <walter digitalmars.com> wrote:DMC has significantly more accurate floating point than other compilers do. This is particularly apparent in the floating point library, exp() included. It involves correctly handling things like NaN's and Infinities, which requires some extra code to be executed. Many C compilers simply ignore those cases. -Walter Sep 21 2001
Thanks! - but I have to ask, what is gcc-2.95.2 doing that DMC is not to the code? -Walter Laurentiu Pancescu wrote in message <9ofker$r28$1 digitaldaemon.com>...I rewrote completely all the numerically-intensive functions, and I was amazed by the speed of DMC generated code: it's the best compiler on Win32!! Borland's free compiler generates a crashing EXE, while Cygwin and MinGW generated code with about half the speed of DMC's code - unbelievable!! It seems that the "-ff" switch is very effective (almost doubles execution speed in this case). Even more, after this code rewrite, the X32 version is exactly as fast as the Win32 version (which is normal, I must have done some stupid things in the first version). Only gcc-2.95.2 on Debian GNU/Linux beats DMC, but the difference is not so much (about 9% faster code)... Congratulations, Walter!! DMC is really great, and the ability of treating Infinity and NaN without inline assembly is extremely useful for mathematical applications. Laurentiu "Walter" <walter digitalmars.com> wrote:DMC has significantly more accurate floating point than other compilers Sep 21 2001
"Walter" <walter digitalmars.com> wrote:Thanks! - but I have to ask, what is gcc-2.95.2 doing that DMC is not to the code? -Walter Sep 21 2001
If you have a billion dollars to spend on engineers, you can task them to coding the entire rtl in optimized assembly language! You're right that you have to check if you're testing the rtl speed or the generated code speed. I was losing a benchmark to gcc once, and couldn't figure out why because in every case dmc generated better code. Turns out the time was all being sucked up in a strcpy() of a constant which gcc had inlined and essentially eliminated. -Walter Laurentiu Pancescu wrote in message <9og353$12og$1 digitaldaemon.com>..."Walter" <walter digitalmars.com> wrote:Thanks! - but I have to ask, what is gcc-2.95.2 doing that DMC is not to Sep 22 2001
"Walter" <walter digitalmars.com> wrote:You're right that you have to check if you're testing the rtl speed or the generated code speed. Sep 29 2001
Laurentiu Pancescu wrote:I rewrote completely all the numerically-intensive functions, and I was amazed by the speed of DMC generated code: it's the best compiler on Win32!! Sep 21 2001
|