www.digitalmars.com Home | Search | C & C++ | D | DMDScript | News Groups | index | prev | next
Archives

D Programming
D
D.gnu
digitalmars.D
digitalmars.D.bugs
digitalmars.D.dtl
digitalmars.D.dwt
digitalmars.D.announce
digitalmars.D.learn
digitalmars.D.debugger

C/C++ Programming
c++
c++.announce
c++.atl
c++.beta
c++.chat
c++.command-line
c++.dos
c++.dos.16-bits
c++.dos.32-bits
c++.idde
c++.mfc
c++.rtl
c++.stl
c++.stl.hp
c++.stl.port
c++.stl.sgi
c++.stlsoft
c++.windows
c++.windows.16-bits
c++.windows.32-bits
c++.wxwindows

digitalmars.empire
digitalmars.DMDScript

c++ - The slowest free C compiler?

↑ ↓ ← "Ronald Barrett" <ronaldb sebcorinc.com> writes:
Hello,

I did a short floating-point benchmark with a simple C source. I tested gcc
(MinGW version), Lcc-win32 and DMC with default compiler options on P4
platform . It's surprisal to see that compilers which use 12 bytes long
double generate faster code. The DMC -ff option does not affect
significantly the resultant cpu utilization in this test. Here are the
results (in relative units):

gcc v3.2.3 (mingw special 20030504-1)                1
Lcc-win32 v3.8                                                       1.36
DMC v8.38n                                                            1.89

gcc v3.2.3 (mingw special 20030504-1                  0.85
                   with Dinkum library v4.02(commercial) -
                   only for comparison)

Ronald
Jan 25 2004
Ilya Minkov <minkov cs.tum.edu> writes:
You cannot seriously use default options, since it is with all compilers 
non-optimised, fast compile! With optimised compiles, i would frankly 
expect small difference, with slowest code being generated by LCC-Win32. 
I believe LCC is the only one which cannot effectively use up the 
floating-point registers.

Use -o -ff on DMC and something like -O2 -ffastmath on GCC, plus the 
architecture switches. Can't recall the options on LCC-Win32.

-eye

Ronald Barrett wrote:
 Hello,
 
 I did a short floating-point benchmark with a simple C source. I tested gcc
 (MinGW version), Lcc-win32 and DMC with default compiler options on P4
 platform . It's surprisal to see that compilers which use 12 bytes long
 double generate faster code. The DMC -ff option does not affect
 significantly the resultant cpu utilization in this test. Here are the
 results (in relative units):
 
 gcc v3.2.3 (mingw special 20030504-1)                1
 Lcc-win32 v3.8                                                       1.36
 DMC v8.38n                                                            1.89
 
 gcc v3.2.3 (mingw special 20030504-1                  0.85
                    with Dinkum library v4.02(commercial) -
                    only for comparison)
 
 Ronald
 
 

Jan 25 2004
→ "Ronald Barrett" <ronaldb sebcorinc.com> writes:
Thanks for the fast replies,

I coducted more comprehensive tests with the same source file. Here are the
results:
gcc v3.2.3 (mingw special 20030504-1)                   1
result_a
gcc v3.2.3 (mingw special 20030504-1) with -O2    0.115     result_a1
gcc v3.2.3 (mingw special 20030504-1) with -O1    0.28       result_a1
DMC v8.38n with -o                                                1.19
result_b
Lcc-win32 v3.8 with/without -O                               1.36
result_a1
DMC v8.38n without -o                                           1.89
result_b

With optimised compiles, i would frankly expect small difference, with

You are right. DMC -o generate faster code in this test than Lcc-Win32. The result_a, result_a1 and result_b are the results of the computations. Every of them contains 3000000 long double values. result_a and result_b are incomparable bit by bit because of the different long double length. I found that result_a and result_a1 have exactly 1000000 identical differences: fc result_a result_a1: 0000000A: 00 22 0000002E: 00 22 00000052: 00 22 ... 0225509E: 00 22 022550C2: 00 22 022550E6: 00 22 I constructed a simple code to visualize the raw results: #include <stdio.h> int main(){ FILE *input = fopen("result_a1", "rb"); long double value[1]; unsigned long counter; for (counter = 0; counter < 10000; counter++) { fread(value, sizeof(long double), 1, input); printf("%.20Lf\n", *value); } fclose(input); return 0; } The problem is that Or with printf("%.20Lg... -1311.0351567603643 from a line of result_a -1311.0351562537248 from the same line of result_a1 Which file contains the correct computations? If this is result_a consequently gcc with -O1 or -O2 options and Lcc-win32 (with/without -O) does not generate accurate long double computations in all cases.
Also, DMC has slower math functions because DMC does extra work in them to

and correct handling of NaN's and overflows, work that is frequently skipped by other compilers. The test code indeed contain some inf computations. If the result_a1 is the correct (the all difference are identical) then "Ilya Minkov" <minkov cs.tum.edu> wrote in message news:bv12k6$7ep$1 digitaldaemon.com...
 You cannot seriously use default options, since it is with all compilers
 non-optimised, fast compile! With optimised compiles, i would frankly
 expect small difference, with slowest code being generated by LCC-Win32.
 I believe LCC is the only one which cannot effectively use up the
 floating-point registers.

 Use -o -ff on DMC and something like -O2 -ffastmath on GCC, plus the
 architecture switches. Can't recall the options on LCC-Win32.

 -eye

 Ronald Barrett wrote:
 Hello,

 I did a short floating-point benchmark with a simple C source. I tested


 (MinGW version), Lcc-win32 and DMC with default compiler options on P4
 platform . It's surprisal to see that compilers which use 12 bytes long
 double generate faster code. The DMC -ff option does not affect
 significantly the resultant cpu utilization in this test. Here are the
 results (in relative units):

 gcc v3.2.3 (mingw special 20030504-1)                1
 Lcc-win32 v3.8


 DMC v8.38n


 gcc v3.2.3 (mingw special 20030504-1                  0.85
                    with Dinkum library v4.02(commercial) -
                    only for comparison)

 Ronald


Jan 25 2004
"Ronald Barrett" <ronaldb sebcorinc.com> writes:
Thanks for the fast replies,

I conducted more comprehensive tests with the same source file. Here are the
results:
gcc v3.2.3 (mingw special 20030504-1)                   1
result_a
gcc v3.2.3 (mingw special 20030504-1) with -O2    0.115     result_a1
gcc v3.2.3 (mingw special 20030504-1) with -O1    0.28       result_a1
DMC v8.38n with -o                                                1.19
result_b
Lcc-win32 v3.8 with/without -O                               1.36
result_a1
DMC v8.38n                                                            1.89
result_b

From Ilya Minkov <minkov cs.tum.edu>:
With optimised compiles, i would frankly expect small difference, with

You are right - DMC with -o option generate faster code in this test than Lcc-Win32. It's interesting to note the performance with gcc -O2 compilation. The result_a, result_a1 and result_b are the results from the test file. Every of them contains exactly 3000000 long double values. result_a and result_b are incomparable bit per bit because of the different long double length. Also the results within result_a and result_b differ significantly (in same zones) probably due to the different long double length (the test code perform really many computations - I don't know how big is the accumulated computational error). I found that result_a and result_a1 have exactly 1000000 identical differences: fc result_a result_a1: 0000000A: 00 22 0000002E: 00 22 00000052: 00 22 ... 0225509E: 00 22 022550C2: 00 22 022550E6: 00 22 To compare visually the raw results I constructed vdata.c: #include <stdio.h> int main(int argc, char *argv[]){ FILE *input = fopen(argv[1], "rb"); long double value[1]; unsigned long counter; for (counter = 0; counter < 3000000; counter++) { fread(value, sizeof(long double), 1, input); printf("%.20Lf\n", *value); } fclose(input); return 0; } lcc vdata.c & lcclnk vdata.obj: vdata result_a1 -0.43290277584473510800 0.75025220636165043600 -0.45220046017929632000 ... vdata result_a > a.txt vdata result_a1 > a1.txt fc a.txt a1.txt /b FC: no differences encountered I repeated this with vdata.c compiled with MinGW gcc (with Dinkum libraries - printf from the the default MinGW gcc libraries have a problem with long doubles values) and found no difference in the fc comparison. It seem that there are no difference between result_a and result_a1. If this is true, consequently the test code compiled with gcc -O2 executes more than 10 times faster than the DMC equivalent with all optimizations. From Walter <walter digitalmars.com>:
Also, DMC has slower math functions because DMC does extra work in them to

and correct handling of NaN's and overflows, work that is frequently skipped by other compilers. The test code indeed contain some overflows computations, but only with addition, subtraction and multiplication. If you show interest in this case I'll send you the source file - it's short and simple algorithm. From Scott Michel <scottm cs.ucla.edu>:
What's the benchmark's code, what does it do, what does it test?

software. Thank you, Ronald "Ilya Minkov" <minkov cs.tum.edu> wrote in message news:bv12k6$7ep$1 digitaldaemon.com...
 You cannot seriously use default options, since it is with all compilers
 non-optimised, fast compile! With optimised compiles, i would frankly
 expect small difference, with slowest code being generated by LCC-Win32.
 I believe LCC is the only one which cannot effectively use up the
 floating-point registers.

 Use -o -ff on DMC and something like -O2 -ffastmath on GCC, plus the
 architecture switches. Can't recall the options on LCC-Win32.

 -eye

 Ronald Barrett wrote:
 Hello,

 I did a short floating-point benchmark with a simple C source. I tested


 (MinGW version), Lcc-win32 and DMC with default compiler options on P4
 platform . It's surprisal to see that compilers which use 12 bytes long
 double generate faster code. The DMC -ff option does not affect
 significantly the resultant cpu utilization in this test. Here are the
 results (in relative units):

 gcc v3.2.3 (mingw special 20030504-1)                1
 Lcc-win32 v3.8


 DMC v8.38n


 gcc v3.2.3 (mingw special 20030504-1                  0.85
                    with Dinkum library v4.02(commercial) -
                    only for comparison)

 Ronald


Jan 25 2004
↑ ↓ "Walter" <walter digitalmars.com> writes:
If you're testing long doubles, you should be aware that few compilers on
Win32 actually support true 80 bit long doubles. Most fake it with 64 bits,
which of course will compute faster, but much less accurately.

DMC++ does real 80 bit long doubles.

Benchmarking floating point isn't easy <g>.
Jan 25 2004
↑ ↓ "Ronald Barrett" <ronaldb sebcorinc.com> writes:
I'm aware of this. I conducted again the MinGW gcc tests with
option -m96bit-long-double. There wasn't difference between results and the
cpu utilization with the previous tests.

How is possible gcc -O2 to produce10 times faster code (96 bit long double)
in some cases than DMC which uses 80 bit long double?
If the above statement is fully correct this mean that my project which
requires 50 days computations (yes I have such projects) with DMC will
execute for 5 days roughly with gcc -O2. In addition it will be more
precisely computed with gcc due to the longer long double.

Do you want to see the source?

Ronald

"Walter" <walter digitalmars.com> wrote in message
news:bv20n4$1m55$1 digitaldaemon.com...
 If you're testing long doubles, you should be aware that few compilers on
 Win32 actually support true 80 bit long doubles. Most fake it with 64

 which of course will compute faster, but much less accurately.

 DMC++ does real 80 bit long doubles.

 Benchmarking floating point isn't easy <g>.

Jan 26 2004
↑ ↓ "Walter" <walter digitalmars.com> writes:
"Ronald Barrett" <ronaldb sebcorinc.com> wrote in message
news:bv2vvl$ah2$1 digitaldaemon.com...
 I'm aware of this. I conducted again the MinGW gcc tests with
 option -m96bit-long-double. There wasn't difference between results and

 cpu utilization with the previous tests.

 How is possible gcc -O2 to produce10 times faster code (96 bit long

 in some cases than DMC which uses 80 bit long double?

There's something else going on in the benchmark code, then. Something else, like perhaps file I/O, that is taking so much time it is swamping the result. Or perhaps MinGW happens to determine that your benchmark computation is all dead code and deletes it entirely.
 If the above statement is fully correct this mean that my project which
 requires 50 days computations (yes I have such projects) with DMC will
 execute for 5 days roughly with gcc -O2. In addition it will be more
 precisely computed with gcc due to the longer long double.

 Do you want to see the source?

 Ronald

Jan 26 2004
↑ ↓ → "Ronald Barrett" <ronaldb sebcorinc.com> writes:
There's something else going on in the benchmark code, then. Something

like perhaps file I/O, that is taking so much time it is swamping the result. Yes, there are exactly 30 fwrite calls with buffer of 100000 long double values (note that the buffer for gcc is 1.2 times larger than the DMC one and therefore the resultant file).
 Or perhaps MinGW happens to determine that your benchmark

No, I compared visually the data from the source code compiled with DMC and with MinGW gcc. There are some zones which differ significantly, but there are also zones which doesn't differ. According to my empiric observations this is due to the different accumulated errors (which in some zones probably deplete the long double precision) in the computational space. In addition compilation with Lcc-win32 produces exactly the same resultant file as compilation with gcc -O2, but about 11 times slower. I'll send you the source code within an hour. Is there precision difference between the 80 bit long double and 96 bit one (LDBL_EPSILON, LDBL_MIN, LDBL_MAX & LDBL_DIG are the same in DMC, Lcc-Win32 & MinGW gcc with -m96bit-long-double option) on x86 platform? Thank you, Ronald
Jan 26 2004
→ "Ronald Barrett" <ronaldb sebcorinc.com> writes:

Jan 25 2004
→ "Walter" <walter digitalmars.com> writes:
"Ronald Barrett" <ronaldb sebcorinc.com> wrote in message
news:bv0t6c$30rc$1 digitaldaemon.com...
 Hello,

 I did a short floating-point benchmark with a simple C source. I tested

 (MinGW version), Lcc-win32 and DMC with default compiler options on P4
 platform . It's surprisal to see that compilers which use 12 bytes long
 double generate faster code. The DMC -ff option does not affect
 significantly the resultant cpu utilization in this test. Here are the
 results (in relative units):

 gcc v3.2.3 (mingw special 20030504-1)                1
 Lcc-win32 v3.8                                                       1.36
 DMC v8.38n                                                            1.89

 gcc v3.2.3 (mingw special 20030504-1                  0.85
                    with Dinkum library v4.02(commercial) -
                    only for comparison)

Default means unoptimized for DMC. Use -o for optimization. Also, DMC has slower math functions because DMC does extra work in them to ensure accuracy and correct handling of NaN's and overflows, work that is frequently skipped by other compilers.
Jan 25 2004
→ Scott Michel <scottm cs.ucla.edu> writes:
These results are absolutely meaningless. What's the benchmark's code, what
does it do, what does it test? Are these normalized raw numbers, normalized
relative to what? What's the distribution's 95% percentile, std. dev.? How
did you come up with the timings?

You've successfully benchmarked something, but what, I can't tell.


-scooter

Ronald Barrett wrote:

 Hello,
 
 I did a short floating-point benchmark with a simple C source. I tested
 gcc (MinGW version), Lcc-win32 and DMC with default compiler options on P4
 platform . It's surprisal to see that compilers which use 12 bytes long
 double generate faster code. The DMC -ff option does not affect
 significantly the resultant cpu utilization in this test. Here are the
 results (in relative units):
 
 gcc v3.2.3 (mingw special 20030504-1)                1
 Lcc-win32 v3.8                                                       1.36
 DMC v8.38n                                                            1.89
 
 gcc v3.2.3 (mingw special 20030504-1                  0.85
                    with Dinkum library v4.02(commercial) -
                    only for comparison)
 
 Ronald

Jan 25 2004