www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Matrix Multiplication benchmark

reply Matthias Pleh <benutzer example.com> writes:
I've created a simple 4x4 Matrix struct and have made some tests, with 
surprising results. (although I've heard from similar results in c++)

struct mat4 {
   float[4][4] m;
   mat4 opMul(in mat4 _m);
}
mat4 mul(in mat4 m1, in mat4 m2);


I've tested this 2 multiplication functions (overloaded and free).

Result (100_000 times, windows7 64bit, 32bit build):

DMD: (dmd -O -inline -release)
opMul:   20 msecs
mul  : 1355 msecs

GDC: (gdc -m32 -O3 -frelease)
opMul:   10 msecs
mul  : 1481 msecs


Why this huge difference in performance?
Aug 16 2012
next sibling parent reply "RommelVR" <daniel350 bigpond.com> writes:
On Thursday, 16 August 2012 at 13:15:41 UTC, Matthias Pleh wrote:
 I've created a simple 4x4 Matrix struct and have made some 
 tests, with surprising results. (although I've heard from 
 similar results in c++)

 struct mat4 {
   float[4][4] m;
   mat4 opMul(in mat4 _m);
 }
 mat4 mul(in mat4 m1, in mat4 m2);


 I've tested this 2 multiplication functions (overloaded and 
 free).

 Result (100_000 times, windows7 64bit, 32bit build):

 DMD: (dmd -O -inline -release)
 opMul:   20 msecs
 mul  : 1355 msecs

 GDC: (gdc -m32 -O3 -frelease)
 opMul:   10 msecs
 mul  : 1481 msecs


 Why this huge difference in performance?
See http://stackoverflow.com/questions/5142366/how-fast-is-d-compared-to-c. Its not current, but it has some hints at what to look at. I suggest changing 'in' to 'const ref' for a real boost; though the semantic difference between the two isn't exactly clear to me in the documentation.
Aug 16 2012
next sibling parent Matthias Pleh <benutzer example.com> writes:
Am 16.08.2012 15:18, schrieb RommelVR:
 I suggest changing 'in' to 'const ref' for a real boost;
The mentioned thread suggest it the other way around. Also this gives me not the answer for my question. Note, it's not a question of comparison of 2 languages (C vs. D), but rather the comparison of operator overloading vs. free global functions.
Aug 16 2012
prev sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
RommelVR:

opMul is obsolete, I suggest to use the newer operator
overloading.

 Why this huge difference in performance?
I don't know, why don't you show us the minimized but compilable source code of the two versions plus their clean assembly (so using printf instead of writeln, etc)? Sometimes it's hard to micro optimize for performance if you don't see the asm. Matthias Pleh:
 I suggest changing 'in' to 'const ref' for a real boost; though 
 the semantic difference between the two isn't exactly clear to 
 me in the documentation.
"in" expands in "scope const". So both are const, but scope is about not escaping data, while ref means the 64 bytes of the struct are passed by reference, this means by pointer. With 64 bytes it's probably better to use ref. In Ada there is something like a "smart_ref" that uses ref or not, choosing what's the most efficient of the two in the current case (it's not good to interface with C code). Bye, bearophile
Aug 16 2012
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 08/16/2012 03:15 PM, Matthias Pleh wrote:
 I've created a simple 4x4 Matrix struct and have made some tests, with
 surprising results. (although I've heard from similar results in c++)

 struct mat4 {
   float[4][4] m;
   mat4 opMul(in mat4 _m);
 }
 mat4 mul(in mat4 m1, in mat4 m2);


 I've tested this 2 multiplication functions (overloaded and free).

 Result (100_000 times, windows7 64bit, 32bit build):

 DMD: (dmd -O -inline -release)
 opMul: 20 msecs
 mul : 1355 msecs

 GDC: (gdc -m32 -O3 -frelease)
 opMul: 10 msecs
 mul : 1481 msecs


 Why this huge difference in performance?
Note that you have missed to provide the full benchmark code listing.
Aug 16 2012