digitalmars.D.learn - Matrix Multiplication benchmark
- Matthias Pleh (16/16) Aug 16 2012 I've created a simple 4x4 Matrix struct and have made some tests, with
- RommelVR (7/25) Aug 16 2012 See
- Matthias Pleh (5/6) Aug 16 2012 The mentioned thread suggest it the other way around.
- bearophile (17/21) Aug 16 2012 RommelVR:
- Timon Gehr (2/18) Aug 16 2012 Note that you have missed to provide the full benchmark code listing.
I've created a simple 4x4 Matrix struct and have made some tests, with surprising results. (although I've heard from similar results in c++) struct mat4 { float[4][4] m; mat4 opMul(in mat4 _m); } mat4 mul(in mat4 m1, in mat4 m2); I've tested this 2 multiplication functions (overloaded and free). Result (100_000 times, windows7 64bit, 32bit build): DMD: (dmd -O -inline -release) opMul: 20 msecs mul : 1355 msecs GDC: (gdc -m32 -O3 -frelease) opMul: 10 msecs mul : 1481 msecs Why this huge difference in performance?
Aug 16 2012
On Thursday, 16 August 2012 at 13:15:41 UTC, Matthias Pleh wrote:I've created a simple 4x4 Matrix struct and have made some tests, with surprising results. (although I've heard from similar results in c++) struct mat4 { float[4][4] m; mat4 opMul(in mat4 _m); } mat4 mul(in mat4 m1, in mat4 m2); I've tested this 2 multiplication functions (overloaded and free). Result (100_000 times, windows7 64bit, 32bit build): DMD: (dmd -O -inline -release) opMul: 20 msecs mul : 1355 msecs GDC: (gdc -m32 -O3 -frelease) opMul: 10 msecs mul : 1481 msecs Why this huge difference in performance?See http://stackoverflow.com/questions/5142366/how-fast-is-d-compared-to-c. Its not current, but it has some hints at what to look at. I suggest changing 'in' to 'const ref' for a real boost; though the semantic difference between the two isn't exactly clear to me in the documentation.
Aug 16 2012
Am 16.08.2012 15:18, schrieb RommelVR:I suggest changing 'in' to 'const ref' for a real boost;The mentioned thread suggest it the other way around. Also this gives me not the answer for my question. Note, it's not a question of comparison of 2 languages (C vs. D), but rather the comparison of operator overloading vs. free global functions.
Aug 16 2012
RommelVR: opMul is obsolete, I suggest to use the newer operator overloading.Why this huge difference in performance?I don't know, why don't you show us the minimized but compilable source code of the two versions plus their clean assembly (so using printf instead of writeln, etc)? Sometimes it's hard to micro optimize for performance if you don't see the asm. Matthias Pleh:I suggest changing 'in' to 'const ref' for a real boost; though the semantic difference between the two isn't exactly clear to me in the documentation."in" expands in "scope const". So both are const, but scope is about not escaping data, while ref means the 64 bytes of the struct are passed by reference, this means by pointer. With 64 bytes it's probably better to use ref. In Ada there is something like a "smart_ref" that uses ref or not, choosing what's the most efficient of the two in the current case (it's not good to interface with C code). Bye, bearophile
Aug 16 2012
On 08/16/2012 03:15 PM, Matthias Pleh wrote:I've created a simple 4x4 Matrix struct and have made some tests, with surprising results. (although I've heard from similar results in c++) struct mat4 { float[4][4] m; mat4 opMul(in mat4 _m); } mat4 mul(in mat4 m1, in mat4 m2); I've tested this 2 multiplication functions (overloaded and free). Result (100_000 times, windows7 64bit, 32bit build): DMD: (dmd -O -inline -release) opMul: 20 msecs mul : 1355 msecs GDC: (gdc -m32 -O3 -frelease) opMul: 10 msecs mul : 1481 msecs Why this huge difference in performance?Note that you have missed to provide the full benchmark code listing.
Aug 16 2012