digitalmars.D.learn - -noboundscheck
- Nvirjskly (5/5) Aug 19 2012 Compiling my code with the -noboundscheck flag sped it up by
- bearophile (8/11) Aug 19 2012 The D front-end is very dumb in this, as far as I know it makes
- Jonathan M Davis (12/17) Aug 19 2012 It would depend entirely on your code. In most cases, I wouldn't expect ...
- Nvirjskly (9/35) Aug 19 2012 I am using dmd.
- Jonathan M Davis (10/14) Aug 19 2012 Both gdc and ldc support D2, though sometimes they're a relesae behind
- 1100110 (4/4) Aug 19 2012 I have gdc, dmd, and ldc installed on my computer.
- Nvirjskly (23/28) Aug 19 2012 Haha I actually do not have whirlpool implemented yet (it's an
- 1100110 (11/40) Aug 19 2012 Yeah, I figured it out. I did have to rename src though...
- 1100110 (124/124) Aug 19 2012 Here are my results! iirc -release implies -noboundscheck..
- Nvirjskly (19/23) Aug 19 2012 Wow, thanks. It looks like ldc2 does not play nice with
- 1100110 (5/26) Aug 19 2012 Really? that was quick.
- bearophile (6/6) Aug 19 2012 I can't understand what command line arguments you are using for
Compiling my code with the -noboundscheck flag sped it up by almost 5 times (whilst passing all tests and working exactly the same way,) is bounds checking really that expensive, and what other simple optimisations can I preform other than -inline -O -noboundscheck?
Aug 19 2012
Nvirjskly:is bounds checking really that expensive,The D front-end is very dumb in this, as far as I know it makes no attempts to remove those tests where they can't fail. Walter believes such optimizations don't gain much.what other simple optimisations can I preform other than -inline -O -noboundscheck?Compiler options change across different compilers. What compiler are you using? Bye, bearophile
Aug 19 2012
On Sunday, August 19, 2012 21:29:38 Nvirjskly wrote:Compiling my code with the -noboundscheck flag sped it up by almost 5 times (whilst passing all tests and working exactly the same way,) is bounds checking really that expensive, and what other simple optimisations can I preform other than -inline -O -noboundscheck?It would depend entirely on your code. In most cases, I wouldn't expect to see a speed up anywhere near that large. But if you're constantly accessing arrays and doing little other computation, then maybe you do. I have no idea what your code is doing. dmd's optimizer isn't the best anyway. It compiles much faster than gdc and ldc do, but it usually generates slower code (the focus on dmd has generally been getting everything working correctly rather than optimizing everything to death, though that should change with time). Whatever the situation with your code is, I'd expect that that the situation with its optimizations would change quite a bit with one of the other D compilers. - Jonathan M Davis
Aug 19 2012
On Sunday, 19 August 2012 at 20:07:32 UTC, Jonathan M Davis wrote:On Sunday, August 19, 2012 21:29:38 Nvirjskly wrote:I am using dmd. Yes, my code is extremely array heavy with many array-based computations (Shame-less plug: https://github.com/Nvirjskly/cryptod)Compiling my code with the -noboundscheck flag sped it up by almost 5 times (whilst passing all tests and working exactly the same way,) is bounds checking really that expensive, and what other simple optimisations can I preform other than -inline -O -noboundscheck?It would depend entirely on your code. In most cases, I wouldn't expect to see a speed up anywhere near that large. But if you're constantly accessing arrays and doing little other computation, then maybe you do. I have no idea what your code is doing.dmd's optimizer isn't the best anyway. It compiles much faster than gdc and ldc do, but it usually generates slower code (the focus on dmd has generally been getting everything working correctly rather than optimizing everything to death, though that should change with time). Whatever the situation with your code is, I'd expect that that the situation with its optimizations would change quite a bit with one of the other D compilers. - Jonathan M DavisAh, that makes a lot of sense. If my goal is a fast running time would it then make sense to use another compiler? I heard that gdc development is lagging behind and that ldc might not even support D2 all that well?
Aug 19 2012
On Sunday, August 19, 2012 22:13:15 Nvirjskly wrote:Ah, that makes a lot of sense. If my goal is a fast running time would it then make sense to use another compiler? I heard that gdc development is lagging behind and that ldc might not even support D2 all that well?Both gdc and ldc support D2, though sometimes they're a relesae behind (especially right after a new dmd release). I don't remember the sites for them, but I do recall that one or both of them have had issues where their old site is generally the first one that you find, so it looks like they don't support D2. But that's an issue with hits and google, not the compiler's themselves. But if you want your code to be as fast as possible, then use either gdc or ldc, though I don't know which is better (it probably depends on your code). - Jonathan M Davis
Aug 19 2012
I have gdc, dmd, and ldc installed on my computer. I also forked your repo two minutes before reading this. Tell me what you want, and Ill run whatever tests you want. But in return, I'm stealing your whirlpool.(with attribution of course.)
Aug 19 2012
On Sunday, 19 August 2012 at 21:11:13 UTC, 1100110 wrote:I have gdc, dmd, and ldc installed on my computer. I also forked your repo two minutes before reading this. Tell me what you want, and Ill run whatever tests you want. But in return, I'm stealing your whirlpool.(with attribution of course.)Haha I actually do not have whirlpool implemented yet (it's an empty file,) but since you seem to want it, it's right at the top of my TODO list (if I'm lucky I'll get it done by the end of today, but best bet is this time tomorrow. I already have the spec open.) benchmark.d contains a main function that runs some rudimentary benchmarks if you want to compile it with that... import std.process, std.stdio, std.file, std.path; void main() { string files = ""; foreach (string name; dirEntries("src", SpanMode.breadth)) { if(name.isFile()) files ~= name ~ " "; } string command = "dmd " ~ files ~ "benchmark.d -ofcryptod -noboundscheck -O -release -inline"; writeln(shell(command)); } should compile that with dmd, I'm not sure about ldc or gdc and their compiler options, but it should be something similar...
Aug 19 2012
Yeah, I figured it out. I did have to rename src though... I ran a few tests, inconclusive for any serious difference. gdc is now compiling with -O3 -march=native -frelease -fno-bounds-check -finline -ffast-math. But no, dmd has the shortest compile times, gdmd the longest. I'm timing everything right now. ...My laptop is getting hot... I want to see how bad it crashes. =P On Sun, 19 Aug 2012 17:17:02 -0500, Nvirjskly <nvirjskly gmail.com> wrote:On Sunday, 19 August 2012 at 21:11:13 UTC, 1100110 wrote:-- Using Opera's revolutionary email client: http://www.opera.com/mail/I have gdc, dmd, and ldc installed on my computer. I also forked your repo two minutes before reading this. Tell me what you want, and Ill run whatever tests you want. But in return, I'm stealing your whirlpool.(with attribution of course.)Haha I actually do not have whirlpool implemented yet (it's an empty file,) but since you seem to want it, it's right at the top of my TODO list (if I'm lucky I'll get it done by the end of today, but best bet is this time tomorrow. I already have the spec open.) benchmark.d contains a main function that runs some rudimentary benchmarks if you want to compile it with that... import std.process, std.stdio, std.file, std.path; void main() { string files = ""; foreach (string name; dirEntries("src", SpanMode.breadth)) { if(name.isFile()) files ~= name ~ " "; } string command = "dmd " ~ files ~ "benchmark.d -ofcryptod -noboundscheck -O -release -inline"; writeln(shell(command)); } should compile that with dmd, I'm not sure about ldc or gdc and their compiler options, but it should be something similar...
Aug 19 2012
Here are my results! iirc -release implies -noboundscheck.. Also I am on x64, and these files only compile to 32bit. So there could be performance missing there. rdmd --force -I../ -m32 -O -inline -release benchmark.d 26.00s user 0.23s system 99% cpu 26.386 total --- 2048 md2 in 1003 milliseconds: 15.9521 Mib/s 32768 md4 in 682 milliseconds: 375.367 Mib/s 32768 md5 in 426 milliseconds: 600.939 Mib/s 8192 ripemd160 in 779 milliseconds: 82.1566 Mib/s 4096 sha1 in 276 milliseconds: 115.942 Mib/s 16777216 ints generated by mersenne twister in 1146 milliseconds: 446.771 Mib/s 256 ints generated by BlumBlumShub in 812 milliseconds: 0.00962131 Mib/s 1048576 texts blowfish encrypted in 645 milliseconds: 99.2248 Mib/s 65536 texts threefish encrypted in 2774 milliseconds: 5.76784 Mib/s 131072 texts AES128 encrypted in 896 milliseconds: 17.8571 Mib/s rdmd --force -I../ -m32 benchmark.d 16.79s user 0.19s system 99% cpu 17.048 total --- 2048 md2 in 1546 milliseconds: 10.3493 Mib/s 32768 md4 in 1240 milliseconds: 206.452 Mib/s 32768 md5 in 1558 milliseconds: 164.313 Mib/s 8192 ripemd160 in 1535 milliseconds: 41.6938 Mib/s 4096 sha1 in 616 milliseconds: 51.9481 Mib/s 16777216 ints generated by mersenne twister in 1510 milliseconds: 339.073 Mib/s 256 ints generated by BlumBlumShub in 816 milliseconds: 0.00957414 Mib/s 1048576 texts blowfish encrypted in 1094 milliseconds: 58.5009 Mib/s 65536 texts threefish encrypted in 3316 milliseconds: 4.82509 Mib/s 131072 texts AES128 encrypted in 1945 milliseconds: 8.22622 Mib/s (ldc && gdc REALLY hate building 32bit code...) rdmd --compiler=ldmd2 --force -I../ -m32 -O -release -noboundscheck benchmark.d 2048 md2 in 570 milliseconds: 28.0702 Mib/s 32768 md4 in 765 milliseconds: 334.641 Mib/s 32768 md5 in 840 milliseconds: 304.762 Mib/s 8192 ripemd160 in 571 milliseconds: 112.084 Mib/s 4096 sha1 in 263 milliseconds: 121.673 Mib/s 16777216 ints generated by mersenne twister in 747 milliseconds: 685.408 Mib/s core.exception.AssertError /build/src/ldc-build/runtime/phobos/std/internal/math/ iguintcore.d(2044): Assertion failure real 0m8.957s user 0m8.499s sys 0m0.387s rdmd --compiler=ldmd2 --force -I../ -m32 benchmark.d 2048 md2 in 2680 milliseconds: 5.97015 Mib/s 32768 md4 in 2088 milliseconds: 122.605 Mib/s 32768 md5 in 2465 milliseconds: 103.854 Mib/s 8192 ripemd160 in 2051 milliseconds: 31.2043 Mib/s 4096 sha1 in 742 milliseconds: 43.1267 Mib/s 16777216 ints generated by mersenne twister in 1580 milliseconds: 324.051 Mib/s core.exception.AssertError /build/src/ldc-build/runtime/phobos/std/internal/math/ iguintcore.d(2044): Assertion failure real 0m14.722s user 0m14.412s sys 0m0.230s I think gdc died... binary /usr/lib/gcc/x86_64-unknown-linux-gnu/4.8.0/cc1d version v2.059 parse benchmark importall benchmark import import import import import import import import import import import import import impo rt import import import import import import import import import import import import import import import import import import import import import import import import import import im port import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import impo rt import import import import import import import import import import import import import import import import import import import import import import import import semantic benchmark import import semantic2 benchmark semantic3 benchmark import import code benchmark /usr/bin/ld: cannot find -lgphobos2 collect2: error: ld returned 1 exit status real 0m15.950s user 0m15.629s sys 0m0.190s I managed to force dmd and (partial) ldc builds for -m64 rdmd --force -O -m64 -release -noboundscheck -I../ benchmark.d 14.29s user 0.19s system 99% cpu 14.553 total 2048 md2 in 1026 milliseconds: 15.5945 Mib/s 32768 md4 in 737 milliseconds: 347.354 Mib/s 32768 md5 in 1078 milliseconds: 237.477 Mib/s 8192 ripemd160 in 922 milliseconds: 69.4143 Mib/s 4096 sha1 in 309 milliseconds: 103.56 Mib/s 16777216 ints generated by mersenne twister in 1079 milliseconds: 474.513 Mib/s 256 ints generated by BlumBlumShub in 3661 milliseconds: 0.00213398 Mib/s 1048576 texts blowfish encrypted in 593 milliseconds: 107.926 Mib/s 65536 texts threefish encrypted in 2376 milliseconds: 6.73401 Mib/s 131072 texts AES128 encrypted in 874 milliseconds: 18.3066 Mib/s 2048 md2 in 587 milliseconds: 27.2572 Mib/s 32768 md4 in 675 milliseconds: 379.259 Mib/s 32768 md5 in 752 milliseconds: 340.426 Mib/s 8192 ripemd160 in 539 milliseconds: 118.738 Mib/s 4096 sha1 in 236 milliseconds: 135.593 Mib/s 16777216 ints generated by mersenne twister in 684 milliseconds: 748.538 Mib/s core.exception.AssertError /build/src/ldc-build/runtime/phobos/std/internal/math/ iguintcore.d(2044): Assertion failure dmd -O -release -m64 -noboundscheck 2048 md2 in 1079 milliseconds: 14.8285 Mib/s 32768 md4 in 804 milliseconds: 318.408 Mib/s 32768 md5 in 1042 milliseconds: 245.681 Mib/s 8192 ripemd160 in 972 milliseconds: 65.8436 Mib/s 4096 sha1 in 324 milliseconds: 98.7654 Mib/s 16777216 ints generated by mersenne twister in 1072 milliseconds: 477.612 Mib/s 256 ints generated by BlumBlumShub in 3611 milliseconds: 0.00216353 Mib/s 1048576 texts blowfish encrypted in 581 milliseconds: 110.155 Mib/s 65536 texts threefish encrypted in 2456 milliseconds: 6.51466 Mib/s 131072 texts AES128 encrypted in 878 milliseconds: 18.2232 Mib/s Please hold while gdc is being recompiled....
Aug 19 2012
On Sunday, 19 August 2012 at 23:48:36 UTC, 1100110 wrote:Here are my results! iirc -release implies -noboundscheck.. Also I am on x64, and these files only compile to 32bit. So there could be performance missing there.Wow, thanks. It looks like ldc2 does not play nice with std.bigint, which is all the more reason for me to use my own version. If you want to see it run and not assert out, remove benchmark_bbs(); from main() in benchamrk.d std.bigint seems to have a lot of problems as I had to repeatedly mess around with things that SHOULD work. I think I should file a few bug reports :/ I think GDC is dying because I have scope imports scattered everywhere and it might not play nice with those... bah. So it looks like ldc2 produces somewhat faster code, if not for the fact that it did not play nice with std.bigint and that gdc does not follow the reference compiler in its support of scope imports... :/ So basically my code is dmd only atm and can be easily converted to support ldc2, and maybe gdc if scope imports are the only problem... On the topic of Whirlpool, I'm almost done a naive non-optimised version, and just need to make the S-box mixin.
Aug 19 2012
On Sun, 19 Aug 2012 19:26:34 -0500, Nvirjskly <nvirjskly gmail.com> wrote:On Sunday, 19 August 2012 at 23:48:36 UTC, 1100110 wrote:Really? that was quick. I didn't get very far with my attempt. =P -- Using Opera's revolutionary email client: http://www.opera.com/mail/Here are my results! iirc -release implies -noboundscheck.. Also I am on x64, and these files only compile to 32bit. So there could be performance missing there.Wow, thanks. It looks like ldc2 does not play nice with std.bigint, which is all the more reason for me to use my own version. If you want to see it run and not assert out, remove benchmark_bbs(); from main() in benchamrk.d std.bigint seems to have a lot of problems as I had to repeatedly mess around with things that SHOULD work. I think I should file a few bug reports :/ I think GDC is dying because I have scope imports scattered everywhere and it might not play nice with those... bah. So it looks like ldc2 produces somewhat faster code, if not for the fact that it did not play nice with std.bigint and that gdc does not follow the reference compiler in its support of scope imports... :/ So basically my code is dmd only atm and can be easily converted to support ldc2, and maybe gdc if scope imports are the only problem... On the topic of Whirlpool, I'm almost done a naive non-optimised version, and just need to make the S-box mixin.
Aug 19 2012
Really? that was quick. I didn't get very far with my attempt. =POk, committing a broken version. Broken in the sense that it does not work correctly as of yet. *Something* is not working properly. I work on finding out what exactly some more today and tommorrow, but it is in a "usable" state, whereby usable I mean that it returns hash values, just not ones matching any test vectors... I have to figure out where my silly mistake is.
Aug 19 2012
On Monday, 20 August 2012 at 02:28:25 UTC, Nvirjskly wrote:Quickly replying that I already found a few mistakes, and am looking for more.Really? that was quick. I didn't get very far with my attempt. =POk, committing a broken version. Broken in the sense that it does not work correctly as of yet. *Something* is not working properly. I work on finding out what exactly some more today and tommorrow, but it is in a "usable" state, whereby usable I mean that it returns hash values, just not ones matching any test vectors... I have to figure out where my silly mistake is.
Aug 19 2012
I can't understand what command line arguments you are using for LDC and GDC, but both of them have many useful optimization arguments (some of them are not easy to use like link time optimization in LDC). Bye, bearophile
Aug 19 2012