digitalmars.D - size_t for length on x64 will make app slower than on x86?
- FrankLike (12/12) Nov 16 2014 Many old projects need move from x86 to x64,but the 'length' type
- Iain Buclaw via Digitalmars-d (5/17) Nov 16 2014 That's benchmarking C#, not D. :)
- Maxim Fomin (3/15) Nov 16 2014 It means where you have uint x = arr.length you should have had
- FrankLike (26/28) Nov 16 2014 I test it :
- Flamencofantasy (4/32) Nov 16 2014 I am not sure your test is significant; calling to!string and
- Matthias Bentrup (5/33) Nov 17 2014 I ran your test program through a profiler, and it spends >40% of
- FrankLike (42/42) Nov 17 2014 I test it:
- Freddy (3/45) Nov 17 2014 Don't profile with out optimzation.
- FrankLike (2/5) Nov 17 2014 I mean projects moved from x86 to x64, 'cast(int)length ' is
- Matthias Bentrup (4/9) Nov 18 2014 I think the reason for the existence of size_t, is that the C
- FrankLike (10/15) Nov 18 2014 But now 'int' is enough, not huge and not small.
- Marco Leise (11/34) Nov 18 2014 Somehow I always wrote that as
- bearophile (14/18) Nov 18 2014 Better:
- Marco Leise (6/30) Nov 18 2014 I know, _ doesn't cut it for 2D operations (2 loops) though or
- ketmar via Digitalmars-d (4/11) Nov 18 2014 the same as for `foreach (auto n; ...)` -- "cosmetic changes are not
- Xinok (4/16) Nov 16 2014 We're missing too many details regarding how he ran his
- Flamencofantasy (17/37) Nov 16 2014 That's correct. Moving 64 bit values on a 32 bit machine results
- Walter Bright (2/4) Nov 16 2014 -release does not turn on function inlining. Use -inline for that.
- ponce (13/25) Nov 17 2014 At least on x86, I would recommand to cast size_t in "int" almost
- Marco Leise (64/80) Nov 18 2014 No, you will not get 'int' instead of 'size_t' in 2.067
Many old projects need move from x86 to x64,but the 'length' type is size_t,it will change on x64,so a lot of work must to do.but I find some info which is help for d: http://www.dotnetperls.com/array-length. it means: test length and longlength, and found 'test longlength' is slower than 'test length'. 0.64 ns Length 2.55 ns LongLength I love D.So I don't want my app on x64 slower than on x86. Hope change in 2.067. Thank you all.
Nov 16 2014
On 16 November 2014 13:39, FrankLike via Digitalmars-d <digitalmars-d puremagic.com> wrote:Many old projects need move from x86 to x64,but the 'length' type is size_t,it will change on x64,so a lot of work must to do.but I find some info which is help for d: http://www.dotnetperls.com/array-length. it means: test length and longlength, and found 'test longlength' is slower than 'test length'. 0.64 ns Length 2.55 ns LongLength I love D.So I don't want my app on x64 slower than on x86. Hope change in 2.067. Thank you all.D it's a field (there's no overhead).
Nov 16 2014
On Sunday, 16 November 2014 at 13:39:24 UTC, FrankLike wrote:Many old projects need move from x86 to x64,but the 'length' type is size_t,it will change on x64,so a lot of work must to do.but I find some info which is help for d: http://www.dotnetperls.com/array-length. it means: test length and longlength, and found 'test longlength' is slower than 'test length'. 0.64 ns Length 2.55 ns LongLength I love D.So I don't want my app on x64 slower than on x86. Hope change in 2.067. Thank you all.It means where you have uint x = arr.length you should have had size_t x = arr.length from the very beginning.
Nov 16 2014
It means where you have uint x = arr.length you should have had size_t x = arr.length from the very beginning.I test it : module aatest; import std.stdio; import std.datetime; import std.conv; size_t[string] aa; void ada() { for(size_t i=0;i<1000000;i++) { aa[to!string(i)] =i; } } void main() { StopWatch sw; sw.start(); ada(); sw.stop(); writeln("\n time is :" , sw.peek().msecs/1000.0," secs"); } dmd -m64 aatest.d ,and dmd aatest.d -ofaa32.exe Result: m64 :0.553 secs; m32:0.5 secs; Thank you all.
Nov 16 2014
I am not sure your test is significant; calling to!string and inserting into an AA is likely orders of magnitude slower than the overhead of shuffling a 64 bit value vs a 32 bit value. On Sunday, 16 November 2014 at 16:02:20 UTC, FrankLike wrote:It means where you have uint x = arr.length you should have had size_t x = arr.length from the very beginning.I test it : module aatest; import std.stdio; import std.datetime; import std.conv; size_t[string] aa; void ada() { for(size_t i=0;i<1000000;i++) { aa[to!string(i)] =i; } } void main() { StopWatch sw; sw.start(); ada(); sw.stop(); writeln("\n time is :" , sw.peek().msecs/1000.0," secs"); } dmd -m64 aatest.d ,and dmd aatest.d -ofaa32.exe Result: m64 :0.553 secs; m32:0.5 secs; Thank you all.
Nov 16 2014
On Sunday, 16 November 2014 at 16:02:20 UTC, FrankLike wrote:I ran your test program through a profiler, and it spends >40% of the time in garbage collection. So I think the slightly longer run time is due to the 64 bit GC being a bit slower than the 32 bit GC.It means where you have uint x = arr.length you should have had size_t x = arr.length from the very beginning.I test it : module aatest; import std.stdio; import std.datetime; import std.conv; size_t[string] aa; void ada() { for(size_t i=0;i<1000000;i++) { aa[to!string(i)] =i; } } void main() { StopWatch sw; sw.start(); ada(); sw.stop(); writeln("\n time is :" , sw.peek().msecs/1000.0," secs"); } dmd -m64 aatest.d ,and dmd aatest.d -ofaa32.exe Result: m64 :0.553 secs; m32:0.5 secs; Thank you all.
Nov 17 2014
I test it: module aasize_t; import std.stdio; import std.datetime; import std.conv; import std.string; size_t[string] aa; void gettime() { for(size_t i=0;i<3000000;i++) { aa[to!string(i)] = i; } } void main() { writeln("size_t.max",size_t.max); gettime(); void getlen(){auto alne = aa.length;} auto r = benchmark!(getlen)(10000); auto f0Result = to!Duration(r[0]); // time f0 took to run 10,000 times writeln("\n size_t time is :",f0Result); StopWatch sw; sw.start(); gettime(); sw.stop(); writeln("\n size_t time is sw:",sw.peek.msecs," msecs"); } ----------and anoter is uint[string] aa dmd -m64 aauint.d dmd -m64 aasize_t.d dmd aaint.d -ofaauint32.exe dmd aasize_t.d -ofaasize_t32.exe del *.obj aaint aasize_t aaint32 aasize_t32 pause Last Result: They take the almost same time,and usage memory. but uint(or int) is more practical for length to use.
Nov 17 2014
On Monday, 17 November 2014 at 15:28:52 UTC, FrankLike wrote:I test it: module aasize_t; import std.stdio; import std.datetime; import std.conv; import std.string; size_t[string] aa; void gettime() { for(size_t i=0;i<3000000;i++) { aa[to!string(i)] = i; } } void main() { writeln("size_t.max",size_t.max); gettime(); void getlen(){auto alne = aa.length;} auto r = benchmark!(getlen)(10000); auto f0Result = to!Duration(r[0]); // time f0 took to run 10,000 times writeln("\n size_t time is :",f0Result); StopWatch sw; sw.start(); gettime(); sw.stop(); writeln("\n size_t time is sw:",sw.peek.msecs," msecs"); } ----------and anoter is uint[string] aa dmd -m64 aauint.d dmd -m64 aasize_t.d dmd aaint.d -ofaauint32.exe dmd aasize_t.d -ofaasize_t32.exe del *.obj aaint aasize_t aaint32 aasize_t32 pause Last Result: They take the almost same time,and usage memory. but uint(or int) is more practical for length to use.Don't profile with out optimzation. Add "-O -inline -release -boundscheck=off" to your dmd arguments.
Nov 17 2014
Don't profile with out optimzation. Add "-O -inline -release -boundscheck=off" to your dmd arguments.I mean projects moved from x86 to x64, 'cast(int)length ' is better than 'size_t i=(something).length '.
Nov 17 2014
On Tuesday, 18 November 2014 at 07:04:50 UTC, FrankLike wrote:I think the reason for the existence of size_t, is that the C designers thought that the second way is better than the first way.Don't profile with out optimzation. Add "-O -inline -release -boundscheck=off" to your dmd arguments.I mean projects moved from x86 to x64, 'cast(int)length ' is better than 'size_t i=(something).length '.
Nov 18 2014
But now 'int' is enough, not huge and not small. if you do this: string[] a ={"abc","def","ghk"... };//Assuming a's length is 1,000,000 for(int i=0;i<a.length;i++) { somework(); } it's enough! 'int' easy to write,not Waste. Most important is easy to migrate code from x86 to x64.I mean projects moved from x86 to x64, 'cast(int)length ' is better than 'size_t i=(something).length '.I think the reason for the existence of size_t, is that the C designers thought that the second way is better than the first way.
Nov 18 2014
Am Tue, 18 Nov 2014 12:22:58 +0000 schrieb "FrankLike" <1150015857 qq.com>:Somehow I always wrote that as foreach (i; 0 .. a.length) { somework(); } and benefited from the fact that the compiler only needs to evaluate a.length once as opposed to the for(...) case. -- MarcoBut now 'int' is enough, not huge and not small. if you do this: string[] a ={"abc","def","ghk"... };//Assuming a's length is 1,000,000 for(int i=0;i<a.length;i++) { somework(); } it's enough! 'int' easy to write,not Waste. Most important is easy to migrate code from x86 to x64.I mean projects moved from x86 to x64, 'cast(int)length ' is better than 'size_t i=(something).length '.I think the reason for the existence of size_t, is that the C designers thought that the second way is better than the first way.
Nov 18 2014
Marco Leise:foreach (i; 0 .. a.length) { somework(); }Better: foreach (immutable _; 0 .. a.length) { somework(); } Unfortunately this syntax is not yet supported, for unknown reasons: foreach (; 0 .. a.length) { somework(); } Bye, bearophile
Nov 18 2014
Am Tue, 18 Nov 2014 19:33:42 +0000 schrieb "bearophile" <bearophileHUGS lycos.com>:Marco Leise:I know, _ doesn't cut it for 2D operations (2 loops) though or you end up with _ and __ or _1 and _2. -- Marcoforeach (i; 0 .. a.length) { somework(); }Better: foreach (immutable _; 0 .. a.length) { somework(); } Unfortunately this syntax is not yet supported, for unknown reasons: foreach (; 0 .. a.length) { somework(); } Bye, bearophile
Nov 18 2014
On Tue, 18 Nov 2014 19:33:42 +0000 bearophile via Digitalmars-d <digitalmars-d puremagic.com> wrote:Unfortunately this syntax is not yet supported, for unknown=20 reasons: =20 foreach (; 0 .. a.length) { somework(); }the same as for `foreach (auto n; ...)` -- "cosmetic changes are not necessary".
Nov 18 2014
On Sunday, 16 November 2014 at 13:39:24 UTC, FrankLike wrote:Many old projects need move from x86 to x64,but the 'length' type is size_t,it will change on x64,so a lot of work must to do.but I find some info which is help for d: http://www.dotnetperls.com/array-length. it means: test length and longlength, and found 'test longlength' is slower than 'test length'. 0.64 ns Length 2.55 ns LongLength I love D.So I don't want my app on x64 slower than on x86. Hope change in 2.067. Thank you all.We're missing too many details regarding how he ran his benchmark. If he compiled and ran his code as 32-bit, that could explain the discrepancy.
Nov 16 2014
That's correct. Moving 64 bit values on a 32 bit machine results in at least 2 machine instructions. whereas LongLenth is a function call even in release mode. var length = array.Length; 000007FE8E453AF2 mov rax,qword ptr [rsp+20h] 000007FE8E453AF7 mov rax,qword ptr [rax+8] 000007FE8E453AFB mov dword ptr [rsp+30h],eax var longLength = array.LongLength; 000007FE8E453B56 mov rax,qword ptr [rsp+20h] 000007FE8E453B5B cmp byte ptr [rax],0 000007FE8E453B5E mov rcx,qword ptr [rsp+20h] 000007FE8E453B63 call 000007FEEE082AB4 000007FE8E453B68 mov qword ptr [rsp+68h],rax 000007FE8E453B6D mov rax,qword ptr [rsp+68h] 000007FE8E453B72 mov qword ptr [rsp+40h],rax On Sunday, 16 November 2014 at 16:03:30 UTC, Xinok wrote:On Sunday, 16 November 2014 at 13:39:24 UTC, FrankLike wrote:Many old projects need move from x86 to x64,but the 'length' type is size_t,it will change on x64,so a lot of work must to do.but I find some info which is help for d: http://www.dotnetperls.com/array-length. it means: test length and longlength, and found 'test longlength' is slower than 'test length'. 0.64 ns Length 2.55 ns LongLength I love D.So I don't want my app on x64 slower than on x86. Hope change in 2.067. Thank you all.We're missing too many details regarding how he ran his benchmark. If he compiled and ran his code as 32-bit, that could explain the discrepancy.
Nov 16 2014
On 11/16/2014 8:20 AM, Flamencofantasy wrote:LongLenth is a function call even in release mode.-release does not turn on function inlining. Use -inline for that.
Nov 16 2014
On Sunday, 16 November 2014 at 13:39:24 UTC, FrankLike wrote:Many old projects need move from x86 to x64,but the 'length' type is size_t,it will change on x64,so a lot of work must to do.but I find some info which is help for d: http://www.dotnetperls.com/array-length. it means: test length and longlength, and found 'test longlength' is slower than 'test length'. 0.64 ns Length 2.55 ns LongLength I love D.So I don't want my app on x64 slower than on x86. Hope change in 2.067. Thank you all.At least on x86, I would recommand to cast size_t in "int" almost everytime for speed. - signed overflow is undefined behaviour and optimizers can take advantage of it. - 64-bits instructions on x86 takes more bytes to encode. i-cache and instruction decoding suffer. - 32-bits instructions on x86 fill the upper range with zeroes, so that false dependencies are eliminated. For these reasons 32-bits ops on x86 are more often than not faster than "native"-sized int, opposite what intuition would tell. For better or worse, int has been made the fastest integer type by chip-makers.
Nov 17 2014
Am Sun, 16 Nov 2014 13:39:22 +0000 schrieb "FrankLike" <1150015857 qq.com>:Many old projects need move from x86 to x64,but the 'length' type=20 is size_t,it will change on x64,so a lot of work must to do.but I=20 find some info which is help for d: http://www.dotnetperls.com/array-length. it means: test length and longlength, and found 'test longlength' is =20 slower than 'test length'. =20 0.64 ns Length 2.55 ns LongLength =20 I love D.So I don't want my app on x64 slower than on x86. =20 Hope change in 2.067. =20 Thank you all.No, you will not get 'int' instead of 'size_t' in 2.067 because a dubious showed you it is faster. In fact when you write the code like this and use 1000 times more iterations to get a reading at all, it looks like this: -------------------------- import std.stdio; import std.datetime; alias =E2=84=95 =3D size_t; void ada() { foreach (=E2=84=95 i; 0 .. 1_000_000_000) {} } void main() { StopWatch sw; sw.start(); ada(); sw.stop(); writefln("time is: %s secs", sw.peek().msecs/1000.0); } ------------------------ And prints 0.461 secs for both dmd -m32 -boundscheck=3Doff -release -inline -O and dmd -m64 -boundscheck=3Doff -release -inline -O on my laptop. When I change 'ada' to: =E2=84=95 ada() { =E2=84=95 v; foreach (=E2=84=95 i; 0 .. 1_000_000_000) { v =3D i+i; } return v; } the -m64 version becomes a lot slower (0.731 secs) compared to the -m32 version (which stays at 0.461 secs). That does not have to do with size_t though: If I change the definition of =E2=84=95 to uint or int in the 64-bit version it stays slow. It is just a difference in the generated code for the loop that makes the 64-bit version generally 270 ms slower. Now to get some more interesting numbers let's chose an operation that is inherently O(n) in regards to bit-width: division =E2=84=95 ada() { =E2=84=95 v; foreach (=E2=84=95 i; 1 .. 1_000_000_001) { v =3D i/i; } return v; } Results: alias =E2=84=95 =3D ulong: 17.07 secs alias =E2=84=95 =3D uint: 5.80 secs alias =E2=84=95 =3D int: 5.53 secs The differences for uint and int are compiler dependent. With LDC uint is faster than int by a similar amount. --=20 Marco
Nov 18 2014