www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - size_t for length on x64 will make app slower than on x86?

reply "FrankLike" <1150015857 qq.com> writes:
Many old projects need move from x86 to x64,but the 'length' type 
is size_t,it will change on x64,so a lot of work must to do.but I 
find some info which is help for d:
http://www.dotnetperls.com/array-length.
it means:
   test length and longlength, and found 'test longlength' is  
slower than 'test length'.

   0.64 ns   Length
   2.55 ns   LongLength

I love D.So I don't want my app on x64 slower than on x86.

Hope change in 2.067.

Thank you all.
Nov 16 2014
next sibling parent Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 16 November 2014 13:39, FrankLike via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 Many old projects need move from x86 to x64,but the 'length' type is
 size_t,it will change on x64,so a lot of work must to do.but I find some
 info which is help for d:
 http://www.dotnetperls.com/array-length.
 it means:
   test length and longlength, and found 'test longlength' is  slower than
 'test length'.

   0.64 ns   Length
   2.55 ns   LongLength

 I love D.So I don't want my app on x64 slower than on x86.

 Hope change in 2.067.

 Thank you all.
D it's a field (there's no overhead).
Nov 16 2014
prev sibling next sibling parent reply "Maxim Fomin" <maxim-fomin outlook.com> writes:
On Sunday, 16 November 2014 at 13:39:24 UTC, FrankLike wrote:
 Many old projects need move from x86 to x64,but the 'length' 
 type is size_t,it will change on x64,so a lot of work must to 
 do.but I find some info which is help for d:
 http://www.dotnetperls.com/array-length.
 it means:
   test length and longlength, and found 'test longlength' is  
 slower than 'test length'.

   0.64 ns   Length
   2.55 ns   LongLength

 I love D.So I don't want my app on x64 slower than on x86.

 Hope change in 2.067.

 Thank you all.
It means where you have uint x = arr.length you should have had size_t x = arr.length from the very beginning.
Nov 16 2014
parent reply "FrankLike" <1150015857 qq.com> writes:
 It means where you have uint x = arr.length you should have had 
 size_t x = arr.length from the very beginning.
I test it : module aatest; import std.stdio; import std.datetime; import std.conv; size_t[string] aa; void ada() { for(size_t i=0;i<1000000;i++) { aa[to!string(i)] =i; } } void main() { StopWatch sw; sw.start(); ada(); sw.stop(); writeln("\n time is :" , sw.peek().msecs/1000.0," secs"); } dmd -m64 aatest.d ,and dmd aatest.d -ofaa32.exe Result: m64 :0.553 secs; m32:0.5 secs; Thank you all.
Nov 16 2014
next sibling parent "Flamencofantasy" <Flamencofantasy gmail.com> writes:
I am not sure your test is significant; calling to!string and 
inserting into an AA is likely orders of magnitude slower than 
the overhead of shuffling a 64 bit value vs a 32 bit value.




On Sunday, 16 November 2014 at 16:02:20 UTC, FrankLike wrote:
 It means where you have uint x = arr.length you should have 
 had size_t x = arr.length from the very beginning.
I test it : module aatest; import std.stdio; import std.datetime; import std.conv; size_t[string] aa; void ada() { for(size_t i=0;i<1000000;i++) { aa[to!string(i)] =i; } } void main() { StopWatch sw; sw.start(); ada(); sw.stop(); writeln("\n time is :" , sw.peek().msecs/1000.0," secs"); } dmd -m64 aatest.d ,and dmd aatest.d -ofaa32.exe Result: m64 :0.553 secs; m32:0.5 secs; Thank you all.
Nov 16 2014
prev sibling parent reply "Matthias Bentrup" <matthias.bentrup googlemail.com> writes:
On Sunday, 16 November 2014 at 16:02:20 UTC, FrankLike wrote:
 It means where you have uint x = arr.length you should have 
 had size_t x = arr.length from the very beginning.
I test it : module aatest; import std.stdio; import std.datetime; import std.conv; size_t[string] aa; void ada() { for(size_t i=0;i<1000000;i++) { aa[to!string(i)] =i; } } void main() { StopWatch sw; sw.start(); ada(); sw.stop(); writeln("\n time is :" , sw.peek().msecs/1000.0," secs"); } dmd -m64 aatest.d ,and dmd aatest.d -ofaa32.exe Result: m64 :0.553 secs; m32:0.5 secs; Thank you all.
I ran your test program through a profiler, and it spends >40% of the time in garbage collection. So I think the slightly longer run time is due to the 64 bit GC being a bit slower than the 32 bit GC.
Nov 17 2014
parent reply "FrankLike" <1150015857 qq.com> writes:
I test it:

module aasize_t;
import std.stdio;
import std.datetime;
import std.conv;
import std.string;

size_t[string] aa;

void gettime()
{
	for(size_t i=0;i<3000000;i++)
	{
		aa[to!string(i)] = i;
	}
}
void main()
{  	writeln("size_t.max",size_t.max);
     gettime();
     void getlen(){auto alne = aa.length;}
	auto r = benchmark!(getlen)(10000);
	auto f0Result = to!Duration(r[0]); // time f0 took to run 10,000 
times
	writeln("\n size_t time is :",f0Result);
	StopWatch sw;
	sw.start();
	gettime();
	sw.stop();
	writeln("\n size_t time is sw:",sw.peek.msecs," msecs");
}
----------and anoter is uint[string] aa

dmd -m64 aauint.d
dmd -m64 aasize_t.d
dmd aaint.d -ofaauint32.exe
dmd aasize_t.d -ofaasize_t32.exe

 del *.obj

aaint
aasize_t

aaint32
aasize_t32
 pause

Last Result:

They take the almost same time,and usage memory. but uint(or int) 
is more practical for length to use.
Nov 17 2014
parent reply "Freddy" <Hexagonalstar64 gmail.com> writes:
On Monday, 17 November 2014 at 15:28:52 UTC, FrankLike wrote:
 I test it:

 module aasize_t;
 import std.stdio;
 import std.datetime;
 import std.conv;
 import std.string;

 size_t[string] aa;

 void gettime()
 {
 	for(size_t i=0;i<3000000;i++)
 	{
 		aa[to!string(i)] = i;
 	}
 }
 void main()
 {  	writeln("size_t.max",size_t.max);
     gettime();
     void getlen(){auto alne = aa.length;}
 	auto r = benchmark!(getlen)(10000);
 	auto f0Result = to!Duration(r[0]); // time f0 took to run 
 10,000 times
 	writeln("\n size_t time is :",f0Result);
 	StopWatch sw;
 	sw.start();
 	gettime();
 	sw.stop();
 	writeln("\n size_t time is sw:",sw.peek.msecs," msecs");
 }
 ----------and anoter is uint[string] aa

 dmd -m64 aauint.d
 dmd -m64 aasize_t.d
 dmd aaint.d -ofaauint32.exe
 dmd aasize_t.d -ofaasize_t32.exe

  del *.obj

 aaint
 aasize_t

 aaint32
 aasize_t32
  pause

 Last Result:

 They take the almost same time,and usage memory. but uint(or 
 int) is more practical for length to use.
Don't profile with out optimzation. Add "-O -inline -release -boundscheck=off" to your dmd arguments.
Nov 17 2014
parent reply "FrankLike" <1150015857 qq.com> writes:
 Don't profile with out optimzation.
 Add "-O -inline -release -boundscheck=off" to your dmd 
 arguments.
I mean projects moved from x86 to x64, 'cast(int)length ' is better than 'size_t i=(something).length '.
Nov 17 2014
parent reply "Matthias Bentrup" <matthias.bentrup googlemail.com> writes:
On Tuesday, 18 November 2014 at 07:04:50 UTC, FrankLike wrote:
 Don't profile with out optimzation.
 Add "-O -inline -release -boundscheck=off" to your dmd 
 arguments.
I mean projects moved from x86 to x64, 'cast(int)length ' is better than 'size_t i=(something).length '.
I think the reason for the existence of size_t, is that the C designers thought that the second way is better than the first way.
Nov 18 2014
parent reply "FrankLike" <1150015857 qq.com> writes:
 I mean projects moved from x86 to x64, 'cast(int)length ' is 
 better than 'size_t i=(something).length '.
I think the reason for the existence of size_t, is that the C designers thought that the second way is better than the first way.
But now 'int' is enough, not huge and not small. if you do this: string[] a ={"abc","def","ghk"... };//Assuming a's length is 1,000,000 for(int i=0;i<a.length;i++) { somework(); } it's enough! 'int' easy to write,not Waste. Most important is easy to migrate code from x86 to x64.
Nov 18 2014
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Tue, 18 Nov 2014 12:22:58 +0000
schrieb "FrankLike" <1150015857 qq.com>:

 
 I mean projects moved from x86 to x64, 'cast(int)length ' is 
 better than 'size_t i=(something).length '.
I think the reason for the existence of size_t, is that the C designers thought that the second way is better than the first way.
But now 'int' is enough, not huge and not small. if you do this: string[] a ={"abc","def","ghk"... };//Assuming a's length is 1,000,000 for(int i=0;i<a.length;i++) { somework(); } it's enough! 'int' easy to write,not Waste. Most important is easy to migrate code from x86 to x64.
Somehow I always wrote that as foreach (i; 0 .. a.length) { somework(); } and benefited from the fact that the compiler only needs to evaluate a.length once as opposed to the for(...) case. -- Marco
Nov 18 2014
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Marco Leise:

 foreach (i; 0 .. a.length)
 {
     somework();
 }
Better: foreach (immutable _; 0 .. a.length) { somework(); } Unfortunately this syntax is not yet supported, for unknown reasons: foreach (; 0 .. a.length) { somework(); } Bye, bearophile
Nov 18 2014
next sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Tue, 18 Nov 2014 19:33:42 +0000
schrieb "bearophile" <bearophileHUGS lycos.com>:

 Marco Leise:
 
 foreach (i; 0 .. a.length)
 {
     somework();
 }
Better: foreach (immutable _; 0 .. a.length) { somework(); } Unfortunately this syntax is not yet supported, for unknown reasons: foreach (; 0 .. a.length) { somework(); } Bye, bearophile
I know, _ doesn't cut it for 2D operations (2 loops) though or you end up with _ and __ or _1 and _2. -- Marco
Nov 18 2014
prev sibling parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Tue, 18 Nov 2014 19:33:42 +0000
bearophile via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 Unfortunately this syntax is not yet supported, for unknown=20
 reasons:
=20
 foreach (; 0 .. a.length)
 {
      somework();
 }
the same as for `foreach (auto n; ...)` -- "cosmetic changes are not necessary".
Nov 18 2014
prev sibling next sibling parent reply "Xinok" <xinok live.com> writes:
On Sunday, 16 November 2014 at 13:39:24 UTC, FrankLike wrote:
 Many old projects need move from x86 to x64,but the 'length' 
 type is size_t,it will change on x64,so a lot of work must to 
 do.but I find some info which is help for d:
 http://www.dotnetperls.com/array-length.
 it means:
   test length and longlength, and found 'test longlength' is  
 slower than 'test length'.

   0.64 ns   Length
   2.55 ns   LongLength

 I love D.So I don't want my app on x64 slower than on x86.

 Hope change in 2.067.

 Thank you all.
We're missing too many details regarding how he ran his benchmark. If he compiled and ran his code as 32-bit, that could explain the discrepancy.
Nov 16 2014
parent reply "Flamencofantasy" <Flamencofantasy gmail.com> writes:
That's correct. Moving 64 bit values on a 32 bit machine results 
in at least 2 machine instructions.


whereas LongLenth is a function call even in release mode.


var length = array.Length;
000007FE8E453AF2  mov         rax,qword ptr [rsp+20h]
000007FE8E453AF7  mov         rax,qword ptr [rax+8]
000007FE8E453AFB  mov         dword ptr [rsp+30h],eax


var longLength = array.LongLength;
000007FE8E453B56  mov         rax,qword ptr [rsp+20h]
000007FE8E453B5B  cmp         byte ptr [rax],0
000007FE8E453B5E  mov         rcx,qword ptr [rsp+20h]
000007FE8E453B63  call        000007FEEE082AB4
000007FE8E453B68  mov         qword ptr [rsp+68h],rax
000007FE8E453B6D  mov         rax,qword ptr [rsp+68h]
000007FE8E453B72  mov         qword ptr [rsp+40h],rax



On Sunday, 16 November 2014 at 16:03:30 UTC, Xinok wrote:
 On Sunday, 16 November 2014 at 13:39:24 UTC, FrankLike wrote:
 Many old projects need move from x86 to x64,but the 'length' 
 type is size_t,it will change on x64,so a lot of work must to 
 do.but I find some info which is help for d:
 http://www.dotnetperls.com/array-length.
 it means:
  test length and longlength, and found 'test longlength' is  
 slower than 'test length'.

  0.64 ns   Length
  2.55 ns   LongLength

 I love D.So I don't want my app on x64 slower than on x86.

 Hope change in 2.067.

 Thank you all.
We're missing too many details regarding how he ran his benchmark. If he compiled and ran his code as 32-bit, that could explain the discrepancy.
Nov 16 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/16/2014 8:20 AM, Flamencofantasy wrote:

 LongLenth is a function call even in release mode.
-release does not turn on function inlining. Use -inline for that.
Nov 16 2014
prev sibling next sibling parent "ponce" <contact gam3sfrommars.fr> writes:
On Sunday, 16 November 2014 at 13:39:24 UTC, FrankLike wrote:
 Many old projects need move from x86 to x64,but the 'length' 
 type is size_t,it will change on x64,so a lot of work must to 
 do.but I find some info which is help for d:
 http://www.dotnetperls.com/array-length.
 it means:
   test length and longlength, and found 'test longlength' is  
 slower than 'test length'.

   0.64 ns   Length
   2.55 ns   LongLength

 I love D.So I don't want my app on x64 slower than on x86.

 Hope change in 2.067.

 Thank you all.
At least on x86, I would recommand to cast size_t in "int" almost everytime for speed. - signed overflow is undefined behaviour and optimizers can take advantage of it. - 64-bits instructions on x86 takes more bytes to encode. i-cache and instruction decoding suffer. - 32-bits instructions on x86 fill the upper range with zeroes, so that false dependencies are eliminated. For these reasons 32-bits ops on x86 are more often than not faster than "native"-sized int, opposite what intuition would tell. For better or worse, int has been made the fastest integer type by chip-makers.
Nov 17 2014
prev sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Sun, 16 Nov 2014 13:39:22 +0000
schrieb "FrankLike" <1150015857 qq.com>:

 Many old projects need move from x86 to x64,but the 'length' type=20
 is size_t,it will change on x64,so a lot of work must to do.but I=20
 find some info which is help for d:
 http://www.dotnetperls.com/array-length.
 it means:
    test length and longlength, and found 'test longlength' is =20
 slower than 'test length'.
=20
    0.64 ns   Length
    2.55 ns   LongLength
=20
 I love D.So I don't want my app on x64 slower than on x86.
=20
 Hope change in 2.067.
=20
 Thank you all.
No, you will not get 'int' instead of 'size_t' in 2.067 because a dubious showed you it is faster. In fact when you write the code like this and use 1000 times more iterations to get a reading at all, it looks like this: -------------------------- import std.stdio; import std.datetime; alias =E2=84=95 =3D size_t; void ada() { foreach (=E2=84=95 i; 0 .. 1_000_000_000) {} } void main() { StopWatch sw; sw.start(); ada(); sw.stop(); writefln("time is: %s secs", sw.peek().msecs/1000.0); } ------------------------ And prints 0.461 secs for both dmd -m32 -boundscheck=3Doff -release -inline -O and dmd -m64 -boundscheck=3Doff -release -inline -O on my laptop. When I change 'ada' to: =E2=84=95 ada() { =E2=84=95 v; foreach (=E2=84=95 i; 0 .. 1_000_000_000) { v =3D i+i; } return v; } the -m64 version becomes a lot slower (0.731 secs) compared to the -m32 version (which stays at 0.461 secs). That does not have to do with size_t though: If I change the definition of =E2=84=95 to uint or int in the 64-bit version it stays slow. It is just a difference in the generated code for the loop that makes the 64-bit version generally 270 ms slower. Now to get some more interesting numbers let's chose an operation that is inherently O(n) in regards to bit-width: division =E2=84=95 ada() { =E2=84=95 v; foreach (=E2=84=95 i; 1 .. 1_000_000_001) { v =3D i/i; } return v; } Results: alias =E2=84=95 =3D ulong: 17.07 secs alias =E2=84=95 =3D uint: 5.80 secs alias =E2=84=95 =3D int: 5.53 secs The differences for uint and int are compiler dependent. With LDC uint is faster than int by a similar amount. --=20 Marco
Nov 18 2014