www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Playing with Entity System, performance and D.

reply SrMordred <patric.dexheimer gmail.com> writes:
I was playing around my ES and different ways of doing it with D.

I end up with a performance test of alias func vs ranges vs 
opApply.

code here:
https://dpaste.dzfl.pl/a2eff240552f

Results on my machine win 10 x64, compiling with:
dub run --build=release --arch=x86 --compiler=ldc2
(unable to test with gdc because 
https://forum.dlang.org/thread/bfmbvxtnqfhhgquayrro forum.dlang.org)

Alias func: ~40ms
Range Type(front, popFront, emtpy): ~50ms
OpApply: ~25ms

So first, if I make some really dumb relate to profiling speed or 
any D code let me know!

1) I thought that ranges were the fastest, but it got the least 
performant code.
2) opApply was faster than alias func. Its suprising for me.
3) Not possible to return multiple values. So in the front() 
method I Wrapped a node of pointers.(maybe a performance impact 
here?, there is a better way of doing it? )
4) there are no equivalent of declaring Type& x; (ref type) right?
5) I saw sometimes people saying that oApply will be removed. But 
i find it very powerfull (multiple returns for foreach is a must! 
eg: foreach(x, y ; myrange) ), and on this case, faster than the 
others.

PS: I know that the performance difference is minimal and may not 
be noticeable on most(all) use cases, this is just for 
fun/knowledge with profiling and to understand D behaviors :)
Jun 19 2017
next sibling parent MrSmith <mrsmith33 yandex.ru> writes:
You may find this interesting 
https://github.com/MrSmith33/datadriven
Jun 19 2017
prev sibling parent reply ag0aep6g <anonymous example.com> writes:
On 06/19/2017 07:42 PM, SrMordred wrote:
 I was playing around my ES and different ways of doing it with D.
 
 I end up with a performance test of alias func vs ranges vs opApply.
 
 code here:
 https://dpaste.dzfl.pl/a2eff240552f
 
 Results on my machine win 10 x64, compiling with:
 dub run --build=release --arch=x86 --compiler=ldc2
 (unable to test with gdc because 
 https://forum.dlang.org/thread/bfmbvxtnqfhhgquayrro forum.dlang.org)
 
 Alias func: ~40ms
 Range Type(front, popFront, emtpy): ~50ms
 OpApply: ~25ms
 
 So first, if I make some really dumb relate to profiling speed or any D 
 code let me know!
 
 1) I thought that ranges were the fastest, but it got the least 
 performant code.
I don't think ranges are expected to be faster than the others.
 2) opApply was faster than alias func. Its suprising for me.
For me, alias_fun and op_apply are very close. If anything, alias_fun seems to be slightly faster. Typical output (ldc2 -release -O3): ---- 36 Result(51) 44 Result(88) 37 Result(-127) ----
 3) Not possible to return multiple values. So in the front() method I 
 Wrapped a node of pointers.(maybe a performance impact here?, there is a 
 better way of doing it? )
Avoiding bounds checking makes it faster for me (but is unsafe of course): ---- return Node(&values.ptr[index_[0]], &results.ptr[index_[1]]); ---- Typical timing: ---- 37 Result(74) 39 Result(113) 38 Result(-1) ---- Almost there, but still a bit slower than the others. By the way, if I read it right, indexes is just `0 .. limit`, twice, isn't it? So there's no real point to it in the sample code. When I get rid of indexes and just count up to `limit`, all three versions perform the same. But I guess you're going to have arbitrary values in indexes when you actually use it.
 4) there are no equivalent of declaring Type& x; (ref type) right?
Right. ref is only for parameters and returns. For variables, you have to user pointers.
Jun 19 2017
parent reply SrMordred <patric.dexheimer gmail.com> writes:
On Monday, 19 June 2017 at 19:06:57 UTC, ag0aep6g wrote:
 For me, alias_fun and op_apply are very close. If anything, 
 alias_fun seems to be slightly faster.

 Typical output (ldc2 -release -O3):
 ----
 Avoiding bounds checking makes it faster for me (but is unsafe 
 of course):
I took a deeper look into dub. "--build=release" make almost all optimizations flags on, except noboundscheck. There is a "--build=release-nobounds" and with it, the numbers got a lot closer (checked on another pc so will not post the numbers now)
 By the way, if I read it right, indexes is just `0 .. limit`, 
 twice, isn't it? So there's no real point to it in the sample 
 code. When I get rid of indexes and just count up to `limit`, 
 all three versions perform the same. But I guess you're going 
 to have arbitrary values in indexes when you actually use it.
Iep :) I choose to keep the indexes 0..limit(and not for eg, random) just to not have cache misses interfering too much with the measurements. Thanks!
Jun 19 2017
parent ag0aep6g <anonymous example.com> writes:
On 06/20/2017 12:42 AM, SrMordred wrote:
 I took a deeper look into dub.
 "--build=release" make almost all optimizations flags on, except 
 noboundscheck.
 There is a "--build=release-nobounds" and with it, the numbers got a lot 
 closer (checked on another pc so will not post the numbers now)
Note that -boundscheck=off undermines the safe attribute. safe code is no longer guaranteed to be memory safe. Using .ptr to avoid bounds checking doesn't have that effect, because it's not safe from the start.
Jun 19 2017