digitalmars.D.learn - Playing with Entity System, performance and D.
I was playing around my ES and different ways of doing it with D. I end up with a performance test of alias func vs ranges vs opApply. code here: https://dpaste.dzfl.pl/a2eff240552f Results on my machine win 10 x64, compiling with: dub run --build=release --arch=x86 --compiler=ldc2 (unable to test with gdc because https://forum.dlang.org/thread/bfmbvxtnqfhhgquayrro forum.dlang.org) Alias func: ~40ms Range Type(front, popFront, emtpy): ~50ms OpApply: ~25ms So first, if I make some really dumb relate to profiling speed or any D code let me know! 1) I thought that ranges were the fastest, but it got the least performant code. 2) opApply was faster than alias func. Its suprising for me. 3) Not possible to return multiple values. So in the front() method I Wrapped a node of pointers.(maybe a performance impact here?, there is a better way of doing it? ) 4) there are no equivalent of declaring Type& x; (ref type) right? 5) I saw sometimes people saying that oApply will be removed. But i find it very powerfull (multiple returns for foreach is a must! eg: foreach(x, y ; myrange) ), and on this case, faster than the others. PS: I know that the performance difference is minimal and may not be noticeable on most(all) use cases, this is just for fun/knowledge with profiling and to understand D behaviors :)
Jun 19 2017
You may find this interesting https://github.com/MrSmith33/datadriven
Jun 19 2017
On 06/19/2017 07:42 PM, SrMordred wrote:I was playing around my ES and different ways of doing it with D. I end up with a performance test of alias func vs ranges vs opApply. code here: https://dpaste.dzfl.pl/a2eff240552f Results on my machine win 10 x64, compiling with: dub run --build=release --arch=x86 --compiler=ldc2 (unable to test with gdc because https://forum.dlang.org/thread/bfmbvxtnqfhhgquayrro forum.dlang.org) Alias func: ~40ms Range Type(front, popFront, emtpy): ~50ms OpApply: ~25ms So first, if I make some really dumb relate to profiling speed or any D code let me know! 1) I thought that ranges were the fastest, but it got the least performant code.I don't think ranges are expected to be faster than the others.2) opApply was faster than alias func. Its suprising for me.For me, alias_fun and op_apply are very close. If anything, alias_fun seems to be slightly faster. Typical output (ldc2 -release -O3): ---- 36 Result(51) 44 Result(88) 37 Result(-127) ----3) Not possible to return multiple values. So in the front() method I Wrapped a node of pointers.(maybe a performance impact here?, there is a better way of doing it? )Avoiding bounds checking makes it faster for me (but is unsafe of course): ---- return Node(&values.ptr[index_[0]], &results.ptr[index_[1]]); ---- Typical timing: ---- 37 Result(74) 39 Result(113) 38 Result(-1) ---- Almost there, but still a bit slower than the others. By the way, if I read it right, indexes is just `0 .. limit`, twice, isn't it? So there's no real point to it in the sample code. When I get rid of indexes and just count up to `limit`, all three versions perform the same. But I guess you're going to have arbitrary values in indexes when you actually use it.4) there are no equivalent of declaring Type& x; (ref type) right?Right. ref is only for parameters and returns. For variables, you have to user pointers.
Jun 19 2017
On Monday, 19 June 2017 at 19:06:57 UTC, ag0aep6g wrote:For me, alias_fun and op_apply are very close. If anything, alias_fun seems to be slightly faster. Typical output (ldc2 -release -O3): ---- Avoiding bounds checking makes it faster for me (but is unsafe of course):I took a deeper look into dub. "--build=release" make almost all optimizations flags on, except noboundscheck. There is a "--build=release-nobounds" and with it, the numbers got a lot closer (checked on another pc so will not post the numbers now)By the way, if I read it right, indexes is just `0 .. limit`, twice, isn't it? So there's no real point to it in the sample code. When I get rid of indexes and just count up to `limit`, all three versions perform the same. But I guess you're going to have arbitrary values in indexes when you actually use it.Iep :) I choose to keep the indexes 0..limit(and not for eg, random) just to not have cache misses interfering too much with the measurements. Thanks!
Jun 19 2017
On 06/20/2017 12:42 AM, SrMordred wrote:I took a deeper look into dub. "--build=release" make almost all optimizations flags on, except noboundscheck. There is a "--build=release-nobounds" and with it, the numbers got a lot closer (checked on another pc so will not post the numbers now)Note that -boundscheck=off undermines the safe attribute. safe code is no longer guaranteed to be memory safe. Using .ptr to avoid bounds checking doesn't have that effect, because it's not safe from the start.
Jun 19 2017