digitalmars.D.learn - Passing arguments a template parameters vs. function parameters
- Craig Dillabaugh (47/47) Jun 11 2013 I was looking at D code for constructing kd-trees, the full code
- John Colvin (19/68) Jun 11 2013 It's just a trade-off between template bloat and having more
- Craig Dillabaugh (4/37) Jun 11 2013 Thanks. This makes sense. If I can find the time perhaps I will
- bearophile (21/32) Jun 11 2013 That code was adapted by me from C code written by another person
- John Colvin (6/14) Jun 11 2013 If I was a gambling man I'd put money on seeing a bigger change
I was looking at D code for constructing kd-trees, the full code listing for which can be found here: http://rosettacode.org/wiki/K-d_tree#Faster_Alternative_Version Of particular interest were the following functions: KdNode* makeTree(size_t dim, size_t i)(KdNode[] nodes) pure nothrow { if (!nodes.length) return null; auto n = findMedian!i(nodes); if (n != null) { enum i2 = (i + 1) % dim; immutable size_t nPos = n - nodes.ptr; n.left = makeTree!(dim, i2)(nodes[0 .. nPos]); n.right = makeTree!(dim, i2)(nodes[nPos + 1 .. $]); } return n; } // See QuickSelect method. KdNode* findMedian(size_t idx)(KdNode[] nodes) pure nothrow { auto start = nodes.ptr; auto end = &nodes[$ - 1] + 1; if (end <= start) return null; if (end == start + 1) return start; KdNode* md = start + (end - start) / 2; //clipped for sake of brevity. } A kd-tree is a search structure for storing multi-dimensional points. At each level it splits the input point set along a plane in one of the dimensions. In the code above this is the value pased as 'idx' to findMedian. findMedian finds the median point in the current dimension, and uses this to define the splitting plane for the points. At each level in the tree the splitting dimension is changed (which is what i/i2 is used for in makeTree()). If I had been coding this I would almost certainly have passed the splitting dimensions as a function parameter, since the compiler has to generate separate makeTree/findMedian functions for each dimension that i/idx takes in the program. So my question is, does anyone have any idea why the author may have used templates here. It was on Rosettacode so maybe they are just there to show off D's nice template abilities, but I am curious if in idiomatic D using templates this way is somehow superior to the function parameter method I would have tried. Cheers, Craig
Jun 11 2013
On Tuesday, 11 June 2013 at 20:46:32 UTC, Craig Dillabaugh wrote:I was looking at D code for constructing kd-trees, the full code listing for which can be found here: http://rosettacode.org/wiki/K-d_tree#Faster_Alternative_Version Of particular interest were the following functions: KdNode* makeTree(size_t dim, size_t i)(KdNode[] nodes) pure nothrow { if (!nodes.length) return null; auto n = findMedian!i(nodes); if (n != null) { enum i2 = (i + 1) % dim; immutable size_t nPos = n - nodes.ptr; n.left = makeTree!(dim, i2)(nodes[0 .. nPos]); n.right = makeTree!(dim, i2)(nodes[nPos + 1 .. $]); } return n; } // See QuickSelect method. KdNode* findMedian(size_t idx)(KdNode[] nodes) pure nothrow { auto start = nodes.ptr; auto end = &nodes[$ - 1] + 1; if (end <= start) return null; if (end == start + 1) return start; KdNode* md = start + (end - start) / 2; //clipped for sake of brevity. } A kd-tree is a search structure for storing multi-dimensional points. At each level it splits the input point set along a plane in one of the dimensions. In the code above this is the value pased as 'idx' to findMedian. findMedian finds the median point in the current dimension, and uses this to define the splitting plane for the points. At each level in the tree the splitting dimension is changed (which is what i/i2 is used for in makeTree()). If I had been coding this I would almost certainly have passed the splitting dimensions as a function parameter, since the compiler has to generate separate makeTree/findMedian functions for each dimension that i/idx takes in the program. So my question is, does anyone have any idea why the author may have used templates here. It was on Rosettacode so maybe they are just there to show off D's nice template abilities, but I am curious if in idiomatic D using templates this way is somehow superior to the function parameter method I would have tried. Cheers, CraigIt's just a trade-off between template bloat and having more values hard-coded in for efficiency. In this particular case, due to the use of the static array in KdNode, if idx is known at compile time then the optimizer can go to town on those loops in findMedian. e.g. in foreach (p; start .. end) { if (p.x[idx] < pivot) { if (p != store) swap(p.x, store.x); store++; } } given offset = p.x.offsetof + idx p.x[idx] becomes just *(p+offset) which is fast and completely predictable behaviour. Great for optimizers and cache prefetching etc... All that is meaningless guesswork though, the only thing that will tell you for sure is benchmarking and profiling.
Jun 11 2013
On Tuesday, 11 June 2013 at 21:26:35 UTC, John Colvin wrote:On Tuesday, 11 June 2013 at 20:46:32 UTC, Craig Dillabaugh wrote:clipI was looking at D code for constructing kd-trees, the full code listing for which can be found here:Thanks. This makes sense. If I can find the time perhaps I will run some benchmarks on it.So my question is, does anyone have any idea why the author may have used templates here. It was on Rosettacode so maybe they are just there to show off D's nice template abilities, but I am curious if in idiomatic D using templates this way is somehow superior to the function parameter method I would have tried. Cheers, CraigIt's just a trade-off between template bloat and having more values hard-coded in for efficiency. In this particular case, due to the use of the static array in KdNode, if idx is known at compile time then the optimizer can go to town on those loops in findMedian. e.g. in foreach (p; start .. end) { if (p.x[idx] < pivot) { if (p != store) swap(p.x, store.x); store++; } } given offset = p.x.offsetof + idx p.x[idx] becomes just *(p+offset) which is fast and completely predictable behaviour. Great for optimizers and cache prefetching etc... All that is meaningless guesswork though, the only thing that will tell you for sure is benchmarking and profiling.
Jun 11 2013
Craig Dillabaugh:If I had been coding this I would almost certainly have passed the splitting dimensions as a function parameter, since the compiler has to generate separate makeTree/findMedian functions for each dimension that i/idx takes in the program. So my question is, does anyone have any idea why the author may have used templates here. It was on Rosettacode so maybe they are just there to show off D's nice template abilities, but I am curious if in idiomatic D using templates this way is somehow superior to the function parameter method I would have tried.That code was adapted by me from C code written by another person (I think Ledrug, a very good C programmer). I am maintaining all the many D entries of Rosettacode (plus few in other languages). John Colvin explains the point of passing that argument as template argument instead of run-time argument. The number of dimensions is limited (often 2 or 3), to for this program the amount of generated template bloat is acceptable. Knowing the axis at compile-time should allows the compiler to perform better optimizations. Beside binary size, the other disadvantage of template bloat is a pressure increase on the 16 kb of the half of L1 code cache of the CPU. So there are always trade-offs.If I can find the time perhaps I will run some benchmarks on it.If you want to do benchmarks don't forget to do them with a better compiler, like LDC2, that is able to use that compile-time information to optimize better. When I have written that code I was using only DMD. I have just started to use LDC2 and I have not yet timed that particular program with LDC2. Bye, bearophile
Jun 11 2013
On Tuesday, 11 June 2013 at 22:13:18 UTC, bearophile wrote:If you want to do benchmarks don't forget to do them with a better compiler, like LDC2, that is able to use that compile-time information to optimize better. When I have written that code I was using only DMD. I have just started to use LDC2 and I have not yet timed that particular program with LDC2. Bye, bearophileIf I was a gambling man I'd put money on seeing a bigger change in DMD than in ldc2/gdc. I reckon dmd will make use of the compile time argument to optimise, whereas the more advanced compilers will be able to prove it's constant either way. I'm interested to know now!
Jun 11 2013