www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Passing arguments a template parameters vs. function parameters

reply "Craig Dillabaugh" <cdillaba cg.scs.carleton.ca> writes:
I was looking at D code for constructing kd-trees, the full code
listing for which can be found here:

http://rosettacode.org/wiki/K-d_tree#Faster_Alternative_Version

Of particular interest were the following functions:

KdNode* makeTree(size_t dim, size_t i)(KdNode[] nodes) pure
nothrow {
      if (!nodes.length)
          return null;

      auto n = findMedian!i(nodes);
      if (n != null) {
          enum i2 = (i + 1) % dim;
          immutable size_t nPos = n - nodes.ptr;
          n.left  = makeTree!(dim, i2)(nodes[0 .. nPos]);
          n.right = makeTree!(dim, i2)(nodes[nPos + 1 .. $]);
      }

      return n;
}

// See QuickSelect method.
KdNode* findMedian(size_t idx)(KdNode[] nodes) pure nothrow {
      auto start = nodes.ptr;
      auto end = &nodes[$ - 1] + 1;

      if (end <= start)
          return null;
      if (end == start + 1)
          return start;

      KdNode* md = start + (end - start) / 2;

      //clipped for sake of brevity.
}

A kd-tree is a search structure for storing multi-dimensional
points.  At each level it splits the input point set along a
plane in one of the dimensions. In the code above this is the
value pased as 'idx' to findMedian.  findMedian finds the median
point in the current dimension, and uses this to define the
splitting plane for the points. At each level in the tree the
splitting dimension is changed (which is what i/i2 is used for in
makeTree()).

If I had been coding this I would almost certainly have passed
the splitting dimensions as a function parameter, since the
compiler has to generate separate makeTree/findMedian functions
for each dimension that i/idx takes in the program.

So my question is, does anyone have any idea why the author may
have used templates here. It was on Rosettacode so maybe they are
just there to show off D's nice template abilities, but I am
curious if in idiomatic D using templates this way is somehow
superior to the function parameter method I would have tried.

Cheers,
Craig
Jun 11 2013
next sibling parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Tuesday, 11 June 2013 at 20:46:32 UTC, Craig Dillabaugh wrote:
 I was looking at D code for constructing kd-trees, the full code
 listing for which can be found here:

 http://rosettacode.org/wiki/K-d_tree#Faster_Alternative_Version

 Of particular interest were the following functions:

 KdNode* makeTree(size_t dim, size_t i)(KdNode[] nodes) pure
 nothrow {
      if (!nodes.length)
          return null;

      auto n = findMedian!i(nodes);
      if (n != null) {
          enum i2 = (i + 1) % dim;
          immutable size_t nPos = n - nodes.ptr;
          n.left  = makeTree!(dim, i2)(nodes[0 .. nPos]);
          n.right = makeTree!(dim, i2)(nodes[nPos + 1 .. $]);
      }

      return n;
 }

 // See QuickSelect method.
 KdNode* findMedian(size_t idx)(KdNode[] nodes) pure nothrow {
      auto start = nodes.ptr;
      auto end = &nodes[$ - 1] + 1;

      if (end <= start)
          return null;
      if (end == start + 1)
          return start;

      KdNode* md = start + (end - start) / 2;

      //clipped for sake of brevity.
 }

 A kd-tree is a search structure for storing multi-dimensional
 points.  At each level it splits the input point set along a
 plane in one of the dimensions. In the code above this is the
 value pased as 'idx' to findMedian.  findMedian finds the median
 point in the current dimension, and uses this to define the
 splitting plane for the points. At each level in the tree the
 splitting dimension is changed (which is what i/i2 is used for 
 in
 makeTree()).

 If I had been coding this I would almost certainly have passed
 the splitting dimensions as a function parameter, since the
 compiler has to generate separate makeTree/findMedian functions
 for each dimension that i/idx takes in the program.

 So my question is, does anyone have any idea why the author may
 have used templates here. It was on Rosettacode so maybe they 
 are
 just there to show off D's nice template abilities, but I am
 curious if in idiomatic D using templates this way is somehow
 superior to the function parameter method I would have tried.

 Cheers,
 Craig
It's just a trade-off between template bloat and having more values hard-coded in for efficiency. In this particular case, due to the use of the static array in KdNode, if idx is known at compile time then the optimizer can go to town on those loops in findMedian. e.g. in foreach (p; start .. end) { if (p.x[idx] < pivot) { if (p != store) swap(p.x, store.x); store++; } } given offset = p.x.offsetof + idx p.x[idx] becomes just *(p+offset) which is fast and completely predictable behaviour. Great for optimizers and cache prefetching etc... All that is meaningless guesswork though, the only thing that will tell you for sure is benchmarking and profiling.
Jun 11 2013
parent "Craig Dillabaugh" <cdillaba cg.scs.carleton.ca> writes:
On Tuesday, 11 June 2013 at 21:26:35 UTC, John Colvin wrote:
 On Tuesday, 11 June 2013 at 20:46:32 UTC, Craig Dillabaugh 
 wrote:
 I was looking at D code for constructing kd-trees, the full 
 code
 listing for which can be found here:
clip
 So my question is, does anyone have any idea why the author may
 have used templates here. It was on Rosettacode so maybe they 
 are
 just there to show off D's nice template abilities, but I am
 curious if in idiomatic D using templates this way is somehow
 superior to the function parameter method I would have tried.

 Cheers,
 Craig
It's just a trade-off between template bloat and having more values hard-coded in for efficiency. In this particular case, due to the use of the static array in KdNode, if idx is known at compile time then the optimizer can go to town on those loops in findMedian. e.g. in foreach (p; start .. end) { if (p.x[idx] < pivot) { if (p != store) swap(p.x, store.x); store++; } } given offset = p.x.offsetof + idx p.x[idx] becomes just *(p+offset) which is fast and completely predictable behaviour. Great for optimizers and cache prefetching etc... All that is meaningless guesswork though, the only thing that will tell you for sure is benchmarking and profiling.
Thanks. This makes sense. If I can find the time perhaps I will run some benchmarks on it.
Jun 11 2013
prev sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Craig Dillabaugh:

 If I had been coding this I would almost certainly have passed
 the splitting dimensions as a function parameter, since the
 compiler has to generate separate makeTree/findMedian functions
 for each dimension that i/idx takes in the program.

 So my question is, does anyone have any idea why the author may
 have used templates here. It was on Rosettacode so maybe they 
 are
 just there to show off D's nice template abilities, but I am
 curious if in idiomatic D using templates this way is somehow
 superior to the function parameter method I would have tried.
That code was adapted by me from C code written by another person (I think Ledrug, a very good C programmer). I am maintaining all the many D entries of Rosettacode (plus few in other languages). John Colvin explains the point of passing that argument as template argument instead of run-time argument. The number of dimensions is limited (often 2 or 3), to for this program the amount of generated template bloat is acceptable. Knowing the axis at compile-time should allows the compiler to perform better optimizations. Beside binary size, the other disadvantage of template bloat is a pressure increase on the 16 kb of the half of L1 code cache of the CPU. So there are always trade-offs.
 If I can find the time perhaps I will run some benchmarks on it.
If you want to do benchmarks don't forget to do them with a better compiler, like LDC2, that is able to use that compile-time information to optimize better. When I have written that code I was using only DMD. I have just started to use LDC2 and I have not yet timed that particular program with LDC2. Bye, bearophile
Jun 11 2013
parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Tuesday, 11 June 2013 at 22:13:18 UTC, bearophile wrote:
 If you want to do benchmarks don't forget to do them with a 
 better compiler, like LDC2, that is able to use that 
 compile-time information to optimize better.

 When I have written that code I was using only DMD. I have just 
 started to use LDC2 and I have not yet timed that particular 
 program with LDC2.

 Bye,
 bearophile
If I was a gambling man I'd put money on seeing a bigger change in DMD than in ldc2/gdc. I reckon dmd will make use of the compile time argument to optimise, whereas the more advanced compilers will be able to prove it's constant either way. I'm interested to know now!
Jun 11 2013