www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Constructor performance in dmd

reply "FreeSlave" <freeslave93 gmail.com> writes:
Hello, I wrote some performance test for different constructors 
of simple struct, and got some unexpected results.

Code:

//dmd vector.d -unittest
module vector;

debug(SVector)
{
     import std.stdio;
}

struct SVector(size_t dim, T = float)
{
private:
     T[dim] _arr;
public:
     alias dim dimension;
     alias dim size;

     this(this)
     {
         debug(SVector)
             writeln("copy constructor");
     }

     this(const T[dim] arr)
     {
         debug(SVector)
             writeln("constructor from static array");
         _arr = arr;
     }

     this(ref const T[dim] arr)
     {
         debug(SVector)
             writeln("constructor from ref static array");
         _arr = arr;
     }

     this(const(T)[] arr)
     {
         debug(SVector)
             writeln("constructor from dynamic array");
         assert(arr.length == dim);
         _arr = arr;
     }

     ref SVector opAssign(const SVector other)
     {
         debug(SVector)
             writeln("assign other vector");
         _arr = other._arr;
         return this;
     }

     ref SVector opAssign(ref const SVector other)
     {
         debug(SVector)
             writeln("assign other ref vector");
         _arr = other._arr;
         return this;
     }

     ref SVector opAssign(const T[dim] arr)
     {
         debug(SVector)
             writeln("assign static array");
         _arr = arr;
         return this;
     }

     ref SVector opAssign(ref const T[dim] arr)
     {
         debug(SVector)
             writeln("assign ref static array");
         _arr = arr;
         return this;
     }

     ref SVector opAssign(const(T)[] arr)
     {
         debug(SVector)
             writeln("assign dynamic array");
         assert(arr.length == dim);
         _arr = arr;
         return this;
     }
}


unittest
{
     import std.datetime;
     import std.stdio;
     void endWatch(ref StopWatch sw, string message)
     {
         sw.stop();
         writefln("%s: %s", message, sw.peek().msecs);
         sw.reset();
     }

     alias int scalar;
     alias SVector!(3, scalar) vec;

     writeln("SVector performance test");
     writefln("using %s as scalar type", scalar.stringof);

     StopWatch sw;

     enum n = 1_000_000;

     sw.start();
     foreach(i; 0..n)
     {
         vec temp = void;
     }
     endWatch(sw, "no init (void)");

     sw.start();
     foreach(i; 0..n)
     {
         vec temp;
     }
     endWatch(sw, "just init");

     sw.start();
     foreach(i; 0..n)
     {
         vec temp = vec.init;
     }
     endWatch(sw, "init from .init");

     sw.start();
     foreach(i; 0..n)
     {
         vec temp = vec();
     }
     endWatch(sw, "init from explicit constructor");

     sw.start();
     scalar[3] arr;
     foreach(i; 0..n)
     {
         vec temp = vec(arr);
     }
     endWatch(sw, "init from other vec");

     sw.start();
     vec v;
     foreach(i; 0..n)
     {
         vec temp = v;
     }
     endWatch(sw, "init from other ref vec");

     sw.start();
     foreach(i; 0..n)
     {
         vec temp = [1,2,3];
     }
     endWatch(sw, "init from static array");

     sw.start();
     foreach(i; 0..n)
     {
         vec temp = arr;
     }
     endWatch(sw, "init from ref static array");

     sw.start();
     scalar[] slice = arr[];
     foreach(i; 0..n)
     {
         vec temp = slice;
     }
     endWatch(sw, "init from slice");
}

int main()
{
     return 0;
}

Please, note 'scalar' alias, because I will change it later. Also 
note 'T[dim] _arr;' definition in SVector struct. Compile this 
example with dmd.

Tests show that we have really bad performance when we explicitly 
initialize our struct with vec.init and vec(). It's near 25 times 
slower than 'just init'.

Ok, let's change scalar from int to long.
Now initialization with vec.init is much faster, but still slower 
than 'just init'. It's weird, that 'long' version is faster to 
initialize. Same for float/double pair - 'double' version is 
initialized faster. Try it yourself! Initialization with vec() is 
still slow. But theoretically all these 3 ways should lead to 
same results.

Return to 'int' and do some magic. Add explicit initialization to 
static array defined in our struct:

T[dim] _arr = 0;

Now we have the best performance! Incredible. But it should be 
the same, right? Because, you know, default constructed array of 
ints have all zeros by default.

Let's leave this zero and change scalar to float. Again, very 
good performance. Now change from zero to T.init:

T[dim] _arr = T.init;

It becomes 5 times slower than zero version.

Also I should notice it looks like -O option does not help here.

ldc has pretty good results in all cases.
Dec 10 2013
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Tue, 10 Dec 2013 12:42:10 +0100
schrieb "FreeSlave" <freeslave93 gmail.com>:

 Let's leave this zero and change scalar to float. Again, very 
 good performance. Now change from zero to T.init:
 
 T[dim] _arr = T.init;
 
 It becomes 5 times slower than zero version.
 
 Also I should notice it looks like -O option does not help here.
 
 ldc has pretty good results in all cases.
I would think that NaNs are explicitly set for each element, while "all zeroes" uses a fast zerofill method (e.g. memset(ptr, 0)). No idea about the other cases. I think DMD is just not checking for duplicate initializations and lacking some good struct initialization or vector copy code gen. -- Marco
Dec 10 2013