www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - first numbers: type function vs template

reply Stefan Koch <uplink.coder googlemail.com> writes:
Good Evening,

I have been talking about type functions for a while now, and 
have claimed theoretical performance improvements when compared 
to templates.
As well as nicer syntax.

Unfortunately, in the example I am about to present the type 
function syntax is not going to look better than the template.

---
class C0{}
class C1 : C0{}
// 497 class definitions omitted for brevity
alias CX = C498;

version (templ)
{
     template type_h_template(T)
     {
         static if (is(T S == super) && S.length)
         {
             enum type_h_template = ( T.stringof ~ " -> " ~ 
S[0].stringof ~ "\n" ~ .type_h_template!(S));
         }
         else
         {
             enum type_h_template = "";
         }
     }

     static assert(type_h_template!(CX));
}
else
{
     string type_hierachy(alias T)
     {
         string result;

         alias base_class;
         // for now this is a typefunction-only __trait
         base_class = __traits(getBaseClass, T);
         while(is(base_class))
         {
             result ~= T.stringof ~ " -> " ~ base_class.stringof ~ 
"\n";
             T = base_class;
             base_class = __traits(getBaseClass, T);
         }

         return result;
     }


    static assert(type_hierachy(CX));
}
---

Note: that the template recursion limit is 500 which is why we 
can only go up to C498, the type function can work with virtually 
infinitely deep hierarchies

I now run the following command to benchmark:
hyperfine "generated/linux/release/64/dmd test_getBaseClass.d 
-sktf -o-" "generated/linux/release/64/dmd test_getBaseClass.d 
-sktf -version=templ -o-" -w 20 -r 500


And it results in:

-sktf -o-
   Time (mean ± σ):      21.4 ms ±   2.8 ms    [User: 16.8 ms, 
System: 4.9 ms]
   Range (min … max):    14.0 ms …  29.3 ms    500 runs


-sktf -version=templ -o-
   Time (mean ± σ):      33.0 ms ±   2.6 ms    [User: 26.9 ms, 
System: 6.3 ms]
   Range (min … max):    24.9 ms …  41.6 ms    500 runs

Summary
   'generated/linux/release/64/dmd test_getBaseClass.d -sktf -o-' 
ran
     1.54 ± 0.24 times faster than 'generated/linux/release/64/dmd 
test_getBaseClass.d -sktf -version=templ -o-'

Which shows that the type function is 1.5x faster for the chosen 
task, given naive implementations of both the type function and 
the template.

In terms of memory a quick best out of three reveals:
19080k for the template  and
14328k for the type function.

Which is a reduction by roughly 25%.

I am rather pleased by those numbers, given that type functions 
are still very much proof of concept and rather unoptimized.
Aug 21 2020
next sibling parent reply Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Friday, 21 August 2020 at 19:13:30 UTC, Stefan Koch wrote:
 In terms of memory a quick best out of three reveals:
 19080k for the template  and
 14328k for the type function.

 Which is a reduction by roughly 25%.

 I am rather pleased by those numbers, given that type functions 
 are still very much proof of concept and rather unoptimized.
Nice. BTW, is newCTFE's latency to high to give further speedups for this type function?
Aug 21 2020
parent Stefan Koch <uplink.coder googlemail.com> writes:
On Friday, 21 August 2020 at 22:05:31 UTC, Per Nordlöw wrote:
 On Friday, 21 August 2020 at 19:13:30 UTC, Stefan Koch wrote:
 In terms of memory a quick best out of three reveals:
 19080k for the template  and
 14328k for the type function.

 Which is a reduction by roughly 25%.

 I am rather pleased by those numbers, given that type 
 functions are still very much proof of concept and rather 
 unoptimized.
Nice. BTW, is newCTFE's latency to high to give further speedups for this type function?
I am not sure. It would depend on how much your introspection has to do. And how many types you have to serialize. newCTFE is an independent virtual machine with it's own ABI. In order for it to work with types the types would have to be serialized and given a binary representation suitable for the VM to work with. Currently I am undecided on how a binary representation would look. Since that depends on what properties I want type functions to be able to use. (type-size, type-members, data-layout, name, mangle, UDAs, baseClasses, vtbls, and so on) I hope that Andrei's work on type info can inform my decision on that. But all that is far in the future.
Aug 21 2020
prev sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Friday, 21 August 2020 at 19:13:30 UTC, Stefan Koch wrote:
 And it results in:

 test_getBaseClass.d -sktf -o-
   Time (mean ± σ):      21.4 ms ±   2.8 ms    [User: 16.8 ms, 
 System: 4.9 ms]
   Range (min … max):    14.0 ms …  29.3 ms    500 runs


 test_getBaseClass.d -sktf -version=templ -o-
   Time (mean ± σ):      33.0 ms ±   2.6 ms    [User: 26.9 ms, 
 System: 6.3 ms]
   Range (min … max):    24.9 ms …  41.6 ms    500 runs

 Summary
   'generated/linux/release/64/dmd test_getBaseClass.d -sktf 
 -o-' ran
     1.54 ± 0.24 times faster than 
 'generated/linux/release/64/dmd test_getBaseClass.d -sktf 
 -version=templ -o-'
Actually I was mislead by the results. I forgot to subtract the overhead. A file with the content "class C0{}" takes on the same machine and comparable load 8.5 milliseconds. Let's be generous and round it down to 8 milliseconds. That means the real speedup is (33.0 - 8) / (21.4 - 8). And with that we are at almost 1.9x on average. If we compare the two min values: (24.9 - 8) / (14.0 - 8) We are even at 2.8x.
Aug 21 2020