digitalmars.D - [core.reflect] showcase fqn

Stefan Koch (78/78) Oct 07 2021 TLDR; a non-optimized fqn using core.reflect is roughly 4 times

bauss (14/94) Oct 07 2021 Why does it have to be abbreviated like fqn tho, instead of also

Stefan Koch (6/10) Oct 07 2021 It doesn't have to be.

bauss (3/14) Oct 07 2021 Thanks! I really like the previews of core.reflection so far tho.

Stefan Koch (72/74) Oct 07 2021 I went ahead and did a test on a somewhat bigger (auto-generated)

russhy (8/8) Oct 07 2021 Impressive results!

Stefan Koch (37/45) Oct 07 2021 Thanks. It is good to hear that.

Stefan Koch <uplink.coder googlemail.com> writes:

TLDR; a non-optimized fqn using core.reflect is roughly 4 times 
faster than the phobos version.

I have had issues with the fullyQualifiedName template in 
std.phobos for a while.
So I have implemented a version of it for `core.reflect.utils` 
which uses the core.reflect transitive parent reflection.

For those how are not interested in the code. First comes the 
little performance comparison.

First with an ldc optimized -O3 build of the compiler, which is 
what you should use in a commercial setting:
```

-version=core_reflect -c fqn_reflect.d
   Time (mean ± σ):      19.6 ms ±   2.9 ms    [User: 14.5 ms, 
System: 5.3 ms]
   Range (min … max):    12.6 ms …  27.3 ms    142 runs


-c fqn_reflect.d
   Time (mean ± σ):      73.9 ms ±   2.5 ms    [User: 57.1 ms, 
System: 16.9 ms]
   Range (min … max):    64.9 ms …  80.7 ms    38 runs

Summary
   'generated/linux/release/64/dmd -version=core_reflect -c 
fqn_reflect.d' ran
     3.78 ± 0.57 times faster than 'generated/linux/release/64/dmd 
-version=phobos_fqn -c fqn_reflect.d'
```

And now with a dmd debug build of the compiler that I use for 
faster iteration when working on compiler features.

```

-version=core_reflect -c fqn_reflect.d
   Time (mean ± σ):      30.3 ms ±   2.4 ms    [User: 25.5 ms, 
System: 4.7 ms]
   Range (min … max):    22.7 ms …  40.9 ms    98 runs


-c fqn_reflect.d
   Time (mean ± σ):     120.2 ms ±   2.4 ms    [User: 104.9 ms, 
System: 15.1 ms]
   Range (min … max):   116.9 ms … 125.9 ms    24 runs

Summary
   'generated/linux/release/64/dmd -version=core_reflect -c 
fqn_reflect.d' ran
     3.97 ± 0.32 times faster than 'generated/linux/release/64/dmd 
-version=phobos_fqn -c fqn_reflect.d'
```

Now comes the source of `fqn_reflect.d` unedited this time to 
avoid typos.

```
module 
reflect.showcases.nicer.java.like.package_.structure.fqn_reflect;

struct U
{
     struct V
     {
         struct W{
             class C
             { int x;  }
         }
     }
}

version (core_reflect)
{
         import core.reflect.utils;
         static assert(fqn!(U.V.W.C) == 
"reflect.showcases.nicer.java.like.package_.structure.fqn_reflect.U.V.W.C");
}
version (phobos_fqn)
{
     import std.traits;
     static assert(fullyQualifiedName!(U.V.W.C) == 
"reflect.showcases.nicer.java.like.package_.structure.fqn_reflect.U.V.W.C");
}
```

What about memory usage?

I am glad you asked. Memory usage is around 3 times lower.

Cheers,

Stefan

Oct 07 2021

bauss <jj_1337 live.dk> writes:

On Thursday, 7 October 2021 at 10:36:42 UTC, Stefan Koch wrote:
 TLDR; a non-optimized fqn using core.reflect is roughly 4 times 
 faster than the phobos version.

 I have had issues with the fullyQualifiedName template in 
 std.phobos for a while.
 So I have implemented a version of it for `core.reflect.utils` 
 which uses the core.reflect transitive parent reflection.

 For those how are not interested in the code. First comes the 
 little performance comparison.

 First with an ldc optimized -O3 build of the compiler, which is 
 what you should use in a commercial setting:
 ```

 -version=core_reflect -c fqn_reflect.d
   Time (mean ± σ):      19.6 ms ±   2.9 ms    [User: 14.5 ms, 
 System: 5.3 ms]
   Range (min … max):    12.6 ms …  27.3 ms    142 runs


 -version=phobos_fqn -c fqn_reflect.d
   Time (mean ± σ):      73.9 ms ±   2.5 ms    [User: 57.1 ms, 
 System: 16.9 ms]
   Range (min … max):    64.9 ms …  80.7 ms    38 runs

 Summary
   'generated/linux/release/64/dmd -version=core_reflect -c 
 fqn_reflect.d' ran
     3.78 ± 0.57 times faster than 
 'generated/linux/release/64/dmd -version=phobos_fqn -c 
 fqn_reflect.d'
 ```

 And now with a dmd debug build of the compiler that I use for 
 faster iteration when working on compiler features.

 ```

 -version=core_reflect -c fqn_reflect.d
   Time (mean ± σ):      30.3 ms ±   2.4 ms    [User: 25.5 ms, 
 System: 4.7 ms]
   Range (min … max):    22.7 ms …  40.9 ms    98 runs


 -version=phobos_fqn -c fqn_reflect.d
   Time (mean ± σ):     120.2 ms ±   2.4 ms    [User: 104.9 ms, 
 System: 15.1 ms]
   Range (min … max):   116.9 ms … 125.9 ms    24 runs

 Summary
   'generated/linux/release/64/dmd -version=core_reflect -c 
 fqn_reflect.d' ran
     3.97 ± 0.32 times faster than 
 'generated/linux/release/64/dmd -version=phobos_fqn -c 
 fqn_reflect.d'
 ```

 Now comes the source of `fqn_reflect.d` unedited this time to 
 avoid typos.

 ```
 module 
 reflect.showcases.nicer.java.like.package_.structure.fqn_reflect;

 struct U
 {
     struct V
     {
         struct W{
             class C
             { int x;  }
         }
     }
 }

 version (core_reflect)
 {
         import core.reflect.utils;
         static assert(fqn!(U.V.W.C) == 
 "reflect.showcases.nicer.java.like.package_.structure.fqn_reflect.U.V.W.C");
 }
 version (phobos_fqn)
 {
     import std.traits;
     static assert(fullyQualifiedName!(U.V.W.C) == 
 "reflect.showcases.nicer.java.like.package_.structure.fqn_reflect.U.V.W.C");
 }
 ```

 What about memory usage?

 I am glad you asked. Memory usage is around 3 times lower.

 Cheers,

 Stefan

Why does it have to be abbreviated like fqn tho, instead of also 
just being named fullyQualifiedName? It's one of the things I 
dislike the most about C and I don't want it to infect D.

Otherwise after a while of using a couple of functions that are 
abbreviated you mix them up and forget which is which etc.

D mostly uses non-abbreviated names for functions etc. and I 
think it should stay that way. There are only some minor 
exceptions like "writeln" where line is abbreviated, but it's not 
such a big deal, since it's obvious what ln means.

Someone who has never seen the function "fqn" or even heard the 
word "fullyQualifiedName" will not know what it means and would 
have to look it up. It's not clear, even from context what it 
actually does/returns.

Oct 07 2021

Stefan Koch <uplink.coder googlemail.com> writes:

On Thursday, 7 October 2021 at 10:43:00 UTC, bauss wrote:
 Why does it have to be abbreviated like fqn tho, instead of 
 also just being named fullyQualifiedName? It's one of the 
 things I dislike the most about C and I don't want it to infect 
 D.

It doesn't have to be.
it's just that I tend to mistype longer words and prefer shorter 
abriviations.
There is no need for the name `fqn` to be that.
`FullyQualifiedName` would also work just fine.

Oct 07 2021

bauss <jj_1337 live.dk> writes:

On Thursday, 7 October 2021 at 10:54:13 UTC, Stefan Koch wrote:
 On Thursday, 7 October 2021 at 10:43:00 UTC, bauss wrote:
 Why does it have to be abbreviated like fqn tho, instead of 
 also just being named fullyQualifiedName? It's one of the 
 things I dislike the most about C and I don't want it to 
 infect D.

 It doesn't have to be.
 it's just that I tend to mistype longer words and prefer 
 shorter abriviations.
 There is no need for the name `fqn` to be that.
 `FullyQualifiedName` would also work just fine.

Thanks! I really like the previews of core.reflection so far tho. 
It'll be so much better than using traits.

Oct 07 2021

Stefan Koch <uplink.coder googlemail.com> writes:

On Thursday, 7 October 2021 at 10:36:42 UTC, Stefan Koch wrote:
 TLDR; a non-optimized fqn using core.reflect is roughly 4 times 
 faster than the phobos version.

I went ahead and did a test on a somewhat bigger (auto-generated) 
testcase.

TLDR: On bigger testcases where the constant overhead is less of 
a factor, `core.reflect` is roughly `11.5 times` faster

To see if the the initial overhead the phobos version has would 
make it perform better once it's run multiple times.

Here are the results


```
uplink uplink-black:~/d/dmd(core_reflect)$ hyperfine 
"generated/linux/release/64/dmd nested_structs.di -o- 
-version=phobos_fqn" "generated/linux/release/64/dmd 
nested_structs.di -o- -version=core_reflect" 
"generated/linux/release/64/dmd nested_structs.di -o- 
-version=no_fqn"

-o- -version=phobos_fqn
   Time (mean ± σ):      6.307 s ±  0.155 s    [User: 5.596 s, 
System: 0.706 s]
   Range (min … max):    6.138 s …  6.637 s    10 runs


-o- -version=core_reflect
   Time (mean ± σ):     792.2 ms ±  14.7 ms    [User: 716.4 ms, 
System: 75.3 ms]
   Range (min … max):   777.0 ms … 827.3 ms    10 runs

   Warning: Statistical outliers were detected. Consider 
re-running this benchmark on a quiet PC without any interferences 
from other programs. It might help to use the '--warmup' or 
'--prepare' options.


-o- -version=no_fqn
   Time (mean ± σ):     316.2 ms ±   5.7 ms    [User: 257.9 ms, 
System: 58.2 ms]
   Range (min … max):   311.5 ms … 329.0 ms    10 runs


Summary
   'generated/linux/release/64/dmd nested_structs.di -o- 
-version=no_fqn' ran
     2.51 ± 0.07 times faster than 'generated/linux/release/64/dmd 
nested_structs.di -o- -version=core_reflect'
    19.95 ± 0.61 times faster than 'generated/linux/release/64/dmd 
nested_structs.di -o- -version=phobos_fqn'
  ```

Note that the `-version=no_fqn` doesn't do any fqn computation 
and merely parses the nested structs.
That is such that you can get an idea of the constant overhead 
which does not go away.

If we factor that in we end up with the following numbers.
`baseline: no_fqn_min = 311 ms` -- that's the time to parse and 
semantically the file essentially.
```
core_reflect_max = 827.3 ms
phobos_fqn_min = 6307 ms
```
to adjust for the overhead we now subtract 311 from both values 
and get
```
core_reflect_self = 516.3
phobos_fqn_self = 6392
real_speedup = 6392 / 516.3 = ~12
```

Which shows that in reality `core.reflect` is 12 times faster.

Cheers,
Stefan

P.S. tests on even larger test-cases suggest that the real 
speedup drops down to `11.5`

In order for you to be able to verify at least the phobos_fqn and 
the no_fqn timings, I have published my testcase in the following 
gist:
https://gist.github.com/UplinkCoder/5acd25168238cac179a5c4ffdf945187

Memory use is `10 times` less in the absolute measurement
And `30 times` less if corrected for the no_fqn version as 
baseline

Oct 07 2021

russhy <russhy gmail.com> writes:

Impressive results!

I wasn't sold initially, but that was because i didn't know what 
it was exactly, this totally is game changer

  __traits is nice for simple stuff, but as soon as you try to do 
more complex logic it starts to become unmanageable, and i never 
remember the exact syntax, hopefully ``core.reflect`` will solve 
that, and so far looks like it's already set to replace __traits 
fully

Oct 07 2021

Stefan Koch <uplink.coder googlemail.com> writes:

On Thursday, 7 October 2021 at 22:42:06 UTC, russhy wrote:
 Impressive results!

 I wasn't sold initially, but that was because i didn't know 
 what it was exactly, this totally is game changer

  __traits is nice for simple stuff, but as soon as you try to 
 do more complex logic it starts to become unmanageable, and i 
 never remember the exact syntax, hopefully ``core.reflect`` 
 will solve that, and so far looks like it's already set to 
 replace __traits fully

Thanks. It is good to hear that.

I just  did a little work on the fqn utility.
This is the code which makes it print template instances should a 
template instance happen to be a parent.
```D

     TemplateInstance ti = 
cast(TemplateInstance)lastParentDecl.getParent();
     if (ti)
     {
         string argString;
         foreach(arg;ti.arguments)
         {
             if (auto il = cast(IntegerLiteral) arg)
             {
                 import std.conv : to;
                 argString ~= to!string(il.value);
                 argString ~= ", ";
             } else if (auto sl = cast(StringLiteral) arg)
             {
                 import std.conv : to;
                 argString ~= `"` ~ sl.value ~ `"`;
                 argString ~= ", ";
              }
         }
         result = ti.name ~ "!(" ~ argString[0 .. $-2]~ ")" ~ 
result[ti.name.length .. $];
         lastParentDecl = ti;
         d = cast(Declaration)ti.parent;
         if (d) goto Ldecl;
     }
```

So if the template argument is not a string or an integer this 
fqn function, (and it is an actual function)
would not be able to print it correctly.
However it is easy to add because it's just regular user-level 
code.

Oct 07 2021

D Programming

C/C++ Programming

Other

digitalmars.D - [core.reflect] showcase fqn