digitalmars.D.learn - CTFE and Static If Question

jmh530 (36/36) May 07 2020 I am curious how ctfe and static ifs interact. In particular, if

H. S. Teoh (37/75) May 07 2020 Then you should find this article helpful:

jmh530 (2/3) May 07 2020 You explained things very well, thanks.

ag0aep6g (25/57) May 07 2020 The `static if` is guaranteed to be evaluated during compilation. That

jmh530 (11/36) May 07 2020 Thanks for the reply.
H. S. Teoh (39/40) May 07 2020 [...]

Paul Backus (9/12) May 07 2020 You can look at the generated code using the Compiler Explorer at

jmh530 (56/70) May 07 2020 Thanks for that. I forgot how much nicer godbolt is for looking

jmh530 <john.michael.hall gmail.com> writes:

I am curious how ctfe and static ifs interact. In particular, if 
an enum bool passed as a template parameter or run-time one will 
turn an if statement into something like a static if statement 
(perhaps after the compiler optimizes other code away). In the 
code below, I am a function that takes a bool as a template 
parameter and another that has it as a run-time parameter. In the 
first, I just pass an compile-time bool (y0) into it. In the 
second I have run-time (x0) and compile-time (y0).

Does foo!y0(rt) generate the same code as foo(rt, y0)?

How is the code generated by foo(rt, x0) different from 
foo(rt,y0)?

auto foo(bool rtct)(int rt) {
     static if (rtct)
         return rt + 1;
     else
         return rt;
}

auto foo(int rt, bool rtct) {
     if (rtct == true)
	    return rt + 1;
     else
         return rt;
}

void main() {
     int rt = 3;
     bool x0 = true;
     bool x1 = false;
     assert(foo(rt, x0) == 4);
     assert(foo(rt, x1) == 3);

     enum y0 = true;
     enum y1 = false;
     assert(foo!y0(rt) == 4);
     assert(foo!y1(rt) == 3);
     assert(foo(rt, y0) == 4);
     assert(foo(rt, y1) == 3);
}

May 07 2020

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, May 07, 2020 at 03:00:18PM +0000, jmh530 via Digitalmars-d-learn wrote:
 I am curious how ctfe and static ifs interact.

Then you should find this article helpful:

	https://wiki.dlang.org/User:Quickfur/Compile-time_vs._compile-time


 In particular, if an enum bool passed as a template parameter or
 run-time one will turn an if statement into something like a static if
 statement (perhaps after the compiler optimizes other code away). In
 the code below, I am a function that takes a bool as a template
 parameter and another that has it as a run-time parameter. In the
 first, I just pass an compile-time bool (y0) into it. In the second I
 have run-time (x0) and compile-time (y0).

There are multiple things here that should not be confused:

1) Template instantiation (includes static if): a template takes some
   number of compile-time arguments, and emits multiple copies of the
   template definition, with said arguments substituted for its
   parameters. Templates are processed early on in the compilation
   process, and can change the structure of the AST that the compiler
   sees.

2) CTFE: runs code that has already been compiled into a form that's
   ready to be emitted as runtime code, and feeds the return value back
   to the compiler so that it can be used as a compile-time value
   somewhere else.  Note that CTFE can only run code that has *passed*
   the template instantiation stage (even though the resulting value
   *can* be used for instantiation of other templates, but it cannot
   change the CTFE'd code itself).

3) Code optimization: this takes place independently of (1) and (2), and
   happens as part of code generation. It basically tries to transform
   the machine code so that it runs more efficiently at runtime.  It may
   simplify statements or eliminate redundant / dead code, but this has
   nothing to do with templates, static if, or CTFE; it's a
   transformation applied toward the end of the compilation process and
   it's universally applied to all runtime code (unless you disable
   optimizations).


 Does foo!y0(rt) generate the same code as foo(rt, y0)?

It generates a different AST structure.  But at the end, the optimizer
may reduce them into the same code, depending on the circumstances.


 How is the code generated by foo(rt, x0) different from foo(rt,y0)?

There is no difference, except that in the latter case, the optimizer
may realize that the argument is a compile-time constant, and inline the
function with its return value instead of emitting a runtime call. This
has nothing to do with templates or CTFE, it's purely an optimizer issue
(and different optimizers may do things differently -- it does not
change the meaning of the code).


 auto foo(bool rtct)(int rt) {
     static if (rtct)
         return rt + 1;
     else
         return rt;
 }
 
 auto foo(int rt, bool rtct) {
     if (rtct == true)
 	    return rt + 1;
     else
         return rt;
 }
 
 void main() {
     int rt = 3;
     bool x0 = true;
     bool x1 = false;
     assert(foo(rt, x0) == 4);
     assert(foo(rt, x1) == 3);
 
     enum y0 = true;
     enum y1 = false;
     assert(foo!y0(rt) == 4);
     assert(foo!y1(rt) == 3);
     assert(foo(rt, y0) == 4);
     assert(foo(rt, y1) == 3);
 }

[...]


T

-- 
IBM = I Blame Microsoft

May 07 2020

jmh530 <john.michael.hall gmail.com> writes:

On Thursday, 7 May 2020 at 15:29:01 UTC, H. S. Teoh wrote:
 [snip]

You explained things very well, thanks.

May 07 2020

ag0aep6g <anonymous example.com> writes:

On 07.05.20 17:00, jmh530 wrote:
 Does foo!y0(rt) generate the same code as foo(rt, y0)?
 
 How is the code generated by foo(rt, x0) different from foo(rt,y0)?
 
 auto foo(bool rtct)(int rt) {
      static if (rtct)
          return rt + 1;
      else
          return rt;
 }
 
 auto foo(int rt, bool rtct) {
      if (rtct == true)
          return rt + 1;
      else
          return rt;
 }
 
 void main() {
      int rt = 3;
      bool x0 = true;
      bool x1 = false;
      assert(foo(rt, x0) == 4);
      assert(foo(rt, x1) == 3);
 
      enum y0 = true;
      enum y1 = false;
      assert(foo!y0(rt) == 4);
      assert(foo!y1(rt) == 3);
      assert(foo(rt, y0) == 4);
      assert(foo(rt, y1) == 3);
 }

The `static if` is guaranteed to be evaluated during compilation. That 
means, `foo!y0` effectively becomes this:

     auto foo(int rt) { return rt + 1; }

There is no such guarantee for `foo(rt, y0)`. It doesn't matter that y0 
is an enum.

But a half-decent optimizer will have no problem replacing all your 
calls with their results. Compared with LDC and GDC, DMD has a poor 
optimizer, but even DMD turns this:

     int main() {
         int rt = 3;
         bool x0 = true;
         bool x1 = false;
         enum y0 = true;
         enum y1 = false;
         return
             foo(rt, x0) +
             foo(rt, x1) +
             foo!y0(rt) +
             foo!y1(rt) +
             foo(rt, y0) +
             foo(rt, y1);
     }

into this:

     int main() { return 21; }

May 07 2020

jmh530 <john.michael.hall gmail.com> writes:

On Thursday, 7 May 2020 at 15:34:21 UTC, ag0aep6g wrote:
 [snip]

 The `static if` is guaranteed to be evaluated during 
 compilation. That means, `foo!y0` effectively becomes this:

     auto foo(int rt) { return rt + 1; }

 There is no such guarantee for `foo(rt, y0)`. It doesn't matter 
 that y0 is an enum.

 But a half-decent optimizer will have no problem replacing all 
 your calls with their results. Compared with LDC and GDC, DMD 
 has a poor optimizer, but even DMD turns this:

     int main() {
         int rt = 3;
         bool x0 = true;
         bool x1 = false;
         enum y0 = true;
         enum y1 = false;
         return
             foo(rt, x0) +
             foo(rt, x1) +
             foo!y0(rt) +
             foo!y1(rt) +
             foo(rt, y0) +
             foo(rt, y1);
     }

 into this:

     int main() { return 21; }

Thanks for the reply.

The particular use case I'm thinking of is more like below where 
some function bar (that returns a T) is called before the return. 
Not sure if that matters or not for inlining the results.

T foo(T[] x, bool y) {
     if (y)
         return bar(x) / x.length;
     else
         return bar(x) / (x.length - 1);
}

May 07 2020

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, May 07, 2020 at 05:34:21PM +0200, ag0aep6g via Digitalmars-d-learn
wrote:
[...]
 Compared with LDC and GDC, DMD has a poor optimizer,

[...]

DMD's optimizer is the whipping boy around here, but to be fair, it is
poor only when it comes to aggrssive inlining and optimizing loops.
Other than those two areas, it's actually a pretty decent optimizer.

And even within those two areas, it actually does do some degree of
optimization, it's not completely useless.  Its main fault is caused by
what I call the domino-effect of optimization: optimal code is often
arrived at by applying a chain of optimization steps, each of which
depends on (some of) the previous step(s). You can think of previous
optimization steps as a domino that knocks down subsequent dominoes
(enables further optimizations).  Conversely, if you miss an
optimization opportunity, it will also close the door to the subsequent
optimization steps. If the previous domino didn't fall, the subsequent
ones won't either.

One of the problems with DMD's optimizer is that the inliner gives up
too easily, and too early in the process; as a result, it often closes
the door to further optimizations.  Its loop optimizer is also rather,
shall we say, timid, and so combined with the equally timid inliner, it
all adds up to a lot of missed optimization opportunities.  And given
that loops are usually where the bottlenecks are, these missed
opportunities add up to a big performance hit at runtime.

Comparatively, LDC and GDC's optimizers are extremely aggressive with
loop optimizations; as a result, they are able to reach past local
minima in computation space and obtain sometimes pretty major
performance gains. LDC also especially seems very aggressive with
inlining, especially with -O2 and -O3; this ensures no rock is left
unturned among the optimization dominoes, thus enabling it to simplify a
lot of code that DMD gives up on too early in the process. Heavily
range-based code seems to especially get vastly different performance
characteristics here -- because range methods tend to be small functions
that, if inlined, lead to further optimizations with the adjacent ranges
in the UFCS pipeline. But if one method failed to be inlined, it blocks
all further optimizations down the chain, causing a chain of missed
opportunities and resulting in slow runtime code.


T

-- 
Дерево держится корнями, а человек -
друзьями.

May 07 2020

Paul Backus <snarwin gmail.com> writes:

On Thursday, 7 May 2020 at 15:00:18 UTC, jmh530 wrote:
 Does foo!y0(rt) generate the same code as foo(rt, y0)?

 How is the code generated by foo(rt, x0) different from 
 foo(rt,y0)?

You can look at the generated code using the Compiler Explorer at 
d.godbolt.org. Here's a link to your example, compiled with ldc, 
with optimizations enabled:

https://d.godbolt.org/z/x5K7P6

As you can see, the non-static-if version has a runtime 
comparison and a conditional jump, and the static-if version does 
not. However, it doesn't make a difference in the end, because 
the calls have been optimized out, leaving an empty main function.

May 07 2020

jmh530 <john.michael.hall gmail.com> writes:

On Thursday, 7 May 2020 at 17:59:30 UTC, Paul Backus wrote:
 On Thursday, 7 May 2020 at 15:00:18 UTC, jmh530 wrote:
 Does foo!y0(rt) generate the same code as foo(rt, y0)?

 How is the code generated by foo(rt, x0) different from 
 foo(rt,y0)?

 You can look at the generated code using the Compiler Explorer 
 at d.godbolt.org. Here's a link to your example, compiled with 
 ldc, with optimizations enabled:

 https://d.godbolt.org/z/x5K7P6

 As you can see, the non-static-if version has a runtime 
 comparison and a conditional jump, and the static-if version 
 does not. However, it doesn't make a difference in the end, 
 because the calls have been optimized out, leaving an empty 
 main function.

Thanks for that. I forgot how much nicer godbolt is for looking 
assembly than run.dlang.org. Or maybe it's just that the 
optimized assembly for ldc looks a lot simpler than dmd?

I eventually played around with it for a bit and ended up with 
what's below. When compiled with ldc -O -release, some of the 
functions have code that are generated a little differently than 
the one above (the assembly looks a little prettier when using 
ints than doubles). It looks like there is a little bit of 
template bloat too, in that it generates something like four 
different versions of the run-time function that all basically do 
the same thing (some slight differences I don't really 
understand). Anyway, I think that's the first time I've ever used 
__traits compiles with a static if and I don't think I've ever 
used an template alias bool before, but I thought it was kind of 
cool.


int foo(alias bool rtct)(int[] x) {
     static if (__traits(compiles, {static if (rtct) { enum val = 
rtct;}})) {
         static if (rtct) {
             return bar(x) / cast(int) x.length;
         } else {
             return bar(x) / cast(int) (x.length - 1);
         }
     } else {
         if (rtct)
             return bar(x) / cast(int) x.length;
         else
             return bar(x) / cast(int) (x.length - 1);
     }
}

int foo(int[] x, bool rtct) {
     if (rtct)
	    return foo!true(x);
     else
         return foo!false(x);
}

int bar(int[] x) {
     return x[0] + x[1];
}


void main() {
     import std.stdio: writeln;

     int[] a = [1, 2, 3, 4, 5];
     bool x0 = true;
     bool x1 = false;
     int result0 = foo(a, x0);
     int result1 = foo(a, x1);
     int result2 = foo!x0(a);
     int result3 = foo!x1(a);

     enum y0 = true;
     enum y1 = false;
     int result0_ = foo(a, y0);
     int result1_ = foo(a, y1);
     int result2_ = foo!y0(a);
     int result3_ = foo!y1(a);
}

May 07 2020

D Programming

C/C++ Programming

Other

digitalmars.D.learn - CTFE and Static If Question