www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - opApply Magic Function Body Transformation

reply Mike Shah <mshah.475 gmail.com> writes:
I'm preparing a video to teach opApply, and I think I'm still 
kind of unsure on how opApply works in regards to the compiler 
transformation. It's one of those features I understand how to 
use, but I'd like a bit of a deeper understanding before I teach 
it -- to admittingly really know what I am doing. Provided is 
something I quickly wrote using opApply as an example, and a few 
concrete questions bolded below if you want to skip ahead.

```d
import std.stdio;

struct Array(T){
   T[] array;
   int opApply(int delegate(ref T) dg){
     int result;
     for(int i=0; i < array.length; i++){
         result = dg(array[i]);
         if(result){
           break;
         }
     }
     return result;
   }
}

void main(){
   Array!int ints;
   ints.array = [4,6,8,10,12];

   foreach(item; ints){
     writeln(item);
   }

}
```

The purpose of opApply I am clear on -- it's a member function 
for 'foreach/foreach_reverse' loops for use in iteration. It 
takes priority over range member functions if both are defined, 
and otherwise *maybe* has some performance trade-offs (or at the 
least, it's slightly easier to template one member function 
versus 3 for an inputRange to avoid virtual calls -- but that's 
an aside that needs testing). Okay -- but now onto the part where 
I need some more understanding -- the delegate and the 
transformation.

In my understanding/teaching of opApply, I would break opApply 
into two main concepts:
1.) The operator overloading of 'opApply' -- and the requirement 
that opApply always returns an integer on the member function 
signature.
2.) The single parameter to opApply must otherwise be a delegate 
parameter. The paramaters to the delegate otherwise match the 
'foreach' parameters. This portion also has more to do with the 
magic transform I am not 100% clear on.

The second part (delegate parameter) is what I'm interested in 
being able to visualize. So in the above code, there's two sort 
of steps going on:

First, the transformation of a foreach loop being lowered to a 
regular 'for' loop.
```d
   foreach(item; ints){
     writeln(item);
   }

// is re-written to something-like what is below.
// But we don't really know if it's 'ints.array.length' or some 
other field.
// Thus we rely on our overload (which we can have multiple) of 
opApply
// to sort of figure this out..
   {
     int i=0;
     for(; i != ints.array.length; i++){
       writeln( /* item */ ); // 'item' represents whatever the 
delegate parameter
                              // is i.e. 'ref T' above.
                              // The 'T' is also the item I am 
iterating on, and
                              // performing some computation on.
     }
   }
```


The next transformation I can *kind of* see if I use 'dmd vcg-ast 
main.d' to compile. I can see the delegate and some magic 
'__applyArg0'.

```d
   23 void main()
   24 {
   25   Array!int ints = 0;
   26   ints.array = [4, 6, 8, 10, 12];
   27   cast(void)ints.opApply(delegate int(ref int __applyArg0) 
 safe => 0);
   28   return 0;
   29 }
```

**So really my one concrete question is** -- can I see 
main.main()__foreachbody_L21_C3(ref int) anywhere? This appears 
to be either a function or label that is the body of my loop. As 
you'll notice, on line 27, there is no longer a 'foreach' loop 
anymore. But I don't seem to be able to see the transformation, 
or otherwise find the symbols anywhere. Perhaps there is another 
magic compiler flag I am missing?

**My second concrete question is** When learning opApply, is it 
useful to think of the 'work' being done as a copy and paste of 
the work being done in a 'foreach ' loop being pasted in? Or 
perhaps just to look at `cast(void)ints.opApply(delegate int(ref 
int __applyArg0)  safe => 0);` and understand your code has been 
magically transformed? Some of my intuition in comments is below.

```d
// Somewhere in main
27   cast(void)ints.opApply(delegate int(ref int __applyArg0) 
 safe => 0);


// "Sort of" what is going on.
// At least to provide some mental model
// for the Array.opApply implementation.
     int opApply(delegate int(ref int __applyArg0) dg)(
     {
       int result;
       for(int i=0; i < this.array.length; i++){
          auto result =  dg(this.array[i]);
                        // {
                        // 'dg' represents the original 'foreach' 
loop
                        //  effectively copied here.
                        // -- but it's not a 'copy and paste of 
code here',
                        // instead we have a call to a magic 
delegate
                        // that exists somewhere from compiler.

                        // I can *think* of the delegate like 
pasting
                        // in the body of original foreach loop, 
but only one
                        // item at a time -- considering the 
single 'index' (elem)
                        // from our loop.
                        // writeln(elem); // same work, but this 
work comes from
                        // body of original 'foreach' found in 
'main()' now wrapped
                        // in a delegate function.
                        //};

           if(result){  // Returns '0' from magic delegate at some 
point?
             break;
           }
       }
       return result;
     }
```

**A third question** Will this call to a delegate provide more 
hidden allocations I wonder?

---
Some more investigation

The disassembly (I used gdc-14 to build and then disassemble with 
'objdump -d main_binary') seems to generate something like this. 
In 'C' parlance we have a function pointer otherwise representing 
where the 'foreach_body' in main otherwise would be stored 
somewhere. It also appears (both from the disassembly, and from 
vcg-ast) that the return value of '1' or '0' does not seem 
important?

```c
void opApply(int* array, int size, void (*func)(int)) {
     int index = 0;
     int result = 0;

     while (index < size) {
         if (index < size) {
             func(array[index]);
             result = 1;
         }
         index++;
     }
}
```

---

Sorry if my questions are not clear, any guidance or pointers to 
examples are helpful!
Jul 27
parent reply kinke <noone nowhere.com> writes:
On Monday, 28 July 2025 at 04:39:00 UTC, Mike Shah wrote:
 **So really my one concrete question is** -- can I see 
 main.main()__foreachbody_L21_C3(ref int) anywhere?
I think that's where the confusion comes from, that misleading `-vcg-ast` output for the loop-body-lambda, apparently printed as `… => 0` regardless of the actual loop body. Based on your example: ```d import core.stdc.stdio; struct Array(T) { T[] array; int opApply(scope int delegate(ref T) dg){ foreach (ref i; array){ const result = dg(i); if (result) return result; } return 0; } } void main() { Array!int ints; ints.array = [4,6,8,10,12]; foreach (item; ints) { if (item == 1) continue; if (item == 2) break; printf("%d\n", item); } } ``` The loop is actually rewritten by the compiler to: ```d ints.opApply((ref int item) { if (item == 1) return 0; // continue => abort this iteration if (item == 2) return 1; // break => abort this and all future iterations printf("%d\n", item); return 0; // continue with next iteration }); ``` So the main thing here is that the body is promoted to a lambda, and the control-flow statements inside the body (`break` and `continue` in the example above) are transformed to specific return codes for the opApply delegate protocol. If we add a `return` statement to the body: ```d int main() { Array!int ints; ints.array = [4,6,8,10,12]; foreach (item; ints) { if (item == 0) return item; if (item == 1) continue; if (item == 2) break; printf("%d\n", item); } return 0; } ``` then the rewrite becomes a bit more complex: ```d int __result; // magic variable inserted by the compiler, for the main() return value const __opApplyResult = ints.opApply((ref int item) { if (item == 0) { __result = item; // set return value for parent function return 2; // return => abort the loop and exit from parent function } if (item == 1) return 0; // continue => abort this iteration if (item == 2) return 1; // break => abort this and all future iterations printf("%d\n", item); return 0; // continue with next iteration }); switch (__opApplyResult) { default: break; case 2: return __result; } return __result = 0; ```
 **A third question** Will this call to a delegate provide more 
 hidden allocations I wonder?
As with any regular lambda, captured outer variables (like the `__result` in the 2nd example) will cause a closure, but as long as the `opApply` takes the delegate as `scope`, the closure will be on the stack, so no harm.
Jul 28
parent Mike Shah <mshah.475 gmail.com> writes:
On Monday, 28 July 2025 at 12:25:39 UTC, kinke wrote:
 On Monday, 28 July 2025 at 04:39:00 UTC, Mike Shah wrote:
 [...]
I think that's where the confusion comes from, that misleading `-vcg-ast` output for the loop-body-lambda, apparently printed as `… => 0` regardless of the actual loop body. [...]
Brilliant -- this I can work with. Is there somewhere already in LDC2 where I can dump out the generated transformation (Otherwise I can probably read the IR well enough)? Thank you very much for helping break this down!
Jul 28