digitalmars.D.learn - opApply Magic Function Body Transformation
I'm preparing a video to teach opApply, and I think I'm still kind of unsure on how opApply works in regards to the compiler transformation. It's one of those features I understand how to use, but I'd like a bit of a deeper understanding before I teach it -- to admittingly really know what I am doing. Provided is something I quickly wrote using opApply as an example, and a few concrete questions bolded below if you want to skip ahead. ```d import std.stdio; struct Array(T){ T[] array; int opApply(int delegate(ref T) dg){ int result; for(int i=0; i < array.length; i++){ result = dg(array[i]); if(result){ break; } } return result; } } void main(){ Array!int ints; ints.array = [4,6,8,10,12]; foreach(item; ints){ writeln(item); } } ``` The purpose of opApply I am clear on -- it's a member function for 'foreach/foreach_reverse' loops for use in iteration. It takes priority over range member functions if both are defined, and otherwise *maybe* has some performance trade-offs (or at the least, it's slightly easier to template one member function versus 3 for an inputRange to avoid virtual calls -- but that's an aside that needs testing). Okay -- but now onto the part where I need some more understanding -- the delegate and the transformation. In my understanding/teaching of opApply, I would break opApply into two main concepts: 1.) The operator overloading of 'opApply' -- and the requirement that opApply always returns an integer on the member function signature. 2.) The single parameter to opApply must otherwise be a delegate parameter. The paramaters to the delegate otherwise match the 'foreach' parameters. This portion also has more to do with the magic transform I am not 100% clear on. The second part (delegate parameter) is what I'm interested in being able to visualize. So in the above code, there's two sort of steps going on: First, the transformation of a foreach loop being lowered to a regular 'for' loop. ```d foreach(item; ints){ writeln(item); } // is re-written to something-like what is below. // But we don't really know if it's 'ints.array.length' or some other field. // Thus we rely on our overload (which we can have multiple) of opApply // to sort of figure this out.. { int i=0; for(; i != ints.array.length; i++){ writeln( /* item */ ); // 'item' represents whatever the delegate parameter // is i.e. 'ref T' above. // The 'T' is also the item I am iterating on, and // performing some computation on. } } ``` The next transformation I can *kind of* see if I use 'dmd vcg-ast main.d' to compile. I can see the delegate and some magic '__applyArg0'. ```d 23 void main() 24 { 25 Array!int ints = 0; 26 ints.array = [4, 6, 8, 10, 12]; 27 cast(void)ints.opApply(delegate int(ref int __applyArg0) safe => 0); 28 return 0; 29 } ``` **So really my one concrete question is** -- can I see main.main()__foreachbody_L21_C3(ref int) anywhere? This appears to be either a function or label that is the body of my loop. As you'll notice, on line 27, there is no longer a 'foreach' loop anymore. But I don't seem to be able to see the transformation, or otherwise find the symbols anywhere. Perhaps there is another magic compiler flag I am missing? **My second concrete question is** When learning opApply, is it useful to think of the 'work' being done as a copy and paste of the work being done in a 'foreach ' loop being pasted in? Or perhaps just to look at `cast(void)ints.opApply(delegate int(ref int __applyArg0) safe => 0);` and understand your code has been magically transformed? Some of my intuition in comments is below. ```d // Somewhere in main 27 cast(void)ints.opApply(delegate int(ref int __applyArg0) safe => 0); // "Sort of" what is going on. // At least to provide some mental model // for the Array.opApply implementation. int opApply(delegate int(ref int __applyArg0) dg)( { int result; for(int i=0; i < this.array.length; i++){ auto result = dg(this.array[i]); // { // 'dg' represents the original 'foreach' loop // effectively copied here. // -- but it's not a 'copy and paste of code here', // instead we have a call to a magic delegate // that exists somewhere from compiler. // I can *think* of the delegate like pasting // in the body of original foreach loop, but only one // item at a time -- considering the single 'index' (elem) // from our loop. // writeln(elem); // same work, but this work comes from // body of original 'foreach' found in 'main()' now wrapped // in a delegate function. //}; if(result){ // Returns '0' from magic delegate at some point? break; } } return result; } ``` **A third question** Will this call to a delegate provide more hidden allocations I wonder? --- Some more investigation The disassembly (I used gdc-14 to build and then disassemble with 'objdump -d main_binary') seems to generate something like this. In 'C' parlance we have a function pointer otherwise representing where the 'foreach_body' in main otherwise would be stored somewhere. It also appears (both from the disassembly, and from vcg-ast) that the return value of '1' or '0' does not seem important? ```c void opApply(int* array, int size, void (*func)(int)) { int index = 0; int result = 0; while (index < size) { if (index < size) { func(array[index]); result = 1; } index++; } } ``` --- Sorry if my questions are not clear, any guidance or pointers to examples are helpful!
Jul 27
On Monday, 28 July 2025 at 04:39:00 UTC, Mike Shah wrote:**So really my one concrete question is** -- can I see main.main()__foreachbody_L21_C3(ref int) anywhere?I think that's where the confusion comes from, that misleading `-vcg-ast` output for the loop-body-lambda, apparently printed as `… => 0` regardless of the actual loop body. Based on your example: ```d import core.stdc.stdio; struct Array(T) { T[] array; int opApply(scope int delegate(ref T) dg){ foreach (ref i; array){ const result = dg(i); if (result) return result; } return 0; } } void main() { Array!int ints; ints.array = [4,6,8,10,12]; foreach (item; ints) { if (item == 1) continue; if (item == 2) break; printf("%d\n", item); } } ``` The loop is actually rewritten by the compiler to: ```d ints.opApply((ref int item) { if (item == 1) return 0; // continue => abort this iteration if (item == 2) return 1; // break => abort this and all future iterations printf("%d\n", item); return 0; // continue with next iteration }); ``` So the main thing here is that the body is promoted to a lambda, and the control-flow statements inside the body (`break` and `continue` in the example above) are transformed to specific return codes for the opApply delegate protocol. If we add a `return` statement to the body: ```d int main() { Array!int ints; ints.array = [4,6,8,10,12]; foreach (item; ints) { if (item == 0) return item; if (item == 1) continue; if (item == 2) break; printf("%d\n", item); } return 0; } ``` then the rewrite becomes a bit more complex: ```d int __result; // magic variable inserted by the compiler, for the main() return value const __opApplyResult = ints.opApply((ref int item) { if (item == 0) { __result = item; // set return value for parent function return 2; // return => abort the loop and exit from parent function } if (item == 1) return 0; // continue => abort this iteration if (item == 2) return 1; // break => abort this and all future iterations printf("%d\n", item); return 0; // continue with next iteration }); switch (__opApplyResult) { default: break; case 2: return __result; } return __result = 0; ```**A third question** Will this call to a delegate provide more hidden allocations I wonder?As with any regular lambda, captured outer variables (like the `__result` in the 2nd example) will cause a closure, but as long as the `opApply` takes the delegate as `scope`, the closure will be on the stack, so no harm.
Jul 28
On Monday, 28 July 2025 at 12:25:39 UTC, kinke wrote:On Monday, 28 July 2025 at 04:39:00 UTC, Mike Shah wrote:Brilliant -- this I can work with. Is there somewhere already in LDC2 where I can dump out the generated transformation (Otherwise I can probably read the IR well enough)? Thank you very much for helping break this down![...]I think that's where the confusion comes from, that misleading `-vcg-ast` output for the loop-body-lambda, apparently printed as `… => 0` regardless of the actual loop body. [...]
Jul 28