digitalmars.D - foreach, opApply, and inline
- Tor Myklebust (95/95) Oct 07 2007 When I compile the following code with dmd 1.015, separate functions are
- BCS (4/12) Oct 07 2007 I assume you tried DMD's -inline flag?
- Tor Myklebust (11/25) Oct 07 2007 I don't think it should in general; if the "small function" calls the
- BCS (26/33) Oct 08 2007 Well of course any optimization needs to be implemented with some sanity...
When I compile the following code with dmd 1.015, separate functions are generated for main, subset's opApply, and the body of the foreach. Is there a way to get dmd to inline opApply and the body of the foreach when the opApply is known and the body of the foreach is known? (More generally, is it possible to declare a function accepting a delegate argument so that the delegate will get inlined if it is known at compile time?) import std.stdio; struct subset { int j; int opApply(int delegate(inout int i) foo) { for (int i = 0; i < j; i = (i+~j+1)&j) foo(i); return 0; } } void main() { foreach(i; subset(12345)) printf("%i\n", i); } The relevant assembly code is thus: Disassembly of section .gnu.linkonce.t_D3foo6subset7opApplyMFDFKiZiZi: 0804a628 <_D3foo6subset7opApplyMFDFKiZiZi>: 804a628: 55 push %ebp 804a629: 8b ec mov %esp,%ebp 804a62b: 83 ec 10 sub $0x10,%esp 804a62e: 53 push %ebx 804a62f: 56 push %esi 804a630: 89 45 f8 mov %eax,0xfffffff8(%ebp) 804a633: 8b d8 mov %eax,%ebx 804a635: 31 c9 xor %ecx,%ecx 804a637: 89 4d fc mov %ecx,0xfffffffc(%ebp) 804a63a: 39 0b cmp %ecx,(%ebx) 804a63c: 7e 33 jle 804a671 <_D3foo6subset7opApplyMFD FKiZiZi+0x49> 804a63e: 89 5d f8 mov %ebx,0xfffffff8(%ebp) 804a641: 8b 55 0c mov 0xc(%ebp),%edx 804a644: 8b 45 08 mov 0x8(%ebp),%eax 804a647: 89 d6 mov %edx,%esi 804a649: 8b 5d f8 mov 0xfffffff8(%ebp),%ebx 804a64c: 8d 4d fc lea 0xfffffffc(%ebp),%ecx 804a64f: 51 push %ecx 804a650: 8b 45 08 mov 0x8(%ebp),%eax 804a653: ff d6 call *%esi 804a655: 8b 13 mov (%ebx),%edx 804a657: 89 55 f4 mov %edx,0xfffffff4(%ebp) 804a65a: f7 d2 not %edx 804a65c: 8b 4d fc mov 0xfffffffc(%ebp),%ecx 804a65f: 8d 54 11 01 lea 0x1(%ecx,%edx,1),%edx 804a663: 8b 4d f4 mov 0xfffffff4(%ebp),%ecx 804a666: 23 d1 and %ecx,%edx 804a668: 89 55 fc mov %edx,0xfffffffc(%ebp) 804a66b: 8b 0b mov (%ebx),%ecx 804a66d: 3b ca cmp %edx,%ecx 804a66f: 7f db jg 804a64c <_D3foo6subset7opApplyMFD FKiZiZi+0x24> 804a671: 31 c0 xor %eax,%eax 804a673: 5e pop %esi 804a674: 5b pop %ebx 804a675: 8b e5 mov %ebp,%esp 804a677: 5d pop %ebp 804a678: c2 08 00 ret $0x8 804a67b: 90 nop Disassembly of section .gnu.linkonce.t_Dmain: 0804a67c <_Dmain>: 804a67c: 55 push %ebp 804a67d: 8b ec mov %esp,%ebp 804a67f: 83 ec 08 sub $0x8,%esp 804a682: b8 a0 a6 04 08 mov $0x804a6a0,%eax 804a687: 50 push %eax 804a688: 6a 00 push $0x0 804a68a: c7 45 fc 39 30 00 00 movl $0x3039,0xfffffffc(%ebp) 804a691: 8d 45 fc lea 0xfffffffc(%ebp),%eax 804a694: e8 8f ff ff ff call 804a628 <_D3foo6subset7opApplyMFD FKiZiZi> 804a699: 31 c0 xor %eax,%eax 804a69b: 8b e5 mov %ebp,%esp 804a69d: 5d pop %ebp 804a69e: c3 ret 804a69f: 90 nop Disassembly of section .gnu.linkonce.t_D3foo4mainFZv14__foreachbody1MFKiZi: 0804a6a0 <_D3foo4mainFZv14__foreachbody1MFKiZi>: 804a6a0: 55 push %ebp 804a6a1: 8b ec mov %esp,%ebp 804a6a3: 50 push %eax 804a6a4: 8b 4d 08 mov 0x8(%ebp),%ecx 804a6a7: ff 31 pushl (%ecx) 804a6a9: ba 68 b6 05 08 mov $0x805b668,%edx 804a6ae: 52 push %edx 804a6af: e8 3c f2 ff ff call 80498f0 <printf plt> 804a6b4: 83 c4 08 add $0x8,%esp 804a6b7: 31 c0 xor %eax,%eax 804a6b9: 8b e5 mov %ebp,%esp 804a6bb: 5d pop %ebp 804a6bc: c2 04 00 ret $0x4 804a6bf: 90 nop Tor Myklebust
Oct 07 2007
Reply to Tor,When I compile the following code with dmd 1.015, separate functions are generated for main, subset's opApply, and the body of the foreach. Is there a way to get dmd to inline opApply and the body of the foreach when the opApply is known and the body of the foreach is known? (More generally, is it possible to declare a function accepting a delegate argument so that the delegate will get inlined if it is known at compile time?)I assume you tried DMD's -inline flag? I have often though that the inlining should work in the more general case where small function get called with a known delegate.
Oct 07 2007
BCS <ao pathlink.com> wrote:Reply to Tor,Yes. That was compiled with -O -inline -release.When I compile the following code with dmd 1.015, separate functions are generated for main, subset's opApply, and the body of the foreach. Is there a way to get dmd to inline opApply and the body of the foreach when the opApply is known and the body of the foreach is known? (More generally, is it possible to declare a function accepting a delegate argument so that the delegate will get inlined if it is known at compile time?)I assume you tried DMD's -inline flag?I have often though that the inlining should work in the more general case where small function get called with a known delegate.I don't think it should in general; if the "small function" calls the delegate multiple times, this can result in extremely bloated code. For the specific case of opApply() happening because of a foreach loop, our intuition as programmers is that the compiled result won't contain any unnecessary function calls. (Imagine if "for (size_t i=0;i<n;i++) foo += bar[i];" generated a function for doing "foo += bar[i]" and a function for doing the iteration itself. You'd have to write for-loops yourself using if-goto again. That would suck, wouldn't it?) Tor Myklebust
Oct 07 2007
Reply to Tor,BCS <ao pathlink.com> wrote:Well of course any optimization needs to be implemented with some sanity checks. What I want to be dealt with is the idiom where a call to a small function takes a delegate literal. Say something like this void Attr(char[] s)(void delegate() dg { writef("<"~s~"> "); scope(exit) writef("<"~s~"> "); dg(); } alias Attr!("b") Bold alias Attr!("http") Doc alias Attr!("body") Body alias Attr!("head") Head Doc({ Head({ ... }); Body({ writef("Hello "); Bold({ writef("world"); }); }); });I have often though that the inlining should work in the more general case where small function get called with a known delegate.I don't think it should in general; if the "small function" calls the delegate multiple times, this can result in extremely bloated code.
Oct 08 2007