digitalmars.D - Why is not inlining that bad?
- Janice Caron (13/13) Oct 08 2007 Forgive me if this is a dumb question, but in other threads I've seen
- Bruce Adams (5/21) Oct 08 2007 Actually the overhead of a function call is significant, at least in som...
- Walter Bright (6/22) Oct 08 2007 Inlining a function, besides getting rid of the function call/return
- 0ffh (7/12) Oct 08 2007 Reminds me of:
- Jb (19/27) Oct 08 2007 Because adressing modes cost pretty much the same on modern x86 cpus.
Forgive me if this is a dumb question, but in other threads I've seen it argued that dereferencing an address from a register offset is not something that anyone needs to be worried about, and that array accesses are so fast that no one need worry about them, etc. This being the case, why is anyone worried about the overhead of a function call? It's just a memory write and a few registers changing, surely? It's not a massively expensive operation like a thread switch or anything, so why worry? If the hardware does memory caching, the return may not even need a memory access. What am I missing? Is it just that D initializes all its local variables as part of calling a function? If so, there are plenty of ways around that.
Oct 08 2007
Janice Caron Wrote:Forgive me if this is a dumb question, but in other threads I've seen it argued that dereferencing an address from a register offset is not something that anyone needs to be worried about, and that array accesses are so fast that no one need worry about them, etc. This being the case, why is anyone worried about the overhead of a function call? It's just a memory write and a few registers changing, surely? It's not a massively expensive operation like a thread switch or anything, so why worry? If the hardware does memory caching, the return may not even need a memory access. What am I missing? Is it just that D initializes all its local variables as part of calling a function? If so, there are plenty of ways around that.Actually the overhead of a function call is significant, at least in some cases. You have to push all the variables onto the stack. If there are a lot of them and particularly if they are pass by value this makes a difference. You are changing the program counter to a different location which might well not be in the instruction cache. Anyway, if your loop is executed 10,000 times then this overhead may be significant. Also if the code is inline the compiler can optimise it as block. This might include moving some initialisation steps outside the loop. It can also ensure that variables stay in the same registers. In general you should prefer optimising your code more by changing the algorithm so the body of the loop is executed less often but sometimes inlining can make the difference. Regards, Bruce.
Oct 08 2007
Janice Caron wrote:Forgive me if this is a dumb question, but in other threads I've seen it argued that dereferencing an address from a register offset is not something that anyone needs to be worried about, and that array accesses are so fast that no one need worry about them, etc. This being the case, why is anyone worried about the overhead of a function call? It's just a memory write and a few registers changing, surely? It's not a massively expensive operation like a thread switch or anything, so why worry? If the hardware does memory caching, the return may not even need a memory access. What am I missing? Is it just that D initializes all its local variables as part of calling a function? If so, there are plenty of ways around that.Inlining a function, besides getting rid of the function call/return code, which can be significant, also enables interprocedural optimizations: register assignment, common subexpressions, constant folding, etc. It can result in dramatically fewer instructions being executed. Besides, it is more code memory cache friendly.
Oct 08 2007
Walter Bright wrote:Inlining a function, besides getting rid of the function call/return code, which can be significant, also enables interprocedural optimizations: register assignment, common subexpressions, constant folding, etc. It can result in dramatically fewer instructions being executed. Besides, it is more code memory cache friendly.Reminds me of: news://news.digitalmars.com:119/fdpmra$14n3$1 digitalmars.com Am I lucky when I, as BCS so elegantly put it, "say my prayers to the deities of optimization", and hope that debugfln(...){} will be reduced to even less than call/retn? Regards, frank
Oct 08 2007
0ffh wrote:Am I lucky when I, as BCS so elegantly put it, "say my prayers to the deities of optimization", and hope that debugfln(...){} will be reduced to even less than call/retn?Of cours Bill Baxter put it, my memory plays tricks on me! Sorry for the misassignment, both! :-) Regards, frank
Oct 08 2007
0ffh wrote:Walter Bright wrote:BTW that was Bill Baxter, replying to me.Inlining a function, besides getting rid of the function call/return code, which can be significant, also enables interprocedural optimizations: register assignment, common subexpressions, constant folding, etc. It can result in dramatically fewer instructions being executed. Besides, it is more code memory cache friendly.Reminds me of: news://news.digitalmars.com:119/fdpmra$14n3$1 digitalmars.com Am I lucky when I, as BCS so elegantly put it, "say my prayers to the deities of optimization", and hope that debugfln(...){} will be reduced to even less than call/retn? Regards, frank
Oct 08 2007
"Janice Caron" <caron800 googlemail.com> wrote in message news:mailman.390.1191827338.16939.digitalmars-d puremagic.com...Forgive me if this is a dumb question, but in other threads I've seen it argued that dereferencing an address from a register offset is not something that anyone needs to be worried about, and that array accesses are so fast that no one need worry about them, etc.Because adressing modes cost pretty much the same on modern x86 cpus. MOV EAX,[EBX+32+ECX*8] costs the same as MOV EAX,[$FFEECCDD] Because its an incredibly common thing to want to do it is optimized to the best possible case.This being the case, why is anyone worried about the overhead of a function call? It's just a memory write and a few registers changing, surely? It's not a massively expensive operation like a thread switch or anything, so why worry?A call/ret pair, costs 2-4 cycles on average. Pushing poping parameters to the stack uses up 1 cycle per instruction aswell, best case scenario, and there is usualy some shuffling of data into different registers. And if the callee does anything more complicated than a handful of operations it's likely that it will need to dump stuff off to the stack to free up more registers. Plus avoiding all that allows more oportunity for optimization. If inlined then things dont have to be shuffled round to suit the calling convention, whether stack based or register based. So small functions with lots of parameters are the most likely beneficiaries.
Oct 08 2007