digitalmars.D - Why is not inlining that bad?

Janice Caron (13/13) Oct 08 2007 Forgive me if this is a dumb question, but in other threads I've seen

Bruce Adams (5/21) Oct 08 2007 Actually the overhead of a function call is significant, at least in som...
Walter Bright (6/22) Oct 08 2007 Inlining a function, besides getting rid of the function call/return

0ffh (7/12) Oct 08 2007 Reminds me of:

0ffh (4/7) Oct 08 2007 Of cours Bill Baxter put it, my memory plays tricks on me!
BCS (2/20) Oct 08 2007 BTW that was Bill Baxter, replying to me.

Jb (19/27) Oct 08 2007 Because adressing modes cost pretty much the same on modern x86 cpus.

"Janice Caron" <caron800 googlemail.com> writes:

Forgive me if this is a dumb question, but in other threads I've seen
it argued that dereferencing an address from a register offset is not
something that anyone needs to be worried about, and that array
accesses are so fast that no one need worry about them, etc.

This being the case, why is anyone worried about the overhead of a
function call? It's just a memory write and a few registers changing,
surely? It's not a massively expensive operation like a thread switch
or anything, so why worry?

If the hardware does memory caching, the return may not even need a
memory access.

What am I missing? Is it just that D initializes all its local
variables as part of calling a function? If so, there are plenty of
ways around that.

Oct 08 2007

Bruce Adams <tortoise_74 yeah.who.co.uk> writes:

Janice Caron Wrote:

 Forgive me if this is a dumb question, but in other threads I've seen
 it argued that dereferencing an address from a register offset is not
 something that anyone needs to be worried about, and that array
 accesses are so fast that no one need worry about them, etc.
 
 This being the case, why is anyone worried about the overhead of a
 function call? It's just a memory write and a few registers changing,
 surely? It's not a massively expensive operation like a thread switch
 or anything, so why worry?
 
 If the hardware does memory caching, the return may not even need a
 memory access.
 
 What am I missing? Is it just that D initializes all its local
 variables as part of calling a function? If so, there are plenty of
 ways around that.

Actually the overhead of a function call is significant, at least in some
cases. You have to push all the variables onto the stack. If there are a lot of
them and particularly if they are pass by value this makes a difference. You
are changing the program counter to a different location which might well not
be in the instruction cache. Anyway, if your loop is executed 10,000 times then
this overhead may be significant. Also if the code is inline the compiler can
optimise it as block. This might include moving some initialisation steps
outside the loop. It can also ensure that variables stay in the same registers. 

In general you should prefer optimising your code more by changing the
algorithm so the body of the loop is executed less often but sometimes inlining
can make the difference.

Regards,

Bruce.

Oct 08 2007

Walter Bright <newshound1 digitalmars.com> writes:

Janice Caron wrote:
 Forgive me if this is a dumb question, but in other threads I've seen
 it argued that dereferencing an address from a register offset is not
 something that anyone needs to be worried about, and that array
 accesses are so fast that no one need worry about them, etc.
 
 This being the case, why is anyone worried about the overhead of a
 function call? It's just a memory write and a few registers changing,
 surely? It's not a massively expensive operation like a thread switch
 or anything, so why worry?
 
 If the hardware does memory caching, the return may not even need a
 memory access.
 
 What am I missing? Is it just that D initializes all its local
 variables as part of calling a function? If so, there are plenty of
 ways around that.

Inlining a function, besides getting rid of the function call/return 
code, which can be significant, also enables interprocedural 
optimizations: register assignment, common subexpressions, constant 
folding, etc. It can result in dramatically fewer instructions being 
executed. Besides, it is more code memory cache friendly.

Oct 08 2007

0ffh <spam frankhirsch.net> writes:

Walter Bright wrote:
 Inlining a function, besides getting rid of the function call/return 
 code, which can be significant, also enables interprocedural 
 optimizations: register assignment, common subexpressions, constant 
 folding, etc. It can result in dramatically fewer instructions being 
 executed. Besides, it is more code memory cache friendly.

Reminds me of:

news://news.digitalmars.com:119/fdpmra$14n3$1 digitalmars.com

Am I lucky when I, as BCS so elegantly put it, "say my prayers to the
deities of optimization", and hope that debugfln(...){} will be reduced
to even less than call/retn?

Regards, frank

Oct 08 2007

0ffh <spam frankhirsch.net> writes:

0ffh wrote:
 Am I lucky when I, as BCS so elegantly put it, "say my prayers to the
 deities of optimization", and hope that debugfln(...){} will be reduced
 to even less than call/retn?

Of cours Bill Baxter put it, my memory plays tricks on me!
Sorry for the misassignment, both! :-)

Regards, frank

Oct 08 2007

BCS <BCS pathlink.com> writes:

0ffh wrote:
 Walter Bright wrote:
 
 Inlining a function, besides getting rid of the function call/return 
 code, which can be significant, also enables interprocedural 
 optimizations: register assignment, common subexpressions, constant 
 folding, etc. It can result in dramatically fewer instructions being 
 executed. Besides, it is more code memory cache friendly.

 
 
 Reminds me of:
 
 news://news.digitalmars.com:119/fdpmra$14n3$1 digitalmars.com
 
 Am I lucky when I, as BCS so elegantly put it, "say my prayers to the
 deities of optimization", and hope that debugfln(...){} will be reduced
 to even less than call/retn?
 
 Regards, frank

BTW that was Bill Baxter, replying to me.

Oct 08 2007

"Jb" <jb nowhere.com> writes:

"Janice Caron" <caron800 googlemail.com> wrote in message 
news:mailman.390.1191827338.16939.digitalmars-d puremagic.com...
 Forgive me if this is a dumb question, but in other threads I've seen
 it argued that dereferencing an address from a register offset is not
 something that anyone needs to be worried about, and that array
 accesses are so fast that no one need worry about them, etc.

Because adressing modes cost pretty much the same on modern x86 cpus.

MOV  EAX,[EBX+32+ECX*8]

costs the same as

MOV  EAX,[$FFEECCDD]

Because its an incredibly common thing to want to do it is optimized to the 
best possible case.


 This being the case, why is anyone worried about the overhead of a
 function call? It's just a memory write and a few registers changing,
 surely? It's not a massively expensive operation like a thread switch
 or anything, so why worry?

A call/ret pair, costs 2-4 cycles on average. Pushing poping parameters to 
the stack uses up 1 cycle per instruction aswell, best case scenario, and 
there is usualy some shuffling of data into different registers.

And if the callee does anything more complicated than a handful of 
operations it's likely that it will need to dump stuff off to the stack to 
free up more registers.

Plus avoiding all that allows more oportunity for optimization. If inlined 
then things dont have to be shuffled round to suit the calling convention, 
whether stack based or register based.

So small functions with lots of parameters are the most likely 
beneficiaries.

Oct 08 2007

D Programming

C/C++ Programming

Other

digitalmars.D - Why is not inlining that bad?