digitalmars.D - Inlining Ref Functions

dsimcha (8/8) May 15 2009 The fact that DMD does not inline functions with ref parameters has come...

Bill Baxter (12/20) May 15 2009 ritical

dsimcha (5/25) May 15 2009 Just to clarify: I actually don't think this is a terribly high priorit...

Bill Baxter (25/50) May 15 2009 Well it was shown before in a demo ray-tracer that the inability to

dsimcha (55/70) May 15 2009 On second thought, maybe this should be a high priority issue, for two r...

grauzone (11/15) May 15 2009 I don't understand why D should pander to C++ freaks? If they think

dsimcha (7/9) May 15 2009 In general I agree. There are higher priorities than improving DMD's op...

Daniel Keep (4/28) May 16 2009 Looking at the DMD 2.029 source code, in inline.c around line 1307

Bill Baxter (10/21) May 17 2009 out

grauzone (2/23) May 17 2009

grauzone (20/48) May 17 2009 Let me be a bit more serious about it: the main selling point of D are

Bill Baxter (15/51) May 18 2009 st

dsimcha (22/28) May 17 2009 But JVM and CLR languages either are orders of magnitude slower than D (...

bearophile (211/211) May 16 2009 I have compiled a small variant of your code with the a very new LDC com...

Michel Fortin (8/16) May 15 2009 Which makes me think: now that "this" in structs is now a "ref"

dsimcha (4/7) May 15 2009 Oddly enough, based on looking at some disassembly, DMD apparently inlin...

Denis Koroskin (2/16) May 17 2009 How about bugzillizing the feature request?

dsimcha (7/23) May 17 2009 It's already in there as bug 2008, though I updated it recently to show ...

dsimcha <dsimcha yahoo.com> writes:

The fact that DMD does not inline functions with ref parameters has come up
several times deep in threads on this NG before, but it's never really
received proper attention.  After changing a few swaps in performance-critical
areas of my code to "manually inlined" swaps and seeing significant speedups,
I'm kind of curious what the rationale is for not inlining functions w/ ref
params.  Is there a good technical reason for this or is it simply a matter of
having higher priorities?  Is inlining functions w/ ref params on the "to do
eventually" list?

May 15 2009

Bill Baxter <wbaxter gmail.com> writes:

On Fri, May 15, 2009 at 1:36 PM, dsimcha <dsimcha yahoo.com> wrote:
 The fact that DMD does not inline functions with ref parameters has come =

up
 several times deep in threads on this NG before, but it's never really
 received proper attention. =A0After changing a few swaps in performance-c=

ritical
 areas of my code to "manually inlined" swaps and seeing significant speed=

ups,
 I'm kind of curious what the rationale is for not inlining functions w/ r=

ef
 params. =A0Is there a good technical reason for this or is it simply a ma=

tter of
 having higher priorities? =A0Is inlining functions w/ ref params on the "=

to do
 eventually" list?

+1 on bumping up the priority on it.
Even if it it can't be made to work in every case, if it could at
least be made to work in simple cases like swap() then it would be
great boon for DMD benchmarks.

--bb

May 15 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Bill Baxter (wbaxter gmail.com)'s article
 On Fri, May 15, 2009 at 1:36 PM, dsimcha <dsimcha yahoo.com> wrote:
 The fact that DMD does not inline functions with ref parameters has come

 up
 several times deep in threads on this NG before, but it's never really
 received proper attention. �After changing a few swaps in performance-c

 ritical
 areas of my code to "manually inlined" swaps and seeing significant speed

 ups,
 I'm kind of curious what the rationale is for not inlining functions w/ r

 ef
 params. �Is there a good technical reason for this or is it simply a ma

 tter of
 having higher priorities? �Is inlining functions w/ ref params on the "

 to do
 eventually" list?

 +1 on bumping up the priority on it.
 Even if it it can't be made to work in every case, if it could at
 least be made to work in simple cases like swap() then it would be
 great boon for DMD benchmarks.
 --bb

Just to clarify:  I actually don't think this is a terribly high priority item.
Now that D2 is apparently pretty much feature complete, I'd rather bug fixes for
functionality bugs than relatively mild performance bugs.  My question was
purely
out of curiosity.

May 15 2009

Bill Baxter <wbaxter gmail.com> writes:

Well it was shown before in a demo ray-tracer that the inability to
inline funcs with refs caused a significant speed hit under DMD.  And
now we're seeing it's causing a significant speed hit for sorting
because of swap routines.

There may be some thorny issues regarding inlining with refs in the
general case but with code like this:

    float lengthSqr(const ref vec3 v) { return v.x*v.x + v.y*v.y + v.z*v.z;=
 }

 it really should be trivial for the compiler to figure out that
direct substitution is possible in a simple case like:

   vec3 w;
   ...
   auto len2 =3D lengthSqr(w);

Maybe I'm missing something, but that looks pretty darn straightforward.

--bb

On Fri, May 15, 2009 at 2:01 PM, dsimcha <dsimcha yahoo.com> wrote:
 =3D=3D Quote from Bill Baxter (wbaxter gmail.com)'s article
 On Fri, May 15, 2009 at 1:36 PM, dsimcha <dsimcha yahoo.com> wrote:
 The fact that DMD does not inline functions with ref parameters has co=



me
 up
 several times deep in threads on this NG before, but it's never really
 received proper attention. =A0After changing a few swaps in performanc=



e-c
 ritical
 areas of my code to "manually inlined" swaps and seeing significant sp=



eed
 ups,
 I'm kind of curious what the rationale is for not inlining functions w=



/ r
 ef
 params. =A0Is there a good technical reason for this or is it simply a=



 ma
 tter of
 having higher priorities? =A0Is inlining functions w/ ref params on th=



e "
 to do
 eventually" list?

 +1 on bumping up the priority on it.
 Even if it it can't be made to work in every case, if it could at
 least be made to work in simple cases like swap() then it would be
 great boon for DMD benchmarks.
 --bb

 Just to clarify: =A0I actually don't think this is a terribly high priori=

ty item.
 Now that D2 is apparently pretty much feature complete, I'd rather bug fi=

xes for
 functionality bugs than relatively mild performance bugs. =A0My question =

was purely
 out of curiosity.

May 15 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Bill Baxter (wbaxter gmail.com)'s article
 Well it was shown before in a demo ray-tracer that the inability to
 inline funcs with refs caused a significant speed hit under DMD.  And
 now we're seeing it's causing a significant speed hit for sorting
 because of swap routines.
 There may be some thorny issues regarding inlining with refs in the
 general case but with code like this:
     float lengthSqr(const ref vec3 v) { return v.x*v.x + v.y*v.y + v.z*v.z;
  }
  it really should be trivial for the compiler to figure out that
 direct substitution is possible in a simple case like:
    vec3 w;
    ...
    auto len2 = lengthSqr(w);
 Maybe I'm missing something, but that looks pretty darn straightforward.
 --bb

On second thought, maybe this should be a high priority issue, for two reasons.

1.  Some uber-hardcore performance freaks will not even consider D if it has the
slightest bit of performance overhead compared to C++.

2.  DMD apparently already inlines certain functions with pointers, so why not
references?  Aren't references just syntactic sugar for pointers?  If DMD's
inliner can already handle pointers, I would think (I could be wrong) that it
would be able to trivially handle references.

Here's some test code:

import std.stdio;

// Shouldn't this generate *exactly* the same code as ptrSwap?
void swap(T)(ref T a, ref T b) {
    T temp = a;
    a = b;
    b = temp;
}

void ptrSwap(T)(T* a, T* b) {
    T temp = *a;
    *a = *b;
    *b = temp;
}

void main() {
    uint a, b;
    swap(a, b);
    ptrSwap(&a, &b);
    writeln(a); // Keep DMD from optimizing out ptrSwap entirely.
}


Here's the disassembly of the relevant portion:

  COMDEF __Dmain
        push    eax                                     ; 0000 _ 50
        push    eax                                     ; 0001 _ 50
        xor     eax, eax                                ; 0002 _ 31. C0
        push    ebx                                     ; 0004 _ 53
        lea     ecx, [esp+4H]                           ; 0005 _ 8D. 4C 24, 04
        mov     dword ptr [esp+4H], eax                 ; 0009 _ 89. 44 24, 04
        mov     dword ptr [esp+8H], eax                 ; 000D _ 89. 44 24, 08
        push    ecx                                     ; 0011 _ 51
        lea     eax, [esp+0CH]                          ; 0012 _ 8D. 44 24, 0C
        call    _D5test711__T4swapTkZ4swapFKkKkZv       ; 0016 _ E8,
00000000(rel)
        mov     edx, dword ptr [esp+4H]                 ; 001B _ 8B. 54 24, 04
        mov     ebx, dword ptr [esp+8H]                 ; 001F _ 8B. 5C 24, 08
        mov     dword ptr [esp+4H], ebx                 ; 0023 _ 89. 5C 24, 04
        mov     eax, offset FLAT:_main                  ; 0027 _ B8,
00000000(segrel)
        mov     dword ptr [esp+8H], edx                 ; 002C _ 89. 54 24, 08
        push    ebx                                     ; 0030 _ 53
        push    10                                      ; 0031 _ 6A, 0A
        call    _D3std5stdio4File14__T5writeTkTaZ5writeMFkaZv; 0033 _ E8,
00000000(rel)
        xor     eax, eax                                ; 0038 _ 31. C0
        pop     ebx                                     ; 003A _ 5B
        add     esp, 8                                  ; 003B _ 83. C4, 08
        ret                                             ; 003E _ C3
__Dmain ENDP

This confirms that DMD inlines ptrSwap, but not swap.  I also did some
benchmarks,
and ptrSwap is as fast as manual inlining, but swap is slower by a factor of 2.

May 15 2009

grauzone <none example.net> writes:

 On second thought, maybe this should be a high priority issue, for two reasons.
 
 1.  Some uber-hardcore performance freaks will not even consider D if it has
the
 slightest bit of performance overhead compared to C++.

I don't understand why D should pander to C++ freaks? If they think 
their language is great, they'll just continue programming C++. Nobody 
cares about them, and they will die a sad, lonely death.


efficient, it won't interfere with D's world domination plans.

Rather, one should avoid cloning the more annoying C++ features. (Ah 
yes, of course D makes them "better". Huh.) That's why we all use D in 
the first place.

/random pointless rant

Of course, this doesn't have to do with DMD's optimizer. That said, if 
you want heavy optimization, you rather should look at LDC or GDC.

May 15 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from grauzone (none example.net)'s article
 Of course, this doesn't have to do with DMD's optimizer. That said, if
 you want heavy optimization, you rather should look at LDC or GDC.

In general I agree.  There are higher priorities than improving DMD's
optimization
back end.  This is a special case, though, because:

1.  Innocent looking code can turn into a significant performance bottleneck.
2.  I would guess it's a very low hanging piece of fruit, since ref is just
syntactic sugar for a pointer and DMD already can inline functions with pointer
parameters.

May 15 2009

Daniel Keep <daniel.keep.lists gmail.com> writes:

dsimcha wrote:
 == Quote from grauzone (none example.net)'s article
 Of course, this doesn't have to do with DMD's optimizer. That said, if
 you want heavy optimization, you rather should look at LDC or GDC.

 
 In general I agree.  There are higher priorities than improving DMD's
optimization
 back end.  This is a special case, though, because:
 
 1.  Innocent looking code can turn into a significant performance bottleneck.
 2.  I would guess it's a very low hanging piece of fruit, since ref is just
 syntactic sugar for a pointer and DMD already can inline functions with pointer
 parameters.

Looking at the DMD 2.029 source code, in inline.c around line 1307
(slightly reformatted)...

    /* If any parameters are Tsarray's (which are passed by reference)
     * or out parameters (also passed by reference), don't do inlining.
     */
    if (parameters)
    {
	for (int i = 0; i < parameters->dim; i++)
	{
	    VarDeclaration *v = (VarDeclaration *)parameters->data[i];
	    if (v->isOut() || v->isRef()
                          || v->type->toBasetype()->ty == Tsarray)
		goto Lno;
	}
    }

  -- Daniel

May 16 2009

Bill Baxter <wbaxter gmail.com> writes:

On Fri, May 15, 2009 at 9:16 PM, grauzone <none example.net> wrote:
 1. =A0Some uber-hardcore performance freaks will not even consider D if =


it
 has the
 slightest bit of performance overhead compared to C++.

 I don't understand why D should pander to C++ freaks? If they think their
 language is great, they'll just continue programming C++. Nobody cares ab=

out
 them, and they will die a sad, lonely death.


 efficient, it won't interfere with D's world domination plans.

 Rather, one should avoid cloning the more annoying C++ features. (Ah yes,=

 of
 course D makes them "better". Huh.) That's why we all use D in the first
 place.

Performance is one of the major selling points of D.  Why settle for a
language with sub-standard tools and lack of safety if you don't care
about performance?  If you don't care about performance you're
probably better off with one of those languages that runs on the JVM
or CLR.

--bb

May 17 2009

grauzone <none example.net> writes:

Bill Baxter wrote:
 On Fri, May 15, 2009 at 9:16 PM, grauzone <none example.net> wrote:
 1.  Some uber-hardcore performance freaks will not even consider D if it
 has the
 slightest bit of performance overhead compared to C++.

 I don't understand why D should pander to C++ freaks? If they think their
 language is great, they'll just continue programming C++. Nobody cares about
 them, and they will die a sad, lonely death.


 efficient, it won't interfere with D's world domination plans.

 Rather, one should avoid cloning the more annoying C++ features. (Ah yes, of
 course D makes them "better". Huh.) That's why we all use D in the first
 place.

 
 Performance is one of the major selling points of D.  Why settle for a
 language with sub-standard tools and lack of safety if you don't care
 about performance?  If you don't care about performance you're
 probably better off with one of those languages that runs on the JVM
 or CLR.

If you'd really care about performance, you'd code in ASM. Blurb.

 --bb

May 17 2009

grauzone <none example.net> writes:

grauzone wrote:
 Bill Baxter wrote:
 On Fri, May 15, 2009 at 9:16 PM, grauzone <none example.net> wrote:
 1.  Some uber-hardcore performance freaks will not even consider D 
 if it
 has the
 slightest bit of performance overhead compared to C++.

 I don't understand why D should pander to C++ freaks? If they think 
 their
 language is great, they'll just continue programming C++. Nobody 
 cares about
 them, and they will die a sad, lonely death.


 efficient, it won't interfere with D's world domination plans.

 Rather, one should avoid cloning the more annoying C++ features. (Ah 
 yes, of
 course D makes them "better". Huh.) That's why we all use D in the first
 place.

 Performance is one of the major selling points of D.  Why settle for a
 language with sub-standard tools and lack of safety if you don't care
 about performance?  If you don't care about performance you're
 probably better off with one of those languages that runs on the JVM
 or CLR.

 
 If you'd really care about performance, you'd code in ASM. Blurb.

Let me be a bit more serious about it: the main selling point of D are 
template meta programming features, and the fact that it's compiled 
(that is, it's light weight enough, and doesn't require a clusterfuck of 
a VM/runtime). Neither JVM nor CLR provide such a language. Else I'd use 
them.

In contrast, there are areas where D scarifies performance for 
simplicity. The GC is an example. Generally, there's lots of features, 
which are "slow, simple and safe" by default. This seems to slowly go 
away, though.

Someone who is really interested in performance will use some of those 
C/C++ compilers and would be scared of dmd's code generator. Seeing how 
old it is and that it support x86 only, he may even break out in 
laughter. (Maybe he'll do that while waiting for his C++ compiler to 
finish compiling.)

I understand that the current trend of D is to close the performance 
gaps to C++. Actually, D seems to aim to be a "better C++". Simplicity 
is replaced by more and more obscure features, etc... With my irrational 
hate for C++, I'm not sure if I like this. I still hope that the end 
result will be something nice.

 --bb

May 17 2009

Bill Baxter <wbaxter gmail.com> writes:

On Sun, May 17, 2009 at 11:12 AM, grauzone <none example.net> wrote:
 grauzone wrote:
 Bill Baxter wrote:
 On Fri, May 15, 2009 at 9:16 PM, grauzone <none example.net> wrote:
 1. =A0Some uber-hardcore performance freaks will not even consider D =





if
 it
 has the
 slightest bit of performance overhead compared to C++.

 I don't understand why D should pander to C++ freaks? If they think
 their
 language is great, they'll just continue programming C++. Nobody cares
 about
 them, and they will die a sad, lonely death.


 efficient, it won't interfere with D's world domination plans.

 Rather, one should avoid cloning the more annoying C++ features. (Ah
 yes, of
 course D makes them "better". Huh.) That's why we all use D in the fir=




st
 place.

 Performance is one of the major selling points of D. =A0Why settle for =



a
 language with sub-standard tools and lack of safety if you don't care
 about performance? =A0If you don't care about performance you're
 probably better off with one of those languages that runs on the JVM
 or CLR.

 If you'd really care about performance, you'd code in ASM. Blurb.

 Let me be a bit more serious about it: the main selling point of D are
 template meta programming features, and the fact that it's compiled (that
 is, it's light weight enough, and doesn't require a clusterfuck of a
 VM/runtime). Neither JVM nor CLR provide such a language. Else I'd use th=

em.

http://www.wordinfo.info/words/index/info/view_unit/1

"Ho! what have we here
So very round and smooth and sharp?
To me 'tis mighty clear
This wonder of an Elephant Is very like a spear!"

To you, the things you listed may be the main selling points.  To
others they may not be.  They are nice features, to be sure, but I'm
not sure who gave you authority to declare those the "main selling
point [sic]".

--bb

May 18 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Bill Baxter (wbaxter gmail.com)'s article
 Performance is one of the major selling points of D.  Why settle for a
 language with sub-standard tools and lack of safety if you don't care
 about performance?  If you don't care about performance you're
 probably better off with one of those languages that runs on the JVM
 or CLR.
 --bb

But JVM and CLR languages either are orders of magnitude slower than D (the
dynamic ones) or are statically typed and don't have the insane metaprogramming
capabilities of D.  IMHO the biggest single selling point of D, D2 at least, is
its metaprogramming.  This allows one to have a statically typed language,
meaning
good performance (even if current implementations are slightly slower than C++,
the difference between D vs. Python or Perl is much greater than D vs. C++ or
raw,
inscrutable hexadecimal numbers) and compile time checking while still being
able
to design flexible APIs.

IMHO (honestly not that informed since I've barely used Java, mostly a gut
feeling) Java APIs feel so ridiculously overengineered because, with static
typing
and no metaprogramming, there's no other way to build flexibility into the
system
besides going overboard on the OO design.  When your only tool is the virtual
function call, everything starts to look like a virtual hammer.  D APIs, on the
other hand, can be very flexible *and* simple.

std.algorithm and std.range are good examples, but since they were written by a
metaprogramming uber-guru and the language was designed specifically to
accommodate them, they might not be that representative.  However, even for
dstats, which is written by a relative amateur and did not have the luxury of
having parts of the core language designed around it, I cringe when I think
about
what I'd have to do in Java to make it as flexible as it is, and API design for
a
statistics library is probably one of the easier cases.

May 17 2009

bearophile <bearophileHUGS lycos.com> writes:

I have compiled a small variant of your code with the a very new LDC compiler
(May  9 revision, it doesn't print the revision number), the code:

import tango.stdc.stdio: printf;
import Integer = tango.text.convert.Integer;

void swap(T)(ref T a, ref T b) {
    T temp = a;
    a = b;
    b = temp;
}

void ptrSwap(T)(T* a, T* b) {
    T temp = *a;
    *a = *b;
    *b = temp;
}

void main(char[][] args) {
    int a = Integer.parse(args[1]);
    int b = Integer.parse(args[2]);
    printf("%d\n", a);
    swap(a, b);
    printf("%d\n", a);
    ptrSwap(&a, &b);
    printf("%d\n", a);
}

Generated asm with various compiler arguments:

ldc -output-s -release inline_test.d

swap:
    subl    $4, %esp
    movl    8(%esp), %ecx
    movl    (%ecx), %edx
    movl    %edx, (%esp)
    movl    (%eax), %edx
    movl    %edx, (%ecx)
    movl    (%esp), %ecx
    movl    %ecx, (%eax)
    addl    $4, %esp
    ret $4

ptrSwap:
    subl    $12, %esp
    movl    16(%esp), %ecx
    movl    %ecx, 8(%esp)
    movl    %eax, 4(%esp)
    movl    8(%esp), %eax
    movl    (%eax), %eax
    movl    %eax, (%esp)
    movl    4(%esp), %eax
    movl    (%eax), %eax
    movl    8(%esp), %ecx
    movl    %eax, (%ecx)
    movl    (%esp), %eax
    movl    4(%esp), %ecx
    movl    %eax, (%ecx)
    addl    $12, %esp
    ret $4

main:
    pushl   %ebx
    pushl   %edi
    pushl   %esi
    subl    $32, %esp
    movl    52(%esp), %eax
    movl    %eax, 28(%esp)
    movl    48(%esp), %eax
    movl    %eax, 24(%esp)
    movl    28(%esp), %eax
    movl    12(%eax), %ecx
    movl    8(%eax), %eax
    movl    %ecx, 8(%esp)
    movl    %eax, 4(%esp)
    movl    $0, (%esp)
    xorl    %esi, %esi
    movl    %esi, %eax
    call    Integer.parse
    subl    $12, %esp
    movl    %eax, 20(%esp)
    movl    28(%esp), %eax
    movl    20(%eax), %ecx
    movl    16(%eax), %eax
    movl    %ecx, 8(%esp)
    movl    %eax, 4(%esp)
    movl    $0, (%esp)
    movl    %esi, %eax
    call    Integer.parse
    subl    $12, %esp
    movl    %eax, 16(%esp)
    movl    20(%esp), %eax
    movl    %eax, 4(%esp)
    movl    $.str1, (%esp)
    call    printf
    leal    20(%esp), %edi
    movl    %edi, (%esp)
    leal    16(%esp), %ebx
    movl    %ebx, %eax
    call    swap
    subl    $4, %esp
    movl    20(%esp), %eax
    movl    %eax, 4(%esp)
    movl    $.str2, (%esp)
    call    printf
    movl    %edi, (%esp)
    movl    %ebx, %eax
    call    ptrSwap
    subl    $4, %esp
    movl    20(%esp), %eax
    movl    %eax, 4(%esp)
    movl    $.str3, (%esp)
    call    printf
    [...]

-------------------------

ldc -inline -release -output-s inline_test.d

main:
    pushl   %esi
    subl    $48, %esp
    movl    60(%esp), %eax
    movl    %eax, 28(%esp)
    movl    56(%esp), %eax
    movl    %eax, 24(%esp)
    movl    28(%esp), %eax
    movl    12(%eax), %ecx
    movl    8(%eax), %eax
    movl    %ecx, 8(%esp)
    movl    %eax, 4(%esp)
    movl    $0, (%esp)
    xorl    %esi, %esi
    movl    %esi, %eax
    call    Integer.parse
    subl    $12, %esp
    movl    %eax, 20(%esp)
    movl    28(%esp), %eax
    movl    20(%eax), %ecx
    movl    16(%eax), %eax
    movl    %ecx, 8(%esp)
    movl    %eax, 4(%esp)
    movl    $0, (%esp)
    movl    %esi, %eax
    call    Integer.parse
    subl    $12, %esp
    movl    %eax, 16(%esp)
    movl    20(%esp), %eax
    movl    %eax, 4(%esp)
    movl    $.str1, (%esp)
    call    printf
    movl    20(%esp), %eax
    movl    %eax, 32(%esp)
    movl    16(%esp), %eax
    movl    %eax, 20(%esp)
    movl    32(%esp), %eax
    movl    %eax, 16(%esp)
    movl    20(%esp), %eax
    movl    %eax, 4(%esp)
    movl    $.str2, (%esp)
    call    printf
    leal    20(%esp), %eax
    movl    %eax, 44(%esp)
    leal    16(%esp), %eax
    movl    %eax, 40(%esp)
    movl    44(%esp), %eax
    movl    (%eax), %eax
    movl    %eax, 36(%esp)
    movl    40(%esp), %eax
    movl    (%eax), %eax
    movl    44(%esp), %ecx
    movl    %eax, (%ecx)
    movl    36(%esp), %eax
    movl    40(%esp), %ecx
    movl    %eax, (%ecx)
    movl    20(%esp), %eax
    movl    %eax, 4(%esp)
    movl    $.str3, (%esp)
    call    printf
    [...]

-------------------------

ldc -inline -release -O5 -output-s inline_test.d
main:
    pushl   %ebx
    pushl   %edi
    pushl   %esi
    subl    $16, %esp
    movl    36(%esp), %esi
    movl    12(%esi), %eax
    movl    8(%esi), %ecx
    movl    %eax, 8(%esp)
    movl    %ecx, 4(%esp)
    movl    $0, (%esp)
    xorl    %edi, %edi
    xorl    %eax, %eax
    call    Integer.parse
    subl    $12, %esp
    movl    %eax, %ebx
    movl    20(%esi), %eax
    movl    16(%esi), %ecx
    movl    %eax, 8(%esp)
    movl    %ecx, 4(%esp)
    movl    $0, (%esp)
    movl    %edi, %eax
    call    Integer.parse
    subl    $12, %esp
    movl    %eax, %esi
    movl    %ebx, 4(%esp)
    movl    $.str1, (%esp)
    call    printf
    movl    %esi, 4(%esp)
    movl    $.str1, (%esp)
    call    printf
    movl    %ebx, 4(%esp)
    movl    $.str1, (%esp)
    call    printf
    [...]

You can see that -inline is enough to get both inlined.

The performance of D isn't something to ignore, I have translated a small ray
tracing program from C++ and I have seen performance up to about 3-3.5 times
slower with DMD, mostly because of missing inlining. Some benchmarks:
http://www.fantascienza.net/leonardo/js/smallpt.zip
http://www.fantascienza.net/leonardo/js/ao_bench.zip

Bye,
bearophile

May 16 2009

Michel Fortin <michel.fortin michelf.com> writes:

On 2009-05-15 16:36:16 -0400, dsimcha <dsimcha yahoo.com> said:

 The fact that DMD does not inline functions with ref parameters has come up
 several times deep in threads on this NG before, but it's never really
 received proper attention.  After changing a few swaps in performance-critical
 areas of my code to "manually inlined" swaps and seeing significant speedups,
 I'm kind of curious what the rationale is for not inlining functions w/ ref
 params.  Is there a good technical reason for this or is it simply a matter of
 having higher priorities?  Is inlining functions w/ ref params on the "to do
 eventually" list?

Which makes me think: now that "this" in structs is now a "ref" 
parameter, does that mean that all non-static member functions of a 
struct won't be inlined?

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

May 15 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Michel Fortin (michel.fortin michelf.com)'s article
 Which makes me think: now that "this" in structs is now a "ref"
 parameter, does that mean that all non-static member functions of a
 struct won't be inlined?

Oddly enough, based on looking at some disassembly, DMD apparently inlines ref
in
this special case.  Go figure.  If there are any explicit parameters that are
ref,
though, it doesn't inline.

May 15 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Sat, 16 May 2009 00:36:16 +0400, dsimcha <dsimcha yahoo.com> wrote:

 The fact that DMD does not inline functions with ref parameters has come  
 up
 several times deep in threads on this NG before, but it's never really
 received proper attention.  After changing a few swaps in  
 performance-critical
 areas of my code to "manually inlined" swaps and seeing significant  
 speedups,
 I'm kind of curious what the rationale is for not inlining functions w/  
 ref
 params.  Is there a good technical reason for this or is it simply a  
 matter of
 having higher priorities?  Is inlining functions w/ ref params on the  
 "to do
 eventually" list?

How about bugzillizing the feature request?

May 17 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Denis Koroskin (2korden gmail.com)'s article
 On Sat, 16 May 2009 00:36:16 +0400, dsimcha <dsimcha yahoo.com> wrote:
 The fact that DMD does not inline functions with ref parameters has come
 up
 several times deep in threads on this NG before, but it's never really
 received proper attention.  After changing a few swaps in
 performance-critical
 areas of my code to "manually inlined" swaps and seeing significant
 speedups,
 I'm kind of curious what the rationale is for not inlining functions w/
 ref
 params.  Is there a good technical reason for this or is it simply a
 matter of
 having higher priorities?  Is inlining functions w/ ref params on the
 "to do
 eventually" list?

 How about bugzillizing the feature request?

It's already in there as bug 2008, though I updated it recently to show a
disassembly proving that DMD can inline pointer param functions but not ref
param
functions. My guess is that, to someone already generally familiar w/ the DMD
code
base (I'm not), this would be really easy to fix, because ref params are just
syntactic sugar for pointers.  Heck, IIRC you can use ref params as pointers in
asm blocks.

May 17 2009

D Programming

C/C++ Programming

Other

digitalmars.D - Inlining Ref Functions