www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Inlining Ref Functions

reply dsimcha <dsimcha yahoo.com> writes:
The fact that DMD does not inline functions with ref parameters has come up
several times deep in threads on this NG before, but it's never really
received proper attention.  After changing a few swaps in performance-critical
areas of my code to "manually inlined" swaps and seeing significant speedups,
I'm kind of curious what the rationale is for not inlining functions w/ ref
params.  Is there a good technical reason for this or is it simply a matter of
having higher priorities?  Is inlining functions w/ ref params on the "to do
eventually" list?
May 15 2009
next sibling parent reply Bill Baxter <wbaxter gmail.com> writes:
On Fri, May 15, 2009 at 1:36 PM, dsimcha <dsimcha yahoo.com> wrote:
 The fact that DMD does not inline functions with ref parameters has come =
up
 several times deep in threads on this NG before, but it's never really
 received proper attention. =A0After changing a few swaps in performance-c=
ritical
 areas of my code to "manually inlined" swaps and seeing significant speed=
ups,
 I'm kind of curious what the rationale is for not inlining functions w/ r=
ef
 params. =A0Is there a good technical reason for this or is it simply a ma=
tter of
 having higher priorities? =A0Is inlining functions w/ ref params on the "=
to do
 eventually" list?
+1 on bumping up the priority on it. Even if it it can't be made to work in every case, if it could at least be made to work in simple cases like swap() then it would be great boon for DMD benchmarks. --bb
May 15 2009
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Bill Baxter (wbaxter gmail.com)'s article
 On Fri, May 15, 2009 at 1:36 PM, dsimcha <dsimcha yahoo.com> wrote:
 The fact that DMD does not inline functions with ref parameters has come
up
 several times deep in threads on this NG before, but it's never really
 received proper attention.  After changing a few swaps in performance-c
ritical
 areas of my code to "manually inlined" swaps and seeing significant speed
ups,
 I'm kind of curious what the rationale is for not inlining functions w/ r
ef
 params.  Is there a good technical reason for this or is it simply a ma
tter of
 having higher priorities?  Is inlining functions w/ ref params on the "
to do
 eventually" list?
+1 on bumping up the priority on it. Even if it it can't be made to work in every case, if it could at least be made to work in simple cases like swap() then it would be great boon for DMD benchmarks. --bb
Just to clarify: I actually don't think this is a terribly high priority item. Now that D2 is apparently pretty much feature complete, I'd rather bug fixes for functionality bugs than relatively mild performance bugs. My question was purely out of curiosity.
May 15 2009
parent reply Bill Baxter <wbaxter gmail.com> writes:
Well it was shown before in a demo ray-tracer that the inability to
inline funcs with refs caused a significant speed hit under DMD.  And
now we're seeing it's causing a significant speed hit for sorting
because of swap routines.

There may be some thorny issues regarding inlining with refs in the
general case but with code like this:

    float lengthSqr(const ref vec3 v) { return v.x*v.x + v.y*v.y + v.z*v.z;=
 }

 it really should be trivial for the compiler to figure out that
direct substitution is possible in a simple case like:

   vec3 w;
   ...
   auto len2 =3D lengthSqr(w);

Maybe I'm missing something, but that looks pretty darn straightforward.

--bb

On Fri, May 15, 2009 at 2:01 PM, dsimcha <dsimcha yahoo.com> wrote:
 =3D=3D Quote from Bill Baxter (wbaxter gmail.com)'s article
 On Fri, May 15, 2009 at 1:36 PM, dsimcha <dsimcha yahoo.com> wrote:
 The fact that DMD does not inline functions with ref parameters has co=
me
 up
 several times deep in threads on this NG before, but it's never really
 received proper attention. =A0After changing a few swaps in performanc=
e-c
 ritical
 areas of my code to "manually inlined" swaps and seeing significant sp=
eed
 ups,
 I'm kind of curious what the rationale is for not inlining functions w=
/ r
 ef
 params. =A0Is there a good technical reason for this or is it simply a=
ma
 tter of
 having higher priorities? =A0Is inlining functions w/ ref params on th=
e "
 to do
 eventually" list?
+1 on bumping up the priority on it. Even if it it can't be made to work in every case, if it could at least be made to work in simple cases like swap() then it would be great boon for DMD benchmarks. --bb
Just to clarify: =A0I actually don't think this is a terribly high priori=
ty item.
 Now that D2 is apparently pretty much feature complete, I'd rather bug fi=
xes for
 functionality bugs than relatively mild performance bugs. =A0My question =
was purely
 out of curiosity.
May 15 2009
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Bill Baxter (wbaxter gmail.com)'s article
 Well it was shown before in a demo ray-tracer that the inability to
 inline funcs with refs caused a significant speed hit under DMD.  And
 now we're seeing it's causing a significant speed hit for sorting
 because of swap routines.
 There may be some thorny issues regarding inlining with refs in the
 general case but with code like this:
     float lengthSqr(const ref vec3 v) { return v.x*v.x + v.y*v.y + v.z*v.z;
  }
  it really should be trivial for the compiler to figure out that
 direct substitution is possible in a simple case like:
    vec3 w;
    ...
    auto len2 = lengthSqr(w);
 Maybe I'm missing something, but that looks pretty darn straightforward.
 --bb
On second thought, maybe this should be a high priority issue, for two reasons. 1. Some uber-hardcore performance freaks will not even consider D if it has the slightest bit of performance overhead compared to C++. 2. DMD apparently already inlines certain functions with pointers, so why not references? Aren't references just syntactic sugar for pointers? If DMD's inliner can already handle pointers, I would think (I could be wrong) that it would be able to trivially handle references. Here's some test code: import std.stdio; // Shouldn't this generate *exactly* the same code as ptrSwap? void swap(T)(ref T a, ref T b) { T temp = a; a = b; b = temp; } void ptrSwap(T)(T* a, T* b) { T temp = *a; *a = *b; *b = temp; } void main() { uint a, b; swap(a, b); ptrSwap(&a, &b); writeln(a); // Keep DMD from optimizing out ptrSwap entirely. } Here's the disassembly of the relevant portion: COMDEF __Dmain push eax ; 0000 _ 50 push eax ; 0001 _ 50 xor eax, eax ; 0002 _ 31. C0 push ebx ; 0004 _ 53 lea ecx, [esp+4H] ; 0005 _ 8D. 4C 24, 04 mov dword ptr [esp+4H], eax ; 0009 _ 89. 44 24, 04 mov dword ptr [esp+8H], eax ; 000D _ 89. 44 24, 08 push ecx ; 0011 _ 51 lea eax, [esp+0CH] ; 0012 _ 8D. 44 24, 0C call _D5test711__T4swapTkZ4swapFKkKkZv ; 0016 _ E8, 00000000(rel) mov edx, dword ptr [esp+4H] ; 001B _ 8B. 54 24, 04 mov ebx, dword ptr [esp+8H] ; 001F _ 8B. 5C 24, 08 mov dword ptr [esp+4H], ebx ; 0023 _ 89. 5C 24, 04 mov eax, offset FLAT:_main ; 0027 _ B8, 00000000(segrel) mov dword ptr [esp+8H], edx ; 002C _ 89. 54 24, 08 push ebx ; 0030 _ 53 push 10 ; 0031 _ 6A, 0A call _D3std5stdio4File14__T5writeTkTaZ5writeMFkaZv; 0033 _ E8, 00000000(rel) xor eax, eax ; 0038 _ 31. C0 pop ebx ; 003A _ 5B add esp, 8 ; 003B _ 83. C4, 08 ret ; 003E _ C3 __Dmain ENDP This confirms that DMD inlines ptrSwap, but not swap. I also did some benchmarks, and ptrSwap is as fast as manual inlining, but swap is slower by a factor of 2.
May 15 2009
next sibling parent reply grauzone <none example.net> writes:
 On second thought, maybe this should be a high priority issue, for two reasons.
 
 1.  Some uber-hardcore performance freaks will not even consider D if it has
the
 slightest bit of performance overhead compared to C++.
I don't understand why D should pander to C++ freaks? If they think their language is great, they'll just continue programming C++. Nobody cares about them, and they will die a sad, lonely death. efficient, it won't interfere with D's world domination plans. Rather, one should avoid cloning the more annoying C++ features. (Ah yes, of course D makes them "better". Huh.) That's why we all use D in the first place. /random pointless rant Of course, this doesn't have to do with DMD's optimizer. That said, if you want heavy optimization, you rather should look at LDC or GDC.
May 15 2009
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from grauzone (none example.net)'s article
 Of course, this doesn't have to do with DMD's optimizer. That said, if
 you want heavy optimization, you rather should look at LDC or GDC.
In general I agree. There are higher priorities than improving DMD's optimization back end. This is a special case, though, because: 1. Innocent looking code can turn into a significant performance bottleneck. 2. I would guess it's a very low hanging piece of fruit, since ref is just syntactic sugar for a pointer and DMD already can inline functions with pointer parameters.
May 15 2009
parent Daniel Keep <daniel.keep.lists gmail.com> writes:
dsimcha wrote:
 == Quote from grauzone (none example.net)'s article
 Of course, this doesn't have to do with DMD's optimizer. That said, if
 you want heavy optimization, you rather should look at LDC or GDC.
In general I agree. There are higher priorities than improving DMD's optimization back end. This is a special case, though, because: 1. Innocent looking code can turn into a significant performance bottleneck. 2. I would guess it's a very low hanging piece of fruit, since ref is just syntactic sugar for a pointer and DMD already can inline functions with pointer parameters.
Looking at the DMD 2.029 source code, in inline.c around line 1307 (slightly reformatted)...
    /* If any parameters are Tsarray's (which are passed by reference)
     * or out parameters (also passed by reference), don't do inlining.
     */
    if (parameters)
    {
	for (int i = 0; i < parameters->dim; i++)
	{
	    VarDeclaration *v = (VarDeclaration *)parameters->data[i];
	    if (v->isOut() || v->isRef()
                          || v->type->toBasetype()->ty == Tsarray)
		goto Lno;
	}
    }
-- Daniel
May 16 2009
prev sibling parent reply Bill Baxter <wbaxter gmail.com> writes:
On Fri, May 15, 2009 at 9:16 PM, grauzone <none example.net> wrote:
 1. =A0Some uber-hardcore performance freaks will not even consider D if =
it
 has the
 slightest bit of performance overhead compared to C++.
I don't understand why D should pander to C++ freaks? If they think their language is great, they'll just continue programming C++. Nobody cares ab=
out
 them, and they will die a sad, lonely death.


 efficient, it won't interfere with D's world domination plans.

 Rather, one should avoid cloning the more annoying C++ features. (Ah yes,=
of
 course D makes them "better". Huh.) That's why we all use D in the first
 place.
Performance is one of the major selling points of D. Why settle for a language with sub-standard tools and lack of safety if you don't care about performance? If you don't care about performance you're probably better off with one of those languages that runs on the JVM or CLR. --bb
May 17 2009
next sibling parent reply grauzone <none example.net> writes:
Bill Baxter wrote:
 On Fri, May 15, 2009 at 9:16 PM, grauzone <none example.net> wrote:
 1.  Some uber-hardcore performance freaks will not even consider D if it
 has the
 slightest bit of performance overhead compared to C++.
I don't understand why D should pander to C++ freaks? If they think their language is great, they'll just continue programming C++. Nobody cares about them, and they will die a sad, lonely death. efficient, it won't interfere with D's world domination plans. Rather, one should avoid cloning the more annoying C++ features. (Ah yes, of course D makes them "better". Huh.) That's why we all use D in the first place.
Performance is one of the major selling points of D. Why settle for a language with sub-standard tools and lack of safety if you don't care about performance? If you don't care about performance you're probably better off with one of those languages that runs on the JVM or CLR.
If you'd really care about performance, you'd code in ASM. Blurb.
 --bb
May 17 2009
parent reply grauzone <none example.net> writes:
grauzone wrote:
 Bill Baxter wrote:
 On Fri, May 15, 2009 at 9:16 PM, grauzone <none example.net> wrote:
 1.  Some uber-hardcore performance freaks will not even consider D 
 if it
 has the
 slightest bit of performance overhead compared to C++.
I don't understand why D should pander to C++ freaks? If they think their language is great, they'll just continue programming C++. Nobody cares about them, and they will die a sad, lonely death. efficient, it won't interfere with D's world domination plans. Rather, one should avoid cloning the more annoying C++ features. (Ah yes, of course D makes them "better". Huh.) That's why we all use D in the first place.
Performance is one of the major selling points of D. Why settle for a language with sub-standard tools and lack of safety if you don't care about performance? If you don't care about performance you're probably better off with one of those languages that runs on the JVM or CLR.
If you'd really care about performance, you'd code in ASM. Blurb.
Let me be a bit more serious about it: the main selling point of D are template meta programming features, and the fact that it's compiled (that is, it's light weight enough, and doesn't require a clusterfuck of a VM/runtime). Neither JVM nor CLR provide such a language. Else I'd use them. In contrast, there are areas where D scarifies performance for simplicity. The GC is an example. Generally, there's lots of features, which are "slow, simple and safe" by default. This seems to slowly go away, though. Someone who is really interested in performance will use some of those C/C++ compilers and would be scared of dmd's code generator. Seeing how old it is and that it support x86 only, he may even break out in laughter. (Maybe he'll do that while waiting for his C++ compiler to finish compiling.) I understand that the current trend of D is to close the performance gaps to C++. Actually, D seems to aim to be a "better C++". Simplicity is replaced by more and more obscure features, etc... With my irrational hate for C++, I'm not sure if I like this. I still hope that the end result will be something nice.
 --bb
May 17 2009
parent Bill Baxter <wbaxter gmail.com> writes:
On Sun, May 17, 2009 at 11:12 AM, grauzone <none example.net> wrote:
 grauzone wrote:
 Bill Baxter wrote:
 On Fri, May 15, 2009 at 9:16 PM, grauzone <none example.net> wrote:
 1. =A0Some uber-hardcore performance freaks will not even consider D =
if
 it
 has the
 slightest bit of performance overhead compared to C++.
I don't understand why D should pander to C++ freaks? If they think their language is great, they'll just continue programming C++. Nobody cares about them, and they will die a sad, lonely death. efficient, it won't interfere with D's world domination plans. Rather, one should avoid cloning the more annoying C++ features. (Ah yes, of course D makes them "better". Huh.) That's why we all use D in the fir=
st
 place.
Performance is one of the major selling points of D. =A0Why settle for =
a
 language with sub-standard tools and lack of safety if you don't care
 about performance? =A0If you don't care about performance you're
 probably better off with one of those languages that runs on the JVM
 or CLR.
If you'd really care about performance, you'd code in ASM. Blurb.
Let me be a bit more serious about it: the main selling point of D are template meta programming features, and the fact that it's compiled (that is, it's light weight enough, and doesn't require a clusterfuck of a VM/runtime). Neither JVM nor CLR provide such a language. Else I'd use th=
em. http://www.wordinfo.info/words/index/info/view_unit/1 "Ho! what have we here So very round and smooth and sharp? To me 'tis mighty clear This wonder of an Elephant Is very like a spear!" To you, the things you listed may be the main selling points. To others they may not be. They are nice features, to be sure, but I'm not sure who gave you authority to declare those the "main selling point [sic]". --bb
May 18 2009
prev sibling parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Bill Baxter (wbaxter gmail.com)'s article
 Performance is one of the major selling points of D.  Why settle for a
 language with sub-standard tools and lack of safety if you don't care
 about performance?  If you don't care about performance you're
 probably better off with one of those languages that runs on the JVM
 or CLR.
 --bb
But JVM and CLR languages either are orders of magnitude slower than D (the dynamic ones) or are statically typed and don't have the insane metaprogramming capabilities of D. IMHO the biggest single selling point of D, D2 at least, is its metaprogramming. This allows one to have a statically typed language, meaning good performance (even if current implementations are slightly slower than C++, the difference between D vs. Python or Perl is much greater than D vs. C++ or raw, inscrutable hexadecimal numbers) and compile time checking while still being able to design flexible APIs. IMHO (honestly not that informed since I've barely used Java, mostly a gut feeling) Java APIs feel so ridiculously overengineered because, with static typing and no metaprogramming, there's no other way to build flexibility into the system besides going overboard on the OO design. When your only tool is the virtual function call, everything starts to look like a virtual hammer. D APIs, on the other hand, can be very flexible *and* simple. std.algorithm and std.range are good examples, but since they were written by a metaprogramming uber-guru and the language was designed specifically to accommodate them, they might not be that representative. However, even for dstats, which is written by a relative amateur and did not have the luxury of having parts of the core language designed around it, I cringe when I think about what I'd have to do in Java to make it as flexible as it is, and API design for a statistics library is probably one of the easier cases.
May 17 2009
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
I have compiled a small variant of your code with the a very new LDC compiler
(May  9 revision, it doesn't print the revision number), the code:

import tango.stdc.stdio: printf;
import Integer = tango.text.convert.Integer;

void swap(T)(ref T a, ref T b) {
    T temp = a;
    a = b;
    b = temp;
}

void ptrSwap(T)(T* a, T* b) {
    T temp = *a;
    *a = *b;
    *b = temp;
}

void main(char[][] args) {
    int a = Integer.parse(args[1]);
    int b = Integer.parse(args[2]);
    printf("%d\n", a);
    swap(a, b);
    printf("%d\n", a);
    ptrSwap(&a, &b);
    printf("%d\n", a);
}

Generated asm with various compiler arguments:

ldc -output-s -release inline_test.d

swap:
    subl    $4, %esp
    movl    8(%esp), %ecx
    movl    (%ecx), %edx
    movl    %edx, (%esp)
    movl    (%eax), %edx
    movl    %edx, (%ecx)
    movl    (%esp), %ecx
    movl    %ecx, (%eax)
    addl    $4, %esp
    ret $4

ptrSwap:
    subl    $12, %esp
    movl    16(%esp), %ecx
    movl    %ecx, 8(%esp)
    movl    %eax, 4(%esp)
    movl    8(%esp), %eax
    movl    (%eax), %eax
    movl    %eax, (%esp)
    movl    4(%esp), %eax
    movl    (%eax), %eax
    movl    8(%esp), %ecx
    movl    %eax, (%ecx)
    movl    (%esp), %eax
    movl    4(%esp), %ecx
    movl    %eax, (%ecx)
    addl    $12, %esp
    ret $4

main:
    pushl   %ebx
    pushl   %edi
    pushl   %esi
    subl    $32, %esp
    movl    52(%esp), %eax
    movl    %eax, 28(%esp)
    movl    48(%esp), %eax
    movl    %eax, 24(%esp)
    movl    28(%esp), %eax
    movl    12(%eax), %ecx
    movl    8(%eax), %eax
    movl    %ecx, 8(%esp)
    movl    %eax, 4(%esp)
    movl    $0, (%esp)
    xorl    %esi, %esi
    movl    %esi, %eax
    call    Integer.parse
    subl    $12, %esp
    movl    %eax, 20(%esp)
    movl    28(%esp), %eax
    movl    20(%eax), %ecx
    movl    16(%eax), %eax
    movl    %ecx, 8(%esp)
    movl    %eax, 4(%esp)
    movl    $0, (%esp)
    movl    %esi, %eax
    call    Integer.parse
    subl    $12, %esp
    movl    %eax, 16(%esp)
    movl    20(%esp), %eax
    movl    %eax, 4(%esp)
    movl    $.str1, (%esp)
    call    printf
    leal    20(%esp), %edi
    movl    %edi, (%esp)
    leal    16(%esp), %ebx
    movl    %ebx, %eax
    call    swap
    subl    $4, %esp
    movl    20(%esp), %eax
    movl    %eax, 4(%esp)
    movl    $.str2, (%esp)
    call    printf
    movl    %edi, (%esp)
    movl    %ebx, %eax
    call    ptrSwap
    subl    $4, %esp
    movl    20(%esp), %eax
    movl    %eax, 4(%esp)
    movl    $.str3, (%esp)
    call    printf
    [...]

-------------------------

ldc -inline -release -output-s inline_test.d

main:
    pushl   %esi
    subl    $48, %esp
    movl    60(%esp), %eax
    movl    %eax, 28(%esp)
    movl    56(%esp), %eax
    movl    %eax, 24(%esp)
    movl    28(%esp), %eax
    movl    12(%eax), %ecx
    movl    8(%eax), %eax
    movl    %ecx, 8(%esp)
    movl    %eax, 4(%esp)
    movl    $0, (%esp)
    xorl    %esi, %esi
    movl    %esi, %eax
    call    Integer.parse
    subl    $12, %esp
    movl    %eax, 20(%esp)
    movl    28(%esp), %eax
    movl    20(%eax), %ecx
    movl    16(%eax), %eax
    movl    %ecx, 8(%esp)
    movl    %eax, 4(%esp)
    movl    $0, (%esp)
    movl    %esi, %eax
    call    Integer.parse
    subl    $12, %esp
    movl    %eax, 16(%esp)
    movl    20(%esp), %eax
    movl    %eax, 4(%esp)
    movl    $.str1, (%esp)
    call    printf
    movl    20(%esp), %eax
    movl    %eax, 32(%esp)
    movl    16(%esp), %eax
    movl    %eax, 20(%esp)
    movl    32(%esp), %eax
    movl    %eax, 16(%esp)
    movl    20(%esp), %eax
    movl    %eax, 4(%esp)
    movl    $.str2, (%esp)
    call    printf
    leal    20(%esp), %eax
    movl    %eax, 44(%esp)
    leal    16(%esp), %eax
    movl    %eax, 40(%esp)
    movl    44(%esp), %eax
    movl    (%eax), %eax
    movl    %eax, 36(%esp)
    movl    40(%esp), %eax
    movl    (%eax), %eax
    movl    44(%esp), %ecx
    movl    %eax, (%ecx)
    movl    36(%esp), %eax
    movl    40(%esp), %ecx
    movl    %eax, (%ecx)
    movl    20(%esp), %eax
    movl    %eax, 4(%esp)
    movl    $.str3, (%esp)
    call    printf
    [...]

-------------------------

ldc -inline -release -O5 -output-s inline_test.d
main:
    pushl   %ebx
    pushl   %edi
    pushl   %esi
    subl    $16, %esp
    movl    36(%esp), %esi
    movl    12(%esi), %eax
    movl    8(%esi), %ecx
    movl    %eax, 8(%esp)
    movl    %ecx, 4(%esp)
    movl    $0, (%esp)
    xorl    %edi, %edi
    xorl    %eax, %eax
    call    Integer.parse
    subl    $12, %esp
    movl    %eax, %ebx
    movl    20(%esi), %eax
    movl    16(%esi), %ecx
    movl    %eax, 8(%esp)
    movl    %ecx, 4(%esp)
    movl    $0, (%esp)
    movl    %edi, %eax
    call    Integer.parse
    subl    $12, %esp
    movl    %eax, %esi
    movl    %ebx, 4(%esp)
    movl    $.str1, (%esp)
    call    printf
    movl    %esi, 4(%esp)
    movl    $.str1, (%esp)
    call    printf
    movl    %ebx, 4(%esp)
    movl    $.str1, (%esp)
    call    printf
    [...]

You can see that -inline is enough to get both inlined.

The performance of D isn't something to ignore, I have translated a small ray
tracing program from C++ and I have seen performance up to about 3-3.5 times
slower with DMD, mostly because of missing inlining. Some benchmarks:
http://www.fantascienza.net/leonardo/js/smallpt.zip
http://www.fantascienza.net/leonardo/js/ao_bench.zip

Bye,
bearophile
May 16 2009
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2009-05-15 16:36:16 -0400, dsimcha <dsimcha yahoo.com> said:

 The fact that DMD does not inline functions with ref parameters has come up
 several times deep in threads on this NG before, but it's never really
 received proper attention.  After changing a few swaps in performance-critical
 areas of my code to "manually inlined" swaps and seeing significant speedups,
 I'm kind of curious what the rationale is for not inlining functions w/ ref
 params.  Is there a good technical reason for this or is it simply a matter of
 having higher priorities?  Is inlining functions w/ ref params on the "to do
 eventually" list?
Which makes me think: now that "this" in structs is now a "ref" parameter, does that mean that all non-static member functions of a struct won't be inlined? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 15 2009
parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Michel Fortin (michel.fortin michelf.com)'s article
 Which makes me think: now that "this" in structs is now a "ref"
 parameter, does that mean that all non-static member functions of a
 struct won't be inlined?
Oddly enough, based on looking at some disassembly, DMD apparently inlines ref in this special case. Go figure. If there are any explicit parameters that are ref, though, it doesn't inline.
May 15 2009
prev sibling parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Sat, 16 May 2009 00:36:16 +0400, dsimcha <dsimcha yahoo.com> wrote:

 The fact that DMD does not inline functions with ref parameters has come  
 up
 several times deep in threads on this NG before, but it's never really
 received proper attention.  After changing a few swaps in  
 performance-critical
 areas of my code to "manually inlined" swaps and seeing significant  
 speedups,
 I'm kind of curious what the rationale is for not inlining functions w/  
 ref
 params.  Is there a good technical reason for this or is it simply a  
 matter of
 having higher priorities?  Is inlining functions w/ ref params on the  
 "to do
 eventually" list?
How about bugzillizing the feature request?
May 17 2009
parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Denis Koroskin (2korden gmail.com)'s article
 On Sat, 16 May 2009 00:36:16 +0400, dsimcha <dsimcha yahoo.com> wrote:
 The fact that DMD does not inline functions with ref parameters has come
 up
 several times deep in threads on this NG before, but it's never really
 received proper attention.  After changing a few swaps in
 performance-critical
 areas of my code to "manually inlined" swaps and seeing significant
 speedups,
 I'm kind of curious what the rationale is for not inlining functions w/
 ref
 params.  Is there a good technical reason for this or is it simply a
 matter of
 having higher priorities?  Is inlining functions w/ ref params on the
 "to do
 eventually" list?
How about bugzillizing the feature request?
It's already in there as bug 2008, though I updated it recently to show a disassembly proving that DMD can inline pointer param functions but not ref param functions. My guess is that, to someone already generally familiar w/ the DMD code base (I'm not), this would be really easy to fix, because ref params are just syntactic sugar for pointers. Heck, IIRC you can use ref params as pointers in asm blocks.
May 17 2009