digitalmars.D - Inlining Ref Functions
- dsimcha (8/8) May 15 2009 The fact that DMD does not inline functions with ref parameters has come...
- Bill Baxter (12/20) May 15 2009 ritical
- dsimcha (5/25) May 15 2009 Just to clarify: I actually don't think this is a terribly high priorit...
- Bill Baxter (25/50) May 15 2009 Well it was shown before in a demo ray-tracer that the inability to
- dsimcha (55/70) May 15 2009 On second thought, maybe this should be a high priority issue, for two r...
- grauzone (11/15) May 15 2009 I don't understand why D should pander to C++ freaks? If they think
- dsimcha (7/9) May 15 2009 In general I agree. There are higher priorities than improving DMD's op...
- Daniel Keep (4/28) May 16 2009 Looking at the DMD 2.029 source code, in inline.c around line 1307
- Bill Baxter (10/21) May 17 2009 out
- grauzone (2/23) May 17 2009
- grauzone (20/48) May 17 2009 Let me be a bit more serious about it: the main selling point of D are
- Bill Baxter (15/51) May 18 2009 st
- dsimcha (22/28) May 17 2009 But JVM and CLR languages either are orders of magnitude slower than D (...
- bearophile (211/211) May 16 2009 I have compiled a small variant of your code with the a very new LDC com...
- Michel Fortin (8/16) May 15 2009 Which makes me think: now that "this" in structs is now a "ref"
- dsimcha (4/7) May 15 2009 Oddly enough, based on looking at some disassembly, DMD apparently inlin...
- Denis Koroskin (2/16) May 17 2009 How about bugzillizing the feature request?
- dsimcha (7/23) May 17 2009 It's already in there as bug 2008, though I updated it recently to show ...
The fact that DMD does not inline functions with ref parameters has come up several times deep in threads on this NG before, but it's never really received proper attention. After changing a few swaps in performance-critical areas of my code to "manually inlined" swaps and seeing significant speedups, I'm kind of curious what the rationale is for not inlining functions w/ ref params. Is there a good technical reason for this or is it simply a matter of having higher priorities? Is inlining functions w/ ref params on the "to do eventually" list?
May 15 2009
On Fri, May 15, 2009 at 1:36 PM, dsimcha <dsimcha yahoo.com> wrote:The fact that DMD does not inline functions with ref parameters has come =upseveral times deep in threads on this NG before, but it's never really received proper attention. =A0After changing a few swaps in performance-c=riticalareas of my code to "manually inlined" swaps and seeing significant speed=ups,I'm kind of curious what the rationale is for not inlining functions w/ r=efparams. =A0Is there a good technical reason for this or is it simply a ma=tter ofhaving higher priorities? =A0Is inlining functions w/ ref params on the "=to doeventually" list?+1 on bumping up the priority on it. Even if it it can't be made to work in every case, if it could at least be made to work in simple cases like swap() then it would be great boon for DMD benchmarks. --bb
May 15 2009
== Quote from Bill Baxter (wbaxter gmail.com)'s articleOn Fri, May 15, 2009 at 1:36 PM, dsimcha <dsimcha yahoo.com> wrote:Just to clarify: I actually don't think this is a terribly high priority item. Now that D2 is apparently pretty much feature complete, I'd rather bug fixes for functionality bugs than relatively mild performance bugs. My question was purely out of curiosity.The fact that DMD does not inline functions with ref parameters has comeupseveral times deep in threads on this NG before, but it's never really received proper attention. After changing a few swaps in performance-criticalareas of my code to "manually inlined" swaps and seeing significant speedups,I'm kind of curious what the rationale is for not inlining functions w/ refparams. Is there a good technical reason for this or is it simply a matter ofhaving higher priorities? Is inlining functions w/ ref params on the "to doeventually" list?+1 on bumping up the priority on it. Even if it it can't be made to work in every case, if it could at least be made to work in simple cases like swap() then it would be great boon for DMD benchmarks. --bb
May 15 2009
Well it was shown before in a demo ray-tracer that the inability to inline funcs with refs caused a significant speed hit under DMD. And now we're seeing it's causing a significant speed hit for sorting because of swap routines. There may be some thorny issues regarding inlining with refs in the general case but with code like this: float lengthSqr(const ref vec3 v) { return v.x*v.x + v.y*v.y + v.z*v.z;= } it really should be trivial for the compiler to figure out that direct substitution is possible in a simple case like: vec3 w; ... auto len2 =3D lengthSqr(w); Maybe I'm missing something, but that looks pretty darn straightforward. --bb On Fri, May 15, 2009 at 2:01 PM, dsimcha <dsimcha yahoo.com> wrote:=3D=3D Quote from Bill Baxter (wbaxter gmail.com)'s articlemeOn Fri, May 15, 2009 at 1:36 PM, dsimcha <dsimcha yahoo.com> wrote:The fact that DMD does not inline functions with ref parameters has co=e-cupseveral times deep in threads on this NG before, but it's never really received proper attention. =A0After changing a few swaps in performanc=eedriticalareas of my code to "manually inlined" swaps and seeing significant sp=/ rups,I'm kind of curious what the rationale is for not inlining functions w=maefparams. =A0Is there a good technical reason for this or is it simply a=e "tter ofhaving higher priorities? =A0Is inlining functions w/ ref params on th=ty item.to doJust to clarify: =A0I actually don't think this is a terribly high priori=eventually" list?+1 on bumping up the priority on it. Even if it it can't be made to work in every case, if it could at least be made to work in simple cases like swap() then it would be great boon for DMD benchmarks. --bbNow that D2 is apparently pretty much feature complete, I'd rather bug fi=xes forfunctionality bugs than relatively mild performance bugs. =A0My question =was purelyout of curiosity.
May 15 2009
== Quote from Bill Baxter (wbaxter gmail.com)'s articleWell it was shown before in a demo ray-tracer that the inability to inline funcs with refs caused a significant speed hit under DMD. And now we're seeing it's causing a significant speed hit for sorting because of swap routines. There may be some thorny issues regarding inlining with refs in the general case but with code like this: float lengthSqr(const ref vec3 v) { return v.x*v.x + v.y*v.y + v.z*v.z; } it really should be trivial for the compiler to figure out that direct substitution is possible in a simple case like: vec3 w; ... auto len2 = lengthSqr(w); Maybe I'm missing something, but that looks pretty darn straightforward. --bbOn second thought, maybe this should be a high priority issue, for two reasons. 1. Some uber-hardcore performance freaks will not even consider D if it has the slightest bit of performance overhead compared to C++. 2. DMD apparently already inlines certain functions with pointers, so why not references? Aren't references just syntactic sugar for pointers? If DMD's inliner can already handle pointers, I would think (I could be wrong) that it would be able to trivially handle references. Here's some test code: import std.stdio; // Shouldn't this generate *exactly* the same code as ptrSwap? void swap(T)(ref T a, ref T b) { T temp = a; a = b; b = temp; } void ptrSwap(T)(T* a, T* b) { T temp = *a; *a = *b; *b = temp; } void main() { uint a, b; swap(a, b); ptrSwap(&a, &b); writeln(a); // Keep DMD from optimizing out ptrSwap entirely. } Here's the disassembly of the relevant portion: COMDEF __Dmain push eax ; 0000 _ 50 push eax ; 0001 _ 50 xor eax, eax ; 0002 _ 31. C0 push ebx ; 0004 _ 53 lea ecx, [esp+4H] ; 0005 _ 8D. 4C 24, 04 mov dword ptr [esp+4H], eax ; 0009 _ 89. 44 24, 04 mov dword ptr [esp+8H], eax ; 000D _ 89. 44 24, 08 push ecx ; 0011 _ 51 lea eax, [esp+0CH] ; 0012 _ 8D. 44 24, 0C call _D5test711__T4swapTkZ4swapFKkKkZv ; 0016 _ E8, 00000000(rel) mov edx, dword ptr [esp+4H] ; 001B _ 8B. 54 24, 04 mov ebx, dword ptr [esp+8H] ; 001F _ 8B. 5C 24, 08 mov dword ptr [esp+4H], ebx ; 0023 _ 89. 5C 24, 04 mov eax, offset FLAT:_main ; 0027 _ B8, 00000000(segrel) mov dword ptr [esp+8H], edx ; 002C _ 89. 54 24, 08 push ebx ; 0030 _ 53 push 10 ; 0031 _ 6A, 0A call _D3std5stdio4File14__T5writeTkTaZ5writeMFkaZv; 0033 _ E8, 00000000(rel) xor eax, eax ; 0038 _ 31. C0 pop ebx ; 003A _ 5B add esp, 8 ; 003B _ 83. C4, 08 ret ; 003E _ C3 __Dmain ENDP This confirms that DMD inlines ptrSwap, but not swap. I also did some benchmarks, and ptrSwap is as fast as manual inlining, but swap is slower by a factor of 2.
May 15 2009
On second thought, maybe this should be a high priority issue, for two reasons. 1. Some uber-hardcore performance freaks will not even consider D if it has the slightest bit of performance overhead compared to C++.I don't understand why D should pander to C++ freaks? If they think their language is great, they'll just continue programming C++. Nobody cares about them, and they will die a sad, lonely death. efficient, it won't interfere with D's world domination plans. Rather, one should avoid cloning the more annoying C++ features. (Ah yes, of course D makes them "better". Huh.) That's why we all use D in the first place. /random pointless rant Of course, this doesn't have to do with DMD's optimizer. That said, if you want heavy optimization, you rather should look at LDC or GDC.
May 15 2009
== Quote from grauzone (none example.net)'s articleOf course, this doesn't have to do with DMD's optimizer. That said, if you want heavy optimization, you rather should look at LDC or GDC.In general I agree. There are higher priorities than improving DMD's optimization back end. This is a special case, though, because: 1. Innocent looking code can turn into a significant performance bottleneck. 2. I would guess it's a very low hanging piece of fruit, since ref is just syntactic sugar for a pointer and DMD already can inline functions with pointer parameters.
May 15 2009
dsimcha wrote:== Quote from grauzone (none example.net)'s articleLooking at the DMD 2.029 source code, in inline.c around line 1307 (slightly reformatted)...Of course, this doesn't have to do with DMD's optimizer. That said, if you want heavy optimization, you rather should look at LDC or GDC.In general I agree. There are higher priorities than improving DMD's optimization back end. This is a special case, though, because: 1. Innocent looking code can turn into a significant performance bottleneck. 2. I would guess it's a very low hanging piece of fruit, since ref is just syntactic sugar for a pointer and DMD already can inline functions with pointer parameters./* If any parameters are Tsarray's (which are passed by reference) * or out parameters (also passed by reference), don't do inlining. */ if (parameters) { for (int i = 0; i < parameters->dim; i++) { VarDeclaration *v = (VarDeclaration *)parameters->data[i]; if (v->isOut() || v->isRef() || v->type->toBasetype()->ty == Tsarray) goto Lno; } }-- Daniel
May 16 2009
On Fri, May 15, 2009 at 9:16 PM, grauzone <none example.net> wrote:it1. =A0Some uber-hardcore performance freaks will not even consider D if =outhas the slightest bit of performance overhead compared to C++.I don't understand why D should pander to C++ freaks? If they think their language is great, they'll just continue programming C++. Nobody cares ab=them, and they will die a sad, lonely death. efficient, it won't interfere with D's world domination plans. Rather, one should avoid cloning the more annoying C++ features. (Ah yes,=ofcourse D makes them "better". Huh.) That's why we all use D in the first place.Performance is one of the major selling points of D. Why settle for a language with sub-standard tools and lack of safety if you don't care about performance? If you don't care about performance you're probably better off with one of those languages that runs on the JVM or CLR. --bb
May 17 2009
Bill Baxter wrote:On Fri, May 15, 2009 at 9:16 PM, grauzone <none example.net> wrote:If you'd really care about performance, you'd code in ASM. Blurb.Performance is one of the major selling points of D. Why settle for a language with sub-standard tools and lack of safety if you don't care about performance? If you don't care about performance you're probably better off with one of those languages that runs on the JVM or CLR.1. Some uber-hardcore performance freaks will not even consider D if it has the slightest bit of performance overhead compared to C++.I don't understand why D should pander to C++ freaks? If they think their language is great, they'll just continue programming C++. Nobody cares about them, and they will die a sad, lonely death. efficient, it won't interfere with D's world domination plans. Rather, one should avoid cloning the more annoying C++ features. (Ah yes, of course D makes them "better". Huh.) That's why we all use D in the first place.--bb
May 17 2009
grauzone wrote:Bill Baxter wrote:Let me be a bit more serious about it: the main selling point of D are template meta programming features, and the fact that it's compiled (that is, it's light weight enough, and doesn't require a clusterfuck of a VM/runtime). Neither JVM nor CLR provide such a language. Else I'd use them. In contrast, there are areas where D scarifies performance for simplicity. The GC is an example. Generally, there's lots of features, which are "slow, simple and safe" by default. This seems to slowly go away, though. Someone who is really interested in performance will use some of those C/C++ compilers and would be scared of dmd's code generator. Seeing how old it is and that it support x86 only, he may even break out in laughter. (Maybe he'll do that while waiting for his C++ compiler to finish compiling.) I understand that the current trend of D is to close the performance gaps to C++. Actually, D seems to aim to be a "better C++". Simplicity is replaced by more and more obscure features, etc... With my irrational hate for C++, I'm not sure if I like this. I still hope that the end result will be something nice.On Fri, May 15, 2009 at 9:16 PM, grauzone <none example.net> wrote:If you'd really care about performance, you'd code in ASM. Blurb.Performance is one of the major selling points of D. Why settle for a language with sub-standard tools and lack of safety if you don't care about performance? If you don't care about performance you're probably better off with one of those languages that runs on the JVM or CLR.1. Some uber-hardcore performance freaks will not even consider D if it has the slightest bit of performance overhead compared to C++.I don't understand why D should pander to C++ freaks? If they think their language is great, they'll just continue programming C++. Nobody cares about them, and they will die a sad, lonely death. efficient, it won't interfere with D's world domination plans. Rather, one should avoid cloning the more annoying C++ features. (Ah yes, of course D makes them "better". Huh.) That's why we all use D in the first place.--bb
May 17 2009
On Sun, May 17, 2009 at 11:12 AM, grauzone <none example.net> wrote:grauzone wrote:ifBill Baxter wrote:On Fri, May 15, 2009 at 9:16 PM, grauzone <none example.net> wrote:1. =A0Some uber-hardcore performance freaks will not even consider D =stit has the slightest bit of performance overhead compared to C++.I don't understand why D should pander to C++ freaks? If they think their language is great, they'll just continue programming C++. Nobody cares about them, and they will die a sad, lonely death. efficient, it won't interfere with D's world domination plans. Rather, one should avoid cloning the more annoying C++ features. (Ah yes, of course D makes them "better". Huh.) That's why we all use D in the fir=aplace.Performance is one of the major selling points of D. =A0Why settle for =em. http://www.wordinfo.info/words/index/info/view_unit/1 "Ho! what have we here So very round and smooth and sharp? To me 'tis mighty clear This wonder of an Elephant Is very like a spear!" To you, the things you listed may be the main selling points. To others they may not be. They are nice features, to be sure, but I'm not sure who gave you authority to declare those the "main selling point [sic]". --bbLet me be a bit more serious about it: the main selling point of D are template meta programming features, and the fact that it's compiled (that is, it's light weight enough, and doesn't require a clusterfuck of a VM/runtime). Neither JVM nor CLR provide such a language. Else I'd use th=language with sub-standard tools and lack of safety if you don't care about performance? =A0If you don't care about performance you're probably better off with one of those languages that runs on the JVM or CLR.If you'd really care about performance, you'd code in ASM. Blurb.
May 18 2009
== Quote from Bill Baxter (wbaxter gmail.com)'s articlePerformance is one of the major selling points of D. Why settle for a language with sub-standard tools and lack of safety if you don't care about performance? If you don't care about performance you're probably better off with one of those languages that runs on the JVM or CLR. --bbBut JVM and CLR languages either are orders of magnitude slower than D (the dynamic ones) or are statically typed and don't have the insane metaprogramming capabilities of D. IMHO the biggest single selling point of D, D2 at least, is its metaprogramming. This allows one to have a statically typed language, meaning good performance (even if current implementations are slightly slower than C++, the difference between D vs. Python or Perl is much greater than D vs. C++ or raw, inscrutable hexadecimal numbers) and compile time checking while still being able to design flexible APIs. IMHO (honestly not that informed since I've barely used Java, mostly a gut feeling) Java APIs feel so ridiculously overengineered because, with static typing and no metaprogramming, there's no other way to build flexibility into the system besides going overboard on the OO design. When your only tool is the virtual function call, everything starts to look like a virtual hammer. D APIs, on the other hand, can be very flexible *and* simple. std.algorithm and std.range are good examples, but since they were written by a metaprogramming uber-guru and the language was designed specifically to accommodate them, they might not be that representative. However, even for dstats, which is written by a relative amateur and did not have the luxury of having parts of the core language designed around it, I cringe when I think about what I'd have to do in Java to make it as flexible as it is, and API design for a statistics library is probably one of the easier cases.
May 17 2009
I have compiled a small variant of your code with the a very new LDC compiler (May 9 revision, it doesn't print the revision number), the code: import tango.stdc.stdio: printf; import Integer = tango.text.convert.Integer; void swap(T)(ref T a, ref T b) { T temp = a; a = b; b = temp; } void ptrSwap(T)(T* a, T* b) { T temp = *a; *a = *b; *b = temp; } void main(char[][] args) { int a = Integer.parse(args[1]); int b = Integer.parse(args[2]); printf("%d\n", a); swap(a, b); printf("%d\n", a); ptrSwap(&a, &b); printf("%d\n", a); } Generated asm with various compiler arguments: ldc -output-s -release inline_test.d swap: subl $4, %esp movl 8(%esp), %ecx movl (%ecx), %edx movl %edx, (%esp) movl (%eax), %edx movl %edx, (%ecx) movl (%esp), %ecx movl %ecx, (%eax) addl $4, %esp ret $4 ptrSwap: subl $12, %esp movl 16(%esp), %ecx movl %ecx, 8(%esp) movl %eax, 4(%esp) movl 8(%esp), %eax movl (%eax), %eax movl %eax, (%esp) movl 4(%esp), %eax movl (%eax), %eax movl 8(%esp), %ecx movl %eax, (%ecx) movl (%esp), %eax movl 4(%esp), %ecx movl %eax, (%ecx) addl $12, %esp ret $4 main: pushl %ebx pushl %edi pushl %esi subl $32, %esp movl 52(%esp), %eax movl %eax, 28(%esp) movl 48(%esp), %eax movl %eax, 24(%esp) movl 28(%esp), %eax movl 12(%eax), %ecx movl 8(%eax), %eax movl %ecx, 8(%esp) movl %eax, 4(%esp) movl $0, (%esp) xorl %esi, %esi movl %esi, %eax call Integer.parse subl $12, %esp movl %eax, 20(%esp) movl 28(%esp), %eax movl 20(%eax), %ecx movl 16(%eax), %eax movl %ecx, 8(%esp) movl %eax, 4(%esp) movl $0, (%esp) movl %esi, %eax call Integer.parse subl $12, %esp movl %eax, 16(%esp) movl 20(%esp), %eax movl %eax, 4(%esp) movl $.str1, (%esp) call printf leal 20(%esp), %edi movl %edi, (%esp) leal 16(%esp), %ebx movl %ebx, %eax call swap subl $4, %esp movl 20(%esp), %eax movl %eax, 4(%esp) movl $.str2, (%esp) call printf movl %edi, (%esp) movl %ebx, %eax call ptrSwap subl $4, %esp movl 20(%esp), %eax movl %eax, 4(%esp) movl $.str3, (%esp) call printf [...] ------------------------- ldc -inline -release -output-s inline_test.d main: pushl %esi subl $48, %esp movl 60(%esp), %eax movl %eax, 28(%esp) movl 56(%esp), %eax movl %eax, 24(%esp) movl 28(%esp), %eax movl 12(%eax), %ecx movl 8(%eax), %eax movl %ecx, 8(%esp) movl %eax, 4(%esp) movl $0, (%esp) xorl %esi, %esi movl %esi, %eax call Integer.parse subl $12, %esp movl %eax, 20(%esp) movl 28(%esp), %eax movl 20(%eax), %ecx movl 16(%eax), %eax movl %ecx, 8(%esp) movl %eax, 4(%esp) movl $0, (%esp) movl %esi, %eax call Integer.parse subl $12, %esp movl %eax, 16(%esp) movl 20(%esp), %eax movl %eax, 4(%esp) movl $.str1, (%esp) call printf movl 20(%esp), %eax movl %eax, 32(%esp) movl 16(%esp), %eax movl %eax, 20(%esp) movl 32(%esp), %eax movl %eax, 16(%esp) movl 20(%esp), %eax movl %eax, 4(%esp) movl $.str2, (%esp) call printf leal 20(%esp), %eax movl %eax, 44(%esp) leal 16(%esp), %eax movl %eax, 40(%esp) movl 44(%esp), %eax movl (%eax), %eax movl %eax, 36(%esp) movl 40(%esp), %eax movl (%eax), %eax movl 44(%esp), %ecx movl %eax, (%ecx) movl 36(%esp), %eax movl 40(%esp), %ecx movl %eax, (%ecx) movl 20(%esp), %eax movl %eax, 4(%esp) movl $.str3, (%esp) call printf [...] ------------------------- ldc -inline -release -O5 -output-s inline_test.d main: pushl %ebx pushl %edi pushl %esi subl $16, %esp movl 36(%esp), %esi movl 12(%esi), %eax movl 8(%esi), %ecx movl %eax, 8(%esp) movl %ecx, 4(%esp) movl $0, (%esp) xorl %edi, %edi xorl %eax, %eax call Integer.parse subl $12, %esp movl %eax, %ebx movl 20(%esi), %eax movl 16(%esi), %ecx movl %eax, 8(%esp) movl %ecx, 4(%esp) movl $0, (%esp) movl %edi, %eax call Integer.parse subl $12, %esp movl %eax, %esi movl %ebx, 4(%esp) movl $.str1, (%esp) call printf movl %esi, 4(%esp) movl $.str1, (%esp) call printf movl %ebx, 4(%esp) movl $.str1, (%esp) call printf [...] You can see that -inline is enough to get both inlined. The performance of D isn't something to ignore, I have translated a small ray tracing program from C++ and I have seen performance up to about 3-3.5 times slower with DMD, mostly because of missing inlining. Some benchmarks: http://www.fantascienza.net/leonardo/js/smallpt.zip http://www.fantascienza.net/leonardo/js/ao_bench.zip Bye, bearophile
May 16 2009
On 2009-05-15 16:36:16 -0400, dsimcha <dsimcha yahoo.com> said:The fact that DMD does not inline functions with ref parameters has come up several times deep in threads on this NG before, but it's never really received proper attention. After changing a few swaps in performance-critical areas of my code to "manually inlined" swaps and seeing significant speedups, I'm kind of curious what the rationale is for not inlining functions w/ ref params. Is there a good technical reason for this or is it simply a matter of having higher priorities? Is inlining functions w/ ref params on the "to do eventually" list?Which makes me think: now that "this" in structs is now a "ref" parameter, does that mean that all non-static member functions of a struct won't be inlined? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 15 2009
== Quote from Michel Fortin (michel.fortin michelf.com)'s articleWhich makes me think: now that "this" in structs is now a "ref" parameter, does that mean that all non-static member functions of a struct won't be inlined?Oddly enough, based on looking at some disassembly, DMD apparently inlines ref in this special case. Go figure. If there are any explicit parameters that are ref, though, it doesn't inline.
May 15 2009
On Sat, 16 May 2009 00:36:16 +0400, dsimcha <dsimcha yahoo.com> wrote:The fact that DMD does not inline functions with ref parameters has come up several times deep in threads on this NG before, but it's never really received proper attention. After changing a few swaps in performance-critical areas of my code to "manually inlined" swaps and seeing significant speedups, I'm kind of curious what the rationale is for not inlining functions w/ ref params. Is there a good technical reason for this or is it simply a matter of having higher priorities? Is inlining functions w/ ref params on the "to do eventually" list?How about bugzillizing the feature request?
May 17 2009
== Quote from Denis Koroskin (2korden gmail.com)'s articleOn Sat, 16 May 2009 00:36:16 +0400, dsimcha <dsimcha yahoo.com> wrote:It's already in there as bug 2008, though I updated it recently to show a disassembly proving that DMD can inline pointer param functions but not ref param functions. My guess is that, to someone already generally familiar w/ the DMD code base (I'm not), this would be really easy to fix, because ref params are just syntactic sugar for pointers. Heck, IIRC you can use ref params as pointers in asm blocks.The fact that DMD does not inline functions with ref parameters has come up several times deep in threads on this NG before, but it's never really received proper attention. After changing a few swaps in performance-critical areas of my code to "manually inlined" swaps and seeing significant speedups, I'm kind of curious what the rationale is for not inlining functions w/ ref params. Is there a good technical reason for this or is it simply a matter of having higher priorities? Is inlining functions w/ ref params on the "to do eventually" list?How about bugzillizing the feature request?
May 17 2009