digitalmars.D - Dynamic Closure + Lazy Arguments = Performance Killer?
- Jason House (4/4) Oct 24 2008 I ported some monte carlo simulation code from Java to D2, and performan...
- Gregor Richards (4/11) Oct 24 2008 Java has a much better garbage collector than D, as it doesn't need to
- Jason House (3/16) Oct 24 2008 The code is written to explicitly avoid memory allocation, especially in...
- Frank Benoit (6/19) Oct 24 2008 It was written in this NG over and over. The D2 full closure feature is
- Bill Baxter (12/31) Oct 24 2008 Not to mention that, among the top problems plaguing D2 currently, it
- Robert Fraser (2/23) Oct 24 2008 Agreed. Is it in bugzilla?
- bearophile (9/12) Oct 24 2008 Can you show us a working minimal code I/we can test?
- Jason House (6/38) Oct 24 2008 The following spends 90% of its time in _d_alloc_memory
- bearophile (9/14) Oct 24 2008 I see. So I presume it becomes quite difficult for D2 to compute up to t...
- Jason House (2/20) Oct 25 2008 I would assume a fix would be to add scope to input delegates and to req...
- Bill Baxter (16/33) Oct 25 2008 This makes no sense because the writer of bar has no idea whether the
- Jason House (2/41) Oct 25 2008 While I agree that should be the default, I've already seen plenty of D1...
- Lars Ivar Igesund (10/48) Oct 25 2008 I agree that D1 behaviour should be the default, since otherwise it'll b...
- Denis Koroskin (10/56) Oct 25 2008 I believe the default should be the one that is most frequently used, ev...
- Lars Ivar Igesund (7/69) Oct 25 2008 I definately agree with this.
- Jarrett Billingsley (3/8) Oct 25 2008 Fantastic. That also neatly solves the "returning a delegate"
- Frits van Bommel (24/32) Oct 25 2008 How would this work? For example:
- Denis Koroskin (30/54) Oct 25 2008 Good question! First, let's expand the code:
- Jarrett Billingsley (10/20) Oct 25 2008 Clarification - it would be an error to return a scope delegate from
- Steven Schveighoffer (58/117) Oct 26 2008 I've been thinking about this solution, and I think the decision to allo...
- Bill Baxter (48/103) Oct 26 2008 Ok, so that's a good example where only the caller knows that heap
- Andrei Alexandrescu (4/17) Oct 24 2008 std.random does not use dynamic memory allocation. Walter is almost done...
- Bill Baxter (8/25) Oct 24 2008 Well the suggestion is that it may be using dynamic memory allocation
- Andrei Alexandrescu (4/29) Oct 24 2008 I forgot.
- Jason House (2/3) Oct 24 2008 Lazy arguments are delegates, and enforce uses lazy arguments
- Andrei Alexandrescu (3/8) Oct 24 2008 Yikes, I see.
- Jason House (6/24) Oct 24 2008 This is exactly why so many have complained about the dynamic closure
- Jason House (2/21) Oct 28 2008 I was wrong about the 4x thing. I have bad hardware. After fixing the ac...
- Steven Schveighoffer (5/38) Oct 28 2008 When you say 'fixing the accidental allocation' you mean removing the ca...
- Jason House (2/45) Oct 29 2008 Yes. You are right. The allocation of the dynamic closure was the only p...
- Russell Lewis (26/26) Nov 03 2008 Objective 1: Make the heap vs. stack variables explicit
I ported some monte carlo simulation code from Java to D2, and performance is horrible. 34% of the execution time is used by std.random.uniform. To my great surprise, 25% of the execution time is memory allocation (and collection) from that random call. The only candidate source I see is a call to ensure with lazy arguments. The memory allocation occurs at the start of the UniformDistribution call. I assume this is dynamic closure kicking in. Can anyone verify that this is the case? 600000 memory allocations per second really kills performance!
Oct 24 2008
Jason House wrote:I ported some monte carlo simulation code from Java to D2, and performance is horrible. 34% of the execution time is used by std.random.uniform. To my great surprise, 25% of the execution time is memory allocation (and collection) from that random call. The only candidate source I see is a call to ensure with lazy arguments. The memory allocation occurs at the start of the UniformDistribution call. I assume this is dynamic closure kicking in. Can anyone verify that this is the case? 600000 memory allocations per second really kills performance!Java has a much better garbage collector than D, as it doesn't need to be conservative. - Gregor Richards
Oct 24 2008
Gregor Richards Wrote:Jason House wrote:The code is written to explicitly avoid memory allocation, especially in tight loops. Without this dynamic closure, the garbage collecor would never run. This case is especially pathetic since the call to ensure will never trigger. This is part of a mini language shootout. The Java version I cloned runs 4x faster. This is only one piece of a much bigger problem.I ported some monte carlo simulation code from Java to D2, and performance is horrible. 34% of the execution time is used by std.random.uniform. To my great surprise, 25% of the execution time is memory allocation (and collection) from that random call. The only candidate source I see is a call to ensure with lazy arguments. The memory allocation occurs at the start of the UniformDistribution call. I assume this is dynamic closure kicking in. Can anyone verify that this is the case? 600000 memory allocations per second really kills performance!Java has a much better garbage collector than D, as it doesn't need to be conservative. - Gregor Richards
Oct 24 2008
Jason House schrieb:I ported some monte carlo simulation code from Java to D2, and performance is horrible. 34% of the execution time is used by std.random.uniform. To my great surprise, 25% of the execution time is memory allocation (and collection) from that random call. The only candidate source I see is a call to ensure with lazy arguments. The memory allocation occurs at the start of the UniformDistribution call. I assume this is dynamic closure kicking in. Can anyone verify that this is the case? 600000 memory allocations per second really kills performance!It was written in this NG over and over. The D2 full closure feature is a BIG!!!! problem. The nested functions passed as callback are an important and performance technique in D. The D2 full closure "feature" effectively removes it and makes D2 less attractive IMHO.
Oct 24 2008
On Sat, Oct 25, 2008 at 7:23 AM, Frank Benoit <keinfarbton googlemail.com> wrote:Jason House schrieb:Not to mention that, among the top problems plaguing D2 currently, it should be one of the easier things to fix. Far easier than figuring out overhauls for operator overloading, or construction syntax, or how ranges should work, or forward reference, or figuring out how 'shared' should work, or merging Tango and Phobos. Compared to those it's pretty easy to solve this one! Personally I think no alloc should be the default, with different syntax to get a full closure. Using the "new" keyword somehow makes sense to me. --bbI ported some monte carlo simulation code from Java to D2, and performance is horrible. 34% of the execution time is used by std.random.uniform. To my great surprise, 25% of the execution time is memory allocation (and collection) from that random call. The only candidate source I see is a call to ensure with lazy arguments. The memory allocation occurs at the start of the UniformDistribution call. I assume this is dynamic closure kicking in. Can anyone verify that this is the case? 600000 memory allocations per second really kills performance!It was written in this NG over and over. The D2 full closure feature is a BIG!!!! problem. The nested functions passed as callback are an important and performance technique in D. The D2 full closure "feature" effectively removes it and makes D2 less attractive IMHO.
Oct 24 2008
Frank Benoit wrote:Jason House schrieb:Agreed. Is it in bugzilla?I ported some monte carlo simulation code from Java to D2, and performance is horrible. 34% of the execution time is used by std.random.uniform. To my great surprise, 25% of the execution time is memory allocation (and collection) from that random call. The only candidate source I see is a call to ensure with lazy arguments. The memory allocation occurs at the start of the UniformDistribution call. I assume this is dynamic closure kicking in. Can anyone verify that this is the case? 600000 memory allocations per second really kills performance!It was written in this NG over and over. The D2 full closure feature is a BIG!!!! problem. The nested functions passed as callback are an important and performance technique in D. The D2 full closure "feature" effectively removes it and makes D2 less attractive IMHO.
Oct 24 2008
Jason House:34% of the execution time is used by std.random.uniform.Kiss of Tango is much faster, and there's a much faster still (but good still) rnd generator around...Can anyone verify that this is the case?Can you show us a working minimal code I/we can test? ------------------- Frank Benoit:The D2 full closure "feature" effectively removes it and makes D2 less attractive IMHO.<The first simple solution is to add the possibility of adding "scope" to closures to not use the heap (but I don't know how to do that in every situation, and it makes the already long syntax of lambdas even longer). But the probably best way for D to become more functional (and normal functional programming is often full of functions that move everywhere, often they are closures, but only virtually) is to grow some more optimizing capabilities, so closures aren't a problem anymore. There are many ways to perform such optimizations (but in a language mostly based on side effects it's less easy). Bye, bearophile
Oct 24 2008
bearophile wrote:Jason House:The following spends 90% of its time in _d_alloc_memory void bar(lazy int i){} void foo(int i){ bar(i); } void main(){ foreach(int i; 1..1000000) foo(i); } Compiling with -O -release reduces it to 88% :)34% of the execution time is used by std.random.uniform.Kiss of Tango is much faster, and there's a much faster still (but good still) rnd generator around...Can anyone verify that this is the case?Can you show us a working minimal code I/we can test? ------------------- Frank Benoit:The D2 full closure "feature" effectively removes it and makes D2 less attractive IMHO.<The first simple solution is to add the possibility of adding "scope" to closures to not use the heap (but I don't know how to do that in every situation, and it makes the already long syntax of lambdas even longer). But the probably best way for D to become more functional (and normal functional programming is often full of functions that move everywhere, often they are closures, but only virtually) is to grow some more optimizing capabilities, so closures aren't a problem anymore. There are many ways to perform such optimizations (but in a language mostly based on side effects it's less easy). Bye, bearophile
Oct 24 2008
Jason House:The following spends 90% of its time in _d_alloc_memory void bar(lazy int i){} void foo(int i){ bar(i); } void main(){ foreach(int i; 1..1000000) foo(i); } Compiling with -O -release reduces it to 88% :)I see. So I presume it becomes quite difficult for D2 to compute up to the 25th term of this sequence (the D code is in the middle of the page) (it takes just few seconds to run on D1): http://en.wikipedia.org/wiki/Man_or_boy_test What syntax can we use to avoid heap allocation? Few ideas: void bar(lazy int i){} // like D1 void bar(scope lazy int i){} // like D1 void bar(closure int i){} // like current D2 Bye, bearophile
Oct 24 2008
bearophile Wrote:Jason House:I would assume a fix would be to add scope to input delegates and to require some kind of declaration on the caller's side when the compiler can't prove safety. It's best for ambiguous cases to be a warning (error). It also makes the code easier for readers to follow.The following spends 90% of its time in _d_alloc_memory void bar(lazy int i){} void foo(int i){ bar(i); } void main(){ foreach(int i; 1..1000000) foo(i); } Compiling with -O -release reduces it to 88% :)I see. So I presume it becomes quite difficult for D2 to compute up to the 25th term of this sequence (the D code is in the middle of the page) (it takes just few seconds to run on D1): http://en.wikipedia.org/wiki/Man_or_boy_test What syntax can we use to avoid heap allocation? Few ideas: void bar(lazy int i){} // like D1 void bar(scope lazy int i){} // like D1 void bar(closure int i){} // like current D2 Bye, bearophile
Oct 25 2008
On Sat, Oct 25, 2008 at 5:24 PM, Jason House <jason.james.house gmail.com> wrote:bearophile Wrote:This makes no sense because the writer of bar has no idea whether the caller will need a heap allocation or not.Jason House:The following spends 90% of its time in _d_alloc_memory void bar(lazy int i){} void foo(int i){ bar(i); } void main(){ foreach(int i; 1..1000000) foo(i); } Compiling with -O -release reduces it to 88% :)I see. So I presume it becomes quite difficult for D2 to compute up to the 25th term of this sequence (the D code is in the middle of the page) (it takes just few seconds to run on D1): http://en.wikipedia.org/wiki/Man_or_boy_test What syntax can we use to avoid heap allocation? Few ideas: void bar(lazy int i){} // like D1 void bar(scope lazy int i){} // like D1 void bar(closure int i){} // like current D2I would assume a fix would be to add scope to input delegates and to require some kind of declaration on the caller's side when the compiler can't prove safety. It's best for ambiguous cases to be a warning (error). It also makes the code easier for readers to follow.I think for a language like D, hidden, hard to find memory allocations like the one Andrei didn't know he was doing should be eliminated. By that I mean stack allocation (D1 behavior) should be the default. Then for places where you really want a closure, some other syntax should be chosen. The other reason I say that is that so far in D I've only very seldom really wanted an allocated closure. So I think I will have to use the funky no-closure-please syntax way more than I would have to use a make-me-a-closure-please syntax. But apparently nobody who knows anything about what's actually going to happen is involved in this discussion, so I think I'll just pipe down for now. --bb
Oct 25 2008
Bill Baxter Wrote:On Sat, Oct 25, 2008 at 5:24 PM, Jason House <jason.james.house gmail.com> wrote:While I agree that should be the default, I've already seen plenty of D1 code that incorrectly used stack-based closures. It really depends on your usage patterns. I do a lot of inter-thread communication in D1bearophile Wrote:This makes no sense because the writer of bar has no idea whether the caller will need a heap allocation or not.Jason House:The following spends 90% of its time in _d_alloc_memory void bar(lazy int i){} void foo(int i){ bar(i); } void main(){ foreach(int i; 1..1000000) foo(i); } Compiling with -O -release reduces it to 88% :)I see. So I presume it becomes quite difficult for D2 to compute up to the 25th term of this sequence (the D code is in the middle of the page) (it takes just few seconds to run on D1): http://en.wikipedia.org/wiki/Man_or_boy_test What syntax can we use to avoid heap allocation? Few ideas: void bar(lazy int i){} // like D1 void bar(scope lazy int i){} // like D1 void bar(closure int i){} // like current D2I would assume a fix would be to add scope to input delegates and to require some kind of declaration on the caller's side when the compiler can't prove safety. It's best for ambiguous cases to be a warning (error). It also makes the code easier for readers to follow.I think for a language like D, hidden, hard to find memory allocations like the one Andrei didn't know he was doing should be eliminated. By that I mean stack allocation (D1 behavior) should be the default. Then for places where you really want a closure, some other syntax should be chosen. The other reason I say that is that so far in D I've only very seldom really wanted an allocated closure. So I think I will have to use the funky no-closure-please syntax way more than I would have to use a make-me-a-closure-please syntax. But apparently nobody who knows anything about what's actually going to happen is involved in this discussion, so I think I'll just pipe down for now. --bb
Oct 25 2008
Bill Baxter wrote:On Sat, Oct 25, 2008 at 5:24 PM, Jason House <jason.james.house gmail.com> wrote:I agree that D1 behaviour should be the default, since otherwise it'll be yet another breaking change. However, I do understand that the D1 behaviour is the unsafe one, and as such the heap allocated version has merit as the default. -- Lars Ivar Igesund blog at http://larsivi.net DSource, #d.tango & #D: larsivi Dancing the Tangobearophile Wrote:This makes no sense because the writer of bar has no idea whether the caller will need a heap allocation or not.Jason House:The following spends 90% of its time in _d_alloc_memory void bar(lazy int i){} void foo(int i){ bar(i); } void main(){ foreach(int i; 1..1000000) foo(i); } Compiling with -O -release reduces it to 88% :)I see. So I presume it becomes quite difficult for D2 to compute up to the 25th term of this sequence (the D code is in the middle of the page) (it takes just few seconds to run on D1): http://en.wikipedia.org/wiki/Man_or_boy_test What syntax can we use to avoid heap allocation? Few ideas: void bar(lazy int i){} // like D1 void bar(scope lazy int i){} // like D1 void bar(closure int i){} // like current D2I would assume a fix would be to add scope to input delegates and to require some kind of declaration on the caller's side when the compiler can't prove safety. It's best for ambiguous cases to be a warning (error). It also makes the code easier for readers to follow.I think for a language like D, hidden, hard to find memory allocations like the one Andrei didn't know he was doing should be eliminated. By that I mean stack allocation (D1 behavior) should be the default. Then for places where you really want a closure, some other syntax should be chosen. The other reason I say that is that so far in D I've only very seldom really wanted an allocated closure. So I think I will have to use the funky no-closure-please syntax way more than I would have to use a make-me-a-closure-please syntax.
Oct 25 2008
On Sat, 25 Oct 2008 18:36:27 +0400, Lars Ivar Igesund <larsivar igesund.net> wrote:Bill Baxter wrote:I believe the default should be the one that is most frequently used, even if it is less safe. Otherwise you may end up with a lot of code duplication. I also think that scope and heap-allocated delegates should have different types so that no imlicit casting from scope delegate to heap one would be possible. In this case callee function that recieves the delegate might demand the delegate to be heap-allocated (because it stores it, for example).On Sat, Oct 25, 2008 at 5:24 PM, Jason House <jason.james.house gmail.com> wrote:I agree that D1 behaviour should be the default, since otherwise it'll be yet another breaking change. However, I do understand that the D1 behaviour is the unsafe one, and as such the heap allocated version has merit as the default.bearophile Wrote:This makes no sense because the writer of bar has no idea whether the caller will need a heap allocation or not.Jason House:The following spends 90% of its time in _d_alloc_memory void bar(lazy int i){} void foo(int i){ bar(i); } void main(){ foreach(int i; 1..1000000) foo(i); } Compiling with -O -release reduces it to 88% :)I see. So I presume it becomes quite difficult for D2 to compute up to the 25th term of this sequence (the D code is in the middle of the page) (it takes just few seconds to run on D1): http://en.wikipedia.org/wiki/Man_or_boy_test What syntax can we use to avoid heap allocation? Few ideas: void bar(lazy int i){} // like D1 void bar(scope lazy int i){} // like D1 void bar(closure int i){} // like current D2I would assume a fix would be to add scope to input delegates and to require some kind of declaration on the caller's side when the compiler can't prove safety. It's best for ambiguous cases to be a warning (error). It also makes the code easier for readers to follow.I think for a language like D, hidden, hard to find memory allocations like the one Andrei didn't know he was doing should be eliminated. By that I mean stack allocation (D1 behavior) should be the default. Then for places where you really want a closure, some other syntax should be chosen. The other reason I say that is that so far in D I've only very seldom really wanted an allocated closure. So I think I will have to use the funky no-closure-please syntax way more than I would have to use a make-me-a-closure-please syntax.
Oct 25 2008
Denis Koroskin wrote:On Sat, 25 Oct 2008 18:36:27 +0400, Lars Ivar Igesund <larsivar igesund.net> wrote:I definately agree with this. -- Lars Ivar Igesund blog at http://larsivi.net DSource, #d.tango & #D: larsivi Dancing the TangoBill Baxter wrote:I believe the default should be the one that is most frequently used, even if it is less safe. Otherwise you may end up with a lot of code duplication. I also think that scope and heap-allocated delegates should have different types so that no imlicit casting from scope delegate to heap one would be possible. In this case callee function that recieves the delegate might demand the delegate to be heap-allocated (because it stores it, for example).On Sat, Oct 25, 2008 at 5:24 PM, Jason House <jason.james.house gmail.com> wrote:I agree that D1 behaviour should be the default, since otherwise it'll be yet another breaking change. However, I do understand that the D1 behaviour is the unsafe one, and as such the heap allocated version has merit as the default.bearophile Wrote:This makes no sense because the writer of bar has no idea whether the caller will need a heap allocation or not.Jason House:The following spends 90% of its time in _d_alloc_memory void bar(lazy int i){} void foo(int i){ bar(i); } void main(){ foreach(int i; 1..1000000) foo(i); } Compiling with -O -release reduces it to 88% :)I see. So I presume it becomes quite difficult for D2 to compute up to the 25th term of this sequence (the D code is in the middle of the page) (it takes just few seconds to run on D1): http://en.wikipedia.org/wiki/Man_or_boy_test What syntax can we use to avoid heap allocation? Few ideas: void bar(lazy int i){} // like D1 void bar(scope lazy int i){} // like D1 void bar(closure int i){} // like current D2I would assume a fix would be to add scope to input delegates and to require some kind of declaration on the caller's side when the compiler can't prove safety. It's best for ambiguous cases to be a warning (error). It also makes the code easier for readers to follow.I think for a language like D, hidden, hard to find memory allocations like the one Andrei didn't know he was doing should be eliminated. By that I mean stack allocation (D1 behavior) should be the default. Then for places where you really want a closure, some other syntax should be chosen. The other reason I say that is that so far in D I've only very seldom really wanted an allocated closure. So I think I will have to use the funky no-closure-please syntax way more than I would have to use a make-me-a-closure-please syntax.
Oct 25 2008
On Sat, Oct 25, 2008 at 10:44 AM, Denis Koroskin <2korden gmail.com> wrote:I also think that scope and heap-allocated delegates should have different types so that no imlicit casting from scope delegate to heap one would be possible. In this case callee function that recieves the delegate might demand the delegate to be heap-allocated (because it stores it, for example).Fantastic. That also neatly solves the "returning a delegate" problem; it simply becomes illegal to return a scope delegate.
Oct 25 2008
On Sat, Oct 25, 2008 at 10:44 AM, Denis Koroskin <2korden gmail.com> wrote:How would this work? For example: ----- struct Struct { // fields... void foo() { // body } } void bar(Struct* p) { auto dg = &p.foo; // stack-based or heap-based delegate? // do stuff with dg } ----- ? (There's no way to know if *p is a heap-based or stack-based struct) Jarrett Billingsley wrote:I also think that scope and heap-allocated delegates should have different types so that no imlicit casting from scope delegate to heap one would be possible. In this case callee function that recieves the delegate might demand the delegate to be heap-allocated (because it stores it, for example).Fantastic. That also neatly solves the "returning a delegate" problem; it simply becomes illegal to return a scope delegate.Even if "scope delegate" becomes a different type, sometimes such a "scope delegate"s is perfectly safe to return: ----- alias scope void delegate() dg; Dg foo(Dg dg) { return dg; // Why would this be illegal? } -----
Oct 25 2008
On Sat, 25 Oct 2008 21:17:34 +0400, Frits van Bommel <fvbommel remwovexcapss.nl> wrote:Good question! First, let's expand the code: void bar(Struct* p) { void delegate() dg; dg.ptr = p; dg.funcptr = &Struct.foo; // do stuff with dg } So, here is the question: is this a "stack-based or heap-based delegate?" I.e. may we return it from function and pass it to those functions that need heap-base delegate or not? Yes, we may return it, obviously, and call outside of the function, so from this point of view it is indeed "heap-allocated delegate" even if nothing is actually allocated. But someone might say that it is unsafe to call this dg because at some point object may become inexistant. To respond this, let's rewrite the code to make it trully heap-allocated and compare if it got any safer: void bar(Struct* p) { void foo() { p.foo(); } auto dg = &foo; } Now dg is heap-allocated (in the sense that place for its local variable are allocated on heap). May we return this delegate from function? Yes. Is it any safer? No. They are absolutely the same.On Sat, Oct 25, 2008 at 10:44 AM, Denis Koroskin <2korden gmail.com> wrote:How would this work? For example: ----- struct Struct { // fields... void foo() { // body } } void bar(Struct* p) { auto dg = &p.foo; // stack-based or heap-based delegate? // do stuff with dg } ----- ?I also think that scope and heap-allocated delegates should have different types so that no imlicit casting from scope delegate to heap one would be possible. In this case callee function that recieves the delegate might demand the delegate to be heap-allocated (because it stores it, for example).auto dg = &p.foo; // stack-based or heap-based delegate?Heap-based one, even if no actual allocation took place.
Oct 25 2008
On Sat, Oct 25, 2008 at 1:17 PM, Frits van Bommel <fvbommel remwovexcapss.nl> wrote:Clarification - it would be an error to return a scope delegate from the scope in which it was declared. Currently the behavior you mention (passing a scope delegate into a function then returning it) doesn't even exist for scope classes - parameters cannot be "scope". I would imagine, though, that if a parameter were scope, a function would be able to return that parameter, and in fact that would be the only way to return a scope reference (delegate or class) from a function.Fantastic. That also neatly solves the "returning a delegate" problem; it simply becomes illegal to return a scope delegate.Even if "scope delegate" becomes a different type, sometimes such a "scope delegate"s is perfectly safe to return: ----- alias scope void delegate() dg; Dg foo(Dg dg) { return dg; // Why would this be illegal? } -----
Oct 25 2008
"Denis Koroskin" wroteOn Sat, 25 Oct 2008 18:36:27 +0400, Lars Ivar Igesund <larsivar igesund.net> wrote:I've been thinking about this solution, and I think the decision to allocate scope or heap should be left up to the developer, and no types should be assigned. Think about an example like this: class DelegateCaller { private delegate int _foo(); this(int delegate() foo) { _foo = foo; } int callit() { return _foo();} } int f1() { int x() { return 5; } scope dc = new DelegateCaller(&x); // allocate on stack return dc.callit() * dc.callit(); } DelegateCaller f2() { int x() { return 5;} return new DelegateCaller(&x); // allocate on heap } So what type should DelegateCaller._foo be? I think the only real solution to this, aside from compiler analysis (which introduces all kinds of problems), is to declare all delegates are stack or heap allocated by default, and allow the developer to deviate by declaring the delegate as opposite. As I think most function delegates are expected to be stack allocated, it makes sense to me that stack delegates should be the default. As a suggestion for syntax, I'd say heap-allocated delegates should use the new keyword somehow: return new DelegateCaller(new(&x)); One issue to determine is how heap-allocated delegates are done. Should there be only one heap allocation per function call, or one per instantiation? If so, what happens if you change data in the function after instantiation? The difference is significant if you create multiple delegates: int delegate() foo[]; int i = 0; int getI() { return i; } foo ~= new(&getI); i++; foo ~= new(&getI); i++; for(int j = 0; j < foo.length; j++) { writefln(foo[j]); } What should be the correct output? 0 1 or 0 2 or 2 2 -SteveBill Baxter wrote:I believe the default should be the one that is most frequently used, even if it is less safe. Otherwise you may end up with a lot of code duplication. I also think that scope and heap-allocated delegates should have different types so that no imlicit casting from scope delegate to heap one would be possible. In this case callee function that recieves the delegate might demand the delegate to be heap-allocated (because it stores it, for example).On Sat, Oct 25, 2008 at 5:24 PM, Jason House <jason.james.house gmail.com> wrote:I agree that D1 behaviour should be the default, since otherwise it'll be yet another breaking change. However, I do understand that the D1 behaviour is the unsafe one, and as such the heap allocated version has merit as the default.bearophile Wrote:This makes no sense because the writer of bar has no idea whether the caller will need a heap allocation or not.Jason House:The following spends 90% of its time in _d_alloc_memory void bar(lazy int i){} void foo(int i){ bar(i); } void main(){ foreach(int i; 1..1000000) foo(i); } Compiling with -O -release reduces it to 88% :)I see. So I presume it becomes quite difficult for D2 to compute up to the 25th term of this sequence (the D code is in the middle of the page) (it takes just few seconds to run on D1): http://en.wikipedia.org/wiki/Man_or_boy_test What syntax can we use to avoid heap allocation? Few ideas: void bar(lazy int i){} // like D1 void bar(scope lazy int i){} // like D1 void bar(closure int i){} // like current D2I would assume a fix would be to add scope to input delegates and to require some kind of declaration on the caller's side when the compiler can't prove safety. It's best for ambiguous cases to be a warning (error). It also makes the code easier for readers to follow.I think for a language like D, hidden, hard to find memory allocations like the one Andrei didn't know he was doing should be eliminated. By that I mean stack allocation (D1 behavior) should be the default. Then for places where you really want a closure, some other syntax should be chosen. The other reason I say that is that so far in D I've only very seldom really wanted an allocated closure. So I think I will have to use the funky no-closure-please syntax way more than I would have to use a make-me-a-closure-please syntax.
Oct 26 2008
On Sun, Oct 26, 2008 at 11:38 PM, Steven Schveighoffer <schveiguy yahoo.com> wrote:Ok, so that's a good example where only the caller knows that heap allocation is necessary, and we already discussed a case where only the callee knows it's necessary.I also think that scope and heap-allocated delegates should have different types so that no imlicit casting from scope delegate to heap one would be possible. In this case callee function that recieves the delegate might demand the delegate to be heap-allocated (because it stores it, for example).I've been thinking about this solution, and I think the decision to allocate scope or heap should be left up to the developer, and no types should be assigned. Think about an example like this: class DelegateCaller { private delegate int _foo(); this(int delegate() foo) { _foo = foo; } int callit() { return _foo();} } int f1() { int x() { return 5; } scope dc = new DelegateCaller(&x); // allocate on stack return dc.callit() * dc.callit(); } DelegateCaller f2() { int x() { return 5;} return new DelegateCaller(&x); // allocate on heap } So what type should DelegateCaller._foo be?I think the only real solution to this, aside from compiler analysis (which introduces all kinds of problems), is to declare all delegates are stack or heap allocated by default, and allow the developer to deviate by declaring the delegate as opposite.It seems to me that from the two cases above, a good solution might be to make stack the default but to allow *either* the callee *or* the caller to request that that default be overridden.As I think most function delegates are expected to be stack allocated, it makes sense to me that stack delegates should be the default.As a suggestion for syntax, I'd say heap-allocated delegates should use the new keyword somehow: return new DelegateCaller(new(&x)); One issue to determine is how heap-allocated delegates are done. Should there be only one heap allocation per function call, or one per instantiation? If so, what happens if you change data in the function after instantiation? The difference is significant if you create multiple delegates: int delegate() foo[]; int i = 0; int getI() { return i; } foo ~= new(&getI); i++; foo ~= new(&getI); i++; for(int j = 0; j < foo.length; j++) { writefln(foo[j]); } What should be the correct output? 0 1Without thinking about implementation or the current behavior at all, this is the output I would expect from a full closure. It should capture the state at the time of its creation. With the either/or proposal you'll need another rule, I think. If you have a case like this: void longTermDelegateKeeper(new int delegate() dg) { ... } // here "new" means heap required ... int i = 0; int getI() { return i; } int delegate() foo[]; foo ~= &getI; i++; longTermDelegateKeeper(foo[0]); // <- what happens? Here there are two options for the "what happens" line I think: 1) stack delegate returned by foo[0] triggers an implicit allocation and copying of current stack variables. (so foo[0]() will return "1") 2) compiler error: "Heap delegate expected". create a heap delegate out of the stack delegate. And by doing that force the caller to examine which state he really mean to capture in that delegate. Did he want it to capture the i==1 state or did he want it to capture i==0? And in a loop context it will force the developer to notice that he's triggering implicit allocations inside a loop when he may not mean to. It also would make it possible to recognize allocations just by looking at code locally. Aside from these D2 delegates I think it's always possible to tell looking at D code where the allocations are. Setting a .length or doing ~= are not obviously (and not necessarily) allocations, but if you see one then you can guess that allocation is involved. I don't really want to end up with a situation where I have to guess if the code I'm looking at is doing allocation just by calling a function that itself doesn't do any allocation either. Finally -- do stack and heap delegates really need to be distinct types? Maybe not. Maybe a run-time check would be good enough. if there's some kind of isHeapDelegate(dg) check available then, library writers could use that. The compiler wouldn't catch the error, but it might be sufficient to catch at runtime in order to avoid the pain of introducing more types. --bb
Oct 26 2008
Jason House wrote:I ported some monte carlo simulation code from Java to D2, and performance is horrible. 34% of the execution time is used by std.random.uniform. To my great surprise, 25% of the execution time is memory allocation (and collection) from that random call. The only candidate source I see is a call to ensure with lazy arguments. The memory allocation occurs at the start of the UniformDistribution call. I assume this is dynamic closure kicking in. Can anyone verify that this is the case? 600000 memory allocations per second really kills performance!std.random does not use dynamic memory allocation. Walter is almost done implementing static closures. Andrei
Oct 24 2008
On Sat, Oct 25, 2008 at 9:59 AM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Jason House wrote:Well the suggestion is that it may be using dynamic memory allocation without intending to because of the dynamic closures. Are you saying that is definitely not the case?I ported some monte carlo simulation code from Java to D2, and performance is horrible. 34% of the execution time is used by std.random.uniform. To my great surprise, 25% of the execution time is memory allocation (and collection) from that random call. The only candidate source I see is a call to ensure with lazy arguments. The memory allocation occurs at the start of the UniformDistribution call. I assume this is dynamic closure kicking in. Can anyone verify that this is the case? 600000 memory allocations per second really kills performance!std.random does not use dynamic memory allocation.Walter is almost done implementing static closures.Excellent! So what strategy is being used? I hope it's static by default, dynamic on request, but your wording suggests otherwise. --bb
Oct 24 2008
Bill Baxter wrote:On Sat, Oct 25, 2008 at 9:59 AM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:I don't think there's any delegate in use in std.random.Jason House wrote:Well the suggestion is that it may be using dynamic memory allocation without intending to because of the dynamic closures. Are you saying that is definitely not the case?I ported some monte carlo simulation code from Java to D2, and performance is horrible. 34% of the execution time is used by std.random.uniform. To my great surprise, 25% of the execution time is memory allocation (and collection) from that random call. The only candidate source I see is a call to ensure with lazy arguments. The memory allocation occurs at the start of the UniformDistribution call. I assume this is dynamic closure kicking in. Can anyone verify that this is the case? 600000 memory allocations per second really kills performance!std.random does not use dynamic memory allocation.I forgot. AndreiWalter is almost done implementing static closures.Excellent! So what strategy is being used? I hope it's static by default, dynamic on request, but your wording suggests otherwise.
Oct 24 2008
Andrei Alexandrescu wrote:I don't think there's any delegate in use in std.random.Lazy arguments are delegates, and enforce uses lazy arguments
Oct 24 2008
Jason House wrote:Andrei Alexandrescu wrote:Yikes, I see. AndreiI don't think there's any delegate in use in std.random.Lazy arguments are delegates, and enforce uses lazy arguments
Oct 24 2008
Andrei Alexandrescu wrote:Jason House wrote:This is exactly why so many have complained about the dynamic closure implementation. You did not intend to use dynamic memory allocation, but it definitely does. A program with nothing but a loop that calls uniform will show it plain as day in the profiler. (I'm using callgrind)I ported some monte carlo simulation code from Java to D2, and performance is horrible. 34% of the execution time is used by std.random.uniform. To my great surprise, 25% of the execution time is memory allocation (and collection) from that random call. The only candidate source I see is a call to ensure with lazy arguments. The memory allocation occurs at the start of the UniformDistribution call. I assume this is dynamic closure kicking in. Can anyone verify that this is the case? 600000 memory allocations per second really kills performance!std.random does not use dynamic memory allocation.Walter is almost done implementing static closures.Ooh... Can you elaborate on that?
Oct 24 2008
Jason House Wrote:Gregor Richards Wrote:I was wrong about the 4x thing. I have bad hardware. After fixing the accidental allocation and running both the D and Java version on the same box, they're only 1% different.Jason House wrote:The code is written to explicitly avoid memory allocation, especially in tight loops. Without this dynamic closure, the garbage collecor would never run. This case is especially pathetic since the call to ensure will never trigger. This is part of a mini language shootout. The Java version I cloned runs 4x faster. This is only one piece of a much bigger problem.I ported some monte carlo simulation code from Java to D2, and performance is horrible. 34% of the execution time is used by std.random.uniform. To my great surprise, 25% of the execution time is memory allocation (and collection) from that random call. The only candidate source I see is a call to ensure with lazy arguments. The memory allocation occurs at the start of the UniformDistribution call. I assume this is dynamic closure kicking in. Can anyone verify that this is the case? 600000 memory allocations per second really kills performance!Java has a much better garbage collector than D, as it doesn't need to be conservative. - Gregor Richards
Oct 28 2008
"Jason House" wroteJason House Wrote:When you say 'fixing the accidental allocation' you mean removing the case where a dynamic closure was allocated? I just want to make sure that is clear. -SteveGregor Richards Wrote:I was wrong about the 4x thing. I have bad hardware. After fixing the accidental allocation and running both the D and Java version on the same box, they're only 1% different.Jason House wrote:The code is written to explicitly avoid memory allocation, especially in tight loops. Without this dynamic closure, the garbage collecor would never run. This case is especially pathetic since the call to ensure will never trigger. This is part of a mini language shootout. The Java version I cloned runs 4x faster. This is only one piece of a much bigger problem.I ported some monte carlo simulation code from Java to D2, and performance is horrible. 34% of the execution time is used by std.random.uniform. To my great surprise, 25% of the execution time is memory allocation (and collection) from that random call. The only candidate source I see is a call to ensure with lazy arguments. The memory allocation occurs at the start of the UniformDistribution call. I assume this is dynamic closure kicking in. Can anyone verify that this is the case? 600000 memory allocations per second really kills performance!Java has a much better garbage collector than D, as it doesn't need to be conservative. - Gregor Richards
Oct 28 2008
Steven Schveighoffer Wrote:"Jason House" wroteYes. You are right. The allocation of the dynamic closure was the only performance problem, and consumed 25% of my execution time. I called it accidental because Andrei was unaware that he had done it.Jason House Wrote:When you say 'fixing the accidental allocation' you mean removing the case where a dynamic closure was allocated? I just want to make sure that is clear. -SteveGregor Richards Wrote:I was wrong about the 4x thing. I have bad hardware. After fixing the accidental allocation and running both the D and Java version on the same box, they're only 1% different.Jason House wrote:The code is written to explicitly avoid memory allocation, especially in tight loops. Without this dynamic closure, the garbage collecor would never run. This case is especially pathetic since the call to ensure will never trigger. This is part of a mini language shootout. The Java version I cloned runs 4x faster. This is only one piece of a much bigger problem.I ported some monte carlo simulation code from Java to D2, and performance is horrible. 34% of the execution time is used by std.random.uniform. To my great surprise, 25% of the execution time is memory allocation (and collection) from that random call. The only candidate source I see is a call to ensure with lazy arguments. The memory allocation occurs at the start of the UniformDistribution call. I assume this is dynamic closure kicking in. Can anyone verify that this is the case? 600000 memory allocations per second really kills performance!Java has a much better garbage collector than D, as it doesn't need to be conservative. - Gregor Richards
Oct 29 2008
Objective 1: Make the heap vs. stack variables explicit Objective 2: Make it impossible to return or store a static (stack) delegate Objective 3: Don't require decorators on lambda expressions. Solution: - Variables are on stack by default - Use modifier "heap" to put a variable on the heap - Delegates can be normal (storable) or "scope" (can't live beyond the scope of our function, and the type is inferred BASED ON WHAT VARIABLES YOU ACCESS. EXAMPLE CODE void foo(scope void delegate()) {...} void bar(void delegate()) {...} void main() { int a; heap int b; foo({ a = 1; }); // legal. bar({ b = 2; }); // legal. bar could store dg, but b is on heap foo({ b = 3; }); // legal. ok to pass non-scope dg to // scope argument bar({ a = 4; }); // SYNTAX ERROR // delegate is scope b/c a is on stack, but // argument to bar isn't scope. } END CODE Thoughts?
Nov 03 2008