www.digitalmars.com         C & C++   DMDScript  

D.gnu - (DMD) problem with closures/delegate literals and structs?

reply Johannes Pfau <nospam example.com> writes:
I found that ARM bug I said I found when testing dub on ARM.

Turns out I was pretty stupid not testing this on X86 first, cause it's
not ARM specific.... Debugging on a fast x86 machine would have been
much more fun. Anyway, GDC right now can't compile dub:

gdc -o bin/dub -fversion=DubUseCurl  -I source
-lcurl  build-files.txt
bin/dub fetch vibe-d
Fetching vibe-d 0.7.18...
[1]    4139 segmentation fault (core dumped)  bin/dub fetch vibe-d

So I debugged this stuff and it's a stack corruption. Have a look at
this example:
http://dpaste.dzfl.pl/433c0a3d

Please note that the this reference in the delegate points to the
stack. Of course copying the struct doesn't magically change the
address, so it still refers the old data.

It looks like we actually generate a closure here which contains the
this pointer instead of directly using the struct as a context pointer.
That is probably an optimization bug in dmd, but it doesn't matter in
this case as the problem would exist for closures and normal delegates.

I'm wondering if this is according to the spec, but I guess it is. If
we have a struct and take the address of a member funtion, the context
pointer is a pointer to the struct (probably on stack) as well. So if
we're in a member function and we have a delegate literal which
accesses this it seems correct that we have a pointer to the struct
(probably on stack). It can however be confusing:

struct S
{
    int a;
    void delegate() getDelegate(int b)
    {
         //b is in a closure and safe to access
         //but a is still accessed via this->a where
         //this may point to the stack.
         return () { return a + b; };
    }
}

So I guess I'm asking: Is this correct D behavior?


BTW: The reason why we see this bug in gdc and not in dmd is simple:
Because of GDC bug 52 NRVO doesn't work in gdc as in dmd. In dmd the
address of the returned variable doesn't change. Dub returns HTTP
structs by value. As long as the address doesn't change there's no
issue, so DMDs NRVO hides this bug.

However, as I now know how to break it it's easy to also break it on
DMD ;-)
http://dpaste.dzfl.pl/c109c2fd

If the currect compiler behavior is correct - and I think it is - then
std.net.curl is a ticking time bomb and needs to be fixed ASAP.
Dec 07 2013
next sibling parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On Dec 7, 2013 9:10 AM, "Johannes Pfau" <nospam example.com> wrote:
 I found that ARM bug I said I found when testing dub on ARM.

 Turns out I was pretty stupid not testing this on X86 first, cause it's
 not ARM specific.... Debugging on a fast x86 machine would have been
 much more fun. Anyway, GDC right now can't compile dub:

 gdc -o bin/dub -fversion=DubUseCurl  -I source
 -lcurl  build-files.txt
 bin/dub fetch vibe-d
 Fetching vibe-d 0.7.18...
 [1]    4139 segmentation fault (core dumped)  bin/dub fetch vibe-d

 So I debugged this stuff and it's a stack corruption. Have a look at
 this example:
 http://dpaste.dzfl.pl/433c0a3d

 Please note that the this reference in the delegate points to the
 stack. Of course copying the struct doesn't magically change the
 address, so it still refers the old data.

 It looks like we actually generate a closure here which contains the
 this pointer instead of directly using the struct as a context pointer.
 That is probably an optimization bug in dmd, but it doesn't matter in
 this case as the problem would exist for closures and normal delegates.

 I'm wondering if this is according to the spec, but I guess it is. If
 we have a struct and take the address of a member funtion, the context
 pointer is a pointer to the struct (probably on stack) as well. So if
 we're in a member function and we have a delegate literal which
 accesses this it seems correct that we have a pointer to the struct
 (probably on stack). It can however be confusing:

 struct S
 {
     int a;
     void delegate() getDelegate(int b)
     {
          //b is in a closure and safe to access
          //but a is still accessed via this->a where
          //this may point to the stack.
          return () { return a + b; };
     }
 }

 So I guess I'm asking: Is this correct D behavior?


 BTW: The reason why we see this bug in gdc and not in dmd is simple:
 Because of GDC bug 52 NRVO doesn't work in gdc as in dmd. In dmd the
 address of the returned variable doesn't change. Dub returns HTTP
 structs by value. As long as the address doesn't change there's no
 issue, so DMDs NRVO hides this bug.

 However, as I now know how to break it it's easy to also break it on
 DMD ;-)
 http://dpaste.dzfl.pl/c109c2fd

 If the currect compiler behavior is correct - and I think it is - then
 std.net.curl is a ticking time bomb and needs to be fixed ASAP.
Ouch. I remember a similar conversation with David when he was writing std.parallelism (the bug he ran into was code that assumed left to right evaluation). I should hope that the author of std.net.curl didn't write the module relying on this behavior. Regards -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';
Dec 07 2013
prev sibling next sibling parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On Dec 7, 2013 2:45 PM, "Iain Buclaw" <ibuclaw gdcproject.org> wrote:
 Ouch. I remember a similar conversation with David when he was writing
std.parallelism (the bug he ran into was code that assumed left to right evaluation). I should hope that the author of std.net.curl didn't write the module relying on this behavior.

Didn't intentionally write. :)

Regards
-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';
Dec 07 2013
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Dec 07, 2013 at 10:08:25AM +0100, Johannes Pfau wrote:
[...]
 gdc -o bin/dub -fversion=DubUseCurl  -I source
 -lcurl  build-files.txt
 bin/dub fetch vibe-d
 Fetching vibe-d 0.7.18...
 [1]    4139 segmentation fault (core dumped)  bin/dub fetch vibe-d
 
 So I debugged this stuff and it's a stack corruption. Have a look at
 this example:
 http://dpaste.dzfl.pl/433c0a3d
 
 Please note that the this reference in the delegate points to the
 stack. Of course copying the struct doesn't magically change the
 address, so it still refers the old data.
Yeah, this is a pretty nasty gotcha in the way D implements structs. Adam Ruppe & I ran into this problem in ConsoleD.terminal: his Terminal struct registers a bunch of dtors that are invoked when the struct goes out of scope, and initially, they were delegates that close over struct member variables. Bad idea. When the struct was returned to the caller, it was copied to a new location (perfectly valid -- structs are value types). This left the old copy of the struct on the stack, so the delegates still worked, but as soon as you reuse that portion of the stack for anything else, the dtor would segfault because the variables it closed over are now garbage. Moral of the story: don't close over struct members in delegates if you're returning the struct to the caller (or expect the delegate to get invoke while passing the struct to a callee -- since the callee gets a *copy* of the struct, not the original). Here be dragons. :P
 It looks like we actually generate a closure here which contains the
 this pointer instead of directly using the struct as a context
 pointer.  That is probably an optimization bug in dmd, but it doesn't
 matter in this case as the problem would exist for closures and normal
 delegates.
The problem is that delegates are never passed the this pointer from the caller; it is assumed that it's part of their context. You never write `myStruct.dg(args)`, but simply `dg(args)`. But by the time the delegate is invoked, you can't guarantee that `this` is still valid. Only the caller knows what copy of the struct it still holds, but this isn't communicated to the delegate. (On second thought, even if it *did* pass the updated `this` to the delegate, it would still be wrong, because there could be multiple copies of the struct by then, and who knows which copy the delegate was supposed to be operating on?)
 I'm wondering if this is according to the spec, but I guess it is. If
 we have a struct and take the address of a member funtion, the context
 pointer is a pointer to the struct (probably on stack) as well. So if
 we're in a member function and we have a delegate literal which
 accesses this it seems correct that we have a pointer to the struct
 (probably on stack). It can however be confusing:
 
 struct S
 {
     int a;
     void delegate() getDelegate(int b)
     {
          //b is in a closure and safe to access
          //but a is still accessed via this->a where
          //this may point to the stack.
          return () { return a + b; };
     }
 }
 
 So I guess I'm asking: Is this correct D behavior?
[...] I don't know if it's *correct*, but that's how it works currently. It's certainly a very nasty pitfall for those not in the know, though. The major issue here is that normally, if your delegate closes over a local variable, the compiler will automatically allocate the local variable on the heap rather than the stack. This prevents crashes caused by referencing out-of-scope local variables. But this is not done for struct members -- it can't be, since the only way is to make struct members pointers to the heap instead of embedded in the struct. Since structs are freely copied around, you can't guarantee that the closed-over member variable is the correct copy -- or even still exists -- by the time the delegate is called. I'm almost tempted to say that it should be illegal for a delegate to close over struct members, but that seems unnecessarily restrictive (there are cases where you might want to do this, and can do it in a safe way). I almost feel like the compiler should complain about escaping references to local variables in this case, though. T -- In a world without fences, who needs Windows and Gates? -- Christian Surchi
Dec 07 2013
parent reply Johannes Pfau <nospam example.com> writes:
Am Sat, 7 Dec 2013 08:43:53 -0800
schrieb "H. S. Teoh" <hsteoh quickfur.ath.cx>:

 On Sat, Dec 07, 2013 at 10:08:25AM +0100, Johannes Pfau wrote:
 [...]
 It looks like we actually generate a closure here which contains the
 this pointer instead of directly using the struct as a context
 pointer.  That is probably an optimization bug in dmd, but it
 doesn't matter in this case as the problem would exist for closures
 and normal delegates.
The problem is that delegates are never passed the this pointer from the caller; it is assumed that it's part of their context. You never write `myStruct.dg(args)`, but simply `dg(args)`. But by the time the delegate is invoked, you can't guarantee that `this` is still valid. Only the caller knows what copy of the struct it still holds, but this isn't communicated to the delegate. (On second thought, even if it *did* pass the updated `this` to the delegate, it would still be wrong, because there could be multiple copies of the struct by then, and who knows which copy the delegate was supposed to be operating on?)
True, but that's not what I meant ;-) I always get the terminology related to closures wrong so sorry if that didn't make sense. What I meant is this: In the first example I posted, http://dpaste.dzfl.pl/433c0a3d the delegate does not access _function variables_. It only accesses the this pointer. So there's no need for it to be a real closure and allocate memory, it could instead be a normal delegate with the context pointer simply set to the this pointer at the moment the delegate is created.
Dec 07 2013
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Dec 07, 2013 at 07:32:46PM +0100, Johannes Pfau wrote:
 Am Sat, 7 Dec 2013 08:43:53 -0800
 schrieb "H. S. Teoh" <hsteoh quickfur.ath.cx>:
 
 On Sat, Dec 07, 2013 at 10:08:25AM +0100, Johannes Pfau wrote:
 [...]
 It looks like we actually generate a closure here which contains
 the this pointer instead of directly using the struct as a context
 pointer.  That is probably an optimization bug in dmd, but it
 doesn't matter in this case as the problem would exist for
 closures and normal delegates.
The problem is that delegates are never passed the this pointer from the caller; it is assumed that it's part of their context. You never write `myStruct.dg(args)`, but simply `dg(args)`. But by the time the delegate is invoked, you can't guarantee that `this` is still valid. Only the caller knows what copy of the struct it still holds, but this isn't communicated to the delegate. (On second thought, even if it *did* pass the updated `this` to the delegate, it would still be wrong, because there could be multiple copies of the struct by then, and who knows which copy the delegate was supposed to be operating on?)
True, but that's not what I meant ;-) I always get the terminology related to closures wrong so sorry if that didn't make sense. What I meant is this: In the first example I posted, http://dpaste.dzfl.pl/433c0a3d the delegate does not access _function variables_. It only accesses the this pointer. So there's no need for it to be a real closure and allocate memory, it could instead be a normal delegate with the context pointer simply set to the this pointer at the moment the delegate is created.
The problem is that when `this` is closed over, it's one thing, but when the delegate is invoked later, `this` may no longer be valid. For example: struct S { int x = 123; // canary void delegate() getDelegate() { // Closes over `this` return () => x; } } void delegate() dg; S makeStruct() { S s; dg = s.getDelegate(); return s; } void main() { auto s = makeStruct(); dg(); // <-- NG } T -- Indifference will certainly be the downfall of mankind, but who cares? -- Miquel van Smoorenburg
Dec 07 2013
prev sibling next sibling parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On 7 December 2013 20:30, H. S. Teoh <hsteoh quickfur.ath.cx> wrote:
 On Sat, Dec 07, 2013 at 07:32:46PM +0100, Johannes Pfau wrote:
 Am Sat, 7 Dec 2013 08:43:53 -0800
 schrieb "H. S. Teoh" <hsteoh quickfur.ath.cx>:

 On Sat, Dec 07, 2013 at 10:08:25AM +0100, Johannes Pfau wrote:
 [...]
 It looks like we actually generate a closure here which contains
 the this pointer instead of directly using the struct as a context
 pointer.  That is probably an optimization bug in dmd, but it
 doesn't matter in this case as the problem would exist for
 closures and normal delegates.
The problem is that delegates are never passed the this pointer from the caller; it is assumed that it's part of their context. You never write `myStruct.dg(args)`, but simply `dg(args)`. But by the time the delegate is invoked, you can't guarantee that `this` is still valid. Only the caller knows what copy of the struct it still holds, but this isn't communicated to the delegate. (On second thought, even if it *did* pass the updated `this` to the delegate, it would still be wrong, because there could be multiple copies of the struct by then, and who knows which copy the delegate was supposed to be operating on?)
True, but that's not what I meant ;-) I always get the terminology related to closures wrong so sorry if that didn't make sense. What I meant is this: In the first example I posted, http://dpaste.dzfl.pl/433c0a3d the delegate does not access _function variables_. It only accesses the this pointer. So there's no need for it to be a real closure and allocate memory, it could instead be a normal delegate with the context pointer simply set to the this pointer at the moment the delegate is created.
The problem is that when `this` is closed over, it's one thing, but when the delegate is invoked later, `this` may no longer be valid. For example: struct S { int x = 123; // canary void delegate() getDelegate() { // Closes over `this` return () => x; } } void delegate() dg; S makeStruct() { S s; dg = s.getDelegate(); return s; } void main() { auto s = makeStruct(); dg(); // <-- NG }
I feel inclined to make that always make that sort of usage an error. :)
Dec 07 2013
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sun, Dec 08, 2013 at 12:17:02AM +0000, Iain Buclaw wrote:
 On 7 December 2013 20:30, H. S. Teoh <hsteoh quickfur.ath.cx> wrote:
 On Sat, Dec 07, 2013 at 07:32:46PM +0100, Johannes Pfau wrote:
[...]
 What I meant is this: In the first example I posted,
 http://dpaste.dzfl.pl/433c0a3d
 the delegate does not access _function variables_. It only accesses
 the this pointer. So there's no need for it to be a real closure
 and allocate memory, it could instead be a normal delegate with the
 context pointer simply set to the this pointer at the moment the
 delegate is created.
The problem is that when `this` is closed over, it's one thing, but when the delegate is invoked later, `this` may no longer be valid. For example: struct S { int x = 123; // canary void delegate() getDelegate() { // Closes over `this` return () => x; } } void delegate() dg; S makeStruct() { S s; dg = s.getDelegate(); return s; } void main() { auto s = makeStruct(); dg(); // <-- NG }
I feel inclined to make that always make that sort of usage an error. :)
I agree. But the problem is that currently the compiler will happily accept this and emit code for it, which will screw up at runtime. So the compiler should be fixed to either reject this outright, or somehow generate code that does the Right Thing. T -- First Rule of History: History doesn't repeat itself -- historians merely repeat each other.
Dec 07 2013