digitalmars.D.bugs - [Issue 9438] New: Strange RefCounted stack overflow
- d-bugmail puremagic.com (74/74) Feb 01 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9438
- d-bugmail puremagic.com (40/40) Feb 01 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9438
- d-bugmail puremagic.com (28/28) Feb 01 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9438
- d-bugmail puremagic.com (50/50) Feb 01 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9438
- d-bugmail puremagic.com (10/12) Feb 01 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9438
- d-bugmail puremagic.com (7/13) Feb 01 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9438
- d-bugmail puremagic.com (7/12) Feb 01 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9438
- d-bugmail puremagic.com (28/28) Feb 01 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9438
- d-bugmail puremagic.com (54/54) Feb 01 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9438
- d-bugmail puremagic.com (10/11) Feb 03 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9438
- d-bugmail puremagic.com (9/16) Feb 04 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9438
- d-bugmail puremagic.com (10/10) Feb 09 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9438
- d-bugmail puremagic.com (11/11) Feb 10 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9438
- d-bugmail puremagic.com (9/9) Feb 10 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9438
- d-bugmail puremagic.com (8/8) Feb 10 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9438
- d-bugmail puremagic.com (11/11) Feb 10 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9438
http://d.puremagic.com/issues/show_bug.cgi?id=9438 Summary: Strange RefCounted stack overflow Product: D Version: D2 Platform: All OS/Version: All Status: NEW Severity: major Priority: P2 Component: DMD AssignedTo: nobody puremagic.com ReportedBy: monarchdodra gmail.com I've been chasing this on and off for a couple months now. Basically, trying to access the RefCounted.refCountedStore.isInitialized of a non-initialized RefCounted in a field of a temporary will create a stack overflow. I know that's not clear, but here is the reduced usecase: //---- import std.container, std.stdio, std.typecons, std.exception; struct S { RefCounted!int _data; this(int) {_data.refCountedStore.ensureInitialized();} int get() property { writeln("here"); enforce(_data.refCountedStore.isInitialized); //OH NOES!!! writeln("there"); return _data.refCountedPayload; } } void main() { // 1) writeln(S(1).get); // 2) S s; writeln(s.get).collectException(); // 3) writeln(S().get); } //---- 1) This will create a temporary S, and intialize the ref counted. the writeln works. 2) The creates a non-temporary S. Trying to access the ref counted will (correctly) throw an exception. 3) This will stack overflow at the "//OH NOES!!!" line: It will first call: ref inout(RefCountedStore) refCountedStore() inout To get the store, and then will recursively call "isInitialized" until the program stack overflows. I have no idea why: //---- property nothrow safe bool isInitialized() const { return _store !is null; } //---- This seems to me like the tip of a more serious bug somewhere. I would be very pleased if someone with more knowledge than me could try to look into it? I think it might also create problems with things such as arrays of arrays: Every time I've tried to fix http://d.puremagic.com/issues/show_bug.cgi?id=6153 I've had crashes (NOT asserts/enforeces), and I think this might be the reason. Originally found with this code: //---- void main() { writeln(Array!int()[0]); } //---- Yes, the code is wrong, but it should *assert*. Currently, it just dies. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 01 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9438 Maxim Fomin <maxim maxim-fomin.ru> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |maxim maxim-fomin.ru --- The situation is more complicated. import std.container, std.stdio, std.typecons, std.exception; struct S { RefCounted!int _data; this(int) {_data.refCountedStore.ensureInitialized();} int get() property { writeln("here"); enforce(_data.refCountedStore.isInitialized); //OH NOES!!! //13 writeln("there"); return _data.refCountedPayload; } } void main() { version (A) { writeln(S(1).get); } version (B) { S s; writeln(s.get).collectException(); } version (C) { writeln(S().get); } } When compiling with version A or B, everything is fine. Version C fails enforcement on line 13. Both A and C throws failed enforcement. Both B and C segfault in Refcounted dtor. Tested on linux64 git head. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 01 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9438 --- Oh, it depends on compiler switches. Reduced from previous by combining B and C: import std.container, std.stdio, std.typecons, std.exception; struct S { RefCounted!int _data; int get() property { writeln("here"); enforce(_data.refCountedStore.isInitialized); //OH NOES!!! writeln("there"); return _data.refCountedPayload; } } void main() { S s; writeln(s.get).collectException(); writeln(S().get); } When compiling with -g, it prints "here", "here" and enforcement failure, without -g it segfaults as like above. Since valgrind does not detect for -g version memory errors, the issue may be dmd codegen bug. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 01 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9438 monarchdodra gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |wrong-code Severity|major |critical I wanted to try to get RefCounted out of the way, and I was able to get a massively reduced test case. That said, the result is mind bogglingly wtf... FYI: win32 on win7.64 //---- import std.stdio, std.exception; //S is merely a struct with a pointer. And a destructor. struct S { int* _payload = null; ~this() { writeln("~this: ", _payload); if (!_payload) return; writeln("Should not be here..."); } } void foo(int* p) { throw new Exception("bla"); } void main() { int* p = S()._payload; writeln("wait for it..."); foo(S()._payload); } //---- ~this: null wait for it... ~this: FFFF05E8 Should not be here... object.Exception main.d(17): bla //---- Apparently, we create a temporary, an exception is thrown, the temporary gets corrupted, and then things start getting awry in the destructor. I'm getting this also for as far back as 2.55 (didn't try anything earlier), so it doesn't seem to be a regression. In any case, exceptions silently corrupting stack temporaries? That's a critical in my book. Who was it again that said our destructors weren't very well tested? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 01 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9438I wanted to try to get RefCounted out of the way, and I was able to get a massively reduced test case. That said, the result is mind bogglingly wtf...Just to add, I just tried compiling with different flags, including -O -release -debug -g: The scenario occurs in all cases, except when the flag "-g" is set, in which case things work correctly. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 01 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9438 ---I wanted to try to get RefCounted out of the way, and I was able to get a massively reduced test case. That said, the result is mind bogglingly wtf... FYI: win32 on win7.64 //----Cannot reproduce on linux 64 githead. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 01 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9438 ---I wanted to try to get RefCounted out of the way, and I was able to get a massively reduced test case. That said, the result is mind bogglingly wtf... FYI: win32 on win7.64Can reproduce. And -g fixes the program. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 01 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9438 --- Reduced for linux struct RefCounted { void *p; ~this() { p = null; } } struct S { RefCounted _data; int get() property { throw new Exception(""); } } void main() { S s; S().get; } With -g it throws, without -g segfaults. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 01 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9438 --- It really seems to be codegen bug. The problem is that presence of code like in main function (struct temporary + simple stack struct) makes dmd generate wrong exception handler table. If you compile main.d one version with -release -O -noboundcheck and other version with the same switches and additionally with -g, you will have absolutely identical asm (obj2asm output) except the single difference is in data segment. In segfaulting version you have .data segment _HandlerTable0: db 050h,000h,000h,000h,063h,000h,000h,000h ;P...c... db 002h,000h,000h,000h,000h,000h,000h,000h ;........ db 019h,000h,000h,000h,048h,000h,000h,000h ;....H... db 0ffffffffh,0ffffffffh,0ffffffffh,0ffffffffh,000h,000h,000h,000h ;........ db 057h,000h,000h,000h,000h,000h,000h,000h ;W....... db 02bh,000h,000h,000h,037h,000h,000h,000h ;+...7... db 000h,000h,000h,000h,000h,000h,000h,000h ;........ db 042h,000h,000h,000h,000h,000h,000h,000h ;B....... // 42h and in throwing version you will have _HandlerTable0: db 050h,000h,000h,000h,063h,000h,000h,000h ;P...c... db 002h,000h,000h,000h,000h,000h,000h,000h ;........ db 019h,000h,000h,000h,048h,000h,000h,000h ;....H... db 0ffffffffh,0ffffffffh,0ffffffffh,0ffffffffh,000h,000h,000h,000h ;........ db 057h,000h,000h,000h,000h,000h,000h,000h ;W....... db 02bh,000h,000h,000h,037h,000h,000h,000h ;+...7... db 000h,000h,000h,000h,000h,000h,000h,000h ;........ db 03eh,000h,000h,000h,000h,000h,000h,000h ;>....... //3eh If you patch incorrect binary, the bug goes away. Corrupted handler table leads to following problem (asm snippet from main): 0x0000000000418888 <+60>: jmp <_Dmain+72> 0x000000000041888a <+62>: lea -0x10(%rbp),%rdi //3Eh 0x000000000041888e <+66>: callq <_D4main1S11__fieldDtorMFZv> //42h 0x0000000000418893 <+71>: retq 0x0000000000418894 <+72>: sub $0x8,%rsp 0x0000000000418898 <+76>: callq 0x4188a3 <_Dmain+87> 0x000000000041889d <+81>: add $0x8,%rsp 0x00000000004188a1 <+85>: jmp 0x4188ad <_Dmain+97> 0x00000000004188a3 <+87>: lea -0x18(%rbp),%rdi 0x00000000004188a7 <+91>: callq 0x418810 <_D4main1S11__fieldDtorMFZv> 0x00000000004188ac <+96>: retq 0x00000000004188ad <+97>: xor %eax,%eax 0x00000000004188af <+99>: pop %r15 In segfaulting version druntime unwinds up to _Dmain+66, after instruction which sets into %rdi this reference, hence dtor receives corrupted pointer. In correct version druntime unwinds up to _Dmain+62, so the this pointer is correct. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 01 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9438It really seems to be codegen bug.Thankyou for investigating this. As I said, it really is out of my league. Do you know what the next step is for fixing this? Who we should forward it to? We should really get this fixed. Stack corruption when an exception is thrown? Nothing good can come out of this. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 03 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9438 ---This does not work. Until somebody who knows dmd source faces the issue, the bug will not be fixed simply because there are too much problems and too few people.It really seems to be codegen bug.Thankyou for investigating this. As I said, it really is out of my league. Do you know what the next step is for fixing this? Who we should forward it to?We should really get this fixed. Stack corruption when an exception is thrown? Nothing good can come out of this.-- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 04 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9438 Maxim Fomin <maxim maxim-fomin.ru> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |pull --- https://github.com/D-Programming-Language/dmd/pull/1645 -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 09 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9438 Commits pushed to master at https://github.com/D-Programming-Language/dmd https://github.com/D-Programming-Language/dmd/commit/7d42ebfc235a0621cb88b6d3438218c376f8908f Fix issue 9438 - Strange RefCounted stack overflow https://github.com/D-Programming-Language/dmd/commit/b9478394fb9e1599dce0f5727ddfd4acdc858163 Fix issue 9438 - Strange RefCounted stack overflow -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 10 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9438 Commit pushed to staging at https://github.com/D-Programming-Language/dmd https://github.com/D-Programming-Language/dmd/commit/204abc2462ca4fa90b51eaea56a5d0604d9d9438 Fix issue 9438 - Strange RefCounted stack overflow -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 10 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9438 Commit pushed to dmd-1.x at https://github.com/D-Programming-Language/dmd https://github.com/D-Programming-Language/dmd/commit/258d96826ca250b553ce5a3f54ead42273bcf821 merge fix issue 9438 -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 10 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9438 Walter Bright <bugzilla digitalmars.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED CC| |bugzilla digitalmars.com Version|D2 |D1 & D2 Resolution| |FIXED -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 10 2013