digitalmars.D - Is this a bug?
- Jerry (24/24) Nov 26 2013 If I read correctly, the following is legal code. If you comment out
- bearophile (4/6) Nov 26 2013 I see no crash on Windows 32 bit.
- growler (3/9) Nov 26 2013 No crash on Fedora 19 x86_64
- H. S. Teoh (6/34) Nov 26 2013 [...]
- Jerry (3/34) Nov 26 2013 If I build dmd from the sources, the program works. It's only the
- Jerry (3/39) Nov 26 2013 Sorry, not true. I had the extra case commented out. Uncommenting it
- H. S. Teoh (33/76) Nov 27 2013 Strange. I tested again on both git HEAD and 2.064.2, and I still don't
- evilrat (2/3) Nov 26 2013 windows 64 bit no crash.
- Philippe Sigaud (2/2) Nov 26 2013 No crash on Linux (Kubuntu) 32bits, DMD 2.064.2.
- Jerry (6/8) Nov 26 2013 This is actually Ubuntu 12.10 64 bit. The last mysterious crash I had
- deadalnix (3/14) Nov 27 2013 Yes, I faced quite a lot of trouble with gold and dmd. They
- Jerry (4/19) Nov 27 2013 Any suggestions for locating this bug? GDB is useless. It can't find
- H. S. Teoh (9/31) Nov 27 2013 [...]
- H. S. Teoh (144/155) Nov 27 2013 [Taking this back to the forum, if you don't mind, since it seems to be
- Kenji Hara (4/13) Nov 28 2013 Might be related?
- H. S. Teoh (6/8) Dec 02 2013 [...]
- Jerry (4/9) Dec 02 2013 I concur. Compiling with head gives me a working binary for this test.
- Orfeo (1/1) Nov 27 2013 No crash on ArchLinux 3.12.0-1-ARCH x86_64 GNU/Linux
- David Eagen (2/2) Nov 27 2013 No crash on Ubuntu 13.10 x86_64 using the dmd 2.064.2 package
If I read correctly, the following is legal code. If you comment out one of the case statements, it does the expected thing. With 4 or more, it crashes. This is with dmd 2.064.2 on Debian. If it's a bug, I'll file a report. Thanks! class BB {} class DD : CC {} class CC : BB { static CC create(string s) { // Succeeds with 3 cases, fails with 4 switch (s) { case "en": case "it": case "ru": case "ko": return new DD; default: throw new Exception("blech"); } } } void main() { CC.create("en"); } jlquinn wyvern:~/d$ ~/dmd2/linux/bin64/dmd switchbug.d -g jlquinn wyvern:~/d$ ./switchbug Segmentation fault (core dumped)
Nov 26 2013
Jerry:This is with dmd 2.064.2 on Debian. If it's a bug, I'll file a report. Thanks!I see no crash on Windows 32 bit. Bye, bearophile
Nov 26 2013
On Wednesday, 27 November 2013 at 00:55:21 UTC, bearophile wrote:Jerry:No crash on Fedora 19 x86_64 Cheers G.This is with dmd 2.064.2 on Debian. If it's a bug, I'll file a report. Thanks!I see no crash on Windows 32 bit. Bye, bearophile
Nov 26 2013
On Tue, Nov 26, 2013 at 06:33:28PM -0500, Jerry wrote:If I read correctly, the following is legal code. If you comment out one of the case statements, it does the expected thing. With 4 or more, it crashes. This is with dmd 2.064.2 on Debian. If it's a bug, I'll file a report. Thanks! class BB {} class DD : CC {} class CC : BB { static CC create(string s) { // Succeeds with 3 cases, fails with 4 switch (s) { case "en": case "it": case "ru": case "ko": return new DD; default: throw new Exception("blech"); } } } void main() { CC.create("en"); } jlquinn wyvern:~/d$ ~/dmd2/linux/bin64/dmd switchbug.d -g jlquinn wyvern:~/d$ ./switchbug Segmentation fault (core dumped)[...] No crash on dmd git HEAD, Debian/unstable (x86_64). T -- "Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next. -- (Stolen from the net)
Nov 26 2013
"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:On Tue, Nov 26, 2013 at 06:33:28PM -0500, Jerry wrote:If I build dmd from the sources, the program works. It's only the precompiled dmd executable that generates a broken binary.If I read correctly, the following is legal code. If you comment out one of the case statements, it does the expected thing. With 4 or more, it crashes. This is with dmd 2.064.2 on Debian. If it's a bug, I'll file a report. Thanks! class BB {} class DD : CC {} class CC : BB { static CC create(string s) { // Succeeds with 3 cases, fails with 4 switch (s) { case "en": case "it": case "ru": case "ko": return new DD; default: throw new Exception("blech"); } } } void main() { CC.create("en"); } jlquinn wyvern:~/d$ ~/dmd2/linux/bin64/dmd switchbug.d -g jlquinn wyvern:~/d$ ./switchbug Segmentation fault (core dumped)[...] No crash on dmd git HEAD, Debian/unstable (x86_64).
Nov 26 2013
Jerry <jlquinn optonline.net> writes:"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:Sorry, not true. I had the extra case commented out. Uncommenting it still gives me a crashing program.On Tue, Nov 26, 2013 at 06:33:28PM -0500, Jerry wrote:If I build dmd from the sources, the program works. It's only the precompiled dmd executable that generates a broken binary.If I read correctly, the following is legal code. If you comment out one of the case statements, it does the expected thing. With 4 or more, it crashes. This is with dmd 2.064.2 on Debian. If it's a bug, I'll file a report. Thanks! class BB {} class DD : CC {} class CC : BB { static CC create(string s) { // Succeeds with 3 cases, fails with 4 switch (s) { case "en": case "it": case "ru": case "ko": return new DD; default: throw new Exception("blech"); } } } void main() { CC.create("en"); } jlquinn wyvern:~/d$ ~/dmd2/linux/bin64/dmd switchbug.d -g jlquinn wyvern:~/d$ ./switchbug Segmentation fault (core dumped)[...] No crash on dmd git HEAD, Debian/unstable (x86_64).
Nov 26 2013
On Wed, Nov 27, 2013 at 02:06:33AM -0500, Jerry wrote:Jerry <jlquinn optonline.net> writes:Strange. I tested again on both git HEAD and 2.064.2, and I still don't get any crashes. This is on Debian/unstable (x86_64). I added a few more calls to CC.create, just to make sure the switch branches all work as expected; here is my test code: ------ class BB {} class DD : CC {} class CC : BB { static CC create(string s) { // Succeeds with 3 cases, fails with 4 switch (s) { case "en": case "it": case "ru": case "ko": return new DD; default: throw new Exception("blech"); } } } void main() { CC.create("en"); CC.create("it"); CC.create("ru"); CC.create("ko"); } ------ Are you sure your copy of dmd isn't picking up the wrong version(s) of the runtime libraries? Maybe try compiling with dmd -v to see if the include paths and library paths are what you expect? T -- GEEK = Gatherer of Extremely Enlightening Knowledge"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:Sorry, not true. I had the extra case commented out. Uncommenting it still gives me a crashing program.On Tue, Nov 26, 2013 at 06:33:28PM -0500, Jerry wrote:If I build dmd from the sources, the program works. It's only the precompiled dmd executable that generates a broken binary.If I read correctly, the following is legal code. If you comment out one of the case statements, it does the expected thing. With 4 or more, it crashes. This is with dmd 2.064.2 on Debian. If it's a bug, I'll file a report. Thanks! class BB {} class DD : CC {} class CC : BB { static CC create(string s) { // Succeeds with 3 cases, fails with 4 switch (s) { case "en": case "it": case "ru": case "ko": return new DD; default: throw new Exception("blech"); } } } void main() { CC.create("en"); } jlquinn wyvern:~/d$ ~/dmd2/linux/bin64/dmd switchbug.d -g jlquinn wyvern:~/d$ ./switchbug Segmentation fault (core dumped)[...] No crash on dmd git HEAD, Debian/unstable (x86_64).
Nov 27 2013
On Tuesday, 26 November 2013 at 23:33:30 UTC, Jerry wrote:This is with dmd 2.064.2 on Debian.windows 64 bit no crash.
Nov 26 2013
No crash on Linux (Kubuntu) 32bits, DMD 2.064.2. Works with 5 or 6 case's also.
Nov 26 2013
Philippe Sigaud <philippe.sigaud gmail.com> writes:No crash on Linux (Kubuntu) 32bits, DMD 2.064.2. Works with 5 or 6 case's also.This is actually Ubuntu 12.10 64 bit. The last mysterious crash I had was due to the use of gold vs ld. It seems Debian and Ubuntu are the only ones using gold at the moment. I'm wondering if this problem is the same kind of thing. Jerry
Nov 26 2013
On Wednesday, 27 November 2013 at 07:09:17 UTC, Jerry wrote:Philippe Sigaud <philippe.sigaud gmail.com> writes:Yes, I faced quite a lot of trouble with gold and dmd. They simply do not mix.No crash on Linux (Kubuntu) 32bits, DMD 2.064.2. Works with 5 or 6 case's also.This is actually Ubuntu 12.10 64 bit. The last mysterious crash I had was due to the use of gold vs ld. It seems Debian and Ubuntu are the only ones using gold at the moment. I'm wondering if this problem is the same kind of thing. Jerry
Nov 27 2013
"deadalnix" <deadalnix gmail.com> writes:On Wednesday, 27 November 2013 at 07:09:17 UTC, Jerry wrote:Any suggestions for locating this bug? GDB is useless. It can't find the frame base. JerryPhilippe Sigaud <philippe.sigaud gmail.com> writes:Yes, I faced quite a lot of trouble with gold and dmd. They simply do not mix.No crash on Linux (Kubuntu) 32bits, DMD 2.064.2. Works with 5 or 6 case's also.This is actually Ubuntu 12.10 64 bit. The last mysterious crash I had was due to the use of gold vs ld. It seems Debian and Ubuntu are the only ones using gold at the moment. I'm wondering if this problem is the same kind of thing. Jerry
Nov 27 2013
On Wed, Nov 27, 2013 at 01:42:04PM -0500, Jerry wrote:"deadalnix" <deadalnix gmail.com> writes:[...] I'm interested in the disassembly of the faulty executable. Maybe you could run `objdump -D $program_name` and send me the output? I'm curious to see what got messed up. (With your reduced test case, that is. The disassembly of your actual program would be too unwieldy.) T -- If I were two-faced, would I be wearing this one? -- Abraham LincolnOn Wednesday, 27 November 2013 at 07:09:17 UTC, Jerry wrote:Any suggestions for locating this bug? GDB is useless. It can't find the frame base.Philippe Sigaud <philippe.sigaud gmail.com> writes:Yes, I faced quite a lot of trouble with gold and dmd. They simply do not mix.No crash on Linux (Kubuntu) 32bits, DMD 2.064.2. Works with 5 or 6 case's also.This is actually Ubuntu 12.10 64 bit. The last mysterious crash I had was due to the use of gold vs ld. It seems Debian and Ubuntu are the only ones using gold at the moment. I'm wondering if this problem is the same kind of thing. Jerry
Nov 27 2013
[Taking this back to the forum, if you don't mind, since it seems to be relevant.] On Wed, Nov 27, 2013 at 03:36:17PM -0500, Jerry wrote:"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:[...]I'm interested in the disassembly of the faulty executable. Maybe you could run `objdump -D $program_name` and send me the output? I'm curious to see what got messed up. (With your reduced test case, that is. The disassembly of your actual program would be too unwieldy.)Here's the disassembly. I'm sending out of gnus in emacs. If it doesn't come through, let me know and I'll resend from my regular mail prog.It came through, thanks. I compared the disassembly with my working version of the same code, and found that the problem appears to be a wrong jump table in the switch statement. Here's the relevant snippet from near the end of CC.create(): [Jerry's version]: 417590: e8 1f 2d 00 00 callq 41a2b4 <_d_switch_string> 417595: 48 89 c6 mov %rax,%rsi 417598: 83 fe 03 cmp $0x3,%esi 41759b: 77 18 ja 4175b5 <_D9switchbug2CC6createFAyaZC9switchbug2CC+0x59> 41759d: ff 24 f5 20 01 43 00 jmpq *0x430120(,%rsi,8) 4175a4: 48 bf a0 72 43 00 00 movabs $0x4372a0,%rdi 4175ab: 00 00 00 4175ae: e8 55 1b 00 00 callq 419108 <_d_newclass> 4175b3: c9 leaveq 4175b4: c3 retq In simple terms, what this code does is: 417590: calls _d_switch_string, a function in druntime that does the string comparisons in a switch over string values, and returns a uint index of the matching switch case number (uint.max if not found). In this case, you have 4 switch cases, so they map to indices 0, 1, 2, 3. 41759b: checks the return value of _d_switch_string, and if it's > 3, then branch to <CC.create + 0x59>, which is where the default case is implemented (not quoted above, but if you look in your disassembly you'll see that it creates an Exception then calls the stack unwinding routine). Since we aren't hitting the default case in our test case, the control would pass to the next instruction. 41759d: here's the interesting part. This looks up a jump table at 0x430120 using the index returned by _d_switch_string, and branches to that address. Looking up this address in the disassembly dump, I find this: 0000000000430120 <_D9switchbug2BB6__initZ>: 430120: 40 01 43 00 rex add %eax,0x0(%rbx) ... 430130: 73 77 jae 4301a9 <_D9switchbug2DD6__vtblZ+0x19> Now this looks odd. _D9switchbug2BB6__initZ looks like the typeinfo for class BB; it is certainly NOT a switch statement jump table!! The first 4 bytes, in fact, corresponds to the address 430140, which contains this: 0000000000430140 <_D9switchbug2BB6__vtblZ>: 430140: 00 72 43 add %dh,0x43(%rdx) 430143: 00 00 add %al,(%rax) 430145: 00 00 add %al,(%rax) 430147: 00 90 76 41 00 00 add %dl,0x4176(%rax) 43014d: 00 00 add %al,(%rax) [... snipped ...] This is the virtual function table of class BB, so it's *definitely* not a valid jump destination of a switch statement! So this looks like where the problem came from. If the CPU ends up here, it would try to interpret function pointers as instructions, and would basically do random nonsensical things until it hits something that can't be interpreted as an instruction, or tries to access a random address that's outside the process address space, upon which the OS kicks in and sends a SIGSEGV to terminate the program. ... Now, for comparison, here is the corresponding disassembly from my working version of your code: [Teoh's version, in CC.create]: 417084: e8 a7 27 00 00 callq 419830 <_d_switch_string> 417089: 48 89 c6 mov %rax,%rsi 41708c: 83 fe 03 cmp $0x3,%esi 41708f: 77 18 ja 4170a9 <_D4test2CC6createFAyaZC4test2CC+0x59> 417091: ff 24 f5 60 e9 42 00 jmpq *0x42e960(,%rsi,8) 417098: 48 bf 40 45 63 00 00 movabs $0x634540,%rdi 41709f: 00 00 00 4170a2: e8 dd 15 00 00 callq 418684 <_d_newclass> 4170a7: c9 leaveq 4170a8: c3 retq Other than the different addresses, which are to be expected (different compile environments, etc.), this code is basically the same as in your version. The only difference lies in the contents of the jump table, which in my version is at 42e960 (as can be seen from the instruction at 417091 above), which contains: 42e95f: 00 98 70 41 00 00 add %bl,0x4170(%rax) 42e965: 00 00 add %al,(%rax) 42e967: 00 98 70 41 00 00 add %bl,0x4170(%rax) 42e96d: 00 00 add %al,(%rax) 42e96f: 00 98 70 41 00 00 add %bl,0x4170(%rax) 42e975: 00 00 add %al,(%rax) 42e977: 00 98 70 41 00 00 add %bl,0x4170(%rax) 42e97d: 00 00 add %al,(%rax) ... (Note that since this part of the code isn't instructions, the disassembler got a bit confused trying to interpret them as instructions, so the addresses are 1 byte off. So the jump table actually starts at the bytes 98 70 41 00 ... .) Now, *this* looks like a proper jump table. In fact, the aforementioned bytes represent the address 417098, which, if you look at the disassembly snippet above, is the very next instruction after the jump table lookup. This makes sense, since the next thing it does is to call _d_newclass to create an instance of DD, after which it simply returns (leaving the address of the new instance of DD in %rax, which is the register containing the return value as per x86 calling conventions). This corresponds with what the source code says. So here, everything is correct and the program works as expected. ... Now, all this begs the question of why dmd produced the wrong code in your environment, but produces the *right* code in mine. Since we're both using the same dmd source code (I believe!), it seems that the most likely culprit must be the linker -- especially since you mentioned something about gold vs. ld. My conjecture is that somehow the linker got mixed up, and wrote the wrong address for the switch's jump table into your executable. (These addresses are generally not fixed until link time, because the compiler doesn't know in advance exactly where each symbol will end up in the final executable.) Further confirmation for this can be found by searching for the byte sequence a4 75 41 in your disassembly dump (this is the address 4175a4, the next instruction after the jump table lookup, which is where the jump destination *should* have been in the first place). This sequence appears here in your version of the executable: 00000000004301f0 <_TMP3>: [... snipped ...] 4301ff: 00 a4 75 41 00 00 00 add %ah,0x41(%rbp,%rsi,2) 430206: 00 00 add %al,(%rax) 430208: a4 movsb %ds:(%rsi),%es:(%rdi) 430209: 75 41 jne 43024c <_D9switchbug2CC6__vtblZ+0xc> 43020b: 00 00 add %al,(%rax) 43020d: 00 00 add %al,(%rax) 43020f: 00 a4 75 41 00 00 00 add %ah,0x41(%rbp,%rsi,2) 430216: 00 00 add %al,(%rax) 430218: a4 movsb %ds:(%rsi),%es:(%rdi) 430219: 75 41 jne 43025c <_D9switchbug2CC6__vtblZ+0x1c> 43021b: 00 00 add %al,(%rax) 43021d: 00 00 add %al,(%rax) ... Ignoring the instructions on the right (they are basically objdump getting confused by data that aren't intended to be instructions), we see that the sequence a4 75 41 00 00 00 00 00 appears exactly 4 times, consecutively. This corresponds with the 4 cases of the switch statement. So, this must be where the real jump table is, at 430200, NOT at 430120 (which is 128 (0x80) bytes too early). It seems unlikely that the compiler would screw things up *this* bad (esp. since it didn't screw up in my environment!), so this seems to reinforce my conclusion that the fault lies with the linker. Well, I hope this helps. :) [...]p.s. How do you prefer to be addressed?I go by my last name. T -- Almost all proofs have bugs, but almost all theorems are true. -- Paul Pedersen
Nov 27 2013
Might be related? https://d.puremagic.com/issues/show_bug.cgi?id=11406 Kenji Hara 2013/11/27 Jerry <jlquinn optonline.net>Philippe Sigaud <philippe.sigaud gmail.com> writes:No crash on Linux (Kubuntu) 32bits, DMD 2.064.2. Works with 5 or 6 case's also.This is actually Ubuntu 12.10 64 bit. The last mysterious crash I had was due to the use of gold vs ld. It seems Debian and Ubuntu are the only ones using gold at the moment. I'm wondering if this problem is the same kind of thing. Jerry
Nov 28 2013
On Thu, Nov 28, 2013 at 05:11:25PM +0900, Kenji Hara wrote:Might be related? https://d.puremagic.com/issues/show_bug.cgi?id=11406[...] Based on my investigation, I'm pretty sure it's the same bug. T -- Winners never quit, quitters never win. But those who never quit AND never win are idiots.
Dec 02 2013
"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:On Thu, Nov 28, 2013 at 05:11:25PM +0900, Kenji Hara wrote:I concur. Compiling with head gives me a working binary for this test. Thanks JerryMight be related? https://d.puremagic.com/issues/show_bug.cgi?id=11406[...] Based on my investigation, I'm pretty sure it's the same bug.
Dec 02 2013
No crash on ArchLinux 3.12.0-1-ARCH x86_64 GNU/Linux
Nov 27 2013
No crash on Ubuntu 13.10 x86_64 using the dmd 2.064.2 package downloaded from dlang.org.
Nov 27 2013