www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Can you reproduce this threading bug?

reply FeepingCreature <feepingcreature gmail.com> writes:
Consider the following code:

void main()
{
     import core.thread : Thread;

     with (new Thread({ })) { isDaemon = true; start; }
}

On Linux, this builds and runs. Most of the time. Maybe 99% of 
the time.

But if you run it in a loop:

while true; do ./test || break; done

You may see that it segfaults after a few seconds. At least, it 
does for me on 2.080.0, Linux 4.18.0-20 x86_64.

This is obviously quite bad. Any ideas?
Jun 14
next sibling parent ag0aep6g <anonymous example.com> writes:
On 14.06.19 18:36, FeepingCreature wrote:
 Consider the following code:
 
 void main()
 {
      import core.thread : Thread;
 
      with (new Thread({ })) { isDaemon = true; start; }
 }
 
 On Linux, this builds and runs. Most of the time. Maybe 99% of the time.
 
 But if you run it in a loop:
 
 while true; do ./test || break; done
 
 You may see that it segfaults after a few seconds. At least, it does for 
 me on 2.080.0, Linux 4.18.0-20 x86_64.
Can reproduce. DMD 2.086.0, Linux 5.0.0-16-generic x86_64
Jun 14
prev sibling next sibling parent Alex <sascha.orlov gmail.com> writes:
On Friday, 14 June 2019 at 16:36:04 UTC, FeepingCreature wrote:
 Consider the following code:

 void main()
 {
     import core.thread : Thread;

     with (new Thread({ })) { isDaemon = true; start; }
 }

 On Linux, this builds and runs. Most of the time. Maybe 99% of 
 the time.

 But if you run it in a loop:

 while true; do ./test || break; done

 You may see that it segfaults after a few seconds. At least, it 
 does for me on 2.080.0, Linux 4.18.0-20 x86_64.

 This is obviously quite bad. Any ideas?
Can reproduce. DMD64 D Compiler v2.086.0, MacOs 10.13.6; Darwin Kernel Version 17.7.0; x86_64
Jun 14
prev sibling next sibling parent reply FeepingCreature <feepingcreature gmail.com> writes:
Happens for me at home too, with ldc2-1.11 on 4.14.111.

I think with the Mac user reporting in, we can exclude a kernel 
or glibc issue. Damn.
Jun 14
parent Antonio Corbi <antonio ggmail.com> writes:
On Friday, 14 June 2019 at 18:41:41 UTC, FeepingCreature wrote:
 Happens for me at home too, with ldc2-1.11 on 4.14.111.

 I think with the Mac user reporting in, we can exclude a kernel 
 or glibc issue. Damn.
Using Arch Linux: Linux h 16:19:25 UTC 2019 x86_64 GNU/Linux And dmd: dmd --version DMD64 D Compiler v2.086.0 Compiling with "dmd -g" and running the same loop but inside gdb (while true; do gdb -ex run -ex q ./ttest || break; done), this is the stack trace I get: [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib/libthread_db.so.1". [New Thread 0x7ffff7b1c700 (LWP 16006)] Thread 2 "ttest" received signal SIGUSR1, User defined signal 1. [Switching to Thread 0x7ffff7b1c700 (LWP 16006)] 0x00007ffff7f6708a in __lll_unlock_wake () from /usr/lib/libpthread.so.0 A debugging session is active. Inferior 1 [process 15984] will be killed. Quit anyway? (y or n) n Not confirmed. (gdb) bt /usr/lib/libpthread.so.0 #1 0x00007ffff7f61a66 in __pthread_mutex_unlock_usercnt () from /usr/lib/libpthread.so.0 _D4core4sync5mutex5Mutex__T14unlock_nothrowTCQBrQBpQBnQBkZQBfMFNbNiNeZv () _D4core6thread6Thread3addFNbNiCQBdQBbQxbZv () #4 0x00005555555a0088 in thread_entryPoint () #5 0x00007ffff7f5da92 in start_thread () from /usr/lib/libpthread.so.0 Hope this helps. Antonio
Jun 14
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2019-06-14 18:36, FeepingCreature wrote:
 Consider the following code:
 
 void main()
 {
      import core.thread : Thread;
 
      with (new Thread({ })) { isDaemon = true; start; }
 }
 
 On Linux, this builds and runs. Most of the time. Maybe 99% of the time.
 
 But if you run it in a loop:
 
 while true; do ./test || break; done
 
 You may see that it segfaults after a few seconds. At least, it does for 
 me on 2.080.0, Linux 4.18.0-20 x86_64.
 
 This is obviously quite bad. Any ideas?
On macOS I get a mixture of the following: Aborting from src/core/sync/mutex.d(149) Error: pthread_mutex_destroy failed.Abort trap: 6 Aborting from src/core/sync/mutex.d(149) Error: pthread_mutex_destroy failed.Segmentation fault: 11 Aborting from Segmentation fault: 11 Pretty easy to reproduce. But when I tried in a debugger I failed to reproduce the segfault. Although, here is what the crash reporter logged: Crashed Thread: 0 Dispatch queue: com.apple.main-thread Exception Type: EXC_CRASH (SIGABRT) Exception Codes: 0x0000000000000000, 0x0000000000000000 Exception Note: EXC_CORPSE_NOTIFY Application Specific Information: abort() called Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 libsystem_kernel.dylib 0x00007fff4fe8cb66 __pthread_kill + 10 1 libsystem_pthread.dylib 0x00007fff50057080 pthread_kill + 333 2 libsystem_c.dylib 0x00007fff4fde81ae abort + 127 3 main 0x00000001033978e2 _D4core8internal5abortQgFNbNiNfMAyaMQemZv + 262 4 main 0x0000000103394e73 thread_term + 259 5 main 0x00000001033a7268 rt_term + 88 6 main 0x00000001033a796c _D2rt6dmain211_d_run_mainUiPPaPUAAaZiZ6runAllMFZv + 208 7 main 0x00000001033a7848 _D2rt6dmain211_d_run_mainUiPPaPUAAaZiZ7tryExecMFMDFZvZv + 36 8 main 0x00000001033a77a8 _d_run_main + 764 9 main 0x0000000103384e72 main + 34 10 libdyld.dylib 0x00007fff4fd3c015 start + 1 Thread 1: 0 main 0x0000000103394ab3 _D4core6thread6Thread6removeFNbNiCQBgQBeQBaZv + 63 1 main 0x0000000103393922 thread_entryPoint + 526 2 libsystem_pthread.dylib 0x00007fff50054661 _pthread_body + 340 3 libsystem_pthread.dylib 0x00007fff5005450d _pthread_start + 377 4 libsystem_pthread.dylib 0x00007fff50053bf9 thread_start + 13 Thread 0 crashed with X86 Thread State (64-bit): rax: 0x0000000000000000 rbx: 0x00007fff88782380 rcx: 0x00007ffeec87b158 rdx: 0x0000000000000000 rdi: 0x0000000000000307 rsi: 0x0000000000000006 rbp: 0x00007ffeec87b190 rsp: 0x00007ffeec87b158 r8: 0x000000000000000a r9: 0x0000000000000011 r10: 0x0000000000000000 r11: 0x0000000000000206 r12: 0x0000000000000307 r13: 0x00007ffeec87b3c6 r14: 0x0000000000000006 r15: 0x000000000000002d rip: 0x00007fff4fe8cb66 rfl: 0x0000000000000206 cr2: 0x000000010343d088 Logical CPU: 6 Error Code: 0x00000004 Trap Number: 14 -- /Jacob Carlborg
Jun 14
prev sibling next sibling parent rikki cattermole <rikki cattermole.co.nz> writes:
On 15/06/2019 4:36 AM, FeepingCreature wrote:
 Consider the following code:
 
 void main()
 {
      import core.thread : Thread;
 
      with (new Thread({ })) { isDaemon = true; start; }
 }
 
 On Linux, this builds and runs. Most of the time. Maybe 99% of the time.
 
 But if you run it in a loop:
 
 while true; do ./test || break; done
 
 You may see that it segfaults after a few seconds. At least, it does for 
 me on 2.080.0, Linux 4.18.0-20 x86_64.
 
 This is obviously quite bad. Any ideas?
Cannot reproduce under Windows 10 dmd 2.082.0 and ldc2 1.12.0-beta2. But this does not say much, Windows has a different set of costs related to threads + processes. It was a good second between runs.
Jun 14
prev sibling parent FeepingCreature <feepingcreature gmail.com> writes:
Filed as 19978! (Darn, I was hoping for 20000.)

https://issues.dlang.org/show_bug.cgi?id=19978

Thanks everyone for the help in excluding kernel, backend and 
stdlib as source!
Jun 16