digitalmars.D.learn - druntime thread (from foreach parallel?) cleanup bug
- mw (102/102) Nov 01 2022 My program received signal SIGSEGV, Segmentation fault.
- H. S. Teoh (9/25) Nov 01 2022 Can you show a code snippet that includes the parallel foreach? Because
- =?UTF-8?Q?Ali_=c3=87ehreli?= (8/9) Nov 01 2022 Did you mean dustmite, which is accessible as 'dub dustmite
- H. S. Teoh (7/17) Nov 01 2022 [...]
- mw (16/17) Nov 01 2022 (It's just a very straight forward foreach on an array; as I said
- Steven Schveighoffer (4/31) Nov 01 2022 Oh yeah, isDaemon detaches the thread from the GC. Don't do that unless
- Steven Schveighoffer (9/11) Nov 01 2022 As discussed on discord, this isn't true actually. All it does is
- mw (22/48) Nov 01 2022 Maybe the hunt library author doesn't know. (My code does not
- Imperatorn (5/12) Nov 01 2022 Please, if you see anything in the docs that needs to be updated,
My program received signal SIGSEGV, Segmentation fault. Its simplified structure looks like this: ``` void foo() { ... writeln("done"); // saw this got printed! } int main() { foo(); return 0; } ``` So, just before the program exit, it failed. I suspect druntime has a thread (maybe due to foreach parallel) cleanup bug somewhere, which is unrelated to my own code. This kind of bug is hard to re-produce, not sure if I should file an issue. I'm using: LDC - the LLVM D compiler (1.30.0) on x86_64. Under gdb, here is the threads info (for the record): Thread 11 "xxx" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x1555553df700 (LWP 36258)] __GI___res_iclose (free_addr=true, statp=0x1555553dfdb8) at res-close.c:103 103 res-close.c: No such file or directory. (gdb) info threads Id Target Id Frame 1 Thread 0x155555515000 (LWP 36244) "lt" 0x0000155550af1d2d in __GI___pthread_timedjoin_ex (threadid=23456246527744, thread_return=0x0, abstime=0x0, block=<optimized out>) at pthread_join_common.c:89 * 11 Thread 0x1555553df700 (LWP 36258) "lt" __GI___res_iclose (free_addr=true, statp=0x1555553dfdb8) at res-close.c:103 17 Thread 0x155544817700 (LWP 36264) "lt" 0x0000155550afac70 in __GI___nanosleep (requested_time=0x155544810e90, remaining=0x155544810ea8) at ../sysdeps/unix/sysv/linux/nanosleep.c:28 (gdb) thread 1 [Switching to thread 1 (Thread 0x155555515000 (LWP 36244))] (threadid=23456246527744, thread_return=0x0, abstime=0x0, block=<optimized out>) at pthread_join_common.c:89 89 pthread_join_common.c: No such file or directory. (gdb) where (threadid=23456246527744, thread_return=0x0, abstime=0x0, block=<optimized out>) at pthread_join_common.c:89 core.thread.osthread.joinLowLevelThread(ulong) () _D4core8internal2gc4impl12conservativeQw3Gcx15stopScanThreadsMFNbZv () _D4core8internal2gc4impl12conservativeQw3Gcx4DtorMFZv () _D4core8internal2gc4impl12conservativeQw14ConservativeGC6__dtorMFZv () _D2rt6dmain212_d_run_main2UAAamPUQgZiZ6runAllMFZv () //home/zhou/project/ldc2-1.30.0-linux-x86_64/bin/../import/core/internal/entrypoint.d:42 <main>, argc=2, argv=0x7fffffffe188, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe178) at ../csu/libc-start.c:310 (gdb) thread 11 [Switching to thread 11 (Thread 0x1555553df700 (LWP 36258))] res-close.c:103 103 res-close.c: No such file or directory. (gdb) where res-close.c:103 thread-freeres.c:29 pthread_create.c:476 ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 (gdb) thread 17 [Switching to thread 17 (Thread 0x155544817700 (LWP 36264))] (requested_time=0x155544810e90, remaining=0x155544810ea8) at ../sysdeps/unix/sysv/linux/nanosleep.c:28 28 ../sysdeps/unix/sysv/linux/nanosleep.c: No such file or directory. (gdb) where (requested_time=0x155544810e90, remaining=0x155544810ea8) at ../sysdeps/unix/sysv/linux/nanosleep.c:28 _D4core6thread8osthread6Thread5sleepFNbNiSQBo4time8DurationZv () _D4hunt4util8DateTimeQj25_sharedStaticCtor_L406_C5FZ9__lambda4MFZv () at home/zhou/.dub/packages/hunt-1.7.16/hunt/source/hunt/util/DateTime.d:430 pthread_create.c:463 ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Nov 01 2022
On Tue, Nov 01, 2022 at 05:19:56PM +0000, mw via Digitalmars-d-learn wrote:My program received signal SIGSEGV, Segmentation fault. Its simplified structure looks like this: ``` void foo() { ... writeln("done"); // saw this got printed! } int main() { foo(); return 0; } ```Can you show a code snippet that includes the parallel foreach? Because the above code snippet is over-simplified to the point it's impossible to tell what the original problem might be, since obviously calling a function that calls writeln would not crash the program. Maybe try running Digger to reduce the code for you? T -- Never step over a puddle, always step around it. Chances are that whatever made it is still dripping.
Nov 01 2022
On 11/1/22 10:27, H. S. Teoh wrote:Maybe try running Digger to reduce the code for you?Did you mean dustmite, which is accessible as 'dub dustmite <destination-path>' but I haven't used it. My guess for the segmentation fault is that the OP is executing destructor code that assumes some members are alive. If so, the code should be moved from destructors to functions to be called like obj.close(). But it's just a guess... Ali
Nov 01 2022
On Tue, Nov 01, 2022 at 10:37:57AM -0700, Ali Çehreli via Digitalmars-d-learn wrote:On 11/1/22 10:27, H. S. Teoh wrote:Oh yes, sorry, I meant dustmite, not digger. :-PMaybe try running Digger to reduce the code for you?Did you mean dustmite, which is accessible as 'dub dustmite <destination-path>' but I haven't used it.My guess for the segmentation fault is that the OP is executing destructor code that assumes some members are alive. If so, the code should be moved from destructors to functions to be called like obj.close(). But it's just a guess...[...] Yes, that's a common gotcha. T -- We are in class, we are supposed to be learning, we have a teacher... Is it too much that I expect him to teach me??? -- RL
Nov 01 2022
Can you show a code snippet that includes the parallel foreach?(It's just a very straight forward foreach on an array; as I said it may not be relevant.) And I just noticed, one of the thread trace points to here: https://github.com/huntlabs/hunt/blob/master/source/hunt/util/DateTime.d#L430 ``` class DateTime { shared static this() { ... dateThread.isDaemon = true; // not sure if this is related } } ``` in the comments, it said: "BUG: ... crashed". Looks like someone run into this (cleanup) issue already, but unable to fix it. Anyway I logged an issue there: https://github.com/huntlabs/hunt/issues/96
Nov 01 2022
On 11/1/22 1:47 PM, mw wrote:Oh yeah, isDaemon detaches the thread from the GC. Don't do that unless you know what you are doing. -SteveCan you show a code snippet that includes the parallel foreach?(It's just a very straight forward foreach on an array; as I said it may not be relevant.) And I just noticed, one of the thread trace points to here: https://github.com/huntlabs/hunt/blob/master/source/hunt/util/DateTime.d#L430 ``` class DateTime {  shared static this() {    ...    dateThread.isDaemon = true; // not sure if this is related  } } ``` in the comments, it said: "BUG: ... crashed". Looks like someone run into this (cleanup) issue already, but unable to fix it. Anyway I logged an issue there: https://github.com/huntlabs/hunt/issues/96
Nov 01 2022
On Tuesday, 1 November 2022 at 18:18:45 UTC, Steven Schveighoffer wrote:Oh yeah, isDaemon detaches the thread from the GC. Don't do that unless you know what you are doing.As discussed on discord, this isn't true actually. All it does is prevent the thread from being joined before exiting the runtime. What is *likely* happening is, the runtime shuts down. That thread is still running, but the D runtime is gone. So it eventually starts trying to do something (like let's say, access thread local storage), and it's gone. Hence the segfault. -Steve
Nov 01 2022
On Tuesday, 1 November 2022 at 18:18:45 UTC, Steven Schveighoffer wrote:Maybe the hunt library author doesn't know. (My code does not directly use this library, it got pulled in by some other decencies.) Currently, the `isDaemon` doc does not mention this about this: https://dlang.org/library/core/thread/threadbase/thread_base.is_daemon.html Sets the daemon status for this thread. While the runtime will wait for all normal threads to complete before tearing down the process, daemon threads are effectively ignored and thus will not prevent the process from terminating. In effect, daemon threads will be terminated automatically by the OS when the process exits. Maybe we should add to the doc? BTW, what is exactly going wrong with their code? I saw the tick() method call inside the anonymous `dateThread` is accessing these two stack variables of shared static this(): https://github.com/huntlabs/hunt/blob/master/source/hunt/util/DateTime.d#L409 Appender!(char[])[2] bufs; const(char)[][2] targets; Why does this tick() call work after the static this() finished in a normal run? Why the problem only shows up when program finish?And I just noticed, one of the thread trace points to here: https://github.com/huntlabs/hunt/blob/master/source/hunt/util/DateTime.d#L430 ``` class DateTime {  shared static this() {    ...    dateThread.isDaemon = true; // not sure if this is related  } } ``` in the comments, it said: "BUG: ... crashed". Looks like someone run into this (cleanup) issue already, but unable to fix it. Anyway I logged an issue there: https://github.com/huntlabs/hunt/issues/96Oh yeah, isDaemon detaches the thread from the GC. Don't do that unless you know what you are doing.
Nov 01 2022
On Tuesday, 1 November 2022 at 19:49:47 UTC, mw wrote:On Tuesday, 1 November 2022 at 18:18:45 UTC, Steven Schveighoffer wrote:Please, if you see anything in the docs that needs to be updated, make a PR right away <3 Documentation saves lives! The times I have thought "I'll do it later" have been too many.[...]Maybe the hunt library author doesn't know. (My code does not directly use this library, it got pulled in by some other decencies.) [...]
Nov 01 2022