digitalmars.D - Threads and GC
- Juan Jose Comellas (64/64) Mar 17 2006 I'm having a problem with the garbage collector when working with thread...
- Sean Kelly (14/84) Mar 17 2006 To sum up, Kris had encountered deadlock problems both with Phobos and
I'm having a problem with the garbage collector when working with threads and DMD 0.149 on Linux. I'm currently writing an application to test some socket-related functionality and it's crashing whenever the garbage collector kicks in. I have two threads (one acting as server and the other one acting as client). Both threads are running tight loops processing messages from each other. In each of the iterations, a small amount of memory is used. At some point, the garbage collector is activated and the SIGUSR1 signal is sent to suspend all the other threads, and just after that I see a crash in the other thread. From what I've seen of Phobos, when activating the garbage collector, the threads are suspended using the SIGUSR1 signal and are resumed with the SIGUSR2 signal. In my test I never see the SIGUSR2 signal being sent. Has anybody else seen something like this before? It seems that Sean and Kris have found some problem with the GC too in Ares, but I haven't read their postings yet (dsource.org is down right now). In case anybody else finds the backtraces useful, I'm including what I could get using an unpatched gdb: Program received signal SIGUSR1, User defined signal 1. [Switching to Thread 1442708400 (LWP 8344)] 0x5557a84e in send () from /lib/tls/libpthread.so.0 (gdb) bt _D5mango2io6Socket6Socket4sendFAvE5mango2io6Socket6Socket5FlagsZi () at /home/jcomellas/devel/d/mango_test/mango/io/Socket.d:1413 at /home/jcomellas/devel/d/mango_test/mango/io/Socket.d:869 at /home/jcomellas/devel/d/mango_test/mango/io/Conduit.d:198 selector.d:308 (gdb) cont Continuing. Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 1433270496 (LWP 8341)] 0x080673b1 in _D3gcx3Gcx4markFPvPvZv () (gdb) bt _D5mango10containers7HashMap89__T7HashMapTT5mango2io5model8IConduit8IConduit6HandleTC5mango2io5model8IConduit8IConduitZ7HashMap8iteratorFZC5mango10containers8Iterator101__T18MutableMapIteratorTT5mango2io5model8IConduit8IConduit6HandleTC5mango2io5model8IConduit8IConduitZ18MutableMapIterator () at /home/jcomellas/devel/d/mango_test/mango/io/selector/SelectSelector.d:303 _D5mango2io8selector14SelectSelector18SelectSelectionSet7opApplyFDFKC5mango2io8selector5model9ISelector12SelectionKeyZiZi () at /home/jcomellas/devel/d/mango_test/mango/io/selector/SelectSelector.d:609 _D8selector12testSelectorFC5mango2io8selector5model9ISelector9ISelectorZv () at selector.d:130
Mar 17 2006
Juan Jose Comellas wrote:I'm having a problem with the garbage collector when working with threads and DMD 0.149 on Linux. I'm currently writing an application to test some socket-related functionality and it's crashing whenever the garbage collector kicks in. I have two threads (one acting as server and the other one acting as client). Both threads are running tight loops processing messages from each other. In each of the iterations, a small amount of memory is used. At some point, the garbage collector is activated and the SIGUSR1 signal is sent to suspend all the other threads, and just after that I see a crash in the other thread. From what I've seen of Phobos, when activating the garbage collector, the threads are suspended using the SIGUSR1 signal and are resumed with the SIGUSR2 signal. In my test I never see the SIGUSR2 signal being sent. Has anybody else seen something like this before? It seems that Sean and Kris have found some problem with the GC too in Ares, but I haven't read their postings yet (dsource.org is down right now).To sum up, Kris had encountered deadlock problems both with Phobos and with Ares. I've since fixed Ares and have been trying to suss out the Phobos issues. I've been focusing on the Win32 code up to now, and have found a potential resource leak with Phobos threads, but no sign of a potential deadlock yet. But perhaps I should give the Posix code a look as well.In case anybody else finds the backtraces useful, I'm including what I could get using an unpatched gdb: Program received signal SIGUSR1, User defined signal 1. [Switching to Thread 1442708400 (LWP 8344)] 0x5557a84e in send () from /lib/tls/libpthread.so.0 (gdb) bt _D5mango2io6Socket6Socket4sendFAvE5mango2io6Socket6Socket5FlagsZi () at /home/jcomellas/devel/d/mango_test/mango/io/Socket.d:1413 at /home/jcomellas/devel/d/mango_test/mango/io/Socket.d:869 at /home/jcomellas/devel/d/mango_test/mango/io/Conduit.d:198 selector.d:308 (gdb) cont Continuing. Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 1433270496 (LWP 8341)] 0x080673b1 in _D3gcx3Gcx4markFPvPvZv () (gdb) bt _D5mango10containers7HashMap89__T7HashMapTT5mango2io5model8IConduit8IConduit6HandleTC5mango2io5model8IConduit8IConduitZ7HashMap8iteratorFZC5mango10containers8Iterator101__T18MutableMapIteratorTT5mango2io5model8IConduit8IConduit6HandleTC5mango2io5model8IConduit8IConduitZ18MutableMapIterator () at /home/jcomellas/devel/d/mango_test/mango/io/selector/SelectSelector.d:303 _D5mango2io8selector14SelectSelector18SelectSelectionSet7opApplyFDFKC5mango2io8selector5model9ISelector12SelectionKeyZiZi () at /home/jcomellas/devel/d/mango_test/mango/io/selector/SelectSelector.d:609 _D8selector12testSelectorFC5mango2io8selector5model9ISelector9ISelectorZv () at selector.d:130Hrm, so the GC thread blows up while trying to scan into pthread library code? I don't see any reason for this to happen, so long as the stack range being passed to the GC is valid. I know there are some library functions that are not considered cancelable, but I would think that they simply turn off signal handling for the span where that's true. Sean Sean
Mar 17 2006