digitalmars.D.learn - GC dead-locking ?
- Marco Leise (19/19) Jun 13 2013 Here is an excerpt from a stack trace I got while profiling
- Marco Leise (7/7) Jun 13 2013 One more note: I get this consistently during profiling, but
- Sean Kelly (11/29) Jun 17 2013 size=3D16401, this=3D...) gc/gcx.d:503
- Marco Leise (7/29) Jun 18 2013 No, I have not overridden the signal handler. I'm aware of the
- Sean Kelly (13/42) Jun 18 2013 =3D16401, this=3D...) gc/gcx.d:503
- Marco Leise (8/41) Jul 01 2013 I could do that (with a little work setting the scenario up
Here is an excerpt from a stack trace I got while profiling with OProfile: alloc_size=0x7fc3d4bfe418) at gc/gcx.d:2099 this=...) gc/gcx.d:503 alloc_size=0x7fc3d4bfe418) gc/gcx.d:421 bitLengths=...) sequencer/algorithm/gzip.d:444 Two more threads are alive, but waiting on a condition variable (i.e.: in pthread_cond_wait(), but from my own and not from druntime code. Is there some obvious way I could have dead-locked the GC ? Or is there a bug ? This was compiled with GDC using DMD FE 2.062. -- Marco
Jun 13 2013
One more note: I get this consistently during profiling, but not without. I don't count kernel involvement out either, since OProfile is a kernel based profiler and there could be a quirk in its interaction with semaphores. -- Marco
Jun 13 2013
On Jun 13, 2013, at 2:22 AM, Marco Leise <Marco.Leise gmx.de> wrote:Here is an excerpt from a stack trace I got while profiling with OProfile: =20poolPtr=3D0x7fc3d4bfe3c8, alloc_size=3D0x7fc3d4bfe418) at gc/gcx.d:2099size=3D16401, this=3D...) gc/gcx.d:503alloc_size=3D0x7fc3d4bfe418) gc/gcx.d:421(this=3D..., bitLengths=3D...) sequencer/algorithm/gzip.d:444=20 Two more threads are alive, but waiting on a condition variable (i.e.: in pthread_cond_wait(), but from my own and not from druntime code. Is there some obvious way I could have dead-locked the GC ? Or is there a bug ?I assume you're running on Linux, which uses signals (SIGUSR1, = specifically) to suspend threads for a collection. So I imagine what's = happening is that your thread is trying to suspend all the other threads = so it can collect, and those threads are ignoring the signal for some = reason. I would expect pthread_cond_wait to be interrupted if a signal = arrives though. Have you overridden the signal handler for SIGUSR1?=
Jun 17 2013
Am Mon, 17 Jun 2013 10:46:19 -0700 schrieb Sean Kelly <sean invisibleduck.org>:On Jun 13, 2013, at 2:22 AM, Marco Leise <Marco.Leise gmx.de> wrote:No, I have not overridden the signal handler. I'm aware of the fact that signals make pthread_cond_wait() return early and put them in a while loop as one would expect, that is all. -- MarcoHere is an excerpt from a stack trace I got while profiling with OProfile: alloc_size=0x7fc3d4bfe418) at gc/gcx.d:2099 this=...) gc/gcx.d:503 alloc_size=0x7fc3d4bfe418) gc/gcx.d:421 bitLengths=...) sequencer/algorithm/gzip.d:444 Two more threads are alive, but waiting on a condition variable (i.e.: in pthread_cond_wait(), but from my own and not from druntime code. Is there some obvious way I could have dead-locked the GC ? Or is there a bug ?I assume you're running on Linux, which uses signals (SIGUSR1, specifically) to suspend threads for a collection. So I imagine what's happening is that your thread is trying to suspend all the other threads so it can collect, and those threads are ignoring the signal for some reason. I would expect pthread_cond_wait to be interrupted if a signal arrives though. Have you overridden the signal handler for SIGUSR1?
Jun 18 2013
On Jun 18, 2013, at 7:01 AM, Marco Leise <Marco.Leise gmx.de> wrote:Am Mon, 17 Jun 2013 10:46:19 -0700 schrieb Sean Kelly <sean invisibleduck.org>: =20fe3c8, alloc_size=3D0x7fc3d4bfe418) at gc/gcx.d:2099On Jun 13, 2013, at 2:22 AM, Marco Leise <Marco.Leise gmx.de> wrote: =20Here is an excerpt from a stack trace I got while profiling with OProfile: =20=3D16401, this=3D...) gc/gcx.d:5030x7fc3d4bfe418) gc/gcx.d:421=3D..., bitLengths=3D...) sequencer/algorithm/gzip.d:444ly) to suspend threads for a collection. So I imagine what's happening is t= hat your thread is trying to suspend all the other threads so it can collect= , and those threads are ignoring the signal for some reason. I would expect= pthread_cond_wait to be interrupted if a signal arrives though. Have you o= verridden the signal handler for SIGUSR1?=20 Two more threads are alive, but waiting on a condition variable (i.e.: in pthread_cond_wait(), but from my own and not from druntime code. Is there some obvious way I could have dead-locked the GC ? Or is there a bug ?=20 I assume you're running on Linux, which uses signals (SIGUSR1, specifical==20 No, I have not overridden the signal handler. I'm aware of the fact that signals make pthread_cond_wait() return early and put them in a while loop as one would expect, that is all.Hrm... Can you trap this in a debugger and post the stack traces of all thre= ads? That stack above is a thread waiting for others to say they're suspend= ed so it can collect.=20=
Jun 18 2013
Am Tue, 18 Jun 2013 19:12:06 -0700 schrieb Sean Kelly <sean invisibleduck.org>:On Jun 18, 2013, at 7:01 AM, Marco Leise <Marco.Leise gmx.de> wrote:I could do that (with a little work setting the scenario up again), but it wont help. As I said, the other two threads were paused in pthread_cond_wait() in my own code. There was nothing special about their stack trace. -- MarcoAm Mon, 17 Jun 2013 10:46:19 -0700 schrieb Sean Kelly <sean invisibleduck.org>:Hrm... Can you trap this in a debugger and post the stack traces of all threads? That stack above is a thread waiting for others to say they're suspended so it can collect.On Jun 13, 2013, at 2:22 AM, Marco Leise <Marco.Leise gmx.de> wrote:No, I have not overridden the signal handler. I'm aware of the fact that signals make pthread_cond_wait() return early and put them in a while loop as one would expect, that is all.Here is an excerpt from a stack trace I got while profiling with OProfile: alloc_size=0x7fc3d4bfe418) at gc/gcx.d:2099 this=...) gc/gcx.d:503 alloc_size=0x7fc3d4bfe418) gc/gcx.d:421 bitLengths=...) sequencer/algorithm/gzip.d:444 Two more threads are alive, but waiting on a condition variable (i.e.: in pthread_cond_wait(), but from my own and not from druntime code. Is there some obvious way I could have dead-locked the GC ? Or is there a bug ?I assume you're running on Linux, which uses signals (SIGUSR1, specifically) to suspend threads for a collection. So I imagine what's happening is that your thread is trying to suspend all the other threads so it can collect, and those threads are ignoring the signal for some reason. I would expect pthread_cond_wait to be interrupted if a signal arrives though. Have you overridden the signal handler for SIGUSR1?
Jul 01 2013