www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 15939] New: GC.collect causes deadlock in multi-threaded

https://issues.dlang.org/show_bug.cgi?id=15939

          Issue ID: 15939
           Summary: GC.collect causes deadlock in multi-threaded
                    environment
           Product: D
           Version: D2
          Hardware: x86_64
                OS: Linux
            Status: NEW
          Severity: blocker
          Priority: P1
         Component: druntime
          Assignee: nobody puremagic.com
          Reporter: apreobrazhensky gmail.com

I have multi-threaded application with threads doing memory intensive work and
main thread cleaning up the garbage periodically by calling GC.collect
manually. Sometimes GC.collect causes deadlock. I don't have simple example,
but I do have stack traces of the threads at the moment of the deadlock.

It happens both for dmd 2.071.0 and for dmd 2.070.* (so it is not related to
the recent GC spinlock change).

That seems like a blocker to me, I suspect that if it happens when I call it
manually it could also happen during normal collections. I'm not familiar with
runtime code, but I would expect some sort of race condition judging from stack
traces below.


Configuration:

dmd 2.071.0 with -O -release -inline -boundscheck=off

x86_64 x86_64 GNU/Linux


That's the main thread's stack trace.

Thread 1 (Thread 0x7ff6653bb6c0 (LWP 6857)):




gc.gc.GC.__T9runLockedS49_D2gc2gc2GC11fullCollectMFNbZ2goFNbPS2gc2gc3GcxZmTPS2gc2gc3GcxZ.runLocked()
()


...application stack


That's how stack trace looks like for the threads which were suspended
correctly.

Thread XX (Thread 0x7ff5c6ffd700 (LWP 6897)):

../sysdeps/unix/sysv/linux/sigsuspend.c:63

../sysdeps/unix/sysv/linux/sigsuspend.c:78




... application stack


That's how stack trace looks like for the threads which weren't suspended:

Thread YY (Thread 0x7ff5c67fc700 (LWP 6898)):





gc.gc.GC.__T9runLockedS46_D2gc2gc2GC12extendNoSyncMFNbPvmmxC8TypeInfoZmS21_D2gc2gc10extendTimelS21_D2gc2gc10numExtendslTPvTmTmTxC8TypeInfoZ.runLocked()
()


... application stack

Thread ZZ (Thread 0x7ff566ffd700 (LWP 6918)):





gc.gc.GC.__T9runLockedS47_D2gc2gc2GC12mallocNoSyncMFNbmkKmxC8TypeInfoZPvS21_D2gc2gc10mallocTimelS21_D2gc2gc10numMallocslTmTkTmTxC8TypeInfoZ.runLocked()
()



... application stack

--
Apr 18 2016