digitalmars.D - GC + malloc/free = deadlock

Benjamin Thaut (22/22) Apr 22 2012 I wrote a small "bad example" program for a presentation. It calls

David (9/28) Apr 22 2012 That's nearly the same with OpenGL, never, really never put a

Benjamin Thaut (6/40) Apr 23 2012 Well the issue with OpenGL is that you can only make OpenGL calls from

Steven Schveighoffer (14/25) Apr 23 2012 This shouldn't happen. The collection routine is specifically designed ...

Benjamin Thaut (9/35) Apr 23 2012 If what you are saying is true, the deadlock must happen somewhere else....

Kapps (6/15) Apr 23 2012 Are these threads created by core.thread.Thread, or are they

Steven Schveighoffer (6/22) Apr 23 2012 No. Dtor calls should all be done while all threads are running. The
Benjamin Thaut (3/18) Apr 23 2012 No all threads are created via core.thread.Thread. There are however

simendsjo (6/26) Apr 23 2012 I've had problems using attachThis() on threads created in C.

Benjamin Thaut <code benjamin-thaut.de> writes:

I wrote a small "bad example" program for a presentation. It calls 
malloc/free very frequently and I wanted to profile the impact of this 
compared to a pool allocator solution. Unfortunately I had to notice 
that the program only runs for a fraction of a second before 
deadlocking. As it seems the following situation accurs:

1) Thread 1 calls malloc(), and locks the internal malloc mutex
2) Thread 2 triggers garbage collection and stops thread 1
3) Thread 2 is done collecting garbage and calls all destructors. One of 
these calls free(), as the malloc mutex is still locked by Thread 1 and 
Thread 1 will never release the mutex, as it is stoped, Thread 2 deadlocks.

Now this is not limited to malloc / free, it can happen with any kind of 
locking that is done within a destructor.

Currently I can only think of two ways to fix this:

1) Don't use free inside a destructor (goodbye manual memory management)
2) A callback triggered by the GC before it starts stopping any threads, 
and after it is done destructing all objects to manually lock / unlock 
necessary synchronization primitives to prevent this kind of 
deadlocking. The default implementation of this callback should lock / 
unlock the malloc mutex.

-- 
Kind Regards
Benjamin Thaut

Apr 22 2012

David <d dav1d.de> writes:

Am 23.04.2012 08:41, schrieb Benjamin Thaut:
 I wrote a small "bad example" program for a presentation. It calls
 malloc/free very frequently and I wanted to profile the impact of this
 compared to a pool allocator solution. Unfortunately I had to notice
 that the program only runs for a fraction of a second before
 deadlocking. As it seems the following situation accurs:

 1) Thread 1 calls malloc(), and locks the internal malloc mutex
 2) Thread 2 triggers garbage collection and stops thread 1
 3) Thread 2 is done collecting garbage and calls all destructors. One of
 these calls free(), as the malloc mutex is still locked by Thread 1 and
 Thread 1 will never release the mutex, as it is stoped, Thread 2 deadlocks.

 Now this is not limited to malloc / free, it can happen with any kind of
 locking that is done within a destructor.

 Currently I can only think of two ways to fix this:

 1) Don't use free inside a destructor (goodbye manual memory management)
 2) A callback triggered by the GC before it starts stopping any threads,
 and after it is done destructing all objects to manually lock / unlock
 necessary synchronization primitives to prevent this kind of
 deadlocking. The default implementation of this callback should lock /
 unlock the malloc mutex.

That's nearly the same with OpenGL, never, really never put a 
glDelete*-Call in a Dtor, I had this for a few commits in glamour (a 
OpenGL wrapper) and it kept raining Segfaults.
The bigger problem was, that was only caused by the GC, I didn't tell 
him to collect garbage from another Thread, it happend when it jumped 
in, and called dtors.
The way I have fixed it was with "scope" and an additional .remove 
method. Maybe this is the way you can solve your problem.

Apr 22 2012

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 23.04.2012 08:52, schrieb David:
 Am 23.04.2012 08:41, schrieb Benjamin Thaut:
 I wrote a small "bad example" program for a presentation. It calls
 malloc/free very frequently and I wanted to profile the impact of this
 compared to a pool allocator solution. Unfortunately I had to notice
 that the program only runs for a fraction of a second before
 deadlocking. As it seems the following situation accurs:

 1) Thread 1 calls malloc(), and locks the internal malloc mutex
 2) Thread 2 triggers garbage collection and stops thread 1
 3) Thread 2 is done collecting garbage and calls all destructors. One of
 these calls free(), as the malloc mutex is still locked by Thread 1 and
 Thread 1 will never release the mutex, as it is stoped, Thread 2
 deadlocks.

 Now this is not limited to malloc / free, it can happen with any kind of
 locking that is done within a destructor.

 Currently I can only think of two ways to fix this:

 1) Don't use free inside a destructor (goodbye manual memory management)
 2) A callback triggered by the GC before it starts stopping any threads,
 and after it is done destructing all objects to manually lock / unlock
 necessary synchronization primitives to prevent this kind of
 deadlocking. The default implementation of this callback should lock /
 unlock the malloc mutex.

 That's nearly the same with OpenGL, never, really never put a
 glDelete*-Call in a Dtor, I had this for a few commits in glamour (a
 OpenGL wrapper) and it kept raining Segfaults.
 The bigger problem was, that was only caused by the GC, I didn't tell
 him to collect garbage from another Thread, it happend when it jumped
 in, and called dtors.
 The way I have fixed it was with "scope" and an additional .remove
 method. Maybe this is the way you can solve your problem.

Well the issue with OpenGL is that you can only make OpenGL calls from 
the thread that actually owns the OpenGL context, so this is not quite 
the same. Because I'm perfectly allowed to call free from a different 
thread then where I called malloc. Also this is not only my problem, 
std.container.Array uses malloc & free so it can happen there too.

Apr 23 2012

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 23 Apr 2012 02:41:21 -0400, Benjamin Thaut  
<code benjamin-thaut.de> wrote:

 I wrote a small "bad example" program for a presentation. It calls  
 malloc/free very frequently and I wanted to profile the impact of this  
 compared to a pool allocator solution. Unfortunately I had to notice  
 that the program only runs for a fraction of a second before  
 deadlocking. As it seems the following situation accurs:

 1) Thread 1 calls malloc(), and locks the internal malloc mutex
 2) Thread 2 triggers garbage collection and stops thread 1
 3) Thread 2 is done collecting garbage and calls all destructors. One of  
 these calls free(), as the malloc mutex is still locked by Thread 1 and  
 Thread 1 will never release the mutex, as it is stoped, Thread 2  
 deadlocks.

This shouldn't happen.  The collection routine is specifically designed to  
avoid this.  I remember when Sean put it into Tango after an IRC  
conversation.

The correct sequence is:

1. stop the world
2. Perform mark on all data (identifying which blocks should be freed)
3. resume the world
4. Call dtors on unreferenced data and deallocate.

This was to fix the exact problem you are having, albeit the malloc/free  
were indirect by calls to C-libs.

Please file a bug.

-Steve

Apr 23 2012

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 23.04.2012 13:57, schrieb Steven Schveighoffer:
 On Mon, 23 Apr 2012 02:41:21 -0400, Benjamin Thaut
 <code benjamin-thaut.de> wrote:

 I wrote a small "bad example" program for a presentation. It calls
 malloc/free very frequently and I wanted to profile the impact of this
 compared to a pool allocator solution. Unfortunately I had to notice
 that the program only runs for a fraction of a second before
 deadlocking. As it seems the following situation accurs:

 1) Thread 1 calls malloc(), and locks the internal malloc mutex
 2) Thread 2 triggers garbage collection and stops thread 1
 3) Thread 2 is done collecting garbage and calls all destructors. One
 of these calls free(), as the malloc mutex is still locked by Thread 1
 and Thread 1 will never release the mutex, as it is stoped, Thread 2
 deadlocks.

 This shouldn't happen. The collection routine is specifically designed
 to avoid this. I remember when Sean put it into Tango after an IRC
 conversation.

 The correct sequence is:

 1. stop the world
 2. Perform mark on all data (identifying which blocks should be freed)
 3. resume the world
 4. Call dtors on unreferenced data and deallocate.

 This was to fix the exact problem you are having, albeit the malloc/free
 were indirect by calls to C-libs.

 Please file a bug.

 -Steve

If what you are saying is true, the deadlock must happen somewhere else. 
This was kind of a assumption because the deadlock happend after I added 
all the malloc / free calls. Because all the threads are stopped when 
this happens I can't really debug this, because the Visual Studio 
debugger tells me that it can not debug any thread but the one that is 
currently running the GC.

Kind Regards
Ingrater

Apr 23 2012

"Kapps" <opantm2+spam gmail.com> writes:

On Monday, 23 April 2012 at 12:11:19 UTC, Benjamin Thaut wrote:
 If what you are saying is true, the deadlock must happen 
 somewhere else. This was kind of a assumption because the 
 deadlock happend after I added all the malloc / free calls. 
 Because all the threads are stopped when this happens I can't 
 really debug this, because the Visual Studio debugger tells me 
 that it can not debug any thread but the one that is currently 
 running the GC.

 Kind Regards
 Ingrater

Are these threads created by core.thread.Thread, or are they 
created through different means such as native OS calls or a C 
api? If the latter, you have to register them with the runtime (I 
think it was something like Thread.attachThis), otherwise issues 
like this will happen.

Apr 23 2012

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 23 Apr 2012 08:36:59 -0400, Kapps <opantm2+spam gmail.com> wrote:

 On Monday, 23 April 2012 at 12:11:19 UTC, Benjamin Thaut wrote:
 If what you are saying is true, the deadlock must happen somewhere  
 else. This was kind of a assumption because the deadlock happend after  
 I added all the malloc / free calls. Because all the threads are  
 stopped when this happens I can't really debug this, because the Visual  
 Studio debugger tells me that it can not debug any thread but the one  
 that is currently running the GC.

 Kind Regards
 Ingrater

 Are these threads created by core.thread.Thread, or are they created  
 through different means such as native OS calls or a C api? If the  
 latter, you have to register them with the runtime (I think it was  
 something like Thread.attachThis), otherwise issues like this will  
 happen.

No.  Dtor calls should all be done while all threads are running.  The  
portion of the GC collection cycle which requires the world to be stopped  
should not be doing any locking/unlocking in arbitrary C libs.  If it is,  
it's a bug.

-Steve

Apr 23 2012

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 23.04.2012 14:36, schrieb Kapps:
 On Monday, 23 April 2012 at 12:11:19 UTC, Benjamin Thaut wrote:
 If what you are saying is true, the deadlock must happen somewhere
 else. This was kind of a assumption because the deadlock happend after
 I added all the malloc / free calls. Because all the threads are
 stopped when this happens I can't really debug this, because the
 Visual Studio debugger tells me that it can not debug any thread but
 the one that is currently running the GC.

 Kind Regards
 Ingrater

 Are these threads created by core.thread.Thread, or are they created
 through different means such as native OS calls or a C api? If the
 latter, you have to register them with the runtime (I think it was
 something like Thread.attachThis), otherwise issues like this will happen.

No all threads are created via core.thread.Thread. There are however 
some threads (OpenGL / OpenAL) that I do not create.

Apr 23 2012

simendsjo <simendsjo gmail.com> writes:

On Mon, 23 Apr 2012 15:44:01 +0200, Benjamin Thaut  
<code benjamin-thaut.de> wrote:

 Am 23.04.2012 14:36, schrieb Kapps:
 On Monday, 23 April 2012 at 12:11:19 UTC, Benjamin Thaut wrote:
 If what you are saying is true, the deadlock must happen somewhere
 else. This was kind of a assumption because the deadlock happend after
 I added all the malloc / free calls. Because all the threads are
 stopped when this happens I can't really debug this, because the
 Visual Studio debugger tells me that it can not debug any thread but
 the one that is currently running the GC.

 Kind Regards
 Ingrater

 Are these threads created by core.thread.Thread, or are they created
 through different means such as native OS calls or a C api? If the
 latter, you have to register them with the runtime (I think it was
 something like Thread.attachThis), otherwise issues like this will  
 happen.

 No all threads are created via core.thread.Thread. There are however  
 some threads (OpenGL / OpenAL) that I do not create.

I've had problems using attachThis() on threads created in C.
Look at  
http://www.digitalmars.com/d/archives/digitalmars/D/learn/GC_collecting_too_much_.._33934.html
No one has answered though..

Apr 23 2012

D Programming

C/C++ Programming

Other

digitalmars.D - GC + malloc/free = deadlock