www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Memory Corruption Issue??

reply Bottled Gin <gin bottle.com> writes:
Greetings

I am struggling with strange memory corruption issues with 
dmd-2.069.2 release.

The issue shows up only when I load a shared library created from 
D code from C and call some D functions from the C side. But 
since the program control is completely with the D code, and data 
structures in D get corrupted, I believe C has no role to play in 
the corruption. It is just that the memory layout of the 
executable, when the compiled D code is loaded from C, is helping 
in replicating the issue.

I have spent almost a week in reducing this issue to less than 
100 lines of code. Now I need the developers' love and help to 
get this issue fixed. There is some bleak chance that I am doing 
something wrong while loading the D library from C code. But 
otherwise it looks like a DMD memory corruption issue.

Since two C, and one D files are involved in recreating the 
issue, I have put all the files on a github repository along with 
a makefile. I have been able to recreate the issue on two Ubuntu 
14.04 64-bit machines.

Generally the issue seems to be with static (thread local) 
variables that get allocated on heap. If I create many such 
variables, I get data corruption in some of these variables and 
sometimes I get segmentation fault. In this testcase that I have 
reduced, contents of a dynamic array are getting corrupted. Since 
all the data is being accessed from only one thread, there is no 
chance of a multicore race condition.

To reproduce the issue, kindly clone my git repo 
(https://github.com/puneet/memerr.git). Change the path of the 
DMD installation (I have tested only with dmd-2.069.2) in the 
makefile and run make.

$ git clone https://github.com/puneet/memerr.git

$ make

I get an output like:

$ make
/home/puneet/local/dmd-2.069.2/linux/bin64/rdmd foo.d
Start frop from D
Successfully completed loop....
./main
Start frop from C
0 ->  �+----------------

The last line is the content of an array which is actually filled 
with only dashes in the code.

Kindly help. I want to make sure that I am not making a mistake 
before I file a bug on dlang bugzilla.

Regards
- Puneet
Jan 20 2016
next sibling parent Johannes Pfau <nospam example.com> writes:
Am Wed, 20 Jan 2016 09:12:57 +0000
schrieb Bottled Gin <gin bottle.com>:

 Greetings
 
 I am struggling with strange memory corruption issues with 
 dmd-2.069.2 release.
 
Could be GC scanning issues. Can you try disabling the garbage collector and see if it still crashes?
Jan 20 2016
prev sibling next sibling parent reply Daniel Kozak <kozzi11 gmail.com> writes:
On Wednesday, 20 January 2016 at 09:12:57 UTC, Bottled Gin wrote:
 Greetings

 I am struggling with strange memory corruption issues with 
 dmd-2.069.2 release.

 [...]
As a workaround you can mark Hash class as a shared final shared class Hash
Jan 20 2016
parent reply Daniel Kozak <kozzi11 gmail.com> writes:
On Wednesday, 20 January 2016 at 11:06:53 UTC, Daniel Kozak wrote:
 On Wednesday, 20 January 2016 at 09:12:57 UTC, Bottled Gin 
 wrote:
 Greetings

 I am struggling with strange memory corruption issues with 
 dmd-2.069.2 release.

 [...]
As a workaround you can mark Hash class as a shared final shared class Hash
But this will change behavour because it will not be in TLS anymore
Jan 20 2016
parent Daniel Kozak <kozzi11 gmail.com> writes:
On Wednesday, 20 January 2016 at 11:08:46 UTC, Daniel Kozak wrote:
 On Wednesday, 20 January 2016 at 11:06:53 UTC, Daniel Kozak 
 wrote:
 On Wednesday, 20 January 2016 at 09:12:57 UTC, Bottled Gin 
 wrote:
 Greetings

 I am struggling with strange memory corruption issues with 
 dmd-2.069.2 release.

 [...]
As a workaround you can mark Hash class as a shared final shared class Hash
But this will change behavour because it will not be in TLS anymore
GC.disable seems help
Jan 20 2016
prev sibling parent reply Daniel Kozak <kozzi11 gmail.com> writes:
On Wednesday, 20 January 2016 at 09:12:57 UTC, Bottled Gin wrote:
 Greetings

 I am struggling with strange memory corruption issues with 
 dmd-2.069.2 release.

 ...
 ./main
 Start frop from C
 0 ->  �+----------------

 The last line is the content of an array which is actually 
 filled with only dashes in the code.

 Kindly help. I want to make sure that I am not making a mistake 
 before I file a bug on dlang bugzilla.

 Regards
 - Puneet
Another workaround is to use GC.addRoot for dynamic allocated data in Dynamic.proc void proc () { import core.memory: GC; dash.length = 32; GC.addRoot(cast(void*)dash.ptr); dash[] = '-'; }
Jan 20 2016
parent reply Daniel Kozak <kozzi11 gmail.com> writes:
On Wednesday, 20 January 2016 at 11:49:29 UTC, Daniel Kozak wrote:
 On Wednesday, 20 January 2016 at 09:12:57 UTC, Bottled Gin 
 wrote:
 Greetings

 I am struggling with strange memory corruption issues with 
 dmd-2.069.2 release.

 ...
 ./main
 Start frop from C
 0 ->  �+----------------

 The last line is the content of an array which is actually 
 filled with only dashes in the code.

 Kindly help. I want to make sure that I am not making a 
 mistake before I file a bug on dlang bugzilla.

 Regards
 - Puneet
Another workaround is to use GC.addRoot for dynamic allocated data in Dynamic.proc void proc () { import core.memory: GC; dash.length = 32; GC.addRoot(cast(void*)dash.ptr); dash[] = '-'; }
And another one is hold pointer to data: class Dynamic { static char[] space; static char[] dash; char* dash_ptr; void rehash () { static Hash hash ; hash = new Hash; hash.clear(); } void proc () { import core.memory: GC; dash.length = 32; dash_ptr = dash.ptr; dash[] = '-'; } }
Jan 20 2016
parent reply Bottled Gin <gin bottle.com> writes:
 Another workaround is to use GC.addRoot for dynamic allocated 
 data in Dynamic.proc

 void proc () {
     import core.memory: GC;
     dash.length = 32;
     GC.addRoot(cast(void*)dash.ptr);
     dash[] = '-';
 }
And another one is hold pointer to data: class Dynamic { static char[] space; static char[] dash; char* dash_ptr; void rehash () { static Hash hash ; hash = new Hash; hash.clear(); } void proc () { import core.memory: GC; dash.length = 32; dash_ptr = dash.ptr; dash[] = '-'; } }
Daniel, thanks for confirming the bug and for providing workaround. The second workaround (saving the pointer) will not work on my real project though. I have multiple threads and the TLS variable will have a different pointer on each thread. Also, can you please tell me how to addRoot an assoc array to GC. It seems there is no ptr property available for assoc arrays. Regards - Puneet
Jan 20 2016
parent reply Daniel Kozak via Digitalmars-d <digitalmars-d puremagic.com> writes:
V Wed, 20 Jan 2016 13:58:46 +0000
Bottled Gin via Digitalmars-d <digitalmars-d puremagic.com> napsáno:

 Another workaround is to use GC.addRoot for dynamic allocated 
 data in Dynamic.proc

 void proc () {
     import core.memory: GC;
     dash.length = 32;
     GC.addRoot(cast(void*)dash.ptr);
     dash[] = '-';
 }  
And another one is hold pointer to data: class Dynamic { static char[] space; static char[] dash; char* dash_ptr; void rehash () { static Hash hash ; hash = new Hash; hash.clear(); } void proc () { import core.memory: GC; dash.length = 32; dash_ptr = dash.ptr; dash[] = '-'; } }
Daniel, thanks for confirming the bug and for providing workaround. The second workaround (saving the pointer) will not work on my real project though. I have multiple threads and the TLS variable will have a different pointer on each thread. Also, can you please tell me how to addRoot an assoc array to GC. It seems there is no ptr property available for assoc arrays. Regards - Puneet
You can use cast(void *)aa; something like this: class Dynamic { static char[] space; static char[int] dash; void rehash () { static Hash hash ; hash = new Hash; hash.clear(); } void proc () { import core.memory: GC; //GC.addRoot(cast(void *)dash); // not here { dash[i] = '-'; } GC.addRoot(cast(void *)dash); // must be after allocation } } Be careful because you mast use addPtr after aa is initialized (you put something to it)
Jan 20 2016
parent reply Bottled Gin <gin bottle.com> writes:
Thanks Daniel

I have added the testcase to a more obscure testcase that I had 
raised on Bugzilla earlier.
https://issues.dlang.org/show_bug.cgi?id=15513

I want to request developers to show some love.

Regards
- Puneet
Jan 20 2016
parent reply Bottled Gin <gin bottle.com> writes:
Greetings

I am using my D code as a dynamically loadable library that gets 
loaded at run time into C/C++ world. As discussed earlier on this 
thread, the GC does not mark TLS objects in this scenario and as 
a result the GC ends up collecting TLS objects even though these 
objects are still in use. More details of the issue can be found 
on the bug tracker https://issues.dlang.org/show_bug.cgi?id=15513

 Daniel provided two workarounds to this issue. One was to 
disable the GC altogether. I do not want to do that since my 
application generates too much data that necessitates regular 
sweeping.

The other suggested workaround was to explicitly invoke 
GC.addRoot for all the TLS objects. This worked for me for some 
situations, but in other scenarios I am still facing crashes. I 
think these crashes may be resulting from the invisible TLS 
objects in phobos and druntime etc. I have confirmed that all 
these crashes subside if I disable GC altogether and also that 
the crashes do not happen if I make a D bases executable instead 
of DLL.

I want to know if someone is working on this issue. If no one is, 
I am ready to spend time and get this behind me. Actually I have 
already worked on this bug and I think now I have a fair idea of 
what is happening.

Is it the right forum to discuss my findings, or should I put my 
comments on bugzilla. I need some guidance in finding the right 
fix.

Regards
- Puneet
Mar 06 2016
parent Bottled Gin <gin bottle.com> writes:
The fix turned out to be much simpler than what I had thought.

https://github.com/D-Programming-Language/druntime/pull/1506
Mar 09 2016