digitalmars.D - Debugging memory leak.
- David Brown (26/26) Oct 08 2007 I've been developing an application on x86_64 (dgcc) without any problem...
- Sean Kelly (4/14) Oct 08 2007 Somewhat, but void[] arrays are still treated as if they have pointers.
- David Brown (11/19) Oct 08 2007 I called delete on the result coming back from std.zlib.compress and it
- Frits van Bommel (12/28) Oct 08 2007 Yes it should. I created a quick patch for the gphobos version of
- Brad Roberts (5/35) Oct 08 2007 Please attach that diff to a bug report so that it doesn't get lost. I'...
- Frits van Bommel (8/15) Oct 08 2007 I recreated it for DMD (the patch wouldn't apply to the DMD std.zlib but...
- Sean Kelly (9/19) Oct 08 2007 Yup. However, an annoying problem still exists with Buffer. Basically,...
- Frits van Bommel (14/33) Oct 08 2007 Actually, that bug (or its Phobos equivalent) seems to be partly to
- Sean Kelly (23/62) Oct 08 2007 Some of the complication comes from a desire for efficiency. Currently,...
- renoX (7/29) Oct 09 2007 IMHO, that's a different issue: the compress prototype should be
- Daniel Keep (7/11) Oct 08 2007 I use ubyte[]s for anything dealing with binary data; I only ever use
I've been developing an application on x86_64 (dgcc) without any problems. Yesterday, I tried building it on x86 to discover that it has a memory leak on that platform. The leak is rather significant, and the program quickly exhausts memory and is killed. Any suggestions from the list on how to debug/fix this? Some background as to what the program is doing: - It reads large amounts of data into dynamically allocated buffers (currently 256K each). - These buffers are passed to std.zlib.compress, which returns a new buffer. Some suspicions I have: - Because of the larger address space on the x86_64, it is less likely that random data will point into one of these buffers, but on the x86, it happens more, causing a buffer to be kept around. Eventually more and more of these stick around. - It's worse with a gentoo built compiler (USE=d emerge gcc) than with the gdc-0.24 binary distribution. These are both built using gcc 4.1.2. Ideas for possibly fixing this: - Manually 'delete' these buffers. In my instance, this wouldn't really be all that difficult since I know when they go out of use. - Call std.gc.hasNoPointers(void*) on the block. I would think this is the case for a char[], but std.zlib.compress uses a void[], which the compiler can't make this assumption about. - Try Tango? Is the GC different there? David
Oct 08 2007
David Brown wrote:Ideas for possibly fixing this: - Manually 'delete' these buffers. In my instance, this wouldn't really be all that difficult since I know when they go out of use. - Call std.gc.hasNoPointers(void*) on the block. I would think this is the case for a char[], but std.zlib.compress uses a void[], which the compiler can't make this assumption about.std.zlib should likely be changed to use a byte[] array instead.- Try Tango? Is the GC different there?Somewhat, but void[] arrays are still treated as if they have pointers. Sean
Oct 08 2007
On Mon, Oct 08, 2007 at 09:22:54AM -0700, Sean Kelly wrote:David Brown wrote:I called delete on the result coming back from std.zlib.compress and it seems to have gotten rid of the memory leak. My guess is that on x86_64, the address space is large enough that compressed data was unlikely to look like pointers, but more likely to do so on a 32-bit platform. So, I would call this a bug in std.zlib. Even if it still returns the void[], it should allocate as a byte[]. In my instance, calling delete on the block reduces the amount of work the garbage collector needs to do, so probably is a good idea anyway. Thanks, DavidIdeas for possibly fixing this: - Manually 'delete' these buffers. In my instance, this wouldn't really be all that difficult since I know when they go out of use. - Call std.gc.hasNoPointers(void*) on the block. I would think this is the case for a char[], but std.zlib.compress uses a void[], which the compiler can't make this assumption about.std.zlib should likely be changed to use a byte[] array instead.
Oct 08 2007
Sean Kelly wrote:David Brown wrote:Yes it should. I created a quick patch for the gphobos version of std.zlib (since the OP was using GDC). I attached it to this post (not pasted inline due to line wrapping issues). I haven't tested it other than running it through 'gdc -c', but it's such a trivial patch that the fact it compiles should mean it should function identically (except for not allocating void[]s, and thus hopefully prevent some memory leakage).Ideas for possibly fixing this: - Manually 'delete' these buffers. In my instance, this wouldn't really be all that difficult since I know when they go out of use. - Call std.gc.hasNoPointers(void*) on the block. I would think this is the case for a char[], but std.zlib.compress uses a void[], which the compiler can't make this assumption about.std.zlib should likely be changed to use a byte[] array instead.But AFAICT tango.io.compress.Zlib doesn't allocate any of those, just ubyte[] arrays, and exception classes. (The unittest does use a MemoryConduit, which internally uses a void[], but nothing else should allocate void[]s for that module)- Try Tango? Is the GC different there?Somewhat, but void[] arrays are still treated as if they have pointers.
Oct 08 2007
On Mon, 8 Oct 2007, Frits van Bommel wrote:Sean Kelly wrote:Please attach that diff to a bug report so that it doesn't get lost. I'll look at it getting it into the next release. Thanks, BradDavid Brown wrote:Yes it should. I created a quick patch for the gphobos version of std.zlib (since the OP was using GDC). I attached it to this post (not pasted inline due to line wrapping issues). I haven't tested it other than running it through 'gdc -c', but it's such a trivial patch that the fact it compiles should mean it should function identically (except for not allocating void[]s, and thus hopefully prevent some memory leakage).Ideas for possibly fixing this: - Manually 'delete' these buffers. In my instance, this wouldn't really be all that difficult since I know when they go out of use. - Call std.gc.hasNoPointers(void*) on the block. I would think this is the case for a char[], but std.zlib.compress uses a void[], which the compiler can't make this assumption about.std.zlib should likely be changed to use a byte[] array instead.But AFAICT tango.io.compress.Zlib doesn't allocate any of those, just ubyte[] arrays, and exception classes. (The unittest does use a MemoryConduit, which internally uses a void[], but nothing else should allocate void[]s for that module)- Try Tango? Is the GC different there?Somewhat, but void[] arrays are still treated as if they have pointers.
Oct 08 2007
Brad Roberts wrote:On Mon, 8 Oct 2007, Frits van Bommel wrote:I recreated it for DMD (the patch wouldn't apply to the DMD std.zlib but was trivial to recreate). I filed a bug report[1]. Given your comment I wasn't sure whether to assign it to you or Walter, so I left it assigned to Walter (the default) and just added you in the 'CC' box, figuring you'd at least be interested in it :). [1]: http://d.puremagic.com/issues/show_bug.cgi?id=1557Yes it should. I created a quick patch for the gphobos version of std.zlib (since the OP was using GDC). I attached it to this post (not pasted inline due to line wrapping issues).Please attach that diff to a bug report so that it doesn't get lost. I'll look at it getting it into the next release.
Oct 08 2007
Frits van Bommel wrote:Sean Kelly wrote:Yup. However, an annoying problem still exists with Buffer. Basically, this class maintains a void[] reference to a block of memory allocated as a byte[]. However, if the block is resized for any reason, the type doing the resizing is used to determine whether the newly allocated block contains pointers. I've been meaning to change the Tango runtime and GC to preserve array block attributes across reallocations, but it's a somewhat involved process and I haven't gotten to it yet. SeanBut AFAICT tango.io.compress.Zlib doesn't allocate any of those, just ubyte[] arrays, and exception classes. (The unittest does use a MemoryConduit, which internally uses a void[], but nothing else should allocate void[]s for that module)- Try Tango? Is the GC different there?Somewhat, but void[] arrays are still treated as if they have pointers.
Oct 08 2007
Sean Kelly wrote:Frits van Bommel wrote:Actually, that bug (or its Phobos equivalent) seems to be partly to blame as well. If you look at std.zlib, you'll see that right after every "new void[whatever]", std.gc.hasNoPointers is called on the result. However, there are some ~/~= operations on those buffers while typed as void[]s. I think if the "has no pointers" bit carried over from the original arrays when concatenating (if neither contains pointers[1], the result doesn't contain any either) std.zlib might actually be leak-free. [1]: If an array gc-allocated, use the attribute as set by the allocation function or the user. Otherwise, use the default for the static type (through TypeInfo). P.S. I think I'm starting to realize what you meant by that "a somewhat involved process" comment ;).Sean Kelly wrote:Yup. However, an annoying problem still exists with Buffer. Basically, this class maintains a void[] reference to a block of memory allocated as a byte[]. However, if the block is resized for any reason, the type doing the resizing is used to determine whether the newly allocated block contains pointers. I've been meaning to change the Tango runtime and GC to preserve array block attributes across reallocations, but it's a somewhat involved process and I haven't gotten to it yet.But AFAICT tango.io.compress.Zlib doesn't allocate any of those, just ubyte[] arrays, and exception classes. (The unittest does use a MemoryConduit, which internally uses a void[], but nothing else should allocate void[]s for that module)- Try Tango? Is the GC different there?Somewhat, but void[] arrays are still treated as if they have pointers.
Oct 08 2007
Frits van Bommel wrote:Sean Kelly wrote:Some of the complication comes from a desire for efficiency. Currently, the array reallocation routines may call two synchronized GC functions: gc_sizeOf and gc_malloc. Preserving block attributes would currently require calling gc_getAttr as well, which would mean three mutex locks for a single append/resize operation. What I'd like to do is create an aggregate routine called something like gc_describe which returns all the relevant information in one go, and replace the call to gc_getAttr with a call to gc_describe. Then the relevant block info can be preserved and passed into the call to gc_malloc later on. The result being that the current maximum of two mutex locks would be retained. The other complication will be a maintenance issue. If somewhere in the runtime performs an array reallocation somehow differently, there will be "holes" in the functionality preserving block attributes. Fortunately, this portion of the runtime is fairly stable so I don't foresee it being a problem. As an aside, I'd originally considered replacing the whole mess with a call to gc_realloc, but currently gc_realloc is allowed to fail of the supplied pointer is to the interior of a memory block (ie. a slice). I don't want to change this, because the current scheme allows a GC implementation to simply call C malloc/realloc/free, which would not be possible if the GC were required to operate correctly on interior pointers. SeanFrits van Bommel wrote:Actually, that bug (or its Phobos equivalent) seems to be partly to blame as well. If you look at std.zlib, you'll see that right after every "new void[whatever]", std.gc.hasNoPointers is called on the result. However, there are some ~/~= operations on those buffers while typed as void[]s. I think if the "has no pointers" bit carried over from the original arrays when concatenating (if neither contains pointers[1], the result doesn't contain any either) std.zlib might actually be leak-free. [1]: If an array gc-allocated, use the attribute as set by the allocation function or the user. Otherwise, use the default for the static type (through TypeInfo). P.S. I think I'm starting to realize what you meant by that "a somewhat involved process" comment ;).Sean Kelly wrote:Yup. However, an annoying problem still exists with Buffer. Basically, this class maintains a void[] reference to a block of memory allocated as a byte[]. However, if the block is resized for any reason, the type doing the resizing is used to determine whether the newly allocated block contains pointers. I've been meaning to change the Tango runtime and GC to preserve array block attributes across reallocations, but it's a somewhat involved process and I haven't gotten to it yet.But AFAICT tango.io.compress.Zlib doesn't allocate any of those, just ubyte[] arrays, and exception classes. (The unittest does use a MemoryConduit, which internally uses a void[], but nothing else should allocate void[]s for that module)- Try Tango? Is the GC different there?Somewhat, but void[] arrays are still treated as if they have pointers.
Oct 08 2007
Sean Kelly a écrit :Frits van Bommel wrote:IMHO, that's a different issue: the compress prototype should be modified to indicates that it returns byte[] not void[]: after all, no pointer is expected in compressed data, so the function signature should reflect this. Regards, renoXSean Kelly wrote:Yup. However, an annoying problem still exists with Buffer. Basically, this class maintains a void[] reference to a block of memory allocated as a byte[]. However, if the block is resized for any reason, the type doing the resizing is used to determine whether the newly allocated block contains pointers. I've been meaning to change the Tango runtime and GC to preserve array block attributes across reallocations, but it's a somewhat involved process and I haven't gotten to it yet. SeanBut AFAICT tango.io.compress.Zlib doesn't allocate any of those, just ubyte[] arrays, and exception classes. (The unittest does use a MemoryConduit, which internally uses a void[], but nothing else should allocate void[]s for that module)- Try Tango? Is the GC different there?Somewhat, but void[] arrays are still treated as if they have pointers.
Oct 09 2007
Frits van Bommel wrote:But AFAICT tango.io.compress.Zlib doesn't allocate any of those, just ubyte[] arrays, and exception classes. (The unittest does use a MemoryConduit, which internally uses a void[], but nothing else should allocate void[]s for that module)I use ubyte[]s for anything dealing with binary data; I only ever use void[]s where I'm dealing with typed stuff when I don't care what the type is. Incidentally, tango.io.compress.Bzip2 also uses ubyte[]s internally (really, they're basically the same module, except with "z_" replaced with "bz_" :P ). -- Daniel
Oct 08 2007