digitalmars.D - Debugging heap corruption
- Vladimir Panteleev (10/10) Jul 29 2007 Hello,
- Leandro Lucarella (9/10) Jul 29 2007 Valgrind override both libc and system calls, and D's GC have to use the...
- Matthias Walter (4/10) Jul 29 2007 Yes, valgrind works like a charm with D - except that D seems to not fre...
- Witold Baryluk (1/3) Jul 29 2007 Someone have valgrind patch for this if I remember. Thomas Kuehne?
- Vladimir Panteleev (49/54) Jul 29 2007 them,
- Vladimir Panteleev (5/6) Jul 29 2007 As I expected - after two few-hour runs, the program hung once and segfa...
- Regan Heath (14/27) Jul 30 2007 One method for finding heap corruption is to write custom memory
- Vladimir Panteleev (5/18) Jul 30 2007 That's exactly what Phobos's GC SENTINEL option is supposed to do (and i...
- Regan Heath (6/30) Jul 30 2007 I haven't heard of "Phobos's GC SENTINEL option" what is it? Where can
- Vladimir Panteleev (6/11) Jul 30 2007 If what you said applies to D, then you must have used your own allocati...
- Regan Heath (4/35) Jul 30 2007 My own code which used this technique was written in C.
- Jason House (7/20) Jul 29 2007 I don't know if this helps, but it's certainly related.
- Walter Bright (8/15) Jul 29 2007 instead of
Hello, I have a memory corruption problem. Namely, my application randomly crashes with access violations or the heap data is corrupted. The application in question is sizeable (over 8000 lines). The crash is hard to reproduce (requires an amount of user interaction), however I've noticed that it always happens after a certain user action. I have thoroughly examined the code handling that action, tried modifying the code (adding asserts and .dups), but to no avail. I might as well expect that it's a bug in the compiler or Phobos. I noticed that Phobos's GC has some debug code for "underrun/overrrun protection" [sic] in phobos\internal\gc\gcx.d, however I found that the code is unfinished. For some reason the corresponding code is put in "version (SENTINEL)" blocks, instead of "debug (SENTINEL)" ones - which would explain why it never worked (there is also a typo in code on line 567). Even with those obvious mistakes fixed, I have no idea how much work would be required to get the SENTINEL debug option working, since it's interfering with other language features such as array concatenation, and firing off false alarms. Someone has also suggested GDB/Vargrind, however I haven't attempted this combination out of reasoning that since D's GC handles all the allocation and manages the memory, Valgrind wouldn't be able to hook the memory allocation routines and take over memory management from D's GC. If I reasoned wrongly, please let me know. Has anyone met and fought heap corruption issues with D before? I could really use some advice, since I've been trying to solve it for weeks and it's causing me to lose motivation in my project :( Any help is appreciated! -- Thanks, Vladimir mailto:thecybershadow gmail.com
Jul 29 2007
Vladimir Panteleev, el 29 de julio a las 10:03 me escribiste:Someone has also suggested GDB/Vargrind, however I haven't attempted this combination out of reasoning that since D's GC handles all the allocation and manages the memory, Valgrind wouldn't be able to hook the memory allocation routines and take over memory management from D's GC. If I reasoned wrongly, please let me know.Valgrind override both libc and system calls, and D's GC have to use them, so my wild guess is *yes*, valgrind should help. -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ .------------------------------------------------------------------------, \ GPG: 5F5A8D05 // F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05 / '--------------------------------------------------------------------' La esperanza es una amiga que nos presta la ilusiĆ³n.
Jul 29 2007
Leandro Lucarella Wrote:Vladimir Panteleev, el 29 de julio a las 10:03 me escribiste:Yes, valgrind works like a charm with D - except that D seems to not free all GC variables at the end, but this should be no problem. Memory corruption is still shown completely. Only problem I see here is if you have some errors in destructors, because then you'll see many corruptions evoked by GC-routines although the reasons are in your code. Also the symbol names are the mangled ones, so reading them is a bit odd. Matthias WalterSomeone has also suggested GDB/Vargrind, however I haven't attempted this combination out of reasoning that since D's GC handles all the allocation and manages the memory, Valgrind wouldn't be able to hook the memory allocation routines and take over memory management from D's GC. If I reasoned wrongly, please let me know.Valgrind override both libc and system calls, and D's GC have to use them, so my wild guess is *yes*, valgrind should help.
Jul 29 2007
Also the symbol names are the mangled ones, so reading them is a bit odd.Someone have valgrind patch for this if I remember. Thomas Kuehne?
Jul 29 2007
On Sun, 29 Jul 2007 17:02:08 +0300, Matthias Walter <walter mail.math.un= i-magdeburg.de> wrote:Leandro Lucarella Wrote:them,Valgrind override both libc and system calls, and D's GC have to use =ree all GC variables at the end, but this should be no problem. Memory c= orruption is still shown completely. Only problem I see here is if you h= ave some errors in destructors, because then you'll see many corruptions= evoked by GC-routines although the reasons are in your code.so my wild guess is *yes*, valgrind should help.Yes, valgrind works like a charm with D - except that D seems to not f=Also the symbol names are the mangled ones, so reading them is a bit o=dd. Thanks for the replies. I'll clarify what I meant earlier regarding my a= ssumptions about Valgrind not working with D as well as with C/C++ progr= ams: Consider this D program: void main() { ubyte[] a; a.length =3D 10; // the allocated memory is in the GC-managed heap= auto p =3D a.ptr; = for(int i=3D0;i<=3D10;i++) *(p+i) =3D i; } In this case, the off-by-one error will go unnoticed, even when running = under Valgrind. The same doesn't happen if you don't use the GC: import std.c.stdlib; void main() { ubyte* p =3D cast(ubyte*)malloc(10); for(int i=3D0;i<=3D10;i++) *(p+i) =3D i; } malloc() wires directly to libc, which means that it will be recognized = and instrumented by Valgrind - thus, the off-by-one error will be caught= . Of course, this doesn't mean that heap corruptions related to D's GC are= untrace by Valgrind, as is displayed in practice. It's just that not al= l bugs, esp. minor (off-by-one) ones will be immediately detected with V= algrind. This sometimes causes the code to corrupt other parts of memory= allocated via the GC, or even some of the GC's control structures, caus= ing the program to crash as an indirect effect of the memory corruption = (and making it impossible to find the original cause of the corruption).= "Fixing" that would probably involve either teaching Valgrind to hook i= nto D's GC, or using the built-in (albeit unfinished) SENTINEL debugging= options mentioned in the original post. I am currently running my program under Valgrind, waiting for some resul= ts, hoping for the best. Will post updates when will have some progress = :) -- = Best regards, Vladimir mailto:thecybershadow gmail.com
Jul 29 2007
On Mon, 30 Jul 2007 05:23:18 +0300, Vladimir Panteleev <thecybershadow gmail.com> wrote:I am currently running my program under Valgrind, waiting for some results, hoping for the best. Will post updates when will have some progress :)As I expected - after two few-hour runs, the program hung once and segfaulted the other. In the second case, the segfault happened when the GC stumbled on an invalid pointer, which was there, no doubt, due to heap corruption. I think I'll give modding Phobos's GC another shot (and perhaps take a look at the warnings Valgrind spits out about the GC referencing uninitialized memory). -- Best regards, Vladimir mailto:thecybershadow gmail.com
Jul 29 2007
Vladimir Panteleev wrote:On Mon, 30 Jul 2007 05:23:18 +0300, Vladimir Panteleev <thecybershadow gmail.com> wrote:One method for finding heap corruption is to write custom memory allocation, reallocation and free routines. In the allocator you allocate extra memory before and after the block you actually return, you initialise these padding blocks to some known pattern and when it comes time to reallocate or free the memory you verify the padding is intact and has not been modified. This allows you to figure out which piece of memory has been corrupted and how (overrun etc). I used to use this to check I wasn't leaking any memory also but with a GC that's no longer important. I'm not sure whether D allows you to define global custom allocators, anyone? Or perhaps Tango has that capability? ReganI am currently running my program under Valgrind, waiting for some results, hoping for the best. Will post updates when will have some progress :)As I expected - after two few-hour runs, the program hung once and segfaulted the other. In the second case, the segfault happened when the GC stumbled on an invalid pointer, which was there, no doubt, due to heap corruption. I think I'll give modding Phobos's GC another shot (and perhaps take a look at the warnings Valgrind spits out about the GC referencing uninitialized memory).
Jul 30 2007
On Mon, 30 Jul 2007 12:03:15 +0300, Regan Heath <regan netmail.co.nz> wrote:Vladimir Panteleev wrote: One method for finding heap corruption is to write custom memory allocation, reallocation and free routines. In the allocator you allocate extra memory before and after the block you actually return, you initialise these padding blocks to some known pattern and when it comes time to reallocate or free the memory you verify the padding is intact and has not been modified. This allows you to figure out which piece of memory has been corrupted and how (overrun etc). I used to use this to check I wasn't leaking any memory also but with a GC that's no longer important. I'm not sure whether D allows you to define global custom allocators, anyone? Or perhaps Tango has that capability?That's exactly what Phobos's GC SENTINEL option is supposed to do (and is what I'll be looking at next). I assume that what you said doesn't apply to D? -- Best regards, Vladimir mailto:thecybershadow gmail.com
Jul 30 2007
Vladimir Panteleev wrote:On Mon, 30 Jul 2007 12:03:15 +0300, Regan Heath <regan netmail.co.nz> wrote:I haven't heard of "Phobos's GC SENTINEL option" what is it? Where can I read about it in the D docs?Vladimir Panteleev wrote: One method for finding heap corruption is to write custom memory allocation, reallocation and free routines. In the allocator you allocate extra memory before and after the block you actually return, you initialise these padding blocks to some known pattern and when it comes time to reallocate or free the memory you verify the padding is intact and has not been modified. This allows you to figure out which piece of memory has been corrupted and how (overrun etc). I used to use this to check I wasn't leaking any memory also but with a GC that's no longer important. I'm not sure whether D allows you to define global custom allocators, anyone? Or perhaps Tango has that capability?That's exactly what Phobos's GC SENTINEL option is supposed to do (and is what I'll be looking at next).I assume that what you said doesn't apply to D?I'm not sure what you mean? Which part of what I said are you assuming doesn't apply to D? Regan
Jul 30 2007
On Mon, 30 Jul 2007 13:16:32 +0300, Regan Heath <regan netmail.co.nz> wrote:I haven't heard of "Phobos's GC SENTINEL option" what is it? Where can I read about it in the D docs?I described it in the original post. It is unfinished and thus undocumented. Quoted:I noticed that Phobos's GC has some debug code for "underrun/overrrun protection" [sic] in phobos\internal\gc\gcx.d, however I found that the code is unfinished. For some reason the corresponding code is put in "version (SENTINEL)" blocks, instead of "debug (SENTINEL)" ones - which would explain why it never worked (there is also a typo in code on line 567). Even with those obvious mistakes fixed, I have no idea how much work would be required to get the SENTINEL debug option working, since it's interfering with other language features such as array concatenation, and firing off false alarms.If what you said applies to D, then you must have used your own allocation routines only in your own code (I.E. you don't hook D's memory allocation), since you didn't know if it's possible to substitute the standard "global allocator". In that case, your code will have the same effect as using libc's malloc() and Valgrind (except your code wouldn't be able to detect immediately when an overflow has happened). See the rest of this thread for details (particularly my reply to Matthias). -- Best regards, Vladimir mailto:thecybershadow gmail.comI assume that what you said doesn't apply to D?I'm not sure what you mean? Which part of what I said are you assuming doesn't apply to D?
Jul 30 2007
Vladimir Panteleev wrote:On Mon, 30 Jul 2007 13:16:32 +0300, Regan Heath <regan netmail.co.nz> wrote:Ahh, sorry, I didn't recall the phrase SENTINEL in your earlier post.I haven't heard of "Phobos's GC SENTINEL option" what is it? Where can I read about it in the D docs?I described it in the original post. It is unfinished and thus undocumented. Quoted:I noticed that Phobos's GC has some debug code for "underrun/overrrun protection" [sic] in phobos\internal\gc\gcx.d, however I found that the code is unfinished. For some reason the corresponding code is put in "version (SENTINEL)" blocks, instead of "debug (SENTINEL)" ones - which would explain why it never worked (there is also a typo in code on line 567). Even with those obvious mistakes fixed, I have no idea how much work would be required to get the SENTINEL debug option working, since it's interfering with other language features such as array concatenation, and firing off false alarms.My own code which used this technique was written in C. ReganIf what you said applies to D, then you must have used your own allocation routines only in your own code (I.E. you don't hook D's memory allocation), since you didn't know if it's possible to substitute the standard "global allocator". In that case, your code will have the same effect as using libc's malloc() and Valgrind (except your code wouldn't be able to detect immediately when an overflow has happened). See the rest of this thread for details (particularly my reply to Matthias).I assume that what you said doesn't apply to D?I'm not sure what you mean? Which part of what I said are you assuming doesn't apply to D?
Jul 30 2007
I don't know if this helps, but it's certainly related. Using gdc, I had SEVERE issues with the garbage collector. The more I avoided the need for garbage collection, the further my program would run. disabling the gc or using dmd let it run indefinitely. I've ported to Tango, but have not tested with gdc yet. I did not try using the latest repository revision of gdc either. Vladimir Panteleev wrote:Hello, I have a memory corruption problem. Namely, my application randomly crashes with access violations or the heap data is corrupted. The application in question is sizeable (over 8000 lines). The crash is hard to reproduce (requires an amount of user interaction), however I've noticed that it always happens after a certain user action. I have thoroughly examined the code handling that action, tried modifying the code (adding asserts and .dups), but to no avail. I might as well expect that it's a bug in the compiler or Phobos. I noticed that Phobos's GC has some debug code for "underrun/overrrun protection" [sic] in phobos\internal\gc\gcx.d, however I found that the code is unfinished. For some reason the corresponding code is put in "version (SENTINEL)" blocks, instead of "debug (SENTINEL)" ones - which would explain why it never worked (there is also a typo in code on line 567). Even with those obvious mistakes fixed, I have no idea how much work would be required to get the SENTINEL debug option working, since it's interfering with other language features such as array concatenation, and firing off false alarms. Someone has also suggested GDB/Vargrind, however I haven't attempted this combination out of reasoning that since D's GC handles all the allocation and manages the memory, Valgrind wouldn't be able to hook the memory allocation routines and take over memory management from D's GC. If I reasoned wrongly, please let me know. Has anyone met and fought heap corruption issues with D before? I could really use some advice, since I've been trying to solve it for weeks and it's causing me to lose motivation in my project :( Any help is appreciated!
Jul 29 2007
Vladimir Panteleev wrote:I noticed that Phobos's GC has some debug code for "underrun/overrrun protection" [sic] in phobos\internal\gc\gcx.d, however I found that the code isunfinished. For somereason the corresponding code is put in "version (SENTINEL)" blocks,instead of"debug (SENTINEL)" ones - which would explain why it never worked(there is also a typoin code on line 567). Even with those obvious mistakes fixed, I haveno idea how muchwork would be required to get the SENTINEL debug option working,since it's interferingwith other language features such as array concatenation, and firingoff false alarms. Please post any bugs you've found or patches you've made to bugzilla.
Jul 29 2007