www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Garbage Collection, Untraceable Errors...

reply cy <dlang verge.info.tm> writes:
Most people complain about garbage collection slowing things 
down, or causing moments of program hang, but lately D's 
performance has been pretty awesome in that regard, IMO. No, the 
problem I have with garbage collection is untraceable errors...

GC.collect() is just this opaque... black box thing, that you 
have to trust to magically find all the garbage and collect it. 
When it does, it's awesome. But when it doesn't... you're just 
screwed. You can't trace through garbage collection, can't debug 
it, can't examine its inner structures, see what objects are 
being destructed, or anything as far as I can tell.

So I get something like the following output:
All unit tests have been run successfully.
finalize
core.exception.InvalidMemoryOperationError src/core/exception.d(693): Invalid
memory operation
----------------

...and the program exits with status 1. No stack trace, no error 
reported. Somewhere I've got a null dereference or something, 
some failure of everything to initialize in the expected order. 
But I have no way of finding it outside of staring at the code 
for another 2 hours trying to see anything funny looking.

Trying to break on "exit" with gdb doesn't work either, since D 
unwinds the stack before exiting in a GC error. Tracing through 
finalization, the deepest I can get is something called rt_term 
before there is no more debugging information (despite 
BUILD=debug being specified). Reading the source in rt/dmain.d 
myself, I can divine that rt_term calls gc_term, and in gc_term 
the error is raised. But what functions gc_term calls, you can't 
even set a breakpoint for. GC.fullCollect() just isn't available. 
(And gdb doesn't let you do tab completion in D, so I can't list 
all possible functions similar to that one.)

So... that's my complaint about the D garbage collection. It's 
frustratingly opaque, impossible to debug, and provides no help 
whatsoever in its error messages to understanding what went 
wrong. Garbage collection causes errors after the program has 
entirely finished, and calling GC.collect() inside a destructor 
will not only cause the error, it'll completely terminate the 
program before it finishes the destructor. Speed is great, but 
debugging with garbage collection is pretty much an endless fount 
of misery and frustration for me.
Jun 13 2016
next sibling parent Mike Parker <aldacron gmail.com> writes:
On Monday, 13 June 2016 at 08:26:34 UTC, cy wrote:

 So... that's my complaint about the D garbage collection. It's 
 frustratingly opaque, impossible to debug, and provides no help 
 whatsoever in its error messages to understanding what went 
 wrong. Garbage collection causes errors after the program has 
 entirely finished, and calling GC.collect() inside a destructor 
 will not only cause the error, it'll completely terminate the 
 program before it finishes the destructor. Speed is great, but 
 debugging with garbage collection is pretty much an endless 
 fount of misery and frustration for me.
While it would be nice to have the location of the error, you can be sure that the problem lies in a destructor somewhere. It is invalid for destructors of GC-managed objects to touch the GC. This is precisely the error you get when it happens. Compiling with -vgc will tell you the source and line number of every place your code touches the GC. If your project is big, that may be more helpful than eyeballing all of you destructors.
Jun 13 2016
prev sibling next sibling parent reply Guillaume Piolat <first.last gmail.com> writes:
On Monday, 13 June 2016 at 08:26:34 UTC, cy wrote:
 So... that's my complaint about the D garbage collection. It's 
 frustratingly opaque, impossible to debug, and provides no help 
 whatsoever in its error messages to understanding what went 
 wrong. Garbage collection causes errors after the program has 
 entirely finished, and calling GC.collect() inside a destructor 
 will not only cause the error, it'll completely terminate the 
 program before it finishes the destructor. Speed is great, but 
 debugging with garbage collection is pretty much an endless 
 fount of misery and frustration for me.
 see what objects are being destructed
Well you can detect in a destructor if you are called by the GC: https://p0nce.github.io/d-idioms/#GC-proof-resource-class This thing helps with getting rid of such bugs. Having a non-trivial destructor called by the GC, and relying on it, is imho an error at best, should always be manual.
Jun 13 2016
parent cy <dlang verge.info.tm> writes:
On Monday, 13 June 2016 at 08:49:55 UTC, Guillaume Piolat wrote:
 This thing helps with getting rid of such bugs.
I suppose. The real cause of the bugs is not destructors I think, but memory corruption. Since frees (and double-frees) are deferred so long, it's just impossible to even guess at where the double destruction happened, or when you dereferenced a null pointer and D didn't catch that... again. For those bugs, it would really help to know which objects are being freed, so you can check whether you accidentally forgot to initialize them or whatnot. Debugging their destructors is not the issue, and if it were I could just add a break inside the destructor itself.
 Having a non-trivial destructor called by the GC, and relying 
 on it, is imho an error at best, should always be manual.
I'd tend to agree. The reason I was using destructors at all is because I was using an SQL database. Those things like to act like whole global environments, where it's cheaper for you to create two tables within the same database than to have two databases open simultaneously, so usually when working with SQL I will make the database a global object, initialized on demand. Since there is *always* a database object, you can write code that can be optimized at compile-time, instead of code that runs conditionally at runtime, on the worthless check of whether it gets passed a database object or not. So, the only problem is if the database is guaranteed to be open and usable anywhere in your code, when do you close it? What I do is explicitly close it, then add a close() in the destructor with a warning. And then I run into GC errors... despite the only destructor in my entire program finishing without error.
Jun 13 2016
prev sibling parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Monday, 13 June 2016 at 08:26:34 UTC, cy wrote:
 core.exception.InvalidMemoryOperationError src/core/exception.d(693): Invalid
memory operation
FWIW: http://wiki.dlang.org/InvalidMemoryOperationError
Jun 13 2016
parent reply cy <dlang verge.info.tm> writes:
On Monday, 13 June 2016 at 08:58:40 UTC, Vladimir Panteleev wrote:
 FWIW: http://wiki.dlang.org/InvalidMemoryOperationError
Yeah... I did forget that the GC was not re-entrant. But the "stop collecting inside the destructor, moron" error is exactly the same as "something allocated in the destructor... moron" error. I'd think we could just allocate space for a potential stack trace before starting garbage collection, but I guess not?
Jun 13 2016
parent Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Monday, 13 June 2016 at 16:43:57 UTC, cy wrote:
 On Monday, 13 June 2016 at 08:58:40 UTC, Vladimir Panteleev 
 wrote:
 FWIW: http://wiki.dlang.org/InvalidMemoryOperationError
Yeah... I did forget that the GC was not re-entrant. But the "stop collecting inside the destructor, moron" error is exactly the same as "something allocated in the destructor... moron" error.
"The GC is not re-entrant" applies to all of its parts, not only the actual part doing the GC and reclaiming memory - i.e. calling any GC function (allocation or an explicit free) while a GC function is running is not supported. In practice, this only applies to allocation/free from a destructor invoked by a GC sweep though.
 I'd think we could just allocate space for a potential stack 
 trace before starting garbage collection, but I guess not?
The problem is certainly solvable, but, well, a GC cycle is initiated precisely when the runtime runs out of memory.
Jun 13 2016