digitalmars.D - Dlls and object collection
- pragma (39/39) Feb 06 2005 This really isn't so much of a request for help, as an example of what c...
- Kris (17/60) Feb 06 2005 Aye;
- pragma (12/17) Feb 06 2005 Gah. I keep forgetting about that. Thank you for setting me straight on...
- Kris (13/17) Feb 06 2005 Ah; gotcha. Walter will probably flip over that notion (perhaps rightly ...
- pragma (13/24) Feb 06 2005 Right, its not the best solution, its just one of several things that mi...
- Kris (8/13) Feb 06 2005 I'm lost on this one, Eric - why would one need to cast to the Interface...
- Matthew (6/27) Feb 06 2005 Which brings me back to my "overly complicated" solution we discussed
- Ben Hinkle (5/17) Feb 06 2005 I guess that's why Java doesn't let you unload classes explicitly and C#...
- Walter (5/7) Feb 10 2005 the
- Walter (26/37) Feb 10 2005 calling
This really isn't so much of a request for help, as an example of what can go wrong with Dll's and GC in D (at present); its something to look out for that I didn't expect at first. Perhaps this will help some struggling noobs out there, but I hope it'll raise some eyebrows with the more seasoned developers among us. BTW, I'm open to suggestions on how to best tackle this problem. Basically, with the new 'hookable GC' that Walter gave us with 0.112, things have improved. One no longer needs to worry about having a dll return a string or int[] and only to watch where that memory will go once the dll is unloaded. However, there are some 'gotchas' still present in the architecture. //mydll.d //(assuming that winmain is configured and the proper GC hooks are in place) class Foobar{} static this(){ new Foobar(); } //test.d //(assuming a Library class that loads a library and hooks/unhooks the GC) void main(){ Library lib = new Library("mydll.dll"); // load and hook lib.unload(); //unhook and unload } Note that there is no communication between main and the dll other than calling hook and unhook. Since the GC is lazy, any collection pass can leave objects outstanding for a variety of reasons (even after a full collect). The code above is *likely* to work, but can fail if the 'Foobar' object created in the mydll.d module is still outsanding after the call to unload(). In that case, the 'Foobar' becomes more of a 'Fubar' object since the gc duitifully tries to call the object's destructor. Said destuctor doesn't exist anymore since the object's v-table points to where the dll used to be. So this isn't a problem that GC-hooking doesn't solve; objects are *very* tightly bound to their 'home' dll. I for one used to think that just keeping track of those objects that cross the dll/application barrier were the only ones requiring discrete tracking. I now understand that this is no longer the case; that every object created within a dll needs the same consideration one way or another. I think GC-managed libraries are the way to go, but such a technique would require being able to set the entire dll-space in memory as a GC root. Does anyone out there have an idea how to gather that information on win32? How about Posix? - EricAnderton at yahoo
Feb 06 2005
Aye; This is partly what I was getting at in an earlier thread; like you, I feel the DLL unloading needs to be managed by the GC (via a DLL wrapper class). However, that can lead to deadlock when the GC halts all threads while it collects. One way around the deadlock issue is to construct a non-blocking, non-spinning, mechanism whereby the DLL-wrapper may be marked for subsequent removal via its destructor (synch won't work, because another thread could be holding the wrapper-mutex while it is asking for the DLL to be loaded - that thread will be paused() by the GC during a collect, which is when the wrapper-destructor could be invoked). A further issue is where the DLL creates a thread of it's own. The GC will not know about such threads, and therefore will not pause() them during a collect. This exposes the potential for heap-corruption, so fair warning :-)but such a technique would require being able to set the entire dll-space in memory as a GC root. Does anyone out there have an idea how to gather that information on win32? How about Posix?I believe the GC adds the DLL static-data-area as a GC root. But I don't follow as to why the entire DLL would need to be mapped. Can you elaborate, Eric? - Kris In article <cu62oh$215u$1 digitaldaemon.com>, pragma says...This really isn't so much of a request for help, as an example of what can go wrong with Dll's and GC in D (at present); its something to look out for that I didn't expect at first. Perhaps this will help some struggling noobs out there, but I hope it'll raise some eyebrows with the more seasoned developers among us. BTW, I'm open to suggestions on how to best tackle this problem. Basically, with the new 'hookable GC' that Walter gave us with 0.112, things have improved. One no longer needs to worry about having a dll return a string or int[] and only to watch where that memory will go once the dll is unloaded. However, there are some 'gotchas' still present in the architecture. //mydll.d //(assuming that winmain is configured and the proper GC hooks are in place) class Foobar{} static this(){ new Foobar(); } //test.d //(assuming a Library class that loads a library and hooks/unhooks the GC) void main(){ Library lib = new Library("mydll.dll"); // load and hook lib.unload(); //unhook and unload } Note that there is no communication between main and the dll other than calling hook and unhook. Since the GC is lazy, any collection pass can leave objects outstanding for a variety of reasons (even after a full collect). The code above is *likely* to work, but can fail if the 'Foobar' object created in the mydll.d module is still outsanding after the call to unload(). In that case, the 'Foobar' becomes more of a 'Fubar' object since the gc duitifully tries to call the object's destructor. Said destuctor doesn't exist anymore since the object's v-table points to where the dll used to be. So this isn't a problem that GC-hooking doesn't solve; objects are *very* tightly bound to their 'home' dll. I for one used to think that just keeping track of those objects that cross the dll/application barrier were the only ones requiring discrete tracking. I now understand that this is no longer the case; that every object created within a dll needs the same consideration one way or another. I think GC-managed libraries are the way to go, but such a technique would require being able to set the entire dll-space in memory as a GC root. Does anyone out there have an idea how to gather that information on win32? How about Posix? - EricAnderton at yahoo
Feb 06 2005
In article <cu64pm$25h4$1 digitaldaemon.com>, Kris says...This is partly what I was getting at in an earlier thread; like you, I feel the DLL unloading needs to be managed by the GC (via a DLL wrapper class). However, that can lead to deadlock when the GC halts all threads while it collects.Gah. I keep forgetting about that. Thank you for setting me straight once again. :)I believe the GC adds the DLL static-data-area as a GC root. But I don't follow as to why the entire DLL would need to be mapped. Can you elaborate, Eric?That's easy to explain. The key issue here is that while the GC does a great job of tracking pointer-to-data dependencies, it fails on pointer-to-code with respect to dlls. So without the *code* space of the dll being mapped, delegates, function-pointers and object v-tables all slip through the cracks. It doesn't have to be all-or-nothing for mapping the dll as a root. I figured that it would probably be easier than trying to find the dll's code segment(s) and add them via gc.addRange(). Either way, its the missing magic needed to make this work transparently. - EricAnderton at yahoo.com
Feb 06 2005
In article <cu6ak5$2hoj$1 digitaldaemon.com>, pragma says...That's easy to explain. The key issue here is that while the GC does a great job of tracking pointer-to-data dependencies, it fails on pointer-to-code with respect to dlls. So without the *code* space of the dll being mapped, delegates, function-pointers and object v-tables all slip through the cracks.Ah; gotcha. Walter will probably flip over that notion (perhaps rightly so) since there's now a raft of machine-code to be scanned (in addition to data segments), some of which will probably look like valid pointers into the heap. :-) One might deal with this via an extended delegate, which also has a reference to the DLL wrapper. Could be done with a class/struct. Perhaps better to use an Interface in the first place, since the implementation could hold a reference to the DLL wrapper, thus steering the GC in the right direction. p.s. The 'flip' reference is with respect to Walter's arguments against disabling stack-array initialization (from months ago), since the array content would likely contain relics of heap references - even random data can look like a valid heap reference. We did, at least, get a partial resolution to that one.
Feb 06 2005
In article <cu6cnf$2kv4$1 digitaldaemon.com>, Kris says...In article <cu6ak5$2hoj$1 digitaldaemon.com>, pragma says...Right, its not the best solution, its just one of several things that might work. :) There are other issues that seem to be in-between the cracks with regards to using dlls. I still can't confirm that I'm doing this right, but it looks like casting an object to an interface, when the object is passed from a dll, throws out the object's v-table (and yields methods that do *nothing* as a result). This flows into yet another problem: calling 'delete' on said object from outside the dll also creates a fault.That's easy to explain. The key issue here is that while the GC does a great job of tracking pointer-to-data dependencies, it fails on pointer-to-code with respect to dlls. So without the *code* space of the dll being mapped, delegates, function-pointers and object v-tables all slip through the cracks.Ah; gotcha. Walter will probably flip over that notion (perhaps rightly so) since there's now a raft of machine-code to be scanned (in addition to data segments), some of which will probably look like valid pointers into the heap.ITester tester = cast(ITester)mydll.newTestObject(); // get a new object tester.foo(); // does absolutely nothing, not even a fault. delete tester; // faultsNow if I change the code to use a base class instead of an interface, the odd behavior goes away. Go figure. - EricAnderton at yahoo
Feb 06 2005
In article <cu6fcl$2q4h$1 digitaldaemon.com>, pragma says...There are other issues that seem to be in-between the cracks with regards to using dlls.Too true :-)I'm lost on this one, Eric - why would one need to cast to the Interface if mydll.newTestObject() returns one? Oh; is that the error? Should that really be a mydll.newTestInterface() instead, where the concrete DLL class implements ITester? Forgive me if I'm stating the obvious! Delete does work correctly on an Interface, under normal circumstances (since DMD 0.81 I think)ITester tester = cast(ITester)mydll.newTestObject(); // get a new object tester.foo(); // does absolutely nothing, not even a fault. delete tester; // faults
Feb 06 2005
"pragma" <pragma_member pathlink.com> wrote in message news:cu6ak5$2hoj$1 digitaldaemon.com...In article <cu64pm$25h4$1 digitaldaemon.com>, Kris says...Which brings me back to my "overly complicated" solution we discussed last month. Keeping code loaded is a sine qua non of component based programming. Glad that there's at least a few others interested. :-)This is partly what I was getting at in an earlier thread; like you, I feel the DLL unloading needs to be managed by the GC (via a DLL wrapper class). However, that can lead to deadlock when the GC halts all threads while it collects.Gah. I keep forgetting about that. Thank you for setting me straight once again. :)I believe the GC adds the DLL static-data-area as a GC root. But I don't follow as to why the entire DLL would need to be mapped. Can you elaborate, Eric?That's easy to explain. The key issue here is that while the GC does a great job of tracking pointer-to-data dependencies, it fails on pointer-to-code with respect to dlls. So without the *code* space of the dll being mapped, delegates, function-pointers and object v-tables all slip through the cracks.
Feb 06 2005
Since the GC is lazy, any collection pass can leave objects outstanding for a variety of reasons (even after a full collect). The code above is *likely* to work, but can fail if the 'Foobar' object created in the mydll.d module is still outsanding after the call to unload(). In that case, the 'Foobar' becomes more of a 'Fubar' object since the gc duitifully tries to call the object's destructor. Said destuctor doesn't exist anymore since the object's v-table points to where the dll used to be.only lets you unload "AppDomains". I don't know much about AppDomains but they look like a way of separating an application into distinct parts. I agree with Kris's observation that current behavior has a problem that the dll's thread list is not merged with the GC's thread list.
Feb 06 2005
"Ben Hinkle" <ben.hinkle gmail.com> wrote in message news:cu66em$2976$1 digitaldaemon.com...I agree with Kris's observation that current behavior has a problem thatthedll's thread list is not merged with the GC's thread list.True, that is a bug. The thread management code needs to be single instanced like the gc is. At the moment, the DLL shouldn't create any threads.
Feb 10 2005
"pragma" <pragma_member pathlink.com> wrote in message news:cu62oh$215u$1 digitaldaemon.com...Note that there is no communication between main and the dll other thancallinghook and unhook. Since the GC is lazy, any collection pass can leave objects outstandingfor avariety of reasons (even after a full collect). The code above is*likely* towork, but can fail if the 'Foobar' object created in the mydll.d module isstilloutsanding after the call to unload(). In that case, the 'Foobar' becomesmoreof a 'Fubar' object since the gc duitifully tries to call the object's destructor. Said destuctor doesn't exist anymore since the object'sv-tablepoints to where the dll used to be. So this isn't a problem that GC-hooking doesn't solve; objects are *very* tightly bound to their 'home' dll.That's why, in the example DLL, there's a call to MyDll_Terminate() which will run the DLL's static destructors before unloading. Furthermore, the static data area of the DLL is removed from the gc's list of roots to scan, so the gc in the EXE is not going to be scanning the DLL's static data after it is unloaded. The only real problem is if an object is left hanging around that has a destructor that resides in the now unloaded DLL. The destructor's code will have been unloaded. (This same problem would happen with C++.) The solutions are: 1) Don't explicitly unload the DLL, just let the OS unload it for you when the application exits 2) Do not leave around long lived objects that have destructors 3) Keep a list of the objects the DLL creates that have destructors, and delete them explicitly when the DLL is unloaded 4) Make the unloading of the DLL the responsibility of a destructor in an object. In each DLL allocated object with a destructor, have a pointer to that DLL object. Then, the DLL won't get unloaded until after the other objects are no longer referred to themselves.
Feb 10 2005