digitalmars.D - large objects and GC
- Fawzi Mohamed (29/29) May 16 2008 There was recently a discussion on large array and GC.
- Vladimir Panteleev (7/11) May 16 2008 This will be effectively the same as having a "wrapper" class for manual...
- BCS (3/10) May 16 2008 I think this is based on the assumption that lots of systems only ever u...
- Vladimir Panteleev (5/6) May 16 2008 Yes, but in lots of cases this is not the case. Changing this would brea...
- BCS (2/4) May 16 2008 for those cases, you don't opt-in. The dafault would stay the same as no...
- Fawzi Mohamed (2/8) May 17 2008 exactly, maybe I should have been clearer, this approach is opt-in.
- Fawzi Mohamed (31/57) May 17 2008 If I understood correctly the actual gc (I looked at tango's one, but
- Vladimir Panteleev (5/6) May 17 2008 Heh, no, this is not the case. The GC will track references individually...
- Fawzi Mohamed (6/13) May 17 2008 Thanks if that is the case then wrapper objects are ok.
- Vladimir Panteleev (6/8) May 17 2008 It's Phobos 1.x, sorry.
- Sean Kelly (9/21) May 17 2008 In Tango:
- Fawzi Mohamed (16/44) May 17 2008 thank you for the explanation, I had badly interpreted the "gc does not
- BCS (5/7) May 16 2008 So in effect, only a pointer the the start of the block counts. Anything...
- Fawzi Mohamed (11/20) May 17 2008 exactly that's the gist of the idea, and would let one use almost
There was recently a discussion on large array and GC. The main conclusion was that the fact that the garbage collector is not exact large arrays don't get collected. On tango this seems less problematic, but can still be an issue. I am writing something that needs large arrays, and one obvious solution is to manually allocate the memory. This works, but then one has to use some kind of memory management, for example either having just 1 owner, or using reference counting (with synchronization or atomic operations). The problem is that if the owner/object that do reference counting are managed by the gc they might stay uncollected for a quite long time because they are probably small, so one needs to really do everything manually, use scope,... Obviously this is efficient and one should do it with large objects, but it would be nice if the thing could transition more gracefully to an automatic managed model. Large object when allocated should get a region for themselves when allocated with the gc, so there could be another approach. One could to add a flag to the garbage collector. This flag would say to the gc to ignore inner pointer in a region when deciding if the region should be collected (but the pointers should be updated when the region is moved). To have internal pointers one should also keep a pointer to the base object. Basically one has automatic reference counting, where the references are pointers to the base object. The advantage is that it is automatic, and that memory can be relocated. If other think it it is a good idea I am willing to invest some time to explore it. Fawzi
May 16 2008
On Sat, 17 May 2008 00:32:26 +0300, Fawzi Mohamed <fmohamed mac.com> wrote:The problem is that if the owner/object that do reference counting are managed by the gc they might stay uncollected for a quite long time because they are probably small, so one needs to really do everything manually, use scope,...I don't understand your logic here. The GC does not prioritize objects based on their size. Smaller objects are much less likely to leak because of the proportionally smaller chance of a bogus pointer keeping it "referenced".One could to add a flag to the garbage collector. This flag would say to the gc to ignore inner pointer in a region when deciding if the region should be collected. To have internal pointers one should also keep a pointer to the base object.This will be effectively the same as having a "wrapper" class for manually allocated memory. The class destructor, which will be called by the GC, should deallocate the external memory. The wrapper class should only have one field, and thus be very small and it will have the chance of leaking almost equivalent to the method you describe. I see no necessity for reference counting either, you just pass around the reference to the wrapper object.(but the pointers should be updated when the region is moved)A moving garbage collector must also be an exact garbage collector. -- Best regards, Vladimir mailto:thecybershadow gmail.com
May 16 2008
Reply to Vladimir,This will be effectively the same as having a "wrapper" class for manually allocated memory. The class destructor, which will be called by the GC, should deallocate the external memory. The wrapper class should only have one field, and thus be very small and it will have the chance of leaking almost equivalent to the method you describe. I see no necessity for reference counting either, you just pass around the reference to the wrapper object.I think this is based on the assumption that lots of systems only ever use the base pointer so why not let the programer leverage that?
May 16 2008
On Sat, 17 May 2008 00:58:41 +0300, BCS <ao pathlink.com> wrote:I think this is based on the assumption that lots of systems only ever use the base pointer so why not let the programer leverage that?Yes, but in lots of cases this is not the case. Changing this would break many things, while it can be worked around with wrapper objects. -- Best regards, Vladimir mailto:thecybershadow gmail.com
May 16 2008
Reply to Vladimir,Changing this would break many things,for those cases, you don't opt-in. The dafault would stay the same as now.
May 16 2008
On 2008-05-17 03:00:48 +0200, BCS <ao pathlink.com> said:Reply to Vladimir,exactly, maybe I should have been clearer, this approach is opt-in.Changing this would break many things,for those cases, you don't opt-in. The dafault would stay the same as now.
May 17 2008
On 2008-05-16 23:54:49 +0200, "Vladimir Panteleev" <thecybershadow gmail.com> said:On Sat, 17 May 2008 00:32:26 +0300, Fawzi Mohamed <fmohamed mac.com> wrote:If I understood correctly the actual gc (I looked at tango's one, but it seems that it is just a slighlty improved version of phobos gc) it doesn't know anything about single objects, it just works with pools. This way the number of object it has to handle stays manageable. If an object is big it gets its own pool, whereas if it is small it gets in a pool with other objects. Now the pool will stay around as long as any objet into it has references. When the pools goes away all the finalizers are called and then memory is released (not necessarily to the system, but at least to the gc). This behavior is ok as long as the size of the object is the one the gc sees, if the pools have a reasonable size the memory loss stays reasonable. But now look at the typical use of an array initialized from other arrays through calculations that need temporary arrays. Using small wrappers both the result and the temporary arrays are likely to be in the same pool. So as long as the result is kept around all the temporaries used to create it stay around. It is clear that if the temporary actually have a large amount of manual allocated memory this result is a waste of resources. The result is that you don't have to just manually manage the big memory allocation, but also the wrappers. I know that with big objects manual management is probably a good idea, but I would like a system that can work reasonably well with a more relaxed management. I think that my proposal achieves this with a small change.The problem is that if the owner/object that do reference counting are managed by the gc they might stay uncollected for a quite long time because they are probably small, so one needs to really do everything manually, use scope,...I don't understand your logic here. The GC does not prioritize objects based on their size. Smaller objects are much less likely to leak because of the proportionally smaller chance of a bogus pointer keeping it "referenced".see the previous point.One could to add a flag to the garbage collector. This flag would say to the gc to ignore inner pointer in a region when deciding if the region should be collected. To have internal pointers one should also keep a pointer to the base object.This will be effectively the same as having a "wrapper" class for manually allocated memory. The class destructor, which will be called by the GC, should deallocate the external memory. The wrapper class should only have one field, and thus be very small and it will have the chance of leaking almost equivalent to the method you describe. I see no necessity for reference counting either, you just pass around the reference to the wrapper object.I know, it was just in case... Fawzi(but the pointers should be updated when the region is moved)A moving garbage collector must also be an exact garbage collector.
May 17 2008
On Sat, 17 May 2008 10:54:06 +0300, Fawzi Mohamed <fmohamed mac.com> wrote:Now the pool will stay around as long as any objet into it has references.Heh, no, this is not the case. The GC will track references individually for every object inside the memory pool. The code for freeing sub-pool-size objects is in gcx.d, lines 2056 to 2075. -- Best regards, Vladimir mailto:thecybershadow gmail.com
May 17 2008
On 2008-05-17 10:42:21 +0200, "Vladimir Panteleev" <thecybershadow gmail.com> said:On Sat, 17 May 2008 10:54:06 +0300, Fawzi Mohamed <fmohamed mac.com> wrote:Thanks if that is the case then wrapper objects are ok. gcx.d, lines 2056 to 2075 of which codebase, where can I get that gcx? tango varsion has something else at those lines... FawziNow the pool will stay around as long as any objet into it has references.Heh, no, this is not the case. The GC will track references individually for every object inside the memory pool. The code for freeing sub-pool-size objects is in gcx.d, lines 2056 to 2075.
May 17 2008
On Sat, 17 May 2008 12:06:45 +0300, Fawzi Mohamed <fmohamed mac.com> wrote:gcx.d, lines 2056 to 2075 of which codebase, where can I get that gcx? tango varsion has something else at those lines...It's Phobos 1.x, sorry. The file is in \dmd\src\phobos\internal\gc\gcx.d -- Best regards, Vladimir mailto:thecybershadow gmail.com
May 17 2008
== Quote from Fawzi Mohamed (fmohamed mac.com)'s articleOn 2008-05-17 10:42:21 +0200, "Vladimir Panteleev" <thecybershadow gmail.com> said:In Tango: /trunk/lib/gc/basic/gcx.d In Phobos: /internal/gc/gcx.d For what you describe, the easiest thing would be to add a new bitfield and have an "allow interior pointers" per block, then check this bit during scanning before flagging a block as reachable. SeanOn Sat, 17 May 2008 10:54:06 +0300, Fawzi Mohamed <fmohamed mac.com> wrote:Thanks if that is the case then wrapper objects are ok. gcx.d, lines 2056 to 2075 of which codebase, where can I get that gcx? tango varsion has something else at those lines...Now the pool will stay around as long as any objet into it has references.Heh, no, this is not the case. The GC will track references individually for every object inside the memory pool. The code for freeing sub-pool-size objects is in gcx.d, lines 2056 to 2075.
May 17 2008
On 2008-05-17 21:35:07 +0200, Sean Kelly <sean invisibleduck.org> said:== Quote from Fawzi Mohamed (fmohamed mac.com)'s articlethank you for the explanation, I had badly interpreted the "gc does not know anything about the objects", and I didn't actually try to test my understanding with a program. For what I want to do a wrapper object that does manual memory allocation fits the bill nicely. My proposal could still be useful to transform a-posteriori a large existing object to this memory management mode, and could avoid the need to have to manually delete it (as for example requested in http://d.puremagic.com/issues/show_bug.cgi?id=2105 ). I will look into the possibility of introducing this in the gc and having a function that given a pointer chacks if it has a full pool for only an object (and no space left for other objects) it switches the flag "no internal pointers". But as I don't need it, don't expect anything :) FawziOn 2008-05-17 10:42:21 +0200, "Vladimir Panteleev" <thecybershadow gmail.com> said:In Tango: /trunk/lib/gc/basic/gcx.d In Phobos: /internal/gc/gcx.d For what you describe, the easiest thing would be to add a new bitfield and have an "allow interior pointers" per block, then check this bit during scanning before flagging a block as reachable. SeanOn Sat, 17 May 2008 10:54:06 +0300, Fawzi Mohamed <fmohamed mac.com> wrote:Thanks if that is the case then wrapper objects are ok. gcx.d, lines 2056 to 2075 of which codebase, where can I get that gcx? tango varsion has something else at those lines...Now the pool will stay around as long as any objet into it has references.Heh, no, this is not the case. The GC will track references individually for every object inside the memory pool. The code for freeing sub-pool-size objects is in gcx.d, lines 2056 to 2075.
May 17 2008
Reply to Fawzi,To have internal pointers one should also keep a pointer to the base object.So in effect, only a pointer the the start of the block counts. Anything else is just ignored. Just making sure I'm understanding you correctly. (I think this is worth exploring, but I don't known how good it will be)
May 16 2008
On 2008-05-16 23:55:08 +0200, BCS <ao pathlink.com> said:Reply to Fawzi,exactly that's the gist of the idea, and would let one use almost normal gc allocated memory instead of manually managed one. This behavior would kick in just if the gc decides that the object should get a pool for himself (obviously the object of that kind should *always* be treated as if they had a pool for themselves and would need a pointer to their base).To have internal pointers one should also keep a pointer to the base object.So in effect, only a pointer the the start of the block counts. Anything else is just ignored.Just making sure I'm understanding you correctly. (I think this is worth exploring, but I don't known how good it will be)good, I am just telling it here to see if I missed something basic, or it is worth exploring (and maybe someone more knowledgeable in the gc, does it for me ;). Fawzi
May 17 2008