digitalmars.D - RFC: Pinning interface for the GC
- =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (29/29) Oct 13 2012 Hi,
- David Nadlinger (9/13) Oct 13 2012 If pointers in pinned objects make their targets live, there
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (13/24) Oct 13 2012 There is a difference: Adding the object itself as a root does not
- David Nadlinger (11/15) Oct 13 2012 But then the GC _does_ have to scan those objects to be able to
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (20/34) Oct 13 2012 Ah, I could have been clearer here.
- David Nadlinger (17/18) Oct 13 2012 I'm not so sure about that.
- David Nadlinger (7/12) Oct 13 2012 Actually, it does: the internal array of added roots is simply
- David Nadlinger (4/8) Oct 13 2012 https://github.com/D-Programming-Language/druntime/pull/322
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (8/17) Oct 13 2012 That's good to know.
- dsimcha (5/32) Oct 13 2012 We already have a NO_MOVE attribute that can be set or unset.
- Rainer Schuetze (32/57) Oct 14 2012 I guess people don't think about pinning because the GC is not moving
Hi, With precise garbage collection coming up, and most likely compacting garbage collection in the future, I think it's time we start thinking about an API to pin garbage collector-managed objects. A typical approach that people use to 'pin' objects today is to allocate a chunk of memory from the C heap, add it as a root [range], and store a reference in it. That, or just global variables. This is kind of terrible because adding the chunk of memory as a root forces the GC to actually scan it, which is unnecessary when what you really want is to pin the object in place and tell the GC "I know what I'm doing, don't touch this". I propose the following functions in core.memory.GC: static bool pin(const(void)* p) nothrow; static bool unpin(const(void)* p) nothrow; The pin function shall pin the object pointed to by p in place such that it is not allowed to be moved nor collected until unpinned. The function shall return true if the object was successfully pinned or false if the object was already pinned or didn't belong to the garbage collector in the first place. The unpin function shall unpin the object pointed to by p such that it is once again eligible for moving and collection as usual. The function shall return true if the object was successfully unpinned or false if the object was not pinned or didn't belong to the garbage collector in the first place. Destroy! -- Alex Rřnne Petersen alex lycus.org http://lycus.org
Oct 13 2012
On Saturday, 13 October 2012 at 18:58:27 UTC, Alex Rønne Petersen wrote:This is kind of terrible because adding the chunk of memory as a root forces the GC to actually scan it, which is unnecessary when what you really want is to pin the object in place and tell the GC "I know what I'm doing, don't touch this".If pointers in pinned objects make their targets live, there would be no difference to simply adding the object as a root. So in your proposal, pinned objects are implicitly marked live if they aren't reachable from any of the roots, but any other objects reachable only from a pinned object but not from a root would be collected – correct? David
Oct 13 2012
On 13-10-2012 21:24, David Nadlinger wrote:On Saturday, 13 October 2012 at 18:58:27 UTC, Alex Rønne Petersen wrote:There is a difference: Adding the object itself as a root does not actually guarantee that the object *itself* might not be collected. At least, this is how I have to assume things work given that this is not guaranteed here: http://dlang.org/phobos/core_memory.html#addRoot As for your question: Not quite. A pinned object that points to any other unpinned objects will implicitly keep those alive. This is at least how I would expect it to work, following the principle of least surprise. -- Alex Rønne Petersen alex lycus.org http://lycus.orgThis is kind of terrible because adding the chunk of memory as a root forces the GC to actually scan it, which is unnecessary when what you really want is to pin the object in place and tell the GC "I know what I'm doing, don't touch this".If pointers in pinned objects make their targets live, there would be no difference to simply adding the object as a root. So in your proposal, pinned objects are implicitly marked live if they aren't reachable from any of the roots, but any other objects reachable only from a pinned object but not from a root would be collected – correct? David
Oct 13 2012
On Saturday, 13 October 2012 at 19:34:29 UTC, Alex Rønne Petersen wrote:As for your question: Not quite. A pinned object that points to any other unpinned objects will implicitly keep those alive. This is at least how I would expect it to work, following the principle of least surprise.But then the GC _does_ have to scan those objects to be able to mark the whole graph as live, no? Wasn't it this what you were referring to as "kind of terrible" in your first post? But yes, for a moving GC, a way to pin objects would have to be added, and lots of code using GC.add*() for interfacing with C would have to be changed – or we make those functions actually pin the objects for backwards compatibility and add a new set of functions which really just add something as a root. David
Oct 13 2012
On 13-10-2012 21:51, David Nadlinger wrote:On Saturday, 13 October 2012 at 19:34:29 UTC, Alex Rønne Petersen wrote:Ah, I could have been clearer here. The problem with using roots is two-fold: 1) It adds unnecessary work for the marking phase. 2) It forces scanning of 'pinned' objects to be imprecise. (1) is not so much of a problem (it only happens if you have root ranges with null pointers and so on), but (2) can be. Another problem that would pop up if we made scanning of roots precise is that, then, the stored reference could be moved (as you also pointed out). I guess there is also the issue of adding roots being a relatively expensive operation - it has to go through a mutex-guarded function whereas pinning objects can be made lock-free (at least on some architectures). I think the problem boils down to using roots for something they're not meant to be used for, semantically.As for your question: Not quite. A pinned object that points to any other unpinned objects will implicitly keep those alive. This is at least how I would expect it to work, following the principle of least surprise.But then the GC _does_ have to scan those objects to be able to mark the whole graph as live, no? Wasn't it this what you were referring to as "kind of terrible" in your first post?But yes, for a moving GC, a way to pin objects would have to be added, and lots of code using GC.add*() for interfacing with C would have to be changed – or we make those functions actually pin the objects for backwards compatibility and add a new set of functions which really just add something as a root. David-- Alex Rønne Petersen alex lycus.org http://lycus.org
Oct 13 2012
On Saturday, 13 October 2012 at 20:12:04 UTC, Alex Rønne Petersen wrote:2) It forces scanning of 'pinned' objects to be imprecise.I'm not so sure about that. In the comment section you added to Git core.memory, you wrote »Roots are always scanned conservatively. Roots include […] memory locations added through the GC.addRoot and GC.addRange functions.«. But this statement is problematic, since addRange() adds a »memory location« consisting of root pointers, whereas addRoot() adds a single (rvalue) root pointer. Thus, depending on which case you consider, »memory location« would refer to different levels of indirection. As far as I can see, adding objects you want to »pin« as roots would only force them to be scanned imprecisely if you'd force the entire GC memory block referred to by addRoot() resp. all the GC blocks referred to by the range added using addRange() to be scanned conservatively. But why would this be necessary? David
Oct 13 2012
On Saturday, 13 October 2012 at 19:34:29 UTC, Alex Rønne Petersen wrote:There is a difference: Adding the object itself as a root does not actually guarantee that the object *itself* might not be collected. At least, this is how I have to assume things work given that this is not guaranteed here: http://dlang.org/phobos/core_memory.html#addRootActually, it does: the internal array of added roots is simply considered an additional range to be scanned by the GC implementation. The docs should probably be clarified in this regard. David
Oct 13 2012
On Saturday, 13 October 2012 at 20:00:54 UTC, David Nadlinger wrote:Actually, it does: the internal array of added roots is simply considered an additional range to be scanned by the GC implementation. The docs should probably be clarified in this regard.https://github.com/D-Programming-Language/druntime/pull/322 David
Oct 13 2012
On 13-10-2012 22:00, David Nadlinger wrote:On Saturday, 13 October 2012 at 19:34:29 UTC, Alex Rønne Petersen wrote:That's good to know. I'm not convinced that this should be defined behavior though. It encourages semantically incorrect use of the API (see my other reply). -- Alex Rønne Petersen alex lycus.org http://lycus.orgThere is a difference: Adding the object itself as a root does not actually guarantee that the object *itself* might not be collected. At least, this is how I have to assume things work given that this is not guaranteed here: http://dlang.org/phobos/core_memory.html#addRootActually, it does: the internal array of added roots is simply considered an additional range to be scanned by the GC implementation. The docs should probably be clarified in this regard. David
Oct 13 2012
We already have a NO_MOVE attribute that can be set or unset. What's wrong with that? http://dlang.org/phobos/core_memory.html#NO_MOVE On Saturday, 13 October 2012 at 18:58:27 UTC, Alex Rønne Petersen wrote:Hi, With precise garbage collection coming up, and most likely compacting garbage collection in the future, I think it's time we start thinking about an API to pin garbage collector-managed objects. A typical approach that people use to 'pin' objects today is to allocate a chunk of memory from the C heap, add it as a root [range], and store a reference in it. That, or just global variables. This is kind of terrible because adding the chunk of memory as a root forces the GC to actually scan it, which is unnecessary when what you really want is to pin the object in place and tell the GC "I know what I'm doing, don't touch this". I propose the following functions in core.memory.GC: static bool pin(const(void)* p) nothrow; static bool unpin(const(void)* p) nothrow; The pin function shall pin the object pointed to by p in place such that it is not allowed to be moved nor collected until unpinned. The function shall return true if the object was successfully pinned or false if the object was already pinned or didn't belong to the garbage collector in the first place. The unpin function shall unpin the object pointed to by p such that it is once again eligible for moving and collection as usual. The function shall return true if the object was successfully unpinned or false if the object was not pinned or didn't belong to the garbage collector in the first place. Destroy!
Oct 13 2012
On 10/13/2012 8:58 PM, Alex Rřnne Petersen wrote:Hi, With precise garbage collection coming up, and most likely compacting garbage collection in the future, I think it's time we start thinking about an API to pin garbage collector-managed objects. A typical approach that people use to 'pin' objects today is to allocate a chunk of memory from the C heap, add it as a root [range], and store a reference in it. That, or just global variables.I guess people don't think about pinning because the GC is not moving objects ;-) As of today, you usually add a root to a garbage collected memory object to keep it in memory (and it can be scanned precisely), but you add a range to a memory chunk not managed by the garbage collector for scanning (you can't pass type info for this, so it is scanned conservatively). So this discussion is about addRoot/removeRoot, not addRange/removeRange (at least in the terminology of the current gc implementation).This is kind of terrible because adding the chunk of memory as a root forces the GC to actually scan it, which is unnecessary when what you really want is to pin the object in place and tell the GC "I know what I'm doing, don't touch this". I propose the following functions in core.memory.GC: static bool pin(const(void)* p) nothrow; static bool unpin(const(void)* p) nothrow; The pin function shall pin the object pointed to by p in place such that it is not allowed to be moved nor collected until unpinned. The function shall return true if the object was successfully pinned or false if the object was already pinned or didn't belong to the garbage collector in the first place. The unpin function shall unpin the object pointed to by p such that it is once again eligible for moving and collection as usual. The function shall return true if the object was successfully unpinned or false if the object was not pinned or didn't belong to the garbage collector in the first place. Destroy!Your proposal splits the addRoot-functionality of holding reference, scanning and moving a garbage-collected object into two functions addRoot/pin. For a non-moving garbage collector, this is not really an issue. Adding a root means that there is a reference to the object somewhere outside the reach of the garbage collector, so moving it would make that reference invalid. Not scanning the root object could cause referenced memory chunks to be collected, making the root object invalid. So I think the addRoot functionality should not change, tearing it apart could create invalid references. The pin/unpin functions do make sense for a moving garbage collector, but your motivation above is misleading. It is not related to adding roots or ranges, though I guess using roots is the safer way. Well, thinking about it again, what's the use case of pinning without an external reference and keeping the object alive, just as addRoot would do? Another question, which also affects addRoot/addRange: should there be a pin/unpin counter, so that an object that is pinned twice also needs to be unpinned twice to be movable again? This cannot be implemented by the simple NO_MOVE flag mentioned by David. roots and ranges implement it, but in a slightly inefficient way because the memory is scanned multiple times then.
Oct 14 2012