digitalmars.D - Moving GC
- dsimcha (10/10) Dec 12 2008 I've been playing around with custom memory allocation schemes for some
- Sean Kelly (4/7) Dec 12 2008 Minuscule. The GC isn't provided nearly enough type information to
- Don (9/19) Dec 13 2008 An related case that could probably be done is pure functions. Inside a
- Sergey Gromov (5/16) Dec 12 2008 Is this at all possible for a conservative GC to be moving? It's one
- dsimcha (15/31) Dec 12 2008 There are different levels of conservatism, though. For example, IIRC, ...
- Christopher Wright (8/15) Dec 12 2008 You'd have to do more bookkeeping:
- Sean Kelly (6/8) Dec 12 2008 This would incur more memory overhead as well. Right now, we make
- dsimcha (12/21) Dec 12 2008 Under what use cases would this extra few bytes really matter? What kin...
- Christopher Wright (6/18) Dec 12 2008 It means that you can't use one block for objects of multiple types.
- dsimcha (7/8) Dec 12 2008 Sure you can, without being any worse off than under the current scheme....
- Fawzi Mohamed (8/23) Dec 13 2008 it should be an array of void (or union pointer, non pointer), they
- Christopher Wright (3/12) Dec 13 2008 If you regularly do that, you won't have a moving garbage collector. If
- Steven Schveighoffer (14/22) Dec 13 2008 You need a bitfield indicating which ptr-sized blocks are valid pointers...
- Brad Roberts (12/35) Dec 12 2008 Maybe isn't good enough. You have to definitively be able to identify
- Stewart Gordon (7/12) Dec 25 2008 I think the underlying question is: Has anybody yet foreseen
I've been playing around with custom memory allocation schemes for some programs, which are based on regions. These work great, and are significantly faster for the use cases I'm using them for, than the general allocation scheme. However, everything would break if it were used with reference types and a moving GC because, as far as the GC knows, the regions I'm allocating from it are untyped. What are the odds that some D implementation gets a moving GC in the foreseeable future? Are they significant enough that I should avoid relying on the GC being non-moving, or is worrying about such things just overdoing trying to be future-proof?
Dec 12 2008
dsimcha wrote:What are the odds that some D implementation gets a moving GC in the foreseeable future?Minuscule. The GC isn't provided nearly enough type information to safely move memory right now, and I don't see that changing any time soon. Sean
Dec 12 2008
Sean Kelly wrote:dsimcha wrote:An related case that could probably be done is pure functions. Inside a pure function, all allocations could be made on a 'pure heap' instead of the normal heap. When returning from a pure function back into a non-pure function, copy the return value into the normal heap (moving it in the process). Then discard the entire pure heap. (This is a *huge* win if the return value from the pure function is a POD, and the pure function does extensive memory allocation). Of course, that's more like a memory pool rather a gc.What are the odds that some D implementation gets a moving GC in the foreseeable future?Minuscule. The GC isn't provided nearly enough type information to safely move memory right now, and I don't see that changing any time soon. Sean
Dec 13 2008
Fri, 12 Dec 2008 15:05:46 +0000 (UTC), dsimcha wrote:I've been playing around with custom memory allocation schemes for some programs, which are based on regions. These work great, and are significantly faster for the use cases I'm using them for, than the general allocation scheme. However, everything would break if it were used with reference types and a moving GC because, as far as the GC knows, the regions I'm allocating from it are untyped. What are the odds that some D implementation gets a moving GC in the foreseeable future? Are they significant enough that I should avoid relying on the GC being non-moving, or is worrying about such things just overdoing trying to be future-proof?Is this at all possible for a conservative GC to be moving? It's one thing when some unused memory is kept allocated due to false pointers, and another when some data resembling pointers changes unpredictably. And GC is to stay conservative as long as unions are supported.
Dec 12 2008
== Quote from Sergey Gromov (snake.scaly gmail.com)'s articleFri, 12 Dec 2008 15:05:46 +0000 (UTC), dsimcha wrote:There are different levels of conservatism, though. For example, IIRC, the Mono GC scans the stack conservatively and pins all objects that have apparent pointers from the stack. However, it scans the heap precisely and moves objects that only have pointers from the heap. In theory (not saying it will or even should actually happen) D could implement a mostly precise GC that makes the conservative assumption only in the case of unions, and pins objects that may be pointed to by pointers contained in unions. However, after thinking about it, I doubt D will go the route of moving/more precise GC. I know I've advocated it in the past, but I've realized that false pointers aren't generally that big a deal if you delete a few huge (>1MB) objects manually. Furthermore, I think it's necessary in a systems programming language to be able to allocate a big chunk of untyped memory. Technically, this could still be done with a moving GC by using the C heap, but it would make things really, really complicated.I've been playing around with custom memory allocation schemes for some programs, which are based on regions. These work great, and are significantly faster for the use cases I'm using them for, than the general allocation scheme. However, everything would break if it were used with reference types and a moving GC because, as far as the GC knows, the regions I'm allocating from it are untyped. What are the odds that some D implementation gets a moving GC in the foreseeable future? Are they significant enough that I should avoid relying on the GC being non-moving, or is worrying about such things just overdoing trying to be future-proof?Is this at all possible for a conservative GC to be moving? It's one thing when some unused memory is kept allocated due to false pointers, and another when some data resembling pointers changes unpredictably. And GC is to stay conservative as long as unions are supported.
Dec 12 2008
dsimcha wrote:However, after thinking about it, I doubt D will go the route of moving/more precise GC. I know I've advocated it in the past, but I've realized that false pointers aren't generally that big a deal if you delete a few huge (>1MB) objects manually. Furthermore, I think it's necessary in a systems programming language to be able to allocate a big chunk of untyped memory. Technically, this could still be done with a moving GC by using the C heap, but it would make things really, really complicated.You'd have to do more bookkeeping: - this memory is referenced by something that I know is a pointer - this memory is referenced by something that might be a pointer (or some random bit pattern) - this memory is not referenced This isn't unreasonable, and it really just depends on how complete and how fast D's runtime reflection is.
Dec 12 2008
== Quote from Christopher Wright (dhasenan gmail.com)'s articleThis isn't unreasonable, and it really just depends on how complete and how fast D's runtime reflection is.This would incur more memory overhead as well. Right now, we make do with one bit per block that indicates whether the block should be scanned for pointers, while this may theoretically require a pointer to TypeInfo per block in order to perform precise scanning. Sean
Dec 12 2008
== Quote from Sean Kelly (sean invisibleduck.org)'s article== Quote from Christopher Wright (dhasenan gmail.com)'s articleUnder what use cases would this extra few bytes really matter? What kinds of programs allocate so many objects that this overhead would be significant in practice? Note that I'm not advocating a moving GC, because in a systems language like D, where working with pointers and untyped memory blocks is supposed to be feasible, I think it would create more problems/limitations than it solves. For example, now it's ok to store a pointer to GC allocated memory in an area not scanned by the GC, such as the C heap or a memory block marked as NO_SCAN, as long as you also have another pointer to it somewhere else. With a moving GC, this would become problematic. However, a more precise non-moving GC (scans the whole heap precisely, only imprecise on the stack) would be nice if it can be reasonably implemented.This isn't unreasonable, and it really just depends on how complete and how fast D's runtime reflection is.This would incur more memory overhead as well. Right now, we make do with one bit per block that indicates whether the block should be scanned for pointers, while this may theoretically require a pointer to TypeInfo per block in order to perform precise scanning. Sean
Dec 12 2008
dsimcha wrote:== Quote from Sean Kelly (sean invisibleduck.org)'s articleIt means that you can't use one block for objects of multiple types. That's a limitation on how you implement a GC, and it could possibly increase your overhead a lot more than converting a bit to a word. (Unless you use some tricks, that bit will take at least one byte, maybe more depending on alignment.)== Quote from Christopher Wright (dhasenan gmail.com)'s articleUnder what use cases would this extra few bytes really matter? What kinds of programs allocate so many objects that this overhead would be significant in practice?This isn't unreasonable, and it really just depends on how complete and how fast D's runtime reflection is.This would incur more memory overhead as well. Right now, we make do with one bit per block that indicates whether the block should be scanned for pointers, while this may theoretically require a pointer to TypeInfo per block in order to perform precise scanning. Sean
Dec 12 2008
== Quote from Christopher Wright (dhasenan gmail.com)'s articleIt means that you can't use one block for objects of multiple types.Sure you can, without being any worse off than under the current scheme. Mark the contents of the memory as an array of void*s. This would have the same effect as marking it for GC scanning under the current scheme, basically making the GC scan the block conservatively. Furthermore, if you're storing value types or pointers to reference types that also have a pointer stored in a GC-scanned block, set the typeinfo to byte or something, and you have the equivalent of setting the NO_SCAN bit.
Dec 12 2008
On 2008-12-13 04:24:54 +0100, dsimcha <dsimcha yahoo.com> said:== Quote from Christopher Wright (dhasenan gmail.com)'s articleit should be an array of void (or union pointer, non pointer), they should be pinned. Also all pointers given to C (and potentially pointers reachable from them) should be pinned. This makes system programming difficult. If one has a closed system with just D then it is easier. FawziIt means that you can't use one block for objects of multiple types.Sure you can, without being any worse off than under the current scheme. Mark the contents of the memory as an array of void*s. This would have the same effect as marking it for GC scanning under the current scheme, basically making the GC scan the block conservatively. Furthermore, if you're storing value types or pointers to reference types that also have a pointer stored in a GC-scanned block, set the typeinfo to byte or something, and you have the equivalent of setting the NO_SCAN bit.
Dec 13 2008
dsimcha wrote:== Quote from Christopher Wright (dhasenan gmail.com)'s articleIf you regularly do that, you won't have a moving garbage collector. If you do that only when you're low on memory, that might be acceptable.It means that you can't use one block for objects of multiple types.Sure you can, without being any worse off than under the current scheme. Mark the contents of the memory as an array of void*s. This would have the same effect as marking it for GC scanning under the current scheme, basically making the GC scan the block conservatively. Furthermore, if you're storing value types or pointers to reference types that also have a pointer stored in a GC-scanned block, set the typeinfo to byte or something, and you have the equivalent of setting the NO_SCAN bit.
Dec 13 2008
"Sean Kelly" wrote== Quote from Christopher Wright (dhasenan gmail.com)'s articleYou need a bitfield indicating which ptr-sized blocks are valid pointers. For example, a 16-byte block (on 32-bit system) only requires 4 bits. For 16, 32, 64, 128 byte blocks, you can probably get away with just a bitfield per block. For larger ones, you may have to have a pointer to a bitfield. But the bitfield should also always contain a bit that says whether the block is an array. That way, you only need one bitfield per element (i.e. the bit pattern repeats). Perhaps, you have one bit that says whether the block is an array, one bit that says whether the 'bitfield' is really a pointer to a static bitfield (for larger elements). These bitfields can easily be stored with the TypeInfo by the compiler. -SteveThis isn't unreasonable, and it really just depends on how complete and how fast D's runtime reflection is.This would incur more memory overhead as well. Right now, we make do with one bit per block that indicates whether the block should be scanned for pointers, while this may theoretically require a pointer to TypeInfo per block in order to perform precise scanning.
Dec 13 2008
Christopher Wright wrote:dsimcha wrote:Maybe isn't good enough. You have to definitively be able to identify every pointer to an object to be able to move the object since every pointer must be adjusted. The only alternative to that is double indirection for movable objects. With double indirection you point to a pointer that the GC owns and can update when the objects are moved. There's HUGE benefits to a moving gc. The gc in java 6 is _extremely_ clever (and not all that hard to write either, at least most of it is.. there's a lot of details in the heuristics parts), but 100% relies on moving objects. Later, BradHowever, after thinking about it, I doubt D will go the route of moving/more precise GC. I know I've advocated it in the past, but I've realized that false pointers aren't generally that big a deal if you delete a few huge (>1MB) objects manually. Furthermore, I think it's necessary in a systems programming language to be able to allocate a big chunk of untyped memory. Technically, this could still be done with a moving GC by using the C heap, but it would make things really, really complicated.You'd have to do more bookkeeping: - this memory is referenced by something that I know is a pointer - this memory is referenced by something that might be a pointer (or some random bit pattern) - this memory is not referenced This isn't unreasonable, and it really just depends on how complete and how fast D's runtime reflection is.
Dec 12 2008
dsimcha wrote: <snip>What are the odds that some D implementation gets a moving GC in the foreseeable future?I think the underlying question is: Has anybody yet foreseen http://d.puremagic.com/issues/show_bug.cgi?id=679 being fixed?Are they significant enough that I should avoid relying on the GC being non-moving, or is worrying about such things just overdoing trying to be future-proof?Depends on whether you're writing an application or a library, I guess. Stewart.
Dec 25 2008