digitalmars.D - Thought on limiting scope of GC
- Jerry (15/15) Feb 13 2014 Hi all,
- Andrei Alexandrescu (5/20) Feb 13 2014 Yah, it's a classic (with the manes "track" -> "mark" and "cleanup" ->
- Jerry (10/36) Feb 14 2014 I don't follow the global GC comment. Let's say you're using global GC
- Francesco Cattoglio (4/9) Feb 14 2014 Track cannot make sure that no reference escapes, therefore
- Andrei Alexandrescu (12/25) Feb 14 2014 Oh, I think mark/sweep in the "mark/sweep idiom" are different from
- Jerry (4/17) Feb 14 2014 The difference is that I'd like the ability for some objects to live
- Andrei Alexandrescu (3/22) Feb 14 2014 Then I guess you'd need to use two allocators.
- thedeemon (5/16) Feb 13 2014 What if allocateStuff() writes address of some newly allocated
- Jerry (6/21) Feb 14 2014 This is a concern. Rather than passing a single object into the
- Paulo Pinto (3/24) Feb 13 2014 How do imagine it to work in multi-core programs? Does it only
- Jerry (5/21) Feb 14 2014 I think this can be handled by storing the thread that requests
- Namespace (3/24) Feb 14 2014 Looks like DIP 46: http://wiki.dlang.org/DIP46
- tcak (9/30) Feb 14 2014 A programmer's aim is to tell computer what to do. Purpose of GC
- Paulo Pinto (8/46) Feb 14 2014 This only works when you are the only guy on the team and have a
- tcak (12/29) Feb 14 2014 Many people wants to disable GC to improve performance (if there
- Paulo Pinto (15/45) Feb 14 2014 Again, this example only works when you are the only guy working on the
- Jerry (4/13) Feb 14 2014 My proposal was to leave GC enabled for the whole program. The track
Hi all, I just had the following thought on limiting the gc in regions. I don't know if this would address some of Manu's concerns, but here goes: My thought is to have something like the following: GC.track(); auto obj = allocateStuff(); GC.cleanup(obj); The idea here is that track() tells GC to explicitly track all objects created from that point until the cleanup call. The cleanup() call tells gc to limit its collection to those objects allocated since the track() call. The obj parameter tells gc to consider obj live. This way, you can avoid tracking everything that may get created, but you can limit how much work gets done. Comments? Slams? Jerry
Feb 13 2014
On 2/13/14, 8:41 PM, Jerry wrote:Hi all, I just had the following thought on limiting the gc in regions. I don't know if this would address some of Manu's concerns, but here goes: My thought is to have something like the following: GC.track(); auto obj = allocateStuff(); GC.cleanup(obj); The idea here is that track() tells GC to explicitly track all objects created from that point until the cleanup call. The cleanup() call tells gc to limit its collection to those objects allocated since the track() call. The obj parameter tells gc to consider obj live. This way, you can avoid tracking everything that may get created, but you can limit how much work gets done. Comments? Slams? JerryYah, it's a classic (with the manes "track" -> "mark" and "cleanup" -> "sweep"). Allocators support that already, and installing a global GC should do as well. Andrei
Feb 13 2014
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:On 2/13/14, 8:41 PM, Jerry wrote:I don't follow the global GC comment. Let's say you're using global GC in general but want to control more tightly what it's doing at a particular region of the code. Mark looks at all things that have been allocated and possibly live. Track says keep track of objects allocated after the track call, and cleanup only looks at those objects that were recently allocated, ignoring the rest of the heap. If you're saying that allocators will provide the means of doing this, then that's fine.Hi all, I just had the following thought on limiting the gc in regions. I don't know if this would address some of Manu's concerns, but here goes: My thought is to have something like the following: GC.track(); auto obj = allocateStuff(); GC.cleanup(obj); The idea here is that track() tells GC to explicitly track all objects created from that point until the cleanup call. The cleanup() call tells gc to limit its collection to those objects allocated since the track() call. The obj parameter tells gc to consider obj live. This way, you can avoid tracking everything that may get created, but you can limit how much work gets done. Comments? Slams? JerryYah, it's a classic (with the manes "track" -> "mark" and "cleanup" -> "sweep"). Allocators support that already, and installing a global GC should do as well.
Feb 14 2014
On Friday, 14 February 2014 at 11:28:11 UTC, Jerry wrote:Track says keep track of objects allocated after the track call, and cleanup only looks at those objects that were recently allocated, ignoring the rest of the heap.Track cannot make sure that no reference escapes, therefore cleaning up an object could be a huge error. This would however make sense e.g. inside pure functions.
Feb 14 2014
On 2/14/14, 3:28 AM, Jerry wrote:Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:Oh, I think mark/sweep in the "mark/sweep idiom" are different from "mark & sweep garbage collector". I looked for the evidence that the idiom does exist under that name, but apparently I was wrong. Anyhow, I guess track/cleanup is less confusing.Yah, it's a classic (with the manes "track" -> "mark" and "cleanup" -> "sweep"). Allocators support that already, and installing a global GC should do as well.I don't follow the global GC comment. Let's say you're using global GC in general but want to control more tightly what it's doing at a particular region of the code. Mark looks at all things that have been allocated and possibly live.Track says keep track of objects allocated after the track call, and cleanup only looks at those objects that were recently allocated, ignoring the rest of the heap. If you're saying that allocators will provide the means of doing this, then that's fine.I'm thinking of something like: MyAllocator alloc = ...; alloc.installGlobally(); ... alloc.deallocateAll(); alloc.uninstallGlobally(); Andrei
Feb 14 2014
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:On 2/14/14, 3:28 AM, Jerry wrote:The difference is that I'd like the ability for some objects to live after the region ends. I.e. it's reducing the scope of the GC, not temporarily replacing it with a completely separate heap.Track says keep track of objects allocated after the track call, and cleanup only looks at those objects that were recently allocated, ignoring the rest of the heap. If you're saying that allocators will provide the means of doing this, then that's fine.I'm thinking of something like: MyAllocator alloc = ...; alloc.installGlobally(); ... alloc.deallocateAll(); alloc.uninstallGlobally();
Feb 14 2014
On 2/14/14, 8:26 AM, Jerry wrote:Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:Then I guess you'd need to use two allocators. AndreiOn 2/14/14, 3:28 AM, Jerry wrote:The difference is that I'd like the ability for some objects to live after the region ends. I.e. it's reducing the scope of the GC, not temporarily replacing it with a completely separate heap.Track says keep track of objects allocated after the track call, and cleanup only looks at those objects that were recently allocated, ignoring the rest of the heap. If you're saying that allocators will provide the means of doing this, then that's fine.I'm thinking of something like: MyAllocator alloc = ...; alloc.installGlobally(); ... alloc.deallocateAll(); alloc.uninstallGlobally();
Feb 14 2014
On Friday, 14 February 2014 at 04:41:43 UTC, Jerry wrote:My thought is to have something like the following: GC.track(); auto obj = allocateStuff(); GC.cleanup(obj); The idea here is that track() tells GC to explicitly track all objects created from that point until the cleanup call. The cleanup() call tells gc to limit its collection to those objects allocated since the track() call. The obj parameter tells gc to consider obj live.What if allocateStuff() writes address of some newly allocated object to a field of some old object existing before GC.track()? You can't just scan only objects created after GC.track(), this might create dangling references in the "old generation".
Feb 13 2014
"thedeemon" <dlang thedeemon.com> writes:On Friday, 14 February 2014 at 04:41:43 UTC, Jerry wrote:This is a concern. Rather than passing a single object into the cleanup, a list of objects to consider live can be passed in. That would cover at least some of these situations, but not all. Would it still be useful given this limitation? Would it give someone looking for tighter control over GC the tools they need?My thought is to have something like the following: GC.track(); auto obj = allocateStuff(); GC.cleanup(obj); The idea here is that track() tells GC to explicitly track all objects created from that point until the cleanup call. The cleanup() call tells gc to limit its collection to those objects allocated since the track() call. The obj parameter tells gc to consider obj live.What if allocateStuff() writes address of some newly allocated object to a field of some old object existing before GC.track()? You can't just scan only objects created after GC.track(), this might create dangling references in the "old generation".
Feb 14 2014
On Friday, 14 February 2014 at 04:41:43 UTC, Jerry wrote:Hi all, I just had the following thought on limiting the gc in regions. I don't know if this would address some of Manu's concerns, but here goes: My thought is to have something like the following: GC.track(); auto obj = allocateStuff(); GC.cleanup(obj); The idea here is that track() tells GC to explicitly track all objects created from that point until the cleanup call. The cleanup() call tells gc to limit its collection to those objects allocated since the track() call. The obj parameter tells gc to consider obj live. This way, you can avoid tracking everything that may get created, but you can limit how much work gets done. Comments? Slams? JerryHow do imagine it to work in multi-core programs? Does it only track thread local allocations?
Feb 13 2014
"Paulo Pinto" <pjmlp progtools.org> writes:On Friday, 14 February 2014 at 04:41:43 UTC, Jerry wrote:I think this can be handled by storing the thread that requests tracking, and then each allocation is tracked if it's done from the same thread that requested tracking. Then cleanup just considers the objects that were tracked.My thought is to have something like the following: GC.track(); auto obj = allocateStuff(); GC.cleanup(obj); The idea here is that track() tells GC to explicitly track all objects created from that point until the cleanup call. The cleanup() call tells gc to limit its collection to those objects allocated since the track() call. The obj parameter tells gc to consider obj live. This way, you can avoid tracking everything that may get created, but you can limit how much work gets done.How do imagine it to work in multi-core programs? Does it only track thread local allocations?
Feb 14 2014
On Friday, 14 February 2014 at 04:41:43 UTC, Jerry wrote:Hi all, I just had the following thought on limiting the gc in regions. I don't know if this would address some of Manu's concerns, but here goes: My thought is to have something like the following: GC.track(); auto obj = allocateStuff(); GC.cleanup(obj); The idea here is that track() tells GC to explicitly track all objects created from that point until the cleanup call. The cleanup() call tells gc to limit its collection to those objects allocated since the track() call. The obj parameter tells gc to consider obj live. This way, you can avoid tracking everything that may get created, but you can limit how much work gets done. Comments? Slams? JerryLooks like DIP 46: http://wiki.dlang.org/DIP46 I like the idea.
Feb 14 2014
On Friday, 14 February 2014 at 04:41:43 UTC, Jerry wrote:Hi all, I just had the following thought on limiting the gc in regions. I don't know if this would address some of Manu's concerns, but here goes: My thought is to have something like the following: GC.track(); auto obj = allocateStuff(); GC.cleanup(obj); The idea here is that track() tells GC to explicitly track all objects created from that point until the cleanup call. The cleanup() call tells gc to limit its collection to those objects allocated since the track() call. The obj parameter tells gc to consider obj live. This way, you can avoid tracking everything that may get created, but you can limit how much work gets done. Comments? Slams? JerryA programmer's aim is to tell computer what to do. Purpose of GC is to help him to prevent problems. In default, AFAIK, GC considers every part of memory in case there are references in them. Well, if the time taking process is scanning all memory, programmer could tell to GC, if he/she trusts about correctness, not to scan some parts of memory to limit scanning area. Example, if I create a char array of 10,000 items, why would I want GC to scan it. I won't put any object references in it for sure.
Feb 14 2014
On Friday, 14 February 2014 at 09:01:09 UTC, tcak wrote:On Friday, 14 February 2014 at 04:41:43 UTC, Jerry wrote:This only works when you are the only guy on the team and have a small codebase to visualize on your head. The moment a middle size team comes into play, it is chaos. There is a reason why manual memory managed languages have lost their place on the enterprise. -- PauloHi all, I just had the following thought on limiting the gc in regions. I don't know if this would address some of Manu's concerns, but here goes: My thought is to have something like the following: GC.track(); auto obj = allocateStuff(); GC.cleanup(obj); The idea here is that track() tells GC to explicitly track all objects created from that point until the cleanup call. The cleanup() call tells gc to limit its collection to those objects allocated since the track() call. The obj parameter tells gc to consider obj live. This way, you can avoid tracking everything that may get created, but you can limit how much work gets done. Comments? Slams? JerryA programmer's aim is to tell computer what to do. Purpose of GC is to help him to prevent problems. In default, AFAIK, GC considers every part of memory in case there are references in them. Well, if the time taking process is scanning all memory, programmer could tell to GC, if he/she trusts about correctness, not to scan some parts of memory to limit scanning area. Example, if I create a char array of 10,000 items, why would I want GC to scan it. I won't put any object references in it for sure.
Feb 14 2014
Many people wants to disable GC to improve performance (if there are other reasons, it is not included here.). If after adding new codes, memory problems start, just disable the GC-disabled-code-parts (as I exampled with that 10,000 item array). This way, errors will disappear and performance may decrease a little. Then fixing can be done to increase performance again. I think enabling GC for only some parts of code is wrong. It should be disabling it for some parts of code. This way, if programmer loses control of memory, he/she can remove GC-disabling codes, and tada everything works correctly without doing any other changes.A programmer's aim is to tell computer what to do. Purpose of GC is to help him to prevent problems. In default, AFAIK, GC considers every part of memory in case there are references in them. Well, if the time taking process is scanning all memory, programmer could tell to GC, if he/she trusts about correctness, not to scan some parts of memory to limit scanning area. Example, if I create a char array of 10,000 items, why would I want GC to scan it. I won't put any object references in it for sure.This only works when you are the only guy on the team and have a small codebase to visualize on your head. The moment a middle size team comes into play, it is chaos. There is a reason why manual memory managed languages have lost their place on the enterprise. -- Paulo
Feb 14 2014
Am 14.02.2014 16:46, schrieb tcak:Again, this example only works when you are the only guy working on the code. For example, projects of the size of Linux kernel are only viable in languages like C, because there are guys validating every single line of code that gets added to the kernel. In most projects that is far from truth, everyone just checks whatever they feel like. Then when the thing blows up on the customer and there are high escalation meetings going over, there are a few poor souls, usually senior developers, going over commit history and using tools like Insure++ to track down the issue. Sometimes it takes a whole week to track down such culprits. I don't miss those days. -- PauloMany people wants to disable GC to improve performance (if there are other reasons, it is not included here.). If after adding new codes, memory problems start, just disable the GC-disabled-code-parts (as I exampled with that 10,000 item array). This way, errors will disappear and performance may decrease a little. Then fixing can be done to increase performance again. I think enabling GC for only some parts of code is wrong. It should be disabling it for some parts of code. This way, if programmer loses control of memory, he/she can remove GC-disabling codes, and tada everything works correctly without doing any other changes.A programmer's aim is to tell computer what to do. Purpose of GC is to help him to prevent problems. In default, AFAIK, GC considers every part of memory in case there are references in them. Well, if the time taking process is scanning all memory, programmer could tell to GC, if he/she trusts about correctness, not to scan some parts of memory to limit scanning area. Example, if I create a char array of 10,000 items, why would I want GC to scan it. I won't put any object references in it for sure.This only works when you are the only guy on the team and have a small codebase to visualize on your head. The moment a middle size team comes into play, it is chaos. There is a reason why manual memory managed languages have lost their place on the enterprise. -- Paulo
Feb 14 2014
"tcak" <tcak pcak.com> writes:Many people wants to disable GC to improve performance (if there are other reasons, it is not included here.). If after adding new codes, memory problems start, just disable the GC-disabled-code-parts (as I exampled with that 10,000 item array). This way, errors will disappear and performance may decrease a little. Then fixing can be done to increase performance again. I think enabling GC for only some parts of code is wrong. It should be disabling it for some parts of code. This way, if programmer loses control of memory, he/she can remove GC-disabling codes, and tada everything works correctly without doing any other changes.My proposal was to leave GC enabled for the whole program. The track and cleanup call pair is intended to narrow the scope of GC in some regions of the code.
Feb 14 2014