digitalmars.D - Componentizing D's garbage collector
- Andrei Alexandrescu (41/41) Jan 10 2014 The GC comes up repeatedly in discussions around here. I've been
- Joseph Rushton Wakeling (4/14) Jan 10 2014 Can you recommend some good background reading for those of us who would...
- Jacob Carlborg (6/9) Jan 10 2014 There's the classic The Garbage Collection Handbook. Here in a later
- Paulo Pinto (5/29) Jan 10 2014 Two good books
- Bienlein (8/11) Jan 10 2014 You mean about how GCs work? I have the book "Garbage Collection:
- Bienlein (3/14) Jan 10 2014 Seems like we were all in parallel with our book recommendations;
- Jacob Carlborg (4/6) Jan 10 2014 That would hopefully indicate they're good recommendations :)
- Regan Heath (12/15) Jan 10 2014 IMO, the Microsoft documentation for the C# GC is a good place to start.
- Joseph Cassman (8/12) Jan 10 2014 I like the thesis "Scheduling Garbage Collection in Embedded
- Joseph Rushton Wakeling (4/8) Jan 10 2014 Interesting -- I'll have to look that up.
- Joseph Rushton Wakeling (3/7) Jan 10 2014 The link, for anyone who's interested:
- Dmitry Olshansky (20/47) Jan 10 2014 O.T.: Please use full URLs, it's not that long.
- Andrei Alexandrescu (8/37) Jan 10 2014 You know I wouldn't write that unless I had an ace up my sleeve. Watch
- H. S. Teoh (17/48) Jan 10 2014 I don't see why having a templated mark() is a problem. Nothing says you
- Joseph Rushton Wakeling (2/3) Jan 10 2014 Ahh, we're getting an allocation-based Dance of the Seven Veils here? :-...
- bioinfornatics (5/5) Jan 10 2014 One big generic problem for GC that is to manage efficiently a
- Sean Kelly (19/42) Jan 10 2014 This only exists because the runtime is statically linked and
- Sean Kelly (3/3) Jan 10 2014 Oh, I assume this will eventually include your ideas about having
- Andrei Alexandrescu (4/7) Jan 10 2014 Yah, such integration would make a lot of sense. Thanks for the
- Benjamin Thaut (2/4) Jan 10 2014 Just one question. Did you read "the garbage collection handbook"?
- Andrei Alexandrescu (4/10) Jan 10 2014 Yah, the new edition. Unfortunately I didn't find a lot of new stuff or
- Benjamin Thaut (13/24) Jan 11 2014 Very good,
- Andrei Alexandrescu (10/34) Jan 11 2014 I'm not considering done as much as separable as a concern. All I'm
- Benjamin Thaut (15/52) Jan 11 2014 Well as far as my understanding of GCs goes, you have two options:
- Andrei Alexandrescu (5/10) Jan 11 2014 The way I see it/hope it turns out, precision is a separate concern from...
- =?ISO-8859-1?Q?S=F6nke_Ludwig?= (5/10) Jan 11 2014 The language can help a lot there (pure... and scope, if "fixed"), so
- Rainer Schuetze (8/65) Jan 12 2014 I think a moving collector is currently not feasible without restricting...
- Benjamin Thaut (8/11) Jan 12 2014 Could you give an example which part of the language would not be doable...
- Jakob Ovrum (7/10) Jan 12 2014 We don't necessarily need that, though. In D we have a plethora
- Benjamin Thaut (3/13) Jan 12 2014 You should really try to write non-GC D code some time. You would be
- Joseph Rushton Wakeling (3/5) Jan 12 2014 Is that down to the language, or parts of druntime and phobos
- Benjamin Thaut (2/7) Jan 12 2014 Both. Although it kept getting better lately.
- Jakob Ovrum (3/5) Jan 12 2014 I was, years ago.
- Andrei Alexandrescu (4/17) Jan 12 2014 Yah, moving would be real nice. I hope to at least clarify the issues
- Brian Rogoff (8/23) Jan 12 2014 I'd rather have more restrictions and a working precise GC, and
- Benjamin Thaut (8/9) Jan 12 2014 I don't see any problem with pointer arithmetic? Either the pointer is
- Sean Kelly (3/10) Jan 12 2014 Or you don't know if something is a pointer or not and so
- Brad Anderson (5/15) Jan 12 2014 The garbage collection page of the D spec actually talks a lot
- Rainer Schuetze (29/45) Jan 13 2014 Maybe I'm too pessimistic ;-) I guess moving in general could be
- Jacob Carlborg (6/19) Jan 13 2014 Could we have a segregated heap for C pointers? Would that help?
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (15/18) Jan 13 2014 Is it possible to declare whether the C-function retains a
- Dmitry Olshansky (6/26) Jan 13 2014 How would it be expensive? I don't see a need to do anything to "pin" a
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (3/7) Jan 13 2014 How do you know that it is a pointer?
- Dmitry Olshansky (6/11) Jan 13 2014 Precisely. Since C space has no type information you have to
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (6/8) Jan 13 2014 I think we are misunderstanding each other? I don't think a Boehm
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (2/2) Jan 13 2014 …and scanning of C data segments is insufficient anyway, since
- Dmitry Olshansky (7/9) Jan 13 2014 A stack is a stack is a stack.
- Dmitry Olshansky (6/13) Jan 13 2014 So what? It can't move it and as such there is nothing special about it,...
- Timon Gehr (5/17) Jan 13 2014 Yes, there is a cost. The requirement to pin arbitrary objects at
- Jacob Carlborg (4/8) Jan 14 2014 Can't these object be pinned somewhere else?
- Benjamin Thaut (4/10) Jan 14 2014 Yes, thats the usual approach. But that means that you have to copy them...
- Jacob Carlborg (4/7) Jan 14 2014 Is there a problem to allocate them in the correct space from the beginn...
- Benjamin Thaut (6/12) Jan 14 2014 Well you have to know it at allocation time. If you have a function that...
- Timon Gehr (3/16) Jan 14 2014 Once it becomes known that an object should be pinned, it is too late
- Benjamin Thaut (4/21) Jan 14 2014 Well no, usually you have to pin objects _before_ passing them to
- Timon Gehr (8/30) Jan 14 2014 Well, there is no such manual mechanism in place, so only strategies
- Benjamin Thaut (3/36) Jan 14 2014 Yes, but pinning will still be necessary for passing data to C/C++
- Timon Gehr (2/4) Jan 14 2014 (I agree, it is the spec that does not.)
- Timon Gehr (10/15) Jan 14 2014 BTW, our misunderstandings are rooted here:
- Dmitry Olshansky (7/29) Jan 14 2014 If one accepts that any heap can have object that is pinned I don't see
- Tofu Ninja (3/3) Jan 14 2014 Why are unions with pointers even allowed in the first place? It
- Benjamin Thaut (5/8) Jan 14 2014 Yes, low level programming. D is supposed to be a systems programmin
- deadalnix (7/10) Jan 14 2014 Tagged unions are very useful. For instance, to take a
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (11/13) Jan 13 2014 Well, the advantage of compaction is that you (in theory) can
- Dmitry Olshansky (13/24) Jan 14 2014 Bleh. If anything you may as well lose it - manual memory management is
- woh (7/7) Jan 14 2014 If Java with all it that time and money can not create a
- Paulo Pinto (2/9) Jan 14 2014 Are you aware of Azul's GC or some real time VMs like Aonix PERC?
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (8/15) Jan 14 2014 Under what assumption? There is not optimal garbage collector,
- Benjamin Thaut (6/20) Jan 13 2014 I also don't see how that would be expenive. Pinning is a common pratice...
- Rainer Schuetze (32/42) Jan 15 2014 The stack reference can be moved outside the memory visible to
- deadalnix (4/12) Jan 13 2014 I had some similar reflection. I'm not sure moving is worthwhile
- Walter Bright (7/19) Jan 13 2014 A moving GC is already supported by D's semantics.
- Brad Roberts (9/29) Jan 13 2014 The key advantage that moving gives to a GC is to allow it to have a blo...
- Walter Bright (2/12) Jan 13 2014 The reason pinning doesn't particularly impede this is because pinning i...
- Timon Gehr (2/11) Jan 13 2014 In Java or in D? Eg. there are no unions in Java.
- Paulo Pinto (3/16) Jan 13 2014 C# has unions.
- Timon Gehr (3/20) Jan 13 2014 "The common language runtime controls the physical layout of the data
- Paulo Pinto (9/37) Jan 14 2014 Did you also read the remaining part of the page, or just looked
- Timon Gehr (13/48) Jan 14 2014 Look, you hadn't done anything besides pasting for backing up your point...
- Paulo Pinto (8/66) Jan 15 2014 Excuse me for being stupid.
- Benjamin Thaut (7/14) Jan 13 2014 I would prefer a option where the user can specifiy a function to scan
- Walter Bright (3/20) Jan 13 2014 I agree, but I was trying to correct the misperception that current D do...
- Benjamin Thaut (9/11) Jan 14 2014 Current D does not allow a moving collector because of the lack of
- Walter Bright (2/9) Jan 14 2014 This is not true, I've implemented one for Java.
- Benjamin Thaut (2/16) Jan 14 2014 So how do you implement a semi-space GC with pinned objects?
- renoX (6/29) Jan 14 2014 You manage the pinned object in a different list that the
- Benjamin Thaut (8/32) Jan 14 2014 I meant a pure semi-space collector. A pure semi-space collector can not...
- Walter Bright (2/5) Jan 14 2014 Be more flexible in what the semi-spaces are.
- Benjamin Thaut (4/10) Jan 14 2014 So you just don't. Effectivley you work around the problem. I would
- Walter Bright (6/16) Jan 14 2014 My experience with book algorithms is they usually require some adjustme...
- Benjamin Thaut (2/3) Jan 14 2014 Then this was a big misunderstanding.
- Frustrated (35/40) Jan 14 2014 So, the issue seems to be that everyone is treating the GC as the
- deadalnix (3/45) Jan 14 2014 Try to do the pull request that goes with that, and we will be
- bearophile (6/14) Jan 13 2014 Time ago I suggested an optional standard method for unions of
- Dmitry Olshansky (8/31) Jan 12 2014 I might be ignorant but why can't we make "mostly-moving" collector?
- Benjamin Thaut (17/47) Jan 12 2014 A semi-space garbage collector is best fitted for small short lived
- sunspyre (27/28) Jan 12 2014 I just want to point out that from an outer perspective, this is
- qznc (8/15) Jan 13 2014 That is not technically possible. A truly concurrent GC has heavy
- Marco Leise (11/68) Jan 10 2014 Nothing to destroy yet. But it does feel good that someone is
- Rainer Schuetze (17/59) Jan 10 2014 You mean the proxy functions? Yes, please axe 'em, that only works in
- Andrei Alexandrescu (13/51) Jan 10 2014 I need to write some code to explain it all. An figure it all :o).
- Rainer Schuetze (8/53) Jan 11 2014 I was thinking about using the proposed module info extension to
- Jacob Carlborg (5/8) Jan 11 2014 You mean this: https://github.com/D-Programming-Language/dmd/pull/2271 ?...
- Rainer Schuetze (4/10) Jan 12 2014 Yes. I'm not sure if there is enough compiler support yet to figure out
- evansl (9/17) Jan 13 2014 [snip]
- Rainer Schuetze (8/35) Jan 15 2014 std.emplace can be used on a partial memory block (e.g. as part
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (16/20) Jan 15 2014 This is a good point. It is a mistake to mix manual memory layout
- evansl (23/55) Feb 28 2014 I'm afraid I'm not familiar with much of what you're talking about.
- Walter Bright (3/8) Jan 10 2014 Rainer, I want to apologize to you for not getting your efforts the atte...
- deadalnix (18/21) Jan 10 2014 One of the key to a fast GC is segregating the heap in smaller
- H. S. Teoh (19/43) Jan 10 2014 Which unfortunately is not an option in D if write barriers are
- Rainer Schuetze (8/30) Jan 11 2014 This might work with safe D. I haven't yet figured the consequences of
- Timon Gehr (3/7) Jan 10 2014 I assume you are aware that there is implicit casting to immutable upon
- deadalnix (2/14) Jan 10 2014 Is it a bird ? Is it a plane ? No it is *ISOLATED* !
- Andrei Alexandrescu (4/12) Jan 10 2014 I don't know. Need to think about it. Maybe it's a wrong design decision...
- H. S. Teoh (10/23) Jan 10 2014 [...]
- Nicholas Londey (10/10) Jan 13 2014 In theory you could use a region allocator for a pure function
- =?ISO-8859-1?Q?S=F6nke_Ludwig?= (4/12) Jan 12 2014 I'd say the easiest way to deal with it is to let the compiler emit a
- Chris Williams (29/34) Jan 14 2014 An additional feature that I would like to suggest is the concept
- Rikki Cattermole (4/42) Jan 14 2014 I really love this idea. However I'm not sure how kernels would
- Chris Williams (7/10) Jan 15 2014 Sleep allows a thread to give way to another thread within the
- Rikki Cattermole (4/15) Jan 15 2014 I'm referring to cpu prioritization. For the process. It'll
The GC comes up repeatedly in discussions around here. I've been thinking for a while about breaking it down into components, and now it seems the time is ripe. The design at http://goo.gl/ZVCJeB seems to be a win. It works well, comprehends all major allocation tropes, someone implemented a subset of it in C++ and measured good results, and a coworker is considering adopting the design for a C++ project as well. I've started with the next logical step - higher-level allocation that is aware of the type of the object being allocated, and realized that integrating a notion of tracing is appropriate at that level, and actually quite easy. So I'm thinking of just doing it. A few highlights: * The design will foster the small, composable components with required/optional primitives that worked for std.allocator. * I plan to review and probably discard all of the pointers-to-functions nonsense in the current GC. * At this point I won't worry about discovering roots; I assume druntime has the appropriate mechanisms in place. Instead I'll focus on primitives such as "given this root, mark all that transitively follows from it" (conservatively or not). * I plan to rely on static introspection for the mark function, i.e: void mark(T)(ref T obj); would mark obj and everything transitively referred to by it. The function would probably take a second parameter that's the heap that obj is sitting in. * I plan to segregate all objects that don't include references and pointers (e.g. int, int[], Tuple!(int, double) etc) into a completely different heap than the "interesting" objects. For the simpler objects there's no need to save detailed type information so they can be stored in a more compact, efficient manner. * There will be no possibility to change the type of certain objects once allocated. An allocation for an immutable object cannot e.g. make it later a mutable one. (This is an essential issue with the current allocator, I think.) * At this point I'm unclear on how generations can be componentized, but am cautiously optimistic. Will see once I get to it. One thing that would be great now would be to make an effort to review and merge the current precise GC work. I'm sure it will be of great help with breaking into components. Destroy. Andrei
Jan 10 2014
On 10/01/14 09:03, Andrei Alexandrescu wrote:The GC comes up repeatedly in discussions around here. I've been thinking for a while about breaking it down into components, and now it seems the time is ripe. The design at http://goo.gl/ZVCJeB seems to be a win. It works well, comprehends all major allocation tropes, someone implemented a subset of it in C++ and measured good results, and a coworker is considering adopting the design for a C++ project as well. I've started with the next logical step - higher-level allocation that is aware of the type of the object being allocated, and realized that integrating a notion of tracing is appropriate at that level, and actually quite easy. So I'm thinking of just doing it.Can you recommend some good background reading for those of us who would love to have some input (or at least insight) to this, but don't yet have the theoretical understanding?
Jan 10 2014
On 2014-01-10 10:56, Joseph Rushton Wakeling wrote:Can you recommend some good background reading for those of us who would love to have some input (or at least insight) to this, but don't yet have the theoretical understanding?There's the classic The Garbage Collection Handbook. Here in a later edition: http://www.amazon.com/The-Garbage-Collection-Handbook-Management/dp/1420082795 -- /Jacob Carlborg
Jan 10 2014
On Friday, 10 January 2014 at 09:57:18 UTC, Joseph Rushton Wakeling wrote:On 10/01/14 09:03, Andrei Alexandrescu wrote:Two good books http://www.amazon.de/The-Garbage-Collection-Handbook-Management/dp/1420082795 http://www.amazon.de/Garbage-Collection-Algorithms-Automatic-Management/dp/0471941484The GC comes up repeatedly in discussions around here. I've been thinking for a while about breaking it down into components, and now it seems the time is ripe. The design at http://goo.gl/ZVCJeB seems to be a win. It works well, comprehends all major allocation tropes, someone implemented a subset of it in C++ and measured good results, and a coworker is considering adopting the design for a C++ project as well. I've started with the next logical step - higher-level allocation that is aware of the type of the object being allocated, and realized that integrating a notion of tracing is appropriate at that level, and actually quite easy. So I'm thinking of just doing it.Can you recommend some good background reading for those of us who would love to have some input (or at least insight) to this, but don't yet have the theoretical understanding?
Jan 10 2014
On Friday, 10 January 2014 at 09:57:18 UTC, Joseph Rushton Wakeling wrote:Can you recommend some good background reading for those of us who would love to have some input (or at least insight) to this, but don't yet have the theoretical understanding?You mean about how GCs work? I have the book "Garbage Collection: Algorithms for Automatic Dynamic Memory Management" by Richard Jones, Rafael D Lins. I can recommend it. It is not too technical that it gets too hard. But it doesn't tell the reader how to write a GC. It's more an overview of various approaches in GC construction.
Jan 10 2014
On Friday, 10 January 2014 at 10:05:45 UTC, Bienlein wrote:On Friday, 10 January 2014 at 09:57:18 UTC, Joseph Rushton Wakeling wrote:Seems like we were all in parallel with our book recommendations; only some minutes apart ...Can you recommend some good background reading for those of us who would love to have some input (or at least insight) to this, but don't yet have the theoretical understanding?You mean about how GCs work? I have the book "Garbage Collection: Algorithms for Automatic Dynamic Memory Management" by Richard Jones, Rafael D Lins. I can recommend it. It is not too technical that it gets too hard. But it doesn't tell the reader how to write a GC. It's more an overview of various approaches in GC construction.
Jan 10 2014
On 2014-01-10 11:30, Bienlein wrote:Seems like we were all in parallel with our book recommendations; only some minutes apart ...That would hopefully indicate they're good recommendations :) -- /Jacob Carlborg
Jan 10 2014
On Fri, 10 Jan 2014 09:56:45 -0000, Joseph Rushton Wakeling <joseph.wakeling webdrake.net> wrote:Can you recommend some good background reading for those of us who would love to have some input (or at least insight) to this, but don't yet have the theoretical understanding?http://msdn.microsoft.com/en-us/library/ms973837.aspx It's an overview of a specific collector so you get a concrete understand of one way to do it, which you can then use as a foundation to build understanding on etc. It's the only thing I've read on GC and I understood (at least at the surface level) Andrei's post completely. Plus it's not too long, and available online for free :) R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Jan 10 2014
On Friday, 10 January 2014 at 09:57:18 UTC, Joseph Rushton Wakeling wrote:On 10/01/14 09:03, Andrei Alexandrescu wrote: Can you recommend some good background reading for those of us who would love to have some input (or at least insight) to this, but don't yet have the theoretical understanding?I like the thesis "Scheduling Garbage Collection in Embedded Systems" by Roger Henriksson. It shows how GC can work with various types of systems, including real-time embedded control systems. It includes a comparison of GC styles. And disects an actual algorithm. Joseph
Jan 10 2014
On 10/01/14 23:25, Joseph Cassman wrote:I like the thesis "Scheduling Garbage Collection in Embedded Systems" by Roger Henriksson. It shows how GC can work with various types of systems, including real-time embedded control systems. It includes a comparison of GC styles. And disects an actual algorithm.Interesting -- I'll have to look that up. Thanks to everyone for your suggestions -- some good new books to add to the reading list! :-)
Jan 10 2014
On 10/01/14 23:25, Joseph Cassman wrote:I like the thesis "Scheduling Garbage Collection in Embedded Systems" by Roger Henriksson. It shows how GC can work with various types of systems, including real-time embedded control systems. It includes a comparison of GC styles. And disects an actual algorithm.The link, for anyone who's interested: http://www.cs.lth.se/home/Roger_Henriksson/thesis.pdf
Jan 10 2014
10-Jan-2014 12:03, Andrei Alexandrescu пишет:The GC comes up repeatedly in discussions around here. I've been thinking for a while about breaking it down into components, and now it seems the time is ripe. The design at http://goo.gl/ZVCJeB seems to be a win.O.T.: Please use full URLs, it's not that long. Also how about actually starting formal process to include it as core.alllocator (package preferably). This activity can go in parallel with with getting a through reveiw of currrent GC and taking steps towards precise GC into the mainstream.I've started with the next logical step - higher-level allocation that is aware of the type of the object being allocated, and realized that integrating a notion of tracing is appropriate at that level, and actually quite easy. So I'm thinking of just doing it. A few highlights: * The design will foster the small, composable components with required/optional primitives that worked for std.allocator. * I plan to review and probably discard all of the pointers-to-functions nonsense in the current GC.Cool so far.* I plan to rely on static introspection for the mark function, i.e: void mark(T)(ref T obj); would mark obj and everything transitively referred to by it. The function would probably take a second parameter that's the heap that obj is sitting in.This will not work AFAICT. Since nobody can tell what kind of type a user-defined program may have and GC is certainly not templated on all types. GC would have to rely on TypeInfos or rather RTInfo-s that describe the layout of Ts in a uniform way (bitmaps to indicate pointers, etc.). I think Walter had some ideas on how to represent these to reduce space.* I plan to segregate all objects that don't include references and pointers (e.g. int, int[], Tuple!(int, double) etc) into a completely different heap than the "interesting" objects. For the simpler objects there's no need to save detailed type information so they can be stored in a more compact, efficient manner.Yup. I though GC already did something like that, with BlkAttr.NO_SCAN.* There will be no possibility to change the type of certain objects once allocated. An allocation for an immutable object cannot e.g. make it later a mutable one. (This is an essential issue with the current allocator, I think.)Would that mean 3 separate heaps, so that addresses given for immutable stuff would never be (sometime later) assigned to (thread-local) mutable? I might have got this point of yours wrong, but I think there is a merit in having separate heap at least for shared stuff. -- Dmitry Olshansky
Jan 10 2014
On 1/10/14 3:29 AM, Dmitry Olshansky wrote:10-Jan-2014 12:03, Andrei Alexandrescu пишет:You know I wouldn't write that unless I had an ace up my sleeve. Watch me :o).* I plan to rely on static introspection for the mark function, i.e: void mark(T)(ref T obj); would mark obj and everything transitively referred to by it. The function would probably take a second parameter that's the heap that obj is sitting in.This will not work AFAICT. Since nobody can tell what kind of type a user-defined program may have and GC is certainly not templated on all types. GC would have to rely on TypeInfos or rather RTInfo-s that describe the layout of Ts in a uniform way (bitmaps to indicate pointers, etc.). I think Walter had some ideas on how to represent these to reduce space.That design will disappear. The attribute won't be settable once initialized, and will never be conservative ("scan all words in this block").* I plan to segregate all objects that don't include references and pointers (e.g. int, int[], Tuple!(int, double) etc) into a completely different heap than the "interesting" objects. For the simpler objects there's no need to save detailed type information so they can be stored in a more compact, efficient manner.Yup. I though GC already did something like that, with BlkAttr.NO_SCAN.Yah, separation of shared data allocation is crucial. We'll see. Andrei* There will be no possibility to change the type of certain objects once allocated. An allocation for an immutable object cannot e.g. make it later a mutable one. (This is an essential issue with the current allocator, I think.)Would that mean 3 separate heaps, so that addresses given for immutable stuff would never be (sometime later) assigned to (thread-local) mutable? I might have got this point of yours wrong, but I think there is a merit in having separate heap at least for shared stuff.
Jan 10 2014
On Fri, Jan 10, 2014 at 08:22:56AM -0800, Andrei Alexandrescu wrote:On 1/10/14 3:29 AM, Dmitry Olshansky wrote:I don't see why having a templated mark() is a problem. Nothing says you can't do this: module core.gc; void mark(T)(ref T obj) { size_t objSize = T.sizeof; // ... and whatever other type-specific info you need // here. // N.B.: non-template function call __mark_impl(cast(void*)&obj, objSize, /* ... whatever else is needed */); }10-Jan-2014 12:03, Andrei Alexandrescu пишет:You know I wouldn't write that unless I had an ace up my sleeve. Watch me :o).* I plan to rely on static introspection for the mark function, i.e: void mark(T)(ref T obj); would mark obj and everything transitively referred to by it. The function would probably take a second parameter that's the heap that obj is sitting in.This will not work AFAICT. Since nobody can tell what kind of type a user-defined program may have and GC is certainly not templated on all types. GC would have to rely on TypeInfos or rather RTInfo-s that describe the layout of Ts in a uniform way (bitmaps to indicate pointers, etc.). I think Walter had some ideas on how to represent these to reduce space.[...] So this will be the beginning of the precise GC that we've been talking about for who knows how long now? I like that! T -- Любишь кататься - люби и саночки возить.That design will disappear. The attribute won't be settable once initialized, and will never be conservative ("scan all words in this block").* I plan to segregate all objects that don't include references and pointers (e.g. int, int[], Tuple!(int, double) etc) into a completely different heap than the "interesting" objects. For the simpler objects there's no need to save detailed type information so they can be stored in a more compact, efficient manner.Yup. I though GC already did something like that, with BlkAttr.NO_SCAN.
Jan 10 2014
On 10/01/14 17:22, Andrei Alexandrescu wrote:You know I wouldn't write that unless I had an ace up my sleeve. Watch me :o).Ahh, we're getting an allocation-based Dance of the Seven Veils here? :-)
Jan 10 2014
One big generic problem for GC that is to manage efficiently a huge tree. More you program use memory for GC will go slower. Maybe to have a GC profile for these applications could to be a good thing.
Jan 10 2014
On Friday, 10 January 2014 at 08:03:03 UTC, Andrei Alexandrescu wrote:The GC comes up repeatedly in discussions around here. I've been thinking for a while about breaking it down into components, and now it seems the time is ripe. * The design will foster the small, composable components with required/optional primitives that worked for std.allocator. * I plan to review and probably discard all of the pointers-to-functions nonsense in the current GC.This only exists because the runtime is statically linked and supporting D in a DLL was a requirement. Grafting together GCs from multiple runtimes was the easiest way to do this at the time, but the correct solution is really just to put the runtime in a DLL and jettison the whole idea of a GC proxy. In short, I'm all for it.* I plan to rely on static introspection for the mark function, i.e: void mark(T)(ref T obj); would mark obj and everything transitively referred to by it. The function would probably take a second parameter that's the heap that obj is sitting in.This will take some work. Maybe thread_scanAll can take an alias to such a function instead of a function pointer as it does today. Something along those lines anyway. In short, the template part is what's tricky about this approach.* I plan to segregate all objects that don't include references and pointers (e.g. int, int[], Tuple!(int, double) etc) into a completely different heap than the "interesting" objects. For the simpler objects there's no need to save detailed type information so they can be stored in a more compact, efficient manner.Yay! I don't like how the current GC mixes these and indicates the presence of pointers via a bit flag.* There will be no possibility to change the type of certain objects once allocated. An allocation for an immutable object cannot e.g. make it later a mutable one. (This is an essential issue with the current allocator, I think.)I think this is reasonable, since it also may affect which pool the data is allocated in. I really would love to aim for a concurrent allocator scheme. Kinda like HOARD but with a GC attached. But this is really only feasible if you make some hard assertions about type conversion.
Jan 10 2014
Oh, I assume this will eventually include your ideas about having the GC manage mapped files? That one has lingered on the "someday" pile for too long.
Jan 10 2014
On 1/10/14 10:57 AM, Sean Kelly wrote:Oh, I assume this will eventually include your ideas about having the GC manage mapped files? That one has lingered on the "someday" pile for too long.Yah, such integration would make a lot of sense. Thanks for the reminder, I'd forgotten about it. Andrei
Jan 10 2014
Am 10.01.2014 09:03, schrieb Andrei Alexandrescu:Destroy. AndreiJust one question. Did you read "the garbage collection handbook"?
Jan 10 2014
On 1/10/14 11:34 AM, Benjamin Thaut wrote:Am 10.01.2014 09:03, schrieb Andrei Alexandrescu:Yah, the new edition. Unfortunately I didn't find a lot of new stuff or things that would help in a practical implementation. AndreiDestroy. AndreiJust one question. Did you read "the garbage collection handbook"?
Jan 10 2014
Am 10.01.2014 20:40, schrieb Andrei Alexandrescu:On 1/10/14 11:34 AM, Benjamin Thaut wrote:Very good, In my opinion it should be required for everyone who wants to participate in D's GC to read that book. So everyone has at least a basic understanding of the problem at Hand. But if you read the book I don't understand, why you simply declare the hardest problem, percise pointer discovery, as done. To be able to actually implement and test a GC that is not a reimplementation of what we already have one needs percise pointer discovery of _all_ pointers, write barriers and GC halting points. So please enlighten me why you simply decalre this done? Kind Regards Benjamin ThautAm 10.01.2014 09:03, schrieb Andrei Alexandrescu:Yah, the new edition. Unfortunately I didn't find a lot of new stuff or things that would help in a practical implementation. AndreiDestroy. AndreiJust one question. Did you read "the garbage collection handbook"?
Jan 11 2014
On 1/11/14 3:52 AM, Benjamin Thaut wrote:Am 10.01.2014 20:40, schrieb Andrei Alexandrescu:I'm not considering done as much as separable as a concern. All I'm saying is I hope to be able to separate the low-level part of discovering roots from the high-level part of marking used memory. BTW the way I see this done is "mostly precise", i.e. there will be at least for a while some words that will be handled conservatively (stack, registers, certain union members). If there's anything in the GC book that suggest that would be impossible, please do let me know! AndreiOn 1/10/14 11:34 AM, Benjamin Thaut wrote:Very good, In my opinion it should be required for everyone who wants to participate in D's GC to read that book. So everyone has at least a basic understanding of the problem at Hand. But if you read the book I don't understand, why you simply declare the hardest problem, percise pointer discovery, as done. To be able to actually implement and test a GC that is not a reimplementation of what we already have one needs percise pointer discovery of _all_ pointers, write barriers and GC halting points. So please enlighten me why you simply decalre this done?Am 10.01.2014 09:03, schrieb Andrei Alexandrescu:Yah, the new edition. Unfortunately I didn't find a lot of new stuff or things that would help in a practical implementation. AndreiDestroy. AndreiJust one question. Did you read "the garbage collection handbook"?
Jan 11 2014
Am 11.01.2014 21:44, schrieb Andrei Alexandrescu:On 1/11/14 3:52 AM, Benjamin Thaut wrote:Well as far as my understanding of GCs goes, you have two options: 1) Impercise pointer discovery => limiting yourself to a mark & sweep 2) Percise pointer disccovery => option to use a semi space GC in combination with a mark & sweep and generations, which allows for superior handling of short lived allocations (which is the biggest problem of D's current GC). Also without percise pointer discovery you will never be able to move objects from one heap into another. This would be especially a problem for the immutable heap, because you might want to allocate all strings on the thread local heap until you discover that you actually need them to be shared between threads. You might also need to move objects into antoher heap whenever a casts happens or global variable is assigned. Kind Regards Benjamin ThautAm 10.01.2014 20:40, schrieb Andrei Alexandrescu:I'm not considering done as much as separable as a concern. All I'm saying is I hope to be able to separate the low-level part of discovering roots from the high-level part of marking used memory. BTW the way I see this done is "mostly precise", i.e. there will be at least for a while some words that will be handled conservatively (stack, registers, certain union members). If there's anything in the GC book that suggest that would be impossible, please do let me know! AndreiOn 1/10/14 11:34 AM, Benjamin Thaut wrote:Very good, In my opinion it should be required for everyone who wants to participate in D's GC to read that book. So everyone has at least a basic understanding of the problem at Hand. But if you read the book I don't understand, why you simply declare the hardest problem, percise pointer discovery, as done. To be able to actually implement and test a GC that is not a reimplementation of what we already have one needs percise pointer discovery of _all_ pointers, write barriers and GC halting points. So please enlighten me why you simply decalre this done?Am 10.01.2014 09:03, schrieb Andrei Alexandrescu:Yah, the new edition. Unfortunately I didn't find a lot of new stuff or things that would help in a practical implementation. AndreiDestroy. AndreiJust one question. Did you read "the garbage collection handbook"?
Jan 11 2014
On 1/11/14 1:20 PM, Benjamin Thaut wrote:1) Impercise pointer discovery => limiting yourself to a mark & sweep 2) Percise pointer disccovery => option to use a semi space GC in combination with a mark & sweep and generations, which allows for superior handling of short lived allocations (which is the biggest problem of D's current GC).The way I see it/hope it turns out, precision is a separate concern from what I'm working on now, which is tracing. Andrei P.S. s/percise/precise/g
Jan 11 2014
Am 11.01.2014 22:20, schrieb Benjamin Thaut:Also without percise pointer discovery you will never be able to move objects from one heap into another. This would be especially a problem for the immutable heap, because you might want to allocate all strings on the thread local heap until you discover that you actually need them to be shared between threads.The language can help a lot there (pure... and scope, if "fixed"), so that the compiler can emit the necessary calls to make this explicit, because it already knows that it's safe without performing a garbage collection cycle.
Jan 11 2014
On 11.01.2014 22:20, Benjamin Thaut wrote:Am 11.01.2014 21:44, schrieb Andrei Alexandrescu:I think a moving collector is currently not feasible without restricting the language a lot, probably similar to safe D and more. I'm not sure we want that in general. We should aim at something in between mark & sweep and compacting generational collection, e.g. a non-moving collector that keeps track of the youngest generation (I just made that up, not sure if it is realistic). Adding concurrency would also be nice...On 1/11/14 3:52 AM, Benjamin Thaut wrote:Well as far as my understanding of GCs goes, you have two options: 1) Impercise pointer discovery => limiting yourself to a mark & sweep 2) Percise pointer disccovery => option to use a semi space GC in combination with a mark & sweep and generations, which allows for superior handling of short lived allocations (which is the biggest problem of D's current GC). Also without percise pointer discovery you will never be able to move objects from one heap into another. This would be especially a problem for the immutable heap, because you might want to allocate all strings on the thread local heap until you discover that you actually need them to be shared between threads. You might also need to move objects into antoher heap whenever a casts happens or global variable is assigned. Kind Regards Benjamin ThautAm 10.01.2014 20:40, schrieb Andrei Alexandrescu:I'm not considering done as much as separable as a concern. All I'm saying is I hope to be able to separate the low-level part of discovering roots from the high-level part of marking used memory. BTW the way I see this done is "mostly precise", i.e. there will be at least for a while some words that will be handled conservatively (stack, registers, certain union members). If there's anything in the GC book that suggest that would be impossible, please do let me know! AndreiOn 1/10/14 11:34 AM, Benjamin Thaut wrote:Very good, In my opinion it should be required for everyone who wants to participate in D's GC to read that book. So everyone has at least a basic understanding of the problem at Hand. But if you read the book I don't understand, why you simply declare the hardest problem, percise pointer discovery, as done. To be able to actually implement and test a GC that is not a reimplementation of what we already have one needs percise pointer discovery of _all_ pointers, write barriers and GC halting points. So please enlighten me why you simply decalre this done?Am 10.01.2014 09:03, schrieb Andrei Alexandrescu:Yah, the new edition. Unfortunately I didn't find a lot of new stuff or things that would help in a practical implementation. AndreiDestroy. AndreiJust one question. Did you read "the garbage collection handbook"?
Jan 12 2014
Am 12.01.2014 11:27, schrieb Rainer Schuetze:I think a moving collector is currently not feasible without restricting the language a lot, probably similar to safe D and more. I'm not sure we want that in general.Could you give an example which part of the language would not be doable with a moving collector? The only thing that comes to my mind is unions and that problem can be solved by allowing the user to specify manual scanning functions for structs or classes containing unions. Also I don't think that we can create a GC which performs as good as the for a moving gc.
Jan 12 2014
On Sunday, 12 January 2014 at 10:40:50 UTC, Benjamin Thaut wrote:Also I don't think that we can create a GC which performs as neccessary changes for a moving gc.We don't necessarily need that, though. In D we have a plethora of non-GC options, so it might be a better idea to tailor our GC to typical D programs rather than trying to reproduce the Java perspective, I don't think the young generation is significant in idiomatic D code, unlike Java code which relies on it heavily.
Jan 12 2014
Am 12.01.2014 11:54, schrieb Jakob Ovrum:On Sunday, 12 January 2014 at 10:40:50 UTC, Benjamin Thaut wrote:You should really try to write non-GC D code some time. You would be astonished how many hidden allocations there are.Also I don't think that we can create a GC which performs as good as changes for a moving gc.We don't necessarily need that, though. In D we have a plethora of non-GC options, so it might be a better idea to tailor our GC to typical For example, from a generational GC perspective, I don't think the young generation is significant in idiomatic D code, unlike Java code which relies on it heavily.
Jan 12 2014
On Sunday, 12 January 2014 at 11:02:28 UTC, Benjamin Thaut wrote:You should really try to write non-GC D code some time. You would be astonished how many hidden allocations there are.Is that down to the language, or parts of druntime and phobos that assume GC when they shouldn't?
Jan 12 2014
Am 12.01.2014 12:12, schrieb Joseph Rushton Wakeling:On Sunday, 12 January 2014 at 11:02:28 UTC, Benjamin Thaut wrote:Both. Although it kept getting better lately.You should really try to write non-GC D code some time. You would be astonished how many hidden allocations there are.Is that down to the language, or parts of druntime and phobos that assume GC when they shouldn't?
Jan 12 2014
On Sunday, 12 January 2014 at 11:02:28 UTC, Benjamin Thaut wrote:You should really try to write non-GC D code some time.I do.You would be astonished how many hidden allocations there are.I was, years ago.
Jan 12 2014
On 1/12/14 2:40 AM, Benjamin Thaut wrote:Am 12.01.2014 11:27, schrieb Rainer Schuetze:Yah, moving would be real nice. I hope to at least clarify the issues related to moving with my work. AndreiI think a moving collector is currently not feasible without restricting the language a lot, probably similar to safe D and more. I'm not sure we want that in general.Could you give an example which part of the language would not be doable with a moving collector? The only thing that comes to my mind is unions and that problem can be solved by allowing the user to specify manual scanning functions for structs or classes containing unions. Also I don't think that we can create a GC which performs as good as the for a moving gc.
Jan 12 2014
On Sunday, 12 January 2014 at 10:40:50 UTC, Benjamin Thaut wrote:Am 12.01.2014 11:27, schrieb Rainer Schuetze:I'd rather have more restrictions and a working precise GC, and let those who wish to do without the safety ask for it explicitly.I think a moving collector is currently not feasible without restricting the language a lot, probably similar to safe D and more. I'm not sure we want that in general.Could you give an example which part of the language would not be doable with a moving collector? The only thing that comes to my mind is unions and that problem can be solved by allowing the user to specify manual scanning functions for structs or classes containing unions.How would the moving GC deal with pointer arithmetic?Also I don't think that we can create a GC which performs as neccessary changes for a moving gc.is to be a 'fully garbage collected language' then it will have to support a state of the art GC.
Jan 12 2014
Am 12.01.2014 18:12, schrieb Brian Rogoff:How would the moving GC deal with pointer arithmetic?I don't see any problem with pointer arithmetic? Either the pointer is pointing to a gc managed memory block and will be patched accordingly, or it is not pointing to a gc managed memory block, and nothing will happen. Obviously if you do unsafe things, you have to know what you are doing and might give additional information to the gc so it can propperly support unsafe stuff.
Jan 12 2014
On Sunday, 12 January 2014 at 17:15:44 UTC, Benjamin Thaut wrote:Am 12.01.2014 18:12, schrieb Brian Rogoff:Or you don't know if something is a pointer or not and so whatever it references is pinned.How would the moving GC deal with pointer arithmetic?I don't see any problem with pointer arithmetic? Either the pointer is pointing to a gc managed memory block and will be patched accordingly, or it is not pointing to a gc managed memory block, and nothing will happen.
Jan 12 2014
On Sunday, 12 January 2014 at 17:15:44 UTC, Benjamin Thaut wrote:Am 12.01.2014 18:12, schrieb Brian Rogoff:The garbage collection page of the D spec actually talks a lot about what is safe and unsafe (even though D doesn't have a moving collector). http://dlang.org/garbage.htmlHow would the moving GC deal with pointer arithmetic?I don't see any problem with pointer arithmetic? Either the pointer is pointing to a gc managed memory block and will be patched accordingly, or it is not pointing to a gc managed memory block, and nothing will happen. Obviously if you do unsafe things, you have to know what you are doing and might give additional information to the gc so it can propperly support unsafe stuff.
Jan 12 2014
On Sunday, 12 January 2014 at 10:40:50 UTC, Benjamin Thaut wrote:Am 12.01.2014 11:27, schrieb Rainer Schuetze:Maybe I'm too pessimistic ;-) I guess moving in general could be ok, I was thinking about segregating heaps by type (shared/immutable/mutable) and moving data between them adds restrictions. I'd like to be proven wrong. Some thoughts regarding a moving collector: - interfacing with C/C++ is problematic: even if a pointer is passed on the stack, it is not guaranteed that this stack entry is not modified by the called function. While this might cause problems when collecting memory due to this being the last reference to the data, it is much more likely that there are still references, but moving pointers will definitely fail. Having to explicitely pin every pointer passed to C functions would be very expensive. - if we have many pinned objects, compaction loses some of its advantages (like fast allocation). If we keep allocating from free lists of equal sized bins, fragmantation and memory overhead is limited (though not small) without moving. Also, moving lot's of data can also be expensive. - interior pointers do not allow "threading" for pointer updates, so it might get pretty expensive to update references to moved objectsI think a moving collector is currently not feasible without restricting the language a lot, probably similar to safe D and more. I'm not sure we want that in general.Could you give an example which part of the language would not be doable with a moving collector? The only thing that comes to my mind is unions and that problem can be solved by allowing the user to specify manual scanning functions for structs or classes containing unions.Also I don't think that we can create a GC which performs as neccessary changes for a moving gc.To compete with other GCs we'd probably need write barriers to keep track of changed references (for concurrent operation or generations). There just needs to be a way to avoid having to rescan the full heap every time, it does not scale. PS: my SSD with all the D stuff just died yesterday (was it a failure of the disks GC?). I'll probably need some time to recover from that...
Jan 13 2014
On 2014-01-13 10:20, Rainer Schuetze wrote:Maybe I'm too pessimistic ;-) I guess moving in general could be ok, I was thinking about segregating heaps by type (shared/immutable/mutable) and moving data between them adds restrictions. I'd like to be proven wrong. Some thoughts regarding a moving collector: - interfacing with C/C++ is problematic: even if a pointer is passed on the stack, it is not guaranteed that this stack entry is not modified by the called function. While this might cause problems when collecting memory due to this being the last reference to the data, it is much more likely that there are still references, but moving pointers will definitely fail. Having to explicitely pin every pointer passed to C functions would be very expensive.Could we have a segregated heap for C pointers? Would that help? Basically having a special function allocating everything that should interface with C. -- /Jacob Carlborg
Jan 13 2014
On Monday, 13 January 2014 at 10:59:44 UTC, Jacob Carlborg wrote:Could we have a segregated heap for C pointers? Would that help? Basically having a special function allocating everything that should interface with C.Is it possible to declare whether the C-function retains a pointer to the memory area or not, and to what extent? In general you will have to assume that the C code retains pointers not only to the object, but the transitive closure of anything that can be reached from it. That is quite extensive… I also hope that the GC isn't fully modularized, because compiler support for specific GC strategies is likely to give better performance. Especially with whole program analysis. It would be very nice to localize GC to a few threads. This would be useful in games where you only want to GC the AI/game mechanics portion of the simulated world, but use less demanding memory management for graphics, physics etc, which keeps running uninterrupted (one can interpolate/predict for a few frames giving the GC some more room to complete).
Jan 13 2014
13-Jan-2014 13:20, Rainer Schuetze пишет:On Sunday, 12 January 2014 at 10:40:50 UTC, Benjamin Thaut wrote:Am 12.01.2014 11:27, schrieb Rainer Schuetze:Maybe I'm too pessimistic ;-) I guess moving in general could be ok, I was thinking about segregating heaps by type (shared/immutable/mutable) and moving data between them adds restrictions. I'd like to be proven wrong. Some thoughts regarding a moving collector:I think a moving collector is currently not feasible without restricting the language a lot, probably similar to safe D and more. I'm not sure we want that in general.Could you give an example which part of the language would not be doable with a moving collector? The only thing that comes to my mind is unions and that problem can be solved by allowing the user to specify manual scanning functions for structs or classes containing unions.Having to explicitely pin every pointer passed to C functions would be very expensive.How would it be expensive? I don't see a need to do anything to "pin" a memory block, at the time of scanning there will be a potential pointer to it (in the stack space of C function or registers). -- Dmitry Olshansky
Jan 13 2014
On Monday, 13 January 2014 at 17:45:26 UTC, Dmitry Olshansky wrote:How would it be expensive? I don't see a need to do anything to "pin" a memory block, at the time of scanning there will be a potential pointer to it (in the stack space of C function or registers).How do you know that it is a pointer?
Jan 13 2014
13-Jan-2014 21:49, "Ola Fosheim Grøstad" <ola.fosheim.grostad+dlang gmail.com>" пишет:On Monday, 13 January 2014 at 17:45:26 UTC, Dmitry Olshansky wrote:Precisely. Since C space has no type information you have to conservatively assume that everything can be a pointer. -- Dmitry OlshanskyHow would it be expensive? I don't see a need to do anything to "pin" a memory block, at the time of scanning there will be a potential pointer to it (in the stack space of C function or registers).How do you know that it is a pointer?
Jan 13 2014
On Monday, 13 January 2014 at 17:56:12 UTC, Dmitry Olshansky wrote:Precisely. Since C space has no type information you have to conservatively assume that everything can be a pointer.I think we are misunderstanding each other? I don't think a Boehm style GC can do compaction, since you cannot modify the "pointer", hence you have to "pin down" the object it possibly points to (to prevent it from moving)?
Jan 13 2014
…and scanning of C data segments is insufficient anyway, since protected drivers can retain pointers that are out of reach...
Jan 13 2014
13-Jan-2014 22:56, "Ola Fosheim Grøstad" <ola.fosheim.grostad+dlang gmail.com>" пишет:…and scanning of C data segments is insufficient anyway, since protected drivers can retain pointers that are out of reach...A stack is a stack is a stack. Not that D haven't had problem with hidden pointers such as these in C heap (and not being marked explicitly). -- Dmitry Olshansky
Jan 13 2014
13-Jan-2014 22:44, "Ola Fosheim Grøstad" <ola.fosheim.grostad+dlang gmail.com>" пишет:On Monday, 13 January 2014 at 17:56:12 UTC, Dmitry Olshansky wrote:So what? It can't move it and as such there is nothing special about it, it doesn't _cost_ anything to pin object. -- Dmitry OlshanskyPrecisely. Since C space has no type information you have to conservatively assume that everything can be a pointer.I think we are misunderstanding each other? I don't think a Boehm style GC can do compaction, since you cannot modify the "pointer", hence you have to "pin down" the object it possibly points to (to prevent it from moving)?
Jan 13 2014
On 01/13/2014 10:08 PM, Dmitry Olshansky wrote:13-Jan-2014 22:44, "Ola Fosheim Grøstad" <ola.fosheim.grostad+dlang gmail.com>" пишет:Yes, there is a cost. The requirement to pin arbitrary objects at arbitrary times, without the possibility to move at pinning time, invalidates GC algorithms that depend on being able to move _all_ objects within some pool.On Monday, 13 January 2014 at 17:56:12 UTC, Dmitry Olshansky wrote:So what? It can't move it and as such there is nothing special about it, it doesn't _cost_ anything to pin object.Precisely. Since C space has no type information you have to conservatively assume that everything can be a pointer.I think we are misunderstanding each other? I don't think a Boehm style GC can do compaction, since you cannot modify the "pointer", hence you have to "pin down" the object it possibly points to (to prevent it from moving)?
Jan 13 2014
On 2014-01-13 22:23, Timon Gehr wrote:Yes, there is a cost. The requirement to pin arbitrary objects at arbitrary times, without the possibility to move at pinning time, invalidates GC algorithms that depend on being able to move _all_ objects within some pool.Can't these object be pinned somewhere else? -- /Jacob Carlborg
Jan 14 2014
Am 14.01.2014 09:15, schrieb Jacob Carlborg:On 2014-01-13 22:23, Timon Gehr wrote:Yes, thats the usual approach. But that means that you have to copy them to a other space before pinning, or allocate them in a space that allows pinning in the first place.Yes, there is a cost. The requirement to pin arbitrary objects at arbitrary times, without the possibility to move at pinning time, invalidates GC algorithms that depend on being able to move _all_ objects within some pool.Can't these object be pinned somewhere else?
Jan 14 2014
On 2014-01-14 10:20, Benjamin Thaut wrote:Yes, thats the usual approach. But that means that you have to copy them to a other space before pinning, or allocate them in a space that allows pinning in the first place.Is there a problem to allocate them in the correct space from the beginning? -- /Jacob Carlborg
Jan 14 2014
Am 14.01.2014 10:52, schrieb Jacob Carlborg:On 2014-01-14 10:20, Benjamin Thaut wrote:Well you have to know it at allocation time. If you have a function that takes a blob of data as argument and passes it to C, the caller does not know of the requirement and you (maybe) have to copy inside the callee. Kind Regards Benjamin ThautYes, thats the usual approach. But that means that you have to copy them to a other space before pinning, or allocate them in a space that allows pinning in the first place.Is there a problem to allocate them in the correct space from the beginning?
Jan 14 2014
On 01/14/2014 10:20 AM, Benjamin Thaut wrote:Am 14.01.2014 09:15, schrieb Jacob Carlborg:Once it becomes known that an object should be pinned, it is too late for moving. (There will then exist a potential pointer to the object.)On 2014-01-13 22:23, Timon Gehr wrote:Yes, thats the usual approach. But that means that you have to copy them to a other space before pinning, or allocate them in a space that allows pinning in the first place.Yes, there is a cost. The requirement to pin arbitrary objects at arbitrary times, without the possibility to move at pinning time, invalidates GC algorithms that depend on being able to move _all_ objects within some pool.Can't these object be pinned somewhere else?
Jan 14 2014
Am 14.01.2014 14:42, schrieb Timon Gehr:On 01/14/2014 10:20 AM, Benjamin Thaut wrote:Well no, usually you have to pin objects _before_ passing them to C-land. Pinning is not a automatic process its done manually (or by the compiler).Am 14.01.2014 09:15, schrieb Jacob Carlborg:Once it becomes known that an object should be pinned, it is too late for moving. (There will then exist a potential pointer to the object.)On 2014-01-13 22:23, Timon Gehr wrote:Yes, thats the usual approach. But that means that you have to copy them to a other space before pinning, or allocate them in a space that allows pinning in the first place.Yes, there is a cost. The requirement to pin arbitrary objects at arbitrary times, without the possibility to move at pinning time, invalidates GC algorithms that depend on being able to move _all_ objects within some pool.Can't these object be pinned somewhere else?
Jan 14 2014
On 01/14/2014 06:56 PM, Benjamin Thaut wrote:Am 14.01.2014 14:42, schrieb Timon Gehr:Well, there is no such manual mechanism in place, so only strategies implemented fully within the compiler work. I think it is not very sensible to let the compiler insert call-backs to eg. pin objects pointed to from unions. It might as well track occupied pointer fields explicitly then. In any case, I think we should just require manual marking of unions and call it a day.On 01/14/2014 10:20 AM, Benjamin Thaut wrote:Well no, usually you have to pin objects _before_ passing them to C-land. Pinning is not a automatic process its done manually (or by the compiler).Am 14.01.2014 09:15, schrieb Jacob Carlborg:Once it becomes known that an object should be pinned, it is too late for moving. (There will then exist a potential pointer to the object.)On 2014-01-13 22:23, Timon Gehr wrote:Yes, thats the usual approach. But that means that you have to copy them to a other space before pinning, or allocate them in a space that allows pinning in the first place.Yes, there is a cost. The requirement to pin arbitrary objects at arbitrary times, without the possibility to move at pinning time, invalidates GC algorithms that depend on being able to move _all_ objects within some pool.Can't these object be pinned somewhere else?
Jan 14 2014
Am 14.01.2014 19:26, schrieb Timon Gehr:On 01/14/2014 06:56 PM, Benjamin Thaut wrote:Yes, but pinning will still be necessary for passing data to C/C++ functions.Am 14.01.2014 14:42, schrieb Timon Gehr:Well, there is no such manual mechanism in place, so only strategies implemented fully within the compiler work. I think it is not very sensible to let the compiler insert call-backs to eg. pin objects pointed to from unions. It might as well track occupied pointer fields explicitly then. In any case, I think we should just require manual marking of unions and call it a day.On 01/14/2014 10:20 AM, Benjamin Thaut wrote:Well no, usually you have to pin objects _before_ passing them to C-land. Pinning is not a automatic process its done manually (or by the compiler).Am 14.01.2014 09:15, schrieb Jacob Carlborg:Once it becomes known that an object should be pinned, it is too late for moving. (There will then exist a potential pointer to the object.)On 2014-01-13 22:23, Timon Gehr wrote:Yes, thats the usual approach. But that means that you have to copy them to a other space before pinning, or allocate them in a space that allows pinning in the first place.Yes, there is a cost. The requirement to pin arbitrary objects at arbitrary times, without the possibility to move at pinning time, invalidates GC algorithms that depend on being able to move _all_ objects within some pool.Can't these object be pinned somewhere else?
Jan 14 2014
On 01/14/2014 07:29 PM, Benjamin Thaut wrote:Yes, but pinning will still be necessary for passing data to C/C++ functions.(I agree, it is the spec that does not.)
Jan 14 2014
On 01/14/2014 06:56 PM, Benjamin Thaut wrote:BTW, our misunderstandings are rooted here: http://dlang.org/garbage.html "Object Pinning and a Moving Garbage Collector Although D does not currently use a moving garbage collector, by following the rules listed above one can be implemented. No special action is required to pin objects. A moving collector will only move objects for which there are no ambiguous references, and for which it can update those references. All other objects will be automatically pinned."Once it becomes known that an object should be pinned, it is too late for moving. (There will then exist a potential pointer to the object.)Well no, usually you have to pin objects _before_ passing them to C-land. Pinning is not a automatic process its done manually (or by the compiler).
Jan 14 2014
14-Jan-2014 21:56, Benjamin Thaut пишет:Am 14.01.2014 14:42, schrieb Timon Gehr:If one accepts that any heap can have object that is pinned I don't see why you need explicit pinning. It just becomes (a complication to) a part of normal marking stage, the discovery of which objects are pinned. In semi-precise setting it IMHO makes perfect sense. -- Dmitry OlshanskyOn 01/14/2014 10:20 AM, Benjamin Thaut wrote:Well no, usually you have to pin objects _before_ passing them to C-land. Pinning is not a automatic process its done manually (or by the compiler).Am 14.01.2014 09:15, schrieb Jacob Carlborg:Once it becomes known that an object should be pinned, it is too late for moving. (There will then exist a potential pointer to the object.)On 2014-01-13 22:23, Timon Gehr wrote:Yes, thats the usual approach. But that means that you have to copy them to a other space before pinning, or allocate them in a space that allows pinning in the first place.Yes, there is a cost. The requirement to pin arbitrary objects at arbitrary times, without the possibility to move at pinning time, invalidates GC algorithms that depend on being able to move _all_ objects within some pool.Can't these object be pinned somewhere else?
Jan 14 2014
Why are unions with pointers even allowed in the first place? It seems like a really really really bad thing to do. Am I missing some really important use case?
Jan 14 2014
Am 14.01.2014 20:08, schrieb Tofu Ninja:Why are unions with pointers even allowed in the first place? It seems like a really really really bad thing to do. Am I missing some really important use case?Yes, low level programming. D is supposed to be a systems programmin glanguage after all. Tagged unions are a often used concept, and pointers within these tagged unions are very common. Also its needed for compatibility with C.
Jan 14 2014
On Tuesday, 14 January 2014 at 19:09:00 UTC, Tofu Ninja wrote:Why are unions with pointers even allowed in the first place? It seems like a really really really bad thing to do. Am I missing some really important use case?Tagged unions are very useful. For instance, to take a simplified, but real life example, I have a function to resolve identifier in a compiler. It can resolve as a type, as an expression, as a template, as a package or module, or whatnot. This is a case where you need to return an union of types (or rely on complete type erasure).
Jan 14 2014
On Monday, 13 January 2014 at 21:08:50 UTC, Dmitry Olshansky wrote:So what? It can't move it and as such there is nothing special about it, it doesn't _cost_ anything to pin object.Well, the advantage of compaction is that you (in theory) can relocate objects so that you: 1. get temporal and spatial coherency (sitting on the same page, to avoid paging) 2. get rid of fragmentation (less paging, requires smaller address space) 3. can have a faster and simpler allocator If you start using the GC heap for "malloc" with pinning, you loose a lot of that? That's the cost.
Jan 13 2014
14-Jan-2014 01:24, "Ola Fosheim Grøstad" <ola.fosheim.grostad+dlang gmail.com>" пишет:On Monday, 13 January 2014 at 21:08:50 UTC, Dmitry Olshansky wrote:Bleh. If anything you may as well lose it - manual memory management is the only option to guarantee you something to the extent of having related object together in the same page. That is if you can make it so by hand. Segregated heaps also get fine locality for similarly-sized objects.So what? It can't move it and as such there is nothing special about it, it doesn't _cost_ anything to pin object.Well, the advantage of compaction is that you (in theory) can relocate objects so that you: 1. get temporal and spatial coherency (sitting on the same page, to avoid paging)2. get rid of fragmentation (less paging, requires smaller address space) 3. can have a faster and simpler allocator If you start using the GC heap for "malloc" with pinning, you loose a lot of that? That's the cost.This is a missed potential not an extra cost. On the contrary write barriers to track precisely every pointer write do *cost* something. This cost is then regained (with interest) by applying simpler and efficient algorithms such as semi-space compaction. -- Dmitry Olshansky
Jan 14 2014
If Java with all it that time and money can not create a performant GC, what hope has D? Full program tracing GC is a dead horse, it does not scale, all of the better Java GC's require 2x or memory of the working set to actually get any speed isolated task or allocator based memory that can use GC as a last resort is what you need, but D lacks in this regard
Jan 14 2014
Am 14.01.2014 18:53, schrieb woh:If Java with all it that time and money can not create a performant GC, what hope has D? Full program tracing GC is a dead horse, it does not scale, all of the better Java GC's require 2x or memory of the working set to actually get any speed isolated task or allocator based memory that can use GC as a last resort is what you need, but D lacks in this regardAre you aware of Azul's GC or some real time VMs like Aonix PERC?
Jan 14 2014
On Tuesday, 14 January 2014 at 17:38:56 UTC, Dmitry Olshansky wrote:Bleh. If anything you may as well lose it - manual memory management is the only option to guarantee you something to the extent of having related object together in the same page. That is if you can make it so by hand.Under what assumption? There is not optimal garbage collector, you have to select or adapt one that fits the application. You can tag objects by a group id or type id and use that during compaction.write do *cost* something. This cost is then regained (with interest) by applying simpler and efficient algorithms such as semi-space compaction.I wouldn't call semi-space compaction efficient, it can lead to heavy paging.
Jan 14 2014
Am 13.01.2014 18:45, schrieb Dmitry Olshansky:13-Jan-2014 13:20, Rainer Schuetze пишет:I also don't see how that would be expenive. Pinning is a common pratice in other garbage collected languages. But your argument is invalid. As the D garbage collector does not have any information about the stack frames of the C functions it simply has to ignore those (assuming its a truly percise GC).Maybe I'm too pessimistic ;-) I guess moving in general could be ok, I was thinking about segregating heaps by type (shared/immutable/mutable) and moving data between them adds restrictions. I'd like to be proven wrong. Some thoughts regarding a moving collector:Having to explicitely pin every pointer passed to C functions would be very expensive.How would it be expensive? I don't see a need to do anything to "pin" a memory block, at the time of scanning there will be a potential pointer to it (in the stack space of C function or registers).
Jan 13 2014
On Monday, 13 January 2014 at 17:45:26 UTC, Dmitry Olshansky wrote:13-Jan-2014 13:20, Rainer Schuetze пишет:The stack reference can be moved outside the memory visible to the GC: // D extern(C) void funC(const char* s); void main() { funC("Hello".dup.ptr); } // C void funC(const char* s) { SomeStruct* struc = malloc(sizeof(SomeStruct)); struc->ptr = s; s = 0; struc->doSomething(); free(struc); } The "s = 0;" will remove the last reference to the passed string visible to the GC. The duped string might get collected during execution of "doSomething". This is a problem with the current GC already, though it is pretty unlikely that both this kind of operation is used in the C function (or similar) and there are no other references in the D program. With a moving GC, references in the D code no longer pin the object, so the problem appears if the C function just works as above. Another bad use case in the C function: the passed stack parameter is used to iterate to the end of an array and then back later. The pointer to the end might not be within the GC allocated memory block anymore, but pointing to the next.On Sunday, 12 January 2014 at 10:40:50 UTC, Benjamin Thaut wrote: Having to explicitely pin every pointer passed to C functions would be very expensive.How would it be expensive? I don't see a need to do anything to "pin" a memory block, at the time of scanning there will be a potential pointer to it (in the stack space of C function or registers).
Jan 15 2014
On Monday, 13 January 2014 at 09:20:16 UTC, Rainer Schuetze wrote:Maybe I'm too pessimistic ;-) I guess moving in general could be ok, I was thinking about segregating heaps by type (shared/immutable/mutable) and moving data between them adds restrictions. I'd like to be proven wrong.I had some similar reflection. I'm not sure moving is worthwhile for D. Segeregation is.To compete with other GCs we'd probably need write barriers to keep track of changed references (for concurrent operation or generations). There just needs to be a way to avoid having to rescan the full heap every time, it does not scale.Barrier are only necessary for mutable shared reference write.
Jan 13 2014
On 1/12/2014 2:40 AM, Benjamin Thaut wrote:Am 12.01.2014 11:27, schrieb Rainer Schuetze:A moving GC is already supported by D's semantics. Unions are dealt with by 'pinning' those objects, i.e. simply don't move them. I know this can work because I implemented a mostly copying generational collector years ago for Java. (After I invented this, someone else came out with a paper about a "mostly copying" collector, so I didn't get any credit. Oh well! But the idea is sound and it works in the real world.)I think a moving collector is currently not feasible without restricting the language a lot, probably similar to safe D and more. I'm not sure we want that in general.Could you give an example which part of the language would not be doable with a moving collector? The only thing that comes to my mind is unions and that problem can be solved by allowing the user to specify manual scanning functions for structs or classes containing unions. Also I don't think that we can create a GC which performs as good as the one of gc.
Jan 13 2014
On 1/13/14 12:44 PM, Walter Bright wrote:On 1/12/2014 2:40 AM, Benjamin Thaut wrote:The key advantage that moving gives to a GC is to allow it to have a block of memory that it can do allocations out of by simply bumping the pointer. When that region is full, a minor collection kicks in, moving anything still alive out to a different block of memory, then reset the region for re-use. Pinned objects == region can't be emptied for reuse. That leads to fragmentation and free list maintenance and you're right back with a more typical allocator. Not to say there aren't other ways of doing things, but with random objects becoming pinnable puts a big damper on things unless you can identify all the objects that might be pinned and isolate them. But I doubt that's really knowable up front.Am 12.01.2014 11:27, schrieb Rainer Schuetze:A moving GC is already supported by D's semantics. Unions are dealt with by 'pinning' those objects, i.e. simply don't move them. I know this can work because I implemented a mostly copying generational collector years ago for Java. (After I invented this, someone else came out with a paper about a "mostly copying" collector, so I didn't get any credit. Oh well! But the idea is sound and it works in the real world.)I think a moving collector is currently not feasible without restricting the language a lot, probably similar to safe D and more. I'm not sure we want that in general.Could you give an example which part of the language would not be doable with a moving collector? The only thing that comes to my mind is unions and that problem can be solved by allowing the user to specify manual scanning functions for structs or classes containing unions. Also I don't think that we can create a GC which performs as good as the one of gc.
Jan 13 2014
On 1/13/2014 12:57 PM, Brad Roberts wrote:The key advantage that moving gives to a GC is to allow it to have a block of memory that it can do allocations out of by simply bumping the pointer. When that region is full, a minor collection kicks in, moving anything still alive out to a different block of memory, then reset the region for re-use. Pinned objects == region can't be emptied for reuse. That leads to fragmentation and free list maintenance and you're right back with a more typical allocator. Not to say there aren't other ways of doing things, but with random objects becoming pinnable puts a big damper on things unless you can identify all the objects that might be pinned and isolate them. But I doubt that's really knowable up front.The reason pinning doesn't particularly impede this is because pinning is rare.
Jan 13 2014
On 01/13/2014 10:11 PM, Walter Bright wrote:In Java or in D? Eg. there are no unions in Java.Not to say there aren't other ways of doing things, but with random objects becoming pinnable puts a big damper on things unless you can identify all the objects that might be pinned and isolate them. But I doubt that's really knowable up front.The reason pinning doesn't particularly impede this is because pinning is rare.
Jan 13 2014
Am 13.01.2014 22:31, schrieb Timon Gehr:On 01/13/2014 10:11 PM, Walter Bright wrote:http://msdn.microsoft.com/en-us/library/system.runtime.interopservices.structlayoutattribute%28v=vs.110%29.aspxIn Java or in D? Eg. there are no unions in Java.Not to say there aren't other ways of doing things, but with random objects becoming pinnable puts a big damper on things unless you can identify all the objects that might be pinned and isolate them. But I doubt that's really knowable up front.The reason pinning doesn't particularly impede this is because pinning is rare.
Jan 13 2014
On 01/13/2014 11:05 PM, Paulo Pinto wrote:Am 13.01.2014 22:31, schrieb Timon Gehr:"The common language runtime controls the physical layout of the data fields of a class or structure in managed memory. However, ..."On 01/13/2014 10:11 PM, Walter Bright wrote:http://msdn.microsoft.com/en-us/library/system.runtime.interopservices.structlayoutattribute%28v=vs.110%29.aspxIn Java or in D? Eg. there are no unions in Java.Not to say there aren't other ways of doing things, but with random objects becoming pinnable puts a big damper on things unless you can identify all the objects that might be pinned and isolate them. But I doubt that's really knowable up front.The reason pinning doesn't particularly impede this is because pinning is rare.
Jan 13 2014
On Monday, 13 January 2014 at 23:42:50 UTC, Timon Gehr wrote:On 01/13/2014 11:05 PM, Paulo Pinto wrote:Did you also read the remaining part of the page, or just looked for something to paste? You can control the layout, that is what matters. Not at what level you are expressing it. Ignoring what such runtimes offer, only puts D at disadvantage when comparing feature lists, which many in the industry do. -- PauloAm 13.01.2014 22:31, schrieb Timon Gehr:"The common language runtime controls the physical layout of the data fields of a class or structure in managed memory. However, ..."On 01/13/2014 10:11 PM, Walter Bright wrote:http://msdn.microsoft.com/en-us/library/system.runtime.interopservices.structlayoutattribute%28v=vs.110%29.aspxIn Java or in D? Eg. there are no unions in Java.Not to say there aren't other ways of doing things, but with random objects becoming pinnable puts a big damper on things unless you can identify all the objects that might be pinned and isolate them. But I doubt that's really knowable up front.The reason pinning doesn't particularly impede this is because pinning is rare.
Jan 14 2014
On 01/14/2014 10:11 AM, Paulo Pinto wrote:On Monday, 13 January 2014 at 23:42:50 UTC, Timon Gehr wrote:Obviously.On 01/13/2014 11:05 PM, Paulo Pinto wrote:Did you also read the remaining part of the page,Am 13.01.2014 22:31, schrieb Timon Gehr:"The common language runtime controls the physical layout of the data fields of a class or structure in managed memory. However, ..."On 01/13/2014 10:11 PM, Walter Bright wrote:http://msdn.microsoft.com/en-us/library/system.runtime.interopservices.structlayoutattribute%28v=vs.110%29.aspxIn Java or in D? Eg. there are no unions in Java.Not to say there aren't other ways of doing things, but with random objects becoming pinnable puts a big damper on things unless you can identify all the objects that might be pinned and isolate them. But I doubt that's really knowable up front.The reason pinning doesn't particularly impede this is because pinning is rare.or just looked for something to paste? ...Look, you hadn't done anything besides pasting for backing up your point or making it precise. I'll be proven wrong if you are able to show us union U{ int x; Object y; }You can control the layout, that is what matters. Not at what level you are expressing it. ...Obviously.Ignoring what such runtimes offer, only puts D at disadvantage when comparing feature lists, which many in the industry do. ...functionality because it is detrimental to GC, eg: http://stackoverflow.com/questions/17771902/struct-memory-hack-to-overlap-object-reference-is-it-possible
Jan 14 2014
On Tuesday, 14 January 2014 at 14:22:27 UTC, Timon Gehr wrote:On 01/14/2014 10:11 AM, Paulo Pinto wrote:Excuse me for being stupid. I just read MSDN without trying it out, so I wasn't aware that object references cannot be made to overllap with other data, even in unsafe structs. Me and my big mouth. Better test it properly next time. -- PauloOn Monday, 13 January 2014 at 23:42:50 UTC, Timon Gehr wrote:Obviously.On 01/13/2014 11:05 PM, Paulo Pinto wrote:Did you also read the remaining part of the page,Am 13.01.2014 22:31, schrieb Timon Gehr:"The common language runtime controls the physical layout of the data fields of a class or structure in managed memory. However, ..."On 01/13/2014 10:11 PM, Walter Bright wrote:http://msdn.microsoft.com/en-us/library/system.runtime.interopservices.structlayoutattribute%28v=vs.110%29.aspxIn Java or in D? Eg. there are no unions in Java.Not to say there aren't other ways of doing things, but with random objects becoming pinnable puts a big damper on things unless you can identify all the objects that might be pinned and isolate them. But I doubt that's really knowable up front.The reason pinning doesn't particularly impede this is because pinning is rare.or just looked for something to paste? ...Look, you hadn't done anything besides pasting for backing up your point or making it precise. I'll be proven wrong if you union U{ int x; Object y; }You can control the layout, that is what matters. Not at what level you are expressing it. ...Obviously.Ignoring what such runtimes offer, only puts D at disadvantage when comparing feature lists, which many in the industry do. ...offer this functionality because it is detrimental to GC, eg: http://stackoverflow.com/questions/17771902/struct-memory-hack-to-overlap-object-reference-is-it-possible
Jan 15 2014
Am 13.01.2014 21:44, schrieb Walter Bright:A moving GC is already supported by D's semantics. Unions are dealt with by 'pinning' those objects, i.e. simply don't move them. I know this can work because I implemented a mostly copying generational collector years ago for Java. (After I invented this, someone else came out with a paper about a "mostly copying" collector, so I didn't get any credit. Oh well! But the idea is sound and it works in the real world.)I would prefer a option where the user can specifiy a function to scan classes / structs that contain unions percisely instead of pinning it. In generall I would want a option to specify a manual scanning function for any block of GC managed memory, D is a systems programming language after all and specifing a custom scanning functionn can be a lot more effective then assuming the worst possible case for that memory block.
Jan 13 2014
On 1/13/2014 1:28 PM, Benjamin Thaut wrote:Am 13.01.2014 21:44, schrieb Walter Bright:I agree, but I was trying to correct the misperception that current D does not allow a moving collector.A moving GC is already supported by D's semantics. Unions are dealt with by 'pinning' those objects, i.e. simply don't move them. I know this can work because I implemented a mostly copying generational collector years ago for Java. (After I invented this, someone else came out with a paper about a "mostly copying" collector, so I didn't get any credit. Oh well! But the idea is sound and it works in the real world.)I would prefer a option where the user can specifiy a function to scan classes / structs that contain unions percisely instead of pinning it. In generall I would want a option to specify a manual scanning function for any block of GC managed memory, D is a systems programming language after all and specifing a custom scanning functionn can be a lot more effective then assuming the worst possible case for that memory block.
Jan 13 2014
Am 13.01.2014 23:55, schrieb Walter Bright:I agree, but I was trying to correct the misperception that current D does not allow a moving collector.Current D does not allow a moving collector because of the lack of compiler support. It is still not possible to identify all pointers percicesly, especially those on the stack. Also when you want to implement a semi-space GC everything _must_ be moveable. Pinning is not an option for a semi-space GC. There for current D does not allow the implementation of semi-space GC without some changes to the spec. (E.g. structs / classes containing unions _must_ provide a custom scanning function).
Jan 14 2014
On 1/14/2014 1:18 AM, Benjamin Thaut wrote:Current D does not allow a moving collector because of the lack of compiler support. It is still not possible to identify all pointers percicesly, especially those on the stack. Also when you want to implement a semi-space GC everything _must_ be moveable. Pinning is not an option for a semi-space GC. There for current D does not allow the implementation of semi-space GC without some changes to the spec. (E.g. structs / classes containing unions _must_ provide a custom scanning function).This is not true, I've implemented one for Java.
Jan 14 2014
Am 14.01.2014 11:05, schrieb Walter Bright:On 1/14/2014 1:18 AM, Benjamin Thaut wrote:So how do you implement a semi-space GC with pinned objects?Current D does not allow a moving collector because of the lack of compiler support. It is still not possible to identify all pointers percicesly, especially those on the stack. Also when you want to implement a semi-space GC everything _must_ be moveable. Pinning is not an option for a semi-space GC. There for current D does not allow the implementation of semi-space GC without some changes to the spec. (E.g. structs / classes containing unions _must_ provide a custom scanning function).This is not true, I've implemented one for Java.
Jan 14 2014
On Tuesday, 14 January 2014 at 11:02:38 UTC, Benjamin Thaut wrote:Am 14.01.2014 11:05, schrieb Walter Bright:You manage the pinned object in a different list that the semi-space list? It's quite often that GCs maintain object inventory with several methods, for example using semi-space for young object but not for old objects..On 1/14/2014 1:18 AM, Benjamin Thaut wrote:So how do you implement a semi-space GC with pinned objects?Current D does not allow a moving collector because of the lack of compiler support. It is still not possible to identify all pointers percicesly, especially those on the stack. Also when you want to implement a semi-space GC everything _must_ be moveable. Pinning is not an option for a semi-space GC. There for current D does not allow the implementation of semi-space GC without some changes to the spec. (E.g. structs / classes containing unions _must_ provide a custom scanning function).This is not true, I've implemented one for Java.
Jan 14 2014
Am 14.01.2014 13:53, schrieb renoX:On Tuesday, 14 January 2014 at 11:02:38 UTC, Benjamin Thaut wrote:I meant a pure semi-space collector. A pure semi-space collector can not pin objects. And I didn't make that up, it even says so in "The Garbage Collection Handbook". Pinning Unions is a bad approach anyway in my eyes. Because you will have to allocate them in seperate space (which supports pinning) you loose the advantge of fast collections for short living objects which a semi-space collector provides.Am 14.01.2014 11:05, schrieb Walter Bright:You manage the pinned object in a different list that the semi-space list? It's quite often that GCs maintain object inventory with several methods, for example using semi-space for young object but not for old objects..On 1/14/2014 1:18 AM, Benjamin Thaut wrote:So how do you implement a semi-space GC with pinned objects?Current D does not allow a moving collector because of the lack of compiler support. It is still not possible to identify all pointers percicesly, especially those on the stack. Also when you want to implement a semi-space GC everything _must_ be moveable. Pinning is not an option for a semi-space GC. There for current D does not allow the implementation of semi-space GC without some changes to the spec. (E.g. structs / classes containing unions _must_ provide a custom scanning function).This is not true, I've implemented one for Java.
Jan 14 2014
On 1/14/2014 3:02 AM, Benjamin Thaut wrote:Am 14.01.2014 11:05, schrieb Walter Bright:Be more flexible in what the semi-spaces are.This is not true, I've implemented one for Java.So how do you implement a semi-space GC with pinned objects?
Jan 14 2014
Am 14.01.2014 21:28, schrieb Walter Bright:On 1/14/2014 3:02 AM, Benjamin Thaut wrote:So you just don't. Effectivley you work around the problem. I would still prefer that specifing a scanning function for unions becomes mandatory.Am 14.01.2014 11:05, schrieb Walter Bright:Be more flexible in what the semi-spaces are.This is not true, I've implemented one for Java.So how do you implement a semi-space GC with pinned objects?
Jan 14 2014
On 1/14/2014 1:10 PM, Benjamin Thaut wrote:Am 14.01.2014 21:28, schrieb Walter Bright:My experience with book algorithms is they usually require some adjustments when working with real problems in the real world. I was well aware of how moving GCs worked when I set the semantics of D objects, since I'd implemented one, and was careful to make them possible with D.On 1/14/2014 3:02 AM, Benjamin Thaut wrote:So you just don't. Effectivley you work around the problem.Am 14.01.2014 11:05, schrieb Walter Bright:Be more flexible in what the semi-spaces are.This is not true, I've implemented one for Java.So how do you implement a semi-space GC with pinned objects?I would still prefer that specifing a scanning function for unions becomes mandatory.I already agreed that there are better ways.
Jan 14 2014
Am 14.01.2014 23:10, schrieb Walter Bright:I already agreed that there are better ways.Then this was a big misunderstanding.
Jan 14 2014
On Tuesday, 14 January 2014 at 22:36:12 UTC, Benjamin Thaut wrote:Am 14.01.2014 23:10, schrieb Walter Bright:So, the issue seems to be that everyone is treating the GC as the Thors hammer and seeing everything as a nails? The GC does and can not solve all problems. Why force people to use it when it doesn't work well? I would think having a multi-layered approach would be best. Allow the GC to do everything it can but also allow it to be configured from 0 to 60. For noobs, non-realtime apps, etc the GC can be set to do it all. One can trade speed for memory(less memory, more scanning etc...). One can disable to in specific cases(such as on unions), have separate heaps for GC memory and non-GC allocated memory, etc. It seems that a more robust GC with many features(with the feature of not using it and either manually allocating or using ARC). The the only real issue becomes how to keep the GC in sync with the non-GC allocated memory... and this only happens if one chooses to use non-GC features(which must be intentional so hopefully the programming knows what he's doing if he got that far). I doubt this will ever be possible to solve efficiently so why bother? Leave it up to the programmer to deal with it. Make some resources available to solve these problems but don't try to dictate what must be done. If the programmer is going to bypass the GC he already must have good reasons so why put up roadblocks? Cause *you* think you know better than him what he needs to do? For example, suppose we have a non-GC based pointer. Allow the programmer to mark it as a pointer with certain contextual parameters so that either the GC ca n "deal with it" better or at least for debugging purposes help find memory leaks. I'm way more interested in having a GC that doesn't stop the world and a compiler that doesn't require the GC for simple stuff as slices(could it not use a buffer and/or ARC instead/alternatively?).I already agreed that there are better ways.Then this was a big misunderstanding.
Jan 14 2014
On Wednesday, 15 January 2014 at 00:24:48 UTC, Frustrated wrote:On Tuesday, 14 January 2014 at 22:36:12 UTC, Benjamin Thaut wrote:Try to do the pull request that goes with that, and we will be able to discuss the real problems.Am 14.01.2014 23:10, schrieb Walter Bright:So, the issue seems to be that everyone is treating the GC as the Thors hammer and seeing everything as a nails? The GC does and can not solve all problems. Why force people to use it when it doesn't work well? I would think having a multi-layered approach would be best. Allow the GC to do everything it can but also allow it to be configured from 0 to 60. For noobs, non-realtime apps, etc the GC can be set to do it all. One can trade speed for memory(less memory, more scanning etc...). One can disable to in specific cases(such as on unions), have separate heaps for GC memory and non-GC allocated memory, etc. It seems that a more robust GC with many features(with the feature of not using it and either manually allocating or using ARC). The the only real issue becomes how to keep the GC in sync with the non-GC allocated memory... and this only happens if one chooses to use non-GC features(which must be intentional so hopefully the programming knows what he's doing if he got that far). I doubt this will ever be possible to solve efficiently so why bother? Leave it up to the programmer to deal with it. Make some resources available to solve these problems but don't try to dictate what must be done. If the programmer is going to bypass the GC he already must have good reasons so why put up roadblocks? Cause *you* think you know better than him what he needs to do? For example, suppose we have a non-GC based pointer. Allow the programmer to mark it as a pointer with certain contextual parameters so that either the GC ca n "deal with it" better or at least for debugging purposes help find memory leaks. I'm way more interested in having a GC that doesn't stop the world and a compiler that doesn't require the GC for simple stuff as slices(could it not use a buffer and/or ARC instead/alternatively?).I already agreed that there are better ways.Then this was a big misunderstanding.
Jan 14 2014
Benjamin Thaut:I would prefer a option where the user can specifiy a function to scan classes / structs that contain unions percisely instead of pinning it. In generall I would want a option to specify a manual scanning function for any block of GC managed memory, D is a systems programming language after all and specifing a custom scanning functionn can be a lot more effective then assuming the worst possible case for that memory block.Time ago I suggested an optional standard method for unions of structs named as "onGC", that if present is used by the GC at run-time to know what to do with the contents of the unions. Bye, bearophile
Jan 13 2014
12-Jan-2014 14:27, Rainer Schuetze пишет:On 11.01.2014 22:20, Benjamin Thaut wrote:I might be ignorant but why can't we make "mostly-moving" collector? For that we discern precise pointers (with typeinfo) and conservative (such as potential pointers in registers/stack). Then any block pointed to by a least one conservative pointer is pinned, others though can be moved freely since all of the pointers are known. -- Dmitry OlshanskyAm 11.01.2014 21:44, schrieb Andrei Alexandrescu: Well as far as my understanding of GCs goes, you have two options: 1) Impercise pointer discovery => limiting yourself to a mark & sweep 2) Percise pointer disccovery => option to use a semi space GC in combination with a mark & sweep and generations, which allows for superior handling of short lived allocations (which is the biggest problem of D's current GC). Also without percise pointer discovery you will never be able to move objects from one heap into another. This would be especially a problem for the immutable heap, because you might want to allocate all strings on the thread local heap until you discover that you actually need them to be shared between threads. You might also need to move objects into antoher heap whenever a casts happens or global variable is assigned. Kind Regards Benjamin ThautI think a moving collector is currently not feasible without restricting the language a lot, probably similar to safe D and more. I'm not sure we want that in general.
Jan 12 2014
Am 12.01.2014 12:03, schrieb Dmitry Olshansky:12-Jan-2014 14:27, Rainer Schuetze пишет:A semi-space garbage collector is best fitted for small short lived allocations. Which are mostly those that are referenced by the stack because they happend as functions are called. And for a semi-space garbage collector there is not mostly moving, it has to copy _all_ objects from one heap onto another so that those left on the original heap are known to be all garbage, can be destroyed and then the heap can be reused for the next collection. Mostly-moving doens't work here. Either you know all pointers percisely, or you don't do it. What you mean is pinning, you can pin objects for which you know they might be referenced by a impercise pointer and thus prevent them from beeing moved around. But this type of moving only works with mark & sweep compacting collectors and thats not really better than what we already have. If you are really interrested in the topic I higly recommend reading "The garbage collector handbook". Kind Regards Benjamin ThautOn 11.01.2014 22:20, Benjamin Thaut wrote:I might be ignorant but why can't we make "mostly-moving" collector? For that we discern precise pointers (with typeinfo) and conservative (such as potential pointers in registers/stack). Then any block pointed to by a least one conservative pointer is pinned, others though can be moved freely since all of the pointers are known.Am 11.01.2014 21:44, schrieb Andrei Alexandrescu: Well as far as my understanding of GCs goes, you have two options: 1) Impercise pointer discovery => limiting yourself to a mark & sweep 2) Percise pointer disccovery => option to use a semi space GC in combination with a mark & sweep and generations, which allows for superior handling of short lived allocations (which is the biggest problem of D's current GC). Also without percise pointer discovery you will never be able to move objects from one heap into another. This would be especially a problem for the immutable heap, because you might want to allocate all strings on the thread local heap until you discover that you actually need them to be shared between threads. You might also need to move objects into antoher heap whenever a casts happens or global variable is assigned. Kind Regards Benjamin ThautI think a moving collector is currently not feasible without restricting the language a lot, probably similar to safe D and more. I'm not sure we want that in general.
Jan 12 2014
On Sunday, 12 January 2014 at 10:27:42 UTC, Rainer Schuetze wrote:[...] Adding concurrency would also be nice...I just want to point out that from an outer perspective, this is a really really *really* big deal. By far the most common arguments I've heard against D are: 1. library availability (derelict, deimos now) 2. community size and impetus (getting there!) 3. shared druntime/phobos so Hello World isn't 800kb (getting there?) 4. garbage collector which is only possible to opt out of by writing C The very reliance of the garbage collector, regardless of how far between the stop-the-world sweeps are, is a showstopper for many people. They hear GC and think Java pauses. Being able to honestly claim "well it runs concurrently in a separate thread and doesn't[*] incur any performance penalty" would be the single biggest leap to greater adoption D could take at this point. Maybe barring "it prints money", but only maybe. It may be less of a *technical* problem than, say, this or that bug in the type system, or the identity crisis of shared. But fixing those would not make for a *twentieth* of the marketing that a concurrent GC would. Fixing those would make people stay -- introducing that GC would make people join. Not saying that such bugs shouldn't get attention, I'm just saying that Bob would scroll past that link on /r/programming. In comparison, Lucarella's dconf slides were... *compelling*. (P.S. many awkwardly long hugs to those culling allocations from druntime/phobos, you rock)
Jan 12 2014
On Sunday, 12 January 2014 at 19:29:08 UTC, sunspyre wrote:The very reliance of the garbage collector, regardless of how far between the stop-the-world sweeps are, is a showstopper for many people. They hear GC and think Java pauses. Being able to honestly claim "well it runs concurrently in a separate thread and doesn't[*] incur any performance penalty" would be the single biggest leap to greater adoption D could take at this point.That is not technically possible. A truly concurrent GC has heavy penalty. People probably think of "Concurrent GC" has Microsoft calls one of the .NET GCs, which is "mostly concurrent". The first step is get a precise GC. That should give a significant performance boost already. Everything else should probably build on this. http://stackoverflow.com/questions/2583644/difference-between-background-and-concurrent-garbage-collection
Jan 13 2014
Am Fri, 10 Jan 2014 00:03:03 -0800 schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:The GC comes up repeatedly in discussions around here. I've been thinking for a while about breaking it down into components, and now it seems the time is ripe. The design at http://goo.gl/ZVCJeB seems to be a win. It works well, comprehends all major allocation tropes, someone implemented a subset of it in C++ and measured good results, and a coworker is considering adopting the design for a C++ project as well. I've started with the next logical step - higher-level allocation that is aware of the type of the object being allocated, and realized that integrating a notion of tracing is appropriate at that level, and actually quite easy. So I'm thinking of just doing it. A few highlights: * The design will foster the small, composable components with required/optional primitives that worked for std.allocator. * I plan to review and probably discard all of the pointers-to-functions nonsense in the current GC. * At this point I won't worry about discovering roots; I assume druntime has the appropriate mechanisms in place. Instead I'll focus on primitives such as "given this root, mark all that transitively follows from it" (conservatively or not). * I plan to rely on static introspection for the mark function, i.e: void mark(T)(ref T obj); would mark obj and everything transitively referred to by it. The function would probably take a second parameter that's the heap that obj is sitting in. * I plan to segregate all objects that don't include references and pointers (e.g. int, int[], Tuple!(int, double) etc) into a completely different heap than the "interesting" objects. For the simpler objects there's no need to save detailed type information so they can be stored in a more compact, efficient manner. * There will be no possibility to change the type of certain objects once allocated. An allocation for an immutable object cannot e.g. make it later a mutable one. (This is an essential issue with the current allocator, I think.) * At this point I'm unclear on how generations can be componentized, but am cautiously optimistic. Will see once I get to it. One thing that would be great now would be to make an effort to review and merge the current precise GC work. I'm sure it will be of great help with breaking into components. Destroy. AndreiNothing to destroy yet. But it does feel good that someone is putting the pieces together to come up with the second generation of D garbage collection. With pieces I mean the discussions about making it optional or allowing user defined allocators, how immutable can be used to an advantage, precise scanning and generational GC. Basically there are enough use cases now that one can shape a new GC model around it. -- Marco
Jan 10 2014
On 10.01.2014 09:03, Andrei Alexandrescu wrote:The GC comes up repeatedly in discussions around here. I've been thinking for a while about breaking it down into components, and now it seems the time is ripe. The design at http://goo.gl/ZVCJeB seems to be a win. It works well, comprehends all major allocation tropes, someone implemented a subset of it in C++ and measured good results, and a coworker is considering adopting the design for a C++ project as well. I've started with the next logical step - higher-level allocation that is aware of the type of the object being allocated, and realized that integrating a notion of tracing is appropriate at that level, and actually quite easy. So I'm thinking of just doing it. A few highlights: * The design will foster the small, composable components with required/optional primitives that worked for std.allocator. * I plan to review and probably discard all of the pointers-to-functions nonsense in the current GC.You mean the proxy functions? Yes, please axe 'em, that only works in very simple single-threaded programs.* At this point I won't worry about discovering roots; I assume druntime has the appropriate mechanisms in place.It currently does not have precise information, but it is dearly needed, too.Instead I'll focus on primitives such as "given this root, mark all that transitively follows from it" (conservatively or not). * I plan to rely on static introspection for the mark function, i.e: void mark(T)(ref T obj); would mark obj and everything transitively referred to by it. The function would probably take a second parameter that's the heap that obj is sitting in.I guess the mark function will be stored somewhere in the TypeInfo. How is this going to work with dynamic and associative array operations if these are not also templated?* I plan to segregate all objects that don't include references and pointers (e.g. int, int[], Tuple!(int, double) etc) into a completely different heap than the "interesting" objects. For the simpler objects there's no need to save detailed type information so they can be stored in a more compact, efficient manner.So no more std.emplace?* There will be no possibility to change the type of certain objects once allocated. An allocation for an immutable object cannot e.g. make it later a mutable one. (This is an essential issue with the current allocator, I think.)I think this might help in having different heaps (especially thread local heaps if you include "shared"), but I'm not sure this works in unsafe D where casting is allowed.* At this point I'm unclear on how generations can be componentized, but am cautiously optimistic. Will see once I get to it. One thing that would be great now would be to make an effort to review and merge the current precise GC work. I'm sure it will be of great help with breaking into components.As written in the other thread ("how to contribute to GC"), I have just made an attempt to make it more reviewable: https://github.com/rainers/druntime/commits/gcx_precise2 The necessary compiler fixes are here: https://github.com/D-Programming-Language/dmd/pull/2480Destroy. Andrei
Jan 10 2014
On 1/10/14 12:37 PM, Rainer Schuetze wrote:Yah, that's where I'm counting on your work :o).* At this point I won't worry about discovering roots; I assume druntime has the appropriate mechanisms in place.It currently does not have precise information, but it is dearly needed, too.I need to write some code to explain it all. An figure it all :o).* I plan to rely on static introspection for the mark function, i.e: void mark(T)(ref T obj); would mark obj and everything transitively referred to by it. The function would probably take a second parameter that's the heap that obj is sitting in.I guess the mark function will be stored somewhere in the TypeInfo. How is this going to work with dynamic and associative array operations if these are not also templated?std.emplace will continue to work as a way to build an object at a specified address. I suspect that allocating and manipulating objects on the GC heap in particular may have certain restrictions. One possibility to avoid such restrictions is to have a function typify(T)(void* p) which ascribes type T to heap location p.* I plan to segregate all objects that don't include references and pointers (e.g. int, int[], Tuple!(int, double) etc) into a completely different heap than the "interesting" objects. For the simpler objects there's no need to save detailed type information so they can be stored in a more compact, efficient manner.So no more std.emplace?Type-forging casts have always been somewhat implementation-defined. The way I see it they'll continue to only work for people who really know what they're doing. I'm not too worried about that.* There will be no possibility to change the type of certain objects once allocated. An allocation for an immutable object cannot e.g. make it later a mutable one. (This is an essential issue with the current allocator, I think.)I think this might help in having different heaps (especially thread local heaps if you include "shared"), but I'm not sure this works in unsafe D where casting is allowed.Yah, time for you to get some destruction first :o). Andrei* At this point I'm unclear on how generations can be componentized, but am cautiously optimistic. Will see once I get to it. One thing that would be great now would be to make an effort to review and merge the current precise GC work. I'm sure it will be of great help with breaking into components.As written in the other thread ("how to contribute to GC"), I have just made an attempt to make it more reviewable: https://github.com/rainers/druntime/commits/gcx_precise2 The necessary compiler fixes are here: https://github.com/D-Programming-Language/dmd/pull/2480
Jan 10 2014
On 10.01.2014 22:42, Andrei Alexandrescu wrote:On 1/10/14 12:37 PM, Rainer Schuetze wrote:I was thinking about using the proposed module info extension to generate data similar to RTInfo by interpreting global/tls data of a module as a struct with type info. I don't know if this is feasable.Yah, that's where I'm counting on your work :o).* At this point I won't worry about discovering roots; I assume druntime has the appropriate mechanisms in place.It currently does not have precise information, but it is dearly needed, too.Waiting eagerly for the code :-)I need to write some code to explain it all. An figure it all :o).* I plan to rely on static introspection for the mark function, i.e: void mark(T)(ref T obj); would mark obj and everything transitively referred to by it. The function would probably take a second parameter that's the heap that obj is sitting in.I guess the mark function will be stored somewhere in the TypeInfo. How is this going to work with dynamic and associative array operations if these are not also templated?That sounds similar to my gc_emplace function. The problematic part is how to save that information in the GC.std.emplace will continue to work as a way to build an object at a specified address. I suspect that allocating and manipulating objects on the GC heap in particular may have certain restrictions. One possibility to avoid such restrictions is to have a function typify(T)(void* p) which ascribes type T to heap location p.* I plan to segregate all objects that don't include references and pointers (e.g. int, int[], Tuple!(int, double) etc) into a completely different heap than the "interesting" objects. For the simpler objects there's no need to save detailed type information so they can be stored in a more compact, efficient manner.So no more std.emplace?Ready to get some blows...Yah, time for you to get some destruction first :o).* At this point I'm unclear on how generations can be componentized, but am cautiously optimistic. Will see once I get to it. One thing that would be great now would be to make an effort to review and merge the current precise GC work. I'm sure it will be of great help with breaking into components.As written in the other thread ("how to contribute to GC"), I have just made an attempt to make it more reviewable: https://github.com/rainers/druntime/commits/gcx_precise2 The necessary compiler fixes are here: https://github.com/D-Programming-Language/dmd/pull/2480
Jan 11 2014
On 2014-01-11 10:30, Rainer Schuetze wrote:I was thinking about using the proposed module info extension to generate data similar to RTInfo by interpreting global/tls data of a module as a struct with type info. I don't know if this is feasable.You mean this: https://github.com/D-Programming-Language/dmd/pull/2271 ? I really hope this can be merged sometime. -- /Jacob Carlborg
Jan 11 2014
On 11.01.2014 21:30, Jacob Carlborg wrote:On 2014-01-11 10:30, Rainer Schuetze wrote:Yes. I'm not sure if there is enough compiler support yet to figure out the global and TLS data layout of a module, especially for a static library where the module is split into a lot of object files.I was thinking about using the proposed module info extension to generate data similar to RTInfo by interpreting global/tls data of a module as a struct with type info. I don't know if this is feasable.You mean this: https://github.com/D-Programming-Language/dmd/pull/2271 ? I really hope this can be merged sometime.
Jan 12 2014
On 01/11/14 03:30, Rainer Schuetze wrote:On 10.01.2014 22:42, Andrei Alexandrescu wrote:[snip][snip] Couldn't you store a pointer to the typeinfo within the allocated memory? IOW, the first part of the allocated memory would be the typeinfo* followed by the actual memory used to store the value of the allocated T object. -regards, Larrystd.emplace will continue to work as a way to build an object at a specified address. I suspect that allocating and manipulating objects on the GC heap in particular may have certain restrictions. One possibility to avoid such restrictions is to have a function typify(T)(void* p) which ascribes type T to heap location p.That sounds similar to my gc_emplace function. The problematic part is how to save that information in the GC.
Jan 13 2014
On Monday, 13 January 2014 at 22:48:37 UTC, evansl wrote:On 01/11/14 03:30, Rainer Schuetze wrote:std.emplace can be used on a partial memory block (e.g. as part of a struct), so you will have to add the emplaced type info in addition to the outer struct's type info. There can be multiple areas with emplaced dta within the same memory allocation, too. So you'll need to store a list of type infos paired with the offsets within the meory block. How do you do this efficiently without extra cost for the usual scanning?On 10.01.2014 22:42, Andrei Alexandrescu wrote:[snip][snip] Couldn't you store a pointer to the typeinfo within the allocated memory? IOW, the first part of the allocated memory would be the typeinfo* followed by the actual memory used to store the value of the allocated T object. -regards, Larrystd.emplace will continue to work as a way to build an object at a specified address. I suspect that allocating and manipulating objects on the GC heap in particular may have certain restrictions. One possibility to avoid such restrictions is to have a function typify(T)(void* p) which ascribes type T to heap location p.That sounds similar to my gc_emplace function. The problematic part is how to save that information in the GC.
Jan 15 2014
On Wednesday, 15 January 2014 at 08:54:51 UTC, Rainer Schuetze wrote:std.emplace can be used on a partial memory block (e.g. as part of a struct), so you will have to add the emplaced type info in addition to the outer struct's type info. There can be multiple areas with emplaced dta within the same memory allocation, too.This is a good point. It is a mistake to mix manual memory layout and byte-level control with an obligatory GC regime. Those optimizations should either be done automatically by a high level optimizer, triggered by compiler hits, or be non-GC. Basically, one has to decide whether to focus on high level or low level for heap allocations. D is by design a high-level effort with low-level access bolted on everywhere. That undermines the premises for efficient high-level implementation. I think you either have to restrict the low level to the non-GC-heap-allocations and inner loops at the leaves of the call-tree, or start with a low level language design and very carefully add optional high level constructs. High level as the default with low level control interspersed everywhere makes high level optimization intractable.
Jan 15 2014
On 01/15/14 02:54, Rainer Schuetze wrote:On Monday, 13 January 2014 at 22:48:37 UTC, evansl wrote:I'm afraid I'm not familiar with much of what you're talking about. However, in another thread, containing this post: http://forum.dlang.org/post/ldhquc$11ec$1 digitalmars.com Benjamin mentioned RTInfo*. Which made me think that instead of pointer to the typeinfo, what about a RTInfo pointer which, according to comments at: https://github.com/D-Programming-Language/druntime/blob/e47a00bff935c3f079bb567a6ec97663ba384487/src/object_.d#L1101 is the data for precise GC. Surely if RTInfo contains the data for precise GC, then that would work. However, I'm guessing that RTInfo* is only available for class types. OTOH, couldn't the compiler figure out the similar information for struct types? Then, if a class type were being scanned, use the existing RTInfo* in the object supertype. OTOH, if a struct type type is being scanned, then use the compiler generated RTInfo for the struct type, a pointer to which could be placed in the heap memory just before the memory for the struct. I guess the scanner would then have to be able to figure out if a void* was pointing to a struct or a class and behave accordingly. Of course I'm probably missing something, coming from a c++ background; hence, I'd appreciate someone pointing out what I'm missing. -regards, LarryOn 01/11/14 03:30, Rainer Schuetze wrote:std.emplace can be used on a partial memory block (e.g. as part of a struct), so you will have to add the emplaced type info in addition to the outer struct's type info. There can be multiple areas with emplaced dta within the same memory allocation, too. So you'll need to store a list of type infos paired with the offsets within the meory block. How do you do this efficiently without extra cost for the usual scanning?On 10.01.2014 22:42, Andrei Alexandrescu wrote:[snip][snip] Couldn't you store a pointer to the typeinfo within the allocated memory? IOW, the first part of the allocated memory would be the typeinfo* followed by the actual memory used to store the value of the allocated T object. -regards, Larrystd.emplace will continue to work as a way to build an object at a specified address. I suspect that allocating and manipulating objects on the GC heap in particular may have certain restrictions. One possibility to avoid such restrictions is to have a function typify(T)(void* p) which ascribes type T to heap location p.That sounds similar to my gc_emplace function. The problematic part is how to save that information in the GC.
Feb 28 2014
On 1/10/2014 12:37 PM, Rainer Schuetze wrote:As written in the other thread ("how to contribute to GC"), I have just made an attempt to make it more reviewable: https://github.com/rainers/druntime/commits/gcx_precise2 The necessary compiler fixes are here: https://github.com/D-Programming-Language/dmd/pull/2480Rainer, I want to apologize to you for not getting your efforts the attention they deserve. I hope we can make progress on this shortly.
Jan 10 2014
On Friday, 10 January 2014 at 20:37:53 UTC, Rainer Schuetze wrote:I think this might help in having different heaps (especially thread local heaps if you include "shared"), but I'm not sure this works in unsafe D where casting is allowed.One of the key to a fast GC is segregating the heap in smaller parts. This is why generational collectors are so popular in java for instance. Segregating the heap imply so complications. For instance, it is necessary to track reference from one heap to another, typically from young to older objects. In D, the type system provide a natural segregation. It would be a great missed opportunity to not take advantage of it. This segregation is really nice as pointers go from one to another in a single direction, avoiding the need to track references changes, and all element of the heap part share some common characteristics. These characteristics allow for specialized GC algorithms (for instance, a GC can run on the immutable heap without the program even knowing about it). Casting is by essence bypassing the type system. In this case, you must be help responsible for what happen. The language cannot provide guarantees.
Jan 10 2014
On Fri, Jan 10, 2014 at 11:58:21PM +0000, deadalnix wrote:On Friday, 10 January 2014 at 20:37:53 UTC, Rainer Schuetze wrote:Which unfortunately is not an option in D if write barriers are required.I think this might help in having different heaps (especially thread local heaps if you include "shared"), but I'm not sure this works in unsafe D where casting is allowed.One of the key to a fast GC is segregating the heap in smaller parts. This is why generational collectors are so popular in java for instance. Segregating the heap imply so complications. For instance, it is necessary to track reference from one heap to another, typically from young to older objects.In D, the type system provide a natural segregation. It would be a great missed opportunity to not take advantage of it. This segregation is really nice as pointers go from one to another in a single direction, avoiding the need to track references changes, and all element of the heap part share some common characteristics. These characteristics allow for specialized GC algorithms (for instance, a GC can run on the immutable heap without the program even knowing about it).Yeah, the transitivity of immutable really makes this a big opportunity. You know immutable can never point to mutable, and also that immutable never ever changes, so that makes for a perfect partitioning of GC memory. You never have to scan the immutable heap when collecting the mutable heap, and there's never a problem with mutating references in the immutable heap. So you don't need write barriers, and probably(?) don't need locks when collecting the immutable heap, too. Const is an interesting grey area here, though. It's also transitive, and doesn't allow mutation through it, but something else may mutate the data via another reference. So I'm not sure how exactly const will come into play here, though it does seem like another promising area for GC optimization.Casting is by essence bypassing the type system. In this case, you must be help responsible for what happen. The language cannot provide guarantees.Agreed. T -- Don't drink and derive. Alcohol and algebra don't mix.
Jan 10 2014
On 11.01.2014 00:58, deadalnix wrote:On Friday, 10 January 2014 at 20:37:53 UTC, Rainer Schuetze wrote:I think this might help in having different heaps (especially thread local heaps if you include "shared"), but I'm not sure this works in unsafe D where casting is allowed.One of the key to a fast GC is segregating the heap in smaller parts. This is why generational collectors are so popular in java for instance.Segregating the heap imply so complications. For instance, it is necessary to track reference from one heap to another, typically from young to older objects. In D, the type system provide a natural segregation. It would be a great missed opportunity to not take advantage of it. This segregation is really nice as pointers go from one to another in a single direction, avoiding the need to track references changes, and all element of the heap part share some common characteristics. These characteristics allow for specialized GC algorithms (for instance, a GC can run on the immutable heap without the program even knowing about it). Casting is by essence bypassing the type system. In this case, you must be help responsible for what happen. The language cannot provide guarantees.This might work with safe D. I haven't yet figured the consequences of implicit heap segregation by type. What restrictions will this impose on system programming? As compacting data is also pretty problematic as long as part of the scanning is conservative, we might as well consider storing generation and type information per allocation. This will destroy some optimization possibilities, though.
Jan 11 2014
On 01/10/2014 09:03 AM, Andrei Alexandrescu wrote:* There will be no possibility to change the type of certain objects once allocated. An allocation for an immutable object cannot e.g. make it later a mutable one. (This is an essential issue with the current allocator, I think.)I assume you are aware that there is implicit casting to immutable upon return from a strongly pure function. What about it?
Jan 10 2014
On Saturday, 11 January 2014 at 04:26:51 UTC, Timon Gehr wrote:On 01/10/2014 09:03 AM, Andrei Alexandrescu wrote:Is it a bird ? Is it a plane ? No it is *ISOLATED* !* There will be no possibility to change the type of certain objects once allocated. An allocation for an immutable object cannot e.g. make it later a mutable one. (This is an essential issue with the current allocator, I think.)I assume you are aware that there is implicit casting to immutable upon return from a strongly pure function. What about it?
Jan 10 2014
On 1/10/14 8:26 PM, Timon Gehr wrote:On 01/10/2014 09:03 AM, Andrei Alexandrescu wrote:I don't know. Need to think about it. Maybe it's a wrong design decision (or maybe separate heaps are wrong). Andrei* There will be no possibility to change the type of certain objects once allocated. An allocation for an immutable object cannot e.g. make it later a mutable one. (This is an essential issue with the current allocator, I think.)I assume you are aware that there is implicit casting to immutable upon return from a strongly pure function. What about it?
Jan 10 2014
On Fri, Jan 10, 2014 at 08:53:44PM -0800, Andrei Alexandrescu wrote:On 1/10/14 8:26 PM, Timon Gehr wrote:[...] I dunno, separate heaps, esp. for immutable vs. mutable, seems like a very powerful GC optimization opportunity. Missing it seems like such a pity, esp. given the price we pay for immutability (transitivity, actual immutability guarantees unlike C/C++ const, etc.). T -- If you want to solve a problem, you need to address its root cause, not just its symptoms. Otherwise it's like treating cancer with Tylenol...On 01/10/2014 09:03 AM, Andrei Alexandrescu wrote:I don't know. Need to think about it. Maybe it's a wrong design decision (or maybe separate heaps are wrong).* There will be no possibility to change the type of certain objects once allocated. An allocation for an immutable object cannot e.g. make it later a mutable one. (This is an essential issue with the current allocator, I think.)I assume you are aware that there is implicit casting to immutable upon return from a strongly pure function. What about it?
Jan 10 2014
In theory you could use a region allocator for a pure function and copy its result out. This wouldn't be worth it in the general case but imagine worker threads in a task pool. Pure functions executing message in/message(s) out that did not significantly contribute to the stop the world GC workload. My D experience is only limited (mainly a C++ programmer) but something along these lines is how I have always imagined GC will sidestep the 'stop the world' problem as we move toward async/await programming models.
Jan 13 2014
Am 11.01.2014 05:26, schrieb Timon Gehr:On 01/10/2014 09:03 AM, Andrei Alexandrescu wrote:I'd say the easiest way to deal with it is to let the compiler emit a "gc_isolatedToImmutable"-like call in this case which moves the object from one heap to another, assuming that the language guarantees safety.* There will be no possibility to change the type of certain objects once allocated. An allocation for an immutable object cannot e.g. make it later a mutable one. (This is an essential issue with the current allocator, I think.)I assume you are aware that there is implicit casting to immutable upon return from a strongly pure function. What about it?
Jan 12 2014
On Friday, 10 January 2014 at 08:03:03 UTC, Andrei Alexandrescu wrote:The design at http://goo.gl/ZVCJeB seems to be a win. It works well, comprehends all major allocation tropes, someone implemented a subset of it in C++ and measured good results, and a coworker is considering adopting the design for a C++ project as well.An additional feature that I would like to suggest is the concept of time-limited collection. The goal would be to be able to write a main loop in my application where I say: const auto FRAME_DUR_MS = 25; while (keepPlaying) { auto start = Clock.currSystemTick() update(); render(); GC.collectMsecs( FRAME_DUR_MS - (Clock.currSystemTick() - start).msecs() ); } The idea is that the collector, conceptually, has a routine that it runs. I.e. 1. Point to bottom of stack, scan through. 2. Pop registers from each thread, check values. 3. Determine unreachable values amongst everything allocated. 4. Call destructors on objects. 5. Free memory. Since "unreachability" is a one-way street, I can pause anywhere in this process and resume from that location at a later time. I don't have to go through all of the steps every time I am called. So while there's no guarantee that any one call to collectMsecs() actually releases any memory, if the caller continues to call it reliably, they should be able to keep on top of their memory usage (or know that they need to allocate less frequently if they see their memory continue to grow, even under this scheme).
Jan 14 2014
On Tuesday, 14 January 2014 at 23:48:33 UTC, Chris Williams wrote:On Friday, 10 January 2014 at 08:03:03 UTC, Andrei Alexandrescu wrote:I really love this idea. However I'm not sure how kernels would like no sleeping to occur at least by my understanding. It would definitely be a big plus to anything related to gui's.The design at http://goo.gl/ZVCJeB seems to be a win. It works well, comprehends all major allocation tropes, someone implemented a subset of it in C++ and measured good results, and a coworker is considering adopting the design for a C++ project as well.An additional feature that I would like to suggest is the concept of time-limited collection. The goal would be to be able to write a main loop in my application where I say: const auto FRAME_DUR_MS = 25; while (keepPlaying) { auto start = Clock.currSystemTick() update(); render(); GC.collectMsecs( FRAME_DUR_MS - (Clock.currSystemTick() - start).msecs() ); } The idea is that the collector, conceptually, has a routine that it runs. I.e. 1. Point to bottom of stack, scan through. 2. Pop registers from each thread, check values. 3. Determine unreachable values amongst everything allocated. 4. Call destructors on objects. 5. Free memory. Since "unreachability" is a one-way street, I can pause anywhere in this process and resume from that location at a later time. I don't have to go through all of the steps every time I am called. So while there's no guarantee that any one call to collectMsecs() actually releases any memory, if the caller continues to call it reliably, they should be able to keep on top of their memory usage (or know that they need to allocate less frequently if they see their memory continue to grow, even under this scheme).
Jan 14 2014
On Wednesday, 15 January 2014 at 01:55:13 UTC, Rikki Cattermole wrote:I really love this idea. However I'm not sure how kernels would like no sleeping to occur at least by my understanding. It would definitely be a big plus to anything related to gui's.Sleep allows a thread to give way to another thread within the same process. If an app only has one thread, then that should end up ceding time to other processes, but the OS is still able to perform process switching regardless of whether an application ever has all of its threads asleep at once or not.
Jan 15 2014
On Wednesday, 15 January 2014 at 21:54:18 UTC, Chris Williams wrote:On Wednesday, 15 January 2014 at 01:55:13 UTC, Rikki Cattermole wrote:I'm referring to cpu prioritization. For the process. It'll depend heavily on the kernel/config of it however.I really love this idea. However I'm not sure how kernels would like no sleeping to occur at least by my understanding. It would definitely be a big plus to anything related to gui's.Sleep allows a thread to give way to another thread within the same process. If an app only has one thread, then that should end up ceding time to other processes, but the OS is still able to perform process switching regardless of whether an application ever has all of its threads asleep at once or not.
Jan 15 2014