digitalmars.D - Per thread heap, GC, etc.
- Markk (55/55) May 14 2021 Hi,
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (3/6) May 14 2021 What do you think of per-task GC?
- Markk (32/38) May 14 2021 I think the per thread-model could be one very powerful
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (15/20) May 14 2021 Yes, the problem is that spinning up a new thread is costly, and
- Markk (6/10) May 14 2021 I think thread pooling (along with the scoped release I
- Imperatorn (3/14) May 14 2021 D rox ☀️
- Imperatorn (6/11) May 14 2021 Interesting thoughts. Just a general question I've been trying to
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (10/21) May 14 2021 For low level programming in general:
- Imperatorn (3/25) May 14 2021 For low level I wouldn't even think of trying to use GC.
- Markk (17/30) May 14 2021 First, I love the proposition of a GC. Most concerns are probably
- Adam D. Ruppe (4/6) May 14 2021 There's two types of programmer: the ones busy doing actual
- Markk (6/10) May 14 2021 I agree with that statement, but then I also believe that D
- russhy (5/11) May 14 2021 nobody complain about the GC
- H. S. Teoh (13/24) May 14 2021 IMO that impression is misleading, because those who are happy with the
- Markk (18/21) May 14 2021 Yes that could well explain a good proportion of it.
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (7/12) May 14 2021 Everybody understands that, but you also have to look at where
- russhy (3/28) May 14 2021 because of phobos and its people i should stop write D? i like D
- sighoya (8/12) May 15 2021 No, please not.
- IGotD- (14/17) May 14 2021 I'm very much against binding any dynamic memory to any thread.
- Imperatorn (3/11) May 14 2021 Kinda agree on this. A thread is a virtualization of the cpu and
- Markk (25/40) May 14 2021 I disagree by 180°. If the memory management is associated with
- H. S. Teoh (14/24) May 14 2021 It would be nice, because it would allow per-thread GC, which could
- Markk (6/16) May 14 2021 No! I did not say this explicitly but of course `immutable` and
- sighoya (10/14) May 14 2021 Isn't that what Nim already has, thread local garbage collection?
- Markk (3/5) May 14 2021 Oh, I must look at Nim.
Hi, D has this nice default per-thread static memory model, i.e. if I understand all this correctly, this allows for better, more natural thread safety, while it makes it generally unsafe to use this memory from other threads (without locking). I guess the same is implicitly true for stack memory. Now could it equally make sense to use per-thread heaps? I.e. all allocations would need to be per thread, and it would be illegal to reference memory of one thread's heap, static memory, or stack from another thread's memory. Some RAII locking could pin message-passed (etc.) references temporarily down for them to be used legally by another thread. This would make the sharing of the pointer known to the original thread for a strictly scoped time. Perhaps the `synchronized` keyword could be used for these stack references (just a spur of the moment proposal for the purpose of this discussion). Pinning is coupled with message-passing (etc.), i.e. no additional locking required. For permanent change of ownership i.e. storing a reference in static or heap memory of the other thread, the referenced memory would have to be copied. I.e. there are no `synchronized` references from inside static or heap (i.e. non-stack, non-scoped) memory. I guess `synchronized` class objects could get their own, non-thread specific, a.k.a. shared heap (similar in a way to the `shared` static memory). All stack references to `synchronized` class objects would have to be marked with `synchronized` too (or this might be inferred). `synchronized` class references stored in other (non-stack, non-scoped) thread memory would still be illegal. Given the above, the GC could be run per thread. The world would not have to be stopped! Which means that some threads could entirely run without GC while others could still benefit from what I personally think is the only universal and scalable solution to memory safety. As a middle ground, some threads might only use a controlled amount of allocations, therefore GC runs would be super-fast, perhaps still acceptable under (near) real-time performance constrains. The model would force developers towards a more modularized, per thread (service?) oriented architecture where message passing and lock free programming would be king... (said from a "schoolbook" understanding of these matters ;-)). Also, I guess the performance of the resulting lock free heap allocs/frees, of the now (by language guarantee) lock free thread safe memory accesses, of the now per thread, smaller and (per definition) lock free GC runs etc. would improve. Being per-thread i.e. non-preemptive, this could also simplify the GC and allow for more compiler optimizations, I guess. There is no danger of register aliasing and whatnot, that I can only guess makes preemptively interrupting GC correctness under high compiler optimization hard. Just some thoughts after reading a handful Rust and D books... and after having seen so many wrinkle their noses at GC ... and therefore, unfortunately D. _Mark
May 14 2021
On Friday, 14 May 2021 at 13:48:12 UTC, Markk wrote:Just some thoughts after reading a handful Rust and D books... and after having seen so many wrinkle their noses at GC ... and therefore, unfortunately D.What do you think of per-task GC? https://forum.dlang.org/post/yqdwgbzkmutjzfdhotst forum.dlang.org
May 14 2021
On Friday, 14 May 2021 at 14:14:32 UTC, Ola Fosheim Grøstad wrote:On Friday, 14 May 2021 at 13:48:12 UTC, Markk wrote:I think the per thread-model could be one very powerful implementation method for your proposal. So the two don't contradict each other at all, at least as far as I understand. However, the per-thread association does more than just introduce a new Allocator variant. It introduces language guarantees, mostly through the fact that there is a stack clearly associated per thread and therefore clear scoping is granted (this extends to fibers with a bit more complexity). The other rules for that guarantee are described above. The proposal would be very "D", i.e. analog to the default thread local static data. In the same spirit as D's thread local static data, it addresses concurrency issues along with memory issues and therefore makes it simpler to code, simpler to implement (non-preemtive), plus more performant at the same time. Personally, I think introducing explicit Allocators, i.e. chaining them through everything as (template) parameters or (worse) booby trapping with sneaky overloads, makes code much more complex, IMHO unbearably so. For contrast, making memory management more of a per-thread thing, could solve this concern too. You can easily set an Allocator as a per thread setting and be done with it. The discussed language guarantees would make sure your allocations are not mixed or illegally referenced across. The exact same code can be run under different Allocators without template code bloat or parametrizing overhead. When the thread terminates (or drops out of the defining scope), the whole memory can be freed as a whole. The discussed language guarantees would make sure you have nothing mixed and dangling. If GC was your set Allocator, no final collection is needed, as you equally proposed ;-). _MarkJust some thoughts after reading a handful Rust and D books... and after having seen so many wrinkle their noses at GC ... and therefore, unfortunately D.What do you think of per-task GC? https://forum.dlang.org/post/yqdwgbzkmutjzfdhotst forum.dlang.org
May 14 2021
On Friday, 14 May 2021 at 15:00:20 UTC, Markk wrote:When the thread terminates (or drops out of the defining scope), the whole memory can be freed as a whole. The discussed language guarantees would make sure you have nothing mixed and dangling. If GC was your set Allocator, no final collection is needed, as you equally proposed ;-).Yes, the problem is that spinning up a new thread is costly, and in order to get the benefits of "no final collection" the thread should be short lived. So, that is where tasks come to the rescue. If you can split the work-load on many short-lived tasks then it can execute on many threads at the same time and still not cause any collection cycle. Of course, if you allow suspension of execution then you need to deal with saving the stack somehow or implement stackless coroutines (or something similar). Anyway, I am happy to see that we are on the same page in general, let's keep the ideas on this flowing :-). Then maybe we can come up with something nice over time. Cheers, Ola.
May 14 2021
On Friday, 14 May 2021 at 15:21:30 UTC, Ola Fosheim Grøstad wrote:On Friday, 14 May 2021 at 15:00:20 UTC, Markk wrote:Yes, the problem is that spinning up a new thread is costly, and in order to get the benefits of "no final collection" the thread should be short lived.I think thread pooling (along with the scoped release I described) and/or D's fibers address all of these concerns in very elegant ways. Again, D is already very, very close. _Mark
May 14 2021
On Friday, 14 May 2021 at 15:30:53 UTC, Markk wrote:On Friday, 14 May 2021 at 15:21:30 UTC, Ola Fosheim Grøstad wrote:D rox ☀️ Let's improve it so it becomes perfect 😁On Friday, 14 May 2021 at 15:00:20 UTC, Markk wrote:Yes, the problem is that spinning up a new thread is costly, and in order to get the benefits of "no final collection" the thread should be short lived.I think thread pooling (along with the scoped release I described) and/or D's fibers address all of these concerns in very elegant ways. Again, D is already very, very close. _Mark
May 14 2021
On Friday, 14 May 2021 at 13:48:12 UTC, Markk wrote:Hi, Just some thoughts after reading a handful Rust and D books... and after having seen so many wrinkle their noses at GC ... and therefore, unfortunately D. _MarkInteresting thoughts. Just a general question I've been trying to understand: Why do you think ppl "wrinkle their noses at GC"? 🤔 I don't get it. GC 4 life!! 🎶☀️ (yes I know in what circumstances you can't use it)
May 14 2021
On Friday, 14 May 2021 at 15:10:37 UTC, Imperatorn wrote:On Friday, 14 May 2021 at 13:48:12 UTC, Markk wrote:For low level programming in general: 1. real time issues/performance 2. unpredictable cleanup (finalization/RAII) 3. higher memory consumption 4. more challenging interop with other languages 5. cannot be used in some execution contexts For D specifically: 1. freezing all GC threads 2. no tracking of ownership-type in the type systemHi, Just some thoughts after reading a handful Rust and D books... and after having seen so many wrinkle their noses at GC ... and therefore, unfortunately D. _MarkInteresting thoughts. Just a general question I've been trying to understand: Why do you think ppl "wrinkle their noses at GC"? 🤔
May 14 2021
On Friday, 14 May 2021 at 15:16:55 UTC, Ola Fosheim Grøstad wrote:On Friday, 14 May 2021 at 15:10:37 UTC, Imperatorn wrote:For low level I wouldn't even think of trying to use GC. 1. Yeah, the implementation could be improvedOn Friday, 14 May 2021 at 13:48:12 UTC, Markk wrote:For low level programming in general: 1. real time issues/performance 2. unpredictable cleanup (finalization/RAII) 3. higher memory consumption 4. more challenging interop with other languages 5. cannot be used in some execution contexts For D specifically: 1. freezing all GC threads 2. no tracking of ownership-type in the type systemHi, Just some thoughts after reading a handful Rust and D books... and after having seen so many wrinkle their noses at GC ... and therefore, unfortunately D. _MarkInteresting thoughts. Just a general question I've been trying to understand: Why do you think ppl "wrinkle their noses at GC"? 🤔
May 14 2021
On Friday, 14 May 2021 at 15:10:37 UTC, Imperatorn wrote:On Friday, 14 May 2021 at 13:48:12 UTC, Markk wrote:First, I love the proposition of a GC. Most concerns are probably unfounded and ill-informed. But look at even this forum. It seems to be one of the biggest no-gos for D. Some of that is justified, such as for near-realtime performance or embedded. Some of it is just driven by fashion-victim hypes around Rust (a terrible language) and others. Why is it so important? Mostly because it is an all-or-nothing proposition. Even one dependency (lib) can lock you in. AFAIK, you can't break out of it, even for parts of your application that - in themselves - don't use it. The thread proposition would alleviate all these concerns. You could finally, truly have it both ways at the same time. D is the only language that I know that comes this close! _MarkHi, Just some thoughts after reading a handful Rust and D books... and after having seen so many wrinkle their noses at GC ... and therefore, unfortunately D. _MarkInteresting thoughts. Just a general question I've been trying to understand: Why do you think ppl "wrinkle their noses at GC"? 🤔 I don't get it. GC 4 life!! 🎶☀️ (yes I know in what circumstances you can't use it)
May 14 2021
On Friday, 14 May 2021 at 15:23:13 UTC, Markk wrote:But look at even this forum. It seems to be one of the biggest no-gos for D.There's two types of programmer: the ones busy doing actual productive work and the ones with the spare time to complain about GC on the forum.
May 14 2021
On Friday, 14 May 2021 at 15:30:46 UTC, Adam D. Ruppe wrote:On Friday, 14 May 2021 at 15:23:13 UTC, Markk wrote:There's two types of programmer: the ones busy doing actual productive work and the ones with the spare time to complain about GC on the forum.I agree with that statement, but then I also believe that D should address the GC concern. Given how close D already is, it would be a shame not to. :-D _Mark
May 14 2021
On Friday, 14 May 2021 at 15:30:46 UTC, Adam D. Ruppe wrote:On Friday, 14 May 2021 at 15:23:13 UTC, Markk wrote:nobody complain about the GC people complain about the fact everything is modeled around the idea of a poors man GC, implemented and served to everyone by forceBut look at even this forum. It seems to be one of the biggest no-gos for D.There's two types of programmer: the ones busy doing actual productive work and the ones with the spare time to complain about GC on the forum.
May 14 2021
On Fri, May 14, 2021 at 04:25:47PM +0000, russhy via Digitalmars-d wrote:On Friday, 14 May 2021 at 15:30:46 UTC, Adam D. Ruppe wrote:IMO that impression is misleading, because those who are happy with the GC are silent and you don't hear from them, and the ones complaining about it are the vocal minority.On Friday, 14 May 2021 at 15:23:13 UTC, Markk wrote:But look at even this forum. It seems to be one of the biggest no-gos for D.Oh the irony.There's two types of programmer: the ones busy doing actual productive work and the ones with the spare time to complain about GC on the forum.nobody complain about the GCpeople complain about the fact everything is modeled around the idea of a poors man GC, implemented and served to everyone by forceNobody is forcing you to do anything. If you don't like D because of the GC, there's plenty of alternatives, like Rust, that people here seem to love talking about. Nobody's twisting your arm that you must use D, and nobody's holding a gun to your head that you must use the GC. :-D T -- Why do conspiracy theories always come from the same people??
May 14 2021
On Friday, 14 May 2021 at 16:59:58 UTC, H. S. Teoh wrote:IMO that impression is misleading, because those who are happy with the GC are silent and you don't hear from them, and the ones complaining about it are the vocal minority.Yes that could well explain a good proportion of it. But then I do find the argument valid for many application scenarios. And I would probably think less of it if I hadn't watched many of the Dconf and other sessions, where the topic of non-GC memory safety seems very dominant. There is this constant "the grass is greener over there" vibe coming across, with nods to Rust et al. All the proposals I encountered so far ( life etc.) always want to ditch the GC entirely (as a global application choice) and that's something that I think will be very damaging to D's power and ecosystem. It will effectively ban all the existing D code base from these applications and make the life so much harder for those that want to support both worlds. I guess there will be two stdlibs and two of everything, or if unified it will be crippled. The proposal I made here might fix this. If it works, it is the best of both worlds, combined. _Mark
May 14 2021
On Friday, 14 May 2021 at 16:59:58 UTC, H. S. Teoh wrote:Nobody is forcing you to do anything. If you don't like D because of the GC, there's plenty of alternatives, like Rust, that people here seem to love talking about. Nobody's twisting your arm that you must use D, and nobody's holding a gun to your head that you must use the GC.Everybody understands that, but you also have to look at where computing is heading and where people are moving. The trend now is that many people create new languages (thanks to LLVM) for system-like programming. As a result you get many small eco system that cannot sustain themselves well. The big winners... C++/Rust and other languages that have momentum.
May 14 2021
On Friday, 14 May 2021 at 16:59:58 UTC, H. S. Teoh wrote:On Fri, May 14, 2021 at 04:25:47PM +0000, russhy via Digitalmars-d wrote:because of phobos and its people i should stop write D? i like D with core.stdc, i don't touch anything from std.On Friday, 14 May 2021 at 15:30:46 UTC, Adam D. Ruppe wrote:IMO that impression is misleading, because those who are happy with the GC are silent and you don't hear from them, and the ones complaining about it are the vocal minority.On Friday, 14 May 2021 at 15:23:13 UTC, Markk wrote:But look at even this forum. It seems to be one of the biggest no-gos for D.Oh the irony.There's two types of programmer: the ones busy doing actual productive work and the ones with the spare time to complain about GC on the forum.nobody complain about the GCpeople complain about the fact everything is modeled around the idea of a poors man GC, implemented and served to everyone by forceNobody is forcing you to do anything. If you don't like D because of the GC, there's plenty of alternatives, like Rust, that people here seem to love talking about. Nobody's twisting your arm that you must use D, and nobody's holding a gun to your head that you must use the GC. :-D T
May 14 2021
On Friday, 14 May 2021 at 23:13:38 UTC, russhy wrote:because of phobos and its people i should stop write D? i like D with core.stdc, i don't touch anything from std.No, please not. But changing stdlib to support other forms of GC may lead to breakage of existing functionality which require to rewrite existing code using GC.people complain about the fact everything is modeled around the idea of a poors man GCCould you elaborate more about the poor man GC? I think this talked is concerned a bit around that. What are your ideas to make it non poor man?
May 15 2021
On Friday, 14 May 2021 at 13:48:12 UTC, Markk wrote:Hi, [...] _MarkI'm very much against binding any dynamic memory to any thread. It collides with a lot of programming models. For example threads created outside D, in C++ or any other language has no knowledge of D GC memory. This means that FFI is much more complicated. Also threads are like prostitutes, they do the work of the client and then another client comes along doing some other work. Typically example are thread pools where any thread can do any work. Also bring fibers into the equation makes this even more unfitting. Memory bounded to a thread is a bad idea and as time moves on it becomes more clear that a program should not assume which thread they are running (should only operate on self) and also not which CPU they are running on.
May 14 2021
On Friday, 14 May 2021 at 15:37:22 UTC, IGotD- wrote:On Friday, 14 May 2021 at 13:48:12 UTC, Markk wrote:Kinda agree on this. A thread is a virtualization of the cpu and a process is a virtualization of the memory.[...]I'm very much against binding any dynamic memory to any thread. It collides with a lot of programming models. For example threads created outside D, in C++ or any other language has no knowledge of D GC memory. This means that FFI is much more complicated. [...]
May 14 2021
On Friday, 14 May 2021 at 15:37:22 UTC, IGotD- wrote:I'm very much against binding any dynamic memory to any thread. It collides with a lot of programming models. For example threads created outside D, in C++ or any other language has no knowledge of D GC memory. This means that FFI is much more complicated.I disagree by 180°. If the memory management is associated with the thread, such "foreign" threads would be completely left alone by D or the GC, and that's exactly as it should be. Passing memory from/to that thread to a D thread is already a difficult thing to do right, this proposal would make that safer and more formal.Also threads are like prostitutes, they do the work of the client and then another client comes along doing some other work. Typically example are thread pools where any thread can do any work.Again, I disagree. Please read my earlier post about how the Allocator assignment could be scoped (e.g. by the pool bootstrapper) https://forum.dlang.org/post/mqfuxbuuhpvqeyvxoang forum.dlang.org, Pool usage could be supported in a very natural way and be very fast, because for short tasks, the GC would never run and all the memory could be jettisoned en bloc when the thread task goes out of scope.Also bring fibers into the equation makes this even more unfitting.Obviously fibers need to share the same thread heap/Allocator. Other than that, scoping/RAII (of the pinned references) is still valid and this is the most important thing. The limitations/language guarantees would sometimes be overly strict between fibers of the same thread, but they are still valid. Fibers are scheduled cooperatively, i.e. non-preemtively, so the premise to make the GC simpler/faster, holds. So what is the problem, exactly?Memory bounded to a thread is a bad idea and as time moves on it becomes more clear that a program should not assume which thread they are running (should only operate on self) and also not which CPU they are running on.This contradicts everything I read about locality becoming more and more important with modern multi-core processors. The following is simply the best article I ever, ever read about the issue. I recommend reading it: https://www.informit.com/articles/article.aspx?p=1609144 _Mark
May 14 2021
On Friday, 14 May 2021 at 17:02:00 UTC, Markk wrote:I disagree by 180°. If the memory management is associated with the thread, such "foreign" threads would be completely left alone by D or the GC, and that's exactly as it should be. Passing memory from/to that thread to a D thread is already a difficult thing to do right, this proposal would make that safer and more formal.If for example C++ calls a D function, the D function does something temporary with arrays then those arrays will not be cleaned up if the array memory is thread local. More likely since the thread has no meta data in D, some error will happen. The programmer will sure know it but this is very inconvenient as the D function will not work if called outside D.
May 14 2021
On Friday, 14 May 2021 at 17:13:00 UTC, IGotD- wrote:On Friday, 14 May 2021 at 17:02:00 UTC, Markk wrote: If for example C++ calls a D function, the D function does something temporary with arrays then those arrays will not be cleaned up if the array memory is thread local.First of all, if the D function lives in the C++ thread (i.e. normal callback) then it inherits the memory management of the C++ thread (e.g. non-GC) and would have to behave accordingly. The situation is much better than today, where the C++ thread punches into the D memory managed world, and it is solely the developers' responsibility to make sure not to return GC'd memory back to the C++ thread. The language guarantees (I described in the initial post) would make sure that nothing illegal can leak back into C++ by disallowing memory references from other (GC'd) threads. If however the C++ call wanted to pass memory to/from other D threads, It could do so via message passing. Everything is properly managed and accounted for, by the thread separation. _Mark
May 14 2021
On Fri, May 14, 2021 at 01:48:12PM +0000, Markk via Digitalmars-d wrote: [...]D has this nice default per-thread static memory model, i.e. if I understand all this correctly, this allows for better, more natural thread safety, while it makes it generally unsafe to use this memory from other threads (without locking). I guess the same is implicitly true for stack memory. Now could it equally make sense to use per-thread heaps?It would be nice, because it would allow per-thread GC, which could address some of the problems people complain about the GC. However, there's a big caveat: sharing data between threads would be essentially extremely broken. Today, immutable can be safely shared across threads, because well, it's immutable. But once allocations are bound to a thread, this sharing would be impossible without major problems.I.e. all allocations would need to be per thread, and it would be illegal to reference memory of one thread's heap, static memory, or stack from another thread's memory.[...] Yeah, this would be a major bugbear for implementing it in D. T -- If creativity is stifled by rigid discipline, then it is not true creativity.
May 14 2021
On Friday, 14 May 2021 at 16:11:26 UTC, H. S. Teoh wrote:However, there's a big caveat: sharing data between threads would be essentially extremely broken. Today, immutable can be safely shared across threads, because well, it's immutable. But once allocations are bound to a thread, this sharing would be impossible without major problems.No! I did not say this explicitly but of course `immutable` and `shared` remain the same. I was talking about heap memory.Compared to what for instance ` life` has to analyze, it is super easy, I think. _MarkI.e. all allocations would need to be per thread, and it would be illegal to reference memory of one thread's heap, static memory, or stack from another thread's memory.[...] Yeah, this would be a major bugbear for implementing it in D.
May 14 2021
On Friday, 14 May 2021 at 13:48:12 UTC, Markk wrote:Just some thoughts after reading a handful Rust and D books... and after having seen so many wrinkle their noses at GC ... and therefore, unfortunately D. _MarkIsn't that what Nim already has, thread local garbage collection? I thought there was a problem to equip that in D, I think it relates to traced vs non traced pointer, though I'm no expert on this. I wanted to know more why we can't do this in D because I like the idea in general. However, I'm favoring more a task based solution mentioned already by Ola, i.e. a green thread based local GC as you could to concurrency without threads.
May 14 2021
On Friday, 14 May 2021 at 16:14:00 UTC, sighoya wrote:Isn't that what Nim already has, thread local garbage collection?Oh, I must look at Nim. _Mark
May 14 2021