www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - proposal: GC.*partial*collect(Duration maxPauseTime, Duration

reply mw <mingwu gmail.com> writes:
Hi,

I'm new to D(2). Have spent past week reading a lot about other 
people's articles on D, pros & cons, etc. One of the things 
people have talked about alot is the slow gc, esp. in a 
multi-threaded env, real-time system, or interactive games dev.

OK, gc has its problem, but most programs cannot do without it 
(nogc phobos not there yet). Actually the programmer knows best 
*when & where* in his/her program, s/he can manually call 
gc.collect() to free resource. But currently gc.collect() will do 
a full collection:

https://dlang.org/library/core/memory/gc.collect.html

which is unpredictable how long it will take. In real-time 
systems this is not acceptable. The programmer will still 
hesitate to call gc.collect().

So we want a predictable gc: I just had a simple idea to improve 
the usability of current gc implementation (probably with minimal 
code change). My idea is add parameters to the gc.collect():

GC.*partial*collect(Duration maxPauseTime, Duration 
maxCollectionTime)

and ask it to do a partial collection. The parameters are defined 
here :-)

https://dlang.org/library/core/memory/gc.profile_stats.html

Name	Type	Description
maxCollectionTime	Duration	largest time spent doing one GC cycle
maxPauseTime		Duration	largest time threads were paused during 
one GC cycle


Now, the gc is much controllable in real-time systems.

Thoughts?
May 18 2020
parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 19/05/2020 5:42 PM, mw wrote:
 Hi,
 
 I'm new to D(2). Have spent past week reading a lot about other people's 
 articles on D, pros & cons, etc. One of the things people have talked 
 about alot is the slow gc, esp. in a multi-threaded env, real-time 
 system, or interactive games dev.
 
 OK, gc has its problem, but most programs cannot do without it (nogc 
 phobos not there yet). Actually the programmer knows best *when & where* 
 in his/her program, s/he can manually call gc.collect() to free 
 resource. But currently gc.collect() will do a full collection:
 
 https://dlang.org/library/core/memory/gc.collect.html
 
 which is unpredictable how long it will take. In real-time systems this 
 is not acceptable. The programmer will still hesitate to call gc.collect().
There are two types of real time systems. Hard: you will not be using the GC, like Weka you will roll your own libraries. Soft: see dplug, you disable the GC and collect as appropriate while minimizing what memory is allocated via the GC. Games full into the soft territory, although depending on the game, buffers and custom allocators can offset this significantly to the point you can enable the GC full time and not have to worry about it.
 So we want a predictable gc: I just had a simple idea to improve the 
 usability of current gc implementation (probably with minimal code 
 change). My idea is add parameters to the gc.collect():
Partial collection can be implemented using a strategy such as tri-color, and do a maximum amount of time on each iteration of scanning. Alternatively you can use fork'ing and snapshots (Windows) which make it an entirely asynchronous process. We know what can be done, just nobody wants to do the work. Our GC implementation isn't all that friendly even though it now supports fork'ing and can be precise. The GC performance isn't all that bad in D, as long as you minimize garbage, hence not much effort goes towards it. For most people you won't notice that it even runs.
May 18 2020
next sibling parent reply mw <mingwu gmail.com> writes:
On Tuesday, 19 May 2020 at 05:59:17 UTC, rikki cattermole wrote:
 We know what can be done, just nobody wants to do the work.
 Our GC implementation isn't all that friendly even though it 
 now supports fork'ing and can be precise.
That’s exactly why I’m proposing just add a timeout to the current gc algorithm, which periodically checks for early return without finish a full collection. I hope the required code change will be minimal. Then the users will have a much predictable (stop-the-world) gc.
May 19 2020
parent reply Luis <luis.panadero gmail.com> writes:
On Tuesday, 19 May 2020 at 15:25:56 UTC, mw wrote:
 On Tuesday, 19 May 2020 at 05:59:17 UTC, rikki cattermole wrote:
 We know what can be done, just nobody wants to do the work.
 Our GC implementation isn't all that friendly even though it 
 now supports fork'ing and can be precise.
That’s exactly why I’m proposing just add a timeout to the current gc algorithm, which periodically checks for early return without finish a full collection. I hope the required code change will be minimal. Then the users will have a much predictable (stop-the-world) gc.
I think that the problem isn't that your idea is bad or nobody likes it. It's a good idea, and would be a nice improvement. The real problem is that there is nobody that would implement it.
May 24 2020
parent reply mw <mingwu gmail.com> writes:
On Sunday, 24 May 2020 at 08:27:39 UTC, Luis wrote:
 On Tuesday, 19 May 2020 at 15:25:56 UTC, mw wrote:
 On Tuesday, 19 May 2020 at 05:59:17 UTC, rikki cattermole 
 wrote:
 We know what can be done, just nobody wants to do the work.
 Our GC implementation isn't all that friendly even though it 
 now supports fork'ing and can be precise.
That’s exactly why I’m proposing just add a timeout to the current gc algorithm, which periodically checks for early return without finish a full collection. I hope the required code change will be minimal. Then the users will have a much predictable (stop-the-world) gc.
I think that the problem isn't that your idea is bad or nobody likes it. It's a good idea, and would be a nice improvement. The real problem is that there is nobody that would implement it.
Maybe, we can start from drafting a DIP? anyone who is familiar with the process want to help? (esp if you also like the idea and want the improvement) Adam D. Ruppe :-) Also I noticed there isn't so much documentation about the GC's internal design and implementation: https://forum.dlang.org/thread/yampusjziyptnbndymik forum.dlang.org Can someone who knows the internals can start a documentation of gc on D wiki? then others can pick up from there?
May 24 2020
parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 25/05/2020 5:41 AM, mw wrote:
 Maybe, we can start from drafting a DIP? anyone who is familiar with the 
 process want to help? (esp if you also like the idea and want the 
 improvement)  Adam D. Ruppe :-)
You don't need a DIP. Write the code, show its good, get it merged.
May 24 2020
parent reply mw <mingwu gmail.com> writes:
On Monday, 25 May 2020 at 04:48:44 UTC, rikki cattermole wrote:
 You don't need a DIP.

 Write the code, show its good, get it merged.
Ha! I may give it a try although I'm not an expert on this. Can you give some extra info? e.g. source file location? maybe in the following thread? (Any doc apart from the code itself?) """ Also I noticed there isn't so much documentation about the GC's internal design and implementation: https://forum.dlang.org/thread/yampusjziyptnbndymik forum.dlang.org Can someone who knows the internals can start a documentation of gc on D wiki? then others can pick up from there? """
May 24 2020
parent rikki cattermole <rikki cattermole.co.nz> writes:
On 25/05/2020 5:55 PM, mw wrote:
 On Monday, 25 May 2020 at 04:48:44 UTC, rikki cattermole wrote:
 You don't need a DIP.

 Write the code, show its good, get it merged.
Ha! I may give it a try although I'm not an expert on this. Can you give some extra info? e.g. source file location? maybe in the following thread?
https://github.com/dlang/druntime/blob/master/src/gc/impl/conservative/gc.d I tried once to add snapshot support so Windows could be asynchronous, lets just say I never found the place where it does the pointer checking and I didn't get much in the way of help myself. Not that I made much fuss about it. Squeaky wheel gets the oil and all that.
May 24 2020
prev sibling parent mw <mingwu gmail.com> writes:
On Tuesday, 19 May 2020 at 05:59:17 UTC, rikki cattermole wrote:
 The GC performance isn't all that bad in D, as long as you 
 minimize garbage, hence not much effort goes towards it. For 
 most people you won't notice that it even runs.
I just replied in the other thread about choosing a new language in corporate environment: https://forum.dlang.org/post/kjjredahuogmatldwolp forum.dlang.org It can be applied here too: companies always want predictability & warranty. Personally I feel inclined to trust your informal words that "The GC performance isn't all that bad in D", but for commercial usage they always want (written) warranty: what's the worst scenario that can happen? Even a laughable warranty is still a warranty: e.g. "the GC will stop your program for at most 1 seconds". OK, that's fine, we know the where the limit is, so at least we can design the software to work-around it, e.g. design some user interaction steps / show animated entertaining pictures (>= 1 seconds), and in the background run the GC. GC.*partial*collect(Duration maxPauseTime, Duration maxCollectionTime) ^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^ This two parameters if added, is that *written* guarantee that the company would want.
May 23 2020