digitalmars.D - Potential GSoC project

digitalmars.D - Potential GSoC project - GC improvements

Jeremy DeHaan (11/11) Mar 09 2016 Hey all,

Joakim (10/21) Mar 09 2016 I'd think working on the GC would make a great GSoC project, as

thedeemon (4/8) Mar 10 2016 Do you think Windows version of this cdgc is feasible at all?

Joakim (8/14) Mar 10 2016 I don't know enough about GCs and Windows to say. I'm assuming

Adam Wilson (7/18) Mar 09 2016 The follow-on projects enabled by a precise GC would be quite useful...
NX (7/13) Mar 09 2016 http://www.infognition.com/blog/2014/the_real_problem_with_gc_in_d.html

Andrei Alexandrescu (3/6) Mar 10 2016 I agree. A first step would be easy to do with std.allocator's

Jeremy DeHaan (7/14) Mar 10 2016 I was looking into this, but I am slightly hesitant. Should the

ZombineDev (9/25) Mar 10 2016 There's no problem in using std.experimental.* internally,

Daniel Kozak via Digitalmars-d (3/23) Mar 10 2016 But we allready need existing D compiler to build D frontend, so maybe

ZombineDev (11/44) Mar 10 2016 The build process looks something like this currently:

Daniel Kozak via Digitalmars-d (6/45) Mar 10 2016 I know how it works now :). But if D is stable enought 1 should be good...

ZombineDev (13/69) Mar 10 2016 Well for 4. maybe, but after a couple of years... Most of the

Jeremy DeHaan (15/15) Mar 11 2016 Thank you all for the feedback.

Rikki Cattermole (6/20) Mar 11 2016 Ugh I would say that is a lot.
Adam Wilson (21/35) Mar 12 2016 If I may make a suggestion. The lock free work is unlikely to require

Chris Wright (19/30) Mar 12 2016 Unions mean we can't be entirely precise, but we can get pretty close.
Jeremy DeHaan (29/43) Mar 12 2016 Rainer has two different precise GC's in pull requests right now

Adam Wilson (47/89) Mar 12 2016 I would contend that it's not quite that simple. Of course the precise

Chris Wright (95/138) Mar 12 2016 On Sat, 12 Mar 2016 13:23:35 -0800, Adam Wilson wrote:

Adam Wilson (75/213) Mar 13 2016 I didn't intend to confuse, but I was not writing to neophyte's, and I

Chris Wright (47/65) Mar 13 2016 It's not a standard term. Google's only seeing about four references to

Adam Wilson (19/84) Mar 13 2016 Is there an implementation of a conservative moving (compacting) GC out

deadalnix (3/6) Mar 13 2016 That is impossible, you need to know what is and isn't a pointer

Chris Wright (12/20) Mar 13 2016 Let's say we do the most precise GC possible for D. We can categorize
Adam Wilson (6/13) Mar 13 2016 It was a rhetorical question. I was just leaving the door open for

Chris Wright (11/13) Mar 13 2016 You made a large number of assertions about garbage collection and they

Adam Wilson (27/40) Mar 13 2016 Erm. That seems a bit extreme. You are asserting that D can't have a

thedeemon (26/32) Mar 13 2016 IIRC, Rainer called it "mostly precise", and for a good reason.

Adam Wilson (27/55) Mar 13 2016 He did. And C# supports pinning so at some point all GC's are partially

Jeremy DeHaan (15/21) Mar 14 2016 I haven't had power for a couple of days, but it looks like the

jmh530 (8/10) Mar 14 2016 I would characterize it as very interesting, although I know very
Adam Wilson (16/36) Mar 14 2016 Awesome! It's a complex topic, but you'll get broad exposure to all
Rainer Schuetze (27/47) Mar 18 2016 Being always way behind reading the forum these days, I'm a little late

Jeremy deHaan (6/12) Mar 18 2016 Thank you for the feedback. I'm still working on my proposal so

Rainer Schuetze (33/50) Mar 19 2016 Well, there are a number of unfinished parts left:

Adam Wilson (13/64) Mar 19 2016 I would argue that precision is still important as it enables a

Craig Dillabaugh (3/11) Mar 18 2016 In case you didn't already know, Adam is now an official mentor

Dmitry Olshansky (7/21) Mar 13 2016 Let's not forget another low-hanging fruit which is parallel marking.

Jeremy DeHaan <dehaan.jeremiah gmail.com> writes:

Hey all,

I'm trying to think of good project ideas for this years GSoC, 
and one in particular I thought would be a great was working on 
and improving the GC. I'm not sure what the scope of this project 
would be like, but at the moment I am thinking writing a 
generational collector would be a good place to start.

I have more ideas, but I don't have a proposal yet. I just wanted 
some initial feedback on the idea, perhaps some advice for scope 
with the time frame in mind, and hopefully someone on the mentor 
list willing to take this on if it were to be accepted.

     Jeremy

Mar 09 2016

Joakim <dlang joakim.fea.st> writes:

On Wednesday, 9 March 2016 at 22:49:28 UTC, Jeremy DeHaan wrote:
 Hey all,

 I'm trying to think of good project ideas for this years GSoC, 
 and one in particular I thought would be a great was working on 
 and improving the GC. I'm not sure what the scope of this 
 project would be like, but at the moment I am thinking writing 
 a generational collector would be a good place to start.

 I have more ideas, but I don't have a proposal yet. I just 
 wanted some initial feedback on the idea, perhaps some advice 
 for scope with the time frame in mind, and hopefully someone on 
 the mentor list willing to take this on if it were to be 
 accepted.

I'd think working on the GC would make a great GSoC project, as 
it's needed and fairly self-contained while having large impact.  
Perhaps someone could build off of Sociomantic's concurrent GC 
(https://www.sociomantic.com/blog/2013/06/porting-cdgc-to-d2/), 
which I assume has been ported to D2, porting it to Windows or 
whatever else remains to be done and then adding to it.

There were some good ideas in this thread from a couple years ago 
too:

http://forum.dlang.org/thread/ucsamkqvrihfuspnxfsk forum.dlang.org

Mar 09 2016

thedeemon <dlang thedeemon.com> writes:

On Thursday, 10 March 2016 at 05:01:37 UTC, Joakim wrote:
 Perhaps someone could build off of Sociomantic's concurrent GC 
 (https://www.sociomantic.com/blog/2013/06/porting-cdgc-to-d2/), 
 which I assume has been ported to D2, porting it to Windows or 
 whatever else remains to be done and then adding to it.

Do you think Windows version of this cdgc is feasible at all?
It relies on fork, something that gets emulated on Windows but 
too slowly, as I understand.

Mar 10 2016

Joakim <dlang joakim.fea.st> writes:

On Thursday, 10 March 2016 at 11:12:47 UTC, thedeemon wrote:
 On Thursday, 10 March 2016 at 05:01:37 UTC, Joakim wrote:
 Perhaps someone could build off of Sociomantic's concurrent GC 
 (https://www.sociomantic.com/blog/2013/06/porting-cdgc-to-d2/), which I assume
has been ported to D2, porting it to Windows or whatever else remains to be
done and then adding to it.

 Do you think Windows version of this cdgc is feasible at all?
 It relies on fork, something that gets emulated on Windows but 
 too slowly, as I understand.

I don't know enough about GCs and Windows to say.  I'm assuming 
it can be done with some effort, but I don't know how much.

Even if it's too much work, it'd be worth it just for POSIX 
systems, which make up almost 70% of all general-purpose 
computers running apps these days. :)

If it's good enough, it could eventually become the default on 
there, while leaving the legacy GC for Windows.

Mar 10 2016

Adam Wilson <flyboynw gmail.com> writes:

Jeremy DeHaan wrote:
 Hey all,

 I'm trying to think of good project ideas for this years GSoC, and one
 in particular I thought would be a great was working on and improving
 the GC. I'm not sure what the scope of this project would be like, but
 at the moment I am thinking writing a generational collector would be a
 good place to start.

 I have more ideas, but I don't have a proposal yet. I just wanted some
 initial feedback on the idea, perhaps some advice for scope with the
 time frame in mind, and hopefully someone on the mentor list willing to
 take this on if it were to be accepted.

      Jeremy

The follow-on projects enabled by a precise GC would be quite useful...

I'd be interested in mentoring if nobody else with more experience is 
interested.

-- 
// Adam Wilson
// quiet.dlang.dev

Mar 09 2016

NX <nightmarex1337 hotmail.com> writes:

On Wednesday, 9 March 2016 at 22:49:28 UTC, Jeremy DeHaan wrote:
 Hey all,

 I'm trying to think of good project ideas for this years GSoC, 
 and one in particular I thought would be a great was working on 
 and improving the GC. I'm not sure what the scope of this 
 project would be like, but at the moment I am thinking writing 
 a generational collector would be a good place to start.

http://www.infognition.com/blog/2014/the_real_problem_with_gc_in_d.html

Good luck with generational GC...

I think the best possible improvement for GC is making it 
lock-free. Currently, GC lock cause some serious performance 
penalties for multithreaded code when frequent allocations take 
place.

Mar 09 2016

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/9/16 10:40 PM, NX wrote:
 I think the best possible improvement for GC is making it lock-free.
 Currently, GC lock cause some serious performance penalties for
 multithreaded code when frequent allocations take place.

I agree. A first step would be easy to do with std.allocator's 
thread-local freelists. -- Andrei

Mar 10 2016

Jeremy DeHaan <dehaan.jeremiah gmail.com> writes:

On Thursday, 10 March 2016 at 15:24:48 UTC, Andrei Alexandrescu 
wrote:
 On 3/9/16 10:40 PM, NX wrote:
 I think the best possible improvement for GC is making it 
 lock-free.
 Currently, GC lock cause some serious performance penalties for
 multithreaded code when frequent allocations take place.

 I agree. A first step would be easy to do with std.allocator's 
 thread-local freelists. -- Andrei

I was looking into this, but I am slightly hesitant. Should the 
gc use something in std.experimental? Or should we think that's 
ok?

I also know that there are some people that think we should avoid 
using Phobos in druntime.

Mar 10 2016

ZombineDev <petar.p.kirov gmail.com> writes:

On Thursday, 10 March 2016 at 16:46:59 UTC, Jeremy DeHaan wrote:
 On Thursday, 10 March 2016 at 15:24:48 UTC, Andrei Alexandrescu 
 wrote:
 On 3/9/16 10:40 PM, NX wrote:
 I think the best possible improvement for GC is making it 
 lock-free.
 Currently, GC lock cause some serious performance penalties 
 for
 multithreaded code when frequent allocations take place.

 I agree. A first step would be easy to do with std.allocator's 
 thread-local freelists. -- Andrei

 I was looking into this, but I am slightly hesitant. Should the 
 gc use something in std.experimental? Or should we think that's 
 ok?

 I also know that there are some people that think we should 
 avoid using Phobos in druntime.

There's no problem in using std.experimental.* internally, 
because if the API changes, the one who changes the API will also 
need to change all internal usages, otherwise his pull request 
won't compile.

About using Phobos in Druntime - you can't directly import 
Phobos, since it's files are not present when Druntime is built. 
The solution is to copy the stuff you need to 
https://github.com/D-Programming-Language/druntime/tree/master/src/rt/util or
some place like this.

Mar 10 2016

Daniel Kozak via Digitalmars-d <digitalmars-d puremagic.com> writes:

Dne 10.3.2016 v 18:15 ZombineDev via Digitalmars-d napsal(a):
 On Thursday, 10 March 2016 at 16:46:59 UTC, Jeremy DeHaan wrote:
 On Thursday, 10 March 2016 at 15:24:48 UTC, Andrei Alexandrescu wrote:
 On 3/9/16 10:40 PM, NX wrote:
 I think the best possible improvement for GC is making it lock-free.
 Currently, GC lock cause some serious performance penalties for
 multithreaded code when frequent allocations take place.

 I agree. A first step would be easy to do with std.allocator's 
 thread-local freelists. -- Andrei

 I was looking into this, but I am slightly hesitant. Should the gc 
 use something in std.experimental? Or should we think that's ok?

 I also know that there are some people that think we should avoid 
 using Phobos in druntime.

 There's no problem in using std.experimental.* internally, because if 
 the API changes, the one who changes the API will also need to change 
 all internal usages, otherwise his pull request won't compile.

 About using Phobos in Druntime - you can't directly import Phobos, 
 since it's files are not present when Druntime is built.

But we allready need existing D compiler to build D frontend, so maybe 
we can use this full D compiler for building druntime in the future?

Mar 10 2016

ZombineDev <petar.p.kirov gmail.com> writes:

On Thursday, 10 March 2016 at 17:46:15 UTC, Daniel Kozak wrote:
 Dne 10.3.2016 v 18:15 ZombineDev via Digitalmars-d napsal(a):
 On Thursday, 10 March 2016 at 16:46:59 UTC, Jeremy DeHaan 
 wrote:
 On Thursday, 10 March 2016 at 15:24:48 UTC, Andrei 
 Alexandrescu wrote:
 On 3/9/16 10:40 PM, NX wrote:
 I think the best possible improvement for GC is making it 
 lock-free.
 Currently, GC lock cause some serious performance penalties 
 for
 multithreaded code when frequent allocations take place.

 I agree. A first step would be easy to do with 
 std.allocator's thread-local freelists. -- Andrei

 I was looking into this, but I am slightly hesitant. Should 
 the gc use something in std.experimental? Or should we think 
 that's ok?

 I also know that there are some people that think we should 
 avoid using Phobos in druntime.

 There's no problem in using std.experimental.* internally, 
 because if the API changes, the one who changes the API will 
 also need to change all internal usages, otherwise his pull 
 request won't compile.

 About using Phobos in Druntime - you can't directly import 
 Phobos, since it's files are not present when Druntime is 
 built.

 But we allready need existing D compiler to build D frontend, 
 so maybe we can use this full D compiler for building druntime 
 in the future?

The build process looks something like this currently:
(a bit simplified)
1. Download old dmd.
2. Build new dmd with 1.
3. Build druntime with 2.
4. Build phobos with 2 and link it with 3.
5. Package 2, 3 and 4 into a release zip

You can't use 1 for building 3 (or 4), because 3 (or 4) may 
depend on a new feature only present in 2. Also you can't rely on 
5 for building 3 (or 4) because you'll get a cyclic dependency.

Mar 10 2016

Daniel Kozak via Digitalmars-d <digitalmars-d puremagic.com> writes:

Dne 10.3.2016 v 19:39 ZombineDev via Digitalmars-d napsal(a):
 On Thursday, 10 March 2016 at 17:46:15 UTC, Daniel Kozak wrote:
 Dne 10.3.2016 v 18:15 ZombineDev via Digitalmars-d napsal(a):
 On Thursday, 10 March 2016 at 16:46:59 UTC, Jeremy DeHaan wrote:
 On Thursday, 10 March 2016 at 15:24:48 UTC, Andrei Alexandrescu wrote:
 On 3/9/16 10:40 PM, NX wrote:
 I think the best possible improvement for GC is making it lock-free.
 Currently, GC lock cause some serious performance penalties for
 multithreaded code when frequent allocations take place.

 I agree. A first step would be easy to do with std.allocator's 
 thread-local freelists. -- Andrei

 I was looking into this, but I am slightly hesitant. Should the gc 
 use something in std.experimental? Or should we think that's ok?

 I also know that there are some people that think we should avoid 
 using Phobos in druntime.

 There's no problem in using std.experimental.* internally, because 
 if the API changes, the one who changes the API will also need to 
 change all internal usages, otherwise his pull request won't compile.

 About using Phobos in Druntime - you can't directly import Phobos, 
 since it's files are not present when Druntime is built.

 But we allready need existing D compiler to build D frontend, so 
 maybe we can use this full D compiler for building druntime in the 
 future?

 The build process looks something like this currently:
 (a bit simplified)
 1. Download old dmd.
 2. Build new dmd with 1.
 3. Build druntime with 2.
 4. Build phobos with 2 and link it with 3.
 5. Package 2, 3 and 4 into a release zip

 You can't use 1 for building 3 (or 4), because 3 (or 4) may depend on 
 a new feature only present in 2. Also you can't rely on 5 for building 
 3 (or 4) because you'll get a cyclic dependency.

I know how it works now :). But if D is stable enought  1 should be good 
enought to do 3 and 4 (yes you can't rely on features and fixes from 2). 
I think we should avoid to use latest features when developing dmd, 
druntime and phobos (maybe we should use only some subset or make some 
rules like new version must be compilable by 3 previous version of dmd)

Mar 10 2016

ZombineDev <petar.p.kirov gmail.com> writes:

On Thursday, 10 March 2016 at 18:49:28 UTC, Daniel Kozak wrote:
 Dne 10.3.2016 v 19:39 ZombineDev via Digitalmars-d napsal(a):
 On Thursday, 10 March 2016 at 17:46:15 UTC, Daniel Kozak wrote:
 Dne 10.3.2016 v 18:15 ZombineDev via Digitalmars-d napsal(a):
 On Thursday, 10 March 2016 at 16:46:59 UTC, Jeremy DeHaan 
 wrote:
 On Thursday, 10 March 2016 at 15:24:48 UTC, Andrei 
 Alexandrescu wrote:
 On 3/9/16 10:40 PM, NX wrote:
 I think the best possible improvement for GC is making it 
 lock-free.
 Currently, GC lock cause some serious performance 
 penalties for
 multithreaded code when frequent allocations take place.

 I agree. A first step would be easy to do with 
 std.allocator's thread-local freelists. -- Andrei

 I was looking into this, but I am slightly hesitant. Should 
 the gc use something in std.experimental? Or should we 
 think that's ok?

 I also know that there are some people that think we should 
 avoid using Phobos in druntime.

 There's no problem in using std.experimental.* internally, 
 because if the API changes, the one who changes the API will 
 also need to change all internal usages, otherwise his pull 
 request won't compile.

 About using Phobos in Druntime - you can't directly import 
 Phobos, since it's files are not present when Druntime is 
 built.

 But we allready need existing D compiler to build D frontend, 
 so maybe we can use this full D compiler for building 
 druntime in the future?

 The build process looks something like this currently:
 (a bit simplified)
 1. Download old dmd.
 2. Build new dmd with 1.
 3. Build druntime with 2.
 4. Build phobos with 2 and link it with 3.
 5. Package 2, 3 and 4 into a release zip

 You can't use 1 for building 3 (or 4), because 3 (or 4) may 
 depend on a new feature only present in 2. Also you can't rely 
 on 5 for building 3 (or 4) because you'll get a cyclic 
 dependency.

 I know how it works now :). But if D is stable enought  1 
 should be good enought to do 3 and 4 (yes you can't rely on 
 features and fixes from 2). I think we should avoid to use 
 latest features when developing dmd, druntime and phobos (maybe 
 we should use only some subset or make some rules like new 
 version must be compilable by 3 previous version of dmd)

Well for 4. maybe, but after a couple of years... Most of the 
time Phobos can't be compiled with an older version of DMD not 
because it relies on new features, but because there are bugs in 
the older version of DMD.
As for 3. - druntime is very tightly coupled with the compiler. 
Quite often changes in the compiler rely on changes in druntime - 
e.g. the new DWARF exceptions, C++ and Objective-C support, 
better code coverage and GC profiling, things in object.d, 
built-in AAs and arrays, linking with MSVC 2015 libc, shared 
libraries, TLS, etc.
On the other hand, I think that using Phobos in DMDFE is a good 
idea and is far more likely to happen.

Mar 10 2016

Jeremy DeHaan <dehaan.jeremiah gmail.com> writes:

Thank you all for the feedback.

I think I might still need a little more feedback as to what the 
project should actually entail, but here's what it's looking like 
so far:

Implement lock free allocation using std.experimental.allocator's 
freelists (SharedFreeList? It was the only thing in the 
documentation that specifically mentions lock free allocation)

Implement a generational garbage collector

Implement precise garbage collection (possibly piggybacking off 
of Rainer's work)


Does anyone think that might be too much to take on or does it 
sound pretty good? I have other garbage collector things I'd like 
to explore, but I they should probably go in the "if time allows" 
category at this point.

    Jeremy

Mar 11 2016

Rikki Cattermole <alphaglosined gmail.com> writes:

On 12/03/16 4:13 AM, Jeremy DeHaan wrote:
 Thank you all for the feedback.

 I think I might still need a little more feedback as to what the project
 should actually entail, but here's what it's looking like so far:

 Implement lock free allocation using std.experimental.allocator's
 freelists (SharedFreeList? It was the only thing in the documentation
 that specifically mentions lock free allocation)

 Implement a generational garbage collector

 Implement precise garbage collection (possibly piggybacking off of
 Rainer's work)


 Does anyone think that might be too much to take on or does it sound
 pretty good? I have other garbage collector things I'd like to explore,
 but I they should probably go in the "if time allows" category at this
 point.

     Jeremy

Ugh I would say that is a lot.
So here is what I would do, have two lists of goals.
One where you would happy if you got done the other if you are able to 
do it, you will.

That way your prioritize what you want to get done and not the others.

Mar 11 2016

Adam Wilson <flyboynw gmail.com> writes:

Jeremy DeHaan wrote:
Thank you all for the feedback.

I think I might still need a little more feedback as to what the project
should actually entail, but here's what it's looking like so far:

Implement lock free allocation using std.experimental.allocator's
freelists (SharedFreeList? It was the only thing in the documentation
that specifically mentions lock free allocation)

Implement a generational garbage collector

Implement precise garbage collection (possibly piggybacking off of
Rainer's work)

Does anyone think that might be too much to take on or does it sound
pretty good? I have other garbage collector things I'd like to explore,
but I they should probably go in the "if time allows" category at this
point.

Jeremy

If I may make a suggestion. The lock free work is unlikely to require
the entirety of GSoC. And the precise GC is the next most important
thing on your list and will have the biggest impact on GC performance.

Once the GC is fully precise we can implement a fully compacting GC,
which improves the usefulness of generational collection. Additionally,
precision allows a significant amount of work to be done in improving
the performance of the GC in multi-threaded scenarios. It should be
quite possible to avoid needing fork() or anything like it altogether. I
know that the .NET GC doesn't need to use anything like it.

Also, I would strongly recommend getting this book and reading it cover
to cover before starting:
http://www.amazon.com/gp/product/1420082795/ref=pd_lpo_sbs_dp_ss_1?pf_rd_p=1944687562&pf_rd_s=lpo-top-stripe-1&pf_rd_t=201&pf_rd_i=0471941484&pf_rd_m=ATVPDKIKX0DER&pf_rd_r=0QD9X3E5QATSBCBT6BMM

I think Rainer's Precise GC only dealt with Heap objects. What needs
work is Register and Stack scanning. Expanding on Rainer's existing
Precise GC work is the right idea, but Register and Stack scanning is a
very big project in it's own right.I suspect it will take up the
remainder of your GSoC time. :)

--
// Adam Wilson
// quiet.dlang.dev

Mar 12 2016

Chris Wright <dhasenan gmail.com> writes:

On Sat, 12 Mar 2016 00:50:06 -0800, Adam Wilson wrote:
 If I may make a suggestion. The lock free work is unlikely to require
 the entirety of GSoC. And the precise GC is the next most important
 thing on your list and will have the biggest impact on GC performance.
 
 Once the GC is fully precise we can implement a fully compacting GC,

Unions mean we can't be entirely precise, but we can get pretty close. 
Stack maps and type pointer maps are entirely separate efforts, and type 
pointer maps will be easier to implement, so the first pass will probably 
only include type pointer maps.

Switching to a compacting GC is risky. It's a transparent change if 
you're only working with D code, but if you interface to other languages, 
it breaks things horribly.

 which improves the usefulness of generational collection. Additionally,
 precision allows a significant amount of work to be done in improving
 the performance of the GC in multi-threaded scenarios.

Primarily because less memory needs to be scanned on average, so less 
work needs to be done and GC pauses are shorter. There's nothing specific 
to multithreading there.

 It should be
 quite possible to avoid needing fork() or anything like it altogether.

fork() is used to avoid pausing the program at all during the mark phase.

 I know that the .NET GC doesn't need to use anything like it.

The .NET GC stops all managed threads during a collection.

During a nursery or gen 1 collection, on average it's probably fast 
enough that forking wouldn't give that much benefit. During a full 
collection, if your application is latency-sensitive, a forking strategy 
would be better.

But .NET can't do that because Windows doesn't have a fork() system call, 
and it's expensive to emulate.

Mar 12 2016

Jeremy DeHaan <dehaan.jeremiah gmail.com> writes:

On Saturday, 12 March 2016 at 08:50:06 UTC, Adam Wilson wrote:
If I may make a suggestion. The lock free work is unlikely to
require the entirety of GSoC. And the precise GC is the next
most important thing on your list and will have the biggest
impact on GC performance.

Rainer has two different precise GC's in pull requests right now
and both are slower than the current one unless there are false
pointers. I would expect anything I come up with to largely act
the same. The reason I keep pushing for a generational garbage
collector is because I think it would be where we would see the
most benefit in terms of general performance.

Once the GC is fully precise we can implement a fully
compacting GC, which improves the usefulness of generational
collection. Additionally, precision allows a significant amount
of work to be done in improving the performance of the GC in
multi-threaded scenarios. It should be quite possible to avoid
needing fork() or anything like it altogether. I know that the
.NET GC doesn't need to use anything like it.

A compacting GC could be built on top of a generational GC
without much difficulty I would think, if we wanted to go that
route. The compaction would just happen as part of a collection
cycle when things are moved into the next generation. I have
concerns about doing any compaction though, mostly because D can
have both references and pointers to objects, and I don't know
how we would want to go about updating pointers. Especially since
pointers to D objects can exists in C and C++ code.

Another reason I want to work on a generational GC is because
this can also lead into a concurrent GC without trying to emulate
fork() on windows. The .Net GC has 3 generations with the last
one having its collections running concurrently because it is
unlikely to affect anything else going on. They don't bother
running the other generations concurrently because their
collections are really short. We could do something similar.

Perhaps someone more intimate with GC's than I am can speak up,
but I think that a generational GC would be the best use of time
in relation to performance gains. Other things can then be
implemented on top of it later.

Also, I would strongly recommend getting this book and reading
it cover to cover before starting:
http://www.amazon.com/gp/product/1420082795/ref=pd_lpo_sbs_dp_ss_1?pf_rd_p=1944687562&pf_rd_s=lpo-top-stripe-1&pf_rd_t=201&pf_rd_i=0471941484&pf_rd_m=ATVPDKIKX0DER&pf_rd_r=0QD9X3E5QATSBCBT6BMM

Thank you for the link to the book. I was planning on this one
http://www.amazon.com/gp/product/0471941484/ , but maybe I will
buy them both.

Mar 12 2016

Adam Wilson <flyboynw gmail.com> writes:

Jeremy DeHaan wrote:
On Saturday, 12 March 2016 at 08:50:06 UTC, Adam Wilson wrote:
If I may make a suggestion. The lock free work is unlikely to require
the entirety of GSoC. And the precise GC is the next most important
thing on your list and will have the biggest impact on GC performance.

Rainer has two different precise GC's in pull requests right now and
both are slower than the current one unless there are false pointers. I
would expect anything I come up with to largely act the same. The reason
I keep pushing for a generational garbage collector is because I think
it would be where we would see the most benefit in terms of general
performance.

I would contend that it's not quite that simple. Of course the precise
GC is slower, it's scanning more stuff. The current conservative GC by
definition can't figure out large chunks of memory so it won't even
bother scanning them. What it does to those things it can't scan. And
what it can't scan it has to pin, which means the memory can't be moved.
You can't build a fully generational GC until you can move everything.
They talk about this problem in the books. However, once you have a
precise GC, a fully moving GC becomes possible and only then do the
performance benefits of a generational GC become realized. I strongly
suspect that a partially moving GC will perform worse than the precise
GC. And I know for a fact that it will suffer enormously from
fragmentation and Out-of-Memory issues. Precision is the basis from
which all higher order GC functions flow.

Once the GC is fully precise we can implement a fully compacting GC,
which improves the usefulness of generational collection.
Additionally, precision allows a significant amount of work to be done
in improving the performance of the GC in multi-threaded scenarios. It
should be quite possible to avoid needing fork() or anything like it
altogether. I know that the .NET GC doesn't need to use anything like it.

A compacting GC could be built on top of a generational GC without much
difficulty I would think, if we wanted to go that route. The compaction
would just happen as part of a collection cycle when things are moved
into the next generation. I have concerns about doing any compaction
though, mostly because D can have both references and pointers to
objects, and I don't know how we would want to go about updating
pointers. Especially since pointers to D objects can exists in C and C++
code.

Erm, a generational GC is, in practice, a moving GC as it must move the
blobs around from the various heaps. The commonly used method is
Stop-and-Copy.

As for the C/C++ code problem various solutions have been proposed here.
For one, D knows the difference between the two, so pinning is possible
there. Or a separate heap space can be used. AFIAK each OS allows you to
specify multiple heaps, so you could specify an unmanaged heap to go
along with the managed heaps. IIRC, C++/CLI does something similar on
Windows. This would be the preferred solution IMO. (Look into HeapCreate
on Windows for example.)

Another reason I want to work on a generational GC is because this can
also lead into a concurrent GC without trying to emulate fork() on
windows. The .Net GC has 3 generations with the last one having its
collections running concurrently because it is unlikely to affect
anything else going on. They don't bother running the other generations
concurrently because their collections are really short. We could do
something similar.

A concurrent GC without precision is mostly useless because the GC roots
are primarily derived from the stack which the thread was spawned and
the stack roots are not scanned in a conservative GC. Additionally, the
Async/Await pattern is not practical without a precise GC.

Perhaps someone more intimate with GC's than I am can speak up, but I
think that a generational GC would be the best use of time in relation
to performance gains. Other things can then be implemented on top of it
later.

The problem is that performance is the most obvious problem with the GC,
but not the most important problem, or most useful. Precision enables a
fully moving, fully generational, fully concurrent, GC. It also enables
other major features that have nothing to do with GC.

Also, I would strongly recommend getting this book and reading it
cover to cover before starting:
http://www.amazon.com/gp/product/1420082795/ref=pd_lpo_sbs_dp_ss_1?pf_rd_p=1944687562&pf_rd_s=lpo-top-stripe-1&pf_rd_t=201&pf_rd_i=0471941484&pf_rd_m=ATVPDKIKX0DER&pf_rd_r=0QD9X3E5QATSBCBT6BMM

Thank you for the link to the book. I was planning on this one
http://www.amazon.com/gp/product/0471941484/ , but maybe I will buy them
both.

Obviously you can do whatever you want. :) And I'll admit that precision
isn't sexy, and it won't improve performance right away, but it opens
the door way to further work that will provide those things. It's hard
and thankless work, but you'll be better setup for GSoC 2017 for adding
generational and concurrency support.

My opinion is that we need to focus on improving the core of the GC so
that it can better perform the high-order GC functions. Yes, it will
make the GC slower, but it's already bad so I doubt people will notice
much.

We've discussed (and by discussed I mean thermonuclear flame-wars) the
GC before numerous times before at great length on this forum, so I am
glad to see that this thread hasn't turned into that, yet.

--
// Adam Wilson
// quiet.dlang.dev

Mar 12 2016

Chris Wright <dhasenan gmail.com> writes:

On Sat, 12 Mar 2016 13:23:35 -0800, Adam Wilson wrote:

To start off, let's talk terminology. You seem to be using nonstandard 
terminology and possibly misunderstanding standard terminology.

A GC scan is the mark phase of a mark/sweep collector (and specifically 
the part where the GC examines roots and allocated memory, if that 
matters). The GC searches for pointers to GC memory. Simple GCs only need 
to record whether there is a pointer to a given allocation, not where 
those pointers are coming from.

A conservative GC is one that will encounter some things that aren't 
pointers but must be scanned as if they were pointers. In D, for 
instance, a union between a pointer and a size_t forces collectors to be 
conservative.

A false pointer is something that a conservative GC must treat as a 
pointer and appears to point to an object allocated by that GC. For 
instance, unions in D allow us to create false pointers.

A precise GC is a GC that never mistakes a non-pointer for a pointer.

A moving GC is a GC that can relocate objects.

A generational GC is one that can perform a collection on part of the 
heap without dealing with the full heap. Those parts tend to be organized 
by how long an object has survived. This does not require a precise or 
moving GC, though it works better with a precise, moving GC.

A write barrier is a way to record which sections of memory have been 
altered since the last GC collection. This lets the GC scan only the 
changed data -- useful to any GC, especially useful to generational ones.

A "partially moving" GC does not exist, as far as I know.

 Jeremy DeHaan wrote:
 On Saturday, 12 March 2016 at 08:50:06 UTC, Adam Wilson wrote:
 If I may make a suggestion. The lock free work is unlikely to require
 the entirety of GSoC. And the precise GC is the next most important
 thing on your list and will have the biggest impact on GC performance.

 Rainer has two different precise GC's in pull requests right now and
 both are slower than the current one unless there are false pointers. 
 I would expect anything I come up with to largely act the same. The
 reason I keep pushing for a generational garbage collector is because I
 think it would be where we would see the most benefit in terms of
 general performance.

 I would contend that it's not quite that simple. Of course the precise
 GC is slower, it's scanning more stuff.

Precise GCs are slower because they must reference some data telling them 
which items on the stack and in allocations represent pointers and which 
do not. This data takes up cache space, and having less cache for memory 
being scanned slows things down. Correlating two pieces of data takes 
extra time, too.

The hope with a precise GC is to avoid false pointers. A false pointer to 
an object that holds references to many other objects can be costly, both 
in GC time and in excess memory usage.

In D, we have a tiny bit of type information surfaced to the GC to reduce 
the incidence of false pointers. We could additionally take advantage of 
the fact that numbers tend to be small and people tend to use 4 to 8 byte 
numbers: we could arrange our address space to avoid ranges where false 
pointers are likely. But that's probably unnecessary.

 The current conservative GC by
 definition can't figure out large chunks of memory so it won't even
 bother scanning them.

It must scan anything that might be a pointer. A precise GC reduces the 
number of things that might be pointers but aren't to zero. Therefore a 
precise GC scans less.

If a conservative GC didn't scan something that might be a pointer, it 
would sometimes free an object that still had live references to it. This 
would result in memory corruption or segmentation faults.

 What it does to those things it can't scan. And
 what it can't scan it has to pin, which means the memory can't be moved.

If a moving GC cannot precisely identify all pointers to an object, it is 
not allowed to move that object. If it can, it is allowed to move that 
object.

A conservative GC can be a moving GC. It must classify things as 
"definitely pointer", "maybe pointer", and "not pointer". It cannot move 
an object with a "maybe pointer" pointing to it, but it can move an 
object with several "definitely pointers" pointing to it.

In D, thanks to unions, we can never create a fully precise collector, 
but we can create a moving collector. Objects pointed to from a union 
can't be moved, but few objects will be pointed to from a union.

 You can't build a fully generational GC until you can move everything.

You can build a more efficient generational GC if you can move objects. 
However, it's not impossible to build a non-moving generational GC. 
Instead of having large blocks of address space for each generation, you 
would have each Pool indicate its generation. Then you have to find the 
Pool for a pointer and check that flag.

That's comparatively slow, but it'd still be faster to do a Gen 0 scan 
that way than to do a full scan with D's current collector (assuming you 
implement write barriers).

 They talk about this problem in the books. However, once you have a
 precise GC, a fully moving GC becomes possible and only then do the
 performance benefits of a generational GC become realized.

The performance benefit of a moving and generational GC over a non-moving 
generational GC is that you can have a contiguous section of address 
space for each generation, allocated densely. You get this with less 
overhead.

However, a moving GC can be a pessimization for user code. If you 
allocate several objects in sequence, with many GCs you'll get memory 
allocated for them that is contiguous or nearly contiguous. For small 
objects, this is probably going to be on the same page of memory. But a 
moving GC is free to move them to different pages. So you need to 
exercise caution.

 I have concerns about doing any compaction
 though, mostly because D can have both references and pointers to
 objects, and I don't know how we would want to go about updating
 pointers. Especially since pointers to D objects can exists in C and
 C++
 code.

 Erm, a generational GC is, in practice, a moving GC as it must move the
 blobs around from the various heaps.

The Boehm collector is generational, conservative, and not moving. You 
probably conflate "moving" and "generational" because of the JVM.

 As for the C/C++ code problem various solutions have been proposed here.
 For one, D knows the difference between the two, so pinning is possible
 there.

There's nothing in the type system that says whether you can pass a 
variable to a C/C++ function. So, no, D *doesn't* know the difference 
between them. And there's no GC.pin function, so people aren't currently 
telling your GC when something's accessible to C/C++ code.

That means a moving collector in D will produce segmentation faults and 
memory corruption in existing, currently working code.

 AFIAK each OS allows you to specify multiple heaps

You get one address space. You get to use mmap to ask for memory at a 
given address. There's no heap_t data structure to provide to malloc or 
mmap.

You might have been confused by literature referring to multiple heaps. 
That's just a single allocator deciding to divide its address space 
logically. There's no OS-level enforcement.

 A concurrent GC without precision is mostly useless

The Boehm collector is not precise. It is concurrent. Its concurrency 
reduces the amount of time a GC collection takes. This is mostly 
orthogonal to the benefits of a precise collector in reducing the amount 
of memory a process uses.

 because the GC roots
 are primarily derived from the stack which the thread was spawned and
 the stack roots are not scanned in a conservative GC.

They *are* scanned in a conservative GC. If you don't scan memory that 
might contain pointers (such as the stack), your GC will inevitably cause 
segmentation faults and memory corruption. D's GC is conservative and 
does not cause segmentation faults or memory corruption, therefore it 
must scan the stack.

Mar 12 2016

Adam Wilson <flyboynw gmail.com> writes:

Chris Wright wrote:
 On Sat, 12 Mar 2016 13:23:35 -0800, Adam Wilson wrote:

 To start off, let's talk terminology. You seem to be using nonstandard
 terminology and possibly misunderstanding standard terminology.

 A GC scan is the mark phase of a mark/sweep collector (and specifically
 the part where the GC examines roots and allocated memory, if that
 matters). The GC searches for pointers to GC memory. Simple GCs only need
 to record whether there is a pointer to a given allocation, not where
 those pointers are coming from.

 A conservative GC is one that will encounter some things that aren't
 pointers but must be scanned as if they were pointers. In D, for
 instance, a union between a pointer and a size_t forces collectors to be
 conservative.

 A false pointer is something that a conservative GC must treat as a
 pointer and appears to point to an object allocated by that GC. For
 instance, unions in D allow us to create false pointers.

 A precise GC is a GC that never mistakes a non-pointer for a pointer.

 A moving GC is a GC that can relocate objects.

 A generational GC is one that can perform a collection on part of the
 heap without dealing with the full heap. Those parts tend to be organized
 by how long an object has survived. This does not require a precise or
 moving GC, though it works better with a precise, moving GC.

I didn't intend to confuse, but I was not writing to neophyte's, and I 
used linguistic shortcuts.

 A write barrier is a way to record which sections of memory have been
 altered since the last GC collection. This lets the GC scan only the
 changed data -- useful to any GC, especially useful to generational ones.

 A "partially moving" GC does not exist, as far as I know.

Yep, it's a Bad Idea.

 Jeremy DeHaan wrote:
 On Saturday, 12 March 2016 at 08:50:06 UTC, Adam Wilson wrote:
 If I may make a suggestion. The lock free work is unlikely to require
 the entirety of GSoC. And the precise GC is the next most important
 thing on your list and will have the biggest impact on GC performance.

 Rainer has two different precise GC's in pull requests right now and
 both are slower than the current one unless there are false pointers.
 I would expect anything I come up with to largely act the same. The
 reason I keep pushing for a generational garbage collector is because I
 think it would be where we would see the most benefit in terms of
 general performance.

 I would contend that it's not quite that simple. Of course the precise
 GC is slower, it's scanning more stuff.

 Precise GCs are slower because they must reference some data telling them
 which items on the stack and in allocations represent pointers and which
 do not. This data takes up cache space, and having less cache for memory
 being scanned slows things down. Correlating two pieces of data takes
 extra time, too.

Yep the data is usually held with the object in a header. Hence the 
"scanning more stuff".

 The hope with a precise GC is to avoid false pointers. A false pointer to
 an object that holds references to many other objects can be costly, both
 in GC time and in excess memory usage.

 In D, we have a tiny bit of type information surfaced to the GC to reduce
 the incidence of false pointers. We could additionally take advantage of
 the fact that numbers tend to be small and people tend to use 4 to 8 byte
 numbers: we could arrange our address space to avoid ranges where false
 pointers are likely. But that's probably unnecessary.

We are not limited by what we currently have. Most of Rainer's work 
involved surfacing more data to the GC.

 The current conservative GC by
 definition can't figure out large chunks of memory so it won't even
 bother scanning them.

 It must scan anything that might be a pointer. A precise GC reduces the
 number of things that might be pointers but aren't to zero. Therefore a
 precise GC scans less.

Yep, precise GC's aren't totally perf-negative.

 If a conservative GC didn't scan something that might be a pointer, it
 would sometimes free an object that still had live references to it. This
 would result in memory corruption or segmentation faults.

 What it does to those things it can't scan. And
 what it can't scan it has to pin, which means the memory can't be moved.

 If a moving GC cannot precisely identify all pointers to an object, it is
 not allowed to move that object. If it can, it is allowed to move that
 object.

 A conservative GC can be a moving GC. It must classify things as
 "definitely pointer", "maybe pointer", and "not pointer". It cannot move
 an object with a "maybe pointer" pointing to it, but it can move an
 object with several "definitely pointers" pointing to it.

 In D, thanks to unions, we can never create a fully precise collector,
 but we can create a moving collector. Objects pointed to from a union
 can't be moved, but few objects will be pointed to from a union.

Actually, we probably could with modifications to the compiler. A little 
metadata goes a long way. And I know that Walter would accept that PR. I 
don't know for sure if it can be done, but I've seen precious little 
evidence one way or the other. Let's investigate before making absolute 
statements about what we can do. I know from discussions with Walter 
that the compiler's AST contains a wealth of information, and that more 
can be added.

 You can't build a fully generational GC until you can move everything.

 You can build a more efficient generational GC if you can move objects.
 However, it's not impossible to build a non-moving generational GC.
 Instead of having large blocks of address space for each generation, you
 would have each Pool indicate its generation. Then you have to find the
 Pool for a pointer and check that flag.

 That's comparatively slow, but it'd still be faster to do a Gen 0 scan
 that way than to do a full scan with D's current collector (assuming you
 implement write barriers).

My apologies I didn't intend to state that it was impossible. I was 
attempting to emphatically question the wisdom of doing so, particular 
when the potential performance benefits are dubious at best. My point is 
that we should do this right, if we keep blindly pursuing performance 
without taking a considered approach to achieving it, we'll never get 
there. The GC will just become an ever larger pile of hacks that 
increase performance in specific cases.

 They talk about this problem in the books. However, once you have a
 precise GC, a fully moving GC becomes possible and only then do the
 performance benefits of a generational GC become realized.

 The performance benefit of a moving and generational GC over a non-moving
 generational GC is that you can have a contiguous section of address
 space for each generation, allocated densely. You get this with less
 overhead.

That's actually pretty important. Precision enables moving, moving has 
clear and enumerable performance benefits. QED.

 However, a moving GC can be a pessimization for user code. If you
 allocate several objects in sequence, with many GCs you'll get memory
 allocated for them that is contiguous or nearly contiguous. For small
 objects, this is probably going to be on the same page of memory. But a
 moving GC is free to move them to different pages. So you need to
 exercise caution.

Yes ... and this is also dealt with in the books, including basic 
algorithms to help combat these problems. Additionally, as evidenced by 
the fact that every major GC (JVM, CLR, et al.) system implements a 
moving GC, I'd argue that the benefits greatly outweigh any potential 
risks. And we have pull requests to further optimize the algorithms. We 
don't have a system at all yet, let's build it first, then we can work 
on optimizing it.

 I have concerns about doing any compaction
 though, mostly because D can have both references and pointers to
 objects, and I don't know how we would want to go about updating
 pointers. Especially since pointers to D objects can exists in C and
 C++
 code.

 Erm, a generational GC is, in practice, a moving GC as it must move the
 blobs around from the various heaps.

 The Boehm collector is generational, conservative, and not moving. You
 probably conflate "moving" and "generational" because of the JVM.

Ok, yes, but lets be honest, it's what we actually want, and more 
importantly, expect. Precision makes it happen, efficiently. Even the 
C++ community doesn't take the Boehm GC very seriously. We have an 
opportunity to prove the C-style language can have a performant GC.

 As for the C/C++ code problem various solutions have been proposed here.
 For one, D knows the difference between the two, so pinning is possible
 there.

 There's nothing in the type system that says whether you can pass a
 variable to a C/C++ function. So, no, D *doesn't* know the difference
 between them. And there's no GC.pin function, so people aren't currently
 telling your GC when something's accessible to C/C++ code.

D does differentiate between a GC reference and pointer. Pin the 
pointers. Yes, you can create get pointer from a reference, and doing so 
is a pinning operation. We'd have to add a GC.pin function, so long as 
the compiler emits it for us. :)

 That means a moving collector in D will produce segmentation faults and
 memory corruption in existing, currently working code.

 AFIAK each OS allows you to specify multiple heaps

 You get one address space. You get to use mmap to ask for memory at a
 given address. There's no heap_t data structure to provide to malloc or
 mmap.

 You might have been confused by literature referring to multiple heaps.
 That's just a single allocator deciding to divide its address space
 logically. There's no OS-level enforcement.

Windows has that concept via HeapCreate shown here: 
https://msdn.microsoft.com/en-us/library/windows/desktop/aa366599(v=vs.85).aspx
I'd be surprised if it's not available on other OS's, and yes, it is OS 
enforced. It might not work exactly how you like it, but that's a far 
cry from "not provided".

 A concurrent GC without precision is mostly useless

 The Boehm collector is not precise. It is concurrent. Its concurrency
 reduces the amount of time a GC collection takes. This is mostly
 orthogonal to the benefits of a precise collector in reducing the amount
 of memory a process uses.

 because the GC roots
 are primarily derived from the stack which the thread was spawned and
 the stack roots are not scanned in a conservative GC.

 They *are* scanned in a conservative GC. If you don't scan memory that
 might contain pointers (such as the stack), your GC will inevitably cause
 segmentation faults and memory corruption. D's GC is conservative and
 does not cause segmentation faults or memory corruption, therefore it
 must scan the stack.

I may have been confusing any precise GC with Rainer's precise GC which 
only did precise Heap scans. Apologies for the confusion.

I will admit that I've done better work with my wording than what is in 
my last message. However, my fundamental point is sound, precision has 
benefits across all GC workloads and enables scenarios beyond the GC 
itself. Extending the compiler can work around many of the issues with 
false pointers.

And let's engage the elephant in the room. One of Rainer's main 
learning's was the precision permeates the compiler and GC, you have to 
deal with it everywhere. So you end up rebuilding the GC from the ground 
up when you add it. What you are proposing is building a 
conservative-generational GC that will have to be jettisoned and 
rewritten *again* once precision is added for a performance boost that, 
by your own admission, may not actually happen. How is this a good use 
of time?

My statement stands. Precision is the basis from which all high-order GC 
functions flow. Yes, you can work around not having it, but such 
workarounds tend to be just that. More importantly you are inevitably 
going to end up doing a half-implementation of precision anyways because 
each of the high-order functions need parts of that information.

I know precision isn't fun. But it's extremely important work that needs 
to get done. And once we have it, the world of GC algorithms opens 
before us. Until then we are intentionally limiting ourselves. One of 
the biggest complaints about D is the GC, fine, so instead of doing yet 
another hack-job on something that may not actually improve GC 
performance, let's do it right and lay the groundwork for a modern GC 
that allows us to really compete with JVM/CLR.

-- 
// Adam Wilson
// quiet.dlang.dev

Mar 13 2016

Chris Wright <dhasenan gmail.com> writes:

On Sun, 13 Mar 2016 12:43:37 -0700, Adam Wilson wrote:
 A "partially moving" GC does not exist, as far as I know.

 Yep, it's a Bad Idea.

It's not a standard term. Google's only seeing about four references to 
the term, none of them authoritative or definitive. Since it's non-
standard, it would behoove you to define it.

I think you mean a GC in which some objects are pinned (due to 
limitations in the GC rather than the user explicitly requesting that 
they be pinned).

Regardless, you ought to define the term and explain why it's a bad idea 
instead of providing a bare assertion. It doesn't make any difference to 
the operation of the GC *why* something's pinned; what matters is how 
much is pinned.

So, for instance, you could point to a conservative moving GC, select one 
circumstance in which it can't move objects, tell us how much memory that 
ends up pinning on average, and then you'll have an argument.

 In D, we have a tiny bit of type information surfaced to the GC to
 reduce the incidence of false pointers. We could additionally take
 advantage of the fact that numbers tend to be small and people tend to
 use 4 to 8 byte numbers: we could arrange our address space to avoid
 ranges where false pointers are likely. But that's probably
 unnecessary.

 We are not limited by what we currently have. Most of Rainer's work
 involved surfacing more data to the GC.

Sure. My point is that we don't *need* a lot of type information in the GC 
to get reasonable advantages. A little type information goes a long way,  
and a little caution in placing objects in the heap can help too.

 In D, thanks to unions, we can never create a fully precise collector,
 but we can create a moving collector. Objects pointed to from a union
 can't be moved, but few objects will be pointed to from a union.

 Actually, we probably could with modifications to the compiler.

No, we couldn't. I've been assuming that we can make whatever 
modifications we want to the compiler for free.

The compiler has static type information. This explicitly contains 
ambiguities. When it's based solely on data available at compile time, 
the compiler can resolve the ambiguities -- and you could have written 
the code without using a union.

But in the majority of cases, people use a union because they must. That 
means which values are set depends on runtime data, and the compiler 
can't tell you which fields will be set.

We could provide a runtime function that is invoked when you set a value 
inside a union. That won't work very well. Consider:

  struct A { long a; long b; }
  struct B { double a; void* p; }
  union AB { A a; B b; }

  extern void foo(ref A a);
  
  AB ab;
  ab.b.p = GC.malloc(512);
  foo(ab.a);

Was that pointer overwritten? No clue! The compiler can't tell. When 
compiling the definition of foo(), it didn't have any idea whether the 
struct was a union member. Now we have to intercept *every* indirect 
write to *any* value with a runtime call, even if you never use a union 
in your program --

Oh look, now Python's beating us by an order of magnitude in benchmarks.

What do we get for it? Well, the 1% of programs that use this pattern 
will allow the GC to move an additional 0.1% of their objects.

Or we leave the GC conservative in this one case. We can detect this case 
pretty easily and treat apparent pointers as pinned. Everything will 
work, and the GC will be better at releasing memory than it currently is.

Mar 13 2016

Adam Wilson <flyboynw gmail.com> writes:

Chris Wright wrote:
 On Sun, 13 Mar 2016 12:43:37 -0700, Adam Wilson wrote:
 A "partially moving" GC does not exist, as far as I know.

 Yep, it's a Bad Idea.

 It's not a standard term. Google's only seeing about four references to
 the term, none of them authoritative or definitive. Since it's non-
 standard, it would behoove you to define it.

 I think you mean a GC in which some objects are pinned (due to
 limitations in the GC rather than the user explicitly requesting that
 they be pinned).

 Regardless, you ought to define the term and explain why it's a bad idea
 instead of providing a bare assertion. It doesn't make any difference to
 the operation of the GC *why* something's pinned; what matters is how
 much is pinned.

 So, for instance, you could point to a conservative moving GC, select one
 circumstance in which it can't move objects, tell us how much memory that
 ends up pinning on average, and then you'll have an argument.

Is there an implementation of a conservative moving (compacting) GC out 
there? I'm not aware of one, but there are a lot of GC's out there. 
Boehm isn't.

 In D, we have a tiny bit of type information surfaced to the GC to
 reduce the incidence of false pointers. We could additionally take
 advantage of the fact that numbers tend to be small and people tend to
 use 4 to 8 byte numbers: we could arrange our address space to avoid
 ranges where false pointers are likely. But that's probably
 unnecessary.

 We are not limited by what we currently have. Most of Rainer's work
 involved surfacing more data to the GC.

 Sure. My point is that we don't *need* a lot of type information in the GC
 to get reasonable advantages. A little type information goes a long way,
 and a little caution in placing objects in the heap can help too.

 In D, thanks to unions, we can never create a fully precise collector,
 but we can create a moving collector. Objects pointed to from a union
 can't be moved, but few objects will be pointed to from a union.

 Actually, we probably could with modifications to the compiler.

 No, we couldn't. I've been assuming that we can make whatever
 modifications we want to the compiler for free.

 The compiler has static type information. This explicitly contains
 ambiguities. When it's based solely on data available at compile time,
 the compiler can resolve the ambiguities -- and you could have written
 the code without using a union.

 But in the majority of cases, people use a union because they must. That
 means which values are set depends on runtime data, and the compiler
 can't tell you which fields will be set.

 We could provide a runtime function that is invoked when you set a value
 inside a union. That won't work very well. Consider:

    struct A { long a; long b; }
    struct B { double a; void* p; }
    union AB { A a; B b; }

    extern void foo(ref A a);

    AB ab;
    ab.b.p = GC.malloc(512);
    foo(ab.a);

 Was that pointer overwritten? No clue! The compiler can't tell. When
 compiling the definition of foo(), it didn't have any idea whether the
 struct was a union member. Now we have to intercept *every* indirect
 write to *any* value with a runtime call, even if you never use a union
 in your program --

 Oh look, now Python's beating us by an order of magnitude in benchmarks.

 What do we get for it? Well, the 1% of programs that use this pattern
 will allow the GC to move an additional 0.1% of their objects.

 Or we leave the GC conservative in this one case. We can detect this case
 pretty easily and treat apparent pointers as pinned. Everything will
 work, and the GC will be better at releasing memory than it currently is.

Ok, so yes, it needs conservative scanning in this case. Because even in 

absolutely against pinning and unions aren't going away, and if the GC 
would have to be go conservative then so be it.  And given that unions 
are a pretty niche use case, I don't think we should drop the idea of 
precision just because it can't be 100% precise, so we end up with a few 
conservatively-scanned edge cases, big deal, the benefits of precision 
to the whole ecosystem would be enormous in the general case.

Is this a debate about precise vs. non-precise GC or are we just 
bikeshedding about terminology and technical details? ;) I moved 
recently and lost track of the GC book for awhile so I've been 
refreshing my terminology. :P

-- 
// Adam Wilson
// quiet.dlang.dev

Mar 13 2016

deadalnix <deadalnix gmail.com> writes:

On Sunday, 13 March 2016 at 23:34:44 UTC, Adam Wilson wrote:
 Is there an implementation of a conservative moving 
 (compacting) GC out there? I'm not aware of one, but there are 
 a lot of GC's out there. Boehm isn't.

That is impossible, you need to know what is and isn't a pointer 
to be able to move things around.

Mar 13 2016

Chris Wright <dhasenan gmail.com> writes:

On Sun, 13 Mar 2016 23:46:51 +0000, deadalnix wrote:

 On Sunday, 13 March 2016 at 23:34:44 UTC, Adam Wilson wrote:
 Is there an implementation of a conservative moving (compacting) GC out
 there? I'm not aware of one, but there are a lot of GC's out there.
 Boehm isn't.

 That is impossible, you need to know what is and isn't a pointer to be
 able to move things around.

Let's say we do the most precise GC possible for D. We can categorize 
things into:
 * maybe a pointer; it's part of a union that contains a non-pointer 
element
 * definitely a pointer (we know because of stack pointer maps and type 
information, and there aren't any unions mucking it up)
 * definitely not a pointer

We can move anything that has only "definitely a pointer"s pointing at 
it. We can't move anything that has one or more "maybe a pointer"s 
pointing at it.

I keep saying this...

Mar 13 2016

Adam Wilson <flyboynw gmail.com> writes:

deadalnix wrote:
 On Sunday, 13 March 2016 at 23:34:44 UTC, Adam Wilson wrote:
 Is there an implementation of a conservative moving (compacting) GC
 out there? I'm not aware of one, but there are a lot of GC's out
 there. Boehm isn't.

 That is impossible, you need to know what is and isn't a pointer to be
 able to move things around.

It was a rhetorical question. I was just leaving the door open for 
someone to surprise me. ;-)

-- 
// Adam Wilson
// quiet.dlang.dev

Mar 13 2016

Chris Wright <dhasenan gmail.com> writes:

On Sun, 13 Mar 2016 16:34:44 -0700, Adam Wilson wrote:
 Is this a debate about precise vs. non-precise GC or are we just
 bikeshedding about terminology and technical details?

You made a large number of assertions about garbage collection and they 
were almost all wrong.

It's not about minutiae -- D can't reasonably have a precise collector, 
and you were saying that nothing else useful can happen before you make 
the GC precise. A non-precise collector can and often should be 
concurrent, and you were saying this was useless.

These are big things. They affect prioritization of major projects. 
Getting them wrong could mislead a GSoC student to the point of not 
accomplishing much when they could have plucked lower-hanging fruit 
successfully.

Mar 13 2016

Adam Wilson <flyboynw gmail.com> writes:

Chris Wright wrote:
 On Sun, 13 Mar 2016 16:34:44 -0700, Adam Wilson wrote:
 Is this a debate about precise vs. non-precise GC or are we just
 bikeshedding about terminology and technical details?

 You made a large number of assertions about garbage collection and they
 were almost all wrong.

 It's not about minutiae -- D can't reasonably have a precise collector,
 and you were saying that nothing else useful can happen before you make
 the GC precise. A non-precise collector can and often should be
 concurrent, and you were saying this was useless.

 These are big things. They affect prioritization of major projects.
 Getting them wrong could mislead a GSoC student to the point of not
 accomplishing much when they could have plucked lower-hanging fruit
 successfully.

Erm. That seems a bit extreme. You are asserting that D can't have a 
precise garbage collector, yet only have presented one case where the 
compiler cannot provide definitive information about the classification 
of pointer. Unions. Ok, so Unions, a relatively low-usage scenario with 
a valid work-around (assume it's a pointer). We can teach the compiler 
and GC how to pin C-style pointers so that's not it. Lastly, Rainer 
seemed to think a precise GC could be done, and he then went and did it 
... so "can't reasonably have a precise collector" is a factually 
incorrect assertion.

My GC lingo may be rusty, and I will admit to using superlatives 
incorrectly myself, but based on the existing evidence my assertions are 
hardly "almost all wrong". I'll clean up my lingo and retool my language 
if you'll tone down the superlatives.

Of course GSoC students need guidance, I've been a GSoC Mentor for DLang 
before. If I didn't think that a precise GC was not only extremely 
important to D, but also within the ability of the student, I would not 
have pushed for it.

"low-hanging fruit" is the reason D's quality is so spotty. We now have 
a chance with GSoC to get some quality time dedicated to a pain point, 
we should utilize that time to work on foundational issues. Then we (the 
community) can flame each other to death about the best algorithms to 
achieve the performance we desire. That is assuming we haven't scared 
him off.

-- 
// Adam Wilson
// quiet.dlang.dev

Mar 13 2016

thedeemon <dlang thedeemon.com> writes:

On Monday, 14 March 2016 at 01:38:50 UTC, Adam Wilson wrote:
 Lastly, Rainer seemed to think a precise GC could be done, and 
 he then went and did it ... so "can't reasonably have a precise 
 collector" is a factually incorrect assertion.

IIRC, Rainer called it "mostly precise", and for a good reason. 
If we're ok with calling a partially conservative GC a precise 
one, then go on. ;) As I understand Rainer's GC works in VisualD 
now and it did solve the problem of leaks he had with the default 
GC, so there is certainly benefit in pursuing preciseness, even 
non-100% one.

I think continuing Rainer's work toward "even more precise" GC, 
and adding important optimizations like allocations without a 
global lock and parallel scanning would be a great project, and a 
realistic one.

As for moving and generational GC, I have big doubts one can do 
it without breaking 90% of old code written in D (passing 

with write barriers everywhere. Now D competes in speed with C++, 
with your proposed changes it will only try to compete with Java 

might be overestimating write-barriers costs, please feel free to 
prove me wrong.

Fun part: if you make a moving GC and try to automate pinning and 
let the compiler do it instead of requiring programmers to 
rewrite most of their D code, you'll probably end up with a 
"partially moving GC" where half of the heap is pinned, some 
objects are moving here and there, and GC rarely releases memory 
back to OS because of those pinned objects.


 My GC lingo may be rusty, and I will admit to using 
 superlatives incorrectly myself, but based on the existing 
 evidence my assertions are hardly "almost all wrong".

They did look so, sorry. Additional explanations helped.

Mar 13 2016

Adam Wilson <flyboynw gmail.com> writes:

thedeemon wrote:
 On Monday, 14 March 2016 at 01:38:50 UTC, Adam Wilson wrote:
 Lastly, Rainer seemed to think a precise GC could be done, and he then
 went and did it ... so "can't reasonably have a precise collector" is
 a factually incorrect assertion.

 IIRC, Rainer called it "mostly precise", and for a good reason. If we're
 ok with calling a partially conservative GC a precise one, then go on.
 ;) As I understand Rainer's GC works in VisualD now and it did solve the
 problem of leaks he had with the default GC, so there is certainly
 benefit in pursuing preciseness, even non-100% one.


conservative. ;-)

 I think continuing Rainer's work toward "even more precise" GC, and
 adding important optimizations like allocations without a global lock
 and parallel scanning would be a great project, and a realistic one.

On this we are in complete agreement.

 As for moving and generational GC, I have big doubts one can do it
 without breaking 90% of old code written in D (passing pointers to C

 everywhere. Now D competes in speed with C++, with your proposed changes

 be lost. On the other hand I might be overestimating write-barriers
 costs, please feel free to prove me wrong.

Well, conclusively proving the cost of write-barriers would require 
profiling. But I think we should be able to optimize the use of 
write-barriers to reduce the impact.

Here is the thing, nobody wants to admit it, but we have lost the battle 
with C++, although to be honest, I don't think we ever had a prayer of 
winning it in the first place. They want C++ with the "warts" removed, 
for some personally distinct set of "warts", they're not really 
interested in D for itself.

Maybe ... why don't instead of trying to compete with everybody else, we 
do our own thing, and do it very well. As long as we're just another "me 
too" operation people will treat us like we are. Let's focus on being 
the best D language out there. And D needs a GC, so let's build the best 
damn garbage collector the natively-compiled world has every seen.

 Fun part: if you make a moving GC and try to automate pinning and let
 the compiler do it instead of requiring programmers to rewrite most of
 their D code, you'll probably end up with a "partially moving GC" where
 half of the heap is pinned, some objects are moving here and there, and
 GC rarely releases memory back to OS because of those pinned objects.

Yea, this is true, although I think getting most of the way there is 
going to be very useful in it's own right. And it would open the 
possibility of experimenting with different algorithms and strategies. 
Right now we're pretty limited in the types of algorithms we can use, 
precision offers us more options and more experiments. Let's get the 
foundation in place, then we can go nuts.

 My GC lingo may be rusty, and I will admit to using superlatives
 incorrectly myself, but based on the existing evidence my assertions
 are hardly "almost all wrong".

 They did look so, sorry. Additional explanations helped.

Thank you. :)

-- 
// Adam Wilson
// quiet.dlang.dev

Mar 13 2016

Jeremy DeHaan <dehaan.jeremiah gmail.com> writes:

I haven't had power for a couple of days, but it looks like the 
discussion has gone along pretty ok. After reading everything, I 
think I'm inclined to agree with Adam and the main focus of my 
proposal will be a precise GC (or as precise as we can get). I'll 
definitely need some guidance, but I'll be learning everything I 
can about GC's and precision until the start of the project.

On Monday, 14 March 2016 at 05:28:13 UTC, Adam Wilson wrote:
 Maybe ... why don't instead of trying to compete with everybody 
 else, we do our own thing, and do it very well. As long as 
 we're just another "me too" operation people will treat us like 
 we are. Let's focus on being the best D language out there. And 
 D needs a GC, so let's build the best damn garbage collector 
 the natively-compiled world has every seen.


That is what I hope to work towards. Hopefully next year I can 
work on making a generational GC, or something else depending on 
how much time I could dedicate between when this GSoC is finished 
and the next begins.

Adam, can you reach out to Craig and let him know you're willing 
to mentor this project if it get's accepted? I talked with him a 
few days ago via email and he said that someone would need to 
take it on but he wasn't sure who.

Mar 14 2016

jmh530 <john.michael.hall gmail.com> writes:

On Tuesday, 15 March 2016 at 01:34:07 UTC, Jeremy DeHaan wrote:
 I haven't had power for a couple of days, but it looks like the 
 discussion has gone along pretty ok.

I would characterize it as very interesting, although I know very 
little about how GCs are implemented.

I have a question, for the more knowledgeable...

There was some discussion about the difficulty of handling unions 
and being able to ensure that pointers are pointers in a precise 
GC. Is there less work involved in getting the GC to be precise 
for  safe code? Does the current GC take this into account?

Mar 14 2016

Adam Wilson <flyboynw gmail.com> writes:

Jeremy DeHaan wrote:
 I haven't had power for a couple of days, but it looks like the
 discussion has gone along pretty ok. After reading everything, I think
 I'm inclined to agree with Adam and the main focus of my proposal will
 be a precise GC (or as precise as we can get). I'll definitely need some
 guidance, but I'll be learning everything I can about GC's and precision
 until the start of the project.

Awesome! It's a complex topic, but you'll get broad exposure to all 
corners of the D ecosystem, it's the kind of experience that you can 
apply to all aspects of software engineering.

 On Monday, 14 March 2016 at 05:28:13 UTC, Adam Wilson wrote:
 Maybe ... why don't instead of trying to compete with everybody else,
 we do our own thing, and do it very well. As long as we're just
 another "me too" operation people will treat us like we are. Let's
 focus on being the best D language out there. And D needs a GC, so
 let's build the best damn garbage collector the natively-compiled
 world has every seen.


 That is what I hope to work towards. Hopefully next year I can work on
 making a generational GC, or something else depending on how much time I
 could dedicate between when this GSoC is finished and the next begins.

 Adam, can you reach out to Craig and let him know you're willing to
 mentor this project if it get's accepted? I talked with him a few days
 ago via email and he said that someone would need to take it on but he
 wasn't sure who.

Will do. And given the nature of the work you want to do, I think 
(personal opinion follows) your application will sail through the process.

The application window opened today so my first recommendation would be 
to get started on your proposal as soon as possible, if you haven't 
already. GSoC has a basic outline of what you'll need to include here: 
http://write.flossmanuals.net/gsocstudentguide/writing-a-proposal/

I'd like to review the proposal before you submit it.

And welcome aboard the crazy-train known as GSoC, it's a lot of work but 
it'll be buckets of fun! :)

-- 
// Adam Wilson
// quiet.dlang.dev

Mar 14 2016

Rainer Schuetze <r.sagitario gmx.de> writes:

On 15.03.2016 02:34, Jeremy DeHaan wrote:
 I haven't had power for a couple of days, but it looks like the
 discussion has gone along pretty ok. After reading everything, I think
 I'm inclined to agree with Adam and the main focus of my proposal will
 be a precise GC (or as precise as we can get). I'll definitely need some
 guidance, but I'll be learning everything I can about GC's and precision
 until the start of the project.

 On Monday, 14 March 2016 at 05:28:13 UTC, Adam Wilson wrote:
 Maybe ... why don't instead of trying to compete with everybody else,
 we do our own thing, and do it very well. As long as we're just
 another "me too" operation people will treat us like we are. Let's
 focus on being the best D language out there. And D needs a GC, so
 let's build the best damn garbage collector the natively-compiled
 world has every seen.


 That is what I hope to work towards. Hopefully next year I can work on
 making a generational GC, or something else depending on how much time I
 could dedicate between when this GSoC is finished and the next begins.

 Adam, can you reach out to Craig and let him know you're willing to
 mentor this project if it get's accepted? I talked with him a few days
 ago via email and he said that someone would need to take it on but he
 wasn't sure who.

Being always way behind reading the forum these days, I'm a little late 
and have not read all the messages in this thread thoroughly. Here are 
some thoughts:

The precise GC used by Visual D is not only heap-precise, but also has 
type information to scan the data and TLS segments precisely. This is
done with a patched dmd now based on version 2.066 and is not part of 
the PR. It uses the solution that I presented at DConf 2013: 
http://dconf.org/2013/talks/schuetze.pdf

Creating information for stack and registers seems a lot of work with 
uncertain return: Apart from having to describe your code in very high 
detail, there is usually no guarantee that you can unwind the stack 
properly, especially when calling through C/C++ functions that you have 
no control of.

A moving collector is possible IMO, but it needs write barriers, and the 
idea of adding these explicitly has met quite some resistance, 
especially by Walter. Write barriers are also necessary for incremental 
collections. It changes the way you can interact with C/C++ because it's 
no longer good enough to just keep a pointer used by C visible to the 
GC, you have to pin it explicitly.

You can have very coarse barriers by using the page protection mechanism 
of the hardware, that's what I implemented for Windows, see 
http://rainers.github.io/visuald/druntime/concurrentgc.html. This is 
similar to Sociomantics concurrent GC using fork. It scans in the 
background, but not with multiple threads in parallel.

Unfortunately, this work has rotton almost 2 years now, but I hope to 
get back to it some day...

Mar 18 2016

Jeremy deHaan <dehaan.jeremiah gmail.com> writes:

On Friday, 18 March 2016 at 16:41:21 UTC, Rainer Schuetze wrote:
 On 15.03.2016 02:34, Jeremy DeHaan wrote:
[...]

 Being always way behind reading the forum these days, I'm a 
 little late and have not read all the messages in this thread 
 thoroughly. Here are some thoughts:

 [...]

Thank you for the feedback. I'm still working on my proposal so 
nothing is set in stone just yet. I'm very interested in working 
on the GC for this GSoC, so what would you suggest be my main 
focus? It sounds like you already have a GC that is more or less 
what I was planning on implementing...

Mar 18 2016

Rainer Schuetze <r.sagitario gmx.de> writes:

On 18.03.2016 22:04, Jeremy deHaan wrote:
 On Friday, 18 March 2016 at 16:41:21 UTC, Rainer Schuetze wrote:
 On 15.03.2016 02:34, Jeremy DeHaan wrote:
 [...]

 Being always way behind reading the forum these days, I'm a little
 late and have not read all the messages in this thread thoroughly.
 Here are some thoughts:

 [...]

 Thank you for the feedback. I'm still working on my proposal so nothing
 is set in stone just yet. I'm very interested in working on the GC for
 this GSoC, so what would you suggest be my main focus? It sounds like
 you already have a GC that is more or less what I was planning on
 implementing...

Well, there are a number of unfinished parts left:

- as far as I understand the precise GC PR 
https://github.com/D-Programming-Language/druntime/pull/1022 is 
currently stalled because it doesn't yield better overall performance 
than the current GC in most situations. The reasoning is that it should 
be able compensate the additional time during allocation (saving pointer 
information) by improved scanning using that information to skip 
non-pointers. That didn't work out yet, though.

- last time I checked generating RTInfo (which contains the pointer 
info) was a bit unreliable, see 
https://github.com/D-Programming-Language/dmd/pull/2480 and 
https://github.com/D-Programming-Language/dmd/pull/3958.

- I only implemented the DATA and TLS section TypeInfo emission for the 
OMF backend, this needs to be ported to other platforms. Martin Nowak 
recently considered to just emit pointer locations. This would make the 
scanning function a bit simpler, but might need a bit more memory in the 
binary.

Regarding the concurrent GC: I consider my implementation a prototype 
with rough edges and a lot of optimization opportunities. I'm not sure 
whether page protection is good enough to implement a generational GC, 
but it might still be possible to take advantage of the information that 
a page hasn't been written to.
Judging from https://blog.golang.org/go15gc Go 1.5 seems to be using a 
similar GC (concurrent mark-and-sweep), though with proper write 
barriers. I guess there is a lot of stuff that can be used from their 
experience.

 so what would you suggest be my main focus?

Given that 32-bit applications are becoming legacy, and false pointers 
are not a common problem in 64-bit processes (they do happen eventually, 
though) I suspect that concurrency of the GC would make a much larger 
impact on the D language than preciseness. A good target should be to 
reduce stop-the-world-time to something acceptable for interactive 
programs, i.e. well below 50ms.

Mar 19 2016

Adam Wilson <flyboynw gmail.com> writes:

Rainer Schuetze wrote:
 On 18.03.2016 22:04, Jeremy deHaan wrote:
 On Friday, 18 March 2016 at 16:41:21 UTC, Rainer Schuetze wrote:
 On 15.03.2016 02:34, Jeremy DeHaan wrote:
 [...]

 Being always way behind reading the forum these days, I'm a little
 late and have not read all the messages in this thread thoroughly.
 Here are some thoughts:

 [...]

 Thank you for the feedback. I'm still working on my proposal so nothing
 is set in stone just yet. I'm very interested in working on the GC for
 this GSoC, so what would you suggest be my main focus? It sounds like
 you already have a GC that is more or less what I was planning on
 implementing...

 Well, there are a number of unfinished parts left:

 - as far as I understand the precise GC PR
 https://github.com/D-Programming-Language/druntime/pull/1022 is
 currently stalled because it doesn't yield better overall performance
 than the current GC in most situations. The reasoning is that it should
 be able compensate the additional time during allocation (saving pointer
 information) by improved scanning using that information to skip
 non-pointers. That didn't work out yet, though.

 - last time I checked generating RTInfo (which contains the pointer
 info) was a bit unreliable, see
 https://github.com/D-Programming-Language/dmd/pull/2480 and
 https://github.com/D-Programming-Language/dmd/pull/3958.

 - I only implemented the DATA and TLS section TypeInfo emission for the
 OMF backend, this needs to be ported to other platforms. Martin Nowak
 recently considered to just emit pointer locations. This would make the
 scanning function a bit simpler, but might need a bit more memory in the
 binary.

 Regarding the concurrent GC: I consider my implementation a prototype
 with rough edges and a lot of optimization opportunities. I'm not sure
 whether page protection is good enough to implement a generational GC,
 but it might still be possible to take advantage of the information that
 a page hasn't been written to.
 Judging from https://blog.golang.org/go15gc Go 1.5 seems to be using a
 similar GC (concurrent mark-and-sweep), though with proper write
 barriers. I guess there is a lot of stuff that can be used from their
 experience.

  > so what would you suggest be my main focus?

 Given that 32-bit applications are becoming legacy, and false pointers
 are not a common problem in 64-bit processes (they do happen eventually,
 though) I suspect that concurrency of the GC would make a much larger
 impact on the D language than preciseness. A good target should be to
 reduce stop-the-world-time to something acceptable for interactive
 programs, i.e. well below 50ms.

I would argue that precision is still important as it enables a 
compacting GC. While technically possible to implement a generational GC 
without compaction, it's generally not considered wise. And even though 
64-bit does not suffer from false pointers as much as 32-bit does it 
still will, especially when used in the big-data, analytics, and 
scientific scenarios that D seems to have made real traction in. While 
false pointers are a problem for a simple command line app, and probably 
even most Vibe.D servers, there is a significant class of work being 
done in D today that would be directly effected by them.

-- 
// Adam Wilson
// quiet.dlang.dev

Mar 19 2016

Craig Dillabaugh <craig.dillabaugh gmail.com> writes:

On Tuesday, 15 March 2016 at 01:34:07 UTC, Jeremy DeHaan wrote:
 I haven't had power for a couple of days, but it looks like the 
 discussion has gone along pretty ok. After reading everything, 
 I think I'm inclined to agree with Adam and the main focus of 
 my proposal will be a precise GC (or as precise as we can get). 
 I'll definitely need some guidance, but I'll be learning 
 everything I can about GC's and precision until the start of 
 the project.

 [...]

In case you didn't already know, Adam is now an official mentor 
so you are all set to go.  Good luck!

Mar 18 2016

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 11-Mar-2016 18:13, Jeremy DeHaan wrote:
 Thank you all for the feedback.

 I think I might still need a little more feedback as to what the project
 should actually entail, but here's what it's looking like so far:

 Implement lock free allocation using std.experimental.allocator's
 freelists (SharedFreeList? It was the only thing in the documentation
 that specifically mentions lock free allocation)

 Implement a generational garbage collector

 Implement precise garbage collection (possibly piggybacking off of
 Rainer's work)

Let's not forget another low-hanging fruit which is parallel marking.
Regardless of the kind of collector we use parallel marking is a must 
have in our multicore world, and should be stright-forward to implement 
in D.

 Does anyone think that might be too much to take on or does it sound
 pretty good? I have other garbage collector things I'd like to explore,
 but I they should probably go in the "if time allows" category at this
 point.

     Jeremy


-- 
Dmitry Olshansky

Mar 13 2016

D Programming

C/C++ Programming

Other

digitalmars.D - Potential GSoC project - GC improvements