www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - GC scan for pointers

reply Gerald Jansen <gjansenXXX XXXownmail.net> writes:
I've studied [1] and [2] but don't understand everything there. 
Hence these dumb questions:

Given

   enum n = 100_000_000; // some big number
   auto a = new ulong[](n);
   auto b = new char[8][](n);
   struct S { ulong x; char[8] y; }
   auto c = new S[](n);

will the large memory blocks allocated for a, b and/or c actually 
be scanned for pointers to GC-allocated memory during a garbage 
collection? If so, why?

[1] 
http://p0nce.github.io/d-idioms/#How-the-D-Garbage-Collector-works
[2] http://dlang.org/garbage.html
Mar 09 2016
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 9 March 2016 at 15:14:02 UTC, Gerald Jansen wrote:
 will the large memory blocks allocated for a, b and/or c 
 actually be scanned for pointers to GC-allocated memory during 
 a garbage collection? If so, why?
No. It knows that the type has no pointers in it, so it will not scan it for them. If it was a struct with a pointer, it might be scanned though. Or static arrays of int on the stack will also be scanned, since the GC doesn't actually know much about local variables - it conservatively assumes anything on the stack might be a pointer. But large arrays are rarely on the stack so I think it is an ok situation. See the GC block attr flags: http://dpldocs.info/experimental-docs/core.memory.GC.BlkAttr.html
Mar 09 2016
parent Chris Wright <dhasenan gmail.com> writes:
On Wed, 09 Mar 2016 15:50:43 +0000, Adam D. Ruppe wrote:
 Or static
 arrays of int on the stack will also be scanned, since the GC doesn't
 actually know much about local variables
It's especially tricky because compilers can reuse memory on the stack -- for instance, if I use one variable in the first half of a function, stop using that variable, and start using another one, the compiler can save me some stack space by putting them at the same address. Plus it's a bit more straightforward to make a performant check for whether a type might be a pointer than for whether a stackframe might have a pointer. With types, it takes one pointer dereference. With stackframes, you have to look through some dictionary stored somewhere.
Mar 09 2016
prev sibling parent reply thedeemon <dlang thedeemon.com> writes:
On Wednesday, 9 March 2016 at 15:14:02 UTC, Gerald Jansen wrote:
 I've studied [1] and [2] but don't understand everything there. 
 Hence these dumb questions:

 Given

   enum n = 100_000_000; // some big number
   auto a = new ulong[](n);
   auto b = new char[8][](n);
   struct S { ulong x; char[8] y; }
   auto c = new S[](n);

 will the large memory blocks allocated for a, b and/or c 
 actually be scanned for pointers to GC-allocated memory during 
 a garbage collection? If so, why?
I've just tested it with my GC tracker ( https://bitbucket.org/infognition/dstuff ), all 3 allocations go with flags APPENDABLE | NO_SCAN which means these blocks will not be scanned. But if you define S as struct S { ulong x; char[] y; } so there is some pointer inside, then it gets allocated with just APPENDABLE flag, i.e. it will be scanned then.
Mar 10 2016
parent Gerald Jansen <gjansenXXX XXXownmail.net> writes:
On Thursday, 10 March 2016 at 10:58:41 UTC, thedeemon wrote:
 On Wednesday, 9 March 2016 at 15:14:02 UTC, Gerald Jansen wrote:
   enum n = 100_000_000; // some big number
   auto a = new ulong[](n);
   auto b = new char[8][](n);
   struct S { ulong x; char[8] y; }
   auto c = new S[](n);

 will the large memory blocks allocated for a, b and/or c 
 actually be scanned for pointers to GC-allocated memory during 
 a garbage collection? If so, why?
I've just tested it with my GC tracker ( https://bitbucket.org/infognition/dstuff ), all 3 allocations go with flags APPENDABLE | NO_SCAN which means these blocks will not be scanned. But if you define S as struct S { ulong x; char[] y; } so there is some pointer inside, then it gets allocated with just APPENDABLE flag, i.e. it will be scanned then.
Thanks for the very clear answer. Adam too. This alleviates much of my fear of GC performance issues for processing largish datasets in memory with traditional loops, even with multiple threads. Of course, it depends on wasting some memory to avoid char[] fields, but that is often a reasonable trade-off for the kind of data I need to process.
Mar 11 2016