www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - GC doesn't collect where expected

reply axricard <axelrwiko gmail.com> writes:
I'm doing some experiments with ldc2 GC, by instrumenting it and 
printing basic information (what is allocated and freed)

My first tests are made on this sample :

```
 cat test2.d
import core.memory; class Bar { int bar; } class Foo { this() { this.bar = new Bar; } Bar bar; } void func() { Foo f2 = new Foo; } int main() { Foo f = new Foo; func(); GC.collect(); return 0; } ``` When trying to run the instrumented druntime, I get a strange behavior : the first collection (done with GC.collect) doesn't sweep anything (in particular, it doesn't sweep memory allocated in _func()_). The whole sweeping is done when program finish, at cleanup. I don't understand why : memory allocated in _func()_ shouldn't be accessible from any root at first collection, right ? ``` ╰─> /instrumented-ldc2 -g -O0 test2.d --disable-gc2stack --disable-d-passes --of test2 && ./test2 "--DRT-gcopt=cleanup:collect fork:0 parallel:0 verbose:2" [test2.d:26] new 'test2.Foo' (24 bytes) => p = 0x7f3a0454d000 [test2.d:10] new 'test2.Bar' (20 bytes) => p = 0x7f3a0454d020 [test2.d:21] new 'test2.Foo' (24 bytes) => p = 0x7f3a0454d040 [test2.d:10] new 'test2.Bar' (20 bytes) => p = 0x7f3a0454d060 ============ COLLECTION ============= ============= MARKING ============== marking range: [0x7fff22337a60..0x7fff22339000] (0x15a0) range: [0x7f3a0454d000..0x7f3a0454d020] (0x20) range: [0x7f3a0454d040..0x7f3a0454d060] (0x20) marking range: [0x7f3a0464d720..0x7f3a0464d8b9] (0x199) marking range: [0x46c610..0x47b3b8] (0xeda8) ============= SWEEPING ============== ===================================================== ============ COLLECTION ============= ============= MARKING ============== marking range: [0x46c610..0x47b3b8] (0xeda8) ============= SWEEPING ============== Freeing test2.Foo (test2.d:26; 24 bytes) (0x7f3a0454d000). AGE : 1/2 Freeing test2.Bar (test2.d:10; 20 bytes) (0x7f3a0454d020). AGE : 1/2 Freeing test2.Foo (test2.d:21; 24 bytes) (0x7f3a0454d040). AGE : 1/2 Freeing test2.Bar (test2.d:10; 20 bytes) (0x7f3a0454d060). AGE : 1/2 ===================================================== ```
Jun 19 2023
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 6/19/23 12:13 PM, axricard wrote:
 I'm doing some experiments with ldc2 GC, by instrumenting it and 
 printing basic information (what is allocated and freed)
 
 My first tests are made on this sample :
 
 ```
 cat test2.d
import core.memory; class Bar { int bar; } class Foo {   this()   {     this.bar = new Bar;   }   Bar bar; } void func() {   Foo f2 = new Foo; } int main() {   Foo f = new Foo;   func();   GC.collect();   return 0; } ``` When trying to run the instrumented druntime, I get a strange behavior : the first collection (done with GC.collect) doesn't sweep anything (in particular, it doesn't sweep memory allocated in _func()_). The whole sweeping is done when program finish, at cleanup. I don't understand why : memory allocated in _func()_ shouldn't be accessible from any root at first collection, right ? ``` ╰─> /instrumented-ldc2 -g -O0 test2.d --disable-gc2stack --disable-d-passes --of test2  &&  ./test2 "--DRT-gcopt=cleanup:collect fork:0 parallel:0 verbose:2" [test2.d:26] new 'test2.Foo' (24 bytes) => p = 0x7f3a0454d000 [test2.d:10] new 'test2.Bar' (20 bytes) => p = 0x7f3a0454d020 [test2.d:21] new 'test2.Foo' (24 bytes) => p = 0x7f3a0454d040 [test2.d:10] new 'test2.Bar' (20 bytes) => p = 0x7f3a0454d060 ============ COLLECTION  =============         ============= MARKING ==============         marking range: [0x7fff22337a60..0x7fff22339000] (0x15a0)                 range: [0x7f3a0454d000..0x7f3a0454d020] (0x20)                 range: [0x7f3a0454d040..0x7f3a0454d060] (0x20)         marking range: [0x7f3a0464d720..0x7f3a0464d8b9] (0x199)         marking range: [0x46c610..0x47b3b8] (0xeda8)         ============= SWEEPING ============== ===================================================== ============ COLLECTION  =============         ============= MARKING ==============         marking range: [0x46c610..0x47b3b8] (0xeda8)         ============= SWEEPING ==============         Freeing test2.Foo (test2.d:26; 24 bytes) (0x7f3a0454d000). AGE :  1/2         Freeing test2.Bar (test2.d:10; 20 bytes) (0x7f3a0454d020). AGE :  1/2         Freeing test2.Foo (test2.d:21; 24 bytes) (0x7f3a0454d040). AGE :  1/2         Freeing test2.Bar (test2.d:10; 20 bytes) (0x7f3a0454d060). AGE :  1/2 ===================================================== ```
In general, the language does not guarantee when the GC will collect your item. In this specific case, most likely it's a stale register or stack reference. One way I usually use to ensure such things is to call a function that destroys the existing stack: ```d void clobber() { int[2048] x; } ``` Calling this function will clear out 2048x4 bytes of data to 0 on the stack. -Steve
Jun 19 2023
next sibling parent reply Anonymouse <zorael gmail.com> writes:
On Monday, 19 June 2023 at 16:43:30 UTC, Steven Schveighoffer 
wrote:
 In this specific case, most likely it's a stale register or 
 stack reference. One way I usually use to ensure such things is 
 to call a function that destroys the existing stack:

 ```d
 void clobber()
 {
    int[2048] x;
 }
 ```

 Calling this function will clear out 2048x4 bytes of data to 0 
 on the stack.

 -Steve
Could you elaborate on how you use this? When do you call it? Just, ever so often, or is there thought behind it?
Jun 19 2023
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 6/19/23 12:51 PM, Anonymouse wrote:
 On Monday, 19 June 2023 at 16:43:30 UTC, Steven Schveighoffer wrote:
 In this specific case, most likely it's a stale register or stack 
 reference. One way I usually use to ensure such things is to call a 
 function that destroys the existing stack:

 ```d
 void clobber()
 {
    int[2048] x;
 }
 ```

 Calling this function will clear out 2048x4 bytes of data to 0 on the 
 stack.
Could you elaborate on how you use this? When do you call it? Just, ever so often, or is there thought behind it?
Just before forcing a collect. The stack is *always* scanned conservatively, and even though really the stack data should be blown away by the next function call (probably GC.collect), it doesn't always work out that way. Indeed, even just declaring `x` might not do it if the compiler decides it doesn't actually have to. But I've found that seems to help. -Steve
Jun 19 2023
prev sibling next sibling parent axricard <axelrwiko gmail.com> writes:
On Monday, 19 June 2023 at 16:43:30 UTC, Steven Schveighoffer 
wrote:
 In general, the language does not guarantee when the GC will 
 collect your item.

 In this specific case, most likely it's a stale register or 
 stack reference. One way I usually use to ensure such things is 
 to call a function that destroys the existing stack:

 ```d
 void clobber()
 {
    int[2048] x;
 }
 ```

 Calling this function will clear out 2048x4 bytes of data to 0 
 on the stack.

 -Steve
All clear, thank you !
Jun 19 2023
prev sibling parent reply axricard <axelrwiko gmail.com> writes:
On Monday, 19 June 2023 at 16:43:30 UTC, Steven Schveighoffer 
wrote:
 In general, the language does not guarantee when the GC will 
 collect your item.

 In this specific case, most likely it's a stale register or 
 stack reference. One way I usually use to ensure such things is 
 to call a function that destroys the existing stack:

 ```d
 void clobber()
 {
    int[2048] x;
 }
 ```

 Calling this function will clear out 2048x4 bytes of data to 0 
 on the stack.

 -Steve
Does it mean that if my function _func()_ is as following (say I don't use clobber), I could keep a lot of memory for a very long time (until the stack is fully erased by other function calls) ? ``` void func() { Foo[2048] x; foreach(i; 0 .. 2048) x[i] = new Foo; } ```
Jun 19 2023
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 6/19/23 2:01 PM, axricard wrote:

 
 Does it mean that if my function _func()_ is as following (say I don't 
 use clobber), I could keep a lot of memory for a very long time (until 
 the stack is fully erased by other function calls) ?
 
 
 ```
 void func()
 {
     Foo[2048] x;
     foreach(i; 0 .. 2048)
       x[i] = new Foo;
 }
 ```
 
When the GC stops all threads, each of them registers their *current* stack as the target to scan, so most likely not. However, the compiler/optimizer is not trying to zero out stack unnecessarily, and likely this leads in some cases to false pointers. Like I said, even the "clobber" function might not actually zero out any stack because the compiler decides writing zeros to the stack that will never be read is a "dead store" and just omit that. This question comes up somewhat frequently "why isn't the GC collecting the garbage I gave it!", and the answer is mostly "don't worry about it". There is no real good way to guarantee an interaction between the compiler, the optimizer, and the runtime to make sure something happens one way or another. The only thing you really should care about is if you have a reference to an item and it's prematurely collected. Then there is a bug. Other than that, just don't worry about it. -Steve
Jun 19 2023