digitalmars.D - What should happen here?
- Steven Schveighoffer (34/34) Sep 20 2021 Without any context, what do you think should happen here?
- Johan (9/40) Sep 20 2021 I believe all options are valid, because there is no guarantee if
- Johan (7/53) Sep 20 2021 FWIW I believe option 1 should only happen with this "to stack"
- IGotD- (10/11) Sep 20 2021 Is this a trick question?
- Steven Schveighoffer (6/22) Sep 20 2021 You are kind of right. The question I bring up, just posted, is how do
- Steven Schveighoffer (51/51) Sep 20 2021 OK, so here is what Actually happens consistently (all this is with dmd)...
- IGotD- (13/14) Sep 20 2021 I think regardless, two worlds shouldn't mix. If you have a C
- IGotD- (12/12) Sep 20 2021 On Monday, 20 September 2021 at 19:00:38 UTC, IGotD- wrote:
- Steven Schveighoffer (6/24) Sep 20 2021 Then why are pointers to structs, arrays, structs containing class
- Steven Schveighoffer (20/26) Sep 20 2021 If I change the struct initialization to a function it has the same
- Elronnd (5/8) Sep 20 2021 Hmm. No opinion on 'should', but you ought to be able to insert
- Johan (15/40) Sep 21 2021 First: the use of "stack" here is wrong and confusing. It should
- Steven Schveighoffer (30/71) Sep 21 2021 Yikes, that's quite aggressive. It says even a method can be in progress...
- bauss (5/82) Sep 21 2021 What about a scope variable that holds an instance of a class? It
- Paulo Pinto (6/26) Sep 21 2021 You just rediscovered Go's runtime.KeepAlive(), and C#
- Johan (21/95) Sep 21 2021 I think this is not unreasonable to implement, it is similar to
- Steven Schveighoffer (15/58) Sep 21 2021 Probably you don't need to push to the stack unless the last usage is
- Steven Schveighoffer (5/20) Sep 22 2021 I made a package for something like this:
- Johan (6/26) Sep 23 2021 For the simple Pin version above, LDC generates the same machine
- Steven Schveighoffer (31/35) Sep 23 2021 So that's not showing the issue (without the -version=PIN)
- Steven Schveighoffer (9/10) Sep 23 2021 Nevermind, the important thing to read there is that it's not pushing it...
- Johan (4/5) Sep 23 2021 For LDC (and I expect GDC too), the asm trick works.
- Steven Schveighoffer (8/13) Sep 23 2021 oooh, really? That's cool. Maybe I'll update the library and re-register...
- max haughton (5/20) Sep 23 2021 RSP is the stack pointer and [RSP] refers to its value, so yes.
- Steven Schveighoffer (6/22) Sep 23 2021 Hm... it cancels all optimizations. No inlining either, or removal of
- H. S. Teoh (7/31) Sep 23 2021 [...]
- Steven Schveighoffer (8/18) Sep 23 2021 You can. But wouldn't you prefer just pushing something on the stack?
- Daniel N (4/12) Sep 23 2021 I would expect this to work on all platforms and compilers...
- Steven Schveighoffer (4/19) Sep 23 2021 This isn't quite the same. This puts c's guts on the stack, which is
- IGotD- (7/15) Sep 23 2021 It doesn't matter where it is, stack or register. What is
- Steven Schveighoffer (8/25) Sep 24 2021 Right, the registers are scanned too. In fact, I'm pretty sure when
- deadalnix (5/7) Sep 24 2021 Not really. If the optimizer can remove dead stack pushes, then
- max haughton (5/13) Sep 24 2021 Doubt it would be 2x on a modern CPU. Point stands though. LLVM
- Steven Schveighoffer (4/11) Sep 24 2021 You think pushing on the stack is going be 2x slower than calling
- H. S. Teoh (15/28) Sep 24 2021 [...]
- Steven Schveighoffer (26/38) Sep 24 2021 The "official" docs [also
- deadalnix (4/18) Sep 26 2021 If the optimizer isn't free to optimize thing away from the
- max haughton (3/23) Sep 26 2021 Zen3 can actually promote a spill onto the stack into its
- Steven Schveighoffer (6/23) Sep 26 2021 You realize what `GC.addRoot` does? It adds a pointer to a treap,
- max haughton (5/22) Sep 26 2021 I think Amaury means that the code would be slower if the
- deadalnix (8/14) Sep 27 2021 Sure, but it does so exclusively for one specific address. The
- Steven Schveighoffer (8/22) Sep 27 2021 Yes, I'm not looking to have all dead stores kept, just ones that are
- Max Samukha (5/10) Sep 27 2021 If performance was important, you would want to allocate the
- Steven Schveighoffer (4/18) Sep 27 2021 That's not as @safe, but it can be a viable option, as long as your
- jfondren (5/17) Sep 27 2021 `scope` makes sense for the examples given but "I want this
- Elronnd (3/6) Sep 27 2021 The cost of GC _collection_ outweighs that of GC.addRoot by a
- Max Samukha (2/8) Sep 28 2021 Yeah, "by a lot" is nonsense.
- deadalnix (9/15) Sep 27 2021 Honestly, I don't see why storing the GC root needs to be
- Elronnd (5/6) Sep 27 2021 Because you need to be able to remove the root later. So it
- deadalnix (4/10) Sep 28 2021 Even then, unless you addRoot like mad, linear search through the
- John Colvin (3/15) Sep 29 2021 If you search from most recently added to least recent then it’ll
- Paulo Pinto (18/33) Sep 23 2021 It should, that is also what the keep alive from C# and Go that I
- Steven Schveighoffer (7/24) Sep 24 2021 Right, what I'd prefer is something that doesn't actually require a call...
- Steven Schveighoffer (4/8) Sep 23 2021 I deregistered it as it doesn't work to solve the problem (it might
- Kagamin (16/30) Sep 24 2021 Another way:
- Walter Bright (3/18) Sep 24 2021 Use addRoot()/removeRoot() to do this in a documented and supported fash...
- Walter Bright (7/14) Sep 24 2021 Data flow analyzers are not scope-based, they are based on first use and...
- Kagamin (4/10) Sep 22 2021 If you use the pointer after the call, that's an easy way to
- Steven Schveighoffer (7/16) Sep 22 2021 The point is, you may only use it via functions/mechanisms that are not
- Steven Schveighoffer (7/16) Sep 22 2021 And by the way I tried naive usage, and the compiler saw right through t...
- Johan (21/39) Sep 22 2021 I think anything that is (close to) zero-overhead is what the
- =?UTF-8?Q?Ali_=c3=87ehreli?= (9/12) Sep 22 2021 Nobody seems to see the violation of "lifetime being the length of the
- IGotD- (11/16) Sep 22 2021 Not according to most ABIs. Parameters are usually not preserved
- Adam D Ruppe (7/9) Sep 22 2021 The compiler thinks the contents don't need to be preserved at
- Steven Schveighoffer (4/12) Sep 22 2021 What if you are calling D functions that call extern(C) functions?
- Adam D Ruppe (7/9) Sep 22 2021 Then it will be an argument to that other D function which makes
- Steven Schveighoffer (12/22) Sep 22 2021 While I don't necessarily disagree with you, this removes all
- Paul Backus (13/21) Sep 22 2021 Either (a) the compiler must assume, pessimistically, that a
- IGotD- (21/32) Sep 22 2021 The problem with a, is that eats up resources such as
- jfondren (8/13) Sep 22 2021 If GC.keepAlive is added, how do you know when to use it without
- jfondren (58/69) Sep 22 2021 Neither of these solutions would've helped with the original code
- Kagamin (3/14) Sep 22 2021 Finally someone tries to solve the real problem. It's beyond me
- Kagamin (8/14) Sep 22 2021 That just means you don't properly use it. It's so in this case,
- Paul Backus (14/45) Sep 20 2021 IMO all of the options are valid.
- =?UTF-8?Q?Ali_=c3=87ehreli?= (10/21) Sep 21 2021 Your explanation at first :) seems invalid because it seems to disregard...
- Dukc (4/35) Sep 21 2021 Most likely option 1. May also be option 3 though, as I remember
- Adam D Ruppe (4/5) Sep 21 2021 Some context in my blog now:
- Steven Schveighoffer (3/3) Sep 22 2021 FYI, I filed an issue so it's not forgotten.
- deadalnix (3/5) Sep 23 2021 This ^
- =?UTF-8?Q?Ali_=c3=87ehreli?= (4/4) Sep 24 2021 Off topic, I heard that some people missed this very important thread
- Walter Bright (26/27) Sep 24 2021 1, 2, or 3 are all valid outcomes.
- =?UTF-8?Q?Ali_=c3=87ehreli?= (7/12) Sep 24 2021 I am sure I have dead references in my D library that exposes a C API.
- Steven Schveighoffer (38/75) Sep 25 2021 Right, but the optimizer is working against that.
- Elronnd (3/8) Sep 25 2021 Indeed; additionally, 'scope c = new Class()' _does_ follow RAII.
- Walter Bright (2/3) Sep 25 2021 That's more of an optimization than a semantic shift.
- Daniel N (9/13) Sep 26 2021 // This works fine...
- Paul Backus (13/19) Sep 26 2021 Destroy on a class reference doesn't call the destructor; it just
- Paul Backus (4/14) Sep 26 2021 Correction: I remembered wrong; it actually does call the
- Walter Bright (16/38) Sep 25 2021 I understand your point. It's value is never used again, so there is no ...
- jfondren (19/21) Sep 25 2021 In the case that spawned this thread, the problem was a segfault
- jfondren (4/5) Sep 25 2021 Although I'd still like an efficient GC.keepAlive. My worry is
- Walter Bright (2/4) Sep 25 2021 Use GC.addRoot() to keep a reference alive. That's what it's for.
- =?UTF-8?Q?Ali_=c3=87ehreli?= (13/37) Sep 25 2021 I must be misunderstanding Steven's example. You say "its value is never...
- Adam D Ruppe (6/8) Sep 25 2021 If it is a final method that doesn't actually use any class
- Steven Schveighoffer (18/50) Sep 25 2021 Yes, that is the case today.
- Walter Bright (8/11) Sep 25 2021 I'd reframe it as the user is trying to impose his own semantics on the
- H. S. Teoh (31/40) Sep 26 2021 IMO, those are just the symptoms.
- Steven Schveighoffer (4/22) Sep 26 2021 OK.
Without any context, what do you think should happen here? ```d import std.stdio; import core.memory; class C { ~this() { writeln("dtor"); } } void main() { auto c = new C; foreach(i; 0 .. 10000) GC.collect; writeln("end of main"); } ``` Option 1: ``` end of main dtor ``` Option 2: ``` dtor end of main ``` Option 3: ``` end of main ``` Option 4: Option 1 or 2, depending on entropy. I'll post a response with what I've observed, and further discussion. I just want people to consider the above only with their initial expectations. -Steve
Sep 20 2021
On Monday, 20 September 2021 at 18:26:59 UTC, Steven Schveighoffer wrote:Without any context, what do you think should happen here? ```d import std.stdio; import core.memory; class C { ~this() { writeln("dtor"); } } void main() { auto c = new C; foreach(i; 0 .. 10000) GC.collect; writeln("end of main"); } ``` Option 1: ``` end of main dtor ``` Option 2: ``` dtor end of main ``` Option 3: ``` end of main ``` Option 4: Option 1 or 2, depending on entropy.I believe all options are valid, because there is no guarantee if and when class destructors are called. I'm guessing without optimization, option 3 will happen most. With optimization turned on, option 2. Or maybe the object is put on the stack (`scope` or LDC's heap->stack optimization) and so option 1 happens? -Johan
Sep 20 2021
On Monday, 20 September 2021 at 18:38:03 UTC, Johan wrote:On Monday, 20 September 2021 at 18:26:59 UTC, Steven Schveighoffer wrote:FWIW I believe option 1 should only happen with this "to stack" optimization (object lifetime provably confined to the function), because we are not obliged by the language spec to collect garbage after `main` ends (so it's best for performance to not collect). -JohanWithout any context, what do you think should happen here? ```d import std.stdio; import core.memory; class C { ~this() { writeln("dtor"); } } void main() { auto c = new C; foreach(i; 0 .. 10000) GC.collect; writeln("end of main"); } ``` Option 1: ``` end of main dtor ``` Option 2: ``` dtor end of main ``` Option 3: ``` end of main ``` Option 4: Option 1 or 2, depending on entropy.I believe all options are valid, because there is no guarantee if and when class destructors are called. I'm guessing without optimization, option 3 will happen most. With optimization turned on, option 2. Or maybe the object is put on the stack (`scope` or LDC's heap->stack optimization) and so option 1 happens?
Sep 20 2021
On Monday, 20 September 2021 at 18:26:59 UTC, Steven Schveighoffer wrote:Without any context, what do you think should happen here?Is this a trick question? In a perfect world I think option 1 should happen. However doesn't it depend on how the compiler decide what to do with variable c. It can be in a register or it can be on stack. If it is on stack, then the GC believe it is still in use. If it is in a register, then the compiler is likely to optimize c away before the loop and the GC destroys it. In reality, I guess option 4.
Sep 20 2021
On 9/20/21 2:39 PM, IGotD- wrote:On Monday, 20 September 2021 at 18:26:59 UTC, Steven Schveighoffer wrote:Of course it's a trick question! No normal questions are asked this way ;)Without any context, what do you think should happen here?Is this a trick question?In a perfect world I think option 1 should happen. However doesn't it depend on how the compiler decide what to do with variable c. It can be in a register or it can be on stack. If it is on stack, then the GC believe it is still in use. If it is in a register, then the compiler is likely to optimize c away before the loop and the GC destroys it. In reality, I guess option 4.You are kind of right. The question I bring up, just posted, is how do you solve passing a class reference into a C library that should only be used during the function? -Steve
Sep 20 2021
OK, so here is what Actually happens consistently (all this is with dmd): 1. Macos 64-bit: Option 2 2. Windows 64-bit: Option 1 3. Windows 32-bit: Option 2 4. Windows 32-bit with -g: Option 1 5. Linux 64-bit: Option 1 6. Linux 32-bit: Option 2 I'm sure other compilers would have varying degrees of options. What is happening? What happens is that the variable `c` is sometimes not stored on the stack, but in a register. after a call to `GC.collect`, that register is overwritten, and there is no longer any reference to the object, it gets collected. In this simple example, we are doing nothing with c afterwards. But there is a real case of trouble happening to someone [here](https://forum.dlang.org/post/xchnfzvpmxgytqprbosz forum.dlang.org). They are passing the class reference into a C function that registers the class in a place the GC can't see. The GC collects that information, and by the time that C library uses that (all *before* `main` exits), the data is invalid. I feel like this might not necessarily be an issue, because technically, you aren't using `c` any more, so it can be deallocated immediately. But right in our documentation [here](https://dlang.org/spec/interfaceToC.html#storage_allocation) it lists ways to alleviate this: ``` If pointers to D garbage collector allocated memory are passed to C functions, it's critical to ensure that the memory will not be collected by the garbage collector before the C function is done with it. This is accomplished by: * Making a copy of the data using core.stdc.stdlib.malloc() and passing the copy instead. * Leaving a pointer to it on the stack (as a parameter or automatic variable), as the garbage collector will scan the stack. * Leaving a pointer to it in the static data segment, as the garbage collector will scan the static data segment. * Registering the pointer with the garbage collector with the std.gc.addRoot() or std.gc.addRange() calls. ``` This to me seems like "leaving a pointer to it on the stack". I'm not sure how else I would do that specifically? Plus, this option is the only "free" one -- the others all require much more complication. Adding a pointer to the stack is free. It's just, I don't know how to tell the compiler to do that besides declaring it. Note that I don't think any other data types allocated on the heap do this. I tried to change C to a struct, and it is not collected. Change the line to `auto c = [new C]` and it works fine. It's just class references that the compiler seems to not care about ensuring stack references stay alive. Should it be this way? -Steve
Sep 20 2021
On Monday, 20 September 2021 at 18:49:12 UTC, Steven Schveighoffer wrote:Should it be this way?I think regardless, two worlds shouldn't mix. If you have a C function that also stores the allocated data then that data should also be allocated in "C world". Typically you have an additional C function that creates the data/object that you then pass to the C functions. I find it more a design pattern/API problem. I'm sure the there are exceptions to what I described where it cannot be used. Also, leaving the variable on stack option should be removed in the documentation because how can we know what the compiler decides to do with it.
Sep 20 2021
On Monday, 20 September 2021 at 19:00:38 UTC, IGotD- wrote:I also think this doesn't have much to do with the GC of D but more a general life time behaviour. In Rust (feel free to kick me in butt every time I mention Rust), there the free call might happen after the variable is last being used and not necessarily at the end of scope. That behaviour would yield the same result that C functions would access invalid memory. The only thing that saves Rust is that when you send a variables to C world, the compiler gives up, consider it gone and does nothing. The next question would be, would the upcoming borrow checker in D solve any of this?
Sep 20 2021
On 9/20/21 3:00 PM, IGotD- wrote:On Monday, 20 September 2021 at 18:49:12 UTC, Steven Schveighoffer wrote:Then why are pointers to structs, arrays, structs containing class references not treated the same? I'm not sure why class references are singled out, but they are for some reason. -SteveShould it be this way?I think regardless, two worlds shouldn't mix. If you have a C function that also stores the allocated data then that data should also be allocated in "C world". Typically you have an additional C function that creates the data/object that you then pass to the C functions. I find it more a design pattern/API problem. I'm sure the there are exceptions to what I described where it cannot be used. Also, leaving the variable on stack option should be removed in the documentation because how can we know what the compiler decides to do with it.
Sep 20 2021
On 9/20/21 3:20 PM, Steven Schveighoffer wrote:Then why are pointers to structs, arrays, structs containing class references not treated the same? I'm not sure why class references are singled out, but they are for some reason.If I change the struct initialization to a function it has the same behavior. e.g.: ```d struct S { ~this() { writeln("dtor"); } } auto makes() { return new S; } void main() { auto s = makes(); GC.collect(); GC.collect(); writeln("end of main"); } ``` Also shows option 2. So it has something to do with how the return value is stored. -Steve
Sep 20 2021
On Monday, 20 September 2021 at 18:49:12 UTC, Steven Schveighoffer wrote:It's just class references that the compiler seems to not care about ensuring stack references stay alive. Should it be this way?Hmm. No opinion on 'should', but you ought to be able to insert a volatileRead late in the function in order to ensure the object stays alive.
Sep 20 2021
On Monday, 20 September 2021 at 18:49:12 UTC, Steven Schveighoffer wrote:I feel like this might not necessarily be an issue, because technically, you aren't using `c` any more, so it can be deallocated immediately. But right in our documentation [here](https://dlang.org/spec/interfaceToC.html#storage_allocation) it lists ways to alleviate this: ``` If pointers to D garbage collector allocated memory are passed to C functions, it's critical to ensure that the memory will not be collected by the garbage collector before the C function is done with it. This is accomplished by: * Making a copy of the data using core.stdc.stdlib.malloc() and passing the copy instead. * Leaving a pointer to it on the stack (as a parameter or automatic variable), as the garbage collector will scan the stack. * Leaving a pointer to it in the static data segment, as the garbage collector will scan the static data segment. * Registering the pointer with the garbage collector with the std.gc.addRoot() or std.gc.addRange() calls. ``` This to me seems like "leaving a pointer to it on the stack". I'm not sure how else I would do that specifically? Plus, this option is the only "free" one -- the others all require much more complication. Adding a pointer to the stack is free. It's just, I don't know how to tell the compiler to do that besides declaring it.First: the use of "stack" here is wrong and confusing. It should be "local storage" (notorious error throughout the spec). Indeed, what is done in your example is putting the pointer in local storage. The scope of that local storage is until the end of function scope (in your example). I don't think (in LDC) that we track the lifetime of variables in that way, so what is done is that the optimizer just looks at last point of use. This is similar to how Java behaves: https://stackoverflow.com/questions/39285108/can-java-garbage-collect-variables-before-end-of-scope As per language spec, the D compilers are non-compliant on this point. So a decision is needed to either change the language spec, or to complain with the D compilers to fix it. -Johan
Sep 21 2021
On 9/21/21 6:58 AM, Johan wrote:On Monday, 20 September 2021 at 18:49:12 UTC, Steven Schveighoffer wrote:Yikes, that's quite aggressive. It says even a method can be in progress on the thing and it's collected early.I feel like this might not necessarily be an issue, because technically, you aren't using `c` any more, so it can be deallocated immediately. But right in our documentation [here](https://dlang.org/spec/interfaceToC.html#storage_allocation) it lists ways to alleviate this: ``` If pointers to D garbage collector allocated memory are passed to C functions, it's critical to ensure that the memory will not be collected by the garbage collector before the C function is done with it. This is accomplished by: * Making a copy of the data using core.stdc.stdlib.malloc() and passing the copy instead. * Leaving a pointer to it on the stack (as a parameter or automatic variable), as the garbage collector will scan the stack. * Leaving a pointer to it in the static data segment, as the garbage collector will scan the static data segment. * Registering the pointer with the garbage collector with the std.gc.addRoot() or std.gc.addRange() calls. ``` This to me seems like "leaving a pointer to it on the stack". I'm not sure how else I would do that specifically? Plus, this option is the only "free" one -- the others all require much more complication. Adding a pointer to the stack is free. It's just, I don't know how to tell the compiler to do that besides declaring it.First: the use of "stack" here is wrong and confusing. It should be "local storage" (notorious error throughout the spec). Indeed, what is done in your example is putting the pointer in local storage. The scope of that local storage is until the end of function scope (in your example). I don't think (in LDC) that we track the lifetime of variables in that way, so what is done is that the optimizer just looks at last point of use. This is similar to how Java behaves: https://stackoverflow.com/questions/39285108/can-java-garbage-collect-variables before-end-of-scopeAs per language spec, the D compilers are non-compliant on this point. So a decision is needed to either change the language spec, or to complain with the D compilers to fix it.I would say if you can somehow find a way to trigger the optimizer not to avoid that stack push, in all compilers, we should do that. IMO, the cost of a stack pointer is minimal compared to the surprising result that we currently see. But I don't know enough about compilers implementation to know whether this is a reasonable ask. Regardless of whether it's spec or implementation, something needs to change. This is why I asked the question without any context first, to have everyone think about what they *expect* to happen before finding out what actually happens. I'm surprised so many expected the current behavior, I did not. I just thought of a possible easy and effective way to ensure the thing isn't collected early: ```d struct Pin(T) { T t; nogc nothrow pure safe ~this() {} alias t this; } ... // usage auto c = Pin!C(new C); // now it needs to be held until the scope ends ``` This seems to work on LDC with -O3 to prevent the early collection, so maybe it is sound? If this is a valid mechanism to ensure it's saved, maybe it can be added to Phobos and the spec updated to recommend that. -Steve
Sep 21 2021
On Tuesday, 21 September 2021 at 12:02:05 UTC, Steven Schveighoffer wrote:On 9/21/21 6:58 AM, Johan wrote:What about a scope variable that holds an instance of a class? It could be made to mean the same thing as in (must not be collected until the end of the scope)On Monday, 20 September 2021 at 18:49:12 UTC, Steven Schveighoffer wrote:Yikes, that's quite aggressive. It says even a method can be in progress on the thing and it's collected early.I feel like this might not necessarily be an issue, because technically, you aren't using `c` any more, so it can be deallocated immediately. But right in our documentation [here](https://dlang.org/spec/interfaceToC.html#storage_allocation) it lists ways to alleviate this: ``` If pointers to D garbage collector allocated memory are passed to C functions, it's critical to ensure that the memory will not be collected by the garbage collector before the C function is done with it. This is accomplished by: * Making a copy of the data using core.stdc.stdlib.malloc() and passing the copy instead. * Leaving a pointer to it on the stack (as a parameter or automatic variable), as the garbage collector will scan the stack. * Leaving a pointer to it in the static data segment, as the garbage collector will scan the static data segment. * Registering the pointer with the garbage collector with the std.gc.addRoot() or std.gc.addRange() calls. ``` This to me seems like "leaving a pointer to it on the stack". I'm not sure how else I would do that specifically? Plus, this option is the only "free" one -- the others all require much more complication. Adding a pointer to the stack is free. It's just, I don't know how to tell the compiler to do that besides declaring it.First: the use of "stack" here is wrong and confusing. It should be "local storage" (notorious error throughout the spec). Indeed, what is done in your example is putting the pointer in local storage. The scope of that local storage is until the end of function scope (in your example). I don't think (in LDC) that we track the lifetime of variables in that way, so what is done is that the optimizer just looks at last point of use. This is similar to how Java behaves: https://stackoverflow.com/questions/39285108/can-java-garbage-collect-variables-before-end-of-scopeAs per language spec, the D compilers are non-compliant on this point. So a decision is needed to either change the language spec, or to complain with the D compilers to fix it.I would say if you can somehow find a way to trigger the optimizer not to avoid that stack push, in all compilers, we should do that. IMO, the cost of a stack pointer is minimal compared to the surprising result that we currently see. But I don't know enough about compilers implementation to know whether this is a reasonable ask. Regardless of whether it's spec or implementation, something needs to change. This is why I asked the question without any context first, to have everyone think about what they *expect* to happen before finding out what actually happens. I'm surprised so many expected the current behavior, I did not. I just thought of a possible easy and effective way to ensure the thing isn't collected early: ```d struct Pin(T) { T t; nogc nothrow pure safe ~this() {} alias t this; } ... // usage auto c = Pin!C(new C); // now it needs to be held until the scope ends ``` This seems to work on LDC with -O3 to prevent the early collection, so maybe it is sound? If this is a valid mechanism to ensure it's saved, maybe it can be added to Phobos and the spec updated to recommend that. -Steve
Sep 21 2021
On Tuesday, 21 September 2021 at 12:02:05 UTC, Steven Schveighoffer wrote:... I just thought of a possible easy and effective way to ensure the thing isn't collected early: ```d struct Pin(T) { T t; nogc nothrow pure safe ~this() {} alias t this; } ... // usage auto c = Pin!C(new C); // now it needs to be held until the scope ends ``` This seems to work on LDC with -O3 to prevent the early collection, so maybe it is sound? If this is a valid mechanism to ensure it's saved, maybe it can be added to Phobos and the spec updated to recommend that. -SteveGC.KeepAlive() https://pkg.go.dev/runtime#KeepAlive https://docs.microsoft.com/en-us/dotnet/api/system.gc.keepalive
Sep 21 2021
On Tuesday, 21 September 2021 at 12:02:05 UTC, Steven Schveighoffer wrote:On 9/21/21 6:58 AM, Johan wrote:I think this is not unreasonable to implement, it is similar to keeping track of what destructors to use: just doing a noop/keepalive on the variable at the end of scope. I can think of hypothetical cases where this would impact performance. For example, the function `void foo(S* s)` receives the pointer in a register, and would have to keep it alive in a register or push it to stack to preserve it for duration of the function; in a tight loop one may not expect that (and there would be no way to _not_ do that). We also don't want this for just any kind of parameter (e.g. not for an int), so would need some smartness on which types to apply this to. I think this would cover it: (pointers to, arrays of) struct, class, slice, AA. Test and see?On Monday, 20 September 2021 at 18:49:12 UTC, Steven Schveighoffer wrote:Yikes, that's quite aggressive. It says even a method can be in progress on the thing and it's collected early.I feel like this might not necessarily be an issue, because technically, you aren't using `c` any more, so it can be deallocated immediately. But right in our documentation [here](https://dlang.org/spec/interfaceToC.html#storage_allocation) it lists ways to alleviate this: ``` If pointers to D garbage collector allocated memory are passed to C functions, it's critical to ensure that the memory will not be collected by the garbage collector before the C function is done with it. This is accomplished by: * Making a copy of the data using core.stdc.stdlib.malloc() and passing the copy instead. * Leaving a pointer to it on the stack (as a parameter or automatic variable), as the garbage collector will scan the stack. * Leaving a pointer to it in the static data segment, as the garbage collector will scan the static data segment. * Registering the pointer with the garbage collector with the std.gc.addRoot() or std.gc.addRange() calls. ``` This to me seems like "leaving a pointer to it on the stack". I'm not sure how else I would do that specifically? Plus, this option is the only "free" one -- the others all require much more complication. Adding a pointer to the stack is free. It's just, I don't know how to tell the compiler to do that besides declaring it.First: the use of "stack" here is wrong and confusing. It should be "local storage" (notorious error throughout the spec). Indeed, what is done in your example is putting the pointer in local storage. The scope of that local storage is until the end of function scope (in your example). I don't think (in LDC) that we track the lifetime of variables in that way, so what is done is that the optimizer just looks at last point of use. This is similar to how Java behaves: https://stackoverflow.com/questions/39285108/can-java-garbage-collect-variables-before-end-of-scopeAs per language spec, the D compilers are non-compliant on this point. So a decision is needed to either change the language spec, or to complain with the D compilers to fix it.I would say if you can somehow find a way to trigger the optimizer not to avoid that stack push, in all compilers, we should do that. IMO, the cost of a stack pointer is minimal compared to the surprising result that we currently see. But I don't know enough about compilers implementation to know whether this is a reasonable ask.Regardless of whether it's spec or implementation, something needs to change. This is why I asked the question without any context first, to have everyone think about what they *expect* to happen before finding out what actually happens. I'm surprised so many expected the current behavior, I did not. I just thought of a possible easy and effective way to ensure the thing isn't collected early: ```d struct Pin(T) { T t; nogc nothrow pure safe ~this() {} alias t this; } ... // usage auto c = Pin!C(new C); // now it needs to be held until the scope ends ``` This seems to work on LDC with -O3 to prevent the early collection, so maybe it is sound?I don't think it is, and I am surprised it works. You can trivially inline the destructor, see that it does nothing, and then the liveness of the variable is very short indeed... -Johan
Sep 21 2021
On 9/21/21 12:19 PM, Johan wrote:On Tuesday, 21 September 2021 at 12:02:05 UTC, Steven Schveighoffer wrote:Probably you don't need to push to the stack unless the last usage is sending the variable to a function, but even that could be more expensive than just keeping in a register. The more I think about it (and finding out that other languages have a keepAlive feature), this really should just be changed to something that's opt-in. A way to signal to the compiler to ensure the thing gets onto the stack. And then we change the spec to say "use this feature to keep pointers alive during a scope". If that's not the thing I posted below, then maybe even a special symbol name can be used to signal to the compiler.I would say if you can somehow find a way to trigger the optimizer not to avoid that stack push, in all compilers, we should do that. IMO, the cost of a stack pointer is minimal compared to the surprising result that we currently see. But I don't know enough about compilers implementation to know whether this is a reasonable ask.I think this is not unreasonable to implement, it is similar to keeping track of what destructors to use: just doing a noop/keepalive on the variable at the end of scope. I can think of hypothetical cases where this would impact performance. For example, the function `void foo(S* s)` receives the pointer in a register, and would have to keep it alive in a register or push it to stack to preserve it for duration of the function; in a tight loop one may not expect that (and there would be no way to _not_ do that). We also don't want this for just any kind of parameter (e.g. not for an int), so would need some smartness on which types to apply this to. I think this would cover it: (pointers to, arrays of) struct, class, slice, AA. Test and see?Yeah, if the inliner elides the entire function, it could potentially be collected, maybe there's something about the fact that the struct has a destructor that forces the compiler to store on the stack? -SteveI just thought of a possible easy and effective way to ensure the thing isn't collected early: ```d struct Pin(T) { Â Â T t; Â Â nogc nothrow pure safe ~this() {} Â Â alias t this; } ... // usage auto c = Pin!C(new C); // now it needs to be held until the scope ends ``` This seems to work on LDC with -O3 to prevent the early collection, so maybe it is sound?I don't think it is, and I am surprised it works. You can trivially inline the destructor, see that it does nothing, and then the liveness of the variable is very short indeed...
Sep 21 2021
On 9/21/21 8:02 AM, Steven Schveighoffer wrote:I just thought of a possible easy and effective way to ensure the thing isn't collected early: ```d struct Pin(T) { Â Â T t; Â Â nogc nothrow pure safe ~this() {} Â Â alias t this; } ... // usage auto c = Pin!C(new C); // now it needs to be held until the scope ends ```I made a package for something like this: https://code.dlang.org/packages/keepalive Maybe it might find some use. -Steve
Sep 22 2021
On Wednesday, 22 September 2021 at 21:06:11 UTC, Steven Schveighoffer wrote:On 9/21/21 8:02 AM, Steven Schveighoffer wrote:For the simple Pin version above, LDC generates the same machine code with/without Pin (as expected): https://d.godbolt.org/z/MW7d9Mefe -JohanI just thought of a possible easy and effective way to ensure the thing isn't collected early: ```d struct Pin(T) { Â Â T t; Â Â nogc nothrow pure safe ~this() {} Â Â alias t this; } ... // usage auto c = Pin!C(new C); // now it needs to be held until the scope ends ```I made a package for something like this: https://code.dlang.org/packages/keepalive Maybe it might find some use.
Sep 23 2021
On 9/23/21 6:54 AM, Johan wrote:For the simple Pin version above, LDC generates the same machine code with/without Pin (as expected): https://d.godbolt.org/z/MW7d9MefeSo that's not showing the issue (without the -version=PIN) I couldn't get it to show on certain platforms, including ldc 64-bit. On 32-bit it does fail, and the PIN fixes it. HOWEVER, I have implemented a suggestion by Adam that perhaps the reason the first call to GC.collect doesn't cause a collection to occur is because the leftover stack might contain some reference, and it's not sufficiently clobbered. So adding: ```d void foo() { int[1000] x; writeln(x[1]); } ``` And calling that function before calling GC.collect Seems to do the trick for 64 bit. Adding a new wrinkle here is that now BOTH versions collect early. https://d.godbolt.org/z/rrnYPeYa3 And even dmd -inline -O is smart enough to see through this. So I added the opaque function call, and even that wasn't enough (it didn't use any variables) So I added passing the `t` through the opaque function, and this has fooled dmd, but not ldc (which I'm guessing is doing optimization on the mangled name, and so can see right through my opaque trick). Possibly, we could create a truly opaque library function. I also tried just putting an empty `asm` block inside. But I can't be certain if it's working or not, the results are inconsistently showing both outputs. I think we will need a real compiler intrinsic at this point. -Steve
Sep 23 2021
On 9/23/21 8:47 AM, Steven Schveighoffer wrote:So that's not showing the issue (without the -version=PIN)Nevermind, the important thing to read there is that it's not pushing it onto the stack. Whether the collection happens early or not is dependent on whether some other reference exists somewhere else (either on the stack or somewhere else). So the problem then is intermittent, depending on whether some other stack reference is seen. Frustrating... -Steve
Sep 23 2021
On Thursday, 23 September 2021 at 12:47:25 UTC, Steven Schveighoffer wrote:I think we will need a real compiler intrinsic at this point.For LDC (and I expect GDC too), the asm trick works. -Johan
Sep 23 2021
On 9/23/21 1:45 PM, Johan wrote:On Thursday, 23 September 2021 at 12:47:25 UTC, Steven Schveighoffer wrote:oooh, really? That's cool. Maybe I'll update the library and re-register. Looking at the disassembly, I do see the difference. This is it pushing to the stack, right? ```asm mov qword ptr [rsp], rax ``` -SteveI think we will need a real compiler intrinsic at this point.For LDC (and I expect GDC too), the asm trick works.
Sep 23 2021
On Thursday, 23 September 2021 at 18:41:55 UTC, Steven Schveighoffer wrote:On 9/23/21 1:45 PM, Johan wrote:RSP is the stack pointer and [RSP] refers to its value, so yes. I will have a look at the LLVM GC intrinsics when I get round to it.On Thursday, 23 September 2021 at 12:47:25 UTC, Steven Schveighoffer wrote:oooh, really? That's cool. Maybe I'll update the library and re-register. Looking at the disassembly, I do see the difference. This is it pushing to the stack, right? ```asm mov qword ptr [rsp], rax ``` -SteveI think we will need a real compiler intrinsic at this point.For LDC (and I expect GDC too), the asm trick works.
Sep 23 2021
On 9/23/21 2:41 PM, Steven Schveighoffer wrote:On 9/23/21 1:45 PM, Johan wrote:Hm... it cancels all optimizations. No inlining either, or removal of the empty function. So the penalty is you are going to call the dtor (with an actual call instruction). I guess that's better than nothing. -SteveOn Thursday, 23 September 2021 at 12:47:25 UTC, Steven Schveighoffer wrote:oooh, really? That's cool. Maybe I'll update the library and re-register. Looking at the disassembly, I do see the difference. This is it pushing to the stack, right? ```asm  mov    qword ptr [rsp], rax ```I think we will need a real compiler intrinsic at this point.For LDC (and I expect GDC too), the asm trick works.
Sep 23 2021
On Thu, Sep 23, 2021 at 03:22:26PM -0400, Steven Schveighoffer via Digitalmars-d wrote:On 9/23/21 2:41 PM, Steven Schveighoffer wrote:[...] All of this long discussion begs the question: why not just use GC.addRoot() and call it a day? T -- He who laughs last thinks slowest.On 9/23/21 1:45 PM, Johan wrote:Hm... it cancels all optimizations. No inlining either, or removal of the empty function. So the penalty is you are going to call the dtor (with an actual call instruction). I guess that's better than nothing.On Thursday, 23 September 2021 at 12:47:25 UTC, Steven Schveighoffer wrote:oooh, really? That's cool. Maybe I'll update the library and re-register. Looking at the disassembly, I do see the difference. This is it pushing to the stack, right? ```asm mov qword ptr [rsp], rax ```I think we will need a real compiler intrinsic at this point.For LDC (and I expect GDC too), the asm trick works.
Sep 23 2021
On 9/23/21 3:40 PM, H. S. Teoh wrote:On Thu, Sep 23, 2021 at 03:22:26PM -0400, Steven Schveighoffer via Digitalmars-d wrote:You can. But wouldn't you prefer just pushing something on the stack? I don't know, it sort of bugs me and fascinates me that there isn't a way to do this easily. The stack is pretty much free to use, adding something to some allocated tree inside the GC (and then later removing it) isn't. The use cases are exceedingly small though... -SteveHm... it cancels all optimizations. No inlining either, or removal of the empty function. So the penalty is you are going to call the dtor (with an actual call instruction). I guess that's better than nothing.[...] All of this long discussion begs the question: why not just use GC.addRoot() and call it a day?
Sep 23 2021
On Thursday, 23 September 2021 at 19:54:56 UTC, Steven Schveighoffer wrote:You can. But wouldn't you prefer just pushing something on the stack? I don't know, it sort of bugs me and fascinates me that there isn't a way to do this easily. The stack is pretty much free to use, adding something to some allocated tree inside the GC (and then later removing it) isn't. The use cases are exceedingly small though... -SteveI would expect this to work on all platforms and compilers... scope c = new C;
Sep 23 2021
On 9/23/21 4:43 PM, Daniel N wrote:On Thursday, 23 September 2021 at 19:54:56 UTC, Steven Schveighoffer wrote:This isn't quite the same. This puts c's guts on the stack, which is much less safe than just putting a reference on the stack. -SteveYou can. But wouldn't you prefer just pushing something on the stack? I don't know, it sort of bugs me and fascinates me that there isn't a way to do this easily. The stack is pretty much free to use, adding something to some allocated tree inside the GC (and then later removing it) isn't. The use cases are exceedingly small though...I would expect this to work on all platforms and compilers... scope c = new C;
Sep 23 2021
On Thursday, 23 September 2021 at 19:54:56 UTC, Steven Schveighoffer wrote:You can. But wouldn't you prefer just pushing something on the stack? I don't know, it sort of bugs me and fascinates me that there isn't a way to do this easily. The stack is pretty much free to use, adding something to some allocated tree inside the GC (and then later removing it) isn't. The use cases are exceedingly small though... -SteveIt doesn't matter where it is, stack or register. What is important is that the pointer value is retained somewhere. KeepAlive should trick the compiler to believe that KeepAlive itself is a user of the resource. How that is done in practice is another question and may vary depending on GC type.
Sep 23 2021
On 9/23/21 5:34 PM, IGotD- wrote:On Thursday, 23 September 2021 at 19:54:56 UTC, Steven Schveighoffer wrote:Right, the registers are scanned too. In fact, I'm pretty sure when looking at the code LDC generates when using my latest keepalive lib, it can use a non-temporary register to store the pointer, and then when it comes time to call the destructor, it puts the pointer on the stack (because destructors require a pointer). And a register is even more performant than the stack! -SteveYou can. But wouldn't you prefer just pushing something on the stack? I don't know, it sort of bugs me and fascinates me that there isn't a way to do this easily. The stack is pretty much free to use, adding something to some allocated tree inside the GC (and then later removing it) isn't. The use cases are exceedingly small though...It doesn't matter where it is, stack or register. What is important is that the pointer value is retained somewhere. KeepAlive should trick the compiler to believe that KeepAlive itself is a user of the resource. How that is done in practice is another question and may vary depending on GC type.
Sep 24 2021
On Thursday, 23 September 2021 at 19:54:56 UTC, Steven Schveighoffer wrote:You can. But wouldn't you prefer just pushing something on the stack?Not really. If the optimizer can remove dead stack pushes, then program will become 2x slower instantly in addition of consuming more stack memory.
Sep 24 2021
On Friday, 24 September 2021 at 15:25:55 UTC, deadalnix wrote:On Thursday, 23 September 2021 at 19:54:56 UTC, Steven Schveighoffer wrote:Doubt it would be 2x on a modern CPU. Point stands though. LLVM and GCC both have intrinsics for GC roots so perhaps they could be used here (calling into the GC is going to be very slow anyway so keeping it's return value on the stack wouldn't matter)You can. But wouldn't you prefer just pushing something on the stack?Not really. If the optimizer can remove dead stack pushes, then program will become 2x slower instantly in addition of consuming more stack memory.
Sep 24 2021
On 9/24/21 11:25 AM, deadalnix wrote:On Thursday, 23 September 2021 at 19:54:56 UTC, Steven Schveighoffer wrote:You think pushing on the stack is going be 2x slower than calling `GC.addRoot`? -SteveYou can. But wouldn't you prefer just pushing something on the stack?Not really. If the optimizer can remove dead stack pushes, then program will become 2x slower instantly in addition of consuming more stack memory.
Sep 24 2021
On Fri, Sep 24, 2021 at 11:31:46AM -0400, Steven Schveighoffer via Digitalmars-d wrote:On 9/24/21 11:25 AM, deadalnix wrote:[...] I still prefer GC.addRoot. For one thing, that's the "official" way to inform the GC that a certain object is still needed and therefore should not be collected. Secondly, it self-documents the intent of the code, instead of some arcane workaround like struct Pin (that may or may not work in the future depending on how smart optimizers become). Third, if the overhead of calling GC.addRoot becomes an actual problem, it can always be turned into an intrinsic that the compiler can, based on certain conditions, replace it with an equivalent internal flag that ensures the value stays on the stack until the end of the scope. T -- Three out of two people have difficulties with fractions. -- Dirk EddelbuettelOn Thursday, 23 September 2021 at 19:54:56 UTC, Steven Schveighoffer wrote:You think pushing on the stack is going be 2x slower than calling `GC.addRoot`?You can. But wouldn't you prefer just pushing something on the stack?Not really. If the optimizer can remove dead stack pushes, then program will become 2x slower instantly in addition of consuming more stack memory.
Sep 24 2021
On 9/24/21 12:21 PM, H. S. Teoh wrote:I still prefer GC.addRoot. For one thing, that's the "official" way to inform the GC that a certain object is still needed and therefore should not be collected.The "official" docs [also say](https://dlang.org/spec/interfaceToC.html#storage_allocation), put a pointer on the stack if you want it to not be collected. Note that `GC.addRoot` performs a different function. It keeps the memory alive until you use `GC.removeRoot`. Putting a pointer on the stack keeps the thing alive until the end of the stack frame. They are not the same thing.Secondly, it self-documents the intent of the code, instead of some arcane workaround like struct Pin (that may or may not work in the future depending on how smart optimizers become).This is pretty self documenting: ```d obj.keepAlive; ``` Or perhaps: ```d GC.keepAlive(obj); ``` which means, keep this object alive until this line at least. The name `Pin` was something I just quickly thought of. But I think `keepAlive` is much more descriptive (and has precedence).Third, if the overhead of calling GC.addRoot becomes an actual problem, it can always be turned into an intrinsic that the compiler can, based on certain conditions, replace it with an equivalent internal flag that ensures the value stays on the stack until the end of the scope.1. `GC.addRoot` cannot mean "put on the stack". Because it has to be paired with a `GC.removeRoot` at a later point in the same frame to have the same effect. Sure, an intrinsic is possible for this situation, but it's way more complex, and I feel not as easily deciphered. 2. If `keepAlive` is poorly performing, it too can be replaced with an -Steve
Sep 24 2021
On Friday, 24 September 2021 at 15:31:46 UTC, Steven Schveighoffer wrote:On 9/24/21 11:25 AM, deadalnix wrote:If the optimizer isn't free to optimize thing away from the stack, yes, it's pretty much guaranteed.On Thursday, 23 September 2021 at 19:54:56 UTC, Steven Schveighoffer wrote:You think pushing on the stack is going be 2x slower than calling `GC.addRoot`? -SteveYou can. But wouldn't you prefer just pushing something on the stack?Not really. If the optimizer can remove dead stack pushes, then program will become 2x slower instantly in addition of consuming more stack memory.
Sep 26 2021
On Sunday, 26 September 2021 at 19:13:57 UTC, deadalnix wrote:On Friday, 24 September 2021 at 15:31:46 UTC, Steven Schveighoffer wrote:Zen3 can actually promote a spill onto the stack into its physical register file.On 9/24/21 11:25 AM, deadalnix wrote:If the optimizer isn't free to optimize thing away from the stack, yes, it's pretty much guaranteed.On Thursday, 23 September 2021 at 19:54:56 UTC, Steven Schveighoffer wrote:You think pushing on the stack is going be 2x slower than calling `GC.addRoot`? -SteveYou can. But wouldn't you prefer just pushing something on the stack?Not really. If the optimizer can remove dead stack pushes, then program will become 2x slower instantly in addition of consuming more stack memory.
Sep 26 2021
On 9/26/21 3:13 PM, deadalnix wrote:On Friday, 24 September 2021 at 15:31:46 UTC, Steven Schveighoffer wrote:You realize what `GC.addRoot` does? It adds a pointer to a treap, through a virtual function call after taking a global lock. Pushing a stack item is going to be at least 10x faster, probably more compared to that. In the function call alone, there are a few stack pushes. -SteveOn 9/24/21 11:25 AM, deadalnix wrote:If the optimizer isn't free to optimize thing away from the stack, yes, it's pretty much guaranteed.On Thursday, 23 September 2021 at 19:54:56 UTC, Steven Schveighoffer wrote:You think pushing on the stack is going be 2x slower than calling `GC.addRoot`?You can. But wouldn't you prefer just pushing something on the stack?Not really. If the optimizer can remove dead stack pushes, then program will become 2x slower instantly in addition of consuming more stack memory.
Sep 26 2021
On Sunday, 26 September 2021 at 20:39:48 UTC, Steven Schveighoffer wrote:On 9/26/21 3:13 PM, deadalnix wrote:I think Amaury means that the code would be slower if the compiler is unable to register allocate at all vs. just for things returned by the allocator.On Friday, 24 September 2021 at 15:31:46 UTC, Steven Schveighoffer wrote:You realize what `GC.addRoot` does? It adds a pointer to a treap, through a virtual function call after taking a global lock. Pushing a stack item is going to be at least 10x faster, probably more compared to that. In the function call alone, there are a few stack pushes. -SteveOn 9/24/21 11:25 AM, deadalnix wrote:If the optimizer isn't free to optimize thing away from the stack, yes, it's pretty much guaranteed.[...]You think pushing on the stack is going be 2x slower than calling `GC.addRoot`?
Sep 26 2021
On Sunday, 26 September 2021 at 20:39:48 UTC, Steven Schveighoffer wrote:You realize what `GC.addRoot` does? It adds a pointer to a treap, through a virtual function call after taking a global lock. Pushing a stack item is going to be at least 10x faster, probably more compared to that. In the function call alone, there are a few stack pushes. -SteveSure, but it does so exclusively for one specific address. The optimizer would have to do so for every single thing that might be pointing to something that is owned by the GC. However, one thing that could be done is to have an intrinsic to do so, so the optimizer knows exactly which references not to mess with. LLVM already has intrinsics for this at the IR level.
Sep 27 2021
On 9/27/21 5:53 AM, deadalnix wrote:On Sunday, 26 September 2021 at 20:39:48 UTC, Steven Schveighoffer wrote:Yes, I'm not looking to have all dead stores kept, just ones that are important (as designated by the developer). What I'm saying is, *if* you need a specific address saved for later in a function, wouldn't you prefer to store that on the stack rather than store it in the GC's roots? This would absolutely be opt-in, not automatic, as I don't see how we can do it automatically. -SteveYou realize what `GC.addRoot` does? It adds a pointer to a treap, through a virtual function call after taking a global lock. Pushing a stack item is going to be at least 10x faster, probably more compared to that. In the function call alone, there are a few stack pushes.Sure, but it does so exclusively for one specific address. The optimizer would have to do so for every single thing that might be pointing to something that is owned by the GC. However, one thing that could be done is to have an intrinsic to do so, so the optimizer knows exactly which references not to mess with. LLVM already has intrinsics for this at the IR level.
Sep 27 2021
On Monday, 27 September 2021 at 13:59:57 UTC, Steven Schveighoffer wrote:What I'm saying is, *if* you need a specific address saved for later in a function, wouldn't you prefer to store that on the stack rather than store it in the GC's roots? This would absolutely be opt-in, not automatic, as I don't see how we can do it automatically.If performance was important, you would want to allocate the object itself on the stack? The cost of GC allocation outweighs that of GC.addRoot by a lot.
Sep 27 2021
On 9/27/21 10:39 AM, Max Samukha wrote:On Monday, 27 September 2021 at 13:59:57 UTC, Steven Schveighoffer wrote:That's not as safe, but it can be a viable option, as long as your object is constructed within your function and not in a factory function. -SteveWhat I'm saying is, *if* you need a specific address saved for later in a function, wouldn't you prefer to store that on the stack rather than store it in the GC's roots? This would absolutely be opt-in, not automatic, as I don't see how we can do it automatically.If performance was important, you would want to allocate the object itself on the stack? The cost of GC allocation outweighs that of GC.addRoot by a lot.
Sep 27 2021
On Monday, 27 September 2021 at 14:39:36 UTC, Max Samukha wrote:On Monday, 27 September 2021 at 13:59:57 UTC, Steven Schveighoffer wrote:`scope` makes sense for the examples given but "I want this object to live at least as long as this variable's scope" and "I want this object to die at this end of this scope" are different desires.What I'm saying is, *if* you need a specific address saved for later in a function, wouldn't you prefer to store that on the stack rather than store it in the GC's roots? This would absolutely be opt-in, not automatic, as I don't see how we can do it automatically.If performance was important, you would want to allocate the object itself on the stack? The cost of GC allocation outweighs that of GC.addRoot by a lot.
Sep 27 2021
On Monday, 27 September 2021 at 14:39:36 UTC, Max Samukha wrote:If performance was important, you would want to allocate the object itself on the stack? The cost of GC allocation outweighs that of GC.addRoot by a lot.The cost of GC _collection_ outweighs that of GC.addRoot by a lot. GC _allocation_ is pretty much free.
Sep 27 2021
On Monday, 27 September 2021 at 22:10:25 UTC, Elronnd wrote:On Monday, 27 September 2021 at 14:39:36 UTC, Max Samukha wrote:Yeah, "by a lot" is nonsense.If performance was important, you would want to allocate the object itself on the stack? The cost of GC allocation outweighs that of GC.addRoot by a lot.The cost of GC _collection_ outweighs that of GC.addRoot by a lot. GC _allocation_ is pretty much free.
Sep 28 2021
On Monday, 27 September 2021 at 13:59:57 UTC, Steven Schveighoffer wrote:What I'm saying is, *if* you need a specific address saved for later in a function, wouldn't you prefer to store that on the stack rather than store it in the GC's roots? This would absolutely be opt-in, not automatic, as I don't see how we can do it automatically. -SteveHonestly, I don't see why storing the GC root needs to be expensive. If it is that is a problem. You could literally compare and swap a counter to get a slot in the buffer and then store the pointer there, and granted there is no contention, it's effectively free (and if you have contention on addRoot, I can tell you what i'm talking about here is the least of your concerns).
Sep 27 2021
On Monday, 27 September 2021 at 22:15:26 UTC, deadalnix wrote:I don't see why storing the GC root needs to be expensiveBecause you need to be able to remove the root later. So it essentially reduces to a hash table. Wrt contention, concurrent hash table is doable (https://github.com/boundary/high-scale-lib/blob/master/src/main/java/org/cliffc/high_scale_lib/NonB ockingHashMap.java) but would need to be implemented; however I agree with Steven that I don't understand why you would do this when you could just store the pointer on the stack or in a register.
Sep 27 2021
On Tuesday, 28 September 2021 at 06:38:20 UTC, Elronnd wrote:On Monday, 27 September 2021 at 22:15:26 UTC, deadalnix wrote:Even then, unless you addRoot like mad, linear search through the array is goign to be super fast (in fact, LLVM DenseSet/DenseMap do exactly that untill the element count gets too high).I don't see why storing the GC root needs to be expensiveBecause you need to be able to remove the root later. So it essentially reduces to a hash table. Wrt contention, concurrent hash table is doable (https://github.com/boundary/high-scale-lib/blob/master/src/main/java/org/cliffc/high_scale_lib/NonB ockingHashMap.java) but would need to be implemented; however I agree with Steven that I don't understand why you would do this when you could just store the pointer on the stack or in a register.
Sep 28 2021
On Tuesday, 28 September 2021 at 16:22:40 UTC, deadalnix wrote:On Tuesday, 28 September 2021 at 06:38:20 UTC, Elronnd wrote:If you search from most recently added to least recent then it’ll be super cheap in most cases. Almost stack-like.On Monday, 27 September 2021 at 22:15:26 UTC, deadalnix wrote:Even then, unless you addRoot like mad, linear search through the array is goign to be super fast (in fact, LLVM DenseSet/DenseMap do exactly that untill the element count gets too high).[...]Because you need to be able to remove the root later. So it essentially reduces to a hash table. Wrt contention, concurrent hash table is doable (https://github.com/boundary/high-scale-lib/blob/master/src/main/java/org/cliffc/high_scale_lib/NonB ockingHashMap.java) but would need to be implemented; however I agree with Steven that I don't understand why you would do this when you could just store the pointer on the stack or in a register.
Sep 29 2021
On Wednesday, 29 September 2021 at 07:23:26 UTC, John Colvin wrote:If you search from most recently added to least recent then it’ll be super cheap in most cases. Almost stack-like.Yes, but pathological in the worst case. Kinda like a freelist without compaction, except your worst case isn't bounded. The generational GC hypothesis is relevant. It says that the worst case will come up, though it will be the exception rather than the rule; however, I don't know if the sorts of objects that need to be explicitly protected have usual lifetimes, so the situation might be even worse.
Sep 29 2021
On Wednesday, 29 September 2021 at 22:17:25 UTC, Elronnd wrote:On Wednesday, 29 September 2021 at 07:23:26 UTC, John Colvin wrote:You can always fall back to a set when things get large.If you search from most recently added to least recent then it’ll be super cheap in most cases. Almost stack-like.Yes, but pathological in the worst case. Kinda like a freelist without compaction, except your worst case isn't bounded. The generational GC hypothesis is relevant. It says that the worst case will come up, though it will be the exception rather than the rule; however, I don't know if the sorts of objects that need to be explicitly protected have usual lifetimes, so the situation might be even worse.
Sep 30 2021
On Thursday, 23 September 2021 at 18:41:55 UTC, Steven Schveighoffer wrote:On 9/23/21 1:45 PM, Johan wrote:post earlier do. You are supposed to use keepalive this way: ~~~D auto c = new C; // ... lots of ongoing activities ... keepAlive(c); // c will survive until keepAlive returns. ~~~ If you look into their source code, they then trick to trick the optimizer that keepAlive is relevant and shouldn't be taken away, thus forcing the GC to ensure that the reference isn't collected until function returns. https://cs.opensource.google/go/go/+/refs/tags/go1.17.1:src/runtime/mfinal.go;l=473 I think the mistake was trying to force c to exist on the stack, but without being used. -- PauloOn Thursday, 23 September 2021 at 12:47:25 UTC, Steven Schveighoffer wrote:oooh, really? That's cool. Maybe I'll update the library and re-register. Looking at the disassembly, I do see the difference. This is it pushing to the stack, right? ```asm mov qword ptr [rsp], rax ``` -SteveI think we will need a real compiler intrinsic at this point.For LDC (and I expect GDC too), the asm trick works.
Sep 23 2021
On 9/23/21 3:29 PM, Paulo Pinto wrote:earlier do.Yes, I took a lot of inspiration from that.You are supposed to use keepalive this way: ~~~D auto c = new C; // ... lots of ongoing activities ... keepAlive(c); // c will survive until keepAlive returns. ~~~ If you look into their source code, they then trick to trick the optimizer that keepAlive is relevant and shouldn't be taken away, thus forcing the GC to ensure that the reference isn't collected until function returns. https://cs.opensource.google/go/go/+/refs/tags/go1.17.1:src/run ime/mfinal.go;l=473Right, what I'd prefer is something that doesn't actually require a call or code of any kind, but the empty asm seems to require the minimum. Note that using my keepalive library that way (just calling c.keepAlive at the end of the function) will also work. -Steve
Sep 24 2021
On 9/22/21 5:06 PM, Steven Schveighoffer wrote:I made a package for something like this: https://code.dlang.org/packages/keepalive Maybe it might find some use.I deregistered it as it doesn't work to solve the problem (it might superficially solve it, but not definitively). -Steve
Sep 23 2021
On Tuesday, 21 September 2021 at 12:02:05 UTC, Steven Schveighoffer wrote:I just thought of a possible easy and effective way to ensure the thing isn't collected early: ```d struct Pin(T) { T t; nogc nothrow pure safe ~this() {} alias t this; } ... // usage auto c = Pin!C(new C); // now it needs to be held until the scope ends ```Another way: ``` struct Pin(T) { T t; ~this() { import core.volatile; volatileLoad(cast(uint*)&t); } } // usage auto c = Pin!C(new C); ```
Sep 24 2021
On 9/21/2021 5:02 AM, Steven Schveighoffer wrote:I just thought of a possible easy and effective way to ensure the thing isn't collected early: ```d struct Pin(T) { Â Â T t; Â Â nogc nothrow pure safe ~this() {} Â Â alias t this; } ... // usage auto c = Pin!C(new C); // now it needs to be held until the scope ends ```Use addRoot()/removeRoot() to do this in a documented and supported fashion. https://dlang.org/phobos/core_memory.html#addRoot
Sep 24 2021
On 9/21/2021 3:58 AM, Johan wrote:First: the use of "stack" here is wrong and confusing. It should be "local storage" (notorious error throughout the spec). Indeed, what is done in your example is putting the pointer in local storage. The scope of that local storage is until the end of function scope (in your example). I don't think (in LDC) that we track the lifetime of variables in that way, so what is done is that the optimizer just looks at last point of use. This is similar to how Java behaves: https://stackoverflow.com/questions/39285108/can-java-garbage-collect-variables before-end-of-scopeData flow analyzers are not scope-based, they are based on first use and last use. The live pointer tracking does this, too. Destructors happen on going out of scope, but the compiler can move them as long as the code behaves "as if" the destructor happened at the end of scope. But garbage collection, the class destructors are run at some arbitrary time after last use, not after going out of scope.
Sep 24 2021
On Monday, 20 September 2021 at 18:49:12 UTC, Steven Schveighoffer wrote:This to me seems like "leaving a pointer to it on the stack". I'm not sure how else I would do that specifically? Plus, this option is the only "free" one -- the others all require much more complication. Adding a pointer to the stack is free. It's just, I don't know how to tell the compiler to do that besides declaring it.If you use the pointer after the call, that's an easy way to ensure that the pointer is kept around long enough.
Sep 22 2021
On 9/22/21 8:12 AM, Kagamin wrote:On Monday, 20 September 2021 at 18:49:12 UTC, Steven Schveighoffer wrote:The point is, you may only use it via functions/mechanisms that are not visible to the GC (like C libraries). In which case you have to pin it somehow. Having to "fake" usage is annoying, but I think we should at least provide foolproof guidance on how to do this. -SteveThis to me seems like "leaving a pointer to it on the stack". I'm not sure how else I would do that specifically? Plus, this option is the only "free" one -- the others all require much more complication. Adding a pointer to the stack is free. It's just, I don't know how to tell the compiler to do that besides declaring it.If you use the pointer after the call, that's an easy way to ensure that the pointer is kept around long enough.
Sep 22 2021
On 9/22/21 8:12 AM, Kagamin wrote:On Monday, 20 September 2021 at 18:49:12 UTC, Steven Schveighoffer wrote:And by the way I tried naive usage, and the compiler saw right through that: ```d auto c = new C; scope(exit) auto fake = c; // still collected early ``` -SteveThis to me seems like "leaving a pointer to it on the stack". I'm not sure how else I would do that specifically? Plus, this option is the only "free" one -- the others all require much more complication. Adding a pointer to the stack is free. It's just, I don't know how to tell the compiler to do that besides declaring it.If you use the pointer after the call, that's an easy way to ensure that the pointer is kept around long enough.
Sep 22 2021
On Wednesday, 22 September 2021 at 12:31:48 UTC, Steven Schveighoffer wrote:On 9/22/21 8:12 AM, Kagamin wrote:I think anything that is (close to) zero-overhead is what the optimizer understands and is therefore not going to get the behavior that you want, besides an intrinsic to tell the optimizer to keep that pointer value alive in some storage that is scanned by GC (reachable memory or registers). In the absence of such intrinsic [*], what you can do is pass the value to something about which we explicitly tell the optimizer that it does not understand it. Cryptic? ;) https://d.godbolt.org/z/M3zbzK4sq ``` import ldc.llvmasm; __asm("", "r", c); ``` Probably this is also expressible in the new inline assembly that both GDC and LDC support. -Johan [*] https://lists.llvm.org/pipermail/llvm-dev/2016-July/102322.html Where it popped up in debuggability considerations.On Monday, 20 September 2021 at 18:49:12 UTC, Steven Schveighoffer wrote:And by the way I tried naive usage, and the compiler saw right through that: ```d auto c = new C; scope(exit) auto fake = c; // still collected early ```This to me seems like "leaving a pointer to it on the stack". I'm not sure how else I would do that specifically? Plus, this option is the only "free" one -- the others all require much more complication. Adding a pointer to the stack is free. It's just, I don't know how to tell the compiler to do that besides declaring it.If you use the pointer after the call, that's an easy way to ensure that the pointer is kept around long enough.
Sep 22 2021
On 9/22/21 11:16 AM, Johan wrote:besides an intrinsic to tell the optimizer to keep that pointer value alive in some storage that is scanned by GC (reachable memory or registers).Nobody seems to see the violation of "lifetime being the length of the scope" being a problem here. (You mentioned it elsewhere in this thread.) I think this is because the same issue is present in other languages. Shouldn't preserving the contents of registers be the responsibility of the compiler or the GC? And perhaps the compiler should skip that optimization if the contents cannot be preserved? Ali
Sep 22 2021
On Wednesday, 22 September 2021 at 18:30:41 UTC, Ali Çehreli wrote:Shouldn't preserving the contents of registers be the responsibility of the compiler or the GC? And perhaps the compiler should skip that optimization if the contents cannot be preserved? AliNot according to most ABIs. Parameters are usually not preserved on stack. Should it be forced on stack because of FFI? I personally don't think so. I think you are attacking this problem from the wrong side. It is not a problem of the optimizer or code generation. It's not the problem of the ABI. It's a life time issue regardless of the type of GC you use. If a resource is moved (both temporary and permanent) to any foreign system, you must have a way of describing that.
Sep 22 2021
On Wednesday, 22 September 2021 at 18:30:41 UTC, Ali Çehreli wrote:Shouldn't preserving the contents of registers be the responsibility of the compiler or the GC?The compiler thinks the contents don't need to be preserved at all. One thing we might do is if passing a pointer to an extern(C) function, the compiler assumes it shouldn't stomp the memory until end of scope.
Sep 22 2021
On 9/22/21 3:10 PM, Adam D Ruppe wrote:On Wednesday, 22 September 2021 at 18:30:41 UTC, Ali Çehreli wrote:What if you are calling D functions that call extern(C) functions? I don't think this is the answer either. -SteveShouldn't preserving the contents of registers be the responsibility of the compiler or the GC?The compiler thinks the contents don't need to be preserved at all. One thing we might do is if passing a pointer to an extern(C) function, the compiler assumes it shouldn't stomp the memory until end of scope.
Sep 22 2021
On Wednesday, 22 September 2021 at 19:14:43 UTC, Steven Schveighoffer wrote:What if you are calling D functions that call extern(C) functions?Then it will be an argument to that other D function which makes it a local variable there and the same rule can apply. If the C function stores something beyond the immediate function, you are already supposed to malloc it or addRoot etc, so I don't think the depth of the call stack really makes a difference.
Sep 22 2021
On 9/22/21 3:24 PM, Adam D Ruppe wrote:On Wednesday, 22 September 2021 at 19:14:43 UTC, Steven Schveighoffer wrote:While I don't necessarily disagree with you, this removes all possibility of abstraction. Even a local function cannot be used to factor out initialization of a C resource. I think code which does not properly do cleanup of C resources it uses, or does so with the expectation that the cleanup must be running after `main` exits is very suspect. But the fact that you can't rely on the recommended remedy in the spec needs fixing, and I don't think this fixes it. This also introduces unnecessary storage of pointers when not necessary (probably the vast majority of extern(C) calls). -SteveWhat if you are calling D functions that call extern(C) functions?Then it will be an argument to that other D function which makes it a local variable there and the same rule can apply. If the C function stores something beyond the immediate function, you are already supposed to malloc it or addRoot etc, so I don't think the depth of the call stack really makes a difference.
Sep 22 2021
On Wednesday, 22 September 2021 at 19:14:43 UTC, Steven Schveighoffer wrote:On 9/22/21 3:10 PM, Adam D Ruppe wrote:Either (a) the compiler must assume, pessimistically, that a pointer passed to a function may be stored somewhere the GC can't see it, and therefore must be preserved in local storage until the end of its lexical lifetime, or (b) the user must manually signal to the compiler or the GC that the pointer should be kept alive before passing it to the function. Given how error-prone option (b) is, I think (a) is the more sensible choice. Users who don't want the pointer kept alive can always opt-in to that behavior by enclosing it in a block scope to limit its lifetime, or setting it to `null` when they're done with it.One thing we might do is if passing a pointer to an extern(C) function, the compiler assumes it shouldn't stomp the memory until end of scope.What if you are calling D functions that call extern(C) functions? I don't think this is the answer either.
Sep 22 2021
On Wednesday, 22 September 2021 at 19:30:37 UTC, Paul Backus wrote:Either (a) the compiler must assume, pessimistically, that a pointer passed to a function may be stored somewhere the GC can't see it, and therefore must be preserved in local storage until the end of its lexical lifetime, or (b) the user must manually signal to the compiler or the GC that the pointer should be kept alive before passing it to the function. Given how error-prone option (b) is, I think (a) is the more sensible choice. Users who don't want the pointer kept alive can always opt-in to that behavior by enclosing it in a block scope to limit its lifetime, or setting it to `null` when they're done with it.The problem with a, is that eats up resources such as registers/stack, preventing better code generation. The compiler might choose to put it on a register and that way block that registers that could be used for better optimization. Also registers load/store on stack will increase. Free after last recently used is a perfectly sane assumption. Also if you are using RC, the compiler should decrease the reference count after last use in order to free up resources like registers. Therefore I think to define that the resource must be kept alive inside the entire scope hurts code generation while the benefit is low. Also option a, is in 99% of cases good because as long the resource is being used it must be *somewhere* (assuming you are using D all the way) which means the D GC will find it. The problems described in this thread are really fringe problems and I think it could be resolved by other means. KeepAlive is one of them and it is also a GC agnostic, works on any GC. I rather say b is the better alternative, covering for those very special cases.
Sep 22 2021
On Wednesday, 22 September 2021 at 19:50:37 UTC, IGotD- wrote:The problems described in this thread are really fringe problems and I think it could be resolved by other means. KeepAlive is one of them and it is also a GC agnostic, works on any GC. I rather say b is the better alternative, covering for those very special cases.If GC.keepAlive is added, how do you know when to use it without first running into this fringe problem in a specific part of your code? In the code that prompted this discussion, the object lifetimes immediately around the C API calls were actually fine. GC.keepAlive in the functions with those API calls, even if applied to the exact object whose lifetime was too short, would not have saved the object.
Sep 22 2021
On Wednesday, 22 September 2021 at 19:30:37 UTC, Paul Backus wrote:Either (a) the compiler must assume, pessimistically, that a pointer passed to a function may be stored somewhere the GC can't see it, and therefore must be preserved in local storage until the end of its lexical lifetime, or (b) the user must manually signal to the compiler or the GC that the pointer should be kept alive before passing it to the function. Given how error-prone option (b) is, I think (a) is the more sensible choice. Users who don't want the pointer kept alive can always opt-in to that behavior by enclosing it in a block scope to limit its lifetime, or setting it to `null` when they're done with it.Neither of these solutions would've helped with the original code that used the epoll API. The pointer passed to C wasn't important, and it would've been fine for its lifetime to end at the end of the function (it was even a pointer to a stack-allocated struct in the function: epoll copies the struct out of the pointer given to it). The pointer that mattered was a misaligned class reference in the structure passed to C. This pointer was to an object that will eventually get destructed before the end of its function scope. *That* objected wasn't passed anywhere in its scope, it was initialized and then had a method called on it. At the site of the short-lived object, there's nothing apparently wrong. At the site of the C API call, what can you do? If you tell the GC that the short-lived object's reference is important, the GC still can't see any references to it afterward, and the GC isn't responsible for the object's short lifetime. If you tell the compiler that the short-lived object's reference is important, this could be a completely separate compilation and the decision to expire it early might've already been made. I think the intuition that's violated here is "the GC is nondeterministic, but the stack is deterministic." And the correct intuition is "class destruction is nondeterministic, but struct destruction is (usually) deterministic." This also can happen without using calling out to C: ```d import std.stdio : writeln; import core.memory : GC; struct Hidden { align(1): int spacer; C obj; } Hidden hidden; class C { int id; this(int id) { this.id = id; hidden.obj = this; } ~this() { id = -id; writeln("dtor"); } } void check() { writeln(hidden.obj.id); } void main() { auto c = new C(17); check; // 17, c is alive GC.collect; GC.collect; check; // -17, c was destroyed writeln("here"); } ```
Sep 22 2021
On Wednesday, 22 September 2021 at 20:17:48 UTC, jfondren wrote:Neither of these solutions would've helped with the original code that used the epoll API. The pointer passed to C wasn't important, and it would've been fine for its lifetime to end at the end of the function (it was even a pointer to a stack-allocated struct in the function: epoll copies the struct out of the pointer given to it). The pointer that mattered was a misaligned class reference in the structure passed to C. This pointer was to an object that will eventually get destructed before the end of its function scope. *That* objected wasn't passed anywhere in its scope, it was initialized and then had a method called on it.Finally someone tries to solve the real problem. It's beyond me why people try to count angels dancing on the tip of a needle.
Sep 22 2021
On Wednesday, 22 September 2021 at 12:31:48 UTC, Steven Schveighoffer wrote:And by the way I tried naive usage, and the compiler saw right through that: ```d auto c = new C; scope(exit) auto fake = c; // still collected early ```That just means you don't properly use it. It's so in this case, but it isn't always the case. For an example how D interoperates with the C heap see the Array container: https://github.com/dlang/phobos/blob/master/std/container/array.d#L604 Another option is to allocate objects in the C heap, then they won't have this problem.
Sep 22 2021
On Monday, 20 September 2021 at 18:26:59 UTC, Steven Schveighoffer wrote:Without any context, what do you think should happen here? ```d import std.stdio; import core.memory; class C { ~this() { writeln("dtor"); } } void main() { auto c = new C; foreach(i; 0 .. 10000) GC.collect; writeln("end of main"); } ``` Option 1: ``` end of main dtor ``` Option 2: ``` dtor end of main ``` Option 3: ``` end of main ``` Option 4: Option 1 or 2, depending on entropy.IMO all of the options are valid. The GC gives no guarantees about when or if it will finalize GC-allocated objects, so both option 1 and 3 are valid. Option 2 at first seems like it should be invalid, but since `c` is never accessed after initialization, the compiler is free to remove the initialization as a dead store, which would allow the GC to collect the `new C` object prior to the end of `c`'s lifetime. I would expect to see either 1 or 3 with optimizations disabled, and possibly 2 with optimizations enabled. I would be surprised to see 4, since dead-store elimination shouldn't depend on entropy at runtime.
Sep 20 2021
On 9/20/21 11:55 AM, Paul Backus wrote:auto c = new C; foreach(i; 0 .. 10000) GC.collect; writeln("end of main");Option 2: ``` dtor end of main ```Option 2 at first seems like it should be invalid, but since `c` is never accessed after initialization, the compiler is free to remove the initialization as a dead store,Your explanation at first :) seems invalid because it seems to disregard side-effects in the constructor the destructor. However, because destructors for GC objects are not guaranteed to be executed anyway, only the constructor should be considered here. I love the overwritten register story but I think this is a bug because "local storage" should be sufficient to keep the object alive. However, given the presence of KeepAlive from other languages, perhaps this is a concept that needs to be communicated better. Ali
Sep 21 2021
On Monday, 20 September 2021 at 18:26:59 UTC, Steven Schveighoffer wrote:Without any context, what do you think should happen here? ```d import std.stdio; import core.memory; class C { ~this() { writeln("dtor"); } } void main() { auto c = new C; foreach(i; 0 .. 10000) GC.collect; writeln("end of main"); } ``` Option 1: ``` end of main dtor ``` Option 2: ``` dtor end of main ``` Option 3: ``` end of main ``` Option 4: Option 1 or 2, depending on entropy.Most likely option 1. May also be option 3 though, as I remember some warnings about trusting the GC destructor to run at all.
Sep 21 2021
On Monday, 20 September 2021 at 18:26:59 UTC, Steven Schveighoffer wrote:Without any contextSome context in my blog now: http://dpldocs.info/this-week-in-d/Blog.Posted_2021_09_20.html#my-investigation
Sep 21 2021
FYI, I filed an issue so it's not forgotten. https://issues.dlang.org/show_bug.cgi?id=22331 -Steve
Sep 22 2021
On Monday, 20 September 2021 at 18:26:59 UTC, Steven Schveighoffer wrote:Option 4: Option 1 or 2, depending on entropy.This ^
Sep 23 2021
Off topic, I heard that some people missed this very important thread because they didn't guess from the subject line that the content was important. Ali
Sep 24 2021
On 9/20/2021 11:26 AM, Steven Schveighoffer wrote:[...]1, 2, or 3 are all valid outcomes. The lifetime of a GC allocated object ends when there are no longer live references to it. Live references can end before the scope ends, this is called "non-lexical scoping". That does not imply that that is when the class destructor is run. The class destructor runs at some arbitrary time *after* there are no longer references to it. The GC is not obligated to run a collection cycle upon program termination (a laxity intended to improve shutdown performance), and hence it is not obliged to run the class destructors. The inevitable consequence of this is: Do *not* use the GC to manage non-memory resources. But if you must do it anyway, use the "destroy" and "free" GC special functions. Of course, if you decide to use these functions, it's up to you to ensure resources are free'd exactly once. https://dlang.org/phobos/object.html#.destroy https://dlang.org/phobos/core_memory.html#.GC.free P.S. A live reference is one where there is a future use of it. A dead reference is one where there isn't a future use. The `c` variable is a dead reference immediately after it is initialized, which is why optimizers delete the assignment. The `new C` return value is dead as soon as it is created. P.P.S. Attempting to deduce the GC's rules from observing its behavior is very likely a path to frustration and errors, because its observed behavior will not make sense (and will appear random) unless one understands the above explanation. P.P.S. Do not conflate class destructors with struct destructors. The latter follow RAII rules, the former does not.
Sep 24 2021
On 9/24/21 5:12 PM, Walter Bright wrote:"non-lexical scoping"live referenceThose are new concepts to me. :(dead reference immediately after it is initializedI am sure I have dead references in my D library that exposes a C API. If it works, it must be because I don't hit a GC cycle in my thin extern(C) function. (Single-threaded too.)Do not conflate class destructors with struct destructors. The latter follow RAII rules, the former does not.Some people stress that fact by using the term "finalizer" for classes. Ali
Sep 24 2021
On 9/24/21 8:12 PM, Walter Bright wrote:On 9/20/2021 11:26 AM, Steven Schveighoffer wrote:Right, but the optimizer is working against that. For example: ```d auto c = new C; .... // a whole bunch of other code c.method; // not necessarily still live here. ``` Why would it not be live there? Because the method might be inlined, and the compiler might determine at that point that none of the data inside the method is needed, and therefore the object is no longer needed. So technically, it's not live after the original allocation. But this is really hard for a person to determine. Imagine having the GC collect your object when the object is clearly "used" later? This is why I made this thread. It's easy to explain why it's happening, but it's really hard to follow the instructions "leave a pointer on the stack" as noted in the spec.[...]1, 2, or 3 are all valid outcomes. The lifetime of a GC allocated object ends when there are no longer live references to it. Live references can end before the scope ends, this is called "non-lexical scoping". That does not imply that that is when the class destructor is run. The class destructor runs at some arbitrary time *after* there are no longer references to it.The GC is not obligated to run a collection cycle upon program termination (a laxity intended to improve shutdown performance), and hence it is not obliged to run the class destructors.Yeah, that's not really the focus here, but it's a good point to make.The inevitable consequence of this is: Do *not* use the GC to manage non-memory resources.When D uses the GC for delegates, classes, etc, and you want to hook to those things via C callbacks, this advice falls flat. Basically, you are saying, when using your OS primitives, don't use D.But if you must do it anyway, use the "destroy" and "free" GC special functions. Of course, if you decide to use these functions, it's up to you to ensure resources are free'd exactly once. https://dlang.org/phobos/object.html#.destroy https://dlang.org/phobos/core_memory.html#.GC.freeDon't recommend free. Just use destroy.P.S. A live reference is one where there is a future use of it. A dead reference is one where there isn't a future use. The `c` variable is a dead reference immediately after it is initialized, which is why optimizers delete the assignment. The `new C` return value is dead as soon as it is created.It would be good for the spec to define what a "live" reference is. Currently, the GC talks about stacks, heap and registers. And I think it should be clear about whether using a reference later should be considered making it "live", as it is not currently when optimized.P.P.S. Attempting to deduce the GC's rules from observing its behavior is very likely a path to frustration and errors, because its observed behavior will not make sense (and will appear random) unless one understands the above explanation.Yes, it is difficult to ensure a collection doesn't occur or does occur. But clearly if it does occur and you don't expect it to, it's not doing what you thought it was. If it's not doing what you thought it was (e.g. keeping the object live), but it doesn't get collected due to some other reason, then it's hard to prove things conclusively. I've resolved to reading the assembly instead of using the measured collection, at least to see if the compiler works around efforts to keep things live, as that is more reliable. However, it's harder to figure out.P.P.S. Do not conflate class destructors with struct destructors. The latter follow RAII rules, the former does not.Actually, this is not strictly true. Structs allocated on the heap do not follow RAII rules, and stack allocations of structs are a completely different issue. -Steve
Sep 25 2021
On Saturday, 25 September 2021 at 17:46:01 UTC, Steven Schveighoffer wrote:Indeed; additionally, 'scope c = new Class()' _does_ follow RAII.P.P.S. Do not conflate class destructors with struct destructors. The latter follow RAII rules, the former does not.Actually, this is not strictly true. Structs allocated on the heap do not follow RAII rules, and stack allocations of structs are a completely different issue.
Sep 25 2021
On 9/25/2021 12:28 PM, Elronnd wrote:Indeed; additionally, 'scope c = new Class()' _does_ follow RAII.That's more of an optimization than a semantic shift.
Sep 25 2021
On Sunday, 26 September 2021 at 02:15:53 UTC, Walter Bright wrote:On 9/25/2021 12:28 PM, Elronnd wrote:// This works fine... scope c = new Class() // I hope this works, otherwise it violates the 'law of least surprise' auto c = new Class() scope(exit) destroy(c); If scope(exit) also works, then I don't think there is any problem...Indeed; additionally, 'scope c = new Class()' _does_ follow RAII.That's more of an optimization than a semantic shift.
Sep 26 2021
On Sunday, 26 September 2021 at 07:49:46 UTC, Daniel N wrote:// I hope this works, otherwise it violates the 'law of least surprise' auto c = new Class() scope(exit) destroy(c); If scope(exit) also works, then I don't think there is any problem...Destroy on a class reference doesn't call the destructor; it just sets the reference to null. If the optimizer sees that the reference isn't used again after that, it can remove the assignment to null as a dead store. And *then* if it sees that the reference isn't used after its initialization, it can remove the initialization as a dead store. Basically, any clever shortcut you can come up with that "uses" the reference but doesn't actually do anything will fall apart as soon as the optimizer figures out that it doesn't actually do anything. Better to use addRoot/removeRoot, which have well-defined semantics and will never be optimized away
Sep 26 2021
On Sunday, 26 September 2021 at 13:29:26 UTC, Paul Backus wrote:On Sunday, 26 September 2021 at 07:49:46 UTC, Daniel N wrote:Correction: I remembered wrong; it actually does call the destructor. So this will work for classes with non-trivial destructors.// I hope this works, otherwise it violates the 'law of least surprise' auto c = new Class() scope(exit) destroy(c); If scope(exit) also works, then I don't think there is any problem...Destroy on a class reference doesn't call the destructor; it just sets the reference to null.
Sep 26 2021
On 9/25/2021 10:46 AM, Steven Schveighoffer wrote:Right, but the optimizer is working against that. For example: ```d auto c = new C; .... // a whole bunch of other code c.method; // not necessarily still live here. ``` Why would it not be live there? Because the method might be inlined, and the compiler might determine at that point that none of the data inside the method is needed, and therefore the object is no longer needed. So technically, it's not live after the original allocation. But this is really hard for a person to determine. Imagine having the GC collect your object when the object is clearly "used" later?I understand your point. It's value is never used again, so there is no reason for the GC to hold on to it. After the point when the value is never used again, when the destructor is run is indeterminate. Maybe the real problem is the user is expecting the destructor to run at a specific point in the execution. The point of putting the variable on the stack (or in a register, it works the same) is so the GC can find it. If D code is being called, D does not allow the hiding of pointers. But C does allow this, such as when doing the singly linked list XOR trick. The GC won't find those pointers, and will collect them. But if the pointer is still on the stack, the GC will find them. If the function is inlined, it is not C code, so hiding the pointer won't be allowed, and it's not a problem.If the closure for a delegate is located on the stack, then it will be found by the GC and works fine. If the closure is located on the GC heap, and the OS will keep it around past the return, then you'll need to use addRoot.Do *not* use the GC to manage non-memory resources.When D uses the GC for delegates, classes, etc, and you want to hook to those things via C callbacks, this advice falls flat.Basically, you are saying, when using your OS primitives, don't use D.addRoot()
Sep 25 2021
On Saturday, 25 September 2021 at 22:19:30 UTC, Walter Bright wrote:Maybe the real problem is the user is expecting the destructor to run at a specific point in the execution.In the case that spawned this thread, the problem was a segfault after a hidden reference was restored to the collected object, and the coder was certain the object should still be alive because the scope of its auto variable hadn't ended yet. The destructor by this point just had some debugging output so that we could see it was being collected before the end of it scope. The hidden reference was hidden by Linux's epoll facility, which keeps 64 bits of arbitrary user data in kernel space, and returns it on an event. A bit like writing a pointer address to a file and then reading it back later. (There's *also* meaningful work that must happen at a specific time in destructors in this code base, and other problems. It's pretty elaborate code by someone new to D.) I don't think this is a problem with D, but I shared the intuition that this object should still be alive after the point the GC collected it, and that wrong intuition made troubleshooting this case a lot harder.
Sep 25 2021
On Saturday, 25 September 2021 at 23:14:54 UTC, jfondren wrote:I don't think this is a problem with D,Although I'd still like an efficient GC.keepAlive. My worry is the optimizer removing some measure I'm taking specifically to keep a reference alive.
Sep 25 2021
On 9/25/2021 4:22 PM, jfondren wrote:Although I'd still like an efficient GC.keepAlive. My worry is the optimizer removing some measure I'm taking specifically to keep a reference alive.Use GC.addRoot() to keep a reference alive. That's what it's for.
Sep 25 2021
On 9/25/21 3:19 PM, Walter Bright wrote:On 9/25/2021 10:46 AM, Steven Schveighoffer wrote:I must be misunderstanding Steven's example. You say "its value is never used again" yet his example has c.method which to me is clearly "using" it again. So, I don't understand Steven's "not necessarily still live here" comment either. Is the case today?Right, but the optimizer is working against that. For example: ```d auto c = new C; .... // a whole bunch of other code c.method; // not necessarily still live here. ``` Why would it not be live there? Because the method might be inlined, and the compiler might determine at that point that none of the data inside the method is needed, and therefore the object is no longer needed. So technically, it's not live after the original allocation. But this is really hard for a person to determine. Imagine having the GC collect your object when the object is clearly "used" later?I understand your point. It's value is never used again, so there is no reason for the GC to hold on to it.After the point when the value is never used again, when the destructor is run is indeterminate.I accept that part. The to-me-new concept of non-lexical scoping scares me if the non-lexical scope is shorter than lexical scope. And especially when the value is clearly used by `c.method`. (If I heard non-lexical scoping, I must have taken it as "destructor may be executed later than the lexical scope".) Ali
Sep 25 2021
On Sunday, 26 September 2021 at 00:23:19 UTC, Ali Çehreli wrote:c.method which to me is clearly "using" it again.If it is a final method that doesn't actually use any class variables then it doesn't actually use the `this` pointer. The optimizer sees this and can let the object go. That code can also run without crashing if c is null too btw for the same reason.
Sep 25 2021
On 9/25/21 8:23 PM, Ali Çehreli wrote:On 9/25/21 3:19 PM, Walter Bright wrote: > On 9/25/2021 10:46 AM, Steven Schveighoffer wrote: >> Right, but the optimizer is working against that. >> >> For example: >> >> ```d >> auto c = new C; >> .... // a whole bunch of other code >> >> c.method; // not necessarily still live here. >> ``` >> >> Why would it not be live there? Because the method might be inlined, >> and the compiler might determine at that point that none of the data >> inside the method is needed, and therefore the object is no longer >> needed. >> >> So technically, it's not live after the original allocation. But this >> is really hard for a person to determine. Imagine having the GC >> collect your object when the object is clearly "used" later? > > I understand your point. It's value is never used again, so there is no > reason for the GC to hold on to it. I must be misunderstanding Steven's example. You say "its value is never used again" yet his example has  c.method which to me is clearly "using" it again. So, I don't understand Steven's "not necessarily still live here" comment either. Is the case today?Yes, that is the case today. As an example, c.method might not use any data inside c. Maybe c.method is `void method() {}` But you might say "it has to pass c into it, so aha, it's live!". Except that it could be inlined, which effectively inlines a completely empty function. But what if it's virtual? "Aha!" you might say, "now it *has* to have the pointer live, because it has to read the vtable". But not if your compiler is ldc -- it can inline virtual functions if it can figure out that there's no way you could have created a derived object. As I said, the optimizer is fighting you the entire way. Someone on Beerconf suggested destroying the object at the end of the struct. That actually is a reasonable solution that I don't think the compiler can elide. If that's what you want. But I still am wishing for a simple "keep this alive" mechanism that doesn't add too much cruft, and is guaranteed to keep it alive. -Steve
Sep 25 2021
On 9/25/2021 6:02 PM, Steven Schveighoffer wrote:As I said, the optimizer is fighting you the entire way.I'd reframe it as the user is trying to impose his own semantics on the optimizer :-) Selecting semantics that enable aggressive optimizations is always a dance between user predictability and high performance. High performance usually wins. Disabling dead assignment elimination would have a catastrophic deleterious effect on optimizations. A lot of template bloat would remain.But I still am wishing for a simple "keep this alive" mechanism that doesn't add too much cruft, and is guaranteed to keep it alive.That's exactly what addRoot() is for.
Sep 25 2021
On Sat, Sep 25, 2021 at 07:13:50PM -0700, Walter Bright via Digitalmars-d wrote:On 9/25/2021 6:02 PM, Steven Schveighoffer wrote:IMO, those are just the symptoms. The core of the problem is that we failed to inform the GC of a reference to some object (because said reference was passed into C land and no longer exists in D land). GC.addRoot is precisely the ticket that solves this core issue: it informs the GC about said reference. **This is what addRoot is for.** All the other elaborate attempts to "fix" this problem without using GC.addRoot is just skirting around the issue without actually addressing the problem. No wonder it feels likes "the optimizer is fighting you the entire way." Actually, the optimizer is NOT trying to fight you; it's merely telling you that **the semantics expressed in your code is not what you think it is**. It's simply reducing your code to its actual semantics as defined by the specs. The fact that this reduction isn't what you expect is a sign that the original code doesn't actually mean what you think it means. It's like trying to prevent a boolean expression from reducing to a constant value by adding more clauses to it. If there's already a tautology in your expression, it's not gonna change no matter what else you try to add to it. Fix the tautology, and all of the problems go away. There's no need for all the other fluff. Seriously, guys, just use GC.addRoot. **That's what it's for.** Stop trying to cure cancer with Tylenol already! :-D [...]As I said, the optimizer is fighting you the entire way.I'd reframe it as the user is trying to impose his own semantics on the optimizer :-)If calling addRoot causing "performance problems" (or whatever other objections one may raise), we can look into improving its performance in various ways. But avoiding to call it in the first place is not a solution. T -- "Computer Science is no more about computers than astronomy is about telescopes." -- E.W. DijkstraBut I still am wishing for a simple "keep this alive" mechanism that doesn't add too much cruft, and is guaranteed to keep it alive.That's exactly what addRoot() is for.
Sep 26 2021
On 9/25/21 10:13 PM, Walter Bright wrote:On 9/25/2021 6:02 PM, Steven Schveighoffer wrote:OK. https://github.com/dlang/dlang.org/pull/3102 -SteveAs I said, the optimizer is fighting you the entire way.I'd reframe it as the user is trying to impose his own semantics on the optimizer :-) Selecting semantics that enable aggressive optimizations is always a dance between user predictability and high performance. High performance usually wins. Disabling dead assignment elimination would have a catastrophic deleterious effect on optimizations. A lot of template bloat would remain.But I still am wishing for a simple "keep this alive" mechanism that doesn't add too much cruft, and is guaranteed to keep it alive.That's exactly what addRoot() is for.
Sep 26 2021