digitalmars.D - DIP1000 scope inference
- Steven Schveighoffer (35/35) Oct 24 2022 Deprecation messages due to dip1000's imminent arrival are scheduled to
- Paul Backus (24/28) Oct 24 2022 No, it does not. This capability was added only for array
- rikki cattermole (2/13) Oct 24 2022 Does this also apply to @safe?
- rikki cattermole (5/19) Oct 24 2022 Apparently.
- Paul Backus (5/18) Oct 24 2022 No, because you are not allowed to return a `scope` variable in
- Steven Schveighoffer (16/26) Oct 24 2022 OK, what about this?
- Paul Backus (20/34) Oct 24 2022 When I compile the above with `@safe` and `-preview=dip1000`, I
- Steven Schveighoffer (22/53) Oct 25 2022 OK, I misread the error here, it's the same on run.dlang.io. But we did
- Steven Schveighoffer (9/12) Oct 25 2022 It's very curious. I can't get any indication that the struct is
- Quirin Schroll (3/17) Oct 25 2022 Asking curiously, wasn’t the function UB before, but the behavior
- Paul Backus (14/16) Oct 25 2022 Until very recently, the language spec [1] said that a `scope`
- Walter Bright (4/6) Oct 26 2022 A very good question. Clearly, having code work when it is @safe, but ca...
- rikki cattermole (3/12) Oct 26 2022 At the very least, if no solution can be determined this needs to be
- Nick Treleaven (4/11) Oct 26 2022 There's nothing to revert that corrupts memory (without
- German Diago (16/20) Oct 26 2022 Is not trusted code (note my little D experience so sorry if I am
- Salih Dincer (10/19) Oct 26 2022 @safe: it's like a seat belt. You can take your children, who
- tsbockman (24/28) Oct 26 2022 A `@safe` function is guaranteed by the compiler to be memory
- Quirin Schroll (13/43) Oct 27 2022 The “(almost)” should be absent. If you mean something other than
- Dukc (12/20) Oct 26 2022 It's not quite exactly that. The code in question fails with
- Steven Schveighoffer (6/28) Oct 26 2022 Yes, maybe. I don't know if it's UB, because I don't know the
- Steven Schveighoffer (13/23) Oct 26 2022 I should be clear here -- the code does *not* compile in @safe code, but...
- Walter Bright (3/16) Oct 26 2022 We're in full agreement here.
- Walter Bright (17/25) Oct 26 2022 [Some more thinking about the problem]
- Steven Schveighoffer (20/60) Oct 26 2022 Please no! We can allocate on the stack by explicitly requesting it:
- Walter Bright (6/27) Oct 27 2022 How would this be done:
- Steven Schveighoffer (9/42) Oct 27 2022 Already works today, except I don't know what the + a means:
- German Diago (13/40) Oct 27 2022 As a person who has used D but not extensively, I was suprised of
- Quirin Schroll (6/60) Oct 27 2022 If `[1, 2, 3]` is stack allocated, it should not compile (at
- Walter Bright (3/14) Oct 27 2022 Add .dup for those that need the array to survive the function.
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (2/3) Oct 27 2022 Why isnt this an immutable constant just like a string literal?
- Dukc (12/18) Oct 27 2022 Please no. Far too much breakage for the value (even without
Deprecation messages due to dip1000's imminent arrival are scheduled to happen on the next release of the compiler. I have some concerns about scope inference, and wanted to find out the answers here. Let's say I have a scope array like this in a trusted function: ```d int[] mkarr() trusted { scope arr = [1, 2, 3]; return arr; } ``` Clearly, this is a bad idea. The compiler might put the array data actually on the stack (right?), and therefore return stack data when it shouldn't. But what if you *don't* mark it scope? Let's try something here: ```d int[] mkarr() safe { int[3] arr = [1, 2, 3]; int[] other = arr[]; other = [4, 5, 6]; return other; } ``` by the time `other` is returned, it should no longer be pointing at stack data. But *because* it was originally assigned to the static array, `other` is inferred as scope (as is proven by the code above failing to compile with dip1000 enabled with an error about returning scope data). Let's switch that back to ` trusted`, and now it does compile, even with dip1000. BUT, let me ask this very crucial question: Does the inferred `scope` make it so that the compiler is *allowed* to allocate the `[4, 5, 6]` literal on the stack? Keep in mind that I never put `scope` here, this is something the compiler did on its own. In a ` trusted` function today, without dip1000, the above is perfectly reasonable and not invalid. Will dip1000 make it corrupt memory? -Steve
Oct 24 2022
On Tuesday, 25 October 2022 at 01:35:28 UTC, Steven Schveighoffer wrote:Does the inferred `scope` make it so that the compiler is *allowed* to allocate the `[4, 5, 6]` literal on the stack? Keep in mind that I never put `scope` here, this is something the compiler did on its own.No, it does not. This capability was added only for array literals, and only for variable initialization: DMD PR: https://github.com/dlang/dmd/pull/14562 Spec PR (pending): https://github.com/dlang/dlang.org/pull/3442 However, this thread raises an important point: changing the way existing language constructs allocate memory in the presence of `scope` may cause ` trusted` code which relied on the original behavior to become unsound. For example, the ` trusted` function below is memory safe when using the current compiler release, but will become unsafe when compiled with DMD 2.101: ```d trusted int[] example() { scope example = [1, 2, 3]; return example; } ``` The worst part is that the potential memory corruption is introduced silently. Users who upgrade to DMD 2.101 will have no idea that the ground has shifted beneath their feet until their code invokes UB at runtime.
Oct 24 2022
On 25/10/2022 3:09 PM, Paul Backus wrote:For example, the ` trusted` function below is memory safe when using the current compiler release, but will become unsafe when compiled with DMD 2.101: ```d trusted int[] example() { scope example = [1, 2, 3]; return example; } ```Does this also apply to safe?
Oct 24 2022
On 25/10/2022 3:13 PM, rikki cattermole wrote:On 25/10/2022 3:09 PM, Paul Backus wrote:Apparently. I can't find any checks in the PR. REVERT REVERT REVERT (or ya know add the check for safe). #SuddenlyWorried lolFor example, the ` trusted` function below is memory safe when using the current compiler release, but will become unsafe when compiled with DMD 2.101: ```d trusted int[] example() { scope example = [1, 2, 3]; return example; } ```Does this also apply to safe?
Oct 24 2022
On Tuesday, 25 October 2022 at 02:13:20 UTC, rikki cattermole wrote:On 25/10/2022 3:09 PM, Paul Backus wrote:No, because you are not allowed to return a `scope` variable in ` safe` code, even if you happen to know that it points to a heap allocation.For example, the ` trusted` function below is memory safe when using the current compiler release, but will become unsafe when compiled with DMD 2.101: ```d trusted int[] example() { scope example = [1, 2, 3]; return example; } ```Does this also apply to safe?
Oct 24 2022
On 10/24/22 10:09 PM, Paul Backus wrote:On Tuesday, 25 October 2022 at 01:35:28 UTC, Steven Schveighoffer wrote:OK, what about this? ```d int[] mkarr() trusted { int[3] arr = [1, 2, 3]; int[] other = [4, 5, 6]; auto foo = other; other = arr[]; return foo; } ``` `other` is inferred as `scope` (along with `foo`), because it touches `arr[]` later (but after it was pointing at what should have been heap memory). So does that count as possible for stack allocation, or is it still heap allocated? -SteveDoes the inferred `scope` make it so that the compiler is *allowed* to allocate the `[4, 5, 6]` literal on the stack? Keep in mind that I never put `scope` here, this is something the compiler did on its own.No, it does not. This capability was added only for array literals, and only for variable initialization: DMD PR: https://github.com/dlang/dmd/pull/14562 Spec PR (pending): https://github.com/dlang/dlang.org/pull/3442
Oct 24 2022
On Tuesday, 25 October 2022 at 02:38:02 UTC, Steven Schveighoffer wrote:OK, what about this? ```d int[] mkarr() trusted { int[3] arr = [1, 2, 3]; int[] other = [4, 5, 6]; auto foo = other; other = arr[]; return foo; } ``` `other` is inferred as `scope` (along with `foo`), because it touches `arr[]` later (but after it was pointing at what should have been heap memory). So does that count as possible for stack allocation, or is it still heap allocated?When I compile the above with ` safe` and `-preview=dip1000`, I get Error: reference to local variable `arr` assigned to non-scope `other` ...using both DMD 2.100.2 and DMD master. So `scope` is not actually being inferred here, and the array is allocated on the heap. My expectation is that `scope` will probably *never* be inferred for `other`, because doing multi-step inference like this requires dataflow analysis in the general case, which is something Walter wants to avoid (see discussion in [issue 20674][1]). So I don't think you have anything to worry about. Still, this is a good illustration of how silently changing the rules on people can have unintended consequences. If Walter ever *does* consider adding dataflow analysis, overly-aggressive "optimizations" like these could easily become obstacles in the way of that goal. [1]: https://issues.dlang.org/show_bug.cgi?id=20674
Oct 24 2022
On 10/24/22 10:59 PM, Paul Backus wrote:On Tuesday, 25 October 2022 at 02:38:02 UTC, Steven Schveighoffer wrote:OK, I misread the error here, it's the same on run.dlang.io. But we did just go through an exercise where a struct not labeled scope is inferred scope not because of its declaration, but because of later things done with it. It doesn't seem to be the case here.OK, what about this? ```d int[] mkarr() trusted { int[3] arr = [1, 2, 3]; int[] other = [4, 5, 6]; auto foo = other; other = arr[]; return foo; } ``` `other` is inferred as `scope` (along with `foo`), because it touches `arr[]` later (but after it was pointing at what should have been heap memory). So does that count as possible for stack allocation, or is it still heap allocated?When I compile the above with ` safe` and `-preview=dip1000`, I get Error: reference to local variable `arr` assigned to non-scope `other`...using both DMD 2.100.2 and DMD master. So `scope` is not actually being inferred here, and the array is allocated on the heap. My expectation is that `scope` will probably *never* be inferred for `other`, because doing multi-step inference like this requires dataflow analysis in the general case, which is something Walter wants to avoid (see discussion in [issue 20674][1]). So I don't think you have anything to worry about.My biggest concern is that this inference takes priority over what is actually written, and then can cause memory problems to occur in code that seemingly reads like it shouldn't cause memory problems. I'm trying to find a hole because I'm worried about that hole showing up without intention later (especially with the way the compiler can inline and rewrite code for optimization). The compiler doing things that are not checkable (I know of no way to introspect that something is scope inferred), hard to describe, and impossible to prevent makes things uncomfortable. Especially if the compiler might make disastrous decisions based on that inference. It would be relieving to have some rule that says "any data inferred scope inside a system or trusted context without explicitly being declared scope shall not result in memory allocations hoisting to the stack". I can deal, begrudgingly, with compiler errors that are misguided. I can't deal with memory errors caused by the compiler knowing better than me. -Steve
Oct 25 2022
On 10/25/22 9:44 AM, Steven Schveighoffer wrote:But we did just go through an exercise where a struct not labeled scope is inferred scope not because of its declaration, but because of later things done with it.It's very curious. I can't get any indication that the struct is inferred scope except by throwing an exception contained in it. If I declare the function safe, it won't let me assign the scope variable to the struct member. If I mark it as trusted, that succeeds, but then it won't let me throw the exception out of the struct because it says the struct is scope. Again, with no way to tell whether scope is inferred, it's hard to judge. -Steve
Oct 25 2022
On Tuesday, 25 October 2022 at 02:09:02 UTC, Paul Backus wrote:For example, the ` trusted` function below is memory safe when using the current compiler release, but will become unsafe when compiled with DMD 2.101: ```d trusted int[] example() { scope example = [1, 2, 3]; return example; } ``` The worst part is that the potential memory corruption is introduced silently. Users who upgrade to DMD 2.101 will have no idea that the ground has shifted beneath their feet until their code invokes UB at runtime.Asking curiously, wasn’t the function UB before, but the behavior changed?
Oct 25 2022
On Tuesday, 25 October 2022 at 14:14:00 UTC, Quirin Schroll wrote:Asking curiously, wasn’t the function UB before, but the behavior changed?Until very recently, the language spec [1] said that a `scope` *parameter* "must not escape", but was silent on whether the same rule applied to `scope` local variables (although it would be reasonable to infer that it did). At some point between the release of DMD 2.100.2 and current `master`, the spec was updated to additionally state that returning a `scope` variable from a function is "disallowed" [2]. So, yes, I think the most reasonable interpretation is that this was always intended to be UB. But I am not confident that the average D user would have *known* for certain it was UB at the time DMD 2.100.2 was released. [1] https://dlang.org/spec/function.html#scope-parameters [2] https://dlang.org/spec/attribute.html#scope
Oct 25 2022
On 10/24/2022 6:35 PM, Steven Schveighoffer wrote:In a ` trusted` function today, without dip1000, the above is perfectly reasonable and not invalid. Will dip1000 make it corrupt memory?A very good question. Clearly, having code work when it is safe, but cause memory corruption when it is marked trusted, is the wrong solution. This should never happen. I'm not sure what the solution should be here.
Oct 26 2022
On 26/10/2022 9:03 PM, Walter Bright wrote:On 10/24/2022 6:35 PM, Steven Schveighoffer wrote:At the very least, if no solution can be determined this needs to be reverted before 2.101.0.In a ` trusted` function today, without dip1000, the above is perfectly reasonable and not invalid. Will dip1000 make it corrupt memory?A very good question. Clearly, having code work when it is safe, but cause memory corruption when it is marked trusted, is the wrong solution. This should never happen. I'm not sure what the solution should be here.
Oct 26 2022
On Wednesday, 26 October 2022 at 08:37:36 UTC, rikki cattermole wrote:On 26/10/2022 9:03 PM, Walter Bright wrote:There's nothing to revert that corrupts memory (without incorrectly writing scope), see Paul's reply.A very good question. Clearly, having code work when it is safe, but cause memory corruption when it is marked trusted, is the wrong solution. This should never happen. I'm not sure what the solution should be here.At the very least, if no solution can be determined this needs to be reverted before 2.101.0.
Oct 26 2022
On Wednesday, 26 October 2022 at 08:03:37 UTC, Walter Bright wrote:A very good question. Clearly, having code work when it is safe, but cause memory corruption when it is marked trusted, is the wrong solution. This should never happen. I'm not sure what the solution should be here.Is not trusted code (note my little D experience so sorry if I am asking something relatively stupid) unsafe? I mean, safe is safe, trusted is ??, system is you go your own. - So what are the guarantees of trusted compared to system? Also, as far as I understood from my limited D usage, only type[N] are static arrays on the stack and the rest are GC-allocated, by default, right? So in the presence of scope, probably that should be a dynamically sized array that was to be "freed" by the GC and invalid at the end of the function. I would assume a move can be done if the array is not static, independently of scope being there or not and an error if it is statically allocated, since the return type is type[] (without explicit size in the type).
Oct 26 2022
On Wednesday, 26 October 2022 at 10:43:11 UTC, German Diago wrote:On Wednesday, 26 October 2022 at 08:03:37 UTC, Walter Bright wrote:safe: it's like a seat belt. You can take your children, who come to see their uncle at the weekend, by car with their seat belts. trusted: it's like an uncle who didn't crash with his tractor. You can take your children around the field with their uncles by tractor. We trust the uncle, but even if he did not have an accident, this 2nd situation is not safe. SDB 79A very good question. Clearly, having code work when it is safe, but cause memory corruption when it is marked trusted, is the wrong solution. This should never happen. I'm not sure what the solution should be here.Is not trusted code (note my little D experience so sorry if I am asking something relatively stupid) unsafe? I mean, safe is safe, trusted is ??, system is you go your own.
Oct 26 2022
On Wednesday, 26 October 2022 at 10:43:11 UTC, German Diago wrote:Is not trusted code (note my little D experience so sorry if I am asking something relatively stupid) unsafe? I mean, safe is safe, trusted is ??, system is you go your own. - So what are the guarantees of trusted compared to system?A ` safe` function is guaranteed by the compiler to be memory safe to call from other ` safe` code with (almost) any possible arguments and under (almost) any circumstances. A ` trusted` function is guaranteed by its author to be memory safe to call from other ` safe` code with (almost) any possible arguments and under (almost) any circumstances. A ` system` function may require the caller to follow additional rules beyond those enforced by the compiler, even in ` safe` code, to maintain memory safety. Since the compiler does not know what these additional rules are and cannot enforce them automatically, calling ` system` functions directly from ` safe` code is forbidden. | Attribute | Must check definition | Must check each caller | |------------|-----------------------|------------------------| | ` safe` | compiler | compiler | | ` trusted` | programmer | compiler | | ` system` | programmer | programmer | Assume the function is implemented correctly, then try to figure out how to call the function from ` safe` code in a way that violates memory safety. If there is a way to do so, the function should be ` system`. Otherwise, it should be ` safe` if that compiles, or ` trusted` if not.
Oct 26 2022
On Wednesday, 26 October 2022 at 20:24:38 UTC, tsbockman wrote:On Wednesday, 26 October 2022 at 10:43:11 UTC, German Diago wrote:The “(almost)” should be absent. If you mean something other than compiler bugs, please tell us.Is not trusted code (note my little D experience so sorry if I am asking something relatively stupid) unsafe? I mean, safe is safe, trusted is ??, system is you go your own. - So what are the guarantees of trusted compared to system?A ` safe` function is guaranteed by the compiler to be memory safe to call from other ` safe` code with (almost) any possible arguments and under (almost) any circumstances. A ` trusted` function is guaranteed by its author to be memory safe to call from other ` safe` code with (almost) any possible arguments and under (almost) any circumstances.A ` system` function may require the caller to follow additional rules beyond those enforced by the compiler, even in ` safe` code, to maintain memory safety. Since the compiler does not know what these additional rules are and cannot enforce them automatically, calling ` system` functions directly from ` safe` code is forbidden. | Attribute | Must check definition | Must check each caller | |------------|-----------------------|------------------------| | ` safe` | compiler | compiler | | ` trusted` | programmer | compiler | | ` system` | programmer | programmer | Assume the function is implemented correctly, then try to figure out how to call the function from ` safe` code in a way that violates memory safety. If there is a way to do so, the function should be ` system`. Otherwise, it should be ` safe` if that compiles, or ` trusted` if not.I agree with the characterization of ` safe` and ` system`. For ` trusted` functions, there’s something more to say: * Widely accessible ones (e.g. `public`, `package`, `protected`, even `private` in a big module) should have a ` safe` interface, i.e. you can use them like ` safe` functions in all regards; they just aren’t ` safe` because of some implementation details. * Narrowly accessible ones (e.g. `private` (in a small module), local functions, immediately executed lambdas) can have a ` system` interface, but their surroundings can be trusted to use the function correctly.
Oct 27 2022
On 27.10.22 19:39, Quirin Schroll wrote:I agree with the characterization of ` safe` and ` system`. For ` trusted` functions, there’s something more to say: * Widely accessible ones (e.g. `public`, `package`, `protected`, even `private` in a big module) should have a ` safe` interface, i.e. you can use them like ` safe` functions in all regards; they just aren’t ` safe` because of some implementation details.Every single trusted function must have a safe interface. That includes local functions and immediately called literals.* Narrowly accessible ones (e.g. `private` (in a small module), local functions, immediately executed lambdas) can have a ` system` interface, but their surroundings can be trusted to use the function correctly.You say it yourself: In that case, the surroundings need to be trusted. The function that is being called can only be system when it doesn't have a safe interface.
Oct 27 2022
On Thursday, 27 October 2022 at 17:39:14 UTC, Quirin Schroll wrote:On Wednesday, 26 October 2022 at 20:24:38 UTC, tsbockman wrote:In practice, ` safe` code depends upon a guard page to catch `null` pointer dereferences. If a struct field or static array element is at a sufficiently large offset from the `null` pointer, this can theoretically result in a silent buffer overrun. As far as I can tell, this is not considered a bug, but rather a reasonable trade-off for improved performance. Also, doing anything at all is [officially undefined behavior](https://dlang.org/spec/expression.html#assert_expressions) after a failed assertion. This is, again, theoretically problematic because debug builds may call user code to prepare or log the `AssertError`. There are probably other obscure cases like these, as well, which ` safe` and ` trusted` functions are not responsible for handling correctly.A ` safe` function is guaranteed by the compiler to be memory safe to call from other ` safe` code with (almost) any possible arguments and under (almost) any circumstances. A ` trusted` function is guaranteed by its author to be memory safe to call from other ` safe` code with (almost) any possible arguments and under (almost) any circumstances.The “(almost)” should be absent. If you mean something other than compiler bugs, please tell us.I agree with the characterization of ` safe` and ` system`. For ` trusted` functions, there’s something more to say: ... * Narrowly accessible ones (e.g. `private` (in a small module), local functions, immediately executed lambdas) can have a ` system` interface, but their surroundings can be trusted to use the function correctly.My characterization [agrees with the language spec](https://dlang.org/spec/function.html#trusted-functions), yours does not. You are essentially redefining ` trusted` to mean "ignore memory any memory safety issues here", instead of what it is actually intended to mean, "trust me, this is actually memory safe".
Oct 27 2022
On Wednesday, 26 October 2022 at 08:03:37 UTC, Walter Bright wrote:On 10/24/2022 6:35 PM, Steven Schveighoffer wrote:It's not quite exactly that. The code in question fails with ` safe`. The problem is that Steven's ` trusted` code not only happens to work, but is defined behaviour without dip1000, yet undefined behaviour with `-preview=dip1000`. My proposal: disable local variable `scope` inference for ` system` and ` trusted` code. This has the downside that it's difficult to test whether the implementation really turns the inference off. But unless we're ready to ditch `scope` inference altogether I can't come up with anything better.In a ` trusted` function today, without dip1000, the above is perfectly reasonable and not invalid. Will dip1000 make it corrupt memory?A very good question. Clearly, having code work when it is safe, but cause memory corruption when it is marked trusted, is the wrong solution. This should never happen. I'm not sure what the solution should be here.
Oct 26 2022
On 10/26/22 8:49 AM, Dukc wrote:On Wednesday, 26 October 2022 at 08:03:37 UTC, Walter Bright wrote:Yes, maybe. I don't know if it's UB, because I don't know the rules/philosophy.On 10/24/2022 6:35 PM, Steven Schveighoffer wrote:It's not quite exactly that. The code in question fails with ` safe`. The problem is that Steven's ` trusted` code not only happens to work, but is defined behaviour without dip1000, yet undefined behaviour with `-preview=dip1000`.In a ` trusted` function today, without dip1000, the above is perfectly reasonable and not invalid. Will dip1000 make it corrupt memory?A very good question. Clearly, having code work when it is safe, but cause memory corruption when it is marked trusted, is the wrong solution. This should never happen. I'm not sure what the solution should be here.My proposal: disable local variable `scope` inference for ` system` and ` trusted` code. This has the downside that it's difficult to test whether the implementation really turns the inference off. But unless we're ready to ditch `scope` inference altogether I can't come up with anything better.This is a possibility. I don't know the consequences of this, especially for template code. -Steve
Oct 26 2022
On 10/26/22 4:03 AM, Walter Bright wrote:On 10/24/2022 6:35 PM, Steven Schveighoffer wrote:I should be clear here -- the code does *not* compile in safe code, but is perfectly reasonable as trusted code. What I don't want is the compiler taking actions based on scope inference that cause memory corruption. I get that we can say "if it wouldn't compile in safe, it's on you to make sure it doesn't corrupt memory as trusted". But if the reason it's unsafe is not because of things you wrote, but because of compiler inference (as in this case), then the compiler should either not do the inference, or not hoist allocations to the stack based on that inference. A philosophy/statement to that effect should be satisfactory. The last thing we want dip1000 to do is *cause* memory corruption. -SteveIn a ` trusted` function today, without dip1000, the above is perfectly reasonable and not invalid. Will dip1000 make it corrupt memory?A very good question. Clearly, having code work when it is safe, but cause memory corruption when it is marked trusted, is the wrong solution. This should never happen. I'm not sure what the solution should be here.
Oct 26 2022
On 10/26/2022 7:38 AM, Steven Schveighoffer wrote:On 10/26/22 4:03 AM, Walter Bright wrote:I understood the issue <g>.On 10/24/2022 6:35 PM, Steven Schveighoffer wrote:I should be clear hereIn a ` trusted` function today, without dip1000, the above is perfectly reasonable and not invalid. Will dip1000 make it corrupt memory?A very good question. Clearly, having code work when it is safe, but cause memory corruption when it is marked trusted, is the wrong solution. This should never happen. I'm not sure what the solution should be here.The last thing we want dip1000 to do is *cause* memory corruption.We're in full agreement here.
Oct 26 2022
On 10/26/2022 1:03 AM, Walter Bright wrote:On 10/24/2022 6:35 PM, Steven Schveighoffer wrote:[Some more thinking about the problem] The question is when is [1,2,3] allocated on the stack, and when is it allocated on the GC heap? Some points: 1. in C it is allocated on the stack. D's behavior to allocate it on the heap is kinda surprising in that light, even though D had such literals before C did 2. allocating on the heap means it is unusable in nogc code 3. when writing expressions, the only way to get it on the stack is to assign it to a scope variable, which is inconvenient and inefficient 4. it runs against the idea that the simpler code should be more efficient than the complex code Therefore, I suggest the following: [1,2,3] is always allocated on the stack [1,2,3].dup is always allocated on the heap and thus, its behavior is not dependent on inference. How we transition to this, we'll have to figure out.In a ` trusted` function today, without dip1000, the above is perfectly reasonable and not invalid. Will dip1000 make it corrupt memory?A very good question. Clearly, having code work when it is safe, but cause memory corruption when it is marked trusted, is the wrong solution. This should never happen. I'm not sure what the solution should be here.
Oct 26 2022
On 10/26/22 8:57 PM, Walter Bright wrote:On 10/26/2022 1:03 AM, Walter Bright wrote:Please no! We can allocate on the stack by explicitly requesting it: ```d int[3] = [1, 2, 3]; ``` The issue is the DRYness of it. This has been proposed before, just: ```d int[$] = [1, 2, 3]; ``` If we are going to fix something, let's fix this! It's backwards compatible too. If anything, the compiler can just punt and say all array literals that aren't immediately assigned to static arrays are allocated on the heap. Then it's consistent. Allocating array literals on the heap is *awesome*, please don't change that! D is one of the best learning languages for high-performance code because you don't have to worry at all about memory management out of the box. I'm actually OK with backends using stack allocations because it can prove they aren't escaping, why can't we just rely on that? -SteveOn 10/24/2022 6:35 PM, Steven Schveighoffer wrote:[Some more thinking about the problem] The question is when is [1,2,3] allocated on the stack, and when is it allocated on the GC heap? Some points: 1. in C it is allocated on the stack. D's behavior to allocate it on the heap is kinda surprising in that light, even though D had such literals before C did 2. allocating on the heap means it is unusable in nogc code 3. when writing expressions, the only way to get it on the stack is to assign it to a scope variable, which is inconvenient and inefficient 4. it runs against the idea that the simpler code should be more efficient than the complex code Therefore, I suggest the following: [1,2,3] is always allocated on the stack [1,2,3].dup is always allocated on the heap and thus, its behavior is not dependent on inference. How we transition to this, we'll have to figure out.In a ` trusted` function today, without dip1000, the above is perfectly reasonable and not invalid. Will dip1000 make it corrupt memory?A very good question. Clearly, having code work when it is safe, but cause memory corruption when it is marked trusted, is the wrong solution. This should never happen. I'm not sure what the solution should be here.
Oct 26 2022
On 10/26/2022 6:26 PM, Steven Schveighoffer wrote:Please no! We can allocate on the stack by explicitly requesting it: ```d int[3] = [1, 2, 3]; ``` The issue is the DRYness of it. This has been proposed before, just: ```d int[$] = [1, 2, 3]; ```How would this be done: foo([1,2,3] + a) i.e. using an array literal in places other than an initialization?If we are going to fix something, let's fix this! It's backwards compatible too. If anything, the compiler can just punt and say all array literals that aren't immediately assigned to static arrays are allocated on the heap. Then it's consistent.And inefficient.Allocating array literals on the heap is *awesome*, please don't change that! D is one of the best learning languages for high-performance code because you don't have to worry at all about memory management out of the box. I'm actually OK with backends using stack allocations because it can prove they aren't escaping, why can't we just rely on that?I thought your test case showed the problem with that :-/
Oct 27 2022
On 10/27/22 9:44 AM, Walter Bright wrote:On 10/26/2022 6:26 PM, Steven Schveighoffer wrote:Already works today, except I don't know what the + a means: foo([1, 2, 3].staticArray);Please no! We can allocate on the stack by explicitly requesting it: ```d int[3] = [1, 2, 3]; ``` The issue is the DRYness of it. This has been proposed before, just: ```d int[$] = [1, 2, 3]; ```How would this be done: foo([1,2,3] + a)Inefficiencies that are taken care of by modern backends, such as llvm and gcc.If we are going to fix something, let's fix this! It's backwards compatible too. If anything, the compiler can just punt and say all array literals that aren't immediately assigned to static arrays are allocated on the heap. Then it's consistent.And inefficient.Backends that put it on the stack are not using language constructs such as scope to make assumptions, they are using actual analysis of the control flow to prove that it doesn't escape. -SteveAllocating array literals on the heap is *awesome*, please don't change that! D is one of the best learning languages for high-performance code because you don't have to worry at all about memory management out of the box. I'm actually OK with backends using stack allocations because it can prove they aren't escaping, why can't we just rely on that?I thought your test case showed the problem with that :-/
Oct 27 2022
On Thursday, 27 October 2022 at 00:57:47 UTC, Walter Bright wrote:On 10/26/2022 1:03 AM, Walter Bright wrote:As a person who has used D but not extensively, I was suprised of type[] vs type[N] behavior all the time. I agree that [1, 2, 3] should allocate in the stack but I am not sure how much code that could break? For example, if before it was on the heap, what happens with this now? int [] func() { // Allocated in the stack, I presume that not safe, should add .dup? int[] v = [1, 2, 3]; return v; } How it should work?On 10/24/2022 6:35 PM, Steven Schveighoffer wrote:[Some more thinking about the problem] The question is when is [1,2,3] allocated on the stack, and when is it allocated on the GC heap? Some points: 1. in C it is allocated on the stack. D's behavior to allocate it on the heap is kinda surprising in that light, even though D had such literals before C did 2. allocating on the heap means it is unusable in nogc code 3. when writing expressions, the only way to get it on the stack is to assign it to a scope variable, which is inconvenient and inefficient 4. it runs against the idea that the simpler code should be more efficient than the complex code Therefore, I suggest the following: [1,2,3] is always allocated on the stack [1,2,3].dup is always allocated on the heapIn a ` trusted` function today, without dip1000, the above is perfectly reasonable and not invalid. Will dip1000 make it corrupt memory?A very good question. Clearly, having code work when it is safe, but cause memory corruption when it is marked trusted, is the wrong solution. This should never happen. I'm not sure what the solution should be here.
Oct 27 2022
On Thursday, 27 October 2022 at 09:36:25 UTC, German Diago wrote:On Thursday, 27 October 2022 at 00:57:47 UTC, Walter Bright wrote:If `[1, 2, 3]` is stack allocated, it should not compile (at least not in ` safe` code, probably not in ` system` code either). The problem is not the assignment to `v` (that is of the same kind as a pointer to a local variable), but that its value is returned and thus leaking the address of a local.On 10/26/2022 1:03 AM, Walter Bright wrote:As a person who has used D but not extensively, I was suprised of `type[]` vs `type[N]` behavior all the time. I agree that `[1, 2, 3]` should allocate in the stack but I am not sure how much code that could break? For example, if before it was on the heap, what happens with this now? ```d int[] func() { // Allocated in the stack, I presume that not safe, should add .dup? int[] v = [1, 2, 3]; return v; } ``` How it should work?On 10/24/2022 6:35 PM, Steven Schveighoffer wrote:[Some more thinking about the problem] The question is when is `[1,2,3]` allocated on the stack, and when is it allocated on the GC heap? Some points: 1. in C it is allocated on the stack. D's behavior to allocate it on the heap is kinda surprising in that light, even though D had such literals before C did 2. allocating on the heap means it is unusable in ` nogc` code 3. when writing expressions, the only way to get it on the stack is to assign it to a scope variable, which is inconvenient and inefficient 4. it runs against the idea that the simpler code should be more efficient than the complex code Therefore, I suggest the following: ```d [1,2,3] // is always allocated on the stack [1,2,3].dup // is always allocated on the heap ```In a ` trusted` function today, without dip1000, the above is perfectly reasonable and not invalid. Will dip1000 make it corrupt memory?A very good question. Clearly, having code work when it is safe, but cause memory corruption when it is marked trusted, is the wrong solution. This should never happen. I'm not sure what the solution should be here.
Oct 27 2022
On 10/27/2022 2:36 AM, German Diago wrote:As a person who has used D but not extensively, I was suprised of type[] vs type[N] behavior all the time. I agree that [1, 2, 3] should allocate in the stack but I am not sure how much code that could break? For example, if before it was on the heap, what happens with this now?You'll get an error on [1]int [] func() { // Allocated in the stack, I presume that not safe, should add .dup? int[] v = [1, 2, 3]; return v; [1] } How it should work?Add .dup for those that need the array to survive the function.
Oct 27 2022
On Thursday, 27 October 2022 at 00:57:47 UTC, Walter Bright wrote:[1,2,3] is always allocated on the stackWhy isnt this an immutable constant just like a string literal?
Oct 27 2022
On Thursday, 27 October 2022 at 00:57:47 UTC, Walter Bright wrote:Therefore, I suggest the following: [1,2,3] is always allocated on the stackPlease no. Far too much breakage for the value (even without going to the question whether it'd be added value in the first place).2. allocating on the heap means it is unusable in nogc codeThe compiler will error, and the programmer can manually fix it. No silent errors. ` nogc` code is still a bit of a special case, GC-using code is the normal we want to optimise the language for.3. when writing expressions, the only way to get it on the stack is to assign it to a scope variable, which is inconvenient and inefficientThe compiler is still free to optimise those as a stack allocation, if it can prove there's no escaping of the data. `scope` is just used to enforce that being the case in ` safe`, or giving the compiler the permission to assume that being the case in ` trusted` and ` system`.
Oct 27 2022