digitalmars.D - DIP1000 scope inference

Steven Schveighoffer (35/35) Oct 24 2022 Deprecation messages due to dip1000's imminent arrival are scheduled to

Paul Backus (24/28) Oct 24 2022 No, it does not. This capability was added only for array

rikki cattermole (2/13) Oct 24 2022 Does this also apply to @safe?

rikki cattermole (5/19) Oct 24 2022 Apparently.
Paul Backus (5/18) Oct 24 2022 No, because you are not allowed to return a `scope` variable in

Steven Schveighoffer (16/26) Oct 24 2022 OK, what about this?

Paul Backus (20/34) Oct 24 2022 When I compile the above with `@safe` and `-preview=dip1000`, I

Steven Schveighoffer (22/53) Oct 25 2022 OK, I misread the error here, it's the same on run.dlang.io. But we did

Steven Schveighoffer (9/12) Oct 25 2022 It's very curious. I can't get any indication that the struct is

Quirin Schroll (3/17) Oct 25 2022 Asking curiously, wasn’t the function UB before, but the behavior

Paul Backus (14/16) Oct 25 2022 Until very recently, the language spec [1] said that a `scope`

Walter Bright (4/6) Oct 26 2022 A very good question. Clearly, having code work when it is @safe, but ca...

rikki cattermole (3/12) Oct 26 2022 At the very least, if no solution can be determined this needs to be

Nick Treleaven (4/11) Oct 26 2022 There's nothing to revert that corrupts memory (without

German Diago (16/20) Oct 26 2022 Is not trusted code (note my little D experience so sorry if I am

Salih Dincer (10/19) Oct 26 2022 @safe: it's like a seat belt. You can take your children, who
tsbockman (24/28) Oct 26 2022 A `@safe` function is guaranteed by the compiler to be memory

Quirin Schroll (13/43) Oct 27 2022 The “(almost)” should be absent. If you mean something other than

ag0aep6g (6/15) Oct 27 2022 Every single @trusted function must have a safe interface. That includes...
tsbockman (19/36) Oct 27 2022 In practice, `@safe` code depends upon a guard page to catch

Dukc (12/20) Oct 26 2022 It's not quite exactly that. The code in question fails with

Steven Schveighoffer (6/28) Oct 26 2022 Yes, maybe. I don't know if it's UB, because I don't know the

Steven Schveighoffer (13/23) Oct 26 2022 I should be clear here -- the code does *not* compile in @safe code, but...

Walter Bright (3/16) Oct 26 2022 We're in full agreement here.

Walter Bright (17/25) Oct 26 2022 [Some more thinking about the problem]

Steven Schveighoffer (20/60) Oct 26 2022 Please no! We can allocate on the stack by explicitly requesting it:

Walter Bright (6/27) Oct 27 2022 How would this be done:

Steven Schveighoffer (9/42) Oct 27 2022 Already works today, except I don't know what the + a means:

German Diago (13/40) Oct 27 2022 As a person who has used D but not extensively, I was suprised of

Quirin Schroll (6/60) Oct 27 2022 If `[1, 2, 3]` is stack allocated, it should not compile (at
Walter Bright (3/14) Oct 27 2022 Add .dup for those that need the array to survive the function.

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (2/3) Oct 27 2022 Why isnt this an immutable constant just like a string literal?

Dukc (12/18) Oct 27 2022 Please no. Far too much breakage for the value (even without

Steven Schveighoffer <schveiguy gmail.com> writes:

Deprecation messages due to dip1000's imminent arrival are scheduled to 
happen on the next release of the compiler. I have some concerns about 
scope inference, and wanted to find out the answers here.

Let's say I have a scope array like this in a  trusted function:

```d
int[] mkarr()  trusted {
    scope arr = [1, 2, 3];
    return arr;
}
```

Clearly, this is a bad idea. The compiler might put the array data 
actually on the stack (right?), and therefore return stack data when it 
shouldn't.

But what if you *don't* mark it scope? Let's try something here:

```d
int[] mkarr()  safe {
    int[3] arr = [1, 2, 3];
    int[] other = arr[];

    other = [4, 5, 6];
    return other;
}
```

by the time `other` is returned, it should no longer be pointing at 
stack data. But *because* it was originally assigned to the static 
array, `other` is inferred as scope (as is proven by the code above 
failing to compile with dip1000 enabled with an error about returning 
scope data).

Let's switch that back to ` trusted`, and now it does compile, even with 
dip1000. BUT, let me ask this very crucial question:

Does the inferred `scope` make it so that the compiler is *allowed* to 
allocate the `[4, 5, 6]` literal on the stack? Keep in mind that I never 
put `scope` here, this is something the compiler did on its own.

In a ` trusted` function today, without dip1000, the above is perfectly 
reasonable and not invalid. Will dip1000 make it corrupt memory?

-Steve

Oct 24 2022

Paul Backus <snarwin gmail.com> writes:

On Tuesday, 25 October 2022 at 01:35:28 UTC, Steven Schveighoffer 
wrote:
 Does the inferred `scope` make it so that the compiler is 
 *allowed* to allocate the `[4, 5, 6]` literal on the stack? 
 Keep in mind that I never put `scope` here, this is something 
 the compiler did on its own.

No, it does not. This capability was added only for array 
literals, and only for variable initialization:

DMD PR: https://github.com/dlang/dmd/pull/14562
Spec PR (pending): https://github.com/dlang/dlang.org/pull/3442

However, this thread raises an important point: changing the way 
existing language constructs allocate memory in the presence of 
`scope` may cause ` trusted` code which relied on the original 
behavior to become unsound.

For example, the ` trusted` function below is memory safe when 
using the current compiler release, but will become unsafe when 
compiled with DMD 2.101:

```d
 trusted int[] example()
{
     scope example = [1, 2, 3];
     return example;
}
```

The worst part is that the potential memory corruption is 
introduced silently. Users who upgrade to DMD 2.101 will have no 
idea that the ground has shifted beneath their feet until their 
code invokes UB at runtime.

Oct 24 2022

rikki cattermole <rikki cattermole.co.nz> writes:

On 25/10/2022 3:09 PM, Paul Backus wrote:
 For example, the ` trusted` function below is memory safe when using the 
 current compiler release, but will become unsafe when compiled with DMD 
 2.101:
 
 ```d
  trusted int[] example()
 {
      scope example = [1, 2, 3];
      return example;
 }
 ```

Does this also apply to  safe?

Oct 24 2022

rikki cattermole <rikki cattermole.co.nz> writes:

On 25/10/2022 3:13 PM, rikki cattermole wrote:
 On 25/10/2022 3:09 PM, Paul Backus wrote:
 For example, the ` trusted` function below is memory safe when using 
 the current compiler release, but will become unsafe when compiled 
 with DMD 2.101:

 ```d
  trusted int[] example()
 {
      scope example = [1, 2, 3];
      return example;
 }
 ```

 
 Does this also apply to  safe?

Apparently.

I can't find any checks in the PR.

REVERT REVERT REVERT (or ya know add the check for  safe).

#SuddenlyWorried lol

Oct 24 2022

Paul Backus <snarwin gmail.com> writes:

On Tuesday, 25 October 2022 at 02:13:20 UTC, rikki cattermole 
wrote:
 On 25/10/2022 3:09 PM, Paul Backus wrote:
 For example, the ` trusted` function below is memory safe when 
 using the current compiler release, but will become unsafe 
 when compiled with DMD 2.101:
 
 ```d
  trusted int[] example()
 {
      scope example = [1, 2, 3];
      return example;
 }
 ```

 Does this also apply to  safe?

No, because you are not allowed to return a `scope` variable in 
` safe` code, even if you happen to know that it points to a heap 
allocation.

Oct 24 2022

Steven Schveighoffer <schveiguy gmail.com> writes:

On 10/24/22 10:09 PM, Paul Backus wrote:
 On Tuesday, 25 October 2022 at 01:35:28 UTC, Steven Schveighoffer wrote:
 Does the inferred `scope` make it so that the compiler is *allowed* to 
 allocate the `[4, 5, 6]` literal on the stack? Keep in mind that I 
 never put `scope` here, this is something the compiler did on its own.

 
 No, it does not. This capability was added only for array literals, and 
 only for variable initialization:
 
 DMD PR: https://github.com/dlang/dmd/pull/14562
 Spec PR (pending): https://github.com/dlang/dlang.org/pull/3442

OK, what about this?

```d
int[] mkarr()  trusted {
     int[3] arr = [1, 2, 3];
     int[] other = [4, 5, 6];

     auto foo = other;
     other = arr[];
     return foo;
}
```

`other` is inferred as `scope` (along with `foo`), because it touches 
`arr[]` later (but after it was pointing at what should have been heap 
memory). So does that count as possible for stack allocation, or is it 
still heap allocated?

-Steve

Oct 24 2022

Paul Backus <snarwin gmail.com> writes:

On Tuesday, 25 October 2022 at 02:38:02 UTC, Steven Schveighoffer 
wrote:
 OK, what about this?

 ```d
 int[] mkarr()  trusted {
     int[3] arr = [1, 2, 3];
     int[] other = [4, 5, 6];

     auto foo = other;
     other = arr[];
     return foo;
 }
 ```

 `other` is inferred as `scope` (along with `foo`), because it 
 touches `arr[]` later (but after it was pointing at what should 
 have been heap memory). So does that count as possible for 
 stack allocation, or is it still heap allocated?

When I compile the above with ` safe` and `-preview=dip1000`, I 
get

     Error: reference to local variable `arr` assigned to 
non-scope `other`

...using both DMD 2.100.2 and DMD master. So `scope` is not 
actually being inferred here, and the array is allocated on the 
heap.

My expectation is that `scope` will probably *never* be inferred 
for `other`, because doing multi-step inference like this 
requires dataflow analysis in the general case, which is 
something Walter wants to avoid (see discussion in [issue 
20674][1]). So I don't think you have anything to worry about.

Still, this is a good illustration of how silently changing the 
rules on people can have unintended consequences. If Walter ever 
*does* consider adding dataflow analysis, overly-aggressive 
"optimizations" like these could easily become obstacles in the 
way of that goal.

[1]: https://issues.dlang.org/show_bug.cgi?id=20674

Oct 24 2022

Steven Schveighoffer <schveiguy gmail.com> writes:

On 10/24/22 10:59 PM, Paul Backus wrote:
 On Tuesday, 25 October 2022 at 02:38:02 UTC, Steven Schveighoffer wrote:
 OK, what about this?

 ```d
 int[] mkarr()  trusted {
     int[3] arr = [1, 2, 3];
     int[] other = [4, 5, 6];

     auto foo = other;
     other = arr[];
     return foo;
 }
 ```

 `other` is inferred as `scope` (along with `foo`), because it touches 
 `arr[]` later (but after it was pointing at what should have been heap 
 memory). So does that count as possible for stack allocation, or is it 
 still heap allocated?

 
 When I compile the above with ` safe` and `-preview=dip1000`, I get
 
      Error: reference to local variable `arr` assigned to non-scope `other`

OK, I misread the error here, it's the same on run.dlang.io. But we did 
just go through an exercise where a struct not labeled scope is inferred 
scope not because of its declaration, but because of later things done 
with it. It doesn't seem to be the case here.

 
 ...using both DMD 2.100.2 and DMD master. So `scope` is not actually 
 being inferred here, and the array is allocated on the heap.
 
 My expectation is that `scope` will probably *never* be inferred for 
 `other`, because doing multi-step inference like this requires dataflow 
 analysis in the general case, which is something Walter wants to avoid 
 (see discussion in [issue 20674][1]). So I don't think you have anything 
 to worry about.

My biggest concern is that this inference takes priority over what is 
actually written, and then can cause memory problems to occur in code 
that seemingly reads like it shouldn't cause memory problems.

I'm trying to find a hole because I'm worried about that hole showing up 
without intention later (especially with the way the compiler can inline 
and rewrite code for optimization). The compiler doing things that are 
not checkable (I know of no way to introspect that something is scope 
inferred), hard to describe, and impossible to prevent makes things 
uncomfortable. Especially if the compiler might make disastrous 
decisions based on that inference.

It would be relieving to have some rule that says "any data inferred 
scope inside a  system or  trusted context without explicitly being 
declared scope shall not result in memory allocations hoisting to the 
stack". I can deal, begrudgingly, with compiler errors that are 
misguided. I can't deal with memory errors caused by the compiler 
knowing better than me.

-Steve

Oct 25 2022

Steven Schveighoffer <schveiguy gmail.com> writes:

On 10/25/22 9:44 AM, Steven Schveighoffer wrote:
 But we did just go through an exercise where a struct not labeled scope 
 is inferred scope not because of its declaration, but because of later 
 things done with it.

It's very curious. I can't get any indication that the struct is 
inferred scope except by throwing an exception contained in it. If I 
declare the function  safe, it won't let me assign the scope variable to 
the struct member. If I mark it as  trusted, that succeeds, but then it 
won't let me throw the exception out of the struct because it says the 
struct is scope.

Again, with no way to tell whether scope is inferred, it's hard to judge.

-Steve

Oct 25 2022

Quirin Schroll <qs.il.paperinik gmail.com> writes:

On Tuesday, 25 October 2022 at 02:09:02 UTC, Paul Backus wrote:
 For example, the ` trusted` function below is memory safe when 
 using the current compiler release, but will become unsafe when 
 compiled with DMD 2.101:

 ```d
  trusted int[] example()
 {
     scope example = [1, 2, 3];
     return example;
 }
 ```

 The worst part is that the potential memory corruption is 
 introduced silently. Users who upgrade to DMD 2.101 will have 
 no idea that the ground has shifted beneath their feet until 
 their code invokes UB at runtime.

Asking curiously, wasn’t the function UB before, but the behavior 
changed?

Oct 25 2022

Paul Backus <snarwin gmail.com> writes:

On Tuesday, 25 October 2022 at 14:14:00 UTC, Quirin Schroll wrote:
 Asking curiously, wasn’t the function UB before, but the 
 behavior changed?

Until very recently, the language spec [1] said that a `scope` 
*parameter* "must not escape", but was silent on whether the same 
rule applied to `scope` local variables (although it would be 
reasonable to infer that it did).

At some point between the release of DMD 2.100.2 and current 
`master`, the spec was updated to additionally state that 
returning a `scope` variable from a function is "disallowed" [2].

So, yes, I think the most reasonable interpretation is that this 
was always intended to be UB. But I am not confident that the 
average D user would have *known* for certain it was UB at the 
time DMD 2.100.2 was released.

[1] https://dlang.org/spec/function.html#scope-parameters
[2] https://dlang.org/spec/attribute.html#scope

Oct 25 2022

Walter Bright <newshound2 digitalmars.com> writes:

On 10/24/2022 6:35 PM, Steven Schveighoffer wrote:
 In a ` trusted` function today, without dip1000, the above is perfectly 
 reasonable and not invalid. Will dip1000 make it corrupt memory?

A very good question. Clearly, having code work when it is  safe, but cause 
memory corruption when it is marked  trusted, is the wrong solution. This
should 
never happen. I'm not sure what the solution should be here.

Oct 26 2022

rikki cattermole <rikki cattermole.co.nz> writes:

On 26/10/2022 9:03 PM, Walter Bright wrote:
 On 10/24/2022 6:35 PM, Steven Schveighoffer wrote:
 In a ` trusted` function today, without dip1000, the above is 
 perfectly reasonable and not invalid. Will dip1000 make it corrupt 
 memory?

 
 A very good question. Clearly, having code work when it is  safe, but 
 cause memory corruption when it is marked  trusted, is the wrong 
 solution. This should never happen. I'm not sure what the solution 
 should be here.

At the very least, if no solution can be determined this needs to be 
reverted before 2.101.0.

Oct 26 2022

Nick Treleaven <nick geany.org> writes:

On Wednesday, 26 October 2022 at 08:37:36 UTC, rikki cattermole 
wrote:
 On 26/10/2022 9:03 PM, Walter Bright wrote:
 A very good question. Clearly, having code work when it is 
  safe, but cause memory corruption when it is marked  trusted, 
 is the wrong solution. This should never happen. I'm not sure 
 what the solution should be here.

 At the very least, if no solution can be determined this needs 
 to be reverted before 2.101.0.

There's nothing to revert that corrupts memory (without 
incorrectly writing scope), see Paul's reply.

Oct 26 2022

German Diago <germandiago gmail.com> writes:

On Wednesday, 26 October 2022 at 08:03:37 UTC, Walter Bright 
wrote:

 A very good question. Clearly, having code work when it is 
  safe, but cause memory corruption when it is marked  trusted, 
 is the wrong solution. This should never happen. I'm not sure 
 what the solution should be here.

Is not trusted code (note my little D experience so sorry if I am 
asking something relatively stupid) unsafe? I mean,  safe is 
safe,  trusted is ??,  system is you go your own.

- So what are the guarantees of  trusted compared to  system?


Also, as far as I understood from my limited D usage, only 
type[N] are static arrays on the stack and the rest are 
GC-allocated, by default, right?

So in the presence of scope, probably that should be a 
dynamically sized array that was to be "freed" by the GC and 
invalid at the end of the function.

I would assume a move can be done if the array is not static, 
independently of scope being there or not and an error if it is 
statically allocated, since the return type is type[] (without 
explicit size in the type).

Oct 26 2022

Salih Dincer <salihdb hotmail.com> writes:

On Wednesday, 26 October 2022 at 10:43:11 UTC, German Diago wrote:
 On Wednesday, 26 October 2022 at 08:03:37 UTC, Walter Bright 
 wrote:

 A very good question. Clearly, having code work when it is 
  safe, but cause memory corruption when it is marked  trusted, 
 is the wrong solution. This should never happen. I'm not sure 
 what the solution should be here.

 Is not trusted code (note my little D experience so sorry if I 
 am asking something relatively stupid) unsafe? I mean,  safe is 
 safe,  trusted is ??,  system is you go your own.

 safe: it's like a seat belt. You can take your children, who 
come to see their uncle at the weekend, by car with their seat 
belts.

 trusted: it's like an uncle who didn't crash with his tractor. 
You can take your children around the field with their uncles by 
tractor.

We trust the uncle, but even if he did not have an accident, this 
2nd situation is not safe.

SDB 79

Oct 26 2022

tsbockman <thomas.bockman gmail.com> writes:

On Wednesday, 26 October 2022 at 10:43:11 UTC, German Diago wrote:
 Is not trusted code (note my little D experience so sorry if I 
 am asking something relatively stupid) unsafe? I mean,  safe is 
 safe,  trusted is ??,  system is you go your own.

 - So what are the guarantees of  trusted compared to  system?

A ` safe` function is guaranteed by the compiler to be memory 
safe to call from other ` safe` code with (almost) any possible 
arguments and under (almost) any circumstances.

A ` trusted` function is guaranteed by its author to be memory 
safe to call from other ` safe` code with (almost) any possible 
arguments and under (almost) any circumstances.

A ` system` function may require the caller to follow additional 
rules beyond those enforced by the compiler, even in ` safe` 
code, to maintain memory safety. Since the compiler does not know 
what these additional rules are and cannot enforce them 
automatically, calling ` system` functions directly from ` safe` 
code is forbidden.

| Attribute  | Must check definition | Must check each caller |
|------------|-----------------------|------------------------|
| ` safe`    | compiler              | compiler               |
| ` trusted` | programmer            | compiler               |
| ` system`  | programmer            | programmer             |

Assume the function is implemented correctly, then try to figure 
out how to call the function from ` safe` code in a way that 
violates memory safety. If there is a way to do so, the function 
should be ` system`.

Otherwise, it should be ` safe` if that compiles, or ` trusted` 
if not.

Oct 26 2022

Quirin Schroll <qs.il.paperinik gmail.com> writes:

On Wednesday, 26 October 2022 at 20:24:38 UTC, tsbockman wrote:
 On Wednesday, 26 October 2022 at 10:43:11 UTC, German Diago 
 wrote:
 Is not trusted code (note my little D experience so sorry if I 
 am asking something relatively stupid) unsafe? I mean,  safe 
 is safe,  trusted is ??,  system is you go your own.

 - So what are the guarantees of  trusted compared to  system?

 A ` safe` function is guaranteed by the compiler to be memory 
 safe to call from other ` safe` code with (almost) any possible 
 arguments and under (almost) any circumstances.

 A ` trusted` function is guaranteed by its author to be memory 
 safe to call from other ` safe` code with (almost) any possible 
 arguments and under (almost) any circumstances.

The “(almost)” should be absent. If you mean something other than 
compiler bugs, please tell us.

 A ` system` function may require the caller to follow 
 additional rules beyond those enforced by the compiler, even in 
 ` safe` code, to maintain memory safety. Since the compiler 
 does not know what these additional rules are and cannot 
 enforce them automatically, calling ` system` functions 
 directly from ` safe` code is forbidden.

 | Attribute  | Must check definition | Must check each caller |
 |------------|-----------------------|------------------------|
 | ` safe`    | compiler              | compiler               |
 | ` trusted` | programmer            | compiler               |
 | ` system`  | programmer            | programmer             |

 Assume the function is implemented correctly, then try to 
 figure out how to call the function from ` safe` code in a way 
 that violates memory safety. If there is a way to do so, the 
 function should be ` system`.

 Otherwise, it should be ` safe` if that compiles, or ` trusted` 
 if not.

I agree with the characterization of ` safe` and ` system`. For 
` trusted` functions, there’s something more to say:
* Widely accessible ones (e.g. `public`, `package`, `protected`, 
even `private` in a big module) should have a ` safe` interface, 
i.e. you can use them like ` safe` functions in all regards; they 
just aren’t ` safe` because of some implementation details.
* Narrowly accessible ones (e.g. `private` (in a small module), 
local functions, immediately executed lambdas) can have a 
` system` interface, but their surroundings can be trusted to use 
the function correctly.

Oct 27 2022

ag0aep6g <anonymous example.com> writes:

On 27.10.22 19:39, Quirin Schroll wrote:
 I agree with the characterization of ` safe` and ` system`. For 
 ` trusted` functions, there’s something more to say:
 * Widely accessible ones (e.g. `public`, `package`, `protected`, even 
 `private` in a big module) should have a ` safe` interface, i.e. you can 
 use them like ` safe` functions in all regards; they just aren’t ` safe` 
 because of some implementation details.

Every single  trusted function must have a safe interface. That includes 
local functions and immediately called literals.

 * Narrowly accessible ones (e.g. `private` (in a small module), local 
 functions, immediately executed lambdas) can have a ` system` interface, 
 but their surroundings can be trusted to use the function correctly.

You say it yourself: In that case, the surroundings need to be  trusted. 
The function that is being called can only be  system when it doesn't 
have a safe interface.

Oct 27 2022

tsbockman <thomas.bockman gmail.com> writes:

On Thursday, 27 October 2022 at 17:39:14 UTC, Quirin Schroll 
wrote:
 On Wednesday, 26 October 2022 at 20:24:38 UTC, tsbockman wrote:
 A ` safe` function is guaranteed by the compiler to be memory 
 safe to call from other ` safe` code with (almost) any 
 possible arguments and under (almost) any circumstances.

 A ` trusted` function is guaranteed by its author to be memory 
 safe to call from other ` safe` code with (almost) any 
 possible arguments and under (almost) any circumstances.

 The “(almost)” should be absent. If you mean something other 
 than compiler bugs, please tell us.

In practice, ` safe` code depends upon a guard page to catch 
`null` pointer dereferences. If a struct field or static array 
element is at a sufficiently large offset from the `null` 
pointer, this can theoretically result in a silent buffer 
overrun. As far as I can tell, this is not considered a bug, but 
rather a reasonable trade-off for improved performance.

Also, doing anything at all is [officially undefined 
behavior](https://dlang.org/spec/expression.html#assert_expressions) after a
failed assertion. This is, again, theoretically problematic because debug
builds may call user code to prepare or log the `AssertError`.

There are probably other obscure cases like these, as well, which 
` safe` and ` trusted` functions are not responsible for handling 
correctly.

 I agree with the characterization of ` safe` and ` system`. For 
 ` trusted` functions, there’s something more to say:
 ...
 * Narrowly accessible ones (e.g. `private` (in a small module), 
 local functions, immediately executed lambdas) can have a 
 ` system` interface, but their surroundings can be trusted to 
 use the function correctly.

My characterization [agrees with the language 
spec](https://dlang.org/spec/function.html#trusted-functions), 
yours does not. You are essentially redefining ` trusted` to mean 
"ignore memory any memory safety issues here", instead of what it 
is actually intended to mean, "trust me, this is actually memory 
safe".

Oct 27 2022

Dukc <ajieskola gmail.com> writes:

On Wednesday, 26 October 2022 at 08:03:37 UTC, Walter Bright 
wrote:
 On 10/24/2022 6:35 PM, Steven Schveighoffer wrote:
 In a ` trusted` function today, without dip1000, the above is 
 perfectly reasonable and not invalid. Will dip1000 make it 
 corrupt memory?

 A very good question. Clearly, having code work when it is 
  safe, but cause memory corruption when it is marked  trusted, 
 is the wrong solution. This should never happen. I'm not sure 
 what the solution should be here.

It's not quite exactly that. The code in question fails with 
` safe`.

The problem is that Steven's ` trusted` code not only happens to 
work, but is defined behaviour without dip1000, yet undefined 
behaviour with `-preview=dip1000`.

My proposal: disable local variable `scope` inference for 
` system` and ` trusted` code. This has the downside that it's 
difficult to test whether the implementation really turns the 
inference off. But unless we're ready to ditch `scope` inference 
altogether I can't come up with anything better.

Oct 26 2022

Steven Schveighoffer <schveiguy gmail.com> writes:

On 10/26/22 8:49 AM, Dukc wrote:
 On Wednesday, 26 October 2022 at 08:03:37 UTC, Walter Bright wrote:
 On 10/24/2022 6:35 PM, Steven Schveighoffer wrote:
 In a ` trusted` function today, without dip1000, the above is 
 perfectly reasonable and not invalid. Will dip1000 make it corrupt 
 memory?

 A very good question. Clearly, having code work when it is  safe, but 
 cause memory corruption when it is marked  trusted, is the wrong 
 solution. This should never happen. I'm not sure what the solution 
 should be here.

 
 It's not quite exactly that. The code in question fails with ` safe`.
 
 The problem is that Steven's ` trusted` code not only happens to work, 
 but is defined behaviour without dip1000, yet undefined behaviour with 
 `-preview=dip1000`.

Yes, maybe. I don't know if it's UB, because I don't know the 
rules/philosophy.

 
 My proposal: disable local variable `scope` inference for ` system` and 
 ` trusted` code. This has the downside that it's difficult to test 
 whether the implementation really turns the inference off. But unless 
 we're ready to ditch `scope` inference altogether I can't come up with 
 anything better.

This is a possibility. I don't know the consequences of this, especially 
for template code.

-Steve

Oct 26 2022

Steven Schveighoffer <schveiguy gmail.com> writes:

On 10/26/22 4:03 AM, Walter Bright wrote:
 On 10/24/2022 6:35 PM, Steven Schveighoffer wrote:
 In a ` trusted` function today, without dip1000, the above is 
 perfectly reasonable and not invalid. Will dip1000 make it corrupt 
 memory?

 
 A very good question. Clearly, having code work when it is  safe, but 
 cause memory corruption when it is marked  trusted, is the wrong 
 solution. This should never happen. I'm not sure what the solution 
 should be here.
 

I should be clear here -- the code does *not* compile in  safe code, but 
is perfectly reasonable as  trusted code.

What I don't want is the compiler taking actions based on scope 
inference that cause memory corruption.

I get that we can say "if it wouldn't compile in  safe, it's on you to 
make sure it doesn't corrupt memory as  trusted". But if the reason it's 
unsafe is not because of things you wrote, but because of compiler 
inference (as in this case), then the compiler should either not do the 
inference, or not hoist allocations to the stack based on that inference.

A philosophy/statement to that effect should be satisfactory. The last 
thing we want dip1000 to do is *cause* memory corruption.

-Steve

Oct 26 2022

Walter Bright <newshound2 digitalmars.com> writes:

On 10/26/2022 7:38 AM, Steven Schveighoffer wrote:
 On 10/26/22 4:03 AM, Walter Bright wrote:
 On 10/24/2022 6:35 PM, Steven Schveighoffer wrote:
 In a ` trusted` function today, without dip1000, the above is perfectly 
 reasonable and not invalid. Will dip1000 make it corrupt memory?

 A very good question. Clearly, having code work when it is  safe, but cause 
 memory corruption when it is marked  trusted, is the wrong solution. This 
 should never happen. I'm not sure what the solution should be here.

 
 I should be clear here

I understood the issue <g>.

 The last thing we 
 want dip1000 to do is *cause* memory corruption.

We're in full agreement here.

Oct 26 2022

Walter Bright <newshound2 digitalmars.com> writes:

On 10/26/2022 1:03 AM, Walter Bright wrote:
 On 10/24/2022 6:35 PM, Steven Schveighoffer wrote:
 In a ` trusted` function today, without dip1000, the above is perfectly 
 reasonable and not invalid. Will dip1000 make it corrupt memory?

 
 A very good question. Clearly, having code work when it is  safe, but cause 
 memory corruption when it is marked  trusted, is the wrong solution. This
should 
 never happen. I'm not sure what the solution should be here.
 

[Some more thinking about the problem]

The question is when is [1,2,3] allocated on the stack, and when is it
allocated 
on the GC heap?

Some points:

1. in C it is allocated on the stack. D's behavior to allocate it on the heap
is 
kinda surprising in that light, even though D had such literals before C did

2. allocating on the heap means it is unusable in  nogc code

3. when writing expressions, the only way to get it on the stack is to assign
it 
to a scope variable, which is inconvenient and inefficient

4. it runs against the idea that the simpler code should be more efficient than 
the complex code

Therefore, I suggest the following:

     [1,2,3] is always allocated on the stack

     [1,2,3].dup is always allocated on the heap

and thus, its behavior is not dependent on inference.

How we transition to this, we'll have to figure out.

Oct 26 2022

Steven Schveighoffer <schveiguy gmail.com> writes:

On 10/26/22 8:57 PM, Walter Bright wrote:
 On 10/26/2022 1:03 AM, Walter Bright wrote:
 On 10/24/2022 6:35 PM, Steven Schveighoffer wrote:
 In a ` trusted` function today, without dip1000, the above is 
 perfectly reasonable and not invalid. Will dip1000 make it corrupt 
 memory?

 A very good question. Clearly, having code work when it is  safe, but 
 cause memory corruption when it is marked  trusted, is the wrong 
 solution. This should never happen. I'm not sure what the solution 
 should be here.

 
 [Some more thinking about the problem]
 
 The question is when is [1,2,3] allocated on the stack, and when is it 
 allocated on the GC heap?
 
 Some points:
 
 1. in C it is allocated on the stack. D's behavior to allocate it on the 
 heap is kinda surprising in that light, even though D had such literals 
 before C did
 
 2. allocating on the heap means it is unusable in  nogc code
 
 3. when writing expressions, the only way to get it on the stack is to 
 assign it to a scope variable, which is inconvenient and inefficient
 
 4. it runs against the idea that the simpler code should be more 
 efficient than the complex code
 
 Therefore, I suggest the following:
 
      [1,2,3] is always allocated on the stack
 
      [1,2,3].dup is always allocated on the heap
 
 and thus, its behavior is not dependent on inference.
 
 How we transition to this, we'll have to figure out.

Please no! We can allocate on the stack by explicitly requesting it:

```d
int[3] = [1, 2, 3];
```

The issue is the DRYness of it. This has been proposed before, just:

```d
int[$] = [1, 2, 3];
```

If we are going to fix something, let's fix this! It's backwards 
compatible too.

If anything, the compiler can just punt and say all array literals that 
aren't immediately assigned to static arrays are allocated on the heap. 
Then it's consistent.

Allocating array literals on the heap is *awesome*, please don't change 
that! D is one of the best learning languages for high-performance code 
because you don't have to worry at all about memory management out of 
the box. I'm actually OK with backends using stack allocations because 
it can prove they aren't escaping, why can't we just rely on that?

-Steve

Oct 26 2022

Walter Bright <newshound2 digitalmars.com> writes:

On 10/26/2022 6:26 PM, Steven Schveighoffer wrote:
 Please no! We can allocate on the stack by explicitly requesting it:
 
 ```d
 int[3] = [1, 2, 3];
 ```
 
 The issue is the DRYness of it. This has been proposed before, just:
 
 ```d
 int[$] = [1, 2, 3];
 ```

How would this be done:

     foo([1,2,3] + a)

i.e. using an array literal in places other than an initialization?


 If we are going to fix something, let's fix this! It's backwards compatible
too.
 
 If anything, the compiler can just punt and say all array literals that aren't 
 immediately assigned to static arrays are allocated on the heap. Then it's 
 consistent.

And inefficient.


 Allocating array literals on the heap is *awesome*, please don't change that!
D 
 is one of the best learning languages for high-performance code because you 
 don't have to worry at all about memory management out of the box. I'm
actually 
 OK with backends using stack allocations because it can prove they aren't 
 escaping, why can't we just rely on that?

I thought your test case showed the problem with that :-/

Oct 27 2022

Steven Schveighoffer <schveiguy gmail.com> writes:

On 10/27/22 9:44 AM, Walter Bright wrote:
 On 10/26/2022 6:26 PM, Steven Schveighoffer wrote:
 Please no! We can allocate on the stack by explicitly requesting it:

 ```d
 int[3] = [1, 2, 3];
 ```

 The issue is the DRYness of it. This has been proposed before, just:

 ```d
 int[$] = [1, 2, 3];
 ```

 
 How would this be done:
 
      foo([1,2,3] + a)

Already works today, except I don't know what the + a means:

foo([1, 2, 3].staticArray);

 If we are going to fix something, let's fix this! It's backwards 
 compatible too.

 If anything, the compiler can just punt and say all array literals 
 that aren't immediately assigned to static arrays are allocated on the 
 heap. Then it's consistent.

 
 And inefficient.

Inefficiencies that are taken care of by modern backends, such as llvm 
and gcc.

 Allocating array literals on the heap is *awesome*, please don't 
 change that! D is one of the best learning languages for 
 high-performance code because you don't have to worry at all about 
 memory management out of the box. I'm actually OK with backends using 
 stack allocations because it can prove they aren't escaping, why can't 
 we just rely on that?

 
 I thought your test case showed the problem with that :-/
 

Backends that put it on the stack are not using language constructs such 
as scope to make assumptions, they are using actual analysis of the 
control flow to prove that it doesn't escape.

-Steve

Oct 27 2022

German Diago <germandiago gmail.com> writes:

On Thursday, 27 October 2022 at 00:57:47 UTC, Walter Bright wrote:
 On 10/26/2022 1:03 AM, Walter Bright wrote:
 On 10/24/2022 6:35 PM, Steven Schveighoffer wrote:
 In a ` trusted` function today, without dip1000, the above is 
 perfectly reasonable and not invalid. Will dip1000 make it 
 corrupt memory?

 
 A very good question. Clearly, having code work when it is 
  safe, but cause memory corruption when it is marked  trusted, 
 is the wrong solution. This should never happen. I'm not sure 
 what the solution should be here.
 

 [Some more thinking about the problem]

 The question is when is [1,2,3] allocated on the stack, and 
 when is it allocated on the GC heap?

 Some points:

 1. in C it is allocated on the stack. D's behavior to allocate 
 it on the heap is kinda surprising in that light, even though D 
 had such literals before C did

 2. allocating on the heap means it is unusable in  nogc code

 3. when writing expressions, the only way to get it on the 
 stack is to assign it to a scope variable, which is 
 inconvenient and inefficient

 4. it runs against the idea that the simpler code should be 
 more efficient than the complex code

 Therefore, I suggest the following:

     [1,2,3] is always allocated on the stack

     [1,2,3].dup is always allocated on the heap

As a person who has used D but not extensively, I was suprised of 
type[] vs type[N] behavior all the time. I agree that [1, 2, 3] 
should allocate in the stack but I am not sure how much code that 
could break? For example, if before it was on the heap, what 
happens with this now?

int [] func() {
   // Allocated in the stack, I presume that not safe, should add 
.dup?
   int[] v  = [1, 2, 3];
   return v;
}

How it should work?

Oct 27 2022

Quirin Schroll <qs.il.paperinik gmail.com> writes:

On Thursday, 27 October 2022 at 09:36:25 UTC, German Diago wrote:
 On Thursday, 27 October 2022 at 00:57:47 UTC, Walter Bright 
 wrote:
 On 10/26/2022 1:03 AM, Walter Bright wrote:
 On 10/24/2022 6:35 PM, Steven Schveighoffer wrote:
 In a ` trusted` function today, without dip1000, the above 
 is perfectly reasonable and not invalid. Will dip1000 make 
 it corrupt memory?

 
 A very good question. Clearly, having code work when it is 
  safe, but cause memory corruption when it is marked 
  trusted, is the wrong solution. This should never happen. 
 I'm not sure what the solution should be here.
 

 [Some more thinking about the problem]

 The question is when is `[1,2,3]` allocated on the stack, and 
 when is it allocated on the GC heap?

 Some points:

 1. in C it is allocated on the stack. D's behavior to allocate 
 it on the heap is kinda surprising in that light, even though 
 D had such literals before C did

 2. allocating on the heap means it is unusable in ` nogc` code

 3. when writing expressions, the only way to get it on the 
 stack is to assign it to a scope variable, which is 
 inconvenient and inefficient

 4. it runs against the idea that the simpler code should be 
 more efficient than the complex code

 Therefore, I suggest the following:
 ```d
 [1,2,3] // is always allocated on the stack

 [1,2,3].dup // is always allocated on the heap
 ```

 As a person who has used D but not extensively, I was suprised 
 of `type[]` vs `type[N]` behavior all the time. I agree that 
 `[1, 2, 3]` should allocate in the stack but I am not sure how 
 much code that could break? For example, if before it was on 
 the heap, what happens with this now?
 ```d
 int[] func() {
   // Allocated in the stack, I presume that not safe, should 
 add .dup?
   int[] v  = [1, 2, 3];
   return v;
 }
 ```
 How it should work?

If `[1, 2, 3]` is stack allocated, it should not compile (at 
least not in ` safe` code, probably not in ` system` code 
either). The problem is not the assignment to `v` (that is of the 
same kind as a pointer to a local variable), but that its value 
is returned and thus leaking the address of a local.

Oct 27 2022

Walter Bright <newshound2 digitalmars.com> writes:

On 10/27/2022 2:36 AM, German Diago wrote:
 As a person who has used D but not extensively, I was suprised of type[] vs 
 type[N] behavior all the time. I agree that [1, 2, 3] should allocate in the 
 stack but I am not sure how much code that could break? For example, if before 
 it was on the heap, what happens with this now?

You'll get an error on [1]

 int [] func() {
    // Allocated in the stack, I presume that not safe, should add .dup?
    int[] v  = [1, 2, 3];
    return v;  [1]
 }
 
 How it should work?

Add .dup for those that need the array to survive the function.

Oct 27 2022

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Thursday, 27 October 2022 at 00:57:47 UTC, Walter Bright wrote:
     [1,2,3] is always allocated on the stack

Why isnt this an immutable constant just like a string literal?

Oct 27 2022

Dukc <ajieskola gmail.com> writes:

On Thursday, 27 October 2022 at 00:57:47 UTC, Walter Bright wrote:
 Therefore, I suggest the following:

     [1,2,3] is always allocated on the stack

Please no. Far too much breakage for the value (even without 
going to the question whether it'd be added value in the first 
place).

 2. allocating on the heap means it is unusable in  nogc code

The compiler will error, and the programmer can manually fix it. 
No silent errors. ` nogc` code is still a bit of a special case, 
GC-using code is the normal we want to optimise the language for.

 3. when writing expressions, the only way to get it on the 
 stack is to assign it to a scope variable, which is 
 inconvenient and inefficient

The compiler is still free to optimise those as a stack 
allocation, if it can prove there's no escaping of the data. 
`scope` is just used to enforce that being the case in ` safe`, 
or giving the compiler the permission to assume that being the 
case in ` trusted` and ` system`.

Oct 27 2022

D Programming

C/C++ Programming

Other

digitalmars.D - DIP1000 scope inference