digitalmars.D.learn - Actual lifetime of static array slices?

Elfstone (35/35) Nov 14 2022 I failed to find any documentation, except dynamic array slices

Mike Parker (11/47) Nov 14 2022 A slice is a view on the existing memory owned by the original

Mike Parker (3/6) Nov 14 2022 And that was just so, so wrong. Of course destructors get called

Siarhei Siamashka (9/13) Nov 14 2022 No, it's not safe. You can add `@safe:` line in the beginning of

Elfstone (38/52) Nov 14 2022 Thanks, @safe works my first code, but the following code still

Siarhei Siamashka (4/8) Nov 14 2022 For debugging purposes? Maybe find the stack boundaries and check

Elfstone (5/13) Nov 14 2022 Great! This should be a builtin feature!

rikki cattermole (2/4) Nov 14 2022 The implementation.

Elfstone (3/7) Nov 14 2022 Cool.

=?UTF-8?Q?Ali_=c3=87ehreli?= (34/47) Nov 14 2022 That one requires the computer to analyze the code to a deeper level.

Siarhei Siamashka (14/17) Nov 15 2022 Well, there's another way to look at it:

Paul Backus (7/20) Nov 15 2022 D's safety model is the same. In `@safe` code, D will reject

Siarhei Siamashka (15/20) Nov 15 2022 I specifically asked for Ali's opinion. Because the context is

=?UTF-8?Q?Ali_=c3=87ehreli?= (11/23) Nov 15 2022 Despite my lack of computer science education, I think the compiler's
Paul Backus (16/36) Nov 16 2022 The goal of `@safe` is to ensure that memory corruption cannot

Elfstone <elfstone yeah.net> writes:

I failed to find any documentation, except dynamic array slices 
will be taken care of by GC, but I assume it's not the case with 
static arrays.

But the code bellow doesn't behave as I expected.

     int[] foo()
     {
     	int[1024] static_array;
     	// return static_array[]; // Error: returning 
`static_array[]` escapes a reference to local variable 
`static_array`
         return null;
     }

     class A
     {
     	this(int[] inData)
     	{
     		data = inData;
     	}

     	int[] data;
     }

     void main()
     {
     	int[] arr;
     	A a;
     	{
     		int[1024] static_array;
     		arr = aSlice; // OK
     		a = new A(aSlice); // OK
     		arr = foo();
     		//arr = foo();

     	}
     }

By assigning aSlice to arr or a, it seemingly escapes the scope, 
I thought there'd be errors, but the code compiles just fine.

Is it really safe though?

Nov 14 2022

Mike Parker <aldacron gmail.com> writes:

On Tuesday, 15 November 2022 at 02:26:41 UTC, Elfstone wrote:
 I failed to find any documentation, except dynamic array slices 
 will be taken care of by GC, but I assume it's not the case 
 with static arrays.

A slice is a view on the existing memory owned by the original 
array. No allocations are made for the slice. The GC will track 
all references to the memory allocated for a dynamic array, so as 
long as any slices remain alive, so will the original memory.

Static arrays are allocated on the stack and become invalid when 
they leave a function scope. In turn, so would any slices or 
other pointers that reference that stack memory.

 But the code bellow doesn't behave as I expected.

     int[] foo()
     {
     	int[1024] static_array;
     	// return static_array[]; // Error: returning 
 `static_array[]` escapes a reference to local variable 
 `static_array`
         return null;
     }

     class A
     {
     	this(int[] inData)
     	{
     		data = inData;
     	}

     	int[] data;
     }

     void main()
     {
     	int[] arr;
     	A a;
     	{
     		int[1024] static_array;
     		arr = aSlice; // OK
     		a = new A(aSlice); // OK
     		arr = foo();
     		//arr = foo();

     	}
     }

 By assigning aSlice to arr or a, it seemingly escapes the 
 scope, I thought there'd be errors, but the code compiles just 
 fine.

 Is it really safe though?

It's not the scope that matters here. It's the stack. Memory 
allocated in the inner scope uses the function stack, so it's all 
valid until the function exits.

Nov 14 2022

Mike Parker <aldacron gmail.com> writes:

On Tuesday, 15 November 2022 at 02:49:55 UTC, Mike Parker wrote:

 It's not the scope that matters here. It's the stack. Memory 
 allocated in the inner scope uses the function stack, so it's 
 all valid until the function exits.

And that was just so, so wrong. Of course destructors get called 
when scopes exit, etc.

Nov 14 2022

Siarhei Siamashka <siarhei.siamashka gmail.com> writes:

On Tuesday, 15 November 2022 at 02:26:41 UTC, Elfstone wrote:
 By assigning aSlice to arr or a, it seemingly escapes the 
 scope, I thought there'd be errors, but the code compiles just 
 fine.

 Is it really safe though?

No, it's not safe. You can add ` safe:` line in the beginning of 
your program and it will fail to compile (after renaming 
static_array to aSlice):

     test.d(27): Error: address of variable `aSlice` assigned to 
`arr` with longer lifetime

By default everything is assumed to be  system and the compiler 
silently allows you to shoot yourself in the foot. See 
https://dlang.org/spec/memory-safe-d.html

Nov 14 2022

Elfstone <elfstone yeah.net> writes:

On Tuesday, 15 November 2022 at 02:50:44 UTC, Siarhei Siamashka 
wrote:
 On Tuesday, 15 November 2022 at 02:26:41 UTC, Elfstone wrote:
 By assigning aSlice to arr or a, it seemingly escapes the 
 scope, I thought there'd be errors, but the code compiles just 
 fine.

 Is it really safe though?

 No, it's not safe. You can add ` safe:` line in the beginning 
 of your program and it will fail to compile (after renaming 
 static_array to aSlice):

     test.d(27): Error: address of variable `aSlice` assigned to 
 `arr` with longer lifetime

 By default everything is assumed to be  system and the compiler 
 silently allows you to shoot yourself in the foot. See 
 https://dlang.org/spec/memory-safe-d.html

Thanks,  safe works my first code, but the following code still 
compiles.

     class A
     {
     	 safe
     	this(int[] inData)
     	{
     		data = inData;
     	}

     	int[] data;
     }

      safe
     int[] foo()
     {
     	int[1024] static_array;
     	// return static_array[]; // Error: returning 
`static_array[]` escapes a reference to local variable 
`static_array`
	return null;
     }

      safe
     A bar()
     {
     	int[1024] static_array;
     	return new A(static_array[]);
     }

      safe
     void main()
     {
     	auto a = bar();
     	writeln(a.data); // OK, but writes garbage
     }

So the compiler detects escaping in foo() but not in bar(), this 
doesn't look right.

Is there a way to tell whether a slice is from a dynamic array or 
a static array?

Nov 14 2022

Siarhei Siamashka <siarhei.siamashka gmail.com> writes:

On Tuesday, 15 November 2022 at 03:05:30 UTC, Elfstone wrote:
 So the compiler detects escaping in foo() but not in bar(), 
 this doesn't look right.

The compiler can detect it with -dip1000 command line option.

 Is there a way to tell whether a slice is from a dynamic array 
 or a static array?

For debugging purposes? Maybe find the stack boundaries and check 
if the address is in stack?

Nov 14 2022

Elfstone <elfstone yeah.net> writes:

On Tuesday, 15 November 2022 at 03:18:17 UTC, Siarhei Siamashka 
wrote:
 On Tuesday, 15 November 2022 at 03:05:30 UTC, Elfstone wrote:
 So the compiler detects escaping in foo() but not in bar(), 
 this doesn't look right.

 The compiler can detect it with -dip1000 command line option.

 Is there a way to tell whether a slice is from a dynamic array 
 or a static array?

 For debugging purposes? Maybe find the stack boundaries and 
 check if the address is in stack?

Great! This should be a builtin feature!


idea what supersedes it?

Nov 14 2022

rikki cattermole <rikki cattermole.co.nz> writes:

On 15/11/2022 5:10 PM, Elfstone wrote:

 what supersedes it?

The implementation.

Nov 14 2022

Elfstone <elfstone yeah.net> writes:

On Tuesday, 15 November 2022 at 04:10:37 UTC, rikki cattermole 
wrote:
 On 15/11/2022 5:10 PM, Elfstone wrote:

 Any idea what supersedes it?

 The implementation.

Cool.

Nov 14 2022

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 11/14/22 19:05, Elfstone wrote:

       safe
      int[] foo()
      {
          int[1024] static_array;
          // return static_array[]; // Error: returning `static_array[]`
 escapes a reference to local variable `static_array`

That is trivial for the compiler to catch.

      return null;
      }

       safe
      A bar()
      {
          int[1024] static_array;
          return new A(static_array[]);

That one requires the computer to analyze the code to a deeper level. 
Yes, we are passing a slice to the A constructor but we don't know 
whether the constructor will store the slice or simply use it. Even the 
following can be safe but impossible to detect by the compiler:

class A
{
      safe
     this(int[] inData)
     {
         data = someCondition() ? new int[42] : inData; // (1)
         // ...
         if (someOtherCondition()) {
             data = null;                               // (2)
         }
     }
     // ...
}

(1) Whether we use inData depends on someCondition(). It may be so that 
bar() is never called depending on someCondition() in the program at all.

(2) data may never refer to 'static_array' after the constructor exits 
depending on someOtherCondition()

The code above may not be using good coding practices and humans may see 
the bugs if there are any but only a slow (infinetely?) compiler can see 
through all the code. (I think  live will help with these cases.) 
Further, the support for "separate compilation" makes it impossible to 
see through function boundaries.

Additionally, we don't want the compiler to force us to copy all stack 
variables just in case.

In summary, you are right but the compiler cannot do anything about it 
in all cases and we wouldn't want it to spend infinite amount of time to 
try to determine everything.

Ali

Nov 14 2022

Siarhei Siamashka <siarhei.siamashka gmail.com> writes:

On Tuesday, 15 November 2022 at 06:44:16 UTC, Ali Çehreli wrote:
 In summary, you are right but the compiler cannot do anything 
 about it in all cases and we wouldn't want it to spend infinite 
 amount of time to try to determine everything.

Well, there's another way to look at it: 
https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html ('Unsafe 
Rust exists because, by nature, static analysis is conservative. 
When the compiler tries to determine whether or not code upholds 
the guarantees, it’s better for it to reject some valid programs 
than to accept some invalid programs. Although the code might be 
okay, **if the Rust compiler doesn’t have enough information to 
be confident, it will reject the code**. In these cases, you can 
use unsafe code to tell the compiler, “Trust me, I know what I’m 
doing.”')

Are you saying that the D safety model is different? In the sense 
that if the D compiler doesn’t have enough information to be 
confident, it will accept the code?

Nov 15 2022

Paul Backus <snarwin gmail.com> writes:

On Tuesday, 15 November 2022 at 13:01:39 UTC, Siarhei Siamashka 
wrote:
 Well, there's another way to look at it: 
 https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html 
 ('Unsafe Rust exists because, by nature, static analysis is 
 conservative. When the compiler tries to determine whether or 
 not code upholds the guarantees, it’s better for it to reject 
 some valid programs than to accept some invalid programs. 
 Although the code might be okay, **if the Rust compiler doesn’t 
 have enough information to be confident, it will reject the 
 code**. In these cases, you can use unsafe code to tell the 
 compiler, “Trust me, I know what I’m doing.”')

 Are you saying that the D safety model is different? In the 
 sense that if the D compiler doesn’t have enough information to 
 be confident, it will accept the code?

D's safety model is the same. In ` safe` code, D will reject 
anything that the compiler cannot say for sure is memory safe. 
However, unlike in Rust, ` safe` is not the default in D, so you 
must mark your code as ` safe` manually if you want to benefit 
from these checks.

Nov 15 2022

Siarhei Siamashka <siarhei.siamashka gmail.com> writes:

On Tuesday, 15 November 2022 at 13:16:18 UTC, Paul Backus wrote:
 D's safety model is the same. In ` safe` code, D will reject 
 anything that the compiler cannot say for sure is memory safe. 
 However, unlike in Rust, ` safe` is not the default in D, so 
 you must mark your code as ` safe` manually if you want to 
 benefit from these checks.

I specifically asked for Ali's opinion. Because the context is 
that the compiler couldn't catch a memory safety bug in the code 
that was annotated as  safe (but without -dip1000) and Ali 
commented that "the compiler cannot do anything about it in all 
cases and we wouldn't want it to spend infinite amount of time to 
try to determine everything". This sounds like he justifies the 
compiler's failure and accepts this as something normal.

The https://dlang.org/spec/memory-safe-d.html page also provides 
a rather vague statement: " safe functions have a number of 
restrictions on what they may do and are intended to disallow 
operations that may cause memory corruption". Which kinda means 
that it makes some effort to catch some memory safety bugs. This 
weasel language isn't very reassuring, compared to a very clear 
Rust documentation.

Nov 15 2022

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 11/15/22 06:05, Siarhei Siamashka wrote:

 Ali commented that "the
 compiler cannot do anything about it in all cases and we wouldn't want
 it to spend infinite amount of time to try to determine everything".

Yes, that's my understanding.

 This sounds like he justifies the compiler's failure and accepts this as
 something normal.

Despite my lack of computer science education, I think the compiler's 
failure in analyzing source code to determine all bugs is "normal". I 
base my understanding on the "halting problem" and the "separate 
compilation" feature that D supports.

 The https://dlang.org/spec/memory-safe-d.html page also provides a
 rather vague statement: " safe functions have a number of restrictions
 on what they may do and are intended to disallow operations that may
 cause memory corruption". Which kinda means that it makes some effort to
 catch some memory safety bugs.

Exactly. My understanding is that  safe attempts to remove memory 
corruptions.  live is being worked on to improve the situation by 
tracking liveness of data.

 This weasel language isn't very
 reassuring, compared to a very clear Rust documentation.

That's spot on.

Ali

Nov 15 2022

Paul Backus <snarwin gmail.com> writes:

On Tuesday, 15 November 2022 at 14:05:42 UTC, Siarhei Siamashka 
wrote:
 On Tuesday, 15 November 2022 at 13:16:18 UTC, Paul Backus wrote:
 D's safety model is the same. In ` safe` code, D will reject 
 anything that the compiler cannot say for sure is memory safe. 
 However, unlike in Rust, ` safe` is not the default in D, so 
 you must mark your code as ` safe` manually if you want to 
 benefit from these checks.

 I specifically asked for Ali's opinion. Because the context is 
 that the compiler couldn't catch a memory safety bug in the 
 code that was annotated as  safe (but without -dip1000) and Ali 
 commented that "the compiler cannot do anything about it in all 
 cases and we wouldn't want it to spend infinite amount of time 
 to try to determine everything". This sounds like he justifies 
 the compiler's failure and accepts this as something normal.

 The https://dlang.org/spec/memory-safe-d.html page also 
 provides a rather vague statement: " safe functions have a 
 number of restrictions on what they may do and are intended to 
 disallow operations that may cause memory corruption". Which 
 kinda means that it makes some effort to catch some memory 
 safety bugs. This weasel language isn't very reassuring, 
 compared to a very clear Rust documentation.

The goal of ` safe` is to ensure that memory corruption cannot 
possibly occur in ` safe` code, period--only in ` system` or 
` trusted` code. If the documentation isn't clear about this, 
that's failure of the documentation.

However, there are some known issues with ` safe` that require 
breaking changes to fix, and to make migration easier for 
existing code, those changes have been hidden behind the 
`-dip1000` flag. So in practice, if you are using ` safe` without 
`-dip1000`, you may run into compiler bugs that compromise memory 
safety.

That's what happened in your example. Slicing a stack-allocated 
static array *shouldn't* be allowed in ` safe` code without 
`-dip1000`, but the compiler allows it anyway, due to a bug, and 
the fix for that bug is enabled by the `-dip1000` switch.

Nov 16 2022

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Actual lifetime of static array slices?