www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Memory safe in D

reply Alex <akornilov.82 mail.ru> writes:
Hello,

I am interesting D as memory safe language (maybe SafeD?) and 
have written very simple code:

```d
 safe

import std.stdio;

class A {
	this() {
		writeln("[A] Constructor");
	}

	~this() {
		writeln("[A] Destructor");
	}

	void run() {
		writeln("[A] run");
	}
}

int main()
{
	A a;
	a.run();
	writeln("Hello, world!");
	return 0;
}
```

Output is:
```
C:\Software\D\dmd2\windows\bin64\dub.exe run --build-mode separate
     Starting Performing "debug" build using 
C:\Software\D\dmd2\windows\bin64\dmd.exe for x86_64.
     Building dlang-app ~master: building configuration 
[application]
      Linking dlang-app
      Running dlang-app.exe
Error Program exited with code -1073741819

Process finished with exit code 2
```

So I don't see any errors or warnings from compiler when I use 
uninitialized variable `a` and don't see any exception with 
backtrace in runtime (application is build in debug mode).

Is it expected behavior?
Looks like it is not very safe approach and can lead to very 
unpleasant memory errors...
Mar 11 2024
next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 11/03/2024 9:16 PM, Alex wrote:
 So I don't see any errors or warnings from compiler when I use 
 uninitialized variable |a| and don't see any exception with backtrace in 
 runtime (application is build in debug mode).
 
 Is it expected behavior? Looks like it is not very safe approach and can 
 lead to very unpleasant memory errors...
This is expected behavior. The variable a was default initialized to null. D has not got type state analysis as part of it, so it cannot detect this situation and cause an error. It is at the top of my todo list for memory safety research for D, as the IR it requires enables other analysis and provides a framework for it to exist in.
Mar 11 2024
next sibling parent reply Alex <akornilov.82 mail.ru> writes:
On Monday, 11 March 2024 at 08:48:47 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 On 11/03/2024 9:16 PM, Alex wrote:
 So I don't see any errors or warnings from compiler when I use 
 uninitialized variable |a| and don't see any exception with 
 backtrace in runtime (application is build in debug mode).
 
 Is it expected behavior? Looks like it is not very safe 
 approach and can lead to very unpleasant memory errors...
This is expected behavior. The variable a was default initialized to null. D has not got type state analysis as part of it, so it cannot detect this situation and cause an error. It is at the top of my todo list for memory safety research for D, as the IR it requires enables other analysis and provides a framework for it to exist in.
Oh... looks like null is also used for refs in D. It's sad :) I thought it used only for pointers in unsafe mode. I think, the null safety feature is very important in modern world (maybe "must have" :) ). Very nice to have such feature in D like in Kotlin for example. So, as I understand, D team have the task in TODO list about implementation something like "null safety"?
Mar 11 2024
next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 11/03/2024 11:20 PM, Alex wrote:
 Oh... looks like null is also used for refs in D. It's sad :)
 I thought it used only for pointers in unsafe mode.
 I think, the null safety feature is very important in modern world 
 (maybe "must have" :) ). Very nice to have such feature in D like in 
 Kotlin for example.
 So, as I understand, D team have the task in TODO list about 
 implementation something like "null safety"?
I'm not sure I'd call myself part of the core D team (although I have another proposal that is currently going through the DIP process that would certainly qualify me for the title!). However in saying that, memory safety is on the foundation's radar as needing solving. I'm just the weirdo that is having a go at trying to solve temporal memory safety (an unsolved problem!).
Mar 11 2024
parent reply Alex <akornilov.82 mail.ru> writes:
On Monday, 11 March 2024 at 10:31:05 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 On 11/03/2024 11:20 PM, Alex wrote:
 Oh... looks like null is also used for refs in D. It's sad :)
 I thought it used only for pointers in unsafe mode.
 I think, the null safety feature is very important in modern 
 world (maybe "must have" :) ). Very nice to have such feature 
 in D like in Kotlin for example.
 So, as I understand, D team have the task in TODO list about 
 implementation something like "null safety"?
I'm not sure I'd call myself part of the core D team (although I have another proposal that is currently going through the DIP process that would certainly qualify me for the title!). However in saying that, memory safety is on the foundation's radar as needing solving. I'm just the weirdo that is having a go at trying to solve temporal memory safety (an unsolved problem!).
Thank you for the information! Maybe you know: are there some guys from D foundation here? Also, I figured out that I can't handle uninitialized access via try/catch: ```d A a; try { a.run(); } catch(Throwable) { writeln("Error"); } ``` So the catch branch not work here.
Mar 11 2024
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 11/03/2024 11:39 PM, Alex wrote:
 On Monday, 11 March 2024 at 10:31:05 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 On 11/03/2024 11:20 PM, Alex wrote:
 Oh... looks like null is also used for refs in D. It's sad :)
 I thought it used only for pointers in unsafe mode.
 I think, the null safety feature is very important in modern world 
 (maybe "must have" :) ). Very nice to have such feature in D like in 
 Kotlin for example.
 So, as I understand, D team have the task in TODO list about 
 implementation something like "null safety"?
I'm not sure I'd call myself part of the core D team (although I have another proposal that is currently going through the DIP process that would certainly qualify me for the title!). However in saying that, memory safety is on the foundation's radar as needing solving. I'm just the weirdo that is having a go at trying to solve temporal memory safety (an unsolved problem!).
Thank you for the information! Maybe you know: are there some guys from D foundation here?
Yes, they are around including Walter, I'm sure he'll see it within the day.
 Also, I figured out that I can't handle uninitialized access via try/catch:
 
 ```d
 A a;
 try {
      a.run();
 } catch(Throwable) {
      writeln("Error");
 }
 ```
 
 So the catch branch not work here.
The a variable was initialized, via default initialization. It is in a known state, null. What you are wanting is a way to have the compiler complain when a nonnull type state is expected, but it is initialized. D does not support that currently.
Mar 11 2024
parent reply Alex <akornilov.82 mail.ru> writes:
On Monday, 11 March 2024 at 10:45:27 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 On 11/03/2024 11:39 PM, Alex wrote:
 On Monday, 11 March 2024 at 10:31:05 UTC, Richard (Rikki) 
 Andrew Cattermole wrote:
 On 11/03/2024 11:20 PM, Alex wrote:
 Oh... looks like null is also used for refs in D. It's sad :)
 I thought it used only for pointers in unsafe mode.
 I think, the null safety feature is very important in modern 
 world (maybe "must have" :) ). Very nice to have such 
 feature in D like in Kotlin for example.
 So, as I understand, D team have the task in TODO list about 
 implementation something like "null safety"?
I'm not sure I'd call myself part of the core D team (although I have another proposal that is currently going through the DIP process that would certainly qualify me for the title!). However in saying that, memory safety is on the foundation's radar as needing solving. I'm just the weirdo that is having a go at trying to solve temporal memory safety (an unsolved problem!).
Thank you for the information! Maybe you know: are there some guys from D foundation here?
Yes, they are around including Walter, I'm sure he'll see it within the day.
 Also, I figured out that I can't handle uninitialized access 
 via try/catch:
 
 ```d
 A a;
 try {
      a.run();
 } catch(Throwable) {
      writeln("Error");
 }
 ```
 
 So the catch branch not work here.
The a variable was initialized, via default initialization. It is in a known state, null. What you are wanting is a way to have the compiler complain when a nonnull type state is expected, but it is initialized. D does not support that currently.
Yes, I got it about compiler, static analyzer can't detect such potential issue for now. The instance of class `A` is initialized by default initializer - correct?. But what about variable `a`? Is it initialized by null or contains reference to the instance initialized by default initializer? What happend when I tried to call method `run()` of `a` in runtime? I see that application was abnormal termination because `writeln("Hello, world!");` was not called. But I don't see any information in console about it (backtrace or something else). Is it uncatched excpetion? But I have tried to catch it - not work.
Mar 11 2024
next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 12/03/2024 12:01 AM, Alex wrote:
 Yes, I got it about compiler, static analyzer can't detect such 
 potential issue for now.
 The instance of class `A` is initialized by default initializer - correct?.
 But what about variable `a`?
The instance of class ``A`` only exists as a reference stored in ``a``. ``a`` is null, therefore there is no instance of ``A``.
 Is it initialized by null or contains reference to the instance 
 initialized by default initializer?
The variable is a pointer, and it is null aka 0. ```d void main() { Object o; } ``` is ```asm _Dmain: push RBP mov RBP,RSP sub RSP,010h mov qword ptr -8[RBP],0 xor EAX,EAX leave ret ``` No different if the type was ``int*`` instead of ``Object``. ```d void main() { int* o; } ``` ```asm _Dmain: push RBP mov RBP,RSP sub RSP,010h mov qword ptr -8[RBP],0 xor EAX,EAX leave ret ```
 What happend when I tried to call method `run()` of `a` in runtime?
 I see that application was abnormal termination because `writeln("Hello, 
 world!");` was not called.
 But I don't see any information in console about it (backtrace or 
 something else).
 Is it uncatched excpetion? But I have tried to catch it - not work.
A class reference is a pointer. It basically works out to be something like this: ```d struct Obj {} void func(Obj* this) {} Obj* obj; func(obj); ``` If the function reads or writes to the this pointer, its going to segfault. Same principle with the vtable lookup. If you make your method on the class final, it should work the exact same way as the struct above, due to it not using the vtable.
Mar 11 2024
parent Alex <akornilov.82 mail.ru> writes:
On Monday, 11 March 2024 at 11:09:40 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 If the function reads or writes to the this pointer, its going 
 to segfault.

 Same principle with the vtable lookup.

 If you make your method on the class final, it should work the 
 exact same way as the struct above, due to it not using the 
 vtable.
Thank you very much for detailed explanation! Now I understand what is going on.
Mar 11 2024
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/11/2024 4:01 AM, Alex wrote:
 Yes, I got it about compiler, static analyzer can't detect such potential
issue 
 for now.
It cannot do it in the general case, that would be the halting problem.
 The instance of class `A` is initialized by default initializer - correct?.
`A a;` will default initialize `a` to `null`. `A a = new A();` will allocate an instance of `A` where each field is default initialized, and assign the result to `a`.
 But what about variable `a`?
`a` is default initialized to `null`.
 Is it initialized by null or contains reference to the instance initialized by 
 default initializer?
`null`
 What happend when I tried to call method `run()` of `a` in runtime?
`a` is passed as the `this` pointer to the method `run()`. Hence, in this case, `this` will be null. If you attempt to dereference `this`, it will seg fault.
 I see that application was abnormal termination because `writeln("Hello, 
 world!");` was not called.
 But I don't see any information in console about it (backtrace or something
else).
To get a backtrace on Windows, run it in the VC debugger.
 Is it uncatched excpetion? But I have tried to catch it - not work.
D's exception catchers do not catch Windows system exceptions for 64 bit code. Microsoft has not seen fit to document their 64 bit EH system.
Mar 11 2024
next sibling parent reply An Pham <home home.com> writes:
 D's exception catchers do not catch Windows system exceptions 
 for 64 bit code. Microsoft has not seen fit to document their 
 64 bit EH system.
This may help? https://learn.microsoft.com/en-us/cpp/build/exception-handling-x64?view=msvc-170
Mar 12 2024
parent Walter Bright <newshound2 digitalmars.com> writes:
On 3/12/2024 7:36 AM, An Pham wrote:
 D's exception catchers do not catch Windows system exceptions for 64 bit code. 
 Microsoft has not seen fit to document their 64 bit EH system.
This may help? https://learn.microsoft.com/en-us/cpp/build/exception-handling-x64?view=msvc-170
I guess they did finally get around to documenting it. Thanks!
Mar 12 2024
prev sibling parent ShowMeTheWay <ShowMeTheWay gmail.com> writes:
On Tuesday, 12 March 2024 at 03:55:27 UTC, Walter Bright wrote:
 On 3/11/2024 4:01 AM, Alex wrote:
 Yes, I got it about compiler, static analyzer can't detect 
 such potential issue for now.
It cannot do it in the general case, that would be the halting problem.
 The instance of class `A` is initialized by default 
 initializer - correct?.
`A a;` will default initialize `a` to `null`. `A a = new A();` will allocate an instance of `A` where each field is default initialized, and assign the result to `a`.
 But what about variable `a`?
`a` is default initialized to `null`.
 Is it initialized by null or contains reference to the 
 instance initialized by default initializer?
`null`
 What happend when I tried to call method `run()` of `a` in 
 runtime?
`a` is passed as the `this` pointer to the method `run()`. Hence, in this case, `this` will be null. If you attempt to dereference `this`, it will seg fault.
 I see that application was abnormal termination because 
 `writeln("Hello, world!");` was not called.
 But I don't see any information in console about it (backtrace 
 or something else).
To get a backtrace on Windows, run it in the VC debugger.
 Is it uncatched excpetion? But I have tried to catch it - not 
 work.
D's exception catchers do not catch Windows system exceptions for 64 bit code. Microsoft has not seen fit to document their 64 bit EH system.
The problem here, is "the D compiler" - i.e it not warning you (as an compilation error), that you are using variable a, even though it has not yet been assigned. But D compiler just leaves it to the hardware to generate a seg fault at runtime?? This can be solved at compilation time.. surely. ------ A a; 'a' ----- https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/compiler-messages/cs0165
Apr 12 2024
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/11/2024 3:20 AM, Alex wrote:
 Oh... looks like null is also used for refs in D. It's sad :)
 I thought it used only for pointers in unsafe mode.
 I think, the null safety feature is very important in modern world (maybe
"must 
 have" :) ). Very nice to have such feature in D like in Kotlin for example.
 So, as I understand, D team have the task in TODO list about implementation 
 something like "null safety"?
Null is actually not a memory safety issue. What happens when null is read or written to is a seg fault. The seg fault is the hardware saying "you cannot do that", so there is nothing unsafe about it. Memory safety is being unable to corrupt memory. The seg fault ensures no memory corruption happens.
Mar 11 2024
parent reply Quirin Schroll <qs.il.paperinik gmail.com> writes:
On Tuesday, 12 March 2024 at 03:00:24 UTC, Walter Bright wrote:
 On 3/11/2024 3:20 AM, Alex wrote:
 Oh... looks like null is also used for refs in D. It's sad :)
 I thought it used only for pointers in unsafe mode.
 I think, the null safety feature is very important in modern 
 world (maybe "must have" :) ). Very nice to have such feature 
 in D like in Kotlin for example.
 So, as I understand, D team have the task in TODO list about 
 implementation something like "null safety"?
Null is actually not a memory safety issue. What happens when null is read or written to is a seg fault. The seg fault is the hardware saying "you cannot do that", so there is nothing unsafe about it.
I guess what people want instead of segmentation faults is not UB, but compile errors. Segmentation faults are better than UB, but a type system that tells you where your code might segfault because of a null dereference is even better: Not only gives it peace of mind, it works on platforms that don’t segfault, and it’s likewise free of any runtime cost. It’s a whole different discussion how to add those compile-time checks to D’s type system.
Mar 26 2024
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 27/03/2024 6:56 AM, Quirin Schroll wrote:
 On Tuesday, 12 March 2024 at 03:00:24 UTC, Walter Bright wrote:
 On 3/11/2024 3:20 AM, Alex wrote:
 Oh... looks like null is also used for refs in D. It's sad :)
 I thought it used only for pointers in unsafe mode.
 I think, the null safety feature is very important in modern world 
 (maybe "must have" :) ). Very nice to have such feature in D like in 
 Kotlin for example.
 So, as I understand, D team have the task in TODO list about 
 implementation something like "null safety"?
Null is actually not a memory safety issue. What happens when null is read or written to is a seg fault. The seg fault is the hardware saying "you cannot do that", so there is nothing unsafe about it.
I guess what people want instead of segmentation faults is not UB, but compile errors. Segmentation faults are better than UB, but a type system that tells you where your code might segfault because of a null dereference is even better: Not only gives it peace of mind, it works on platforms that don’t segfault, and it’s likewise free of any runtime cost. It’s a whole different discussion how to add those compile-time checks to D’s type system.
Yes, and unfortunately it appears nobody except Paul has opinions about its existence. https://forum.dlang.org/post/ucdmmlxklanpsggqmwas forum.dlang.org I put a lot of work into type state analysis that deals with uninitialized/initialized, nullable/non-null type states. It has been pretty disheartening to see this thread and then look at my proposal for it and there just isn't anyone with counter proposals or issues it may bring. Not even Walter saying he sees nothing wrong with it currently.
Mar 26 2024
next sibling parent reply Meta <jared771 gmail.com> writes:
On Tuesday, 26 March 2024 at 22:31:56 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 Yes, and unfortunately it appears nobody except Paul has 
 opinions about its existence.

 https://forum.dlang.org/post/ucdmmlxklanpsggqmwas forum.dlang.org

 I put a lot of work into type state analysis that deals with 
 uninitialized/initialized, nullable/non-null type states.
With all due respect, why did you put so much work into something you know won't be taken seriously by this community?
 It has been pretty disheartening to see this thread and then 
 look at my proposal for it and there just isn't anyone with 
 counter proposals or issues it may bring. Not even Walter 
 saying he sees nothing wrong with it currently.
There's like, maybe 5-10 people here who have the necessary background to even be able to understand and critique your proposal. The majority of people who use D come from a C/C++ background, and don't care about type state, or understand why it's useful. Rust _had_ support for typestate, and they _removed_ most of it. Rust already gets a lot of criticism here for being overly complex, so if it was too complex a feature for Rust, why would it ever be accepted into D? Even adding a bottom type to D was controversial. I say this all as someone who would love to have a robust typestate system for D, but it just ain't gonna happen. D is just not the language for it. Your best bet is to either propose it for OpenD, or fork it and implement it yourself.
Mar 27 2024
parent reply Lance Bachmeier <no spam.net> writes:
On Wednesday, 27 March 2024 at 17:07:57 UTC, Meta wrote:

 I say this all as someone who would love to have a robust 
 typestate system for D, but it just ain't gonna happen. D is 
 just not the language for it. Your best bet is to either 
 propose it for OpenD, or fork it and implement it yourself.
I'd say the best strategy would be to write up some examples using real code showing large benefits. Then it might have a chance. The current proposal not only assumes the reader is familiar with the concepts, but that they can envision substantial benefits in their own code. I had no more idea of the benefits after reading the proposal than I did before.
Mar 27 2024
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 28/03/2024 7:24 AM, Lance Bachmeier wrote:
 On Wednesday, 27 March 2024 at 17:07:57 UTC, Meta wrote:
 
 I say this all as someone who would love to have a robust typestate 
 system for D, but it just ain't gonna happen. D is just not the 
 language for it. Your best bet is to either propose it for OpenD, or 
 fork it and implement it yourself.
I'd say the best strategy would be to write up some examples using real code showing large benefits. Then it might have a chance. The current proposal not only assumes the reader is familiar with the concepts, but that they can envision substantial benefits in their own code. I had no more idea of the benefits after reading the proposal than I did before.
First example has been added, thanks to Razvan's recent Rust link: However because it doesn't enable you to do anything new, and only ever checks against certain logic errors it has been very difficult for me to create examples it needs a different head space which I expected to deal with later, oh well. ```d T* makeNull(T)() safe { return null; } void useNull() safe { int* var = makeNull!int(); // var is in type state initialized as per makeNull return state *var = 42; // segfault due to var being null } ``` What we want to happen instead: ```d T* makeNull(T)(/* return'initialized */) safe { return null; // type state default is more than the type state initialized // so it is accepted } void useNull() safe { int* var = makeNull!int(); // var is in type state initialized as per MakeNull return state // perform load via var variable // this will error due to initialized is less than the nonnull type state // Error: Variable var is in type state initialized which could be null, cannot write to it *var = 42; } ``` To fix, simply check for null! ```d void useNull() safe { int* var = makeNull!int(); // var is in type state initialized as per MakeNull return state if (var !is null) { // in scope, assume var is in type state nonnull *var = 42; } } ```
Mar 28 2024
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 3/26/2024 3:31 PM, Richard (Rikki) Andrew Cattermole wrote:
 Not even Walter saying he sees nothing wrong with it currently.
It is better if I abstain from jumping in too early.
Mar 29 2024
prev sibling parent reply bachmeier <no spam.net> writes:
On Monday, 11 March 2024 at 08:48:47 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 On 11/03/2024 9:16 PM, Alex wrote:
 So I don't see any errors or warnings from compiler when I use 
 uninitialized variable |a| and don't see any exception with 
 backtrace in runtime (application is build in debug mode).
 
 Is it expected behavior? Looks like it is not very safe 
 approach and can lead to very unpleasant memory errors...
This is expected behavior. The variable a was default initialized to null. D has not got type state analysis as part of it, so it cannot detect this situation and cause an error. It is at the top of my todo list for memory safety research for D, as the IR it requires enables other analysis and provides a framework for it to exist in.
Rather than doing that, couldn't the compiler say `A a;` is not valid inside ` safe`?
Mar 11 2024
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 12/03/2024 4:31 AM, bachmeier wrote:
 On Monday, 11 March 2024 at 08:48:47 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 On 11/03/2024 9:16 PM, Alex wrote:
 So I don't see any errors or warnings from compiler when I use 
 uninitialized variable |a| and don't see any exception with backtrace 
 in runtime (application is build in debug mode).

 Is it expected behavior? Looks like it is not very safe approach and 
 can lead to very unpleasant memory errors...
This is expected behavior. The variable a was default initialized to null. D has not got type state analysis as part of it, so it cannot detect this situation and cause an error. It is at the top of my todo list for memory safety research for D, as the IR it requires enables other analysis and provides a framework for it to exist in.
Rather than doing that, couldn't the compiler say `A a;` is not valid inside ` safe`?
One of the improvements for type state analysis I want to make is for methods: ```d class Foo { void func(this'nonnull); } ``` Instead of: ```d class Foo { void func(this'reachable); } ``` That'll catch it when you try to call something. However I'm not sure if disallowing null entering is a great idea, its going to enter through other methods so you might as well embrace catching that as well.
Mar 11 2024
parent reply bachmeier <no spam.net> writes:
On Monday, 11 March 2024 at 19:22:54 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 On 12/03/2024 4:31 AM, bachmeier wrote:
 On Monday, 11 March 2024 at 08:48:47 UTC, Richard (Rikki) 
 Andrew Cattermole wrote:
 On 11/03/2024 9:16 PM, Alex wrote:
 So I don't see any errors or warnings from compiler when I 
 use uninitialized variable |a| and don't see any exception 
 with backtrace in runtime (application is build in debug 
 mode).

 Is it expected behavior? Looks like it is not very safe 
 approach and can lead to very unpleasant memory errors...
This is expected behavior. The variable a was default initialized to null. D has not got type state analysis as part of it, so it cannot detect this situation and cause an error. It is at the top of my todo list for memory safety research for D, as the IR it requires enables other analysis and provides a framework for it to exist in.
Rather than doing that, couldn't the compiler say `A a;` is not valid inside ` safe`?
One of the improvements for type state analysis I want to make is for methods: ```d class Foo { void func(this'nonnull); } ``` Instead of: ```d class Foo { void func(this'reachable); } ``` That'll catch it when you try to call something. However I'm not sure if disallowing null entering is a great idea, its going to enter through other methods so you might as well embrace catching that as well.
What I've never understood is the valid use case for `A a;` that would justify that line compiling. I don't do much with classes, but the same thing comes about with structs. Why would it make sense to write code like this, particularly if you've marked it safe? ``` struct Foo { double n; string s = string.init; } Foo * f; // Crashes writeln(f.n); Foo * f = void; // Doesn't crash but gives wrong output writeln(f.n); Foo * g; // Crashes g.n = 3.3; Foo * g = void; g.n = 3.3; // Correct output writeln(g.n); // Crashes writeln(g.s); ```
Mar 11 2024
next sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 12/03/2024 9:39 AM, bachmeier wrote:
 On Monday, 11 March 2024 at 19:22:54 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 On 12/03/2024 4:31 AM, bachmeier wrote:
 On Monday, 11 March 2024 at 08:48:47 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 On 11/03/2024 9:16 PM, Alex wrote:
 So I don't see any errors or warnings from compiler when I use 
 uninitialized variable |a| and don't see any exception with 
 backtrace in runtime (application is build in debug mode).

 Is it expected behavior? Looks like it is not very safe approach 
 and can lead to very unpleasant memory errors...
This is expected behavior. The variable a was default initialized to null. D has not got type state analysis as part of it, so it cannot detect this situation and cause an error. It is at the top of my todo list for memory safety research for D, as the IR it requires enables other analysis and provides a framework for it to exist in.
Rather than doing that, couldn't the compiler say `A a;` is not valid inside ` safe`?
One of the improvements for type state analysis I want to make is for methods: ```d class Foo {     void func(this'nonnull); } ``` Instead of: ```d class Foo {     void func(this'reachable); } ``` That'll catch it when you try to call something. However I'm not sure if disallowing null entering is a great idea, its going to enter through other methods so you might as well embrace catching that as well.
What I've never understood is the valid use case for `A a;` that would justify that line compiling. I don't do much with classes, but the same thing comes about with structs. Why would it make sense to write code like this, particularly if you've marked it safe? ``` struct Foo {   double n;   string s = string.init; } Foo * f; // Crashes writeln(f.n); Foo * f = void; // Doesn't crash but gives wrong output writeln(f.n); Foo * g; // Crashes g.n = 3.3; Foo * g = void; g.n = 3.3; // Correct output writeln(g.n); // Crashes writeln(g.s); ```
Like that? No. You may be setting the variable based upon some condition and then passing the result into a function. Null may be a valid behavior for that function.
Mar 11 2024
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
The default value for a class reference is `null`. Most objects need an "I'm
not 
a valid object" value, and null is ideal for it because if it is dereferenced,
a 
hardware segment fault is generated.

If a variable is initialized with `void`, that is something completely 
different. The variable winds up being set to garbage, as it is not initialized 
at all. This is why `void` initialization for references is only allowed in
code 
marked  safe, and is usually used when top efficiency is required.

Consider what a null class reference is good for:

```D
class ExtraInfo { ... }

struct S
{
     int a,b,c;
     ExtraInfo extra;
}
```
In my program, sometimes I need the `extra` info, but most of the time, not.
Why 
have the compiler force an allocation for `extra` if it isn't used all the
time? 
That just wastes time and memory.
Mar 11 2024
parent reply Lance Bachmeier <no spam.net> writes:
On Tuesday, 12 March 2024 at 03:12:44 UTC, Walter Bright wrote:
 The default value for a class reference is `null`. Most objects 
 need an "I'm not a valid object" value, and null is ideal for 
 it because if it is dereferenced, a hardware segment fault is 
 generated.

 If a variable is initialized with `void`, that is something 
 completely different. The variable winds up being set to 
 garbage, as it is not initialized at all. This is why `void` 
 initialization for references is only allowed in code marked 
  safe, and is usually used when top efficiency is required.

 Consider what a null class reference is good for:

 ```D
 class ExtraInfo { ... }

 struct S
 {
     int a,b,c;
     ExtraInfo extra;
 }
 ```
 In my program, sometimes I need the `extra` info, but most of 
 the time, not. Why have the compiler force an allocation for 
 `extra` if it isn't used all the time? That just wastes time 
 and memory.
You can write `ExtraInfo extra = null;`. The reason `ExtraInfo extra;` is so confusing, and leads to posts like the one that started this thread, is because you're explicitly telling the compiler you want ExtraInfo. A new user of the language has no reason to expect it to be null. Someone wanting to optimize their code should have to be explicit that they want null and they're willing to deal with all the problems that can cause. While it's true that your program is always going to crash, that's not a great solution unless you're testing every possible outcome for your program as you write it. It can take a long time for it to crash, possibly when you're busy with other things, and with no indication of why it crashed.
Mar 12 2024
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/12/2024 9:13 AM, Lance Bachmeier wrote:
 You can write `ExtraInfo extra = null;`.
 
 The reason `ExtraInfo extra;` is so confusing, and leads to posts like the one 
 that started this thread, is because you're explicitly telling the compiler
you 
 want ExtraInfo. A new user of the language has no reason to expect it to be 
 null. Someone wanting to optimize their code should have to be explicit that 
 they want null and they're willing to deal with all the problems that can
cause.
Should it be initialized to - what? Let's say you're creating a linked list, with null signifying the end. If there aren't null references, you're going to have to have an "end" marker or some sort. So instead of checking for null, you have to check for the marker. If you forget to check for the marker, and the linked list goes off the end, then what? An exception is thrown? An assert fail()? How are these better? The program still fails at runtime.
 While it's true that your program is always going to crash, that's not a great 
 solution unless you're testing every possible outcome for your program as you 
 write it. It can take a long time for it to crash, possibly when you're busy 
 with other things, and with no indication of why it crashed.
In my experience, the beauty of a null pointer exception is it almost always results in a direct indication to where the problem is, and it's one of the easiest bugs to fix.
Mar 12 2024
parent reply Lance Bachmeier <no spam.net> writes:
On Tuesday, 12 March 2024 at 17:53:41 UTC, Walter Bright wrote:
 On 3/12/2024 9:13 AM, Lance Bachmeier wrote:
 You can write `ExtraInfo extra = null;`.
 
 The reason `ExtraInfo extra;` is so confusing, and leads to 
 posts like the one that started this thread, is because you're 
 explicitly telling the compiler you want ExtraInfo. A new user 
 of the language has no reason to expect it to be null. Someone 
 wanting to optimize their code should have to be explicit that 
 they want null and they're willing to deal with all the 
 problems that can cause.
Should it be initialized to - what? Let's say you're creating a linked list, with null signifying the end. If there aren't null references, you're going to have to have an "end" marker or some sort. So instead of checking for null, you have to check for the marker. If you forget to check for the marker, and the linked list goes off the end, then what? An exception is thrown? An assert fail()? How are these better? The program still fails at runtime.
I'm not sure I follow. As I understand it, `ExtraInfo extra;` and `ExtraInfo extra = null;` are exactly the same to the compiler (DMD generates identical assembly). My argument is that `ExtraInfo extra;` is confusing and therefore should not compile. That wouldn't restrict the language other than having to add "=null" to the declaration.
Mar 12 2024
parent Walter Bright <newshound2 digitalmars.com> writes:
When I first learned Java, I did not realize that the classes were reference 
types. I thought they were values (like in C++). I found the coding examples 
baffling, until I finally realized they were reference types.

I don't think there's a way around needing to understand the difference between 
a reference type and a value type.
Mar 12 2024
prev sibling next sibling parent reply Sergey <kornburn yandex.ru> writes:
On Monday, 11 March 2024 at 08:16:13 UTC, Alex wrote:
 Hello,
 int main()
 {
 	A a;
 	a.run();
 	writeln("Hello, world!");
 	return 0;
 }
 ```
You need ``` A a = new A(); ```
Mar 11 2024
parent reply Alex <akornilov.82 mail.ru> writes:
On Monday, 11 March 2024 at 08:49:35 UTC, Sergey wrote:
 On Monday, 11 March 2024 at 08:16:13 UTC, Alex wrote:
 Hello,
 int main()
 {
 	A a;
 	a.run();
 	writeln("Hello, world!");
 	return 0;
 }
 ```
You need ``` A a = new A(); ```
No. It was test how D handle uninitialized variables.
Mar 11 2024
parent reply Sergey <kornburn yandex.ru> writes:
On Monday, 11 March 2024 at 10:21:43 UTC, Alex wrote:
 No. It was test how D handle uninitialized variables.
Oh right. Didn't get from the first read. Based on the info from here https://en.wikipedia.org/wiki/Void_safety feature.
Mar 11 2024
next sibling parent reply Alex <akornilov.82 mail.ru> writes:
On Monday, 11 March 2024 at 10:35:55 UTC, Sergey wrote:
 On Monday, 11 March 2024 at 10:21:43 UTC, Alex wrote:
 No. It was test how D handle uninitialized variables.
Oh right. Didn't get from the first read. Based on the info from here https://en.wikipedia.org/wiki/Void_safety feature.
Yes, you right. I had hope what SafeD also support this feature in some way. But looks like not now.
Mar 11 2024
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/11/2024 3:42 AM, Alex wrote:
 Yes, you right. I had hope what SafeD also support this feature in some way.
But 
 looks like not now.
Null references are not unsafe, that's why it is not in SafeD.
Mar 11 2024
parent reply Alex <akornilov.82 mail.ru> writes:
On Tuesday, 12 March 2024 at 03:13:54 UTC, Walter Bright wrote:
 On 3/11/2024 3:42 AM, Alex wrote:
 Null references are not unsafe, that's why it is not in SafeD.
The Java has negative experience with null: https://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare/ In modern C++ preference is given to `std::optional`. For developer, who want make reliable software, it menas many rutinic checks for null. But mistakes are inevitable because of human factor. On other hand compiler can do it better (with 100% guarantee). In my opinion, Kotlin nullable types with compiler vaidation in compilation time is a powerfull feature: ```d A? a; // without explicit initialization is ok here, because <type>? can hold null a.run(); // compilation error, because can be null (the type of "a" is "A?") if (a != null) { a.run(); // ok, because can't be null in this branch (now type of "a" is "A") } ```
Mar 12 2024
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/12/2024 1:26 AM, Alex wrote:
 On Tuesday, 12 March 2024 at 03:13:54 UTC, Walter Bright wrote:
 On 3/11/2024 3:42 AM, Alex wrote:
 Null references are not unsafe, that's why it is not in SafeD.
The Java has negative experience with null: https://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare/ In modern C++ preference is given to `std::optional`. For developer, who want make reliable software, it menas many rutinic checks for null. But mistakes are inevitable because of human factor. On other hand compiler can do it better (with 100% guarantee).
Yeah, I know about that article. It's very popular. I've written a sort of rebuttal: https://www.digitalmars.com/articles/C-biggest-mistake.html With Java, it is also not a memory safety issue when there's a null exception.
 In my opinion, Kotlin nullable types with compiler vaidation in compilation
time 
 is a powerfull feature:
 
 ```d
 A? a; // without explicit initialization is ok here, because <type>? can hold
null
 
 a.run(); // compilation error, because can be null (the type of "a" is "A?")
 
 if (a != null) {
     a.run(); // ok, because can't be null in this branch (now type of "a" is
"A")
 }
 ```
It is always better to catch null mistakes at compile time rather than runtime, but it isn't a memory safety issue.
Mar 12 2024
parent reply Alex <akornilov.82 mail.ru> writes:
On Tuesday, 12 March 2024 at 17:47:32 UTC, Walter Bright wrote:
 Yeah, I know about that article. It's very popular. I've 
 written a sort of rebuttal:

 https://www.digitalmars.com/articles/C-biggest-mistake.html
If you ask me about root of evil in C I will answer it is pointers with their arithmetic. If you ask me about root of evil in Java I will answer it is null "pointer" (reference actually) :)
 With Java, it is also not a memory safety issue when there's a 
 null exception.
I think it depends on what we means "memory safety". If only "dangling pointers", I agree. But if we want to have strict guarantee at compilation time that any accessible reference leads to alive object answer is no, SafeD can't provide such guarantee yet.
 It is always better to catch null mistakes at compile time 
 rather than runtime, but it isn't a memory safety issue.
Are D foundation considering the possibility to implement something like this in future version of D?
Mar 12 2024
next sibling parent reply "H. S. Teoh" <hsteoh qfbox.info> writes:
On Tue, Mar 12, 2024 at 07:33:17PM +0000, Alex via Digitalmars-d wrote:
[...]
 I think it depends on what we means "memory safety".
In D, memory safety means memory cannot be corrupted. That is, writing to a pointer intended for one variable will not overwrite values of another, unrelated variable. This includes things like buffer overflows, stack corruption, overwriting pointers with maliciously crafted values that causes data to be written to places that isn't supposed to be written to, etc.. Dereferencing a null pointer is not a memory corruption according to this definition. (Even though in practice, having an application abort because of a null pointer can also become an issue, e.g., if a malicious outsider is able to trigger that condition consistently, it could be exploited in a DoS attack.) T -- Life is complex. It consists of real and imaginary parts. -- YHL
Mar 12 2024
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 3/12/2024 12:48 PM, H. S. Teoh wrote:
 (Even though in practice, having an application abort
 because of a null pointer can also become an issue, e.g., if a malicious
 outsider is able to trigger that condition consistently, it could be
 exploited in a DoS attack.)
The same goes for assert() failures, buffer overflow exceptions, and other runtime checks D inserts into the code.
Mar 12 2024
prev sibling parent reply Ogi <ogion.art gmail.com> writes:
On Tuesday, 12 March 2024 at 19:48:22 UTC, H. S. Teoh wrote:
 Even though in practice, having an application abort because of 
 a null pointer can also become an issue, e.g., if a malicious 
 outsider is able to trigger that condition consistently, it 
 could be exploited in a DoS attack.
Division by zero also crashes the program but nobody makes a big deal out of it.
Mar 12 2024
parent reply Alex <akornilov.82 mail.ru> writes:
On Wednesday, 13 March 2024 at 06:36:14 UTC, Ogi wrote:
 Division by zero also crashes the program but nobody makes a 
 big deal out of it.
Guess division by zero is not so common as null pointer issues. Also, usually it leads just to some king of arithmetic exception.
Mar 13 2024
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/13/2024 12:50 AM, Alex wrote:
 Also, usually it 
 leads just to some king of arithmetic exception.
It's a hardware exception, same as the null reference exception.
Mar 18 2024
parent reply Alex <akornilov.82 mail.ru> writes:
On Monday, 18 March 2024 at 22:27:28 UTC, Walter Bright wrote:
 It's a hardware exception, same as the null reference exception.
Yes, but language runtime can catch it via operating system mechanism and re-throw as language exception which can be handled inside code.
Mar 19 2024
next sibling parent Alexandru Ermicioi <alexandru.ermicioi gmail.com> writes:
On Tuesday, 19 March 2024 at 08:44:29 UTC, Alex wrote:
 On Monday, 18 March 2024 at 22:27:28 UTC, Walter Bright wrote:
 It's a hardware exception, same as the null reference 
 exception.
Yes, but language runtime can catch it via operating system mechanism and re-throw as language exception which can be handled inside code.
I wonder how these exceptions will play, if they are thrown fron nogc code... I'd assume that you can set only one handler for each hardware exception and that handler would need to know in what function this occured, nogc or gc one, so it can halt in nogc, and throw in gc methods.
Mar 19 2024
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 3/19/2024 1:44 AM, Alex wrote:
 On Monday, 18 March 2024 at 22:27:28 UTC, Walter Bright wrote:
 It's a hardware exception, same as the null reference exception.
Yes, but language runtime can catch it via operating system mechanism and re-throw as language exception which can be handled inside code.
Null seg faults can be dealt with that way, too.
Mar 29 2024
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/12/2024 12:33 PM, Alex wrote:
 I think it depends on what we means "memory safety". If only "dangling 
 pointers", I agree.
Memory safety is not something we've uniquely defined. It's generally accepted that it specifically means no memory corruption. "Safe" programs can offer additional guarantees, like no race conditions.
 But if we want to have strict guarantee at compilation time 
 that any accessible reference leads to alive object answer is no, SafeD can't 
 provide such guarantee yet.
In general, we cannot guarantee that assert()'s won't trip at runtime, nor buffer overflow exceptions. All we can do is say we stop the program when it happens. Actual memory corruption is infinitely worse than promptly stopping the program when it detects an internal bug.
 It is always better to catch null mistakes at compile time rather than 
 runtime, but it isn't a memory safety issue.
Are D foundation considering the possibility to implement something like this in future version of D?
Consider the following: ``` class A { void bar(); } void foo(int i) { A a; if (i) a = new A(); ... if (i) a.bar(); } ``` What happens if we apply data flow analysis to determine the state of `a` when it calls `bar()`? It will determine that `a` has the possible values (`null`, new A()`). Hence, it will give an error that `a` is possibly null at that point. Yet the code is correct, not buggy. Yes, the compiler could figure out that `i` is the same, but the conditions can be more complex such that the compiler cannot figure it out (the halting problem). So that doesn't work. We could lower `a.bar()` to `NullCheck(a).bar()` which throws an exception if `a` is null. But what have we gained there? Nothing. The program still aborts with an exception, just like if the hardware checked. Except we've got this manual check that costs extra code and CPU time. BTW, doing data flow analysis is very expensive in terms of compiler run time. The optimizer does it, but running the optimizer is optional for that reason.
Mar 12 2024
next sibling parent reply Alex <akornilov.82 mail.ru> writes:
On Wednesday, 13 March 2024 at 06:05:35 UTC, Walter Bright wrote:
 Memory safety is not something we've uniquely defined. It's 
 generally accepted that it specifically means no memory 
 corruption. "Safe" programs can offer additional guarantees, 
 like no race conditions.
Yeah, race condition is the second headache after memory safety (include null pointer issues) :) I know only one language which give guarantee at compilation time about absence of race condition (but not deadlock). It is Rust. But D have `shared` keyword and as I understand it can provide such guarantee at compilation time for SafeD, right?
 In general, we cannot guarantee that assert()'s won't trip at 
 runtime, nor buffer overflow exceptions. All we can do is say 
 we stop the program when it happens.
As I understand there is only one case when D stop program - it is null pointer, correct? So if D will support null safety it will be very reliable solution :)
 Consider the following:
I mean nullable types like in Kotlin (originally in Ceylon) and null safety model around it: https://kotlinlang.org/docs/null-safety.html ```d A? a; // <type name>? is nullable. The variable 'a' is implicity initialized by zero. Compiler allow this. A b; // compilation error, because type 'A' (without question mark) is not nullable. So you should initialize it explicity. A c = new A; // ok a.run(); // compilation error, because variable 'a' can be null (type is nullable 'A?') if (a != null) { a.run(); // ok, because type of variable 'a' is A (not nullable) in this branch. } c.run() // ok, because type of variable 'c' is A (not nullable) ``` In your example (see comment after line `void foo(int i) {`): ```d class A { void bar(); } void foo(int i) { A a; // compilation error, type A is not nullable if (i) a = new A(); ... if (i) a.bar(); } ```
 What happens if we apply data flow analysis to determine the 
 state of `a` when it calls `bar()`? It will determine that `a` 
 has the possible values (`null`, new A()`). Hence, it will give 
 an error that `a` is possibly null at that point.

 Yet the code is correct, not buggy.
 Yes, the compiler could figure out that `i` is the same, but 
 the conditions can be more complex such that the compiler 
 cannot figure it out (the halting problem).

 So that doesn't work.

 BTW, doing data flow analysis is very expensive in terms of 
 compiler run time. The optimizer does it, but running the 
 optimizer is optional for that reason.
The general strategy for compiler to allow or deny dereference is based on variable type. If variable is nullable type dereference is denied. If variable is not nullable type dereference is allowed. Variable type can be implicity changed by explicity check it value with null and will be not nullable type inside branch. Certainly variable should be effectively immutable in it life cycle scope. Such strategy at comilation time is not very expensive and shouldn't depends on compiler optimizations. Anyway I think developer can accept some delay of project build in exchange for reliable software :)
 We could lower `a.bar()` to `NullCheck(a).bar()` which throws 
 an exception if `a` is null. But what have we gained there? 
 Nothing. The program still aborts with an exception, just like 
 if the hardware checked. Except we've got this manual check 
 that costs extra code and CPU time.
I agree that `NullCheck(a).bar()` is not solution, because just moves issue in runtime.
Mar 13 2024
next sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Wednesday, March 13, 2024 1:43:23 AM MDT Alex via Digitalmars-d wrote:
 On Wednesday, 13 March 2024 at 06:05:35 UTC, Walter Bright wrote:
 Memory safety is not something we've uniquely defined. It's
 generally accepted that it specifically means no memory
 corruption. "Safe" programs can offer additional guarantees,
 like no race conditions.
Yeah, race condition is the second headache after memory safety (include null pointer issues) :) I know only one language which give guarantee at compilation time about absence of race condition (but not deadlock). It is Rust. But D have `shared` keyword and as I understand it can provide such guarantee at compilation time for SafeD, right?
What shared is supposed to do is give an error if you attempt do anything with a shared variable that isn't guaranteed to be thread-safe - which basically means that it should give an error when you actually try to do much of anything with a shared variable. However, it's not fully implemented by default right now (it will currently give an error in some cases but not all). The -preview=nosharedaccess switch can be used to make accessing shared variables an error in general (like it's supposed to be), but it hasn't been enabled by default yet. Ideally, the compiler would know when it was thread-safe to access a shared variable and implicitly remove shared within that code so that you could safely access the variable, but in practice, the compiler has no way of knowing that your code has done what's necessary to protect access to that variable, since that involves doing stuff like locking a mutex whenever that variable is accessed, and the language has no understanding of any of that (and it's not at all easy to give the language such an understanding except for in very simple cases). So, what happens in practice is that the programmer has to lock the appropriate mutex, then temporarily cast away shared to operate on the variable, then make sure that no thread-safe references to the data exist any longer prior to releasing the mutex. So, you get code like synchronized(mutex) { int* local = cast()&sharedVar; *local = 42; } The result is that the code is system, not safe, and the programmer has to verify its correctness and mark it with trusted for it to be useable by safe code. However, higher level objects can be written such that they have shared, safe/ trusted member functions, and those member functions then take care of all of the locking and casting internally so that you can just use the type without directly dealing with the locking or casting. But ultimately what shared is doing is segregating the code that deals with concurrency and making it so that you have to cast to actually do much of anything with it so that you can't shoot yourself in the foot with shared elsewhere. You have to outright tell the compiler that you want to take the risk. You then only have to examine certain sections of the code to make sure that the code that's actually interacting with shared data does so correctly (whereas in a language like C++, the type system doesn't help you with any of that). So the way that you write thread-safe code in D is pretty similar to what you'd do in a language like C++ or Java, but shared makes it so that you know which data is shared and so that you can't accidentally access shared data in a manner which isn't thread-safe, whereas the type system really doesn't help you with any of that in C++ or Java. - Jonathan M Davis
Mar 13 2024
parent Alex <akornilov.82 mail.ru> writes:
On Wednesday, 13 March 2024 at 21:57:17 UTC, Jonathan M Davis 
wrote:
 On Wednesday, March 13, 2024 1:43:23 AM MDT Alex via 
 Digitalmars-d wrote:
 What shared is supposed to do is give an error if you attempt 
 do anything with a shared variable that isn't guaranteed to be 
 thread-safe - which basically means that it should give an 
 error when you actually try to do much of anything with a 
 shared variable. However, it's not fully implemented by default 
 right now (it will currently give an error in some cases but 
 not all). The -preview=nosharedaccess switch can be used to 
 make accessing shared variables an error in general (like it's 
 supposed to be), but it hasn't been enabled by default yet.
Thank you for detailed explanation!
Mar 14 2024
prev sibling parent reply cc <cc nevernet.com> writes:
On Wednesday, 13 March 2024 at 07:43:23 UTC, Alex wrote:
 I mean nullable types like in Kotlin (originally in Ceylon) and 
 null safety model around it: 
 https://kotlinlang.org/docs/null-safety.html
 ```d
 if (a != null) {
     a.run(); // ok, because type of variable 'a' is A (not 
 nullable) in this branch.
 }
 ```
I like this, can the compiler handle cases like this? ```d if (a == null) return; a.run(); ```
Mar 13 2024
parent Alex <akornilov.82 mail.ru> writes:
On Thursday, 14 March 2024 at 00:32:47 UTC, cc wrote:
 On Wednesday, 13 March 2024 at 07:43:23 UTC, Alex wrote:
 I mean nullable types like in Kotlin (originally in Ceylon) 
 and null safety model around it: 
 https://kotlinlang.org/docs/null-safety.html
 ```d
 if (a != null) {
     a.run(); // ok, because type of variable 'a' is A (not 
 nullable) in this branch.
 }
 ```
I like this, can the compiler handle cases like this? ```d if (a == null) return; a.run(); ```
Yes, sure, Kotlin compiler handle it. The analyzer controls variable type in each branch. After `if` type of `a` will be non-nullable type `A`.
Mar 14 2024
prev sibling next sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 13/03/2024 7:05 PM, Walter Bright wrote:
 BTW, doing data flow analysis is very expensive in terms of compiler run 
 time. The optimizer does it, but running the optimizer is optional for 
 that reason.
I want to see D become temporally safe, and that means DFA for safe code. The question is not if, but when at this point, we have to solve it, and in doing so define the literature otherwise we'll be left behind. I'm certainly not ready for my type state analysis DIP to go into development just yet, but ideas shouldn't be too far behind.
Mar 13 2024
prev sibling next sibling parent reply Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:
On Wednesday, 13 March 2024 at 06:05:35 UTC, Walter Bright wrote:
 [..]

 Consider the following:
 ```
 class A { void bar(); }

 void foo(int i) {
     A a;
     if (i) a = new A();
     ...
     if (i) a.bar();
 }
 ```
 What happens if we apply data flow analysis to determine the 
 state of `a` when it calls `bar()`? It will determine that `a` 
 has the possible values (`null`, new A()`). Hence, it will give 
 an error that `a` is possibly null at that point.

 Yet the code is correct, not buggy.

 Yes, the compiler could figure out that `i` is the same, but 
 the conditions can be more complex such that the compiler 
 cannot figure it out (the halting problem).

 So that doesn't work.

 We could lower `a.bar()` to `NullCheck(a).bar()` which throws 
 an exception if `a` is null. But what have we gained there? 
 Nothing. The program still aborts with an exception, just like 
 if the hardware checked. Except we've got this manual check 
 that costs extra code and CPU time.

 BTW, doing data flow analysis is very expensive in terms of 
 compiler run time. The optimizer does it, but running the 
 optimizer is optional for that reason.
Here's how TypeScript deals with this problem: ```ts class A { bar() {} } function foo(i: number) { let a: A; if (i) a = new A(); if (i) a.bar(); // Error: Variable 'a' is used before being assigned. } function foo2(i: number) { let a: A | null = null; if (i) a = new A(); if (i) a.bar(); // Error: 'a' is possibly 'null' } ``` https://www.typescriptlang.org/docs/handbook/2/narrowing.html
Mar 13 2024
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/13/2024 9:06 AM, Petar Kirov [ZombineDev] wrote:
 Here's how TypeScript deals with this problem:
 
 ```ts
 class A { bar() {} }
 
 function foo(i: number) {
      let a: A;
      if (i) a = new A();
 
      if (i) a.bar(); // Error: Variable 'a' is used before being assigned.
 }
 
 function foo2(i: number) {
      let a: A | null = null;
      if (i) a = new A();
 
      if (i) a.bar(); // Error: 'a' is possibly 'null'
 }
 
 ```
Yes, that is one way to do it, disallowing otherwise valid programs. I ran into this with C, and there was too much C code that would be disallowed, so I didn't proceed with it. Another issue with this approach is as soon as you pass `a` to another function, `bar(a)`, the body of `bar` doesn't know if `a` is null or not. So this kind of analysis is not generally useful. The same thing comes up with: ``` int* foo() { int a; return &a; } ``` Most every C compiler will detect that. But there are many ways to slip it past the compiler: ``` int* foo() { int a; int* p = &a; return bar(p); } int* bar(int* p) { return p; } ``` which is why D came up with `scope` and `return`.
Mar 18 2024
parent Alex <akornilov.82 mail.ru> writes:
On Monday, 18 March 2024 at 22:36:21 UTC, Walter Bright wrote:
 Another issue with this approach is as soon as you pass `a` to 
 another function, `bar(a)`, the body of `bar` doesn't know if 
 `a` is null or not.
Hmm... why? If type of `a` is non-nullable `A` compiler know that in bode of `bar(A a)` argument `a` is never `null`.
Mar 19 2024
prev sibling next sibling parent Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:
On Wednesday, 13 March 2024 at 06:05:35 UTC, Walter Bright wrote:
 [..]
 Consider the following:
 ```
 class A { void bar(); }

 void foo(int i) {
     A a;
     if (i) a = new A();
     ...
     if (i) a.bar();
 }
 ```
 What happens if we apply data flow analysis to determine the 
 state of `a` when it calls `bar()`? It will determine that `a` 
 has the possible values (`null`, new A()`). Hence, it will give 
 an error that `a` is possibly null at that point.
Here's how this situation is handled in TypeScript: ```ts class A { bar() {} } function foo(i: number) { let a: A; if (i) a = new A(); if (i) a.bar(); // Error: Variable 'a' is used before being assigned. } function foo2(i: number) { let a: A | null = null; if (i) a = new A(); if (i) a.bar(); // Error: 'a' is possibly 'null' } function bar(i: number) { let a: A; if (i) { a = new A(); a.bar(); // No errors. } } function bar2(i: number) { let a: A | null = null; if (i) { a = new A(); // The type of `a` is `A | null` a.bar(); // The type of `a` is now `A` } } ```
 Yet the code is correct, not buggy.
 
 Yes, the compiler could figure out that `i` is the same, but 
 the conditions can be more complex such that the compiler 
 cannot figure it out (the halting problem).

 So that doesn't work.
I agree, however in my experience (I've been using TypeScript professionally since ~2019) it's not a problem for the developer to rewrite the code in a way that the compiler can understand. In this case - rewriting `foo` to `bar`. While your example was intentionally simple, in practice, restructuring the code so the compiler can understand it, often makes it more clear for the humans behind the screen as well.
 We could lower `a.bar()` to `NullCheck(a).bar()` which throws 
 an exception if `a` is null. But what have we gained there? 
 Nothing. The program still aborts with an exception, just like 
 if the hardware checked. Except we've got this manual check 
 that costs extra code and CPU time.
I agree that simply letting the OS handle the segfault is sufficient for 98% of the use cases. For the other 2% (say writing code for kernels-mode or micro controllers without MMU), having a compiler flag to enable rewriting `a.bar()` to `assert(a), a.bar()` would be nice.
 BTW, doing data flow analysis is very expensive in terms of 
 compiler run time. The optimizer does it, but running the 
 optimizer is optional for that reason.
early days (I'm not sure if it was part of the first release, or the faster languages in terms of compiler time. I'd be very interested to hear what you have to say about their [language specification][1] on definite assignment: That said, TypeScript takes this (colloquially known as [flow typing][2]) much further: https://www.typescriptlang.org/docs/handbook/2/narrowing.html. It plays extremely pleasingly with their [union types][3]. P.S. please disregard my previous message. I clicked "Send" by mistake. [1]: https://github.com/dotnet/csharpstandard/blob/draft-v9/standard/variables.md#94-definite-assignment [2]: https://en.wikipedia.org/wiki/Flow-sensitive_typing [3]: https://www.typescriptlang.org/docs/handbook/2/everyday-types.html#union-types
Mar 13 2024
prev sibling next sibling parent reply Nick Treleaven <nick geany.org> writes:
On Wednesday, 13 March 2024 at 06:05:35 UTC, Walter Bright wrote:
 In general, we cannot guarantee that assert()'s won't trip at 
 runtime, nor buffer overflow exceptions. All we can do is say 
 we stop the program when it happens.
An assert is explicit in the source code, it's visible to the programmer. A null dereference can easily happen without anything in the code to suggest to the programmer that the program might abort. This is because quite often a pointer/reference is never null in a particular scope (e.g. function parameters), so programmers stop worrying about null, and sometimes do this when the pointer actually can be null. What is needed is: 1. A way of distinguishing between never null pointers and sometimes null pointers. 2. Have a syntactical opt-in way of force unwrapping a nullable pointer. This can even be a no-op so you still get the hardware checking efficiency.
 Actual memory corruption is infinitely worse than promptly 
 stopping the program when it detects an internal bug.
Absolutely, though most modern languages seem to have some support for null safety too.
 Consider the following:
 ```
 class A { void bar(); }

 void foo(int i) {
     A a;
     if (i) a = new A();
     ...
     if (i) a.bar();
 }
 ```
 What happens if we apply data flow analysis to determine the 
 state of `a` when it calls `bar()`? It will determine that `a` 
 has the possible values (`null`, new A()`). Hence, it will give 
 an error that `a` is possibly null at that point.
I think you don't need DFA at least as I understand it:
    if (i) a = new A();
If `a` is non-nullable, this line could be made to error because there is no `else` branch that also initializes `a`. This is what cppfront does. non-nullable types need initialization.
    if (i) a.bar();
If `a` is nullable, this line is an error because you are calling a method that needs an A when (from the DFA-less compiler's point of view) you might only have a null pointer. To handle that, you either: * `assert(a)` before `a.bar()`, and the compiler assumes `a` is not null and in release mode there is only a hardware null check. * Call a function to force-unwrap the nullable type to a non-null type, e.g. `a.unwrap.bar()`. This function can be a no-op in terms of hardware, but requiring calling it makes the programmer aware that a possible null dereference (at least from the compiler POV) may occur. * Rewrite the if statement to `if (a)`. That is actually better code because you would need to check that `i` hadn't changed in between the two if statements, which might be long, to understand the code. I hope you agree that at least some of this is workable and beneficial. Now, as regards D, I'm not sure the best way to add non-nullable types to the language. Even editions may not be enough, perhaps a compiler switch to do null checking. Non-nullable types are also very useful for API documentation. It's common for docs to forget to say "don't pass null", the user has to check the source code if available. The API can't be misused when the type system actually carries the information about whether the pointer actually exists or not (info that should be absolutely fundamental to a good static type system).
Mar 14 2024
parent reply Walter Bright <newshound2 digitalmars.com> writes:
Some easy cases can be handled easily(!). But to do it reliably, DFA is 
required. And DFA makes the front end slow.

If one doesn't do DFA, then I will be subjected to endless bug reports where 
people find a case that needs DFA to resolve.
Mar 18 2024
next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 19/03/2024 11:46 AM, Walter Bright wrote:
 Some easy cases can be handled easily(!). But to do it reliably, DFA is 
 required. And DFA makes the front end slow.
 
 If one doesn't do DFA, then I will be subjected to endless bug reports 
 where people find a case that needs DFA to resolve.
If anyone wants evidence of this, look no further than live. https://issues.dlang.org/show_bug.cgi?id=21923 https://issues.dlang.org/show_bug.cgi?id=21854 A memory analysis technique that requires DFA, but doesn't as it wasn't fully thought out and too specific to it. We need semantic 4 to solve this properly: see my recent post on type state analysis in DIP ideas forum.
Mar 18 2024
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/18/2024 4:19 PM, Richard (Rikki) Andrew Cattermole wrote:
 On 19/03/2024 11:46 AM, Walter Bright wrote:
 If one doesn't do DFA, then I will be subjected to endless bug reports where 
 people find a case that needs DFA to resolve.
If anyone wants evidence of this, look no further than live. https://issues.dlang.org/show_bug.cgi?id=21923 https://issues.dlang.org/show_bug.cgi?id=21854 A memory analysis technique that requires DFA, but doesn't as it wasn't fully thought out and too specific to it.
D is a complex language, and live has bugs in it for some constructs. That doesn't mean DFA is the wrong tool for the job, it is the only tool for it and the problems are routine problems that can be fixed.
Mar 29 2024
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 30/03/2024 4:03 PM, Walter Bright wrote:
 On 3/18/2024 4:19 PM, Richard (Rikki) Andrew Cattermole wrote:
 On 19/03/2024 11:46 AM, Walter Bright wrote:
 If one doesn't do DFA, then I will be subjected to endless bug 
 reports where people find a case that needs DFA to resolve.
If anyone wants evidence of this, look no further than live. https://issues.dlang.org/show_bug.cgi?id=21923 https://issues.dlang.org/show_bug.cgi?id=21854 A memory analysis technique that requires DFA, but doesn't as it wasn't fully thought out and too specific to it.
D is a complex language, and live has bugs in it for some constructs. That doesn't mean DFA is the wrong tool for the job, it is the only tool for it and the problems are routine problems that can be fixed.
Yes, all I am getting at is a dedicated DFA that isn't specific to live would be a better solution. If we really really want live to stick around (I have other ideas on how to replace it while getting guarantees which live cannot provide), rewriting it on my proposed semantic 4 would be a better solution long term.
Mar 29 2024
parent Walter Bright <newshound2 digitalmars.com> writes:
On 3/29/2024 8:07 PM, Richard (Rikki) Andrew Cattermole wrote:
 Yes, all I am getting at is a dedicated DFA that isn't specific to  live would 
 be a better solution.
 
 If we really really want  live to stick around (I have other ideas on how to 
 replace it while getting guarantees which  live cannot provide), rewriting it
on 
 my proposed semantic 4 would be a better solution long term.
We should continue this in the dips.development thread
Mar 31 2024
prev sibling parent reply Nick Treleaven <nick geany.org> writes:
On Monday, 18 March 2024 at 22:46:27 UTC, Walter Bright wrote:
 Some easy cases can be handled easily(!). But to do it 
 reliably, DFA is required. And DFA makes the front end slow.

 If one doesn't do DFA, then I will be subjected to endless bug 
 reports where people find a case that needs DFA to resolve.
Sorry my reply wasn't rigorous enough.
    if (i) a = new A();
If `a` is non-nullable, this line could be made to error because there is no `else` branch that also initializes `a`. This is what cppfront does.
I think this is workable without DFA, the compiler just tracks when a variable is initialized. There is never a state where a variable may be both initialized and not initialized (e.g. no dependency on whether a jump instruction was taken at runtime). Do you think this is reasonably cheap time cost?
    if (i) a.bar();
If `a` is nullable, this line is an error because you are calling a method that needs an A when (from the DFA-less compiler's point of view) you might only have a null pointer. To handle that, you either: * `assert(a)` before `a.bar()`, and the compiler assumes `a` is not null and in release mode there is only a hardware null check.
This alone indeed does not work in general as `a` may become null after the assert and before it is dereferenced.
 * Call a function to force-unwrap the nullable type to a 
 non-null type, e.g. `a.unwrap.bar()`. This function can be a 
 no-op in terms of hardware, but requiring calling it makes the 
 programmer aware that a possible null dereference (at least 
 from the compiler POV) may occur.
This works and requires no analysis. The function internally just casts to non-null, the advantage being that the call is explicit to the programmer, alerting them and reviewers to possible program abort. This alone might be workable in theory but in practice it would probably be annoying in some cases where the compiler could help.
 * Rewrite the if statement to `if (a)`. That is actually better 
 code because you would need to check that `i` hadn't changed in 
 between the two if statements, which might be long, to 
 understand the code.
Like assert, this is not workable in the general case. However, declaring a new variable could work: ```d if (auto b = a) { b.bar; } ``` There are 2 options: 1. The type of `b` is non-nullable. This is a breaking change. There would need to be some syntax for opting in to make `b` nullable. 2. We have some other syntax for `b` being non-nullable. Workable?
Mar 22 2024
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/22/2024 3:51 AM, Nick Treleaven wrote:
 I think this is workable without DFA, the compiler just tracks when a variable 
 is initialized. There is never a state where a variable may be both
initialized 
 and not initialized
``` A a = null; if (i) a = new A(); // a is both initialized and not initialized ``` Now throw in loops and goto's, and DFA is needed. Compiler optimizers use DFA because it works and ad-hoc techniques do not.
Mar 29 2024
parent reply Nick Treleaven <nick geany.org> writes:
On Saturday, 30 March 2024 at 03:00:32 UTC, Walter Bright wrote:
 On 3/22/2024 3:51 AM, Nick Treleaven wrote:
 I think this is workable without DFA, the compiler just tracks 
 when a variable is initialized. There is never a state where a 
 variable may be both initialized and not initialized
``` A a = null; if (i) a = new A(); // a is both initialized and not initialized ```
There `a` is always initialized. It's a nullable type. If you remove the `= null` and make `a` a non-nullable type, you would get an error for the `if` statement because it initializes `a` in its branch, and there is no `else` branch which is required to also initialize `a`.
 Now throw in loops and goto's, and DFA is needed. Compiler 
 optimizers use DFA because it works and ad-hoc techniques do 
 not.
This does not need DFA, correct?
Mar 30 2024
parent reply Nick Treleaven <nick geany.org> writes:
On Saturday, 30 March 2024 at 09:24:12 UTC, Nick Treleaven wrote:
 On Saturday, 30 March 2024 at 03:00:32 UTC, Walter Bright wrote:
 On 3/22/2024 3:51 AM, Nick Treleaven wrote:
 I think this is workable without DFA, the compiler just 
 tracks when a variable is initialized. There is never a state 
 where a variable may be both initialized and not initialized
``` A a = null; if (i) a = new A(); // a is both initialized and not initialized ```
There `a` is always initialized. It's a nullable type. If you remove the `= null` and make `a` a non-nullable type, you would get an error for the `if` statement because it initializes `a` in its branch, and there is no `else` branch which is required to also initialize `a`.
This is an example for Herb Sutter's cppfront, which enforces that `p` is non-null: ```c++2 main: () = { i := 0; p: unique_ptr<int>; if (i) { p = new<int>; // error: p must be initialized on both branches or neither } } ```
 Now throw in loops and goto's, and DFA is needed. Compiler 
 optimizers use DFA because it works and ad-hoc techniques do 
 not.
cppfront: ```c++2 main: () = { i := 0; p: unique_ptr<int>; while i < 3 next i++ { p = new<int>; std::cout << p* << "\n"; // ok, p is always initialized } } ``` Changing the while loop: ```c++2 while i < 3 next i++ { std::cout << p* << "\n"; // error, p used before it was initialized p = new<int>; } ``` Goto skipping initialization is already disallowed in D.
 This does not need DFA, correct?
Mar 30 2024
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/30/2024 2:38 AM, Nick Treleaven wrote:
 cppfront:
Is cppfront using ad-hoc or DFA? You can get a ways with ad-hoc, but there's a reason optimizers use DFA, especially with unstructured code.
 Goto skipping initialization is already disallowed in D.
That restriction is imposed because the front end doesn't do DFA.
Mar 31 2024
parent reply Nick Treleaven <nick geany.org> writes:
On Monday, 1 April 2024 at 01:00:31 UTC, Walter Bright wrote:
 On 3/30/2024 2:38 AM, Nick Treleaven wrote:
 cppfront:
Is cppfront using ad-hoc or DFA? You can get a ways with ad-hoc, but there's a reason optimizers use DFA, especially with unstructured code.
From what I have gathered, cppfront is only doing 2 basic things at the moment: * require initialization on both branches of an `if` statement or neither * track at runtime whether a variable has been initialized and abort if it is accessed without initialization However, proper analysis is planned to detect common cases of invalid pointers and container types. It does not aim to catch all possible errors. See the links here which are for C++1: https://github.com/hsutter/cppfront?tab=readme-ov-file#2015-lifetime-safety Each point where a pointer variable is modified, the compiler tracks what possible things it could point to, e.g. local data. For the latter, when the local data goes out of scope, if the pointer hasn't been overwritten, then it is known to be pointing to invalid data and any subsequent dereference is flagged at compile-time. A partial prototype was implemented for Clang which was demo'd in the 2018 youtube video. There is also a formal written proposal P1179. There is a section in that paper on loops - see 2.4.9:
 A loop is treated as if it were the first two loop iterations 
 unrolled using an if. For example,
`for(/*init*/;/*cond*/;/*incr*/){/*body*/}` is treated as `if(/*init*/;/*cond*/){/*body*/;/*incr*/} if(/*cond*/){/*body*/}`. There was a section on null dereference detection in the 2018 video: https://youtu.be/80BZxujhY38?t=41m22s At around the 45m mark Herb actually says "there is no DFA going on". The example there has a loop whose number of iterations is only known at runtime. Just before that null bit there was also an example that detects iterator invalidation.
 Goto skipping initialization is already disallowed in D.
That restriction is imposed because the front end doesn't do DFA.
D already does enforce for aggregate constructors that immutable data is only initialized once. That AIUI is not really DFA, but something simpler, and the idea could be extended to all functions to detect common cases of uninitialized non-null reference types (if we had them) - at some speed cost, hopefully acceptable.
Apr 11 2024
parent reply Nick Treleaven <nick geany.org> writes:
On Thursday, 11 April 2024 at 16:19:52 UTC, Nick Treleaven wrote:
 Each point where a pointer variable is modified, the compiler 
 tracks what possible things it could point to, e.g. local data.
It tracks anything in the scope of the current function that the pointer could point to, at each statement.
 For the latter, when the local data goes out of scope, if the 
 pointer hasn't been overwritten, then it is known to be 
 pointing to invalid data and any subsequent dereference is 
 flagged at compile-time.
What I meant was if there is a dereference of a pointer that *may have been* (according to the limited analysis) assigned the address of a local that has gone out of scope, that dereference gets flagged at compile-time. Even though at runtime it may never actually have that address.
Apr 11 2024
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/11/2024 9:25 AM, Nick Treleaven wrote:
 What I meant was if there is a dereference of a pointer that *may have been* 
 (according to the limited analysis) assigned the address of a local that has 
 gone out of scope, that dereference gets flagged at compile-time. Even though
at 
 runtime it may never actually have that address.
Given the following: ``` safe void foo() { int* p; { int x; p = &x; } } ``` The compiler gives: test.d(8): Error: address of variable `x` assigned to `p` with longer lifetime when the -preview=dip1021 switch is used. https://github.com/dlang/DIPs/blob/master/DIPs/accepted/DIP1021.md Perhaps it's time to make dip1021 the default. Or at least turn it on with dip1000?
Apr 16 2024
next sibling parent ag0aep6g <anonymous example.com> writes:
On 16.04.24 20:25, Walter Bright wrote:
 ```
  safe
 void foo()
 {
      int* p;
      {
      int x;
      p = &x;
      }
 }
 ```
 
 The compiler gives:
 
 test.d(8): Error: address of variable `x` assigned to `p` with longer 
 lifetime
 
 when the -preview=dip1021 switch is used.
 
 https://github.com/dlang/DIPs/blob/master/DIPs/accepted/DIP1021.md
 
 Perhaps it's time to make dip1021 the default. Or at least turn it on 
 with dip1000?
DIP 1021 does nothing here. It's `-preview=dip1000` that prints the error. `-preview=dip1021` implies `-preview=dip1000`. And no `-preview` is needed at all for DMD to reject the code (with a different error).
Apr 16 2024
prev sibling parent Nick Treleaven <nick geany.org> writes:
On Tuesday, 16 April 2024 at 18:25:29 UTC, Walter Bright wrote:
 ```
  safe
 void foo()
 {
     int* p;
     {
 	int x;
 	p = &x;
     }
 }
 ```

 The compiler gives:

 test.d(8): Error: address of variable `x` assigned to `p` with 
 longer lifetime
-dip1000 is good at detecting possible dangling pointers to scope data, but it does it when the pointer is assigned. The difference with the C++ paper is it only tells you *when you try to dereference* a pointer which may point to data which is now invalid because the dereference happens in a higher scope. There are cases where -dip1000 would give a false positive which are still useful that the paper would allow (e.g. involving loops or where the pointer is written to later before the dereference, overwriting the invalid pointer). Anyway, I was just trying to describe what the C++ paper is supposed to do. My main point was about D detecting uninitialized variable use (which is a prerequisite for non-nullable types).
Apr 18 2024
prev sibling parent Quirin Schroll <qs.il.paperinik gmail.com> writes:
On Wednesday, 13 March 2024 at 06:05:35 UTC, Walter Bright wrote:
 Consider the following:
 ```
 class A { void bar(); }

 void foo(int i) {
     A a;
     if (i) a = new A();
     ...
     if (i) a.bar();
 }
 ```
 What happens if we apply data flow analysis to determine the 
 state of `a` when it calls `bar()`? It will determine that `a` 
 has the possible values (`null`, new A()`). Hence, it will give 
 an error that `a` is possibly null at that point.
A type system can come to this conclusion, no control-flow analysis needed.
 Yet the code is correct, not buggy.
Depends on what `...` does with `i`.
 Yes, the compiler could figure out that `i` is the same, but 
 the conditions can be more complex such that the compiler 
 cannot figure it out (the halting problem).

 So that doesn't work.
It seems you want the compiler not to diagnose “obvious” cases where a null check would be superfluous. My sense is the more formal/mathy inclined people don’t even ask for that.
 We could lower `a.bar()` to `NullCheck(a).bar()` which throws 
 an exception if `a` is null. But what have we gained there? 
 Nothing. The program still aborts with an exception, just like 
 if the hardware checked. Except we've got this manual check 
 that costs extra code and CPU time.
Please not. Just raise a compile-error.
 BTW, doing data flow analysis is very expensive in terms of 
 compiler run time. The optimizer does it, but running the 
 optimizer is optional for that reason.
You don’t need data flow analysis if the type system can tell which values are potentially `null` and which aren’t. Comprehensive example: ```d void foo(int i) { A? a; // Change 1: Tell the type system that `a` is possibly null if (i) a = new A(); ... if (i) (cast(A)a).bar(); // Change 2: A cast to assert `a` is not null. } ``` The question is, will `i` be (effectively) changed in `...`? If it won’t, the two `if` checks are the same and `a.bar()` would be fine. Only the type system doesn't know. It sees a `A?` object having a method called on it. Going from the original code to this would happen like this: 1. Trying to compile, you get an error stating that `A a` must be either initialized or be a nullable type (i.e. `A? a`) if `a` is supposed to be potentially `null`. Okay, you think, it’s supposed to be `null`, i.e. you use `A?`. 2. Trying to compile again, you get an error saying `a` cannot have a method called on it because its type says it’s possibly `null`, and you have to make sure somehow that it won’t be: Options are: * Use `a?.bar()` which only calls `bar` if `a` isn’t null. * Use `a!.bar()` which asserts (throwing an Error) that `a` isn’t null and then calls `bar`. * Use an explicit cast, which just silences the error, i.e. inserts no check, meaning it gives you a segfault if `a` is null. 3. So you insert a cast, assuming you don’t meaningfully touch `i` in `...`. You’re absolutely right that a segfault is infinitely better than UB, but a type system that catches potential/likely segfaults before the program even runs once is infinitely better than segfaults. In this example, if `...` does change to `i`, the cast is ill-posed and you go back to segfault land. If you’re unsure, `a!.bar()` is probably better. It’s definitely safer. (And, for the segfault land enthusiasts, we can add syntax sugar for the cast: `cast(!null)` for cases when the type is not known or needlessly long. What you do need control-flow analysis for is if you want to avoid those casts in “obvious” cases (e.g. if the `...` doesn’t write to `i`). What doesn’t need control-flow analysis are language constructs that the compiler recognizes as null checks. Just imagine for a moment D had `final` as a type constructor which for classes means head-const. On `if (a !is null)` the compiler can add a new variable `A __a = cast(A)a;` and attempt to use `__a` instead of `a` in the code block. If it succeeds, `a` isn’t possibly reassigned `null` and therefore stays non-null. No additional clutter needed in the block. Recognizing this pattern isn’t that hard I hope. It’s definitely less expensive than control-flow analysis.
Apr 25 2024
prev sibling parent reply cc <cc nevernet.com> writes:
On Monday, 11 March 2024 at 10:35:55 UTC, Sergey wrote:
 On Monday, 11 March 2024 at 10:21:43 UTC, Alex wrote:
 No. It was test how D handle uninitialized variables.
Oh right. Didn't get from the first read. Based on the info from here https://en.wikipedia.org/wiki/Void_safety feature.
And Objective-C, in which calling methods on (er, sending messages to) null references is not only allowed, it's just good manners. Coming to D from that was a bit of a shock with the sudden segfaults, though eventually we adapt. Still, it would be nice to have the option of Errors thrown rather than a straight crash. I assume this is for efficiency reasons, but couldn't a compiler flag be a consideration? Debuggers are nice but it's still an extra, external step.
Mar 12 2024
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/12/2024 4:42 PM, cc wrote:
 And Objective-C, in which calling methods on (er, sending messages to) null 
 references is not only allowed, it's just good manners.  Coming to D from
that 
 was a bit of a shock with the sudden segfaults, though eventually we adapt.  
 Still, it would be nice to have the option of Errors thrown rather than a 
 straight crash.  I assume this is for efficiency reasons, but couldn't a 
 compiler flag be a consideration?  Debuggers are nice but it's still an
extra, 
 external step.
One some platforms, seg faults are intercepted by the D runtime and a stack trace is emitted.
Mar 12 2024
parent Alex <akornilov.82 mail.ru> writes:
On Wednesday, 13 March 2024 at 06:07:32 UTC, Walter Bright wrote:
 One some platforms, seg faults are intercepted by the D runtime 
 and a stack trace is emitted.
Would be nice to have platform independent behavior which fixed in language specification.
Mar 13 2024
prev sibling next sibling parent reply Nick Treleaven <nick geany.org> writes:
On Monday, 11 March 2024 at 08:16:13 UTC, Alex wrote:
 Hello,

 I am interesting D as memory safe language (maybe SafeD?) and 
 have written very simple code:

 ```d
  safe

 import std.stdio;
The ` safe` attribute there does nothing, it only applies to the import declaration, and is ignored. Perhaps you meant ` safe:` with the trailing colon, so it applies the attribute to every declaration after it in the module.
 So I don't see any errors or warnings from compiler when I use 
 uninitialized variable `a`
`a` is not uninitialized - you have to use `= void` for that (https://dlang.org/spec/declaration.html#void_init). Uninitialized pointers/references are not allowed in safe functions.
 and don't see any exception with backtrace in runtime 
 (application is build in debug mode).
Try using optimization. On Linux, the backend can detect the null dereference at compile-time: ``` $ dmd -O os/nullobj.d os/nullobj.d(22): Error: null dereference in function _Dmain ``` Line 22: a.run(); However, only simple cases are detected at compile-time.
 Is it expected behavior?
 Looks like it is not very safe approach and can lead to very 
 unpleasant memory errors...
safe only means memory-safety: https://dlang.org/spec/memory-safe-d.html Null-safety is not part of memory-safety, because in D it should not be possible to violate memory-safety when a pointer/reference is null. For a long time I've wanted compile-time null-safety using non-nullable pointers/references, but there are no plans to add that AFAIK.
Mar 11 2024
parent reply Alex <akornilov.82 mail.ru> writes:
On Monday, 11 March 2024 at 10:48:52 UTC, Nick Treleaven wrote:
 The ` safe` attribute there does nothing, it only applies to 
 the import declaration, and is ignored. Perhaps you meant 
 ` safe:` with the trailing colon, so it applies the attribute 
 to every declaration after it in the module.
Yes, I mean whole file declared as safe. Thank you! I not so familiar with D yet :)
 `a` is not uninitialized - you have to use `= void` for that 
 (https://dlang.org/spec/declaration.html#void_init). 
 Uninitialized pointers/references are not allowed in  safe 
 functions.
Ok, got it, in my example variable is initialized by default value (null).
 Try using optimization. On Linux, the backend can detect the 
 null dereference at compile-time:
 ```
 $ dmd -O os/nullobj.d
 os/nullobj.d(22): Error: null dereference in function _Dmain
 ```
Thanks, it works on Windows to :) Is it possible pass the compilation flag -O via `dub run`?
 Line 22:
 	a.run();

 However, only simple cases are detected at compile-time.
You right, after this trivial modification compiler fails to detect bug :( ```d void doRun(A a) { a.run(); } int main() { A a; //a.run(); doRun(a); writeln("Hello, world!"); return 0; } ```
  safe only means memory-safety:
 https://dlang.org/spec/memory-safe-d.html

 Null-safety is not part of memory-safety, because in D it 
 should not be possible to violate memory-safety when a 
 pointer/reference is null.
Formally yes, but segfault looks not good for language which say "I have memory safe feature", in my opinion. For example, Rust guarantees what successfully compiled code couldn't lead to crash/segfault.
 For a long time I've wanted compile-time null-safety using 
 non-nullable pointers/references, but there are no plans to add 
 that AFAIK.
Very good idea! Hope you can implement it sooner or later :)
Mar 11 2024
next sibling parent Ogi <ogion.art gmail.com> writes:
On Monday, 11 March 2024 at 13:16:04 UTC, Alex wrote:
 Thanks, it works on Windows to :)
 Is it possible pass the compilation flag -O via `dub run`?
Of course: `dub run --build=release` :). The `-O` flag instructs DMD to optimize generated code, and null dereference detection is just a by-product of data flow analysis that DMD performs for some optimizations.
Mar 11 2024
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
All a seg fault is is the hardware detecting the fault.

Consider an array overflow:

```
int test(int[] a)
{
     return a[100];
}
```
There is no way in general to detect an array overflow here, so there is a 
runtime check. If the check fails, a fatal array overflow exception is
generated.

I'm pretty sure Rust does the same thing for array overflows.

A seg fault is no different, it's just the hardware doing the check for you, so 
it doesn't cost you any extra code or speed.
Mar 11 2024
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Monday, 11 March 2024 at 08:16:13 UTC, Alex wrote:

 Is it expected behavior?
 Looks like it is not very safe approach and can lead to very 
 unpleasant memory errors...
So I know there are a lot of responses here, with a lot of discussion. But I don't think anyone has told you *why* D works this way. The explanation is that D is expecting the memory hardware to fault when you dereference null. We know that this is not the case for all situations, but it is the case for all of D's normal usage modes (e.g. as user-code on standard operating systems). Since the memory hardware *already supports this*, and is essentially free, D has deferred to that mechanism to guard against dereferencing null pointers. Not assuming this behavior means all dereferences of pointers/classes in ` safe` code would have to be instrumented with a check, slowing down the code significantly. I consider null pointer faults to be annoying, but not nearly as bad as dangling pointer accesses. At least a null pointer *always* crashes when you access it. -Steve
Mar 11 2024
next sibling parent reply Alex <akornilov.82 mail.ru> writes:
On Monday, 11 March 2024 at 19:43:33 UTC, Steven Schveighoffer 
wrote:
Hello, Steven!

Thank you for clarifications!

 The explanation is that D is expecting the memory hardware to 
 fault when you dereference null.
Ok, got it, but on Windows I observe stange behaviour in runtime: application terminated without system error and any logs in console. In this case I think developer can't localize and figured out what is going wrong in code.
 Not assuming this behavior means all dereferences of 
 pointers/classes in  safe code would have to be instrumented 
 with a check, slowing down the code significantly.
I agree that checks on each pointer dereferences maybe expensive in runtime. But from my point of view solution is disable any null refernces in code. The compiler can do it during compilation and validation in runtime won't be needed.
Mar 11 2024
parent Walter Bright <newshound2 digitalmars.com> writes:
On 3/11/2024 1:14 PM, Alex wrote:
 Ok, got it, but on Windows I observe stange behaviour in runtime: application 
 terminated without system error and any logs in console.
That's odd because Windows has no trouble reporting null dereferences and exiting the program. Try running it under the VC debugger. It'll drop you right on the line of source code that dereferences the null.
Mar 11 2024
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/11/2024 12:43 PM, Steven Schveighoffer wrote:
 The explanation is that D is expecting the memory hardware to fault when you 
 dereference null. We know that this is not the case for all situations
In particular, when a constant is added to the null reference that is large enough to skip over the protected pages in the memory space.
Mar 11 2024
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Tuesday, 12 March 2024 at 03:28:54 UTC, Walter Bright wrote:
 On 3/11/2024 12:43 PM, Steven Schveighoffer wrote:
 The explanation is that D is expecting the memory hardware to 
 fault when you dereference null. We know that this is not the 
 case for all situations
In particular, when a constant is added to the null reference that is large enough to skip over the protected pages in the memory space.
I may have mentioned this before, but the way to fix this is in ` safe` code, before each reference with a constant offset that you know to be greater than one page, validate the root pointer is not null. FWIW, I was actually talking about environments where the null page does not segfault, like in a kernel. -Steve
Mar 12 2024
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/12/2024 6:57 AM, Steven Schveighoffer wrote:
 I may have mentioned this before, but the way to fix this is in ` safe` code, 
 before each reference with a constant offset that you know to be greater than 
 one page, validate the root pointer is not null.
I had thought the compiler already checked for that. But testing shows it does not. I wonder if there was a PR to remove it?
 FWIW, I was actually talking about environments where the null page does not 
 segfault, like in a kernel.
I wonder why anyone would design it that way.
Mar 12 2024
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Tuesday, 12 March 2024 at 17:42:14 UTC, Walter Bright wrote:
 On 3/12/2024 6:57 AM, Steven Schveighoffer wrote:
 I may have mentioned this before, but the way to fix this is 
 in ` safe` code, before each reference with a constant offset 
 that you know to be greater than one page, validate the root 
 pointer is not null.
I had thought the compiler already checked for that. But testing shows it does not. I wonder if there was a PR to remove it?
I don't know if that was ever present.
 FWIW, I was actually talking about environments where the null 
 page does not segfault, like in a kernel.
I wonder why anyone would design it that way.
e.g.: https://en.wikipedia.org/wiki/Zero_page#Interrupt_vectors -Steve
Mar 12 2024
parent Walter Bright <newshound2 digitalmars.com> writes:
On 3/12/2024 12:00 PM, Steven Schveighoffer wrote:
 FWIW, I was actually talking about environments where the null page does not 
 segfault, like in a kernel.
I wonder why anyone would design it that way.
e.g.: https://en.wikipedia.org/wiki/Zero_page#Interrupt_vectors
As the article says, only the x86 in real mode does that.
Mar 12 2024
prev sibling parent reply ShowMeTheWay <ShowMeTheWay gmail.com> writes:
On Monday, 11 March 2024 at 19:43:33 UTC, Steven Schveighoffer 
wrote:
 On Monday, 11 March 2024 at 08:16:13 UTC, Alex wrote:

 Is it expected behavior?
 Looks like it is not very safe approach and can lead to very 
 unpleasant memory errors...
So I know there are a lot of responses here, with a lot of discussion. But I don't think anyone has told you *why* D works this way. The explanation is that D is expecting the memory hardware to fault when you dereference null. We know that this is not the case for all situations, but it is the case for all of D's normal usage modes (e.g. as user-code on standard operating systems). Since the memory hardware *already supports this*, and is essentially free, D has deferred to that mechanism to guard against dereferencing null pointers. Not assuming this behavior means all dereferences of pointers/classes in ` safe` code would have to be instrumented with a check, slowing down the code significantly. I consider null pointer faults to be annoying, but not nearly as bad as dangling pointer accesses. At least a null pointer *always* crashes when you access it. -Steve
The problem is less that the code is dereferencing null, and more, that "..forgetting to assign a value to a local is probably a bug.", to qoute Eric Lippert. When you're derefencing null in a situation where you almost certainly should NOT be doing that, then it should be considered a likely bug. To quote him some more,... "If its probably a bug and it is cheap and easy to detect, then there is good incentive to make the behavior either illegal or a warning." Many of us use compilers (that have been around for decades), that do just that. (even though it's actually a bug): A a; a.run(); This should not be legal D code. It should produce an error if compiled. It's not difficult for a compiler to work this one out.
Apr 16 2024
parent reply bachmeier <no spam.net> writes:
On Tuesday, 16 April 2024 at 07:25:21 UTC, ShowMeTheWay wrote:


 (even though it's actually a bug):

 A a;
 a.run();

 This should not be legal D code. It should produce an error if 
 compiled.

 It's not difficult for a compiler to work this one out.
I'm repeating myself, but there's no good argument in favor of that compiling. All it gives you is bugs, confusion, and a steep learning curve in the name of saving a few keystrokes. ``` import std; void main() { A a; writeln(a is null); // true B b = null; writeln(b is null); // true C c = void; writeln(c is null); // false, c isn't initialized to null or anything else } class A {} class B {} class C {} ``` `a.run()` is natural. `b.run()` wouldn't make sense even to a new programmer. Neither would `c.run()`.
Apr 16 2024
parent reply bachmeier <no spam.net> writes:
On Tuesday, 16 April 2024 at 15:48:15 UTC, bachmeier wrote:
 On Tuesday, 16 April 2024 at 07:25:21 UTC, ShowMeTheWay wrote:


 (even though it's actually a bug):

 A a;
 a.run();

 This should not be legal D code. It should produce an error if 
 compiled.

 It's not difficult for a compiler to work this one out.
I'm repeating myself, but there's no good argument in favor of that compiling. All it gives you is bugs, confusion, and a steep learning curve in the name of saving a few keystrokes.
That should say there's no good argument in favor of the first line in your example compiling. The compiler should stop as soon as it sees `A a;`.
Apr 16 2024
parent reply ShowMeTheWay <ShowMeTheWay gmail.com> writes:
On Tuesday, 16 April 2024 at 15:51:11 UTC, bachmeier wrote:
 ...
 That should say there's no good argument in favor of the first 
 line in your example compiling. The compiler should stop as 
 soon as it sees `A a;`.
I have to disagree here. If A is a class definition, then: A a; is valid code, even though the compiler initialises it to null here, by default. null is a valid state for a. The programmer might prefer to assign to it separately, for example: a = new A(); In this case: a.run(); cannot be considered a likely bug, since 'a' has been assigned to before before it is used. So the bug is not that 'a' is null (cause that is a valid state for 'a'), but that 'a' is being used before it has been assigned to. Now if the programmer did: A a = null; // definitely assigned a.run(); .. then the programmer is an idiot, and the compiler should let the programmer deal with the consequences .... backwards with regards to the compilers ability to detect simple bugs like this. Coming from a C++ background is even worse perhaps, cause a C++ may well think that 'A a;' actually calls the objects constructor... a mistake I made lot early on with D, and still do on occasion (and the compiler is no help at all here). D will always accept 'A a;' as valid code, I expect. What the D compiler really needs however, is some simple definite assignment analysis (for local variables) to detect a simple bug like this... perhaps as a compile time argument initially... but eventually made the default. I doubt it would be difficult or complex to implement (for someone who knows how to). So really, the only question is what impact it would have on compile time. If it's only for local variables, then that impact should not really be noticable. btw. This too is a likely bug: int b; writeln(b); The compiler should require you to assign to 'b' before using it. On the otherhand, this below should *not* get the compilers attention: int b = int.init; writeln(b);
Apr 16 2024
next sibling parent reply Basile B. <b2.temp gmx.com> writes:
On Tuesday, 16 April 2024 at 22:15:42 UTC, ShowMeTheWay wrote:
 btw. This too is a likely bug:

 int b;
 writeln(b);

 The compiler should require you to assign to 'b' before using 
 it.

 On the otherhand, this below should *not* get the compilers 
 attention:

 int b = int.init;
 writeln(b);
Both are semantically equivalent. The first version is about knowing how the language works, the second is about being stupid. D policy about default initializers is really to create clear poison value. You still have "void initialization" if you want to introduce UBs. ```d int b = void; writeln(b); ``` that is more what should get the compiler attention.
Apr 16 2024
next sibling parent ShowMeTheWay <ShowMeTheWay gmail.com> writes:
On Wednesday, 17 April 2024 at 00:25:07 UTC, Basile B. wrote:
 On Tuesday, 16 April 2024 at 22:15:42 UTC, ShowMeTheWay wrote:
 btw. This too is a likely bug:

 int b;
 writeln(b);

 The compiler should require you to assign to 'b' before using 
 it.

 On the otherhand, this below should *not* get the compilers 
 attention:

 int b = int.init;
 writeln(b);
Both are semantically equivalent. The first version is about knowing how the language works, the second is about being stupid. D policy about default initializers is really to create clear poison value. You still have "void initialization" if you want to introduce UBs. ```d int b = void; writeln(b); ``` that is more what should get the compiler attention.
I don't agree. Once it's been definitely assigned, the compiler should leave it to the programmer. If there's a bug, then that is the programmers problem to deal with. On the otherhand, use of a variable that has not yet been assigned to (regardless of its default value)... well that is almost certainly a bug, and a worthy target for the compilers attention. I don't expect to see in the D compiler, given the priorities of D, but it would be nice to have.
Apr 16 2024
prev sibling parent ShowMeTheWay <ShowMeTheWay gmail.com> writes:
On Wednesday, 17 April 2024 at 00:25:07 UTC, Basile B. wrote:
 On Tuesday, 16 April 2024 at 22:15:42 UTC, ShowMeTheWay wrote:
 btw. This too is a likely bug:

 int b;
 writeln(b);

 The compiler should require you to assign to 'b' before using 
 it.

 On the otherhand, this below should *not* get the compilers 
 attention:

 int b = int.init;
 writeln(b);
Both are semantically equivalent. The first version is about knowing how the language works, the second is about being stupid. D policy about default initializers is really to create clear poison value. You still have "void initialization" if you want to introduce UBs. ```d int b = void; writeln(b); ``` that is more what should get the compiler attention.
btw. I don't see these as semantically equivalent. The compiler may, but I don't. int b; writeln(b); vs.. int b = void; writeln(b); The first one suggests to me a likely bug. The second one suggest to me, the programmer has definitely assigned a value to b, and whether the writeln code is a bug or not depends on what the programmer intended.... I cannot assume that a bug was not the intention... perhaps it was.. perhaps it wasn't. The compiler would not know the intention of the programmer either. Only in the case of the 'use of an unassigned variable', can the compiler reasonbly, and quickly, assume it's a likely bug, and that it should alert the programmer. In the first example, I would want the compiler to alert me. In the second example, I would want the compiler to get out of the way and let me do what I want...
Apr 16 2024
prev sibling next sibling parent Doigt <labog outlook.com> writes:
On Tuesday, 16 April 2024 at 22:15:42 UTC, ShowMeTheWay wrote:
 int b;
 writeln(b);

 The compiler should require you to assign to 'b' before using 
 it.

 On the otherhand, this below should *not* get the compilers 
 attention:

 int b = int.init;
 writeln(b);
Not a bug. Variables that aren't explicitly assigned a value have default values that are implicitly assigned to them. Those are the same default values used for arrays and struct/class fields. So in theory, you should know about them because they are unavoidable elsewhere even if implicit value assignment was turned off.
Apr 16 2024
prev sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 17/04/2024 10:15 AM, ShowMeTheWay wrote:

 backwards with regards to the compilers ability to detect simple bugs 
 like this.
What you are talking about is called type state analysis. https://en.wikipedia.org/wiki/Typestate_analysis To do this properly requires a verification data flow analysis. But once you have that DFA it opens a lot of doors for memory and temporal safety. Currently D is defined against type state analysis, however it doesn't have it as its own thing. Pointers are defined to be guaranteed to be non-null when accessed by using the CPU's MMU to throw an exception if you tried to access it. If you goto past a variable, that is a compiler error as it is not reachable. My proposal has the hierarchy of: unreachable < reachable < initialized < default < non-null < user-defined. https://forum.dlang.org/post/ucdmmlxklanpsggqmwas forum.dlang.org So yes D is backwards, because its missing a whole bunch of analysis that you are expecting to exist.
Apr 17 2024
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
Just responding in general to the discussion here:



This is a very difficult problem to solve. It can be automated to 
a degree, but I believe this is equivalent to the halting 
problem, so it's not solvable in P time.

However, many languages can prove *cases* of this using developer 
help (i.e. "unwrapping" a maybe-null value). This means that you 
have the drawback that your code has to be instrumented with null 
checks (written by you). And even if you do this, if it turns out 
the thing is null, you still have to handle it!



When you have an object that is null, and it shouldn't be, the 
means by which that null happened are no longer important. You 
need to handle it somehow.

- The path D takes is, let the hardware solve it.
- The path Java takes is to instrument dereferences on 
possibly-null variables and throw an exception if it's null.
- The path many other languages take is to force you to validate 
it's not null before using it.

These mechanisms all have their advantages and disadvantages.

D and Java have the advantage that you don't have to jump through 
hoops to prove things to the compiler. If a null pointer fault 
happens, it's because you have a bug in your code, but the 
language is your safety net here. It is still memory-safe in D 
since the hardware fault stops you from continuing execution, and 
any invalid pointers other than null should not be possible. In 
Java, it's obviously still memory safe.

For the other languages, you are forced to validate something is 
null or not null. This has the advantage that certain classes of 
bugs can be avoided at compile time, and in many cases, the code 
can be clearer where null pointers might exist. But the cost is 
that you may be forced to validate something that is obvious to 
the developer (but not the compiler). It adds to the tediousness 
of the language.

Relying on segfaults also has a further drawback that unless you 
happen to be running under a debugger, or have enabled 
core-dumps, you get no indication where in your program the issue 
happened. There is also no distinction between memory corruption 
and null pointer dereferencing. I think we should have a better 
outcome than this, we definitely can be more informative about 
what is happening.

The point I'm making is that it's not possible to ensure you 
never have to deal with null pointers. They happen. They even 
might happen in memory safe languages such as Rust. The 
difference is how you have to handle them. Handling them one way 
or another has benefits or drawbacks, but those are the tradeoffs 
you must make. It's important to note that in all these 
situations, the code is still *memory safe*.

-Steve
Mar 13 2024
parent reply Alex <akornilov.82 mail.ru> writes:
On Wednesday, 13 March 2024 at 18:20:12 UTC, Steven Schveighoffer 
wrote:
 - The path D takes is, let the hardware solve it.
 - The path Java takes is to instrument dereferences on 
 possibly-null variables and throw an exception if it's null.
 - The path many other languages take is to force you to 
 validate it's not null before using it.
Rust doesn't allow null references at all (exclude unsafe code). It is one more alternative path.
 For the other languages, you are forced to validate something 
 is null or not null. This has the advantage that certain 
 classes of bugs can be avoided at compile time, and in many 
 cases, the code can be clearer where null pointers might exist. 
 But the cost is that you may be forced to validate something 
 that is obvious to the developer (but not the compiler). It 
 adds to the tediousness of the language.
On the other side with this approach developer can choose between nullable type and non-nullable. If he choose nullable type he really have to do checks. But for non-nullable type he can work completely safety without any checks. As I know Kotlin encourages using non-nullable types wherever possible.
Mar 13 2024
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Wednesday, 13 March 2024 at 19:36:01 UTC, Alex wrote:
 On Wednesday, 13 March 2024 at 18:20:12 UTC, Steven 
 Schveighoffer wrote:
 - The path D takes is, let the hardware solve it.
 - The path Java takes is to instrument dereferences on 
 possibly-null variables and throw an exception if it's null.
 - The path many other languages take is to force you to 
 validate it's not null before using it.
Rust doesn't allow null references at all (exclude unsafe code). It is one more alternative path.
Then "optional" types, basically the same thing. Null is a memory safe "invalid thing". The thing I'm getting at is -- if you have something that doesn't exist, but is supposed to exist, then you have to deal with it. How you deal with it is where the tradeoffs come in. The basic building block that all memory-safe language tools are built on is -- you should not be able to use invalid memory. null pointers which halt the program are a flavor of that.
 For the other languages, you are forced to validate something 
 is null or not null. This has the advantage that certain 
 classes of bugs can be avoided at compile time, and in many 
 cases, the code can be clearer where null pointers might 
 exist. But the cost is that you may be forced to validate 
 something that is obvious to the developer (but not the 
 compiler). It adds to the tediousness of the language.
On the other side with this approach developer can choose between nullable type and non-nullable. If he choose nullable type he really have to do checks. But for non-nullable type he can work completely safety without any checks. As I know Kotlin encourages using non-nullable types wherever possible.
The checks have to come somewhere. If you have a value that is of *unknown* validity, and you want to ensure it's valid, you need a check. We have different flavors of: * automated checks * how the checks are handled if failure occurs * checks that you are forced to perform * how long such checks are enforced by the compiler (e.g., flow analysis, building into the type the validity). Building non-null into the type indeed means as long as you have that type, you don't have to check. But to get it into that type, if you started with a possibly-invalid value, *somewhere* you had to do a check. Consider an array/vector of items in Rust. And an index. When you index that vector, the compiler has no idea what that index is. It must validate the index before dereferencing the element. This is a check, and the handling of it is defined by the language. Having a possibly-null pointer is no different. D defines that in safe code, a pointer will be valid or null. The "check" occurs on use, and is performed by the hardware. This is in *stark* contrast to having a possibly-invalid pointer to non-null memory (e.g. dangling or buffer overflow). Those should never occur. -Steve
Mar 13 2024
next sibling parent reply Alex <akornilov.82 mail.ru> writes:
On Wednesday, 13 March 2024 at 19:58:24 UTC, Steven Schveighoffer 
wrote:
 Then "optional" types, basically the same thing. Null is a 
 memory safe "invalid thing".

 The thing I'm getting at is -- if you have something that 
 doesn't exist, but is supposed to exist, then you have to deal 
 with it. How you deal with it is where the tradeoffs come in.

 The basic building block that all memory-safe language tools 
 are built on is -- you should not be able to use invalid 
 memory. null pointers which halt the program are a flavor of 
 that.
I think fundamental difference between optional type and nullable reference is that optional type is under control of compiler as type system of the language. But nullable refrence will "shoot" only in runtime and compiler can't help.
 Building non-null into the type indeed means as long as you 
 have that type, you don't have to check. But to get it into 
 that type, if you started with a possibly-invalid value, 
 *somewhere* you had to do a check.
Or have done initialization of variable by a new instance (good practice in contrast with variables without explicitly initialization).
 Consider an array/vector of items in Rust. And an index. When 
 you index that vector, the compiler has no idea what that index 
 is. It must validate the index before dereferencing the 
 element. This is a check, and the handling of it is defined by 
 the language.
Looks like the array boundary checks is not possible at compilation time at all. But null references might be handled by compiler thanks to type system.
 Having a possibly-null pointer is no different. D defines that 
 in safe code, a pointer will be valid or null. The "check" 
 occurs on use, and is performed by the hardware.
But the fundamental difference that null pointer checks is performed in D at runtime but in Kotlin (and other languages which support null safety) at compilation time.
Mar 13 2024
next sibling parent Alex <akornilov.82 mail.ru> writes:
On Wednesday, 13 March 2024 at 20:34:42 UTC, Alex wrote:
 But the fundamental difference that null pointer checks is 
 performed in D at runtime but in Kotlin (and other languages 
 which support null safety) at compilation time.
I mean that Kotlin can validate code and points all places where null reference check must be performed (for nullable type). But null reference check itself will be performed in runtime.
Mar 13 2024
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Wednesday, 13 March 2024 at 20:34:42 UTC, Alex wrote:
 On Wednesday, 13 March 2024 at 19:58:24 UTC, Steven 
 Schveighoffer wrote:
 Having a possibly-null pointer is no different. D defines that 
 in safe code, a pointer will be valid or null. The "check" 
 occurs on use, and is performed by the hardware.
But the fundamental difference that null pointer checks is performed in D at runtime but in Kotlin (and other languages which support null safety) at compilation time.
No, Kotlin also performs a check at runtime. The same hardware runs the machine code, and the MMU is the same one that runs D code. However, the check is not considered to cost anything. It's part of how the MMU works. -Steve
Mar 13 2024
parent Alex <akornilov.82 mail.ru> writes:
On Thursday, 14 March 2024 at 02:09:15 UTC, Steven Schveighoffer 
wrote:
 No, Kotlin also performs a check at runtime. The same hardware 
 runs the machine code, and the MMU is the same one that runs D 
 code.

 However, the check is not considered to cost anything. It's 
 part of how the MMU works.
Yes, sure, you right. Checks will be in runtime too and thanks to MMU hardware checks are cheap. To be clear: - Kotlin compiler rejects code wthout all required null reference checks for nullable type. - Null refrence checks will be performed in runtime (like in D). - Kotlin compiler give 100% guarantee that null pointer exception never happend in runtime (exclude unsafe operations where developer take responsibility). I give Kotlin as example because familar with it, but other languages with null safety do something like this. And this powerful feature is possible thanks to language type system. The nullable type is special case of cool idea - the union type: https://www.typescriptlang.org/docs/handbook/unions-and-intersections.html http://web.mit.edu/ceylon_v1.3.3/ceylon-1.3.3/doc/en/spec/html_single/#unionandintersectiontypes
Mar 14 2024
prev sibling next sibling parent reply Don Allen <donaldcallen gmail.com> writes:
On Wednesday, 13 March 2024 at 19:58:24 UTC, Steven Schveighoffer 
wrote:
 On Wednesday, 13 March 2024 at 19:36:01 UTC, Alex wrote:
 On Wednesday, 13 March 2024 at 18:20:12 UTC, Steven 
 Schveighoffer wrote:
 - The path D takes is, let the hardware solve it.
 - The path Java takes is to instrument dereferences on 
 possibly-null variables and throw an exception if it's null.
 - The path many other languages take is to force you to 
 validate it's not null before using it.
Rust doesn't allow null references at all (exclude unsafe code). It is one more alternative path.
Then "optional" types, basically the same thing. Null is a memory safe "invalid thing".
I haven't read this whole thread, but I've seen responses to the above, and I'd like to add my own comment. The difference between null and optional types in Rust is that the Rust compiler forces you to handle an optional type in order to obtain its value. You can't get at the value without processing the optional with a 'match' (or related constructs, such as 'if let'), which requires handling both possibilities -- Some and None. The compiler therefore forces you to handle the unusual case, so if it happens, the result will be something under your control. Null and other unexpected values are simply not equivalent. You are not forced to provide code to handle the possibility of the unexpected value. When your code blindly forges ahead, you may get a seg-fault. Depending on how the code was compiled, you may or may not be headed for a session with gdb. Related to the above, you may also process an uninitialized value, at which point anything can happen. To me, this is one of the weak spots in D, inherited from C. It can lead to an incorrect result returned without error. D's use of default initializers (as opposed to C's random garbage) does not fully protect against this. Rust does prevent this.
Mar 16 2024
parent reply Nick Treleaven <nick geany.org> writes:
On Saturday, 16 March 2024 at 19:10:56 UTC, Don Allen wrote:
 The compiler therefore forces you to handle the unusual case, 
 so if it happens, the result will be something under your 
 control.
You can call `unwrap` on the Option which will panic if it's None. But that's fine, because that call makes it clear to anyone reading the code that the programmer is intentionally assuming the Option contains a value. ...
 Related to the above, you may also process an uninitialized 
 value, at which point anything can happen.
It can't violate memory safety:
 Void initializers for variables with a type that may contain 
 unsafe values (such as types with pointers) are not allowed in 
  safe code.
https://dlang.org/spec/declaration.html#void_init
Mar 18 2024
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 3/18/24 13:31, Nick Treleaven wrote:
 
 ...
 Related to the above, you may also process an uninitialized value, at 
 which point anything can happen.
It can't violate memory safety:
```d import std.stdio; void main(){ bool b = void; if(b) writeln("yes"); if(!b) writeln("no"); } ```
Mar 18 2024
parent Nick Treleaven <nick geany.org> writes:
On Monday, 18 March 2024 at 21:10:40 UTC, Timon Gehr wrote:
 ```d
 import std.stdio;
 void main(){
     bool b = void;
     if(b) writeln("yes");
     if(!b) writeln("no");
 }
 ```
The spec was updated:
 For a bool, only 0 and 1 are safe values.
https://dlang.org/spec/function.html#safe-values The implementation still needs to make this an error: https://github.com/dlang/dmd/pull/15362
Mar 22 2024
prev sibling next sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 19/03/2024 1:31 AM, Nick Treleaven wrote:
 On Saturday, 16 March 2024 at 19:10:56 UTC, Don Allen wrote:
 The compiler therefore forces you to handle the unusual case, so if it 
 happens, the result will be something under your control.
You can call `unwrap` on the Option which will panic if it's None. But that's fine, because that call makes it clear to anyone reading the code that the programmer is intentionally assuming the Option contains a value. ...
 Related to the above, you may also process an uninitialized value, at 
 which point anything can happen.
It can't violate memory safety:
 Void initializers for variables with a type that may contain unsafe 
 values (such as types with pointers) are not allowed in  safe code.
https://dlang.org/spec/declaration.html#void_init
With type state analysis you should be allowed to write, but not read uninitialized variables. So this restriction that it isn't safe to use them in is simply because we don't have the ability to guarantee initialization before reading can occur.
Mar 18 2024
prev sibling parent Don Allen <donaldcallen gmail.com> writes:
On Monday, 18 March 2024 at 12:31:23 UTC, Nick Treleaven wrote:
 On Saturday, 16 March 2024 at 19:10:56 UTC, Don Allen wrote:
 The compiler therefore forces you to handle the unusual case, 
 so if it happens, the result will be something under your 
 control.
You can call `unwrap` on the Option which will panic if it's None. But that's fine, because that call makes it clear to anyone reading the code that the programmer is intentionally assuming the Option contains a value.
Yes, that's one choice. You can also call 'expect' and provide a specific error message about what happened, unlike 'unwrap' which only provides a boiler-plate message. My point is that, in Rust, you can't forget to handle a return of None because the compiler forces you to do so. As opposed to dealing with a seg-fault because you forgot to test for a null pointer.
 ...
 Related to the above, you may also process an uninitialized 
 value, at which point anything can happen.
It can't violate memory safety:
I didn't say it did. Perhaps I could have been more clear. My intent was to discuss the contrast between a language that makes certain that you handle unusual cases in way of your own choosing vs. getting an unexpected seg-fault and having to get out the debugging apparatus to find out what happened. Or worse, shipping code with this time-bomb and having it blow up in the face of your users.
 Void initializers for variables with a type that may contain 
 unsafe values (such as types with pointers) are not allowed in 
  safe code.
https://dlang.org/spec/declaration.html#void_init
Mar 21 2024
prev sibling parent Nick Treleaven <nick geany.org> writes:
On Wednesday, 13 March 2024 at 19:58:24 UTC, Steven Schveighoffer 
wrote:
 Building non-null into the type indeed means as long as you 
 have that type, you don't have to check. But to get it into 
 that type, if you started with a possibly-invalid value, 
 *somewhere* you had to do a check.
...
 Having a possibly-null pointer is no different. D defines that 
 in safe code, a pointer will be valid or null. The "check" 
 occurs on use, and is performed by the hardware.
One important difference is that D makes the null check *as late as possible*. Often in code using non-nullable types, the check gets done earlier, nearer to where the problem is. E.g. when a function produces a pointer, but the pointer doesn't actually get dereferenced there but is stored and then some time later it gets dereferenced. Then it's not easy to find where the null pointer was actually produced. Having non-null pointers can save time debugging.
Mar 18 2024