www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Both safe and wrong?

reply =?UTF-8?B?THXDrXM=?= Marques <luis luismarques.eu> writes:
Consider this code:
https://run.dlang.io/is/aw09pD

import std.stdio;

 safe:

const x = 42;
int* y = cast(int*) &x;

void main()
{
     *y = 7;
     writeln(x); // prints 42
}

I'm probably too out of touch of how the safe type system should 
work, but this seems inconsistent. I would expect *either*:

1) Overriding x's const is unsafe and undefined behavior. The 
cast inside a safe block should cause an error. The 42 is fine.

2) As we promised the type system, we didn't modify the 42 
through a const view of that value (x), instead we used a mutable 
view (y). The cast is fine. It should print 7 instead.

But I guess I'm wrong? Please clarify.
Feb 03
next sibling parent reply ag0aep6g <anonymous example.com> writes:
On 04.02.19 00:48, Luís Marques wrote:
 import std.stdio;
 
  safe:
 
 const x = 42;
 int* y = cast(int*) &x;
 
 void main()
 {
      *y = 7;
      writeln(x); // prints 42
 }
 
 I'm probably too out of touch of how the safe type system should work, 
 but this seems inconsistent. I would expect *either*:
 
 1) Overriding x's const is unsafe and undefined behavior. The cast 
 inside a safe block should cause an error. The 42 is fine.
 
 2) As we promised the type system, we didn't modify the 42 through a 
 const view of that value (x), instead we used a mutable view (y). The 
 cast is fine. It should print 7 instead.
The thing here is that the safe attribute only applies to functions. `int* y = cast(int*) &x;` is not a function declaration, so ` safe:` has no effect on it. All code that's not in a function is system, always. There is no way to mark it as safe.
Feb 03
next sibling parent reply =?UTF-8?B?THXDrXM=?= Marques <luis luismarques.eu> writes:
On Monday, 4 February 2019 at 00:18:48 UTC, ag0aep6g wrote:
 The thing here is that the  safe attribute only applies to 
 functions. `int* y = cast(int*) &x;` is not a function 
 declaration, so ` safe:` has no effect on it.
Gotcha. It's disappointing that there is no way to mark an entire module as safe. Also seems a bit inconsistent with other attributes marked with the "attribute:" syntax, which apply more broadly.
Feb 03
parent reply Rubn <where is.this> writes:
On Monday, 4 February 2019 at 00:29:04 UTC, Luís Marques wrote:
 On Monday, 4 February 2019 at 00:18:48 UTC, ag0aep6g wrote:
 The thing here is that the  safe attribute only applies to 
 functions. `int* y = cast(int*) &x;` is not a function 
 declaration, so ` safe:` has no effect on it.
Gotcha. It's disappointing that there is no way to mark an entire module as safe. Also seems a bit inconsistent with other attributes marked with the "attribute:" syntax, which apply more broadly.
Walter is extremely against the idea of safe module std.stdio; It's kind of ironic that you can't have safe code that's not in a function. Kind of seems like a gigantic hole in creating a safe language.
Feb 03
parent =?UTF-8?B?THXDrXM=?= Marques <luis luismarques.eu> writes:
On Monday, 4 February 2019 at 00:35:43 UTC, Rubn wrote:
 It's kind of ironic that you can't have  safe code that's not 
 in a function. Kind of seems like a gigantic hole in creating a 
 safe language.
Right. As CTFE illustrates, a variable's initialization expression is a kind of compile-time code. See the fine line between constant folding and CTFE, etc. So, not being inside a function declaration is not a particularly useful distinction, from the user's point of view. In fact, that's what I was thinking about when I started this thread: some possible changes to the way that expressions are typed, and how that interacts with safe.
Feb 03
prev sibling parent reply ag0aep6g <anonymous example.com> writes:
On 04.02.19 01:18, ag0aep6g wrote:
 The thing here is that the  safe attribute only applies to functions. 
 `int* y = cast(int*) &x;` is not a function declaration, so ` safe:` has 
 no effect on it.
 
 All code that's not in a function is  system, always. There is no way to 
 mark it as  safe.
Maybe even more peculiar than globals, you also can't mark parameter default values as safe: ---- import std.stdio; safe: immutable x = 42; void main() { *f() = 7; writeln(x); // 42 writeln(*&x); // 7 } int* f(int* y = cast(int*) &x) { return y; } ----
Feb 03
next sibling parent =?UTF-8?B?THXDrXM=?= Marques <luis luismarques.eu> writes:
On Monday, 4 February 2019 at 00:31:36 UTC, ag0aep6g wrote:
 Maybe even more peculiar than globals, you also can't mark 
 parameter default values as  safe:
Yup, clearly safe's scope could be expanded. I don't know how problematic that would be with D's current design...? I use safe so little (because most of the times I've tried to use it I ran into annoying limitations) I'm quite out of touch with its details. In fact, that's how this whole thread started. I was thinking about some possible changes to the type system and it wasn't quite clear to me how that interacted with safe, because I don't use it enough.
Feb 03
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/3/2019 4:31 PM, ag0aep6g wrote:
 [...]
https://issues.dlang.org/show_bug.cgi?id=19645
Feb 03
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/3/2019 3:48 PM, Luís Marques wrote:
 [...]
https://issues.dlang.org/show_bug.cgi?id=19646
Feb 03
next sibling parent bauss <jj_1337 live.dk> writes:
On Monday, 4 February 2019 at 01:39:20 UTC, Walter Bright wrote:
 On 2/3/2019 3:48 PM, Luís Marques wrote:
 [...]
https://issues.dlang.org/show_bug.cgi?id=19646
Amazing
Feb 03
prev sibling parent reply =?UTF-8?B?THXDrXM=?= Marques <luis luismarques.eu> writes:
On Monday, 4 February 2019 at 01:39:20 UTC, Walter Bright wrote:
 https://issues.dlang.org/show_bug.cgi?id=19646
Hi Walter, Thanks for the feedback. Given how terse the forum post is, I'm not sure how to interpret it. Are you suggesting this was always a bug, or are you changing D's safe design to encompass expressions outside functions? If this was always a bug, then perhaps we should also add a bug report to include the fact that safe is documented just as "Safe Functions", which could be misleading? In any case, it's great to see this moving forward. I thought I was just asking a silly question because I was too tired! Cheers, Luís
Feb 04
parent reply Walter Bright <newshound2 digitalmars.com> writes:
I listed it as a "normal" bug rather than "enhancement". I believe that answers 
your question.
Feb 04
parent reply Olivier FAURE <couteaubleu gmail.com> writes:
On Monday, 4 February 2019 at 20:36:23 UTC, Walter Bright wrote:
 I listed it as a "normal" bug rather than "enhancement". I 
 believe that answers your question.
Are you confirming that the documentation in https://dlang.org/spec/memory-safe-d.html will be updated to refer to "safe code" instead of just safe functions? Anyway, thinking about it, I think fixing this elegantly might require coming up with new semantics. How do you make sure that no system code is called in your project without manually checking every single variable declaration in your dependencies? Checking that functions are safe is easy because system is infectious, so you only have to check for trusted code, but there's no way to make global variable safety infectious, is there?
Feb 06
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/6/2019 2:02 AM, Olivier FAURE wrote:
 On Monday, 4 February 2019 at 20:36:23 UTC, Walter Bright wrote:
 I listed it as a "normal" bug rather than "enhancement". I believe that 
 answers your question.
Are you confirming that the documentation in https://dlang.org/spec/memory-safe-d.html will be updated to refer to "safe code" instead of just safe functions? Anyway, thinking about it, I think fixing this elegantly might require coming up with new semantics. How do you make sure that no system code is called in your project without manually checking every single variable declaration in your dependencies? Checking that functions are safe is easy because system is infectious, so you only have to check for trusted code, but there's no way to make global variable safety infectious, is there?
I'm not seeing a problem with it.
Feb 06
parent reply Olivier FAURE <couteaubleu gmail.com> writes:
On Thursday, 7 February 2019 at 01:04:44 UTC, Walter Bright wrote:
 I'm not seeing a problem with it.
Let me put this another way: If you want to make sure only safe code is ever run, you can slap safe on your main function, and the program will only compile if every single other function is either safe or trusted. However, system variable declarations aren't infectious so far (and making them infectious would be a breaking change), which means the following code: import std.stdio; immutable int x = 1; int* y = cast(int*)&x; void main() safe { *y = 2; writeln(x); writeln(*(&x)); writeln(*y); } compiles even though main(), a safe function, ends up doing something unsafe (mutating a value declared as immutable). One way to fix this would be to forbid using system global variables in safe functions, but this would definitely be a breaking change, unless global variable safety is determined by the compiler by default (which is its own can of worms).
Feb 07
next sibling parent jmh530 <john.michael.hall gmail.com> writes:
On Thursday, 7 February 2019 at 17:33:01 UTC, Olivier FAURE wrote:
 [snip]

 One way to fix this would be to forbid using  system global 
 variables in  safe functions, but this would definitely be a 
 breaking change, unless global variable safety is determined by 
 the compiler by default (which is its own can of worms).
Inferring safety of global variables does seem tricky. You have to check every place it's used. I don't see what you could do other than forbid system global variables in safe functions. What your example shows is that it's broken already. At least adding the ability to mark them safe is a step in making it possible to resolve the issue.
Feb 07
prev sibling parent reply =?UTF-8?B?THXDrXM=?= Marques <luis luismarques.eu> writes:
On Thursday, 7 February 2019 at 17:33:01 UTC, Olivier FAURE wrote:
 One way to fix this would be to forbid using  system global 
 variables in  safe functions, but this would definitely be a 
 breaking change, unless global variable safety is determined by 
 the compiler by default (which is its own can of worms).
You are discussing evaluating the safety to the variable. Why not evaluate the safety of the variable's initialization expression? That is, safe would always refer to code, just not necessarily functions.
Feb 07
next sibling parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Thursday, February 7, 2019 3:33:45 PM MST Lus Marques via Digitalmars-d 
wrote:
 On Thursday, 7 February 2019 at 17:33:01 UTC, Olivier FAURE wrote:
 One way to fix this would be to forbid using  system global
 variables in  safe functions, but this would definitely be a
 breaking change, unless global variable safety is determined by
 the compiler by default (which is its own can of worms).
You are discussing evaluating the safety to the variable. Why not evaluate the safety of the variable's initialization expression? That is, safe would always refer to code, just not necessarily functions.
Indeed. You don't check variables for safety. You check code that runs. Initialization expressions are a bit weird in that they're a way for code that is run to exist outside of a function. In all other cases, any code that is run (as opposed to being a declaration) is inside a function of some kind. So, what we have here is simply a case of the small amount of runnable code that doesn't exist in a function being missed in the design of safe. If you just think of all initialization expressions as being essentially lambdas that declared and called in place, then all that we need is for those lambdas to be marked as safe when the section of code that they're in is marked with safe. It is possible that fixing this will cause code breakage, but it's likely to be rare, since you aren't normally going to get code like the original example had where an initialization expression outside of a function was taking an address. That code probably wasn't even legal until fairly recently, because it didn't used to be possible to directly initialize pointers when their value had to be known at compile time. As I understand it, being able to do something like int* i = new int(42); with a variable outside of a functions is a fairly recent improvement. Regardless, I think that it's pretty clear that we need to fix this, and my inclination is to argue that initialization expressions should just be treated as if they were lowered to lambdas that were immediately called. e.g. int* y = cast(int*) &x; becomes something like int* y = () { return cast(int*) &x; }(); Then when you have something like safe pure: int* y = cast(int*) &x; it's quite clear what should happen. But I don't know if there are any issues with that particular way of spec-ing it, and it's not like you want to _actually_ add any lambdas - though in most cases lambdas like that should be optimized out anywhere they're used anyway (though IIRC, that doesn't necessarily happen with dmd at present). However, it does make the attribute portion of things very straightforward. - Jonathan M Davis
Feb 07
prev sibling next sibling parent reply ag0aep6g <anonymous example.com> writes:
On 07.02.19 23:33, Luís Marques wrote:
 You are discussing evaluating the safety to the variable. Why not 
 evaluate the safety of the variable's initialization expression? That 
 is,  safe would always refer to code, just not necessarily functions.
Evaluating the safety of initializers without somehow marking the variables as safe/ system would fix the issue you brought up, but not the one that Olivier is talking about. His point is that you would still have to check manually if the globals you're using are safe or not. The compiler wouldn't see a difference between a safely initialized global and an unsafely initialized one. But safe is supposed to eliminate that kind of manual verification. ---- safe: int* x = /* ... this initializer would be checked for safety ... */; void main() { *x = 7; /* Guaranteed to be safe. */ *y = 7; /* Might exhibit undefined behavior. */ } system: int* y = /* ... this initializer would not be checked ... */; ---- If we'd apply the attributes to the variables, and forbid using system variables in safe code, then `*y = 7;` would be rejected.
Feb 08
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Friday, February 8, 2019 1:02:59 AM MST ag0aep6g via Digitalmars-d wrote:
 On 07.02.19 23:33, Lus Marques wrote:
 You are discussing evaluating the safety to the variable. Why not
 evaluate the safety of the variable's initialization expression? That
 is,  safe would always refer to code, just not necessarily functions.
Evaluating the safety of initializers without somehow marking the variables as safe/ system would fix the issue you brought up, but not the one that Olivier is talking about. His point is that you would still have to check manually if the globals you're using are safe or not. The compiler wouldn't see a difference between a safely initialized global and an unsafely initialized one. But safe is supposed to eliminate that kind of manual verification. ---- safe: int* x = /* ... this initializer would be checked for safety ... */; void main() { *x = 7; /* Guaranteed to be safe. */ *y = 7; /* Might exhibit undefined behavior. */ } system: int* y = /* ... this initializer would not be checked ... */; ---- If we'd apply the attributes to the variables, and forbid using system variables in safe code, then `*y = 7;` would be rejected.
Except that there's no such thing as an safe or system variable. You're talking about the exact same problem as when you have an safe function which takes a pointer, and you give it a pointer to invalid memory. safety depends on you giving valid input to safe functions. So, if you have an safe function using a module-level variable whose initializer isn't safe, and it does something which makes that module-level variable unsafe to use, then you're giving the function invalid input, just like if you passed it an invalid pointer via one of its explicit parameters. - Jonathan M Davis
Feb 08
parent reply ag0aep6g <anonymous example.com> writes:
On 08.02.19 09:31, Jonathan M Davis wrote:
 Except that there's no such thing as an  safe or  system variable.
Of course there isn't now. The idea is to add that concept to the language.
 You're
 talking about the exact same problem as when you have an  safe function
 which takes a pointer, and you give it a pointer to invalid memory.
In that scenario, you're making the invalid call from system/ trusted code. There you're supposed to watch out for this kind of stuff. You might argue that globals are located in an system context, so it's okay when they compromise otherwise safe functions. Then one just has to take care when declaring globals, since they're inputs to all safe functions. I guess that's a reasonable position, but I don't think it's the most useful one. Right now, to ensure that a program is actually safe, one has to: 1) Check `main`. I.e., mark it safe, or verify it manually. 2) Manually verify all trusted code. 3) Check all static constructors. 4) Check all statically initialized variables that are not in a function: module-level variables, class/struct fields, anything else? 5) Account for bugs in safe and stuff that I'm forgetting here. Ideally, the list would be as short as possible. So if we can eliminate #4, that would be a good thing, in my opinion.
Feb 08
parent reply Olivier FAURE <couteaubleu gmail.com> writes:
On Friday, 8 February 2019 at 09:40:09 UTC, ag0aep6g wrote:
 Right now, to ensure that a program is actually safe, one has 
 to:

 1) Check `main`. I.e., mark it  safe, or verify it manually.
 2) Manually verify all  trusted code.
 3) Check all static constructors.
 4) Check all statically initialized variables that are not in a 
 function: module-level variables, class/struct fields, anything 
 else?
 5) Account for bugs in  safe and stuff that I'm forgetting here.

 Ideally, the list would be as short as possible. So if we can 
 eliminate #4, that would be a good thing, in my opinion.
Right. Like I said earlier, one way to eliminate #4 would be to say "variables that aren't marked as safe can't be called from safe code". The problem with that is that it would break tons of existing code. Alternate solutions: - Make global variables safe by default. - Deduce safety by default from a global variable's initialization expression. Both of these solutions would involve very little code breakage, but would break with existing D semantics, which is yet another wart you have to explain to beginners. (alternate alternate solution: make an "everything is safe by default" DIP, and use a -dipXXXX flag for the transition for a few years)
Feb 08
parent reply XavierAP <n3minis-git yahoo.es> writes:
On Friday, 8 February 2019 at 20:06:02 UTC, Olivier FAURE wrote:
 (alternate alternate solution: make an "everything is  safe by 
 default" DIP, and use a -dipXXXX flag for the transition for a 
 few years)
I don't think this would be well received by the betterC/C++ crowd. And as important as safe is for some applications, the decision to make it optional and not default was taken long ago, and has huge implications. Fixing backwards compatibility for this related but separate issue is hardly a warrant for such a huge change. It looks to me too that variables need now to be flagged with the attribute, because the alternative is for every one of the safe functions that use any given global variable, to check again on their own that there isn’t anything unsafe in its initialization code. And then again, you are pushing the compile error that belongs on the dependency (global initialization) to the client ( safe function) A possible solution in my opinion is: - From then on, globals will be flagged safe, system or trusted just like functions. - But under the hood, globals in binaries (already compiled, so currently always without any attribute) are accepted by safe code (as if they were trusted) – BUT generate a compile warning. - When compiling new code, globals in this code are subject to new restrictions; so they are flagged safe, trusted or system explicitly; and if their initialization is unsafe and they’re flagged as safe, they generate a compile error; without need for any other compilation unit that uses this variable to check on its own. This way there is no code break. Other than new warnings, which are good because they expose a newly discovered vulnerability; or re-compiling unsafe global initializations, which should be fixed, or may be flagged trusted, individually or however.
Feb 12
next sibling parent XavierAP <n3minis-git yahoo.es> writes:
On Tuesday, 12 February 2019 at 12:49:45 UTC, XavierAP wrote:
 - When compiling new code, globals in this code are subject to 
 new restrictions; so they are flagged  safe,  trusted or 
  system explicitly;
Sorry instead of "explicitly" I should have said according to the same rules as functions. So if they're not found in a safe or trusted region, they will be flagged as system. The difference is that globals from binaries would be specially considered as " trusted with warning" until re-compiled (regardless of what region they were compiled from originally, which cannot likely be known).
Feb 12
prev sibling parent reply Olivier FAURE <couteaubleu gmail.com> writes:
On Tuesday, 12 February 2019 at 12:49:45 UTC, XavierAP wrote:
 This way there is no code break. Other than new warnings, which 
 are good because they expose a newly discovered vulnerability; 
 or re-compiling unsafe global initializations, which should be 
 fixed, or may be flagged  trusted, individually or however.
There would be some breakage for code with " safe:" at the beginning of the file. Also, I don't think D compilers have any semantics for behaving differently depending on whether the code they're compiling is legacy or new code (eg already compiled code, as you put it). I'm guessing that those semantics would be non-trivial to implement.
Feb 12
parent reply XavierAP <n3minis-git yahoo.es> writes:
On Tuesday, 12 February 2019 at 23:17:56 UTC, Olivier FAURE wrote:
 There would be some breakage for code with " safe:" at the 
 beginning of the file.
Only if the file contained unsafe code in initializations. And if we consider this a bug rather than an enhancement, the code would have been already broken. The compiler should break that code; and it should have always detected it and generated an error. I wouldn't think there are lots of such unsafe code hidden in global initializations within safe files, in Phobos or elsewhere. And wherever there is, it should be broken. But you could always fix it by only adding trusted.
 Also, I don't think D compilers have any semantics for behaving 
 differently depending on whether the code they're compiling is 
 legacy or new code (eg already compiled code, as you put it). 
 I'm guessing that those semantics would be non-trivial to 
 implement.
I wouldn't add any new semantics or mechanism. If the code is being compiled, safe should be enforced. I was talking about the situation where you link to binary libraries (compiled with an old version with this bug). Whenever (very often) dependencies are compiled from source instead, the new errors should simply pop up and break the compilation. This also means that at the same time this is fixed in the compiler, any such unsafe code in Phobos should be fixed.
Feb 12
parent Seb <seb wilzba.ch> writes:
On Wednesday, 13 February 2019 at 00:25:27 UTC, XavierAP wrote:
 I wouldn't add any new semantics or mechanism. If the code is 
 being compiled,  safe should be enforced. I was talking about 
 the situation where you link to binary libraries (compiled with 
 an old version with this bug).

 Whenever (very often) dependencies are compiled from source 
 instead, the new errors should simply pop up and break the 
 compilation. This also means that at the same time this is 
 fixed in the compiler, any such unsafe code in Phobos should be 
 fixed.
Yup, Phobos always gets compiled with `-de` on _every_ PR to DMD, s.t. every fix/improvement to the compiler needs to fix Phobos (and druntime first). Also, the ~ top50 Dub project are tested as well (https://buildkite.com/dlang), so that's why (hard) breaking changes aren't going to happen these days. Here's a recent example: https://github.com/dlang/dmd/pull/9154 Though of course there have been exceptions where safe-ty fixes affected only one of those 50 Dub packages and have been deemed as critical (+ affecting only a small subset + easy to fix), so that simply that one package was fixed.
Feb 13
prev sibling parent Olivier FAURE <couteaubleu gmail.com> writes:
On Thursday, 7 February 2019 at 22:33:45 UTC, Luís Marques wrote:
 On Thursday, 7 February 2019 at 17:33:01 UTC, Olivier FAURE 
 wrote:
 One way to fix this would be to forbid using  system global 
 variables in  safe functions, but this would definitely be a 
 breaking change, unless global variable safety is determined 
 by the compiler by default (which is its own can of worms).
You are discussing evaluating the safety to the variable. Why not evaluate the safety of the variable's initialization expression? That is, safe would always refer to code, just not necessarily functions.
Yes, that's what I meant by "variable safety". In retrospect, that was a little ambiguous.
Feb 08
prev sibling parent XavierAP <n3minis-git yahoo.es> writes:
On Wednesday, 6 February 2019 at 10:02:18 UTC, Olivier FAURE 
wrote:
 Anyway, thinking about it, I think fixing this elegantly might 
 require coming up with new semantics. How do you make sure that 
 no  system code is called in your project without manually 
 checking every single variable declaration in your dependencies?
Isn't it enough that the same constraints have been in place when compiling the dependencies, if they are flagged as safe? (Of course excepting trusted code, that's always about trusting the human author's word that their unsafe code has no unsafe consequences; same as if you link to a C library.) If safe can be circumvented (in global initializations) then it's no longer a promise of safety, with all the security consequences, but rather an empty attribute. If safety is indeed to be a "big thing" in computer science,[1] this is just a vulnerability bug that needs to be fixed. [1] https://www.reddit.com/r/cpp/comments/6b4xrc/walter_bright_believes_memory_safety_will_kill_c/
Feb 07