digitalmars.D - trusted attribute should be replaced with trusted blocks
- Ogi (99/99) Jan 15 2020 There was a discussion a few years ago [1], but now it came up
- Joseph Rushton Wakeling (29/39) Jan 15 2020 So here's the problem with this approach (which was mentioned by
- ag0aep6g (33/66) Jan 15 2020 [...]
- Johannes Pfau (17/41) Jan 15 2020 I think it shouldn't be much of a problem, as there is a very nice
- ag0aep6g (10/28) Jan 15 2020 If that's deemed acceptable, I'm on board. But the alternative requires
- Dominikus Dittes Scherkl (14/59) Jan 16 2020 Oh, I hate that discussion.
- Joseph Rushton Wakeling (9/12) Jan 15 2020 Well, apologies to Steven if I've misinterpreted his proposal.
- ag0aep6g (9/17) Jan 15 2020 You're saying that an @safe function `f` is somehow more guaranteed to
- Johannes Pfau (9/30) Jan 15 2020 A trusted function in D is supposed to have a safe interface: For each
- ag0aep6g (11/29) Jan 15 2020 [...]
- Joseph Rushton Wakeling (9/12) Jan 15 2020 No, I'm saying that it is useful to have a clear distinction,
- ag0aep6g (12/20) Jan 15 2020 You're saying that it is useful for me as a user of Phobos when I
- IGotD- (15/18) Jan 15 2020 Speaking of libraries, as you can see many calls will go the C
- Timon Gehr (15/35) Jan 15 2020 That's nonsense, it completely undermines @safe, because many sensible
- Joseph Rushton Wakeling (33/43) Jan 15 2020 I see what you're getting at here -- you mean that if we're
- ag0aep6g (15/36) Jan 15 2020 Exactly.
- Joseph Rushton Wakeling (46/56) Jan 15 2020 The fact that I personally find it useful -- and going by this
- Timon Gehr (3/10) Jan 15 2020 It's an implementation detail. If you care about the distinction, you
- Joseph Rushton Wakeling (7/10) Jan 15 2020 Sure. But on a practical day-to-day basis, @safe vs @trusted
- Timon Gehr (15/27) Jan 15 2020 You have to be careful when writing a @trusted function, not when
- Joseph Rushton Wakeling (54/75) Jan 16 2020 Thanks for being patient enough to clarify the issues here. This
- Steven Schveighoffer (22/37) Jan 16 2020 In fact, because of how the system works, @safe code is LESS likely to
- Joseph Rushton Wakeling (9/30) Jan 16 2020 I'm not quite sure I follow what you mean here, can you
- Steven Schveighoffer (44/58) Jan 16 2020 For example, I want a safe function that uses malloc to allocate, and
- H. S. Teoh (58/80) Jan 16 2020 Good example! So in this case, the trust really is between the tag and
- ag0aep6g (5/19) Jan 16 2020 [...]
- H. S. Teoh (30/38) Jan 16 2020 [...]
- Timon Gehr (36/95) Jan 16 2020 More or less. Two points:
- Johannes Pfau (6/38) Jan 17 2020 I'm curious, what do you think would be the ideal scheme if we could
- Dominikus Dittes Scherkl (8/12) Jan 17 2020 Yes, pretty much the same as in Rust.
- Dominikus Dittes Scherkl (7/19) Jan 17 2020 And by the way: I would not call the blocks @system, they should
- Steven Schveighoffer (5/25) Jan 17 2020 @system blocks would only live inside @trusted functions. So you can
- jmh530 (6/8) Jan 17 2020 Somewhat OT, but HN is blowing up with an article [1] about the
- Steven Schveighoffer (7/17) Jan 17 2020 Well, those people for sure won't be running to D any time soon.
- jmh530 (8/15) Jan 17 2020 I don't disagree with your sentiments. My interest in the piece
- Timon Gehr (6/8) Jan 17 2020 They really do, and the fact that many people do not understand this is
- jmh530 (10/16) Jan 17 2020 Ah, that is correct. I had not even realized it.
- IGotD- (13/22) Jan 17 2020 I think this is pretty funny and also predictable. There is an
- H. S. Teoh (33/43) Jan 17 2020 The other human behaviour is that people form habits and then resist
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (11/13) Jan 17 2020 This is overblown. Rather than go by secondary sources go
- Timon Gehr (9/15) Jan 17 2020 Different approaches have different trade-offs and I am not sure how to
- David Nadlinger (7/11) Jan 18 2020 For the record, this is also exactly what I argued for in the
- David Nadlinger (21/26) Jan 18 2020 Detail the scenario where this would be useful, please.
- Joseph Rushton Wakeling (31/42) Jan 23 2020 I'm honestly not sure what is useful to add to the existing
- Steven Schveighoffer (27/101) Jan 16 2020 I'll interject here. I was thinking actually exactly along the lines of
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (14/24) Jan 16 2020 As was pointed out @trusted does not achieve much more than a
- Steven Schveighoffer (11/36) Jan 16 2020 Enforcing the name of the comment helps the greppability. And putting
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (6/8) Jan 16 2020 Note: for this to work in the general case (with more advanced
- ag0aep6g (32/51) Jan 16 2020 On Thursday, 16 January 2020 at 15:30:45 UTC, Steven
- H. S. Teoh (14/30) Jan 16 2020 [...]
- Joseph Rushton Wakeling (5/14) Jan 16 2020 Hang on, have 3 of us all made the same proposal? (OK, I just
- H. S. Teoh (47/62) Jan 16 2020 Fools or not, the important thing is whether we can convince Walter to
- Ogi (9/14) Jan 16 2020 The are two scenarios of memory corruption in a function with a
- IGotD- (15/19) Jan 15 2020 I also don't understand what's the point with @trusted. Should it
- Joseph Rushton Wakeling (7/19) Jan 15 2020 No. @trusted is about saying "This function should be safe to
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (8/11) Jan 15 2020 Could make sense if you only could call @trusted from @trusted...
- IGotD- (7/13) Jan 15 2020 This is why I think it should be removed. In my world there is no
- Joseph Rushton Wakeling (20/26) Jan 15 2020 Presumably your programs are therefore self-crafted binary, since
- IGotD- (13/19) Jan 15 2020 @safe is a subset of D that guarantees no memory corruption. The
- ag0aep6g (9/10) Jan 15 2020 Not quite. @safe is a subset of D that guarantees no memory
- Timon Gehr (32/59) Jan 15 2020 @safe does not mean "this unconditionally can not corrupt memory", it
- Patrick Schluter (15/30) Jan 16 2020 No, that's where you're wrong. @trusted gives the same guarantees
- IGotD- (9/21) Jan 16 2020 Then we can remove @safe all together and trust the programmer to
- Paul Backus (5/15) Jan 16 2020 The difference between "a small subset of the program must be
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (4/11) Jan 16 2020 If memory-safety is the default then you only need to mark
- Paul Backus (4/16) Jan 16 2020 This is a non-sequitur. Did you mean to respond to a different
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (10/17) Jan 16 2020 It wasn't really clear what "IGotD-" meant. Although I suspect he
- IGotD- (10/25) Jan 16 2020 Yes, kind of.
- H. S. Teoh (11/19) Jan 15 2020 [...]
- Joseph Rushton Wakeling (2/8) Jan 15 2020 To be honest, I just use butterflies :-)
- Paul Backus (4/7) Jan 15 2020 "How to Write @trusted Code in D", on the D blog, is a good
- Paul Backus (14/24) Jan 15 2020 To me, this is the only part of the argument against @trusted
- H. S. Teoh (16/23) Jan 15 2020 [...]
- Ogi (4/12) Jan 16 2020 You’re right. I feel like a moron now.
- Timon Gehr (10/35) Jan 15 2020 That's already how it works. The OP just didn't bother to check whether
There was a discussion a few years ago [1], but now it came up again in the light of safe-by-default DIP. I’ve created a new thread for this subject because it’s off-topic to DIP 1028 but still deserves attention, especially now when there’s a shift towards memory safety. This proposal is independent to DIP 1028. We can have trusted blocks without safe by default and vice versa. But they go along nicely. The idea is to remove trusted as a function attribute and instead introduce trusted blocks: safe fun() { //safe code here trusted { //those few lines that require manual checking } //we are safe again } Generally, only a few lines inside trusted are actually unsafe. With trusted blocks parts that require manual checking will stand out. Why should we leave the task of identifying problematic places to the reader? Once the dangerous parts are enclosed inside trusted blocks, memory safety of the rest of the function is guaranteed by a compiler. Why should we refuse from the additional checks? Actually, there’s already a workaround for injecting unsafe section to safe code: just put it inside trusted lambda that’s called immediately. It’s featured in the “How to Write trusted Code” in D blog. Even Walter admits that he’s using this trick in Phobos. So why should we resort to dirty hacks that could also harm performance and compilation time if we can make it part of a language? Some would say that we need trusted attribute to distinguish functions that are checked by compiler and functions that are checked manually. But is safe actually safe? You can’t tell that a “safe” function doesn’t call trusted functions somewhere along the road, or doesn’t use the aforementioned lambda hack. Why pretend that safe guarantees safety if it’s nothing more than a pinky promise? If you think of it, it makes no sense for trusted to be a function attribute. It doesn’t describe its behavior but its implementation. If you call a function, it should not be your concern; safe and trusted are the same thing for you. But these two attributes result in two different signatures. If a function expects a safe callback, you can’t pass a trusted function to it without safe wrapping. If some library makes a function that used to be safe trusted, that’s a breaking change. Etc, etc. Why should we introduce complexity out of nowhere? Without trusted the safety design will be much simpler to grasp. Currently the are three vague keywords, complex rules on which functions can call which and best practices on which attribute you should choose. Without trusted, it’s simple as potato: there are safe and system functions, safe require putting unsafe parts inside trusted blocks and can only call other safe functions. We should not disregard the learning experience, because with every overcomplicated aspect of the language we lose some potential users. With safe as default upon us, it’s especially important to get safety right. There is a risk that users would slap trusted on everything just to make it work. While the new syntax wouldn’t stop them all, putting a whole function (or an entire module) inside of a block would scream code smell. We can’t expect all users to be world-class programmers. The proposed design is more foolproof than the current safety trichotomy. A user could create a trusted function to do something unsafe and do it properly, but also do something that he expects to be safe while it’s not (like, calling a system function thinking that it’s safe). In a safe function with trusted blocks it wouldn’t be possible to do something unsafe if you don’t mean it. To sum up, trusted blocks will make it easier to write safe code, read safe code and learn the language. This proposal aligns with the current direction of the language towards memory safety. How much will this change break? Probably not that much, because trusted functions are not common. To upgrade your code, the proper thing to do would be to put only the unsafe parts inside trusted blocks. It would require some manual work, but it’s not for naught, since it would re-enable compiler checks outside of trusted blocks. You can even find some bugs your eye could miss! Of course, you can also just put the whole function body inside a trusted block and call it a day. Not great, but just as bad as the current trusted attribute. Additional thoughts. Obviously, trusted block must not introduce a new scope. If you need a new scope, just use double curly. We could also allow applying trusted to a single line: trusted unsafeFunction(); But I don’t think that’s a good idea, as it would make it just too easy to call unsafe functions from safe code. With only safe and system attributes left, it would make perfect sense to rename system to unsafe, as suggested by Manu. The only ones who will protest are those who use “I write system code” as a pickup line. It would be possible to write a tool that analyzes code safety and e.g. shows how many trusted lines are in the dub package. The should be probably dropped. [1] https://forum.dlang.org/thread/blrglebkzhrilxkbprgh forum.dlang.org
Jan 15 2020
On Wednesday, 15 January 2020 at 14:30:02 UTC, Ogi wrote:The idea is to remove trusted as a function attribute and instead introduce trusted blocks: safe fun() { //safe code here trusted { //those few lines that require manual checking } //we are safe again } Generally, only a few lines inside trusted are actually unsafe.So here's the problem with this approach (which was mentioned by several people in the discussion): the actual safety of a function like this is usually down to the combination of the lines that (in your example) are both inside and outside the trusted block. What that means is that it's important that an external user of the function doesn't just see it as safe, but recognizes that the function -- as a whole -- is one whose promise of safety is conditional on its internals being correct. And that's essentially what trusted is for. So, a better approach would be for the function to be marked up like this: trusted fun () // alerts the outside user { // lines that on their own are provably safe go here system { // these lines are allowed to use system code } // only provably safe lines here again } ... and the compiler's behaviour would be to explicitly verify standard safe rules for all the lines inside the trusted function _except_ the ones inside a system { ... } block. Cf. Steven Schveighoffer's remarks here: https://forum.dlang.org/post/qv7t8b$2h2t$1 digitalmars.com This way the function signature gives a clear indicator to the user which functions are provably safe, and which are safe only on the assumption that the developer has done their job properly.
Jan 15 2020
On 15.01.20 17:54, Joseph Rushton Wakeling wrote:On Wednesday, 15 January 2020 at 14:30:02 UTC, Ogi wrote:[...][...]safe fun() { //safe code here trusted { //those few lines that require manual checking } //we are safe again }So here's the problem with this approach (which was mentioned by several people in the discussion): the actual safety of a function like this is usually down to the combination of the lines that (in your example) are both inside and outside the trusted block.Yup. But a proposal could specify that that's the intended meaning for an safe function that contains trusted blocks. Whereas it's more of a cheat when we use trusted nested functions like that. [...]So, a better approach would be for the function to be marked up like this: trusted fun () // alerts the outside user { // lines that on their own are provably safe go here system { // these lines are allowed to use system code } // only provably safe lines here again } ... and the compiler's behaviour would be to explicitly verify standard safe rules for all the lines inside the trusted function _except_ the ones inside a system { ... } block. Cf. Steven Schveighoffer's remarks here: https://forum.dlang.org/post/qv7t8b$2h2t$1 digitalmars.com This way the function signature gives a clear indicator to the user which functions are provably safe, and which are safe only on the assumption that the developer has done their job properly.I don't think that's what Steven had in mind. In that world, safe would be very, very limited, because it couldn't be allowed to call trusted functions. That means safe would only apply to trivial functions, and trusted would assume the role that safe has today. But you'd have to wrap every call from an trusted to another trusted function in an system block. It wouldn't be practical. The real purpose of trusted in that example is to allow the system block in the body, and to signal to reviewers and maintainers that the whole function is unsafe despite the mechanical checks that are done on most of the lines. To a user, trusted functions would still be the same as safe ones. Unfortunately, adding the mechanical checks of safe to trusted would mean breaking all trusted code that exists. So implementing that scheme seems unrealistic. But as Steven says, it can be done when we use trusted blocks instead of system blocks and safe instead of trusted on the function. I.e.: safe fun () { // lines that the compiler accepts as safe go here trusted { // these lines are allowed to use system code } // only safe lines here again } It weakens the meaning of safe somewhat, but it's often treated that way already. There's clearly a need.
Jan 15 2020
Am Wed, 15 Jan 2020 19:06:11 +0100 schrieb ag0aep6g:The real purpose of trusted in that example is to allow the system block in the body, and to signal to reviewers and maintainers that the whole function is unsafe despite the mechanical checks that are done on most of the lines. To a user, trusted functions would still be the same as safe ones. Unfortunately, adding the mechanical checks of safe to trusted would mean breaking all trusted code that exists. So implementing that scheme seems unrealistic.I think it shouldn't be much of a problem, as there is a very nice transition path: * Add system block support * Add -transition=systemBlocks which enforces system blocks in trusted functions * Users gradually add system blocks to their trusted functions, until everything compiles with -transition=systemBlocks. If you did not add all blocks yet, your code will still compile fine without - transisition=systemBlocks * -transition=systemBlocks becomes defaultBut as Steven says, it can be done when we use trusted blocks instead of system blocks and safe instead of trusted on the function. I.e.: safe fun () { // lines that the compiler accepts as safe go here trusted { // these lines are allowed to use system code } // only safe lines here again } It weakens the meaning of safe somewhat, but it's often treated that way already. There's clearly a need.I don't really like this. It makes trusted functions completely useless legacy cruft. And there's no longer any way to annotate a function as 'this is 100% safe code', so then you'll have to check every safe function as thoroughly as trusted functions. -- Johannes
Jan 15 2020
On 15.01.20 19:38, Johannes Pfau wrote:Am Wed, 15 Jan 2020 19:06:11 +0100 schrieb ag0aep6g:[...]I think it shouldn't be much of a problem, as there is a very nice transition path: * Add system block support * Add -transition=systemBlocks which enforces system blocks in trusted functions * Users gradually add system blocks to their trusted functions, until everything compiles with -transition=systemBlocks. If you did not add all blocks yet, your code will still compile fine without - transisition=systemBlocks * -transition=systemBlocks becomes defaultIf that's deemed acceptable, I'm on board. But the alternative requires zero work from users (if we just keep trusted functions around as legacy cruft).[...]But as Steven says, it can be done when we use trusted blocks instead of system blocks and safe instead of trusted on the function. I.e.:I don't really like this. It makes trusted functions completely useless legacy cruft. And there's no longer any way to annotate a function as 'this is 100% safe code', so then you'll have to check every safe function as thoroughly as trusted functions.You already have to check for trusted nested functions if you want to make sure an safe function is really 100% safe. And those already routinely leak their unsafety into the surrounding safe function. I don't see what would change in that regard.
Jan 15 2020
On Wednesday, 15 January 2020 at 18:38:25 UTC, Johannes Pfau wrote:Am Wed, 15 Jan 2020 19:06:11 +0100 schrieb ag0aep6g:Oh, I hate that discussion. trusted functions were useless from the start. They are just the same as safe from a caller perspective. And trusted blocks are nothing new either, only the syntax is more ugly because anonymous lambdas are used to simulate them: () trusted { // system stuff here }() And the argument that trusted functions could better be found and that that would increase the safety is invalid. trusted blocks can be searched just as good, and any function containing them still need to be treated with caution.The real purpose of trusted in that example is to allow the system block in the body, and to signal to reviewers and maintainers that the whole function is unsafe despite the mechanical checks that are done on most of the lines. To a user, trusted functions would still be the same as safe ones. Unfortunately, adding the mechanical checks of safe to trusted would mean breaking all trusted code that exists. So implementing that scheme seems unrealistic.I think it shouldn't be much of a problem, as there is a very nice transition path: * Add system block support * Add -transition=systemBlocks which enforces system blocks in trusted functions * Users gradually add system blocks to their trusted functions, until everything compiles with -transition=systemBlocks. If you did not add all blocks yet, your code will still compile fine without - transisition=systemBlocks * -transition=systemBlocks becomes defaultBut as Steven says, it can be done when we use trusted blocks instead of system blocks and safe instead of trusted on the function. I.e.: safe fun () { // lines that the compiler accepts as safe go here trusted { // these lines are allowed to use system code } // only safe lines here again } It weakens the meaning of safe somewhat, but it's often treated that way already. There's clearly a need.I don't really like this. It makes trusted functions completely useless legacy cruft. And there's no longer any way to annotate a function as 'this is 100% safe code', so then you'll have to check every safe function as thoroughly as trusted functions.
Jan 16 2020
On Wednesday, 15 January 2020 at 18:06:11 UTC, ag0aep6g wrote:I don't think that's what Steven had in mind. In that world, safe would be very, very limited, because it couldn't be allowed to call trusted functions.Well, apologies to Steven if I've misinterpreted his proposal. But what I had in mind was that safe would be able to call trusted just as it does now. So, put that together with what I wrote above, and you have something that allows better validation of the internals of trusted functions, and still gives the user clarity about which functions are safe in their own terms, and which are safe based on some programmer provided guarantees.
Jan 15 2020
On 15.01.20 19:41, Joseph Rushton Wakeling wrote:Well, apologies to Steven if I've misinterpreted his proposal. But what I had in mind was that safe would be able to call trusted just as it does now. So, put that together with what I wrote above, and you have something that allows better validation of the internals of trusted functions, and still gives the user clarity about which functions are safe in their own terms, and which are safe based on some programmer provided guarantees.You're saying that an safe function `f` is somehow more guaranteed to be safe than an trusted function `g`, even though `f` may be calling `g`. I don't see how that makes sense. ---- R f(P params) safe { return g(params); } R g(P params) trusted { /* ... whatever ... */ } ---- Any assumptions you have about `f` better also be true about `g`.
Jan 15 2020
Am Wed, 15 Jan 2020 20:14:13 +0100 schrieb ag0aep6g:On 15.01.20 19:41, Joseph Rushton Wakeling wrote:A trusted function in D is supposed to have a safe interface: For each possible combination of argument values, a trusted function must not cause invalid memory accesses. This is the only reason you can safely call trusted functions from safe functions. It's the programmers responsibility to design trusted function APIs in a way so they can't be misused. -- JohannesWell, apologies to Steven if I've misinterpreted his proposal. But what I had in mind was that safe would be able to call trusted just as it does now. So, put that together with what I wrote above, and you have something that allows better validation of the internals of trusted functions, and still gives the user clarity about which functions are safe in their own terms, and which are safe based on some programmer provided guarantees.You're saying that an safe function `f` is somehow more guaranteed to be safe than an trusted function `g`, even though `f` may be calling `g`. I don't see how that makes sense. ---- R f(P params) safe { return g(params); } R g(P params) trusted { /* ... whatever ... */ } ---- Any assumptions you have about `f` better also be true about `g`.
Jan 15 2020
On 15.01.20 22:08, Johannes Pfau wrote:Am Wed, 15 Jan 2020 20:14:13 +0100 schrieb ag0aep6g:[...]On 15.01.20 19:41, Joseph Rushton Wakeling wrote:[...]So, put that together with what I wrote above, and you have something that allows better validation of the internals of trusted functions, and still gives the user clarity about which functions are safe in their own terms, and which are safe based on some programmer provided guarantees.You're saying that an safe function `f` is somehow more guaranteed to be safe than an trusted function `g`, even though `f` may be calling `g`. I don't see how that makes sense.A trusted function in D is supposed to have a safe interface: For each possible combination of argument values, a trusted function must not cause invalid memory accesses. This is the only reason you can safely call trusted functions from safe functions. It's the programmers responsibility to design trusted function APIs in a way so they can't be misused.I completely agree, but I'm not sure how it's related to my rejection of Joseph's statement that trusted "gives the user [= caller] clarity about which functions [...] are safe based on some programmer provided guarantees". The distinction between safe and trusted matters when looking at the implementation of a function. It doesn't matter when calling an safe/ trusted function, precisely because trusted function must have safe interfaces just like safe functions.
Jan 15 2020
On Wednesday, 15 January 2020 at 19:14:13 UTC, ag0aep6g wrote:You're saying that an safe function `f` is somehow more guaranteed to be safe than an trusted function `g`, even though `f` may be calling `g`. I don't see how that makes sense.No, I'm saying that it is useful to have a clear distinction, visible in the API declaration, between functions whose promise of memory safety can be validated by the compiler, versus functions whose promise of memory safety cannot be validated by the compiler. It's a practical distinction, because it gives the developer clarity about which parts of a codebase need to be examined in order to rule out the possibility of memory safety violations.
Jan 15 2020
On Wednesday, 15 January 2020 at 23:16:21 UTC, Joseph Rushton Wakeling wrote:No, I'm saying that it is useful to have a clear distinction, visible in the API declaration, between functions whose promise of memory safety can be validated by the compiler, versus functions whose promise of memory safety cannot be validated by the compiler.You're saying that it is useful for me as a user of Phobos when I see that, for example, std.stdio.File.open is trusted [1], right? I disagree. It doesn't mean anything other than safe in that context. Knowing that the function is trusted instead of safe is of zero use to a user of Phobos. It cannot help them make any decision whatsoever.It's a practical distinction, because it gives the developer clarity about which parts of a codebase need to be examined in order to rule out the possibility of memory safety violations.I.e., it's a tool for library authors, not for library users. You clearly spoke of an "external" or "outside" user of the function before. [1] https://dlang.org/phobos/std_stdio.html#.File.open
Jan 15 2020
On Wednesday, 15 January 2020 at 23:46:11 UTC, ag0aep6g wrote:You're saying that it is useful for me as a user of Phobos when I see that, for example, std.stdio.File.open is trusted [1], right?Speaking of libraries, as you can see many calls will go the C library which is of course is unsafe code. The only way I can see it is that safe code should be able to call unsafe code by default, for convenience. Now there is a possibility that the programmer wants that the code shall not call any unsafe code what so ever. Then I think that a function attribute should be added that tells that the code is not allowed to call any unsafe code down the line. This is better because. 1. It is a restriction set by the programmer and not some arbitrary promise from a third party. 2. It is absolute. The compiler can check this, either it hits unsafe or it doesn't. It is either true or false and it cannot be questioned.
Jan 15 2020
On 16.01.20 01:09, IGotD- wrote:On Wednesday, 15 January 2020 at 23:46:11 UTC, ag0aep6g wrote:That's nonsense, it completely undermines safe, because many sensible system functions do not have a safe interface. safe code shouldn't be able to directly call `free` from the C standard library on arbitrary pointers.You're saying that it is useful for me as a user of Phobos when I see that, for example, std.stdio.File.open is trusted [1], right?Speaking of libraries, as you can see many calls will go the C library which is of course is unsafe code. The only way I can see it is that safe code should be able to call unsafe code by default, for convenience. ...Now there is a possibility that the programmer wants that the code shall not call any unsafe code what so ever.trusted functions are not unsafe, it's just not checked by the compiler.Then I think that a function attribute should be added that tells that the code is not allowed to call any unsafe code down the line. This is better because. 1. It is a restriction set by the programmer and not some arbitrary promise from a third party.It's an arbitrary promise from the compiler implementation...2. It is absolute. The compiler can check this, either it hits unsafe or it doesn't. It is either true or false and it cannot be questioned.This is not better. Why do you trust compilers and standard library behave according to the specification but are unwilling to do that for any other code that cannot be verified by the compiler? You are making an arbitrary distinction where there is none. The point of safe is that if you write safe code (and no trusted code), memory corruption is not supposed to occur; any memory corruption that occurs anyway is not your own fault. (Except maybe for not doing due diligence when evaluating a potential dependency before using it.)
Jan 15 2020
On Wednesday, 15 January 2020 at 23:46:11 UTC, ag0aep6g wrote:You're saying that it is useful for me as a user of Phobos when I see that, for example, std.stdio.File.open is trusted [1], right?Yes, I am saying exactly that.I disagree. It doesn't mean anything other than safe in that context. Knowing that the function is trusted instead of safe is of zero use to a user of Phobos. It cannot help them make any decision whatsoever.I see what you're getting at here -- you mean that if we're treating this function as a black box that I have no influence over, then both safe and trusted mean the same thing in terms of how that black box ought to behave. But that doesn't mean the distinction isn't useful. It gives me a clear indication that the code of this function (note, _the code of this function_, not the code of other functions it might call) could contain bugs that would allow memory safety violations. Contrast this with, say, Rust, where any function can do something like this: fn foo(x: &[int]) { unsafe { // any old horrible stuff can happen here } } ... where any number of memory safety bugs could exist in that `unsafe { ... }` block, and yet there is no possible clue from the API that this could be going on. The fact that in a similar situation D forces you to annotate the function with ` trusted`, and alert users to the _possibility_ that memory safety bugs could exist within the code of this function, is useful information even if you can't access the source code.I.e., it's a tool for library authors, not for library users. You clearly spoke of an "external" or "outside" user of the function before.It's clearly much less useful to anyone who doesn't have access to the source code (which doesn't mean it's not useful at all). But in general, given the ability to read and search the source code (which users as well as authors can do), it's very useful to be able to ask the question: "Of the code in this project that claims to be memory safe, which bits could actually contain memory safety bugs?"
Jan 15 2020
On Thursday, 16 January 2020 at 00:21:21 UTC, Joseph Rushton Wakeling wrote:I see what you're getting at here -- you mean that if we're treating this function as a black box that I have no influence over, then both safe and trusted mean the same thing in terms of how that black box ought to behave.Exactly.But that doesn't mean the distinction isn't useful. It gives me a clear indication that the code of this function (note, _the code of this function_, not the code of other functions it might call) could contain bugs that would allow memory safety violations.And that isn't useful to a user in any way. [...]The fact that in a similar situation D forces you to annotate the function with ` trusted`, and alert users to the _possibility_ that memory safety bugs could exist within the code of this function, is useful information even if you can't access the source code.I don't agree. trusted doesn't alert you any more of the possibility of a memory safety bug than safe. You can't assume that an safe function won't corrupt your memory any more than you can assume the same about an trusted function. [...]It's clearly much less useful to anyone who doesn't have access to the source code (which doesn't mean it's not useful at all).(It is useless, though.)But in general, given the ability to read and search the source code (which users as well as authors can do), it's very useful to be able to ask the question: "Of the code in this project that claims to be memory safe, which bits could actually contain memory safety bugs?"Yes, but you find those interesting bits by grepping over the source code, not by looking at the attributes of public functions. Many safe functions have trusted innards that don't show up in API documentation.
Jan 15 2020
On Thursday, 16 January 2020 at 00:40:12 UTC, ag0aep6g wrote:And that isn't useful to a user in any way.The fact that I personally find it useful -- and going by this discussion thread, so do some others -- means that this claim can't possibly be true.I don't agree. trusted doesn't alert you any more of the possibility of a memory safety bug than safe. You can't assume that an safe function won't corrupt your memory any more than you can assume the same about an trusted function.safe on a function tells you no more than it promises: that the compiler will alert you to any memory safety violations it is able to detect in that function's implementation. If it does not detect any, that could be because none are present, or it could be that at some point nested inside the code it calls there is something trusted that the compiler does not attempt to validate. trusted on a function, on the other hand, explicitly tells you that its safety is contingent on something other than compiler validation, and that you should not expect the compiler to validate _any_ part of its internals. Do you seriously not see any difference between an annotation that tells you: "this function should be memory-safe, and the compiler will attempt to validate this to the best of its ability" versus, "this function should be memory-safe, but the compiler will not make any attempt to validate that"? And do you seriously not see any value in having that distinction clear? Obviously the allegedly safe function _could_ just be a thin wrapper around a trusted lambda containing all sorts of horrible crack. But on balance of probabilities, in any library written by vaguely competent people, that should probably not be your _first_ assumption.(It is useless, though.)Don't mistake "I don't find it useful" for "No one finds it useful".Yes, but you find those interesting bits by grepping over the source code, not by looking at the attributes of public functions. Many safe functions have trusted innards that don't show up in API documentation.FWIW I do think it may be a bit of a code smell or anti-pattern to have nested trusted functions inside safe functions. Of course, there's still the possibility that a safe function can call some external but private trusted function, but it reduces the attack space if the internal code of a safe function can be fully validated _in its own terms_ (i.e. if it can be validated that there is nothing _inside the function_ that could cause a memory safety bug). BTW, that's part of the motivation for my suggestions at the beginning of this thread about how trusted could be improved: https://forum.dlang.org/post/heujvxcsppiyagxfvliv forum.dlang.org If the compiler would validate all the contents of trusted functions except for lines contained in ` system { ... }` blocks (see the original example if it's not clear what I mean), then that would probably be a preferred way to achieve what developers currently do using safe functions with some nested trusted lambdas. And that would probably mean that APIs would be annotated in a more informative way about the _true_ risk of memory safety bugs.
Jan 15 2020
On 16.01.20 02:43, Joseph Rushton Wakeling wrote:Do you seriously not see any difference between an annotation that tells you: "this function should be memory-safe, and the compiler will attempt to validate this to the best of its ability" versus, "this function should be memory-safe, but the compiler will not make any attempt to validate that"? And do you seriously not see any value in having that distinction clear?It's an implementation detail. If you care about the distinction, you should check out the function's implementation, not its signature.
Jan 15 2020
On Thursday, 16 January 2020 at 01:53:18 UTC, Timon Gehr wrote:It's an implementation detail. If you care about the distinction, you should check out the function's implementation, not its signature.Sure. But on a practical day-to-day basis, safe vs trusted signatures help to prioritize one's allocation of care somewhat. I'm coming to the conclusion that much of the differences of opinion in this thread are between folks who want to see things as absolutes, and folks who recognize that these features are tools for mitigating risk, not eliminating it.
Jan 15 2020
On 16.01.20 03:06, Joseph Rushton Wakeling wrote:On Thursday, 16 January 2020 at 01:53:18 UTC, Timon Gehr wrote:You have to be careful when writing a trusted function, not when calling it. If you do not trust a given library, there is no reason to be more careful around a trusted API than around a safe API, as they do not mean different things. safe does not fully eliminate risk of memory corruption in practice, but that does not mean there is anything non-absolute about the specifications of the attributes. As I am sure you understand, if you see a safe function signature, you don't know that its implementation is not a single trusted function call, so the difference in signature is meaningless unless you adhere to specific conventions (which the library you will be considering to use as a dependency most likely will not do).It's an implementation detail. If you care about the distinction, you should check out the function's implementation, not its signature.Sure. But on a practical day-to-day basis, safe vs trusted signatures help to prioritize one's allocation of care somewhat. ...I'm coming to the conclusion that much of the differences of opinion in this thread are between folks who want to see things as absolutes, and folks who recognize that these features are tools for mitigating risk, not eliminating it.I was not able to figure out a response to this sentence that is both polite and honest.
Jan 15 2020
On Thursday, 16 January 2020 at 03:34:26 UTC, Timon Gehr wrote:You have to be careful when writing a trusted function, not when calling it. If you do not trust a given library, there is no reason to be more careful around a trusted API than around a safe API, as they do not mean different things.Thanks for being patient enough to clarify the issues here. This helps to make much clearer the thinking behind the point of view that's been articulated elsewhere in this discussion, that the trusted/ safe distinction is there for library maintainers rather than users. I agree that it's _primarily_ there for maintainers, but I don't agree that it has _no_ value for users (more on that below). But your observations do make clear how unreasonable my rather judgemental "I'm coming to the conclusion ..." remarks were, so apologies to everyone for that. (It's no excuse, but I was quite tired when I wrote that, and so probably not exercising as much good judgement as I ought.)safe does not fully eliminate risk of memory corruption in practice, but that does not mean there is anything non-absolute about the specifications of the attributes.Would we be able to agree that the absolute part of the spec of both amounts to, "The emergence of a memory safety problem inside this function points to a bug either in the function itself or in the initialization of the data that is passed to it" ... ? (In the latter case I'm thinking that e.g. one can have a perfectly, provably correct safe function taking a slice as input, and its behaviour can still get messed up because the user initializes a slice in some crazy unsafe way and passes that in.)As I am sure you understand, if you see a safe function signature, you don't know that its implementation is not a single trusted function callYes, on this we agree. (I even mentioned this case in one of my posts above.)so the difference in signature is meaningless unless you adhere to specific conventionsHere's where I think we start having a disagreement. I think it is meaningful to be able to distinguish between "The compiler will attempt to validate the memory safety of this function to the extent possible given the trusted assumptions injected by the developer" (which _might_ be the entirety of the function), versus "The safety of this function will definitely not be validated in any way by the compiler". Obviously that's _more_ helpful to the library authors than users, but it's still informative to the user: it's saying that while the _worst case_ assumptions are the same (100% unvalidated), the best case are not.(which the library you will be considering to use as a dependency most likely will not do).Obviously in general one should not assume virtue on the part of library developers. But OTOH in a day-to-day practical working scenario, where one has to prioritize how often one wants to deep-dive into implementation details -- versus just taking a function's signature and docs at face value and only enquiring more deeply if something breaks -- it's always useful to have a small hint about the best vs. worst case scenarios. It's not that safe provides a stronger guarantee than trusted, it's that trusted makes clear that you are definitely in worst-case territory. It's not a magic bullet, it's just another data point that helps inform the question of whether one might want to deep-dive up front or not (a decision which might be influenced by plenty of other factors besides memory safety concerns). The distinction only becomes meaningless if one is unable to deep-dive and explore the library code.Well FWIW, don't feel the need to be polite to me if you don't think I deserve it :-) But in any case, apologies again to everyone for those remarks. They were unfair and based on a misinterpretation of the positions being articulated.I'm coming to the conclusion that much of the differences of opinion in this thread are between folks who want to see things as absolutes, and folks who recognize that these features are tools for mitigating risk, not eliminating it.I was not able to figure out a response to this sentence that is both polite and honest.
Jan 16 2020
On 1/16/20 6:50 AM, Joseph Rushton Wakeling wrote:Obviously in general one should not assume virtue on the part of library developers. But OTOH in a day-to-day practical working scenario, where one has to prioritize how often one wants to deep-dive into implementation details -- versus just taking a function's signature and docs at face value and only enquiring more deeply if something breaks -- it's always useful to have a small hint about the best vs. worst case scenarios. It's not that safe provides a stronger guarantee than trusted, it's that trusted makes clear that you are definitely in worst-case territory. It's not a magic bullet, it's just another data point that helps inform the question of whether one might want to deep-dive up front or not (a decision which might be influenced by plenty of other factors besides memory safety concerns).In fact, because of how the system works, safe code is LESS likely to mean what you think. If you see a safe function, it just means "some of this is mechanically checked". It doesn't mean that the function is so much more solid than a trusted function that you can skip the review. It can have trusted escapes that force the whole function into the realm of needing review. I would say that today, safe and trusted are indistinguishable from each other as a user of the function. If we moved to a scheme more like I was writing about in the post you quoted, then they actually do start to take on a more solid meaning. It's still not fool-proof -- safe functions can call trusted functions, which can call system functions. BUT if everyone does the job they should be doing, then you shouldn't be able to call trusted functions and corrupt memory, and you should not have to necessarily review safe functions. There are still cases where you still have to review functions that are safe which do not have inner functions that are trusted. These are cases where data that is usually accessible to safe functions can cause memory problems in conjunction with trusted functions. When you need to break the rules, it's very hard to contain where the rule breaking stops. -Steve
Jan 16 2020
On Thursday, 16 January 2020 at 15:45:46 UTC, Steven Schveighoffer wrote:In fact, because of how the system works, safe code is LESS likely to mean what you think.I'm not quite sure I follow what you mean here, can you clarify/explain?If you see a safe function, it just means "some of this is mechanically checked". It doesn't mean that the function is so much more solid than a trusted function that you can skip the review. It can have trusted escapes that force the whole function into the realm of needing review.Yes, agreed. And hence why your proposal for how to improve trusted really makes sense.If we moved to a scheme more like I was writing about in the post you quoted, then they actually do start to take on a more solid meaning. It's still not fool-proof -- safe functions can call trusted functions, which can call system functions. BUT if everyone does the job they should be doing, then you shouldn't be able to call trusted functions and corrupt memory, and you should not have to necessarily review safe functions.Yes, this was exactly how I interpreted your proposal.There are still cases where you still have to review functions that are safe which do not have inner functions that are trusted. These are cases where data that is usually accessible to safe functions can cause memory problems in conjunction with trusted functions. When you need to break the rules, it's very hard to contain where the rule breaking stops.For example, in a class or struct implementation where a private variable can be accessed by both safe and trusted methods ... ?
Jan 16 2020
On 1/16/20 1:08 PM, Joseph Rushton Wakeling wrote:On Thursday, 16 January 2020 at 15:45:46 UTC, Steven Schveighoffer wrote:For example, I want a safe function that uses malloc to allocate, and free to deallocate. Perhaps that is just scratch space and it's just an implementation detail. I want everything *else* in the function to be safe. So I have to mark the function safe, not trusted. Otherwise I don't get the compiler checks. In this way, safe cannot really be used as a marker of "don't need to manually verify" because it's the only way to turn on the mechanical checking. So there is more incentive to mark code safe than trusted at the function level. I guess I should have worded it as, you are probably going to see safe prototypes that more often then not still need checking. Same goes for template functions. How do you even know whether it can be safe or not? You can try it, but that doesn't mean there's no trusted blocks inside. I just don't see the practicality or validity of worrying about trusted functions more than safe ones from a user perspective. That being said, code out there is almost always too trusting when marking functions trusted. They should be small and easily reviewable. The longer the function, the more chances for assumptions to sneak in.In fact, because of how the system works, safe code is LESS likely to mean what you think.I'm not quite sure I follow what you mean here, can you clarify/explain?A recent example was a tagged union [1]. The tag is just an integer or boolean indicating which member of the union is valid. As long as the tag matches which actual element of the union is valid, you can use trusted functions to access the union member. However, safe code is able to twiddle the tag without the compiler complaining. The trusted code is expecting the link between the union member that is valid and the tag. In other words, you can muck with the tag all day long in safe land, even in a completely safe function. But it may violates the assumptions that the trusted functions make, making the other parts unsafe. Therefore, you have to review the whole type, even the safe calls, to make sure none of them violates the invariant. And this is why some folks (ag0aep6g) disagree that trusted functions can be valid in this situation -- they have to be valid for ALL inputs in ALL contexts, because the alternative is that you have to manually check safe code. I can live with the idea that safe code needs checking within context, as long as it helps me ensure that *most* of the stuff is right. The other option is to somehow use the compiler to enforce the semantic, like marking the *data* system. In other words you are telling the compiler "I know that it's normally safe to change this tag, but in this case, you can't, because it will mess things up elsewhere". -Steve [1] https://github.com/dlang/phobos/pull/7347There are still cases where you still have to review functions that are safe which do not have inner functions that are trusted. These are cases where data that is usually accessible to safe functions can cause memory problems in conjunction with trusted functions. When you need to break the rules, it's very hard to contain where the rule breaking stops.For example, in a class or struct implementation where a private variable can be accessed by both safe and trusted methods ... ?
Jan 16 2020
On Thu, Jan 16, 2020 at 02:18:09PM -0500, Steven Schveighoffer via Digitalmars-d wrote:On 1/16/20 1:08 PM, Joseph Rushton Wakeling wrote:[...]Good example! So in this case, the trust really is between the tag and the union, not so much in the trusted function itself. The trusted function is really just *assuming* the validity of the correspondence between the tag and the union. Without encoding this context somehow, the compiler cannot guarantee that some outside code ( safe code, no less!) won't break the invariant and thereby invalidate the trusted function.For example, in a class or struct implementation where a private variable can be accessed by both safe and trusted methods ... ?A recent example was a tagged union [1]. The tag is just an integer or boolean indicating which member of the union is valid. As long as the tag matches which actual element of the union is valid, you can use trusted functions to access the union member. However, safe code is able to twiddle the tag without the compiler complaining. The trusted code is expecting the link between the union member that is valid and the tag. In other words, you can muck with the tag all day long in safe land, even in a completely safe function. But it may violates the assumptions that the trusted functions make, making the other parts unsafe.Therefore, you have to review the whole type, even the safe calls, to make sure none of them violates the invariant.:-( And I guess this extends to any type that has trusted methods that make assumptions about the data stored in the type. Which logically leads to the idea that the data itself should be tagged somehow, and therefore your idea of tagging the *data*. [...]The other option is to somehow use the compiler to enforce the semantic, like marking the *data* system. In other words you are telling the compiler "I know that it's normally safe to change this tag, but in this case, you can't, because it will mess things up elsewhere".[...] So it's basically a way of tainting any code that touches the data, such that you're not allowed to touch the data unless you are system or trusted. This actually makes a lot of sense, the more I think about it. Take a pointer T*, for example. Why is it illegal to modify the pointer (i.e. do pointer arithmetic with T*) in safe code? The act of changing the pointer doesn't in itself corrupt memory. What corrupts memory is when the pointer is changed in a way that *breaks assumptions* laid upon it by safe code, such that when we subsequently dereference it, we may end up in UB land. We may say that pointer dereference is trusted, in the same sense as the tagged union access you described -- it's assuming that the pointer points to something valid -- and our pointer arithmetic has just broken that assumption. Similarly, it's illegal to manipulate the .ptr field of an int[] in safe code: not because that in itself corrupts memory, but because that breaks the assumption that an expression like int[i] will access valid data (provided it's within the bounds of .length). Again, the manipulation of .ptr is system, and array dereference with [i] is trusted in the same sense as tagged union access: arr[i] *assumes* that there's a valid correspondence between .ptr, .length, and whatever .ptr points to. If therefore we prohibit manipulating .ptr in safe code but allow arr[i] (which makes assumptions about .ptr), then it makes sense to prohibit manipulation of the tagged union's tag field and allow the trusted member to look up union fields. It could even be argued that union field lookup ought to be safe in the same way arr[i] is safe: it won't corrupt memory or read out-of-bounds, contingent upon the assumptions laid on an int[]'s .ptr and .length fields not to have been broken. IOW, we're talking about "sensitive data" here, i.e., data that must not be modified in the wrong ways because it will break assumptions that other code have laid upon it. Manipulating pointers is system because pointers are sensitive data. Manipulating ints is safe because ints are not sensitive data. In the same vein, the tag field of a tagged union is sensitive data, and therefore manipulating it must be system, i.e., only a trusted function ought to be allowed to do that. By default, safe comes with its own set of what constitutes sensitive data, and operations on such data are rightfully restricted. Allowing the user to tag data as sensitive seems to be a logical extension of safe. T -- It said to install Windows 2000 or better, so I installed Linux instead.
Jan 16 2020
On Thursday, 16 January 2020 at 20:47:54 UTC, H. S. Teoh wrote:On Thu, Jan 16, 2020 at 02:18:09PM -0500, Steven Schveighoffer[...][...]The other option is to somehow use the compiler to enforce the semantic, like marking the *data* system. In other words you are telling the compiler "I know that it's normally safe to change this tag, but in this case, you can't, because it will mess things up elsewhere".[...] So it's basically a way of tainting any code that touches the data, such that you're not allowed to touch the data unless you are system or trusted.By default, safe comes with its own set of what constitutes sensitive data, and operations on such data are rightfully restricted. Allowing the user to tag data as sensitive seems to be a logical extension of safe.For reference, here's the upcoming DIP: https://github.com/dlang/DIPs/pull/179
Jan 16 2020
On Thu, Jan 16, 2020 at 10:45:46AM -0500, Steven Schveighoffer via Digitalmars-d wrote: [...]In fact, because of how the system works, safe code is LESS likely to mean what you think. If you see a safe function, it just means "some of this is mechanically checked". It doesn't mean that the function is so much more solid than a trusted function that you can skip the review. It can have trusted escapes that force the whole function into the realm of needing review.[...] Yeah, that's the part that makes me uncomfortable every time I see a trusted lambda inside a function that *clearly* does not sport a safe interface, as in, its safety is dependent on the surrounding code. I think it would be better to completely outlaw trusted blocks inside a safe function, and to require calling an external trusted function. And inside a trusted function, most of the body will still be subject to safe checks, except for explicitly-marked out system scopes. This way, the meaning of safe becomes "this function has been thoroughly mechanically checked, and it will not corrupt memory provided all trusted functions that it calls operate correctly". And trusted would mean "this function has been mechanically checked except for those blocks explicitly marked safe, which must be reviewed manually together with the rest of the function body". The latter is useful as a preventative measure: if you allow unrestricted use of system code inside a trusted function, then every single code change made to that function requires the manual re-evaluation of the entire function, because you don't know if you've inadvertently introduced a safety hole. Not allowing system code by default means if you accidentally slip up outside of the isolated system blocks, the compiler will complain and you will fix it. This way, you minimize the surface area of potential problems to a smaller scope inside the trusted function, and leverage the compiler's automatic checks to catch your mistakes, as opposed to having zero safeguards as soon as you slap trusted on your function. T -- Try to keep an open mind, but not so open your brain falls out. -- theboz
Jan 16 2020
On 16.01.20 12:50, Joseph Rushton Wakeling wrote:On Thursday, 16 January 2020 at 03:34:26 UTC, Timon Gehr wrote:More or less. Two points: - The _only_ precondition a trusted/ safe function can assume for guaranteeing no memory corruption is that there is no preexisting memory corruption. - For callers that threat the library as a black-box, this definition is essentially sufficient. (This is why there is not really a reason to treat the signatures differently to the point where changing from one to the other is a breaking API change.) White-box callers get the additional language guarantee that if the function corrupts memory, that happens when it executed some bad trusted code and this is the motivation behind having both safe and trusted. system exists because in low-level code, sometimes you want to write or use functions that have highly non-trivial preconditions for ensuring no memory corruption happens....safe does not fully eliminate risk of memory corruption in practice, but that does not mean there is anything non-absolute about the specifications of the attributes.Would we be able to agree that the absolute part of the spec of both amounts to, "The emergence of a memory safety problem inside this function points to a bug either in the function itself or in the initialization of the data that is passed to it" ... ? ...(In the latter case I'm thinking that e.g. one can have a perfectly, provably correct safe function taking a slice as input, and its behaviour can still get messed up because the user initializes a slice in some crazy unsafe way and passes that in.) ...That is preexisting memory corruption. If you use trusted/ system code to destroy an invariant that the safe part of the language assumes to hold for a given type, you have corrupted memory.It is possible to write a trusted function that consists of a single call to a safe function, so you are assuming a convention where people do not call safe code from trusted code in certain ways. Anyway, my central point was that it is an implementation detail. That does not mean it is necessarily useless to a user in all circumstances, but that someone who writes a library will likely choose to hide it.As I am sure you understand, if you see a safe function signature, you don't know that its implementation is not a single trusted function callYes, on this we agree. (I even mentioned this case in one of my posts above.)so the difference in signature is meaningless unless you adhere to specific conventionsHere's where I think we start having a disagreement. I think it is meaningful to be able to distinguish between "The compiler will attempt to validate the memory safety of this function to the extent possible given the trusted assumptions injected by the developer" (which _might_ be the entirety of the function), versus "The safety of this function will definitely not be validated in any way by the compiler". Obviously that's _more_ helpful to the library authors than users, but it's still informative to the user: it's saying that while the _worst case_ assumptions are the same (100% unvalidated), the best case are not. ...Right now, the library developer has a valid incentive to actively avoid trusted functions in their API. This is because it is always possible, trusted is an implementation detail and changing this detail can in principle break dependent code. (E.g., a template instantiated with a safe delegate will give you a different instantiation from the same template instantiated with a trusted delegate, and if e.g., you have some static cache in your template function, a change from safe to trusted in some API can silently slow down the downstream application by a factor of two, change iteration orders through hash tables, etc.)(which the library you will be considering to use as a dependency most likely will not do).Obviously in general one should not assume virtue on the part of library developers. But OTOH in a day-to-day practical working scenario, where one has to prioritize how often one wants to deep-dive into implementation details -- versus just taking a function's signature and docs at face value and only enquiring more deeply if something breaks -- it's always useful to have a small hint about the best vs. worst case scenarios. ...It's not that safe provides a stronger guarantee than trusted, it's that trusted makes clear that you are definitely in worst-case territory. It's not a magic bullet, it's just another data point that helps inform the question of whether one might want to deep-dive up front or not (a decision which might be influenced by plenty of other factors besides memory safety concerns). The distinction only becomes meaningless if one is unable to deep-dive and explore the library code. ...I just think that if you are willing to do that, you should use e.g. grep, not the function signature where a competent library author will likely choose to hide trusted as an implementation detail.
Jan 16 2020
Am Thu, 16 Jan 2020 04:34:26 +0100 schrieb Timon Gehr:On 16.01.20 03:06, Joseph Rushton Wakeling wrote:I'm curious, what do you think would be the ideal scheme if we could redesign it from scratch? Only safe/ system as function attributes and trusted (or system) blocks which can be used in safe functions? -- JohannesOn Thursday, 16 January 2020 at 01:53:18 UTC, Timon Gehr wrote:You have to be careful when writing a trusted function, not when calling it. If you do not trust a given library, there is no reason to be more careful around a trusted API than around a safe API, as they do not mean different things. safe does not fully eliminate risk of memory corruption in practice, but that does not mean there is anything non-absolute about the specifications of the attributes. As I am sure you understand, if you see a safe function signature, you don't know that its implementation is not a single trusted function call, so the difference in signature is meaningless unless you adhere to specific conventions (which the library you will be considering to use as a dependency most likely will not do).It's an implementation detail. If you care about the distinction, you should check out the function's implementation, not its signature.Sure. But on a practical day-to-day basis, safe vs trusted signatures help to prioritize one's allocation of care somewhat. ...I'm coming to the conclusion that much of the differences of opinion in this thread are between folks who want to see things as absolutes, and folks who recognize that these features are tools for mitigating risk, not eliminating it.I was not able to figure out a response to this sentence that is both polite and honest.
Jan 17 2020
On Friday, 17 January 2020 at 08:10:48 UTC, Johannes Pfau wrote:I'm curious, what do you think would be the ideal scheme if we could redesign it from scratch? Only safe/ system as function attributes and trusted (or system) blocks which can be used in safe functions?Yes, pretty much the same as in Rust. And we can get there: just need to introduce trusted blocks (which would not a change what is already possible but only a simplification of syntax) and changing all trusted functions to safe (with one big trusted block around the function body). These are pretty small changes, but would yield a much better structure.
Jan 17 2020
On Friday, 17 January 2020 at 09:21:30 UTC, Dominikus Dittes Scherkl wrote:On Friday, 17 January 2020 at 08:10:48 UTC, Johannes Pfau wrote:And by the way: I would not call the blocks system, they should be distinguishable from system functions to make them better searchable! I still think the trusted blocks should be rare and small and only necessary at the lowest level deep within libraries.I'm curious, what do you think would be the ideal scheme if we could redesign it from scratch? Only safe/ system as function attributes and trusted (or system) blocks which can be used in safe functions?Yes, pretty much the same as in Rust. And we can get there: just need to introduce trusted blocks (which would not a change what is already possible but only a simplification of syntax) and changing all trusted functions to safe (with one big trusted block around the function body). These are pretty small changes, but would yield a much better structure.
Jan 17 2020
On 1/17/20 4:39 AM, Dominikus Dittes Scherkl wrote:On Friday, 17 January 2020 at 09:21:30 UTC, Dominikus Dittes Scherkl wrote:system blocks would only live inside trusted functions. So you can search for trusted, and then look for the system blocks to see which parts need attention. -SteveOn Friday, 17 January 2020 at 08:10:48 UTC, Johannes Pfau wrote:And by the way: I would not call the blocks system, they should be distinguishable from system functions to make them better searchable! I still think the trusted blocks should be rare and small and only necessary at the lowest level deep within libraries.I'm curious, what do you think would be the ideal scheme if we could redesign it from scratch? Only safe/ system as function attributes and trusted (or system) blocks which can be used in safe functions?Yes, pretty much the same as in Rust. And we can get there: just need to introduce trusted blocks (which would not a change what is already possible but only a simplification of syntax) and changing all trusted functions to safe (with one big trusted block around the function body). These are pretty small changes, but would yield a much better structure.
Jan 17 2020
On Friday, 17 January 2020 at 09:21:30 UTC, Dominikus Dittes Scherkl wrote:[snip] Yes, pretty much the same as in Rust.Somewhat OT, but HN is blowing up with an article [1] about the Rust community complaining unsafe being overused, resulting in the maintainer of the HTTP framework quitting. [1] https://words.steveklabnik.com/a-sad-day-for-rust
Jan 17 2020
On 1/17/20 11:33 AM, jmh530 wrote:On Friday, 17 January 2020 at 09:21:30 UTC, Dominikus Dittes Scherkl wrote:Well, those people for sure won't be running to D any time soon. We have had our share of quitters as well. It happens, people get very emotional sometimes. It's good to remember that the Internet is not a nice place, it's good to have somewhat of a thick skin for that kind of stuff. -Steve[snip] Yes, pretty much the same as in Rust.Somewhat OT, but HN is blowing up with an article [1] about the Rust community complaining unsafe being overused, resulting in the maintainer of the HTTP framework quitting. [1] https://words.steveklabnik.com/a-sad-day-for-rust
Jan 17 2020
On Friday, 17 January 2020 at 16:57:04 UTC, Steven Schveighoffer wrote:[snip] Well, those people for sure won't be running to D any time soon. We have had our share of quitters as well. It happens, people get very emotional sometimes. It's good to remember that the Internet is not a nice place, it's good to have somewhat of a thick skin for that kind of stuff. -SteveI don't disagree with your sentiments. My interest in the piece was also that Rust doesn't have the trusted/system dichotomy. The problem the community had with the code was that there were unsafe blocks all over the place, which made it difficult to verify if it was safe or not. I think that ties in with some of the discussion here.
Jan 17 2020
On 17.01.20 18:16, jmh530 wrote:My interest in the piece was also that Rust doesn't have the trusted/system dichotomy.They really do, and the fact that many people do not understand this is a pretty solid argument against Rust's `unsafe` keyword. In Rust, the `unsafe` keyword is overloaded to have dual meanings. Functions with `unsafe` blocks are trusted functions. Functions annotated `unsafe` are system functions.
Jan 17 2020
On Friday, 17 January 2020 at 18:12:21 UTC, Timon Gehr wrote:[snip] They really do, and the fact that many people do not understand this is a pretty solid argument against Rust's `unsafe` keyword. In Rust, the `unsafe` keyword is overloaded to have dual meanings. Functions with `unsafe` blocks are trusted functions. Functions annotated `unsafe` are system functions.Ah, that is correct. I had not even realized it. Some of the discussion on this thread and some others recently who are skeptical of trusted would thus find this useful. Rust has the equivalent of system and system blocks, but no equivalent keyword to express trusted. At the same time, they are also having problems with people marking too much code as their equivalent of system, contributing to problems in their community. This would suggest removing trusted should have a very high bar.
Jan 17 2020
On Friday, 17 January 2020 at 16:33:14 UTC, jmh530 wrote:On Friday, 17 January 2020 at 09:21:30 UTC, Dominikus Dittes Scherkl wrote:I think this is pretty funny and also predictable. There is an overuse of unsafe in Rust because the programmers want/must escape their Gulag in order to do the things they want, it's human behaviour. I have for long predicted that Rust is not the way forward for programming languages even if the language have some good things and now they are trying to escape from it. BTW, here is another proof of that, the blog "Learning Rust the dangerous way", http://cliffle.com/p/dangerust/. Which was appreciated by many users and beginners. This is one of the reasons I'm a bit skeptical against DIP 1028, https://forum.dlang.org/thread/ejaxvwklkyfnksjkldux forum.dlang.org. That people will not value it that much. I have nothing against a safe subset, but I'm not sure making it default is the right way to go. A bit OT, but there will be another round of that DIP at least.[snip] Yes, pretty much the same as in Rust.Somewhat OT, but HN is blowing up with an article [1] about the Rust community complaining unsafe being overused, resulting in the maintainer of the HTTP framework quitting. [1] https://words.steveklabnik.com/a-sad-day-for-rust
Jan 17 2020
On Fri, Jan 17, 2020 at 05:28:09PM +0000, IGotD- via Digitalmars-d wrote:On Friday, 17 January 2020 at 16:33:14 UTC, jmh530 wrote:[...]The other human behaviour is that people form habits and then resist changing said habits. See, the thing is that there's a lot to be said about defaults that incentivize people to do things the Right Way(tm). You provide the option to do things differently, there's an escape hatch for when you need it, but you also nudge them in the right direction, so that if they're undecided or not paying attention, they automatically default to doing it the right way. One thing that D did quite well IMO is that the default way to do things often coincides with the best way. As opposed to say C++, where the most obvious way to write a piece of code is almost certainly the wrong way, due to any number of potential problems (built-in arrays are unsafe, avoid raw pointers, avoid new, avoid writing loops, avoid mutation, the list goes on). But you have to start out with the right defaults, because once people form habits around those defaults, they will resist change. Inertia is a powerful force. One of the areas D didn't incentivize in the right way is being system by default. DIP 1028 is trying to change that, but you see the consequences of not starting out that way in the first place: people are resisting it because they have become accustomed to system by default, and dislike changing their habits. [...][1] https://words.steveklabnik.com/a-sad-day-for-rustI think this is pretty funny and also predictable. There is an overuse of unsafe in Rust because the programmers want/must escape their Gulag in order to do the things they want, it's human behaviour.This is one of the reasons I'm a bit skeptical against DIP 1028, https://forum.dlang.org/thread/ejaxvwklkyfnksjkldux forum.dlang.org. That people will not value it that much. I have nothing against a safe subset, but I'm not sure making it default is the right way to go.[...] IMO it would have worked had it been the default from the very beginning. This is why language decisions are so hard to make, because you don't really know what's the best design except in retrospect, but wrong decisions are hard to change after the fact because of inertia. By the time you accumulate enough experience to know what would have worked better, you may already be stuck with the previous decision. T -- Heuristics are bug-ridden by definition. If they didn't have bugs, they'd be algorithms.
Jan 17 2020
On Friday, 17 January 2020 at 19:21:14 UTC, H. S. Teoh wrote:See, the thing is that there's a lot to be said about defaults that incentivize people to do things the Right Way(tm).This is overblown. Rather than go by secondary sources go straight to the author: «Actix always will be “shit full of UB” and “benchmark cheater”. (Btw, with tfb benchmark I just wanted to push rust to the limits, I wanted it to be on the top, I didn’t want to push other rust frameworks down.) » https://github.com/actix/actix-web He was clearly exploring a terrain where it makes sense to go low level, that is, he tried to get ahead in benchmarks. You get what you pay for...
Jan 17 2020
On 17.01.20 09:10, Johannes Pfau wrote:Am Thu, 16 Jan 2020 04:34:26 +0100 schrieb Timon Gehr: ... I'm curious, what do you think would be the ideal scheme if we could redesign it from scratch?Different approaches have different trade-offs and I am not sure how to weight the benefits and drawbacks. However, whatever the final solution is, the type system should not make a difference between safe and trusted function signatures.Only safe/ system as function attributes and trusted (or system) blocks which can be used in safe functions?I do not particularly like this option. trusted blocks are better than system blocks, but an implicit assumption many people are still going to make is that only the code inside the block is trusted, and this will typically be incorrect.
Jan 17 2020
On Friday, 17 January 2020 at 08:10:48 UTC, Johannes Pfau wrote:I'm curious, what do you think would be the ideal scheme if we could redesign it from scratch? Only safe/ system as function attributes and trusted (or system) blocks which can be used in safe functions?For the record, this is also exactly what I argued for in the thread linked in the original post, from more than 7 years ago. No redesigning from scratch is needed for this. Just add trusted blocks and discourage (and perhaps slowly deprecate, over years) the use of trusted as a function attribute. — David
Jan 18 2020
On Thursday, 16 January 2020 at 00:21:21 UTC, Joseph Rushton Wakeling wrote:The fact that in a similar situation D forces you to annotate the function with ` trusted`, and alert users to the _possibility_ that memory safety bugs could exist within the code of this function, is useful information even if you can't access the source code.Detail the scenario where this would be useful, please. If you want to audit a program to make sure there are no uses of potentially memory-unsafe code, you need access to all the source code: Even safe functions can contain arbitrary amounts of potentially unsafe code, as they can call into trusted functions. You make this point yourself in the quoted post; any information conveyed by trusted is necessarily incomplete, by virtue of it being intransitive. In other words, this "alert", as you put it, has next to no information content on account of its arbitrarily high false-negative rate ( safe functions calling into unsafe code), and is thus worse than useless. If you don't have access to the source code, you don't know anything about what is used to implement a safe function. If you do, you don't care where exactly the keyword you are grepping for in an audit is, as long as it is proximate to the potentially-unsafe code. trusted has no place on the API level. — David
Jan 18 2020
On Saturday, 18 January 2020 at 21:44:17 UTC, David Nadlinger wrote:On Thursday, 16 January 2020 at 00:21:21 UTC, Joseph Rushton Wakeling wrote:I'm honestly not sure what is useful to add to the existing discussion. But I think the problem is that most folks are looking for certainties, whereas I'm prepared to entertain a certain amount of probability in certain contexts. (Please read on to understand what I mean by that, rather than assuming that I'm prepared to allow memory safety to be a roll of the dice:-) Put it like this: for any given safe function you see, odds are that in practice it uses no trusted code outside the standard library/runtime. Is that a guarantee? No. But it's a reasonable heuristic to use on a day-to-day "How concerned do I have to be that this function might do something scary?" basis. (Depending on what the function does, I might be able to make an educated guess of the likelihood there's something trusted closer to home.) OTOH if I see a function marked as trusted I have a cast-iron guarantee that the compiler did not do anything to verify the memory safety of _this particular function_. Which gives me a nudge that on the balance of probability, I might want to give a bit more up-front scrutiny to exactly what it's doing -- either by reading the source code if I can, or by playing with it a bit to see if I can trip it up with some unexpected input. Is that an _audit_? No. But it doesn't seem vastly different to the kind of day to day trust-versus-verify judgement calls that we all have to make, on a day to day basis, about all sorts of matters of code correctness in the APIs that we use.The fact that in a similar situation D forces you to annotate the function with ` trusted`, and alert users to the _possibility_ that memory safety bugs could exist within the code of this function, is useful information even if you can't access the source code.Detail the scenario where this would be useful, please.If you want to audit a program to make sure there are no uses of potentially memory-unsafe code, you need access to all the source codeYes, agreed. But that's the difference between doing a full safety audit versus the typical day-to-day "Does it seem reasonable to use this function for my use-case?" judgement calls that we all make when writing code.
Jan 23 2020
On 1/15/20 1:06 PM, ag0aep6g wrote:On 15.01.20 17:54, Joseph Rushton Wakeling wrote:On Wednesday, 15 January 2020 at 14:30:02 UTC, Ogi wrote:[...][...]safe fun() { //safe code here trusted { //those few lines that require manual checking } //we are safe again }So here's the problem with this approach (which was mentioned by several people in the discussion): the actual safety of a function like this is usually down to the combination of the lines that (in your example) are both inside and outside the trusted block.Yup. But a proposal could specify that that's the intended meaning for an safe function that contains trusted blocks. Whereas it's more of a cheat when we use trusted nested functions like that. [...]So, a better approach would be for the function to be marked up like this: trusted fun () // alerts the outside user { // lines that on their own are provably safe go here system { // these lines are allowed to use system code } // only provably safe lines here again } ... and the compiler's behaviour would be to explicitly verify standard safe rules for all the lines inside the trusted function _except_ the ones inside a system { ... } block. Cf. Steven Schveighoffer's remarks here: https://forum.dlang.org/post/qv7t8b$2h2t$1 digitalmars.com This way the function signature gives a clear indicator to the user which functions are provably safe, and which are safe only on the assumption that the developer has done their job properly.I don't think that's what Steven had in mind. In that world, safe would be very, very limited, because it couldn't be allowed to call trusted functions. That means safe would only apply to trivial functions, and trusted would assume the role that safe has today. But you'd have to wrap every call from an trusted to another trusted function in an system block. It wouldn't be practical.The real purpose of trusted in that example is to allow the system block in the body, and to signal to reviewers and maintainers that the whole function is unsafe despite the mechanical checks that are done on most of the lines. To a user, trusted functions would still be the same as safe ones.I'll interject here. I was thinking actually exactly along the lines of Joseph's code, and exactly what you are saying as well. And I think Joe's ideas are along the same lines, just they were misinterpreted here. There are two things to look at for safety. One is that a function is safe or not safe (that is, it has a safe implementation, even if there are calls to system functions, so therefore is callable from mechanically checked safe code). This is the part where the compiler uses function attributes to determine what is callable and what is not. The second is how much manual review is needed for the code. This is a signal to the reviewer/reader. In the current regime, the two reasons for marking are muddled -- we don't have a good way to say "this needs manual checking, but I also want the benefits of mechanical checking". This is why I proposed a change to trusted code STILL being mechanically checked, unless you want an escape. This would allow you to mark all code that needs manual review trusted, even if it's mechanically checked (it still needs review if the system-calling parts can muck with the data). There may be a case as well to make data only accessible from system escapes, because the semantics of the data affect the memory safety of an aggregate.Unfortunately, adding the mechanical checks of safe to trusted would mean breaking all trusted code that exists. So implementing that scheme seems unrealistic.Yeah, most likely. We would possibly need a fourth attribute for this purpose, or we can continue to rely on safe/ trusted meaning what they mean today (it's doable, though confusing).But as Steven says, it can be done when we use trusted blocks instead of system blocks and safe instead of trusted on the function. I.e.: safe fun () { // lines that the compiler accepts as safe go here trusted { // these lines are allowed to use system code } // only safe lines here again } It weakens the meaning of safe somewhat, but it's often treated that way already. There's clearly a need.safe code is tremendously hampered without trusted. You could only do things like simple math. As soon as you start needing things like memory allocation, or i/o, you need escapes. That is the reality we have. -Steve
Jan 16 2020
On Thursday, 16 January 2020 at 15:30:45 UTC, Steven Schveighoffer wrote:The second is how much manual review is needed for the code. This is a signal to the reviewer/reader. In the current regime, the two reasons for marking are muddled -- we don't have a good way to say "this needs manual checking, but I also want the benefits of mechanical checking". This is why I proposed a change to trusted code STILL being mechanically checked, unless you want an escape. This would allow you to mark all code that needs manual review trusted, even if it's mechanically checked (it still needs review if the system-calling parts can muck with the data).As was pointed out trusted does not achieve much more than a comment, so why not just have a statement- / operator-level escape using a distinguishable and greppable marker like . Then you can just prepend that to all function calls or operations that are unsafe: safe_function(…){ ptr = … //TRUSTED: this is safe because x, y, z free(ptr); } Then leave on any existing mechanical checks and keep adding until it passes.
Jan 16 2020
On 1/16/20 10:46 AM, Ola Fosheim Grøstad wrote:On Thursday, 16 January 2020 at 15:30:45 UTC, Steven Schveighoffer wrote:Enforcing the name of the comment helps the greppability. And putting trusted on the entire function helps avoid grep finding blocks whereby you now have to dig back up to find the context. safe functions with no trusted escapes have a much different level of review required. If you can assume all the inputs are valid, then the function shouldn't need review -- even if it calls trusted functions. I was looking for a distinction from safe functions like this, and safe ones with escapes. I do like the syntax, though system free(ptr); isn't much worse. -SteveThe second is how much manual review is needed for the code. This is a signal to the reviewer/reader. In the current regime, the two reasons for marking are muddled -- we don't have a good way to say "this needs manual checking, but I also want the benefits of mechanical checking". This is why I proposed a change to trusted code STILL being mechanically checked, unless you want an escape. This would allow you to mark all code that needs manual review trusted, even if it's mechanically checked (it still needs review if the system-calling parts can muck with the data).As was pointed out trusted does not achieve much more than a comment, so why not just have a statement- / operator-level escape using a distinguishable and greppable marker like . Then you can just prepend that to all function calls or operations that are unsafe: safe_function(…){ ptr = … //TRUSTED: this is safe because x, y, z free(ptr); } Then leave on any existing mechanical checks and keep adding until it passes.
Jan 16 2020
On Thursday, 16 January 2020 at 15:46:01 UTC, Ola Fosheim Grøstad wrote:Then leave on any existing mechanical checks and keep adding until it passes.Note: for this to work in the general case (with more advanced verification) you most likely need to add an assume construct to tell the compiler what invariants have been upheld by the escaped/unknown code that was called.
Jan 16 2020
On Thursday, 16 January 2020 at 15:30:45 UTC, Steven Schveighoffer wrote: [...]Three ways to design safe/ trusted: 1) safe cannot call trusted. I agree that this would be virtually useless. 2) safe can call trusted. trusted functions must be self-contained safe. This is how trusted is currently meant to work. 3) safe code can contain trusted parts (blocks/lambdas). Those trusted parts may rely on the surrounding safe code for safety. The safe parts are effectively in a third state: Mechanical checks are performed, but manual checks are still needed to verify the interaction with the trusted parts. table, as far as I'm concerned. doing that is strictly speaking valid today. But if we do get trusted blocks, that's an opportunity to make it valid. I.e., "[weakening] the meaning of safe". In contrast, with system blocks (in trusted functions) we could To reiterate, I'm ok with both system blocks and trusted getting any kind of block syntax. But that would be harder to formalize, and there's no point in doing it if we want to transition to block syntax anyway. that one way or the other. And if block syntax gets introduced, then that's the time to strike.But as Steven says, it can be done when we use trusted blocks instead of system blocks and safe instead of trusted on the function. I.e.: safe fun () { // lines that the compiler accepts as safe go here trusted { // these lines are allowed to use system code } // only safe lines here again } It weakens the meaning of safe somewhat, but it's often treated that way already. There's clearly a need.safe code is tremendously hampered without trusted. You could only do things like simple math. As soon as you start needing things like memory allocation, or i/o, you need escapes. That is the reality we have.
Jan 16 2020
On Thu, Jan 16, 2020 at 10:30:45AM -0500, Steven Schveighoffer via Digitalmars-d wrote: [...]There are two things to look at for safety. One is that a function is safe or not safe (that is, it has a safe implementation, even if there are calls to system functions, so therefore is callable from mechanically checked safe code). This is the part where the compiler uses function attributes to determine what is callable and what is not. The second is how much manual review is needed for the code. This is a signal to the reviewer/reader. In the current regime, the two reasons for marking are muddled -- we don't have a good way to say "this needs manual checking, but I also want the benefits of mechanical checking". This is why I proposed a change to trusted code STILL being mechanically checked, unless you want an escape. This would allow you to mark all code that needs manual review trusted, even if it's mechanically checked (it still needs review if the system-calling parts can muck with the data).[...] This is why I proposed that trusted functions should *still* be subject to safe checks, only with the exception that they're allowed to have embedded system blocks where such checks are relaxed (and these system blocks are only allowed inside trusted functions). So the trusted is a visual marker that it needs to be manually verified, but you still have the benefit of the compiler automatically verifying most of its body except for the (hopefully small) system block where such checks are temporarily suspended. T -- It is impossible to make anything foolproof because fools are so ingenious. -- Sammy
Jan 16 2020
On Thursday, 16 January 2020 at 18:05:25 UTC, H. S. Teoh wrote:This is why I proposed that trusted functions should *still* be subject to safe checks, only with the exception that they're allowed to have embedded system blocks where such checks are relaxed (and these system blocks are only allowed inside trusted functions). So the trusted is a visual marker that it needs to be manually verified, but you still have the benefit of the compiler automatically verifying most of its body except for the (hopefully small) system block where such checks are temporarily suspended.Hang on, have 3 of us all made the same proposal? (OK, I just reiterated what I understood to be Steven's proposal, but ...:-) I'll leave it to others to decide if we're great minds or fools or anything in between ;-)
Jan 16 2020
On Thu, Jan 16, 2020 at 06:10:16PM +0000, Joseph Rushton Wakeling via Digitalmars-d wrote:On Thursday, 16 January 2020 at 18:05:25 UTC, H. S. Teoh wrote:Fools or not, the important thing is whether we can convince Walter to agree with this... This is far from the first time such an idea came up. I remember back when Mihail Strashuns was actively contributing to Phobos, we had this discussion on Github where we agreed that we'd like to reduce the scope of trusted as much as possible, meaning, the unsafe parts of trusted should be as small as possible in order to minimize the surface area of potential problems. This was before people came up with the idea of a nested trusted lambda. We both felt very uncomfortable that there were some functions Phobos that were marked trusted, but were so large that it was impractical to review the entire function body for correctness. Furthermore, since Phobos at the time was undergoing a rapid rate of change, we were uncomfortable with the idea that any random PR might touch some seemingly-innocuous part of a trusted function and break its safety, yet there would be no warning whatsoever from the autotester because the compiler simply turned off all checks inside a trusted function. IIRC it was that discussion, which later led to other, further discussions, that eventually resulted in the idea of using nested trusted lambdas inside functions. Of course, in the interim, we also learned from Walter what his stance was: trusted functions should sport a safe API, i.e., even though by necessity it has to do uncheckable things inside, its outward-facing API should be such that it's impossible to break its safety without also breaking your own safety. I.e., taking `int[]` is OK because, presumably, safe code will not allow you to construct an `int[]` that has an illegal pointer or wrong length; but taking `int*, size_t` is not OK, because the caller can just pass the wrong length and you're screwed. Eventually, this restriction was relaxed for nested trusted lambdas, due to the API restriction being too onerous and impractical in some cases. Regardless of what the story was, the basic idea is the same: to shrink the scope of unchecked code as much as possible, and to leverage the compiler's safe-ty checks as much as possible. Ideally, most of a trusted function's body should actually be safe, and only a small part system -- where the compiler is unable to mechanically verify its correctness. That way, if you make a mistake while editing a trusted function, most of the time the compiler will catch it. Only inside the system block (or whatever we decide to call the unchecked block), the checks are suspended and you have to be extra careful when changing it. Basically, we want all the help we can get from the compiler to minimize human error, and we want to reduce the scope of human error to as narrow a scope as possible (while acknowledging that we can never fully eliminate it -- which is why we need trusted in the first place). T -- Once bitten, twice cry...This is why I proposed that trusted functions should *still* be subject to safe checks, only with the exception that they're allowed to have embedded system blocks where such checks are relaxed (and these system blocks are only allowed inside trusted functions). So the trusted is a visual marker that it needs to be manually verified, but you still have the benefit of the compiler automatically verifying most of its body except for the (hopefully small) system block where such checks are temporarily suspended.Hang on, have 3 of us all made the same proposal? (OK, I just reiterated what I understood to be Steven's proposal, but ...:-) I'll leave it to others to decide if we're great minds or fools or anything in between ;-)
Jan 16 2020
On Wednesday, 15 January 2020 at 16:54:58 UTC, Joseph Rushton Wakeling wrote:So here's the problem with this approach (which was mentioned by several people in the discussion): the actual safety of a function like this is usually down to the combination of the lines that (in your example) are both inside and outside the trusted block.The are two scenarios of memory corruption in a function with a trusted block. Either trusted section is incorrect, or the trusted section is given incorrect input. Safe code that has no deal with trusted code cannot compromise safety. So while the culprit can be outside of trusted block, it will lower the number of suspects. It will give you a good idea of the required safety checks.
Jan 16 2020
On Wednesday, 15 January 2020 at 14:30:02 UTC, Ogi wrote:Without trusted the safety design will be much simpler to grasp. Currently the are three vague keywords, complex rules on which functions can call which and best practices on which attribute you should choose.I also don't understand what's the point with trusted. Should it only be used as a trampoline for safe code into the unknown or "some holds allowed" like something in between system and safe. I find this highly confusing. It's like x86 protection rings (ring 0 - 3) where rings 1 - 2 are seldom used. If you think about it trusted is not necessary as you say that trusted "must be manually verified", aka unsafe. I think you should be allowed to call unsafe code from safe code and it is up to the programmer to check the code that they call and make an assessment on stability. trusted call can call unsafe code further down the line so trusted isn't that useful. I might have misunderstood the point of trusted totally and you can try to make me understand what's the point of it. I didn't find the documentation very helpful.
Jan 15 2020
On Wednesday, 15 January 2020 at 19:27:42 UTC, IGotD- wrote:On Wednesday, 15 January 2020 at 14:30:02 UTC, Ogi wrote:No. trusted is about saying "This function should be safe to use, but that safety has been designed by the developer rather than being automatically verifiable by the compiler." Contrast with system which can be applied to functions that are inherently not guaranteed to be safe -- e.g. where the safety (or not) depends on what input the user provides.Without trusted the safety design will be much simpler to grasp. Currently the are three vague keywords, complex rules on which functions can call which and best practices on which attribute you should choose.I also don't understand what's the point with trusted. Should it only be used as a trampoline for safe code into the unknown or "some holds allowed" like something in between system and safe. I find this highly confusing. It's like x86 protection rings (ring 0 - 3) where rings 1 - 2 are seldom used. If you think about it trusted is not necessary as you say that trusted "must be manually verified", aka unsafe.
Jan 15 2020
On Wednesday, 15 January 2020 at 19:49:53 UTC, Joseph Rushton Wakeling wrote:No. trusted is about saying "This function should be safe to use, but that safety has been designed by the developer rather than being automatically verifiable by the compiler."Could make sense if you only could call trusted from trusted... If you can call trusted from safe then it makes no sense whatsoever. If you view attributes as something for external APIs then it makes no sense to distinguish between trusted and safe. It is just noise that conveys nothing of substance.
Jan 15 2020
On Wednesday, 15 January 2020 at 19:49:53 UTC, Joseph Rushton Wakeling wrote:No. trusted is about saying "This function should be safe to use, but that safety has been designed by the developer rather than being automatically verifiable by the compiler." Contrast with system which can be applied to functions that are inherently not guaranteed to be safe -- e.g. where the safety (or not) depends on what input the user provides.This is why I think it should be removed. In my world there is no "trust the human". Also trusted in kind of backwards. It should be the caller that designate a call or operation trusted, not the the function that you call. Otherwise it is like asking car sales man if I can trust him.
Jan 15 2020
On Wednesday, 15 January 2020 at 21:17:38 UTC, IGotD- wrote:This is why I think it should be removed. In my world there is no "trust the human".Presumably your programs are therefore self-crafted binary, since you couldn't possibly trust the humans who wrote the standard library to write valid code, or the compiler writers to translate it correctly into machine instructions? :-)Also trusted in kind of backwards. It should be the caller that designate a call or operation trusted, not the the function that you call. Otherwise it is like asking car sales man if I can trust him.I think you're getting caught up on the choice of terminology. It's just a hierarchy of guarantees: safe -- this function should behave in a memory-safe way for all possible inputs you can provide, and this can be validated by the compiler trusted -- this function should behave in a memory-safe way for all possible inputs you can provide, but this has been validated by the developer, and cannot be automatically validated by the compiler system -- some of the possible inputs to this function will cause it to behave in a memory-unsafe way You don't have to like the choice of keywords, but you should recognize that they describe valuable distinctions. There are some nice examples of how these distinctions are useful in the article linked to above.
Jan 15 2020
On Wednesday, 15 January 2020 at 23:01:57 UTC, Joseph Rushton Wakeling wrote:Presumably your programs are therefore self-crafted binary, since you couldn't possibly trust the humans who wrote the standard library to write valid code, or the compiler writers to translate it correctly into machine instructions? :-)safe is a subset of D that guarantees no memory corruption. The only way to ensure this is if the compiler have all the source code (will object code also work?) and can check that all the calls are also safe. If this condition is not met, it is not safe by definition. trusted code has reduced memory guarantees and can also call unsafe code further along the line and therefore unsafe. trusted is therefore impossible and the criteria cannot be met. It's just a badge meaning nothing.I think you're getting caught up on the choice of terminology. It's just a hierarchy of guarantees:No, I'm caught up in the semantics. I see that a condition cannot be met and therefore unnecessary. trusted is an oxymoron.
Jan 15 2020
On Wednesday, 15 January 2020 at 23:24:38 UTC, IGotD- wrote:safe is a subset of D that guarantees no memory corruption.Not quite. safe is a subset of D that guarantees no memory corruption, as long as people write only correct trusted code (and there might be more restrictions). This weaker definition is being used because the strong one is considered impractical. There might be text in the documentation that suggests the stronger definition, but that is not going to take precedence. The weaker one is widely accepted as more useful. If anything, a conflict would surely be resolved by changing the documentation.
Jan 15 2020
On 16.01.20 00:24, IGotD- wrote:On Wednesday, 15 January 2020 at 23:01:57 UTC, Joseph Rushton Wakeling wrote:safe does not mean "this unconditionally can not corrupt memory", it says "if its trusted dependencies are written right, safe code can not introduce memory corruption". If you don't write trusted code yourself and only use the standard library, safe gives you the guarantee you mention per the language and standard library specification, but in practice there can be bugs in the language implementation or its dependencies. Assuming a correct compiler implementation of the safety checks, the locations where memory corruption can be caused will however all be within ` trusted` function calls. Similar concerns apply to other safe languages such as Java. (Why do you think users need to update their JVM regularly? It's not just about performance upgrades.) What do you think "security updates" for your OS do? How is safe supposed to be _useful_ if it cannot assume the OS, system C libraries, and hardware behave as specified?Presumably your programs are therefore self-crafted binary, since you couldn't possibly trust the humans who wrote the standard library to write valid code, or the compiler writers to translate it correctly into machine instructions? :-)safe is a subset of D that guarantees no memory corruption.The only way to ensure this is if the compiler have all the source code (will object code also work?) and can check that all the calls are also safe. If this condition is not met, it is not safe by definition. trusted code has reduced memory guarantees and can also call unsafe code further along the line and therefore unsafe. trusted is therefore impossible and the criteria cannot be met. It's just a badge meaning nothing. ...Nonsense. It's the place where you inject assumptions into your conditional verifier. It reduces the amount of code you have to audit to convince yourself that a given dependency will not corrupt the internal state of your program.You are operating on bad assumptions or bad logic. trusted is clearly useful, because it makes no sense to require every piece of trusted code to be part of the compiler implementation.I think you're getting caught up on the choice of terminology. It's just a hierarchy of guarantees:No, I'm caught up in the semantics. I see that a condition cannot be met and therefore unnecessary.trusted is an oxymoron.trusted void foo(int[] a,int b){ if(0<=b&&b<a.length){ a.ptr[b]=2; } } system void bar(int[] a,int b){ a.ptr[b]=2; } How do you not see the difference, that it is useful to annotate that difference, and that it is okay to call foo from safe code, but not bar?
Jan 15 2020
On Wednesday, 15 January 2020 at 23:24:38 UTC, IGotD- wrote:On Wednesday, 15 January 2020 at 23:01:57 UTC, Joseph Rushton Wakeling wrote:No, that's where you're wrong. trusted gives the same guarantees than safe. The only difference is that safe can automatically be checked and trusted cannot. ANY memory violation in a trusted code is a BUG and the responsibility of the programmer. That's why trusted is important and should be only applied to the parts that cannot be checked by the compiler. Why limit then trusted block to functions and not to scopes? Imo the call interface has a semantic that does not allow (or at least not encourage) to interact directly with data of the callers scope. All interactions have to be done via parameters which scopes and lifetimes are known. This is not the case with simple scopes. So the difference between the two is the ABI which adds some guarantees that a simple scope cannot (see Steven Schveighofer's example).Presumably your programs are therefore self-crafted binary, since you couldn't possibly trust the humans who wrote the standard library to write valid code, or the compiler writers to translate it correctly into machine instructions? :-)safe is a subset of D that guarantees no memory corruption. The only way to ensure this is if the compiler have all the source code (will object code also work?) and can check that all the calls are also safe. If this condition is not met, it is not safe by definition. trusted code has reduced memory guarantees and can also call unsafe code further along the line and therefore unsafe.
Jan 16 2020
On Thursday, 16 January 2020 at 10:44:56 UTC, Patrick Schluter wrote:No, that's where you're wrong. trusted gives the same guarantees than safe. The only difference is that safe can automatically be checked and trusted cannot. ANY memory violation in a trusted code is a BUG and the responsibility of the programmer.Then we can remove safe all together and trust the programmer to only use the safe subset of D.That's why trusted is important and should be only applied to the parts that cannot be checked by the compiler.Then trusted is as good as putting a comment in the code.All interactions have to be done via parameters which scopes and lifetimes are known. This is not the case with simple scopes. So the difference between the two is the ABI which adds some guarantees that a simple scope cannot (see Steven Schveighofer's example).Yes, so if safe code can only call function with the safe attribute but can have system blocks in it. This is the same thing as trusted. This is very similar to another language and I think it at least got that part right.
Jan 16 2020
On Thursday, 16 January 2020 at 10:58:33 UTC, IGotD- wrote:On Thursday, 16 January 2020 at 10:44:56 UTC, Patrick Schluter wrote:The difference between "a small subset of the program must be manually verified in order to guarantee memory safety" and "the entire program must be manually verified in order to guarantee memory safety" is not insignificant.No, that's where you're wrong. trusted gives the same guarantees than safe. The only difference is that safe can automatically be checked and trusted cannot. ANY memory violation in a trusted code is a BUG and the responsibility of the programmer.Then we can remove safe all together and trust the programmer to only use the safe subset of D.
Jan 16 2020
On Thursday, 16 January 2020 at 14:46:15 UTC, Paul Backus wrote:On Thursday, 16 January 2020 at 10:58:33 UTC, IGotD- wrote:If memory-safety is the default then you only need to mark functions and code blocks as unsafe. No need for trusted or safe.Then we can remove safe all together and trust the programmer to only use the safe subset of D.The difference between "a small subset of the program must be manually verified in order to guarantee memory safety" and "the entire program must be manually verified in order to guarantee memory safety" is not insignificant.
Jan 16 2020
On Thursday, 16 January 2020 at 15:02:10 UTC, Ola Fosheim Grøstad wrote:On Thursday, 16 January 2020 at 14:46:15 UTC, Paul Backus wrote:This is a non-sequitur. Did you mean to respond to a different message?On Thursday, 16 January 2020 at 10:58:33 UTC, IGotD- wrote:If memory-safety is the default then you only need to mark functions and code blocks as unsafe. No need for trusted or safe.Then we can remove safe all together and trust the programmer to only use the safe subset of D.The difference between "a small subset of the program must be manually verified in order to guarantee memory safety" and "the entire program must be manually verified in order to guarantee memory safety" is not insignificant.
Jan 16 2020
On Thursday, 16 January 2020 at 15:12:07 UTC, Paul Backus wrote:On Thursday, 16 January 2020 at 15:02:10 UTC, Ola Fosheim Grøstad wrote:It wasn't really clear what "IGotD-" meant. Although I suspect he was ironic, but if taken literally it would be fair to say that what D has is pretty much the same as this: assume all code is written as memory safe code and add an escape that allows for writing unsafe constructs adorned with a comment that says that code is trusted... except you also need to mark functions as unsafe. Not really a big shift from Rust, except Rust provides dedicated typing-constructs for doing unsafe operations like dealing with uninitialized variables.If memory-safety is the default then you only need to mark functions and code blocks as unsafe. No need for trusted or safe.This is a non-sequitur. Did you mean to respond to a different message?
Jan 16 2020
On Thursday, 16 January 2020 at 15:29:58 UTC, Ola Fosheim Grøstad wrote:On Thursday, 16 January 2020 at 15:12:07 UTC, Paul Backus wrote: It wasn't really clear what "IGotD-" meant. Although I suspect he was ironic, but if taken literally it would be fair to say that what D has is pretty much the same as this: assume all code is written as memory safe code and add an escape that allows for writing unsafe constructs adorned with a comment that says that code is trusted... except you also need to mark functions as unsafe. Not really a big shift from Rust, except Rust provides dedicated typing-constructs for doing unsafe operations like dealing with uninitialized variables.Yes, kind of. It similar what ag0aep6g described in alternative 3. On Thursday, 16 January 2020 at 16:56:17 UTC, ag0aep6g wrote:3) safe code can contain trusted parts (blocks/lambdas). Those trusted parts may rely on the surrounding safe code for safety. The safe parts are effectively in a third state: Mechanical checks are performed, but manual checks are still needed to verify the interaction with the trusted parts.but the difference is that we don't need trusted. We can let system rely on the safety of the surrounding safe block. I can't see how this really different from having that extra trusted attribute. I might have forgotten to tell that we need this extra system {} block.
Jan 16 2020
On Wed, Jan 15, 2020 at 11:01:57PM +0000, Joseph Rushton Wakeling via Digitalmars-d wrote:On Wednesday, 15 January 2020 at 21:17:38 UTC, IGotD- wrote:[...] And you also built your own hardware, because you cannot trust the hardware manufacturers that they didn't have a bug in silicon that causes memory corruption under rare circumstances, in spite of your correctly-coded instructions. Or they didn't put a trapdoor in silicon that allows unknown 3rd parties from some remote server to gain ring 0 access to your CPU. T -- My program has no bugs! Only undocumented features...This is why I think it should be removed. In my world there is no "trust the human".Presumably your programs are therefore self-crafted binary, since you couldn't possibly trust the humans who wrote the standard library to write valid code, or the compiler writers to translate it correctly into machine instructions? :-)
Jan 15 2020
On Wednesday, 15 January 2020 at 23:26:01 UTC, H. S. Teoh wrote:And you also built your own hardware, because you cannot trust the hardware manufacturers that they didn't have a bug in silicon that causes memory corruption under rare circumstances, in spite of your correctly-coded instructions. Or they didn't put a trapdoor in silicon that allows unknown 3rd parties from some remote server to gain ring 0 access to your CPU.To be honest, I just use butterflies :-)
Jan 15 2020
On Wednesday, 15 January 2020 at 19:27:42 UTC, IGotD- wrote:I might have misunderstood the point of trusted totally and you can try to make me understand what's the point of it. I didn't find the documentation very helpful."How to Write trusted Code in D", on the D blog, is a good introduction: https://dlang.org/blog/2016/09/28/how-to-write-trusted-code-in-d/
Jan 15 2020
On Wednesday, 15 January 2020 at 14:30:02 UTC, Ogi wrote:If you think of it, it makes no sense for trusted to be a function attribute. It doesn’t describe its behavior but its implementation. If you call a function, it should not be your concern; safe and trusted are the same thing for you. But these two attributes result in two different signatures. If a function expects a safe callback, you can’t pass a trusted function to it without safe wrapping. If some library makes a function that used to be safe trusted, that’s a breaking change. Etc, etc. Why should we introduce complexity out of nowhere?To me, this is the only part of the argument against trusted that's really convincing. Having the safe/ trusted distinction be part of the type system and the ABI is almost 100% downside, and changing that would eliminate some pain points at essentially no cost to the language. However, both of these issues can be fixed without a huge overhaul of trusted. The type-system issue can be fixed by introducing implicit conversions between safe and trusted functions, and the ABI issue can be fixed by changing how trusted affects name mangling. Once these issues are fixed, the only remaining benefits of the proposed overhaul are cosmetic, and IMO not worth the disruption they would cause.
Jan 15 2020
On Thu, Jan 16, 2020 at 01:07:21AM +0000, Paul Backus via Digitalmars-d wrote:On Wednesday, 15 January 2020 at 14:30:02 UTC, Ogi wrote:[...][...] Hogwash. Did you even test this before making statements like that? void fun(void function() safe dg) { } void trustme() trusted { } void main() { fun(&trustme); } Compiles fine. [...]If a function expects a safe callback, you can’t pass a trusted function to it without safe wrapping.However, both of these issues can be fixed without a huge overhaul of trusted. The type-system issue can be fixed by introducing implicit conversions between safe and trusted functions, and the ABI issue can be fixed by changing how trusted affects name mangling.trusted *already* implicitly converts to safe. We're arguing over nothing here. T -- Two wrongs don't make a right; but three rights do make a left...
Jan 15 2020
On Thursday, 16 January 2020 at 01:32:23 UTC, H. S. Teoh wrote:Hogwash. Did you even test this before making statements like that? void fun(void function() safe dg) { } void trustme() trusted { } void main() { fun(&trustme); } Compiles fine.You’re right. I feel like a moron now. If I am not mistaken, this is the only example of covariant function attributes.
Jan 16 2020
On 16.01.20 02:07, Paul Backus wrote:On Wednesday, 15 January 2020 at 14:30:02 UTC, Ogi wrote:That's already how it works. The OP just didn't bother to check whether the claim is true: void main(){ void delegate() safe dg0=() trusted{}; // ok void delegate() trusted dg1=() safe{}; // ok }If you think of it, it makes no sense for trusted to be a function attribute. It doesn’t describe its behavior but its implementation. If you call a function, it should not be your concern; safe and trusted are the same thing for you. But these two attributes result in two different signatures. If a function expects a safe callback, you can’t pass a trusted function to it without safe wrapping. If some library makes a function that used to be safe trusted, that’s a breaking change. Etc, etc. Why should we introduce complexity out of nowhere?To me, this is the only part of the argument against trusted that's really convincing. Having the safe/ trusted distinction be part of the type system and the ABI is almost 100% downside, and changing that would eliminate some pain points at essentially no cost to the language. However, both of these issues can be fixed without a huge overhaul of trusted. The type-system issue can be fixed by introducing implicit conversions between safe and trusted functions,and the ABI issue can be fixed by changing how trusted affects name mangling. ...It's not just the ABI, it's also static introspection. The way to fix that is to treat ` trusted` functions as ` safe` functions from the outside. (E.g.., `typeof(() trusted{})` would be `void delegate() safe`.)Once these issues are fixed, the only remaining benefits of the proposed overhaul are cosmetic, and IMO not worth the disruption they would cause.
Jan 15 2020