digitalmars.D - Discussion Thread: DIP 1035-- system Variables--Final Review
- Mike Parker (21/21) Feb 19 2022 This is the discussion thread for the Final Review of DIP 1035,
- Mike Parker (3/8) Feb 19 2022 The feedback thread is located here:
- Dukc (44/66) Feb 20 2022 While this DIP will solve most of the issues of `@safe`
- Dennis (4/7) Feb 20 2022 Yes, there's an issue for that: [Issue 21981 - Manually calling a
- Dukc (22/25) Mar 04 2022 I was just checking what the language spec says about this, and
- Dukc (22/25) Mar 04 2022 I was just checking what the language spec says about this, and
- Paul Backus (14/28) Mar 04 2022 First, this was not "overlooked"--it was added to the language
- Nick Treleaven (7/18) Feb 22 2022 Doesn't the same problem occur just with reassignment?:
- Dukc (20/39) Feb 23 2022 Reassignment can be forbidden at least. But still, my point does
- Paul Backus (18/28) Feb 21 2022 In the "Example: `int` as pointer" section, the following
- Paul Backus (4/7) Feb 21 2022 This should have been in the Feedback thread. I've reposted it
- Dennis (6/11) Feb 21 2022 It is memory-safety related, it allows you to create custom
- Paul Backus (14/25) Feb 21 2022 If the goal is being able to define custom pointer types, then
- Dennis (18/29) Feb 21 2022 A double `fclose` on a `FILE*` is basically a double free. I
- Paul Backus (15/23) Feb 21 2022 The compiler needs at least two notions: `hasPointers` and
- Stanislav Blinov (19/29) Feb 22 2022 A more pertinent example around file descriptors and memory
- Paul Backus (8/26) Feb 22 2022 If you attempt to fill in the missing part of your example, I
- Stanislav Blinov (11/39) Feb 22 2022 Yes, the implementation of `File` would need @trusted code. How
- Paul Backus (17/25) Feb 22 2022 If completing the example required *incorrect* use of `@trusted`
- Stanislav Blinov (20/47) Feb 22 2022 More the reason for me to not understand the objection then.
- Paul Backus (29/39) Feb 22 2022 The example, as written, does not link, because `File.write` is
- Stanislav Blinov (23/56) Feb 22 2022 ...that @trusted code is incorrect, at least on some platforms
- Dennis (8/14) Feb 23 2022 Why? I don't see it.
- Stanislav Blinov (4/20) Feb 23 2022 Because not all possible values of `data.length` are valid values
- Paul Backus (18/27) Feb 23 2022 POSIX says:
- Paul Backus (24/39) Feb 23 2022 Having spent some more time scratching my head over this, I now
- Paul Backus (5/9) Feb 23 2022 By the way, this issue has also come up in Rust:
- Stanislav Blinov (29/51) Feb 23 2022 Yes, or you may use e.g. `memfd_create`. And you can inherit such
- Paul Backus (22/25) Mar 04 2022 This is my reply to [this post][1] from the feedback thread:
- Dennis (11/19) Mar 04 2022 That's new to me, and the error makes no sense considering you
- Paul Backus (5/23) Mar 04 2022 Yes, I think this is a case of the compiler (and the spec)
This is the discussion thread for the Final Review of DIP 1035, " system Variables": https://github.com/dlang/DIPs/blob/4d73e17901a3a620bf59a2a5bfb8c433069c5f52/DIPs/DIP1035.md The review period will end at 11:59 PM ET on March 5, or when I make a post declaring it complete. Discussion in this thread may continue beyond that point. Here in the discussion thread, you are free to discuss anything and everything related to the DIP. Express your support or opposition, debate alternatives, argue the merits, etc. However, if you have any specific feedback on how to improve the proposal itself, then please post it in the feedback thread. The feedback thread will be the source for the review summary I write at the end of this review round. I will post a link to that thread immediately following this post. Just be sure to read and understand the Reviewer Guidelines before posting there: https://github.com/dlang/DIPs/blob/master/docs/guidelines-reviewers.md And my blog post on the difference between the Discussion and Feedback threads: https://dlang.org/blog/2020/01/26/dip-reviews-discussion-vs-feedback/ Please stay on topic here. I will delete posts that are completely off-topic.
Feb 19 2022
On Saturday, 19 February 2022 at 12:24:04 UTC, Mike Parker wrote:However, if you have any specific feedback on how to improve the proposal itself, then please post it in the feedback thread. The feedback thread will be the source for the review summary I write at the end of this review round. I will post a link to that thread immediately following this post.The feedback thread is located here: https://forum.dlang.org/post/kwabfusqvczenjjacbmq forum.dlang.org
Feb 19 2022
On Saturday, 19 February 2022 at 12:24:04 UTC, Mike Parker wrote:This is the discussion thread for the Final Review of DIP 1035, " system Variables": https://github.com/dlang/DIPs/blob/4d73e17901a3a620bf59a2a5bfb8c433069c5f52/DIPs/DIP1035.md The review period will end at 11:59 PM ET on March 5, or when I make a post declaring it complete. Discussion in this thread may continue beyond that point. Here in the discussion thread, you are free to discuss anything and everything related to the DIP. Express your support or opposition, debate alternatives, argue the merits, etc. However, if you have any specific feedback on how to improve the proposal itself, then please post it in the feedback thread. The feedback thread will be the source for the review summary I write at the end of this review round. I will post a link to that thread immediately following this post. Just be sure to read and understand the Reviewer Guidelines before posting there: https://github.com/dlang/DIPs/blob/master/docs/guidelines-reviewers.md And my blog post on the difference between the Discussion and Feedback threads: https://dlang.org/blog/2020/01/26/dip-reviews-discussion-vs-feedback/ Please stay on topic here. I will delete posts that are completely off-topic.While this DIP will solve most of the issues of ` safe` `__traits(getMember, xxx, yyy)` fetching private members, there is one thing that will remain a problem: destructors. The issue is that we want a way to specify a destructor so that it could be called by ` safe` code at end of the lifetime of an instance, but not early: ```D auto shouldBeSafe() { ObjectWithDestructor x; } auto shouldBeSystem() { ObjectWithDestructor x; __traits(getMember, x, "__dtor")(); } ``` For that to work, we will either have to make privacy inviolable from ` safe` (the destructor that does not want to be called early would be private, and `object.destroy` would be changed to be ` system` for private destructors), or add some alternative way to define destructors. Why we want to do that at all? DIP1000. Most of the potential of DIP1000 is wasted if you cannot prevent destruction before end of the scope: ```D safe void abuse() { auto cont = SomeRaiiContainer([1,2,3]); scope ptr = &cont.front; destroy(cont); int oops = *ptr; } ``` Yes, you can prevent that by also marking `abuse` ` live`. But I think we ought to do better than that. As I see it, ` live` is mainly meant as a partial memory safety mechanishm for low-level code that cannot be ` safe`. It isn't intended that you start to mark your average ` safe` code as ` live`, that would be terribly onerous. So we don't want to settle for our libraries being memory safe in ` live`, if we can make them memory safe in normal ` safe`. This is not intended as an argument against DIP1035, in fact I'm still in favour of it. I just wanted to point out that it does not entirely solve the problem of `__traits(getMember, xxx, yyy)` bypassing privacy.
Feb 20 2022
On Sunday, 20 February 2022 at 15:16:30 UTC, Dukc wrote:The issue is that we want a way to specify a destructor so that it could be called by ` safe` code at end of the lifetime of an instance, but not early:Yes, there's an issue for that: [Issue 21981 - Manually calling a __dtor can violate memory safety ](https://issues.dlang.org/show_bug.cgi?id=21981)
Feb 20 2022
On Friday, 4 March 2022 at 09:39:53 UTC, Dennis wrote at the feedback theard:On Friday, 25 February 2022 at 21:46:25 UTC, Dukc wrote:I was just checking what the language spec says about this, and found an alternative we have all been overlooking. A type can be declared unsafe in the present language by giving it an invariant. Yes I meant that contract programming invariant! The spec says that void-initializing a type with an invariant, or using an union that has a member with an invariant is ` system`-only. Thus the invariant effectively declares the type unsafe. It also means that `void[1]` is an unsafe type, because it can contain a struct with an invariant. This DIP still has the advantage that ` safe` functions in the same module with the invariant type do not need any special care. But still, that sounds a pretty trivial gain to me - in the `IntSlice` example you can make the members read-only with a bit union trickery if you want to, or define a string mixin that does the same automatically. I'm starting to think it's probably not worth it overall. Still I'm only slightly against because the rules proposed blend such nicely with the existing language, and it sure is sometimes convenient to have an alternative.Wouldn't putting the handle in union with `void[1]` work?No, `void[1]` is not a type with unsafe values.
Mar 04 2022
On Friday, 4 March 2022 at 09:39:53 UTC, Dennis wrote at the feedback theard:On Friday, 25 February 2022 at 21:46:25 UTC, Dukc wrote:I was just checking what the language spec says about this, and found an alternative we have all been overlooking. A type can be declared unsafe in the present language by giving it an invariant. Yes I meant that contract programming invariant! The spec says that void-initializing a type with an invariant, or using an union that has a member with an invariant is ` system`-only. Thus the invariant effectively declares the type unsafe. It also means that `void[1]` is an unsafe type, because it can contain a struct with an invariant. This DIP still has the advantage that ` safe` functions in the same module with the invariant type do not need any special care. But still, that sounds a pretty trivial gain to me - in the `IntSlice` example you can make the members read-only with a bit union trickery if you want to, or define a string mixin that does the same automatically. I'm starting to think it's probably not worth it overall. Still I'm only slightly against because the rules proposed blend such nicely with the existing language, and it sure is sometimes convenient to have an alternative.Wouldn't putting the handle in union with `void[1]` work?No, `void[1]` is not a type with unsafe values.
Mar 04 2022
On Friday, 4 March 2022 at 13:06:35 UTC, Dukc wrote:On Friday, 4 March 2022 at 09:39:53 UTC, Dennis wrote at the feedback theard:First, this was not "overlooked"--it was added to the language spec well after DIP 1035 was written and submitted. Dennis and I have been aware of this spec change since it was first proposed in [DMD PR 12326][1]. Second, this is not a complete alternative to DIP 1035, because it does not solve [the `__traits(getMember)` issue][2]. As long as ` safe` code is allowed to bypass encapsulation and access the fields of user-defined types directly, it is impossible for ` trusted` code to rely on the integrity of the data in those fields. [1]: https://github.com/dlang/dmd/pull/12326#issuecomment-812575730 [2]: https://issues.dlang.org/show_bug.cgi?id=20941On Friday, 25 February 2022 at 21:46:25 UTC, Dukc wrote:I was just checking what the language spec says about this, and found an alternative we have all been overlooking. A type can be declared unsafe in the present language by giving it an invariant. Yes I meant that contract programming invariant! The spec says that void-initializing a type with an invariant, or using an union that has a member with an invariant is ` system`-only. Thus the invariant effectively declares the type unsafe.Wouldn't putting the handle in union with `void[1]` work?No, `void[1]` is not a type with unsafe values.
Mar 04 2022
On Sunday, 20 February 2022 at 15:16:30 UTC, Dukc wrote:Why we want to do that at all? DIP1000. Most of the potential of DIP1000 is wasted if you cannot prevent destruction before end of the scope: ```D safe void abuse() { auto cont = SomeRaiiContainer([1,2,3]); scope ptr = &cont.front; destroy(cont); int oops = *ptr; } ```Doesn't the same problem occur just with reassignment?: ```d scope ptr = &cont.front; cont = cont.init; int oops = *ptr; ```
Feb 22 2022
On Tuesday, 22 February 2022 at 15:05:14 UTC, Nick Treleaven wrote:On Sunday, 20 February 2022 at 15:16:30 UTC, Dukc wrote:Reassignment can be forbidden at least. But still, my point does not stand scrutiny, because: 1: No reassignment is clumsy. 2: Even accepting that, this would still do the same thing: ```D auto pCont = new SomeRaiiContainer([1,2,3]); scope ptr = &pCont.front(); pCont = null; GC.collect; int oops = *ptr; ``` So, stratch that. RAII or reference counted containers can only be ` safe` with a callback based usage, as suggested by Paul: https://github.com/dlang/phobos/pull/8368#issuecomment-1024917439 . And that idiom is ` safe` even with present destructors. Hopefully: We have discovered so many in our DIP1000-based memory safety schemes lately so there is no guarantee that this isn't still just some oversight.Why we want to do that at all? DIP1000. Most of the potential of DIP1000 is wasted if you cannot prevent destruction before end of the scope: ```D safe void abuse() { auto cont = SomeRaiiContainer([1,2,3]); scope ptr = &cont.front; destroy(cont); int oops = *ptr; } ```Doesn't the same problem occur just with reassignment?: ```d scope ptr = &cont.front; cont = cont.init; int oops = *ptr; ```
Feb 23 2022
On Saturday, 19 February 2022 at 12:24:04 UTC, Mike Parker wrote:This is the discussion thread for the Final Review of DIP 1035, " system Variables": https://github.com/dlang/DIPs/blob/4d73e17901a3a620bf59a2a5bfb8c433069c5f52/DIPs/DIP1035.mdIn the "Example: `int` as pointer" section, the following sentence appears:Because an `int` is a safe type, any `int` value can be created from ` safe` code, so any memory corruption that could follow from escaping a `scope int` could also result from creating the same `int` value without accessing the variable.This sentence correctly recognizes that (absent incorrect ` trusted` code elsewhere) there is no memory-safety risk in allowing a value without indirections to escape from a function. It also completely undermines the example's motivation. If there is no benefit to memory-safety from applying `scope` checking to data without indirections, then there is no justification for enabling such checks in all ` safe` code, even if they may occasionally be "desirable" for other, non-memory-safety-related reasons. Later, in the "Description" section, we find the following sentence:The `scope` keyword is not stripped away [from an aggregate with at least one ` system` field], even when the aggregate has no members that contain pointers.The only justification for this appears to be the example discussed above. Both this sentence, and the example that attempts to support it, should be removed from the DIP.
Feb 21 2022
On Monday, 21 February 2022 at 19:49:58 UTC, Paul Backus wrote:In the "Example: `int` as pointer" section, the following sentence appears: [...]This should have been in the Feedback thread. I've reposted it there now: https://forum.dlang.org/post/lmmpuyeurzavwqiylwlp forum.dlang.org
Feb 21 2022
On Monday, 21 February 2022 at 19:49:58 UTC, Paul Backus wrote:If there is no benefit to memory-safety from applying `scope` checking to data without indirections, then there is no justification for enabling such checks in all ` safe` code, even if they may occasionally be "desirable" for other, non-memory-safety-related reasons.It is memory-safety related, it allows you to create custom pointer types. A pointer is just an integer under the hood, the idea of indirections and lifetimes is just a compile time idea around a `size_t` which indexes into memory. Why can't we do the same with a `ushort` which indexes into an array?
Feb 21 2022
On Monday, 21 February 2022 at 20:30:07 UTC, Dennis wrote:On Monday, 21 February 2022 at 19:49:58 UTC, Paul Backus wrote:If the goal is being able to define custom pointer types, then the DIP should use that as an example instead of talking about file descriptors, and it should explain *exactly* which part of the example depends on this feature for memory safety (as the other examples do). I still don't think it's a compelling use-case, though. [`TailUnqual`][1] does something very similar, using the `union` workaround, and it would not benefit from having access to `scope`-checked integers because (a) it stores a `size_t`, so eliminating the `union` wouldn't save any space; and (b) it needs the `union` for correct GC scanning regardless. [1]: https://gist.github.com/pbackus/1638523a5b6ea3ce2c0a73358cff4dc6If there is no benefit to memory-safety from applying `scope` checking to data without indirections, then there is no justification for enabling such checks in all ` safe` code, even if they may occasionally be "desirable" for other, non-memory-safety-related reasons.It is memory-safety related, it allows you to create custom pointer types. A pointer is just an integer under the hood, the idea of indirections and lifetimes is just a compile time idea around a `size_t` which indexes into memory. Why can't we do the same with a `ushort` which indexes into an array?
Feb 21 2022
On Monday, 21 February 2022 at 21:50:31 UTC, Paul Backus wrote:If the goal is being able to define custom pointer types, then the DIP should use that as an example instead of talking about file descriptors, and it should explain *exactly* which part of the example depends on this feature for memory safety (as the other examples do).A double `fclose` on a `FILE*` is basically a double free. I thought the same would apply to raw file descriptors, but I just read that a double `close` simply results in an `EBADF` error, so maybe it's not a good example.I still don't think it's a compelling use-case, though. [`TailUnqual`][1] does something very similar, using the `union` workaround, and it would not benefit from having access to `scope`-checked integers because (a) it stores a `size_t`, so eliminating the `union` wouldn't save any space; and (b) it needs the `union` for correct GC scanning regardless.Yes, TailUnqual doesn't need `scope`-checked integers, but that doesn't mean other code doesn't need it. I added the rule for two reasons: - The compiler currently has a notion of a type that `hasPointers`. The extra complexity of adding a notion `hasSystemVariables` was daunting, but then I thought we could just make them the same. I think that would not only simplify the implementation, but also the feature in general. It makes it easy to draw a parallel to a pointer and a ` system size_t`. - Some people asked for the feature (see links in the rationale section) I can improve the DIP text, but I'm not yet convinced the rule should be scrapped.
Feb 21 2022
On Monday, 21 February 2022 at 22:56:30 UTC, Dennis wrote:- The compiler currently has a notion of a type that `hasPointers`. The extra complexity of adding a notion `hasSystemVariables` was daunting, but then I thought we could just make them the same. I think that would not only simplify the implementation, but also the feature in general. It makes it easy to draw a parallel to a pointer and a ` system size_t`.The compiler needs at least two notions: `hasPointers` and `hasUnsafeValues`. `hasUnsafeValues` is a superset of `hasPointers`, and also includes aggregate types with ` system` fields and [`bool`][1]. Since I assume you do not intend to expand `scope` checking to `bool`, folding everything into a single concept will not be possible. [1]: https://issues.dlang.org/show_bug.cgi?id=20148- Some people asked for the feature (see links in the rationale section)I've read those links. In one of them, the problem was solved using existing DIP 1000 features. The other asks for "unique references (and borrow) to plain ints". It is not clear to me that `scope` checking for integers would solve either of these problems. If you believe it would, then it should not be difficult for you to write up an example demonstrating how.
Feb 21 2022
On Monday, 21 February 2022 at 22:56:30 UTC, Dennis wrote:On Monday, 21 February 2022 at 21:50:31 UTC, Paul Backus wrote:A more pertinent example around file descriptors and memory safety is void-initialization: ```d struct File { void write(const(void)[] data) safe; // ... private int fd; } void main() safe { File f = void; // this compiles in current language, because `File` doesn't have pointers f.write("hello"); // may corrupt memory if (implementation-defined) value of `f.fd` happens to correspond to an existing mapping } ```If the goal is being able to define custom pointer types, then the DIP should use that as an example instead of talking about file descriptors, and it should explain *exactly* which part of the example depends on this feature for memory safety (as the other examples do).A double `fclose` on a `FILE*` is basically a double free. I thought the same would apply to raw file descriptors, but I just read that a double `close` simply results in an `EBADF` error, so maybe it's not a good example.
Feb 22 2022
On Tuesday, 22 February 2022 at 08:47:55 UTC, Stanislav Blinov wrote:A more pertinent example around file descriptors and memory safety is void-initialization: ```d struct File { void write(const(void)[] data) safe; // ... private int fd; } void main() safe { File f = void; // this compiles in current language, because `File` doesn't have pointers f.write("hello"); // may corrupt memory if (implementation-defined) value of `f.fd` happens to correspond to an existing mapping } ```If you attempt to fill in the missing part of your example, I think you will find that you cannot actually demonstrate memory corruption resulting from `void`-initialization of a file descriptor without the use of ` trusted` code (e.g., to cast the `void*` returned from `mmap` to some other type of pointer whose target type has unsafe values).
Feb 22 2022
On Tuesday, 22 February 2022 at 13:13:43 UTC, Paul Backus wrote:On Tuesday, 22 February 2022 at 08:47:55 UTC, Stanislav Blinov wrote:Yes, the implementation of `File` would need trusted code. How would that invalidate the example? Your process can inherit fds from its parent. Or you may have pipes, shared memfds, sockets. Plenty of ways of obtaining a valid fd without requiring any casts OR having to deal with pointers. The example shows a way to (unintentionally) alias an existing fd (obtained through whichever means) and write to it, in safe context. An fd is an unsafe quantity encoded in a safe type. We need a way to express that in the language.A more pertinent example around file descriptors and memory safety is void-initialization: ```d struct File { void write(const(void)[] data) safe; // ... private int fd; } void main() safe { File f = void; // this compiles in current language, because `File` doesn't have pointers f.write("hello"); // may corrupt memory if (implementation-defined) value of `f.fd` happens to correspond to an existing mapping } ```If you attempt to fill in the missing part of your example, I think you will find that you cannot actually demonstrate memory corruption resulting from `void`-initialization of a file descriptor without the use of ` trusted` code (e.g., to cast the `void*` returned from `mmap` to some other type of pointer whose target type has unsafe values).
Feb 22 2022
On Tuesday, 22 February 2022 at 15:55:16 UTC, Stanislav Blinov wrote:Yes, the implementation of `File` would need trusted code. How would that invalidate the example?If completing the example required *incorrect* use of ` trusted` (i.e., on a function that does not have a [safe interface][1]), it would not be valid. Using ` trusted` in the implementation of `File.write` to call POSIX `write` would not be a problem. [1]: https://dlang.org/spec/function.html#safe-interfacesYour process can inherit fds from its parent. Or you may have pipes, shared memfds, sockets. Plenty of ways of obtaining a valid fd without requiring any casts OR having to deal with pointers. The example shows a way to (unintentionally) alias an existing fd (obtained through whichever means) and write to it, in safe context.The example shows a write to an fd, and then hand-waves about how this could maybe, hypothetically, somehow, cause memory corruption. Aliasing an fd does not, by itself, constitute memory corruption. Remember, if you can cause memory corruption in ` safe` code, that means you can also cause undefined behavior in ` safe` code. So if you cannot write a program that uses this alleged loophole to cause UB, then what you have found is not actually memory corruption. (Although it may still be "data corruption".)
Feb 22 2022
On Tuesday, 22 February 2022 at 16:16:30 UTC, Paul Backus wrote:On Tuesday, 22 February 2022 at 15:55:16 UTC, Stanislav Blinov wrote:It doesn't. The program, as presented, is enough.Yes, the implementation of `File` would need trusted code. How would that invalidate the example?If completing the example required *incorrect* use of ` trusted` (i.e., on a function that does not have a [safe interface][1]), it would not be valid.Using ` trusted` in the implementation of `File.write` to call POSIX `write` would not be a problem.More the reason for me to not understand the objection then.[1]: https://dlang.org/spec/function.html#safe-interfacesI don't follow. It seems pretty clear to me how the example is expressed. Where's the handwaving? Compare to this: ```d void main() safe { char[5]* ptr = void; *ptr = "hello"; } ``` The above won't compile, since void-initialization of pointers is not allowed in safe code, with good reason. Is there anything to handwave here? The fd example, however, *will* compile in current language, despite doing the same thing.Your process can inherit fds from its parent. Or you may have pipes, shared memfds, sockets. Plenty of ways of obtaining a valid fd without requiring any casts OR having to deal with pointers. The example shows a way to (unintentionally) alias an existing fd (obtained through whichever means) and write to it, in safe context.The example shows a write to an fd, and then hand-waves about how this could maybe, hypothetically, somehow, cause memory corruption.Aliasing an fd does not, by itself, constitute memory corruption.I did not say it did. Writing to an fd initialized with "implementation-defined" (read: garbage) value may - that's what I said, and that's what the example shows.Remember, if you can cause memory corruption in ` safe` code, that means you can also cause undefined behavior in ` safe` code. So if you cannot write a program that uses this alleged loophole to cause UB, then what you have found is not actually memory corruption. (Although it may still be "data corruption".)You *can* write such a program with fds. The example is one such program. Do you have any suggestions on how to make it clearer?
Feb 22 2022
On Tuesday, 22 February 2022 at 17:29:46 UTC, Stanislav Blinov wrote:On Tuesday, 22 February 2022 at 16:16:30 UTC, Paul Backus wrote:The example, as written, does not link, because `File.write` is missing a function body. If I fill in the obvious implementation, I get the following program: ```d struct File { void write(const(void)[] data) safe { import core.sys.posix.unistd: write; () trusted { write(fd, data.ptr, data.length); }(); } private int fd; } void main() safe { File f = void; f.write("hello"); } ``` The above program does not have undefined behavior. The call to `write` will either fail with `EBADF`, or attempt to write the string `"hello"` to some unspecified open file. If you believe there is some way to get the above program to produce undefined behavior, or to complete your original example in such a way that it produces undefined behavior without the use of incorrect ` trusted` code, I'm afraid you will have to spell it out for me.Remember, if you can cause memory corruption in ` safe` code, that means you can also cause undefined behavior in ` safe` code. So if you cannot write a program that uses this alleged loophole to cause UB, then what you have found is not actually memory corruption. (Although it may still be "data corruption".)You *can* write such a program with fds. The example is one such program. Do you have any suggestions on how to make it clearer?
Feb 22 2022
On Tuesday, 22 February 2022 at 18:33:58 UTC, Paul Backus wrote:On Tuesday, 22 February 2022 at 17:29:46 UTC, Stanislav Blinov wrote:If you're going to go there, then...You *can* write such a program with fds. The example is one such program. Do you have any suggestions on how to make it clearer?The example, as written, does not link, because `File.write` is missing a function body.If I fill in the obvious implementation, I get the following program: ```d struct File { void write(const(void)[] data) safe { import core.sys.posix.unistd: write; () trusted { write(fd, data.ptr, data.length); }(); } private int fd; }...that trusted code is incorrect, at least on some platforms (yes, I can nitpick too). Or we can simply agree that `File.write` is implemented correctly in terms of `write` (which is the important part) and leave it at that, as the rest is irrelevant to the example. I am seriously perplexed at this kind of nitpicking, not to mention the implied expectation of having to spell out full-blown libraries in an example code.void main() safe { File f = void; f.write("hello"); } ``` The above program does not have undefined behavior. The call to `write` will either fail with `EBADF`, or attempt to write the string `"hello"` to some unspecified open file.I am reasonably certain that the results may be much more varied than that, including some that aren't specified: https://pubs.opengroup.org/onlinepubs/9699919799/functions/write.htmlIf you believe there is some way to get the above program to produce undefined behavior, or to complete your original example in such a way that it produces undefined behavior without the use of incorrect ` trusted` code, I'm afraid you will have to spell it out for me.Not exhaustive: It may corrupt a given GC's implementation's heap, which means what occurs after the } is anyone's guess. It may mutate data that's supposed to be immutable (i.e. in a parent process, though you could argue that might not be relevant to the DIP). It may block indefinitely, or crash, or complete with no effect. If you could demonstrate that it cannot possibly exhibit at least the above, I'll happily accept being mistaken. ...but seriously. What is it with all the condescending tone on the forums lately?
Feb 22 2022
On Wednesday, 23 February 2022 at 00:14:13 UTC, Stanislav Blinov wrote:If you're going to go there, then... ... ...that trusted code is incorrect, at least on some platforms (yes, I can nitpick too).Why? I don't see it....but seriously. What is it with all the condescending tone on the forums lately?I think Paul makes a valid point and uses an appropriate tone. The DIP should not be hand-wavy about how `scope` checking would help memory safety when using file descriptors. I didn't go into much detail there because I didn't think it would be a contested addition.
Feb 23 2022
On Wednesday, 23 February 2022 at 16:14:51 UTC, Dennis wrote:On Wednesday, 23 February 2022 at 00:14:13 UTC, Stanislav Blinov wrote:Because not all possible values of `data.length` are valid values for `write`'s third argument.If you're going to go there, then... ... ...that trusted code is incorrect, at least on some platforms (yes, I can nitpick too).Why? I don't see it.I see the problem now, thanks....but seriously. What is it with all the condescending tone on the forums lately?I think Paul makes a valid point and uses an appropriate tone. The DIP should not be hand-wavy about how `scope` checking would help memory safety when using file descriptors. I didn't go into much detail there because I didn't think it would be a contested addition.
Feb 23 2022
On Wednesday, 23 February 2022 at 22:01:55 UTC, Stanislav Blinov wrote:Because not all possible values of `data.length` are valid values for `write`'s third argument.POSIX says:Before any action described below is taken, and if nbyte is zero and the file is a regular file, the write() function may detect and return errors as described below. In the absence of errors, or if error detection is not performed, the write() function shall return zero and have no other results. If nbyte is zero and the file is not a regular file, the results are unspecified.https://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html I was unable to find a definition in the standard itself of exactly what "unspecified" means in this context, but I think we can assume that it does not mean the same thing as "undefined", because the POSIX standard uses the actual word "undefined" elsewhere (e.g., in the description of [`pthread_mutex_destroy`][1]). If we assume that it means the same thing as ["unspecified behavior" in C][2], then it means that there are multiple possible behaviors, and the standard does not require an implementation to commit to any particular one in any given situation. [1]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutex_destroy.html
Feb 23 2022
On Wednesday, 23 February 2022 at 00:14:13 UTC, Stanislav Blinov wrote:On Tuesday, 22 February 2022 at 18:33:58 UTC, Paul Backus wrote:Having spent some more time scratching my head over this, I now realize what I was missing: it is indeed possible to open a file descriptor that can corrupt *arbitrary* memory in a process's address space, using something like `/proc/self/mem`. Maybe I'm an idiot for missing this the first time around; I can only ask that you take pity on me. :) This means that calling `write` on a fd is only memory safe if you have previously verified that the file the fd refers to is "well behaved" (i.e., satisfies a particular invariant). It follows that the fd itself must be stored in a ` system` variable in order to ensure that the invariant is maintained in ` safe` code. I don't think adding `scope` checking to the fd makes any difference here, though. *Reading* from `/proc/self/mem` in ` safe` code is perfectly fine, even if you are reading from uninitialized or deallocated memory. The reason such reads are UB when done through pointers is that *dereferencing an invalid pointer* is UB, not because reading from the memory is UB. (I'm also not sure if it's possible in practice to tell whether a file is "well behaved". If not, that means we have to either accept that `write` is *always* ` system`, or allow a permanent loophole in ` safe`. But that's a separate issue.)If you believe there is some way to get the above program to produce undefined behavior, or to complete your original example in such a way that it produces undefined behavior without the use of incorrect ` trusted` code, I'm afraid you will have to spell it out for me.Not exhaustive: It may corrupt a given GC's implementation's heap, which means what occurs after the } is anyone's guess. It may mutate data that's supposed to be immutable (i.e. in a parent process, though you could argue that might not be relevant to the DIP). It may block indefinitely, or crash, or complete with no effect. If you could demonstrate that it cannot possibly exhibit at least the above, I'll happily accept being mistaken.
Feb 23 2022
On Wednesday, 23 February 2022 at 18:16:17 UTC, Paul Backus wrote:(I'm also not sure if it's possible in practice to tell whether a file is "well behaved". If not, that means we have to either accept that `write` is *always* ` system`, or allow a permanent loophole in ` safe`. But that's a separate issue.)By the way, this issue has also come up in Rust: - https://github.com/rust-lang/rust/issues/32670 - https://blog.yossarian.net/2021/03/16/totally_safe_transmute-line-by-line
Feb 23 2022
On Wednesday, 23 February 2022 at 18:16:17 UTC, Paul Backus wrote:Having spent some more time scratching my head over this, I now realize what I was missing: it is indeed possible to open a file descriptor that can corrupt *arbitrary* memory in a process's address space, using something like `/proc/self/mem`.Yes, or you may use e.g. `memfd_create`. And you can inherit such an fd from a parent process. Or receive a shared memory descriptor from another process.Maybe I'm an idiot for missing this the first time around; I can only ask that you take pity on me. :)Never! How dare you make me question myself!!! :)This means that calling `write` on a fd is only memory safe if you have previously verified that the file the fd refers to is "well behaved" (i.e., satisfies a particular invariant). It follows that the fd itself must be stored in a ` system` variable in order to ensure that the invariant is maintained in ` safe` code.Yup.I don't think adding `scope` checking to the fd makes any difference here, though. *Reading* from `/proc/self/mem` in ` safe` code is perfectly fine, even if you are reading from uninitialized or deallocated memory. The reason such reads are UB when done through pointers is that *dereferencing an invalid pointer* is UB, not because reading from the memory is UB.Well, results of `read`ing from some types of fds are also not specified. So, if I'm not mistaken, performing such a read *and* then using the resulting "data" would be undefined behavior (provided the program even gets there). As for `scope` checks themselves - as Dennis mentions, double `close` looks dissimilar to double free. Yet it *is* subject to a superset of that - use after free, as are `read` and `write`. You may well safely "dangle" an fd and not invoke UB by calling those functions with it, but only up to the point when the program opens another descriptor. Calling `close` on a dangled fd, which would then succeed, would be a mere bug and not invoke UB, but attempting to `write` or `read`+use may. So I do think that fds could still be a good example material for the DIP.(I'm also not sure if it's possible in practice to tell whether a file is "well behaved". If not, that means we have to either accept that `write` is *always* ` system`, or allow a permanent loophole in ` safe`. But that's a separate issue.)I don't think that should be necessary in concrete cases, as the onus of ensuring the implicit invariant would lie on the implementation of, in this case, `File` - e.g. making it non-copyable (or reference-counted), ensuring that the constructor opens an appropriate kind of file, etc. etc. That way the only way to make it unsafe would be to corrupt the given instance of `File` itself, which means there's a memory safety issue somewhere else in the program (for example, that same void-initialization).
Feb 23 2022
This is my reply to [this post][1] from the feedback thread: [1]: https://forum.dlang.org/post/qbbatlviwhjsnytbypfw forum.dlang.org On Friday, 4 March 2022 at 09:39:53 UTC, Dennis wrote:On Friday, 25 February 2022 at 21:46:25 UTC, Dukc wrote:`void[1]` is considered by the compiler to potentially contain pointer data, in accordance with this section of the language spec: https://dlang.org/spec/arrays.html#void_arrays Note in particular the paragraph that begins, "Void arrays can also be static". As a result, the compiler will not allow you to void-initialize a `void[1]` in ` safe` code: ```d void main() safe { void[1] a = void; // error } ``` So, the workaround suggested by Dukc would indeed work. (By the way, I know this because the first thing I did after I read his post in the feedback thread was to actually write out a complete example using the `void[1]` workaround and check to see if it worked.)Wouldn't putting the handle in union with `void[1]` work?No, `void[1]` is not a type with unsafe values.
Mar 04 2022
On Friday, 4 March 2022 at 13:09:42 UTC, Paul Backus wrote:As a result, the compiler will not allow you to void-initialize a `void[1]` in ` safe` code: ```d void main() safe { void[1] a = void; // error } ```That's new to me, and the error makes no sense considering you can (implicitly) convert any array to a `void[]` even in ` safe` code, so you can still do this: ```D void main() safe { ubyte[1] x = void; void[1] y = x; } ```So, the workaround suggested by Dukc would indeed work.It's an interesting alternative if we can nail it down.
Mar 04 2022
On Friday, 4 March 2022 at 14:00:31 UTC, Dennis wrote:On Friday, 4 March 2022 at 13:09:42 UTC, Paul Backus wrote:Yes, I think this is a case of the compiler (and the spec) applying rules more broadly than is strictly necessary, since `void`-initializing a `void[1]` cannot *actually* lead to UB in ` safe` code on its own.As a result, the compiler will not allow you to void-initialize a `void[1]` in ` safe` code: ```d void main() safe { void[1] a = void; // error } ```That's new to me, and the error makes no sense considering you can (implicitly) convert any array to a `void[]` even in ` safe` code, so you can still do this: ```D void main() safe { ubyte[1] x = void; void[1] y = x; } ```
Mar 04 2022