digitalmars.D - ref is unsafe
- Jonathan M Davis (66/66) Dec 30 2012 After some recent discussions relating to auto ref and const ref, I have...
- Daniel Kozak (4/34) Dec 30 2012 IMHO, try to return ref to local variable should be error, and
- Jonathan M Davis (21/23) Dec 30 2012 You can disallow that in the easy case of
- monarch_dodra (15/43) Dec 30 2012 Wouldn't it be enough to disallow functions that both take and
- Jonathan M Davis (10/26) Dec 30 2012 The question is whether that would be too limiting. Certainly, it risks ...
- Nick Treleaven (16/30) Dec 30 2012 I think the compiler needs to be able to mark foo as a function that
- Rob T (16/53) Dec 30 2012 That seems like a promising approach. If the compiler can track
- jerro (6/12) Dec 30 2012 If functions's source isn't available, the compiler can't know
- Jonathan M Davis (48/67) Dec 30 2012 No. There's no guarantee that the compiler has access to the function's ...
- jerro (13/17) Dec 30 2012 What about this:
- Jonathan M Davis (5/26) Dec 30 2012 Good point. Member variables of parameters also cause problems. So, it v...
- Rob T (14/43) Dec 30 2012 This may be far fetched, but consider this,
- Jonathan M Davis (3/17) Dec 30 2012 Addresses would only be known at runtime, and it's far too late at that ...
- Michel Fortin (27/30) Dec 30 2012 Note that the above definition includes every struct member function
- Nick Treleaven (17/75) Dec 31 2012 I was aware attributes would be needed for .di files. I suppose
- Nick Treleaven (5/27) Dec 31 2012 Actually overloading 'in ref' with a different meaning on return type
- Zach the Mystic (15/44) Jan 04 2013 I realized just now that it's also applicable to member functions:
- Jonathan M Davis (11/22) Jan 04 2013 Yes. That's a function which takes a ref and returns by ref just like I ...
- Zach the Mystic (3/11) Jan 04 2013 Well, I've been working on just that. I'll have it for you
- comco (16/46) Jan 06 2013 Why this won't work?
- comco (29/40) Jan 09 2013 I think this is the most reasonable thing to do and I can argue
- Dmitry Olshansky (6/17) Dec 30 2012 And another one:
- Carl Sturtivant (28/28) Dec 30 2012 /*
- Jonathan M Davis (19/20) Dec 30 2012 Delegates use closures. The stack of the calling function is copied onto...
- Zach the Mystic (7/12) Dec 30 2012 If this is the way to go, maybe "@saferef" could double as both
- Mehrdad (4/4) Dec 31 2012 I don't understand why there is a discussion on trying to
- Rob T (12/17) Dec 31 2012 Yes, but that will render a whole lot of perfectly safe code,
- Mehrdad (2/3) Dec 31 2012 Halting problem?
- Jonathan M Davis (11/16) Jan 01 2013 The problem is ranges. auto ref returns are _very_ common with the front...
- Maxim Fomin (19/35) Jan 02 2013 This is not a surprise, I remember Andrei was talking about it
- Jonathan M Davis (14/21) Jan 02 2013 It's a hole in @safe. It must be fixed. That's not even vaguely up for
- Maxim Fomin (39/72) Jan 02 2013 I argue that @safity can be easily broken (not only by example I
- Jonathan M Davis (10/15) Jan 02 2013 Then we're going to have to disagree, and I believe that Walter and Andr...
- Thiez (7/22) Jan 02 2013 Perhaps it is worth looking at Rust for this problem? They have
- H. S. Teoh (8/23) Jan 02 2013 [...]
- David Nadlinger (6/11) Jan 02 2013 @trusted shouldn't be a part of the function signature anyway,
- H. S. Teoh (7/15) Jan 02 2013 [...]
- Rob T (5/23) Jan 02 2013 Reading through the discussion on the subject, I have to agree
- Andrei Alexandrescu (3/17) Jan 02 2013 That is incorrect.
- Artur Skawina (6/9) Jan 03 2013 extern(C) does not imply extern.
- Jonathan M Davis (3/6) Jan 02 2013 Agreed. And @trusted is seriously questionable.
- H. S. Teoh (8/14) Jan 02 2013 [...]
- Jason House (16/44) Jan 02 2013 The best solution I can think of is for the @safe code to require
- Timon Gehr (2/19) Jan 02 2013 Those two cases are pretty much the same.
- Jason House (4/29) Jan 03 2013 If what I suggest is done, they must be differentiated. If you
- Timon Gehr (3/33) Jan 03 2013 Obviously _both_ examples result in memory corruption. i is not a ref
- =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= (23/34) Jan 03 2013 +1
- Zach the Mystic (122/131) Jan 03 2013 I've thought about how I think the attributes should work if D is
- David Nadlinger (10/37) Jan 03 2013 I must admit that I haven't read the rest of the thread yet, but
- Rob T (18/26) Jan 03 2013 The problem with that idea, is that a ref return with no
- David Nadlinger (10/21) Jan 03 2013 I am not quite sure what you are trying to say. If the compiler
- Rob T (4/12) Jan 03 2013 See my post directly above this one that has an example.
- deadalnix (4/24) Jan 03 2013 You can't return a scope ref in @safe code, so that is not an
- Jonathan M Davis (13/16) Jan 03 2013 The source code is always available when compiling the function itself. ...
- Rob T (36/36) Jan 03 2013 OK, I understand what you mean by "scope" and how that can be
- deadalnix (3/10) Jan 03 2013 This seems to me like the sane thing to do.
- Tommi (62/70) Jan 10 2013 If you disallow passing local variables as non-scope ref
- Zach the Mystic (75/75) Jan 05 2013 I've here formalized how I think the constraints on a non-scope
- Zach the Mystic (3/3) Jan 08 2013 I felt confident enough about my proposal to submit it as
After some recent discussions relating to auto ref and const ref, I have come to the conlusion that as it stands, ref is not safe. It's system. And I think that we need to take a serious look at it to see what we can do to make it safe. The problem is combining code that takes ref parameters with code that returns by ref. Take this code for example: ref int foo(ref int i) { return i; } ref int bar() { int i = 7; return foo(i); } ref int baz(int i) { return foo(i); } void main() { auto a = bar(); auto b = baz(5); } Both bar and baz return a ref to a local variable which no longer exists. They refer to garbage. It's exactly the same problem as in int* foo(int* i) { return i; } int* bar() { int i = 7; return foo(&i); } void main() { auto a = bar(); } However, that code is considered system, because it's taking the address of a local variable, whereas the code using ref is considered to be safe. But it's just as unsafe as taking the address of the local variable is. Really, it's the same thing but with differing syntax. The question is what to do about this. The most straightforward thing is to just make ref parameters system, but that would be horrible. With that sort of restriction, a _lot_ of code suddenly won't be able to be safe, and it affects const ref and auto ref and anything else along those lines, so whatever solution we come up with for having auto ref with non-templated functions will almost certainly have the problem, and once that works, I'd expect it to be used pretty much by default, making most D code system, which would be a _big_ problem. Another possibility is to make ref imply scope, but given the transitive nature of that, that could be _really_ annoying. Maybe it's the correct solution though. Another possibility would be to make it so that functions with a ref parameter are only system if they also return by ref (the lack of ability to have ref variables outside of parameters and return types saves us from such a ref being squirreled away somewhere). I don't know how good or bad an idea that is. It certainly reduces how much code using ref would have to be system, but it might not be sufficient given how much stuff like std.algorithm uses auto ref for its return types. And maybe another solution which I can't think of at the moment would be better. But my point is that we currently have a _major_ hole in SafeD thanks to the combination of ref parameters and ref return types, and we need to find a solution. - Jonathan M Davis Related: http://d.puremagic.com/issues/show_bug.cgi?id=8838
Dec 30 2012
On Sunday, 30 December 2012 at 08:38:27 UTC, Jonathan M Davis wrote:After some recent discussions relating to auto ref and const ref, I have come to the conlusion that as it stands, ref is not safe. It's system. And I think that we need to take a serious look at it to see what we can do to make it safe. The problem is combining code that takes ref parameters with code that returns by ref. Take this code for example: ref int foo(ref int i) { return i; } ref int bar() { int i = 7; return foo(i); } ref int baz(int i) { return foo(i); } void main() { auto a = bar(); auto b = baz(5); } Both bar and baz return a ref to a local variable which no longer exists. They refer to garbage. It's exactly the same problem as inIMHO, try to return ref to local variable should be error, and such a code shouldn't be compilable
Dec 30 2012
On Sunday, December 30, 2012 10:04:01 Daniel Kozak wrote:IMHO, try to return ref to local variable should be error, and such a code shouldn't be compilableYou can disallow that in the easy case of ref int boo(int i) { return i; } and in fact, that's already illegal. The problem is the wrapper function. You'd also have to disallow functions from returning ref parameters by ref. Otherwise, ref int foo(ref int i) { return i; } ref int baz(int i) { return foo(i); } continues to cause problems. And making it illegal to return ref parameters by ref would be a serious problem for wrapper ranges, because they do that sort of thing all the time with front. So, that's not really going to work. - Jonathan M Davis
Dec 30 2012
On Sunday, 30 December 2012 at 09:18:30 UTC, Jonathan M Davis wrote:On Sunday, December 30, 2012 10:04:01 Daniel Kozak wrote:Wouldn't it be enough to disallow functions that both take and return by ref? There would still be some limitations, but at least: //---- property ref T front(T)(T[] a); //---- Would still be safe. It seams the only code that is unsafe always boils down to taking an argument by ref and returning it by ref... At best, we'd (try) to only make that illegal (when we can), or (seeing things the other (safer) way around), only allow returning by ref, if the compiler is able to prove it is not also an input by ref?IMHO, try to return ref to local variable should be error, and such a code shouldn't be compilableYou can disallow that in the easy case of ref int boo(int i) { return i; } and in fact, that's already illegal. The problem is the wrapper function. You'd also have to disallow functions from returning ref parameters by ref. Otherwise, ref int foo(ref int i) { return i; } ref int baz(int i) { return foo(i); } continues to cause problems. And making it illegal to return ref parameters by ref would be a serious problem for wrapper ranges, because they do that sort of thing all the time with front. So, that's not really going to work. - Jonathan M Davis
Dec 30 2012
On Sunday, December 30, 2012 11:04:35 monarch_dodra wrote:Wouldn't it be enough to disallow functions that both take and return by ref? There would still be some limitations, but at least: //---- property ref T front(T)(T[] a); //---- Would still be safe. It seams the only code that is unsafe always boils down to taking an argument by ref and returning it by ref... At best, we'd (try) to only make that illegal (when we can), or (seeing things the other (safer) way around), only allow returning by ref, if the compiler is able to prove it is not also an input by ref?The question is whether that would be too limiting. Certainly, it risks being a big problem for wrapper functions, since they may _need_ to take an argument by ref and return it by ref (or more probably, auto ref for both, but that amounts to the same thing as far as this issue goes). We could go with making such functions system rather than safe, but I don't know how problematic that would be. We may have no choice though, since unless you can prove that the ref being passed in will stay valid as long as the ref being passed out is used, you can't prove that that code is safe. - Jonathan M Davis
Dec 30 2012
On 30/12/2012 09:17, Jonathan M Davis wrote:The problem is the wrapper function. You'd also have to disallow functions from returning ref parameters by ref. Otherwise, ref int foo(ref int i) { return i; } ref int baz(int i) { return foo(i); } continues to cause problems. And making it illegal to return ref parameters by ref would be a serious problem for wrapper ranges, because they do that sort of thing all the time with front. So, that's not really going to work.I think the compiler needs to be able to mark foo as a function that returns its input reference. Then, any arguments to foo that are locals should cause an error at the call site (e.g. in baz). So legal calls to foo can always be safe. To extend the above code: ref int quux(ref int i) { return foo(i); } Here the compiler already knows that foo returns its input reference. So it checks whether foo is being passed a local - no; but it also has to check if foo is passed any ref parameters of quux, which it is. The compiler now has to mark quux as a function that returns its input reference. Works?
Dec 30 2012
On Sunday, 30 December 2012 at 17:32:41 UTC, Nick Treleaven wrote:On 30/12/2012 09:17, Jonathan M Davis wrote:That seems like a promising approach. If the compiler can track where the local is being passed by ref and returned by ref, then it should be able to determine if the ref to the local is leaving the scope it was originally conceived in and issue a compiler error if it is. The idea of "tagging" the local so that it can be tracked may work well. You may still be able to hide it from the compiler using pointers, but at that point you're not safe anymore but that should be fine because all we want to do is allow returns by ref to be proven safe or not. In general terms, no reference to a local should ever leave it's scope, so ultimately the compiler *has* to track the scope of any local, no matter if it is being passed by ref or not, so really this is a solution that has to be implemented one way or the other. --rtThe problem is the wrapper function. You'd also have to disallow functions from returning ref parameters by ref. Otherwise, ref int foo(ref int i) { return i; } ref int baz(int i) { return foo(i); } continues to cause problems. And making it illegal to return ref parameters by ref would be a serious problem for wrapper ranges, because they do that sort of thing all the time with front. So, that's not really going to work.I think the compiler needs to be able to mark foo as a function that returns its input reference. Then, any arguments to foo that are locals should cause an error at the call site (e.g. in baz). So legal calls to foo can always be safe. To extend the above code: ref int quux(ref int i) { return foo(i); } Here the compiler already knows that foo returns its input reference. So it checks whether foo is being passed a local - no; but it also has to check if foo is passed any ref parameters of quux, which it is. The compiler now has to mark quux as a function that returns its input reference. Works?
Dec 30 2012
Here the compiler already knows that foo returns its input reference. So it checks whether foo is being passed a local - no; but it also has to check if foo is passed any ref parameters of quux, which it is. The compiler now has to mark quux as a function that returns its input reference. Works?If functions's source isn't available, the compiler can't know what the function does. This could only work if this property of a function (whether it returns a reference to its ref parameter) would be part of its type. The compiler could still infer it for function literals and templates, similar to how pure works now.
Dec 30 2012
On Sunday, December 30, 2012 17:32:40 Nick Treleaven wrote:I think the compiler needs to be able to mark foo as a function that returns its input reference. Then, any arguments to foo that are locals should cause an error at the call site (e.g. in baz). So legal calls to foo can always be safe. To extend the above code: ref int quux(ref int i) { return foo(i); } Here the compiler already knows that foo returns its input reference. So it checks whether foo is being passed a local - no; but it also has to check if foo is passed any ref parameters of quux, which it is. The compiler now has to mark quux as a function that returns its input reference. Works?No. There's no guarantee that the compiler has access to the function's body, and the function being called could be compiled after the function which calls it. There's a reason that attribute inferrence only works with templated functions. In every other case, the programmer has to mark it. We're _not_ going to get any kind inferrence without templates. D's compilation model doesn't allow it. The closest that we could get to what you suggest would be to add a new attribute similar to nothrow but which guarantees that the function does not return a ref to a parameter. So, you'd have to mark your functions that way (e.g. with norefparamreturn). Maybe the compiler could infer it for templated ones, but this attribute would basically have to work like other inferred attributes and be marked manually in all other cases. Certainly, you can't have the compiler figuring it out for you in general, because D's compilation model allows the function being called to be compiled separately from (and potentially after) the function calling it. And when you think about what this attribute would be needed for, it gets a bit bizarre to have it. The _only_ time that it's applicable is when a function takes an argument by ref and returns the same type by ref. In all other cases, the compiler can guarantee it just based on the type system. I suppose that we could have an attribute that indicated that a function _did_ return a ref to one of its params and then have the compiler give an error if it were missing, which means that the foo function ref int foo(ref int i) { return i; } would end up with an error for not having the attribute, whereas a function like baz ref int baz(int i) { return foo(i); } would not end up with the error unless foo had the attribute on it. But that's very different from any attribute that we currently have. It would be like having a throw attribute instead of a nothrow attribute. I suppose that it is a possible solution though. I could also see an argument that the attribute should go on the parameter rather than the function, in which case you could have more fine-grained control over it, but it does complicate things further. Honestly though, I'm inclined to argue that functions which return by ref and have a ref parameter of that same type just be considered system. It's just way simpler. It's also more in line with how pointers to locals are handled, though because ref is far more restrictive, it should be possible to come up with a different solution (like the attribute), whereas the fact that you can squirrel away pointers to things makes it rather complicated (if not impossible) to have a solution other than simply make taking the address of a local variable system. You can't squirrel away ref. - Jonathan M Davis
Dec 30 2012
Honestly though, I'm inclined to argue that functions which return by ref and have a ref parameter of that same type just be considered system.What about this: struct Foo { int a; } ref int bar(ref Foo foo) { return foo.a; } the parameter type and the return type here are different, but bar still returns a reference to its parameter. I guess you should consider all functions that return ref and have at least one ref parameter system (unless they are marked trusted).
Dec 30 2012
On Sunday, December 30, 2012 23:18:43 jerro wrote:Good point. Member variables of parameters also cause problems. So, it very quickly devolves to any function which accepts a user-defined type by ref and returns anything by ref would have to be system, which is far from pleasant. - Jonathan M DavisHonestly though, I'm inclined to argue that functions which return by ref and have a ref parameter of that same type just be considered system.What about this: struct Foo { int a; } ref int bar(ref Foo foo) { return foo.a; } the parameter type and the return type here are different, but bar still returns a reference to its parameter. I guess you should consider all functions that return ref and have at least one ref parameter system (unless they are marked trusted).
Dec 30 2012
On Sunday, 30 December 2012 at 22:30:24 UTC, Jonathan M Davis wrote:On Sunday, December 30, 2012 23:18:43 jerro wrote:This may be far fetched, but consider this, If a function returns a ref that is the same as what was passed in by ref, then the passed and return addresses would match, which means that it still may be possible for the compiler to detect the situation. This is more complicated in the case where a user defined struct was passed by ref, and the ref return type is a member from that struct, but it still may be possible to detect it. If address matching is possible (or is that determined only at link time?) then it may be possible to detect a situation that should be illegal and flagged as a compiler error. --rtGood point. Member variables of parameters also cause problems. So, it very quickly devolves to any function which accepts a user-defined type by ref and returns anything by ref would have to be system, which is far from pleasant. - Jonathan M DavisHonestly though, I'm inclined to argue that functions which return by ref and have a ref parameter of that same type just be considered system.What about this: struct Foo { int a; } ref int bar(ref Foo foo) { return foo.a; } the parameter type and the return type here are different, but bar still returns a reference to its parameter. I guess you should consider all functions that return ref and have at least one ref parameter system (unless they are marked trusted).
Dec 30 2012
On Monday, December 31, 2012 02:37:56 Rob T wrote:This may be far fetched, but consider this, If a function returns a ref that is the same as what was passed in by ref, then the passed and return addresses would match, which means that it still may be possible for the compiler to detect the situation. This is more complicated in the case where a user defined struct was passed by ref, and the ref return type is a member from that struct, but it still may be possible to detect it. If address matching is possible (or is that determined only at link time?) then it may be possible to detect a situation that should be illegal and flagged as a compiler error.Addresses would only be known at runtime, and it's far too late at that point. - Jonathan M Davis
Dec 30 2012
On 2012-12-30 22:29:33 +0000, Jonathan M Davis <jmdavisProg gmx.com> said:Good point. Member variables of parameters also cause problems. So, it very quickly devolves to any function which accepts a user-defined type by ref and returns anything by ref would have to be system, which is far from pleasant.Note that the above definition includes every struct member function that returns a ref because of the implicit this parameter. Also, it's not just functions returning by ref, it could be a function returning a delegate too, if the delegate happens to make use of the reference: void delegate() foo(ref int a) { return { writeln(a); }; } void delegate() bar() { int a; return foo(a); // leaking reference to a beyond bar's scope } And similar to passing a value by ref: you can pass a slice to a static array, then return the slice: int[] foo(int[] a) { return a; } int[] bar() { int[2] a; return foo(a[]); } Three variations on the same theme. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Dec 30 2012
On 30/12/2012 22:01, Jonathan M Davis wrote:On Sunday, December 30, 2012 17:32:40 Nick Treleaven wrote:I was aware attributes would be needed for .di files. I suppose attribute inference for non-template functions is not doable.I think the compiler needs to be able to mark foo as a function that returns its input reference. Then, any arguments to foo that are locals should cause an error at the call site (e.g. in baz). So legal calls to foo can always be safe. To extend the above code: ref int quux(ref int i) { return foo(i); } Here the compiler already knows that foo returns its input reference. So it checks whether foo is being passed a local - no; but it also has to check if foo is passed any ref parameters of quux, which it is. The compiler now has to mark quux as a function that returns its input reference. Works?No. There's no guarantee that the compiler has access to the function's body, and the function being called could be compiled after the function which calls it. There's a reason that attribute inferrence only works with templated functions. In every other case, the programmer has to mark it. We're _not_ going to get any kind inferrence without templates. D's compilation model doesn't allow it.The closest that we could get to what you suggest would be to add a new attribute similar to nothrow but which guarantees that the function does not return a ref to a parameter. So, you'd have to mark your functions that way (e.g. with norefparamreturn). Maybe the compiler could infer it for templated ones, but this attribute would basically have to work like other inferred attributes and be marked manually in all other cases. Certainly, you can't have the compiler figuring it out for you in general, because D's compilation model allows the function being called to be compiled separately from (and potentially after) the function calling it.As you suggested below that, I would have the attribute mean the opposite, refparamreturn. Functions that need it but don't have it can be detected by recompiling them. The syntax could be 'in ref': in ref int quux(ref int i);And when you think about what this attribute would be needed for, it gets a bit bizarre to have it. The _only_ time that it's applicable is when a function takes an argument by ref and returns the same type by ref. In all other cases, the compiler can guarantee it just based on the type system.As jerro and Michel Fortin pointed out, 'in ref' would be needed for returning input struct members, and capturing inputs with delegates and slices. So the feature might pull its weight. I think detecting all these situations would be essentially the same as the checks needed for a scope parameter, so if/when scope parameters get implemented, 'in ref' might not be hard to bolt on. If a simpler solution doesn't disallow any sensible uses of ref returns, that would be preferable. But I don't think we've found it yet.I suppose that we could have an attribute that indicated that a function _did_ return a ref to one of its params and then have the compiler give an error if it were missing, which means that the foo function ref int foo(ref int i) { return i; } would end up with an error for not having the attribute, whereas a function like baz ref int baz(int i) { return foo(i); } would not end up with the error unless foo had the attribute on it. But that's very different from any attribute that we currently have. It would be like having a throw attribute instead of a nothrow attribute. I suppose that it is a possible solution though. I could also see an argument that the attribute should go on the parameter rather than the function, in which case you could have more fine-grained control over it, but it does complicate things further.I suppose a parameter attribute might be useful to allow passing locals to other ref parameters which aren't returned, and as documentation.
Dec 31 2012
On 31/12/2012 14:44, Nick Treleaven wrote:On 30/12/2012 22:01, Jonathan M Davis wrote:Actually overloading 'in ref' with a different meaning on return type from parameter (const scope) is probably a bad idea. Instead: ref int quux( escape ref int i); [...]The closest that we could get to what you suggest would be to add a new attribute similar to nothrow but which guarantees that the function does not return a ref to a parameter. So, you'd have to mark your functions that way (e.g. with norefparamreturn). Maybe the compiler could infer it for templated ones, but this attribute would basically have to work like other inferred attributes and be marked manually in all other cases. Certainly, you can't have the compiler figuring it out for you in general, because D's compilation model allows the function being called to be compiled separately from (and potentially after) the function calling it.As you suggested below that, I would have the attribute mean the opposite, refparamreturn. Functions that need it but don't have it can be detected by recompiling them. The syntax could be 'in ref': in ref int quux(ref int i);I suppose a parameter attribute might be useful to allow passing locals to other ref parameters which aren't returned, and as documentation.
Dec 31 2012
On Sunday, 30 December 2012 at 22:02:16 UTC, Jonathan M Davis wrote:The closest that we could get to what you suggest would be to add a new attribute similar to nothrow but which guarantees that the function does not return a ref to a parameter. So, you'd have to mark your functions that way (e.g. with norefparamreturn). Maybe the compiler could infer it for templated ones, but this attribute would basically have to work like other inferred attributes and be marked manually in all other cases. Certainly, you can't have the compiler figuring it out for you in general, because D's compilation model allows the function being called to be compiled separately from (and potentially after) the function calling it. And when you think about what this attribute would be needed for, it gets a bit bizarre to have it. The _only_ time that it's applicable is when a function takes an argument by ref and returns the same type by ref. In all other cases, the compiler can guarantee it just based on the type system.I realized just now that it's also applicable to member functions: struct F { int _i; ref int ser() { return _i; } // Needs to be marked as well } A struct's fields are implicit parameters in anything it returns.Honestly though, I'm inclined to argue that functions which return by ref and have a ref parameter of that same type just be considered system.Structs mess that up as well: struct S { int i; } ref int d(ref S s) { return s.i; }
Jan 04 2013
On Friday, January 04, 2013 17:26:59 Zach the Mystic wrote:Yes. That's a function which takes a ref and returns by ref just like I said. It's just that in this case, the ref returned isn't the full object that was passed by ref but just a portion of it. What that means is that you can't assume that the ref being returned is safe just because the type of the parameter and the return type aren't the same. But it doesn't change the statement that a function which takes a parameter by ref and returns by ref can't be considered safe without additional constraints of some kind. It just shows why you don't have an easy way out to make many of them safe based on the differing types involved. - Jonathan M DavisHonestly though, I'm inclined to argue that functions which return by ref and have a ref parameter of that same type just be considered system.Structs mess that up as well: struct S { int i; } ref int d(ref S s) { return s.i; }
Jan 04 2013
On Friday, 4 January 2013 at 20:20:08 UTC, Jonathan M Davis wrote:... But it doesn't change the statement that a function which takes a parameter by ref and returns by ref can't be considered safe without additional constraints of some kind. It just shows why you don't have an easy way out to make many of them safe based on the differing types involved.Well, I've been working on just that. I'll have it for you tomorrow, I think.
Jan 04 2013
On Friday, 4 January 2013 at 20:20:08 UTC, Jonathan M Davis wrote:On Friday, January 04, 2013 17:26:59 Zach the Mystic wrote:Why this won't work? 1. If the function code is available at ct, we can check for escaping locals. 2. Otherwise, we want to statically say to the compiler that the returned ref is safe exactly in these lines in which the particular function argument, from which the ref has been extracted, has not yet gone out of scope. So the returned ref safety guarantee tracks the argument safety guarantee. Something like this: ref int f( infer_safe_from ref int a); With such an annotation on an argument, the compiler will be able to infer the safety of a function when used in cases in which a is in scope whenever the returned ref is referenced. Now, if f is used in this manner: {Yes. That's a function which takes a ref and returns by ref just like I said. It's just that in this case, the ref returned isn't the full object that was passed by ref but just a portion of it. What that means is that you can't assume that the ref being returned is safe just because the type of the parameter and the return type aren't the same. But it doesn't change the statement that a function which takes a parameter by ref and returns by ref can't be considered safe without additional constraints of some kind. It just shows why you don't have an easy way out to make many of them safe based on the differing types involved. - Jonathan M DavisHonestly though, I'm inclined to argue that functions which return by ref and have a ref parameter of that same type just be considered system.Structs mess that up as well: struct S { int i; } ref int d(ref S s) { return s.i; }
Jan 06 2013
On Sunday, 30 December 2012 at 22:02:16 UTC, Jonathan M Davis wrote:But that's very different from any attribute that we currently have. It would be like having a throw attribute instead of a nothrow attribute. I suppose that it is a possible solution though. I could also see an argument that the attribute should go on the parameter rather than the function, in which case you could have more fine-grained control over it, but it does complicate things further.I think this is the most reasonable thing to do and I can argue that the complications are not a valid argument against this. I've came out with roughly the same idea some days ago. Comparing this with nothrow is a nice point, but I don't see it as an argument against it. This is the most logical thing to do, and solves problems. So, the general notion that we want to (statically) express is that the ref result of a function __can__ depend on one (or more - depending on a condition for example) ref function parameters. Now, if the result is used when all the annotated arguments are still in scope, that usage can be considered safe. So a function declaration will (conceptually) look like this: ref int min( result_tracks_scope_of ref int a, result_tracks_scope_of ref int b) { return a < b ? a : b; } Now the interface provides enough information for itself to infer when the usage is safe and when not. This will work equally well when we refer to members of the ref parameters. ref int a( result_tracks_scope_of A a) { return a.la.bala; } The crucial thing is that the compiler can simply infer these attributes when the implementation is available, so we won't have to issue errors when the user has not added them (if the code is available).
Jan 09 2013
12/30/2012 12:37 PM, Jonathan M Davis пишет:After some recent discussions relating to auto ref and const ref, I have come to the conlusion that as it stands, ref is not safe. It's system. And I think that we need to take a serious look at it to see what we can do to make it safe. The problem is combining code that takes ref parameters with code that returns by ref. Take this code for example:[snip]And maybe another solution which I can't think of at the moment would be better. But my point is that we currently have a _major_ hole in SafeD thanks to the combination of ref parameters and ref return types, and we need to find a solution. - Jonathan M Davis Related: http://d.puremagic.com/issues/show_bug.cgi?id=8838And another one: http://d.puremagic.com/issues/show_bug.cgi?id=9195 -- Dmitry Olshansky
Dec 30 2012
/* The implementation of delegates has solved an analogous problem. e.g. */ import std.stdio; auto getfun( int x) { int y = x*x; int ysquared() { return y*y; } return &ysquared; } void main() { auto f1 = getfun(2); auto f2 = getfun(3); writeln( f1() ); writeln( f2() ); } /* The variable y no longer exists local to getfun, but its existence is prolonged making its use safe. Just like _immutable_, some clever compiler inference that is transitive plus a delegate-like solution may do the job with ref without imposing constraints upon the sane. Plus, it may lead to new terrain when the liberation of local variables as above occurs in this context. Warning: this is all speculation. */
Dec 30 2012
On Monday, December 31, 2012 03:34:18 Carl Sturtivant wrote:The implementation of delegates has solved an analogous problem.Delegates use closures. The stack of the calling function is copied onto the heap so that it will continue to be valid for the delegate after the function returns. We don't want to be doing anything like that with ref, and it's generally, completely unnecessary. It's just that there are cases where you can escape such references if you're not careful, and the compiler currently considers those to be safe, when they're not actually safe. So, we presumably need to do one of 1. Limit what can be legally done with ref to get rid of the problem. 2. Make ref system in cases where we can't prove that it's safe and try and prove it to be safe in as many situations as possible. 3. Create a new attribute which has to be used when a function returns a ref to a parameter and use that to make it illegal to pass a ref to a local variable to such functions. 4. Something else that similarly protects against this at compile time without any extra overhead at runtime. I really don't think that something like closures would be acceptable for ref parameters. - Jonathan M Davis
Dec 30 2012
On Monday, 31 December 2012 at 02:47:46 UTC, Jonathan M Davis wrote:3. Create a new attribute which has to be used when a function returns a ref to a parameter and use that to make it illegal to pass a ref to a local variable to such functions.If this is the way to go, maybe " saferef" could double as both safe and inoutref. [OT] I've not been here for a while, but I've been reading up on the D boards again. I might want to help with the standard library lexer and parser. Happy New Year...
Dec 30 2012
I don't understand why there is a discussion on trying to special-case ref parameters. There's nothing special about ref parameters... what's special is ref _returns_. Therefore all we need to do is disallow ref returns in safe code.
Dec 31 2012
On Monday, 31 December 2012 at 21:25:53 UTC, Mehrdad wrote:I don't understand why there is a discussion on trying to special-case ref parameters. There's nothing special about ref parameters... what's special is ref _returns_. Therefore all we need to do is disallow ref returns in safe code.Yes, but that will render a whole lot of perfectly safe code, including easily provable safe code, from being marked as safe. The real problem is the ability to return a temp as a ref by obfuscating the temp from the compiler through an intermediate wrapper of some kind. Perhaps what must be disallowed (as being safe) are ref returns where the return result cannot be proven to be safe from within the calling function, i.e. the wrapper may be safe, but the usage of the wrapper cannot be guaranteed to be safe when used as a ref return. --rt
Dec 31 2012
On Monday, 31 December 2012 at 23:39:35 UTC, Rob T wrote:ref returns where the return result cannot be proven to be safeHalting problem?
Dec 31 2012
On Monday, December 31, 2012 22:25:52 Mehrdad wrote:I don't understand why there is a discussion on trying to special-case ref parameters. There's nothing special about ref parameters... what's special is ref _returns_. Therefore all we need to do is disallow ref returns in safe code.The problem is ranges. auto ref returns are _very_ common with the front and back of ranges (especially wrapper ranges). If returning by ref or auto ref automatically renders code system, then we just made a _large_ portion of Phobos system when most of it is actually perfectly safe. We _might_ be able to say that a function is system if it both takes an argument by ref and returns by ref, but even that is likely to be a problem unless we can statically prove _in most cases_ that there's no way that the ref being returned could be to any portion of any of the arguments passed by ref. - Jonathan M Davis
Jan 01 2013
On Sunday, 30 December 2012 at 08:38:27 UTC, Jonathan M Davis wrote:After some recent discussions relating to auto ref and const ref, I have come to the conlusion that as it stands, ref is not safe. It's system.This is not a surprise, I remember Andrei was talking about it 1.5 year ago.And I think that we need to take a serious look at it to see what we can do to make it safe. The problem is combining code that takes ref parameters with code that returns by ref. Take this code for example: <skipped>I have not met any bugzilla issue or a forum thread when someone has fallen in this double ref trap. The only cases I remember are discussions that there is such possible problem. Requiring some new attribute or new keyword does not really help, because almost all D language constraints can be avoided by low-level tricks. Inferring this trap is not always possible as was mentioned here because compiler does not always have access to function definition. I think it should not be fixed, but probably compiler may issue warning at some circumstances when it can realize this situation. By the way, there is another issue with ref - http://dpaste.dzfl.pl/928767a9 which was discussed several month ago minimum. Do you think this should be also fixed?But my point is that we currently have a _major_ hole in SafeD thanks to the combination of ref parameters and ref return types, and we need to find a solution. - Jonathan M DavisI don't take into D's safity seriously because it can be easily hacked.
Jan 02 2013
On Wednesday, January 02, 2013 13:45:32 Maxim Fomin wrote:I think it should not be fixed, but probably compiler may issue warning at some circumstances when it can realize this situation.It's a hole in safe. It must be fixed. That's not even vaguely up for discussion. The question is _how_ to fix it. Ideally, it would be fixed in a way that limits how much more code has to become system.By the way, there is another issue with ref - http://dpaste.dzfl.pl/928767a9 which was discussed several month ago minimum. Do you think this should be also fixed?It's not a bug. You're dereferencing a null pointer, so you get a segfault. There's nothing surprising there.I don't take into D's safity seriously because it can be easily hacked.It's fine if you don't care about it, but as the maintainers of the language and standard library, we have to take it seriously. Regardless of the likelihood of there being a bug caused by this, it breaks safe, so it must be fixed, even if that means simply making all functions which both accept by ref and return by ref system. But that's very undesirable, because it will lead to too much code being considered system even when it's perfectly safe. Hence why this is being discussed. - Jonathan M Davis
Jan 02 2013
On Wednesday, 2 January 2013 at 19:37:51 UTC, Jonathan M Davis wrote:On Wednesday, January 02, 2013 13:45:32 Maxim Fomin wrote:I argue that safity can be easily broken (not only by example I provided) and there is no way to fix all holes because D is a system language and provides access to low-level features. Safe is good to warn about (not prevent from) doing something wrong but it cannot stop from all safety breakages. Nor it should make plenty of code uncompilable just because some trick may cause segfault. Actually many things can cause segfaults, but they are not intended to be fixed.I think it should not be fixed, but probably compiler may issue warning at some circumstances when it can realize this situation.It's a hole in safe. It must be fixed. That's not even vaguely up for discussion. The question is _how_ to fix it. Ideally, it would be fixed in a way that limits how much more code has to become system.Consider broaden example when function takes pointer, does not check for null and passes later reference. This is similar to double ref trick. Consider another example: ---main.d----- extern(C) void foo() safe pure nothrow; void notThatSafe () safe pure nothrow { foo(); } void main() { notThatSafe(); } ----foo.d--- extern(C) void foo() { throw new Exception(""); } ---------- So, pure, nothrow and safe are effectively stripped off by separate compilation. Another example, which does not require separate file: http://dpaste.dzfl.pl/f968cab5By the way, there is another issue with ref - http://dpaste.dzfl.pl/928767a9 which was discussed several month ago minimum. Do you think this should be also fixed?It's not a bug. You're dereferencing a null pointer, so you get a segfault. There's nothing surprising there.Again, I argue that D is a system language and there are many possibilities to break safity. Although fixing holes does make sense in general, it does not make sense fixing obvious issues so that plenty of code becomes uncompilable and safity usage becomes very annoying.I don't take into D's safity seriously because it can be easily hacked.It's fine if you don't care about it, but as the maintainers of the language and standard library, we have to take it seriously. Regardless of the likelihood of there being a bug caused by this, it breaks safe, so it must be fixed, even if that means simply making all functions which both accept by ref and return by ref system. But that's very undesirable, because it will lead to too much code being considered system even when it's perfectly safe. Hence why this is being discussed. - Jonathan M Davis
Jan 02 2013
On Wednesday, January 02, 2013 23:21:55 Maxim Fomin wrote:Again, I argue that D is a system language and there are many possibilities to break safity. Although fixing holes does make sense in general, it does not make sense fixing obvious issues so that plenty of code becomes uncompilable and safity usage becomes very annoying.Then we're going to have to disagree, and I believe that Walter and Andrei are completely with me on this one. If all of the constructs that you use are safe, then it should be _guaranteed_ that your program is memory-safe. That's what safe is for. Yes, it can be gotten around if the programmer marks system code as trusted when it's not really memory-safe, but that's the programmer's problem. safe is not doing it's job and is completely pointless if it has any holes in it beyond programmers mislabeling functions as trusted. - Jonathan M Davis
Jan 02 2013
On Wednesday, 2 January 2013 at 22:53:04 UTC, Jonathan M Davis wrote:Then we're going to have to disagree, and I believe that Walter and Andrei are completely with me on this one. If all of the constructs that you use are safe, then it should be _guaranteed_ that your program is memory-safe. That's what safe is for. Yes, it can be gotten around if the programmer marks system code as trusted when it's not really memory-safe, but that's the programmer's problem. safe is not doing it's job and is completely pointless if it has any holes in it beyond programmers mislabeling functions as trusted. - Jonathan M DavisPerhaps it is worth looking at Rust for this problem? They have been looking pretty hard at the lifetimes of data/pointers and perhaps they have a (possibly partial) solution that can be used in the D compiler. It seems to me a ref in D has many things in common with Rust's borrowed pointers.
Jan 02 2013
On Wednesday, 2 January 2013 at 23:33:16 UTC, Thiez wrote:On Wednesday, 2 January 2013 at 22:53:04 UTC, Jonathan M Davis wrote:You can also look at how Algol solved this over 40 years ago: Insert a runtime check that the escaping reference does not point to the current stack frame which is about to be destroyed. The check should be very cheap at runtime but it can be deactivated in a release build for efficiency just like it is done for array indexing. FYI Nimrod has the same problem and it's planned to prevent these cases statically with a type based alias analysis; however at least the first versions will still keep the dynamic check as these kind of static analyses cry for correctness proofs IMO.Then we're going to have to disagree, and I believe that Walter and Andrei are completely with me on this one. If all of the constructs that you use are safe, then it should be _guaranteed_ that your program is memory-safe. That's what safe is for. Yes, it can be gotten around if the programmer marks system code as trusted when it's not really memory-safe, but that's the programmer's problem. safe is not doing it's job and is completely pointless if it has any holes in it beyond programmers mislabeling functions as trusted. - Jonathan M DavisPerhaps it is worth looking at Rust for this problem?
Jan 03 2013
On Friday, 4 January 2013 at 00:46:33 UTC, Araq wrote:On Wednesday, 2 January 2013 at 23:33:16 UTC, Thiez wrote:I did suggest something like that, and it may be a good idea to implement as a debugging aid (like runtime range checking). I wonder how difficult it would be to implement? Unfortunately, it does not help solve the safe compile time checks. --rtOn Wednesday, 2 January 2013 at 22:53:04 UTC, Jonathan M Davis wrote:You can also look at how Algol solved this over 40 years ago: Insert a runtime check that the escaping reference does not point to the current stack frame which is about to be destroyed. The check should be very cheap at runtime but it can be deactivated in a release build for efficiency just like it is done for array indexing. FYI Nimrod has the same problem and it's planned to prevent these cases statically with a type based alias analysis; however at least the first versions will still keep the dynamic check as these kind of static analyses cry for correctness proofs IMO.Then we're going to have to disagree, and I believe that Walter and Andrei are completely with me on this one. If all of the constructs that you use are safe, then it should be _guaranteed_ that your program is memory-safe. That's what safe is for. Yes, it can be gotten around if the programmer marks system code as trusted when it's not really memory-safe, but that's the programmer's problem. safe is not doing it's job and is completely pointless if it has any holes in it beyond programmers mislabeling functions as trusted. - Jonathan M DavisPerhaps it is worth looking at Rust for this problem?
Jan 03 2013
On Wed, Jan 02, 2013 at 05:52:54PM -0500, Jonathan M Davis wrote:On Wednesday, January 02, 2013 23:21:55 Maxim Fomin wrote:[...] All extern(C) functions must be system by default. It makes no sense to allow a safe extern(C) function, since there is no way for the compiler to verify anything at all. The best you can do is trusted. T -- Two American lawyers went down to the beach for a swim. Seeing a canoe rental nearby, one asked the other, "Roe, or Wade?"Again, I argue that D is a system language and there are many possibilities to break safity. Although fixing holes does make sense in general, it does not make sense fixing obvious issues so that plenty of code becomes uncompilable and safity usage becomes very annoying.Then we're going to have to disagree, and I believe that Walter and Andrei are completely with me on this one. If all of the constructs that you use are safe, then it should be _guaranteed_ that your program is memory-safe. That's what safe is for. Yes, it can be gotten around if the programmer marks system code as trusted when it's not really memory-safe, but that's the programmer's problem. safe is not doing it's job and is completely pointless if it has any holes in it beyond programmers mislabeling functions as trusted.
Jan 02 2013
On Wednesday, 2 January 2013 at 23:08:14 UTC, H. S. Teoh wrote:All extern(C) functions must be system by default. It makes no sense to allow a safe extern(C) function, since there is no way for the compiler to verify anything at all. The best you can do is trusted.trusted shouldn't be a part of the function signature anyway, see http://forum.dlang.org/thread/blrglebkzhrilxkbprgh forum.dlang.org. Somebody up for creating a DIP on that? David
Jan 02 2013
On Thu, Jan 03, 2013 at 12:53:56AM +0100, David Nadlinger wrote:On Wednesday, 2 January 2013 at 23:08:14 UTC, H. S. Teoh wrote:[...] Good point, trusted is an implementation detail that should not pollute public APIs. T -- Век живи - век учись. А дураком помрёшь.All extern(C) functions must be system by default. It makes no sense to allow a safe extern(C) function, since there is no way for the compiler to verify anything at all. The best you can do is trusted.trusted shouldn't be a part of the function signature anyway, see http://forum.dlang.org/thread/blrglebkzhrilxkbprgh forum.dlang.org. Somebody up for creating a DIP on that?
Jan 02 2013
On Thursday, 3 January 2013 at 00:13:41 UTC, H. S. Teoh wrote:On Thu, Jan 03, 2013 at 12:53:56AM +0100, David Nadlinger wrote:Reading through the discussion on the subject, I have to agree that trusted should be used to mark implementation details and not the api. --rtOn Wednesday, 2 January 2013 at 23:08:14 UTC, H. S. Teoh wrote:[...] Good point, trusted is an implementation detail that should not pollute public APIs. TAll extern(C) functions must be system by default. It makes no sense to allow a safe extern(C) function, since there is no way for the compiler to verify anything at all. The best you can do is trusted.trusted shouldn't be a part of the function signature anyway, see http://forum.dlang.org/thread/blrglebkzhrilxkbprgh forum.dlang.org. Somebody up for creating a DIP on that?
Jan 02 2013
On 1/2/13 5:21 PM, Maxim Fomin wrote:On Wednesday, 2 January 2013 at 19:37:51 UTC, Jonathan M Davis wrote:That is incorrect. AndreiOn Wednesday, January 02, 2013 13:45:32 Maxim Fomin wrote:I argue that safity can be easily broken (not only by example I provided) and there is no way to fix all holes because D is a system language and provides access to low-level features. Safe is good to warn about (not prevent from) doing something wrong but it cannot stop from all safety breakages.I think it should not be fixed, but probably compiler may issue warning at some circumstances when it can realize this situation.It's a hole in safe. It must be fixed. That's not even vaguely up for discussion. The question is _how_ to fix it. Ideally, it would be fixed in a way that limits how much more code has to become system.
Jan 02 2013
On 01/03/13 00:06, H. S. Teoh wrote:All extern(C) functions must be system by default. It makes no sense to allow a safe extern(C) function, since there is no way for the compiler to verify anything at all. The best you can do is trusted.extern(C) does not imply extern. And for extern functions -- safe and trusted are equivalent, unless this makes a difference for name mangling, which it a) shouldn't, b) already does not for the extern(C) case. artur
Jan 03 2013
On Wednesday, January 02, 2013 15:06:24 H. S. Teoh wrote:All extern(C) functions must be system by default. It makes no sense to allow a safe extern(C) function, since there is no way for the compiler to verify anything at all. The best you can do is trusted.Agreed. And trusted is seriously questionable. - Jonathan M Davis
Jan 02 2013
On Wed, Jan 02, 2013 at 06:30:38PM -0500, Jonathan M Davis wrote:On Wednesday, January 02, 2013 15:06:24 H. S. Teoh wrote:[...] We may not have a choice on that, if some Phobos code needs to call C in the backend but needs to expose a safe interface. But yeah, if possible, we should prohibit trusted on extern(C) functions as well. T -- Two American lawyers went down to the beach for a swim. Seeing a canoe rental nearby, one asked the other, "Roe, or Wade?"All extern(C) functions must be system by default. It makes no sense to allow a safe extern(C) function, since there is no way for the compiler to verify anything at all. The best you can do is trusted.Agreed. And trusted is seriously questionable.
Jan 02 2013
On Sunday, 30 December 2012 at 08:38:27 UTC, Jonathan M Davis wrote:After some recent discussions relating to auto ref and const ref, I have come to the conlusion that as it stands, ref is not safe. It's system. And I think that we need to take a serious look at it to see what we can do to make it safe. The problem is combining code that takes ref parameters with code that returns by ref.The best solution I can think of is for the safe code to require a ref return value is treated with the same care as all the function input arguments. I'll try to annotate the example code you gave to explain.Take this code for example: ref int foo(ref int i) { return i; }This function is valid. Ref input arguments can be returned.ref int bar() { int i = 7; return foo(i); }If safe, this code will not compile. Error: foo may return a local stack variable Since "i" is a local variable, "foo(i)" might return it.ref int baz(int i) { return foo(i); }This function is fine. "i" is an input argument so "foo(i)" is considered to be equivalent to an input argument.void main() { auto a = bar(); auto b = baz(5); }Both function calls compile. The variable a could be returned. I'm not sure if b should be returnable by ref. if "5" is a manifest constant, it must be an error in safe code. If it has a permanent address, it could be returned.
Jan 02 2013
On 01/03/2013 12:48 AM, Jason House wrote:...Those two cases are pretty much the same.ref int bar() { int i = 7; return foo(i); }If safe, this code will not compile. Error: foo may return a local stack variable Since "i" is a local variable, "foo(i)" might return it.ref int baz(int i) { return foo(i); }This function is fine. "i" is an input argument so "foo(i)" is considered to be equivalent to an input argument.
Jan 02 2013
On Thursday, 3 January 2013 at 05:56:27 UTC, Timon Gehr wrote:On 01/03/2013 12:48 AM, Jason House wrote:If what I suggest is done, they must be differentiated. If you replace "return foo(i)" with "return i", the compiler will already issue an error for the local variable case....Those two cases are pretty much the same.ref int bar() { int i = 7; return foo(i); }If safe, this code will not compile. Error: foo may return a local stack variable Since "i" is a local variable, "foo(i)" might return it.ref int baz(int i) { return foo(i); }This function is fine. "i" is an input argument so "foo(i)" is considered to be equivalent to an input argument.
Jan 03 2013
On 01/03/2013 01:52 PM, Jason House wrote:On Thursday, 3 January 2013 at 05:56:27 UTC, Timon Gehr wrote:Obviously _both_ examples result in memory corruption. i is not a ref parameter.On 01/03/2013 12:48 AM, Jason House wrote:If what I suggest is done, they must be differentiated. If you replace "return foo(i)" with "return i", the compiler will already issue an error for the local variable case....Those two cases are pretty much the same.ref int bar() { int i = 7; return foo(i); }If safe, this code will not compile. Error: foo may return a local stack variable Since "i" is a local variable, "foo(i)" might return it.ref int baz(int i) { return foo(i); }This function is fine. "i" is an input argument so "foo(i)" is considered to be equivalent to an input argument.
Jan 03 2013
Am 03.01.2013 00:48, schrieb Jason House:On Sunday, 30 December 2012 at 08:38:27 UTC, Jonathan M Davis wrote:+1 In other words, references returned by a function call that took any references to locals would be tainted as possibly local (in the function local data flow) and thus are not allowed to escape the scope. References derived from non-local refs could still be returned and returning references to fields from a struct method also works. --- safe ref int test(ref int v) { return v; // fine } safe ref int test2() { int local; return test(local); // error: (possibly) returning ref to local } safe ref int test3() { int local; int* ptr = &test(local); // fine, ptr is tainted 'local' return *ptr; // error: (possibly) returning ref to local } safe ref int test4(ref int val) { return test(val); // fine, can only be a ref to the external 'val' or to a global } ---After some recent discussions relating to auto ref and const ref, I have come to the conlusion that as it stands, ref is not safe. It's system. And I think that we need to take a serious look at it to see what we can do to make it safe. The problem is combining code that takes ref parameters with code that returns by ref.The best solution I can think of is for the safe code to require a ref return value is treated with the same care as all the function input arguments. I'll try to annotate the example code you gave to explain.
Jan 03 2013
On Friday, 4 January 2013 at 06:30:55 UTC, Sönke Ludwig wrote:In other words, references returned by a function call that took any references to locals would be tainted as possibly local (in the function local data flow) and thus are not allowed to escape the scope. References derived from non-local refs could still be returned and returning references to fields from a struct method also works. --- safe ref int test(ref int v) { return v; // fine }v should be scope here. If not, other function have no guarantee that the reference will not escape.safe ref int test2() { int local; return test(local); // error: (possibly) returning ref to local } safe ref int test3() { int local; int* ptr = &test(local); // fine, ptr is tainted 'local' return *ptr; // error: (possibly) returning ref to local } safe ref int test4(ref int val) { return test(val); // fine, can only be a ref to the external 'val' or to a global } ---Given the modification mentioned above, this look like the way to go.
Jan 04 2013
On Friday, 4 January 2013 at 06:30:55 UTC, Sönke Ludwig wrote:In other words, references returned by a function call that took any references to locals would be tainted as possibly local (in the function local data flow) and thus are not allowed to escape the scope. References derived from non-local refs could still be returned and returning references to fields from a struct method also works. --- safe ref int test(ref int v) { return v; // fine } safe ref int test2() { int local; return test(local); // error: (possibly) returning ref to local } safe ref int test3() { int local; int* ptr = &test(local); // fine, ptr is tainted 'local' return *ptr; // error: (possibly) returning ref to local } safe ref int test4(ref int val) { return test(val); // fine, can only be a ref to the external 'val' or to a global } ---Trying to say that formally: Definitions: 'Tainter function': A function that: 1. takes at least one of its parameters by reference and 2. returns by reference 'Tainting function call': A call to a 'tainter function' where at least one of the arguments passed by reference is ref to a local variable Then the rules become: Function may not return a reference to: Rule 1: a function-local variable Rule 2. a value returned by a 'tainting function call' safe: ref int tfun(ref int v) { // tfun tagged 'tainter function' ... } ref int test1() { int local; return local; // error by Rule 1 } ref int test2() { int local; return tfun(local); // error by Rule 2 } ref int test3() { int local; int* ptr = &tfun(local); // ptr tagged 'local' return *ptr; // error by Rule 2 } ref int test4(ref int val) { return tfun(val); // fine } int global; ref int test5() { int local; int* ptr = &tfun(local); // ptr tagged 'local' ptr = &global; // ptr's 'local' tag removed return *ptr; // fine }
Jan 04 2013
On Friday, 4 January 2013 at 14:15:01 UTC, Tommi wrote:'Tainting function call': A call to a 'tainter function' where at least one of the arguments passed by reference is ref to a local variableI forgot to point out that the return value of a 'tainting function call' is considered to be a "reference to a function-local variable" (even if it's not in reality).
Jan 04 2013
On Sunday, 30 December 2012 at 08:38:27 UTC, Jonathan M Davis wrote:And maybe another solution which I can't think of at the moment would be better. But my point is that we currently have a _major_ hole in SafeD thanks to the combination of ref parameters and ref return types, and we need to find a solution. - Jonathan M Davis Related: http://d.puremagic.com/issues/show_bug.cgi?id=8838I've thought about how I think the attributes should work if D is forced to use them. This was the first system I came up with, but as you'll see below, the system can be simplified by ignoring safe-ty altogether: Two attributes: saferef and inoutref // " saferef" is semantically equivalent to " safe inoutref" saferef ref int fupz(ref int a) { somethingUnsafe(); // Error return a; //Okay } // The same function won't work with just safe safe ref int fuz(ref int a) { return a; // Error: a safe function which returns a reference to // a variable deriving from one of its parameters must be // marked saferef } // Basic rule against using it when not necessary: // a saferef or inoutref function must both accept and return a ref saferef int validate1(ref int a) { return a; } // Error inoutref ref int validate2(int a) { return a; } // Error // saferef's are chained by compiler enforcement: saferef ref int fonz(ref int a) { return a; } safe ref int frooz(ref int a) { return fonz(a); // Error: a function which returns the result of one of // its parameters being passed to a saferef or inoutref // function must itself be marked saferef or inoutref } // The problem of escaping local variables: saferef ref int fonz(ref int a) { return a; } ref int dollop() { int local; return fonz(local); // Error: a function may not return the result of a local variable passed to a saferef or an inoutref function } // inoutref may be used when you have otherwise un-safe code: inoutref ref int froes(ref int a) { /+…some unsafe code…+/ return a; } ref int f() { int local; return froes(local); // Bug caught now even in system code } // An enhancement: mark harmless parameters as saferef saferef ref int twoParams( saferef ref int a, ref int b) { return a; // Error: a saferef or inoutref function may not return a reference derived from a parameter marked saferef return b; // Fine } // Only saferef or inoutref functions would be able to use saferef parameters: ref int zorf( saferef ref int a, ref int b) {} // Error So I typed all of that out and realized that a simpler alternative would be to ignore safe altogether and have the inoutref functionality be on by default. The only attribute now required would be outref, which could be simplified to just "out" so long as it appeared *before* the parameter list, since it could be confused for an out contract if it came afterwards. So: " saferef" <=> " safe outref" is unnecessary because all functions are checked, not just safe ones. ref int lugs(ref int a) { return a; // Okay } ref int h(ref int a) { return lugs(a); // Okay int local; return lugs(local); // Error: may not return the result of a local variable // passed to a function which both accepts and returns a // ref unless that function is marked " outref" } int d; outref ref int saml(ref int a) { return *(new int); // Fine return d; // Fine return a; // Error: a function marked " outref" may not return a reference // deriving from one of its parameters } ref int lugs(ref int a) { return a; } outref ref int druh(ref int a) { return lugs(a); // Error: a function marked outref may not return the result // of one of its parameters being passed to a function unless // that function is itself marked outref } // Must both accept and return a reference outref int boops(ref int a) {} // Error outref ref int bop(int a) {} // Error // Harmless parameters may be marked trusted: outref ref int lit( trusted ref int a, ref int b) { return a; // Passes based on the honor system return b; // Error } The second system is much simpler, and it's only a little more computationally expensive than the first, since the signature of all functions called with local variables must be scanned for ref output and input, not just safe ones.
Jan 03 2013
On Sunday, 30 December 2012 at 08:38:27 UTC, Jonathan M Davis wrote:After some recent discussions relating to auto ref and const ref, I have come to the conlusion that as it stands, ref is not safe. It's system. And I think that we need to take a serious look at it to see what we can do to make it safe. The problem is combining code that takes ref parameters with code that returns by ref. Take this code for example: ref int foo(ref int i) { return i; } ref int bar() { int i = 7; return foo(i); } ref int baz(int i) { return foo(i); } void main() { auto a = bar(); auto b = baz(5); }I must admit that I haven't read the rest of the thread yet, but I think the obvious and correct solution is to disallow passing locals (including non-ref parameters, which are effectively locals in D) as non-scope ref arguments. The scope attribute, once properly implemented, would make sure that the reference is not escaped. For now, we could just make it behave overly conservative in safe code. David
Jan 03 2013
On Thursday, 3 January 2013 at 21:56:22 UTC, David Nadlinger wrote:I must admit that I haven't read the rest of the thread yet, but I think the obvious and correct solution is to disallow passing locals (including non-ref parameters, which are effectively locals in D) as non-scope ref arguments.The problem with that idea, is that a ref return with no arguments may call another ref return that returns something that escapes the scope it was created in. If the source code is not available, then there's no way for the compiler to determine that this is going on. I would suggest to disallow all ref returns that make use of a ref return function call *unless* the code portion is marked as trusted, and to to that requires following the ideas presented for changing how trusted should be implemented, ie allowing selected portions of otherwise unsafe code to be marked as trusted by a programmer who has verified the use of the code to be safe given the context.The scope attribute, once properly implemented, would make sure that the reference is not escaped. For now, we could just make it behave overly conservative in safe code. DavidMy understanding was that in some cases that source code is not available to the compiler, which I would think means that preventing scope escaping cannot be 100% guaranteed, correct? --rt
Jan 03 2013
On Thursday, 3 January 2013 at 22:50:38 UTC, Rob T wrote:On Thursday, 3 January 2013 at 21:56:22 UTC, David Nadlinger wrote:I am not quite sure what you are trying to say. If the compiler never sees the source code for the functions, then codegen is going to be difficult. ;) Yes, if you just see "void iPromiseNotToEscapeMyParameter(scope ref int a) safe;", then there is no way to directly check that the function actually does not leak the parameter address. However, you can be sure that the compiler checked that when generating the code for the function. DavidI must admit that I haven't read the rest of the thread yet, but I think the obvious and correct solution is to disallow passing locals (including non-ref parameters, which are effectively locals in D) as non-scope ref arguments.The problem with that idea, is that a ref return with no arguments may call another ref return that returns something that escapes the scope it was created in. If the source code is not available, then there's no way for the compiler to determine that this is going on.
Jan 03 2013
On Thursday, 3 January 2013 at 23:06:03 UTC, David Nadlinger wrote:See my post directly above this one that has an example. --rtThe problem with that idea, is that a ref return with no arguments may call another ref return that returns something that escapes the scope it was created in. If the source code is not available, then there's no way for the compiler to determine that this is going on.I am not quite sure what you are trying to say. If the compiler never sees the source code for the functions, then codegen is going to be difficult. ;)
Jan 03 2013
On Thursday, 3 January 2013 at 22:50:38 UTC, Rob T wrote:The problem with that idea, is that a ref return with no arguments may call another ref return that returns something that escapes the scope it was created in. If the source code is not available, then there's no way for the compiler to determine that this is going on.You can't return a scope ref in safe code, so that is not an issue.I would suggest to disallow all ref returns that make use of a ref return function call *unless* the code portion is marked as trusted, and to to that requires following the ideas presented for changing how trusted should be implemented, ie allowing selected portions of otherwise unsafe code to be marked as trusted by a programmer who has verified the use of the code to be safe given the context.This is why the scope qualifier exists.The scope attribute, once properly implemented, would make sure that the reference is not escaped. For now, we could just make it behave overly conservative in safe code. DavidMy understanding was that in some cases that source code is not available to the compiler, which I would think means that preventing scope escaping cannot be 100% guaranteed, correct?
Jan 03 2013
On Thursday, January 03, 2013 23:50:37 Rob T wrote:My understanding was that in some cases that source code is not available to the compiler, which I would think means that preventing scope escaping cannot be 100% guaranteed, correct?The source code is always available when compiling the function itself. So (assuming that scope is fully implemented - which it's not right now), the compiler will be able to verify that a scope parameter does not escape the function when it compiles that function. What doesn't work is inferring function attributes at the call site, because that requires that the full code be available at the call site. And that's not necessarily true unless you're dealing with a templated function (which is part of why attribute inferrence only works with templated functions). But as long as you're talking about stuff that can be verified when the function itself is compiled, then the fact that the source code isn't necessarily available to the caller isn't an issue. - Jonathan M Davis
Jan 03 2013
OK, I understand what you mean by "scope" and how that can be used to prevent leaking a local ref out. Don't forget to consider this kind of scenario, which has no ref arguments to consider struct X { int _i; ref int f() { return _i; } } ref int F() { X x; return x.f(); } int main() { // example uses that currently compile F = 1000; writeln(F()); } Is this valid? Does local x remain defined up until the function call terminates completely, ie until after the reference is no longer valid? I can also mark everything as safe and it will compile, and also scope x safe ref int F() { scope X x; return x.f(); // this compiles too return x._i; } --rt
Jan 03 2013
On Thursday, 3 January 2013 at 21:56:22 UTC, David Nadlinger wrote:I must admit that I haven't read the rest of the thread yet, but I think the obvious and correct solution is to disallow passing locals (including non-ref parameters, which are effectively locals in D) as non-scope ref arguments. The scope attribute, once properly implemented, would make sure that the reference is not escaped. For now, we could just make it behave overly conservative in safe code.This seems to me like the sane thing to do.
Jan 03 2013
On Thursday, 3 January 2013 at 21:56:22 UTC, David Nadlinger wrote:I must admit that I haven't read the rest of the thread yet, but I think the obvious and correct solution is to disallow passing locals (including non-ref parameters, which are effectively locals in D) as non-scope ref arguments. The scope attribute, once properly implemented, would make sure that the reference is not escaped. For now, we could just make it behave overly conservative in safe code. DavidIf you disallow passing local variables as non-scope ref arguments, then you effectively disallow all method calls on local variables. My reasoning is as follows: struct T { int get(int v) const; void set(int v); } Those methods of T can be thought of as free functions with these signatures: int get(ref const T obj, int v); void set(ref T obj, int v); And these kinds of method calls: T obj; int n = obj.get(v); obj.set(n); ...can be thought of as being converted to these free function calls: T obj; int n = .get(obj, v); .set(obj, n); I don't know what the compiler does or doesn't do, but it is *as_if* the compiler did this conversion from method calls to free functions. Now it's obvious, given those free function signatures, that if you disallow passing function-local variables as non-scope references, you also disallow this code: void func() { T obj; obj.set(123); } Because that would effectively be the same as: void func() { T obj; // obj is a local variable .set(obj, 123); // obj is passed as non-scope ref } Then, you might ask, why don't those methods of T correspond to these free function signatures: int get(scope ref const T obj, int v); void set(scope ref T obj, int v); And the answer is obviously that it would prevent these kinds of methods: struct T { int v; ref T increment() { v++; return this; } } ...because that would then convert to this free function signature: ref T increment(scope ref T obj) { obj.v++; return obj; // Can't return a reference to a scope argument }
Jan 10 2013
...Although, I should add that my analogy between methods and free functions seems to break when the object is an rvalue. Like in: struct T { int v; this(int a) { v = a; } int get() { return v; } } int v = T(4).get(); Given my analogy, the method get() should be able to be thought of as a free function: int gget(ref T obj) { return obj.v; } But then the above method call should be able to thought of as: int v = gget(T(4)); ...which won't compile because T(4) is an rvalue, and according to D, rvalues can't be passed as ref (nor const ref). I don't know which one is flawed, my analogy, or the logic of how D is designed.
Jan 10 2013
On Thursday, 10 January 2013 at 16:42:09 UTC, Tommi wrote:...which won't compile because T(4) is an rvalue, and according to D, rvalues can't be passed as ref (nor const ref). I don't know which one is flawed, my analogy, or the logic of how D is designed.My analogy is a bit broken in the sense that methods actually see their designated object as a reference to lvalue even if it is an rvalue. But I don't think that affects the logic of my main argument about scope arguments. A more strict language logic would be inconvenient. But, this logic does introduce a discrepancy between non-member operators and member operators in C++ (which D actually side-steps by disallowing non-member operators... and then re-introduces by providing UFCS): // C++: struct T { int val = 10; T& operator--() { --val; return *this; } }; T& operator++(T& t) { ++t.val; return t; } int main() { _cprintf("%d\n", (--T()).val); // Prints: 9 _cprintf("%d\n", (++T()).val); // Error: no known conversion // from 'T' to 'T&' return 0; } // D: import std.stdio; struct T { int val; int get() { ++val; return val; } } int het(ref T t) { ++t.val; return t.val; } void main() { writeln(T().get()); writeln(T().het()); // Error: T(0) is not an lvalue }
Jan 10 2013
I've here formalized how I think the constraints on a non-scope ref taking and ref returning function should work. This represents a whole addition to the type system. The attribute " outref" from my previous post has been shortened to keyword "out" (must come before parentheses). This is all I have left to say about this topic: ref int lugs(ref int a) { return a; } ref int h(ref int a) { return lugs(a); // Okay int local; return lugs(local); // Error: the result of a function which accepts a local as non-scope ref and returns ref is treated as local and cannot be escaped unless that function is marked "out" int* p = &lugs(local); // Same error } int d; out ref int saml(ref int a) { return *(new int); // Fine return d; // Fine return a; // Error: a function marked "out" may not escape a non-scope ref parameter } ref int lugh(ref int a) { return a; } out ref int druh(ref int a) { return lugh(a); // Error: a function marked "out" may not escape the result of a function which accepts its non-scope ref parameter and returns a ref unless that function is also marked "out" } out int boops(ref int a) {} // Error: a function marked "out" must return a reference out ref int bop(int a, in ref b, scope ref c) {} // Error: a non-member function marked "out" must accept at least one non-scope ref parameter // "cast(out)" provides all needed flexibility: out ref int lit(ref int a) { return cast(out) a; // Not safe // But with trusted blocks, we could do: trusted { return cast(out) a; } // safe // And with trusted statements, the brackets are gone: trusted return cast(out) a; // safe // Otherwise, this function must be marked " trusted" } // You can use cast(out) anywhere: ref int hugs(ref int a) { return a; } ref int g(ref int a) { int local; return cast(out) (hugs(local)); // Okay return cast(out) local; // Okay?? return hugs(cast(out) local); // Won't know what hit 'em } // Nor did I forget about structs: struct S { int _i; static int _s; out ref int club() { return _i; } // Error: a member function marked "out" may not escape a non-static field out ref int trob() { return _s; } // Okay out ref int blub() { return cast(out) _i; } //Okay } struct B { int _i; ref int snub() { return _i; } } ref int bub() { B b; return b.snub(); // Error: the result of a local instance's non-static method which returns ref is considered local and may not be escaped unless that method is marked "out" int* i = &b.snub(); // Same error }
Jan 05 2013
I felt confident enough about my proposal to submit it as enhancement request: http://d.puremagic.com/issues/show_bug.cgi?id=9283
Jan 08 2013
On Wednesday, 9 January 2013 at 04:33:21 UTC, Zach the Mystic wrote:I felt confident enough about my proposal to submit it as enhancement request: http://d.puremagic.com/issues/show_bug.cgi?id=9283I like it. One issue though, like you also indicated by putting question marks on it: ref T get(T)() { T local; return cast(out) local; // This shouldn't compile } Because, wouldn't returning a local variable as a reference be a dangling reference in all cases? No matter if the programmer claims it's correct by saying cast(out)... it just can't be correct. And T can be a type that has reference semantics or value semantics, it doesn't matter. That function would always return a dangling reference, were it allowed to compile.
Jan 09 2013
On Wednesday, 9 January 2013 at 04:33:21 UTC, Zach the Mystic wrote:I felt confident enough about my proposal to submit it as enhancement request: http://d.puremagic.com/issues/show_bug.cgi?id=9283By the way, what do you propose is the correct placement of this "new" out keyword:
Jan 09 2013