digitalmars.D - ref parameters: there is no escape
- Andrei Alexandrescu (29/29) Aug 14 2011 Walter and I have had a long discussion and we thought we'd bring an
- Jakob Ovrum (21/51) Aug 14 2011 I like the idea, but don't we already have (currently non-enforced)
- Timon Gehr (10/40) Aug 14 2011 Well, then it is possible to 'wash clean' a pointer to ref argument
- kennytm (10/49) Aug 14 2011 Well, you could adopt bug 6442 and call the constructor as
- dsimcha (14/44) Aug 14 2011 I think this is an absolutely terrible idea, unless it has an "I know
- Jakob Ovrum (6/11) Aug 14 2011 What if it was allowed if the parameters were explicitly marked scope?
- Andrei Alexandrescu (5/17) Aug 14 2011 Exactly. Using scope has been part of the discussion, and our agreement
- Jacob Carlborg (4/24) Aug 14 2011 Can we do the opposite, somehow indicating that the parameters might esc...
- Andrei Alexandrescu (4/31) Aug 14 2011 We talked about this, too. I even aired ~scope. Such a change would be
- dsimcha (3/7) Aug 14 2011 Let's assume for the sake of argument that scope is part of the game.
- Andrei Alexandrescu (9/23) Aug 14 2011 I'm weary of absolute qualifications, particularly after arguments have
- dsimcha (6/30) Aug 14 2011 Pass-by-
- Andrei Alexandrescu (3/11) Aug 14 2011 "absolutely terrible"
- dsimcha (10/29) Aug 14 2011 Argh, accidentally hit send before I meant to on my last post. Please
- Andrei Alexandrescu (16/47) Aug 14 2011 Yah, dWrapper would become:
- dsimcha (38/48) Aug 14 2011 Ok, IIUC we might have found some common ground here. Is the idea that,...
- Andrei Alexandrescu (4/21) Aug 14 2011 [snip]
- bearophile (6/13) Aug 14 2011 It's interesting to know how much code and how much hard the changes are...
- Marco Leise (3/9) Aug 14 2011 +1 (although with __gshared it would have created some horrible code whe...
- Michel Fortin (19/23) Aug 14 2011 Actually, no, that's not safe by itself. Consider this:
- Jacob Carlborg (24/54) Aug 14 2011 I have code relying on this, probably not could practice but it works.
- dsimcha (16/46) Aug 14 2011 Another example of why this is a bad idea:
- Andrei Alexandrescu (17/32) Aug 14 2011 I understand. Would it be agreeable to require a cast to take the
- dsimcha (10/16) Aug 14 2011 But this breaks encapsulation horribly in the presence of conservative
- Andrei Alexandrescu (4/22) Aug 14 2011 You are exploring an increasingly narrow niche. Is it worth keeping a
- dsimcha (3/6) Aug 14 2011 Yes!!! Such conservative and inflexible rules have no place in a
- Andrei Alexandrescu (5/12) Aug 14 2011 I see. Do you have a response to any of the arguments I brought? Among
- Mehrdad (5/9) Aug 14 2011 ... I hope you're joking.
- Timon Gehr (5/15) Aug 15 2011 In a well designed language, warnings are useless. Either the code is
- Mehrdad (5/11) Aug 15 2011 ?!?!??!!
- Timon Gehr (11/23) Aug 15 2011 Only if the function intends to escape the reference. And if you really
- Mehrdad (30/43) Aug 15 2011 ... introducing D#?
- Jakob Ovrum (8/51) Aug 15 2011 No reference to stack memory can escape its stack frame in your example....
- Mehrdad (19/39) Aug 15 2011 Oh, I assumed it was obvious that you usually have to /call/ a function
- Mehrdad (2/6) Aug 15 2011 Sorry, that should say "false", not "true".
- Jakob Ovrum (14/53) Aug 16 2011 Not all calls to that function would be dangerous. I would even go so
- Dmitry Olshansky (8/15) Aug 16 2011 @system void dangerous(ref int x)
- Timon Gehr (6/21) Aug 16 2011 Andrei proposed to make this invalid even for system functions. The only...
- Mehrdad (5/12) Aug 16 2011 Right. Would you mind giving me around ~3 examples of languages you DO
- Timon Gehr (26/69) Aug 16 2011 Sure. I was never implying otherwise. But the main reason that in C#
- Andrei Alexandrescu (4/14) Aug 15 2011 No.
- Steven Schveighoffer (14/42) Aug 15 2011 It sounds reasonable, especially with the added clarification that you c...
Walter and I have had a long discussion and we thought we'd bring an idea for community review. We believe it would be useful for safety purposes to disallow escaping addresses of ref parameters. Consider: class C { int * p; this(ref int x) { p = &x; // escapes the address of a ref parameter } } Such code is accepted today. We believe it is error-prone and dangerous, particularly because the caller has no syntactic cue that the address of the parameter is passed into the function (in this case constructor). Worse, such a function cannot be characterized as safe. So we want to make the above an error. The workaround is obvious - just take int* as a parameter instead of ref int. What a function can do with a ref parameter in general is: * use it directly just like a local; * pass it down to other functions (which may take it by value or reference); * pass its address down to pure functions because a pure function cannot escape the address anyway (cool insight by Walter); * take its address as long as the address doesn't outlive the frame of the function. The third bullet is not easy to implement as it requires flow analysis, but we may start with a conservative version first. Probably there won't be a lot of broken code anyway. Please chime in with any comments you might have! Thanks, Andrei
Aug 14 2011
On 2011/08/14 23:20, Andrei Alexandrescu wrote:Walter and I have had a long discussion and we thought we'd bring an idea for community review. We believe it would be useful for safety purposes to disallow escaping addresses of ref parameters. Consider: class C { int * p; this(ref int x) { p = &x; // escapes the address of a ref parameter } } Such code is accepted today. We believe it is error-prone and dangerous, particularly because the caller has no syntactic cue that the address of the parameter is passed into the function (in this case constructor). Worse, such a function cannot be characterized as safe. So we want to make the above an error. The workaround is obvious - just take int* as a parameter instead of ref int. What a function can do with a ref parameter in general is: * use it directly just like a local; * pass it down to other functions (which may take it by value or reference); * pass its address down to pure functions because a pure function cannot escape the address anyway (cool insight by Walter); * take its address as long as the address doesn't outlive the frame of the function. The third bullet is not easy to implement as it requires flow analysis, but we may start with a conservative version first. Probably there won't be a lot of broken code anyway. Please chime in with any comments you might have! Thanks, AndreiI like the idea, but don't we already have (currently non-enforced) scope parameters for this? Of course it would be nice to have "ref" also mean scope, like "in" meaning "scope const", but it would be nice to have scope working properly. Currently, this code compiles fine: --------------------- const(char)[] test; void foo(in char[] s) { test = s; } void main() { foo("bar"); } ---------------------- This is a big problem when writing a library (accepting delegate callbacks and such) and you're not sure whether the user wants to just "read" a variable or hold onto it. Whether or not to make a copy should be the user's choice, not the library's.
Aug 14 2011
On 08/14/2011 04:20 PM, Andrei Alexandrescu wrote:Walter and I have had a long discussion and we thought we'd bring an idea for community review. We believe it would be useful for safety purposes to disallow escaping addresses of ref parameters. Consider: class C { int * p; this(ref int x) { p = &x; // escapes the address of a ref parameter } } Such code is accepted today. We believe it is error-prone and dangerous, particularly because the caller has no syntactic cue that the address of the parameter is passed into the function (in this case constructor). Worse, such a function cannot be characterized as safe. So we want to make the above an error. The workaround is obvious - just take int* as a parameter instead of ref int. What a function can do with a ref parameter in general is: * use it directly just like a local; * pass it down to other functions (which may take it by value or reference); * pass its address down to pure functions because a pure function cannot escape the address anyway (cool insight by Walter);Well, then it is possible to 'wash clean' a pointer to ref argument using a pure function: int* identity(int* p)pure{return p;} int* global; void escapeRef(ref int x) safe{ global=identity(&x); }* take its address as long as the address doesn't outlive the frame of the function. The third bullet is not easy to implement as it requires flow analysis, but we may start with a conservative version first. Probably there won't be a lot of broken code anyway. Please chime in with any comments you might have! Thanks, AndreiI agree, disallow. But more important is that scope parameters start working. Probably some of the code could be reused.
Aug 14 2011
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Walter and I have had a long discussion and we thought we'd bring an idea for community review. We believe it would be useful for safety purposes to disallow escaping addresses of ref parameters. Consider: class C { int * p; this(ref int x) { p = &x; // escapes the address of a ref parameter } } Such code is accepted today. We believe it is error-prone and dangerous, particularly because the caller has no syntactic cue that the address of the parameter is passed into the function (in this case constructor).Well, you could adopt bug 6442 and call the constructor as auto c = new C(ref x); <g>Worse, such a function cannot be characterized as safe. So we want to make the above an error. The workaround is obvious - just take int* as a parameter instead of ref int. What a function can do with a ref parameter in general is: * use it directly just like a local; * pass it down to other functions (which may take it by value or reference); * pass its address down to pure functions because a pure function cannot escape the address anyway (cool insight by Walter);Does this mean strongly pure? Because for now we can write a weakly pure function pure int* escape(int* q) { return q; } and change that constructor to this(ref int x) { p = escape(&x); }* take its address as long as the address doesn't outlive the frame of the function. The third bullet is not easy to implement as it requires flow analysis, but we may start with a conservative version first. Probably there won't be a lot of broken code anyway. Please chime in with any comments you might have! Thanks, Andrei
Aug 14 2011
I think this is an absolutely terrible idea, unless it has an "I know what I'm doing, let me cast away the safety" loophole. Consider the case of designing a D wrapper for C functionality. // C, we know it doesn't escape its parameters but the compiler doesn't. void cFun(int* a, int* b); // D: void dWrapper(ref int a, ref int b) { cFun(&a, &b); } If you want the compiler to put extra restrictions on you in the name of safety, that's what SafeD is for. If you're writing an system function, then the compiler should stay out of your way and let you do what you want, unless it can **prove** that it's wrong. On 8/14/2011 10:20 AM, Andrei Alexandrescu wrote:Walter and I have had a long discussion and we thought we'd bring an idea for community review. We believe it would be useful for safety purposes to disallow escaping addresses of ref parameters. Consider: class C { int * p; this(ref int x) { p = &x; // escapes the address of a ref parameter } } Such code is accepted today. We believe it is error-prone and dangerous, particularly because the caller has no syntactic cue that the address of the parameter is passed into the function (in this case constructor). Worse, such a function cannot be characterized as safe. So we want to make the above an error. The workaround is obvious - just take int* as a parameter instead of ref int. What a function can do with a ref parameter in general is: * use it directly just like a local; * pass it down to other functions (which may take it by value or reference); * pass its address down to pure functions because a pure function cannot escape the address anyway (cool insight by Walter); * take its address as long as the address doesn't outlive the frame of the function. The third bullet is not easy to implement as it requires flow analysis, but we may start with a conservative version first. Probably there won't be a lot of broken code anyway. Please chime in with any comments you might have! Thanks, Andrei
Aug 14 2011
On 2011/08/15 0:28, dsimcha wrote:I think this is an absolutely terrible idea, unless it has an "I know what I'm doing, let me cast away the safety" loophole. Consider the case of designing a D wrapper for C functionality. // C, we know it doesn't escape its parameters but the compiler doesn't. void cFun(int* a, int* b);What if it was allowed if the parameters were explicitly marked scope? void cFun(scope int* a, scope int* b); I can imagine it being a proper inconvenience most of the time though, with many libraries not escaping a lot at all, you'd have to mark pretty much everything scope manually.
Aug 14 2011
On 8/14/11 10:33 AM, Jakob Ovrum wrote:On 2011/08/15 0:28, dsimcha wrote:Exactly. Using scope has been part of the discussion, and our agreement was that it would be a lot of burden to require manual scope annotations for non-escaping parameters. AndreiI think this is an absolutely terrible idea, unless it has an "I know what I'm doing, let me cast away the safety" loophole. Consider the case of designing a D wrapper for C functionality. // C, we know it doesn't escape its parameters but the compiler doesn't. void cFun(int* a, int* b);What if it was allowed if the parameters were explicitly marked scope? void cFun(scope int* a, scope int* b); I can imagine it being a proper inconvenience most of the time though, with many libraries not escaping a lot at all, you'd have to mark pretty much everything scope manually.
Aug 14 2011
On 2011-08-14 18:45, Andrei Alexandrescu wrote:On 8/14/11 10:33 AM, Jakob Ovrum wrote:Can we do the opposite, somehow indicating that the parameters might escape? -- /Jacob CarlborgOn 2011/08/15 0:28, dsimcha wrote:Exactly. Using scope has been part of the discussion, and our agreement was that it would be a lot of burden to require manual scope annotations for non-escaping parameters. AndreiI think this is an absolutely terrible idea, unless it has an "I know what I'm doing, let me cast away the safety" loophole. Consider the case of designing a D wrapper for C functionality. // C, we know it doesn't escape its parameters but the compiler doesn't. void cFun(int* a, int* b);What if it was allowed if the parameters were explicitly marked scope? void cFun(scope int* a, scope int* b); I can imagine it being a proper inconvenience most of the time though, with many libraries not escaping a lot at all, you'd have to mark pretty much everything scope manually.
Aug 14 2011
On 8/14/11 11:50 AM, Jacob Carlborg wrote:On 2011-08-14 18:45, Andrei Alexandrescu wrote:We talked about this, too. I even aired ~scope. Such a change would be doable but is liable to break a lot of code. AndreiOn 8/14/11 10:33 AM, Jakob Ovrum wrote:Can we do the opposite, somehow indicating that the parameters might escape?On 2011/08/15 0:28, dsimcha wrote:Exactly. Using scope has been part of the discussion, and our agreement was that it would be a lot of burden to require manual scope annotations for non-escaping parameters. AndreiI think this is an absolutely terrible idea, unless it has an "I know what I'm doing, let me cast away the safety" loophole. Consider the case of designing a D wrapper for C functionality. // C, we know it doesn't escape its parameters but the compiler doesn't. void cFun(int* a, int* b);What if it was allowed if the parameters were explicitly marked scope? void cFun(scope int* a, scope int* b); I can imagine it being a proper inconvenience most of the time though, with many libraries not escaping a lot at all, you'd have to mark pretty much everything scope manually.
Aug 14 2011
On 8/14/2011 12:45 PM, Andrei Alexandrescu wrote:Exactly. Using scope has been part of the discussion, and our agreement was that it would be a lot of burden to require manual scope annotations for non-escaping parameters. AndreiLet's assume for the sake of argument that scope is part of the game. (How) would it be checked?
Aug 14 2011
On 8/14/11 10:28 AM, dsimcha wrote:I think this is an absolutely terrible idea, unless it has an "I know what I'm doing, let me cast away the safety" loophole.I'm weary of absolute qualifications, particularly after arguments have been made in favor of the idea that are not refuted.Consider the case of designing a D wrapper for C functionality. // C, we know it doesn't escape its parameters but the compiler doesn't. void cFun(int* a, int* b); // D: void dWrapper(ref int a, ref int b) { cFun(&a, &b); }I understand. Probably it's fine to require an explicit cast for taking the address. Offhand, I don't see this as a frequent situation, or one that would make pass-by-pointer unpalatable.If you want the compiler to put extra restrictions on you in the name of safety, that's what SafeD is for. If you're writing an system function, then the compiler should stay out of your way and let you do what you want, unless it can **prove** that it's wrong.The problem is, currently all functions that pass locals by ref cannot be proven safe modularly. Andrei
Aug 14 2011
On 8/14/2011 12:44 PM, Andrei Alexandrescu wrote:On 8/14/11 10:28 AM, dsimcha wrote:What do you mean "absolute qualifications"?I think this is an absolutely terrible idea, unless it has an "I know what I'm doing, let me cast away the safety" loophole.I'm weary of absolute qualifications, particularly after arguments have been made in favor of the idea that are not refuted.Pass-by-Consider the case of designing a D wrapper for C functionality. // C, we know it doesn't escape its parameters but the compiler doesn't. void cFun(int* a, int* b); // D: void dWrapper(ref int a, ref int b) { cFun(&a, &b); }I understand. Probably it's fine to require an explicit cast for taking the address. Offhand, I don't see this as a frequent situation, or one that would make pass-by-pointer unpalatable.Right, but they can be proven safe if they pass locals by ref **to safe functions**. I don't think there's any disagreement that safe functions shouldn't be allowed to take the address of locals or parameters.If you want the compiler to put extra restrictions on you in the name of safety, that's what SafeD is for. If you're writing an system function, then the compiler should stay out of your way and let you do what you want, unless it can **prove** that it's wrong.The problem is, currently all functions that pass locals by ref cannot be proven safe modularly.
Aug 14 2011
On 8/14/11 11:51 AM, dsimcha wrote:On 8/14/2011 12:44 PM, Andrei Alexandrescu wrote:"absolutely terrible" AndreiOn 8/14/11 10:28 AM, dsimcha wrote:What do you mean "absolute qualifications"?I think this is an absolutely terrible idea, unless it has an "I know what I'm doing, let me cast away the safety" loophole.I'm weary of absolute qualifications, particularly after arguments have been made in favor of the idea that are not refuted.
Aug 14 2011
Argh, accidentally hit send before I meant to on my last post. Please ignore. On 8/14/2011 12:44 PM, Andrei Alexandrescu wrote:Pass-by-pointer is really, really ugly when used in high-level D-style code, and exposes the implementation detail that the D wrapper is using C code. By explicit cast, do you mean one in dWrapper() that's encapsulated and invisible to the caller?Consider the case of designing a D wrapper for C functionality. // C, we know it doesn't escape its parameters but the compiler doesn't. void cFun(int* a, int* b); // D: void dWrapper(ref int a, ref int b) { cFun(&a, &b); }I understand. Probably it's fine to require an explicit cast for taking the address. Offhand, I don't see this as a frequent situation, or one that would make pass-by-pointer unpalatable.Right, but they can be proven safe if they pass locals by ref **to safe functions**. I don't think there's any disagreement that safe functions shouldn't be allowed to take the address of locals or parameters.If you want the compiler to put extra restrictions on you in the name of safety, that's what SafeD is for. If you're writing an system function, then the compiler should stay out of your way and let you do what you want, unless it can **prove** that it's wrong.The problem is, currently all functions that pass locals by ref cannot be proven safe modularly.
Aug 14 2011
On 8/14/11 11:55 AM, dsimcha wrote:Argh, accidentally hit send before I meant to on my last post. Please ignore. On 8/14/2011 12:44 PM, Andrei Alexandrescu wrote:Yah, dWrapper would become: void dWrapper(ref int a, ref int b) { cFun(cast(int*) &a, cast(int*) &b); } If the casts are missing, the compiler's error message could clarify under what assumptions they might be inserted.Pass-by-pointer is really, really ugly when used in high-level D-style code, and exposes the implementation detail that the D wrapper is using C code. By explicit cast, do you mean one in dWrapper() that's encapsulated and invisible to the caller?Consider the case of designing a D wrapper for C functionality. // C, we know it doesn't escape its parameters but the compiler doesn't. void cFun(int* a, int* b); // D: void dWrapper(ref int a, ref int b) { cFun(&a, &b); }I understand. Probably it's fine to require an explicit cast for taking the address. Offhand, I don't see this as a frequent situation, or one that would make pass-by-pointer unpalatable.We don't have that rule yet, but we can enact it. I strongly believe it would help if we enacted the "unescapable ref" rule for all D code. It disallows a patently dangerous pattern that many C++ coding standards (including Facebook's) explicitly disallow. We found pernicious bugs in our code caused by escaping reference parameters, and we're looking into adding a rule in our lint program to statically disallow it. If that's worthwhile (and I have evidence it is), then it's all the better to put the check straight in the language. AndreiRight, but they can be proven safe if they pass locals by ref **to safe functions**. I don't think there's any disagreement that safe functions shouldn't be allowed to take the address of locals or parameters.If you want the compiler to put extra restrictions on you in the name of safety, that's what SafeD is for. If you're writing an system function, then the compiler should stay out of your way and let you do what you want, unless it can **prove** that it's wrong.The problem is, currently all functions that pass locals by ref cannot be proven safe modularly.
Aug 14 2011
On 8/14/2011 1:05 PM, Andrei Alexandrescu wrote:Ok, IIUC we might have found some common ground here. Is the idea that, if you insert the cast, then it's an unsafe cast and you're free to take the address of a ref parameter, period? I think this is a reasonable compromise: 1. There's enormous precedent for the idea that casts are for things you **probably** shouldn't be doing but may occasionally have a good reason to do. 2. It's greppable, unlike the status quo, where there's no easy way to search for possible escaping of addresses of ref parameters. 3. It solves the encapsulation problem mentioned in my previous post. 4. It can be disallowed in SafeD as an unsafe cast. 5. If you allow taking the address of ref parameters without a cast as long as the compiler can prove that they don't escape, then performing this type of cast is a very explicit statement that you **know** the compiler can't prove that those addresses don't escape and that you take full responsibility for ensuring they don't. 6. It may eventually lead to a more comprehensive ownership type system similar to one that's been discussed here before, where there's ScopedPointers and regular pointers. A ScopedPointer is a super type of a regular pointer, isn't allowed to escape from where it was created, etc. Bottom line: I completely agree that escaping addresses of ref parameters is a terrible design. I'm fine with making constructs that have the potential to do so more verbose and explicit by requiring casts. However, I am against disallowing it completely for the following reasons: 1. The rules against it would have to be conservative, meaning at least some valid designs are tossed out as well. This is completely unacceptable in a systems language. 2. I'm not convinced that escaping addresses of ref parameters is at all easy to do by accident. 3. In a systems language the compiler should **never** go out of its way to completely disallow a design, no matter how bad that design is. A language is a tool that should do what the user tells it to, not a nanny that should prevent the user from being naughty. (Though this does not preclude the compiler making it hard to do bad things **by accident**.) I don't care if said design is wholeheartedly endorsed by The Devil himself.Pass-by-pointer is really, really ugly when used in high-level D-style code, and exposes the implementation detail that the D wrapper is using C code. By explicit cast, do you mean one in dWrapper() that's encapsulated and invisible to the caller?Yah, dWrapper would become: void dWrapper(ref int a, ref int b) { cFun(cast(int*) &a, cast(int*) &b); } If the casts are missing, the compiler's error message could clarify under what assumptions they might be inserted.
Aug 14 2011
On 8/14/11 12:39 PM, dsimcha wrote:On 8/14/2011 1:05 PM, Andrei Alexandrescu wrote:[snip] Everything sounds great, thanks. AndreiOk, IIUC we might have found some common ground here. Is the idea that, if you insert the cast, then it's an unsafe cast and you're free to take the address of a ref parameter, period?Pass-by-pointer is really, really ugly when used in high-level D-style code, and exposes the implementation detail that the D wrapper is using C code. By explicit cast, do you mean one in dWrapper() that's encapsulated and invisible to the caller?Yah, dWrapper would become: void dWrapper(ref int a, ref int b) { cFun(cast(int*) &a, cast(int*) &b); } If the casts are missing, the compiler's error message could clarify under what assumptions they might be inserted.
Aug 14 2011
Andrei Alexandrescu:We found pernicious bugs in our code caused by escaping reference parameters, and we're looking into adding a rule in our lint program to statically disallow it. If that's worthwhile (and I have evidence it is), then it's all the better to put the check straight in the language.Another possibility is to add it to an experimental branch of DMD, study its use and effects for some time, and then decide what to do of this idea.We talked about this, too. I even aired ~scope. Such a change would be doable but is liable to break a lot of code.It's interesting to know how much code and how much hard the changes are to do. Like in Go, a small tool similar to the one that converts Python2 code to Python3 code is getting useful in D too, to reduce the amount of work of people (including Phobos dev) that have to update D code to follow changed in D language. Bye, bearophile
Aug 14 2011
Am 14.08.2011, 21:08 Uhr, schrieb bearophile <bearophileHUGS lycos.com>:Like in Go, a small tool similar to the one that converts Python2 code to Python3 code is getting useful in D too, to reduce the amount of work of people (including Phobos dev) that have to update D code to follow changed in D language. Bye, bearophile+1 (although with __gshared it would have created some horrible code when the change was made :p)
Aug 14 2011
On 2011-08-14 16:55:08 +0000, dsimcha <dsimcha yahoo.com> said:Right, but they can be proven safe if they pass locals by ref **to safe functions**. I don't think there's any disagreement that safe functions shouldn't be allowed to take the address of locals or parameters.Actually, no, that's not safe by itself. Consider this: ref int foo(ref int a) safe { return a; } ref int bar() safe { int a; return foo(a); } And now 'bar' returns its local variable 'a' by ref, thanks to the complicity of 'foo'. All this unsafety is perfectly safe. I think a safe function shouldn't be allowed to return by ref one of its parameter. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Aug 14 2011
On 2011-08-14 16:20, Andrei Alexandrescu wrote:Walter and I have had a long discussion and we thought we'd bring an idea for community review. We believe it would be useful for safety purposes to disallow escaping addresses of ref parameters. Consider: class C { int * p; this(ref int x) { p = &x; // escapes the address of a ref parameter } } Such code is accepted today. We believe it is error-prone and dangerous, particularly because the caller has no syntactic cue that the address of the parameter is passed into the function (in this case constructor). Worse, such a function cannot be characterized as safe. So we want to make the above an error. The workaround is obvious - just take int* as a parameter instead of ref int. What a function can do with a ref parameter in general is: * use it directly just like a local; * pass it down to other functions (which may take it by value or reference); * pass its address down to pure functions because a pure function cannot escape the address anyway (cool insight by Walter); * take its address as long as the address doesn't outlive the frame of the function. The third bullet is not easy to implement as it requires flow analysis, but we may start with a conservative version first. Probably there won't be a lot of broken code anyway. Please chime in with any comments you might have! Thanks, AndreiI have code relying on this, probably not could practice but it works. This is a usage example: void main () { int i = 3; restore(i) in { i = 4; }; assert(i == 3); } Restore returns a struct which overloads the "in" operator and stores a pointer to the value pass to "restore". I'm overloading the "in" operator have a nicer looking delegate syntax. But I guess this could be seen as operator overload abuse. If D just could have a good looking syntax for delegate literals, like this: restore(i) { i = 4; } Then this wouldn't be needed. -- /Jacob Carlborg
Aug 14 2011
Another example of why this is a bad idea: In std.parallelism, I have a function called TaskPool.put, which takes a Task object by reference, takes its address and puts it on the task queue. This is used for scoped tasks. However, it's safe because Task has a destructor that waits for the task to be finished and out of the task queue before destroying the stack frame it's on and returning. Why can't we just establish a strong convention that, if a function truly escapes the address of a ref parameter (meaning it actually lives longer than the lifetime of the function), you take a pointer instead of a ref? It's not like escaping ref parameters unintentionally is a common source of bugs. My point is that any rule we come up with will always be conservative. D is a **systems language** and needs to give the benefit of the doubt to assuming the programmer knows what he/she is doing. If you want extra checks, that's what SafeD is for. On 8/14/2011 10:20 AM, Andrei Alexandrescu wrote:Walter and I have had a long discussion and we thought we'd bring an idea for community review. We believe it would be useful for safety purposes to disallow escaping addresses of ref parameters. Consider: class C { int * p; this(ref int x) { p = &x; // escapes the address of a ref parameter } } Such code is accepted today. We believe it is error-prone and dangerous, particularly because the caller has no syntactic cue that the address of the parameter is passed into the function (in this case constructor). Worse, such a function cannot be characterized as safe. So we want to make the above an error. The workaround is obvious - just take int* as a parameter instead of ref int. What a function can do with a ref parameter in general is: * use it directly just like a local; * pass it down to other functions (which may take it by value or reference); * pass its address down to pure functions because a pure function cannot escape the address anyway (cool insight by Walter); * take its address as long as the address doesn't outlive the frame of the function. The third bullet is not easy to implement as it requires flow analysis, but we may start with a conservative version first. Probably there won't be a lot of broken code anyway. Please chime in with any comments you might have! Thanks, Andrei
Aug 14 2011
On 8/14/11 10:41 AM, dsimcha wrote:Another example of why this is a bad idea: In std.parallelism, I have a function called TaskPool.put, which takes a Task object by reference, takes its address and puts it on the task queue. This is used for scoped tasks. However, it's safe because Task has a destructor that waits for the task to be finished and out of the task queue before destroying the stack frame it's on and returning.I understand. Would it be agreeable to require a cast to take the address of the parameter since you're relying on an extralinguistic invariant? Basically you'd be more motivated to do so if you recognized how problematic escaping ref parameters is for most cases.Why can't we just establish a strong convention that, if a function truly escapes the address of a ref parameter (meaning it actually lives longer than the lifetime of the function), you take a pointer instead of a ref? It's not like escaping ref parameters unintentionally is a common source of bugs.Convention has its usefulness, but also major downsides. The problem here is that we can't verify (or infer) safe for a lot of functions. Basically all functions taking ref become very difficult to use from safe code, including the common idiom of passing a stack variable into a function by reference. I don't think we can afford to lose so much when turning safety on.My point is that any rule we come up with will always be conservative. D is a **systems language** and needs to give the benefit of the doubt to assuming the programmer knows what he/she is doing. If you want extra checks, that's what SafeD is for.I have only little sympathy for this argument; it actually leaves me more convinced we're on the right path. We're not talking about making it impossible to do something that you want to do. We're discussing about a change that will make a lot of functions efficient _and_ safe, leaving a minority of cases to a slight syntactic change. Andrei
Aug 14 2011
On 8/14/2011 12:54 PM, Andrei Alexandrescu wrote:I have only little sympathy for this argument; it actually leaves me more convinced we're on the right path. We're not talking about making it impossible to do something that you want to do. We're discussing about a change that will make a lot of functions efficient _and_ safe, leaving a minority of cases to a slight syntactic change. AndreiBut this breaks encapsulation horribly in the presence of conservative rules. Let's say you start off with a function: SomeType fun(ref T arg) { .... } Then you change fun()'s implementation such that it takes the address of arg. It does **not** escape this address, so the fact that the address is taken is an implementation detail. However, since the compiler's rules are conservative, this code would might be illegal if the compiler can't prove via its static analysis that the addresses don't escape. Bam! Implementation details leaking into function signatures.
Aug 14 2011
On 8/14/11 12:02 PM, dsimcha wrote:On 8/14/2011 12:54 PM, Andrei Alexandrescu wrote:You are exploring an increasingly narrow niche. Is it worth keeping a hole in the language for the sake of that? AndreiI have only little sympathy for this argument; it actually leaves me more convinced we're on the right path. We're not talking about making it impossible to do something that you want to do. We're discussing about a change that will make a lot of functions efficient _and_ safe, leaving a minority of cases to a slight syntactic change. AndreiBut this breaks encapsulation horribly in the presence of conservative rules. Let's say you start off with a function: SomeType fun(ref T arg) { .... } Then you change fun()'s implementation such that it takes the address of arg. It does **not** escape this address, so the fact that the address is taken is an implementation detail. However, since the compiler's rules are conservative, this code would might be illegal if the compiler can't prove via its static analysis that the addresses don't escape. Bam! Implementation details leaking into function signatures.
Aug 14 2011
On 8/14/2011 1:06 PM, Andrei Alexandrescu wrote:You are exploring an increasingly narrow niche. Is it worth keeping a hole in the language for the sake of that? AndreiYes!!! Such conservative and inflexible rules have no place in a systems language, period.
Aug 14 2011
On 8/14/11 12:12 PM, dsimcha wrote:On 8/14/2011 1:06 PM, Andrei Alexandrescu wrote:I see. Do you have a response to any of the arguments I brought? Among other things, does the fact that you still can do what you want to do assuage your perceived inconvenience? AndreiYou are exploring an increasingly narrow niche. Is it worth keeping a hole in the language for the sake of that? AndreiYes!!! Such conservative and inflexible rules have no place in a systems language, period.
Aug 14 2011
On 8/14/2011 7:20 AM, Andrei Alexandrescu wrote:Walter and I have had a long discussion and we thought we'd bring an idea for community review. We believe it would be useful for safety purposes to disallow escaping addresses of ref parameters. <snip>... I hope you're joking. (1) I thought the whole point of D was that you didn't need pointers to program effectively? (2) Isn't this what compiler **warnings** are for?
Aug 14 2011
On 08/15/2011 08:16 AM, Mehrdad wrote:On 8/14/2011 7:20 AM, Andrei Alexandrescu wrote:Why would the change contradict this?Walter and I have had a long discussion and we thought we'd bring an idea for community review. We believe it would be useful for safety purposes to disallow escaping addresses of ref parameters. <snip>... I hope you're joking. (1) I thought the whole point of D was that you didn't need pointers to program effectively?(2) Isn't this what compiler **warnings** are for?In a well designed language, warnings are useless. Either the code is well formed or it is not. All you will have to do to convince the compiler that what you are doing is safe is to insert a cast.
Aug 15 2011
On 8/15/2011 4:10 AM, Timon Gehr wrote:On 08/15/2011 08:16 AM, Mehrdad wrote:Because now you need pointers to pass things by reference?(1) I thought the whole point of D was that you didn't need pointers to program effectively?Why would the change contradict this??!?!??!! designed?(2) Isn't this what compiler **warnings** are for?In a well designed language, warnings are useless.
Aug 15 2011
On 08/15/2011 04:04 PM, Mehrdad wrote:On 8/15/2011 4:10 AM, Timon Gehr wrote:Only if the function intends to escape the reference. And if you really need to, you can still use a cast. Furthermore, escaping a reference is generally unsafe when it is to stack memory (they can be some higher-level invariant that guarantees safety, but that is not within the compilers reach -- use a cast.) When the reference is to a value type on heap memory, you had pointers in your code all along.On 08/15/2011 08:16 AM, Mehrdad wrote:Because now you need pointers to pass things by reference?(1) I thought the whole point of D was that you didn't need pointers to program effectively?Why would the change contradict this?Warnings are issued for constructs that are regarded as potentially dangerous/error-prone by many people, but are still valid code. Therefore, they usually reflect suboptimal language design.?!?!??!! designed?(2) Isn't this what compiler **warnings** are for?In a well designed language, warnings are useless.
Aug 15 2011
On 8/15/2011 2:04 PM, Timon Gehr wrote:On 08/15/2011 04:04 PM, Mehrdad wrote:void dangerous(ref int x) { unsafe { bar(&x); } }On 8/15/2011 4:10 AM, Timon Gehr wrote: Because now you need pointers to pass things by reference?Only if the function intends to escape the reference. And if you really need to, you can still use a cast.Furthermore, escaping a reference is generally unsafe when it is to stack memory (they can be some higher-level invariant that guarantees safety, but that is not within the compilers reach -- use a cast.)Right, so let's also ban "auto ref" as a return value, since it could be returning the parameter itself, without the caller's knowledge: auto ref trySomethingExpensive(T)(auto ref T input, bool condition) { return condition ? doSomethingExpensive(input) : input; // Optimize the input } Thoughts? Should we ban ref return types, too?When the reference is to a value type on heap memory, you had pointers in your code all along.And that means...? successful at rendering them useless *except* for interop purposes. Obviously, it looks as though D is failing to achieve that same goal, requiring pointers for something so simple -- do we really want that?Warnings are issued for constructs that are regarded as potentially dangerous/error-prone by many people, but are still valid code. Therefore, they usually reflect suboptimal language design.Is that why my C compiler issues "Warning: Uninitialized local variable p" when I run this? int *p; *p = 5; AFAIK this is clearly *INVALID* (i.e. undefined) code according to the standard...
Aug 15 2011
On 2011/08/16 12:33, Mehrdad wrote:On 8/15/2011 2:04 PM, Timon Gehr wrote:No reference to stack memory can escape its stack frame in your example. I don't see how this has anything to do with the discussion.On 08/15/2011 04:04 PM, Mehrdad wrote:void dangerous(ref int x) { unsafe { bar(&x); } }On 8/15/2011 4:10 AM, Timon Gehr wrote: Because now you need pointers to pass things by reference?Only if the function intends to escape the reference. And if you really need to, you can still use a cast.Furthermore, escaping a reference is generally unsafe when it is to stack memory (they can be some higher-level invariant that guarantees safety, but that is not within the compilers reach -- use a cast.)Right, so let's also ban "auto ref" as a return value, since it could be returning the parameter itself, without the caller's knowledge: auto ref trySomethingExpensive(T)(auto ref T input, bool condition) { return condition ? doSomethingExpensive(input) : input; // Optimize the input } Thoughts? Should we ban ref return types, too?Java doesn't have pointers. (unless you're referring to its reference types, which you obviously aren't, considering your interop comment). Pointers aren't required in D either, but are supported nevertheless, one reason being that D allows for putting value types directly on theWhen the reference is to a value type on heap memory, you had pointers in your code all along.And that means...? successful at rendering them useless *except* for interop purposes. Obviously, it looks as though D is failing to achieve that same goal, requiring pointers for something so simple -- do we really want that?Warnings are issued for constructs that are regarded as potentially dangerous/error-prone by many people, but are still valid code. Therefore, they usually reflect suboptimal language design.Is that why my C compiler issues "Warning: Uninitialized local variable p" when I run this? int *p; *p = 5; AFAIK this is clearly *INVALID* (i.e. undefined) code according to the standard...
Aug 15 2011
On 8/15/2011 9:09 PM, Jakob Ovrum wrote:On 2011/08/16 12:33, Mehrdad wrote:Oh, I assumed it was obvious that you usually have to /call/ a function for anything to happen. But perhaps it wasn't, my bad. Here's something (hopefully) more obvious/verbose: auto ref call_if(T)(auto ref T input, bool cond) { return cond ? call_something(input) : input; } ref int get_foo(int s) { int a = 9.8 * s; return call_if(x, true); // oops? } void test() { writeln(GetFoo()); // <--- what do you suppose gets returned? }auto ref trySomethingExpensive(T)(auto ref T input, bool condition) { return condition ? doSomethingExpensive(input) : input; // Optimize the input } Thoughts? Should we ban ref return types, too?No reference to stack memory can escape its stack frame in your example. I don't see how this has anything to do with the discussion.If you consider references to be the same as pointers then surely you had pointers in your Java code all along?Java doesn't have pointers. <snip>When the reference is to a value type on heap memory, you had pointers in your code all along.Pointers aren't required in D either.It seems like they will be, after Andrei gets this working...Would be curious to know why you just ignored that comment, btw. Do you really consider that to be "valid" code, in any sense of the term?Warnings are issued for constructs that are regarded as potentially dangerous/error-prone by many people, but are still valid code.int *p; *p = 5; AFAIK this is clearly *INVALID* (i.e. undefined) code according to the standard...
Aug 15 2011
On 8/15/2011 9:48 PM, Mehrdad wrote:ref int get_foo(int s) { int a = 9.8 * s; return call_if(x, true); // oops? }Sorry, that should say "false", not "true".
Aug 15 2011
On 2011/08/16 13:48, Mehrdad wrote:On 8/15/2011 9:09 PM, Jakob Ovrum wrote:Not all calls to that function would be dangerous. I would even go so far as to claim most uses of such a function would not be dangerous. No need to be snarky, anyway.On 2011/08/16 12:33, Mehrdad wrote:Oh, I assumed it was obvious that you usually have to /call/ a function for anything to happen. But perhaps it wasn't, my bad. Here's something (hopefully) more obvious/verbose:auto ref trySomethingExpensive(T)(auto ref T input, bool condition) { return condition ? doSomethingExpensive(input) : input; // Optimize the input } Thoughts? Should we ban ref return types, too?No reference to stack memory can escape its stack frame in your example. I don't see how this has anything to do with the discussion.auto ref call_if(T)(auto ref T input, bool cond) { return cond ? call_something(input) : input; } ref int get_foo(int s) { int a = 9.8 * s; return call_if(x, true); // oops? } void test() { writeln(GetFoo()); // <--- what do you suppose gets returned? }I agree these cases are problematic too, but disallowing escaping of ref parameters does not affect them. I don't think Andrei's suggestion includes disallowing returning ref parameters by ref, if it does, I agree that it would reduce the usefulness of ref return.Things will work the same, except you'll need an explicit cast if you want to take the address of the ref parameter (which, as before, yields a pointer). You shouldn't take the address of ref parameters unless unsafe example).If you consider references to be the same as pointers then surely you had pointers in your Java code all along?Java doesn't have pointers. <snip>When the reference is to a value type on heap memory, you had pointers in your code all along.Pointers aren't required in D either.It seems like they will be, after Andrei gets this working...I'm not the one who said it was...Would be curious to know why you just ignored that comment, btw. Do you really consider that to be "valid" code, in any sense of the term?Warnings are issued for constructs that are regarded as potentially dangerous/error-prone by many people, but are still valid code.int *p; *p = 5; AFAIK this is clearly *INVALID* (i.e. undefined) code according to the standard...
Aug 16 2011
On 16.08.2011 7:33, Mehrdad wrote:void dangerous(ref int x) { unsafe { bar(&x); } }system void dangerous(ref int x) { bar(&x); } Fixed? -- Dmitry Olshansky
Aug 16 2011
On 08/16/2011 12:09 PM, Dmitry Olshansky wrote:On 16.08.2011 7:33, Mehrdad wrote:Andrei proposed to make this invalid even for system functions. The only way to escape a ref param's address would be system void dangerous(ref int x) { bar(cast(int*)&x); }void dangerous(ref int x) { unsafe { bar(&x); } }system void dangerous(ref int x) { bar(&x); } Fixed?
Aug 16 2011
On 8/16/2011 8:08 AM, Timon Gehr wrote:Andrei proposed to make this invalid even for system functions. The only way to escape a ref param's address would be system void dangerous(ref int x) { bar(cast(int*)&x); }What type would it be casting /from/?So if you are supposing I think the C language is well designed and warnings are useless for C, you are on the wrong path.Right. Would you mind giving me around ~3 examples of languages you DO consider to be well-designed, other than D? Maybe we can (hopefully) find some common ground then...
Aug 16 2011
On 08/16/2011 05:33 AM, Mehrdad wrote:On 8/15/2011 2:04 PM, Timon Gehr wrote:cast syntax and that in a managed language, casts should always be safe.On 08/15/2011 04:04 PM, Mehrdad wrote:void dangerous(ref int x) { unsafe { bar(&x); } }On 8/15/2011 4:10 AM, Timon Gehr wrote: Because now you need pointers to pass things by reference?Only if the function intends to escape the reference. And if you really need to, you can still use a cast.Well obviously some measure to ban the improper usage of ref return types from safeD has to be taken eventually (otherwise safeD would not be memory safe, duh). Basically the problem arises, when you directly pass on a reference you got from a function that took some of your local variables by ref (maybe indirectly through multiple ref returns). So that could be the construct to ban, which is less restrictive than banning ref return altogether.Furthermore, escaping a reference is generally unsafe when it is to stack memory (they can be some higher-level invariant that guarantees safety, but that is not within the compilers reach -- use a cast.)Right, so let's also ban "auto ref" as a return value, since it could be returning the parameter itself, without the caller's knowledge: auto ref trySomethingExpensive(T)(auto ref T input, bool condition) { return condition ? doSomethingExpensive(input) : input; // Optimize the input } Thoughts? Should we ban ref return types, too?auto x=new int; static assert(is(typeof(x)==int*)); Alternatively, you can use a wrapper class and then you don't need pointers at all.When the reference is to a value type on heap memory, you had pointers in your code all along.And that means...?successful at rendering them useless *except* for interop purposes. Obviously, it looks as though D is failing to achieve that same goal, requiring pointers for something so simple -- do we really want that?allowing escaping ref argument addresses.C is a machine-friendly language for writing lightning fast portable programs for arbitrary von Neumann architecture computers. Furthermore, writing a conformant compiler should be very easy. It does not need to be a well designed language, because that would interfer with other goals. So if you are supposing I think the C language is well designed and warnings are useless for C, you are on the wrong path. It is the other way round. But it is really not a problem at all. I still like C. Usually, well designed languages have somewhat worse run-time characteristics.Warnings are issued for constructs that are regarded as potentially dangerous/error-prone by many people, but are still valid code. Therefore, they usually reflect suboptimal language design.Is that why my C compiler issues "Warning: Uninitialized local variable p" when I run this? int *p; *p = 5; AFAIK this is clearly *INVALID* (i.e. undefined) code according to the standard...
Aug 16 2011
On 8/15/11 1:16 AM, Mehrdad wrote:On 8/14/2011 7:20 AM, Andrei Alexandrescu wrote:The proposed change has little to do with needing pointers or not.Walter and I have had a long discussion and we thought we'd bring an idea for community review. We believe it would be useful for safety purposes to disallow escaping addresses of ref parameters. <snip>... I hope you're joking. (1) I thought the whole point of D was that you didn't need pointers to program effectively?(2) Isn't this what compiler **warnings** are for?No. Andrei
Aug 15 2011
On Sun, 14 Aug 2011 10:20:37 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Walter and I have had a long discussion and we thought we'd bring an idea for community review. We believe it would be useful for safety purposes to disallow escaping addresses of ref parameters. Consider: class C { int * p; this(ref int x) { p = &x; // escapes the address of a ref parameter } } Such code is accepted today. We believe it is error-prone and dangerous, particularly because the caller has no syntactic cue that the address of the parameter is passed into the function (in this case constructor). Worse, such a function cannot be characterized as safe. So we want to make the above an error. The workaround is obvious - just take int* as a parameter instead of ref int. What a function can do with a ref parameter in general is: * use it directly just like a local; * pass it down to other functions (which may take it by value or reference); * pass its address down to pure functions because a pure function cannot escape the address anyway (cool insight by Walter); * take its address as long as the address doesn't outlive the frame of the function. The third bullet is not easy to implement as it requires flow analysis, but we may start with a conservative version first. Probably there won't be a lot of broken code anyway. Please chime in with any comments you might have!It sounds reasonable, especially with the added clarification that you can cast yourself back to the good old unsafe pointer. The one thing I'm leery of is that structs are passed by reference for member functions, which is *forced* by the compiler. Not that it's going to be horrible, but I think in certain cases, especially for things that allocate structs on the heap, this is going to require a lot of casting. Here is a real world example for dcollections that is full of &this: http://www.dsource.org/projects/dcollections/browser/branches/d2/dcollections/Link.d#L37 Is there no way to say "for this section of code, allow taking reference addresses"? -Steve
Aug 15 2011