digitalmars.D - escaping addresses of ref parameters - not
- Andrei Alexandrescu (45/45) Feb 08 2009 Hey,
- Jarrett Billingsley (6/7) Feb 08 2009 I think yes. ;)
- Nick Sabalausky (13/33) Feb 08 2009 Or something like:
- Jarrett Billingsley (2/5) Feb 08 2009 Not if D gets (non)nullable types ;))))
- Jason House (3/63) Feb 08 2009 What constitutes escaping? If any other functions are called with the p...
- Denis Koroskin (3/7) Feb 08 2009 Whereas I agree about pointers, non-escape should be default so scope is...
- Andrei Alexandrescu (5/10) Feb 08 2009 The callee tucks away the address of the ref parameter, or the address
- Jason House (7/18) Feb 09 2009 Ok, so this is head-escape analysis instead of transitive escape analysi...
- Denis Koroskin (9/54) Feb 08 2009 I agree. It also grants safe way to pass temporaries:
- Andrei Alexandrescu (10/20) Feb 08 2009 Well rvalues still shouldn't bind to temporaries because I want to allow...
- Daniel Keep (13/13) Feb 08 2009 I've used ref arguments in the past to wrap a C api that expects
- Christopher Wright (2/19) Feb 09 2009 Why aren't you passing a Foo*?
- Denis Koroskin (15/35) Feb 09 2009 That's ok if Foo is a struct:
- Daniel Keep (6/46) Feb 09 2009 Because I treat pointers as "dragons be here" territory, and try to
- Andrei Alexandrescu (4/19) Feb 09 2009 The entire scheme relies on ref not being allowed outside function
- Bartosz Milewski (1/1) Feb 08 2009 Of course, enforce(mdgt); is there only for documentation purposes. Just...
- Brad Roberts (7/9) Feb 09 2009 What sort of documentation do you have that's able to stop a program in
- Bartosz Milewski (2/14) Feb 09 2009
- Denis Koroskin (2/5) Feb 09 2009 It will throw a recoverable Exception (an access violation is an Error, ...
- Christopher Wright (2/12) Feb 09 2009 And a segfault is a hard stop, unless you have a signal handler for it.
- Steven Schveighoffer (14/24) Feb 09 2009 As long as there is a way to circumvent this, I'm OK with this rule.
- Andrei Alexandrescu (3/36) Feb 09 2009 Yah, an explicit cast ref T -> T* must be still allowed.
- Bartosz Milewski (2/15) Feb 09 2009
- Christopher Wright (3/4) Feb 11 2009 That is annoying, and there are libraries that fix it. It's already
- Jarrett Billingsley (6/19) Feb 11 2009 Probably because on Windows, segfaults are reported through the same
- Michel Fortin (17/25) Feb 10 2009 Isn't the "this" argument for struct a ref now? So you can't do this
- Andrei Alexandrescu (4/30) Feb 10 2009 It's a ref.
- Brad Roberts (7/11) Feb 10 2009 Where does the language spec state or suggest that? A subset of structs
- Andrei Alexandrescu (6/15) Feb 10 2009 We plan to do that for D2. It would avoid all issues with C++'s copy
Hey, I've been doing a hecatomb of coding in D lately, and had an insight that I think is pretty cool. Consider: struct Widget { private Midget * m; ... this(ref Midget mdgt) { m = &mdgt; ... } } It's a rather typical pattern in C++ for forwarding objects that need to store a reference/pointer to their parent and also nicely warn their user that a NULL pointer won't do. But I'm thinking this is unduly dangerous because the unwitting user can easily get all sorts of wrong code to compile: Widget makeACoolWidget() { Midget coolMidget; return Widget(coolMidget); // works! or...? } The compiler's escape detection mechanism can't help quite a lot here because the escape hatch is rather indirect. Initially I thought SafeD should prevent such escapes, whereas D allows them. Now I start thinking the pattern above is dangerous enough to be disallowed in all of D. How about this rule? *************** Rule: ref parameters are PASS-DOWN and RETURN only. No escaping of addresses of ref parameters is allowed. If you want to escape the address of a ref parameter, use a pointer in the first place. *************** This rule is powerful and leads to an honest style of programming: if you plan on escaping some thing's address, you make that clear in the public signature. The fix to the idiom above is: struct Widget { private Midget * m; ... this(Midget * mdgt) { enforce(mdgt); m = mdgt; ... } } Widget makeACoolWidget() { auto coolMidget = new Midget; return Widget(coolMidget); // works! } Whaddaya think? Andrei
Feb 08 2009
On Sun, Feb 8, 2009 at 10:39 PM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Whaddaya think?I think yes. ;) I have only used ref params as either a performance optimization when passing structs or when I want to modify a value in the calling function. Allowing refs to escape seems really dangerous.
Feb 08 2009
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message news:gmo8hl$1687$1 digitalmars.com...*************** Rule: ref parameters are PASS-DOWN and RETURN only. No escaping of addresses of ref parameters is allowed. If you want to escape the address of a ref parameter, use a pointer in the first place. *************** This rule is powerful and leads to an honest style of programming: if you plan on escaping some thing's address, you make that clear in the public signature. The fix to the idiom above is: struct Widget { private Midget * m; ... this(Midget * mdgt) { enforce(mdgt); m = mdgt; ... } }Or something like: struct Widget { private Midget * m; ... this(NonNullable!(Midget*) mdgt) { m = mdgt; ... } } ;)Widget makeACoolWidget() { auto coolMidget = new Midget; return Widget(coolMidget); // works! } Whaddaya think?Sounds reasonable for the most part. My only concern is that (if I'm understanding it right) the null-check is moved from compile-time to run-time.
Feb 08 2009
On Sun, Feb 8, 2009 at 11:55 PM, Nick Sabalausky <a a.a> wrote:Sounds reasonable for the most part. My only concern is that (if I'm understanding it right) the null-check is moved from compile-time to run-time.Not if D gets (non)nullable types ;))))
Feb 08 2009
Andrei Alexandrescu wrote:Hey, I've been doing a hecatomb of coding in D lately, and had an insight that I think is pretty cool. Consider: struct Widget { private Midget * m; ... this(ref Midget mdgt) { m = &mdgt; ... } } It's a rather typical pattern in C++ for forwarding objects that need to store a reference/pointer to their parent and also nicely warn their user that a NULL pointer won't do. But I'm thinking this is unduly dangerous because the unwitting user can easily get all sorts of wrong code to compile: Widget makeACoolWidget() { Midget coolMidget; return Widget(coolMidget); // works! or...? } The compiler's escape detection mechanism can't help quite a lot here because the escape hatch is rather indirect. Initially I thought SafeD should prevent such escapes, whereas D allows them. Now I start thinking the pattern above is dangerous enough to be disallowed in all of D. How about this rule? *************** Rule: ref parameters are PASS-DOWN and RETURN only. No escaping of addresses of ref parameters is allowed. If you want to escape the address of a ref parameter, use a pointer in the first place. *************** This rule is powerful and leads to an honest style of programming: if you plan on escaping some thing's address, you make that clear in the public signature. The fix to the idiom above is: struct Widget { private Midget * m; ... this(Midget * mdgt) { enforce(mdgt); m = mdgt; ... } } Widget makeACoolWidget() { auto coolMidget = new Midget; return Widget(coolMidget); // works! } Whaddaya think? AndreiWhat constitutes escaping? If any other functions are called with the parameter, even if they're const(T), it can still escape. It may be easiest to start with worrying about mutable references for now and then extend to const references later. Also, please find another way than pointer vs. non-pointer to communicate if something can escape. Once upon a time, D was going to have scope parameters. I always assumed that meant no escape. I'd really love to see something along those lines come back.
Feb 08 2009
On Mon, 09 Feb 2009 08:10:31 +0300, Jason House <jason.james.house gmail.com> wrote: [snip]Also, please find another way than pointer vs. non-pointer to communicate if something can escape. Once upon a time, D was going to have scope parameters. I always assumed that meant no escape. I'd really love to see something along those lines come back.Whereas I agree about pointers, non-escape should be default so scope is not useful. It is rather nonscope which is needed here.
Feb 08 2009
Jason House wrote:What constitutes escaping?The callee tucks away the address of the ref parameter, or the address of a direct field of it.If any other functions are called with the parameter, even if they're const(T), it can still escape. It may be easiest to start with worrying about mutable references for now and then extend to const references later.const is orthogonal. Andrei
Feb 08 2009
Andrei Alexandrescu Wrote:Jason House wrote:Ok, so this is head-escape analysis instead of transitive escape analysis. Still, there are a few issues that I can see: 1. This may impose a significant restriction for reference types - any use of them as an "in" parameter will be a compiler error. "in" does not imply no escape. (Right now, I think pure functions can modify their arguments, so it's still possible for pure functions to allow escaping of in parameters into other function input arguments.) 2. It's not just the address of immediate members. &a.b.c.d.e.f could also be an issue if b,c,d, and e are all value types. (This is actually quite minor, it just needs a slight definition change of what escaping means) 3. How would you handle value type member functions that return ref variables? They could be references to their internal member variables or a ref to something else.What constitutes escaping?The callee tucks away the address of the ref parameter, or the address of a direct field of it.do this change and make const orthogonal, you need to change the default behavior of in parameters as well.If any other functions are called with the parameter, even if they're const(T), it can still escape. It may be easiest to start with worrying about mutable references for now and then extend to const references later.const is orthogonal.
Feb 09 2009
On Mon, 09 Feb 2009 06:39:36 +0300, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Hey, I've been doing a hecatomb of coding in D lately, and had an insight that I think is pretty cool. Consider: struct Widget { private Midget * m; ... this(ref Midget mdgt) { m = &mdgt; ... } } It's a rather typical pattern in C++ for forwarding objects that need to store a reference/pointer to their parent and also nicely warn their user that a NULL pointer won't do. But I'm thinking this is unduly dangerous because the unwitting user can easily get all sorts of wrong code to compile: Widget makeACoolWidget() { Midget coolMidget; return Widget(coolMidget); // works! or...? } The compiler's escape detection mechanism can't help quite a lot here because the escape hatch is rather indirect. Initially I thought SafeD should prevent such escapes, whereas D allows them. Now I start thinking the pattern above is dangerous enough to be disallowed in all of D. How about this rule? *************** Rule: ref parameters are PASS-DOWN and RETURN only. No escaping of addresses of ref parameters is allowed. If you want to escape the address of a ref parameter, use a pointer in the first place. *************** This rule is powerful and leads to an honest style of programming: if you plan on escaping some thing's address, you make that clear in the public signature. The fix to the idiom above is: struct Widget { private Midget * m; ... this(Midget * mdgt) { enforce(mdgt); m = mdgt; ... } } Widget makeACoolWidget() { auto coolMidget = new Midget; return Widget(coolMidget); // works! } Whaddaya think? AndreiI agree. It also grants safe way to pass temporaries: int bar(); int* gi; void foo(ref int i) { gi = &i; } foo(bar()); // unsafe
Feb 08 2009
Denis Koroskin wrote:I agree. It also grants safe way to pass temporaries: int bar(); int* gi; void foo(ref int i) { gi = &i; } foo(bar()); // unsafeWell rvalues still shouldn't bind to temporaries because I want to allow a function to return by ref a ref parameter: ref int foo(ref int i) { if (i == 0) ++i; return i; } Binding an rvalue to a ref would essentially pass a zombie out of foo. Andrei
Feb 08 2009
I've used ref arguments in the past to wrap a C api that expects pointers. I'm fine with this so long as there is a way to break out of it (in regular D, at least) that makes it abundantly clear you need to know what you're doing. Something like: void wrapSomeCApi(ref Foo arg) { Foo* argptr = ref_unsafe_escape(arg); some_c_api(argptr); } Incidentally, I don't suppose we can get ref variables while Walter's at it? :P -- Daniel
Feb 08 2009
Daniel Keep wrote:I've used ref arguments in the past to wrap a C api that expects pointers. I'm fine with this so long as there is a way to break out of it (in regular D, at least) that makes it abundantly clear you need to know what you're doing. Something like: void wrapSomeCApi(ref Foo arg) { Foo* argptr = ref_unsafe_escape(arg); some_c_api(argptr); } Incidentally, I don't suppose we can get ref variables while Walter's at it? :P -- DanielWhy aren't you passing a Foo*?
Feb 09 2009
Christopher Wright Wrote:Daniel Keep wrote:That's ok if Foo is a struct: struct Rect { int x, y, width, weight; }; class Widget { // returns true on success bool getBounds(ref Rect rect) { Rect* rectPtr = ref_unsafe_escape(rect); return gtk_widget_get_bounds(rectPtr) != 0; } } Another question is, what is the benefit of it? Why not take an adress directly? return gtk_widget_get_bounds(&rect) != 0;I've used ref arguments in the past to wrap a C api that expects pointers. I'm fine with this so long as there is a way to break out of it (in regular D, at least) that makes it abundantly clear you need to know what you're doing. Something like: void wrapSomeCApi(ref Foo arg) { Foo* argptr = ref_unsafe_escape(arg); some_c_api(argptr); } Incidentally, I don't suppose we can get ref variables while Walter's at it? :P -- DanielWhy aren't you passing a Foo*?
Feb 09 2009
Denis Koroskin wrote:Christopher Wright Wrote:Because I treat pointers as "dragons be here" territory, and try to restrict them to as little of my code as humanly possible. I also feel that if a function call is using pointers as an implementation detail, I shouldn't have to specify it myself. -- DanielDaniel Keep wrote:That's ok if Foo is a struct: struct Rect { int x, y, width, weight; }; class Widget { // returns true on success bool getBounds(ref Rect rect) { Rect* rectPtr = ref_unsafe_escape(rect); return gtk_widget_get_bounds(rectPtr) != 0; } } Another question is, what is the benefit of it? Why not take an adress directly? return gtk_widget_get_bounds(&rect) != 0;I've used ref arguments in the past to wrap a C api that expects pointers. I'm fine with this so long as there is a way to break out of it (in regular D, at least) that makes it abundantly clear you need to know what you're doing. Something like: void wrapSomeCApi(ref Foo arg) { Foo* argptr = ref_unsafe_escape(arg); some_c_api(argptr); } Incidentally, I don't suppose we can get ref variables while Walter's at it? :P -- DanielWhy aren't you passing a Foo*?
Feb 09 2009
Daniel Keep wrote:I've used ref arguments in the past to wrap a C api that expects pointers. I'm fine with this so long as there is a way to break out of it (in regular D, at least) that makes it abundantly clear you need to know what you're doing. Something like: void wrapSomeCApi(ref Foo arg) { Foo* argptr = ref_unsafe_escape(arg); some_c_api(argptr); } Incidentally, I don't suppose we can get ref variables while Walter's at it? :PThe entire scheme relies on ref not being allowed outside function signatures. If ref vars were allowed, they could be escaped. Andrei
Feb 09 2009
Of course, enforce(mdgt); is there only for documentation purposes. Just like null dereference, it halts the program, right?
Feb 08 2009
Bartosz Milewski wrote:Of course, enforce(mdgt); is there only for documentation purposes. Just like null dereference, it halts the program, right?What sort of documentation do you have that's able to stop a program in it's tracks? :) Yes, it's not a compile time check, it's a run time check. It's different from assert() in that it still happens even in release builds. Later, Brad
Feb 09 2009
My point is that it's a redundant check. Whether it is there or not, the result is the same--the program will halt. Maybe the error message form enforce will look nicer, but that's about it. Brad Roberts Wrote:Bartosz Milewski wrote:Of course, enforce(mdgt); is there only for documentation purposes. Just like null dereference, it halts the program, right?What sort of documentation do you have that's able to stop a program in it's tracks? :) Yes, it's not a compile time check, it's a run time check. It's different from assert() in that it still happens even in release builds. Later, Brad
Feb 09 2009
On Mon, 09 Feb 2009 11:24:09 +0300, Bartosz Milewski <bartosz relisoft.com> wrote:My point is that it's a redundant check. Whether it is there or not, the result is the same--the program will halt. Maybe the error message form enforce will look nicer, but that's about it.It will throw a recoverable Exception (an access violation is an Error, IIRC).
Feb 09 2009
Denis Koroskin wrote:On Mon, 09 Feb 2009 11:24:09 +0300, Bartosz Milewski <bartosz relisoft.com> wrote:And a segfault is a hard stop, unless you have a signal handler for it.My point is that it's a redundant check. Whether it is there or not, the result is the same--the program will halt. Maybe the error message form enforce will look nicer, but that's about it.It will throw a recoverable Exception (an access violation is an Error, IIRC).
Feb 09 2009
"Andrei Alexandrescu" wroteThe compiler's escape detection mechanism can't help quite a lot here because the escape hatch is rather indirect. Initially I thought SafeD should prevent such escapes, whereas D allows them. Now I start thinking the pattern above is dangerous enough to be disallowed in all of D. How about this rule? *************** Rule: ref parameters are PASS-DOWN and RETURN only. No escaping of addresses of ref parameters is allowed. If you want to escape the address of a ref parameter, use a pointer in the first place. ***************As long as there is a way to circumvent this, I'm OK with this rule. Something that's the equivalent of a cast. Two reasons: 1. Using a * dereference pointer inside a function for all usages is sometimes tedious. This would be a non issue if ref local variables were allowed, i.e.: void foo(int *x) { ref int rx = *x; // use rx until you need to copy the address of x. } 2. you may need to call functions you have no control over that take a pointer but do not save a reference to it. e.g. system calls. -Steve
Feb 09 2009
Steven Schveighoffer wrote:"Andrei Alexandrescu" wroteYah, an explicit cast ref T -> T* must be still allowed. AndreiThe compiler's escape detection mechanism can't help quite a lot here because the escape hatch is rather indirect. Initially I thought SafeD should prevent such escapes, whereas D allows them. Now I start thinking the pattern above is dangerous enough to be disallowed in all of D. How about this rule? *************** Rule: ref parameters are PASS-DOWN and RETURN only. No escaping of addresses of ref parameters is allowed. If you want to escape the address of a ref parameter, use a pointer in the first place. ***************As long as there is a way to circumvent this, I'm OK with this rule. Something that's the equivalent of a cast. Two reasons: 1. Using a * dereference pointer inside a function for all usages is sometimes tedious. This would be a non issue if ref local variables were allowed, i.e.: void foo(int *x) { ref int rx = *x; // use rx until you need to copy the address of x. } 2. you may need to call functions you have no control over that take a pointer but do not save a reference to it. e.g. system calls. -Steve
Feb 09 2009
What bothers me is that this is equivalent to saying that a seg fault caused by null dereference can be caught only if the programmer puts explicit runtime checks before it happens. I would say that reaks of C philosophy, except that in most C++ implementation I've been working with you can simply catch a seg fault. I wouldn't mind not being able to catch a seg fault in a language where it's impossible to have an unitialized reference. But both in Java and in D it's very easy to get into this situation (in fact, it's easier in D) because of hidden reference semantics of class objects. Which ties nicely with the discussion of nullable types. Christopher Wright Wrote:Denis Koroskin wrote:On Mon, 09 Feb 2009 11:24:09 +0300, Bartosz Milewski <bartosz relisoft.com> wrote:And a segfault is a hard stop, unless you have a signal handler for it.My point is that it's a redundant check. Whether it is there or not, the result is the same--the program will halt. Maybe the error message form enforce will look nicer, but that's about it.It will throw a recoverable Exception (an access violation is an Error, IIRC).
Feb 09 2009
Bartosz Milewski wrote:What bothers me is that this is equivalent to saying that a seg fault caused by null dereference can be caught only if the programmer puts explicit runtime checks before it happens. I would say that reaks of C philosophy, except that in most C++ implementation I've been working with you can simply catch a seg fault. I wouldn't mind not being able to catch a seg fault in a language where it's impossible to have an unitialized reference. But both in Java and in D it's very easy to get into this situation (in fact, it's easier in D) because of hidden reference semantics of class objects. Which ties nicely with the discussion of nullable types.That is annoying, and there are libraries that fix it. It's already handled on Windows by default; why isn't it handled on Linux?
Feb 11 2009
On Wed, Feb 11, 2009 at 8:19 AM, Christopher Wright <dhasenan gmail.com> wrote:Bartosz Milewski wrote:Probably because on Windows, segfaults are reported through the same exception handling mechanism that the D compiler makes use of, while on linux, they come in through signal handlers. I don't know how easy it is to start unwinding (or completely unwind) the call stack in the signal handler but the impression I get is that it's not fun.What bothers me is that this is equivalent to saying that a seg fault caused by null dereference can be caught only if the programmer puts explicit runtime checks before it happens. I would say that reaks of C philosophy, except that in most C++ implementation I've been working with you can simply catch a seg fault. I wouldn't mind not being able to catch a seg fault in a language where it's impossible to have an unitialized reference. But both in Java and in D it's very easy to get into this situation (in fact, it's easier in D) because of hidden reference semantics of class objects. Which ties nicely with the discussion of nullable types.That is annoying, and there are libraries that fix it. It's already handled on Windows by default; why isn't it handled on Linux?
Feb 11 2009
On 2009-02-08 22:39:36 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:*************** Rule: ref parameters are PASS-DOWN and RETURN only. No escaping of addresses of ref parameters is allowed. If you want to escape the address of a ref parameter, use a pointer in the first place. ***************Isn't the "this" argument for struct a ref now? So you can't do this any longer: static A[] listOfRegisteredA; struct A { void register() { listOfRegisteredA ~= &this; } }This rule is powerful and leads to an honest style of programming: if you plan on escaping some thing's address, you make that clear in the public signature.But you can't change the "this" parameter to not be a ref, can you? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Feb 10 2009
Michel Fortin wrote:On 2009-02-08 22:39:36 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:You shouldn't do that anyway as structs can be moved freely.*************** Rule: ref parameters are PASS-DOWN and RETURN only. No escaping of addresses of ref parameters is allowed. If you want to escape the address of a ref parameter, use a pointer in the first place. ***************Isn't the "this" argument for struct a ref now? So you can't do this any longer: static A[] listOfRegisteredA; struct A { void register() { listOfRegisteredA ~= &this; } }It's a ref. AndreiThis rule is powerful and leads to an honest style of programming: if you plan on escaping some thing's address, you make that clear in the public signature.But you can't change the "this" parameter to not be a ref, can you?
Feb 10 2009
Andrei Alexandrescu wrote:You shouldn't do that anyway as structs can be moved freely. AndreiWhere does the language spec state or suggest that? A subset of structs can be (those that contain no internal pointers nor have had their address taken (including references), but it's not a generally true fact, as far as I recall. Later, Brad
Feb 10 2009
Brad Roberts wrote:Andrei Alexandrescu wrote:We plan to do that for D2. It would avoid all issues with C++'s copy construction and rvalue references. Structs with internal pointers will be allowed in D2 only if manipulated exclusively through pointers. AndreiYou shouldn't do that anyway as structs can be moved freely. AndreiWhere does the language spec state or suggest that? A subset of structs can be (those that contain no internal pointers nor have had their address taken (including references), but it's not a generally true fact, as far as I recall.
Feb 10 2009