digitalmars.D - Scope and Ref and Borrowing
- Walter Bright (124/124) Nov 13 2014 Thought I'd bring this up as deadalnix is working on a related proposal....
- Manu via Digitalmars-d (8/144) Nov 13 2014 I'm not quite clear; are you suggesting 'scope ref' would be a storage
- Walter Bright (5/12) Nov 13 2014 Only for function types.
- Walter Bright (3/4) Nov 13 2014 Found it:
- Foo (1/1) Nov 14 2014 Remembers me a bit of http://wiki.dlang.org/DIP36
- Kagamin (7/17) Nov 14 2014 ref T f(ref T delegate() dg)
- Freddy (12/156) Nov 17 2014 Why not make the compiler copy the variable how's address is
- "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (4/15) Nov 18 2014 It would be safe, but it'd be another hard to detect source of GC
- Marco Leise (18/18) Nov 18 2014 I have to say I mostly settled with Marc Sch=C3=BCtz's approach to
Thought I'd bring this up as deadalnix is working on a related proposal. It uses 'scope' in conjunction with 'ref' to resolve some long standing safe issues. --------------------------------------- **Background The goal of safe code is that it is guaranteed to be memory safe. This is mostly achieved, but there's a gaping hole - returning pointers to stack objects when those objects are out of scope. This is memory corruption. The simple cases of this are disallowed: T* func(T t) { T u; return &t; // Error: escaping reference to local t return &u; // Error: escaping reference to local u } But are is easily circumvented: T* func(T t) { T* p = &t; return p; // no error detected } safe deals with this by preventing taking the address of a local: T* func(T t) safe { T* p = &t; // Error: cannot take address of parameter t in safe function func return p; } But this is awfully restrictive. So the 'ref' storage class was introduced which defines a special purpose pointer. 'ref' can only appear in certain contexts, in particular function parameters and returns, only applies to declarations, cannot be stored, and cannot be incremented. ref T func(T t) safe { return t; // Error: escaping reference to local variable t } Ref can be passed down to functions: void func(ref T t) safe; void bar(ref T t) safe { func(t); // ok } But the following idiom is far too useful to be disallowed: ref T func(ref T t) safe { return t; // ok } And if it is misused it can result in stack corruption: ref T foo() safe { T t; return func(t); // no error detected, despite returning pointer to t } The purpose of this proposal is to detect these cases at compile time and disallow them. Memory safety is achieved by allowing pointers to stack objects be passed down the stack, but those pointers may not be saved into non-stack objects or stack objects higher on the stack, and may not be passed up the stack past where they are allocated. The: return func(t); case is detected by all of the following conditions being true: 1. foo() returns by reference 2. func() returns by reference 3. func() has one or more parameters that are by reference 4. 1 or more of the arguments to those parameters are stack objects local to foo() 5. Those arguments can be safe-ly converted from the parameter to the return type. For example, if the return type is larger than the parameter type, the return type cannot be a reference to the argument. If the return type is a pointer, and the parameter type is a size_t, it cannot be a reference to the argument. The larger a list of these cases can be made, the more code will pass safe checks without requiring further annotation. **Scope Ref The above solution is correct, but a bit restrictive. After all, func(t, u) could be returning a reference to non-local u, not local t, and so should work. To fix this, introduce the concept of 'scope ref': ref T func(scope ref T t, T u) safe { return t; // Error: escaping scope ref t return u; // ok } Scope means that the ref is guaranteed not to escape. T u; ref T foo() safe { T t; return func(t, u); // ok, u is not local return func(u, t); // Error: escaping scope ref t } This scheme minimizes the number of 'scope' annotations required. **Out Parameters 'out' parameters are treated like 'ref' parameters for the purposes of this document. **Inference Many functions can infer pure, safe, and nogc. Those same functions can infer which ref parameters are 'scope', without needing user annotation. **Mangling Scope will require additional name mangling, as it affects the interface of the function. **Nested Functions Nested functions have more objects available than just their arguments: ref T foo() safe { T t; ref T func() { return t; } return func(); // should be disallowed } On the plus side the body of the nested function is available to the compiler for examination. **Delegates and Closures This one is a little harder; but the compiler can detect that t is taken by reference, and then can assume that dg() is returning t by reference and disallow it. ref T foo() safe { T t; ref T func() { return t; } auto dg = &func; return dg(); // should be disallowed } **Overloading Scope does not affect overloading, i.e.: T func(scope ref a); T func(ref T b); are considered the same as far as overloading goes. **Inheritance Overriding functions inherit any 'scope' annotations from their antecedents. **Limitations Arrays of references are not allowed. Struct and class fields that are references are not allowed. Non-parameter variables cannot be references.
Nov 13 2014
On 14 November 2014 11:20, Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:Thought I'd bring this up as deadalnix is working on a related proposal. It uses 'scope' in conjunction with 'ref' to resolve some long standing safe issues. --------------------------------------- **Background The goal of safe code is that it is guaranteed to be memory safe. This is mostly achieved, but there's a gaping hole - returning pointers to stack objects when those objects are out of scope. This is memory corruption. The simple cases of this are disallowed: T* func(T t) { T u; return &t; // Error: escaping reference to local t return &u; // Error: escaping reference to local u } But are is easily circumvented: T* func(T t) { T* p = &t; return p; // no error detected } safe deals with this by preventing taking the address of a local: T* func(T t) safe { T* p = &t; // Error: cannot take address of parameter t in safe function func return p; } But this is awfully restrictive. So the 'ref' storage class was introduced which defines a special purpose pointer. 'ref' can only appear in certain contexts, in particular function parameters and returns, only applies to declarations, cannot be stored, and cannot be incremented. ref T func(T t) safe { return t; // Error: escaping reference to local variable t } Ref can be passed down to functions: void func(ref T t) safe; void bar(ref T t) safe { func(t); // ok } But the following idiom is far too useful to be disallowed: ref T func(ref T t) safe { return t; // ok } And if it is misused it can result in stack corruption: ref T foo() safe { T t; return func(t); // no error detected, despite returning pointer to t } The purpose of this proposal is to detect these cases at compile time and disallow them. Memory safety is achieved by allowing pointers to stack objects be passed down the stack, but those pointers may not be saved into non-stack objects or stack objects higher on the stack, and may not be passed up the stack past where they are allocated. The: return func(t); case is detected by all of the following conditions being true: 1. foo() returns by reference 2. func() returns by reference 3. func() has one or more parameters that are by reference 4. 1 or more of the arguments to those parameters are stack objects local to foo() 5. Those arguments can be safe-ly converted from the parameter to the return type. For example, if the return type is larger than the parameter type, the return type cannot be a reference to the argument. If the return type is a pointer, and the parameter type is a size_t, it cannot be a reference to the argument. The larger a list of these cases can be made, the more code will pass safe checks without requiring further annotation. **Scope Ref The above solution is correct, but a bit restrictive. After all, func(t, u) could be returning a reference to non-local u, not local t, and so should work. To fix this, introduce the concept of 'scope ref': ref T func(scope ref T t, T u) safe { return t; // Error: escaping scope ref t return u; // ok } Scope means that the ref is guaranteed not to escape. T u; ref T foo() safe { T t; return func(t, u); // ok, u is not local return func(u, t); // Error: escaping scope ref t } This scheme minimizes the number of 'scope' annotations required. **Out Parameters 'out' parameters are treated like 'ref' parameters for the purposes of this document. **Inference Many functions can infer pure, safe, and nogc. Those same functions can infer which ref parameters are 'scope', without needing user annotation. **Mangling Scope will require additional name mangling, as it affects the interface of the function. **Nested Functions Nested functions have more objects available than just their arguments: ref T foo() safe { T t; ref T func() { return t; } return func(); // should be disallowed } On the plus side the body of the nested function is available to the compiler for examination. **Delegates and Closures This one is a little harder; but the compiler can detect that t is taken by reference, and then can assume that dg() is returning t by reference and disallow it. ref T foo() safe { T t; ref T func() { return t; } auto dg = &func; return dg(); // should be disallowed } **Overloading Scope does not affect overloading, i.e.: T func(scope ref a); T func(ref T b); are considered the same as far as overloading goes. **Inheritance Overriding functions inherit any 'scope' annotations from their antecedents. **Limitations Arrays of references are not allowed. Struct and class fields that are references are not allowed. Non-parameter variables cannot be references.I'm not quite clear; are you suggesting 'scope ref' would be a storage class? (looks like it, but it also affects mangling?) Please, consider Marc Schütz's existing proposal. Please, please, don't make scope a storage class. I have a lot of other issues with this proposal, but maybe I misunderstand it in principle...?
Nov 13 2014
On 11/13/2014 7:41 PM, Manu via Digitalmars-d wrote:I'm not quite clear; are you suggesting 'scope ref' would be a storage class?Yes.(looks like it, but it also affects mangling?)Only for function types.Please, consider Marc Schütz's existing proposal. Please, please, don't make scope a storage class.Got a handy link to it?I have a lot of other issues with this proposal, but maybe I misunderstand it in principle...?I have no idea what your other issues are and what they may be based on :-)
Nov 13 2014
On 11/13/2014 9:36 PM, Walter Bright wrote:Got a handy link to it?Found it: http://wiki.dlang.org/User:Schuetzm/scope
Nov 13 2014
On Friday, 14 November 2014 at 01:21:07 UTC, Walter Bright wrote:ref T foo() safe { T t; ref T func() { return t; } auto dg = &func; return dg(); // should be disallowed }ref T f(ref T delegate() dg) { return dg(); //ok? } On Friday, 14 November 2014 at 05:38:34 UTC, Walter Bright wrote:On 11/13/2014 9:36 PM, Walter Bright wrote:http://forum.dlang.org/post/etjuormplgfbomwdrurp forum.dlang.orgGot a handy link to it?Found it: http://wiki.dlang.org/User:Schuetzm/scope
Nov 14 2014
On Friday, 14 November 2014 at 01:21:07 UTC, Walter Bright wrote:Thought I'd bring this up as deadalnix is working on a related proposal. It uses 'scope' in conjunction with 'ref' to resolve some long standing safe issues. --------------------------------------- **Background The goal of safe code is that it is guaranteed to be memory safe. This is mostly achieved, but there's a gaping hole - returning pointers to stack objects when those objects are out of scope. This is memory corruption. The simple cases of this are disallowed: T* func(T t) { T u; return &t; // Error: escaping reference to local t return &u; // Error: escaping reference to local u } But are is easily circumvented: T* func(T t) { T* p = &t; return p; // no error detected } safe deals with this by preventing taking the address of a local: T* func(T t) safe { T* p = &t; // Error: cannot take address of parameter t in safe function func return p; } But this is awfully restrictive. So the 'ref' storage class was introduced which defines a special purpose pointer. 'ref' can only appear in certain contexts, in particular function parameters and returns, only applies to declarations, cannot be stored, and cannot be incremented. ref T func(T t) safe { return t; // Error: escaping reference to local variable t } Ref can be passed down to functions: void func(ref T t) safe; void bar(ref T t) safe { func(t); // ok } But the following idiom is far too useful to be disallowed: ref T func(ref T t) safe { return t; // ok } And if it is misused it can result in stack corruption: ref T foo() safe { T t; return func(t); // no error detected, despite returning pointer to t } The purpose of this proposal is to detect these cases at compile time and disallow them. Memory safety is achieved by allowing pointers to stack objects be passed down the stack, but those pointers may not be saved into non-stack objects or stack objects higher on the stack, and may not be passed up the stack past where they are allocated. The: return func(t); case is detected by all of the following conditions being true: 1. foo() returns by reference 2. func() returns by reference 3. func() has one or more parameters that are by reference 4. 1 or more of the arguments to those parameters are stack objects local to foo() 5. Those arguments can be safe-ly converted from the parameter to the return type. For example, if the return type is larger than the parameter type, the return type cannot be a reference to the argument. If the return type is a pointer, and the parameter type is a size_t, it cannot be a reference to the argument. The larger a list of these cases can be made, the more code will pass safe checks without requiring further annotation. **Scope Ref The above solution is correct, but a bit restrictive. After all, func(t, u) could be returning a reference to non-local u, not local t, and so should work. To fix this, introduce the concept of 'scope ref': ref T func(scope ref T t, T u) safe { return t; // Error: escaping scope ref t return u; // ok } Scope means that the ref is guaranteed not to escape. T u; ref T foo() safe { T t; return func(t, u); // ok, u is not local return func(u, t); // Error: escaping scope ref t } This scheme minimizes the number of 'scope' annotations required. **Out Parameters 'out' parameters are treated like 'ref' parameters for the purposes of this document. **Inference Many functions can infer pure, safe, and nogc. Those same functions can infer which ref parameters are 'scope', without needing user annotation. **Mangling Scope will require additional name mangling, as it affects the interface of the function. **Nested Functions Nested functions have more objects available than just their arguments: ref T foo() safe { T t; ref T func() { return t; } return func(); // should be disallowed } On the plus side the body of the nested function is available to the compiler for examination. **Delegates and Closures This one is a little harder; but the compiler can detect that t is taken by reference, and then can assume that dg() is returning t by reference and disallow it. ref T foo() safe { T t; ref T func() { return t; } auto dg = &func; return dg(); // should be disallowed } **Overloading Scope does not affect overloading, i.e.: T func(scope ref a); T func(ref T b); are considered the same as far as overloading goes. **Inheritance Overriding functions inherit any 'scope' annotations from their antecedents. **Limitations Arrays of references are not allowed. Struct and class fields that are references are not allowed. Non-parameter variables cannot be references.Why not make the compiler copy the variable how's address is taken to the heap like how a closure does it? Would it be hard that optimize away? ---- int* c; int main(){ int a; int* b=&a;//a is now a reference to the heap c=&a;//safe } ----
Nov 17 2014
On Monday, 17 November 2014 at 20:07:54 UTC, Freddy wrote:Why not make the compiler copy the variable how's address is taken to the heap like how a closure does it? Would it be hard that optimize away? ---- int* c; int main(){ int a; int* b=&a;//a is now a reference to the heap c=&a;//safe } ----It would be safe, but it'd be another hard to detect source of GC allocations. Besides safety, an important purpose of borrowing is to avoid those.
Nov 18 2014
I have to say I mostly settled with Marc Sch=C3=BCtz's approach to lifetime management already, because it allows us to return scope ref arguments from functions by directly declaring which arguments the result can be referring to. So you can for example write sub-string search algorithms that can safely return part of the (scoped) input string. There were also two or three other topics on the NG where it seemed like his proposal would solve a problem with D's expressiveness, contributing to an ever growing list of use-cases that made me think that it is worth the added complexity. It borrows from Rust's life-time checking, but without getting too complex. That said, more inference of attributes is always positive and up to the point where you introduce a conflicting definition of what scope arguments do, the proposals look compatible. So +1 to that. --=20 Marco
Nov 18 2014