digitalmars.D - scope escaping
- Adam D. Ruppe (141/141) Feb 06 2014 Let's see if we can make this work in two steps: first, making
- Adam D. Ruppe (120/120) Feb 06 2014 Making scope the default
- Adam D. Ruppe (97/97) Feb 06 2014 Sorry, my lines got mangled, let me try pasting it again.
- Matej Nanut (4/7) Feb 06 2014 I just stumbled upon Rust's memory management scheme yesterday and it
- Adam D. Ruppe (19/21) Feb 06 2014 Yeah, I haven't used rust but I have read about it, and the more
- Elie Morisse (7/8) Feb 06 2014 How about letting the compiler decide what's best in the default
- Adam D. Ruppe (10/12) Feb 06 2014 The problem there is the compiler would have to look at the big
- Paulo Pinto (4/14) Feb 06 2014 Java and Go compilers do it, why not D ones?
- Adam D. Ruppe (2/3) Feb 06 2014 Perhaps it could work, I don't really know.
- Paulo Pinto (8/11) Feb 07 2014 With escape analysis, something that DMD as far as I know doesn't
- Benjamin Thaut (6/6) Feb 06 2014 Another idea. I would totaly love that behaviour.
- Namespace (3/9) Feb 06 2014 +1
- Adam D. Ruppe (9/11) Feb 06 2014 Absolutely. In fact, generically, any scope item could be moved
- Benjamin Thaut (6/15) Feb 06 2014 Count me in on supporting this feature. I played with writing a DIP for
- Dicebot (4/10) Feb 06 2014 That was basis of old rejected `scope ref` proposal for rvalue
- Marco Leise (5/19) Feb 06 2014 Why would anyone reject this?
- Meta (11/110) Feb 06 2014 This, along with an actual implementation of scope would be a
- Adam D. Ruppe (4/7) Feb 06 2014 I don't agree with that - I think this could be done pretty much
- Meta (4/12) Feb 06 2014 I know very little about compilers, but wouldn't figuring out if
- Adam D. Ruppe (64/67) Feb 06 2014 I might be using the word wrong too, but I don't think so. All
- Dicebot (34/35) Feb 06 2014 Had only quick look at it but here are some things to remember
- Adam D. Ruppe (45/56) Feb 06 2014 I think it is very important to put on the return value
- "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (79/85) Feb 08 2014 (I see now that Adam already talked about this in his second
Let's see if we can make this work in two steps: first, making the existing scope storage class work, and second, but considering making it the default. First, let's define it. A scope reference may never escape its scope. This means: 0) Note that scope is irrelevant on value types. I believe it is also mostly irrelevant on references to immutable data (such as strings) since they are de facto value types. (A side effect of this: immutable stack data is wrong.... which is an arguable point, since correct enforcement of slices into it would let you maintain the immutable illusion. Hmm, both sides have good points.) Nevertheless, while the immutable reference can be debated, scope definitely doesn't matter on value types. While it might be there, I think it should just be a no-op. 1) It or its address must never be assigned to a higher scope. (The compiler currently disallows rebinding scope variables, which I think does achieve this, but is more blunt than it needs to be. If we want to disable rebinding, let's do that on a type-by-type basis e.g. disabling postblit on a unique ptr.) void foo() { int[] outerSlice; { scope int[] innerSlice = ...; outerSlice = innerSlice; // error innerSlice = innerSlice[1 .. $]; // I think this should be ok } } Parameters and return values are considered the same level for this, since the parameter and return value both belong to the caller. So: int[] foo() { int[15] staticBuffer; scope int[] slice = staticBuffer[]; return slice; // illegal, return value is one level higher than inner function } // OK, you aren't giving the caller anything they don't already have scope char[] strchr(scope char[] s, char[]) { return s; } It is acceptable to pass it to a lower scope. int average(in int[]); // in == const scope void foo() { int[15] staticBuffer; scope int[] slice = staticBuffer[]; int avg = average(slice); // OK, passing to inner scope is fine } scope slice.ptr and &scope slice's return values themselves must be scope. Yes, scope MUST work on function return values as well as parameters and variables. This is an absolute necessity for any degree of sanity, which I'll talk about more in my next numbered point. BTW I keep using slices into static buffers here because that's the main real-world concern we should keep in mind. A static buffer is a strictly-scoped owned container built right into the language. We know it is wrong to return a reference to stack data, we know why. Conversely, we have a pretty good idea about what *can* work with it. Scope, if we do it right, should statically catch misuses of static array slices while allowing proper uses. So when in doubt about something, ask: does this make sense when referring to a static buffer slice? 2) scope must be carried along with the variable at every step of its life. (In this sense, it starts to look more like a type constructor than a storage class, but I think it is slightly different still.) void foo() { int[] outerSlice; { int[16] staticBuffer; scope int[] innerSlice = staticBuffer[]; // OK int[] cheatingSlice = innerSlice; // uh oh, no good because... outerSlice = cheatingSlice; // ...it enables this } } A potential workaround is to require every assignment to also be scope. scope int[] cheatingSlice = innerSlice; // OK outerSlice = cheatingSlice; // this is still disallowed, so cool It is very important that this also applies through function return values, since otherwise: T identity(T)(scope T t) { return t; } can and will escape references. Consider strchr on a static stack array. We do NOT want that to return a pointer to the stack memory after it ceases to exist. This, that identity function should be illegal with cannot return scope from a non-scope function. We'll allow it by marking the return value as scope as well. (Again, this sounds a lot like a type constructor.) 3) structs are considered reference types if ANY of their members are reference types (unless specifically noted otherwise, see my following post about default and encapsulation for details). Thus, the scope rules may apply to them: struct Holder { int[] foo; } Holder h; void test(scope int[] f) { h.foo = f; // must be an error, f is escaping to global scope directly h = Holder(f); // this must also be an error, f is escaping indirectly } The constructed Holder inside would have to inherit the scopiness of f. This might be the trickiest part of getting this right (though it is kinda neatly solved if scope is default :) ) a) A struct constructed with a scope variable itself must be scope, and thus all the rules apply to it. b) Assigning to a struct which is not scope, even if it is a local variable, must not be permitted. Holder h2; h2.foo = f; // this isn't escaping the scope, but is dropping scope Just as if we had a local variable of type int[]. We may make the struct scope: scope Holder h2; h2.foo = f; // OK c) Calling methods on a struct which may escape the scope is wrong. Ideally, `this` would always be scope... in fact, I think that's the best way to go. An alternative though might be to restrict calling of non-pure functions. Pure functions don't allow mutation of non-scope data in the first place, so they shouldn't be able to escape references. I think that covers what I want. Note that this is not necessarily safe: struct C_Array { /* grows with malloc */ scope T* borrow() {} } C_Array!int i; int* b = i.borrow; i ~= 10; // might realloc... // leaving b dangling So it isn't necessarily safe. I think it *would* be safe with static arrays. BTW static array slicing should return scope as should most user defined containers. But with user-defined types, safety is still in the hands of the programmer. Reallocing with a non-sealed reference should always be considered trusted. Stand by for my next post which will discuss making it default, with a few more points relevant to the whole concept.
Feb 06 2014
Making scope the default ======================= There's five points to discuss: 1) All variables are assumed to be marked with scope implicitly 2) The exception is structs with a special annotation which marks that they encapsulate a resource. An encapsulated resource explicitly marked scope at the usage site is STILL scope, but it will not implicitly inherit the scopiness of the member reference/ encapsulated_resource struct RefCounted(T) { T t; // the scopiness of this would not propagated to refcounted itself } This lets us write structs to manage raw pointers (etc.) as an escape from the rules. Note you may also write encaspulated_resource struct Borrowed(T){} as an escape from the rules. Using this would of course be at your own risk, analogous to trusted code. 3) Built-in allocations return GC!T instead of T. GC!T's definition is: encapsulated_resource struct GC(T) { private T _managed_payload; /* force_inline */ /* implicit scope return value */ safe nothrow inout(T) borrow() { return _managed_payload; } alias borrow this; } NOTE: if inout(T) there doesn't work for const correctness, we need to fix const on wrapped types; an orthogonal issue. If you don't care about ownership, the alias this gives you a naked borrowed reference whenever needed. If you do care about ownership: auto foo = new Foo(); static assert(is(typeof(foo) == GC!Foo)); letting you store it with confidence without additional steps or assumptions. When passing to a template, if you want to explicitly borrow it, you might write borrow. Otherwise, IFTI will see the whole GC!T type. This is important if we want to write owned identity templates. If an argument is scope, ownership is irrelevant. We might strip it off but I don't think that's necessary... might help avoid template bloat though. 4) All other types remain the same. Yes, typeof(this) == T, NEVER GC!T. Again, remember the rule of thumb: would this work with as static stack buffer? class Foo { Foo getMe() { return this; } } ubyte[__traits(classInstanceSize, Foo)] buffer; Foo f = emplace!Foo(buffer); // ok so far, f is scope GC!Foo gc = f.getMe(); // obviously wrong, f is not GC The object does not control its own allocation, so it does not own its own memory. Thus, `this` is *always* borrowed. Does this work if building a tree: class Tree { Tree[] children; Tree addChild(Tree t) { children ~= t; } } addChild there would *not* compile, since it escapes the t into the object's scope. Tree would need to know ownership: make children and addChild take GC!Tree instead, for example, then it will work. What if addChild wants to set t.parent = this; ? That wouldn't be possible (without using a trust-me borrowed!T wrapper)... and while this would break some of my code... I say unto you, such code was already broken, because the parent might be emplaced on a stack buffer! GC!Tree child = new Tree(); { ubyte[...] stack; Owned!Tree parent = emplace!Tree(stack[]); parent.addChild(child); } child.parent; // bug city Instead, addChild should request its own ownership. Tree addChild(GC!Tree child, GC!Tree _this) { children ~= child; child.parent = _this; } Then, the buggy above scenario does not compile, while making it possible to do the correct thing, storing a (verified) GC reference in the object graph. I understand that would be a bit of a pain, but you agree it is more correct, yes? So that might be worthwhile breakage (especailly since we're talking about potentially large breakage already.) 5) Interaction with safe is something we can debate. safe works best with the GC, but if we play our scope cards right, memory corruption via stack stuff can be statically eliminated too, thus making some varaints of emplace safe too. So I don't think even safe functions can assume this == GC, and even if they could, we shouldn't since it limits us from legitimate optimizations. So I think the safe rules should stay exactly as they are now. Wrapper structs that do things like malloc/realloc might be system because it would still be possible for a borrowed pointer to be invalidated when they realloc (note this is not the case with GC, which is safe even through growth reallocations). So safe and scope are separate issues.
Feb 06 2014
Sorry, my lines got mangled, let me try pasting it again. Making scope the default ======================= There's five points to discuss: 1) All variables are assumed to be marked with scope implicitly 2) The exception is structs with a special annotation which marks that they encapsulate a resource. An encapsulated resource explicitly marked scope at the usage site is STILL scope, but it will not implicitly inherit the scopiness of the member reference/ encapsulated_resource struct RefCounted(T) { T t; // the scopiness of this would not propagated to // refcounted itself } This lets us write structs to manage raw pointers (etc.) as an escape from the rules. Note you may also write encaspulated_resource struct Borrowed(T){} as an escape from the rules. Using this would of course be at your own risk, analogous to trusted code. 3) Built-in allocations return GC!T instead of T. GC!T's definition is: encapsulated_resource struct GC(T) { private T _managed_payload; /* force_inline */ /* implicit scope return value */ safe nothrow inout(T) borrow() { return _managed_payload; } alias borrow this; } NOTE: if inout(T) there doesn't work for const correctness, we need to fix const on wrapped types; an orthogonal issue. If you don't care about ownership, the alias this gives you a naked borrowed reference whenever needed. If you do care about ownership: auto foo = new Foo(); static assert(is(typeof(foo) == GC!Foo)); letting you store it with confidence without additional steps or assumptions. When passing to a template, if you want to explicitly borrow it, you might write borrow. Otherwise, IFTI will see the whole GC!T type. This is important if we want to write owned identity templates. If an argument is scope, ownership is irrelevant. We might strip it off but I don't think that's necessary... might help avoid template bloat though. 4) All other types remain the same. Yes, typeof(this) == T, NEVER GC!T. Again, remember the rule of thumb: would this work with as static stack buffer? class Foo { Foo getMe() { return this; } } ubyte[__traits(classInstanceSize, Foo)] buffer; Foo f = emplace!Foo(buffer); // ok so far, f is scope GC!Foo gc = f.getMe(); // obviously wrong, f is not GC The object does not control its own allocation, so it does not own its own memory. Thus, `this` is *always* borrowed. Does this work if building a tree: class Tree { Tree[] children; Tree addChild(Tree t) { children ~= t; } } addChild there would *not* compile, since it escapes the t into the object's scope. Tree would need to know ownership: make children and addChild take GC!Tree instead, for example, then it will work. What if addChild wants to set t.parent = this; ? That wouldn't be possible (without using a trust-me borrowed!T wrapper)... and while this would break some of my code... I say unto you, such code was already broken, because the parent might be emplaced on a stack buffer! GC!Tree child = new Tree(); { ubyte[...] stack; Owned!Tree parent = emplace!Tree(stack[]); parent.addChild(child); } child.parent; // bug city Instead, addChild should request its own ownership. Tree addChild(GC!Tree child, GC!Tree _this) { children ~= child; child.parent = _this; } Then, the buggy above scenario does not compile, while making it possible to do the correct thing, storing a (verified) GC reference in the object graph. I understand that would be a bit of a pain, but you agree it is more correct, yes? So that might be worthwhile breakage (especailly since we're talking about potentially large breakage already.) 5) Interaction with safe is something we can debate. safe works best with the GC, but if we play our scope cards right, memory corruption via stack stuff can be statically eliminated too, thus making some varaints of emplace safe too. So I don't think even safe functions can assume this == GC, and even if they could, we shouldn't since it limits us from legitimate optimizations. So I think the safe rules should stay exactly as they are now. Wrapper structs that do things like malloc/realloc might be system because it would still be possible for a borrowed pointer to be invalidated when they realloc (note this is not the case with GC, which is safe even through growth reallocations). So safe and scope are separate issues.
Feb 06 2014
On 6 Feb 2014 16:56, "Adam D. Ruppe" <destructionator gmail.com> wrote:Making scope the default ======================= [...]I just stumbled upon Rust's memory management scheme yesterday and it seemed similar to this. On first glance, I really like it.
Feb 06 2014
On Thursday, 6 February 2014 at 18:29:48 UTC, Matej Nanut wrote:I just stumbled upon Rust's memory management scheme yesterday and it seemed similar to this.Yeah, I haven't used rust but I have read about it, and the more I think about it, the more I realize it really isn't that new - it is just formalizing what we already do as programmers. Escaping a reference to stack data is always wrong. We know this and try not to do it. The language barely helps with this though - we're on our own. We can't even be completely sure a reference actually is GC since it might be on the stack without us realizing it. So what the Rust system and my proposal (which I'm pretty sure is simpler than the Rust one - it doesn't catch all the problems, but should be easier to implement and use for the majority of cases) does is try to get the language to help us get this right. It's the same thing with like error handling. In C, you know you have to clean up with a failed operation and you have to do it yourself. This is often done by checking return values and goto clean up code. In D, we have struct destructors, scope(failure), and exceptions to help us do the same task with less work and more confidence.
Feb 06 2014
On Thursday, 6 February 2014 at 15:53:01 UTC, Adam D. Ruppe wrote:Making scope the defaultHow about letting the compiler decide what's best in the default case? · if a global reference to the variable espaces or a reference is returned by a function ⇒ GC-allocated · otherwise ⇒ scoped to where the last reference to the variable is seen by static analysis
Feb 06 2014
On Thursday, 6 February 2014 at 19:24:19 UTC, Elie Morisse wrote:How about letting the compiler decide what's best in the default case?The problem there is the compiler would have to look at the big picture to make an informed decision, and big picture decisions are generally hard to implement. Determining whether it is GC or not automatically would require analysis of the function body, tracing where each reference ends up, and looking at other functions it gets passed to (which might not be possible if you have only the prototype without a body). Things like pure can help with it, but generally, I don't think the compiler can make a smart decision.
Feb 06 2014
Am 06.02.2014 21:29, schrieb Adam D. Ruppe:On Thursday, 6 February 2014 at 19:24:19 UTC, Elie Morisse wrote:Java and Go compilers do it, why not D ones? -- PauloHow about letting the compiler decide what's best in the default case?The problem there is the compiler would have to look at the big picture to make an informed decision, and big picture decisions are generally hard to implement. Determining whether it is GC or not automatically would require analysis of the function body, tracing where each reference ends up, and looking at other functions it gets passed to (which might not be possible if you have only the prototype without a body). Things like pure can help with it, but generally, I don't think the compiler can make a smart decision.
Feb 06 2014
On Thursday, 6 February 2014 at 21:04:45 UTC, Paulo Pinto wrote:Java and Go compilers do it, why not D ones?Perhaps it could work, I don't really know.
Feb 06 2014
On Thursday, 6 February 2014 at 23:20:44 UTC, Adam D. Ruppe wrote:On Thursday, 6 February 2014 at 21:04:45 UTC, Paulo Pinto wrote:With escape analysis, something that DMD as far as I know doesn't do. You need to create execution flows for basic blocks, every variable that does not escape the blocks can be turned into stack allocations. -- PauloJava and Go compilers do it, why not D ones?Perhaps it could work, I don't really know.
Feb 07 2014
Another idea. I would totaly love that behaviour. void foo(scope int[] arg) { ... } foo([1 2 3 4]); // allocates the array literal on the stack, because it is scoped. Kind Regards Benjamin Thaut
Feb 06 2014
On Thursday, 6 February 2014 at 20:17:25 UTC, Benjamin Thaut wrote:Another idea. I would totaly love that behaviour. void foo(scope int[] arg) { ... } foo([1 2 3 4]); // allocates the array literal on the stack, because it is scoped. Kind Regards Benjamin Thaut+1
Feb 06 2014
On Thursday, 6 February 2014 at 20:17:25 UTC, Benjamin Thaut wrote:foo([1 2 3 4]); // allocates the array literal on the stack, because it is scoped.Absolutely. In fact, generically, any scope item could be moved to the stack. We were just discussing in the chat room how scope = stack allocation and scope = don't escape the reference actually go hand in hand; they are not two separate features, stack allocation is an optimization enabled by the restriction... and the restriction is required by the optimization to maintain memory safety.
Feb 06 2014
Am 06.02.2014 21:26, schrieb Adam D. Ruppe:On Thursday, 6 February 2014 at 20:17:25 UTC, Benjamin Thaut wrote:Count me in on supporting this feature. I played with writing a DIP for giving scope a meaning for quite some time. And the ideas are pretty similar to yours. Kind Regards Benjamin Thautfoo([1 2 3 4]); // allocates the array literal on the stack, because it is scoped.Absolutely. In fact, generically, any scope item could be moved to the stack. We were just discussing in the chat room how scope = stack allocation and scope = don't escape the reference actually go hand in hand; they are not two separate features, stack allocation is an optimization enabled by the restriction... and the restriction is required by the optimization to maintain memory safety.
Feb 06 2014
On Thursday, 6 February 2014 at 20:17:25 UTC, Benjamin Thaut wrote:Another idea. I would totaly love that behaviour. void foo(scope int[] arg) { ... } foo([1 2 3 4]); // allocates the array literal on the stack, because it is scoped. Kind Regards Benjamin ThautThat was basis of old rejected `scope ref` proposal for rvalue references :(
Feb 06 2014
Am Thu, 06 Feb 2014 21:50:32 +0000 schrieb "Dicebot" <public dicebot.lv>:On Thursday, 6 February 2014 at 20:17:25 UTC, Benjamin Thaut wrote:Why would anyone reject this? -- MarcoAnother idea. I would totaly love that behaviour. void foo(scope int[] arg) { ... } foo([1 2 3 4]); // allocates the array literal on the stack, because it is scoped. Kind Regards Benjamin ThautThat was basis of old rejected `scope ref` proposal for rvalue references :(
Feb 06 2014
On Thursday, 6 February 2014 at 15:53:01 UTC, Adam D. Ruppe wrote:Sorry, my lines got mangled, let me try pasting it again. Making scope the default ======================= There's five points to discuss: 1) All variables are assumed to be marked with scope implicitly 2) The exception is structs with a special annotation which marks that they encapsulate a resource. An encapsulated resource explicitly marked scope at the usage site is STILL scope, but it will not implicitly inherit the scopiness of the member reference/ encapsulated_resource struct RefCounted(T) { T t; // the scopiness of this would not propagated to // refcounted itself } This lets us write structs to manage raw pointers (etc.) as an escape from the rules. Note you may also write encaspulated_resource struct Borrowed(T){} as an escape from the rules. Using this would of course be at your own risk, analogous to trusted code. 3) Built-in allocations return GC!T instead of T. GC!T's definition is: encapsulated_resource struct GC(T) { private T _managed_payload; /* force_inline */ /* implicit scope return value */ safe nothrow inout(T) borrow() { return _managed_payload; } alias borrow this; } NOTE: if inout(T) there doesn't work for const correctness, we need to fix const on wrapped types; an orthogonal issue. If you don't care about ownership, the alias this gives you a naked borrowed reference whenever needed. If you do care about ownership: auto foo = new Foo(); static assert(is(typeof(foo) == GC!Foo)); letting you store it with confidence without additional steps or assumptions. When passing to a template, if you want to explicitly borrow it, you might write borrow. Otherwise, IFTI will see the whole GC!T type. This is important if we want to write owned identity templates. If an argument is scope, ownership is irrelevant. We might strip it off but I don't think that's necessary... might help avoid template bloat though. 4) All other types remain the same. Yes, typeof(this) == T, NEVER GC!T. Again, remember the rule of thumb: would this work with as static stack buffer? class Foo { Foo getMe() { return this; } } ubyte[__traits(classInstanceSize, Foo)] buffer; Foo f = emplace!Foo(buffer); // ok so far, f is scope GC!Foo gc = f.getMe(); // obviously wrong, f is not GC The object does not control its own allocation, so it does not own its own memory. Thus, `this` is *always* borrowed. Does this work if building a tree: class Tree { Tree[] children; Tree addChild(Tree t) { children ~= t; } } addChild there would *not* compile, since it escapes the t into the object's scope. Tree would need to know ownership: make children and addChild take GC!Tree instead, for example, then it will work. What if addChild wants to set t.parent = this; ? That wouldn't be possible (without using a trust-me borrowed!T wrapper)... and while this would break some of my code... I say unto you, such code was already broken, because the parent might be emplaced on a stack buffer! GC!Tree child = new Tree(); { ubyte[...] stack; Owned!Tree parent = emplace!Tree(stack[]); parent.addChild(child); } child.parent; // bug city Instead, addChild should request its own ownership. Tree addChild(GC!Tree child, GC!Tree _this) { children ~= child; child.parent = _this; } Then, the buggy above scenario does not compile, while making it possible to do the correct thing, storing a (verified) GC reference in the object graph. I understand that would be a bit of a pain, but you agree it is more correct, yes? So that might be worthwhile breakage (especailly since we're talking about potentially large breakage already.) 5) Interaction with safe is something we can debate. safe works best with the GC, but if we play our scope cards right, memory corruption via stack stuff can be statically eliminated too, thus making some varaints of emplace safe too. So I don't think even safe functions can assume this == GC, and even if they could, we shouldn't since it limits us from legitimate optimizations. So I think the safe rules should stay exactly as they are now. Wrapper structs that do things like malloc/realloc might be system because it would still be possible for a borrowed pointer to be invalidated when they realloc (note this is not the case with GC, which is safe even through growth reallocations). So safe and scope are separate issues.This, along with an actual implementation of scope would be a really neat thing to have, but it has 2 problems. 1. It would significantly increase language complexity (although the return on investment would also be quite high). 2. Walter would be dead-set against it. He's said before that implementing scope would require flow-analysis in the compiler, which would increase the implementation complexity by a lot. On the flipside of that, if someone ever convinces him and flow-analysis *is* added, it opens up the door to a whole new world of other possible enhancements.
Feb 06 2014
On Thursday, 6 February 2014 at 22:41:34 UTC, Meta wrote:2. Walter would be dead-set against it. He's said before that implementing scope would require flow-analysis in the compiler, which would increase the implementation complexity by a lot.I don't agree with that - I think this could be done pretty much within the existing type system, especially since scope has to be set everywhere by the programmer.
Feb 06 2014
On Thursday, 6 February 2014 at 23:24:08 UTC, Adam D. Ruppe wrote:On Thursday, 6 February 2014 at 22:41:34 UTC, Meta wrote:I know very little about compilers, but wouldn't figuring out if a variable is being escaped to an outer scope require flow anyalysis?2. Walter would be dead-set against it. He's said before that implementing scope would require flow-analysis in the compiler, which would increase the implementation complexity by a lot.I don't agree with that - I think this could be done pretty much within the existing type system, especially since scope has to be set everywhere by the programmer.
Feb 06 2014
On Thursday, 6 February 2014 at 23:27:06 UTC, Meta wrote:I know very little about compilers, but wouldn't figuring out if a variable is being escaped to an outer scope require flow anyalysis?I might be using the word wrong too, but I don't think so. All the information needed is available right at the assignment point: int[] a; void foo() { scope int[] b; a = b; // this line is interesting } At that line alone, everything we need to know is available. The compiler currently has this code: if (e1->op == TOKvar && (((VarExp *)e1)->var->storage_class & STCscope) && op == TOKassign) { error("cannot rebind scope variables"); } If a was marked scope, it would trigger that. But I don't think this is a particularly useful check, at least not alone. We could try this though: if ( e1->op == TOKvar && !(((VarExp *)e1)->var->storage_class & STCscope) && e2->op == TOKvar && (((VarExp *)e2)->var->storage_class & STCscope) && op == TOKassign) { error("cannot assign scope variables to non-scope"); } Now, that line will fail with this error - scope to non-scope is not allowed. If we add scope to the a variable, the existing compiler code will trigger an error test500.d(41): Error: cannot rebind scope variables But I don't like that - it is /too/ conservative (although these two checks together more or less achieve the first step I want). However, what if we take that check out, can we allow rebinding but only to the same or lower scopes? Well, the VarExp->var we have on the left-hand side has info about the variable's parent.... and we can check that. I'd like to point out that this is WRONG but I don't know dmd that well either: if ( e1->op == TOKvar && (((VarExp *)e1)->var->storage_class & STCscope) && e2->op == TOKvar && (((VarExp *)e2)->var->storage_class & STCscope) && op == TOKassign) { Declaration* v1 = ((VarExp*)e1)->var; Declaration* v2 = ((VarExp*)e2)->var; if(v1->parent != v2->parent) error("cannot assign scope variable to higher scope"); } But this prohibits it from passing outside the function. A more correct check would be to ensure v1 is equal to or a parent of v2... and using whatever is needed to handle inner scopes inside a function. tbh I don't know just what to look at to get the rest of the scope, but since the compiler can identify what this name actually refers to somehow, it must know where the declaration comes from too! I just don't know where exactly to look. Of course, the tricky part will be structs, but the same basic idea of the implementation should work (if we can get it working first for basic assignments).
Feb 06 2014
On Thursday, 6 February 2014 at 15:47:44 UTC, Adam D. Ruppe wrote:...Had only quick look at it but here are some things to remember that I have realised when drafting my own scope proposal: 1) passing scope arguments: /* what to put here as qualifier? */ int[] foo(scope int[] input) { return input; // this should work int[5] internal; // implictly scope return internal[]; // but this shouldn't } I think it makes sense to prohibit `scope` as explicitly named return attribute but make it inferrable via `inout`. 2) Transitivity & aggregation: struct A { scope int[] slice1; int[] slice2; int value; } void foo(int[] input) { } void boo(ref int input) { } void main() { int[5] stack; A a; // is it different from "scope A a;"? a.slice = stack[]; // guess should be ok a.slice2 = new int[]; // should this? foo(a.slice1); // obviously fail foo(a.slice2); // but does this? boo(a.value); // I'd expect this to fail } Main problem with strict scope definition is that most seem to inuitively get what it is expected to do but defining exact set of rules is rather tricky.
Feb 06 2014
On Thursday, 6 February 2014 at 22:04:05 UTC, Dicebot wrote:I think it makes sense to prohibit `scope` as explicitly named return attribute but make it inferrable via `inout`.I think it is very important to put on the return value explicitly so it can be used to control access to a sealed resource. Perhaps returning a scope var would make it easy enough to infer though.2) Transitivity & aggregation: A a; // is it different from "scope A a;"?Probably not because A is a value type. Even if you explicitly marked it scope, it wouldn't really matter. Any pointer to it should be scope since it is on the stack though.a.slice2 = new int[]; // should this?Yeah, it should. Here's how I'm seeing the struct: let's just decompose it to a list of local variables. So "A a;" is considered by the same rules as scope int[] a_slice1; int[] a_slice2; int a_value; as if they were written right there as local variables. A struct is conceptually just a group of variables, after all, let's treat it just like that.foo(a.slice2); // but does this?Thus this is ok too, since it would work if we used the local var a_slice2.void boo(ref int input) { } boo(a.value); // I'd expect this to failI actually think that should work. Let's try to imagine what problems could come up in boo: int* global; void boo(ref int input) { global = &input; } That is a problem... but it is almost ALWAYS a problem. Unless it happened to be passed a heap int by ref, this would always fail. I think taking the address of a ref parameter should be allowed, but should always yield a scope pointer. You can't be sure it isn't on the stack, so you don't want to escape that address.... perhaps ref implies scope? If you want a pointer to escape, ask for a pointer. Moreover, I think address of a stack var should also be scope, so void storeMe(int* i) {} void test() { int i; storeMe(&i); // fails: address of stack var yielded scope var which cannot be passed to non-scope parameter } Otherwise though, writing to a ref is ok, even if it is on the stack. If boo just wants to read and update the value, that's ok.Main problem with strict scope definition is that most seem to inuitively get what it is expected to do but defining exact set of rules is rather tricky.Yeah, structs definitely complicate things, but I think pretending they are just a bunch of local variables gives us a consistent and useful definition.
Feb 06 2014
(I see now that Adam already talked about this in his second post, but I'm posting it anyway, as I suggest a different solution.) On Thursday, 6 February 2014 at 15:47:44 UTC, Adam D. Ruppe wrote:c) Calling methods on a struct which may escape the scope is wrong. Ideally, `this` would always be scope... in fact, I think that's the best way to go. An alternative though might be to restrict calling of non-pure functions. Pure functions don't allow mutation of non-scope data in the first place, so they shouldn't be able to escape references.As a consequence of this, it would no longer be possible to manage an owned object that has a back-pointer to its owner, e.g: class Window { Menu _menu; this() { _menu = new Menu(this); // cannot pass `this` } ~this() { delete _menu; } } class Menu { Window _parent; this(Window parent) { parent = _parent; } } This can probably be solved somehow. Allowing `scope` as a type constructor and doing the following might work, but I'm not sure about the safety implications: class Window { scope Menu _menu; this() { _menu = new Menu(this); // `new` returns a scope(Menu) } ~this() { delete _menu; // either explicitly or implicitly } } class Menu { scope Window _parent; scope this(scope Window parent) { parent = _parent; } } (Note that this assumes scope isn't the default.) The trick is to recognize that we can basically treat constructors and destructors as having a scope that starts when the constructor is entered and ends when the destructor returns. `this` can be seen as being declared inside this scope. For member fields, `scope` also means "owned". Therefore, it needs to be destroyed when the scope is left (which in this case means: when the object is being destroyed). Thinking about this some more, this might even be a good idea for local scope variables too. This would basically mean undeprecating `scope` for classes. Anyway, this behaviour guarantees that the reference no longer exists after the object is destroyed. `this` must me marked as `scope`, which causes `new` to return a scoped reference. This necessary to keep it from being assigned to non-scope variables. Consider the following situation: class Menu { scope Window _parent; scope this(scope Window parent) { parent = _parent; } } Menu a; void foo(scope Window w) { a = new Menu(w); // not good, trying to assign to non-scope scope Menu b = new Menu(w); // ok } On the other hand, I already see at least one problem: class SomeClass { scope SomeClass other; } void foo(scope SomeClass a) { scope SomeClass b = new SomeClass; a.other = b; // ouch } In Rust this is solved by the concept of lifetimes, i.e. while both `a` and `b` are scoped, they have different lifetimes. It's disallowed to store a reference to an object with a shorter lifetime into an object with a longer lifetime.
Feb 08 2014