www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - scope escaping

reply "Adam D. Ruppe" <destructionator gmail.com> writes:
Let's see if we can make this work in two steps: first, making 
the existing scope storage class work, and second, but 
considering making it the default.

First, let's define it. A scope reference may never escape its 
scope. This means:

0) Note that scope is irrelevant on value types. I believe it is 
also mostly irrelevant on references to immutable data (such as 
strings) since they are de facto value types. (A side effect of 
this: immutable stack data is wrong.... which is an arguable 
point, since correct enforcement of slices into it would let you 
maintain the immutable illusion. Hmm, both sides have good 
points.)

Nevertheless, while the immutable reference can be debated, scope 
definitely doesn't matter on value types. While it might be 
there, I think it should just be a no-op.

1) It or its address must never be assigned to a higher scope. 
(The compiler currently disallows rebinding scope variables, 
which I think does achieve this, but is more blunt than it needs 
to be. If we want to disable rebinding, let's do that on a 
type-by-type basis e.g. disabling postblit on a unique ptr.)

void foo() {
    int[] outerSlice;
    {
       scope int[] innerSlice = ...;
       outerSlice = innerSlice; // error
       innerSlice = innerSlice[1 .. $]; // I think this should be 
ok
    }
}

Parameters and return values are considered the same level for 
this, since the parameter and return value both belong to the 
caller. So:

int[] foo() {
    int[15] staticBuffer;
    scope int[] slice = staticBuffer[];
    return slice; // illegal, return value is one level higher 
than inner function
}

// OK, you aren't giving the caller anything they don't already 
have
scope char[] strchr(scope char[] s, char[]) { return s; }

It is acceptable to pass it to a lower scope.

int average(in int[]); // in == const scope

void foo() {
     int[15] staticBuffer;
     scope int[] slice = staticBuffer[];
     int avg = average(slice); // OK, passing to inner scope is 
fine
}


scope slice.ptr and &scope slice's return values themselves must 
be scope. Yes, scope MUST work on function return values as well 
as parameters and variables. This is an absolute necessity for 
any degree of sanity, which I'll talk about more in my next 
numbered point.


BTW I keep using slices into static buffers here because that's 
the main real-world concern we should keep in mind. A static 
buffer is a strictly-scoped owned container built right into the 
language. We know it is wrong to return a reference to stack 
data, we know why. Conversely, we have a pretty good idea about 
what *can* work with it. Scope, if we do it right, should 
statically catch misuses of static array slices while allowing 
proper uses.

So when in doubt about something, ask: does this make sense when 
referring to a static buffer slice?

2) scope must be carried along with the variable at every step of 
its life. (In this sense, it starts to look more like a type 
constructor than a storage class, but I think it is slightly 
different still.)

void foo() {
    int[] outerSlice;
    {
        int[16] staticBuffer;
        scope int[] innerSlice = staticBuffer[]; // OK
        int[] cheatingSlice = innerSlice; // uh oh, no good 
because...
        outerSlice = cheatingSlice; // ...it enables this
    }
}


A potential workaround is to require every assignment to also be 
scope.

        scope int[] cheatingSlice = innerSlice; // OK
        outerSlice = cheatingSlice; // this is still disallowed, 
so cool

It is very important that this also applies through function 
return values, since otherwise:

T identity(T)(scope T t) { return t; }

can and will escape references. Consider strchr on a static stack 
array. We do NOT want that to return a pointer to the stack 
memory after it ceases to exist.

This, that identity function should be illegal with cannot return 
scope from a non-scope function. We'll allow it by marking the 
return value as scope as well. (Again, this sounds a lot like a 
type constructor.)


3) structs are considered reference types if ANY of their members 
are reference types (unless specifically noted otherwise, see my 
following post about default and encapsulation for details). 
Thus, the scope rules may apply to them:

struct Holder {
    int[] foo;
}

Holder h;
void test(scope int[] f) {
     h.foo = f; // must be an error, f is escaping to global scope 
directly
     h = Holder(f); // this must also be an error, f is escaping 
indirectly
}

The constructed Holder inside would have to inherit the scopiness 
of f. This might be the trickiest part of getting this right 
(though it is kinda neatly solved if scope is default :) )

a) A struct constructed with a scope variable itself must be 
scope, and thus all the rules apply to it.

b) Assigning to a struct which is not scope, even if it is a 
local variable, must not be permitted.

Holder h2;
h2.foo = f; // this isn't escaping the scope, but is dropping 
scope

Just as if we had a local variable of type int[].

We may make the struct scope:

scope Holder h2;
h2.foo = f; // OK

c) Calling methods on a struct which may escape the scope is 
wrong. Ideally, `this` would always be scope... in fact, I think 
that's the best way to go. An alternative though might be to 
restrict calling of non-pure functions. Pure functions don't 
allow mutation of non-scope data in the first place, so they 
shouldn't be able to escape references.




I think that covers what I want. Note that this is not 
necessarily  safe:

struct C_Array { /* grows with malloc */ scope T* borrow() {} }

C_Array!int i;
int* b = i.borrow;
i ~= 10; // might realloc...
// leaving b dangling


So it isn't necessarily  safe. I think it *would* be  safe with 
static arrays. BTW static array slicing should return scope as 
should most user defined containers. But with user-defined types, 
 safety is still in the hands of the programmer. Reallocing with 
a non-sealed reference should always be considered  trusted.


Stand by for my next post which will discuss making it default, 
with a few more points relevant to the whole concept.
Feb 06 2014
next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
Making scope the default
=======================


There's five points to discuss:

1) All variables are assumed to be marked with scope implicitly

2) The exception is structs with a special annotation which marks
    that they encapsulate a resource. An encapsulated resource
    explicitly marked scope at the usage site is STILL scope, but
    it will not implicitly inherit the scopiness of the member 
reference/

 encapsulated_resource
struct RefCounted(T) {
     T t; // the scopiness of this would not propagated to 
refcounted itself
}

This lets us write structs to manage raw pointers (etc.) as an 
escape from
the rules. Note you may also write  encaspulated_resource struct 
Borrowed(T){}
as an escape from the rules. Using this would of course be at 
your own risk,
analogous to  trusted code.

3) Built-in allocations return GC!T instead of T. GC!T's 
definition is:

 encapsulated_resource
struct GC(T) {
     private T _managed_payload;
     /*  force_inline */
     /* implicit scope return value */
      safe nothrow inout(T) borrow() { return _managed_payload; }
     alias borrow this;
}

NOTE: if inout(T) there doesn't work for const correctness, we 
need to fix
const on wrapped types; an orthogonal issue.

If you don't care about ownership, the alias this gives you a 
naked borrowed
reference whenever needed. If you do care about ownership:

auto foo = new Foo();
static assert(is(typeof(foo) == GC!Foo));

letting you store it with confidence without additional steps or 
assumptions.

When passing to a template, if you want to explicitly borrow it, 
you might
write borrow. Otherwise, IFTI will see the whole GC!T type. This 
is important
if we want to write owned identity templates.

If an argument is scope, ownership is irrelevant. We might strip 
it off but
I don't think that's necessary... might help avoid template bloat 
though.

4) All other types remain the same. Yes, typeof(this) == T, NEVER 
GC!T.
    Again, remember the rule of thumb: would this work with as 
static stack
    buffer?

    class Foo { Foo getMe() { return this; } }
    ubyte[__traits(classInstanceSize, Foo)] buffer;
    Foo f = emplace!Foo(buffer); // ok so far, f is scope
    GC!Foo gc = f.getMe(); // obviously wrong, f is not GC

    The object does not control its own allocation, so it does not 
own
    its own memory. Thus, `this` is *always* borrowed.

    Does this work if building a tree:

    class Tree { Tree[] children; Tree addChild(Tree t) { children 
~= t; } }

    addChild there would *not* compile, since it escapes the t 
into the object's
    scope. Tree would need to know ownership: make children and 
addChild take
    GC!Tree instead, for example, then it will work.

    What if addChild wants to set t.parent = this; ? That wouldn't 
be possible
    (without using a trust-me borrowed!T wrapper)... and while 
this would break
    some of my code... I say unto you, such code was already 
broken, because
    the parent might be emplaced on a stack buffer!

    GC!Tree child = new Tree();
    {
        ubyte[...] stack;
        Owned!Tree parent = emplace!Tree(stack[]);
        parent.addChild(child);
    }
    child.parent; // bug city


    Instead, addChild should request its own ownership.

    Tree addChild(GC!Tree child, GC!Tree _this) {
        children ~= child;
        child.parent = _this;
    }


    Then, the buggy above scenario does not compile, while making 
it possible
    to do the correct thing, storing a (verified) GC reference in 
the
    object graph.


    I understand that would be a bit of a pain, but you agree it 
is more correct,
    yes? So that might be worthwhile breakage (especailly since 
we're talking
    about potentially large breakage already.)


5) Interaction with  safe is something we can debate.  safe works 
best with
    the GC, but if we play our scope cards right, memory 
corruption via stack
    stuff can be statically eliminated too, thus making some 
varaints of emplace
     safe too. So I don't think even  safe functions can assume 
this == GC, and
    even if they could, we shouldn't since it limits us from 
legitimate
    optimizations.

    So I think the  safe rules should stay exactly as they are 
now. Wrapper
    structs that do things like malloc/realloc might be  system 
because it
    would still be possible for a borrowed pointer to be 
invalidated when
    they realloc (note this is not the case with GC, which is 
 safe even
    through growth reallocations). So  safe and scope are separate 
issues.
Feb 06 2014
parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
Sorry, my lines got mangled, let me try pasting it again.


Making scope the default
=======================


There's five points to discuss:

1) All variables are assumed to be marked with scope implicitly

2) The exception is structs with a special annotation which marks 
that they encapsulate a resource. An encapsulated resource 
explicitly marked scope at the usage site is STILL scope, but it 
will not implicitly inherit the scopiness of the member reference/

 encapsulated_resource
struct RefCounted(T) {
     T t; // the scopiness of this would not propagated to
          // refcounted itself
}

This lets us write structs to manage raw pointers (etc.) as an 
escape from the rules. Note you may also write 
 encaspulated_resource struct Borrowed(T){} as an escape from the 
rules. Using this would of course be at your own risk, analogous 
to  trusted code.

3) Built-in allocations return GC!T instead of T. GC!T's 
definition is:

 encapsulated_resource
struct GC(T) {
     private T _managed_payload;
     /*  force_inline */
     /* implicit scope return value */
      safe nothrow inout(T) borrow() { return _managed_payload; }
     alias borrow this;
}

NOTE: if inout(T) there doesn't work for const correctness, we 
need to fix const on wrapped types; an orthogonal issue.

If you don't care about ownership, the alias this gives you a 
naked borrowed reference whenever needed. If you do care about 
ownership:

auto foo = new Foo();
static assert(is(typeof(foo) == GC!Foo));

letting you store it with confidence without additional steps or 
assumptions.

When passing to a template, if you want to explicitly borrow it, 
you might write borrow. Otherwise, IFTI will see the whole GC!T 
type.  This is important if we want to write owned identity 
templates.

If an argument is scope, ownership is irrelevant. We might strip 
it off but I don't think that's necessary... might help avoid 
template bloat though.

4) All other types remain the same. Yes, typeof(this) == T, NEVER 
GC!T.  Again, remember the rule of thumb: would this work with as 
static stack buffer?

    class Foo { Foo getMe() { return this; } }
    ubyte[__traits(classInstanceSize, Foo)] buffer;
    Foo f = emplace!Foo(buffer); // ok so far, f is scope
    GC!Foo gc = f.getMe(); // obviously wrong, f is not GC

    The object does not control its own allocation, so it does not 
own its own memory. Thus, `this` is *always* borrowed.

    Does this work if building a tree:

    class Tree { Tree[] children; Tree addChild(Tree t) {
children ~= t; } }

    addChild there would *not* compile, since it escapes the t 
into the object's scope. Tree would need to know ownership: make 
children and addChild take GC!Tree instead, for example, then it 
will work.

    What if addChild wants to set t.parent = this; ? That wouldn't 
be possible (without using a trust-me borrowed!T wrapper)... and 
while this would break some of my code... I say unto you, such 
code was already broken, because the parent might be emplaced on 
a stack buffer!

    GC!Tree child = new Tree();
    {
        ubyte[...] stack;
        Owned!Tree parent = emplace!Tree(stack[]);
        parent.addChild(child);
    }
    child.parent; // bug city


    Instead, addChild should request its own ownership.

    Tree addChild(GC!Tree child, GC!Tree _this) {
        children ~= child;
        child.parent = _this;
    }


    Then, the buggy above scenario does not compile, while making 
it possible to do the correct thing, storing a (verified) GC 
reference in the object graph.


    I understand that would be a bit of a pain, but you agree it 
is more correct, yes? So that might be worthwhile breakage 
(especailly since we're talking about potentially large breakage 
already.)


5) Interaction with  safe is something we can debate.  safe works 
best with the GC, but if we play our scope cards right, memory 
corruption via stack stuff can be statically eliminated too, thus 
making some varaints of emplace  safe too. So I don't think even 
 safe functions can assume this == GC, and even if they could, we 
shouldn't since it limits us from legitimate optimizations.

    So I think the  safe rules should stay exactly as they are 
now. Wrapper structs that do things like malloc/realloc might be 
 system because it would still be possible for a borrowed pointer 
to be invalidated when they realloc (note this is not the case 
with GC, which is  safe even through growth reallocations). So 
 safe and scope are separate issues.
Feb 06 2014
next sibling parent reply Matej Nanut <matejnanut gmail.com> writes:
On 6 Feb 2014 16:56, "Adam D. Ruppe" <destructionator gmail.com> wrote:
 Making scope the default
 =======================
 [...]
I just stumbled upon Rust's memory management scheme yesterday and it seemed similar to this. On first glance, I really like it.
Feb 06 2014
parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Thursday, 6 February 2014 at 18:29:48 UTC, Matej Nanut wrote:
 I just stumbled upon Rust's memory management scheme yesterday 
 and it seemed similar to this.
Yeah, I haven't used rust but I have read about it, and the more I think about it, the more I realize it really isn't that new - it is just formalizing what we already do as programmers. Escaping a reference to stack data is always wrong. We know this and try not to do it. The language barely helps with this though - we're on our own. We can't even be completely sure a reference actually is GC since it might be on the stack without us realizing it. So what the Rust system and my proposal (which I'm pretty sure is simpler than the Rust one - it doesn't catch all the problems, but should be easier to implement and use for the majority of cases) does is try to get the language to help us get this right. It's the same thing with like error handling. In C, you know you have to clean up with a failed operation and you have to do it yourself. This is often done by checking return values and goto clean up code. In D, we have struct destructors, scope(failure), and exceptions to help us do the same task with less work and more confidence.
Feb 06 2014
prev sibling next sibling parent reply "Elie Morisse" <syniurge gmail.com> writes:
On Thursday, 6 February 2014 at 15:53:01 UTC, Adam D. Ruppe wrote:
 Making scope the default
How about letting the compiler decide what's best in the default case? · if a global reference to the variable espaces or a reference is returned by a function ⇒ GC-allocated · otherwise ⇒ scoped to where the last reference to the variable is seen by static analysis
Feb 06 2014
parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Thursday, 6 February 2014 at 19:24:19 UTC, Elie Morisse wrote:
 How about letting the compiler decide what's best in the 
 default case?
The problem there is the compiler would have to look at the big picture to make an informed decision, and big picture decisions are generally hard to implement. Determining whether it is GC or not automatically would require analysis of the function body, tracing where each reference ends up, and looking at other functions it gets passed to (which might not be possible if you have only the prototype without a body). Things like pure can help with it, but generally, I don't think the compiler can make a smart decision.
Feb 06 2014
parent reply Paulo Pinto <pjmlp progtools.org> writes:
Am 06.02.2014 21:29, schrieb Adam D. Ruppe:
 On Thursday, 6 February 2014 at 19:24:19 UTC, Elie Morisse wrote:
 How about letting the compiler decide what's best in the default case?
The problem there is the compiler would have to look at the big picture to make an informed decision, and big picture decisions are generally hard to implement. Determining whether it is GC or not automatically would require analysis of the function body, tracing where each reference ends up, and looking at other functions it gets passed to (which might not be possible if you have only the prototype without a body). Things like pure can help with it, but generally, I don't think the compiler can make a smart decision.
Java and Go compilers do it, why not D ones? -- Paulo
Feb 06 2014
parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Thursday, 6 February 2014 at 21:04:45 UTC, Paulo Pinto wrote:
 Java and Go compilers do it, why not D ones?
Perhaps it could work, I don't really know.
Feb 06 2014
parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Thursday, 6 February 2014 at 23:20:44 UTC, Adam D. Ruppe wrote:
 On Thursday, 6 February 2014 at 21:04:45 UTC, Paulo Pinto wrote:
 Java and Go compilers do it, why not D ones?
Perhaps it could work, I don't really know.
With escape analysis, something that DMD as far as I know doesn't do. You need to create execution flows for basic blocks, every variable that does not escape the blocks can be turned into stack allocations. -- Paulo
Feb 07 2014
prev sibling next sibling parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Another idea. I would totaly love that behaviour.

void foo(scope int[] arg) { ... }

foo([1 2 3 4]); // allocates the array literal on the stack, because it 
is scoped.

Kind Regards
Benjamin Thaut
Feb 06 2014
next sibling parent "Namespace" <rswhite4 googlemail.com> writes:
On Thursday, 6 February 2014 at 20:17:25 UTC, Benjamin Thaut 
wrote:
 Another idea. I would totaly love that behaviour.

 void foo(scope int[] arg) { ... }

 foo([1 2 3 4]); // allocates the array literal on the stack, 
 because it is scoped.

 Kind Regards
 Benjamin Thaut
+1
Feb 06 2014
prev sibling next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Thursday, 6 February 2014 at 20:17:25 UTC, Benjamin Thaut 
wrote:
 foo([1 2 3 4]); // allocates the array literal on the stack, 
 because it is scoped.
Absolutely. In fact, generically, any scope item could be moved to the stack. We were just discussing in the chat room how scope = stack allocation and scope = don't escape the reference actually go hand in hand; they are not two separate features, stack allocation is an optimization enabled by the restriction... and the restriction is required by the optimization to maintain memory safety.
Feb 06 2014
parent Benjamin Thaut <code benjamin-thaut.de> writes:
Am 06.02.2014 21:26, schrieb Adam D. Ruppe:
 On Thursday, 6 February 2014 at 20:17:25 UTC, Benjamin Thaut wrote:
 foo([1 2 3 4]); // allocates the array literal on the stack, because
 it is scoped.
Absolutely. In fact, generically, any scope item could be moved to the stack. We were just discussing in the chat room how scope = stack allocation and scope = don't escape the reference actually go hand in hand; they are not two separate features, stack allocation is an optimization enabled by the restriction... and the restriction is required by the optimization to maintain memory safety.
Count me in on supporting this feature. I played with writing a DIP for giving scope a meaning for quite some time. And the ideas are pretty similar to yours. Kind Regards Benjamin Thaut
Feb 06 2014
prev sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Thursday, 6 February 2014 at 20:17:25 UTC, Benjamin Thaut 
wrote:
 Another idea. I would totaly love that behaviour.

 void foo(scope int[] arg) { ... }

 foo([1 2 3 4]); // allocates the array literal on the stack, 
 because it is scoped.

 Kind Regards
 Benjamin Thaut
That was basis of old rejected `scope ref` proposal for rvalue references :(
Feb 06 2014
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Thu, 06 Feb 2014 21:50:32 +0000
schrieb "Dicebot" <public dicebot.lv>:

 On Thursday, 6 February 2014 at 20:17:25 UTC, Benjamin Thaut 
 wrote:
 Another idea. I would totaly love that behaviour.

 void foo(scope int[] arg) { ... }

 foo([1 2 3 4]); // allocates the array literal on the stack, 
 because it is scoped.

 Kind Regards
 Benjamin Thaut
That was basis of old rejected `scope ref` proposal for rvalue references :(
Why would anyone reject this? -- Marco
Feb 06 2014
prev sibling parent reply "Meta" <jared771 gmail.com> writes:
On Thursday, 6 February 2014 at 15:53:01 UTC, Adam D. Ruppe wrote:
 Sorry, my lines got mangled, let me try pasting it again.


 Making scope the default
 =======================


 There's five points to discuss:

 1) All variables are assumed to be marked with scope implicitly

 2) The exception is structs with a special annotation which 
 marks that they encapsulate a resource. An encapsulated 
 resource explicitly marked scope at the usage site is STILL 
 scope, but it will not implicitly inherit the scopiness of the 
 member reference/

  encapsulated_resource
 struct RefCounted(T) {
     T t; // the scopiness of this would not propagated to
          // refcounted itself
 }

 This lets us write structs to manage raw pointers (etc.) as an 
 escape from the rules. Note you may also write 
  encaspulated_resource struct Borrowed(T){} as an escape from 
 the rules. Using this would of course be at your own risk, 
 analogous to  trusted code.

 3) Built-in allocations return GC!T instead of T. GC!T's 
 definition is:

  encapsulated_resource
 struct GC(T) {
     private T _managed_payload;
     /*  force_inline */
     /* implicit scope return value */
      safe nothrow inout(T) borrow() { return _managed_payload; }
     alias borrow this;
 }

 NOTE: if inout(T) there doesn't work for const correctness, we 
 need to fix const on wrapped types; an orthogonal issue.

 If you don't care about ownership, the alias this gives you a 
 naked borrowed reference whenever needed. If you do care about 
 ownership:

 auto foo = new Foo();
 static assert(is(typeof(foo) == GC!Foo));

 letting you store it with confidence without additional steps 
 or assumptions.

 When passing to a template, if you want to explicitly borrow 
 it, you might write borrow. Otherwise, IFTI will see the whole 
 GC!T type.  This is important if we want to write owned 
 identity templates.

 If an argument is scope, ownership is irrelevant. We might 
 strip it off but I don't think that's necessary... might help 
 avoid template bloat though.

 4) All other types remain the same. Yes, typeof(this) == T, 
 NEVER GC!T.  Again, remember the rule of thumb: would this work 
 with as static stack buffer?

    class Foo { Foo getMe() { return this; } }
    ubyte[__traits(classInstanceSize, Foo)] buffer;
    Foo f = emplace!Foo(buffer); // ok so far, f is scope
    GC!Foo gc = f.getMe(); // obviously wrong, f is not GC

    The object does not control its own allocation, so it does 
 not own its own memory. Thus, `this` is *always* borrowed.

    Does this work if building a tree:

    class Tree { Tree[] children; Tree addChild(Tree t) {
 children ~= t; } }

    addChild there would *not* compile, since it escapes the t 
 into the object's scope. Tree would need to know ownership: 
 make children and addChild take GC!Tree instead, for example, 
 then it will work.

    What if addChild wants to set t.parent = this; ? That 
 wouldn't be possible (without using a trust-me borrowed!T 
 wrapper)... and while this would break some of my code... I say 
 unto you, such code was already broken, because the parent 
 might be emplaced on a stack buffer!

    GC!Tree child = new Tree();
    {
        ubyte[...] stack;
        Owned!Tree parent = emplace!Tree(stack[]);
        parent.addChild(child);
    }
    child.parent; // bug city


    Instead, addChild should request its own ownership.

    Tree addChild(GC!Tree child, GC!Tree _this) {
        children ~= child;
        child.parent = _this;
    }


    Then, the buggy above scenario does not compile, while 
 making it possible to do the correct thing, storing a 
 (verified) GC reference in the object graph.


    I understand that would be a bit of a pain, but you agree it 
 is more correct, yes? So that might be worthwhile breakage 
 (especailly since we're talking about potentially large 
 breakage already.)


 5) Interaction with  safe is something we can debate.  safe 
 works best with the GC, but if we play our scope cards right, 
 memory corruption via stack stuff can be statically eliminated 
 too, thus making some varaints of emplace  safe too. So I don't 
 think even  safe functions can assume this == GC, and even if 
 they could, we shouldn't since it limits us from legitimate 
 optimizations.

    So I think the  safe rules should stay exactly as they are 
 now. Wrapper structs that do things like malloc/realloc might 
 be  system because it would still be possible for a borrowed 
 pointer to be invalidated when they realloc (note this is not 
 the case with GC, which is  safe even through growth 
 reallocations). So  safe and scope are separate issues.
This, along with an actual implementation of scope would be a really neat thing to have, but it has 2 problems. 1. It would significantly increase language complexity (although the return on investment would also be quite high). 2. Walter would be dead-set against it. He's said before that implementing scope would require flow-analysis in the compiler, which would increase the implementation complexity by a lot. On the flipside of that, if someone ever convinces him and flow-analysis *is* added, it opens up the door to a whole new world of other possible enhancements.
Feb 06 2014
parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Thursday, 6 February 2014 at 22:41:34 UTC, Meta wrote:
 2. Walter would be dead-set against it. He's said before that 
 implementing scope would require flow-analysis in the compiler, 
 which would increase the implementation complexity by a lot.
I don't agree with that - I think this could be done pretty much within the existing type system, especially since scope has to be set everywhere by the programmer.
Feb 06 2014
parent reply "Meta" <jared771 gmail.com> writes:
On Thursday, 6 February 2014 at 23:24:08 UTC, Adam D. Ruppe wrote:
 On Thursday, 6 February 2014 at 22:41:34 UTC, Meta wrote:
 2. Walter would be dead-set against it. He's said before that 
 implementing scope would require flow-analysis in the 
 compiler, which would increase the implementation complexity 
 by a lot.
I don't agree with that - I think this could be done pretty much within the existing type system, especially since scope has to be set everywhere by the programmer.
I know very little about compilers, but wouldn't figuring out if a variable is being escaped to an outer scope require flow anyalysis?
Feb 06 2014
parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Thursday, 6 February 2014 at 23:27:06 UTC, Meta wrote:
 I know very little about compilers, but wouldn't figuring out 
 if a variable is being escaped to an outer scope require flow 
 anyalysis?
I might be using the word wrong too, but I don't think so. All the information needed is available right at the assignment point: int[] a; void foo() { scope int[] b; a = b; // this line is interesting } At that line alone, everything we need to know is available. The compiler currently has this code: if (e1->op == TOKvar && (((VarExp *)e1)->var->storage_class & STCscope) && op == TOKassign) { error("cannot rebind scope variables"); } If a was marked scope, it would trigger that. But I don't think this is a particularly useful check, at least not alone. We could try this though: if ( e1->op == TOKvar && !(((VarExp *)e1)->var->storage_class & STCscope) && e2->op == TOKvar && (((VarExp *)e2)->var->storage_class & STCscope) && op == TOKassign) { error("cannot assign scope variables to non-scope"); } Now, that line will fail with this error - scope to non-scope is not allowed. If we add scope to the a variable, the existing compiler code will trigger an error test500.d(41): Error: cannot rebind scope variables But I don't like that - it is /too/ conservative (although these two checks together more or less achieve the first step I want). However, what if we take that check out, can we allow rebinding but only to the same or lower scopes? Well, the VarExp->var we have on the left-hand side has info about the variable's parent.... and we can check that. I'd like to point out that this is WRONG but I don't know dmd that well either: if ( e1->op == TOKvar && (((VarExp *)e1)->var->storage_class & STCscope) && e2->op == TOKvar && (((VarExp *)e2)->var->storage_class & STCscope) && op == TOKassign) { Declaration* v1 = ((VarExp*)e1)->var; Declaration* v2 = ((VarExp*)e2)->var; if(v1->parent != v2->parent) error("cannot assign scope variable to higher scope"); } But this prohibits it from passing outside the function. A more correct check would be to ensure v1 is equal to or a parent of v2... and using whatever is needed to handle inner scopes inside a function. tbh I don't know just what to look at to get the rest of the scope, but since the compiler can identify what this name actually refers to somehow, it must know where the declaration comes from too! I just don't know where exactly to look. Of course, the tricky part will be structs, but the same basic idea of the implementation should work (if we can get it working first for basic assignments).
Feb 06 2014
prev sibling next sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Thursday, 6 February 2014 at 15:47:44 UTC, Adam D. Ruppe wrote:
 ...
Had only quick look at it but here are some things to remember that I have realised when drafting my own scope proposal: 1) passing scope arguments: /* what to put here as qualifier? */ int[] foo(scope int[] input) { return input; // this should work int[5] internal; // implictly scope return internal[]; // but this shouldn't } I think it makes sense to prohibit `scope` as explicitly named return attribute but make it inferrable via `inout`. 2) Transitivity & aggregation: struct A { scope int[] slice1; int[] slice2; int value; } void foo(int[] input) { } void boo(ref int input) { } void main() { int[5] stack; A a; // is it different from "scope A a;"? a.slice = stack[]; // guess should be ok a.slice2 = new int[]; // should this? foo(a.slice1); // obviously fail foo(a.slice2); // but does this? boo(a.value); // I'd expect this to fail } Main problem with strict scope definition is that most seem to inuitively get what it is expected to do but defining exact set of rules is rather tricky.
Feb 06 2014
parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Thursday, 6 February 2014 at 22:04:05 UTC, Dicebot wrote:
 I think it makes sense to prohibit `scope` as explicitly named 
 return attribute but make it inferrable via `inout`.
I think it is very important to put on the return value explicitly so it can be used to control access to a sealed resource. Perhaps returning a scope var would make it easy enough to infer though.
 2) Transitivity & aggregation:
     A a; // is it different from "scope A a;"?
Probably not because A is a value type. Even if you explicitly marked it scope, it wouldn't really matter. Any pointer to it should be scope since it is on the stack though.
     a.slice2 = new int[]; // should this?
Yeah, it should. Here's how I'm seeing the struct: let's just decompose it to a list of local variables. So "A a;" is considered by the same rules as scope int[] a_slice1; int[] a_slice2; int a_value; as if they were written right there as local variables. A struct is conceptually just a group of variables, after all, let's treat it just like that.
     foo(a.slice2); // but does this?
Thus this is ok too, since it would work if we used the local var a_slice2.
 void boo(ref int input) { }
     boo(a.value); // I'd expect this to fail
I actually think that should work. Let's try to imagine what problems could come up in boo: int* global; void boo(ref int input) { global = &input; } That is a problem... but it is almost ALWAYS a problem. Unless it happened to be passed a heap int by ref, this would always fail. I think taking the address of a ref parameter should be allowed, but should always yield a scope pointer. You can't be sure it isn't on the stack, so you don't want to escape that address.... perhaps ref implies scope? If you want a pointer to escape, ask for a pointer. Moreover, I think address of a stack var should also be scope, so void storeMe(int* i) {} void test() { int i; storeMe(&i); // fails: address of stack var yielded scope var which cannot be passed to non-scope parameter } Otherwise though, writing to a ref is ok, even if it is on the stack. If boo just wants to read and update the value, that's ok.
 Main problem with strict scope definition is that most seem to 
 inuitively get what it is expected to do but defining exact set 
 of rules is rather tricky.
Yeah, structs definitely complicate things, but I think pretending they are just a bunch of local variables gives us a consistent and useful definition.
Feb 06 2014
prev sibling parent "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
(I see now that Adam already talked about this in his second 
post, but I'm posting it anyway, as I suggest a different 
solution.)

On Thursday, 6 February 2014 at 15:47:44 UTC, Adam D. Ruppe wrote:
 c) Calling methods on a struct which may escape the scope is 
 wrong. Ideally, `this` would always be scope... in fact, I 
 think that's the best way to go. An alternative though might be 
 to restrict calling of non-pure functions. Pure functions don't 
 allow mutation of non-scope data in the first place, so they 
 shouldn't be able to escape references.
As a consequence of this, it would no longer be possible to manage an owned object that has a back-pointer to its owner, e.g: class Window { Menu _menu; this() { _menu = new Menu(this); // cannot pass `this` } ~this() { delete _menu; } } class Menu { Window _parent; this(Window parent) { parent = _parent; } } This can probably be solved somehow. Allowing `scope` as a type constructor and doing the following might work, but I'm not sure about the safety implications: class Window { scope Menu _menu; this() { _menu = new Menu(this); // `new` returns a scope(Menu) } ~this() { delete _menu; // either explicitly or implicitly } } class Menu { scope Window _parent; scope this(scope Window parent) { parent = _parent; } } (Note that this assumes scope isn't the default.) The trick is to recognize that we can basically treat constructors and destructors as having a scope that starts when the constructor is entered and ends when the destructor returns. `this` can be seen as being declared inside this scope. For member fields, `scope` also means "owned". Therefore, it needs to be destroyed when the scope is left (which in this case means: when the object is being destroyed). Thinking about this some more, this might even be a good idea for local scope variables too. This would basically mean undeprecating `scope` for classes. Anyway, this behaviour guarantees that the reference no longer exists after the object is destroyed. `this` must me marked as `scope`, which causes `new` to return a scoped reference. This necessary to keep it from being assigned to non-scope variables. Consider the following situation: class Menu { scope Window _parent; scope this(scope Window parent) { parent = _parent; } } Menu a; void foo(scope Window w) { a = new Menu(w); // not good, trying to assign to non-scope scope Menu b = new Menu(w); // ok } On the other hand, I already see at least one problem: class SomeClass { scope SomeClass other; } void foo(scope SomeClass a) { scope SomeClass b = new SomeClass; a.other = b; // ouch } In Rust this is solved by the concept of lifetimes, i.e. while both `a` and `b` are scoped, they have different lifetimes. It's disallowed to store a reference to an object with a shorter lifetime into an object with a longer lifetime.
Feb 08 2014