digitalmars.D - scope escaping

Adam D. Ruppe (141/141) Feb 06 2014 Let's see if we can make this work in two steps: first, making

Adam D. Ruppe (120/120) Feb 06 2014 Making scope the default

Adam D. Ruppe (97/97) Feb 06 2014 Sorry, my lines got mangled, let me try pasting it again.

Matej Nanut (4/7) Feb 06 2014 I just stumbled upon Rust's memory management scheme yesterday and it

Adam D. Ruppe (19/21) Feb 06 2014 Yeah, I haven't used rust but I have read about it, and the more

Elie Morisse (7/8) Feb 06 2014 How about letting the compiler decide what's best in the default

Adam D. Ruppe (10/12) Feb 06 2014 The problem there is the compiler would have to look at the big

Paulo Pinto (4/14) Feb 06 2014 Java and Go compilers do it, why not D ones?

Adam D. Ruppe (2/3) Feb 06 2014 Perhaps it could work, I don't really know.

Paulo Pinto (8/11) Feb 07 2014 With escape analysis, something that DMD as far as I know doesn't

Benjamin Thaut (6/6) Feb 06 2014 Another idea. I would totaly love that behaviour.

Namespace (3/9) Feb 06 2014 +1
Adam D. Ruppe (9/11) Feb 06 2014 Absolutely. In fact, generically, any scope item could be moved

Benjamin Thaut (6/15) Feb 06 2014 Count me in on supporting this feature. I played with writing a DIP for

Dicebot (4/10) Feb 06 2014 That was basis of old rejected `scope ref` proposal for rvalue

Marco Leise (5/19) Feb 06 2014 Why would anyone reject this?

Meta (11/110) Feb 06 2014 This, along with an actual implementation of scope would be a

Adam D. Ruppe (4/7) Feb 06 2014 I don't agree with that - I think this could be done pretty much

Meta (4/12) Feb 06 2014 I know very little about compilers, but wouldn't figuring out if

Adam D. Ruppe (64/67) Feb 06 2014 I might be using the word wrong too, but I don't think so. All

Dicebot (34/35) Feb 06 2014 Had only quick look at it but here are some things to remember

Adam D. Ruppe (45/56) Feb 06 2014 I think it is very important to put on the return value

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (79/85) Feb 08 2014 (I see now that Adam already talked about this in his second

"Adam D. Ruppe" <destructionator gmail.com> writes:

Let's see if we can make this work in two steps: first, making 
the existing scope storage class work, and second, but 
considering making it the default.

First, let's define it. A scope reference may never escape its 
scope. This means:

0) Note that scope is irrelevant on value types. I believe it is 
also mostly irrelevant on references to immutable data (such as 
strings) since they are de facto value types. (A side effect of 
this: immutable stack data is wrong.... which is an arguable 
point, since correct enforcement of slices into it would let you 
maintain the immutable illusion. Hmm, both sides have good 
points.)

Nevertheless, while the immutable reference can be debated, scope 
definitely doesn't matter on value types. While it might be 
there, I think it should just be a no-op.

1) It or its address must never be assigned to a higher scope. 
(The compiler currently disallows rebinding scope variables, 
which I think does achieve this, but is more blunt than it needs 
to be. If we want to disable rebinding, let's do that on a 
type-by-type basis e.g. disabling postblit on a unique ptr.)

void foo() {
    int[] outerSlice;
    {
       scope int[] innerSlice = ...;
       outerSlice = innerSlice; // error
       innerSlice = innerSlice[1 .. $]; // I think this should be 
ok
    }
}

Parameters and return values are considered the same level for 
this, since the parameter and return value both belong to the 
caller. So:

int[] foo() {
    int[15] staticBuffer;
    scope int[] slice = staticBuffer[];
    return slice; // illegal, return value is one level higher 
than inner function
}

// OK, you aren't giving the caller anything they don't already 
have
scope char[] strchr(scope char[] s, char[]) { return s; }

It is acceptable to pass it to a lower scope.

int average(in int[]); // in == const scope

void foo() {
     int[15] staticBuffer;
     scope int[] slice = staticBuffer[];
     int avg = average(slice); // OK, passing to inner scope is 
fine
}


scope slice.ptr and &scope slice's return values themselves must 
be scope. Yes, scope MUST work on function return values as well 
as parameters and variables. This is an absolute necessity for 
any degree of sanity, which I'll talk about more in my next 
numbered point.


BTW I keep using slices into static buffers here because that's 
the main real-world concern we should keep in mind. A static 
buffer is a strictly-scoped owned container built right into the 
language. We know it is wrong to return a reference to stack 
data, we know why. Conversely, we have a pretty good idea about 
what *can* work with it. Scope, if we do it right, should 
statically catch misuses of static array slices while allowing 
proper uses.

So when in doubt about something, ask: does this make sense when 
referring to a static buffer slice?

2) scope must be carried along with the variable at every step of 
its life. (In this sense, it starts to look more like a type 
constructor than a storage class, but I think it is slightly 
different still.)

void foo() {
    int[] outerSlice;
    {
        int[16] staticBuffer;
        scope int[] innerSlice = staticBuffer[]; // OK
        int[] cheatingSlice = innerSlice; // uh oh, no good 
because...
        outerSlice = cheatingSlice; // ...it enables this
    }
}


A potential workaround is to require every assignment to also be 
scope.

        scope int[] cheatingSlice = innerSlice; // OK
        outerSlice = cheatingSlice; // this is still disallowed, 
so cool

It is very important that this also applies through function 
return values, since otherwise:

T identity(T)(scope T t) { return t; }

can and will escape references. Consider strchr on a static stack 
array. We do NOT want that to return a pointer to the stack 
memory after it ceases to exist.

This, that identity function should be illegal with cannot return 
scope from a non-scope function. We'll allow it by marking the 
return value as scope as well. (Again, this sounds a lot like a 
type constructor.)


3) structs are considered reference types if ANY of their members 
are reference types (unless specifically noted otherwise, see my 
following post about default and encapsulation for details). 
Thus, the scope rules may apply to them:

struct Holder {
    int[] foo;
}

Holder h;
void test(scope int[] f) {
     h.foo = f; // must be an error, f is escaping to global scope 
directly
     h = Holder(f); // this must also be an error, f is escaping 
indirectly
}

The constructed Holder inside would have to inherit the scopiness 
of f. This might be the trickiest part of getting this right 
(though it is kinda neatly solved if scope is default :) )

a) A struct constructed with a scope variable itself must be 
scope, and thus all the rules apply to it.

b) Assigning to a struct which is not scope, even if it is a 
local variable, must not be permitted.

Holder h2;
h2.foo = f; // this isn't escaping the scope, but is dropping 
scope

Just as if we had a local variable of type int[].

We may make the struct scope:

scope Holder h2;
h2.foo = f; // OK

c) Calling methods on a struct which may escape the scope is 
wrong. Ideally, `this` would always be scope... in fact, I think 
that's the best way to go. An alternative though might be to 
restrict calling of non-pure functions. Pure functions don't 
allow mutation of non-scope data in the first place, so they 
shouldn't be able to escape references.




I think that covers what I want. Note that this is not 
necessarily  safe:

struct C_Array { /* grows with malloc */ scope T* borrow() {} }

C_Array!int i;
int* b = i.borrow;
i ~= 10; // might realloc...
// leaving b dangling


So it isn't necessarily  safe. I think it *would* be  safe with 
static arrays. BTW static array slicing should return scope as 
should most user defined containers. But with user-defined types, 
 safety is still in the hands of the programmer. Reallocing with 
a non-sealed reference should always be considered  trusted.


Stand by for my next post which will discuss making it default, 
with a few more points relevant to the whole concept.

Feb 06 2014

"Adam D. Ruppe" <destructionator gmail.com> writes:

Making scope the default
=======================


There's five points to discuss:

1) All variables are assumed to be marked with scope implicitly

2) The exception is structs with a special annotation which marks
    that they encapsulate a resource. An encapsulated resource
    explicitly marked scope at the usage site is STILL scope, but
    it will not implicitly inherit the scopiness of the member 
reference/

 encapsulated_resource
struct RefCounted(T) {
     T t; // the scopiness of this would not propagated to 
refcounted itself
}

This lets us write structs to manage raw pointers (etc.) as an 
escape from
the rules. Note you may also write  encaspulated_resource struct 
Borrowed(T){}
as an escape from the rules. Using this would of course be at 
your own risk,
analogous to  trusted code.

3) Built-in allocations return GC!T instead of T. GC!T's 
definition is:

 encapsulated_resource
struct GC(T) {
     private T _managed_payload;
     /*  force_inline */
     /* implicit scope return value */
      safe nothrow inout(T) borrow() { return _managed_payload; }
     alias borrow this;
}

NOTE: if inout(T) there doesn't work for const correctness, we 
need to fix
const on wrapped types; an orthogonal issue.

If you don't care about ownership, the alias this gives you a 
naked borrowed
reference whenever needed. If you do care about ownership:

auto foo = new Foo();
static assert(is(typeof(foo) == GC!Foo));

letting you store it with confidence without additional steps or 
assumptions.

When passing to a template, if you want to explicitly borrow it, 
you might
write borrow. Otherwise, IFTI will see the whole GC!T type. This 
is important
if we want to write owned identity templates.

If an argument is scope, ownership is irrelevant. We might strip 
it off but
I don't think that's necessary... might help avoid template bloat 
though.

4) All other types remain the same. Yes, typeof(this) == T, NEVER 
GC!T.
    Again, remember the rule of thumb: would this work with as 
static stack
    buffer?

    class Foo { Foo getMe() { return this; } }
    ubyte[__traits(classInstanceSize, Foo)] buffer;
    Foo f = emplace!Foo(buffer); // ok so far, f is scope
    GC!Foo gc = f.getMe(); // obviously wrong, f is not GC

    The object does not control its own allocation, so it does not 
own
    its own memory. Thus, `this` is *always* borrowed.

    Does this work if building a tree:

    class Tree { Tree[] children; Tree addChild(Tree t) { children 
~= t; } }

    addChild there would *not* compile, since it escapes the t 
into the object's
    scope. Tree would need to know ownership: make children and 
addChild take
    GC!Tree instead, for example, then it will work.

    What if addChild wants to set t.parent = this; ? That wouldn't 
be possible
    (without using a trust-me borrowed!T wrapper)... and while 
this would break
    some of my code... I say unto you, such code was already 
broken, because
    the parent might be emplaced on a stack buffer!

    GC!Tree child = new Tree();
    {
        ubyte[...] stack;
        Owned!Tree parent = emplace!Tree(stack[]);
        parent.addChild(child);
    }
    child.parent; // bug city


    Instead, addChild should request its own ownership.

    Tree addChild(GC!Tree child, GC!Tree _this) {
        children ~= child;
        child.parent = _this;
    }


    Then, the buggy above scenario does not compile, while making 
it possible
    to do the correct thing, storing a (verified) GC reference in 
the
    object graph.


    I understand that would be a bit of a pain, but you agree it 
is more correct,
    yes? So that might be worthwhile breakage (especailly since 
we're talking
    about potentially large breakage already.)


5) Interaction with  safe is something we can debate.  safe works 
best with
    the GC, but if we play our scope cards right, memory 
corruption via stack
    stuff can be statically eliminated too, thus making some 
varaints of emplace
     safe too. So I don't think even  safe functions can assume 
this == GC, and
    even if they could, we shouldn't since it limits us from 
legitimate
    optimizations.

    So I think the  safe rules should stay exactly as they are 
now. Wrapper
    structs that do things like malloc/realloc might be  system 
because it
    would still be possible for a borrowed pointer to be 
invalidated when
    they realloc (note this is not the case with GC, which is 
 safe even
    through growth reallocations). So  safe and scope are separate 
issues.

Feb 06 2014

"Adam D. Ruppe" <destructionator gmail.com> writes:

Sorry, my lines got mangled, let me try pasting it again.


Making scope the default
=======================


There's five points to discuss:

1) All variables are assumed to be marked with scope implicitly

2) The exception is structs with a special annotation which marks 
that they encapsulate a resource. An encapsulated resource 
explicitly marked scope at the usage site is STILL scope, but it 
will not implicitly inherit the scopiness of the member reference/

 encapsulated_resource
struct RefCounted(T) {
     T t; // the scopiness of this would not propagated to
          // refcounted itself
}

This lets us write structs to manage raw pointers (etc.) as an 
escape from the rules. Note you may also write 
 encaspulated_resource struct Borrowed(T){} as an escape from the 
rules. Using this would of course be at your own risk, analogous 
to  trusted code.

3) Built-in allocations return GC!T instead of T. GC!T's 
definition is:

 encapsulated_resource
struct GC(T) {
     private T _managed_payload;
     /*  force_inline */
     /* implicit scope return value */
      safe nothrow inout(T) borrow() { return _managed_payload; }
     alias borrow this;
}

NOTE: if inout(T) there doesn't work for const correctness, we 
need to fix const on wrapped types; an orthogonal issue.

If you don't care about ownership, the alias this gives you a 
naked borrowed reference whenever needed. If you do care about 
ownership:

auto foo = new Foo();
static assert(is(typeof(foo) == GC!Foo));

letting you store it with confidence without additional steps or 
assumptions.

When passing to a template, if you want to explicitly borrow it, 
you might write borrow. Otherwise, IFTI will see the whole GC!T 
type.  This is important if we want to write owned identity 
templates.

If an argument is scope, ownership is irrelevant. We might strip 
it off but I don't think that's necessary... might help avoid 
template bloat though.

4) All other types remain the same. Yes, typeof(this) == T, NEVER 
GC!T.  Again, remember the rule of thumb: would this work with as 
static stack buffer?

    class Foo { Foo getMe() { return this; } }
    ubyte[__traits(classInstanceSize, Foo)] buffer;
    Foo f = emplace!Foo(buffer); // ok so far, f is scope
    GC!Foo gc = f.getMe(); // obviously wrong, f is not GC

    The object does not control its own allocation, so it does not 
own its own memory. Thus, `this` is *always* borrowed.

    Does this work if building a tree:

    class Tree { Tree[] children; Tree addChild(Tree t) {
children ~= t; } }

    addChild there would *not* compile, since it escapes the t 
into the object's scope. Tree would need to know ownership: make 
children and addChild take GC!Tree instead, for example, then it 
will work.

    What if addChild wants to set t.parent = this; ? That wouldn't 
be possible (without using a trust-me borrowed!T wrapper)... and 
while this would break some of my code... I say unto you, such 
code was already broken, because the parent might be emplaced on 
a stack buffer!

    GC!Tree child = new Tree();
    {
        ubyte[...] stack;
        Owned!Tree parent = emplace!Tree(stack[]);
        parent.addChild(child);
    }
    child.parent; // bug city


    Instead, addChild should request its own ownership.

    Tree addChild(GC!Tree child, GC!Tree _this) {
        children ~= child;
        child.parent = _this;
    }


    Then, the buggy above scenario does not compile, while making 
it possible to do the correct thing, storing a (verified) GC 
reference in the object graph.


    I understand that would be a bit of a pain, but you agree it 
is more correct, yes? So that might be worthwhile breakage 
(especailly since we're talking about potentially large breakage 
already.)


5) Interaction with  safe is something we can debate.  safe works 
best with the GC, but if we play our scope cards right, memory 
corruption via stack stuff can be statically eliminated too, thus 
making some varaints of emplace  safe too. So I don't think even 
 safe functions can assume this == GC, and even if they could, we 
shouldn't since it limits us from legitimate optimizations.

    So I think the  safe rules should stay exactly as they are 
now. Wrapper structs that do things like malloc/realloc might be 
 system because it would still be possible for a borrowed pointer 
to be invalidated when they realloc (note this is not the case 
with GC, which is  safe even through growth reallocations). So 
 safe and scope are separate issues.

Feb 06 2014

Matej Nanut <matejnanut gmail.com> writes:

On 6 Feb 2014 16:56, "Adam D. Ruppe" <destructionator gmail.com> wrote:
 Making scope the default
 =======================
 [...]

I just stumbled upon Rust's memory management scheme yesterday and it
seemed similar to this.

On first glance, I really like it.

Feb 06 2014

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Thursday, 6 February 2014 at 18:29:48 UTC, Matej Nanut wrote:
 I just stumbled upon Rust's memory management scheme yesterday 
 and it seemed similar to this.

Yeah, I haven't used rust but I have read about it, and the more 
I think about it, the more I realize it really isn't that new - 
it is just formalizing what we already do as programmers.

Escaping a reference to stack data is always wrong. We know this 
and try not to do it. The language barely helps with this though 
- we're on our own. We can't even be completely sure a reference 
actually is GC since it might be on the stack without us 
realizing it.

So what the Rust system and my proposal (which I'm pretty sure is 
simpler than the Rust one - it doesn't catch all the problems, 
but should be easier to implement and use for the majority of 
cases) does is try to get the language to help us get this right.

It's the same thing with like error handling. In C, you know you 
have to clean up with a failed operation and you have to do it 
yourself. This is often done by checking return values and goto 
clean up code. In D, we have struct destructors, scope(failure), 
and exceptions to help us do the same task with less work and 
more confidence.

Feb 06 2014

"Elie Morisse" <syniurge gmail.com> writes:

On Thursday, 6 February 2014 at 15:53:01 UTC, Adam D. Ruppe wrote:
 Making scope the default

How about letting the compiler decide what's best in the default 
case?

  · if a global reference to the variable espaces or a reference 
is returned by a function ⇒ GC-allocated
  · otherwise ⇒ scoped to where the last reference to the variable 
is seen by static analysis

Feb 06 2014

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Thursday, 6 February 2014 at 19:24:19 UTC, Elie Morisse wrote:
 How about letting the compiler decide what's best in the 
 default case?

The problem there is the compiler would have to look at the big 
picture to make an informed decision, and big picture decisions 
are generally hard to implement.

Determining whether it is GC or not automatically would require 
analysis of the function body, tracing where each reference ends 
up, and looking at other functions it gets passed to (which might 
not be possible if you have only the prototype without a body). 
Things like pure can help with it, but generally, I don't think 
the compiler can make a smart decision.

Feb 06 2014

Paulo Pinto <pjmlp progtools.org> writes:

Am 06.02.2014 21:29, schrieb Adam D. Ruppe:
 On Thursday, 6 February 2014 at 19:24:19 UTC, Elie Morisse wrote:
 How about letting the compiler decide what's best in the default case?

 The problem there is the compiler would have to look at the big picture
 to make an informed decision, and big picture decisions are generally
 hard to implement.

 Determining whether it is GC or not automatically would require analysis
 of the function body, tracing where each reference ends up, and looking
 at other functions it gets passed to (which might not be possible if you
 have only the prototype without a body). Things like pure can help with
 it, but generally, I don't think the compiler can make a smart decision.

Java and Go compilers do it, why not D ones?

--
Paulo

Feb 06 2014

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Thursday, 6 February 2014 at 21:04:45 UTC, Paulo Pinto wrote:
 Java and Go compilers do it, why not D ones?

Perhaps it could work, I don't really know.

Feb 06 2014

"Paulo Pinto" <pjmlp progtools.org> writes:

On Thursday, 6 February 2014 at 23:20:44 UTC, Adam D. Ruppe wrote:
 On Thursday, 6 February 2014 at 21:04:45 UTC, Paulo Pinto wrote:
 Java and Go compilers do it, why not D ones?

 Perhaps it could work, I don't really know.

With escape analysis, something that DMD as far as I know doesn't 
do.

You need to create execution flows for basic blocks, every 
variable that does not escape the blocks can be turned into stack 
allocations.

--
Paulo

Feb 07 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Another idea. I would totaly love that behaviour.

void foo(scope int[] arg) { ... }

foo([1 2 3 4]); // allocates the array literal on the stack, because it 
is scoped.

Kind Regards
Benjamin Thaut

Feb 06 2014

"Namespace" <rswhite4 googlemail.com> writes:

On Thursday, 6 February 2014 at 20:17:25 UTC, Benjamin Thaut 
wrote:
 Another idea. I would totaly love that behaviour.

 void foo(scope int[] arg) { ... }

 foo([1 2 3 4]); // allocates the array literal on the stack, 
 because it is scoped.

 Kind Regards
 Benjamin Thaut

+1

Feb 06 2014

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Thursday, 6 February 2014 at 20:17:25 UTC, Benjamin Thaut 
wrote:
 foo([1 2 3 4]); // allocates the array literal on the stack, 
 because it is scoped.

Absolutely. In fact, generically, any scope item could be moved 
to the stack. We were just discussing in the chat room how scope 
= stack allocation and scope = don't escape the reference 
actually go hand in hand; they are not two separate features, 
stack allocation is an optimization enabled by the restriction... 
and the restriction is required by the optimization to maintain 
memory safety.

Feb 06 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 06.02.2014 21:26, schrieb Adam D. Ruppe:
 On Thursday, 6 February 2014 at 20:17:25 UTC, Benjamin Thaut wrote:
 foo([1 2 3 4]); // allocates the array literal on the stack, because
 it is scoped.

 Absolutely. In fact, generically, any scope item could be moved to the
 stack. We were just discussing in the chat room how scope = stack
 allocation and scope = don't escape the reference actually go hand in
 hand; they are not two separate features, stack allocation is an
 optimization enabled by the restriction... and the restriction is
 required by the optimization to maintain memory safety.

Count me in on supporting this feature. I played with writing a DIP for 
giving scope a meaning for quite some time. And the ideas are pretty 
similar to yours.

Kind Regards
Benjamin Thaut

Feb 06 2014

"Dicebot" <public dicebot.lv> writes:

On Thursday, 6 February 2014 at 20:17:25 UTC, Benjamin Thaut 
wrote:
 Another idea. I would totaly love that behaviour.

 void foo(scope int[] arg) { ... }

 foo([1 2 3 4]); // allocates the array literal on the stack, 
 because it is scoped.

 Kind Regards
 Benjamin Thaut

That was basis of old rejected `scope ref` proposal for rvalue 
references :(

Feb 06 2014

Marco Leise <Marco.Leise gmx.de> writes:

Am Thu, 06 Feb 2014 21:50:32 +0000
schrieb "Dicebot" <public dicebot.lv>:

 On Thursday, 6 February 2014 at 20:17:25 UTC, Benjamin Thaut 
 wrote:
 Another idea. I would totaly love that behaviour.

 void foo(scope int[] arg) { ... }

 foo([1 2 3 4]); // allocates the array literal on the stack, 
 because it is scoped.

 Kind Regards
 Benjamin Thaut

 
 That was basis of old rejected `scope ref` proposal for rvalue 
 references :(

Why would anyone reject this?

-- 
Marco

Feb 06 2014

"Meta" <jared771 gmail.com> writes:

On Thursday, 6 February 2014 at 15:53:01 UTC, Adam D. Ruppe wrote:
 Sorry, my lines got mangled, let me try pasting it again.


 Making scope the default
 =======================


 There's five points to discuss:

 1) All variables are assumed to be marked with scope implicitly

 2) The exception is structs with a special annotation which 
 marks that they encapsulate a resource. An encapsulated 
 resource explicitly marked scope at the usage site is STILL 
 scope, but it will not implicitly inherit the scopiness of the 
 member reference/

  encapsulated_resource
 struct RefCounted(T) {
     T t; // the scopiness of this would not propagated to
          // refcounted itself
 }

 This lets us write structs to manage raw pointers (etc.) as an 
 escape from the rules. Note you may also write 
  encaspulated_resource struct Borrowed(T){} as an escape from 
 the rules. Using this would of course be at your own risk, 
 analogous to  trusted code.

 3) Built-in allocations return GC!T instead of T. GC!T's 
 definition is:

  encapsulated_resource
 struct GC(T) {
     private T _managed_payload;
     /*  force_inline */
     /* implicit scope return value */
      safe nothrow inout(T) borrow() { return _managed_payload; }
     alias borrow this;
 }

 NOTE: if inout(T) there doesn't work for const correctness, we 
 need to fix const on wrapped types; an orthogonal issue.

 If you don't care about ownership, the alias this gives you a 
 naked borrowed reference whenever needed. If you do care about 
 ownership:

 auto foo = new Foo();
 static assert(is(typeof(foo) == GC!Foo));

 letting you store it with confidence without additional steps 
 or assumptions.

 When passing to a template, if you want to explicitly borrow 
 it, you might write borrow. Otherwise, IFTI will see the whole 
 GC!T type.  This is important if we want to write owned 
 identity templates.

 If an argument is scope, ownership is irrelevant. We might 
 strip it off but I don't think that's necessary... might help 
 avoid template bloat though.

 4) All other types remain the same. Yes, typeof(this) == T, 
 NEVER GC!T.  Again, remember the rule of thumb: would this work 
 with as static stack buffer?

    class Foo { Foo getMe() { return this; } }
    ubyte[__traits(classInstanceSize, Foo)] buffer;
    Foo f = emplace!Foo(buffer); // ok so far, f is scope
    GC!Foo gc = f.getMe(); // obviously wrong, f is not GC

    The object does not control its own allocation, so it does 
 not own its own memory. Thus, `this` is *always* borrowed.

    Does this work if building a tree:

    class Tree { Tree[] children; Tree addChild(Tree t) {
 children ~= t; } }

    addChild there would *not* compile, since it escapes the t 
 into the object's scope. Tree would need to know ownership: 
 make children and addChild take GC!Tree instead, for example, 
 then it will work.

    What if addChild wants to set t.parent = this; ? That 
 wouldn't be possible (without using a trust-me borrowed!T 
 wrapper)... and while this would break some of my code... I say 
 unto you, such code was already broken, because the parent 
 might be emplaced on a stack buffer!

    GC!Tree child = new Tree();
    {
        ubyte[...] stack;
        Owned!Tree parent = emplace!Tree(stack[]);
        parent.addChild(child);
    }
    child.parent; // bug city


    Instead, addChild should request its own ownership.

    Tree addChild(GC!Tree child, GC!Tree _this) {
        children ~= child;
        child.parent = _this;
    }


    Then, the buggy above scenario does not compile, while 
 making it possible to do the correct thing, storing a 
 (verified) GC reference in the object graph.


    I understand that would be a bit of a pain, but you agree it 
 is more correct, yes? So that might be worthwhile breakage 
 (especailly since we're talking about potentially large 
 breakage already.)


 5) Interaction with  safe is something we can debate.  safe 
 works best with the GC, but if we play our scope cards right, 
 memory corruption via stack stuff can be statically eliminated 
 too, thus making some varaints of emplace  safe too. So I don't 
 think even  safe functions can assume this == GC, and even if 
 they could, we shouldn't since it limits us from legitimate 
 optimizations.

    So I think the  safe rules should stay exactly as they are 
 now. Wrapper structs that do things like malloc/realloc might 
 be  system because it would still be possible for a borrowed 
 pointer to be invalidated when they realloc (note this is not 
 the case with GC, which is  safe even through growth 
 reallocations). So  safe and scope are separate issues.

This, along with an actual implementation of scope would be a 
really neat thing to have, but it has 2 problems.

1. It would significantly increase language complexity (although 
the return on investment would also be quite high).

2. Walter would be dead-set against it. He's said before that 
implementing scope would require flow-analysis in the compiler, 
which would increase the implementation complexity by a lot. On 
the flipside of that, if someone ever convinces him and 
flow-analysis *is* added, it opens up the door to a whole new 
world of other possible enhancements.

Feb 06 2014

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Thursday, 6 February 2014 at 22:41:34 UTC, Meta wrote:
 2. Walter would be dead-set against it. He's said before that 
 implementing scope would require flow-analysis in the compiler, 
 which would increase the implementation complexity by a lot.

I don't agree with that - I think this could be done pretty much 
within the existing type system, especially since scope has to be 
set everywhere by the programmer.

Feb 06 2014

"Meta" <jared771 gmail.com> writes:

On Thursday, 6 February 2014 at 23:24:08 UTC, Adam D. Ruppe wrote:
 On Thursday, 6 February 2014 at 22:41:34 UTC, Meta wrote:
 2. Walter would be dead-set against it. He's said before that 
 implementing scope would require flow-analysis in the 
 compiler, which would increase the implementation complexity 
 by a lot.

 I don't agree with that - I think this could be done pretty 
 much within the existing type system, especially since scope 
 has to be set everywhere by the programmer.

I know very little about compilers, but wouldn't figuring out if 
a variable is being escaped to an outer scope require flow 
anyalysis?

Feb 06 2014

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Thursday, 6 February 2014 at 23:27:06 UTC, Meta wrote:
 I know very little about compilers, but wouldn't figuring out 
 if a variable is being escaped to an outer scope require flow 
 anyalysis?

I might be using the word wrong too, but I don't think so. All 
the information needed is available right at the assignment point:

int[] a;
void foo() {
    scope int[] b;
    a = b; // this line is interesting
}

At that line alone, everything we need to know is available. The 
compiler currently has this code:

     if (e1->op == TOKvar &&
         (((VarExp *)e1)->var->storage_class & STCscope) &&
         op == TOKassign)
     {
         error("cannot rebind scope variables");
     }


If a was marked scope, it would trigger that. But I don't think 
this is a particularly useful check, at least not alone. We could 
try this though:

     if (
         e1->op == TOKvar &&
         !(((VarExp *)e1)->var->storage_class & STCscope) &&
         e2->op == TOKvar &&
         (((VarExp *)e2)->var->storage_class & STCscope) &&
         op == TOKassign)
     {
         error("cannot assign scope variables to non-scope");
     }


Now, that line will fail with this error - scope to non-scope is 
not allowed.

If we add scope to the a variable, the existing compiler code 
will trigger an error

test500.d(41): Error: cannot rebind scope variables

But I don't like that - it is /too/ conservative (although these 
two checks together more or less achieve the first step I want). 
However, what if we take that check out, can we allow rebinding 
but only to  the same or lower scopes?

Well, the VarExp->var we have on the left-hand side has info 
about the variable's parent.... and we can check that. I'd like 
to point out that this is WRONG but I don't know dmd that well 
either:

     if (
         e1->op == TOKvar &&
         (((VarExp *)e1)->var->storage_class & STCscope) &&
         e2->op == TOKvar &&
         (((VarExp *)e2)->var->storage_class & STCscope) &&
         op == TOKassign)
     {
         Declaration* v1 = ((VarExp*)e1)->var;
         Declaration* v2 = ((VarExp*)e2)->var;
         if(v1->parent != v2->parent)
            error("cannot assign scope variable to higher scope");
     }


But this prohibits it from passing outside the function. A more 
correct check would be to ensure v1 is equal to or a parent of 
v2... and using whatever is needed to handle inner scopes inside 
a function.


tbh I don't know just what to look at to get the rest of the 
scope, but since the compiler can identify what this name 
actually refers to somehow, it must know where the declaration 
comes from too! I just don't know where exactly to look.



Of course, the tricky part will be structs, but the same basic 
idea of the implementation should work (if we can get it working 
first for basic assignments).

Feb 06 2014

"Dicebot" <public dicebot.lv> writes:

On Thursday, 6 February 2014 at 15:47:44 UTC, Adam D. Ruppe wrote:
 ...

Had only quick look at it but here are some things to remember 
that I have realised when drafting my own scope proposal:

1) passing scope arguments:

/* what to put here as qualifier? */ int[] foo(scope int[] input)
{
     return input; // this should work
     int[5] internal; // implictly scope
     return internal[]; // but this shouldn't
}

I think it makes sense to prohibit `scope` as explicitly named 
return attribute but make it inferrable via `inout`.

2) Transitivity & aggregation:

struct A
{
     scope int[] slice1;
     int[] slice2;
     int value;
}

void foo(int[] input) { }
void boo(ref int input) { }

void main()
{
     int[5] stack;
     A a; // is it different from "scope A a;"?
     a.slice = stack[]; // guess should be ok
     a.slice2 = new int[]; // should this?
     foo(a.slice1); // obviously fail
     foo(a.slice2); // but does this?
     boo(a.value); // I'd expect this to fail
}

Main problem with strict scope definition is that most seem to 
inuitively get what it is expected to do but defining exact set 
of rules is rather tricky.

Feb 06 2014

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Thursday, 6 February 2014 at 22:04:05 UTC, Dicebot wrote:
 I think it makes sense to prohibit `scope` as explicitly named 
 return attribute but make it inferrable via `inout`.

I think it is very important to put on the return value 
explicitly so it can be used to control access to a sealed 
resource.

Perhaps returning a scope var would make it easy enough to infer 
though.


 2) Transitivity & aggregation:
     A a; // is it different from "scope A a;"?

Probably not because A is a value type. Even if you explicitly 
marked it scope, it wouldn't really matter. Any pointer to it 
should be scope since it is on the stack though.

     a.slice2 = new int[]; // should this?

Yeah, it should. Here's how I'm seeing the struct: let's just 
decompose it to a list of local variables. So "A a;" is 
considered by the same rules as
    scope int[] a_slice1;
    int[] a_slice2;
    int a_value;

as if they were written right there as local variables. A struct 
is conceptually just a group of variables, after all, let's treat 
it just like that.

     foo(a.slice2); // but does this?

Thus this is ok too, since it would work if we used the local var 
a_slice2.

 void boo(ref int input) { }
     boo(a.value); // I'd expect this to fail

I actually think that should work. Let's try to imagine what 
problems could come up in boo:

int* global;
void boo(ref int input) {
    global = &input;
}

That is a problem... but it is almost ALWAYS a problem. Unless it 
happened to be passed a heap int by ref, this would always fail.

I think taking the address of a ref parameter should be allowed, 
but should always yield a scope pointer. You can't be sure it 
isn't on the stack, so you don't want to escape that address.... 
perhaps ref implies scope? If you want a pointer to escape, ask 
for a pointer. Moreover, I think address of a stack var should 
also be scope, so

void storeMe(int* i) {}
void test() {
    int i;
    storeMe(&i); // fails: address of stack var yielded scope var 
which cannot be passed to non-scope parameter
}

Otherwise though, writing to a ref is ok, even if it is on the 
stack. If boo just wants to read and update the value, that's ok.


 Main problem with strict scope definition is that most seem to 
 inuitively get what it is expected to do but defining exact set 
 of rules is rather tricky.

Yeah, structs definitely complicate things, but I think 
pretending they are just a bunch of local variables gives us a 
consistent and useful definition.

Feb 06 2014

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

(I see now that Adam already talked about this in his second 
post, but I'm posting it anyway, as I suggest a different 
solution.)

On Thursday, 6 February 2014 at 15:47:44 UTC, Adam D. Ruppe wrote:
 c) Calling methods on a struct which may escape the scope is 
 wrong. Ideally, `this` would always be scope... in fact, I 
 think that's the best way to go. An alternative though might be 
 to restrict calling of non-pure functions. Pure functions don't 
 allow mutation of non-scope data in the first place, so they 
 shouldn't be able to escape references.

As a consequence of this, it would no longer be possible to 
manage an owned object that has a back-pointer to its owner, e.g:

class Window {
     Menu _menu;
     this() {
         _menu = new Menu(this);    // cannot pass `this`
     }
     ~this() {
         delete _menu;
     }
}

class Menu {
     Window _parent;
     this(Window parent) {
         parent = _parent;
     }
}

This can probably be solved somehow. Allowing `scope` as a type 
constructor and doing the following might work, but I'm not sure 
about the safety implications:

class Window {
     scope Menu _menu;
     this() {
         _menu = new Menu(this);    // `new` returns a scope(Menu)
     }
     ~this() {
         delete _menu;              // either explicitly or 
implicitly
     }
}

class Menu {
     scope Window _parent;
     scope this(scope Window parent) {
         parent = _parent;
     }
}

(Note that this assumes scope isn't the default.)

The trick is to recognize that we can basically treat 
constructors and destructors as having a scope that starts when 
the constructor is entered and ends when the destructor returns. 
`this` can be seen as being declared inside this scope.

For member fields, `scope` also means "owned". Therefore, it 
needs to be destroyed when the scope is left (which in this case 
means: when the object is being destroyed). Thinking about this 
some more, this might even be a good idea for local scope 
variables too. This would basically mean undeprecating `scope` 
for classes. Anyway, this behaviour guarantees that the reference 
no longer exists after the object is destroyed.

`this` must me marked as `scope`, which causes `new` to return a 
scoped reference. This necessary to keep it from being assigned 
to non-scope variables. Consider the following situation:

class Menu {
     scope Window _parent;
     scope this(scope Window parent) {
         parent = _parent;
     }
}

Menu a;
void foo(scope Window w) {
     a = new Menu(w);               // not good, trying to assign 
to non-scope
     scope Menu b = new Menu(w);    // ok
}

On the other hand, I already see at least one problem:

class SomeClass {
     scope SomeClass other;
}

void foo(scope SomeClass a) {
     scope SomeClass b = new SomeClass;
     a.other = b;    // ouch
}

In Rust this is solved by the concept of lifetimes, i.e. while 
both `a` and `b` are scoped, they have different lifetimes. It's 
disallowed to store a reference to an object with a shorter 
lifetime into an object with a longer lifetime.

Feb 08 2014

D Programming

C/C++ Programming

Other

digitalmars.D - scope escaping