www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Deterministic life-time storage type

reply travert phare.normalesup.org (Christophe) writes:
Hi. I don't have time to follow all discussions here, but it makes a 
some time that I have an idea that I would like to share here, to know 
if that may interest programmers in various fields. The idea is major 
change in the langage (perhaps for D.3), to give tools for the compiler 
and the programmer to know what is the lifetime of any type or object in 
memory, to allow a better memory management. This could dramatically 
reduce the use of the GC, and even allow extra optimisations. It would 
also solve the cast-to-immutable problem, and perhaps even some 
r-value/l-value issues. However, this feature requires some discipline 
to use, and that's why I would like to know if people would be 
interested, or if it is too much to add to a programming langage.

Now, this is the idea in a few words: 
In each function signature, you can add information about whether the 
function may keep reference to its parameters or return value. Then, 
when you declare a variable, you can say how long you want to use that 
variable. With these information, the compiler can check you use your 
variables right, and use this information to destroy the variable at the 
right time.

To do this, I'll alter the meaning of the scope, in, out and inout 
keywords to create new storage type :

 - dynamic variable: this refers to a variable for which references can 
be freely taken. It is allocated on the heap, and garbage collected the 
usual way. This is the default, but an additional keyword, "dynamic" may 
be used to explicitely declare a dynamic variable.

 Example:
 | dynamic(int)[] a = [1, 2, 3]; // same as: int[] a = [1, 2, 3];
 | dynamic(int) b = 5; // same as: ref b = new int; b = 5;

 - scope variable: this refers to a variable for which we can be sure 
that no reference to the variable, or any subpart of it (scope is 
transitive), will survive the current scope. No dynamic reference of a 
scope variable can be made.

 Example:
 | int[] g;
 | 
 | void main()
 | {
 |    scope int[] a = [1, 2, 3]; // the allocated array can be destroyed 
 |                               // at the end of the current scope
 |    scope int b = 5; // same as: int b = 5; (exept that no closure are 
 |                     // allowed)
 |
 |    g = a[]; // error: no reference of a may escape main's scope.
 |  }

A specific scope, different from the current scope, can be specified by 
adding parentheses to the scope keyword:

 - scope(in): This scope is a bridge between scope and dynamic. 
Variables of any scope (including dynamic variables) can be cast to 
scope(in). External references of a scope(in) variable may exist, but no 
new references of a scope(in) variable that survives the current scope 
may be made. Several scope(in) variables usually do not share the same 
scope (use scope(label) for that).

 Example:
 | int[] g;
 | 
 | void main()
 | {
 |   int a[] = [1, 2, 3];
 |   scope int b[] = [4, 5, 6];
 |
 |   scope(in) int[] c = a[]; // ok
 |   c = b[]; // ok
 |   g = c[]; // error: no reference of c may escape main's scope.
 | }

 - scope(out): This scope is for variables to be returned. When a 
scope(out) variable is returned, the calling function can be sure that 
no reference of the variable or any of its subpart exist anywhere, but 
in the returned value itself. The caller may cast the scope(out) 
variable to any scope, and may even cast it to immutable. The caller 
"decides" what is the scope of the scope(out) variable.

 Example:
 | scope(out) int[] oneTwoThree()
 | {
 |   scope(out) int r = [1, 2, 3];
 |   return r;
 | }
 | 
 | void main()
 | {
 |   scope a = r;
 | };

 - scope(inout): A combinaison of scope(in) and scope(inout): No 
reference of the variable that survive the scope may be taken, but the 
returned value.

 Example:
 | scope(inout) int[] firstHalf(scope(inout) int[] a)
 | {
 |   return a[0..$/2];
 | }

 - scope(label) variable: variable shares its scope with
the variable or label "label".

 Example:
 | void main()
 | {
 |   scope a = [1, 2, 3];
 |   {
 |     scope(a) b = [3, 4, 5];
 |     a = b; //  ok, b has a's scope
 |   }

In addition, to make scope usage less verbose, we may make in, out, and 
inout parameters and return values implicitely scope(in), scope(out), 
and scope(inout) respectively, in addition to their current meanig, as 
long as code breakage is tolerable (do probably not before D.3 unless 
this proposal gets more approval than I expect).

This scope system is very similar to the mutable/immutable system. It is 
optionnal (one may code without it). There is transitivity, a bridge 
type (const or scope(in)), and also the same virality (is this an 
english word??). This means that to be usable, this system requires to 
restrict the usage of parameters and returned value of the functions by 
appropriate keywords (scope(in, out or inout), otherwise a scoped 
variable can't be passed to a function and is not usable in practice. 
But in my opinion, the gain is very large. When used, variable lifetimes 
becomes deterministic, the compiler can destroy them at the right time, 
and use the GC only when necessary, or with global variables.

I only gave here a few definitions, from which a whole scope system can 
be deduced, and implemented. I've given it more thoughts, but this post 
is long enough for now, so I will let you give me your thoughts, and 
gladly answer your questions about subtelity that may arise, 
feasibility, etc.

-- 
Christophe Travert
Apr 21 2012
next sibling parent reply Artur Skawina <art.08.09 gmail.com> writes:
On 04/21/12 16:22, Christophe wrote:
 Now, this is the idea in a few words: 
 In each function signature, you can add information about whether the 
 function may keep reference to its parameters or return value. Then, 
 when you declare a variable, you can say how long you want to use that 
 variable. With these information, the compiler can check you use your 
 variables right, and use this information to destroy the variable at the 
 right time.
 
 To do this, I'll alter the meaning of the scope, in, out and inout 
 keywords to create new storage type :
[...]
 I only gave here a few definitions, from which a whole scope system can 
 be deduced, and implemented. I've given it more thoughts, but this post 
 is long enough for now, so I will let you give me your thoughts, and 
 gladly answer your questions about subtelity that may arise, 
 feasibility, etc.
"scope", in its current meaning, should have been the default for all function arguments. If this was the case, would introducing your scope-scopes bring any additional benefits? (Let's ignore enforcement for now, and assume the compiler won't let the scoped variables escape). There was a thread some time ago on a similar topic: http://www.digitalmars.com/d/archives/digitalmars/D/learn/Why_I_could_not_cast_string_to_int_32126.html#N32168 Your "scope(out)" seems to be yet another incarnation of uniq/unique (something that apparently keeps coming up over and over again). "scope(inout)" AFAICT could be "T[] f(return T[] a) { return a[0..2]; }"; reusing the "return" keyword to mean "this argument could be returned directly or indirectly as result". artur
Apr 21 2012
parent reply travert phare.normalesup.org (Christophe) writes:
Artur Skawina , dans le message (digitalmars.D:164784), a écrit :
 "scope", in its current meaning, should have been the default for all
 function arguments. If this was the case, would introducing your scope-scopes
 bring any additional benefits? (Let's ignore enforcement for now, and assume
 the compiler won't let the scoped variables escape).
Scope in its current meaning is about the same as my scope(in) for parameters. However, it is not transitive, and that changes a lot of things: int[] g; void foo(scope int[] a) { g = a; } // passes: a is not protected at all scope is pointless here. It cannot even protect an array. Just like const/immutable, transitivity is essential.
 Your "scope(out)" seems to be yet another incarnation of uniq/unique 
 (something that apparently keeps coming up over and over again).

 "scope(inout)" AFAICT could be "T[] f(return T[] a) { return a[0..2]; 
 }"; reusing the "return" keyword to mean "this argument could be 
 returned directly or indirectly as result".
I never heard about uniq or return keyword for parameters. I don't have time to follow the forum most of the time. So basically I just put those three ideas together, with a new naming convention, and transitivity. Why are these ideas not going further ? Well, I could have summarized my long post in one line: Why is the scope attribute not transitive ? -- Christophe
Apr 21 2012
parent Artur Skawina <art.08.09 gmail.com> writes:
On 04/22/12 02:01, Christophe wrote:
 Artur Skawina , dans le message (digitalmars.D:164784), a écrit :
 "scope", in its current meaning, should have been the default for all
 function arguments. If this was the case, would introducing your scope-scopes
 bring any additional benefits? (Let's ignore enforcement for now, and assume
 the compiler won't let the scoped variables escape).
Scope in its current meaning is about the same as my scope(in) for parameters. However, it is not transitive, and that changes a lot of things: int[] g; void foo(scope int[] a) { g = a; } // passes: a is not protected at all scope is pointless here. It cannot even protect an array. Just like const/immutable, transitivity is essential.
Yes. there was a reason for my lets-ignore-enforcement-for-now suggestion... The compiler won't catch the escaping refs right now.
 Your "scope(out)" seems to be yet another incarnation of uniq/unique 
 (something that apparently keeps coming up over and over again).

 "scope(inout)" AFAICT could be "T[] f(return T[] a) { return a[0..2]; 
 }"; reusing the "return" keyword to mean "this argument could be 
 returned directly or indirectly as result".
I never heard about uniq or return keyword for parameters. I don't have time to follow the forum most of the time. So basically I just put those three ideas together, with a new naming convention, and transitivity. Why are these ideas not going further ? Well, I could have summarized my long post in one line: Why is the scope attribute not transitive ?
It is, or at least this is how i read "references in the parameter cannot be escaped". It just isn't currently enforced. artur
Apr 21 2012
prev sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2012-04-21 14:22:41 +0000, travert phare.normalesup.org (Christophe) said:

 This scope system is very similar to the mutable/immutable system. It is
 optionnal (one may code without it). There is transitivity, a bridge
 type (const or scope(in)), and also the same virality (is this an
 english word??). This means that to be usable, this system requires to
 restrict the usage of parameters and returned value of the functions by
 appropriate keywords (scope(in, out or inout), otherwise a scoped
 variable can't be passed to a function and is not usable in practice.
 But in my opinion, the gain is very large. When used, variable lifetimes
 becomes deterministic, the compiler can destroy them at the right time,
 and use the GC only when necessary, or with global variables.
I fear your solution might not be complicated enough (!) to allow some common patterns. One simple case that often challenges such proposals is the swap function. So with your system, how do you write the swap function? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Apr 21 2012
parent reply travert phare.normalesup.org (Christophe Travert) writes:
Michel Fortin , dans le message (digitalmars.D:164824), a écrit :
 So with your system, how do you write the swap function?
I've thought about that. The scope(label) is the key. void T swap(T)(scope T a, scope(a) T b) { scope(a) tmp = a; a = b; b = tmp; } scope(inout) would also do the trick, since it is implicitely shared between parameters and return values. -- Christophe
Apr 21 2012
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2012-04-22 06:41:46 +0000, travert phare.normalesup.org (Christophe 
Travert) said:

 Michel Fortin , dans le message (digitalmars.D:164824), a écrit :
 So with your system, how do you write the swap function?
I've thought about that. The scope(label) is the key. void T swap(T)(scope T a, scope(a) T b) { scope(a) tmp = a; a = b; b = tmp; } scope(inout) would also do the trick, since it is implicitely shared between parameters and return values.
Your proposal is very similar to some things that were discussed in 2008 when escape analysis became the topic of the day on this newsgroup. There were two problems for adoption: it makes writing functions difficult (because you have to add all that scoping thing to your mental model) and implementing new type modifiers is a major undertaking that didn't fit with the schedule. While the second problem might disappear given enough time, the first one is a hurdle. You might find this a good read: <http://www.digitalmars.com/d/archives/digitalmars/D/Escape_analysis_78791.html> -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Apr 22 2012
next sibling parent travert phare.normalesup.org (Christophe) writes:
Michel Fortin , dans le message (digitalmars.D:164837), a écrit :
 You might find this a good read:
 <http://www.digitalmars.com/d/archives/digitalmars/D/Escape_analysis_78791.html>
 
Thanks, I'll have a look.
Apr 22 2012
prev sibling parent travert phare.normalesup.org (Christophe) writes:
Michel Fortin , dans le message (digitalmars.D:164837), a écrit :
 newsgroup. There were two problems for adoption: it makes writing 
 functions difficult (because you have to add all that scoping thing to 
 your mental model) and implementing new type modifiers is a major 
 undertaking that didn't fit with the schedule. While the second problem 
 might disappear given enough time, the first one is a hurdle.
If we choose the following defaults, the hurdle may not be that high: -1: function *parameters* and return value are scope by default (scope(in) and scope(out) in my terminology, although it is enough to say scope in that case), -2: *variable* are dynamic (= noscope = escape) when it is necessary, and scope when the compiler can find that they do not escape the scope. This way, programmers don't have to worry about variable's scope, since they are dynamic by default. But the performance cost may not be too high, because most of the times the variable will be treated like scope since the functions that use them will be scope by defaults. Lazy programmers only have to say when they let an argument escape the scope of a function. This is a good thing, because this scheme should be avoided, and in any case, this information should be documented. When a scope as to be shared between parameters/return value, stating "inout" would adequately solve 90% of the cases or more. For the less than 10% left, "dynamic" and/or deep copies allows to get rid of the problem, if the programmer is too lazy to use scope(label) or has no choice. Finally, programmers requiring efficiency can control the scope of a variable by declaring them explicitely scope. This way, the compiler will check they are not dynamic. This is obviously non-backward-compatible. However, most of the errors will occur in function's signature, where the compiler can be made to provide adequate error messages to correct the signature quickly. I reckon backward-compatibility issue and implementation time makes it very difficult to considers this before D 3.0, which is not tomorrow. I would really like this sort of system to be implemented in a medium-term, even if I think it's just dream. -- Christophe Travert, 4 years too late to discuss the issue.
Apr 23 2012