www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - pure static

reply "bearophile" <bearophileHUGS lycos.com> writes:
Currently this compiles because arr gets copied (or moved) 
(another less visible write of arr happens at the point of its 
initialization, it gets written by its init):


int[5] foo() pure {
     int[5] arr;
     return arr;
}
void main() {}


Currently even this function compiles, despite it's not actually 
pure, because arr contains garbage (and it's garbage that leaks 
information from other function stack frames, so it's a security 
hazard), so in theory a good compiler should disallow this:

int[5] foo() pure {
     int[5] arr = void;
     return arr;
}
void main() {}



On the other hand I think a strongly pure function like this 
could be accepted, avoiding the final copy of the result (the 
result contains a pointer to static data. Here the static data is 
an array, but returning a pointer to a static struct is equally 
valid):

int[] foo() pure {
     pure static int[5] arr;
     return arr;
}
void main() {}


"pure static" data means that 'arr' get cleaned (overwritten by 
its init) at the entry of the function foo (just like for 
not-static variables), to keep the function referentially 
transparent.

So this is forbidden:

pure static int[5] arr = void;


A smart compiler can even see arr is fully assigned inside the 
function and optimize away the first clear of the array:

int[] foo() pure {
     pure static int[5] arr; // optimized as =void
     foreach (immutable int i, ref r; arr)
         r = i;
     return arr;
}
void main() {}


Bye,
bearophile
Jan 06 2014
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 01/06/2014 09:35 PM, bearophile wrote:
 "pure static" data means that 'arr' get cleaned (overwritten by its
 init) at the entry of the function foo (just like for not-static
 variables), to keep the function referentially transparent.
It doesn't. The reinitialization may be observable through references obtained from earlier calls.
Jan 06 2014
parent "bearophile" <bearophileHUGS lycos.com> writes:
Timon Gehr:

 It doesn't. The reinitialization may be observable through 
 references obtained from earlier calls.
Right, so that pure function can return only const data: const(int[]) foo() pure { Is this idea still sufficiently useful? Perhaps not. Bye, bearophile
Jan 06 2014
prev sibling parent reply "Meta" <jared771 gmail.com> writes:
On Monday, 6 January 2014 at 20:35:39 UTC, bearophile wrote:
 Currently this compiles because arr gets copied (or moved) 
 (another less visible write of arr happens at the point of its 
 initialization, it gets written by its init):


 int[5] foo() pure {
     int[5] arr;
     return arr;
 }
 void main() {}


 Currently even this function compiles, despite it's not 
 actually pure, because arr contains garbage (and it's garbage 
 that leaks information from other function stack frames, so 
 it's a security hazard), so in theory a good compiler should 
 disallow this:

 int[5] foo() pure {
     int[5] arr = void;
     return arr;
 }
 void main() {}



 On the other hand I think a strongly pure function like this 
 could be accepted, avoiding the final copy of the result (the 
 result contains a pointer to static data. Here the static data 
 is an array, but returning a pointer to a static struct is 
 equally valid):

 int[] foo() pure {
     pure static int[5] arr;
     return arr;
 }
 void main() {}


 "pure static" data means that 'arr' get cleaned (overwritten by 
 its init) at the entry of the function foo (just like for 
 not-static variables), to keep the function referentially 
 transparent.

 So this is forbidden:

 pure static int[5] arr = void;


 A smart compiler can even see arr is fully assigned inside the 
 function and optimize away the first clear of the array:

 int[] foo() pure {
     pure static int[5] arr; // optimized as =void
     foreach (immutable int i, ref r; arr)
         r = i;
     return arr;
 }
 void main() {}


 Bye,
 bearophile
Why not just return arr.dup instead? You're returning a slice of a stack-allocated array, so of course you shouldn't write code like this.
Jan 06 2014
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Meta:

 Why not just return arr.dup instead? You're returning a slice 
 of a stack-allocated array, so of course you shouldn't write 
 code like this.
In certain critical code paths heap allocations are evil (perhaps even if your generational GC has a stack-like nursery, that currently the D GC doesn't have) :-) And you also want to minimize copies and initializations. Bye, bearophile
Jan 06 2014
next sibling parent reply "Meta" <jared771 gmail.com> writes:
On Tuesday, 7 January 2014 at 00:54:10 UTC, bearophile wrote:
 Meta:

 Why not just return arr.dup instead? You're returning a slice 
 of a stack-allocated array, so of course you shouldn't write 
 code like this.
In certain critical code paths heap allocations are evil (perhaps even if your generational GC has a stack-like nursery, that currently the D GC doesn't have) :-) And you also want to minimize copies and initializations. Bye, bearophile
But is it ever legal to return a local stack-allocated static array from a function in D? Won't that entail a copy either way?
Jan 06 2014
parent "bearophile" <bearophileHUGS lycos.com> writes:
Meta:

 But is it ever legal to return a local stack-allocated static 
 array from a function in D?
It's legal because in D fixed-size arrays are values. So D copies them (but in in some cases theory it can allocate them in the stack frame of the caller and avoid the copy).
 Won't that entail a copy either way?
In the code I've shown I have used "pure static", so the array is allocated statically, not on the stack. Bye, bearophile
Jan 06 2014
prev sibling parent "TheFlyingFiddle" <kurtyan student.chalmers.se> writes:
On Tuesday, 7 January 2014 at 00:54:10 UTC, bearophile wrote:
 Meta:

 Why not just return arr.dup instead? You're returning a slice 
 of a stack-allocated array, so of course you shouldn't write 
 code like this.
In certain critical code paths heap allocations are evil (perhaps even if your generational GC has a stack-like nursery, that currently the D GC doesn't have) :-) And you also want to minimize copies and initializations. Bye, bearophile
Wouldn't it be better to simply allocate on the stack directly in the calling function? Or if that is not a possibility use a region allocator or even a ScopeStack allocator? It's basically just a pointer bump so it should be fast. So it would translate to something like this: int[] foo(Allocator)(ref Allocator allocator) { auto arr = allocator.allocate!(int[])(5); foreach(i, ref elem; arr) elem = i; return arr; }
Jan 07 2014