digitalmars.D - GC for pure functions -- implementation ideas

Don (79/79) Apr 15 2011 I noticed a lively discussion in Bugzilla about the GC, with speculation...

Walter Bright (2/6) Apr 15 2011 I think it's a good idea.
Sean Kelly (4/8) Apr 15 2011 thread local 'stack pointer' which points to the first free slot.

Don (3/9) Apr 15 2011 Yes. You could think of it as a way the compiler could automatically
dsimcha (15/21) Apr 15 2011 Yeah, I never formalized it at all, but that's roughly what TempAlloc

Don (9/37) Apr 15 2011 Yes. I'm not sure what the best strategy is if you run out of memory
Timon Gehr (19/33) Apr 16 2011 It should be trivial to just deallocate when arr goes out of scope.

bearophile (4/6) Apr 16 2011 If this is true, then you need a sealed temporary heap.
Fawzi Mohamed (27/32) Apr 17 2011 charset=US-ASCII;

Don (10/42) Apr 17 2011 Yes, but that only matters if that function is both extremely

Timon Gehr (11/20) Apr 17 2011 I think this idea has potential, I was only pointing out that maybe the ...

Michel Fortin (10/12) Apr 17 2011 Speaking of rethinking the GC and its primitives, is there a way

Steven Schveighoffer (6/14) Apr 18 2011 It's coarsely grained, but you can intercept all finalization calls:

Steven Schveighoffer (9/25) Apr 18 2011 Also, there's an undocumented, non-published (i.e. not in object.di)
Michel Fortin (10/29) Apr 18 2011 Won't do. What I need is to make the GC track memory blocks allocated

bearophile (9/16) Apr 15 2011 - In D we have not even started to exploit the full potential of purity ...
Tomek =?ISO-8859-2?Q?Sowi=F1ski?= (9/25) Apr 15 2011 I'm far from being a GC expert but I think Java having identified such c...

Don (3/27) Apr 15 2011 That works for the non-leaky function itself, but it doesn't help for
Steven Schveighoffer (12/32) Apr 18 2011 Full escape analysis requires a non-stanard object format and a

Kagamin (2/8) Apr 18 2011 Why support from linker? Even .net uses plain old COFF, it just stores m...

Steven Schveighoffer (7/19) Apr 18 2011 Wouldn't the linker have to verify that the escapes are valid when linki...

Kagamin (2/6) Apr 18 2011 If it doesn't, it's not a big deal, I think. One usually does things in ...
Kagamin (2/3) Apr 18 2011 Metadata in .net is more than annotation, so there's no direct relation ...

bearophile (4/5) Apr 15 2011 Escape analysis will be useful for D compilers too (I think LDC-LLVM is ...
Jason House (3/8) Apr 17 2011 It'd reduce the use of the pure heap to leaky pure functions called from...

Don (17/25) Apr 17 2011 It would definitely help a lot. It just wouldn't catch everything.

Fawzi Mohamed (15/20) Apr 18 2011 yes more info is always better, I didn't want to diminish your work,

Denis Koroskin (6/85) Apr 18 2011 I was doing something similar to get rid of the memory leaks in DDMD: I ...
Bruno Medeiros (4/8) Apr 20 2011 This stuff is why I love static typing! :D
Robert Jacques (10/12) Nov 07 2011 I really like this general concept (It feels a lot like young/old

Jonathan M Davis (9/25) Nov 07 2011 The fact that Appender isn't pure is a huge blow to purity. At least wit...
bearophile (4/7) Nov 08 2011 As first step to implement Don's GC idea I think it will be useful a __t...

Gor Gyolchanyan (11/19) Nov 08 2011 This would be an AWESOME idea, because it would allow to specialize

Don <nospam nospam.com> writes:

I noticed a lively discussion in Bugzilla about the GC, with speculation 
about the impact of a precise GC on speed.
But it seems to me that a dedicated GC for pure functions has enormous 
unexplored potential, and might be relatively easy to implement.

LEAKY FUNCTIONS

Define a 'leaky' pure function as a pure function which can return
heap-allocated memory to the caller, ie, where the return value or a
parameter passed by reference has at least one pointer or reference
type. This can be determined simply by inspecting the signature. (Note
that the function does not need to be immutably pure).

The interesting thing is that heap allocation inside non-leaky pure
functions behaves like stack allocation. When you return from that
function, *all* those variables are unreachable, and can be discarded en 
masse. Here's an idea of how to exploit this.

THE PURE HEAP

Create a pure heap for each thread. This is a heap which can only be
used by pure functions. I present some simplistic code, with the
simplest possible implementation: just a big block of memory with a 
thread local 'stack pointer' which points to the first free slot.

static ubyte *heap; // initialized to big chunk of RAM.
static size_t stackptr = 0;
static size_t savedstackptr = 0;

For *non-leaky* pure functions: if any of the functions it calls are 
leaky, or if it makes any memory allocations, then call a HeapEnter 
function (in the druntime) at the start, and a HeapExit function at the 
end. Leaky pure functions don't get this prologue and epilogue code.
Non-leaky pure functions that don't do memory allocation are simply 
ignored. (Note that the compiler can determine if a function makes any 
memory allocations, simply by inspecting its body -- it isn't any more 
difficult than checking if it is nothrow).

void pureHeapEnter()
{
     cast(ubyte *)(heap + stackptr) = savedstackptr;
     savedstackptr = stackptr;
     stackptr += size_t.sizeof;
}

void pureHeapExit()
{
     stackptr = savedstackptr;  // instant GC!!
     savedstackptr = cast(ubyte *)(heap +stackptr);
}

The pureHeapExit function has the effect of instantly (and precisely!)
collecting all of the memory allocated in the non-leaky pure function 
and in every leaky function that it called.

In any pure function, leaky or non-leaky, when memory is allocated, call
pureMalloc instead of gcMalloc when allocating. (Non-leaky pure 
functions will of course always allocate on the pure heap.).

void *pureMalloc(int nbytes)
{
     if (!stackptr)
         return gcMalloc(nbytes); // we're leaky, do a normal malloc
     // we can use the pure heap
     auto r = heap + stackptr;
     stackptr += nbytes;
     return r;
}

REFINEMENTS
We can make this scheme more generally applicable. If there is a leaky 
return value which is cheap to copy, then we can pretend the function is 
non-leaky: at exit, if we were called with stackptr == 0, then we copy 
(deepdup) the return value to the gc heap, before calling pureHeapExit. 
If stackptr was non-zero, we don't need to copy it.

COMPLICATIONS
Classes with finalizers are an annoying complication. But again, we can 
look at all the functions we call, and all the 'new' operations we 
perform, to see if any finalizers exist. Maybe we could even have a 
separate finalizer heap?

Exceptions are the biggest nuisance, since they can also leak 
heap-allocated memory. A catch handler in a non-pure function would need 
to check to see if the pure heap 'stackpointer' is non-zero, and if so, 
it would need to do a deep dup of the exception, then clear the pure 
heap. Any pure function (leaky or not) which contains a catch handler 
would need to record the value of the savedstackptr at entry to the 
function, and the catch handler would need to unwind the pure heap until 
we get back to it.

In reality, things are going to be a bit more complicated than this. But
it seems to me that conceptually, something like this could still stay 
fairly simple and be very, very fast. With no changes required to the 
language, and not even any changes required to existing code.

Apr 15 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 4/15/2011 1:12 PM, Don wrote:
 In reality, things are going to be a bit more complicated than this. But
 it seems to me that conceptually, something like this could still stay fairly
 simple and be very, very fast. With no changes required to the language, and
not
 even any changes required to existing code.

I think it's a good idea.

Apr 15 2011

Sean Kelly <sean invisibleduck.org> writes:

On Apr 15, 2011, at 1:12 PM, Don wrote:
=20
 Create a pure heap for each thread. This is a heap which can only be
 used by pure functions. I present some simplistic code, with the
 simplest possible implementation: just a big block of memory with a =

thread local 'stack pointer' which points to the first free slot.

It's a good idea.  dsimcha was already going to polish his TempAlloc, =
wasn't he?  Seems like this is nearly the same thing.=

Apr 15 2011

Don <nospam nospam.com> writes:

Sean Kelly wrote:
 On Apr 15, 2011, at 1:12 PM, Don wrote:
 Create a pure heap for each thread. This is a heap which can only be
 used by pure functions. I present some simplistic code, with the
 simplest possible implementation: just a big block of memory with a thread
local 'stack pointer' which points to the first free slot.

 
 It's a good idea.  dsimcha was already going to polish his TempAlloc, wasn't
he?  Seems like this is nearly the same thing.

Yes. You could think of it as a way the compiler could automatically 
convert 'new' into a use of TempAlloc, in many useful cases.

Apr 15 2011

dsimcha <dsimcha yahoo.com> writes:

On 4/15/2011 5:01 PM, Sean Kelly wrote:
 On Apr 15, 2011, at 1:12 PM, Don wrote:
 Create a pure heap for each thread. This is a heap which can only be
 used by pure functions. I present some simplistic code, with the
 simplest possible implementation: just a big block of memory with a thread
local 'stack pointer' which points to the first free slot.

 It's a good idea.  dsimcha was already going to polish his TempAlloc, wasn't
he?  Seems like this is nearly the same thing.

Yeah, I never formalized it at all, but that's roughly what TempAlloc 
accomplishes.  My other concern is, what happens in the case of the 
following code:

uint nonLeaky() pure {
     foreach(i; 0..42) {
          auto arr = new uint[666];
          // do stuff
     }

     return 8675309;
}

In this case the arr instance from every loop iteration is retained 
until nonLeaky() returns, whether it's referenced or not.  Granted, this 
is a silly example, but I imagine there are cases where stuff like this 
happens in practice.

Apr 15 2011

Don <nospam nospam.com> writes:

dsimcha wrote:
 On 4/15/2011 5:01 PM, Sean Kelly wrote:
 On Apr 15, 2011, at 1:12 PM, Don wrote:
 Create a pure heap for each thread. This is a heap which can only be
 used by pure functions. I present some simplistic code, with the
 simplest possible implementation: just a big block of memory with a 
 thread local 'stack pointer' which points to the first free slot.

 It's a good idea.  dsimcha was already going to polish his TempAlloc, 
 wasn't he?  Seems like this is nearly the same thing.

 
 Yeah, I never formalized it at all, but that's roughly what TempAlloc 
 accomplishes.  My other concern is, what happens in the case of the 
 following code:
 
 uint nonLeaky() pure {
     foreach(i; 0..42) {
          auto arr = new uint[666];
          // do stuff
     }
 
     return 8675309;
 }
 
 In this case the arr instance from every loop iteration is retained 
 until nonLeaky() returns, whether it's referenced or not.  Granted, this 
 is a silly example, but I imagine there are cases where stuff like this 
 happens in practice.

Yes. I'm not sure what the best strategy is if you run out of memory 
inside purealloc.
The simple strategy would just be switch to the normal heap once the 
pure heap is full. Then, the gc would have to check for a full pure 
heap. If the pure heap is full, it should scan everything from the last 
heapptr to the end of the pure heap, as well as everything else it 
scans. But you still get the entire size of the pure heap as an 
instant-gc region.

Apr 15 2011

Timon Gehr <timon.gehr gmx.ch> writes:

 Yeah, I never formalized it at all, but that's roughly what TempAlloc
 accomplishes.  My other concern is, what happens in the case of the
 following code:

 uint nonLeaky() pure {
      foreach(i; 0..42) {
           auto arr = new uint[666];
           // do stuff
      }

      return 8675309;
 }

 In this case the arr instance from every loop iteration is retained
 until nonLeaky() returns, whether it's referenced or not.  Granted, this
 is a silly example, but I imagine there are cases where stuff like this
 happens in practice.

It should be trivial to just deallocate when arr goes out of scope.

This would be more complicated to resolve, as the analogy to stack allocation
vanishes:

uint nonLeaky() pure {
     uint[] arr1 = new uint[666];
     uint[] arr2 = new uint[666];
     foreach(i; 0..42) {
          arr1 = new uint[66*i];
          // do stuff
     }

     return 8675309;
}

The problem is, that inside a non-leaky pure function the general case for
dynamic
allocations might be just as complicated as in other parts of the program.
However, the problem does only exist when the pure function deletes/overrides
its
own references. Those are the only ones it is allowed to modify. Therefore, the
compiler could just use the GC heap whenever a reference is assigned twice or
more
times inside a non-leaky pure function? I think it might be a better option than
letting the pure heap fill up with garbage.

Apr 16 2011

bearophile <bearophileHUGS lycos.com> writes:

Timon Gehr:

 The problem is, that inside a non-leaky pure function the general case for
dynamic
 allocations might be just as complicated as in other parts of the program.

If this is true, then you need a sealed temporary heap.

Bye,
bearophile

Apr 16 2011

Fawzi Mohamed <fawzi gmx.ch> writes:

	charset=US-ASCII;
	format=flowed;
	delsp=yes
Content-Transfer-Encoding: 7bit


On 16-apr-11, at 22:49, Timon Gehr wrote:

 [...]
 The problem is, that inside a non-leaky pure function the general  
 case for dynamic
 allocations might be just as complicated as in other parts of the  
 program.

indeed, this is exactly what I wanted to write, yes in some cases, one  
can get away with simple stack like, or similar but it breaks down  
very quickly.
In fact GC were introduced by functional languages, because they are  
kind of needed for them, already that should hint to the fact that  
functional, or pure languages are not intrinsically easier to collect.

What can be useful is allowing one to add a set of pools, that then  
can be freed all at once.
Having several pools is also what is needed to remove the global lock  
in malloc, so that is definitely the way to go imho.
Then one can give the control of these extra pools to the programmer,  
so that it is easy use a special pool for a part of the program and  
then release a lot of objects at once. Even then one should put quite  
some thought into it (for example about static/global objects that  
might be allocated for caching purposes).
A strictly pure function returning a value without pointers gives  
guarantees, but as soon as some caching (even behind the scenes) goes  
on, then things will fail. If a separate pool is used consistently for  
cached or shared objects one should be able to allow even caching.
All this comes back again to having several pools, showing how useful  
such a primitive is.

Fawzi

Apr 17 2011

Don <nospam nospam.com> writes:

Fawzi Mohamed wrote:
 
 On 16-apr-11, at 22:49, Timon Gehr wrote:
 
 [...]
 The problem is, that inside a non-leaky pure function the general case 
 for dynamic
 allocations might be just as complicated as in other parts of the program.


Yes, but that only matters if that function is both extremely 
memory-inefficient AND long-lived. In which case you can fall back to 
the normal GC (or even a dedicated pure GC). I don't think you ever lose 
anything.

 indeed, this is exactly what I wanted to write, yes in some cases, one 
 can get away with simple stack like, or similar but it breaks down very 
 quickly.
 In fact GC were introduced by functional languages, because they are 
 kind of needed for them, already that should hint to the fact that 
 functional, or pure languages are not intrinsically easier to collect.

I find that *impossible* to believe. Note also that you are equating 
"functional" = "pure" in the D sense, which is not true.
Firstly, functional languages generate *enormous* amounts of garbage. D 
does not. Secondly, non-leaky pure functions are rare in functional 
programming languages. I think we are in new territory here.
 
 What can be useful is allowing one to add a set of pools, that then can 
 be freed all at once.
 Having several pools is also what is needed to remove the global lock in 
 malloc, so that is definitely the way to go imho.
 Then one can give the control of these extra pools to the programmer, so 
 that it is easy use a special pool for a part of the program and then 
 release a lot of objects at once. Even then one should put quite some 
 thought into it (for example about static/global objects that might be 
 allocated for caching purposes).
 A strictly pure function returning a value without pointers gives 
 guarantees, but as soon as some caching (even behind the scenes) goes 
 on, then things will fail. If a separate pool is used consistently for 
 cached or shared objects one should be able to allow even caching.
 All this comes back again to having several pools, showing how useful 
 such a primitive is.
 
 Fawzi

Apr 17 2011

Timon Gehr <timon.gehr gmx.ch> writes:

Don wrote:

 The problem is, that inside a non-leaky pure function the general case
 for dynamic
 allocations might be just as complicated as in other parts of the program.


 Yes, but that only matters if that function is both extremely
 memory-inefficient AND long-lived. In which case you can fall back to
 the normal GC (or even a dedicated pure GC). I don't think you ever lose
 anything.

I think this idea has potential, I was only pointing out that maybe the compiler
should identify such garbage creating functions and use the normal GC for them
by
default. It is not really a valid option to only restrict the size of the pure
heap (or even just the amount it can grow per call frame) and then fall back to
the normal GC, because a non-leaky pure function that behaves nicely may also
need
a lot of memory.

Fawzi Mohamed wrote:
 Having several pools is also what is needed to remove the global lock in
 malloc, so that is definitely the way to go imho.

I agree. I don't like the fact that at the GC suspends all running threads while
collecting. But Hardware will probably be evolving towards Core-local RAM
(several
_physical_ pools) anyways.

Apr 17 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-04-17 05:22:10 -0400, Fawzi Mohamed <fawzi gmx.ch> said:

 All this comes back again to having several pools, showing how useful
 such a primitive is.

Speaking of rethinking the GC and its primitives, is there a way 
currently to tell the GC to track pointers to a manually allocated 
block and assign a callback for when the GC wants that block to be 
finalized and deallocated? I'm kinda going to need that if I am to 
bring decent garbage-collected Objective-C objects in D.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Apr 17 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sun, 17 Apr 2011 14:08:29 -0400, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2011-04-17 05:22:10 -0400, Fawzi Mohamed <fawzi gmx.ch> said:

 All this comes back again to having several pools, showing how useful
 such a primitive is.

 Speaking of rethinking the GC and its primitives, is there a way  
 currently to tell the GC to track pointers to a manually allocated block  
 and assign a callback for when the GC wants that block to be finalized  
 and deallocated? I'm kinda going to need that if I am to bring decent  
 garbage-collected Objective-C objects in D.

It's coarsely grained, but you can intercept all finalization calls:

http://www.digitalmars.com/d/2.0/phobos/core_runtime.html#collectHandler

You'd have to figure out some way to filter only the ones you care about.

-Steve

Apr 18 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 18 Apr 2011 08:27:25 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Sun, 17 Apr 2011 14:08:29 -0400, Michel Fortin  
 <michel.fortin michelf.com> wrote:

 On 2011-04-17 05:22:10 -0400, Fawzi Mohamed <fawzi gmx.ch> said:

 All this comes back again to having several pools, showing how useful
 such a primitive is.

 Speaking of rethinking the GC and its primitives, is there a way  
 currently to tell the GC to track pointers to a manually allocated  
 block and assign a callback for when the GC wants that block to be  
 finalized and deallocated? I'm kinda going to need that if I am to  
 bring decent garbage-collected Objective-C objects in D.

 It's coarsely grained, but you can intercept all finalization calls:

 http://www.digitalmars.com/d/2.0/phobos/core_runtime.html#collectHandler

 You'd have to figure out some way to filter only the ones you care about.


Also, there's an undocumented, non-published (i.e. not in object.di)  
function that allows you to attach an event handler when an object is  
disposed:

https://github.com/D-Programming-Language/druntime/blob/master/src/object_.d#L2394

I think Bill Baxter's WeakRef object used it.  I'm not sure how supported  
it is, or why it isn't public.

-Steve

Apr 18 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-04-18 08:27:25 -0400, "Steven Schveighoffer" 
<schveiguy yahoo.com> said:

 On Sun, 17 Apr 2011 14:08:29 -0400, Michel Fortin  
 <michel.fortin michelf.com> wrote:
 
 On 2011-04-17 05:22:10 -0400, Fawzi Mohamed <fawzi gmx.ch> said:
 
 All this comes back again to having several pools, showing how useful
 such a primitive is.

 
 Speaking of rethinking the GC and its primitives, is there a way  
 currently to tell the GC to track pointers to a manually allocated 
 block  and assign a callback for when the GC wants that block to be 
 finalized  and deallocated? I'm kinda going to need that if I am to 
 bring decent  garbage-collected Objective-C objects in D.

 
 It's coarsely grained, but you can intercept all finalization calls:
 
 http://www.digitalmars.com/d/2.0/phobos/core_runtime.html#collectHandler
 
 You'd have to figure out some way to filter only the ones you care about.

Won't do. What I need is to make the GC track memory blocks allocated 
externally, with some callback telling me when they are no longer 
referenced.

I'll probably just add this capability to the GC myself when the time comes.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Apr 18 2011

bearophile <bearophileHUGS lycos.com> writes:

Don:

 But it seems to me that a dedicated GC for pure functions has enormous 
 unexplored potential, and might be relatively easy to implement.

- In D we have not even started to exploit the full potential of purity and
pure functions.
- In D reducing the work of the normal GC is a very good thing (because of
problems caused by not moving GC, its not precise nature, its not advanced
implementation, and because D is a complex system language).
- CommonLisp programmers and designers know that's it's good to give some hints
to the the GC. Optional information that help the GC do its work more
efficiently. The D1 "scope" attribute for class allocations was a not well
designed&implemented example of this.


 (Note that the compiler can determine if a function makes any 
 memory allocations, simply by inspecting its body -- it isn't any more 
 difficult than checking if it is nothrow).

Time ago I have suggested a  noheap that makes sure a function tree doesn't
allocate memory. 
What you say makes me think that here user code may just desire to know what
the compiler knows: a __traits that given a function name returns a true if the
function performs heap allocation.


 With no changes required to the 
 language, and not even any changes required to existing code.

Phobos and other many small things (D too, like some form to express
conditional purity, etc) need to change to allow D programmers to use purity
more widely in their programs. This in turn will make your pure GC more and
more useful.

Bye,
bearophile

Apr 15 2011

Tomek =?ISO-8859-2?Q?Sowi=F1ski?= <just ask.me> writes:

Don napisa=B3:

 LEAKY FUNCTIONS
=20
 Define a 'leaky' pure function as a pure function which can return
 heap-allocated memory to the caller, ie, where the return value or a
 parameter passed by reference has at least one pointer or reference
 type. This can be determined simply by inspecting the signature. (Note
 that the function does not need to be immutably pure).
=20
 The interesting thing is that heap allocation inside non-leaky pure
 functions behaves like stack allocation. When you return from that
 function, *all* those variables are unreachable, and can be discarded en=

=20
 masse. Here's an idea of how to exploit this.
=20
 THE PURE HEAP
=20
 [snip]

I'm far from being a GC expert but I think Java having identified such case=
s with escape analysis just puts locally allocated objects on the stack.

Couldn't we too? Your mark & release "pure heap" scheme looks alright but t=
his seems simpler.

The notion of "non-leaky" functions can be useful either way.

--=20
Tomek

Apr 15 2011

Don <nospam nospam.com> writes:

Tomek Sowi�ski wrote:
 Don napisa�:
 
 LEAKY FUNCTIONS

 Define a 'leaky' pure function as a pure function which can return
 heap-allocated memory to the caller, ie, where the return value or a
 parameter passed by reference has at least one pointer or reference
 type. This can be determined simply by inspecting the signature. (Note
 that the function does not need to be immutably pure).

 The interesting thing is that heap allocation inside non-leaky pure
 functions behaves like stack allocation. When you return from that
 function, *all* those variables are unreachable, and can be discarded en 
 masse. Here's an idea of how to exploit this.

 THE PURE HEAP

 [snip]

 
 I'm far from being a GC expert but I think Java having identified such cases
with escape analysis just puts locally allocated objects on the stack.

That works for the non-leaky function itself, but it doesn't help for 
the functions it calls.

 Couldn't we too? Your mark & release "pure heap" scheme looks alright but this
seems simpler.
 
 The notion of "non-leaky" functions can be useful either way.

Apr 15 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 15 Apr 2011 18:36:17 -0400, Tomek Sowiński <just ask.me> wrote:

 Don napisał:

 LEAKY FUNCTIONS

 Define a 'leaky' pure function as a pure function which can return
 heap-allocated memory to the caller, ie, where the return value or a
 parameter passed by reference has at least one pointer or reference
 type. This can be determined simply by inspecting the signature. (Note
 that the function does not need to be immutably pure).

 The interesting thing is that heap allocation inside non-leaky pure
 functions behaves like stack allocation. When you return from that
 function, *all* those variables are unreachable, and can be discarded en
 masse. Here's an idea of how to exploit this.

 THE PURE HEAP

 [snip]

 I'm far from being a GC expert but I think Java having identified such  
 cases with escape analysis just puts locally allocated objects on the  
 stack.

Full escape analysis requires a non-stanard object format and a  
non-standard linker.  Simply because, escape analysis must be a  
compiler-added attribute, and the linker has to understand that.  We can  
get away with certain things reusing the C linker by doing things like  
name mangling, but escape analysis must add far more annotations  
(essentially tell how parameters and return values are linked).

I think at some point, I'd prefer D to use it's own object format and  
linker, to allow both escape analysis, and to use the object format as the  
import database.  But that isn't likely to come from Walter, I think  
someone would have to build a new compiler.

-Steve

Apr 18 2011

Kagamin <spam here.lot> writes:

Steven Schveighoffer Wrote:

 Full escape analysis requires a non-stanard object format and a  
 non-standard linker.  Simply because, escape analysis must be a  
 compiler-added attribute, and the linker has to understand that.  We can  
 get away with certain things reusing the C linker by doing things like  
 name mangling, but escape analysis must add far more annotations  
 (essentially tell how parameters and return values are linked).

Why support from linker? Even .net uses plain old COFF, it just stores metadata
in a separate section. Escape analysis is a compile-time job for the compiler,
so this metadata can be simply ignored and discarded by the linker. For the
same reason, this metadata can be stored in any place, even in a separate file.

Apr 18 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 18 Apr 2011 09:34:07 -0400, Kagamin <spam here.lot> wrote:

 Steven Schveighoffer Wrote:

 Full escape analysis requires a non-stanard object format and a
 non-standard linker.  Simply because, escape analysis must be a
 compiler-added attribute, and the linker has to understand that.  We can
 get away with certain things reusing the C linker by doing things like
 name mangling, but escape analysis must add far more annotations
 (essentially tell how parameters and return values are linked).

 Why support from linker? Even .net uses plain old COFF, it just stores  
 metadata in a separate section. Escape analysis is a compile-time job  
 for the compiler, so this metadata can be simply ignored and discarded  
 by the linker. For the same reason, this metadata can be stored in any  
 place, even in a separate file.

Wouldn't the linker have to verify that the escapes are valid when linking  
objects together?  I admit I don't know much about the object format and  
linking process, but I assumed that the linker had to be involved in  
enforcement of escape analysis.

Does the .NET linker ignore the metadata?

-Steve

Apr 18 2011

Kagamin <spam here.lot> writes:

Steven Schveighoffer Wrote:

 Wouldn't the linker have to verify that the escapes are valid when linking  
 objects together?  I admit I don't know much about the object format and  
 linking process, but I assumed that the linker had to be involved in  
 enforcement of escape analysis.

If it doesn't, it's not a big deal, I think. One usually does things in the
right order. This is better solved by a build tool, and integration of compiler
into a build tool may be a good idea. After all, some of them already use the
compiler to infer dependencies.

Apr 18 2011

Kagamin <spam here.lot> writes:

Steven Schveighoffer Wrote:

 Does the .NET linker ignore the metadata?

Metadata in .net is more than annotation, so there's no direct relation to what
we're talking about. It's just an example. It's not like there's nothing to fix
in COFF, but it's not a big deal to make it do anything you want. You simply
put your metadata in your section and even stay compatible with existing tools,
so new linker can add some extra safety, but it won't be required to make
things work.

Apr 18 2011

bearophile <bearophileHUGS lycos.com> writes:

Tomek Sowi�ski:

 I'm far from being a GC expert but I think Java having identified such cases
with escape analysis just puts locally allocated objects on the stack.

Escape analysis will be useful for D compilers too (I think LDC-LLVM is not
doing this much yet), but if the amount of non-escaping memory allocated is
large, you don't want to put it on the stack, a special heap is better.

Bye,
bearophile

Apr 15 2011

Jason House <jason.james.house gmail.com> writes:

Don Wrote:
 Tomek Sowi�ski wrote:
 I'm far from being a GC expert but I think Java having identified such cases
with escape analysis just puts locally allocated objects on the stack.

 
 That works for the non-leaky function itself, but it doesn't help for 
 the functions it calls.

It'd reduce the use of the pure heap to leaky pure functions called from pure
functions. If I understood the original proposal correctly, this would reduce
how frequently pure functions have to manipulate the pure stack. I haven't
thought through the exception handling case, so I may be completely wrong!

I was originally assuming the return type of a pure function was enough to
determine if it wasn't leaky, but now I'm thinking only pure nothrow functions
can be non-leaky. That might make the stack allocation optimization too rare to
be worthwhile?

Apr 17 2011

Don <nospam nospam.com> writes:

Jason House wrote:
 Don Wrote:
 Tomek Sowi�ski wrote:
 I'm far from being a GC expert but I think Java having identified such cases
with escape analysis just puts locally allocated objects on the stack.

 That works for the non-leaky function itself, but it doesn't help for 
 the functions it calls.

 
 It'd reduce the use of the pure heap to leaky pure functions called from pure
functions. If I understood the original proposal correctly, this would reduce
how frequently pure functions have to manipulate the pure stack. I haven't
thought through the exception handling case, so I may be completely wrong!

It would definitely help a lot. It just wouldn't catch everything.
It seems fairly difficult though.

 I was originally assuming the return type of a pure function was enough to
determine if it wasn't leaky, 

You also have parameters passed by reference.

but now I'm thinking only pure nothrow functions can be non-leaky. That 
might make the stack allocation optimization too rare to be worthwhile?

It might. Although as I mentioned, you can deep-dup any exceptions onto 
the normal gc heap, at the moment they are caught. That deep-dup is not 
performance critical, and in most cases, the dup is simple. But the 
worst case is, the entire pure heap needs to be copied. It might turn 
out to be too complicated to be worthwhile, limiting the scheme to pure 
nothrow functions.
Nontheless I think pure nothrow functions will be pretty common.

Basically, my contribution is this: the compiler can easily work out, 
for each function, whenever it has entered and exited a non-leaky pure 
function. It can make a call into the GC whenever this happens. This 
gives the GC many more potential strategies.

Apr 17 2011

Fawzi Mohamed <fawzi gmx.ch> writes:

On 17-apr-11, at 21:44, Don wrote:

 [...]
 Basically, my contribution is this: the compiler can easily work  
 out, for each function, whenever it has entered and exited a non- 
 leaky pure function. It can make a call into the GC whenever this  
 happens. This gives the GC many more potential strategies.

yes more info is always better, I didn't want to diminish your work,  
but to point toward a general improvement in the GC
My fear is that the overhead of this approach will make it worth only  
for big allocations (getting rid of eventual false pointers), and thus  
only under the programmer control.

Classifying the "allocation potential" of a function might help make  
better automatic decisions.
That is difficult in general  (one can flag alloc in loop with unknown  
length or more than 4 elements as large for example, something that  
would miss recursive allocations), but it would be useful, because for  
those that allocate little could do nothing, or use a fixed stack like  
heap, whereas those that allocate a lot could checkpoint the heap  
(assuming a separate pool for each thread) or use a new pool.

Fawzi

Apr 18 2011

"Denis Koroskin" <2korden gmail.com> writes:

On Sat, 16 Apr 2011 00:12:34 +0400, Don <nospam nospam.com> wrote:

 I noticed a lively discussion in Bugzilla about the GC, with speculation  
 about the impact of a precise GC on speed.
 But it seems to me that a dedicated GC for pure functions has enormous  
 unexplored potential, and might be relatively easy to implement.

 LEAKY FUNCTIONS

 Define a 'leaky' pure function as a pure function which can return
 heap-allocated memory to the caller, ie, where the return value or a
 parameter passed by reference has at least one pointer or reference
 type. This can be determined simply by inspecting the signature. (Note
 that the function does not need to be immutably pure).

 The interesting thing is that heap allocation inside non-leaky pure
 functions behaves like stack allocation. When you return from that
 function, *all* those variables are unreachable, and can be discarded en  
 masse. Here's an idea of how to exploit this.

 THE PURE HEAP

 Create a pure heap for each thread. This is a heap which can only be
 used by pure functions. I present some simplistic code, with the
 simplest possible implementation: just a big block of memory with a  
 thread local 'stack pointer' which points to the first free slot.

 static ubyte *heap; // initialized to big chunk of RAM.
 static size_t stackptr = 0;
 static size_t savedstackptr = 0;

 For *non-leaky* pure functions: if any of the functions it calls are  
 leaky, or if it makes any memory allocations, then call a HeapEnter  
 function (in the druntime) at the start, and a HeapExit function at the  
 end. Leaky pure functions don't get this prologue and epilogue code.
 Non-leaky pure functions that don't do memory allocation are simply  
 ignored. (Note that the compiler can determine if a function makes any  
 memory allocations, simply by inspecting its body -- it isn't any more  
 difficult than checking if it is nothrow).

 void pureHeapEnter()
 {
      cast(ubyte *)(heap + stackptr) = savedstackptr;
      savedstackptr = stackptr;
      stackptr += size_t.sizeof;
 }

 void pureHeapExit()
 {
      stackptr = savedstackptr;  // instant GC!!
      savedstackptr = cast(ubyte *)(heap +stackptr);
 }

 The pureHeapExit function has the effect of instantly (and precisely!)
 collecting all of the memory allocated in the non-leaky pure function  
 and in every leaky function that it called.

 In any pure function, leaky or non-leaky, when memory is allocated, call
 pureMalloc instead of gcMalloc when allocating. (Non-leaky pure  
 functions will of course always allocate on the pure heap.).

 void *pureMalloc(int nbytes)
 {
      if (!stackptr)
          return gcMalloc(nbytes); // we're leaky, do a normal malloc
      // we can use the pure heap
      auto r = heap + stackptr;
      stackptr += nbytes;
      return r;
 }

 REFINEMENTS
 We can make this scheme more generally applicable. If there is a leaky  
 return value which is cheap to copy, then we can pretend the function is  
 non-leaky: at exit, if we were called with stackptr == 0, then we copy  
 (deepdup) the return value to the gc heap, before calling pureHeapExit.  
 If stackptr was non-zero, we don't need to copy it.

 COMPLICATIONS
 Classes with finalizers are an annoying complication. But again, we can  
 look at all the functions we call, and all the 'new' operations we  
 perform, to see if any finalizers exist. Maybe we could even have a  
 separate finalizer heap?

 Exceptions are the biggest nuisance, since they can also leak  
 heap-allocated memory. A catch handler in a non-pure function would need  
 to check to see if the pure heap 'stackpointer' is non-zero, and if so,  
 it would need to do a deep dup of the exception, then clear the pure  
 heap. Any pure function (leaky or not) which contains a catch handler  
 would need to record the value of the savedstackptr at entry to the  
 function, and the catch handler would need to unwind the pure heap until  
 we get back to it.

 In reality, things are going to be a bit more complicated than this. But
 it seems to me that conceptually, something like this could still stay  
 fairly simple and be very, very fast. With no changes required to the  
 language, and not even any changes required to existing code.

I was doing something similar to get rid of the memory leaks in DDMD: I  
allocated with something like pureMalloc, then took one of the root  
objects and made a deep copy of the entire object hierarchy. The newly  
constructed object hierarchy (allocated in managed heap) is then used  
instead of the original one.

Apr 18 2011

Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:

On 15/04/2011 21:12, Don wrote:
 I noticed a lively discussion in Bugzilla about the GC, with speculation
 about the impact of a precise GC on speed.
 But it seems to me that a dedicated GC for pure functions has enormous
 unexplored potential, and might be relatively easy to implement.

This stuff is why I love static typing! :D


-- 
Bruno Medeiros - Software Engineer

Apr 20 2011

"Robert Jacques" <sandford jhu.edu> writes:

On Fri, 15 Apr 2011 16:12:34 -0400, Don <nospam nospam.com> wrote:
[snip]
 In reality, things are going to be a bit more complicated than this. But
 it seems to me that conceptually, something like this could still stay fairly
simple and be very, very fast. With no changes required to the language, and
not even any changes required to existing code.

I really like this general concept (It feels a lot like young/old
generational collecting, but without the overhead), both for non-leaky
pure functions and ctfe.

On a side note, core.memory.gc is not currently marked pure and/or
nothrow, so Appender / containers which use the low level GC commands for
performance can't be used in pure/nothrow code. I'm not sure how to fix
this, as it would essentially have to violate the type system in a major
way. It's one more thing to think about with regard to purity and the GC.

Nov 07 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday, November 07, 2011 23:28:28 Robert Jacques wrote:
 On Fri, 15 Apr 2011 16:12:34 -0400, Don <nospam nospam.com> wrote:
 [snip]
 
 In reality, things are going to be a bit more complicated than this. But
 it seems to me that conceptually, something like this could still stay
 fairly simple and be very, very fast. With no changes required to the
 language, and not even any changes required to existing code.

 I really like this general concept (It feels a lot like young/old
 generational collecting, but without the overhead), both for non-leaky
 pure functions and ctfe.
 
 On a side note, core.memory.gc is not currently marked pure and/or
 nothrow, so Appender / containers which use the low level GC commands for
 performance can't be used in pure/nothrow code. I'm not sure how to fix
 this, as it would essentially have to violate the type system in a major
 way. It's one more thing to think about with regard to purity and the GC.

The fact that Appender isn't pure is a huge blow to purity. At least with 
nothrow, you can wrap it in a try-catch block, but there is no such option for 
pure. You could probably cast a function pointer to pure if you had to, but 
that is at least stretching the type system if not outright breaking it - 
though if you're _certain_ that the function is effectively pure, then it 
should be fine. Regardless, a solution really should be found to the issue of 
purity and Appender.

- Jonathan M Davis

Nov 07 2011

bearophile <bearophileHUGS lycos.com> writes:

Robert Jacques:

 I really like this general concept (It feels a lot like young/old
 generational collecting, but without the overhead), both for non-leaky
 pure functions and ctfe.

As first step to implement Don's GC idea I think it will be useful a
__traits(gcallocates, someFunction) that returns true at compile-time if
someFunction performs allocations from the GC heap or if it calls functions
where __traits(gcallocates, otherFunction) returns true.

Bye,
bearophile

Nov 08 2011

Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:

 __traits(gcallocates, someFunction)

This would be an AWESOME idea, because it would allow to specialize
functions based on the way, let's say, delegates behave and if they
don't use GC, the specialized function would also restrain from using
it making it follow the behavior pattern and making it much more
usable in performance-critical environments, while keeping it usable
in GC-aware environment too.


On Tue, Nov 8, 2011 at 5:53 PM, bearophile <bearophileHUGS lycos.com> wrote=
:
 Robert Jacques:

 I really like this general concept (It feels a lot like young/old
 generational collecting, but without the overhead), both for non-leaky
 pure functions and ctfe.

 As first step to implement Don's GC idea I think it will be useful a __tr=

aits(gcallocates, someFunction) that returns true at compile-time if someFu=
nction performs allocations from the GC heap or if it calls functions where=
 __traits(gcallocates, otherFunction) returns true.
 Bye,
 bearophile

Nov 08 2011

D Programming

C/C++ Programming

Other

digitalmars.D - GC for pure functions -- implementation ideas