www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - GC for pure functions -- implementation ideas

reply Don <nospam nospam.com> writes:
I noticed a lively discussion in Bugzilla about the GC, with speculation 
about the impact of a precise GC on speed.
But it seems to me that a dedicated GC for pure functions has enormous 
unexplored potential, and might be relatively easy to implement.

LEAKY FUNCTIONS

Define a 'leaky' pure function as a pure function which can return
heap-allocated memory to the caller, ie, where the return value or a
parameter passed by reference has at least one pointer or reference
type. This can be determined simply by inspecting the signature. (Note
that the function does not need to be immutably pure).

The interesting thing is that heap allocation inside non-leaky pure
functions behaves like stack allocation. When you return from that
function, *all* those variables are unreachable, and can be discarded en 
masse. Here's an idea of how to exploit this.

THE PURE HEAP

Create a pure heap for each thread. This is a heap which can only be
used by pure functions. I present some simplistic code, with the
simplest possible implementation: just a big block of memory with a 
thread local 'stack pointer' which points to the first free slot.

static ubyte *heap; // initialized to big chunk of RAM.
static size_t stackptr = 0;
static size_t savedstackptr = 0;

For *non-leaky* pure functions: if any of the functions it calls are 
leaky, or if it makes any memory allocations, then call a HeapEnter 
function (in the druntime) at the start, and a HeapExit function at the 
end. Leaky pure functions don't get this prologue and epilogue code.
Non-leaky pure functions that don't do memory allocation are simply 
ignored. (Note that the compiler can determine if a function makes any 
memory allocations, simply by inspecting its body -- it isn't any more 
difficult than checking if it is nothrow).

void pureHeapEnter()
{
     cast(ubyte *)(heap + stackptr) = savedstackptr;
     savedstackptr = stackptr;
     stackptr += size_t.sizeof;
}

void pureHeapExit()
{
     stackptr = savedstackptr;  // instant GC!!
     savedstackptr = cast(ubyte *)(heap +stackptr);
}

The pureHeapExit function has the effect of instantly (and precisely!)
collecting all of the memory allocated in the non-leaky pure function 
and in every leaky function that it called.

In any pure function, leaky or non-leaky, when memory is allocated, call
pureMalloc instead of gcMalloc when allocating. (Non-leaky pure 
functions will of course always allocate on the pure heap.).

void *pureMalloc(int nbytes)
{
     if (!stackptr)
         return gcMalloc(nbytes); // we're leaky, do a normal malloc
     // we can use the pure heap
     auto r = heap + stackptr;
     stackptr += nbytes;
     return r;
}

REFINEMENTS
We can make this scheme more generally applicable. If there is a leaky 
return value which is cheap to copy, then we can pretend the function is 
non-leaky: at exit, if we were called with stackptr == 0, then we copy 
(deepdup) the return value to the gc heap, before calling pureHeapExit. 
If stackptr was non-zero, we don't need to copy it.

COMPLICATIONS
Classes with finalizers are an annoying complication. But again, we can 
look at all the functions we call, and all the 'new' operations we 
perform, to see if any finalizers exist. Maybe we could even have a 
separate finalizer heap?

Exceptions are the biggest nuisance, since they can also leak 
heap-allocated memory. A catch handler in a non-pure function would need 
to check to see if the pure heap 'stackpointer' is non-zero, and if so, 
it would need to do a deep dup of the exception, then clear the pure 
heap. Any pure function (leaky or not) which contains a catch handler 
would need to record the value of the savedstackptr at entry to the 
function, and the catch handler would need to unwind the pure heap until 
we get back to it.

In reality, things are going to be a bit more complicated than this. But
it seems to me that conceptually, something like this could still stay 
fairly simple and be very, very fast. With no changes required to the 
language, and not even any changes required to existing code.
Apr 15 2011
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/15/2011 1:12 PM, Don wrote:
 In reality, things are going to be a bit more complicated than this. But
 it seems to me that conceptually, something like this could still stay fairly
 simple and be very, very fast. With no changes required to the language, and
not
 even any changes required to existing code.
I think it's a good idea.
Apr 15 2011
prev sibling next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
On Apr 15, 2011, at 1:12 PM, Don wrote:
=20
 Create a pure heap for each thread. This is a heap which can only be
 used by pure functions. I present some simplistic code, with the
 simplest possible implementation: just a big block of memory with a =
thread local 'stack pointer' which points to the first free slot. It's a good idea. dsimcha was already going to polish his TempAlloc, = wasn't he? Seems like this is nearly the same thing.=
Apr 15 2011
next sibling parent Don <nospam nospam.com> writes:
Sean Kelly wrote:
 On Apr 15, 2011, at 1:12 PM, Don wrote:
 Create a pure heap for each thread. This is a heap which can only be
 used by pure functions. I present some simplistic code, with the
 simplest possible implementation: just a big block of memory with a thread
local 'stack pointer' which points to the first free slot.
It's a good idea. dsimcha was already going to polish his TempAlloc, wasn't he? Seems like this is nearly the same thing.
Yes. You could think of it as a way the compiler could automatically convert 'new' into a use of TempAlloc, in many useful cases.
Apr 15 2011
prev sibling parent reply dsimcha <dsimcha yahoo.com> writes:
On 4/15/2011 5:01 PM, Sean Kelly wrote:
 On Apr 15, 2011, at 1:12 PM, Don wrote:
 Create a pure heap for each thread. This is a heap which can only be
 used by pure functions. I present some simplistic code, with the
 simplest possible implementation: just a big block of memory with a thread
local 'stack pointer' which points to the first free slot.
It's a good idea. dsimcha was already going to polish his TempAlloc, wasn't he? Seems like this is nearly the same thing.
Yeah, I never formalized it at all, but that's roughly what TempAlloc accomplishes. My other concern is, what happens in the case of the following code: uint nonLeaky() pure { foreach(i; 0..42) { auto arr = new uint[666]; // do stuff } return 8675309; } In this case the arr instance from every loop iteration is retained until nonLeaky() returns, whether it's referenced or not. Granted, this is a silly example, but I imagine there are cases where stuff like this happens in practice.
Apr 15 2011
next sibling parent Don <nospam nospam.com> writes:
dsimcha wrote:
 On 4/15/2011 5:01 PM, Sean Kelly wrote:
 On Apr 15, 2011, at 1:12 PM, Don wrote:
 Create a pure heap for each thread. This is a heap which can only be
 used by pure functions. I present some simplistic code, with the
 simplest possible implementation: just a big block of memory with a 
 thread local 'stack pointer' which points to the first free slot.
It's a good idea. dsimcha was already going to polish his TempAlloc, wasn't he? Seems like this is nearly the same thing.
Yeah, I never formalized it at all, but that's roughly what TempAlloc accomplishes. My other concern is, what happens in the case of the following code: uint nonLeaky() pure { foreach(i; 0..42) { auto arr = new uint[666]; // do stuff } return 8675309; } In this case the arr instance from every loop iteration is retained until nonLeaky() returns, whether it's referenced or not. Granted, this is a silly example, but I imagine there are cases where stuff like this happens in practice.
Yes. I'm not sure what the best strategy is if you run out of memory inside purealloc. The simple strategy would just be switch to the normal heap once the pure heap is full. Then, the gc would have to check for a full pure heap. If the pure heap is full, it should scan everything from the last heapptr to the end of the pure heap, as well as everything else it scans. But you still get the entire size of the pure heap as an instant-gc region.
Apr 15 2011
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
 Yeah, I never formalized it at all, but that's roughly what TempAlloc
 accomplishes.  My other concern is, what happens in the case of the
 following code:
 uint nonLeaky() pure {
      foreach(i; 0..42) {
           auto arr = new uint[666];
           // do stuff
      }

      return 8675309;
 }

 In this case the arr instance from every loop iteration is retained
 until nonLeaky() returns, whether it's referenced or not.  Granted, this
 is a silly example, but I imagine there are cases where stuff like this
 happens in practice.
It should be trivial to just deallocate when arr goes out of scope. This would be more complicated to resolve, as the analogy to stack allocation vanishes: uint nonLeaky() pure { uint[] arr1 = new uint[666]; uint[] arr2 = new uint[666]; foreach(i; 0..42) { arr1 = new uint[66*i]; // do stuff } return 8675309; } The problem is, that inside a non-leaky pure function the general case for dynamic allocations might be just as complicated as in other parts of the program. However, the problem does only exist when the pure function deletes/overrides its own references. Those are the only ones it is allowed to modify. Therefore, the compiler could just use the GC heap whenever a reference is assigned twice or more times inside a non-leaky pure function? I think it might be a better option than letting the pure heap fill up with garbage.
Apr 16 2011
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Timon Gehr:

 The problem is, that inside a non-leaky pure function the general case for
dynamic
 allocations might be just as complicated as in other parts of the program.
If this is true, then you need a sealed temporary heap. Bye, bearophile
Apr 16 2011
prev sibling parent reply Fawzi Mohamed <fawzi gmx.ch> writes:
	charset=US-ASCII;
	format=flowed;
	delsp=yes
Content-Transfer-Encoding: 7bit


On 16-apr-11, at 22:49, Timon Gehr wrote:

 [...]
 The problem is, that inside a non-leaky pure function the general  
 case for dynamic
 allocations might be just as complicated as in other parts of the  
 program.
indeed, this is exactly what I wanted to write, yes in some cases, one can get away with simple stack like, or similar but it breaks down very quickly. In fact GC were introduced by functional languages, because they are kind of needed for them, already that should hint to the fact that functional, or pure languages are not intrinsically easier to collect. What can be useful is allowing one to add a set of pools, that then can be freed all at once. Having several pools is also what is needed to remove the global lock in malloc, so that is definitely the way to go imho. Then one can give the control of these extra pools to the programmer, so that it is easy use a special pool for a part of the program and then release a lot of objects at once. Even then one should put quite some thought into it (for example about static/global objects that might be allocated for caching purposes). A strictly pure function returning a value without pointers gives guarantees, but as soon as some caching (even behind the scenes) goes on, then things will fail. If a separate pool is used consistently for cached or shared objects one should be able to allow even caching. All this comes back again to having several pools, showing how useful such a primitive is. Fawzi
Apr 17 2011
next sibling parent reply Don <nospam nospam.com> writes:
Fawzi Mohamed wrote:
 
 On 16-apr-11, at 22:49, Timon Gehr wrote:
 
 [...]
 The problem is, that inside a non-leaky pure function the general case 
 for dynamic
 allocations might be just as complicated as in other parts of the program.
Yes, but that only matters if that function is both extremely memory-inefficient AND long-lived. In which case you can fall back to the normal GC (or even a dedicated pure GC). I don't think you ever lose anything.
 indeed, this is exactly what I wanted to write, yes in some cases, one 
 can get away with simple stack like, or similar but it breaks down very 
 quickly.
 In fact GC were introduced by functional languages, because they are 
 kind of needed for them, already that should hint to the fact that 
 functional, or pure languages are not intrinsically easier to collect.
I find that *impossible* to believe. Note also that you are equating "functional" = "pure" in the D sense, which is not true. Firstly, functional languages generate *enormous* amounts of garbage. D does not. Secondly, non-leaky pure functions are rare in functional programming languages. I think we are in new territory here.
 
 What can be useful is allowing one to add a set of pools, that then can 
 be freed all at once.
 Having several pools is also what is needed to remove the global lock in 
 malloc, so that is definitely the way to go imho.
 Then one can give the control of these extra pools to the programmer, so 
 that it is easy use a special pool for a part of the program and then 
 release a lot of objects at once. Even then one should put quite some 
 thought into it (for example about static/global objects that might be 
 allocated for caching purposes).
 A strictly pure function returning a value without pointers gives 
 guarantees, but as soon as some caching (even behind the scenes) goes 
 on, then things will fail. If a separate pool is used consistently for 
 cached or shared objects one should be able to allow even caching.
 All this comes back again to having several pools, showing how useful 
 such a primitive is.
 
 Fawzi
 
Apr 17 2011
parent Timon Gehr <timon.gehr gmx.ch> writes:
Don wrote:

 The problem is, that inside a non-leaky pure function the general case
 for dynamic
 allocations might be just as complicated as in other parts of the program.
Yes, but that only matters if that function is both extremely memory-inefficient AND long-lived. In which case you can fall back to the normal GC (or even a dedicated pure GC). I don't think you ever lose anything.
I think this idea has potential, I was only pointing out that maybe the compiler should identify such garbage creating functions and use the normal GC for them by default. It is not really a valid option to only restrict the size of the pure heap (or even just the amount it can grow per call frame) and then fall back to the normal GC, because a non-leaky pure function that behaves nicely may also need a lot of memory. Fawzi Mohamed wrote:
 Having several pools is also what is needed to remove the global lock in
 malloc, so that is definitely the way to go imho.
I agree. I don't like the fact that at the GC suspends all running threads while collecting. But Hardware will probably be evolving towards Core-local RAM (several _physical_ pools) anyways.
Apr 17 2011
prev sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2011-04-17 05:22:10 -0400, Fawzi Mohamed <fawzi gmx.ch> said:

 All this comes back again to having several pools, showing how useful
 such a primitive is.
Speaking of rethinking the GC and its primitives, is there a way currently to tell the GC to track pointers to a manually allocated block and assign a callback for when the GC wants that block to be finalized and deallocated? I'm kinda going to need that if I am to bring decent garbage-collected Objective-C objects in D. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Apr 17 2011
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 17 Apr 2011 14:08:29 -0400, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2011-04-17 05:22:10 -0400, Fawzi Mohamed <fawzi gmx.ch> said:

 All this comes back again to having several pools, showing how useful
 such a primitive is.
Speaking of rethinking the GC and its primitives, is there a way currently to tell the GC to track pointers to a manually allocated block and assign a callback for when the GC wants that block to be finalized and deallocated? I'm kinda going to need that if I am to bring decent garbage-collected Objective-C objects in D.
It's coarsely grained, but you can intercept all finalization calls: http://www.digitalmars.com/d/2.0/phobos/core_runtime.html#collectHandler You'd have to figure out some way to filter only the ones you care about. -Steve
Apr 18 2011
next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 18 Apr 2011 08:27:25 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Sun, 17 Apr 2011 14:08:29 -0400, Michel Fortin  
 <michel.fortin michelf.com> wrote:

 On 2011-04-17 05:22:10 -0400, Fawzi Mohamed <fawzi gmx.ch> said:

 All this comes back again to having several pools, showing how useful
 such a primitive is.
Speaking of rethinking the GC and its primitives, is there a way currently to tell the GC to track pointers to a manually allocated block and assign a callback for when the GC wants that block to be finalized and deallocated? I'm kinda going to need that if I am to bring decent garbage-collected Objective-C objects in D.
It's coarsely grained, but you can intercept all finalization calls: http://www.digitalmars.com/d/2.0/phobos/core_runtime.html#collectHandler You'd have to figure out some way to filter only the ones you care about.
Also, there's an undocumented, non-published (i.e. not in object.di) function that allows you to attach an event handler when an object is disposed: https://github.com/D-Programming-Language/druntime/blob/master/src/object_.d#L2394 I think Bill Baxter's WeakRef object used it. I'm not sure how supported it is, or why it isn't public. -Steve
Apr 18 2011
prev sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2011-04-18 08:27:25 -0400, "Steven Schveighoffer" 
<schveiguy yahoo.com> said:

 On Sun, 17 Apr 2011 14:08:29 -0400, Michel Fortin  
 <michel.fortin michelf.com> wrote:
 
 On 2011-04-17 05:22:10 -0400, Fawzi Mohamed <fawzi gmx.ch> said:
 
 All this comes back again to having several pools, showing how useful
 such a primitive is.
Speaking of rethinking the GC and its primitives, is there a way currently to tell the GC to track pointers to a manually allocated block and assign a callback for when the GC wants that block to be finalized and deallocated? I'm kinda going to need that if I am to bring decent garbage-collected Objective-C objects in D.
It's coarsely grained, but you can intercept all finalization calls: http://www.digitalmars.com/d/2.0/phobos/core_runtime.html#collectHandler You'd have to figure out some way to filter only the ones you care about.
Won't do. What I need is to make the GC track memory blocks allocated externally, with some callback telling me when they are no longer referenced. I'll probably just add this capability to the GC myself when the time comes. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Apr 18 2011
prev sibling next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Don:

 But it seems to me that a dedicated GC for pure functions has enormous 
 unexplored potential, and might be relatively easy to implement.
- In D we have not even started to exploit the full potential of purity and pure functions. - In D reducing the work of the normal GC is a very good thing (because of problems caused by not moving GC, its not precise nature, its not advanced implementation, and because D is a complex system language). - CommonLisp programmers and designers know that's it's good to give some hints to the the GC. Optional information that help the GC do its work more efficiently. The D1 "scope" attribute for class allocations was a not well designed&implemented example of this.
 (Note that the compiler can determine if a function makes any 
 memory allocations, simply by inspecting its body -- it isn't any more 
 difficult than checking if it is nothrow).
Time ago I have suggested a noheap that makes sure a function tree doesn't allocate memory. What you say makes me think that here user code may just desire to know what the compiler knows: a __traits that given a function name returns a true if the function performs heap allocation.
 With no changes required to the 
 language, and not even any changes required to existing code.
Phobos and other many small things (D too, like some form to express conditional purity, etc) need to change to allow D programmers to use purity more widely in their programs. This in turn will make your pure GC more and more useful. Bye, bearophile
Apr 15 2011
prev sibling next sibling parent reply Tomek =?ISO-8859-2?Q?Sowi=F1ski?= <just ask.me> writes:
Don napisa=B3:

 LEAKY FUNCTIONS
=20
 Define a 'leaky' pure function as a pure function which can return
 heap-allocated memory to the caller, ie, where the return value or a
 parameter passed by reference has at least one pointer or reference
 type. This can be determined simply by inspecting the signature. (Note
 that the function does not need to be immutably pure).
=20
 The interesting thing is that heap allocation inside non-leaky pure
 functions behaves like stack allocation. When you return from that
 function, *all* those variables are unreachable, and can be discarded en=
=20
 masse. Here's an idea of how to exploit this.
=20
 THE PURE HEAP
=20
 [snip]
I'm far from being a GC expert but I think Java having identified such case= s with escape analysis just puts locally allocated objects on the stack. Couldn't we too? Your mark & release "pure heap" scheme looks alright but t= his seems simpler. The notion of "non-leaky" functions can be useful either way. --=20 Tomek
Apr 15 2011
next sibling parent Don <nospam nospam.com> writes:
Tomek Sowiński wrote:
 Don napisał:
 
 LEAKY FUNCTIONS

 Define a 'leaky' pure function as a pure function which can return
 heap-allocated memory to the caller, ie, where the return value or a
 parameter passed by reference has at least one pointer or reference
 type. This can be determined simply by inspecting the signature. (Note
 that the function does not need to be immutably pure).

 The interesting thing is that heap allocation inside non-leaky pure
 functions behaves like stack allocation. When you return from that
 function, *all* those variables are unreachable, and can be discarded en 
 masse. Here's an idea of how to exploit this.

 THE PURE HEAP

 [snip]
I'm far from being a GC expert but I think Java having identified such cases with escape analysis just puts locally allocated objects on the stack.
That works for the non-leaky function itself, but it doesn't help for the functions it calls.
 Couldn't we too? Your mark & release "pure heap" scheme looks alright but this
seems simpler.
 
 The notion of "non-leaky" functions can be useful either way.
 
Apr 15 2011
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 15 Apr 2011 18:36:17 -0400, Tomek SowiƄski <just ask.me> wrote:

 Don napisaƂ:

 LEAKY FUNCTIONS

 Define a 'leaky' pure function as a pure function which can return
 heap-allocated memory to the caller, ie, where the return value or a
 parameter passed by reference has at least one pointer or reference
 type. This can be determined simply by inspecting the signature. (Note
 that the function does not need to be immutably pure).

 The interesting thing is that heap allocation inside non-leaky pure
 functions behaves like stack allocation. When you return from that
 function, *all* those variables are unreachable, and can be discarded en
 masse. Here's an idea of how to exploit this.

 THE PURE HEAP

 [snip]
I'm far from being a GC expert but I think Java having identified such cases with escape analysis just puts locally allocated objects on the stack.
Full escape analysis requires a non-stanard object format and a non-standard linker. Simply because, escape analysis must be a compiler-added attribute, and the linker has to understand that. We can get away with certain things reusing the C linker by doing things like name mangling, but escape analysis must add far more annotations (essentially tell how parameters and return values are linked). I think at some point, I'd prefer D to use it's own object format and linker, to allow both escape analysis, and to use the object format as the import database. But that isn't likely to come from Walter, I think someone would have to build a new compiler. -Steve
Apr 18 2011
parent reply Kagamin <spam here.lot> writes:
Steven Schveighoffer Wrote:

 Full escape analysis requires a non-stanard object format and a  
 non-standard linker.  Simply because, escape analysis must be a  
 compiler-added attribute, and the linker has to understand that.  We can  
 get away with certain things reusing the C linker by doing things like  
 name mangling, but escape analysis must add far more annotations  
 (essentially tell how parameters and return values are linked).
Why support from linker? Even .net uses plain old COFF, it just stores metadata in a separate section. Escape analysis is a compile-time job for the compiler, so this metadata can be simply ignored and discarded by the linker. For the same reason, this metadata can be stored in any place, even in a separate file.
Apr 18 2011
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 18 Apr 2011 09:34:07 -0400, Kagamin <spam here.lot> wrote:

 Steven Schveighoffer Wrote:

 Full escape analysis requires a non-stanard object format and a
 non-standard linker.  Simply because, escape analysis must be a
 compiler-added attribute, and the linker has to understand that.  We can
 get away with certain things reusing the C linker by doing things like
 name mangling, but escape analysis must add far more annotations
 (essentially tell how parameters and return values are linked).
Why support from linker? Even .net uses plain old COFF, it just stores metadata in a separate section. Escape analysis is a compile-time job for the compiler, so this metadata can be simply ignored and discarded by the linker. For the same reason, this metadata can be stored in any place, even in a separate file.
Wouldn't the linker have to verify that the escapes are valid when linking objects together? I admit I don't know much about the object format and linking process, but I assumed that the linker had to be involved in enforcement of escape analysis. Does the .NET linker ignore the metadata? -Steve
Apr 18 2011
next sibling parent Kagamin <spam here.lot> writes:
Steven Schveighoffer Wrote:

 Wouldn't the linker have to verify that the escapes are valid when linking  
 objects together?  I admit I don't know much about the object format and  
 linking process, but I assumed that the linker had to be involved in  
 enforcement of escape analysis.
If it doesn't, it's not a big deal, I think. One usually does things in the right order. This is better solved by a build tool, and integration of compiler into a build tool may be a good idea. After all, some of them already use the compiler to infer dependencies.
Apr 18 2011
prev sibling parent Kagamin <spam here.lot> writes:
Steven Schveighoffer Wrote:

 Does the .NET linker ignore the metadata?
Metadata in .net is more than annotation, so there's no direct relation to what we're talking about. It's just an example. It's not like there's nothing to fix in COFF, but it's not a big deal to make it do anything you want. You simply put your metadata in your section and even stay compatible with existing tools, so new linker can add some extra safety, but it won't be required to make things work.
Apr 18 2011
prev sibling next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Tomek Sowiński:

 I'm far from being a GC expert but I think Java having identified such cases
with escape analysis just puts locally allocated objects on the stack.
Escape analysis will be useful for D compilers too (I think LDC-LLVM is not doing this much yet), but if the amount of non-escaping memory allocated is large, you don't want to put it on the stack, a special heap is better. Bye, bearophile
Apr 15 2011
prev sibling next sibling parent reply Jason House <jason.james.house gmail.com> writes:
Don Wrote:
 Tomek Sowiński wrote:
 I'm far from being a GC expert but I think Java having identified such cases
with escape analysis just puts locally allocated objects on the stack.
That works for the non-leaky function itself, but it doesn't help for the functions it calls.
It'd reduce the use of the pure heap to leaky pure functions called from pure functions. If I understood the original proposal correctly, this would reduce how frequently pure functions have to manipulate the pure stack. I haven't thought through the exception handling case, so I may be completely wrong! I was originally assuming the return type of a pure function was enough to determine if it wasn't leaky, but now I'm thinking only pure nothrow functions can be non-leaky. That might make the stack allocation optimization too rare to be worthwhile?
Apr 17 2011
parent reply Don <nospam nospam.com> writes:
Jason House wrote:
 Don Wrote:
 Tomek Sowiński wrote:
 I'm far from being a GC expert but I think Java having identified such cases
with escape analysis just puts locally allocated objects on the stack.
That works for the non-leaky function itself, but it doesn't help for the functions it calls.
It'd reduce the use of the pure heap to leaky pure functions called from pure functions. If I understood the original proposal correctly, this would reduce how frequently pure functions have to manipulate the pure stack. I haven't thought through the exception handling case, so I may be completely wrong!
It would definitely help a lot. It just wouldn't catch everything. It seems fairly difficult though.
 I was originally assuming the return type of a pure function was enough to
determine if it wasn't leaky, 
You also have parameters passed by reference. but now I'm thinking only pure nothrow functions can be non-leaky. That might make the stack allocation optimization too rare to be worthwhile? It might. Although as I mentioned, you can deep-dup any exceptions onto the normal gc heap, at the moment they are caught. That deep-dup is not performance critical, and in most cases, the dup is simple. But the worst case is, the entire pure heap needs to be copied. It might turn out to be too complicated to be worthwhile, limiting the scheme to pure nothrow functions. Nontheless I think pure nothrow functions will be pretty common. Basically, my contribution is this: the compiler can easily work out, for each function, whenever it has entered and exited a non-leaky pure function. It can make a call into the GC whenever this happens. This gives the GC many more potential strategies.
Apr 17 2011
parent Fawzi Mohamed <fawzi gmx.ch> writes:
On 17-apr-11, at 21:44, Don wrote:

 [...]
 Basically, my contribution is this: the compiler can easily work  
 out, for each function, whenever it has entered and exited a non- 
 leaky pure function. It can make a call into the GC whenever this  
 happens. This gives the GC many more potential strategies.
yes more info is always better, I didn't want to diminish your work, but to point toward a general improvement in the GC My fear is that the overhead of this approach will make it worth only for big allocations (getting rid of eventual false pointers), and thus only under the programmer control. Classifying the "allocation potential" of a function might help make better automatic decisions. That is difficult in general (one can flag alloc in loop with unknown length or more than 4 elements as large for example, something that would miss recursive allocations), but it would be useful, because for those that allocate little could do nothing, or use a fixed stack like heap, whereas those that allocate a lot could checkpoint the heap (assuming a separate pool for each thread) or use a new pool. Fawzi
Apr 18 2011
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Sat, 16 Apr 2011 00:12:34 +0400, Don <nospam nospam.com> wrote:

 I noticed a lively discussion in Bugzilla about the GC, with speculation  
 about the impact of a precise GC on speed.
 But it seems to me that a dedicated GC for pure functions has enormous  
 unexplored potential, and might be relatively easy to implement.

 LEAKY FUNCTIONS

 Define a 'leaky' pure function as a pure function which can return
 heap-allocated memory to the caller, ie, where the return value or a
 parameter passed by reference has at least one pointer or reference
 type. This can be determined simply by inspecting the signature. (Note
 that the function does not need to be immutably pure).

 The interesting thing is that heap allocation inside non-leaky pure
 functions behaves like stack allocation. When you return from that
 function, *all* those variables are unreachable, and can be discarded en  
 masse. Here's an idea of how to exploit this.

 THE PURE HEAP

 Create a pure heap for each thread. This is a heap which can only be
 used by pure functions. I present some simplistic code, with the
 simplest possible implementation: just a big block of memory with a  
 thread local 'stack pointer' which points to the first free slot.

 static ubyte *heap; // initialized to big chunk of RAM.
 static size_t stackptr = 0;
 static size_t savedstackptr = 0;

 For *non-leaky* pure functions: if any of the functions it calls are  
 leaky, or if it makes any memory allocations, then call a HeapEnter  
 function (in the druntime) at the start, and a HeapExit function at the  
 end. Leaky pure functions don't get this prologue and epilogue code.
 Non-leaky pure functions that don't do memory allocation are simply  
 ignored. (Note that the compiler can determine if a function makes any  
 memory allocations, simply by inspecting its body -- it isn't any more  
 difficult than checking if it is nothrow).

 void pureHeapEnter()
 {
      cast(ubyte *)(heap + stackptr) = savedstackptr;
      savedstackptr = stackptr;
      stackptr += size_t.sizeof;
 }

 void pureHeapExit()
 {
      stackptr = savedstackptr;  // instant GC!!
      savedstackptr = cast(ubyte *)(heap +stackptr);
 }

 The pureHeapExit function has the effect of instantly (and precisely!)
 collecting all of the memory allocated in the non-leaky pure function  
 and in every leaky function that it called.

 In any pure function, leaky or non-leaky, when memory is allocated, call
 pureMalloc instead of gcMalloc when allocating. (Non-leaky pure  
 functions will of course always allocate on the pure heap.).

 void *pureMalloc(int nbytes)
 {
      if (!stackptr)
          return gcMalloc(nbytes); // we're leaky, do a normal malloc
      // we can use the pure heap
      auto r = heap + stackptr;
      stackptr += nbytes;
      return r;
 }

 REFINEMENTS
 We can make this scheme more generally applicable. If there is a leaky  
 return value which is cheap to copy, then we can pretend the function is  
 non-leaky: at exit, if we were called with stackptr == 0, then we copy  
 (deepdup) the return value to the gc heap, before calling pureHeapExit.  
 If stackptr was non-zero, we don't need to copy it.

 COMPLICATIONS
 Classes with finalizers are an annoying complication. But again, we can  
 look at all the functions we call, and all the 'new' operations we  
 perform, to see if any finalizers exist. Maybe we could even have a  
 separate finalizer heap?

 Exceptions are the biggest nuisance, since they can also leak  
 heap-allocated memory. A catch handler in a non-pure function would need  
 to check to see if the pure heap 'stackpointer' is non-zero, and if so,  
 it would need to do a deep dup of the exception, then clear the pure  
 heap. Any pure function (leaky or not) which contains a catch handler  
 would need to record the value of the savedstackptr at entry to the  
 function, and the catch handler would need to unwind the pure heap until  
 we get back to it.

 In reality, things are going to be a bit more complicated than this. But
 it seems to me that conceptually, something like this could still stay  
 fairly simple and be very, very fast. With no changes required to the  
 language, and not even any changes required to existing code.
I was doing something similar to get rid of the memory leaks in DDMD: I allocated with something like pureMalloc, then took one of the root objects and made a deep copy of the entire object hierarchy. The newly constructed object hierarchy (allocated in managed heap) is then used instead of the original one.
Apr 18 2011
prev sibling next sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 15/04/2011 21:12, Don wrote:
 I noticed a lively discussion in Bugzilla about the GC, with speculation
 about the impact of a precise GC on speed.
 But it seems to me that a dedicated GC for pure functions has enormous
 unexplored potential, and might be relatively easy to implement.
This stuff is why I love static typing! :D -- Bruno Medeiros - Software Engineer
Apr 20 2011
prev sibling parent reply "Robert Jacques" <sandford jhu.edu> writes:
On Fri, 15 Apr 2011 16:12:34 -0400, Don <nospam nospam.com> wrote:
[snip]
 In reality, things are going to be a bit more complicated than this. But
 it seems to me that conceptually, something like this could still stay fairly
simple and be very, very fast. With no changes required to the language, and
not even any changes required to existing code.
I really like this general concept (It feels a lot like young/old generational collecting, but without the overhead), both for non-leaky pure functions and ctfe. On a side note, core.memory.gc is not currently marked pure and/or nothrow, so Appender / containers which use the low level GC commands for performance can't be used in pure/nothrow code. I'm not sure how to fix this, as it would essentially have to violate the type system in a major way. It's one more thing to think about with regard to purity and the GC.
Nov 07 2011
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, November 07, 2011 23:28:28 Robert Jacques wrote:
 On Fri, 15 Apr 2011 16:12:34 -0400, Don <nospam nospam.com> wrote:
 [snip]
 
 In reality, things are going to be a bit more complicated than this. But
 it seems to me that conceptually, something like this could still stay
 fairly simple and be very, very fast. With no changes required to the
 language, and not even any changes required to existing code.
I really like this general concept (It feels a lot like young/old generational collecting, but without the overhead), both for non-leaky pure functions and ctfe. On a side note, core.memory.gc is not currently marked pure and/or nothrow, so Appender / containers which use the low level GC commands for performance can't be used in pure/nothrow code. I'm not sure how to fix this, as it would essentially have to violate the type system in a major way. It's one more thing to think about with regard to purity and the GC.
The fact that Appender isn't pure is a huge blow to purity. At least with nothrow, you can wrap it in a try-catch block, but there is no such option for pure. You could probably cast a function pointer to pure if you had to, but that is at least stretching the type system if not outright breaking it - though if you're _certain_ that the function is effectively pure, then it should be fine. Regardless, a solution really should be found to the issue of purity and Appender. - Jonathan M Davis
Nov 07 2011
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Robert Jacques:

 I really like this general concept (It feels a lot like young/old
 generational collecting, but without the overhead), both for non-leaky
 pure functions and ctfe.
As first step to implement Don's GC idea I think it will be useful a __traits(gcallocates, someFunction) that returns true at compile-time if someFunction performs allocations from the GC heap or if it calls functions where __traits(gcallocates, otherFunction) returns true. Bye, bearophile
Nov 08 2011
parent Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
 __traits(gcallocates, someFunction)
This would be an AWESOME idea, because it would allow to specialize functions based on the way, let's say, delegates behave and if they don't use GC, the specialized function would also restrain from using it making it follow the behavior pattern and making it much more usable in performance-critical environments, while keeping it usable in GC-aware environment too. On Tue, Nov 8, 2011 at 5:53 PM, bearophile <bearophileHUGS lycos.com> wrote= :
 Robert Jacques:

 I really like this general concept (It feels a lot like young/old
 generational collecting, but without the overhead), both for non-leaky
 pure functions and ctfe.
As first step to implement Don's GC idea I think it will be useful a __tr=
aits(gcallocates, someFunction) that returns true at compile-time if someFu= nction performs allocations from the GC heap or if it calls functions where= __traits(gcallocates, otherFunction) returns true.
 Bye,
 bearophile
Nov 08 2011