digitalmars.D - Possible quick win in GC?
- Abdulhaq (48/48) Sep 28 2014 Perhaps I've too had much caffeine today but I've had an idea
- Abdulhaq (40/40) Sep 28 2014 Here's a code snippet which mopefully makes things a bit clearer:
- Freddy (3/44) Sep 28 2014 You mean this?
- Abdulhaq (15/17) Sep 28 2014 Of course my proposal uses the techique of escape analysis as
- David Nadlinger (8/13) Sep 28 2014 LDC does the "fairly simple" part of this already in a custom
- Abdulhaq (28/41) Sep 29 2014 That's interesting, yes I guessed that the escape analysis would
- thedeemon (4/4) Sep 29 2014 On Sunday, 28 September 2014 at 16:29:45 UTC, Abdulhaq wrote...
Perhaps I've too had much caffeine today but I've had an idea which might give a fairly quick win on the GC speed situation. It's a simple idea at heart so it's very possible/likely that this is a well known idea that has already been discarded but anyway here goes. I got the idea after thinking that it should be fairly simple for the compiler to detect straightforward cases of when a variable can be declared as going on the stack - i.e. no references to it are retained after its enclosing function returns. At the moment AIUI it is necessary for a class instance to be declared by the programmer as 'scoped' for this to take place. Further, I was considering the type of ownership and boundary considerations that could be used to improve memory management - e.g. using the notion of an owner instance which, upon destruction, destroys all owned objects. Afer some consideration it seems to me that by using only static analysis a tree of references could be constructed of references from a root 'scoped' object to all referred to objects that are allocated after the allocation of the root object. When the root object goes out of scope it is destroyed and all the descendent objects from the root object (as identified by the static analysis) could also be destroyed in one simple shot. The static analysis of course constructs the tree by analysing the capturing of references from one object to another. It could be the case that even a simple static analysis at first (e.g. discard the technique in difficult situations) could cover a lot of use cases (statistically). Of course, if one of the descendent objects is referred to by an object which is not in the object tree, then this technique cannot be used. However, I envisage that there are many situations where upon the destruction of a root object all related post-allocated objects can also be destroyed. In terms of implementation I see this being done by what I am calling 'bands' within the GC. With the allocation of any identified root object, a new band (heap) is created in the GC. Child objects of the root object (i.e. only referred to by the root object and other child objects in its tree) are placed in the same band. When the root object goes out of scope the entire band is freed. This by definition is safe because the static analysis has ensured that there are no 'out-of-tree' references between child objects in the tree and out-of-tree (out-of-band) objects. This property also means that normal GC runs do not need to add the scoped root object as a GC root object - this memory will normally only be freed when the scoped root object at the top of the tree goes out of scope. If memory becomes constrained then the bands can be added as root objects to the GC and memory incrementally freed just as with regularly allocated objects. Sorry if this idea is daft and I've wasted your time!
Sep 28 2014
Here's a code snippet which mopefully makes things a bit clearer: /** * In this example the variable foo can be statically analysed as safe to go on the stack. * The new instance of Bar allocated in funcLevelB is only referred to by foo. foo can * be considered a root 'scoped' variable and the GC can delete both foo and the new Bar() * when foo goes out of scope. There is no need (except when under memory pressure) for * the GC to scan the band created for foo and it's related child allocations. */ import std.stdio; class Bar { public: int x; this(int x) { this.x = x; } } class Foo { public: Bar bar; } void funcLevelA() { Foo foo = new Foo(); // static analysis could detect this as able to go on the stack funcLevelB(foo); writeln(foo.bar.x); } void funcLevelB(Foo foo) { foo.bar = new Bar(12); // this allocated memory is only referred to by foo, which // static analysis has established can go on the stack } void main() { funcLevelA(); }
Sep 28 2014
On Sunday, 28 September 2014 at 17:47:42 UTC, Abdulhaq wrote:Here's a code snippet which mopefully makes things a bit clearer: /** * In this example the variable foo can be statically analysed as safe to go on the stack. * The new instance of Bar allocated in funcLevelB is only referred to by foo. foo can * be considered a root 'scoped' variable and the GC can delete both foo and the new Bar() * when foo goes out of scope. There is no need (except when under memory pressure) for * the GC to scan the band created for foo and it's related child allocations. */ import std.stdio; class Bar { public: int x; this(int x) { this.x = x; } } class Foo { public: Bar bar; } void funcLevelA() { Foo foo = new Foo(); // static analysis could detect this as able to go on the stack funcLevelB(foo); writeln(foo.bar.x); } void funcLevelB(Foo foo) { foo.bar = new Bar(12); // this allocated memory is only referred to by foo, which // static analysis has established can go on the stack } void main() { funcLevelA(); }You mean this? https://en.wikipedia.org/wiki/Escape_analysis
Sep 28 2014
You mean this? https://en.wikipedia.org/wiki/Escape_analysisOf course my proposal uses the techique of escape analysis as part of its methodology, but the essence of the idea is to greatly cut down on the work that the GC has to do on each sweep when dealing with objects that have been found to belong to a particular set. The objects in each set are in an object graph that has no incoming references from objects external to the set and which can therefore be allocated in their own heap that is destroyed when the root object goes out of scope. The saving takes place because the GC does not need to scan the default heap for pointers found in the new heaps (bands). For certain type of programs such as compilers / lexers / parsers where many temporary objects are allocated and shortly after deallocated this can result in a substantial time saving in execution. In terms of memory usage we would see multiple potentially large but short-lived spikes.
Sep 28 2014
On Sunday, 28 September 2014 at 16:29:45 UTC, Abdulhaq wrote:I got the idea after thinking that it should be fairly simple for the compiler to detect straightforward cases of when a variable can be declared as going on the stack - i.e. no references to it are retained after its enclosing function returns.LDC does the "fairly simple" part of this already in a custom LLVM optimizer pass. The issue is that escape analysis is fairly hard in general, and currently even more limited because we only do it on the LLVM IR level (i.e. don't leverage any additional attributes like scope, pure, … that might be present in the D source code). David
Sep 28 2014
On Sunday, 28 September 2014 at 20:20:29 UTC, David Nadlinger wrote:On Sunday, 28 September 2014 at 16:29:45 UTC, Abdulhaq wrote:That's interesting, yes I guessed that the escape analysis would present the harder part, but I'm hoping that the algorithm can be built up incrementally, identifying the easy wins first and then over time extending it to cover harder cases. One way that I see it working it is to conduct a form of lowering where the new operator has some information added to it to indicate the 'band' that the GC should place the non-root objects into (root objects go on the stack). Using the syntax of C++'s placement new (but totally different semantics) code could be lowered to e.g. External externalObj = new(0) External(); // 0 means use the default heap Foo foo = new(0x1234) Foo(); // 0x1234 is the heap/band id for this set of objects ... Bar bar = new (0x1234) Bar(); When the GC allocates memory it does so in the indicated band/heap, and then when foo (the root object of the object graph) goes out of scope the relevant band/heap is destroyed en bloc. The benefit of the idea is that when scanning for objects that can be deleted the GC does not need to consider those objects in the non default bands/heaps. For some classes of programs such as compilers (it was Higgs that gave me the stimulus), and with good static analysis (aye there's the rub cap'n) this could represent a very substantial time saving on ech GC sweep.I got the idea after thinking that it should be fairly simple for the compiler to detect straightforward cases of when a variable can be declared as going on the stack - i.e. no references to it are retained after its enclosing function returns.LDC does the "fairly simple" part of this already in a custom LLVM optimizer pass. The issue is that escape analysis is fairly hard in general, and currently even more limited because we only do it on the LLVM IR level (i.e. don't leverage any additional attributes like scope, pure, … that might be present in the D source code). David
Sep 29 2014
On Monday, 29 September 2014 at 07:03:29 UTC, Abdulhaq wrote:On Sunday, 28 September 2014 at 20:20:29 UTC, David Nadlinger wrote:Sounds a little like http://wiki.dlang.org/DIP46 MikeOn Sunday, 28 September 2014 at 16:29:45 UTC, Abdulhaq wrote:That's interesting, yes I guessed that the escape analysis would present the harder part, but I'm hoping that the algorithm can be built up incrementally, identifying the easy wins first and then over time extending it to cover harder cases. One way that I see it working it is to conduct a form of lowering where the new operator has some information added to it to indicate the 'band' that the GC should place the non-root objects into (root objects go on the stack). Using the syntax of C++'s placement new (but totally different semantics) code could be lowered to e.g. External externalObj = new(0) External(); // 0 means use the default heap Foo foo = new(0x1234) Foo(); // 0x1234 is the heap/band id for this set of objects ... Bar bar = new (0x1234) Bar(); When the GC allocates memory it does so in the indicated band/heap, and then when foo (the root object of the object graph) goes out of scope the relevant band/heap is destroyed en bloc. The benefit of the idea is that when scanning for objects that can be deleted the GC does not need to consider those objects in the non default bands/heaps. For some classes of programs such as compilers (it was Higgs that gave me the stimulus), and with good static analysis (aye there's the rub cap'n) this could represent a very substantial time saving on ech GC sweep.I got the idea after thinking that it should be fairly simple for the compiler to detect straightforward cases of when a variable can be declared as going on the stack - i.e. no references to it are retained after its enclosing function returns.LDC does the "fairly simple" part of this already in a custom LLVM optimizer pass. The issue is that escape analysis is fairly hard in general, and currently even more limited because we only do it on the LLVM IR level (i.e. don't leverage any additional attributes like scope, pure, … that might be present in the D source code). David
Sep 29 2014
On Monday, 29 September 2014 at 09:45:38 UTC, Mike wrote:On Monday, 29 September 2014 at 07:03:29 UTC, Abdulhaq wrote:Ah thanks for the link yes there are definite similarities, Walter identifies sets of objects to go in his region through having a boundary on the pure function. My set is determined by static analysis and does not require the function to be pure. However, thedeemon has pointed out that this technique is in fact well known - I'm left to wonder if I should have a go at an implementation, but my wife would not be too chuffed about that.On Sunday, 28 September 2014 at 20:20:29 UTC, David Nadlinger wrote:Sounds a little like http://wiki.dlang.org/DIP46 MikeOn Sunday, 28 September 2014 at 16:29:45 UTC, Abdulhaq wrote:That's interesting, yes I guessed that the escape analysis would present the harder part, but I'm hoping that the algorithm can be built up incrementally, identifying the easy wins first and then over time extending it to cover harder cases. One way that I see it working it is to conduct a form of lowering where the new operator has some information added to it to indicate the 'band' that the GC should place the non-root objects into (root objects go on the stack). Using the syntax of C++'s placement new (but totally different semantics) code could be lowered to e.g. External externalObj = new(0) External(); // 0 means use the default heap Foo foo = new(0x1234) Foo(); // 0x1234 is the heap/band id for this set of objects ... Bar bar = new (0x1234) Bar(); When the GC allocates memory it does so in the indicated band/heap, and then when foo (the root object of the object graph) goes out of scope the relevant band/heap is destroyed en bloc. The benefit of the idea is that when scanning for objects that can be deleted the GC does not need to consider those objects in the non default bands/heaps. For some classes of programs such as compilers (it was Higgs that gave me the stimulus), and with good static analysis (aye there's the rub cap'n) this could represent a very substantial time saving on ech GC sweep.I got the idea after thinking that it should be fairly simple for the compiler to detect straightforward cases of when a variable can be declared as going on the stack - i.e. no references to it are retained after its enclosing function returns.LDC does the "fairly simple" part of this already in a custom LLVM optimizer pass. The issue is that escape analysis is fairly hard in general, and currently even more limited because we only do it on the LLVM IR level (i.e. don't leverage any additional attributes like scope, pure, … that might be present in the D source code). David
Sep 29 2014
On Sunday, 28 September 2014 at 16:29:45 UTC, Abdulhaq wrote... Congratulations on inventing http://en.wikipedia.org/wiki/Region-based_memory_management and "Region inference" in particular.
Sep 29 2014
On Monday, 29 September 2014 at 11:02:12 UTC, thedeemon wrote:On Sunday, 28 September 2014 at 16:29:45 UTC, Abdulhaq wrote... Congratulations on inventing http://en.wikipedia.org/wiki/Region-based_memory_management and "Region inference" in particular.Ah yes - it's identical isn't it - I don't know whether to be happy or sad about that :). They explain it much better too... So, (does anyone know) has this technique been discarded for D or is it 'just' a matter of the resources to do it?
Sep 29 2014
On Monday, 29 September 2014 at 11:36:51 UTC, Abdulhaq wrote:So, (does anyone know) has this technique been discarded for D or is it 'just' a matter of the resources to do it?From the literature on this topic I remember attempts for automatic region inference mostly failed: it lead to some small regions here and there that didn't affect anything much and then one huge region where most of the data landed, requiring a full-blown GC inside that big region. Making it work in D, where everything can be mutated by anything and any type system wall is a cast() away from breaking, seems virtually impossible.
Sep 29 2014
On Monday, 29 September 2014 at 12:02:40 UTC, thedeemon wrote:On Monday, 29 September 2014 at 11:36:51 UTC, Abdulhaq wrote:Mmm yes I can imagine that being the case... oh well!So, (does anyone know) has this technique been discarded for D or is it 'just' a matter of the resources to do it?From the literature on this topic I remember attempts for automatic region inference mostly failed: it lead to some small regions here and there that didn't affect anything much and then one huge region where most of the data landed, requiring a full-blown GC inside that big region. Making it work in D, where everything can be mutated by anything and any type system wall is a cast() away from breaking, seems virtually impossible.
Sep 30 2014