digitalmars.D - Idea for allocators
- Diggory (83/83) May 31 2013 So, I've been thinking about a few of the current problems with D:
So, I've been thinking about a few of the current problems with D: - No allocators on containers - Standard library functions doing too much GC allocation - Escaping pointers to memory not allocated using the GC - Implicit allocation with "~", "~=" and array literals And I came up with something that might be able to solve a few of these: string Test(Alloc = allocator(return))(string a, string b) { return a ~ b; } Escape analysis would be done by the compiler for every allocation, whether that's implicit via "~", explicit with "new" or whatever. This will result in a list of ways that a reference to the allocated memory can escape, which can contain: - By assignment to a global - By return value - By a particular parameter (if parameter is ref or contains a pointer) - By the "this" parameter If there are multiple ways it could escape then a partial ordering can help the compiler choose the most general, or if there is no reasonable ordering then it could error. In each case the allocator can be specified using a template parameter: - string Test(Alloc = allocator(global))(string a, string b); - string Test(Alloc = allocator(return))(string a, string b); - string Test(Alloc = allocator("a"))(string a, string b); - string Test(Alloc = allocator(this))(string a, string b); Multiple values could also be specified: - string Test(Alloc = allocator("b", return))(string a, string b); This does two things - it tells the caller what the allocator will be used for, and it helps the compiler decide which allocator to use for each allocation. If an allocation can't be escaped at all then it should be allocated on the stack, or at least using a stack/region allocator for best performance. Anyway, going back to this case: string Test(Alloc = allocator(return))(string a, string b) { return a ~ b; } The compiler can see that the allocation caused by "a ~ b" can only be escaped via the return value, so it will automatically use the type "Alloc" as the allocator for that allocation. If "Test" is called like so: void Test2() { auto result = Test("Hello ", "world!"); if (result.length > 5) writeln("Blah"); } The "Alloc" parameter has a default value of "allocator(...)" which means that the caller should try to figure out what to pass in. "allocator(return)" means it will be used to allocate the return value, so the compiler performs escape analysis on the return value and finds out that it never escapes, and so provides a simple stack/region allocator. The Alloc parameter is a normal template parameter aside from its default value so you can always explicitly specify a different allocator to use (saves having GC and no-GC versions of each phobos function). It can also be used directly as an allocator from within the function. It could also be used with non-function templates such as containers, although the only useful defaults would be "allocator(this)" and "allocator(global)". The allocator would still be filled in automatically by the compiler if not specified so that it could potentially allocate an entire container in a stack/region allocator and all transparent to the caller. In cases where there is no allocator such as in a non-template it will fall back to using the GC and so be completely backward compatible. This could all be quite difficult to implement but it does provide some nice benefits: - In most cases the only thing needed to take advantage is to add "Alloc = allocator(return)" to the template parameters - Should massively reduce GC usage and cost of allocations (think toLower, etc.) - No new syntax apart from the keyword "allocator" - Can still use "~" and all the other nice features of D even in performance critical/no-gc code - Compiler analysis required is confined to a single function at a time - The biggest problem with allocators in C++ is that nobody actually bothers to use them. Since in this case the best allocator is chosen automatically that's not a problem.
May 31 2013