digitalmars.D.learn - GC-less Hash-Tables (AA)
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (9/9) Sep 17 2014 Have anybody cooked any GC-less variants of hash-tables
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (4/7) Sep 17 2014 Important follow-up question: If types of key and value all have
- Justin Whear (4/16) Sep 17 2014 See our hashmap and hashset implementations here:
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (8/9) Sep 17 2014 Superb!
- H. S. Teoh via Digitalmars-d-learn (7/18) Sep 17 2014 How do you implement a completely GC-free AA with no limit on number of
- bearophile (6/8) Sep 17 2014 Ada2012 has a fixed-size hash in the standard library, it can
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (4/9) Sep 17 2014 I mean, GC-free not heap-free. An AA of course needs dynamic
- H. S. Teoh via Digitalmars-d-learn (7/15) Sep 17 2014 So you would use malloc/free?
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (7/8) Sep 17 2014 Yes, with GC-free I mean *not* involving D's automatic garbage
- H. S. Teoh via Digitalmars-d-learn (24/34) Sep 17 2014 Assuming you're talking about the built-in AA's, the current
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (8/12) Sep 17 2014 So I guess there is room for some optimizations here right?
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (6/7) Sep 17 2014 Further if they key and values are all fixed-size we should
- H. S. Teoh via Digitalmars-d-learn (20/28) Sep 17 2014 Keys and values are *always* fixed sized. Either they are a fixed-sized
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (5/16) Sep 18 2014 Very interesting, indeed. Do you have an D issue (list) to begin
- H. S. Teoh via Digitalmars-d-learn (14/30) Sep 17 2014 All of this has been thought of before. The big obstacle is that right
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (4/5) Sep 17 2014 What's the difference between std.containers.Array and
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (5/8) Sep 17 2014 What'a the cost of using hashset in favour of hashmap if I want
- Brian Schott (5/6) Sep 17 2014 With the exception of getting keys and values arrays. Those
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (69/72) Sep 18 2014 I get loads of erros on DMD 2.066:
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (3/5) Sep 18 2014 My mistake. I hade accidentally used another version of
Have anybody cooked any GC-less variants of hash-tables (associative arrays) that take keys and values with value semantics only. Similar to how X[] relates to std.containers.Array!X I need this to index my nodes in graphs with tens of millions of nodes.
Sep 17 2014
On Wednesday, 17 September 2014 at 10:39:07 UTC, Nordlöw wrote:Have anybody cooked any GC-less variants of hash-tables (associative arrays) that take keys and values with value semantics only.Important follow-up question: If types of key and value all have value semantics (no indirections) will the hash-table still use the GC for allocations?
Sep 17 2014
On Wed, 17 Sep 2014 10:39:05 +0000, Nordlöw wrote:Have anybody cooked any GC-less variants of hash-tables (associative arrays) that take keys and values with value semantics only. Similar to how X[] relates to std.containers.Array!X I need this to index my nodes in graphs with tens of millions of nodes.See our hashmap and hashset implementations here: https://github.com/economicmodeling/containers/tree/master/src/containers These containers are all certified GC-free.
Sep 17 2014
On Wednesday, 17 September 2014 at 15:27:40 UTC, Justin Whear wrote:These containers are all certified GC-free.Superb! One question though: AFAIK a builtin hash-table in D shouldn't require nor use any GC-allocations if the keys and values all have reference semantics right (no string class, nor member indirections)? What is the current status on this in dmd/druntime?
Sep 17 2014
On Wed, Sep 17, 2014 at 07:23:01PM +0000, "Nordlöw" via Digitalmars-d-learn wrote:On Wednesday, 17 September 2014 at 15:27:40 UTC, Justin Whear wrote:How do you implement a completely GC-free AA with no limit on number of entries stored? I mean, where would it get the memory to store the hashtable from? T -- Once bitten, twice cry...These containers are all certified GC-free.Superb! One question though: AFAIK a builtin hash-table in D shouldn't require nor use any GC-allocations if the keys and values all have reference semantics right (no string class, nor member indirections)? What is the current status on this in dmd/druntime?
Sep 17 2014
H. S. Teoh:How do you implement a completely GC-free AA with no limit on number of entries stored?Ada2012 has a fixed-size hash in the standard library, it can even be allocated on the stack. But the number of entries is not unlimited. Bye, bearophile
Sep 17 2014
On Wednesday, 17 September 2014 at 19:51:06 UTC, H. S. Teoh via Digitalmars-d-learn wrote:How do you implement a completely GC-free AA with no limit on number of entries stored? I mean, where would it get the memory to store the hashtable from?I mean, GC-free not heap-free. An AA of course needs dynamic memory management a la C++'s std::vector.
Sep 17 2014
On Wed, Sep 17, 2014 at 10:26:16PM +0000, "Nordlöw" via Digitalmars-d-learn wrote:On Wednesday, 17 September 2014 at 19:51:06 UTC, H. S. Teoh via Digitalmars-d-learn wrote:So you would use malloc/free? T -- It is of the new things that men tire --- of fashions and proposals and improvements and change. It is the old things that startle and intoxicate. It is the old things that are young. -- G.K. ChestertonHow do you implement a completely GC-free AA with no limit on number of entries stored? I mean, where would it get the memory to store the hashtable from?I mean, GC-free not heap-free. An AA of course needs dynamic memory management a la C++'s std::vector.
Sep 17 2014
On Wednesday, 17 September 2014 at 22:36:55 UTC, H. S. Teoh via Digitalmars-d-learn wrote:So you would use malloc/free?Yes, with GC-free I mean *not* involving D's automatic garbage collector for the individual allocations of the AA *keys* and *values*. How are these normally allocated in a hash-table?...as individual heap allocations?
Sep 17 2014
On Wed, Sep 17, 2014 at 10:44:17PM +0000, "Nordlöw" via Digitalmars-d-learn wrote:On Wednesday, 17 September 2014 at 22:36:55 UTC, H. S. Teoh via Digitalmars-d-learn wrote:Assuming you're talking about the built-in AA's, the current implementation is as follows. There is the hashtable itself, which is just an array of pointers to individual buckets. The buckets are implemented as a linked-list of Slots, which are just structs containing a next pointer, the hash value, and a copy of the key and value. Obviously, individual allocations of Slots come from the GC heap, but in theory it could come from any allocator, potentially even malloc/free, since the lifetime of the Slots is well-defined (user code generally does not have direct access to them; so the AA always knows when a Slot is going out of scope). It does *not* perform separate allocations for keys and values. This means that if your keys and values are value-types, then there is no additional allocation besides the Slots themselves. For reference types, only the reference is stored (this is one of the sources of currently-open AA bugs where unexpected behaviour happens when non-immutable reference-type keys are used in AA's: somebody else changes the key after it's hashed, and this breaks the AA's ability to find that key/value again). No additional allocation is done unless the key/value type itself has a postblit that performs additional allocations. T -- I think Debian's doing something wrong, `apt-get install pesticide', doesn't seem to remove the bugs on my system! -- Mike DresserSo you would use malloc/free?Yes, with GC-free I mean *not* involving D's automatic garbage collector for the individual allocations of the AA *keys* and *values*. How are these normally allocated in a hash-table?...as individual heap allocations?
Sep 17 2014
On Wednesday, 17 September 2014 at 22:56:17 UTC, H. S. Teoh via Digitalmars-d-learn wrote:Slots come from the GC heap, but in theory it could come from any allocator, potentially even malloc/free, since the lifetime of the SlotsSo I guess there is room for some optimizations here right? We could use, for instance, http://dlang.org/phobos/std_traits.html#hasIndirections to check if we need to use GC. If not we can use malloc/free. Couldn't such an enhancement give significant speed-ups for small key-value allocations?
Sep 17 2014
On Wednesday, 17 September 2014 at 23:27:10 UTC, Nordlöw wrote:to check if we need to use GC. If not we can use malloc/free.Further if they key and values are all fixed-size we should probably use some allocator instead of malloc/free. That would make more efficient use of this allocation regularity, right? Is the current AA-allocator easily accessible/modifable somewhere in druntime?
Sep 17 2014
On Wed, Sep 17, 2014 at 11:43:27PM +0000, "Nordlöw" via Digitalmars-d-learn wrote:On Wednesday, 17 September 2014 at 23:27:10 UTC, Nordlöw wrote:Keys and values are *always* fixed sized. Either they are a fixed-sized value type, or they are a fixed-sized reference to some object (of reference type -- the references themselves are of course constant sized). We don't care about managing the resources for the reference type because that's supposed to be the user's problem.to check if we need to use GC. If not we can use malloc/free.Further if they key and values are all fixed-size we should probably use some allocator instead of malloc/free. That would make more efficient use of this allocation regularity, right?Is the current AA-allocator easily accessible/modifable somewhere in druntime?Well, (most of) the code is in druntime src/rt/aaA.d and while it's certainly *modifiable*, let's just say it's not a very pleasant experience to do so. :-) There is still a lot of compiler-dependency built into the code, and one of the major obstacles to making significant improvements is the reliance on typeinfo's rather than compile-time type information via templates. Ideally it should be a fully-decoupled library implementation interfacing with the compiler via a set API, but we're still some ways off from that right now. If you'd like to help it reach that point, it would be most welcome, but be aware that right now, doing things like changing the allocation scheme is not going to be an easy change. T -- To provoke is to call someone stupid; to argue is to call each other stupid.
Sep 17 2014
On Wednesday, 17 September 2014 at 23:57:03 UTC, H. S. Teoh via Digitalmars-d-learn wrote:compile-time type information via templates. Ideally it should be a fully-decoupled library implementation interfacing with the compiler via a set API, but we're still some ways off from that right now. If you'd like to help it reach that point, it would be most welcome, but be aware that right now, doing things like changing the allocation scheme is not going to be an easy change.Very interesting, indeed. Do you have an D issue (list) to begin with? Thanks for now.
Sep 18 2014
On Wed, Sep 17, 2014 at 11:27:08PM +0000, "Nordlöw" via Digitalmars-d-learn wrote:On Wednesday, 17 September 2014 at 22:56:17 UTC, H. S. Teoh via Digitalmars-d-learn wrote:All of this has been thought of before. The big obstacle is that right now, the AA implementation is split across dmd hacks, druntime, and object.di, and relies on typeinfo's rather than compile-time type information, so it's not easy to implement these improvements. Recently, there has been a slow but steady trickle of pull requests that have begun to address the implementation issues, and hopefully in the not-so-distant future we will be able to finally decouple the AA implementation from dmd, and stand a fighting chance of moving it into a library implementation. Once it's in the library, the improvements you mention will be trivial to implement. T -- Help a man when he is in trouble and he will remember you when he is in trouble again.Slots come from the GC heap, but in theory it could come from any allocator, potentially even malloc/free, since the lifetime of the SlotsSo I guess there is room for some optimizations here right? We could use, for instance, http://dlang.org/phobos/std_traits.html#hasIndirections to check if we need to use GC. If not we can use malloc/free. Couldn't such an enhancement give significant speed-ups for small key-value allocations?
Sep 17 2014
On Wednesday, 17 September 2014 at 15:27:40 UTC, Justin Whear wrote:These containers are all certified GC-free.What's the difference between std.containers.Array and DynamicArray? The RC?
Sep 17 2014
On Wednesday, 17 September 2014 at 19:39:16 UTC, Nordlöw wrote:On Wednesday, 17 September 2014 at 15:27:40 UTC, Justin Whear wrote:What'a the cost of using hashset in favour of hashmap if I want the to use the Range range() property?These containers are all certified GC-free.
Sep 17 2014
On Wednesday, 17 September 2014 at 15:27:40 UTC, Justin Whear wrote:These containers are all certified GC-free.With the exception of getting keys and values arrays. Those return GC memory to the caller. I'm pretty sure it's documented that it does that though. Everything else uses allocators.
Sep 17 2014
See our hashmap and hashset implementations here: https://github.com/economicmodeling/containers/tree/master/src/containers These containers are all certified GC-free.I get loads of erros on DMD 2.066: memory/allocators.d(81,4): Error: pure function 'memory.allocators.BlockAllocator!1024LU.BlockAllocator.~this' cannot access mutable static data 'it' memory/allocators.d(81,4): Error: pure function 'memory.allocators.BlockAllocator!1024LU.BlockAllocator.~this' cannot access mutable static data 'it' memory/allocators.d(81,28): Error: pure function 'memory.allocators.BlockAllocator!1024LU.BlockAllocator.~this' cannot call impure function 'std.allocator.Mallocator.deallocate' memory/allocators.d(81,28): Error: 'std.allocator.Mallocator.deallocate' is not nothrow memory/allocators.d(72,2): Error: destructor 'memory.allocators.BlockAllocator!1024LU.BlockAllocator.~this' is nothrow yet may throw memory/allocators.d(20,34): Error: template instance memory.allocators.BlockAllocator!1024LU error instantiating /home/per/Work/justd/containers/hashmap.d(339,29): instantiated from here: NodeAllocator!(32LU, 1024LU) /home/per/Work/justd/containers/hashmap.d(13,18): instantiated from here: HashMap!(string, int, hashString) /home/per/Work/justd/containers/hashmap.d(351,12): instantiated from here: HashMap!(string, int) /home/per/Work/justd/containers/hashmap.d(340,17): Error: template instance Freelist!(BlockAllocator!1024LU, 32LU, 32LU) is used as a type /home/per/Work/justd/containers/hashmap.d(341,22): Error: template instance Freelist!(BlockAllocator!1024LU, 32LU, 32LU) is used as a type /home/per/Work/justd/containers/hashmap.d(342,11): Error: template instance SList!(Node, SListNodeAllocator*) is used as a type /home/per/Work/justd/containers/hashmap.d(55,3): Error: undefined identifier deallocate, did you mean template reallocate(Allocator)(ref Allocator a, ref void[] b, size_t s)? /home/per/Work/justd/containers/hashmap.d(340,17): Error: template instance Freelist!(BlockAllocator!1024LU, 8LU, 8LU) is used as a type /home/per/Work/justd/containers/hashmap.d(341,22): Error: template instance Freelist!(BlockAllocator!1024LU, 8LU, 8LU) is used as a type /home/per/Work/justd/containers/hashmap.d(342,11): Error: template instance SList!(Node, SListNodeAllocator*) is used as a type /home/per/Work/justd/containers/hashmap.d(55,3): Error: undefined identifier deallocate, did you mean template reallocate(Allocator)(ref Allocator a, ref void[] b, size_t s)? /home/per/Work/justd/containers/hashmap.d(19,18): Error: template instance containers.hashmap.HashMap!(char, char, builtinHash) error instantiating /home/per/Work/justd/containers/hashmap.d(372,13): instantiated from here: HashMap!(char, char) memory/allocators.d(81,4): Error: pure function 'memory.allocators.BlockAllocator!512LU.BlockAllocator.~this' cannot access mutable static data 'it' memory/allocators.d(81,4): Error: pure function 'memory.allocators.BlockAllocator!512LU.BlockAllocator.~this' cannot access mutable static data 'it' memory/allocators.d(81,28): Error: pure function 'memory.allocators.BlockAllocator!512LU.BlockAllocator.~this' cannot call impure function 'std.allocator.Mallocator.deallocate' memory/allocators.d(81,28): Error: 'std.allocator.Mallocator.deallocate' is not nothrow memory/allocators.d(72,2): Error: destructor 'memory.allocators.BlockAllocator!512LU.BlockAllocator.~this' is nothrow yet may throw memory/allocators.d(20,34): Error: template instance memory.allocators.BlockAllocator!512LU error instantiating Compilation finished at Thu Sep 18 12:20:19
Sep 18 2014
On Thursday, 18 September 2014 at 10:21:29 UTC, Nordlöw wrote:My mistake. I hade accidentally used another version of std.allocator.These containers are all certified GC-free.I get loads of erros on DMD 2.066:
Sep 18 2014