digitalmars.D - Fun project - faster associative array algorithm
- Walter Bright (15/15) Apr 07 2015 The current D associative array algorithm
- w0rp (11/27) Apr 07 2015 I have already been experimenting with this myself actually. My
- w0rp (18/18) Apr 07 2015 I'll also take this chance to once again plug 'setDefault' as a
- bearophile (5/7) Apr 07 2015 One problem with D strings is that they don't cache their hash
- w0rp (5/12) Apr 07 2015 I probably need to put that back in for some types. I tried
- Martin Nowak (17/21) Apr 07 2015 Quite to the contrary, it's very cache friendly.
- w0rp (5/11) Apr 07 2015 One thing I was wondering about, which you might know more about,
- Steven Schveighoffer (4/15) Apr 07 2015 BTW, fun fact about D AA's, the load factor is 4. As in nodes =
- Martin Nowak (4/10) Apr 08 2015 You made a mistake somewhere, the sweet spot should be in the
- Jens Bauer (9/10) Apr 07 2015 I did a quick-scan of the source and didn't see any Bloom filter
- Marco Leise (10/21) Apr 08 2015 More precisely it adds a memory and time overhead, but
- deadalnix (5/5) Apr 07 2015 Food for thought :
- thedeemon (4/9) Apr 08 2015 More food, D implementation and benchmark results (default AAs
- Andrei Alexandrescu (3/11) Apr 07 2015 Arrays would need to move data. Current hashtables rely on values
- bearophile (7/11) Apr 07 2015 The efficiency behavour of modern CPUs+memory pyramid are rather
- Andrei Alexandrescu (2/11) Apr 07 2015 You must have replied to the wrong post. -- Andrei
- Marco Leise (34/48) Apr 08 2015
- deadalnix (6/8) Apr 07 2015 I think it is fair to say current AA are bankrupt and need a
- Andrei Alexandrescu (6/13) Apr 07 2015 That's not enough. People may take the address of elements in the
- Steven Schveighoffer (7/20) Apr 07 2015 The correct way forward is to implement the AA in the library in the
- Walter Bright (3/4) Apr 07 2015 The data structure has changed before (used to be a binary tree) without...
- Steven Schveighoffer (8/12) Apr 07 2015 Sure but there are so many gotchas to this thing. Multiple very smart,
- H. S. Teoh via Digitalmars-d (8/22) Apr 07 2015 [...]
- Walter Bright (2/14) Apr 07 2015 Conform to the existing interface and it'll work.
- Daniel Murphy (4/9) Apr 07 2015 I mostly agree, except I think we only need _an_ AA in the library, not
- Martin Nowak (15/17) Apr 08 2015 I really don't that all or nothing attitude, it condemns an
- Martin Nowak (3/5) Apr 08 2015 There is a clear acceptance list for a good AA here.
- Steven Schveighoffer (29/42) Apr 09 2015 I'm not suggesting nobody ever do anything out of comfort. What I'm
- Daniel Murphy (3/8) Apr 09 2015 Working out what to lower to is easy, working out when to lower it is ha...
- Steven Schveighoffer (4/12) Apr 09 2015 Wouldn't it be whenever AA was previously invoked? I'm surprised there
- Daniel Murphy (5/7) Apr 09 2015 It has to be done before some parts of semantic, but after others. Eg e...
- Steven Schveighoffer (7/14) Apr 09 2015 I think a reasonable path is to lower as early as possible, have
- Walter Bright (4/5) Apr 07 2015 If the array of buckets were enhanced to include the hashes, the key/val...
- Andrei Alexandrescu (3/9) Apr 07 2015 Not sure I understand. What if the collision array needs to grow? That
- Walter Bright (7/17) Apr 07 2015 The collision array is an array of pairs:
- Andrei Alexandrescu (2/21) Apr 07 2015 Ah, I thought the array embeds KeyValue directly. Thx! -- Andrei
- Martin Nowak (6/7) Apr 08 2015 It should if they are relatively small compared to the pointer,
- Dejan Lekic (9/25) Apr 08 2015 Every now and then I think how nice it would be for Phobos to be
- Marco Leise (37/37) Apr 08 2015 What's a faster algorithm? There are some that deal with
The current D associative array algorithm https://github.com/D-Programming-Language/druntime/blob/master/src/rt/aaA.d uses an array of buckets with a linked list attached to the buckets to resolve collisions. Linked lists are perhaps the worst (i.e. slowest) data structure possible on modern CPUs with memory caches. A possible cache-friendly replacement would be an array of buckets with local probing to resolve collisions. D programs are often heavily dependent on associative arrays, so this could be a big win for us. And it's a fun project, too. Any takers? Interestingly, https://github.com/D-Programming-Language/dmd/blob/master/src/root/stringtable.c uses quadratic probing instead of linked lists, but this is not cache friendly, either.
Apr 07 2015
On Tuesday, 7 April 2015 at 17:25:00 UTC, Walter Bright wrote:The current D associative array algorithm https://github.com/D-Programming-Language/druntime/blob/master/src/rt/aaA.d uses an array of buckets with a linked list attached to the buckets to resolve collisions. Linked lists are perhaps the worst (i.e. slowest) data structure possible on modern CPUs with memory caches. A possible cache-friendly replacement would be an array of buckets with local probing to resolve collisions. D programs are often heavily dependent on associative arrays, so this could be a big win for us. And it's a fun project, too. Any takers? Interestingly, https://github.com/D-Programming-Language/dmd/blob/master/src/root/stringtable.c uses quadratic probing instead of linked lists, but this is not cache friendly, either.I have already been experimenting with this myself actually. My hashmap now uses a bucket array with quadratic probing. I was just following Martin Nowak's advice. My code for it is here. https://github.com/w0rp/dstruct/blob/master/source/dstruct/map.d I have a benchmark program available in my repository for testing it against the druntime version. It doesn't amaze me at the moment, as it's slightly faster for integers, and slightly slower for strings at the moment. I would appreciate any advice on it. I'm sure someone will look at my code and say, "Noo! Don't do that!"
Apr 07 2015
I'll also take this chance to once again plug 'setDefault' as a useful associative array primitive. It's in my library. I don't care what name is used for it, just the semantics. Nowak suggested 'getOrSet', which is a good name. // A string -> int[] map HashMap!(string, int[]) map; // set an element as usual. map["a"] = [1, 2, 3]; // Returns [1, 2, 3], by reference, and appends 4 to it. map.setDefault("a") ~= 4 // Finds nothing in the map currently, returns (int[]).init by reference, // now created in the map, appends to that. map.setDefault("b") ~= 1 assert(map["b"] == [1]); // setDefault with a lazy parameter. int[] arr = map.setDefault("matey", [1, 2]); assert(arr == [1, 2] && arr == map["matey"]);
Apr 07 2015
w0rp:It doesn't amaze me at the moment, as it's slightly faster for integers, and slightly slower for strings at the moment.One problem with D strings is that they don't cache their hash value. Bye, bearophile
Apr 07 2015
On Tuesday, 7 April 2015 at 18:28:22 UTC, bearophile wrote:w0rp:I probably need to put that back in for some types. I tried taking the hash values away. I know that for integers where the size is <= size_t.sizeof, then I can keep the hash codes out of the array, as the key is always the same as the hash.It doesn't amaze me at the moment, as it's slightly faster for integers, and slightly slower for strings at the moment.One problem with D strings is that they don't cache their hash value. Bye, bearophile
Apr 07 2015
On 04/07/2015 07:24 PM, Walter Bright wrote:https://github.com/D-Programming-Language/dmd/blob/master/src/root/stringtable.c uses quadratic probing instead of linked lists, but this is not cache friendly, either.Quite to the contrary, it's very cache friendly. The triangular numbers are 1 3 6 10 15 21 With 8 byte per entry as in stringtable you need 2 collisions on the same "bucket" to land 48 bytes away, which might still be the same cacheline. The problem with linear probing is primary clustering, especially when your hash isn't too good, which can lead to long streaks of neighbouring buckets that are filled. Any insert that hashes into such a lump (chances are increasing the bigger it becomes) will make it grow even further. For higher load factor such as 0.75 this can lead to a huge variance of the probe sequence length and extreme lengths on the upper 95/99 percentiles. Here is the ticket for open addressing, btw. https://issues.dlang.org/show_bug.cgi?id=14385
Apr 07 2015
On Tuesday, 7 April 2015 at 18:35:27 UTC, Martin Nowak wrote:For higher load factor such as 0.75 this can lead to a huge variance of the probe sequence length and extreme lengths on the upper 95/99 percentiles. Here is the ticket for open addressing, btw. https://issues.dlang.org/show_bug.cgi?id=14385One thing I was wondering about, which you might know more about, is that I had to set my load factor to be half the size of the array, as quadratic probing seems to fail when more than half the buckets are filled. Is that correct, or did I make a mistake?
Apr 07 2015
On 4/7/15 3:07 PM, w0rp wrote:On Tuesday, 7 April 2015 at 18:35:27 UTC, Martin Nowak wrote:BTW, fun fact about D AA's, the load factor is 4. As in nodes = 4xbuckets will be added before rehashing. -SteveFor higher load factor such as 0.75 this can lead to a huge variance of the probe sequence length and extreme lengths on the upper 95/99 percentiles. Here is the ticket for open addressing, btw. https://issues.dlang.org/show_bug.cgi?id=14385One thing I was wondering about, which you might know more about, is that I had to set my load factor to be half the size of the array, as quadratic probing seems to fail when more than half the buckets are filled. Is that correct, or did I make a mistake?
Apr 07 2015
On Tuesday, 7 April 2015 at 19:07:01 UTC, w0rp wrote:On Tuesday, 7 April 2015 at 18:35:27 UTC, Martin Nowak wrote: One thing I was wondering about, which you might know more about, is that I had to set my load factor to be half the size of the array, as quadratic probing seems to fail when more than half the buckets are filled. Is that correct, or did I make a mistake?You made a mistake somewhere, the sweet spot should be in the range of 0.6-0.8. Quadratic probing with triangular numbers is guaranteed to fail only when your buckets are completely full.
Apr 08 2015
On Tuesday, 7 April 2015 at 17:25:00 UTC, Walter Bright wrote:The current D associative array algorithmI did a quick-scan of the source and didn't see any Bloom filter there. I wonder if it would be best to have the Bloom filters completely external or if they should be included in the associative array ? -Eg. the Bloom filter does require extra memory, which is not always desirable. On the other hand, a Bloom filter could be opted in, where high speed is required.
Apr 07 2015
Am Tue, 07 Apr 2015 19:09:15 +0000 schrieb "Jens Bauer" <doctor who.no>:On Tuesday, 7 April 2015 at 17:25:00 UTC, Walter Bright wrote:More precisely it adds a memory and time overhead, but reduces the cost for most lookups of non-existent keys. When you just want an associative array for known-to-exist keys it would be a bad idea to use a Bloom filter. Opt in is not possible with the current AA syntax as far as I can see. Except you mean some wrapper struct as part of Phobos/druntime. -- MarcoThe current D associative array algorithmI did a quick-scan of the source and didn't see any Bloom filter there. I wonder if it would be best to have the Bloom filters completely external or if they should be included in the associative array ? -Eg. the Bloom filter does require extra memory, which is not always desirable. On the other hand, a Bloom filter could be opted in, where high speed is required.
Apr 08 2015
Food for thought : http://codecapsule.com/2013/11/11/robin-hood-hashing/ http://people.csail.mit.edu/shanir/publications/disc2008_submission_98.pdf Also it is probably worthwhile to adopt various strategy depending on element types characteristics.
Apr 07 2015
On Tuesday, 7 April 2015 at 19:09:21 UTC, deadalnix wrote:Food for thought : http://codecapsule.com/2013/11/11/robin-hood-hashing/ http://people.csail.mit.edu/shanir/publications/disc2008_submission_98.pdf Also it is probably worthwhile to adopt various strategy depending on element types characteristics.More food, D implementation and benchmark results (default AAs vs. Vibe.d linear probing vs. Robin Hood) : http://www.infognition.com/blog/2014/on_robin_hood_hashing.html
Apr 08 2015
On 4/7/15 10:24 AM, Walter Bright wrote:The current D associative array algorithm https://github.com/D-Programming-Language/druntime/blob/master/src/rt/aaA.d uses an array of buckets with a linked list attached to the buckets to resolve collisions. Linked lists are perhaps the worst (i.e. slowest) data structure possible on modern CPUs with memory caches. A possible cache-friendly replacement would be an array of buckets with local probing to resolve collisions.Arrays would need to move data. Current hashtables rely on values staying put. -- Andrei
Apr 07 2015
Andrei Alexandrescu:The efficiency behavour of modern CPUs+memory pyramid are rather not linear and not intuitive. As Walter has said at the the start of this thread, arrays come out as more efficient in a large number of cases... Bye, bearophileA possible cache-friendly replacement would be an array of buckets with local probing to resolve collisions.Arrays would need to move data. Current hashtables rely on values staying put. -- Andrei
Apr 07 2015
On 4/7/15 3:18 PM, bearophile wrote:Andrei Alexandrescu:You must have replied to the wrong post. -- AndreiThe efficiency behavour of modern CPUs+memory pyramid are rather not linear and not intuitive. As Walter has said at the the start of this thread, arrays come out as more efficient in a large number of cases...A possible cache-friendly replacement would be an array of buckets with local probing to resolve collisions.Arrays would need to move data. Current hashtables rely on values staying put. -- Andrei
Apr 07 2015
Am Tue, 07 Apr 2015 15:26:45 -0700 schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:On 4/7/15 3:18 PM, bearophile wrote:No, that's really what Walter said: "A possible cache-friendly replacement would be an array [...]." Especially when iterating arrays, multiple elements might be in the same cache-line and the CPU's prefetching will kick in and load the next array elements while the current ones are processed. Unexpected memory fetches from RAM can easily be >100 times slower. https://software.intel.com/sites/products/collateral/hpc/vtune/performanc _analysis_guide.pdf (pg. 22) https://software.intel.com/en-us/forums/topic/287236 " Core i7 Xeon 5500 Series Data Source Latency (approximate) L1 CACHE hit, ~4 cycles L2 CACHE hit, ~10 cycles L3 CACHE hit, line unshared ~40 cycles L3 CACHE hit, shared line in another core ~65 cycles L3 CACHE hit, modified in another core ~75 cycles remote L3 CACHE ~100-300 cycles Local Dram ~60 ns Remote Dram ~100 ns [...] a) Structure data such that computationally related information resides within same cache line. b) Write your algorithms such that the higher frequency of access occures on (near) adjacent memory. This reduces the TLB pressure. c) Reduce number of writes to RAM through use of temporary varibles that can be registerized (optimized into register) d) Structure data such that you can manipulate using SSE (when possible) e) For parallel programming, coordinate activities amongst threads sharing cache levels. (HT siblings for L1 and L2, same die or half die for L3, same NUMA node for multiple nodes). -- Jim Dempsey" -- MarcoAndrei Alexandrescu:You must have replied to the wrong post. -- AndreiThe efficiency behavour of modern CPUs+memory pyramid are rather not linear and not intuitive. As Walter has said at the the start of this thread, arrays come out as more efficient in a large number of cases...A possible cache-friendly replacement would be an array of buckets with local probing to resolve collisions.Arrays would need to move data. Current hashtables rely on values staying put. -- Andrei
Apr 08 2015
On Tuesday, 7 April 2015 at 22:14:46 UTC, Andrei Alexandrescu wrote:Arrays would need to move data. Current hashtables rely on values staying put. -- AndreiI think it is fair to say current AA are bankrupt and need a revamping anyway. We can make the in operator return a wrapper that cast to bool (safely) and get/update the data (systemely).
Apr 07 2015
On 4/7/15 3:22 PM, deadalnix wrote:On Tuesday, 7 April 2015 at 22:14:46 UTC, Andrei Alexandrescu wrote:Doesn't strike me as a fair statement.Arrays would need to move data. Current hashtables rely on values staying put. -- AndreiI think it is fair to say current AA are bankrupt and need a revamping anyway.We can make the in operator return a wrapper that cast to bool (safely) and get/update the data (systemely).That's not enough. People may take the address of elements in the hashtable and assume the data stays put. This is currently safe and legal in D. Andrei
Apr 07 2015
On 4/7/15 6:27 PM, Andrei Alexandrescu wrote:On 4/7/15 3:22 PM, deadalnix wrote:The correct way forward is to implement the AA in the library in the safest way possible. Then, make the library implementation customizable for specialized situations. But it needs to be fully in the library, not partly in the compiler as it is now. Until that point, all these discussions on AA updates are useless. -SteveOn Tuesday, 7 April 2015 at 22:14:46 UTC, Andrei Alexandrescu wrote:Doesn't strike me as a fair statement.Arrays would need to move data. Current hashtables rely on values staying put. -- AndreiI think it is fair to say current AA are bankrupt and need a revamping anyway.We can make the in operator return a wrapper that cast to bool (safely) and get/update the data (systemely).That's not enough. People may take the address of elements in the hashtable and assume the data stays put. This is currently safe and legal in D.
Apr 07 2015
On 4/7/2015 3:33 PM, Steven Schveighoffer wrote:Until that point, all these discussions on AA updates are useless.The data structure has changed before (used to be a binary tree) without disruption and can change again.
Apr 07 2015
On 4/7/15 7:01 PM, Walter Bright wrote:On 4/7/2015 3:33 PM, Steven Schveighoffer wrote:Sure but there are so many gotchas to this thing. Multiple very smart, very competent developers set out to change the AA and failed. I would rather effort be spent focusing on getting it into a full-fledged library type with simple lowering for syntax, and then we can play with everything we want to do. At least at that point, there aren't special considerations to worry about. -SteveUntil that point, all these discussions on AA updates are useless.The data structure has changed before (used to be a binary tree) without disruption and can change again.
Apr 07 2015
On Tue, Apr 07, 2015 at 07:47:37PM -0400, Steven Schveighoffer via Digitalmars-d wrote:On 4/7/15 7:01 PM, Walter Bright wrote:[...] Didn't somebody check in a whole bunch of PRs already in this direction? IIRC, there are only 1 or 2 more pieces left before AA's can be fully implemented as a library type. T -- People say I'm indecisive, but I'm not sure about that. -- YHL, CONLANGOn 4/7/2015 3:33 PM, Steven Schveighoffer wrote:Sure but there are so many gotchas to this thing. Multiple very smart, very competent developers set out to change the AA and failed. I would rather effort be spent focusing on getting it into a full-fledged library type with simple lowering for syntax, and then we can play with everything we want to do. At least at that point, there aren't special considerations to worry about.Until that point, all these discussions on AA updates are useless.The data structure has changed before (used to be a binary tree) without disruption and can change again.
Apr 07 2015
On 4/7/2015 4:47 PM, Steven Schveighoffer wrote:On 4/7/15 7:01 PM, Walter Bright wrote:Conform to the existing interface and it'll work.On 4/7/2015 3:33 PM, Steven Schveighoffer wrote:Sure but there are so many gotchas to this thing. Multiple very smart, very competent developers set out to change the AA and failed. I would rather effort be spent focusing on getting it into a full-fledged library type with simple lowering for syntax, and then we can play with everything we want to do. At least at that point, there aren't special considerations to worry about.Until that point, all these discussions on AA updates are useless.The data structure has changed before (used to be a binary tree) without disruption and can change again.
Apr 07 2015
"Steven Schveighoffer" wrote in message news:mg1m0g$91e$1 digitalmars.com...The correct way forward is to implement the AA in the library in the safest way possible. Then, make the library implementation customizable for specialized situations. But it needs to be fully in the library, not partly in the compiler as it is now. Until that point, all these discussions on AA updates are useless.I mostly agree, except I think we only need _an_ AA in the library, not necessarily _the_ AA.
Apr 07 2015
On Tuesday, 7 April 2015 at 22:33:52 UTC, Steven Schveighoffer wrote:Until that point, all these discussions on AA updates are useless.I really don't that all or nothing attitude, it condemns an important step, just because something better might be possible. It's also very comfortable, because this way nobody will ever have to do anything. Improving the AA implementation has a big immediate effect, and will also help anyone writing a library implementation, because any candidate up to this date was still based on that crappy bucket list implementation. The biggest problems in writing an AA library implementation sorted by difficulty are: - deprecation of all magic AA behaviors (attributes, as[i][j]++) - get the lowering right - efficient construction and value insertion (rvalue moving)
Apr 08 2015
On Wednesday, 8 April 2015 at 09:12:00 UTC, Martin Nowak wrote:The biggest problems in writing an AA library implementation sorted by difficulty are:There is a clear acceptance list for a good AA here. https://github.com/D-Programming-Language/druntime/pull/934#issuecomment-65916801
Apr 08 2015
On 4/8/15 5:11 AM, Martin Nowak wrote:On Tuesday, 7 April 2015 at 22:33:52 UTC, Steven Schveighoffer wrote:Until that point, all these discussions on AA updates are useless.I really don't that all or nothing attitude, it condemns an important step, just because something better might be possible. It's also very comfortable, because this way nobody will ever have to do anything.I'm not suggesting nobody ever do anything out of comfort. What I'm suggesting is that it's not worth improving something that will be discarded anyway. Let's fix the problem hampering people from understanding how the AA works so they can work on improving the AA. Namely, the discarding of all compile-time info during implementation.Improving the AA implementation has a big immediate effect, and will also help anyone writing a library implementation, because any candidate up to this date was still based on that crappy bucket list implementation.The "crappy bucket list" implementation is also the simplest. When the focus should be on fixing the AA interface with the compiler and library, I think having a simple understandable implementation is fine. Let me remind you of your attitude when I brought up the fact that AA's load factor was 4 (just fixing this one number could improve performance, even for the bucket list implementation):We'll throw the current implementation away and implement an openaddressing AA once we get to it. So don't worry about the load factor right now. In other words, don't bother fixing the constant "4" in the code, it's going to be thrown away anyways.The biggest problems in writing an AA library implementation sorted by difficulty are: - deprecation of all magic AA behaviors (attributes, as[i][j]++) - get the lowering rightI really am surprised that these would be difficult at all. The lowering is pretty easy: a) T[U] => AssociativeArray!(U,T) b) [a:b, c:d] => AssociativeArray!(typeof(true ? a : c), typeof(true ? b : d)).literal(a, b, c, d); Basically, bikeshed the name "literal" is the difficult part. And deprecating the magic AA behaviors, isn't this just removing a whole lot of code from the compiler? Once the lowering is working, I don't see how removing special cases would be difficult. The part where we may have issues I see is CTFE support. But I would defer to those who have already tried this exercise, perhaps they can share what broke their solutions. -Steve
Apr 09 2015
"Steven Schveighoffer" wrote in message news:mg5pf0$1g51$1 digitalmars.com...I really am surprised that these would be difficult at all. The lowering is pretty easy: a) T[U] => AssociativeArray!(U,T) b) [a:b, c:d] => AssociativeArray!(typeof(true ? a : c), typeof(true ? b : d)).literal(a, b, c, d);Working out what to lower to is easy, working out when to lower it is hard.
Apr 09 2015
On 4/9/15 8:07 AM, Daniel Murphy wrote:"Steven Schveighoffer" wrote in message news:mg5pf0$1g51$1 digitalmars.com...Wouldn't it be whenever AA was previously invoked? I'm surprised there are any unknowns here. -SteveI really am surprised that these would be difficult at all. The lowering is pretty easy: a) T[U] => AssociativeArray!(U,T) b) [a:b, c:d] => AssociativeArray!(typeof(true ? a : c), typeof(true ? b : d)).literal(a, b, c, d);Working out what to lower to is easy, working out when to lower it is hard.
Apr 09 2015
"Steven Schveighoffer" wrote in message news:mg61gn$1o09$1 digitalmars.com...Wouldn't it be whenever AA was previously invoked? I'm surprised there are any unknowns here.It has to be done before some parts of semantic, but after others. Eg error messages and template matching still needs to be done on the AA type, but other parts need the actual template type.
Apr 09 2015
On 4/9/15 11:25 AM, Daniel Murphy wrote:"Steven Schveighoffer" wrote in message news:mg61gn$1o09$1 digitalmars.com...I think a reasonable path is to lower as early as possible, have everything show up as the template type in error messages, and then work on updating the messages over time to reflect the builtin syntax. In any case, I didn't think about template matching, those are handled specially for AA. Perhaps we can make that more generic. -SteveWouldn't it be whenever AA was previously invoked? I'm surprised there are any unknowns here.It has to be done before some parts of semantic, but after others. Eg error messages and template matching still needs to be done on the AA type, but other parts need the actual template type.
Apr 09 2015
On 4/7/2015 3:14 PM, Andrei Alexandrescu wrote:Arrays would need to move data. Current hashtables rely on values staying put.If the array of buckets were enhanced to include the hashes, the key/value objects could stay untouched when the AA is grown. The key/value objects would only be dereferenced if the hashes matched.
Apr 07 2015
On 4/7/15 3:56 PM, Walter Bright wrote:On 4/7/2015 3:14 PM, Andrei Alexandrescu wrote:Not sure I understand. What if the collision array needs to grow? That might move it. -- AndreiArrays would need to move data. Current hashtables rely on values staying put.If the array of buckets were enhanced to include the hashes, the key/value objects could stay untouched when the AA is grown. The key/value objects would only be dereferenced if the hashes matched.
Apr 07 2015
On 4/7/2015 3:59 PM, Andrei Alexandrescu wrote:On 4/7/15 3:56 PM, Walter Bright wrote:The collision array is an array of pairs: KeyValue *kv; Hash_t hash; growing the array moves those around, but the locations of the array elements are never exposed to the users. The addresses of kv are exposed, but those don't move.On 4/7/2015 3:14 PM, Andrei Alexandrescu wrote:Not sure I understand. What if the collision array needs to grow? That might move it. -- AndreiArrays would need to move data. Current hashtables rely on values staying put.If the array of buckets were enhanced to include the hashes, the key/value objects could stay untouched when the AA is grown. The key/value objects would only be dereferenced if the hashes matched.
Apr 07 2015
On 4/7/15 4:18 PM, Walter Bright wrote:On 4/7/2015 3:59 PM, Andrei Alexandrescu wrote:Ah, I thought the array embeds KeyValue directly. Thx! -- AndreiOn 4/7/15 3:56 PM, Walter Bright wrote:The collision array is an array of pairs: KeyValue *kv; Hash_t hash; growing the array moves those around, but the locations of the array elements are never exposed to the users. The addresses of kv are exposed, but those don't move.On 4/7/2015 3:14 PM, Andrei Alexandrescu wrote:Not sure I understand. What if the collision array needs to grow? That might move it. -- AndreiArrays would need to move data. Current hashtables rely on values staying put.If the array of buckets were enhanced to include the hashes, the key/value objects could stay untouched when the AA is grown. The key/value objects would only be dereferenced if the hashes matched.
Apr 07 2015
On Tuesday, 7 April 2015 at 23:38:21 UTC, Andrei Alexandrescu wrote:Ah, I thought the array embeds KeyValue directly. Thx! -- AndreiIt should if they are relatively small compared to the pointer, or at least store the key inline. As we recently discussed about the freeing bug in the AA, it should be fine to store values inline, as long as the bucket array is never deleted.
Apr 08 2015
On Tuesday, 7 April 2015 at 17:25:00 UTC, Walter Bright wrote:The current D associative array algorithm https://github.com/D-Programming-Language/druntime/blob/master/src/rt/aaA.d uses an array of buckets with a linked list attached to the buckets to resolve collisions. Linked lists are perhaps the worst (i.e. slowest) data structure possible on modern CPUs with memory caches. A possible cache-friendly replacement would be an array of buckets with local probing to resolve collisions. D programs are often heavily dependent on associative arrays, so this could be a big win for us. And it's a fun project, too. Any takers? Interestingly, https://github.com/D-Programming-Language/dmd/blob/master/src/root/stringtable.c uses quadratic probing instead of linked lists, but this is not cache friendly, either.Every now and then I think how nice it would be for Phobos to be just a bunch of "module interfaces" along with default implementations of them. Unfortunately, or fortunately, D does not have module interfaces like, say, Modula-3. It would make things much easier for someone who wants to work on a different implementation of a certain module (or even whole package) implementation. I wonder what other people think about this.
Apr 08 2015
What's a faster algorithm? There are some that deal with collisions differently but might be slower with good hash distributions. You can improve iteration speed by storing only an index into a dense array of values, but it eats more memory. As far as I can see from my own experiments with HAMT (http://infoscience.epfl.ch/record/64398/files/idealhashtrees.pdf) the time it takes to compute hashes and do a binary comparison for e.g. strings is so huge that the actual AA algorithm doesn't really matter. If anyone managed to make string AAs faster by swapping out the AA implementation I'd be surprised. Instead I focuses on using malloc over GC and borrowed w0rp's idea of not hashing integral data types such as int, char or bool. They are now their own hash, which is dangerous in terms of hash distribution, but acts as a real turbo boost over the built-in AA (+700% faster for lookups in my tests with 100% hit rate.) May be important: I found that Tuple generates a hash by addition of the hashes of its fields. This means that a coordinate tuple like Tuple!(int, int) will collide with mirror coordinates. E.g.: tuple(2,5) has the same hash as tuple(5,2) and so on. Furthermore the hash functions for each field are virtual function calls from a TypeInfo object that contains further branching to determine whether the type has an overridden toHash or the default implementation (SuperFastHash over binary representation) is used. I found that: enum hasher = &typeid(T).getHash; hasher(&t); already improved hashing speed for gdc a bit. Without looking at the generated assembly my wild guess is that de-virtualization failed. TypeInfo objects should be known at compile time and getHash and other calls to it inlined. Even the dynamic branch to check for overridden toHash is known at compile-time. -- Marco
Apr 08 2015