www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Fun project - faster associative array algorithm

reply Walter Bright <newshound2 digitalmars.com> writes:
The current D associative array algorithm

     https://github.com/D-Programming-Language/druntime/blob/master/src/rt/aaA.d

uses an array of buckets with a linked list attached to the buckets to resolve 
collisions.

Linked lists are perhaps the worst (i.e. slowest) data structure possible on 
modern CPUs with memory caches.

A possible cache-friendly replacement would be an array of buckets with local 
probing to resolve collisions. D programs are often heavily dependent on 
associative arrays, so this could be a big win for us.

And it's a fun project, too. Any takers?



Interestingly,

 
https://github.com/D-Programming-Language/dmd/blob/master/src/root/stringtable.c

uses quadratic probing instead of linked lists, but this is not cache friendly, 
either.
Apr 07 2015
next sibling parent reply "w0rp" <devw0rp gmail.com> writes:
On Tuesday, 7 April 2015 at 17:25:00 UTC, Walter Bright wrote:
 The current D associative array algorithm

     
 https://github.com/D-Programming-Language/druntime/blob/master/src/rt/aaA.d

 uses an array of buckets with a linked list attached to the 
 buckets to resolve collisions.

 Linked lists are perhaps the worst (i.e. slowest) data 
 structure possible on modern CPUs with memory caches.

 A possible cache-friendly replacement would be an array of 
 buckets with local probing to resolve collisions. D programs 
 are often heavily dependent on associative arrays, so this 
 could be a big win for us.

 And it's a fun project, too. Any takers?



 Interestingly,


 https://github.com/D-Programming-Language/dmd/blob/master/src/root/stringtable.c

 uses quadratic probing instead of linked lists, but this is not 
 cache friendly, either.
I have already been experimenting with this myself actually. My hashmap now uses a bucket array with quadratic probing. I was just following Martin Nowak's advice. My code for it is here. https://github.com/w0rp/dstruct/blob/master/source/dstruct/map.d I have a benchmark program available in my repository for testing it against the druntime version. It doesn't amaze me at the moment, as it's slightly faster for integers, and slightly slower for strings at the moment. I would appreciate any advice on it. I'm sure someone will look at my code and say, "Noo! Don't do that!"
Apr 07 2015
next sibling parent "w0rp" <devw0rp gmail.com> writes:
I'll also take this chance to once again plug 'setDefault' as a 
useful associative array primitive. It's in my library. I don't 
care what name is used for it, just the semantics. Nowak 
suggested 'getOrSet', which is a good name.

// A string -> int[] map
HashMap!(string, int[]) map;

// set an element as usual.
map["a"] = [1, 2, 3];

// Returns [1, 2, 3], by reference, and appends 4 to it.
map.setDefault("a") ~= 4

// Finds nothing in the map currently, returns (int[]).init by 
reference,
// now created in the map, appends to that.
map.setDefault("b") ~= 1

assert(map["b"] == [1]);

// setDefault with a lazy parameter.
int[] arr = map.setDefault("matey", [1, 2]);

assert(arr == [1, 2] && arr == map["matey"]);
Apr 07 2015
prev sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
w0rp:

 It doesn't amaze me at the moment, as it's slightly faster for 
 integers, and slightly slower for strings at the moment.
One problem with D strings is that they don't cache their hash value. Bye, bearophile
Apr 07 2015
parent "w0rp" <devw0rp gmail.com> writes:
On Tuesday, 7 April 2015 at 18:28:22 UTC, bearophile wrote:
 w0rp:

 It doesn't amaze me at the moment, as it's slightly faster for 
 integers, and slightly slower for strings at the moment.
One problem with D strings is that they don't cache their hash value. Bye, bearophile
I probably need to put that back in for some types. I tried taking the hash values away. I know that for integers where the size is <= size_t.sizeof, then I can keep the hash codes out of the array, as the key is always the same as the hash.
Apr 07 2015
prev sibling next sibling parent reply Martin Nowak <code+news.digitalmars dawg.eu> writes:
On 04/07/2015 07:24 PM, Walter Bright wrote:
 https://github.com/D-Programming-Language/dmd/blob/master/src/root/stringtable.c
 
 uses quadratic probing instead of linked lists, but this is not cache
 friendly, either.
Quite to the contrary, it's very cache friendly. The triangular numbers are 1 3 6 10 15 21 With 8 byte per entry as in stringtable you need 2 collisions on the same "bucket" to land 48 bytes away, which might still be the same cacheline. The problem with linear probing is primary clustering, especially when your hash isn't too good, which can lead to long streaks of neighbouring buckets that are filled. Any insert that hashes into such a lump (chances are increasing the bigger it becomes) will make it grow even further. For higher load factor such as 0.75 this can lead to a huge variance of the probe sequence length and extreme lengths on the upper 95/99 percentiles. Here is the ticket for open addressing, btw. https://issues.dlang.org/show_bug.cgi?id=14385
Apr 07 2015
parent reply "w0rp" <devw0rp gmail.com> writes:
On Tuesday, 7 April 2015 at 18:35:27 UTC, Martin Nowak wrote:
 For higher load factor such as 0.75 this can lead to a huge 
 variance of
 the probe sequence length and extreme lengths on the upper 95/99
 percentiles.

 Here is the ticket for open addressing, btw.
 https://issues.dlang.org/show_bug.cgi?id=14385
One thing I was wondering about, which you might know more about, is that I had to set my load factor to be half the size of the array, as quadratic probing seems to fail when more than half the buckets are filled. Is that correct, or did I make a mistake?
Apr 07 2015
next sibling parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 4/7/15 3:07 PM, w0rp wrote:
 On Tuesday, 7 April 2015 at 18:35:27 UTC, Martin Nowak wrote:
 For higher load factor such as 0.75 this can lead to a huge variance of
 the probe sequence length and extreme lengths on the upper 95/99
 percentiles.

 Here is the ticket for open addressing, btw.
 https://issues.dlang.org/show_bug.cgi?id=14385
One thing I was wondering about, which you might know more about, is that I had to set my load factor to be half the size of the array, as quadratic probing seems to fail when more than half the buckets are filled. Is that correct, or did I make a mistake?
BTW, fun fact about D AA's, the load factor is 4. As in nodes = 4xbuckets will be added before rehashing. -Steve
Apr 07 2015
prev sibling parent "Martin Nowak" <code dawg.eu> writes:
On Tuesday, 7 April 2015 at 19:07:01 UTC, w0rp wrote:
 On Tuesday, 7 April 2015 at 18:35:27 UTC, Martin Nowak wrote:
 One thing I was wondering about, which you might know more 
 about, is that I had to set my load factor to be half the size 
 of the array, as quadratic probing seems to fail when more than 
 half the buckets are filled. Is that correct, or did I make a 
 mistake?
You made a mistake somewhere, the sweet spot should be in the range of 0.6-0.8. Quadratic probing with triangular numbers is guaranteed to fail only when your buckets are completely full.
Apr 08 2015
prev sibling next sibling parent reply "Jens Bauer" <doctor who.no> writes:
On Tuesday, 7 April 2015 at 17:25:00 UTC, Walter Bright wrote:
 The current D associative array algorithm
I did a quick-scan of the source and didn't see any Bloom filter there. I wonder if it would be best to have the Bloom filters completely external or if they should be included in the associative array ? -Eg. the Bloom filter does require extra memory, which is not always desirable. On the other hand, a Bloom filter could be opted in, where high speed is required.
Apr 07 2015
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Tue, 07 Apr 2015 19:09:15 +0000
schrieb "Jens Bauer" <doctor who.no>:

 On Tuesday, 7 April 2015 at 17:25:00 UTC, Walter Bright wrote:
 The current D associative array algorithm
I did a quick-scan of the source and didn't see any Bloom filter there. I wonder if it would be best to have the Bloom filters completely external or if they should be included in the associative array ? -Eg. the Bloom filter does require extra memory, which is not always desirable. On the other hand, a Bloom filter could be opted in, where high speed is required.
More precisely it adds a memory and time overhead, but reduces the cost for most lookups of non-existent keys. When you just want an associative array for known-to-exist keys it would be a bad idea to use a Bloom filter. Opt in is not possible with the current AA syntax as far as I can see. Except you mean some wrapper struct as part of Phobos/druntime. -- Marco
Apr 08 2015
prev sibling next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
Food for thought :

http://codecapsule.com/2013/11/11/robin-hood-hashing/
http://people.csail.mit.edu/shanir/publications/disc2008_submission_98.pdf

Also it is probably worthwhile to adopt various strategy 
depending on element types characteristics.
Apr 07 2015
parent "thedeemon" <dlang thedeemon.com> writes:
On Tuesday, 7 April 2015 at 19:09:21 UTC, deadalnix wrote:
 Food for thought :

 http://codecapsule.com/2013/11/11/robin-hood-hashing/
 http://people.csail.mit.edu/shanir/publications/disc2008_submission_98.pdf

 Also it is probably worthwhile to adopt various strategy 
 depending on element types characteristics.
More food, D implementation and benchmark results (default AAs vs. Vibe.d linear probing vs. Robin Hood) : http://www.infognition.com/blog/2014/on_robin_hood_hashing.html
Apr 08 2015
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/7/15 10:24 AM, Walter Bright wrote:
 The current D associative array algorithm


 https://github.com/D-Programming-Language/druntime/blob/master/src/rt/aaA.d

 uses an array of buckets with a linked list attached to the buckets to
 resolve collisions.

 Linked lists are perhaps the worst (i.e. slowest) data structure
 possible on modern CPUs with memory caches.

 A possible cache-friendly replacement would be an array of buckets with
 local probing to resolve collisions.
Arrays would need to move data. Current hashtables rely on values staying put. -- Andrei
Apr 07 2015
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 A possible cache-friendly replacement would be an array of 
 buckets with local probing to resolve collisions.
Arrays would need to move data. Current hashtables rely on values staying put. -- Andrei
The efficiency behavour of modern CPUs+memory pyramid are rather not linear and not intuitive. As Walter has said at the the start of this thread, arrays come out as more efficient in a large number of cases... Bye, bearophile
Apr 07 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/7/15 3:18 PM, bearophile wrote:
 Andrei Alexandrescu:

 A possible cache-friendly replacement would be an array of buckets
 with local probing to resolve collisions.
Arrays would need to move data. Current hashtables rely on values staying put. -- Andrei
The efficiency behavour of modern CPUs+memory pyramid are rather not linear and not intuitive. As Walter has said at the the start of this thread, arrays come out as more efficient in a large number of cases...
You must have replied to the wrong post. -- Andrei
Apr 07 2015
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Tue, 07 Apr 2015 15:26:45 -0700
schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:

 On 4/7/15 3:18 PM, bearophile wrote:
 Andrei Alexandrescu:

 A possible cache-friendly replacement would be an array of buckets
 with local probing to resolve collisions.
Arrays would need to move data. Current hashtables rely on values staying put. -- Andrei
The efficiency behavour of modern CPUs+memory pyramid are rather not linear and not intuitive. As Walter has said at the the start of this thread, arrays come out as more efficient in a large number of cases...
You must have replied to the wrong post. -- Andrei
No, that's really what Walter said: "A possible cache-friendly replacement would be an array [...]." Especially when iterating arrays, multiple elements might be in the same cache-line and the CPU's prefetching will kick in and load the next array elements while the current ones are processed. Unexpected memory fetches from RAM can easily be >100 times slower. https://software.intel.com/sites/products/collateral/hpc/vtune/performanc _analysis_guide.pdf (pg. 22) https://software.intel.com/en-us/forums/topic/287236 " Core i7 Xeon 5500 Series Data Source Latency (approximate) L1 CACHE hit, ~4 cycles L2 CACHE hit, ~10 cycles L3 CACHE hit, line unshared ~40 cycles L3 CACHE hit, shared line in another core ~65 cycles L3 CACHE hit, modified in another core ~75 cycles remote L3 CACHE ~100-300 cycles Local Dram ~60 ns Remote Dram ~100 ns [...] a) Structure data such that computationally related information resides within same cache line. b) Write your algorithms such that the higher frequency of access occures on (near) adjacent memory. This reduces the TLB pressure. c) Reduce number of writes to RAM through use of temporary varibles that can be registerized (optimized into register) d) Structure data such that you can manipulate using SSE (when possible) e) For parallel programming, coordinate activities amongst threads sharing cache levels. (HT siblings for L1 and L2, same die or half die for L3, same NUMA node for multiple nodes). -- Jim Dempsey" -- Marco
Apr 08 2015
prev sibling next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 7 April 2015 at 22:14:46 UTC, Andrei Alexandrescu 
wrote:
 Arrays would need to move data. Current hashtables rely on 
 values staying put. -- Andrei
I think it is fair to say current AA are bankrupt and need a revamping anyway. We can make the in operator return a wrapper that cast to bool (safely) and get/update the data (systemely).
Apr 07 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/7/15 3:22 PM, deadalnix wrote:
 On Tuesday, 7 April 2015 at 22:14:46 UTC, Andrei Alexandrescu wrote:
 Arrays would need to move data. Current hashtables rely on values
 staying put. -- Andrei
I think it is fair to say current AA are bankrupt and need a revamping anyway.
Doesn't strike me as a fair statement.
 We can make the in operator return a wrapper that cast to bool (safely)
 and get/update the data (systemely).
That's not enough. People may take the address of elements in the hashtable and assume the data stays put. This is currently safe and legal in D. Andrei
Apr 07 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 4/7/15 6:27 PM, Andrei Alexandrescu wrote:
 On 4/7/15 3:22 PM, deadalnix wrote:
 On Tuesday, 7 April 2015 at 22:14:46 UTC, Andrei Alexandrescu wrote:
 Arrays would need to move data. Current hashtables rely on values
 staying put. -- Andrei
I think it is fair to say current AA are bankrupt and need a revamping anyway.
Doesn't strike me as a fair statement.
 We can make the in operator return a wrapper that cast to bool (safely)
 and get/update the data (systemely).
That's not enough. People may take the address of elements in the hashtable and assume the data stays put. This is currently safe and legal in D.
The correct way forward is to implement the AA in the library in the safest way possible. Then, make the library implementation customizable for specialized situations. But it needs to be fully in the library, not partly in the compiler as it is now. Until that point, all these discussions on AA updates are useless. -Steve
Apr 07 2015
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/7/2015 3:33 PM, Steven Schveighoffer wrote:
 Until that point, all these discussions on AA updates are useless.
The data structure has changed before (used to be a binary tree) without disruption and can change again.
Apr 07 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 4/7/15 7:01 PM, Walter Bright wrote:
 On 4/7/2015 3:33 PM, Steven Schveighoffer wrote:
 Until that point, all these discussions on AA updates are useless.
The data structure has changed before (used to be a binary tree) without disruption and can change again.
Sure but there are so many gotchas to this thing. Multiple very smart, very competent developers set out to change the AA and failed. I would rather effort be spent focusing on getting it into a full-fledged library type with simple lowering for syntax, and then we can play with everything we want to do. At least at that point, there aren't special considerations to worry about. -Steve
Apr 07 2015
next sibling parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Tue, Apr 07, 2015 at 07:47:37PM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 On 4/7/15 7:01 PM, Walter Bright wrote:
On 4/7/2015 3:33 PM, Steven Schveighoffer wrote:
Until that point, all these discussions on AA updates are useless.
The data structure has changed before (used to be a binary tree) without disruption and can change again.
Sure but there are so many gotchas to this thing. Multiple very smart, very competent developers set out to change the AA and failed. I would rather effort be spent focusing on getting it into a full-fledged library type with simple lowering for syntax, and then we can play with everything we want to do. At least at that point, there aren't special considerations to worry about.
[...] Didn't somebody check in a whole bunch of PRs already in this direction? IIRC, there are only 1 or 2 more pieces left before AA's can be fully implemented as a library type. T -- People say I'm indecisive, but I'm not sure about that. -- YHL, CONLANG
Apr 07 2015
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/7/2015 4:47 PM, Steven Schveighoffer wrote:
 On 4/7/15 7:01 PM, Walter Bright wrote:
 On 4/7/2015 3:33 PM, Steven Schveighoffer wrote:
 Until that point, all these discussions on AA updates are useless.
The data structure has changed before (used to be a binary tree) without disruption and can change again.
Sure but there are so many gotchas to this thing. Multiple very smart, very competent developers set out to change the AA and failed. I would rather effort be spent focusing on getting it into a full-fledged library type with simple lowering for syntax, and then we can play with everything we want to do. At least at that point, there aren't special considerations to worry about.
Conform to the existing interface and it'll work.
Apr 07 2015
prev sibling next sibling parent "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Steven Schveighoffer"  wrote in message 
news:mg1m0g$91e$1 digitalmars.com...

 The correct way forward is to implement the AA in the library in the 
 safest way possible. Then, make the library implementation customizable 
 for specialized situations. But it needs to be fully in the library, not 
 partly in the compiler as it is now. Until that point, all these 
 discussions on AA updates are useless.
I mostly agree, except I think we only need _an_ AA in the library, not necessarily _the_ AA.
Apr 07 2015
prev sibling parent reply "Martin Nowak" <code dawg.eu> writes:
On Tuesday, 7 April 2015 at 22:33:52 UTC, Steven Schveighoffer 
wrote:
 Until that point, all these discussions on AA updates are 
 useless.
I really don't that all or nothing attitude, it condemns an important step, just because something better might be possible. It's also very comfortable, because this way nobody will ever have to do anything. Improving the AA implementation has a big immediate effect, and will also help anyone writing a library implementation, because any candidate up to this date was still based on that crappy bucket list implementation. The biggest problems in writing an AA library implementation sorted by difficulty are: - deprecation of all magic AA behaviors (attributes, as[i][j]++) - get the lowering right - efficient construction and value insertion (rvalue moving)
Apr 08 2015
next sibling parent "Martin Nowak" <code dawg.eu> writes:
On Wednesday, 8 April 2015 at 09:12:00 UTC, Martin Nowak wrote:
 The biggest problems in writing an AA library implementation 
 sorted by difficulty are:
There is a clear acceptance list for a good AA here. https://github.com/D-Programming-Language/druntime/pull/934#issuecomment-65916801
Apr 08 2015
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 4/8/15 5:11 AM, Martin Nowak wrote:
 On Tuesday, 7 April 2015 at 22:33:52 UTC, Steven Schveighoffer wrote:
 Until that point, all these discussions on AA updates are useless.
 I really don't that all or nothing attitude, it condemns an important
 step, just because something better might be possible. It's also very
 comfortable, because this way nobody will ever have to do anything.
I'm not suggesting nobody ever do anything out of comfort. What I'm suggesting is that it's not worth improving something that will be discarded anyway. Let's fix the problem hampering people from understanding how the AA works so they can work on improving the AA. Namely, the discarding of all compile-time info during implementation.
 Improving the AA implementation has a big immediate effect, and will
 also help anyone writing a library implementation, because any candidate
 up to this date was still based on that crappy bucket list implementation.
The "crappy bucket list" implementation is also the simplest. When the focus should be on fixing the AA interface with the compiler and library, I think having a simple understandable implementation is fine. Let me remind you of your attitude when I brought up the fact that AA's load factor was 4 (just fixing this one number could improve performance, even for the bucket list implementation):
 We'll throw the current implementation away and implement an open 
addressing AA once we get to it. So don't worry about the load factor right now. In other words, don't bother fixing the constant "4" in the code, it's going to be thrown away anyways.
 The biggest problems in writing an AA library implementation sorted by
 difficulty are:
 - deprecation of all magic AA behaviors (attributes, as[i][j]++)
 - get the lowering right
I really am surprised that these would be difficult at all. The lowering is pretty easy: a) T[U] => AssociativeArray!(U,T) b) [a:b, c:d] => AssociativeArray!(typeof(true ? a : c), typeof(true ? b : d)).literal(a, b, c, d); Basically, bikeshed the name "literal" is the difficult part. And deprecating the magic AA behaviors, isn't this just removing a whole lot of code from the compiler? Once the lowering is working, I don't see how removing special cases would be difficult. The part where we may have issues I see is CTFE support. But I would defer to those who have already tried this exercise, perhaps they can share what broke their solutions. -Steve
Apr 09 2015
parent reply "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Steven Schveighoffer"  wrote in message 
news:mg5pf0$1g51$1 digitalmars.com...

 I really am surprised that these would be difficult at all. The lowering 
 is pretty easy:
 a) T[U] => AssociativeArray!(U,T)
 b) [a:b, c:d] => AssociativeArray!(typeof(true ? a : c), typeof(true ? b : 
 d)).literal(a, b, c, d);
Working out what to lower to is easy, working out when to lower it is hard.
Apr 09 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 4/9/15 8:07 AM, Daniel Murphy wrote:
 "Steven Schveighoffer"  wrote in message
 news:mg5pf0$1g51$1 digitalmars.com...

 I really am surprised that these would be difficult at all. The
 lowering is pretty easy:
 a) T[U] => AssociativeArray!(U,T)
 b) [a:b, c:d] => AssociativeArray!(typeof(true ? a : c), typeof(true ?
 b : d)).literal(a, b, c, d);
Working out what to lower to is easy, working out when to lower it is hard.
Wouldn't it be whenever AA was previously invoked? I'm surprised there are any unknowns here. -Steve
Apr 09 2015
parent reply "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Steven Schveighoffer"  wrote in message 
news:mg61gn$1o09$1 digitalmars.com...

 Wouldn't it be whenever AA was previously invoked? I'm surprised there are 
 any unknowns here.
It has to be done before some parts of semantic, but after others. Eg error messages and template matching still needs to be done on the AA type, but other parts need the actual template type.
Apr 09 2015
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 4/9/15 11:25 AM, Daniel Murphy wrote:
 "Steven Schveighoffer"  wrote in message
 news:mg61gn$1o09$1 digitalmars.com...

 Wouldn't it be whenever AA was previously invoked? I'm surprised there
 are any unknowns here.
It has to be done before some parts of semantic, but after others. Eg error messages and template matching still needs to be done on the AA type, but other parts need the actual template type.
I think a reasonable path is to lower as early as possible, have everything show up as the template type in error messages, and then work on updating the messages over time to reflect the builtin syntax. In any case, I didn't think about template matching, those are handled specially for AA. Perhaps we can make that more generic. -Steve
Apr 09 2015
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/7/2015 3:14 PM, Andrei Alexandrescu wrote:
 Arrays would need to move data. Current hashtables rely on values staying put.
If the array of buckets were enhanced to include the hashes, the key/value objects could stay untouched when the AA is grown. The key/value objects would only be dereferenced if the hashes matched.
Apr 07 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/7/15 3:56 PM, Walter Bright wrote:
 On 4/7/2015 3:14 PM, Andrei Alexandrescu wrote:
 Arrays would need to move data. Current hashtables rely on values
 staying put.
If the array of buckets were enhanced to include the hashes, the key/value objects could stay untouched when the AA is grown. The key/value objects would only be dereferenced if the hashes matched.
Not sure I understand. What if the collision array needs to grow? That might move it. -- Andrei
Apr 07 2015
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/7/2015 3:59 PM, Andrei Alexandrescu wrote:
 On 4/7/15 3:56 PM, Walter Bright wrote:
 On 4/7/2015 3:14 PM, Andrei Alexandrescu wrote:
 Arrays would need to move data. Current hashtables rely on values
 staying put.
If the array of buckets were enhanced to include the hashes, the key/value objects could stay untouched when the AA is grown. The key/value objects would only be dereferenced if the hashes matched.
Not sure I understand. What if the collision array needs to grow? That might move it. -- Andrei
The collision array is an array of pairs: KeyValue *kv; Hash_t hash; growing the array moves those around, but the locations of the array elements are never exposed to the users. The addresses of kv are exposed, but those don't move.
Apr 07 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/7/15 4:18 PM, Walter Bright wrote:
 On 4/7/2015 3:59 PM, Andrei Alexandrescu wrote:
 On 4/7/15 3:56 PM, Walter Bright wrote:
 On 4/7/2015 3:14 PM, Andrei Alexandrescu wrote:
 Arrays would need to move data. Current hashtables rely on values
 staying put.
If the array of buckets were enhanced to include the hashes, the key/value objects could stay untouched when the AA is grown. The key/value objects would only be dereferenced if the hashes matched.
Not sure I understand. What if the collision array needs to grow? That might move it. -- Andrei
The collision array is an array of pairs: KeyValue *kv; Hash_t hash; growing the array moves those around, but the locations of the array elements are never exposed to the users. The addresses of kv are exposed, but those don't move.
Ah, I thought the array embeds KeyValue directly. Thx! -- Andrei
Apr 07 2015
parent "Martin Nowak" <code dawg.eu> writes:
On Tuesday, 7 April 2015 at 23:38:21 UTC, Andrei Alexandrescu 
wrote:
 Ah, I thought the array embeds KeyValue directly. Thx! -- Andrei
It should if they are relatively small compared to the pointer, or at least store the key inline. As we recently discussed about the freeing bug in the AA, it should be fine to store values inline, as long as the bucket array is never deleted.
Apr 08 2015
prev sibling next sibling parent "Dejan Lekic" <dejan.lekic gmail.com> writes:
On Tuesday, 7 April 2015 at 17:25:00 UTC, Walter Bright wrote:
 The current D associative array algorithm

     
 https://github.com/D-Programming-Language/druntime/blob/master/src/rt/aaA.d

 uses an array of buckets with a linked list attached to the 
 buckets to resolve collisions.

 Linked lists are perhaps the worst (i.e. slowest) data 
 structure possible on modern CPUs with memory caches.

 A possible cache-friendly replacement would be an array of 
 buckets with local probing to resolve collisions. D programs 
 are often heavily dependent on associative arrays, so this 
 could be a big win for us.

 And it's a fun project, too. Any takers?



 Interestingly,


 https://github.com/D-Programming-Language/dmd/blob/master/src/root/stringtable.c

 uses quadratic probing instead of linked lists, but this is not 
 cache friendly, either.
Every now and then I think how nice it would be for Phobos to be just a bunch of "module interfaces" along with default implementations of them. Unfortunately, or fortunately, D does not have module interfaces like, say, Modula-3. It would make things much easier for someone who wants to work on a different implementation of a certain module (or even whole package) implementation. I wonder what other people think about this.
Apr 08 2015
prev sibling parent Marco Leise <Marco.Leise gmx.de> writes:
What's a faster algorithm? There are some that deal with
collisions differently but might be slower with good hash
distributions. You can improve iteration speed by storing only
an index into a dense array of values, but it eats more memory.

As far as I can see from my own experiments with HAMT
(http://infoscience.epfl.ch/record/64398/files/idealhashtrees.pdf)
the time it takes to compute hashes and do a binary comparison
for e.g. strings is so huge that the actual AA algorithm
doesn't really matter. If anyone managed to make string AAs
faster by swapping out the AA implementation I'd be surprised.

Instead I focuses on using malloc over GC and borrowed w0rp's
idea of not hashing integral data types such as int, char or
bool. They are now their own hash, which is dangerous in terms
of hash distribution, but acts as a real turbo boost over the
built-in AA (+700% faster for lookups in my tests with 100%
hit rate.)

May be important:

I found that Tuple generates a hash by addition of the hashes
of its fields. This means that a coordinate tuple like
Tuple!(int, int) will collide with mirror coordinates.
E.g.: tuple(2,5) has the same hash as tuple(5,2) and so on.
Furthermore the hash functions for each field are virtual
function calls from a TypeInfo object that contains further
branching to determine whether the type has an overridden
toHash or the default implementation (SuperFastHash over
binary representation) is used.
I found that:

  enum hasher = &typeid(T).getHash;
  hasher(&t);

already improved hashing speed for gdc a bit. Without looking
at the generated assembly my wild guess is that
de-virtualization failed. TypeInfo objects should be known at
compile time and getHash and other calls to it inlined. Even
the dynamic branch to check for overridden toHash is known at
compile-time.

-- 
Marco
Apr 08 2015