digitalmars.D - Fun project - faster associative array algorithm

Walter Bright (15/15) Apr 07 2015 The current D associative array algorithm

w0rp (11/27) Apr 07 2015 I have already been experimenting with this myself actually. My

w0rp (18/18) Apr 07 2015 I'll also take this chance to once again plug 'setDefault' as a
bearophile (5/7) Apr 07 2015 One problem with D strings is that they don't cache their hash

w0rp (5/12) Apr 07 2015 I probably need to put that back in for some types. I tried

Martin Nowak (17/21) Apr 07 2015 Quite to the contrary, it's very cache friendly.

w0rp (5/11) Apr 07 2015 One thing I was wondering about, which you might know more about,

Steven Schveighoffer (4/15) Apr 07 2015 BTW, fun fact about D AA's, the load factor is 4. As in nodes =
Martin Nowak (4/10) Apr 08 2015 You made a mistake somewhere, the sweet spot should be in the

Jens Bauer (9/10) Apr 07 2015 I did a quick-scan of the source and didn't see any Bloom filter

Marco Leise (10/21) Apr 08 2015 More precisely it adds a memory and time overhead, but

deadalnix (5/5) Apr 07 2015 Food for thought :

thedeemon (4/9) Apr 08 2015 More food, D implementation and benchmark results (default AAs

Andrei Alexandrescu (3/11) Apr 07 2015 Arrays would need to move data. Current hashtables rely on values

bearophile (7/11) Apr 07 2015 The efficiency behavour of modern CPUs+memory pyramid are rather

Andrei Alexandrescu (2/11) Apr 07 2015 You must have replied to the wrong post. -- Andrei

Marco Leise (34/48) Apr 08 2015

deadalnix (6/8) Apr 07 2015 I think it is fair to say current AA are bankrupt and need a

Andrei Alexandrescu (6/13) Apr 07 2015 That's not enough. People may take the address of elements in the

Steven Schveighoffer (7/20) Apr 07 2015 The correct way forward is to implement the AA in the library in the

Walter Bright (3/4) Apr 07 2015 The data structure has changed before (used to be a binary tree) without...

Steven Schveighoffer (8/12) Apr 07 2015 Sure but there are so many gotchas to this thing. Multiple very smart,

H. S. Teoh via Digitalmars-d (8/22) Apr 07 2015 [...]
Walter Bright (2/14) Apr 07 2015 Conform to the existing interface and it'll work.

Daniel Murphy (4/9) Apr 07 2015 I mostly agree, except I think we only need _an_ AA in the library, not
Martin Nowak (15/17) Apr 08 2015 I really don't that all or nothing attitude, it condemns an

Martin Nowak (3/5) Apr 08 2015 There is a clear acceptance list for a good AA here.
Steven Schveighoffer (29/42) Apr 09 2015 I'm not suggesting nobody ever do anything out of comfort. What I'm

Daniel Murphy (3/8) Apr 09 2015 Working out what to lower to is easy, working out when to lower it is ha...

Steven Schveighoffer (4/12) Apr 09 2015 Wouldn't it be whenever AA was previously invoked? I'm surprised there

Daniel Murphy (5/7) Apr 09 2015 It has to be done before some parts of semantic, but after others. Eg e...

Steven Schveighoffer (7/14) Apr 09 2015 I think a reasonable path is to lower as early as possible, have

Walter Bright (4/5) Apr 07 2015 If the array of buckets were enhanced to include the hashes, the key/val...

Andrei Alexandrescu (3/9) Apr 07 2015 Not sure I understand. What if the collision array needs to grow? That

Walter Bright (7/17) Apr 07 2015 The collision array is an array of pairs:

Andrei Alexandrescu (2/21) Apr 07 2015 Ah, I thought the array embeds KeyValue directly. Thx! -- Andrei

Martin Nowak (6/7) Apr 08 2015 It should if they are relatively small compared to the pointer,

Dejan Lekic (9/25) Apr 08 2015 Every now and then I think how nice it would be for Phobos to be
Marco Leise (37/37) Apr 08 2015 What's a faster algorithm? There are some that deal with

Walter Bright <newshound2 digitalmars.com> writes:

The current D associative array algorithm

     https://github.com/D-Programming-Language/druntime/blob/master/src/rt/aaA.d

uses an array of buckets with a linked list attached to the buckets to resolve 
collisions.

Linked lists are perhaps the worst (i.e. slowest) data structure possible on 
modern CPUs with memory caches.

A possible cache-friendly replacement would be an array of buckets with local 
probing to resolve collisions. D programs are often heavily dependent on 
associative arrays, so this could be a big win for us.

And it's a fun project, too. Any takers?



Interestingly,

 
https://github.com/D-Programming-Language/dmd/blob/master/src/root/stringtable.c

uses quadratic probing instead of linked lists, but this is not cache friendly, 
either.

Apr 07 2015

"w0rp" <devw0rp gmail.com> writes:

On Tuesday, 7 April 2015 at 17:25:00 UTC, Walter Bright wrote:
 The current D associative array algorithm

     
 https://github.com/D-Programming-Language/druntime/blob/master/src/rt/aaA.d

 uses an array of buckets with a linked list attached to the 
 buckets to resolve collisions.

 Linked lists are perhaps the worst (i.e. slowest) data 
 structure possible on modern CPUs with memory caches.

 A possible cache-friendly replacement would be an array of 
 buckets with local probing to resolve collisions. D programs 
 are often heavily dependent on associative arrays, so this 
 could be a big win for us.

 And it's a fun project, too. Any takers?



 Interestingly,


 https://github.com/D-Programming-Language/dmd/blob/master/src/root/stringtable.c

 uses quadratic probing instead of linked lists, but this is not 
 cache friendly, either.

I have already been experimenting with this myself actually. My 
hashmap now uses a bucket array with quadratic probing. I was 
just following Martin Nowak's advice. My code for it is here.

https://github.com/w0rp/dstruct/blob/master/source/dstruct/map.d

I have a benchmark program available in my repository for testing 
it against the druntime version. It doesn't amaze me at the 
moment, as it's slightly faster for integers, and slightly slower 
for strings at the moment. I would appreciate any advice on it. 
I'm sure someone will look at my code and say, "Noo! Don't do 
that!"

Apr 07 2015

"w0rp" <devw0rp gmail.com> writes:

I'll also take this chance to once again plug 'setDefault' as a 
useful associative array primitive. It's in my library. I don't 
care what name is used for it, just the semantics. Nowak 
suggested 'getOrSet', which is a good name.

// A string -> int[] map
HashMap!(string, int[]) map;

// set an element as usual.
map["a"] = [1, 2, 3];

// Returns [1, 2, 3], by reference, and appends 4 to it.
map.setDefault("a") ~= 4

// Finds nothing in the map currently, returns (int[]).init by 
reference,
// now created in the map, appends to that.
map.setDefault("b") ~= 1

assert(map["b"] == [1]);

// setDefault with a lazy parameter.
int[] arr = map.setDefault("matey", [1, 2]);

assert(arr == [1, 2] && arr == map["matey"]);

Apr 07 2015

"bearophile" <bearophileHUGS lycos.com> writes:

w0rp:

 It doesn't amaze me at the moment, as it's slightly faster for 
 integers, and slightly slower for strings at the moment.

One problem with D strings is that they don't cache their hash 
value.

Bye,
bearophile

Apr 07 2015

"w0rp" <devw0rp gmail.com> writes:

On Tuesday, 7 April 2015 at 18:28:22 UTC, bearophile wrote:
 w0rp:

 It doesn't amaze me at the moment, as it's slightly faster for 
 integers, and slightly slower for strings at the moment.

 One problem with D strings is that they don't cache their hash 
 value.

 Bye,
 bearophile

I probably need to put that back in for some types. I tried 
taking the hash values away. I know that for integers where the 
size is <= size_t.sizeof, then I can keep the hash codes out of 
the array, as the key is always the same as the hash.

Apr 07 2015

Martin Nowak <code+news.digitalmars dawg.eu> writes:

On 04/07/2015 07:24 PM, Walter Bright wrote:
 https://github.com/D-Programming-Language/dmd/blob/master/src/root/stringtable.c
 
 uses quadratic probing instead of linked lists, but this is not cache
 friendly, either.

Quite to the contrary, it's very cache friendly.
The triangular numbers are

1 3 6 10 15 21

With 8 byte per entry as in stringtable you need 2 collisions on the
same "bucket" to land 48 bytes away, which might still be the same
cacheline.

The problem with linear probing is primary clustering, especially when
your hash isn't too good, which can lead to long streaks of neighbouring
buckets that are filled. Any insert that hashes into such a lump
(chances are increasing the bigger it becomes) will make it grow even
further.

For higher load factor such as 0.75 this can lead to a huge variance of
the probe sequence length and extreme lengths on the upper 95/99
percentiles.

Here is the ticket for open addressing, btw.
https://issues.dlang.org/show_bug.cgi?id=14385

Apr 07 2015

"w0rp" <devw0rp gmail.com> writes:

On Tuesday, 7 April 2015 at 18:35:27 UTC, Martin Nowak wrote:
 For higher load factor such as 0.75 this can lead to a huge 
 variance of
 the probe sequence length and extreme lengths on the upper 95/99
 percentiles.

 Here is the ticket for open addressing, btw.
 https://issues.dlang.org/show_bug.cgi?id=14385

One thing I was wondering about, which you might know more about, 
is that I had to set my load factor to be half the size of the 
array, as quadratic probing seems to fail when more than half the 
buckets are filled. Is that correct, or did I make a mistake?

Apr 07 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 4/7/15 3:07 PM, w0rp wrote:
 On Tuesday, 7 April 2015 at 18:35:27 UTC, Martin Nowak wrote:
 For higher load factor such as 0.75 this can lead to a huge variance of
 the probe sequence length and extreme lengths on the upper 95/99
 percentiles.

 Here is the ticket for open addressing, btw.
 https://issues.dlang.org/show_bug.cgi?id=14385

 One thing I was wondering about, which you might know more about, is
 that I had to set my load factor to be half the size of the array, as
 quadratic probing seems to fail when more than half the buckets are
 filled. Is that correct, or did I make a mistake?

BTW, fun fact about D AA's, the load factor is 4. As in nodes = 
4xbuckets will be added before rehashing.

-Steve

Apr 07 2015

"Martin Nowak" <code dawg.eu> writes:

On Tuesday, 7 April 2015 at 19:07:01 UTC, w0rp wrote:
 On Tuesday, 7 April 2015 at 18:35:27 UTC, Martin Nowak wrote:
 One thing I was wondering about, which you might know more 
 about, is that I had to set my load factor to be half the size 
 of the array, as quadratic probing seems to fail when more than 
 half the buckets are filled. Is that correct, or did I make a 
 mistake?

You made a mistake somewhere, the sweet spot should be in the 
range of 0.6-0.8. Quadratic probing with triangular numbers is 
guaranteed to fail only when your buckets are completely full.

Apr 08 2015

"Jens Bauer" <doctor who.no> writes:

On Tuesday, 7 April 2015 at 17:25:00 UTC, Walter Bright wrote:
 The current D associative array algorithm

I did a quick-scan of the source and didn't see any Bloom filter 
there.
I wonder if it would be best to have the Bloom filters completely 
external or if they should be included in the associative array ?
-Eg. the Bloom filter does require extra memory, which is not 
always desirable.
On the other hand, a Bloom filter could be opted in, where high 
speed is required.

Apr 07 2015

Marco Leise <Marco.Leise gmx.de> writes:

Am Tue, 07 Apr 2015 19:09:15 +0000
schrieb "Jens Bauer" <doctor who.no>:

 On Tuesday, 7 April 2015 at 17:25:00 UTC, Walter Bright wrote:
 The current D associative array algorithm

 
 I did a quick-scan of the source and didn't see any Bloom filter 
 there.
 I wonder if it would be best to have the Bloom filters completely 
 external or if they should be included in the associative array ?
 -Eg. the Bloom filter does require extra memory, which is not 
 always desirable.
 On the other hand, a Bloom filter could be opted in, where high 
 speed is required.

More precisely it adds a memory and time overhead, but
reduces the cost for most lookups of non-existent keys.

When you just want an associative array for known-to-exist
keys it would be a bad idea to use a Bloom filter. Opt in is
not possible with the current AA syntax as far as I can see.
Except you mean some wrapper struct as part of Phobos/druntime.

-- 
Marco

Apr 08 2015

"deadalnix" <deadalnix gmail.com> writes:

Food for thought :

http://codecapsule.com/2013/11/11/robin-hood-hashing/
http://people.csail.mit.edu/shanir/publications/disc2008_submission_98.pdf

Also it is probably worthwhile to adopt various strategy 
depending on element types characteristics.

Apr 07 2015

"thedeemon" <dlang thedeemon.com> writes:

On Tuesday, 7 April 2015 at 19:09:21 UTC, deadalnix wrote:
 Food for thought :

 http://codecapsule.com/2013/11/11/robin-hood-hashing/
 http://people.csail.mit.edu/shanir/publications/disc2008_submission_98.pdf

 Also it is probably worthwhile to adopt various strategy 
 depending on element types characteristics.

More food, D implementation and benchmark results (default AAs 
vs. Vibe.d linear probing vs. Robin Hood) :
http://www.infognition.com/blog/2014/on_robin_hood_hashing.html

Apr 08 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 4/7/15 10:24 AM, Walter Bright wrote:
 The current D associative array algorithm


 https://github.com/D-Programming-Language/druntime/blob/master/src/rt/aaA.d

 uses an array of buckets with a linked list attached to the buckets to
 resolve collisions.

 Linked lists are perhaps the worst (i.e. slowest) data structure
 possible on modern CPUs with memory caches.

 A possible cache-friendly replacement would be an array of buckets with
 local probing to resolve collisions.

Arrays would need to move data. Current hashtables rely on values 
staying put. -- Andrei

Apr 07 2015

"bearophile" <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 A possible cache-friendly replacement would be an array of 
 buckets with local probing to resolve collisions.

 Arrays would need to move data. Current hashtables rely on 
 values staying put. -- Andrei

The efficiency behavour of modern CPUs+memory pyramid are rather 
not linear and not intuitive. As Walter has said at the the start 
of this thread, arrays come out as more efficient in a large 
number of cases...

Bye,
bearophile

Apr 07 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 4/7/15 3:18 PM, bearophile wrote:
 Andrei Alexandrescu:

 A possible cache-friendly replacement would be an array of buckets
 with local probing to resolve collisions.

 Arrays would need to move data. Current hashtables rely on values
 staying put. -- Andrei

 The efficiency behavour of modern CPUs+memory pyramid are rather not
 linear and not intuitive. As Walter has said at the the start of this
 thread, arrays come out as more efficient in a large number of cases...

You must have replied to the wrong post. -- Andrei

Apr 07 2015

Marco Leise <Marco.Leise gmx.de> writes:

Am Tue, 07 Apr 2015 15:26:45 -0700
schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:

 On 4/7/15 3:18 PM, bearophile wrote:
 Andrei Alexandrescu:

 A possible cache-friendly replacement would be an array of buckets
 with local probing to resolve collisions.

 Arrays would need to move data. Current hashtables rely on values
 staying put. -- Andrei

 The efficiency behavour of modern CPUs+memory pyramid are rather not
 linear and not intuitive. As Walter has said at the the start of this
 thread, arrays come out as more efficient in a large number of cases...

 
 You must have replied to the wrong post. -- Andrei

 
No, that's really what Walter said:
 "A possible cache-friendly replacement would be an array
[...]."

Especially when iterating arrays, multiple elements might be
in the same cache-line and the CPU's prefetching will kick in
and load the next array elements while the current ones are
processed.

Unexpected memory fetches from RAM can easily be >100 times
slower.
https://software.intel.com/sites/products/collateral/hpc/vtune/performanc
_analysis_guide.pdf (pg. 22)

https://software.intel.com/en-us/forums/topic/287236
"
Core i7 Xeon 5500 Series

Data Source Latency (approximate)
L1 CACHE hit, ~4 cycles
L2 CACHE hit, ~10 cycles
L3 CACHE hit, line unshared ~40 cycles
L3 CACHE hit, shared line in another core ~65 cycles
L3 CACHE hit, modified in another core ~75 cycles
remote L3 CACHE ~100-300 cycles
Local Dram ~60 ns
Remote Dram ~100 ns

[...]

a) Structure data such that computationally related information resides within
same cache line.
b) Write your algorithms such that the higher frequency of access occures on
(near) adjacent memory. This reduces the TLB pressure.
c) Reduce number of writes to RAM through use of temporary varibles that can be
registerized (optimized into register)
d) Structure data such that you can manipulate using SSE (when possible)
e) For parallel programming, coordinate activities amongst threads sharing
cache levels. (HT siblings for L1 and L2, same die or half die for L3, same
NUMA node for multiple nodes).

-- Jim Dempsey"

-- 
Marco

Apr 08 2015

"deadalnix" <deadalnix gmail.com> writes:

On Tuesday, 7 April 2015 at 22:14:46 UTC, Andrei Alexandrescu 
wrote:
 Arrays would need to move data. Current hashtables rely on 
 values staying put. -- Andrei

I think it is fair to say current AA are bankrupt and need a 
revamping anyway.

We can make the in operator return a wrapper that cast to bool 
(safely) and get/update the data (systemely).

Apr 07 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 4/7/15 3:22 PM, deadalnix wrote:
 On Tuesday, 7 April 2015 at 22:14:46 UTC, Andrei Alexandrescu wrote:
 Arrays would need to move data. Current hashtables rely on values
 staying put. -- Andrei

 I think it is fair to say current AA are bankrupt and need a revamping
 anyway.

Doesn't strike me as a fair statement.

 We can make the in operator return a wrapper that cast to bool (safely)
 and get/update the data (systemely).

That's not enough. People may take the address of elements in the 
hashtable and assume the data stays put. This is currently safe and 
legal in D.


Andrei

Apr 07 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 4/7/15 6:27 PM, Andrei Alexandrescu wrote:
 On 4/7/15 3:22 PM, deadalnix wrote:
 On Tuesday, 7 April 2015 at 22:14:46 UTC, Andrei Alexandrescu wrote:
 Arrays would need to move data. Current hashtables rely on values
 staying put. -- Andrei

 I think it is fair to say current AA are bankrupt and need a revamping
 anyway.

 Doesn't strike me as a fair statement.

 We can make the in operator return a wrapper that cast to bool (safely)
 and get/update the data (systemely).

 That's not enough. People may take the address of elements in the
 hashtable and assume the data stays put. This is currently safe and
 legal in D.

The correct way forward is to implement the AA in the library in the 
safest way possible. Then, make the library implementation customizable 
for specialized situations. But it needs to be fully in the library, not 
partly in the compiler as it is now. Until that point, all these 
discussions on AA updates are useless.

-Steve

Apr 07 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 4/7/2015 3:33 PM, Steven Schveighoffer wrote:
 Until that point, all these discussions on AA updates are useless.

The data structure has changed before (used to be a binary tree) without 
disruption and can change again.

Apr 07 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 4/7/15 7:01 PM, Walter Bright wrote:
 On 4/7/2015 3:33 PM, Steven Schveighoffer wrote:
 Until that point, all these discussions on AA updates are useless.

 The data structure has changed before (used to be a binary tree) without
 disruption and can change again.

Sure but there are so many gotchas to this thing. Multiple very smart, 
very competent developers set out to change the AA and failed. I would 
rather effort be spent focusing on getting it into a full-fledged 
library type with simple lowering for syntax, and then we can play with 
everything we want to do. At least at that point, there aren't special 
considerations to worry about.

-Steve

Apr 07 2015

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Tue, Apr 07, 2015 at 07:47:37PM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 On 4/7/15 7:01 PM, Walter Bright wrote:
On 4/7/2015 3:33 PM, Steven Schveighoffer wrote:
Until that point, all these discussions on AA updates are useless.

The data structure has changed before (used to be a binary tree)
without disruption and can change again.

 
 Sure but there are so many gotchas to this thing. Multiple very smart,
 very competent developers set out to change the AA and failed. I would
 rather effort be spent focusing on getting it into a full-fledged
 library type with simple lowering for syntax, and then we can play
 with everything we want to do. At least at that point, there aren't
 special considerations to worry about.

[...]

Didn't somebody check in a whole bunch of PRs already in this direction?
IIRC, there are only 1 or 2 more pieces left before AA's can be fully
implemented as a library type.


T

-- 
People say I'm indecisive, but I'm not sure about that. -- YHL, CONLANG

Apr 07 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 4/7/2015 4:47 PM, Steven Schveighoffer wrote:
 On 4/7/15 7:01 PM, Walter Bright wrote:
 On 4/7/2015 3:33 PM, Steven Schveighoffer wrote:
 Until that point, all these discussions on AA updates are useless.

 The data structure has changed before (used to be a binary tree) without
 disruption and can change again.

 Sure but there are so many gotchas to this thing. Multiple very smart, very
 competent developers set out to change the AA and failed. I would rather effort
 be spent focusing on getting it into a full-fledged library type with simple
 lowering for syntax, and then we can play with everything we want to do. At
 least at that point, there aren't special considerations to worry about.

Conform to the existing interface and it'll work.

Apr 07 2015

"Daniel Murphy" <yebbliesnospam gmail.com> writes:

"Steven Schveighoffer"  wrote in message 
news:mg1m0g$91e$1 digitalmars.com...

 The correct way forward is to implement the AA in the library in the 
 safest way possible. Then, make the library implementation customizable 
 for specialized situations. But it needs to be fully in the library, not 
 partly in the compiler as it is now. Until that point, all these 
 discussions on AA updates are useless.

I mostly agree, except I think we only need _an_ AA in the library, not 
necessarily _the_ AA.

Apr 07 2015

"Martin Nowak" <code dawg.eu> writes:

On Tuesday, 7 April 2015 at 22:33:52 UTC, Steven Schveighoffer 
wrote:
 Until that point, all these discussions on AA updates are 
 useless.

I really don't that all or nothing attitude, it condemns an 
important step, just because something better might be possible. 
It's also very comfortable, because this way nobody will ever 
have to do anything.

Improving the AA implementation has a big immediate effect, and 
will also help anyone writing a library implementation, because 
any candidate up to this date was still based on that crappy 
bucket list implementation.

The biggest problems in writing an AA library implementation 
sorted by difficulty are:
- deprecation of all magic AA behaviors (attributes, as[i][j]++)
- get the lowering right
- efficient construction and value insertion (rvalue moving)

Apr 08 2015

"Martin Nowak" <code dawg.eu> writes:

On Wednesday, 8 April 2015 at 09:12:00 UTC, Martin Nowak wrote:
 The biggest problems in writing an AA library implementation 
 sorted by difficulty are:

There is a clear acceptance list for a good AA here.
https://github.com/D-Programming-Language/druntime/pull/934#issuecomment-65916801

Apr 08 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 4/8/15 5:11 AM, Martin Nowak wrote:
 On Tuesday, 7 April 2015 at 22:33:52 UTC, Steven Schveighoffer wrote:
 Until that point, all these discussions on AA updates are useless.


 I really don't that all or nothing attitude, it condemns an important
 step, just because something better might be possible. It's also very
 comfortable, because this way nobody will ever have to do anything.

I'm not suggesting nobody ever do anything out of comfort. What I'm 
suggesting is that it's not worth improving something that will be 
discarded anyway. Let's fix the problem hampering people from 
understanding how the AA works so they can work on improving the AA. 
Namely, the discarding of all compile-time info during implementation.

 Improving the AA implementation has a big immediate effect, and will
 also help anyone writing a library implementation, because any candidate
 up to this date was still based on that crappy bucket list implementation.

The "crappy bucket list" implementation is also the simplest. When the 
focus should be on fixing the AA interface with the compiler and 
library, I think having a simple understandable implementation is fine.

Let me remind you of your attitude when I brought up the fact that AA's 
load factor was 4 (just fixing this one number could improve 
performance, even for the bucket list implementation):

 We'll throw the current implementation away and implement an open 

addressing AA once we get to it. So don't worry about the load factor 
right now.

In other words, don't bother fixing the constant "4" in the code, it's 
going to be thrown away anyways.

 The biggest problems in writing an AA library implementation sorted by
 difficulty are:
 - deprecation of all magic AA behaviors (attributes, as[i][j]++)
 - get the lowering right

I really am surprised that these would be difficult at all. The lowering 
is pretty easy:
a) T[U] => AssociativeArray!(U,T)
b) [a:b, c:d] => AssociativeArray!(typeof(true ? a : c), typeof(true ? b 
: d)).literal(a, b, c, d);

Basically, bikeshed the name "literal" is the difficult part.

And deprecating the magic AA behaviors, isn't this just removing a whole 
lot of code from the compiler? Once the lowering is working, I don't see 
how removing special cases would be difficult. The part where we may 
have issues I see is CTFE support. But I would defer to those who have 
already tried this exercise, perhaps they can share what broke their 
solutions.

-Steve

Apr 09 2015

"Daniel Murphy" <yebbliesnospam gmail.com> writes:

"Steven Schveighoffer"  wrote in message 
news:mg5pf0$1g51$1 digitalmars.com...

 I really am surprised that these would be difficult at all. The lowering 
 is pretty easy:
 a) T[U] => AssociativeArray!(U,T)
 b) [a:b, c:d] => AssociativeArray!(typeof(true ? a : c), typeof(true ? b : 
 d)).literal(a, b, c, d);

Working out what to lower to is easy, working out when to lower it is hard.

Apr 09 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 4/9/15 8:07 AM, Daniel Murphy wrote:
 "Steven Schveighoffer"  wrote in message
 news:mg5pf0$1g51$1 digitalmars.com...

 I really am surprised that these would be difficult at all. The
 lowering is pretty easy:
 a) T[U] => AssociativeArray!(U,T)
 b) [a:b, c:d] => AssociativeArray!(typeof(true ? a : c), typeof(true ?
 b : d)).literal(a, b, c, d);

 Working out what to lower to is easy, working out when to lower it is hard.

Wouldn't it be whenever AA was previously invoked? I'm surprised there 
are any unknowns here.

-Steve

Apr 09 2015

"Daniel Murphy" <yebbliesnospam gmail.com> writes:

"Steven Schveighoffer"  wrote in message 
news:mg61gn$1o09$1 digitalmars.com...

 Wouldn't it be whenever AA was previously invoked? I'm surprised there are 
 any unknowns here.

It has to be done before some parts of semantic, but after others.  Eg error 
messages and template matching still needs to be done on the AA type, but 
other parts need the actual template type.

Apr 09 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 4/9/15 11:25 AM, Daniel Murphy wrote:
 "Steven Schveighoffer"  wrote in message
 news:mg61gn$1o09$1 digitalmars.com...

 Wouldn't it be whenever AA was previously invoked? I'm surprised there
 are any unknowns here.

 It has to be done before some parts of semantic, but after others.  Eg
 error messages and template matching still needs to be done on the AA
 type, but other parts need the actual template type.

I think a reasonable path is to lower as early as possible, have 
everything show up as the template type in error messages, and then work 
on updating the messages over time to reflect the builtin syntax.

In any case, I didn't think about template matching, those are handled 
specially for AA. Perhaps we can make that more generic.

-Steve

Apr 09 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 4/7/2015 3:14 PM, Andrei Alexandrescu wrote:
 Arrays would need to move data. Current hashtables rely on values staying put.

If the array of buckets were enhanced to include the hashes, the key/value 
objects could stay untouched when the AA is grown. The key/value objects would 
only be dereferenced if the hashes matched.

Apr 07 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 4/7/15 3:56 PM, Walter Bright wrote:
 On 4/7/2015 3:14 PM, Andrei Alexandrescu wrote:
 Arrays would need to move data. Current hashtables rely on values
 staying put.

 If the array of buckets were enhanced to include the hashes, the
 key/value objects could stay untouched when the AA is grown. The
 key/value objects would only be dereferenced if the hashes matched.

Not sure I understand. What if the collision array needs to grow? That 
might move it. -- Andrei

Apr 07 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 4/7/2015 3:59 PM, Andrei Alexandrescu wrote:
 On 4/7/15 3:56 PM, Walter Bright wrote:
 On 4/7/2015 3:14 PM, Andrei Alexandrescu wrote:
 Arrays would need to move data. Current hashtables rely on values
 staying put.

 If the array of buckets were enhanced to include the hashes, the
 key/value objects could stay untouched when the AA is grown. The
 key/value objects would only be dereferenced if the hashes matched.

 Not sure I understand. What if the collision array needs to grow? That might
 move it. -- Andrei

The collision array is an array of pairs:

     KeyValue *kv;
     Hash_t hash;

growing the array moves those around, but the locations of the array elements 
are never exposed to the users. The addresses of kv are exposed, but those
don't 
move.

Apr 07 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 4/7/15 4:18 PM, Walter Bright wrote:
 On 4/7/2015 3:59 PM, Andrei Alexandrescu wrote:
 On 4/7/15 3:56 PM, Walter Bright wrote:
 On 4/7/2015 3:14 PM, Andrei Alexandrescu wrote:
 Arrays would need to move data. Current hashtables rely on values
 staying put.

 If the array of buckets were enhanced to include the hashes, the
 key/value objects could stay untouched when the AA is grown. The
 key/value objects would only be dereferenced if the hashes matched.

 Not sure I understand. What if the collision array needs to grow? That
 might
 move it. -- Andrei

 The collision array is an array of pairs:

      KeyValue *kv;
      Hash_t hash;

 growing the array moves those around, but the locations of the array
 elements are never exposed to the users. The addresses of kv are
 exposed, but those don't move.

Ah, I thought the array embeds KeyValue directly. Thx! -- Andrei

Apr 07 2015

"Martin Nowak" <code dawg.eu> writes:

On Tuesday, 7 April 2015 at 23:38:21 UTC, Andrei Alexandrescu 
wrote:
 Ah, I thought the array embeds KeyValue directly. Thx! -- Andrei

It should if they are relatively small compared to the pointer, 
or at least store the key inline. As we recently discussed about 
the freeing bug in the AA, it should be fine to store values 
inline, as long as the bucket array is never deleted.

Apr 08 2015

"Dejan Lekic" <dejan.lekic gmail.com> writes:

On Tuesday, 7 April 2015 at 17:25:00 UTC, Walter Bright wrote:
 The current D associative array algorithm

     
 https://github.com/D-Programming-Language/druntime/blob/master/src/rt/aaA.d

 uses an array of buckets with a linked list attached to the 
 buckets to resolve collisions.

 Linked lists are perhaps the worst (i.e. slowest) data 
 structure possible on modern CPUs with memory caches.

 A possible cache-friendly replacement would be an array of 
 buckets with local probing to resolve collisions. D programs 
 are often heavily dependent on associative arrays, so this 
 could be a big win for us.

 And it's a fun project, too. Any takers?



 Interestingly,


 https://github.com/D-Programming-Language/dmd/blob/master/src/root/stringtable.c

 uses quadratic probing instead of linked lists, but this is not 
 cache friendly, either.

Every now and then I think how nice it would be for Phobos to be 
just a bunch of "module interfaces" along with default 
implementations of them. Unfortunately, or fortunately, D does 
not have module interfaces like, say, Modula-3. It would make 
things much easier for someone who wants to work on a different 
implementation of a certain module (or even whole package) 
implementation.

I wonder what other people think about this.

Apr 08 2015

Marco Leise <Marco.Leise gmx.de> writes:

What's a faster algorithm? There are some that deal with
collisions differently but might be slower with good hash
distributions. You can improve iteration speed by storing only
an index into a dense array of values, but it eats more memory.

As far as I can see from my own experiments with HAMT
(http://infoscience.epfl.ch/record/64398/files/idealhashtrees.pdf)
the time it takes to compute hashes and do a binary comparison
for e.g. strings is so huge that the actual AA algorithm
doesn't really matter. If anyone managed to make string AAs
faster by swapping out the AA implementation I'd be surprised.

Instead I focuses on using malloc over GC and borrowed w0rp's
idea of not hashing integral data types such as int, char or
bool. They are now their own hash, which is dangerous in terms
of hash distribution, but acts as a real turbo boost over the
built-in AA (+700% faster for lookups in my tests with 100%
hit rate.)

May be important:

I found that Tuple generates a hash by addition of the hashes
of its fields. This means that a coordinate tuple like
Tuple!(int, int) will collide with mirror coordinates.
E.g.: tuple(2,5) has the same hash as tuple(5,2) and so on.
Furthermore the hash functions for each field are virtual
function calls from a TypeInfo object that contains further
branching to determine whether the type has an overridden
toHash or the default implementation (SuperFastHash over
binary representation) is used.
I found that:

  enum hasher = &typeid(T).getHash;
  hasher(&t);

already improved hashing speed for gdc a bit. Without looking
at the generated assembly my wild guess is that
de-virtualization failed. TypeInfo objects should be known at
compile time and getHash and other calls to it inlined. Even
the dynamic branch to check for overridden toHash is known at
compile-time.

-- 
Marco

Apr 08 2015

D Programming

C/C++ Programming

Other

digitalmars.D - Fun project - faster associative array algorithm