digitalmars.D - AA Performance in Benchmarks

Shammah Chancellor (21/21) Apr 23 2015 So, I was tooling around with one of the benchmarks I saw listed

Steven Schveighoffer (5/9) Apr 23 2015 There is a PR being discussed right now. Please take a look and chime in...

Laeeth Isharc (10/23) Apr 25 2015 At a slight tangent: is there a way to reserve capacity for an

Daniel Murphy (4/10) Apr 26 2015 No.

Martin Nowak (11/13) Apr 26 2015 Yes, I think phobos should get a few more optimized containers.

Martin Nowak (2/6) Apr 25 2015 Went from 11s down to 9s, ~20% improvement.

Chris (5/27) Apr 24 2015 Interesting. I noticed while doing some benchmarking that looking

Chris (31/65) Apr 24 2015 AA lookup is (more often than not) slightly slower than (or at

Chris (3/69) Apr 24 2015 Recte: I meant to say creating the data is slower here, of

"Shammah Chancellor" <s s.com> writes:

So, I was tooling around with one of the benchmarks I saw listed 
on twitter:

https://github.com/kostya/benchmarks/tree/master/brainfuck

This D benchmark spends most of it's time on AA lookups.   
Discussions about the implementation aside, I am curious as to 
why the D AA is so slow even after calling AA.rehash.

I took a look at the Nim tables implementation, and it looks that 
they use power-of-two vectors and use the property of primitive 
roots modulo n to get the next value in the case of a collision:

```
proc nextTry(h, maxHash: THash): THash {.inline.} =
   result = ((5 * h) + 1) and maxHash
```

So, my questions:

1) What are the general purpose performance implications of 
something like this?
2) I saw a thread awhile ago where someone had posted an updated 
implementation of the AA with benchmarks and showed that it was 
substantially faster in all cases.  I can't find the thread again 
-- what happened with this work?

-Shammah

Apr 23 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 4/23/15 10:40 AM, Shammah Chancellor wrote:

 2) I saw a thread awhile ago where someone had posted an updated
 implementation of the AA with benchmarks and showed that it was
 substantially faster in all cases.  I can't find the thread again --
 what happened with this work?

There is a PR being discussed right now. Please take a look and chime in:

https://github.com/D-Programming-Language/druntime/pull/1229

It would be interesting to know how the new AA performs in this test.

-Steve

Apr 23 2015

"Laeeth Isharc" <nospamlaeeth nospam.laeeth.com> writes:

On Thursday, 23 April 2015 at 15:29:37 UTC, Steven Schveighoffer 
wrote:
 On 4/23/15 10:40 AM, Shammah Chancellor wrote:

 2) I saw a thread awhile ago where someone had posted an 
 updated
 implementation of the AA with benchmarks and showed that it was
 substantially faster in all cases.  I can't find the thread 
 again --
 what happened with this work?

 There is a PR being discussed right now. Please take a look and 
 chime in:

 https://github.com/D-Programming-Language/druntime/pull/1229

 It would be interesting to know how the new AA performs in this 
 test.

 -Steve

At a slight tangent: is there a way to reserve capacity for an 
associative array?  The obvious way does not seem to work.

I noticed also in a talk at nativecode (possibly by Herb Sutter) 
that C++ gives you the ability to tune almost any parameter of 
the associative array implementation in the standard library.  
(I'm not a C++ guy).  Is this entirely spurious, or would it be 
helpful for D?


Laeeth.

Apr 25 2015

"Daniel Murphy" <yebbliesnospam gmail.com> writes:

"Laeeth Isharc"  wrote in message 
news:hnihyhgtmdzhszkpnftl forum.dlang.org...

 At a slight tangent: is there a way to reserve capacity for an associative 
 array?  The obvious way does not seem to work.

No.

 I noticed also in a talk at nativecode (possibly by Herb Sutter) that C++ 
 gives you the ability to tune almost any parameter of the associative 
 array implementation in the standard library.  (I'm not a C++ guy).  Is 
 this entirely spurious, or would it be helpful for D?

Yes, if we had an AA in phobos.

Apr 26 2015

Martin Nowak <code+news.digitalmars dawg.eu> writes:

On 04/26/2015 12:58 PM, Daniel Murphy wrote:
 
 Yes, if we had an AA in phobos.

Yes, I think phobos should get a few more optimized containers.

SparseSet
DenseSet
SparseHash
DenseHash

They should be configurable w.r.t. load factor, equality comparison, and
allocation policy (when std.allocator is available).

http://sparsehash.googlecode.com/svn/trunk/doc/index.html

As the builtin druntime AA will likely always remain a general purpose
hash table we can't do all the possible optimizations.

Apr 26 2015

Martin Nowak <code+news.digitalmars dawg.eu> writes:

On 04/23/2015 05:29 PM, Steven Schveighoffer wrote:
 
 https://github.com/D-Programming-Language/druntime/pull/1229
 
 It would be interesting to know how the new AA performs in this test.

Went from 11s down to 9s, ~20% improvement.

Apr 25 2015

"Chris" <wendlec tcd.ie> writes:

On Thursday, 23 April 2015 at 14:40:58 UTC, Shammah Chancellor 
wrote:
 So, I was tooling around with one of the benchmarks I saw 
 listed on twitter:

 https://github.com/kostya/benchmarks/tree/master/brainfuck

 This D benchmark spends most of it's time on AA lookups.   
 Discussions about the implementation aside, I am curious as to 
 why the D AA is so slow even after calling AA.rehash.

 I took a look at the Nim tables implementation, and it looks 
 that they use power-of-two vectors and use the property of 
 primitive roots modulo n to get the next value in the case of a 
 collision:

 ```
 proc nextTry(h, maxHash: THash): THash {.inline.} =
   result = ((5 * h) + 1) and maxHash
 ```

 So, my questions:

 1) What are the general purpose performance implications of 
 something like this?
 2) I saw a thread awhile ago where someone had posted an 
 updated implementation of the AA with benchmarks and showed 
 that it was substantially faster in all cases.  I can't find 
 the thread again -- what happened with this work?

 -Shammah

Interesting. I noticed while doing some benchmarking that looking 
up data that is stored in an AA is slower than generating the 
data on the fly. I was really surprised.

Apr 24 2015

"Chris" <wendlec tcd.ie> writes:

On Friday, 24 April 2015 at 09:18:48 UTC, Chris wrote:
 On Thursday, 23 April 2015 at 14:40:58 UTC, Shammah Chancellor 
 wrote:
 So, I was tooling around with one of the benchmarks I saw 
 listed on twitter:

 https://github.com/kostya/benchmarks/tree/master/brainfuck

 This D benchmark spends most of it's time on AA lookups.   
 Discussions about the implementation aside, I am curious as to 
 why the D AA is so slow even after calling AA.rehash.

 I took a look at the Nim tables implementation, and it looks 
 that they use power-of-two vectors and use the property of 
 primitive roots modulo n to get the next value in the case of 
 a collision:

 ```
 proc nextTry(h, maxHash: THash): THash {.inline.} =
  result = ((5 * h) + 1) and maxHash
 ```

 So, my questions:

 1) What are the general purpose performance implications of 
 something like this?
 2) I saw a thread awhile ago where someone had posted an 
 updated implementation of the AA with benchmarks and showed 
 that it was substantially faster in all cases.  I can't find 
 the thread again -- what happened with this work?

 -Shammah

 Interesting. I noticed while doing some benchmarking that 
 looking up data that is stored in an AA is slower than 
 generating the data on the fly. I was really surprised.

AA lookup is (more often than not) slightly slower than (or at 
least as fast as) generating the data on demand:

import std.datetime : benchmark, Duration;
import std.string : format;
import std.conv : to;
import std.stdio : writeln;

enum {
   string[string] words = [
     "Data1":"Blah",
     "Data2":"Blub",
   ],
   string[] letters = ["B", "l", "a", "h"]
}

void main() {
   words.rehash();
   auto result = benchmark!(lookup, make)(100_000);
   writeln(to!Duration(result[0]));
   writeln(to!Duration(result[1]));
}

string lookup() {
   auto blah = words["Data1"];
   return blah;
}

string make() {
   string res;
   foreach (c; letters)
     res ~= c;
   return res;
}

dmd v2.067.0 -release -O -boundscheck=off -inline

Apr 24 2015

"Chris" <wendlec tcd.ie> writes:

On Friday, 24 April 2015 at 11:30:14 UTC, Chris wrote:
 On Friday, 24 April 2015 at 09:18:48 UTC, Chris wrote:
 On Thursday, 23 April 2015 at 14:40:58 UTC, Shammah Chancellor 
 wrote:
 So, I was tooling around with one of the benchmarks I saw 
 listed on twitter:

 https://github.com/kostya/benchmarks/tree/master/brainfuck

 This D benchmark spends most of it's time on AA lookups.   
 Discussions about the implementation aside, I am curious as 
 to why the D AA is so slow even after calling AA.rehash.

 I took a look at the Nim tables implementation, and it looks 
 that they use power-of-two vectors and use the property of 
 primitive roots modulo n to get the next value in the case of 
 a collision:

 ```
 proc nextTry(h, maxHash: THash): THash {.inline.} =
 result = ((5 * h) + 1) and maxHash
 ```

 So, my questions:

 1) What are the general purpose performance implications of 
 something like this?
 2) I saw a thread awhile ago where someone had posted an 
 updated implementation of the AA with benchmarks and showed 
 that it was substantially faster in all cases.  I can't find 
 the thread again -- what happened with this work?

 -Shammah

 Interesting. I noticed while doing some benchmarking that 
 looking up data that is stored in an AA is slower than 
 generating the data on the fly. I was really surprised.

 AA lookup is (more often than not) slightly slower than (or at 
 least as fast as) generating the data on demand:

 import std.datetime : benchmark, Duration;
 import std.string : format;
 import std.conv : to;
 import std.stdio : writeln;

 enum {
   string[string] words = [
     "Data1":"Blah",
     "Data2":"Blub",
   ],
   string[] letters = ["B", "l", "a", "h"]
 }

 void main() {
   words.rehash();
   auto result = benchmark!(lookup, make)(100_000);
   writeln(to!Duration(result[0]));
   writeln(to!Duration(result[1]));
 }

 string lookup() {
   auto blah = words["Data1"];
   return blah;
 }

 string make() {
   string res;
   foreach (c; letters)
     res ~= c;
   return res;
 }

 dmd v2.067.0 -release -O -boundscheck=off -inline

Recte: I meant to say creating the data is slower here, of 
course. Ah, well, it's Friday!

Apr 24 2015

D Programming

C/C++ Programming

Other

digitalmars.D - AA Performance in Benchmarks