www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - What hashing algorithm is used for the D implementation of associative

reply "Gary Willoughby" <dev nomad.so> writes:
What hashing algorithm is used for the D implementation of 
associative arrays? Where in the D source does the AA code live?
Aug 09 2014
next sibling parent "Damian Day" <damianroyday gmail.com> writes:
On Saturday, 9 August 2014 at 09:33:12 UTC, Gary Willoughby wrote:
 What hashing algorithm is used for the D implementation of 
 associative arrays? Where in the D source does the AA code live?
https://github.com/D-Programming-Language/druntime/blob/master/src/rt/aaA.d I think it uses the objects generic TypeInfo getHash function - see line 170.
Aug 09 2014
prev sibling next sibling parent reply Mike Wey <mike-wey example.com> writes:
On 08/09/2014 11:33 AM, Gary Willoughby wrote:
 What hashing algorithm is used for the D implementation of associative
 arrays? Where in the D source does the AA code live?
Paul Hsieh's SuperFastHash: http://www.azillionmonkeys.com/qed/hash.html The source is here: https://github.com/D-Programming-Language/druntime/blob/master/src/rt/aaA.d -- Mike Wey
Aug 09 2014
next sibling parent reply "Gary Willoughby" <dev nomad.so> writes:
On Saturday, 9 August 2014 at 10:28:02 UTC, Mike Wey wrote:
 Paul Hsieh's SuperFastHash:
 http://www.azillionmonkeys.com/qed/hash.html
Where is this implemented?
Aug 09 2014
parent Mike Wey <mike-wey example.com> writes:
On 08/09/2014 01:43 PM, Gary Willoughby wrote:
 On Saturday, 9 August 2014 at 10:28:02 UTC, Mike Wey wrote:
 Paul Hsieh's SuperFastHash:
 http://www.azillionmonkeys.com/qed/hash.html
Where is this implemented?
https://github.com/D-Programming-Language/druntime/blob/master/src/rt/util/hash.d -- Mike Wey
Aug 09 2014
prev sibling parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Saturday, 9 August 2014 at 10:28:02 UTC, Mike Wey wrote:
 On 08/09/2014 11:33 AM, Gary Willoughby wrote:
 What hashing algorithm is used for the D implementation of 
 associative
 arrays? Where in the D source does the AA code live?
Paul Hsieh's SuperFastHash: http://www.azillionmonkeys.com/qed/hash.html The source is here: https://github.com/D-Programming-Language/druntime/blob/master/src/rt/aaA.d
Isn't SuperFastHash vulnerable to collision attacks?
Aug 09 2014
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Marc Schütz:

 Isn't SuperFastHash vulnerable to collision attacks?
D AAs used to be not vulnerable to collision attacks because they resolved collisions building a red-black tree for each bucket. Later buckets became linked lists for speed, leading to the current sensitivity to collision attacks. I think D is not yet in the stage of its development where it starts to care a lot about attacks. Currently D programs are able to "attack themselves" just fine :-) But as usual patches are (slowly) welcome. Bye, bearophile
Aug 14 2014
next sibling parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Thursday, 14 August 2014 at 13:10:58 UTC, bearophile wrote:
 Marc Schütz:

 Isn't SuperFastHash vulnerable to collision attacks?
D AAs used to be not vulnerable to collision attacks because they resolved collisions building a red-black tree for each bucket. Later buckets became linked lists for speed, leading to the current sensitivity to collision attacks. I think D is not yet in the stage of its development where it starts to care a lot about attacks.
IMO this is a _very_ dangerous stance. These kinds of attacks became well known in 2011, when it turned out that almost all of the commonly used languages and web frameworks were vulnerable: http://events.ccc.de/congress/2011/Fahrplan/events/4680.en.html It would be nice if D behaved correctly before any actual attack becomes known. Besides, AAs are probably already exposed to outside attackers in vibe.d (didn't check though).
 Currently D programs are able to "attack themselves" just fine 
 :-) But as usual patches are (slowly) welcome.
Aug 14 2014
parent "H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:
On Thu, Aug 14, 2014 at 04:32:24PM +0000, via Digitalmars-d-learn wrote:
 On Thursday, 14 August 2014 at 13:10:58 UTC, bearophile wrote:
Marc Schütz:

Isn't SuperFastHash vulnerable to collision attacks?
D AAs used to be not vulnerable to collision attacks because they resolved collisions building a red-black tree for each bucket. Later buckets became linked lists for speed, leading to the current sensitivity to collision attacks. I think D is not yet in the stage of its development where it starts to care a lot about attacks.
IMO this is a _very_ dangerous stance. These kinds of attacks became well known in 2011, when it turned out that almost all of the commonly used languages and web frameworks were vulnerable: http://events.ccc.de/congress/2011/Fahrplan/events/4680.en.html It would be nice if D behaved correctly before any actual attack becomes known. Besides, AAs are probably already exposed to outside attackers in vibe.d (didn't check though).
[...] PR's to fix this issue would be greatly appreciated! T -- Nobody is perfect. I am Nobody. -- pepoluan, GKC forum
Aug 14 2014
prev sibling parent "safety0ff" <safety0ff.dev gmail.com> writes:
On Thursday, 14 August 2014 at 13:10:58 UTC, bearophile wrote:
 D AAs used to be not vulnerable to collision attacks because 
 they resolved collisions building a red-black tree for each 
 bucket. Later buckets became linked lists for speed,
Slight corrections: It was a effectively a randomized BST, it used the hash value + comparison function to place the elements in the tree. E.g. The AA's node comparison function might be: if (hash == other.hash) return value.opCmp(other.value); else if (hash < other.hash) return -1; return 1; The hash function has a significant influence on how balanced the BST will be. Insertion and removal order also have performance influence since rebalancing was only done when growing the AA. It had no performance guarantees. I believe it was removed to reduce memory consumption, see the Mar 19 2010 cluster of commits by Walter Bright to aaA.d. Since GC rounds up allocations to powers of two for small objects, the additional pointer doubles the allocation size per node. A template library based AA implementation should be able to handily outperform built-in AAs and provide guarantees. Furthermore, improved memory management could be a significant win. Fun fact: The AA implementation within DMD still uses the "randomized" BST though the hash functions are very rudimentary.
Aug 14 2014
prev sibling parent "Sean Kelly" <sean invisibleduck.org> writes:
Superfast. Though Murmur has gotten good enough that I'm tempted 
to switch.  At the time, Murmur didn't even have a license so it 
wasn't an option.
Aug 14 2014