www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Static introspection of suitable hash function depending on type of

Does anyone have any good reads on the subject of statically 
choosing suitable hash-functions depending on the type (and in 
turn size) of the key?

I wonder because I'm currently experimenting with a hash set 
implementation at

https://github.com/nordlow/phobos-next/blob/40e2973b74d58470a13a5a6ee0ed9c9a42c6dea1/src/hashset.d

and benchmark for different hash-functions for it at

https://github.com/nordlow/phobos-next/blob/b8942dc569921b4dadfddbdcdac3a2bb0834a9e0/src/benchmarkAppend.d

I'm measuring significant differences in speed depending on the 
choice of the hash-function:

Inserted 1000000 integers in 49 ms, 65 μs, and 9 hnsecs, Checked 
1000000 integers in 48 ms, 562 μs, and 2 hnsecs for 
HashSet!(uint, null, typeidHashOf)
Inserted 1000000 integers in 51 ms, 897 μs, and 5 hnsecs, Checked 
1000000 integers in 47 ms, 108 μs, and 9 hnsecs for 
HashSet!(uint, null, hashOf)
Inserted 1000000 integers in 60 ms, 641 μs, and 5 hnsecs, Checked 
1000000 integers in 70 ms, 664 μs, and 2 hnsecs for 
HashSet!(uint, null, MurmurHash3!(128u, 64u))
Inserted 1000000 integers in 34 ms, 450 μs, and 5 hnsecs, Checked 
1000000 integers in 27 ms, 738 μs, and 8 hnsecs for 
HashSet!(uint, null, FNV!(64LU, true))
Inserted 1000000 integers in 97 ms, 400 μs, and 6 hnsecs, Checked 
1000000 integers in 104 ms, 33 μs, and 1 hnsec for HashSet!(uint, 
null, XXHash64)
integers in 39 ms, 304 μs, and 3 hnsecs for bool[uint]

using LDC 1.4.0.

A factor 2 for insert() and factor 4 for contains().

The reason is partly because many high-performance hashes, such 
as XXHash64, have a significant overhead (tens of clock-cycles) 
because of its super-scalar nature but are fast (~1 clock-cycle 
per byte) for large keys.

The test is dumb for now and is only constructed to benchmark the 
hash-function.

According to a comment at

https://stackoverflow.com/questions/46533112/static-introspection-of-suitable-hash-function-depending-on-type-of-hash-key

"the C++ standard library already includes hashes for many basic 
types? (I ask this because if you don't have a special 
distribution I would assume the standard committee has already 
made good choices. My answer would be to use those.)"

Does Phobos have anything similar today or planned?
Oct 02 2017