www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Array toHash()

reply David Held <dmd wyntrmute.com> writes:
I have a class which contains an int[] and some other stuff.  I want to 
use my class as the key for an AA, so I am overriding toHash().  But the 
int[] is the only part which should produce the hash code.  I know that 
int[].toHash() is defined somehow, because I can put int[] directly into 
an AA without writing any hash functions.  But I don't know how to spell 
this explicitly or force the compiler to generate it for me so that I 
can forward to it in my toHash().  For illustration:

class Foo
{
     override
     size_t toHash()  trusted pure const nothrow
     {
         // error: no property 'toHash' for type 'int[]'
         return importantStuff.toHash();
     }

     // override opEquals() too...

     int[] importantStuff;
     bool  notImportant;
     int   ignoreMe;
}

Any way to avoid re-implementing this hash function?

Dave
Nov 26 2014
parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 11/26/2014 04:25 PM, David Held wrote:

 class Foo
 {
      override
      size_t toHash()  trusted pure const nothrow
      {
          // error: no property 'toHash' for type 'int[]'
          return importantStuff.toHash();
      }
The getHash() member function of the particular TypeInfo can be used. However, it is not currently pure, so you must comment that out from your toHash: override size_t toHash() trusted /* pure */ const nothrow { return typeid(importantStuff).getHash(&importantStuff); } If a function can safely be casted to pure, you can use the following yet-missing-in-phobos function template: import std.traits; auto assumePure(T)(T t) if (isFunctionPointer!T || isDelegate!T) { enum attrs = functionAttributes!T | FunctionAttribute.pure_; return cast(SetFunctionAttributes!(T, functionLinkage!T, attrs)) t; } // ... override size_t toHash() trusted pure const nothrow { auto func = assumePure(&(typeid(importantStuff).getHash)); return func(&importantStuff); } Note that now your toHash can be pure. Ali
Nov 26 2014
parent reply David Held <dmd wyntrmute.com> writes:
On 11/26/2014 4:40 PM, Ali Çehreli wrote:
 [...]
      override
      size_t toHash()  trusted pure const nothrow
      {
          auto func = assumePure(&(typeid(importantStuff).getHash));
          return func(&importantStuff);
      }
Very helpful, thanks! Am I right in assuming that there is some runtime cost here, as we are calling through a function pointer? Is the lack of inlining the only cost, or is getting at the typeinfo also costly? Dave
Nov 29 2014
parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 11/29/2014 12:45 PM, David Held wrote:

 On 11/26/2014 4:40 PM, Ali Çehreli wrote:
 [...]
      override
      size_t toHash()  trusted pure const nothrow
      {
          auto func = assumePure(&(typeid(importantStuff).getHash));
          return func(&importantStuff);
      }
 Am I right in assuming that there is some runtime
 cost here, as we are calling through a function pointer?
I think so. However, I think the compiler can eliminate dereferencing the function pointer if it can prove that it is not necessary.
 Is the lack of inlining the only cost, or is getting at the
 typeinfo also costly?
typeid() is a runtime function. I think it will be costly every time toHash is called. The function pointer can be initialized once. // I've "deduced" the type from an error message. ;) static const ulong delegate(const(void*)) const pure nothrow trusted func; // This will be executed once per thread: static this() { func = assumePure(&(typeid(Foo.init.importantStuff).getHash)); } override size_t toHash() trusted pure const nothrow { return func(&importantStuff); } Now there is no TypeInfo lookup after thread initialization time. Ali
Nov 29 2014
parent David Held <dmd wyntrmute.com> writes:
On 11/29/2014 3:59 PM, Ali Çehreli wrote:
 [...]
 typeid() is a runtime function. I think it will be costly every time
 toHash is called. The function pointer can be initialized once.

      // I've "deduced" the type from an error message. ;)
      static const ulong delegate(const(void*)) const pure nothrow
  trusted func;

      // This will be executed once per thread:
      static this() {
          func = assumePure(&(typeid(Foo.init.importantStuff).getHash));
      }

      override
      size_t toHash()  trusted pure const nothrow
      {
          return func(&importantStuff);
      }

 Now there is no TypeInfo lookup after thread initialization time.
Nice! This is getting very fancy, but also a bit unfortunate. It would be appropriate if the type of importantStuff were arbitrary, but it's not. It's int[]. In fact, the fact that I can get a compiler-generated hash function for any type (I presume) surprises me. Not entirely, since the compiler can just treat types as binary blobs, and define a hash function that operates as if it were byte[]. Although it is nice that we can get at stuff through reflection, this is not quite the same as the ostensibly static [and inlinable] call to int[].toHash() which logically occurs when int[] is used directly as a key. Also, would it be possible to avoid spelling the delegate type by making func a function static and using an auto return on a named method? Finally, I find it a little surprising that there's no standard module which implements a quality hash function builder (using something like MurmurHash or CityHash). This seems like it would be a fairly common requirement for a lot of users. Are there any non-standard libraries which do this? Dave
Nov 29 2014