www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Caching in druntime TypeInfo classes

reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
While investigating:

	 https://issues.dlang.org/show_bug.cgi?id=4244

I found that the druntime function for computing the hash of static
arrays (this also applies to dynamic arrays, btw) is horrendously slow:
about 8-9 times slower than the equivalent operation on a POD struct of
the same size.

The problem is caused by the call to hasCustomToHash() inside
getArrayHash() in object.d, which in turn calls getElement(), which
walks the typeinfo tree to find the TypeInfo for the first non-array /
typedef type info definition, in order to determine if array elements
have a custom toHash method. This walk is done *every single time* the
array is hashed, even though the return value never changes for each
array type.

So I tried to modify getArrayHash() to cache this information in the
TypeInfo, but ran into some roadblocks: since TypeInfo's are supposed to
be const, this operation is illegal. Unless I cast away const, but
that's a rather dirty hack. The other problem is that the compiler
hardcodes the sizes of each TypeInfo instance, so it will refuse to
compile object.d anyway if the TypeInfo is expanded to have an extra
field for caching the result of hasCustomTohash(). But since we have to
modify the compiler now, my reaction was, why not have the compiler
compute this value itself? Since the compiler already has all the
information needed to compute this value. We don't have to wait till
runtime. The only drawback is adding more complexity to the compiler,
making it hard for other efforts like SDC to implement D.

What do you guys think? Should hasCustomToHash() be cached somehow in
object.o? Or is caching a poor solution, and we should do something
else?


T

-- 
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth? -- Michael Beibl
Jun 30 2015
next sibling parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 6/30/15 5:14 PM, H. S. Teoh via Digitalmars-d wrote:
 While investigating:

 	 https://issues.dlang.org/show_bug.cgi?id=4244

 I found that the druntime function for computing the hash of static
 arrays (this also applies to dynamic arrays, btw) is horrendously slow:
 about 8-9 times slower than the equivalent operation on a POD struct of
 the same size.

 The problem is caused by the call to hasCustomToHash() inside
 getArrayHash() in object.d, which in turn calls getElement(), which
 walks the typeinfo tree to find the TypeInfo for the first non-array /
 typedef type info definition, in order to determine if array elements
 have a custom toHash method. This walk is done *every single time* the
 array is hashed, even though the return value never changes for each
 array type.

 So I tried to modify getArrayHash() to cache this information in the
 TypeInfo, but ran into some roadblocks: since TypeInfo's are supposed to
 be const, this operation is illegal. Unless I cast away const, but
 that's a rather dirty hack. The other problem is that the compiler
 hardcodes the sizes of each TypeInfo instance, so it will refuse to
 compile object.d anyway if the TypeInfo is expanded to have an extra
 field for caching the result of hasCustomTohash(). But since we have to
 modify the compiler now, my reaction was, why not have the compiler
 compute this value itself? Since the compiler already has all the
 information needed to compute this value. We don't have to wait till
 runtime. The only drawback is adding more complexity to the compiler,
 making it hard for other efforts like SDC to implement D.

 What do you guys think? Should hasCustomToHash() be cached somehow in
 object.o? Or is caching a poor solution, and we should do something
 else?
This sounds like a very good job for RTInfo. But another possibility -- could you just store the flag in the AA itself? Still doesn't help if you have lots of small similarly-typed AAs, but the fact that the AA key type doesn't change should give you this optimization for relatively cheap, no? -Steve
Jun 30 2015
prev sibling next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 01-Jul-2015 00:14, H. S. Teoh via Digitalmars-d wrote:
 While investigating:

 	 https://issues.dlang.org/show_bug.cgi?id=4244

 I found that the druntime function for computing the hash of static
 arrays (this also applies to dynamic arrays, btw) is horrendously slow:
 about 8-9 times slower than the equivalent operation on a POD struct of
 the same size.

 The problem is caused by the call to hasCustomToHash() inside
 getArrayHash() in object.d, which in turn calls getElement(), which
 walks the typeinfo tree to find the TypeInfo for the first non-array /
 typedef type info definition, in order to determine if array elements
 have a custom toHash method. This walk is done *every single time* the
 array is hashed, even though the return value never changes for each
 array type.
Can't the compiler just help us and populate a variable in TypeInfo that is hasCustomToHash is inefficiently computing? -- Dmitry Olshansky
Jun 30 2015
prev sibling next sibling parent reply "Mike" <none none.com> writes:
On Tuesday, 30 June 2015 at 21:17:13 UTC, H. S. Teoh wrote:
 since TypeInfo's are supposed to be const, this operation is 
 illegal. Unless I cast away const, but that's a rather dirty 
 hack. The other problem is that the compiler hardcodes the 
 sizes of each TypeInfo instance, so it will refuse to compile 
 object.d anyway if the TypeInfo is expanded to have an extra 
 field for caching the result of hasCustomTohash(). But since we 
 have to modify the compiler now, my reaction was, why not have 
 the compiler compute this value itself? Since the compiler 
 already has all the information needed to compute this value. 
 We don't have to wait till runtime. The only drawback is adding 
 more complexity to the compiler, making it hard for other 
 efforts like SDC to implement D.
IMO the root cause of the trouble implementing your idea is the fact that the whole TypeInfo implementation is janky. I think we should implement some flavor of Adam Ruppe's idea to move TypeInfo to the runtime (https://issues.dlang.org/show_bug.cgi?id=12270). Then the implementation would be more straightforward...and it would solve another major problem that I'm having trying to use D for resource-constrained systems (http://bugzilla.gdcproject.org/show_bug.cgi?id=184) without having to resort to so much hackery. It may also help reduce binary sizes on other platforms, which has come up a few times recently. That's probably much more than you bargained for, but I have every intention on implementing it this year, and could use some help. Mike
Jun 30 2015
parent "rsw0x" <anonymous anonymous.com> writes:
On Tuesday, 30 June 2015 at 23:23:46 UTC, Mike wrote:
 On Tuesday, 30 June 2015 at 21:17:13 UTC, H. S. Teoh wrote:
[...]
IMO the root cause of the trouble implementing your idea is the fact that the whole TypeInfo implementation is janky. I think we should implement some flavor of Adam Ruppe's idea to move TypeInfo to the runtime (https://issues.dlang.org/show_bug.cgi?id=12270). Then the implementation would be more straightforward...and it would solve another major problem that I'm having trying to use D for resource-constrained systems (http://bugzilla.gdcproject.org/show_bug.cgi?id=184) without having to resort to so much hackery. It may also help reduce binary sizes on other platforms, which has come up a few times recently. [...]
+1 typeinfo situation is ... bad.
Jun 30 2015
prev sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 30 June 2015 at 21:17:13 UTC, H. S. Teoh wrote:
 While investigating:

 	 https://issues.dlang.org/show_bug.cgi?id=4244

 I found that the druntime function for computing the hash of 
 static arrays (this also applies to dynamic arrays, btw) is 
 horrendously slow: about 8-9 times slower than the equivalent 
 operation on a POD struct of the same size.

 The problem is caused by the call to hasCustomToHash() inside 
 getArrayHash() in object.d, which in turn calls getElement(), 
 which walks the typeinfo tree to find the TypeInfo for the 
 first non-array / typedef type info definition, in order to 
 determine if array elements have a custom toHash method. This 
 walk is done *every single time* the array is hashed, even 
 though the return value never changes for each array type.

 So I tried to modify getArrayHash() to cache this information 
 in the TypeInfo, but ran into some roadblocks: since TypeInfo's 
 are supposed to be const, this operation is illegal. Unless I 
 cast away const, but that's a rather dirty hack. The other 
 problem is that the compiler hardcodes the sizes of each 
 TypeInfo instance, so it will refuse to compile object.d anyway 
 if the TypeInfo is expanded to have an extra field for caching 
 the result of hasCustomTohash(). But since we have to modify 
 the compiler now, my reaction was, why not have the compiler 
 compute this value itself? Since the compiler already has all 
 the information needed to compute this value. We don't have to 
 wait till runtime. The only drawback is adding more complexity 
 to the compiler, making it hard for other efforts like SDC to 
 implement D.

 What do you guys think? Should hasCustomToHash() be cached 
 somehow in object.o? Or is caching a poor solution, and we 
 should do something else?


 T
This should not use typeinfo at all IMO. We have template to do exactly that.
Jun 30 2015
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, 1 July 2015 at 00:48:55 UTC, deadalnix wrote:

 This should not use typeinfo at all IMO.

 We have template to do exactly that.
Yeah. It's known at compile time whether a type has a custom toHash function, so it's pretty ridiculous to be looking that sort of thing up at runtime - _especially_ when it's slow to do so. - Jonathan M Davis
Jun 30 2015