www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Druntime AA interface could be enhanced a bit

reply Michael Rynn <michaelrynn optusnet.com.au> writes:
In playing around trying to understand and make a better AA, I have 
updated again the dsource/projects/aa/trunk/druntime/aaA.d, this time to 
be a single linked list version.  

In the template versions of various implementations, and in Java HashMap, 
I noticed that its nice to be able to specify the initial capacity, and a 
load ratio (nodes to table length), and see how performance varies. 

It would be nice to have an _aaInit function, where these things are 
specified.  

And corresponding .init property.

And a .capacity (read and write) property.
And a .loadRatio (double) property.
How about a .statistics property?

valuetype[keytype] aa.

aa.init(capacity, loadratio).
...fill it up...
uint[]  stats = aa.statistics;

The implementation template classes now have these things. Capacity and 
load ratio are pretty standard for AA. They could be added to the current 
AA as well, which I might get around to. 

The _aaInit function should push the TypeInfo_AssociativeArray, once and 
for all, and not have different TypeInfos and valuesizes, re-pushed to 
different functions, var-arged before keys and values, as if they might 
change at any time. _aaRehash may get a different TypeInfo when called 
directly than when called from inside _aaGet, which I am sure will be 
corrected soon.

There may be a vague general principle here, of having classes that can 
be auto-initialized if they support certain properties.  The AA is really 
a class type hiding in a reference, and it would be nice to support that 
directly.

  Since the implementation is a reference type, the compiler can check 
for nullness, and call an init thunk, before calling standard AA 
functions and properties that require non-null reference.  Its either 
initialized or it isn't. 

The sizeof AA is 4, on my machine, not 8, as is supposed to be the common 
value in the D web site documentation on AA. Its the size of a pointer, 
so the site would be more accurate to just say its an opaque reference 
pointer sized type.

Again the aaA.d has been given a runtime TypeInfo initialisation tweak so 
that it can shovel integer value keys straight into the hash value of the 
nodes, and avoid TypeInfo compare and getHash.  It needs a bit more 
testing.  Its easier to work and experiment with than the more 
complicated tree version.

The current AA has nice balanced trees , but they do not make much of a 
performance difference at lower load ratios ( < 4 ), so that extra 
pointer seems to be a waste. The current AA will rehash before the load 
level becomes too sluggish.  I have uploaded some better refined tests to 
compare the various implementations at dsource/aa.

The linear congruential sequence generator variant is an example of "open 
addressing" hash table, and its performance is limited by the drawbacks 
of this model. The other implementations do better, waste less memory, 
and can handle table load ratios greater than  1.

Performance varies very much with the sort of keys used, much more so 
than the implementation variants, and table loads. Wrapping a string in 
class and caching the hash value gets much better performance than raw 
string keys. The string hash distribution was a bit better with a hash 
function using 31 as the character multiplier rather than 11.

-- May be I will get a job using Java soon..
Apr 09 2010
next sibling parent Sean Kelly <sean invisibleduck.org> writes:
Michael Rynn Wrote:

 
 In playing around trying to understand and make a better AA, I have 
 updated again the dsource/projects/aa/trunk/druntime/aaA.d, this time to 
 be a single linked list version.  
 
 In the template versions of various implementations, and in Java HashMap, 
 I noticed that its nice to be able to specify the initial capacity, and a 
 load ratio (nodes to table length), and see how performance varies. 
 
 It would be nice to have an _aaInit function, where these things are 
 specified.  
 
 And corresponding .init property.
 
 And a .capacity (read and write) property.
 And a .loadRatio (double) property.
 How about a .statistics property?
Please submit a ticket labeled [enhancement] with as much detail as you can provide. It's a good idea, but will get lost if it isn't in BugZilla.
Apr 09 2010
prev sibling next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Michael Rynn, I don't agree. In my opinion it's better to keep the built-in AAs
very flexible and very easy to use. I prefer them to have methods/attributes to
make them more handy to use in real situations (time ago I have listed here
time several missing things, like clear and copy methods, while I have said
that the rehash can be removed), while the performance tweak parameters can be
added to a associative array in Phobos, that can be used for more specialized
purposes.

Bye,
bearophile
Apr 09 2010
prev sibling next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Michael Rynn wrote:
 In playing around trying to understand and make a better AA, I have 
 updated again the dsource/projects/aa/trunk/druntime/aaA.d, this time to 
 be a single linked list version.  
I don't understand, the latest release already changed aaA.d to use a singly linked list.
Apr 09 2010
parent Michael Rynn <michaelrynn optusnet.com.au> writes:
On Fri, 09 Apr 2010 12:08:54 -0700, Walter Bright wrote:

 Michael Rynn wrote:
 In playing around trying to understand and make a better AA, I have
 updated again the dsource/projects/aa/trunk/druntime/aaA.d, this time
 to be a single linked list version.
I don't understand, the latest release already changed aaA.d to use a singly linked list.
Well, well, this is the power of synchronicity. After I picked up my jaw and eyeballs from the floor, I checked out the latest. The new aaA.d is good, but I think I can do better. I am still testing out some ideas, and still learning, so I will post an update to bugzilla after my brain reset.
Apr 13 2010
prev sibling parent reply BLS <windevguy hotmail.de> writes:
On 09/04/2010 19:46, Michael Rynn wrote:
 In playing around trying to understand and make a better AA, I have
 updated again the dsource/projects/aa/trunk/druntime/aaA.d, this time to
 be a single linked list version.
Well,Well Since D2/Phobos is still(quite a while now) lacking a container library a build-in AA may look like an good idea. I am not willing to agree. IMHO AAs or let's call 'em dictionaries for a while could be driven by skip lists (QT like) or whatever is appropriate, maybe even on runtime demand) How to implement that with build-in AAs ? I guess what I want to ask is > Are D AAs really so fundamental ? Are D AAs flexible enough ? Bjoern I think it was dschimsa who has initiated a place for AA replacements on dsource. Good idea.
Apr 09 2010
parent Walter Bright <newshound1 digitalmars.com> writes:
BLS wrote:
 I guess what I want to ask is
 Are D AAs really so fundamental ?
Yes, I believe they are. For one thing, they are highly useful in CTFE, which would be difficult to do if they were a user defined type.
 Are D AAs flexible enough ?
Yes, the criticisms of them tend to be about their performance, not suitability for a wide variety of applications.
Apr 09 2010