www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Vote: AA builtin method get()

reply David Medlock <noone nowhere.com> writes:
Walter,

I really think the following is a good solution to the AA issues.

1. Add a get method to all AAs:

bool get( in key, out value );

if the value is there, you get it and no double lookups.  A great side 
effect of this is the out value is initialized to the default! Which 
mimics the old behavior and allows you to determine if it was there.

2. Deprecate the in syntax.  It is really not useful because it cannot 
be overridden if I wish to replace an AA with a class.  Semantically it 
has no advantage to a method.  It also requires pointers which isn't 
really required elsewhere in the rest of D's core.

3. allow remove(..) to return a boolean if the key was removed.  An 
overloaded remove with an 'out key' paramter might be useful as well.

4. Upon implementing those, the don't create behavior is kept for normal 
lookups.

Comments?

Here's hoping Walter is reading...

-DavidM
Jul 13 2005
parent reply "Ben Hinkle" <ben.hinkle gmail.com> writes:
"David Medlock" <noone nowhere.com> wrote in message 
news:db3026$629$1 digitaldaemon.com...
 Walter,

 I really think the following is a good solution to the AA issues.

 1. Add a get method to all AAs:

 bool get( in key, out value );

 if the value is there, you get it and no double lookups.  A great side 
 effect of this is the out value is initialized to the default! Which 
 mimics the old behavior and allows you to determine if it was there.
question: would get() insert if not present? I assume not but that last sentance above made me not so sure. It might also be nice to have insert(key) which would equivalent to &aa[key] assuming & inserts key if it isn't present.
 2. Deprecate the in syntax.  It is really not useful because it cannot be 
 overridden if I wish to replace an AA with a class.  Semantically it has 
 no advantage to a method.  It also requires pointers which isn't really 
 required elsewhere in the rest of D's core.
overloading shall come someday. The current AA indexing behavior isn't overloadable either.
 3. allow remove(..) to return a boolean if the key was removed.  An 
 overloaded remove with an 'out key' paramter might be useful as well.
sounds good to me.
 4. Upon implementing those, the don't create behavior is kept for normal 
 lookups.
What do you mean by normal lookups? Do you mean rvalue aa[key]?
 Comments?

 Here's hoping Walter is reading...

 -DavidM 
Jul 13 2005
parent reply David Medlock <noone nowhere.com> writes:
Ben Hinkle wrote:

 "David Medlock" <noone nowhere.com> wrote in message 
 news:db3026$629$1 digitaldaemon.com...
 
Walter,

I really think the following is a good solution to the AA issues.

1. Add a get method to all AAs:

bool get( in key, out value );

if the value is there, you get it and no double lookups.  A great side 
effect of this is the out value is initialized to the default! Which 
mimics the old behavior and allows you to determine if it was there.
question: would get() insert if not present? I assume not but that last sentance above made me not so sure. It might also be nice to have insert(key) which would equivalent to &aa[key] assuming & inserts key if it isn't present.
I would assume it would not insert, which could be handled if it returned false. Since an insert into an AA isnt a lookup, this is still efficient. With get() I don't see a case where insert would be needed? MyClass var; if ( aa.get( key, var )==false ) aa[key] = ( var = new Class() );
2. Deprecate the in syntax.  It is really not useful because it cannot be 
overridden if I wish to replace an AA with a class.  Semantically it has 
no advantage to a method.  It also requires pointers which isn't really 
required elsewhere in the rest of D's core.
overloading shall come someday.
Overloading in? Whats wrong with a method? Seems like a novelty.... The current AA indexing behavior isn't
 overloadable either.
 
Why not? Do you mean the &aa[key] syntax?
 
3. allow remove(..) to return a boolean if the key was removed.  An 
overloaded remove with an 'out key' paramter might be useful as well.
sounds good to me.
4. Upon implementing those, the don't create behavior is kept for normal 
lookups.
What do you mean by normal lookups? Do you mean rvalue aa[key]?
Yes, rvalues. LValues would ostensibly not be affected.
 
Comments?

Here's hoping Walter is reading...

-DavidM 
Jul 13 2005
parent reply "Ben Hinkle" <bhinkle mathworks.com> writes:
"David Medlock" <noone nowhere.com> wrote in message 
news:db342u$97b$1 digitaldaemon.com...
 Ben Hinkle wrote:

 "David Medlock" <noone nowhere.com> wrote in message 
 news:db3026$629$1 digitaldaemon.com...

Walter,

I really think the following is a good solution to the AA issues.

1. Add a get method to all AAs:

bool get( in key, out value );

if the value is there, you get it and no double lookups.  A great side 
effect of this is the out value is initialized to the default! Which 
mimics the old behavior and allows you to determine if it was there.
question: would get() insert if not present? I assume not but that last sentance above made me not so sure. It might also be nice to have insert(key) which would equivalent to &aa[key] assuming & inserts key if it isn't present.
I would assume it would not insert, which could be handled if it returned false. Since an insert into an AA isnt a lookup, this is still efficient. With get() I don't see a case where insert would be needed? MyClass var; if ( aa.get( key, var )==false ) aa[key] = ( var = new Class() );
That's still a double-lookup. I don't know what you mean exactly when you say "an insert into an AA isn't a lookup". In order to insert a key you need to compute the hash value and find an available slot for it. In particular if the key is already in the AA then an insert is 99% the same amount of work as looking up a key, which computes the hash and find the existing slot. I thought you were looking for an API that would mimic the C++ API - the point being that missing keys are inserted and an lvalue is returned.
2. Deprecate the in syntax.  It is really not useful because it cannot be 
overridden if I wish to replace an AA with a class.  Semantically it has 
no advantage to a method.  It also requires pointers which isn't really 
required elsewhere in the rest of D's core.
overloading shall come someday.
Overloading in? Whats wrong with a method? Seems like a novelty.... The current AA indexing behavior isn't
 overloadable either.
Why not? Do you mean the &aa[key] syntax?
yes. The only overloads are opIndex and opIndexAssign
Jul 13 2005
parent reply David Medlock <noone nowhere.com> writes:
Ben Hinkle wrote:

 "David Medlock" <noone nowhere.com> wrote in message 
 news:db342u$97b$1 digitaldaemon.com...
 
Ben Hinkle wrote:


"David Medlock" <noone nowhere.com> wrote in message 
news:db3026$629$1 digitaldaemon.com...


Walter,

I really think the following is a good solution to the AA issues.

1. Add a get method to all AAs:

bool get( in key, out value );

if the value is there, you get it and no double lookups.  A great side 
effect of this is the out value is initialized to the default! Which 
mimics the old behavior and allows you to determine if it was there.
question: would get() insert if not present? I assume not but that last sentance above made me not so sure. It might also be nice to have insert(key) which would equivalent to &aa[key] assuming & inserts key if it isn't present.
I would assume it would not insert, which could be handled if it returned false. Since an insert into an AA isnt a lookup, this is still efficient. With get() I don't see a case where insert would be needed? MyClass var; if ( aa.get( key, var )==false ) aa[key] = ( var = new Class() );
That's still a double-lookup. I don't know what you mean exactly when you say "an insert into an AA isn't a lookup". In order to insert a key you need to compute the hash value and find an available slot for it. In particular if the key is already in the AA then an insert is 99% the same amount of work as looking up a key, which computes the hash and find the existing slot.
I assume Walter uses a bucket hashmap. This means a linked list has an entry inserted at its head for the key's bucket. This is a constant time operation(basically nil). Computing the key is not usually a hefty task(most are ints most likely), but searching through the bucket of entries can be(depending on the load). With cache-sensitive processors I would bet searching through the bucket list takes quite a bit longer. Besides, even in the cases where the key takes a couple of extra milliseconds to compute, it only happens the *first time*. If that is the crux of your program's performance, I would say it either: 1. Needs a better data strategy. 2. Is as fast as it is going to be, barring assembly.
 I thought you were looking for an API that would mimic the C++ API - the
 point being that missing keys are inserted and an lvalue is returned.
I actually like the idea of a Rvalue (var= aa[100]) lookup simply returning the default and *not creating* an entry, but this would break due to arrays being null proof... -DavidM
Jul 13 2005
parent reply "Ben Hinkle" <bhinkle mathworks.com> writes:
"David Medlock" <noone nowhere.com> wrote in message 
news:db36p6$bt9$1 digitaldaemon.com...
 Ben Hinkle wrote:

 "David Medlock" <noone nowhere.com> wrote in message 
 news:db342u$97b$1 digitaldaemon.com...

Ben Hinkle wrote:


"David Medlock" <noone nowhere.com> wrote in message 
news:db3026$629$1 digitaldaemon.com...


Walter,

I really think the following is a good solution to the AA issues.

1. Add a get method to all AAs:

bool get( in key, out value );

if the value is there, you get it and no double lookups.  A great side 
effect of this is the out value is initialized to the default! Which 
mimics the old behavior and allows you to determine if it was there.
question: would get() insert if not present? I assume not but that last sentance above made me not so sure. It might also be nice to have insert(key) which would equivalent to &aa[key] assuming & inserts key if it isn't present.
I would assume it would not insert, which could be handled if it returned false. Since an insert into an AA isnt a lookup, this is still efficient. With get() I don't see a case where insert would be needed? MyClass var; if ( aa.get( key, var )==false ) aa[key] = ( var = new Class() );
That's still a double-lookup. I don't know what you mean exactly when you say "an insert into an AA isn't a lookup". In order to insert a key you need to compute the hash value and find an available slot for it. In particular if the key is already in the AA then an insert is 99% the same amount of work as looking up a key, which computes the hash and find the existing slot.
I assume Walter uses a bucket hashmap. This means a linked list has an entry inserted at its head for the key's bucket. This is a constant time operation(basically nil). Computing the key is not usually a hefty task(most are ints most likely), but searching through the bucket of entries can be(depending on the load). With cache-sensitive processors I would bet searching through the bucket list takes quite a bit longer. Besides, even in the cases where the key takes a couple of extra milliseconds to compute, it only happens the *first time*. If that is the crux of your program's performance, I would say it either: 1. Needs a better data strategy. 2. Is as fast as it is going to be, barring assembly.
Actually the current AA implementation doesn't use a linked list of buckets - it uses an unbalanced tree (or perhaps more accurately the tree will be balanced if the hashing algorithm is good). In any case "inserting" a key that is already in the map needs to locate and reuse the existing bucket otherwise you'll end up with duplicate entries in the map. So insert is essentially the same as a lookup. I agree if you know a key is not in the map and the map is using a linked list of buckets that inserting is faster than lookup but neither of those assumptions would be true for the "insert" function I was talking about. But I agree performance isn't the crucial difference. Convenience and making the code more explicit would probably be the true benefits.
 I thought you were looking for an API that would mimic the C++ API - the
 point being that missing keys are inserted and an lvalue is returned.
I actually like the idea of a Rvalue (var= aa[100]) lookup simply returning the default and *not creating* an entry, but this would break due to arrays being null proof...
ok.
Jul 13 2005
parent David Medlock <noone nowhere.com> writes:
Ben Hinkle wrote:

 "David Medlock" <noone nowhere.com> wrote in message 
 news:db36p6$bt9$1 digitaldaemon.com...
 
I assume Walter uses a bucket hashmap.  This means a linked list has an 
entry inserted at its head for the key's bucket.  This is a constant time 
operation(basically nil). Computing the key is not usually a hefty 
task(most are ints most likely), but searching through the bucket of 
entries can be(depending on the load). With cache-sensitive processors I 
would bet searching through the bucket list takes quite a bit longer.

Besides, even in the cases where the key takes a couple of extra 
milliseconds to compute, it only happens the *first time*.

If that is the crux of your program's performance, I would say it either:
1. Needs a better data strategy.
2. Is as fast as it is going to be, barring assembly.
Actually the current AA implementation doesn't use a linked list of buckets - it uses an unbalanced tree (or perhaps more accurately the tree will be balanced if the hashing algorithm is good).
Interesting, I didn't know this. In any case "inserting"
 a key that is already in the map needs to locate and reuse the existing 
 bucket otherwise you'll end up with duplicate entries in the map. So insert 
 is essentially the same as a lookup. 
True. Mea culpa.
 But I agree performance isn't the crucial difference. Convenience and making 
 the code more explicit would probably be the true benefits.
 
Hehe, right on. My work time is a constant, while the computer gets more 'work time' constantly(Moore's law). Thanks Ben. -DavidM
Jul 13 2005