www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 10733] New: Add a druntime function to find the pointer to a key of a built-in associative array given the pointer to a value

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10733

           Summary: Add a druntime function to find the pointer to a key
                    of a built-in associative array given the pointer to a
                    value
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: druntime
        AssignedTo: nobody puremagic.com
        ReportedBy: bearophile_hugs eml.cc



Associative arrays are handy, but beside sets and bags there's a common desire
for ordered dictionaries. They are not ordered in the sense an AVL is ordered,
it's not the order of the keys that matters for them, but the insertion order
of the key-value pairs. So they are hash-based, unlike a search tree. The
Python standard library has such data structure:

http://docs.python.org/2/library/collections.html#collections.OrderedDict

An OrderedDict has some of the advantages of an array (it keeps the insertion
order, its printing and its iteration is deterministic, it can be iterated
backwards too), and it can have a popLast or popHead member function.

There are several ways to implement an OrderedDict in library code. One of the
simplest is to wrap a built-in D hash based associative array. 

Let's say the implementation is like:


final class OrderedAA(KeyType, ValType) {
    static struct MyValue {
        ValType item;
        typeof(this)* pred, succ;
    }
    private MyValue[KeyType] data;
    ...
}


Each value is a struct that beside the original value 'item' also contains a
pointer to the precedent and successive MV, in a circular doubly linked list.

The member functions of OrderedAA that add or remove items from the associative
array manage the linked list of values. The iteration is a range that follows
the linked list. But this way you only scan the values.

To also implement byKey or to scan the keys in a foreach, or the key-value
pairs, you find the pointer to the key given the pointer to a value MyValue.

There a way to find the pointer to an AA key given the pointer to its value,
but it's not portable and it's undocumented.

So perhaps it's worth adding to the standard object module of druntime a
portable way to find that pointer, to allow a simple library-based
implementation of a hash-based ordered associative array. Such function must
work in O(1), it's meant only for access the key in read mode, and it's
probably short and simple to write. The disadvantage it that such function
works on the inner representation of the D associative arrays, so it can
potentially constraint future implementations of such associative arrays. But
it's not easy to invent AA implementations that make it hard to implement that
function in O(1).

Alternative or better solutions/ideas are welcome.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 31 2013
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10733


hsteoh quickfur.ath.cx changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hsteoh quickfur.ath.cx



I think ordered dictionaries should be implemented as a library type.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 23 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10733





 I think ordered dictionaries should be implemented as a library type.
This request asks to add to the object module just one very small function, that allows to implement ordered dictionaries using 99% of library code, building on top of the built-in associative arrays, keeping very small the amount of duplicated associative code logic. Is this good enough for you? If this is not good enough and you desire something different, then please explain why. Thank you. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 23 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10733




Well, it's not that it's not good enough, it's just that with your approach,
the resulting code would be un- safe, because you have to cast Value* to Key*.
It can't be made  trusted either, because you can't guarantee where the user
code got that Value* from (they could have created a Value variable outside of
the AA, and then the resulting Key* would be an invalid pointer).

Also, based on your other AA bug reports / enhancement requests, it would
appear that the current AA implementation may need to be overhauled anyway, and
adding too many dependencies on the internal implementation would only make
that harder.

For true code reuse to support these different kinds of AA, I think a better
approach is to implement a common Set type. A Set basically is a valueless
hash, it just stores Elements. The current Key/Value AA can then be implemented
by creating an Element whose toHash only hashes the key part of the Element.
This also avoids the current code duplication in byKey and byValue, because the
underlying Set would only have a single range returned by Set.byElement, so
byKey and byValue would just be wrappers around byElement that returns either
the key part or the value part. And byPair would be trivially implemented as
well. (In fact, this is close to what is currently implemented in AA's, byKey
and byValue basically iterates over Slots and returns the key part or the value
part.)

Using this Set, you can implement your ordered AA just by using an Element with
both key and value and next/prev pointers, and you'll be able to iterate over
both keys and values.


One way to achieve this in the current AA implementation might be to use a
zero-size Value, say ubyte[0]. (I'm not sure if this works, but if not, it's
probably a bug that needs to be fixed. In any case, you could work around it by
using a dummy ubyte value that is never used.) Then define a Key struct that
contains both the real key and real value and next/prev pointers to other Keys,
with a toHash that only hashes the "real key" part of the struct. The OrderedAA
wrapper would then translate this Key struct into the user-facing "real" key
and value.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 23 2013
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10733




Hmm, actually, my hack to wrap around the current AA implementation doesn't
work; you can't look up the value part of the key! When constructing the fake
key, you don't know the value part, and the AA doesn't return the real key
either, so there's no way to retrieve the value part of the fake key. :-(

I guess we still can't get away from a Set implementation.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 23 2013