www.digitalmars.com         C & C++   DMDScript  

D - Container of unspecified type?

reply Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:
I've been thinking lately that it would be good if we could declare
containers of unspecified type, and let the compiler choose the
implementation.  There are 3 major types of container that I can think
of:
    * Unsorted, unindexed (bag)
    * Sorted, unindexed (list)
    * Unsorted, indexed (map/hash)
I suppose you could have a sorted,indexed containter, but what would
that be?

My thought here is that each of those containers has a fair analogy in
the array syntax:
    * Sorted, unindexed: associative array
    * Unsorted, indexed: integer-indexed array
    * Unsorted, unindexed: integer-indexed array (no reason we CAN'T
have indexes...)

For most purposes, a container is just a container - the programmer
doesn't really need to know how it is implemented.  So what if we
thought of an array not as an array, but as a generic container
declaration, which could be implemented as an array or as some other
type?

Basically, the idea is that the compiler would look at the things that
you do to the container, and choose an appropriate type.  If all you
ever do is array.addTail and array.removeFront, then it will decide to
implement your container as a linked list (or doubly-linked list, if
that's appropriate).  If all you do is access by integer indexes, then
it uses an array.  And so on.

Of course, performance-critical code would want to specify how the thing
was implemented.  So you add an optional "as" clause to array
declarations:

    int[char[]] myIntArray;    // compiler chooses container type
    int[char[]] as hash myHashTable;    // compiler obliged to use
"hash" as the container type

Then we would add a series of container-like properties to arrays:
    addTail - exactly equivalent to the ~ operator, but included to
match addFront,removeTail,removeFront
    addFront
    removeTail
    removeFront
    push - alias of addTail
    pop - alias of removeTail
    getTail - doesn't remove the object
    getFront - doesn't remove the object
    others?

Explicit container types could include:
    array (only valid if the index type is integers)
    hash
    list
    dlist
    tree
    avltree
    others?

--
The Villagers are Online! http://villagersonline.com

.[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ]
.[ (a version.of(English).(precise.more)) is(possible) ]
?[ you want.to(help(develop(it))) ]
Jul 22 2002
next sibling parent reply "anderson" <anderson firestar.com.au> writes:
I agree. Walter mentioned something about datatype implementation taking a
lot of time. But what do I know, perhaps these are a different type of
datatype or parhaps it's worth it? If it was possible to program your own
properties for arrays, cuppled with tempates, it may be possible to write
your own (or standard libarary it). Parhaps there could be a class tempate
that extends array types, allow you to program your own variants. Although I
don't know how the syntax would look.

How about conversions between formats. ie you could covert a unsorted array
time to sorted for a period of code and then convert it back (ie you don't
care about maintaining the arrays in sorted format, although it's allright
if it is). For example, in some cases there are parts in algorithms where
you need to convert an unsorted array into a sorted hash (perhaps for a
multiple binary search and remove) but because this operation is not used
frequently it better to generally better keep an unsorted array. D has the
.sort, but sometimes a data-struct is more adapt to certain tasks.

"Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:3D3BFC0F.50C9D499 deming-os.org...
 I've been thinking lately that it would be good if we could declare
 containers of unspecified type, and let the compiler choose the
 implementation.  There are 3 major types of container that I can think
 of:
     * Unsorted, unindexed (bag)
     * Sorted, unindexed (list)
     * Unsorted, indexed (map/hash)
 I suppose you could have a sorted,indexed containter, but what would
 that be?

 My thought here is that each of those containers has a fair analogy in
 the array syntax:
     * Sorted, unindexed: associative array
     * Unsorted, indexed: integer-indexed array
     * Unsorted, unindexed: integer-indexed array (no reason we CAN'T
 have indexes...)

 For most purposes, a container is just a container - the programmer
 doesn't really need to know how it is implemented.  So what if we
 thought of an array not as an array, but as a generic container
 declaration, which could be implemented as an array or as some other
 type?

 Basically, the idea is that the compiler would look at the things that
 you do to the container, and choose an appropriate type.  If all you
 ever do is array.addTail and array.removeFront, then it will decide to
 implement your container as a linked list (or doubly-linked list, if
 that's appropriate).  If all you do is access by integer indexes, then
 it uses an array.  And so on.

 Of course, performance-critical code would want to specify how the thing
 was implemented.  So you add an optional "as" clause to array
 declarations:

     int[char[]] myIntArray;    // compiler chooses container type
     int[char[]] as hash myHashTable;    // compiler obliged to use
 "hash" as the container type

 Then we would add a series of container-like properties to arrays:
     addTail - exactly equivalent to the ~ operator, but included to
 match addFront,removeTail,removeFront
     addFront
     removeTail
     removeFront
     push - alias of addTail
     pop - alias of removeTail
     getTail - doesn't remove the object
     getFront - doesn't remove the object
     others?

 Explicit container types could include:
     array (only valid if the index type is integers)
     hash
     list
     dlist
     tree
     avltree
     others?

 --
 The Villagers are Online! http://villagersonline.com

 .[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ]
 .[ (a version.of(English).(precise.more)) is(possible) ]
 ?[ you want.to(help(develop(it))) ]
Jul 22 2002
parent "Sandor Hojtsy" <hojtsy index.hu> writes:
 Then we would add a series of container-like properties to arrays:
     addTail - exactly equivalent to the ~ operator, but included to
 match addFront,removeTail,removeFront
     addFront
     removeTail
     removeFront
     push - alias of addTail
     pop - alias of removeTail
     getTail - doesn't remove the object
     getFront - doesn't remove the object
     others?
If nothing else comes through from your idea, these properties (and some more) should of course be added to the built-in arrays. That seems easy enough.
Jul 23 2002
prev sibling parent "Sean L. Palmer" <seanpalmer earthlink.net> writes:
"Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:3D3BFC0F.50C9D499 deming-os.org...
 I've been thinking lately that it would be good if we could declare
 containers of unspecified type, and let the compiler choose the
 implementation.  There are 3 major types of container that I can think
 of:
     * Unsorted, unindexed (bag)
     * Sorted, unindexed (list)
     * Unsorted, indexed (map/hash)
 I suppose you could have a sorted,indexed containter, but what would
 that be?
Lists aren't necessarily sorted (though they do have a definite order.) Bags are unordered; i.e. they don't guarantee order from iteration to iteration. While iterating over a bag, it's ordered, but only during that iteration, and only if nobody adds or removes objects during that traversal. You can do some interesting optimizations when order doesn't matter. Delete is swap with last, delete last. Insert can always go at the end, or into an empty slot if you have them. Arrays are ordered and indexed (always by integer position in the array) but generally unsorted. Matrices are related to arrays. Sets are indexed (probably sorted, but that's an implementation detail) and the only real property is inclusion/exclusion (and iteration). For integer indices it's essentially a bit array. Sparse arrays (save space at expense of access speed) Maps and hashtables are like associative arrays; indexed on any type. Maps have ordering that makes sense (usually just what < provides); the ordering of hash tables is entirely dependent on choice of hash. Stacks and queues are really basic containers too but can be easily implemented atop arrays or lists. Deques seem to be a list of small arrays; indexed (slow!) and ordered. Trees of various forms, usually binary. Graphs are similar to trees. A sorted, indexed container is like a sorted array, conceptually. A map could fall into this category.
 My thought here is that each of those containers has a fair analogy in
 the array syntax:
     * Sorted, unindexed: associative array
     * Unsorted, indexed: integer-indexed array
     * Unsorted, unindexed: integer-indexed array (no reason we CAN'T
 have indexes...)
I'd like to think of an associative array like I'd think of a map: a (slow to index) sparse array that can be indexed by any type. Yes, indexing is usually something one might be willing to give up in exchange for other optimizations. Sometimes a list is the most natural type of container for a given data set. Using an array where a list is needed can result in large algorithmic penalties (mainly for reallocations).
 For most purposes, a container is just a container - the programmer
 doesn't really need to know how it is implemented.  So what if we
 thought of an array not as an array, but as a generic container
 declaration, which could be implemented as an array or as some other
 type?
Because the programmer also needs control over what type the container is. At least the ability to hint to the compiler what kind of things are important (consistent total order, index speed, iteration speed, insertion/removal speed (at ends, or arbitrary), search speed, foreknowledge of maximum capacity). All these things can give large hints as to which implementation would be most suitable. So, we call these common groups of traits the "common" name for that type: list, array, what have you. Of course you could use some strange syntax to differentiate between them. The main problem syntactically is that you can't use [] on lists; indexing lists is contrary to the nature of the class (although possible, algorithmically it's very slow). Arrays, while conveniently indexable by integer, also don't index well when the index isn't integral. And when it is integral, it can end up wasting lots of space if data is sparse. So sparse arrays would be nice too. Iteration seems to be the one common factor in (almost!) all containers. Indexing is a luxury that can be emulated if necessary.
 Basically, the idea is that the compiler would look at the things that
 you do to the container, and choose an appropriate type.  If all you
 ever do is array.addTail and array.removeFront, then it will decide to
 implement your container as a linked list (or doubly-linked list, if
 that's appropriate).  If all you do is access by integer indexes, then
 it uses an array.  And so on.
Getting the compiler to detect this sounds difficult. I'd rather just tell it what kind of container I want.
 Of course, performance-critical code would want to specify how the thing
 was implemented.  So you add an optional "as" clause to array
 declarations:

     int[char[]] myIntArray;    // compiler chooses container type
     int[char[]] as hash myHashTable;    // compiler obliged to use
 "hash" as the container type
That sounds fairly ok.
 Then we would add a series of container-like properties to arrays:
     addTail - exactly equivalent to the ~ operator, but included to
 match addFront,removeTail,removeFront
     addFront
     removeTail
     removeFront
     push - alias of addTail
     pop - alias of removeTail
     getTail - doesn't remove the object
     getFront - doesn't remove the object
     others?
Didn't I just post a list similar to this a few weeks ago? ;)
 Explicit container types could include:
     array (only valid if the index type is integers)
     hash
     list
     dlist
     tree
     avltree
     others?
There are a few above. Sean
Jul 31 2002