digitalmars.D - std.containers

digitalmars.D - std.containers - WAT

James Miller (50/50) Mar 27 2012 In helping someone in D.learn, I ended up looking through the

Andrei Alexandrescu (26/75) Mar 27 2012 Actually removeAny is one of the better ones. It just removes an element...

James Miller (37/41) Mar 27 2012 But there is a Range struct inside each container, it implements the
Jonathan M Davis (14/61) Mar 27 2012 It's like iterators in C++. Each container has its own, and that's the t...

James Miller <james aatch.net> writes:

In helping someone in D.learn, I ended up looking through the
documentation and code for std.containers.

There is a lot wrong with whats going on there right now.

For a start, the documentation is pretty unclear, The table for the
container primitives is poorly explained, and the complexity columns
wraps on most screen sizes, making understanding them a nightmare. The
explanations for what things are, and what they do is pretty unclear,
and since its not completely standardized over the different
containers, you end up with some truly mindboggling documentation. A
good example is this: "c.removeAny()	- Removes some element from c and
returns it.", what some element at random? Or does it take an argument
that isn't show like what is implied by this: "c.stableRemoveAny(v) -
Same as c.removeAny(v), but is guaranteed to not invalidate any
iterators.", but that doesn't seem right because SList has "T
removeAny() - Picks one value from the front of the container, removes
it from the container, and returns it." which doesn't take a value.
This is endless confusing, and I ended up consulting the unittests in
order to understand what the hell was going on.

Then there are the strange input and output types/value for the
arguments. For example, why does SList.linearRemove return an empty
Range on the Take!Range overload, and an empty Range on the Range
overload? In fact, why are any of the functions accepting and
returning /Ranges/ which are internal types specific to the container?
Why don't they all just take a type that can be converted to a range?
As is stands, you end up with ridiculous requirements like:

    //Remove a single element from an SList, given its "index"
    size_t index = 5;
    SList!int s = [0,1,2,3,4,5,6,7,8,9];
    auto r = s[0..index-1];
    auto r1 = take(r, 1);
    s.linearRemove(r1);
    assert(equal(s, [0,1,2,3,5,6,7,8,9]);

Which could be replaced with an overload for linearRemove that looks like this:

    /** Removes the element at index, return false if the element
could not be removed **/
    bool linearRemove(size_t index){...}

Almost all of the "primitives" seem to be based around adding or
removing from the front or back of the container, but there's no
random-access removal. And since the behaviour is Container specific,
its impossible to figure out what a function does anyway! For example,
SList's linearRemove(Range) just removes the front elements from the
container, not obvious.

Is there any reason why things are the way they are? Because all I see
are people getting exited about ranges and operator overloading and
not really thinking about what a useful set of features any given
container should have.

--
James Miller

P.S. I'm happy to help out improving the documentation, but I have no
idea what these things are supposed to do in the first place...

Mar 27 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/27/12 7:46 PM, James Miller wrote:
 For a start, the documentation is pretty unclear, The table for the
 container primitives is poorly explained, and the complexity columns
 wraps on most screen sizes, making understanding them a nightmare.

I plan to redo that table. A screenshot would help - thanks!

 The
 explanations for what things are, and what they do is pretty unclear,
 and since its not completely standardized over the different
 containers, you end up with some truly mindboggling documentation. A
 good example is this: "c.removeAny()	- Removes some element from c and
 returns it.", what some element at random?

Actually removeAny is one of the better ones. It just removes an element 
from the container that is most convenient to the container.

 Or does it take an argument
 that isn't show like what is implied by this: "c.stableRemoveAny(v) -
 Same as c.removeAny(v), but is guaranteed to not invalidate any
 iterators.", but that doesn't seem right because SList has "T
 removeAny() - Picks one value from the front of the container, removes
 it from the container, and returns it." which doesn't take a value.
 This is endless confusing, and I ended up consulting the unittests in
 order to understand what the hell was going on.

That's a mistake in the documentation, stableRemoveAny does not take a 
value. I just fixed it in 
https://github.com/D-Programming-Language/phobos/commit/9d7168be5fc91da74231ab86d989ac6013280830.

 Then there are the strange input and output types/value for the
 arguments. For example, why does SList.linearRemove return an empty
 Range on the Take!Range overload, and an empty Range on the Range
 overload?

It's simple - generally removeRange returns the portion after the 
deleted part. The structure of SList dictates that removing a range 
means removing through the end. Removing a Take!Range naturally will 
return some potentially nonempty range.

 In fact, why are any of the functions accepting and
 returning /Ranges/ which are internal types specific to the container?

Ranges are not internal to containers, they are the way containers are 
manipulated.

 Why don't they all just take a type that can be converted to a range?

I don't see the advantage of doing so.

 As is stands, you end up with ridiculous requirements like:

      //Remove a single element from an SList, given its "index"
      size_t index = 5;
      SList!int s = [0,1,2,3,4,5,6,7,8,9];
      auto r = s[0..index-1];
      auto r1 = take(r, 1);
      s.linearRemove(r1);
      assert(equal(s, [0,1,2,3,5,6,7,8,9]);

 Which could be replaced with an overload for linearRemove that looks like this:

      /** Removes the element at index, return false if the element
 could not be removed **/
      bool linearRemove(size_t index){...}

We can add overloads that use index-based operations, but generally you 
already have a range and would like to do a positioned delete. If you 
find yourself doing a lot of indexed operations in a SList, you chose 
the wrong containers.

 Almost all of the "primitives" seem to be based around adding or
 removing from the front or back of the container, but there's no
 random-access removal.

Why the quotes? More containers are likely to perform removal at an end 
efficiently than randomly positioned.

 And since the behaviour is Container specific,
 its impossible to figure out what a function does anyway! For example,
 SList's linearRemove(Range) just removes the front elements from the
 container, not obvious.

I'm not sure I get this. Might be a bug somewhere. The linearRemove 
primitive is supposed to do the same for all containers - remove a range 
in linear time.

 Is there any reason why things are the way they are? Because all I see
 are people getting exited about ranges and operator overloading and
 not really thinking about what a useful set of features any given
 container should have.

Proposals for improving things are always welcome.

 P.S. I'm happy to help out improving the documentation, but I have no
 idea what these things are supposed to do in the first place...

Just ask.


Andrei

Mar 27 2012

James Miller <james aatch.net> writes:

On 28 March 2012 15:28, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 In fact, why are any of the functions accepting and
 returning /Ranges/ which are internal types specific to the container?


 Ranges are not internal to containers, they are the way containers are
 manipulated.

But there is a Range struct inside each container, it implements the
Range interface, but as far as I am aware, it is its own type, and
therefore anything that explicitly expects a Range is going to
expecting its own internal type. From what I can tell, the internal
Range struct on each container is just the underlying storage
mechanism for that container, and you can even have more than one
defined internal Range for a container. The issue comes about that
several methods want that type. Now there may be some voodoo going on
that I don't understand, but it seems to me that I can't do some thing
like:

    SList!int s = [1,2,3,4,5,6,7];
    s.linearRemove([4]) // Arrays are ranges

To remove a certain value, because [4] is not a valid type, despite
the fact that isForwardRange!(int[]) returns true. This is because
although int[] is a range, it is not a Range (as in the inner type
inside SList named Range). This is incredibly confusing to people
reading the documentation, since Range and range could mean anything.

I guess what I want is the why's for everything, why do things return
certain values? Why do certain methods only accept data that was
pulled from the container in the first place? Simply saying that some
generally returns some value doesn't help, I need to know the whys
behind it. Is is because the common use case is that, is it because it
was the easiest way to do it, is it because you just randomly decided
to because you felt that void was a cop-out?

I just feel that the documentation could do with some work in this
regard, but I wasn't there when this was being designed, I don't know
the whys behind the decisions. If I point out where I think the
documentation is lacking, simply in a "That is completely useless"
kind of way, could somebody either update the documentation, or
provide me with better explanations.

I have also attached the screenshot you asked for, sorry for the
quality, I don't have the tools on my machine to deal with images
properly right now.

--
James Miller

Mar 27 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday, March 28, 2012 16:40:56 James Miller wrote:
 On 28 March 2012 15:28, Andrei Alexandrescu
 
 <SeeWebsiteForEmail erdani.org> wrote:
 In fact, why are any of the functions accepting and
 returning /Ranges/ which are internal types specific to the container?

 
 Ranges are not internal to containers, they are the way containers are
 manipulated.

 
 But there is a Range struct inside each container, it implements the
 Range interface, but as far as I am aware, it is its own type, and
 therefore anything that explicitly expects a Range is going to
 expecting its own internal type. From what I can tell, the internal
 Range struct on each container is just the underlying storage
 mechanism for that container, and you can even have more than one
 defined internal Range for a container. The issue comes about that
 several methods want that type. Now there may be some voodoo going on
 that I don't understand, but it seems to me that I can't do some thing
 like:
 
     SList!int s = [1,2,3,4,5,6,7];
     s.linearRemove([4]) // Arrays are ranges
 
 To remove a certain value, because [4] is not a valid type, despite
 the fact that isForwardRange!(int[]) returns true. This is because
 although int[] is a range, it is not a Range (as in the inner type
 inside SList named Range). This is incredibly confusing to people
 reading the documentation, since Range and range could mean anything.
 
 I guess what I want is the why's for everything, why do things return
 certain values? Why do certain methods only accept data that was
 pulled from the container in the first place? Simply saying that some
 generally returns some value doesn't help, I need to know the whys
 behind it. Is is because the common use case is that, is it because it
 was the easiest way to do it, is it because you just randomly decided
 to because you felt that void was a cop-out?
 
 I just feel that the documentation could do with some work in this
 regard, but I wasn't there when this was being designed, I don't know
 the whys behind the decisions. If I point out where I think the
 documentation is lacking, simply in a "That is completely useless"
 kind of way, could somebody either update the documentation, or
 provide me with better explanations.
 
 I have also attached the screenshot you asked for, sorry for the
 quality, I don't have the tools on my machine to deal with images
 properly right now.

It's like iterators in C++. Each container has its own, and that's the type 
that you use to iterate over and manipulate the elements of a container. It's 
just that instead of using iterators, std.container is using ranges.

As to why some functions require ranges from the container specifically, it's 
for the same reason that many functions on STL containers require iterators 
from the container specifically: they need to operate on those exact elements, 
not the values of those elements or some other iterator or range which points 
to the same values. They must operate on those exact elements.

std.container still needs some work to be sure (parts of it are going to be 
redesigned to work with the custom allocators which are currently being worked 
on), and the documentation probably does need some work, but much of how it 
works is pretty much the same as how the containers work in C++.

- Jonathan M Davis

Mar 27 2012

D Programming

C/C++ Programming

Other

digitalmars.D - std.containers - WAT