www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - std.containers - WAT

reply James Miller <james aatch.net> writes:
In helping someone in D.learn, I ended up looking through the
documentation and code for std.containers.

There is a lot wrong with whats going on there right now.

For a start, the documentation is pretty unclear, The table for the
container primitives is poorly explained, and the complexity columns
wraps on most screen sizes, making understanding them a nightmare. The
explanations for what things are, and what they do is pretty unclear,
and since its not completely standardized over the different
containers, you end up with some truly mindboggling documentation. A
good example is this: "c.removeAny()	- Removes some element from c and
returns it.", what some element at random? Or does it take an argument
that isn't show like what is implied by this: "c.stableRemoveAny(v) -
Same as c.removeAny(v), but is guaranteed to not invalidate any
iterators.", but that doesn't seem right because SList has "T
removeAny() - Picks one value from the front of the container, removes
it from the container, and returns it." which doesn't take a value.
This is endless confusing, and I ended up consulting the unittests in
order to understand what the hell was going on.

Then there are the strange input and output types/value for the
arguments. For example, why does SList.linearRemove return an empty
Range on the Take!Range overload, and an empty Range on the Range
overload? In fact, why are any of the functions accepting and
returning /Ranges/ which are internal types specific to the container?
Why don't they all just take a type that can be converted to a range?
As is stands, you end up with ridiculous requirements like:

    //Remove a single element from an SList, given its "index"
    size_t index = 5;
    SList!int s = [0,1,2,3,4,5,6,7,8,9];
    auto r = s[0..index-1];
    auto r1 = take(r, 1);
    s.linearRemove(r1);
    assert(equal(s, [0,1,2,3,5,6,7,8,9]);

Which could be replaced with an overload for linearRemove that looks like this:

    /** Removes the element at index, return false if the element
could not be removed **/
    bool linearRemove(size_t index){...}

Almost all of the "primitives" seem to be based around adding or
removing from the front or back of the container, but there's no
random-access removal. And since the behaviour is Container specific,
its impossible to figure out what a function does anyway! For example,
SList's linearRemove(Range) just removes the front elements from the
container, not obvious.

Is there any reason why things are the way they are? Because all I see
are people getting exited about ranges and operator overloading and
not really thinking about what a useful set of features any given
container should have.

--
James Miller

P.S. I'm happy to help out improving the documentation, but I have no
idea what these things are supposed to do in the first place...
Mar 27 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/27/12 7:46 PM, James Miller wrote:
 For a start, the documentation is pretty unclear, The table for the
 container primitives is poorly explained, and the complexity columns
 wraps on most screen sizes, making understanding them a nightmare.
I plan to redo that table. A screenshot would help - thanks!
 The
 explanations for what things are, and what they do is pretty unclear,
 and since its not completely standardized over the different
 containers, you end up with some truly mindboggling documentation. A
 good example is this: "c.removeAny()	- Removes some element from c and
 returns it.", what some element at random?
Actually removeAny is one of the better ones. It just removes an element from the container that is most convenient to the container.
 Or does it take an argument
 that isn't show like what is implied by this: "c.stableRemoveAny(v) -
 Same as c.removeAny(v), but is guaranteed to not invalidate any
 iterators.", but that doesn't seem right because SList has "T
 removeAny() - Picks one value from the front of the container, removes
 it from the container, and returns it." which doesn't take a value.
 This is endless confusing, and I ended up consulting the unittests in
 order to understand what the hell was going on.
That's a mistake in the documentation, stableRemoveAny does not take a value. I just fixed it in https://github.com/D-Programming-Language/phobos/commit/9d7168be5fc91da74231ab86d989ac6013280830.
 Then there are the strange input and output types/value for the
 arguments. For example, why does SList.linearRemove return an empty
 Range on the Take!Range overload, and an empty Range on the Range
 overload?
It's simple - generally removeRange returns the portion after the deleted part. The structure of SList dictates that removing a range means removing through the end. Removing a Take!Range naturally will return some potentially nonempty range.
 In fact, why are any of the functions accepting and
 returning /Ranges/ which are internal types specific to the container?
Ranges are not internal to containers, they are the way containers are manipulated.
 Why don't they all just take a type that can be converted to a range?
I don't see the advantage of doing so.
 As is stands, you end up with ridiculous requirements like:

      //Remove a single element from an SList, given its "index"
      size_t index = 5;
      SList!int s = [0,1,2,3,4,5,6,7,8,9];
      auto r = s[0..index-1];
      auto r1 = take(r, 1);
      s.linearRemove(r1);
      assert(equal(s, [0,1,2,3,5,6,7,8,9]);

 Which could be replaced with an overload for linearRemove that looks like this:

      /** Removes the element at index, return false if the element
 could not be removed **/
      bool linearRemove(size_t index){...}
We can add overloads that use index-based operations, but generally you already have a range and would like to do a positioned delete. If you find yourself doing a lot of indexed operations in a SList, you chose the wrong containers.
 Almost all of the "primitives" seem to be based around adding or
 removing from the front or back of the container, but there's no
 random-access removal.
Why the quotes? More containers are likely to perform removal at an end efficiently than randomly positioned.
 And since the behaviour is Container specific,
 its impossible to figure out what a function does anyway! For example,
 SList's linearRemove(Range) just removes the front elements from the
 container, not obvious.
I'm not sure I get this. Might be a bug somewhere. The linearRemove primitive is supposed to do the same for all containers - remove a range in linear time.
 Is there any reason why things are the way they are? Because all I see
 are people getting exited about ranges and operator overloading and
 not really thinking about what a useful set of features any given
 container should have.
Proposals for improving things are always welcome.
 P.S. I'm happy to help out improving the documentation, but I have no
 idea what these things are supposed to do in the first place...
Just ask. Andrei
Mar 27 2012
next sibling parent James Miller <james aatch.net> writes:
On 28 March 2012 15:28, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 In fact, why are any of the functions accepting and
 returning /Ranges/ which are internal types specific to the container?
Ranges are not internal to containers, they are the way containers are manipulated.
But there is a Range struct inside each container, it implements the Range interface, but as far as I am aware, it is its own type, and therefore anything that explicitly expects a Range is going to expecting its own internal type. From what I can tell, the internal Range struct on each container is just the underlying storage mechanism for that container, and you can even have more than one defined internal Range for a container. The issue comes about that several methods want that type. Now there may be some voodoo going on that I don't understand, but it seems to me that I can't do some thing like: SList!int s = [1,2,3,4,5,6,7]; s.linearRemove([4]) // Arrays are ranges To remove a certain value, because [4] is not a valid type, despite the fact that isForwardRange!(int[]) returns true. This is because although int[] is a range, it is not a Range (as in the inner type inside SList named Range). This is incredibly confusing to people reading the documentation, since Range and range could mean anything. I guess what I want is the why's for everything, why do things return certain values? Why do certain methods only accept data that was pulled from the container in the first place? Simply saying that some generally returns some value doesn't help, I need to know the whys behind it. Is is because the common use case is that, is it because it was the easiest way to do it, is it because you just randomly decided to because you felt that void was a cop-out? I just feel that the documentation could do with some work in this regard, but I wasn't there when this was being designed, I don't know the whys behind the decisions. If I point out where I think the documentation is lacking, simply in a "That is completely useless" kind of way, could somebody either update the documentation, or provide me with better explanations. I have also attached the screenshot you asked for, sorry for the quality, I don't have the tools on my machine to deal with images properly right now. -- James Miller
Mar 27 2012
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, March 28, 2012 16:40:56 James Miller wrote:
 On 28 March 2012 15:28, Andrei Alexandrescu
 
 <SeeWebsiteForEmail erdani.org> wrote:
 In fact, why are any of the functions accepting and
 returning /Ranges/ which are internal types specific to the container?
Ranges are not internal to containers, they are the way containers are manipulated.
But there is a Range struct inside each container, it implements the Range interface, but as far as I am aware, it is its own type, and therefore anything that explicitly expects a Range is going to expecting its own internal type. From what I can tell, the internal Range struct on each container is just the underlying storage mechanism for that container, and you can even have more than one defined internal Range for a container. The issue comes about that several methods want that type. Now there may be some voodoo going on that I don't understand, but it seems to me that I can't do some thing like: SList!int s = [1,2,3,4,5,6,7]; s.linearRemove([4]) // Arrays are ranges To remove a certain value, because [4] is not a valid type, despite the fact that isForwardRange!(int[]) returns true. This is because although int[] is a range, it is not a Range (as in the inner type inside SList named Range). This is incredibly confusing to people reading the documentation, since Range and range could mean anything. I guess what I want is the why's for everything, why do things return certain values? Why do certain methods only accept data that was pulled from the container in the first place? Simply saying that some generally returns some value doesn't help, I need to know the whys behind it. Is is because the common use case is that, is it because it was the easiest way to do it, is it because you just randomly decided to because you felt that void was a cop-out? I just feel that the documentation could do with some work in this regard, but I wasn't there when this was being designed, I don't know the whys behind the decisions. If I point out where I think the documentation is lacking, simply in a "That is completely useless" kind of way, could somebody either update the documentation, or provide me with better explanations. I have also attached the screenshot you asked for, sorry for the quality, I don't have the tools on my machine to deal with images properly right now.
It's like iterators in C++. Each container has its own, and that's the type that you use to iterate over and manipulate the elements of a container. It's just that instead of using iterators, std.container is using ranges. As to why some functions require ranges from the container specifically, it's for the same reason that many functions on STL containers require iterators from the container specifically: they need to operate on those exact elements, not the values of those elements or some other iterator or range which points to the same values. They must operate on those exact elements. std.container still needs some work to be sure (parts of it are going to be redesigned to work with the custom allocators which are currently being worked on), and the documentation probably does need some work, but much of how it works is pretty much the same as how the containers work in C++. - Jonathan M Davis
Mar 27 2012