digitalmars.D - Proposal for "Classic DList" (CDList)

monarch_dodra (157/157) Nov 06 2012 I'd like to make a proposal for a "Classic" DList.

monarch_dodra (3/4) Nov 06 2012 Still under construction of course. Still writting up docs and

Jonathan M Davis (19/24) Nov 06 2012 Given some of the issues that std.container has, I'd honestly argue for ...

"monarch_dodra" <monarchdodra gmail.com> writes:

I'd like to make a proposal for a "Classic" DList.

The MAIN reason I think the current one is inadequate is because 
of its current semantics, as discussed here:
http://forum.dlang.org/thread/gjhclwsuqyhrimdeoaec forum.dlang.org

And documented here:
https://github.com/D-Programming-Language/phobos/pull/910

Basically, a DList is not really a container. It is a view into a 
chain of nodes. All the DLists share the same chain, but each 
DList may have a different view of the chain. This is a really 
strange semantic I've never seen anywhere else, and quite 
franckly, I'm not even sure this was the original intent 
anyways...

So I set up to write CDList (Classic DList). The 6 points that 
distinguish it from DList are:
1. Explicit Reference Semantics.
2. Optional deterministic memory model.
3. Completeness of implementation.
4. Assertive invalidation.
5. Splice (!)

To be perfectly fair, 3 & 4 can be built into our current model. 
splice too, but with more difficulty.

I'll elaborate on each point:

1. Explicit Reference Semantics.
We explicitly state that ranges point to a payload, that can be 
shared, and need to be initialized. I don't believe in lazy 
initialization, hence the "explicit". The bonus here though is 
that we can compare and assign a CDlist to null. For example:

--------
     CDList!int a;
     assert(a == null);
     CDList!int b;
     assert(a == b);
     assert(a is b);

     a = b;
     assert(a == b);
     assert(a is b);

     auto c = CDList!int();
     assert(c != null);
     assert(a != c);
     assert(a !is c);

     a = CDList!int();
     assert(a == c);
     assert(a !is c);

     a = null; //release a
--------

Long story short: Class semantics, but without the polymorphism.

2. Optional deterministic memory model.
This is the money argument. I've inserted an optional template 
parameter that defines if allocation should be done by the GC, or 
manually. They have their pros and cons:
-GC: + Direct access to node content references (front and back 
return by ref).
-GC: + references to nodes are never invalidated (although ranges 
can be).
-Malloc: + Nodes are released as soon as they are removed from 
the range.
-Malloc: - No reference access to the node contents
-Malloc: - "Remove" and "~this()" are not BIGOH(1) anymore (since 
it triggers the destruction of the nodes).

Here is an example of the Malloc model in action:
--------
     struct Per//son
     {
         int* p;
         this(int i){p = [i].ptr; ++pop;}
         this(this){p = [*p].ptr; ++pop;}
         ~this(){if(p){p = null;--pop;}}
     }
     alias CDList!(Per, CDListGCAllocation.no) Peoples;
     {
         Peoples.Range rr;
         {
             auto a = Peoples(Per(1), Per(5), Per(6));
             auto b = Peoples(Per(3), Per(4), Per(9));
             auto c = Peoples();
             assert(pop == 6);

             a.removeBack();
             assert(pop == 5);

             auto r = a[].take(2);
             a.spliceAfter(r, b[].take(2));
             assert(pop == 5);

             b.clear();
             assert(pop == 4);

             rr = c.splice(a[].take(2));
         }
         assert(pop == 2); //still held by rr
     }
     assert(pop == 0);
--------

With a little before taste of "splice" :)
As you can see, each person gets destroyed as soon as their node 
leaves the scope. Interesting note: An extracted range will 
single handedly prevent the destruction of a CDList's content, so 
the behavior is safe too.

3. Completeness of implementation.
This just means that *any* function that takes a "Range" as an 
argument will also work with a "Take!Range".
Also, *any* function that takes "stuff" will accept stuff as:
-a value,
-an indiscriminate list of values
-a range.

So for example: "d.insertAfter(d[].take(3), 1, 2, 3)" is 
perfectly legal (and super convenient).

In all fairness, this could be implemented back into our DList.

4. Assertive invalidation.
You'd be amazed the shit DList gets away with. You know all that 
"stableRemove" it has? Yeah, not stable at all. What's more, when 
you *do* invalidate a range, chances are you'll never even notice 
it. My implementation invalidates the shit out each and every 
nodes individually (in version(assert) or course), and catches 
any invalid code almost instantaneously. That may not sound like 
much, but manipulating lists is tricky business.

5. Splice (!)

That's right! What's a DList without a splice? I think I've 
antroduced a semantic that would be to Andrei's taste: 
"spliceFront", "spliceBack", "spliceBefore" and "spliceAfter". 
Simple as that. The ricky business with splice though is that the 
input range will not really be invalidated, but WILL change 
owner. Because of this, "spliceXXX" will:
"return a range spanning the spliced nodes, that is owned by the 
current range". Here is a simple example:
--------
     auto a = IntList(1, 2, 6);
     auto b = IntList(3, 4, 5);
     auto r = a[]; //r is owned by a
     auto r = b.splice(r); //r is now owned by b;
--------

Arguments can, Of course, be either a Range or a Take!Range. 
Operations are o(1) for Range. In any case, nodes are never 
constructed of course.

here it is in more exciteing action:
--------
     auto a = IntList(1, 2, 5);
     auto b = IntList(3, 4, 9);
     auto r = a.spliceAfter(a[].take(2), b[].take(2));
     assert(a[].equal([1, 2, 3, 4, 5]));
     assert(b[].equal([9]));
     assert(r.equal([3, 4]));
--------

Self splice is, of course, possible. I love the expressiveness of 
this one liner:
--------
     auto a = IntList(3, 1, 2, 4);
     a.spliceFront(a[].drop(1).take(2));
     assert(a[].equal([1, 2, 3, 4]));
--------

-------------------------------------------------
The implementation is in this here:
https://github.com/monarchdodra/phobos/commits/DList2/std/container.d
https://github.com/monarchdodra/phobos/commit/2292071a862ecb0b1f9b77d1413c5d9b9c7b24e1#std/container.d

What do you think? I'm not sure where to take this: I can't just 
change DList into CDList, because, well, breakage. If we decide 
to change it to reference semantics, but with lazy 
initialization, it can be done though...

Also, container duplication is a bad idea. Still as it stands, 
DList is an EXTREMELLY tricky container to use...

So what to do?

Nov 06 2012

"monarch_dodra" <monarchdodra gmail.com> writes:

On Tuesday, 6 November 2012 at 12:18:47 UTC, monarch_dodra wrote:
 [SNIP]

Still under construction of course. Still writting up docs and 
stuff.

Nov 06 2012

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Tuesday, November 06, 2012 13:22:10 monarch_dodra wrote:
 On Tuesday, 6 November 2012 at 12:18:47 UTC, monarch_dodra wrote:
 [SNIP]

 
 Still under construction of course. Still writting up docs and
 stuff.

Given some of the issues that std.container has, I'd honestly argue for just 
outright changing the containers in there if we need to, or at least making 
names like CDList temporary, and replace DList with it later rather than 
keeping the worse names around permanently.

One thing that came up recently in D.Learn ( 
http://forum.dlang.org/thread/lozpofrboxsfybmhkhch forum.dlang.org ) is the 
fact that SList and DList make the unfortunate choice of having their insert* 
functions return the number of items inserted rather than a range starting at 
those elements like C++ would have done (std::list's insert* functions return 
iterators). In general, it's a royal pain to get at ranges to anything other 
than the whole list, and I think that while the basic API of std.container is 
good, a number of the functions need some tweaks if they're really going to be 
range-friendly. AFAIK, it's the first set of purely range-based containers, and 
I think that the result is still very rough around the edges due to a lack of 
experience with using ranges with containers instead of iterators.

std.container is a good start, but it still needs work. So, I think that it's 
great if you want to spend time trying to iron out some of the kinks in it.

- Jonathan M Davis

Nov 06 2012

D Programming

C/C++ Programming

Other

digitalmars.D - Proposal for "Classic DList" (CDList)