digitalmars.D - We need to rethink remove in std.container

Jonathan M Davis (54/54) Feb 21 2011 Okay, removing elements from a container sucks right now. You can do stu...

%u (16/17) Feb 21 2011 container that it's on. That makes sense. It's obviously not going

Jonathan M Davis (30/51) Feb 21 2011 Removing a range is not necessarily related to removing an element with ...

%u (9/10) Feb 21 2011 add a primitive to forward ranges that returns the first n elements of
%u (7/7) Feb 21 2011 You know, I'm actually now questioning whether STL did the right thing

Jonathan M Davis (21/28) Feb 21 2011 Some of them do take indices, but you can't generally operate on indices...

%u (14/14) Feb 21 2011 Hm... so the entire issue here seems to be the capability to do

Jonathan M Davis (14/29) Feb 21 2011 _Of course_, it's worth it. Do you never alter anything in a range? Func...

Andrei Alexandrescu (4/10) Feb 22 2011 This is the exact insight that made me recognize take() is an essential

Denis Koroskin (55/142) Feb 22 2011 I believe remove operation is better expressed in terms of a Cursor

Steven Schveighoffer (27/52) Feb 22 2011 This is exactly how dcollections works.

Andrei Alexandrescu (28/31) Feb 22 2011 At this exact point I had what would be either a good idea or a brainfar...

Steven Schveighoffer (20/51) Feb 22 2011 This is almost the same as a dcollections cursor. The only difference i...

Andrei Alexandrescu (100/155) Feb 23 2011 I thought about this overnight and reached the same conclusion. The

Steven Schveighoffer (59/87) Feb 23 2011 I don't think this is possible. When passing a range to a container for...

Andrei Alexandrescu (16/93) Feb 23 2011 That's arguably correct and in any case better than supporting a range

Steven Schveighoffer (43/75) Feb 23 2011 But what about takeOne(retro(myrange))? If support for that was desired...

Philippe Sigaud (18/31) Feb 22 2011 n _element; }

Andrei Alexandrescu (16/42) Feb 22 2011 auto v = findFirst(range, needle);

Steven Schveighoffer (6/11) Feb 22 2011 s/insert/put

Philippe Sigaud (5/8) Feb 22 2011 I mean that, in the same way that a container can expose its elements

Steven Schveighoffer (9/21) Feb 23 2011 Oh, that's easy. Any container range that allows setting front elements...

Steven Schveighoffer (7/12) Feb 22 2011 Actually, I haven't used dcollections (or much d code) in so long, I

Lutger Blijdestijn (15/41) Feb 22 2011 I don't understand why not. Given, as you said, that a lot of functions ...

Jonathan M Davis (12/57) Feb 22 2011 It's horribly inefficient for one. Also, I'm not sure that the range is ...

Lutger Blijdestijn (13/76) Feb 22 2011 It's a matter of range (in)validation right? I admit at this point I'm n...

Jonathan M Davis (24/104) Feb 22 2011 It depends on the implementation. It's possible that iterating an iterat...

Lutger Blijdestijn (11/122) Feb 22 2011 I'm fairly certain this is guaranteed, but specifically tied to std::map...

Andrei Alexandrescu (6/12) Feb 22 2011 I don't think removeAny() works for his case because he wants to remove

Lutger Blijdestijn (9/24) Feb 22 2011 I was looking for remove(ElementType value) but this function is not lis...

Andrei Alexandrescu (18/39) Feb 22 2011 It's good that positioned remove does not work with findSplit. It would

Steven Schveighoffer (9/66) Feb 22 2011 This should be possible with equalRange.
Jonathan M Davis (31/78) Feb 22 2011 Well, if RedBlackTree's remove gets changed to take the return values fr...
Jonathan M Davis (4/52) Feb 23 2011 I've made several improvements to RedBlackTree and created a pull reques...

Steven Schveighoffer (6/9) Feb 23 2011 I'll be sure to look at this soon. I really need to ramp up my git

Jonathan M Davis (4/15) Feb 23 2011 Yeah. It's hiding under couch, but sneaky. So, if you go looking, odds a...

Steven Schveighoffer (24/62) Feb 22 2011 RedBlackTree supports the equalRange function, which gives you a range o...

Lutger Blijdestijn (3/44) Feb 22 2011 oo how I missed that. When you are working on RedBlackTree, would you pl...

Steven Schveighoffer (7/15) Feb 22 2011 I'm sorry you missed it, better docs is one of the planned improvements....

spir (10/12) Feb 22 2011 If this means a generalisation of what D's builtin 'in' provides, then I...

Jonathan M Davis <jmdavisProg gmx.com> writes:

Okay, removing elements from a container sucks right now. You can do stuff like 
removeAny (generally pretty useless IMHO) or removeFront just fine, but
removing 
an arbitrary range from a container just plain sucks.

remove takes a range type which is the range type for the container that it's 
on. That makes sense. It's obviously not going to be able to a take an
arbitrary 
range and remove that from itself. How would it know which elements in that 
range corresponded to which elements in itself - especially when it could be a 
range which skips elements or something similar? So, _that_ part makes sense.

But have you actually tried to get a range of the appropriate type to remove 
from a container? It seems like almost ever function in std.range and 
std.algorithm returns a new range type, making them completely useless for 
processing a range to be removed from a container.

I was looking to remove a single element for a RedBlackTree. The best function 
that I could think to get the proper range was findSplit. The middle portion of 
the return value would be the portion that I was looking for. I'd take that and 
pass it to RedBlackTree's remove. Wrong. It uses takeExactly in its 
implementation and the first two portions of the result of findSplit aren't the 
right range type.

So, what do I do? The _only_ thing that I can think of at the moment is to use 
find to locate the beginning of the range that I want, take the length of the 
range with walkLength and then use popBackN to pop of the elements I don't
want. 
e.g.

======
import std.algorithm, std.container, std.conv,
       std.range, std.stdio;

void main()
{
    auto rbt = RedBlackTree!int([0, 2, 5, 12, 59, 22]);
    assert(to!string(rbt[]) == "[0, 2, 5, 12, 22, 59]");

    auto found = find(rbt[], 5);
    assert(to!string(found) == "[5, 12, 22, 59]");

    popBackN(found, walkLength(found) - 1);
    assert(to!string(found) == "[5]");

    rbt.remove(found);
    assert(to!string(rbt[]) == "[0, 2, 12, 22, 59]");
}
======

That's disgusting. All that just to remove one element? And what if the range 
isn't bidirectional (e.g. SList)? Well, then you have no popBack, and as far as 
I can tell, you're screwed, since you can't use either take or takeExactly, 
because both of them return new range types.

In particular, the fact that you can't take a range and create a new one of the 
same type from its first n elements is highly problematic. Maybe we need to add
a 
function to ForwardRange which returns a new range with the first n elements
(it 
certainly looks like that's the key element that's missing here). I don't know 
if that would be reasonable, but the fact that you can't create a range of the 
same type as the original range when taking just its first n elements seems 
crippling in this situation.

I don't know what the proper solution to this is, but the current situation 
strikes me as untenable. I had to think through this problem for a while before 
I came to a solution that came even close to working, let alone get one that 
actually works. Removing elements from a container should _not_ be this hard. 
The situation with remove _needs_ to be improved.

- Jonathan M Davis

Feb 21 2011

%u <wfunction hotmail.com> writes:

 remove takes a range type which is the range type for the

container that it's on. That makes sense. It's obviously not going
to be able to a take an arbitrary range and remove that from itself.
How would it know which elements in that range corresponded to which
elements in itself - especially when it could be a range which skips
elements or something similar? So, _that_ part makes sense.

I'm wondering, what was the reason for making a "remove range X from
Y" method in the first place? To me, that's not a "remove"
operation; if anything, it's the "removeAll" operation, but I'd
actually call it "subtract" (because it's the set subtraction
operation... although there can be duplicates).

If we change remove to only take in a single element, wouldn't its
implementation then be trivial? (For a range, just return a filter
that removes the element. For a container, actually remove the
element. For ranges that also behave like containers -- like arrays
-- I have no idea.)

Does this sound like a good idea?

Feb 21 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday 21 February 2011 21:04:47 %u wrote:
 remove takes a range type which is the range type for the

 
 container that it's on. That makes sense. It's obviously not going
 to be able to a take an arbitrary range and remove that from itself.
 How would it know which elements in that range corresponded to which
 elements in itself - especially when it could be a range which skips
 elements or something similar? So, _that_ part makes sense.
 
 I'm wondering, what was the reason for making a "remove range X from
 Y" method in the first place? To me, that's not a "remove"
 operation; if anything, it's the "removeAll" operation, but I'd
 actually call it "subtract" (because it's the set subtraction
 operation... although there can be duplicates).
 
 If we change remove to only take in a single element, wouldn't its
 implementation then be trivial? (For a range, just return a filter
 that removes the element. For a container, actually remove the
 element. For ranges that also behave like containers -- like arrays
 -- I have no idea.)
 
 Does this sound like a good idea?

Removing a range is not necessarily related to removing an element with a 
particular value. Sure, a method could be added to the various container types 
which takes a value and removes the first instance of that value from the 
container, but that's not what remove is necessarily trying to do. It's like 
erase in C++'s standard library which takes two iterators. There are plenty of 
situations where removing a range makes sense. The odd bit with remove in
Phobos 
vs erase in the STL is that with iterators, you can give a single iterator to 
indicate a single element whereas you can't do that with a range. So, remove 
takes a range, and if you want to remove a single element, that range only has 
one element in it. It's a bit weird, but that's fine.

The typical way to remove an element in the STL is to use find to find an
element 
and then erase to remove it. remove in std.container is doing the same thing. 
The problem is that you can't give the result of find to remove, because
instead 
of a single iterator, find gives you a whole range, and you probably don't want 
to remove that whole range. You generally either want to remove the first
element 
or some set of elements at the front of the range. So, you need to able to 
remove the other elements from the range so that you can give that range to the 
remove method on the container. That's fine in principle, but since so many of 
the range-based algorithms give you a new type back and you need the exact type 
that that container uses for its ranges, it's very hard to get a range of the 
correct type with the correct elements in it.

So, while find will give you the beginning of the range you want, there's not
an 
easy way to remove the end of that range. Functions like take and takeExactly 
return new types, so they don't work. The more I look at it, the more I'm 
convinced that we really need to add a primitive to forward ranges that returns 
the first n elements of that range. Without that, I don't see how you can get a 
range of the correct type with only those elements in it unless it's also a 
bidirectional range, which some ranges (like SList's range) aren't.

- Jonathan M Davis

Feb 21 2011

%u <wfunction hotmail.com> writes:

 The more I look at it, the more I'm convinced that we really need to

add a primitive to forward ranges that returns the first n elements of
that range. Without that, I don't see how you can get a range of the
correct type with only those elements in it unless it's also a
bidirectional range, which some ranges (like SList's range) aren't.

Huh, yeah I think I agree. It seems like ranges are just delayed
evaluation, and that, at some point, we need to be able to force their
evaluation -- otherwise, it's like combining Scheme's (delay X) and
(force Y) operations with mutation, which just doesn't work sensibly
in non-functional programming.

Feb 21 2011

%u <wfunction hotmail.com> writes:

You know, I'm actually now questioning whether STL did the right thing
with requiring iterators for the erase() method. It actually seems
quite pointless -- there's no reason why integer indices wouldn't
work. I don't think we really need to follow suit with STL here... is
there some situation with erase() that I'm missing that can't be
covered with plain integer (or rather, size_t) indices, and that
requires ranges?

Feb 21 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday 21 February 2011 21:51:51 %u wrote:
 You know, I'm actually now questioning whether STL did the right thing
 with requiring iterators for the erase() method. It actually seems
 quite pointless -- there's no reason why integer indices wouldn't
 work. I don't think we really need to follow suit with STL here... is
 there some situation with erase() that I'm missing that can't be
 covered with plain integer (or rather, size_t) indices, and that
 requires ranges?

Some of them do take indices, but you can't generally operate on indices. To do 
that, you need the original container. But you can operate on an iterate just 
fine. So, it's generally easier to deal with iterators.

However, the big thing would be that it actually doesn't make sense to use 
indices in many cases. Think about RedBlackTree for a moment. If you remove 
something from it or add something to it, that doesn't invalidate any range
that 
you have which points to it (assuming that the end points haven't changed). You 
can still take that range and use it any any algorithm that you want -
including 
remove (assuming that the algorithm takes the appropriate kind of range of 
course), and it will work. But using indices wouldn't work. They would have 
changed. Adding or removing anything to a container at or before an index that 
you have would make it so that that index no longer points to the same element. 
However, for many containers (though not all), any iterators or ranges would 
still be valid.

So, on the whole, using iterators or ranges makes great sense. I have no
problem 
with how the STL does that. The problem is transitioning that to ranges. The
STL 
doesn't go and change your iterator types on you when you feed them to 
algorithms, but std.range and std.algorithm typically do. So, the result is sub-
optimal in this case, to say the least.

- Jonathan M Davis

Feb 21 2011

%u <wfunction hotmail.com> writes:

Hm... so the entire issue here seems to be the capability to do
iteration and modification in a concurrent manner, right?

IMHO that may not be worth the costs we're paying -- I would argue
that you normally shouldn't be modifying a collection that you're
iterating over in the first place; it just doesn't make any sense
when you're delaying everything. Sure, it would work fine if you
couldn't have more than one reference to any container (since every
range would would then have the same view of the container), but
when we're introducing delayed evaluation with ranges, I just don't
see how (or even why) it should be combinable with modification.
It's like trying to read from a database and concurrently modifying
the underlying storage by hand, bypassing the database... it just
doesn't work that way, outside of functional programming. So is it
really worth paying the price?

Feb 21 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday 21 February 2011 22:51:25 %u wrote:
 Hm... so the entire issue here seems to be the capability to do
 iteration and modification in a concurrent manner, right?
 
 IMHO that may not be worth the costs we're paying -- I would argue
 that you normally shouldn't be modifying a collection that you're
 iterating over in the first place; it just doesn't make any sense
 when you're delaying everything. Sure, it would work fine if you
 couldn't have more than one reference to any container (since every
 range would would then have the same view of the container), but
 when we're introducing delayed evaluation with ranges, I just don't
 see how (or even why) it should be combinable with modification.
 It's like trying to read from a database and concurrently modifying
 the underlying storage by hand, bypassing the database... it just
 doesn't work that way, outside of functional programming. So is it
 really worth paying the price?

_Of course_, it's worth it. Do you never alter anything in a range? Functions 
like sort need to be able to alter the elements that the range refers to. Not
to 
mention, there's generally no other way to alter elements in a container other 
than through a range to it unless you remove them and add new ones to replace 
them. The one exception would be if the container is random access, and then
you 
can use the subscript operator to get at its elements. Ranges definitely need
to 
be able to alter the elements that they contain. Sure, there are plenty of
times 
when you don't need to do that, but there are times when you definitely do.

And really, the problem here isn't really the concept of ranges - it's their 
implementation. The fact that you can't get the right type of range with the 
right elements in it is the problem. Doing something like building the take 
functionality into forward ranges would fix that problem.

- Jonathan M Davis

Feb 21 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/21/11 11:27 PM, Jonathan M Davis wrote:
 The typical way to remove an element in the STL is to use find to find an
element
 and then erase to remove it. remove in std.container is doing the same thing.
 The problem is that you can't give the result of find to remove, because
instead
 of a single iterator, find gives you a whole range, and you probably don't want
 to remove that whole range. You generally either want to remove the first
element
 or some set of elements at the front of the range.

This is the exact insight that made me recognize take() is an essential 
range type.

Andrei

Feb 22 2011

"Denis Koroskin" <2korden gmail.com> writes:

On Tue, 22 Feb 2011 05:55:20 +0300, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 Okay, removing elements from a container sucks right now. You can do  
 stuff like
 removeAny (generally pretty useless IMHO) or removeFront just fine, but  
 removing
 an arbitrary range from a container just plain sucks.

 remove takes a range type which is the range type for the container that  
 it's
 on. That makes sense. It's obviously not going to be able to a take an  
 arbitrary
 range and remove that from itself. How would it know which elements in  
 that
 range corresponded to which elements in itself - especially when it  
 could be a
 range which skips elements or something similar? So, _that_ part makes  
 sense.

 But have you actually tried to get a range of the appropriate type to  
 remove
 from a container? It seems like almost ever function in std.range and
 std.algorithm returns a new range type, making them completely useless  
 for
 processing a range to be removed from a container.

 I was looking to remove a single element for a RedBlackTree. The best  
 function
 that I could think to get the proper range was findSplit. The middle  
 portion of
 the return value would be the portion that I was looking for. I'd take  
 that and
 pass it to RedBlackTree's remove. Wrong. It uses takeExactly in its
 implementation and the first two portions of the result of findSplit  
 aren't the
 right range type.

 So, what do I do? The _only_ thing that I can think of at the moment is  
 to use
 find to locate the beginning of the range that I want, take the length  
 of the
 range with walkLength and then use popBackN to pop of the elements I  
 don't want.
 e.g.

 ======
 import std.algorithm, std.container, std.conv,
        std.range, std.stdio;

 void main()
 {
     auto rbt = RedBlackTree!int([0, 2, 5, 12, 59, 22]);
     assert(to!string(rbt[]) == "[0, 2, 5, 12, 22, 59]");

     auto found = find(rbt[], 5);
     assert(to!string(found) == "[5, 12, 22, 59]");

     popBackN(found, walkLength(found) - 1);
     assert(to!string(found) == "[5]");

     rbt.remove(found);
     assert(to!string(rbt[]) == "[0, 2, 12, 22, 59]");
 }
 ======

 That's disgusting. All that just to remove one element? And what if the  
 range
 isn't bidirectional (e.g. SList)? Well, then you have no popBack, and as  
 far as
 I can tell, you're screwed, since you can't use either take or  
 takeExactly,
 because both of them return new range types.

 In particular, the fact that you can't take a range and create a new one  
 of the
 same type from its first n elements is highly problematic. Maybe we need  
 to add a
 function to ForwardRange which returns a new range with the first n  
 elements (it
 certainly looks like that's the key element that's missing here). I  
 don't know
 if that would be reasonable, but the fact that you can't create a range  
 of the
 same type as the original range when taking just its first n elements  
 seems
 crippling in this situation.

 I don't know what the proper solution to this is, but the current  
 situation
 strikes me as untenable. I had to think through this problem for a while  
 before
 I came to a solution that came even close to working, let alone get one  
 that
 actually works. Removing elements from a container should _not_ be this  
 hard.
 The situation with remove _needs_ to be improved.

 - Jonathan M Davis

I believe remove operation is better expressed in terms of a Cursor  
(iterator in C++). It also doesn't make much sense (at least of me) for a  
find to return a subrange of the source range. While it does generalize  
well, it's not really what most people expect.

What I'd love to have instead is something like this:

int[] arr = [1, 2, 3, 4, 5];
auto cursor = arr.find(3); // either points to 3 or null

int[] before = arr[0..cursor];
int value = arr[cursor]; // for consistency, C++ uses *cursor
int[] after = arr[arr.advance(cursor)..$];

Note that typeof(cursor) could/should actually be int*, but I don't use  
arr[cursor+1..$] because cursor doesn't know about range bounds, and as  
such it can't advance on it's own.

For dynamic arrays:

V[K] dynamicArray;
V* cursor = key in dynamicArray;
dynamicArray.remove(key); // requires an additional lookup, which is  
sub-optimal
// dynamicArray.remove(cursor); // optimal but doesn't work atm. I think  
it should


I think it's also worth noting that "find" for trees makes even less sense  
(to me). You can't "slice" a tree arbitrarily, yet find tries to do that:

     auto rbt = RedBlackTree!int([0, 2, 5, 12, 59, 22]);
     assert(to!string(rbt[]) == "[0, 2, 5, 12, 22, 59]");

     auto found = find(rbt[], 5);
     assert(to!string(found) == "[5, 12, 22, 59]");

What is the type of "found"? It's clearly not a tree anymore. Don't know  
why, but it's still a range. How is it implemented? Without looking at the  
source I can tell it's implemented as an original tree + a cursor. And I  
believe the two are better be separated.

As a side note, in my own code I have 2 version of remove - remove and  
removeUnstable, which work differently. Simple example:

// swaps current element with last one and pops last
void removeUnstable(T)(ref T[] array, T* cursor)
{
     auto length = array.length;
     assert(length > 0);
     --length;

     T* last = array.ptr + length;
     assert(array.ptr <= cursor && cursor <= last);
     *cursor = *last;

     array.length = length;
}

// Moves elements
T* remove(T)(ref T[] array, T* cursor)
{
     auto length = array.length;
     assert(length > 0);
     --length;
	
     assert(array.ptr <= cursor && cursor <= array.ptr + length);
	
     auto numElementsToMove = length - (cursor - array.ptr);
     memmove(cursor, cursor + 1, numElementsToMove * T.sizeof);

     array.length = length;

     return cursor;
}

Feb 22 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 22 Feb 2011 03:43:25 -0500, Denis Koroskin <2korden gmail.com>  
wrote:

 I believe remove operation is better expressed in terms of a Cursor  
 (iterator in C++). It also doesn't make much sense (at least of me) for  
 a find to return a subrange of the source range. While it does  
 generalize well, it's not really what most people expect.

 What I'd love to have instead is something like this:

 int[] arr = [1, 2, 3, 4, 5];
 auto cursor = arr.find(3); // either points to 3 or null

 int[] before = arr[0..cursor];
 int value = arr[cursor]; // for consistency, C++ uses *cursor
 int[] after = arr[arr.advance(cursor)..$];

This is exactly how dcollections works.

 Note that typeof(cursor) could/should actually be int*, but I don't use  
 arr[cursor+1..$] because cursor doesn't know about range bounds, and as  
 such it can't advance on it's own.

Here is how dcollections solves this.

A cursor in dcollections is actually a zero or one element range.  It can  
only reference exactly one element.  Since it has no references to other  
elements, it is immune to operations that use the surrounding elements.   
In contrast, a red black tree range contains references to the begin and  
the end node.  This means operations on these nodes (i.e. removing the end  
node, or inserting an element between the two) can affect the range.

So essentially, your code would look like this in dcollections:

auto arr = new ArrayList!int(1,2,3,4,5);

auto cursor = arr.find(3);

int[] before = arr[0..cursor];
int value = arr[cursor];
cursor.popFront();
int[] after = arr[cursor..arr.end]; // note, $ overrides don't currently  
work, so we can't get the sugar here...

 I think it's also worth noting that "find" for trees makes even less  
 sense (to me). You can't "slice" a tree arbitrarily, yet find tries to  
 do that:

     auto rbt = RedBlackTree!int([0, 2, 5, 12, 59, 22]);
     assert(to!string(rbt[]) == "[0, 2, 5, 12, 22, 59]");

     auto found = find(rbt[], 5);
     assert(to!string(found) == "[5, 12, 22, 59]");

 What is the type of "found"? It's clearly not a tree anymore. Don't know  
 why, but it's still a range. How is it implemented? Without looking at  
 the source I can tell it's implemented as an original tree + a cursor.  
 And I believe the two are better be separated.

Actually, find is probably not the correct thing to use for this.  Use  
upperBound (or lowerBound, I can never remember which one does what) to  
get all the elements >= 5.

The range is not a tree, and this is intentional.  Containers are *not*  
ranges.  Think of a range as a 'view' into a container, but not a  
container itself.  You can't add/remove elements from the tree using a  
range, you can just traverse it.  If it helps, think of the range as a  
pair of iterators.

-Steve

Feb 22 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/22/11 7:58 AM, Steven Schveighoffer wrote:
 A cursor in dcollections is actually a zero or one element range. It can
 only reference exactly one element. Since it has no references to other
 elements, it is immune to operations that use the surrounding elements.

At this exact point I had what would be either a good idea or a brainfart.

I've been thinking for a good while to define a "singleton range", a 
range that has at most one element. I had the intuition that a singleton 
range is a useful concept, but I felt it was a bit tenuous to argue in 
its favor (mostly because one could easily simulate it with repeat(x, 
1)), so I never put it in std.range.

The definition would go like this:

auto singletonRange(T)(T element)
{
     static struct Result
     {
         private T _element;
         private bool _done;
          property bool empty() { return _done; }
          property auto front() { assert(!empty); return _element; }
         void popFront() { assert(!empty); _done = true; }
         auto save() { return this; }
     }

     return Result(element, false);
}

But now I realized that singleton ranges are your cursors, so I'll 
definitely add them. And you also made me realize that a singleton range 
is very different from other ranges in the same way 1 is different from 
other numbers. Great.

This may as well be The Great Unification that was needed to converge 
std.container with dcollections in a harmonious whole.


Andrei

Feb 22 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 22 Feb 2011 09:23:00 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 2/22/11 7:58 AM, Steven Schveighoffer wrote:
 A cursor in dcollections is actually a zero or one element range. It can
 only reference exactly one element. Since it has no references to other
 elements, it is immune to operations that use the surrounding elements.

 At this exact point I had what would be either a good idea or a  
 brainfart.

 I've been thinking for a good while to define a "singleton range", a  
 range that has at most one element. I had the intuition that a singleton  
 range is a useful concept, but I felt it was a bit tenuous to argue in  
 its favor (mostly because one could easily simulate it with repeat(x,  
 1)), so I never put it in std.range.

 The definition would go like this:

 auto singletonRange(T)(T element)
 {
      static struct Result
      {
          private T _element;
          private bool _done;
           property bool empty() { return _done; }
           property auto front() { assert(!empty); return _element; }
          void popFront() { assert(!empty); _done = true; }
          auto save() { return this; }
      }

      return Result(element, false);
 }

This is almost the same as a dcollections cursor.  The only difference is,  
popFront in a cursor not only sets the '_done' member, but also moves to  
the next element.  This distinction is necessary with node-based  
containers that have an end marker node.  So the cursor that is returned  
 from 'end()' (or from find where the element was not found, etc.) is an  
already-empty cursor.

I hadn't thought of just setting the empty variable, but I think it  
wouldn't work properly.  You could possibly say "a cursor that points to  
an element but is empty is actually pointing *after* the element."  But  
then I'm not sure how to construct an end cursor.  It's advantageous to  
point at the end of a container, where there is no true node to point at.   
Is there another way to think about this?

 But now I realized that singleton ranges are your cursors, so I'll  
 definitely add them. And you also made me realize that a singleton range  
 is very different from other ranges in the same way 1 is different from  
 other numbers. Great.

Yes, that distinction is really important, I'm glad you see that!

 This may as well be The Great Unification that was needed to converge  
 std.container with dcollections in a harmonious whole.

I'm not sure what to think of this ;)  Is this a "oh what could have been"  
statement or a "let's reopen negotiations" statement?

Note, I would love for dcollections to be mainstream Phobos, but I have  
accepted the current reality and am fine with the way things are now too.

-Steve

Feb 22 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/22/11 8:41 AM, Steven Schveighoffer wrote:
 On Tue, 22 Feb 2011 09:23:00 -0500, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 2/22/11 7:58 AM, Steven Schveighoffer wrote:
 A cursor in dcollections is actually a zero or one element range. It can
 only reference exactly one element. Since it has no references to other
 elements, it is immune to operations that use the surrounding elements.

 At this exact point I had what would be either a good idea or a
 brainfart.

 I've been thinking for a good while to define a "singleton range", a
 range that has at most one element. I had the intuition that a
 singleton range is a useful concept, but I felt it was a bit tenuous
 to argue in its favor (mostly because one could easily simulate it
 with repeat(x, 1)), so I never put it in std.range.

 The definition would go like this:

 auto singletonRange(T)(T element)
 {
 static struct Result
 {
 private T _element;
 private bool _done;
  property bool empty() { return _done; }
  property auto front() { assert(!empty); return _element; }
 void popFront() { assert(!empty); _done = true; }
 auto save() { return this; }
 }

 return Result(element, false);
 }

 This is almost the same as a dcollections cursor. The only difference
 is, popFront in a cursor not only sets the '_done' member, but also
 moves to the next element. This distinction is necessary with node-based
 containers that have an end marker node. So the cursor that is returned
 from 'end()' (or from find where the element was not found, etc.) is an
 already-empty cursor.

I thought about this overnight and reached the same conclusion. The 
thing is, what's needed is not one element, but one element of a 
specific range. Here's what I have right now in range.d:

/**
Returns a range with at most one element; for example, $(D
takeOne([42, 43, 44])) returns a range consisting of the integer $(D
42). Calling $(D popFront()) off that range renders it empty.

Sometimes an empty range with the same signature is needed. For such
ranges use $(D takeNone!R()). For example:

----
auto s = takeOne([42, 43, 44]);
static assert(isRandomAccessRange!(typeof(s)));
assert(s.length == 1);
assert(!s.empty);
assert(s.front == 42);
s.front() = 43;
assert(s.front == 43);
assert(s.back == 43);
assert(s[0] == 43);
s.popFront();
assert(s.length == 0);
assert(s.empty);
s = takeNone!(int[])();
assert(s.length == 0);
assert(s.empty);
----

In effect $(D takeOne(r)) is equivalent to $(take(r, 1)) and $(D
takeNone(r)) is equivalent to $(D take(r, 0)), but in certain
interfaces it is important to know statically that the range may only
have at most one element.

The type returned by $(D takeOne) and $(D takeNone) is a random-access
range.
  */
auto takeOne(R)(R source) if (isInputRange!R)
{
     static struct Result
     {
         private R _source;
         bool _empty;
          property bool empty() const { return _empty; }
          property auto ref front() { assert(!empty); return 
_source.front; }
         void popFront() { assert(!empty); _empty = true; }
         void popBack() { assert(!empty); _empty = true; }
         auto save() { return Result(_source.save, empty); }
          property auto ref back() { assert(!empty); return _source.front; }
          property size_t length() const { return !empty; }
         auto ref opIndex(size_t n) { assert(n < length); return 
_source.front; }
         auto opSlice(size_t m, size_t n)
         {
             assert(m <= n && n < length);
             return n > m ? this : Result(_source, false);
         }
     }

     return Result(source, source.empty);
}

/// Ditto
auto takeNone(R)() if (isInputRange!R)
{
     return typeof(takeOne(R.init))(R.init, true);
}

unittest
{
     auto s = takeOne([42, 43, 44]);
     static assert(isRandomAccessRange!(typeof(s)));
     assert(s.length == 1);
     assert(!s.empty);
     assert(s.front == 42);
     s.front() = 43;
     assert(s.front == 43);
     assert(s.back == 43);
     assert(s[0] == 43);
     s.popFront();
     assert(s.length == 0);
     assert(s.empty);
     s = takeNone!(int[])();
     assert(s.length == 0);
     assert(s.empty);
}

 I hadn't thought of just setting the empty variable, but I think it
 wouldn't work properly. You could possibly say "a cursor that points to
 an element but is empty is actually pointing *after* the element." But
 then I'm not sure how to construct an end cursor. It's advantageous to
 point at the end of a container, where there is no true node to point
 at. Is there another way to think about this?

Yah, I think the right approach is to base the 0/1 range on an existing 
range, not on an existing value.

 But now I realized that singleton ranges are your cursors, so I'll
 definitely add them. And you also made me realize that a singleton
 range is very different from other ranges in the same way 1 is
 different from other numbers. Great.

 Yes, that distinction is really important, I'm glad you see that!

 This may as well be The Great Unification that was needed to converge
 std.container with dcollections in a harmonious whole.

 I'm not sure what to think of this ;) Is this a "oh what could have
 been" statement or a "let's reopen negotiations" statement?

 Note, I would love for dcollections to be mainstream Phobos, but I have
 accepted the current reality and am fine with the way things are now too.

Absolutely, and your collaboration on e.g. RedBlackTree and others is 
very much appreciated.

The way I was thinking is, there would a lot of convergence if 
dcollections wiped the notion of "cursor" and replaced it with takeOne 
throughout. One obvious objection to the range/cursor design is there 
are two abstractions to deal with instead of one. I had the intuition 
that ranges should be enough, but was not quite able to substantiate it 
fully until just about now. With take, takeExactly, and takeOne, it 
seems that indeed there is only need for the range interface, and that 
particular range types can replace cursors.

I realize that there are other major differences in design between 
dcollections and std.container, probably the largest one being the use 
of a hierarchy vs. free-floating types, so probably full unification is 
still not possible or quite remote. Nevertheless, keeping the 
communication channels open would make both libraries better and all of 
us a bit wiser.


Andrei

Feb 23 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 23 Feb 2011 08:33:03 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 2/22/11 8:41 AM, Steven Schveighoffer wrote:
 On Tue, 22 Feb 2011 09:23:00 -0500, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:


 This may as well be The Great Unification that was needed to converge
 std.container with dcollections in a harmonious whole.

 I'm not sure what to think of this ;) Is this a "oh what could have
 been" statement or a "let's reopen negotiations" statement?

 Note, I would love for dcollections to be mainstream Phobos, but I have
 accepted the current reality and am fine with the way things are now  
 too.

 Absolutely, and your collaboration on e.g. RedBlackTree and others is  
 very much appreciated.

 The way I was thinking is, there would a lot of convergence if  
 dcollections wiped the notion of "cursor" and replaced it with takeOne  
 throughout. One obvious objection to the range/cursor design is there  
 are two abstractions to deal with instead of one. I had the intuition  
 that ranges should be enough, but was not quite able to substantiate it  
 fully until just about now. With take, takeExactly, and takeOne, it  
 seems that indeed there is only need for the range interface, and that  
 particular range types can replace cursors.

I don't think this is possible.  When passing a range to a container for  
removal, the container needs to access the underlying implementation (e.g.  
pointer to tree node) in order to immediately know what element is being  
removed.  The takeOne range you propose does not allow this, you can't get  
at the underlying range.  Even if you could, only takeOne!(myrangetype)  
would work, which leaves us right back at supporting a small number of  
specific ranges.

In addition, a cursor points to one element, a range points to multiple  
elements.  *conceptually* takeOne is the same, but it still stores the  
range internally, so it's actually more expensive than a range to pass  
around (additional _empty member).

I was thinking more along the lines of ranges being able to generate  
cursors if it makes sense (i.e. base range is a range of the container).   
Dcollections already supports this with the two accessors begin() and  
end().  We might also need a last() which gets the last valid element in  
the range.  This would be akin to back().

Here is what I was thinking that can help with unifying interface.  I'll  
just show you as a code snippet:

void remove(C)(C c) if (is(C == mycursortype))
{
    // since we know the structure of c, we can use it to remove the  
element from the container
}

void remove(R)(R r) if (is(R == myrangetype))
{
   // possibly can be optimized
}

void remove(R)(R r) if (!is(R == myrangetype) && isInputRange!R &&  
is(typeof(r.begin) == mycursortype))
{
   while(!r.empty)
   {
     auto c = r.begin;
     r.popFront();
     remove(c);
   }
}

If wrapper ranges can "forward" the begin and end functions through their  
API, then you can construct arbitrary ranges that will be accepted as  
parameters to remove.  We could potentially add forwarding functions to  
all wrapper ranges, or we could pick ones that make the most sense (like  
take).

In addition, dcollections (and any collection that can verify its  
endpoints and cursor membership, SList obviously cannot) allows slicing  
based on cursors in a limited fashion.  Therefore, this allows  
constructing ranges from cursors if you have the original container.

As an example of what this could allow (assuming retro and take forward  
the begin(), end() and last() methods):

container.remove(take(retro(container[]), 5)); // remove the last 5  
elements from the container.

 I realize that there are other major differences in design between  
 dcollections and std.container, probably the largest one being the use  
 of a hierarchy vs. free-floating types, so probably full unification is  
 still not possible or quite remote. Nevertheless, keeping the  
 communication channels open would make both libraries better and all of  
 us a bit wiser.

As I've said in the past, the interface hierarchy is not critical at the  
moment because D has very little shared library support.  I'm willing to  
remove the interfaces as long as the types remain classes (which I believe  
you are more agreeable with lately).  This allows the interfaces to be  
added on later if it makes sense.

The cursor concept has to stay though.

-Steve

Feb 23 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/23/11 8:51 AM, Steven Schveighoffer wrote:
 On Wed, 23 Feb 2011 08:33:03 -0500, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 2/22/11 8:41 AM, Steven Schveighoffer wrote:
 On Tue, 22 Feb 2011 09:23:00 -0500, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:


 This may as well be The Great Unification that was needed to converge
 std.container with dcollections in a harmonious whole.

 I'm not sure what to think of this ;) Is this a "oh what could have
 been" statement or a "let's reopen negotiations" statement?

 Note, I would love for dcollections to be mainstream Phobos, but I have
 accepted the current reality and am fine with the way things are now
 too.

 Absolutely, and your collaboration on e.g. RedBlackTree and others is
 very much appreciated.

 The way I was thinking is, there would a lot of convergence if
 dcollections wiped the notion of "cursor" and replaced it with takeOne
 throughout. One obvious objection to the range/cursor design is there
 are two abstractions to deal with instead of one. I had the intuition
 that ranges should be enough, but was not quite able to substantiate
 it fully until just about now. With take, takeExactly, and takeOne, it
 seems that indeed there is only need for the range interface, and that
 particular range types can replace cursors.

 I don't think this is possible. When passing a range to a container for
 removal, the container needs to access the underlying implementation
 (e.g. pointer to tree node) in order to immediately know what element is
 being removed. The takeOne range you propose does not allow this, you
 can't get at the underlying range.

I agree that takeOne's source should be made public.

 Even if you could, only
 takeOne!(myrangetype) would work, which leaves us right back at
 supporting a small number of specific ranges.

That's arguably correct and in any case better than supporting a range 
plus a distinct cursor abstraction. You remove the cursor abstraction 
and replace it with a specific realization of the range abstraction.

 In addition, a cursor points to one element, a range points to multiple
 elements. *conceptually* takeOne is the same, but it still stores the
 range internally, so it's actually more expensive than a range to pass
 around (additional _empty member).

I agree. There are a number of ways around this if important, but I need 
first to be clear of the projected uses of takeOne.

 I was thinking more along the lines of ranges being able to generate
 cursors if it makes sense (i.e. base range is a range of the container).
 Dcollections already supports this with the two accessors begin() and
 end(). We might also need a last() which gets the last valid element in
 the range. This would be akin to back().

 Here is what I was thinking that can help with unifying interface. I'll
 just show you as a code snippet:

 void remove(C)(C c) if (is(C == mycursortype))
 {
 // since we know the structure of c, we can use it to remove the element
 from the container
 }

 void remove(R)(R r) if (is(R == myrangetype))
 {
 // possibly can be optimized
 }

 void remove(R)(R r) if (!is(R == myrangetype) && isInputRange!R &&
 is(typeof(r.begin) == mycursortype))
 {
 while(!r.empty)
 {
 auto c = r.begin;
 r.popFront();
 remove(c);
 }
 }

 If wrapper ranges can "forward" the begin and end functions through
 their API, then you can construct arbitrary ranges that will be accepted
 as parameters to remove. We could potentially add forwarding functions
 to all wrapper ranges, or we could pick ones that make the most sense
 (like take).

Soon as we get to begin() and end() as primitives it all starts being 
problematic. We may as well design with STL-style iterators then. One 
good thing about ranges is that they _don't_ need to rely on lower-level 
primitives.

 In addition, dcollections (and any collection that can verify its
 endpoints and cursor membership, SList obviously cannot) allows slicing
 based on cursors in a limited fashion. Therefore, this allows
 constructing ranges from cursors if you have the original container.

 As an example of what this could allow (assuming retro and take forward
 the begin(), end() and last() methods):

 container.remove(take(retro(container[]), 5)); // remove the last 5
 elements from the container.

I understand and I see the advantages. Still I believe that the 
disadvantages of dealing with cursors overcome this potential 
flexibility. I'd be more inclined to look for a design for which ranges 
suffice for iteration.


Andrei

Feb 23 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 23 Feb 2011 12:16:26 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 2/23/11 8:51 AM, Steven Schveighoffer wrote:
 I don't think this is possible. When passing a range to a container for
 removal, the container needs to access the underlying implementation
 (e.g. pointer to tree node) in order to immediately know what element is
 being removed. The takeOne range you propose does not allow this, you
 can't get at the underlying range.

 I agree that takeOne's source should be made public.

But what about takeOne(retro(myrange))?  If support for that was desired,  
one would need to do it explicitly.  As each new wrapper type is desired,  
one needs to add support for the specific configuration of wrappers.

 Even if you could, only
 takeOne!(myrangetype) would work, which leaves us right back at
 supporting a small number of specific ranges.

 That's arguably correct and in any case better than supporting a range  
 plus a distinct cursor abstraction. You remove the cursor abstraction  
 and replace it with a specific realization of the range abstraction.

At the end of the day, a container needs a cursor (or something like it)  
in order to remove an element.  The cursor is the gateway between the safe  
range abstraction, and the unsafe internal representation (with nasty  
pointers).

In reality, supporting cursors and ranges vs. takeOne and ranges only  
differs in that you need to add a cursor type.  The number of functions  
required remains the same (the cursor handling functions replace the  
takeOne handling functions).

 Soon as we get to begin() and end() as primitives it all starts being  
 problematic. We may as well design with STL-style iterators then. One  
 good thing about ranges is that they _don't_ need to rely on lower-level  
 primitives.

Ranges for containers necessarily need to rely on lower level primitives  
as I stated above.  A cursor gives a safe abstraction of such a primitive.

Example, An SList range:


     struct Range
     {
         private Node * _head; // <------ this is the primitive
         private this(Node * p) { _head = p; }
         /// Forward range primitives.
         bool empty() const { return !_head; }
         /// ditto
          property Range save() { return this; }
         /// ditto
          property T front() { return _head._payload; }
         /// ditto
          property void front(T value)
         {
             enforce(_head);
             _head._payload = value;
         }
         ...

Note that the range *hides* the unsafe primitive.  But so does a cursor.   
Essentially, a cursor is a way to refer to one of those primitives.   
Seeing as how the implementation needs to use one of those primitives,  
being able to pass around a safe version of that primitive is very  
advantageous.  Requiring passing around a range of them is problematic for  
many functions.

 In addition, dcollections (and any collection that can verify its
 endpoints and cursor membership, SList obviously cannot) allows slicing
 based on cursors in a limited fashion. Therefore, this allows
 constructing ranges from cursors if you have the original container.

 As an example of what this could allow (assuming retro and take forward
 the begin(), end() and last() methods):

 container.remove(take(retro(container[]), 5)); // remove the last 5
 elements from the container.

 I understand and I see the advantages. Still I believe that the  
 disadvantages of dealing with cursors overcome this potential  
 flexibility. I'd be more inclined to look for a design for which ranges  
 suffice for iteration.

Then I think we shall keep things the way they are.  Cursors are  
absolutely necessary in my model of collections, and to get rid of them  
would abandon the most fundamental reason I created dcollections.

-Steve

Feb 23 2011

Philippe Sigaud <philippe.sigaud gmail.com> writes:

On Tue, Feb 22, 2011 at 15:23, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 auto singletonRange(T)(T element)
 {
 =C2=A0 =C2=A0static struct Result
 =C2=A0 =C2=A0{
 =C2=A0 =C2=A0 =C2=A0 =C2=A0private T _element;
 =C2=A0 =C2=A0 =C2=A0 =C2=A0private bool _done;
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 property bool empty() { return _done; }
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 property auto front() { assert(!empty); retur=

n _element; }
 =C2=A0 =C2=A0 =C2=A0 =C2=A0void popFront() { assert(!empty); _done =3D tr=

ue; }
 =C2=A0 =C2=A0 =C2=A0 =C2=A0auto save() { return this; }
 =C2=A0 =C2=A0}

 =C2=A0 =C2=A0return Result(element, false);
 }

That's also what many people would like a findFirst function to
return. Either a 1-element range with the found value or an empty
range if the value doesn't exist.

auto v =3D findFirst(range, needle);
if (!v.empty) { ... }


Maybe you could allow for the creation of an empty output
singletonRange where a lone value could then be put.
auto singleton(T)() { ... } but the user would need to provide the 'T'
by himself.
and then:

void put(T element) {assert(empty); _element =3D element;}

That could be a way to modify a container. Btw, is there a way to have
an empty container and use an output range to fill it? I didn't look
at std.container for quite some time.

Feb 22 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/22/11 3:23 PM, Philippe Sigaud wrote:
 On Tue, Feb 22, 2011 at 15:23, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org>  wrote:

 auto singletonRange(T)(T element)
 {
     static struct Result
     {
         private T _element;
         private bool _done;
          property bool empty() { return _done; }
          property auto front() { assert(!empty); return _element; }
         void popFront() { assert(!empty); _done = true; }
         auto save() { return this; }
     }

     return Result(element, false);
 }


That's also what many people would like a findFirst function to
 return. Either a 1-element range with the found value or an empty
 range if the value doesn't exist.

auto v = findFirst(range, needle);
 if (!v.empty) { ... }

Good point.

 Maybe you could allow for the creation of an empty output
 singletonRange where a lone value could then be put.
 auto singleton(T)() { ... } but the user would need to provide the 'T'
 by himself.

Yah, I thought of that after posting. It would look something like this:

auto singletonRange(T)()
{
     typeof(singletonRange(T.init)) result = void;
     result._done = true;
     return result;
}

No mercy :o). But that's unsafe so probably we'd need to leave out "= void".

 and then:

void put(T element) {assert(empty); _element = element;}
That could be a way to modify a container. Btw, is there a way to have
 an empty container and use an output range to fill it? I didn't look
 at std.container for quite some time.

No, but this may be an interesting idea to pursue.


Andrei

Feb 22 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 22 Feb 2011 16:55:29 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 2/22/11 3:23 PM, Philippe Sigaud wrote:
 Btw, is there a way to have
 an empty container and use an output range to fill it? I didn't look
 at std.container for quite some time.

 No, but this may be an interesting idea to pursue.

s/insert/put

Now all containers that support insertion are output ranges...

Or am I missing something?

-Steve

Feb 22 2011

Philippe Sigaud <philippe.sigaud gmail.com> writes:

On Tue, Feb 22, 2011 at 23:20, Steven Schveighoffer <schveiguy yahoo.com> wrote:
 s/insert/put

 Now all containers that support insertion are output ranges...

 Or am I missing something?

I mean that, in the same way that a container can expose its elements
as an input range using [], a partially filled container could give
access to its empty 'slots' (created by 'reserve()' ) via an output
range.

Feb 22 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 23 Feb 2011 01:30:49 -0500, Philippe Sigaud  
<philippe.sigaud gmail.com> wrote:

 On Tue, Feb 22, 2011 at 23:20, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:
 s/insert/put

 Now all containers that support insertion are output ranges...

 Or am I missing something?

 I mean that, in the same way that a container can expose its elements
 as an input range using [], a partially filled container could give
 access to its empty 'slots' (created by 'reserve()' ) via an output
 range.

Oh, that's easy.  Any container range that allows setting front elements  
is fair game.

For example SList's range can be an output range right now.  (I believe  
isOutputRange!(SList.init[]) should be true)

What's not available is adding new elements to a container via the output  
range concept.

-Steve

Feb 23 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 22 Feb 2011 08:58:04 -0500, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 So essentially, your code would look like this in dcollections:

 auto arr = new ArrayList!int(1,2,3,4,5);

 auto cursor = arr.find(3);

 int[] before = arr[0..cursor];
 int value = arr[cursor];


Actually, I haven't used dcollections (or much d code) in so long, I  
forgot the cursor's main function!

int value = cursor.front;

That looks better ;)

-Steve

Feb 22 2011

Lutger Blijdestijn <lutger.blijdestijn gmail.com> writes:

Jonathan M Davis wrote:

 Okay, removing elements from a container sucks right now. You can do stuff
 like removeAny (generally pretty useless IMHO) or removeFront just fine,
 but removing an arbitrary range from a container just plain sucks.
 
 remove takes a range type which is the range type for the container that
 it's on. That makes sense. It's obviously not going to be able to a take
 an arbitrary range and remove that from itself. How would it know which
 elements in that range corresponded to which elements in itself -
 especially when it could be a range which skips elements or something
 similar? So, _that_ part makes sense.

I don't understand why not. Given, as you said, that a lot of functions  
return a new type it would make sense. My answer as to how would be 
something like:

while range not empty
    container.remove range.front
    range.popFront

Is there a problem with that? 
 
 But have you actually tried to get a range of the appropriate type to
 remove from a container? It seems like almost ever function in std.range
 and std.algorithm returns a new range type, making them completely useless
 for processing a range to be removed from a container.
 
 I was looking to remove a single element for a RedBlackTree. The best
 function that I could think to get the proper range was findSplit. The
 middle portion of the return value would be the portion that I was looking
 for. I'd take that and pass it to RedBlackTree's remove. Wrong. It uses
 takeExactly in its implementation and the first two portions of the result
 of findSplit aren't the right range type.
 
 So, what do I do? The _only_ thing that I can think of at the moment is to
 use find to locate the beginning of the range that I want, take the length
 of the range with walkLength and then use popBackN to pop of the elements
 I don't want. e.g.

The table in the docs mention stableRemoveAny(v) which says "Same as 
c.removeAny(v), but is guaranteed to not invalidate any iterators." 

Though c.removeAny(v) itself is not listed in the table nor implemented in 
RedBlackTree, isn't this the right function for the job? (I take it v stands 
for a value of the container, rather than a range). builtin aa's also 
implement this function.

Feb 22 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday 22 February 2011 01:04:48 Lutger Blijdestijn wrote:
 Jonathan M Davis wrote:
 Okay, removing elements from a container sucks right now. You can do
 stuff like removeAny (generally pretty useless IMHO) or removeFront just
 fine, but removing an arbitrary range from a container just plain sucks.
 
 remove takes a range type which is the range type for the container that
 it's on. That makes sense. It's obviously not going to be able to a take
 an arbitrary range and remove that from itself. How would it know which
 elements in that range corresponded to which elements in itself -
 especially when it could be a range which skips elements or something
 similar? So, _that_ part makes sense.

 
 I don't understand why not. Given, as you said, that a lot of functions
 return a new type it would make sense. My answer as to how would be
 something like:
 
 while range not empty
     container.remove range.front
     range.popFront
 
 Is there a problem with that?

It's horribly inefficient for one. Also, I'm not sure that the range is valid
any 
more after the remove, since it's front was removed. popFront may not work 
correctly.

 But have you actually tried to get a range of the appropriate type to
 remove from a container? It seems like almost ever function in std.range
 and std.algorithm returns a new range type, making them completely
 useless for processing a range to be removed from a container.
 
 I was looking to remove a single element for a RedBlackTree. The best
 function that I could think to get the proper range was findSplit. The
 middle portion of the return value would be the portion that I was
 looking for. I'd take that and pass it to RedBlackTree's remove. Wrong.
 It uses takeExactly in its implementation and the first two portions of
 the result of findSplit aren't the right range type.
 
 So, what do I do? The _only_ thing that I can think of at the moment is
 to use find to locate the beginning of the range that I want, take the
 length of the range with walkLength and then use popBackN to pop of the
 elements I don't want. e.g.

 
 The table in the docs mention stableRemoveAny(v) which says "Same as
 c.removeAny(v), but is guaranteed to not invalidate any iterators."
 
 Though c.removeAny(v) itself is not listed in the table nor implemented in
 RedBlackTree, isn't this the right function for the job? (I take it v
 stands for a value of the container, rather than a range). builtin aa's
 also implement this function.

removeAny doesn't solve the problem. For starters, the table appears to be
wrong 
in that stableRemoveAny takes a value and removeAny doesn't. So, either 
stableRemoveAny is supposed to not take a value or there's supposed to be 
another version of removeAny which takes a value, and it's missing. So, I don't 
know what exactly is intended with stableRemoveAny.

Regardless of that, however, remove still needs to work. It's perfectly valid
to 
want to remove a range of elements rather than just one.

- Jonathan M Davis

Feb 22 2011

Lutger Blijdestijn <lutger.blijdestijn gmail.com> writes:

Jonathan M Davis wrote:

 On Tuesday 22 February 2011 01:04:48 Lutger Blijdestijn wrote:
 Jonathan M Davis wrote:
 Okay, removing elements from a container sucks right now. You can do
 stuff like removeAny (generally pretty useless IMHO) or removeFront
 just fine, but removing an arbitrary range from a container just plain
 sucks.
 
 remove takes a range type which is the range type for the container
 that it's on. That makes sense. It's obviously not going to be able to
 a take an arbitrary range and remove that from itself. How would it
 know which elements in that range corresponded to which elements in
 itself - especially when it could be a range which skips elements or
 something similar? So, _that_ part makes sense.

 
 I don't understand why not. Given, as you said, that a lot of functions
 return a new type it would make sense. My answer as to how would be
 something like:
 
 while range not empty
     container.remove range.front
     range.popFront
 
 Is there a problem with that?

 
 It's horribly inefficient for one. Also, I'm not sure that the range is
 valid any more after the remove, since it's front was removed. popFront
 may not work correctly.

It's a matter of range (in)validation right? I admit at this point I'm not 
sure how it is supposed to work. remove + popFront is a bad idea if the 
range is a view into the contained from where stuff is removed, but in c++ 
this works for map: some_map.erase(i++) where i is an iterator. So it 
depends on the type of container what is supported I think. But perhaps this 
falls under Walter's category of useless wankery, since it is easy to write 
yourself.

 But have you actually tried to get a range of the appropriate type to
 remove from a container? It seems like almost ever function in
 std.range and std.algorithm returns a new range type, making them
 completely useless for processing a range to be removed from a
 container.
 
 I was looking to remove a single element for a RedBlackTree. The best
 function that I could think to get the proper range was findSplit. The
 middle portion of the return value would be the portion that I was
 looking for. I'd take that and pass it to RedBlackTree's remove. Wrong.
 It uses takeExactly in its implementation and the first two portions of
 the result of findSplit aren't the right range type.
 
 So, what do I do? The _only_ thing that I can think of at the moment is
 to use find to locate the beginning of the range that I want, take the
 length of the range with walkLength and then use popBackN to pop of the
 elements I don't want. e.g.

 
 The table in the docs mention stableRemoveAny(v) which says "Same as
 c.removeAny(v), but is guaranteed to not invalidate any iterators."
 
 Though c.removeAny(v) itself is not listed in the table nor implemented
 in RedBlackTree, isn't this the right function for the job? (I take it v
 stands for a value of the container, rather than a range). builtin aa's
 also implement this function.

 
 removeAny doesn't solve the problem. For starters, the table appears to be
 wrong in that stableRemoveAny takes a value and removeAny doesn't. So,
 either stableRemoveAny is supposed to not take a value or there's supposed
 to be another version of removeAny which takes a value, and it's missing.
 So, I don't know what exactly is intended with stableRemoveAny.
 
 Regardless of that, however, remove still needs to work. It's perfectly
 valid to want to remove a range of elements rather than just one.
 
 - Jonathan M Davis

(stable)removeAny does not solve the problem of removing ranges, but there 
must be *something* that solves the problem of removing one element. You 
found one way to do it for RedBlackTree (i gave up), but if I am not 
mistaken it doesn't have the right complexity and efficient removal is a key 
property of this container.

Feb 22 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday 22 February 2011 03:52:34 Lutger Blijdestijn wrote:
 Jonathan M Davis wrote:
 On Tuesday 22 February 2011 01:04:48 Lutger Blijdestijn wrote:
 Jonathan M Davis wrote:
 Okay, removing elements from a container sucks right now. You can do
 stuff like removeAny (generally pretty useless IMHO) or removeFront
 just fine, but removing an arbitrary range from a container just plain
 sucks.
 
 remove takes a range type which is the range type for the container
 that it's on. That makes sense. It's obviously not going to be able to
 a take an arbitrary range and remove that from itself. How would it
 know which elements in that range corresponded to which elements in
 itself - especially when it could be a range which skips elements or
 something similar? So, _that_ part makes sense.

 
 I don't understand why not. Given, as you said, that a lot of functions
 return a new type it would make sense. My answer as to how would be
 something like:
 
 while range not empty
 
     container.remove range.front
     range.popFront
 
 Is there a problem with that?

 
 It's horribly inefficient for one. Also, I'm not sure that the range is
 valid any more after the remove, since it's front was removed. popFront
 may not work correctly.

 
 It's a matter of range (in)validation right? I admit at this point I'm not
 sure how it is supposed to work. remove + popFront is a bad idea if the
 range is a view into the contained from where stuff is removed, but in c++
 this works for map: some_map.erase(i++) where i is an iterator. So it
 depends on the type of container what is supported I think. But perhaps
 this falls under Walter's category of useless wankery, since it is easy to
 write yourself.

It depends on the implementation. It's possible that iterating an iterator or 
popping the front of a range where that iterator no longer points to a valid 
element actually works. It's also possible that, because it no longer points to 
a valid element, it _doesn't_ work. It wouldn't surprise me at all that 
some_map.erase(i++) isn't guaranteed to work in C++ at all but that you've just 
gotten lucky with the STL implementation that you've been using. I'd have to 
research it to be sure. But my guess would be that you've been lucky.

 But have you actually tried to get a range of the appropriate type to
 remove from a container? It seems like almost ever function in
 std.range and std.algorithm returns a new range type, making them
 completely useless for processing a range to be removed from a
 container.
 
 I was looking to remove a single element for a RedBlackTree. The best
 function that I could think to get the proper range was findSplit. The
 middle portion of the return value would be the portion that I was
 looking for. I'd take that and pass it to RedBlackTree's remove.
 Wrong. It uses takeExactly in its implementation and the first two
 portions of the result of findSplit aren't the right range type.
 
 So, what do I do? The _only_ thing that I can think of at the moment
 is to use find to locate the beginning of the range that I want, take
 the length of the range with walkLength and then use popBackN to pop
 of the elements I don't want. e.g.

 
 The table in the docs mention stableRemoveAny(v) which says "Same as
 c.removeAny(v), but is guaranteed to not invalidate any iterators."
 
 Though c.removeAny(v) itself is not listed in the table nor implemented
 in RedBlackTree, isn't this the right function for the job? (I take it v
 stands for a value of the container, rather than a range). builtin aa's
 also implement this function.

 
 removeAny doesn't solve the problem. For starters, the table appears to
 be wrong in that stableRemoveAny takes a value and removeAny doesn't.
 So, either stableRemoveAny is supposed to not take a value or there's
 supposed to be another version of removeAny which takes a value, and
 it's missing. So, I don't know what exactly is intended with
 stableRemoveAny.
 
 Regardless of that, however, remove still needs to work. It's perfectly
 valid to want to remove a range of elements rather than just one.
 
 - Jonathan M Davis

 
 (stable)removeAny does not solve the problem of removing ranges, but there
 must be *something* that solves the problem of removing one element. You
 found one way to do it for RedBlackTree (i gave up), but if I am not
 mistaken it doesn't have the right complexity and efficient removal is a
 key property of this container.

If you could just do take or takeExactly on the range you got from find and
pass 
it to remove, then it wouldn't matter whether you wanted to remove 1 element or 
many. The only difference is the number passed to take. The problem is that
that 
doesn't work, because take returns the wrong range type. It has no way to get n 
elements from the front of the range and have them be in a range which is the 
same type as the orignal. So, I don't think that one element vs many is really 
the problem here. It's the inability to get the first n elements of a forward 
range and still retain the same range type.

Regardless, you're right. the solution I found most definitely does _not_ have 
the right complexity. Because you have to use walkLength on the range that you 
get from find, it becomes O(n) instead of O(log n), and O(log n) is what it's 
_supposed_ to be for a red black tree. As I said, I don't believe that the 
current situation with remove is tenable, and the only solution that I see at 
the moment is to change forward range so that it has a function which returns 
the first n elements of that range with the same range type as that range.

- Jonathan M Davis

Feb 22 2011

Lutger Blijdestijn <lutger.blijdestijn gmail.com> writes:

Jonathan M Davis wrote:

 On Tuesday 22 February 2011 03:52:34 Lutger Blijdestijn wrote:
 Jonathan M Davis wrote:
 On Tuesday 22 February 2011 01:04:48 Lutger Blijdestijn wrote:
 Jonathan M Davis wrote:
 Okay, removing elements from a container sucks right now. You can do
 stuff like removeAny (generally pretty useless IMHO) or removeFront
 just fine, but removing an arbitrary range from a container just
 plain sucks.
 
 remove takes a range type which is the range type for the container
 that it's on. That makes sense. It's obviously not going to be able
 to a take an arbitrary range and remove that from itself. How would
 it know which elements in that range corresponded to which elements
 in itself - especially when it could be a range which skips elements
 or something similar? So, _that_ part makes sense.

 
 I don't understand why not. Given, as you said, that a lot of
 functions return a new type it would make sense. My answer as to how
 would be something like:
 
 while range not empty
 
     container.remove range.front
     range.popFront
 
 Is there a problem with that?

 
 It's horribly inefficient for one. Also, I'm not sure that the range is
 valid any more after the remove, since it's front was removed. popFront
 may not work correctly.

 
 It's a matter of range (in)validation right? I admit at this point I'm
 not sure how it is supposed to work. remove + popFront is a bad idea if
 the range is a view into the contained from where stuff is removed, but
 in c++ this works for map: some_map.erase(i++) where i is an iterator. So
 it depends on the type of container what is supported I think. But
 perhaps this falls under Walter's category of useless wankery, since it
 is easy to write yourself.

 
 It depends on the implementation. It's possible that iterating an iterator
 or popping the front of a range where that iterator no longer points to a
 valid element actually works. It's also possible that, because it no
 longer points to a valid element, it _doesn't_ work. It wouldn't surprise
 me at all that some_map.erase(i++) isn't guaranteed to work in C++ at all
 but that you've just gotten lucky with the STL implementation that you've
 been using. I'd have to research it to be sure. But my guess would be that
 you've been lucky.

I'm fairly certain this is guaranteed, but specifically tied to std::map. 
The iterator returns the old value and is incremented before erasure, which 
doesn't invalidate iterators except the one that got removed.
 
 But have you actually tried to get a range of the appropriate type
 to remove from a container? It seems like almost ever function in
 std.range and std.algorithm returns a new range type, making them
 completely useless for processing a range to be removed from a
 container.
 
 I was looking to remove a single element for a RedBlackTree. The
 best function that I could think to get the proper range was
 findSplit. The middle portion of the return value would be the
 portion that I was looking for. I'd take that and pass it to
 RedBlackTree's remove. Wrong. It uses takeExactly in its
 implementation and the first two portions of the result of findSplit
 aren't the right range type.
 
 So, what do I do? The _only_ thing that I can think of at the moment
 is to use find to locate the beginning of the range that I want,
 take the length of the range with walkLength and then use popBackN
 to pop of the elements I don't want. e.g.

 
 The table in the docs mention stableRemoveAny(v) which says "Same as
 c.removeAny(v), but is guaranteed to not invalidate any iterators."
 
 Though c.removeAny(v) itself is not listed in the table nor
 implemented in RedBlackTree, isn't this the right function for the
 job? (I take it v stands for a value of the container, rather than a
 range). builtin aa's also implement this function.

 
 removeAny doesn't solve the problem. For starters, the table appears to
 be wrong in that stableRemoveAny takes a value and removeAny doesn't.
 So, either stableRemoveAny is supposed to not take a value or there's
 supposed to be another version of removeAny which takes a value, and
 it's missing. So, I don't know what exactly is intended with
 stableRemoveAny.
 
 Regardless of that, however, remove still needs to work. It's perfectly
 valid to want to remove a range of elements rather than just one.
 
 - Jonathan M Davis

 
 (stable)removeAny does not solve the problem of removing ranges, but
 there must be *something* that solves the problem of removing one
 element. You found one way to do it for RedBlackTree (i gave up), but if
 I am not mistaken it doesn't have the right complexity and efficient
 removal is a key property of this container.

 
 If you could just do take or takeExactly on the range you got from find
 and pass it to remove, then it wouldn't matter whether you wanted to
 remove 1 element or many. The only difference is the number passed to
 take. The problem is that that doesn't work, because take returns the
 wrong range type. It has no way to get n elements from the front of the
 range and have them be in a range which is the same type as the orignal.
 So, I don't think that one element vs many is really the problem here.
 It's the inability to get the first n elements of a forward range and
 still retain the same range type.

There are two problems I think. What if I want to remove an array of strings 
from a bst of strings? That should be supported, even if only by removing 
them one by one yourself. If all we got was remove(Range) then this scenario 
isn't supported at all. Another problem is the efficiency. With a bst, using 
find + take + remove has to locate the element twice (once for find, once 
for remove), whereas remove(value) would only need to do this once. 

 Regardless, you're right. the solution I found most definitely does _not_
 have the right complexity. Because you have to use walkLength on the range
 that you get from find, it becomes O(n) instead of O(log n), and O(log n)
 is what it's _supposed_ to be for a red black tree. As I said, I don't
 believe that the current situation with remove is tenable, and the only
 solution that I see at the moment is to change forward range so that it
 has a function which returns the first n elements of that range with the
 same range type as that range.
 
 - Jonathan M Davis

Feb 22 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/22/11 3:04 AM, Lutger Blijdestijn wrote:
 The table in the docs mention stableRemoveAny(v) which says "Same as
 c.removeAny(v), but is guaranteed to not invalidate any iterators."

 Though c.removeAny(v) itself is not listed in the table nor implemented in
 RedBlackTree, isn't this the right function for the job? (I take it v stands
 for a value of the container, rather than a range). builtin aa's also
 implement this function.

I don't think removeAny() works for his case because he wants to remove 
a specific element from a container. removeAny() is useful when you want 
to express "pick one element from that container in the easiest way 
possible".

Andrei

Feb 22 2011

Lutger Blijdestijn <lutger.blijdestijn gmail.com> writes:

Andrei Alexandrescu wrote:

 On 2/22/11 3:04 AM, Lutger Blijdestijn wrote:
 The table in the docs mention stableRemoveAny(v) which says "Same as
 c.removeAny(v), but is guaranteed to not invalidate any iterators."

 Though c.removeAny(v) itself is not listed in the table nor implemented
 in RedBlackTree, isn't this the right function for the job? (I take it v
 stands for a value of the container, rather than a range). builtin aa's
 also implement this function.

 
 I don't think removeAny() works for his case because he wants to remove
 a specific element from a container. removeAny() is useful when you want
 to express "pick one element from that container in the easiest way
 possible".
 
 Andrei

I was looking for remove(ElementType value) but this function is not listed 
anywhere in std.container. The entry for stableRemoveAny(v) references 
removeAny(v) - note the parameter - so I thought that indicated the intent 
of a function removeAny(ElementType) but it doesn't exist in the table 
either.

I assumed (mistakenly) that removeAny(v) would remove any value x where x == 
v from the container, where the choice of which would be left up to the 
container.

Feb 22 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/21/11 8:55 PM, Jonathan M Davis wrote:
 Okay, removing elements from a container sucks right now. You can do stuff like
 removeAny (generally pretty useless IMHO) or removeFront just fine, but
removing
 an arbitrary range from a container just plain sucks.

 remove takes a range type which is the range type for the container that it's
 on. That makes sense. It's obviously not going to be able to a take an
arbitrary
 range and remove that from itself. How would it know which elements in that
 range corresponded to which elements in itself - especially when it could be a
 range which skips elements or something similar? So, _that_ part makes sense.

 But have you actually tried to get a range of the appropriate type to remove
 from a container? It seems like almost ever function in std.range and
 std.algorithm returns a new range type, making them completely useless for
 processing a range to be removed from a container.

 I was looking to remove a single element for a RedBlackTree. The best function
 that I could think to get the proper range was findSplit. The middle portion of
 the return value would be the portion that I was looking for. I'd take that and
 pass it to RedBlackTree's remove. Wrong. It uses takeExactly in its
 implementation and the first two portions of the result of findSplit aren't the
 right range type.

It's good that positioned remove does not work with findSplit. It would 
be inefficient to use the wonderfully structured tree for linear search. 
The algorithm should be simple - find the key to delete in O(log n), 
then remove it.

 So, what do I do? The _only_ thing that I can think of at the moment is to use
 find to locate the beginning of the range that I want, take the length of the
 range with walkLength and then use popBackN to pop of the elements I don't
want.

[snip]

What we need to do is add to RedBlackTree a function that accepts the 
result of take() - the same way SList does. It has only recently dawned 
on me that certain generic ranges are pivotal to algorithms and 
containers, and so far I identified take() and takeExactly() to be two 
such important ranges.

One other thing that we can add to RedBlackTree is a simple removeKey() 
convenience method that does the find-and-remove thing. Any Phobos 
developer (including JMD of course :o)), please feel free to implement 
all of the above and submit it for pulling. And thanks Jonathan for 
using std.container - I am sure there are a few other wrinkles that will 
be revealed and then ironed out following consistent usage.


Andrei

Feb 22 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 22 Feb 2011 08:01:24 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 2/21/11 8:55 PM, Jonathan M Davis wrote:
 Okay, removing elements from a container sucks right now. You can do  
 stuff like
 removeAny (generally pretty useless IMHO) or removeFront just fine, but  
 removing
 an arbitrary range from a container just plain sucks.

 remove takes a range type which is the range type for the container  
 that it's
 on. That makes sense. It's obviously not going to be able to a take an  
 arbitrary
 range and remove that from itself. How would it know which elements in  
 that
 range corresponded to which elements in itself - especially when it  
 could be a
 range which skips elements or something similar? So, _that_ part makes  
 sense.

 But have you actually tried to get a range of the appropriate type to  
 remove
 from a container? It seems like almost ever function in std.range and
 std.algorithm returns a new range type, making them completely useless  
 for
 processing a range to be removed from a container.

 I was looking to remove a single element for a RedBlackTree. The best  
 function
 that I could think to get the proper range was findSplit. The middle  
 portion of
 the return value would be the portion that I was looking for. I'd take  
 that and
 pass it to RedBlackTree's remove. Wrong. It uses takeExactly in its
 implementation and the first two portions of the result of findSplit  
 aren't the
 right range type.

 It's good that positioned remove does not work with findSplit. It would  
 be inefficient to use the wonderfully structured tree for linear search.  
 The algorithm should be simple - find the key to delete in O(log n),  
 then remove it.

This should be possible with equalRange.

However, there can be needs to do a linear search on a RBT if you are not  
searching for keys.  I think we definitely need take as a supported range  
in RBT (I didn't think so originally).

 So, what do I do? The _only_ thing that I can think of at the moment is  
 to use
 find to locate the beginning of the range that I want, take the length  
 of the
 range with walkLength and then use popBackN to pop of the elements I  
 don't want.

 [snip]

 What we need to do is add to RedBlackTree a function that accepts the  
 result of take() - the same way SList does. It has only recently dawned  
 on me that certain generic ranges are pivotal to algorithms and  
 containers, and so far I identified take() and takeExactly() to be two  
 such important ranges.

 One other thing that we can add to RedBlackTree is a simple removeKey()  
 convenience method that does the find-and-remove thing. Any Phobos  
 developer (including JMD of course :o)), please feel free to implement  
 all of the above and submit it for pulling. And thanks Jonathan for  
 using std.container - I am sure there are a few other wrinkles that will  
 be revealed and then ironed out following consistent usage.

I should be able to complete these improvements before the next release  
(excluding any quick-fix releases).  Still need to learn git though...

-Steve

Feb 22 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, February 22, 2011 05:01:24 Andrei Alexandrescu wrote:
 On 2/21/11 8:55 PM, Jonathan M Davis wrote:
 Okay, removing elements from a container sucks right now. You can do
 stuff like removeAny (generally pretty useless IMHO) or removeFront just
 fine, but removing an arbitrary range from a container just plain sucks.
 
 remove takes a range type which is the range type for the container that
 it's on. That makes sense. It's obviously not going to be able to a take
 an arbitrary range and remove that from itself. How would it know which
 elements in that range corresponded to which elements in itself -
 especially when it could be a range which skips elements or something
 similar? So, _that_ part makes sense.
 
 But have you actually tried to get a range of the appropriate type to
 remove from a container? It seems like almost ever function in std.range
 and std.algorithm returns a new range type, making them completely
 useless for processing a range to be removed from a container.
 
 I was looking to remove a single element for a RedBlackTree. The best
 function that I could think to get the proper range was findSplit. The
 middle portion of the return value would be the portion that I was
 looking for. I'd take that and pass it to RedBlackTree's remove. Wrong.
 It uses takeExactly in its implementation and the first two portions of
 the result of findSplit aren't the right range type.

 
 It's good that positioned remove does not work with findSplit. It would
 be inefficient to use the wonderfully structured tree for linear search.
 The algorithm should be simple - find the key to delete in O(log n),
 then remove it.

Well, if RedBlackTree's remove gets changed to take the return values from take 
and takeExactly as you suggest below, then it _will_ work. Regardless, I don't 
think that findSplit is the best choice for this situation. It's just the best 
that I could think of given that take and takeExactly didn't work. I was then 
frustrated to discover that findSplit used takeExactly (though thinking about
it 
now, given how forward ranges work, it's not like it could really do
otherwise). 
removeKey is really what RedBlackTree needs for the most part, however.
Removing 
a range from a RedBlackTree definitely makes sense in some cases, but usually 
you're looking to remove a particular element or set of elements which aren't 
necessarily in any specific order or adjacent in the tree (which implies that 
perhaps we need a removeKey(s) which takes a range of values - though not 
necessarily a range from RedBlackTree).

 So, what do I do? The _only_ thing that I can think of at the moment is
 to use find to locate the beginning of the range that I want, take the
 length of the range with walkLength and then use popBackN to pop of the
 elements I don't want.

 
 [snip]
 
 What we need to do is add to RedBlackTree a function that accepts the
 result of take() - the same way SList does. It has only recently dawned
 on me that certain generic ranges are pivotal to algorithms and
 containers, and so far I identified take() and takeExactly() to be two
 such important ranges.
 
 One other thing that we can add to RedBlackTree is a simple removeKey()
 convenience method that does the find-and-remove thing. Any Phobos
 developer (including JMD of course :o)), please feel free to implement
 all of the above and submit it for pulling. And thanks Jonathan for
 using std.container - I am sure there are a few other wrinkles that will
 be revealed and then ironed out following consistent usage.

In this case, it probably makes sense to let Steve take care of it, since he's 
already working on it, though I may just add removeKey and submit a pull
request 
which Steven can deal with as he sees appropriate, so that I can use it for
what 
I'm doing.

However, having found a fundamental flaw in how remove currently works with 
RedBlackTree, I thought it best to discuss how to solve the problem before 
making any changes to it anyway. Regardless, RedBlackTree is the container that 
I've been most missing in what I've been doing in D, so I'll definitely be
using 
it in code now that I have it.

By the way, are we looking at changing the containers to be classes by the next 
release (since, as I recall, you decided that we're definitely making that 
change)? That's one change that RedBlackTree could definitely benefit from
(since 
you have to run one of its destructors for it to be set up correctly and it 
can't have a default constructor as a struct), and it's probably the sort of 
change that we should make sooner rather than later, since the longer that the 
types in std.container are structs, the more code it will break when they
become 
classes.

- Jonathan M Davis

Feb 22 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday 22 February 2011 05:01:24 Andrei Alexandrescu wrote:
 On 2/21/11 8:55 PM, Jonathan M Davis wrote:
 Okay, removing elements from a container sucks right now. You can do
 stuff like removeAny (generally pretty useless IMHO) or removeFront just
 fine, but removing an arbitrary range from a container just plain sucks.
 
 remove takes a range type which is the range type for the container that
 it's on. That makes sense. It's obviously not going to be able to a take
 an arbitrary range and remove that from itself. How would it know which
 elements in that range corresponded to which elements in itself -
 especially when it could be a range which skips elements or something
 similar? So, _that_ part makes sense.
 
 But have you actually tried to get a range of the appropriate type to
 remove from a container? It seems like almost ever function in std.range
 and std.algorithm returns a new range type, making them completely
 useless for processing a range to be removed from a container.
 
 I was looking to remove a single element for a RedBlackTree. The best
 function that I could think to get the proper range was findSplit. The
 middle portion of the return value would be the portion that I was
 looking for. I'd take that and pass it to RedBlackTree's remove. Wrong.
 It uses takeExactly in its implementation and the first two portions of
 the result of findSplit aren't the right range type.

 
 It's good that positioned remove does not work with findSplit. It would
 be inefficient to use the wonderfully structured tree for linear search.
 The algorithm should be simple - find the key to delete in O(log n),
 then remove it.
 
 So, what do I do? The _only_ thing that I can think of at the moment is
 to use find to locate the beginning of the range that I want, take the
 length of the range with walkLength and then use popBackN to pop of the
 elements I don't want.

 
 [snip]
 
 What we need to do is add to RedBlackTree a function that accepts the
 result of take() - the same way SList does. It has only recently dawned
 on me that certain generic ranges are pivotal to algorithms and
 containers, and so far I identified take() and takeExactly() to be two
 such important ranges.
 
 One other thing that we can add to RedBlackTree is a simple removeKey()
 convenience method that does the find-and-remove thing. Any Phobos
 developer (including JMD of course :o)), please feel free to implement
 all of the above and submit it for pulling. And thanks Jonathan for
 using std.container - I am sure there are a few other wrinkles that will
 be revealed and then ironed out following consistent usage.

I've made several improvements to RedBlackTree and created a pull request for 
them: https://github.com/D-Programming-Language/phobos/pull/10

- Jonathan M Davis

Feb 23 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 23 Feb 2011 07:01:12 -0500, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 I've made several improvements to RedBlackTree and created a pull  
 request for
 them: https://github.com/D-Programming-Language/phobos/pull/10

I'll be sure to look at this soon.  I really need to ramp up my git  
skills...

Anyone know where I can find a 30 hour day? ;)

-Steve

Feb 23 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday, February 23, 2011 06:56:45 Steven Schveighoffer wrote:
 On Wed, 23 Feb 2011 07:01:12 -0500, Jonathan M Davis <jmdavisProg gmx.com>
 
 wrote:
 I've made several improvements to RedBlackTree and created a pull
 request for
 them: https://github.com/D-Programming-Language/phobos/pull/10

 
 I'll be sure to look at this soon.  I really need to ramp up my git
 skills...
 
 Anyone know where I can find a 30 hour day? ;)

Yeah. It's hiding under couch, but sneaky. So, if you go looking, odds are you 
won't find it, and then you'll have even _fewer_ hours in the day. :)

- Jonathan M Davis

Feb 23 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 21 Feb 2011 21:55:20 -0500, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 Okay, removing elements from a container sucks right now. You can do  
 stuff like
 removeAny (generally pretty useless IMHO) or removeFront just fine, but  
 removing
 an arbitrary range from a container just plain sucks.

 remove takes a range type which is the range type for the container that  
 it's
 on. That makes sense. It's obviously not going to be able to a take an  
 arbitrary
 range and remove that from itself. How would it know which elements in  
 that
 range corresponded to which elements in itself - especially when it  
 could be a
 range which skips elements or something similar? So, _that_ part makes  
 sense.

 But have you actually tried to get a range of the appropriate type to  
 remove
 from a container? It seems like almost ever function in std.range and
 std.algorithm returns a new range type, making them completely useless  
 for
 processing a range to be removed from a container.

 I was looking to remove a single element for a RedBlackTree. The best  
 function
 that I could think to get the proper range was findSplit. The middle  
 portion of
 the return value would be the portion that I was looking for. I'd take  
 that and
 pass it to RedBlackTree's remove. Wrong. It uses takeExactly in its
 implementation and the first two portions of the result of findSplit  
 aren't the
 right range type.

RedBlackTree supports the equalRange function, which gives you a range of  
elements equal to the value you give.

If your RedBlackTree does not support duplicate elements, then this should  
always be a 1 or zero element range, which you can then remove.

 That's disgusting. All that just to remove one element? And what if the  
 range
 isn't bidirectional (e.g. SList)? Well, then you have no popBack, and as  
 far as
 I can tell, you're screwed, since you can't use either take or  
 takeExactly,
 because both of them return new range types.

take is supported as a range type for SList.  It is not for RedBlackTree  
because there are usually better ways to get ranges from the container.   
However, I can see reasons that take could be used.  For example, in a  
RedBlackTree implementing a map, you may want to search for the first  
element that has a particular value (O(n) complexity find).  In that case,  
you should be able to remove just one element.  This should be added as a  
bugzilla report, and I'll add the ability to do take ranges to the red  
black tree.  RBT is overdue for some improvements anyways (bearophile  
found quite a few shortfalls with it).

But I see your point in that you should really be able to construct *any*  
type of range and have it work.

I wonder if there is a way we could generalize "give me the  
implementation-specific representation of the first item *reference*"

For example, in dcollections, we have the notion of cursors.  The remove  
functions take either a cursor or a range.  However, I could possibly  
extend the range removal function to allow any range type as long as the  
begin() member gives you the correct cursor type.

-Steve

Feb 22 2011

Lutger Blijdestijn <lutger.blijdestijn gmail.com> writes:

Steven Schveighoffer wrote:

 On Mon, 21 Feb 2011 21:55:20 -0500, Jonathan M Davis <jmdavisProg gmx.com>
 wrote:
 
 Okay, removing elements from a container sucks right now. You can do
 stuff like
 removeAny (generally pretty useless IMHO) or removeFront just fine, but
 removing
 an arbitrary range from a container just plain sucks.

 remove takes a range type which is the range type for the container that
 it's
 on. That makes sense. It's obviously not going to be able to a take an
 arbitrary
 range and remove that from itself. How would it know which elements in
 that
 range corresponded to which elements in itself - especially when it
 could be a
 range which skips elements or something similar? So, _that_ part makes
 sense.

 But have you actually tried to get a range of the appropriate type to
 remove
 from a container? It seems like almost ever function in std.range and
 std.algorithm returns a new range type, making them completely useless
 for
 processing a range to be removed from a container.

 I was looking to remove a single element for a RedBlackTree. The best
 function
 that I could think to get the proper range was findSplit. The middle
 portion of
 the return value would be the portion that I was looking for. I'd take
 that and
 pass it to RedBlackTree's remove. Wrong. It uses takeExactly in its
 implementation and the first two portions of the result of findSplit
 aren't the
 right range type.

 
 RedBlackTree supports the equalRange function, which gives you a range of
 elements equal to the value you give.
 

oo how I missed that. When you are working on RedBlackTree, would you please 
consider putting an example in the doc that uses it to this effect?

Feb 22 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 22 Feb 2011 09:38:04 -0500, Lutger Blijdestijn  
<lutger.blijdestijn gmail.com> wrote:

 Steven Schveighoffer wrote:

 RedBlackTree supports the equalRange function, which gives you a range  
 of
 elements equal to the value you give.

 oo how I missed that. When you are working on RedBlackTree, would you  
 please
 consider putting an example in the doc that uses it to this effect?

I'm sorry you missed it, better docs is one of the planned improvements.   
Bearophile pointed out quite a few deficiencies which I need to work on.

I'll be sure to make RedBlackTree a much more usable construct by the next  
release.

-Steve

Feb 22 2011

spir <denis.spir gmail.com> writes:

On 02/22/2011 03:13 PM, Steven Schveighoffer wrote:
 I wonder if there is a way we could generalize "give me the
 implementation-specific representation of the first item *reference*"

If this means a generalisation of what D's builtin 'in' provides, then I think 
this would be great. Then, passing back the 'reference' would provide _direct_ 
access (without [new] lookup) either to read, remove, change... Can't we unify 
this with your notion of cursor? What do you think?

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Feb 22 2011

D Programming

C/C++ Programming

Other

digitalmars.D - We need to rethink remove in std.container