www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - design question

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
I'm toying with a rather new approach to a certain problem, and wanted 
to ask for some feedback before coding with it.

The STL has two functions with a "stable" variant: stable_sort and 
stable_partition. They differ from sort and partition in that they 
preserve the order of equivalent objects. They are more costly, hence 
the default is "unstable".

I started with the same approach in std.algorithm, but before long I 
figured two things:

a) There are many more algorithms that can benefit from stable vs. 
unstable versions: remove, topN, schwartzSort, partialSort, completeSort 
(take a sorted range and an unsorted one and make a sorted range out of 
them), and makeIndex.

b) Aside from stable and unstable, there's a semistable notion: when 
partitioning, the left half stays stable and the right half is unstable.

Clearly encoding stability in the function name isn't faring very well, 
so I made a policy of it:

enum SwapStrategy { stable, semistable, unstable }

and parameterized the appropriate functions with SwapStrategy. A use 
case looks like this:

// even to the left, odd to the right
auto arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
auto p = partition!("(a & 1) == 0", SwapStrategy.stable)(arr);
assert(arr == [2, 4, 6, 8, 10, 1, 3, 5, 7, 9]);
assert(p == arr[5 .. $]);

Ok, until now there's only been background :o|. The new part starts here.

I think the stability request could be better encoded as a wrapper of 
the range making the request. For example:

auto p = partition!("(a & 1) == 0")(keepStable(arr));

So in order to tell partition you want stability, you just wrap the 
range with a call to keepStable. (The runtime cost is negligible). 
Similarly, say you want a stable sort. You'd say:

sort(keepStable(arr));

instead of:

sort!(SwapStrategy.stable)(arr);

I have the feeling this approach will scale better in the future. Also 
it is less verbose and in sync with the current way of doing searches:

auto f = find(assumeSorted(haystack), needle);

In that case, assumeSorted uses the same means of transmitting 
information about a range into an algorithm.

So... I'd appreciate any thoughts you might have. Thanks.


Andrei
Apr 03 2009
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 I think the stability request could be better encoded as a wrapper of
 the range making the request. For example:
 auto p = partition!("(a & 1) == 0")(keepStable(arr));
 So in order to tell partition you want stability, you just wrap the
 range with a call to keepStable. (The runtime cost is negligible).
 Similarly, say you want a stable sort. You'd say:
 sort(keepStable(arr));
 instead of:
 sort!(SwapStrategy.stable)(arr);
Can you elaborate a little? In particular, would the KeepStable wrapper actually _do_ anything in itself to make modifications to the underlying range stable, or would it just wrap the range in a new type so that you could stick a static if statement in the sort algorithm to decide what to?
Apr 03 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 I think the stability request could be better encoded as a wrapper of
 the range making the request. For example:
 auto p = partition!("(a & 1) == 0")(keepStable(arr));
 So in order to tell partition you want stability, you just wrap the
 range with a call to keepStable. (The runtime cost is negligible).
 Similarly, say you want a stable sort. You'd say:
 sort(keepStable(arr));
 instead of:
 sort!(SwapStrategy.stable)(arr);
Can you elaborate a little? In particular, would the KeepStable wrapper actually _do_ anything in itself to make modifications to the underlying range stable, or would it just wrap the range in a new type so that you could stick a static if statement in the sort algorithm to decide what to?
keepStable is a template function that simply wraps the range in a different type. It's a way to pass information about the range (e.g., "keep this stable", or "this is sorted already"), to the function understanding it. Andrei
Apr 03 2009
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 keepStable is a template function that simply wraps the range in a 
 different type. It's a way to pass information about the range (e.g., 
 "keep this stable", or "this is sorted already"), to the function 
 understanding it.
Would these be composable? I.e.: thisAttribute(thatAttribute(T)) is the same as thatAttribute(thisAttribute(T)) ? Would the algorithm detect the 'inner' attribute?
Apr 03 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Walter Bright wrote:
 Andrei Alexandrescu wrote:
 keepStable is a template function that simply wraps the range in a 
 different type. It's a way to pass information about the range (e.g., 
 "keep this stable", or "this is sorted already"), to the function 
 understanding it.
Would these be composable? I.e.: thisAttribute(thatAttribute(T)) is the same as thatAttribute(thisAttribute(T)) ? Would the algorithm detect the 'inner' attribute?
Great question. Not sure whether I can implement composition to be entirely order-invariant, but yes, the algorithm should "see" various attributes of the wrapped range. Andrei
Apr 03 2009
parent BCS <ao pathlink.com> writes:
Reply to Andrei,

 Walter Bright wrote:
 
 Andrei Alexandrescu wrote:
 
 keepStable is a template function that simply wraps the range in a
 different type. It's a way to pass information about the range
 (e.g., "keep this stable", or "this is sorted already"), to the
 function understanding it.
 
Would these be composable? I.e.: thisAttribute(thatAttribute(T)) is the same as thatAttribute(thisAttribute(T)) ? Would the algorithm detect the 'inner' attribute?
Great question. Not sure whether I can implement composition to be entirely order-invariant, but yes, the algorithm should "see" various attributes of the wrapped range. Andrei
you could do composition by tuple/set unioning and always sort the set (by .stringof)
Apr 03 2009
prev sibling next sibling parent Jason House <jason.james.house gmail.com> writes:
I don't like it. You're introducing types for things that should only exist at
the call site. Now I can declare variables with those types, and even start
composing them together. If a user zips two stable ranges together, they should
be pissed if they don't get a stable range out. Even if std.algorithm gets it
right, 3rd party libs might not. Additionally, std.algorithm should support
others mimicing the same type-based design style.

Overall, i don't think it makes implementing sort easier, but it opens a lot of
undesirable issues.


Andrei Alexandrescu Wrote:

 I'm toying with a rather new approach to a certain problem, and wanted 
 to ask for some feedback before coding with it.
 
 The STL has two functions with a "stable" variant: stable_sort and 
 stable_partition. They differ from sort and partition in that they 
 preserve the order of equivalent objects. They are more costly, hence 
 the default is "unstable".
 
 I started with the same approach in std.algorithm, but before long I 
 figured two things:
 
 a) There are many more algorithms that can benefit from stable vs. 
 unstable versions: remove, topN, schwartzSort, partialSort, completeSort 
 (take a sorted range and an unsorted one and make a sorted range out of 
 them), and makeIndex.
 
 b) Aside from stable and unstable, there's a semistable notion: when 
 partitioning, the left half stays stable and the right half is unstable.
 
 Clearly encoding stability in the function name isn't faring very well, 
 so I made a policy of it:
 
 enum SwapStrategy { stable, semistable, unstable }
 
 and parameterized the appropriate functions with SwapStrategy. A use 
 case looks like this:
 
 // even to the left, odd to the right
 auto arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
 auto p = partition!("(a & 1) == 0", SwapStrategy.stable)(arr);
 assert(arr == [2, 4, 6, 8, 10, 1, 3, 5, 7, 9]);
 assert(p == arr[5 .. $]);
 
 Ok, until now there's only been background :o|. The new part starts here.
 
 I think the stability request could be better encoded as a wrapper of 
 the range making the request. For example:
 
 auto p = partition!("(a & 1) == 0")(keepStable(arr));
 
 So in order to tell partition you want stability, you just wrap the 
 range with a call to keepStable. (The runtime cost is negligible). 
 Similarly, say you want a stable sort. You'd say:
 
 sort(keepStable(arr));
 
 instead of:
 
 sort!(SwapStrategy.stable)(arr);
 
 I have the feeling this approach will scale better in the future. Also 
 it is less verbose and in sync with the current way of doing searches:
 
 auto f = find(assumeSorted(haystack), needle);
 
 In that case, assumeSorted uses the same means of transmitting 
 information about a range into an algorithm.
 
 So... I'd appreciate any thoughts you might have. Thanks.
 
 
 Andrei
Apr 03 2009
prev sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 03 Apr 2009 20:53:18 +0400, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 I'm toying with a rather new approach to a certain problem, and wanted  
 to ask for some feedback before coding with it.

 The STL has two functions with a "stable" variant: stable_sort and  
 stable_partition. They differ from sort and partition in that they  
 preserve the order of equivalent objects. They are more costly, hence  
 the default is "unstable".

 I started with the same approach in std.algorithm, but before long I  
 figured two things:

 a) There are many more algorithms that can benefit from stable vs.  
 unstable versions: remove, topN, schwartzSort, partialSort, completeSort  
 (take a sorted range and an unsorted one and make a sorted range out of  
 them), and makeIndex.

 b) Aside from stable and unstable, there's a semistable notion: when  
 partitioning, the left half stays stable and the right half is unstable.

 Clearly encoding stability in the function name isn't faring very well,  
 so I made a policy of it:

 enum SwapStrategy { stable, semistable, unstable }

 and parameterized the appropriate functions with SwapStrategy. A use  
 case looks like this:

 // even to the left, odd to the right
 auto arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
 auto p = partition!("(a & 1) == 0", SwapStrategy.stable)(arr);
 assert(arr == [2, 4, 6, 8, 10, 1, 3, 5, 7, 9]);
 assert(p == arr[5 .. $]);

 Ok, until now there's only been background :o|. The new part starts here.

 I think the stability request could be better encoded as a wrapper of  
 the range making the request. For example:

 auto p = partition!("(a & 1) == 0")(keepStable(arr));

 So in order to tell partition you want stability, you just wrap the  
 range with a call to keepStable. (The runtime cost is negligible).  
 Similarly, say you want a stable sort. You'd say:

 sort(keepStable(arr));

 instead of:

 sort!(SwapStrategy.stable)(arr);

 I have the feeling this approach will scale better in the future. Also  
 it is less verbose and in sync with the current way of doing searches:

 auto f = find(assumeSorted(haystack), needle);

 In that case, assumeSorted uses the same means of transmitting  
 information about a range into an algorithm.

 So... I'd appreciate any thoughts you might have. Thanks.


 Andrei
This sounds interesting, but I don't thing it's good idea overall. assumeSorted(range) is fine, because "sorted" is a property of a range. auto r = assumeSorted(range); // makes sense keepStable(range) is not okay, because stable is not a property of a range. It's a property of an algorithm. auto r = keepStable(range); // makes no sense to me
Apr 03 2009