digitalmars.D - Worst-case performance of quickSort / getPivot

Vladimir Panteleev (62/62) Nov 15 2013 I've been investigating an instance that I ran into while

Craig Dillabaugh (8/35) Nov 15 2013 On Friday, 15 November 2013 at 21:46:26 UTC, Vladimir Panteleev

Vladimir Panteleev (10/11) Nov 15 2013 - Inconsistent execution time - someone trying to benchmark
Xinok (10/11) Nov 16 2013 (1) Sort is capable of being marked pure, depending on the type

Chris Cain (15/79) Nov 15 2013 Improving quicksort is probably a good idea, but IIRC, the stable
Ivan Kazmenko (8/13) Nov 16 2013 I've been hit by this case too, half a year ago. See the

Andrei Alexandrescu (3/14) Nov 16 2013 I think a median of five pivot selection is in order.
Andrei Alexandrescu (3/14) Nov 16 2013 Yah, I thought one is in place already.

Jean Christophe (34/53) Nov 16 2013 One possible implementation suggests to swap Left and Right

Vladimir Panteleev (12/55) Nov 16 2013 getPivot sorts the first, middle and last element in the range

Jean Christophe (6/30) Nov 16 2013 Yes.

Andrei Alexandrescu (4/8) Nov 16 2013 Maybe makeIndex may be of help.
Vladimir Panteleev (10/16) Nov 17 2013 If the range elements are reference types, that's what will

Andrei Alexandrescu (15/61) Nov 16 2013 That may help this particular situation, but does it do anything

Jean Christophe (14/37) Nov 16 2013 Yes. This has the extra advantage that the smallest of the three

Craig Dillabaugh (8/28) Nov 16 2013 On Saturday, 16 November 2013 at 14:20:32 UTC, Jean Christophe

Xinok (29/30) Nov 16 2013 I'm a horrible procrastinator, I should have had this fixed over

Andrei Alexandrescu (5/9) Nov 16 2013 There's something fishy about a more restricted algorithm doing better

Xinok (53/65) Nov 16 2013 Timsort is an "adaptive" sort. It's able to achieve better

Andrei Alexandrescu (11/27) Nov 16 2013 This is in response to what? Are you trying to talk me out of the

Chris Cain (25/36) Nov 16 2013 I think it's more complicated than that. Let's assume for a

Andrei Alexandrescu (15/34) Nov 16 2013 Very simple. Any sequence with a large number of equivalent elements

Chris Cain (27/42) Nov 16 2013 Ah, very nicely described. So a stable sort can take more time
Xinok (7/18) Nov 16 2013 If you tweak for one scenario, you're only losing in other

Andrei Alexandrescu (11/26) Nov 16 2013 Yes, it is also my understanding that Timsort does well on almost sorted...

Chris Cain (41/44) Nov 16 2013 I'll kick off the suggestions with this:

Andrei Alexandrescu (5/8) Nov 16 2013 I am hoping for some more representative corpora, along the lines of

Chris Cain (40/51) Nov 17 2013 I think I get what you're saying, but sortbenchmark.org uses

Andrei Alexandrescu (25/33) Nov 17 2013 OK, since I see you have some interest...

Chris Cain (91/94) Nov 17 2013 Sure.

Andrei Alexandrescu (5/21) Nov 17 2013 I think that's a terrific start! (Not sure I understand where the 30

Chris Cain (31/33) Nov 17 2013 Thanks :)

David Nadlinger (8/11) Dec 04 2013 The better way to avoid this problem is usually to compile the

bearophile (5/10) Nov 17 2013 See also:

Chris Cain (33/37) Nov 17 2013 Nice one, bearophile. You called it.

Ivan Kazmenko (59/72) Nov 16 2013 Regarding an ideal pivot choice for quicksort, I'd like to

Ivan Kazmenko (8/15) Nov 16 2013 Correction: this does *not* cover the median-of-medians case [1].
Chris Cain (14/16) Nov 16 2013 Oh, of course. I did that in an algorithms class taught by a

Chris Cain (7/10) Nov 16 2013 Actually, not everything. It'd swap the pivot and the last
Andrei Alexandrescu (10/18) Nov 16 2013 I keep on thinking of a sort with delayed pivot selection that starts

Craig Dillabaugh (10/85) Nov 16 2013 Woah, that sounds complicated. Not 100% sure I entirely

Vladimir Panteleev (9/14) Nov 16 2013 My understanding that for unordered data, there is no algorithm

Craig Dillabaugh (8/22) Nov 17 2013 Unless I misunderstand something I am pretty sure there is a

Vladimir Panteleev (4/5) Nov 17 2013 Nice! Didn't know about this - although still seems to be a lot

Ivan Kazmenko (10/48) Nov 17 2013 Right, this method (median of medians is an example

Andrei Alexandrescu (5/8) Nov 16 2013 Nice story, but the setup is a tad tenuous (albeit indeed theoretically

Ivan Kazmenko (14/23) Nov 17 2013 Actually, I was thinking about a two phase attack, and it is not

Andrei Alexandrescu (3/24) Nov 17 2013 That won't do with randomized pivot selection.

Timon Gehr (2/5) Nov 17 2013 Hence, in no sense. This does not perform independent uniform random pic...

Ivan Kazmenko (8/16) Nov 17 2013 Not at all. There is a number of situations where you want your

Timon Gehr (5/19) Nov 17 2013 One can't say random picking is bad because one can use some other

Vladimir Panteleev (4/8) Nov 16 2013 For the record, I tried both SwapStragegy options with my data

Xinok (8/18) Nov 16 2013 Could you try running a benchmark on the same data using this

Vladimir Panteleev (14/33) Dec 04 2013 Sorry for the late reply - my access to my home PC was spotty in

Xinok (4/5) Nov 28 2013 I've made a pull request to fix this issue:

Ivan Kazmenko (6/8) Nov 28 2013 I've left some comments on Github.

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

I've been investigating an instance that I ran into while 
processing some data, in which multiSort had apparently stalled 
whereas two successive sort calls processed the data quickly. 
After reducing the code and 20 million points, I've reduced the 
problem to this test case:

enum N=1000000;
N.iota.retro.chain(N.only).array.sort();

Apparently, this simple case exhibits worst-case (quadratic) 
complexity in our sort implementation. The .retro isn't necessary 
- this also exhibits quadratic complexity (athough the total 
number of predicate calls is halved), because the first quickSort 
cycle will flip the order of elements during partitioning:

N.iota.chain(0.only).array.sort();

Closer inspection shows that the problem is manifested due to how 
the pivot is chosen. Currently, std.algorithm's getPivot uses the 
median-of-three method (median value of first, middle and last 
element[1]). However, in this case, the median element will 
always be the first element - this remains true in all quickSort 
iterations.

Here is an illustrated example with N=9. Note that at the same 
time as choosing the pivot, getPivot also orders the candidates 
(first, middle and last range elements) according to the sort 
predicate. After calling getPivot, quickSortImpl moves the pivot 
to the end of the range by swapping it with the last element.

getPivot(0..10)
8,7,6,5,4,3,2,1,0,9 <- getPivot - before swap
9,7,6,5,4,8,2,1,0,3 <- getPivot - after swap
9,7,6,5,4,3,2,1,0,8 <- quickSortImpl - after swap
9,8,6,5,4,3,2,1,0,7 <- quickSortImpl - after partition
getPivot(2..10)
6,5,4,3,2,1,0,7 <- getPivot - before swap
7,5,4,3,6,1,0,2 <- getPivot - after swap
7,5,4,3,2,1,0,6 <- quickSortImpl - after swap
7,6,4,3,2,1,0,5 <- quickSortImpl - after partition
(...)

The algorithm gets stuck on "slopes" in sorted data, which are 
not uncommon in real-world data (my data resembled a 1D 
heightmap). Although the median-of-three way of picking a pivot 
is recommended by some sources (notably Sedgewick), perhaps a 
better method would be better suitable for Phobos:

* Many sources recommend using a random element as a pivot. 
According to [2], "Randomized quicksort, for any input, it 
requires only O(n log n) expected time (averaged over all choices 
of pivots)". Also, if it is not possible to predict the pivot 
choice, it is impossible to craft worst-case input, which is a 
plus from a security point[3]. However, I'm not sure if making 
the behavior of std.algorithm's sort nondeterministic is 
desirable.
* It looks like many STL implementations use Introsort[4] - a 
hybrid sorting algorithm which starts with QuickSort, but 
switches to HeapSort if the recursion limit exceeds a threshold.
* I found a recent article[5] which describes an optimal 
algorithm of picking a pivot, however it is behind a paywall.

[1]: Apparently the term is also used to refer to choosing three 
points *randomly*, instead of first/middle/last: 
http://stackoverflow.com/a/164183/21501
[2]: 
http://en.wikipedia.org/wiki/Quicksort#Analysis_of_Randomized_quicksort
[3]: 
https://www.google.com/search?q=quicksort+worst+case+denial+of+service
[4]: http://en.wikipedia.org/wiki/Introsort
[5]: https://news.ycombinator.com/item?id=6629117

Nov 15 2013

"Craig Dillabaugh" <cdillaba cg.scs.carleton.ca> writes:

On Friday, 15 November 2013 at 21:46:26 UTC, Vladimir Panteleev
wrote:
clip
 The algorithm gets stuck on "slopes" in sorted data, which are 
 not uncommon in real-world data (my data resembled a 1D 
 heightmap). Although the median-of-three way of picking a pivot 
 is recommended by some sources (notably Sedgewick), perhaps a 
 better method would be better suitable for Phobos:

If you use median-of-five (or seven) then you would start getting
reasonable pivots in such an instance, though I am sure someone
could craft an input set that defeats that. Does the Sedgewick
source mention why 3 is picked?  Easy to implement I assume.

 * Many sources recommend using a random element as a pivot. 
 According to [2], "Randomized quicksort, for any input, it 
 requires only O(n log n) expected time (averaged over all 
 choices of pivots)". Also, if it is not possible to predict the 
 pivot choice, it is impossible to craft worst-case input, which 
 is a plus from a security point[3]. However, I'm not sure if 
 making the behavior of std.algorithm's sort nondeterministic is 
 desirable.

What are the arguments against using a randomized algorithm?


 * It looks like many STL implementations use Introsort[4] - a 
 hybrid sorting algorithm which starts with QuickSort, but 
 switches to HeapSort if the recursion limit exceeds a threshold.
 * I found a recent article[5] which describes an optimal 
 algorithm of picking a pivot, however it is behind a paywall.

 [1]: Apparently the term is also used to refer to choosing 
 three points *randomly*, instead of first/middle/last: 
 http://stackoverflow.com/a/164183/21501
 [2]: 
 http://en.wikipedia.org/wiki/Quicksort#Analysis_of_Randomized_quicksort
 [3]: 
 https://www.google.com/search?q=quicksort+worst+case+denial+of+service
 [4]: http://en.wikipedia.org/wiki/Introsort
 [5]: https://news.ycombinator.com/item?id=6629117

Nov 15 2013

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Friday, 15 November 2013 at 22:56:58 UTC, Craig Dillabaugh 
wrote:
 What are the arguments against using a randomized algorithm?

- Inconsistent execution time - someone trying to benchmark 
Phobos sort functions might get really confused;
- QuickSort is an unstable sort. If there is more to the data 
than the items being compared, the data will end up in a 
different order on different program runs.
- std.algorithm's sort allows specifying arbitrary code as a 
predicate. If that code has side effects, the program will behave 
differently on different runs.

Nov 15 2013

"Xinok" <xinok live.com> writes:

On Friday, 15 November 2013 at 22:56:58 UTC, Craig Dillabaugh 
wrote:
 What are the arguments against using a randomized algorithm?

(1) Sort is capable of being marked pure, depending on the type 
being sorted and the predicate. But choosing random pivots means 
introducing side effects.

(2) Random pivots invoke average performance, whereas choosing 
the pivot from a median of N has the potential to achieve better 
performance on partially sorted lists (fewer reads and writes).

(3) I've tested random pivots and found that choosing from a 
median of three or five is often the better choice.

Nov 16 2013

"Chris Cain" <clcain uncg.edu> writes:

On Friday, 15 November 2013 at 21:46:26 UTC, Vladimir Panteleev 
wrote:
 I've been investigating an instance that I ran into while 
 processing some data, in which multiSort had apparently stalled 
 whereas two successive sort calls processed the data quickly. 
 After reducing the code and 20 million points, I've reduced the 
 problem to this test case:

 enum N=1000000;
 N.iota.retro.chain(N.only).array.sort();

 Apparently, this simple case exhibits worst-case (quadratic) 
 complexity in our sort implementation. The .retro isn't 
 necessary - this also exhibits quadratic complexity (athough 
 the total number of predicate calls is halved), because the 
 first quickSort cycle will flip the order of elements during 
 partitioning:

 N.iota.chain(0.only).array.sort();

 Closer inspection shows that the problem is manifested due to 
 how the pivot is chosen. Currently, std.algorithm's getPivot 
 uses the median-of-three method (median value of first, middle 
 and last element[1]). However, in this case, the median element 
 will always be the first element - this remains true in all 
 quickSort iterations.

 Here is an illustrated example with N=9. Note that at the same 
 time as choosing the pivot, getPivot also orders the candidates 
 (first, middle and last range elements) according to the sort 
 predicate. After calling getPivot, quickSortImpl moves the 
 pivot to the end of the range by swapping it with the last 
 element.

 getPivot(0..10)
 8,7,6,5,4,3,2,1,0,9 <- getPivot - before swap
 9,7,6,5,4,8,2,1,0,3 <- getPivot - after swap
 9,7,6,5,4,3,2,1,0,8 <- quickSortImpl - after swap
 9,8,6,5,4,3,2,1,0,7 <- quickSortImpl - after partition
 getPivot(2..10)
 6,5,4,3,2,1,0,7 <- getPivot - before swap
 7,5,4,3,6,1,0,2 <- getPivot - after swap
 7,5,4,3,2,1,0,6 <- quickSortImpl - after swap
 7,6,4,3,2,1,0,5 <- quickSortImpl - after partition
 (...)

 The algorithm gets stuck on "slopes" in sorted data, which are 
 not uncommon in real-world data (my data resembled a 1D 
 heightmap). Although the median-of-three way of picking a pivot 
 is recommended by some sources (notably Sedgewick), perhaps a 
 better method would be better suitable for Phobos:

 * Many sources recommend using a random element as a pivot. 
 According to [2], "Randomized quicksort, for any input, it 
 requires only O(n log n) expected time (averaged over all 
 choices of pivots)". Also, if it is not possible to predict the 
 pivot choice, it is impossible to craft worst-case input, which 
 is a plus from a security point[3]. However, I'm not sure if 
 making the behavior of std.algorithm's sort nondeterministic is 
 desirable.
 * It looks like many STL implementations use Introsort[4] - a 
 hybrid sorting algorithm which starts with QuickSort, but 
 switches to HeapSort if the recursion limit exceeds a threshold.
 * I found a recent article[5] which describes an optimal 
 algorithm of picking a pivot, however it is behind a paywall.

 [1]: Apparently the term is also used to refer to choosing 
 three points *randomly*, instead of first/middle/last: 
 http://stackoverflow.com/a/164183/21501
 [2]: 
 http://en.wikipedia.org/wiki/Quicksort#Analysis_of_Randomized_quicksort
 [3]: 
 https://www.google.com/search?q=quicksort+worst+case+denial+of+service
 [4]: http://en.wikipedia.org/wiki/Introsort
 [5]: https://news.ycombinator.com/item?id=6629117

Improving quicksort is probably a good idea, but IIRC, the stable 
sort has been changed to Timsort, which I find to be faster in 
most cases. Probably what should be done is change the default to 
stable sort and improve the unstable version to be faster in some 
way.

Presumably you could use a "randomized version," but with a 
constant key. That way the algorithm would be deterministic but 
likely to have good characteristics in normal inputs. 
Unfortunately that wouldn't solve the issue of an attacker 
getting to choose an input and causing worst case performance (it 
would just make it harder to come up with such an input). 
Ultimately, using the stable version (Timsort) is safer for 
security use (hence why I'd suggest it be the default).

Nov 15 2013

"Ivan Kazmenko" <gassa mail.ru> writes:

On Friday, 15 November 2013 at 21:46:26 UTC, Vladimir Panteleev 
wrote:
 I've been investigating an instance that I ran into while 
 processing some data, in which multiSort had apparently stalled 
 whereas two successive sort calls processed the data quickly. 
 After reducing the code and 20 million points, I've reduced the 
 problem to this test case:

I've been hit by this case too, half a year ago.  See the 
relevant discussion on D.learn:
http://forum.dlang.org/thread/drpbkvhlprkdjzhbcqjn forum.dlang.org

The fact that the issue is raised once again perhaps means it is 
worth a bugreport.  What do you guys think?

Ivan Kazmenko.

Nov 16 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/16/13 3:29 AM, Ivan Kazmenko wrote:
 On Friday, 15 November 2013 at 21:46:26 UTC, Vladimir Panteleev wrote:
 I've been investigating an instance that I ran into while processing
 some data, in which multiSort had apparently stalled whereas two
 successive sort calls processed the data quickly. After reducing the
 code and 20 million points, I've reduced the problem to this test case:

 I've been hit by this case too, half a year ago.  See the relevant
 discussion on D.learn:
 http://forum.dlang.org/thread/drpbkvhlprkdjzhbcqjn forum.dlang.org

 The fact that the issue is raised once again perhaps means it is worth a
 bugreport.  What do you guys think?

 Ivan Kazmenko.

I think a median of five pivot selection is in order.

Andrei

Nov 16 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/16/13 3:29 AM, Ivan Kazmenko wrote:
 On Friday, 15 November 2013 at 21:46:26 UTC, Vladimir Panteleev wrote:
 I've been investigating an instance that I ran into while processing
 some data, in which multiSort had apparently stalled whereas two
 successive sort calls processed the data quickly. After reducing the
 code and 20 million points, I've reduced the problem to this test case:

 I've been hit by this case too, half a year ago.  See the relevant
 discussion on D.learn:
 http://forum.dlang.org/thread/drpbkvhlprkdjzhbcqjn forum.dlang.org

 The fact that the issue is raised once again perhaps means it is worth a
 bugreport.  What do you guys think?

 Ivan Kazmenko.

Yah, I thought one is in place already.

Andrei

Nov 16 2013

"Jean Christophe" <cybrarian live.fr> writes:

On Friday, 15 November 2013 at 21:46:26 UTC, Vladimir Panteleev 
wrote:

 getPivot(0..10)
 8,7,6,5,4,3,2,1,0,9 <- getPivot - before swap
 9,7,6,5,4,8,2,1,0,3 <- getPivot - after swap
 9,7,6,5,4,3,2,1,0,8 <- quickSortImpl - after swap
 9,8,6,5,4,3,2,1,0,7 <- quickSortImpl - after partition
 getPivot(2..10)
 6,5,4,3,2,1,0,7 <- getPivot - before swap
 7,5,4,3,6,1,0,2 <- getPivot - after swap
 7,5,4,3,2,1,0,6 <- quickSortImpl - after swap
 7,6,4,3,2,1,0,5 <- quickSortImpl - after partition
 (...)

One possible implementation suggests to swap Left and Right 
immediatly after choosing the Pivot (if Left > Right), then place 
the Pivot at Right-1. It seems that this option was not taken. 
Any reason?

As the efficiency of Quicksort is known to be bad in sorting a 
small number of elements, ie. < 10, it might be nice to implement 
an option to automatically switch to a more appropriate algorithm 
if it's relevant to do so.

 * Many sources recommend using a random element as a pivot. 
 According to [2], "Randomized quicksort, for any input, it 
 requires only O(n log n) expected time (averaged over all 
 choices of pivots)".

IMO it would be costly and not so relevant if the goal is to be 
fast.

 Also, if it is not possible to predict the pivot choice, it is 
 impossible to craft worst-case input, which is a plus from a 
 security point[3]. However, I'm not sure if making the behavior 
 of std.algorithm's sort nondeterministic is desirable.

I think it's not desirable.

--

Quicksorting a collection of Objects?

BTW I'm very interested in finding a library which could 
Quicksort an array of pointers, where each pointer points to a 
class object (or a structure) address. The library would make 
possible, for example, to sort the `class objects` using one of 
their members as the key. Because the swaps are done on the 
actual pointers (and not the Objects pointed), the Quicksort 
should be very efficient. However, the algorithm woulnd't be so 
trivial to implement because each comparison shall be done using 
the Object's member's value :

eg. Obj1.foo < Obj2.foo.

Of course, the programmer can choose any relevant member property 
to sort the collection. And it should be as easy to use as:

class SomeClass { string foo; double bar;}
SomeClass[] a;
// put 100 000 objects into a
a.sort("foo");

Do we already have something like that available somewhere or is 
it possible to make one eventually?

-- JC

Nov 16 2013

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Saturday, 16 November 2013 at 14:20:32 UTC, Jean Christophe 
wrote:
 On Friday, 15 November 2013 at 21:46:26 UTC, Vladimir Panteleev 
 wrote:

 getPivot(0..10)
 8,7,6,5,4,3,2,1,0,9 <- getPivot - before swap
 9,7,6,5,4,8,2,1,0,3 <- getPivot - after swap
 9,7,6,5,4,3,2,1,0,8 <- quickSortImpl - after swap
 9,8,6,5,4,3,2,1,0,7 <- quickSortImpl - after partition
 getPivot(2..10)
 6,5,4,3,2,1,0,7 <- getPivot - before swap
 7,5,4,3,6,1,0,2 <- getPivot - after swap
 7,5,4,3,2,1,0,6 <- quickSortImpl - after swap
 7,6,4,3,2,1,0,5 <- quickSortImpl - after partition
 (...)

 One possible implementation suggests to swap Left and Right 
 immediatly after choosing the Pivot (if Left > Right), then 
 place the Pivot at Right-1. It seems that this option was not 
 taken. Any reason?

getPivot sorts the first, middle and last element in the range 
right after it figures out their order. Is there any advantage to 
your suggestion over this?

 As the efficiency of Quicksort is known to be bad in sorting a 
 small number of elements, ie. < 10, it might be nice to 
 implement an option to automatically switch to a more 
 appropriate algorithm if it's relevant to do so.

It does, actually - it switches to optimisticInsertionSort for 
ranges of 25 elements or shorter. I disabled that behavior for 
illustration.

 IMO it would be costly and not so relevant if the goal is to be 
 fast.

The impact shouldn't be big if a simple RNG, like LCG or 
XorShift, was used.

 Quicksorting a collection of Objects?

?

 BTW I'm very interested in finding a library which could 
 Quicksort an array of pointers, where each pointer points to a 
 class object (or a structure) address. The library would make 
 possible, for example, to sort the `class objects` using one of 
 their members as the key. Because the swaps are done on the 
 actual pointers (and not the Objects pointed), the Quicksort 
 should be very efficient. However, the algorithm woulnd't be so 
 trivial to implement because each comparison shall be done 
 using the Object's member's value :

 eg. Obj1.foo < Obj2.foo.

 Of course, the programmer can choose any relevant member 
 property to sort the collection. And it should be as easy to 
 use as:

 class SomeClass { string foo; double bar;}
 SomeClass[] a;
 // put 100 000 objects into a
 a.sort("foo");

 Do we already have something like that available somewhere or 
 is it possible to make one eventually?

You mean, sort!`a.foo < b.foo` ?

Nov 16 2013

"Jean Christophe" <cybrarian live.fr> writes:

 BTW I'm very interested in finding a library which could 
 Quicksort an array of pointers, where each pointer points to a 
 class object (or a structure) address. The library would make 
 possible, for example, to sort the `class objects` using one 
 of their members as the key. Because the swaps are done on the 
 actual pointers (and not the Objects pointed), the Quicksort 
 should be very efficient. However, the algorithm woulnd't be 
 so trivial to implement because each comparison shall be done 
 using the Object's member's value :

 eg. Obj1.foo < Obj2.foo.

 Of course, the programmer can choose any relevant member 
 property to sort the collection. And it should be as easy to 
 use as:

 class SomeClass { string foo; double bar;}
 SomeClass[] a;
 // put 100 000 objects into a
 a.sort("foo");

 Do we already have something like that available somewhere or 
 is it possible to make one eventually?

 You mean, sort!`a.foo < b.foo` ?

Yes.

An indirect sorting, assuming a and b to be ojects of class 
SomePotentialyLargeClass.

Because the array to sort contains pointers only, all the data 
movement is essentially the same as if we were sorting integer.

-- JC

Nov 16 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/16/13 9:30 PM, Jean Christophe wrote:
 An indirect sorting, assuming a and b to be ojects of class
 SomePotentialyLargeClass.

 Because the array to sort contains pointers only, all the data movement
 is essentially the same as if we were sorting integer.

Maybe makeIndex may be of help.

http://dlang.org/phobos/std_algorithm.html#makeIndex


Andrei

Nov 16 2013

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Sunday, 17 November 2013 at 05:30:24 UTC, Jean Christophe 
wrote:
 You mean, sort!`a.foo < b.foo` ?

 Yes.

 An indirect sorting, assuming a and b to be ojects of class 
 SomePotentialyLargeClass.

 Because the array to sort contains pointers only, all the data 
 movement is essentially the same as if we were sorting integer.

If the range elements are reference types, that's what will 
happen (unless they overload opAssign oslt). Otherwise, there's 
makeIndex (already mentioned by Andrei), or you could also do it 
by hand:

1. r.length.iota.array.sort!((a, b) => r[a]<r[b]);
2. r.length.iota.map!(a => &r[a]).array.sort!((a, b) => *a<*b);

r.map!`&a` doesn't work though, because we don't get a reference 
to the range element in the predicate.

Nov 17 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/16/13 6:20 AM, Jean Christophe wrote:
 On Friday, 15 November 2013 at 21:46:26 UTC, Vladimir Panteleev wrote:

 getPivot(0..10)
 8,7,6,5,4,3,2,1,0,9 <- getPivot - before swap
 9,7,6,5,4,8,2,1,0,3 <- getPivot - after swap
 9,7,6,5,4,3,2,1,0,8 <- quickSortImpl - after swap
 9,8,6,5,4,3,2,1,0,7 <- quickSortImpl - after partition
 getPivot(2..10)
 6,5,4,3,2,1,0,7 <- getPivot - before swap
 7,5,4,3,6,1,0,2 <- getPivot - after swap
 7,5,4,3,2,1,0,6 <- quickSortImpl - after swap
 7,6,4,3,2,1,0,5 <- quickSortImpl - after partition
 (...)

 One possible implementation suggests to swap Left and Right immediatly
 after choosing the Pivot (if Left > Right), then place the Pivot at
 Right-1. It seems that this option was not taken. Any reason?

That may help this particular situation, but does it do anything 
interesting in general?

 As the efficiency of Quicksort is known to be bad in sorting a small
 number of elements, ie. < 10, it might be nice to implement an option to
 automatically switch to a more appropriate algorithm if it's relevant to
 do so.

That's done already.

 * Many sources recommend using a random element as a pivot. According
 to [2], "Randomized quicksort, for any input, it requires only O(n log
 n) expected time (averaged over all choices of pivots)".

 IMO it would be costly and not so relevant if the goal is to be fast.

 Also, if it is not possible to predict the pivot choice, it is
 impossible to craft worst-case input, which is a plus from a security
 point[3]. However, I'm not sure if making the behavior of
 std.algorithm's sort nondeterministic is desirable.

 I think it's not desirable.

Vladimir also enumerated a few disadvantages of random pivot selection. 
I think those notwithstanding, random pivot is an option we should 
consider very seriously. It's easy to use a fast random number 
generators (randomness doesn't need to be very good).

The only reason for which I've hesitated to add random pivot selection 
is that it's less usual and might surprise some. It is _very_ solidly 
motivated.

 Quicksorting a collection of Objects?

 BTW I'm very interested in finding a library which could Quicksort an
 array of pointers, where each pointer points to a class object (or a
 structure) address. The library would make possible, for example, to
 sort the `class objects` using one of their members as the key. Because
 the swaps are done on the actual pointers (and not the Objects pointed),
 the Quicksort should be very efficient. However, the algorithm woulnd't
 be so trivial to implement because each comparison shall be done using
 the Object's member's value :

 eg. Obj1.foo < Obj2.foo.

 Of course, the programmer can choose any relevant member property to
 sort the collection. And it should be as easy to use as:

 class SomeClass { string foo; double bar;}
 SomeClass[] a;
 // put 100 000 objects into a
 a.sort("foo");

 Do we already have something like that available somewhere or is it
 possible to make one eventually?

I had a diff at a point that added overloads to sort allowing the called 
to only specify the key. It turned out to be confusing, and all it saved 
was to save a tad of duplication "expr" instead of "expr(a) < expr(b)".


Andrei

Nov 16 2013

"Jean Christophe" <cybrarian live.fr> writes:

On Saturday, 16 November 2013 at 16:10:56 UTC, Andrei 
Alexandrescu wrote:
 On 11/16/13 6:20 AM, Jean Christophe wrote:
 On Friday, 15 November 2013 at 21:46:26 UTC, Vladimir 
 Panteleev wrote:

 getPivot(0..10)
 8,7,6,5,4,3,2,1,0,9 <- getPivot - before swap
 9,7,6,5,4,8,2,1,0,3 <- getPivot - after swap
 9,7,6,5,4,3,2,1,0,8 <- quickSortImpl - after swap
 9,8,6,5,4,3,2,1,0,7 <- quickSortImpl - after partition
 getPivot(2..10)
 6,5,4,3,2,1,0,7 <- getPivot - before swap
 7,5,4,3,6,1,0,2 <- getPivot - after swap
 7,5,4,3,2,1,0,6 <- quickSortImpl - after swap
 7,6,4,3,2,1,0,5 <- quickSortImpl - after partition
 (...)

 One possible implementation suggests to swap Left and Right 
 immediatly
 after choosing the Pivot (if Left > Right), then place the 
 Pivot at
 Right-1. It seems that this option was not taken. Any reason?

 That may help this particular situation, but does it do 
 anything interesting in general?

Yes. This has the extra advantage that the smallest of the three 
winds up in A[left], which is where the partitioning routine 
would put it anyway. The largest winds up at A[right] which is 
also the correct place, since it is larger than the Pivot. 
Therefore you can place the Pivot in A[right -1] and initialize i 
and j to (left+1) and (right-2) in the partition phase. Another 
benefit is that because A[left] is smaller than the Pivot, it 
will act as a sentinel for j. We do not need to worry about j 
running past the end. Same for i. Thus you assert:

A[left] <= A[center] <= A[right]

even before you hide the Pivot.

-- JC

Nov 16 2013

"Craig Dillabaugh" <cdillaba cg.scs.carleton.ca> writes:

On Saturday, 16 November 2013 at 14:20:32 UTC, Jean Christophe
wrote:
clip
 BTW I'm very interested in finding a library which could 
 Quicksort an array of pointers, where each pointer points to a 
 class object (or a structure) address. The library would make 
 possible, for example, to sort the `class objects` using one of 
 their members as the key. Because the swaps are done on the 
 actual pointers (and not the Objects pointed), the Quicksort 
 should be very efficient. However, the algorithm woulnd't be so 
 trivial to implement because each comparison shall be done 
 using the Object's member's value :

 eg. Obj1.foo < Obj2.foo.

 Of course, the programmer can choose any relevant member 
 property to sort the collection. And it should be as easy to 
 use as:

 class SomeClass { string foo; double bar;}
 SomeClass[] a;
 // put 100 000 objects into a
 a.sort("foo");

 Do we already have something like that available somewhere or 
 is it possible to make one eventually?

 -- JC

This is sort of an interesting problem that you propose, because
while sorting just the pointers means moving things around less,
you would also get absolutely horrible locality of
reference/cache hit performance.

Have you thought about ways to deal with that?

Nov 16 2013

"Xinok" <xinok live.com> writes:

On Friday, 15 November 2013 at 21:46:26 UTC, Vladimir Panteleev 
wrote:
 ...

I'm a horrible procrastinator, I should have had this fixed over 
a year ago, lol. Anyways, I posted a bug report about this 
problem long ago and a working solution has been gathering dust 
waiting to be implemented in Phobos.

Bug report (poorly titled):
http://d.puremagic.com/issues/show_bug.cgi?id=7767

Solution:
https://github.com/Xinok/XSort/blob/master/unstablesort.d

* It's an introsort which falls back to a bottom-up binary heap 
sort after so many recursions to guarantee O(n log n) performance 
in the worst-case.

* It chooses the pivot from a median-of-five. It adds nearly zero 
overhead and results in a noticeable decrease in comparisons 
(predicate calls). I see no reason why not to add this.

* My solution uses a binary insertion sort for sublists up to 32 
elements long. This reduces comparisons as well but also adds a 
fair amount of overhead. It's about 15-20% slower when sorting an 
array of integers. My idea is to use this by default and 
specialize the linear insertion sort when the element type and 
predicate are known.

* Regardless of these improvements, I think Timsort should be the 
default sorting algorithm. It's the better choice in many (most?) 
cases and, well, it's stable. Quick sort would still be available 
for those cases in which it's faster and stable sorting isn't 
needed.


Well, it's Saturday and I have no plans. Let's see if I can't get 
a pull request done by the end of the day...

Nov 16 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/16/13 11:46 AM, Xinok wrote:
 * Regardless of these improvements, I think Timsort should be the
 default sorting algorithm. It's the better choice in many (most?) cases
 and, well, it's stable. Quick sort would still be available for those
 cases in which it's faster and stable sorting isn't needed.

There's something fishy about a more restricted algorithm doing better 
than one that's less restricted. We must improve unstable sort, not make 
stable sort the default.

Andrei

Nov 16 2013

"Xinok" <xinok live.com> writes:

On Saturday, 16 November 2013 at 20:35:21 UTC, Andrei 
Alexandrescu wrote:
 On 11/16/13 11:46 AM, Xinok wrote:
 * Regardless of these improvements, I think Timsort should be 
 the
 default sorting algorithm. It's the better choice in many 
 (most?) cases
 and, well, it's stable. Quick sort would still be available 
 for those
 cases in which it's faster and stable sorting isn't needed.

 There's something fishy about a more restricted algorithm doing 
 better than one that's less restricted. We must improve 
 unstable sort, not make stable sort the default.

 Andrei

Timsort is an "adaptive" sort. It's able to achieve better 
performance in many cases by exploiting patterns in the data.

I decided to make a nice little test using my benchmark code here:
https://github.com/Xinok/XSort/blob/master/benchsort.d


This is the permutation I designed for the test:

static uint[] base;
base.length = 1024 * 1024;
foreach(i, ref v; base) v = i;

foreach(i; 0..1024)
{
	auto rand()
	{
		rndGen.popFront();
		return rndGen.front % base.length;
	}
	
	switch(rand() % 3)
	{
		// Swap two ranges of elements
		case 0:
		{
			auto arr = [rand(), rand(), rand(), rand()];
			sort(arr);
			swapRanges(base[arr[0]..arr[1]], base[arr[2]..arr[3]]);
		} break;
		
		// Reverse a range of elements
		case 1:
		{
			auto a = rand(), b = rand();
			if(a > b) swap(a, b);
			reverse(base[a..b]);
		} break;
			
		// Shuffle a small range of elements
		case 2:
		{
			auto a = rand();
			while(a < 1024) a = rand();
			assert(a >= 1024);
			randomShuffle(base[a - 1024 .. a]);
		} break;
			
		default: assert(0);
	}
}


And the results (last number is predicate calls):

Current Unstable Sort  50ms  32783474
New Unstable Sort      69ms  21503542
Timsort                35ms  3905887


That's why I suggest making Timsort the default sorting algorithm.

Nov 16 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/16/13 2:11 PM, Xinok wrote:
 On Saturday, 16 November 2013 at 20:35:21 UTC, Andrei Alexandrescu wrote:
 On 11/16/13 11:46 AM, Xinok wrote:
 * Regardless of these improvements, I think Timsort should be the
 default sorting algorithm. It's the better choice in many (most?) cases
 and, well, it's stable. Quick sort would still be available for those
 cases in which it's faster and stable sorting isn't needed.

 There's something fishy about a more restricted algorithm doing better
 than one that's less restricted. We must improve unstable sort, not
 make stable sort the default.

 Andrei

 Timsort is an "adaptive" sort. It's able to achieve better performance
 in many cases by exploiting patterns in the data.

This is in response to what? Are you trying to talk me out of the 
pigeonhole principle?

 I decided to make a nice little test using my benchmark code here:
 https://github.com/Xinok/XSort/blob/master/benchsort.d

I understand and appreciate that Timsort is a nicely optimized 
algorithm. This has nothing to do with it doing more work than an 
unstable sort that is just as nicely optimized.

At the end of the day whatever stable sorting algorithms are out there, 
their set is included in the set of all sorting algorithms. In order to 
preserve stability, stable sorting algorithms need to do nonnegative 
extra work.


Andrei

Nov 16 2013

"Chris Cain" <clcain uncg.edu> writes:

On Saturday, 16 November 2013 at 23:40:39 UTC, Andrei 
Alexandrescu wrote:
 This is in response to what? Are you trying to talk me out of 
 the pigeonhole principle?
...
 I understand and appreciate that Timsort is a nicely optimized 
 algorithm. This has nothing to do with it doing more work than 
 an unstable sort that is just as nicely optimized.

 At the end of the day whatever stable sorting algorithms are 
 out there, their set is included in the set of all sorting 
 algorithms. In order to preserve stability, stable sorting 
 algorithms need to do nonnegative extra work.


 Andrei

I think it's more complicated than that. Let's assume for a 
moment that you've proven that such an unstable sort must exist 
that is faster (I'm not convinced that it necessarily must take 
extra work to maintain stability). You have not shown how much 
faster it might be (it could be only 1% faster) nor how much work 
it would take to discover (even an ideal pivot choice for 
quicksort actually cannot be as fast as Timsort on an already 
sorted sequence, and quicksort is not an appropriate algorithm 
for accepting presorted subsequences). If it takes 2 years to 
come up with an unstable sort faster than Timsort, then it seems 
like a long time to be using something that isn't the best that 
we currently have. Especially if D is to be used in benchmarks 
against other languages.

Timsort is implemented now and has no real cost to use as 
default. If, at some point, an unstable sort is discovered that 
is faster on average than Timsort AND (extremely important) it 
has no potential for an attacker to be able to choose a sequence 
to sort that causes poor performance, then unstable could be 
switched back as the default at some point.

Right now, this could, in fact, cause a security hole (DOS-type 
attack) depending on how it is used. I think it's better to use a 
safer default in this case, even if we could make it faster than 
Timsort somehow.

Nov 16 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/16/13 4:18 PM, Chris Cain wrote:
 I think it's more complicated than that. Let's assume for a moment that
 you've proven that such an unstable sort must exist that is faster (I'm
 not convinced that it necessarily must take extra work to maintain
 stability).

Very simple. Any sequence with a large number of equivalent elements 
will cause timsort to work more than an unstable algorithm, because it 
would need to shift around those elements. (Consider e.g. 1, 0, 0, 0, 
..., 0.) The discussion then shifts to how many equivalent elements are 
in the typical range etc.

 You have not shown how much faster it might be (it could be
 only 1% faster) nor how much work it would take to discover (even an
 ideal pivot choice for quicksort actually cannot be as fast as Timsort
 on an already sorted sequence, and quicksort is not an appropriate
 algorithm for accepting presorted subsequences).

There are well known tweaks to quicksort that make it operate well on 
sorted or almost sorted sequences.

 If it takes 2 years to
 come up with an unstable sort faster than Timsort, then it seems like a
 long time to be using something that isn't the best that we currently
 have. Especially if D is to be used in benchmarks against other languages.

 Timsort is implemented now and has no real cost to use as default. If,
 at some point, an unstable sort is discovered that is faster on average
 than Timsort AND (extremely important) it has no potential for an
 attacker to be able to choose a sequence to sort that causes poor
 performance, then unstable could be switched back as the default at some
 point.

We could do that, but that would be masking a bug instead of fixing it. 
We should instead focus on fixing unstable sort. One additional issue 
with making stable sort the default for a while is that people will grow 
to assume sort() is stable, and will have broken code when we switch back.

Let's fix unstable sort. If it's slower on any other data than Timsort's 
happy case, we have a bug.


Andrei

Nov 16 2013

"Chris Cain" <clcain uncg.edu> writes:

On Sunday, 17 November 2013 at 00:56:46 UTC, Andrei Alexandrescu 
wrote:
 Very simple. Any sequence with a large number of equivalent 
 elements will cause timsort to work more than an unstable 
 algorithm, because it would need to shift around those 
 elements. (Consider e.g. 1, 0, 0, 0, ..., 0.) The discussion 
 then shifts to how many equivalent elements are in the typical 
 range etc.

Ah, very nicely described. So a stable sort can take more time 
when there is more than one equal element than a theoretical 
unstable sort.

 There are well known tweaks to quicksort that make it operate 
 well on sorted or almost sorted sequences.

I'd have to see them to comment on them. I'd doubt that they only 
use quicksort if they support taking advantage of large parts of 
the array being sorted. I'd expect some sort of merging to occur 
in that case.

 We could do that, but that would be masking a bug instead of 
 fixing it. We should instead focus on fixing unstable sort. One 
 additional issue with making stable sort the default for a 
 while is that people will grow to assume sort() is stable, and 
 will have broken code when we switch back.

 Let's fix unstable sort. If it's slower on any other data than 
 Timsort's happy case, we have a bug.

Sure, but could we at least agree on a timescale? If we can't 
come up with something significantly better than Timsort in the 
next year, it seems pointless to hold out for something that may 
never come. Hidden bug or not, deliberately exposing a bug that 
has no effective plan to be fixed isn't a good tactic either. 
Especially if D is expected to be competitive to other native 
languages in speed. Seeing D falter in the "sort" column of a 
shootout could be disconcerting to some and we can't count on 
everyone knowing to specify stable sort in those head-to-heads 
until we work out improving unstable sort.

That said, your point that people might grow comfortable with 
sort being stable is a good one. Unfortunately, I don't have an 
answer for that. Documentation detailing that sort is not 
guaranteed a particular stability is about the best that could be 
done. Perhaps a variant of Timsort could be prepared that 
deliberately changes the order of equal elements. Designed 
instability could suffice as a stopgap against that (at least 
until the unstable sort is worked out).

Nov 16 2013

"Xinok" <xinok live.com> writes:

On Sunday, 17 November 2013 at 00:56:46 UTC, Andrei Alexandrescu 
wrote:
 On 11/16/13 4:18 PM, Chris Cain wrote:
 You have not shown how much faster it might be (it could be
 only 1% faster) nor how much work it would take to discover 
 (even an
 ideal pivot choice for quicksort actually cannot be as fast as 
 Timsort
 on an already sorted sequence, and quicksort is not an 
 appropriate
 algorithm for accepting presorted subsequences).

 There are well known tweaks to quicksort that make it operate 
 well on sorted or almost sorted sequences.

If you tweak for one scenario, you're only losing in other 
scenarios. Timsort already performs ideally in these cases, as 
I've demonstrated. Rather than optimizing quicksort for cases in 
which Timsort will still be the victor, we should optimize for 
those cases in which quicksort is already faster.

Nov 16 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/16/13 6:42 PM, Xinok wrote:
 On Sunday, 17 November 2013 at 00:56:46 UTC, Andrei Alexandrescu wrote:
 On 11/16/13 4:18 PM, Chris Cain wrote:
 You have not shown how much faster it might be (it could be
 only 1% faster) nor how much work it would take to discover (even an
 ideal pivot choice for quicksort actually cannot be as fast as Timsort
 on an already sorted sequence, and quicksort is not an appropriate
 algorithm for accepting presorted subsequences).

 There are well known tweaks to quicksort that make it operate well on
 sorted or almost sorted sequences.

 If you tweak for one scenario, you're only losing in other scenarios.

I am not so sure about that.

 Timsort already performs ideally in these cases, as I've demonstrated.

Yes, it is also my understanding that Timsort does well on almost sorted 
data. The issue is one does not know a priori what the distribution of 
the data is.

 Rather than optimizing quicksort for cases in which Timsort will still
 be the victor, we should optimize for those cases in which quicksort is
 already faster.

We should have a "good unstable sort" in addition to the "good stable 
sort" we already have. Seeing this as a competition between quicksort 
and timsort would be the wrong frame.

Probably a good step forward would be to hook a sort benchmarking corpus 
to our unittests. What are the consecrated corpora?


Andrei

Nov 16 2013

"Chris Cain" <clcain uncg.edu> writes:

On Sunday, 17 November 2013 at 04:14:22 UTC, Andrei Alexandrescu 
wrote:
 Probably a good step forward would be to hook a sort 
 benchmarking corpus to our unittests. What are the consecrated 
 corpora?

I'll kick off the suggestions with this:

File to provide "real" data:
ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Escherichia_coli_K_12_substr__MG1655_uid57779/NC_000913.fna

 From http://www.genome.jp/kegg-bin/show_organism?org=eco -> 
RefSeq -> NC_000913.fna

(despite this being real data, typically you don't actually sort 
genomes like this ... it's simply "more interesting" than a 
pseudo-randomized dataset)

Code:

---
import std.range, std.algorithm, std.stdio, std.array, std.conv, 
std.datetime;

enum strategy = SwapStrategy.stable;

void main() {
     auto arr = File("NC_000913.fna").byLine
             .drop(1) // Trim the first line because it's just 
metadata
             /*.take(50)*/
             .join
             .chunks(50)
             .map!text
             .array;
     StopWatch sw = StopWatch(AutoStart.yes);
     arr.sort!("a < b", strategy)();
     sw.stop();
     writeln(text(arr.length, " elements sorted in ", 
sw.peek.msecs, " msecs"));
}
---

On my system, this shows unstable sort to be ~2x faster than 
stable sort (which is to be expected of data with no discernible 
patterns).

The "key" thing this is testing is the effectiveness of the sort 
on data that often takes longer to compare (with such a small 
alphabet, many strings have a common prefix).

That said, it might also be reproduced "well enough" using a 
random generator to create similar strings to sort, but the basic 
idea is there. I just like using real genomes for performance 
testing things :)

Nov 16 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/16/13 9:21 PM, Chris Cain wrote:
 That said, it might also be reproduced "well enough" using a random
 generator to create similar strings to sort, but the basic idea is
 there. I just like using real genomes for performance testing things :)

I am hoping for some more representative corpora, along the lines of 
http://sortbenchmark.org/. Some data that we can use as good proxies for 
typical application usage.

Andrei

Nov 16 2013

"Chris Cain" <clcain uncg.edu> writes:

On Sunday, 17 November 2013 at 07:19:26 UTC, Andrei Alexandrescu 
wrote:
 On 11/16/13 9:21 PM, Chris Cain wrote:
 That said, it might also be reproduced "well enough" using a 
 random
 generator to create similar strings to sort, but the basic 
 idea is
 there. I just like using real genomes for performance testing 
 things :)

 I am hoping for some more representative corpora, along the 
 lines of http://sortbenchmark.org/. Some data that we can use 
 as good proxies for typical application usage.

 Andrei

I think I get what you're saying, but sortbenchmark.org uses 
completely pseudorandom (but reproducable) entries that I don't 
think are representative of real data either:

(using gensort -a minus the verification columns)
---
AsfAGHM5om
~sHd0jDv6X
uI^EYm8s=|
Q)JN)R9z-L
o4FoBkqERn
*}-Wz1;TD-
0fssx}~[oB
...
---

Most places use very fake data as proxies for real data. It's 
better to have something somewhat structured and choose data 
that, despite not being real data, stresses the benchmark in a 
unique way.

I'm not suggesting my benchmark be the only one; if we're going 
to use pseudorandom data (I'm not certain we could actually get 
"realistic data" that would serve us that much better) we might 
as well have different test cases that stress the sort routine in 
different ways. Obviously, using the real genome would be 
preferable to generating some (since it's actually truly "real" 
data, just used in an unorthodox way) but there's a disadvantage 
to attaching a 4.6MB file to a benchmarking setup. Especially if 
more might come.

Anyway, it's a reasonable representation of "data that has no 
discernable order that can occasionally take some time to 
compare." Think something like sorting a list of customer records 
by name. If they're ordered by ID, then the names would not 
likely have a discernable pattern and the comparison between 
names might be "more expensive" because some names can be common.

We can do "more realistic" for that type of scenario, if you'd 
like. I could look at a distribution for last names/first names 
and generate fake names to sort in a reasonable approximation of 
a distribution of real names. I'm not certain the outcome would 
change that much.

Nov 17 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/17/13 6:20 AM, Chris Cain wrote:
 I'm not suggesting my benchmark be the only one; if we're going to use
 pseudorandom data (I'm not certain we could actually get "realistic
 data" that would serve us that much better) we might as well have
 different test cases that stress the sort routine in different ways.
 Obviously, using the real genome would be preferable to generating some
 (since it's actually truly "real" data, just used in an unorthodox way)
 but there's a disadvantage to attaching a 4.6MB file to a benchmarking
 setup. Especially if more might come.

OK, since I see you have some interest...

You said nobody would care to actually sort genome data. I'm aiming for 
data that's likely to be a good proxy for tasks people are interested in 
doing.

For example, people may be interested in sorting floating-point numbers 
resulting from sales, measurements, frequencies, probabilities, and 
whatnot. Since most of those have a Gaussian distribution, a corpus with 
Gaussian-distributed measurements would be nice.

Then, people may want to sort things by date/time. Depending on the 
scale the distribution is different - diurnal cycle, weekly cycle, 
seasonal cycle, secular ebbs and flows etc. I'm unclear on what would be 
a good set of data. For sub-day time ranges uniform distribution may be 
appropriate.

Then, people may want to sort records by e.g. Lastname, Firstname, or 
index a text by words. For names we'd need some census data or 
phonebook. For general text sorting we can use classic texts such as 
Alice in Wonderland or the King James Bible (see 
http://corpus.canterbury.ac.nz/descriptions/). Sorting by word length is 
a possibility (word lengths are probably Gaussian-distributed).

Uniform random data is also a baseline, not terribly representative, but 
worth keeping an eye on. In fact uniform data is unfairly rigged in 
quicksort's favor: any pivot is likely to be pretty good, and there are 
no sorted runs that often occur in real data.


Andrei

Nov 17 2013

"Chris Cain" <clcain uncg.edu> writes:

On Sunday, 17 November 2013 at 16:57:19 UTC, Andrei Alexandrescu 
wrote:
 Then, people may want to sort records by e.g. Lastname, 
 Firstname, or index a text by words. For names we'd need some 
 census data or phonebook.

Sure.

As a source of data, app_c.csv from 
http://www.census.gov/genealogy/www/data/2000surnames/ -> File B: 
Surnames...


Test case:
---
import std.stdio, std.algorithm, std.random, std.range, std.conv, 
std.datetime;

// Name generator constants
enum NumberOfPossibleNames = 1_000_000;
enum NumberOfRandomNames = 1_000_000;

enum SortStrategy = SwapStrategy.unstable;

void main() {
     auto data = File("app_c.csv").byLine
             .drop(1) // First line is just column names
             .take(NumberOfPossibleNames)
             .map!(e => e.text.split(","))
             .array;
     auto names = data.map!(e => e[0]).array;
     auto proportions = data.map!(e => e[2].to!size_t).array;

     auto rnd = Random(50);
     auto randomNames = fastDice(rnd, proportions)
             .take(NumberOfRandomNames)
             .map!(i => names[i])
             .array;

     StopWatch sw = AutoStart.yes;
     randomNames.sort!("a < b", SortStrategy)();
     sw.stop();

     writeln(randomNames.length, " names sorted in ",
         sw.peek.msecs, " msecs using ", SortStrategy, " sort");
}

struct FastDice(Rng, SearchPolicy pol = SearchPolicy.gallop) {
     SortedRange!(size_t[]) _propCumSumSorted;
     size_t _sum;
     size_t _front;
     Rng* _rng;

     this(ref Rng rng, size_t[] proportions) {
         size_t[] _propCumSum = proportions.save.array;
         _rng = &rng;

         size_t mass = 0;
         foreach(ref e; _propCumSum) {
             mass += e;
             e = mass;
         }

         _sum = _propCumSum.back;

         _propCumSumSorted = assumeSorted(_propCumSum);

         popFront();
     }

     void popFront() {
         immutable point = uniform(0, _sum, *_rng);
         assert(point < _sum);

         _front = _propCumSumSorted.lowerBound!pol(point).length;
     }

     enum empty = false;

      property
     auto front() {
         return _front;
     }
}

auto fastDice(Rng)(ref Rng rng, size_t[] proportions) {
     return FastDice!(Rng)(rng, proportions);
}

auto fastDice(size_t[] proportions) {
     return fastDice(rndGen, proportions);
}
---

Results (using `-O -inline -release -noboundschecks` on my 
computer):
unstable sort: 738 msecs
stable sort: 1001 msecs


Also, to do this (in reasonable time) I had to create a new range 
which I called "FastDice" ... it does the same as 
std.random.dice, but is intended for cases where you'll be 
throwing dice throws often on the same data, so it does a bit of 
precomputation (creating a cumulative sum array) and allows for 
binary search/gallop to reduce the time to come up with each 
throw. I opted for gallop in this since the data is sorted in 
such a way that most common names come first.

It needs a bit of work to make it actually generic, but it's a 
good start and I'll just say it's WTFPL code, so use it for 
whatever. It'll be especially good for generating test cases like 
I have done above.

FWIW, FastDice takes ~400ms to generate all those dice throws. I 
did a back-of-the-envelope calculation on using dice, and just 
the time saved caching the sum saved maybe 30 minutes (No, I 
didn't wait that long, I stopped after 5 and wrote FastDice) of 
time each run. And the time saved using a gallop search instead 
of linear search is pretty significant as well (difference 
between n and log n time).

Nov 17 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/17/13 1:23 PM, Chris Cain wrote:
 On Sunday, 17 November 2013 at 16:57:19 UTC, Andrei Alexandrescu wrote:
 Then, people may want to sort records by e.g. Lastname, Firstname, or
 index a text by words. For names we'd need some census data or phonebook.

 Sure.

 As a source of data, app_c.csv from
 http://www.census.gov/genealogy/www/data/2000surnames/ -> File B:
 Surnames...

[snip]

 It needs a bit of work to make it actually generic, but it's a good
 start and I'll just say it's WTFPL code, so use it for whatever. It'll
 be especially good for generating test cases like I have done above.

 FWIW, FastDice takes ~400ms to generate all those dice throws. I did a
 back-of-the-envelope calculation on using dice, and just the time saved
 caching the sum saved maybe 30 minutes (No, I didn't wait that long, I
 stopped after 5 and wrote FastDice) of time each run. And the time saved
 using a gallop search instead of linear search is pretty significant as
 well (difference between n and log n time).

I think that's a terrific start! (Not sure I understand where the 30 
minutes come from...)


Andrei

Nov 17 2013

"Chris Cain" <clcain uncg.edu> writes:

On Monday, 18 November 2013 at 02:36:40 UTC, Andrei Alexandrescu 
wrote:
 I think that's a terrific start!

Thanks :)

 (Not sure I understand where the 30 minutes come from...)

On my computer (`-release -inline -noboundscheck` ... `-O` is 
omitted because it removes the summation all together since it 
isn't used anywhere):

---
auto benchmarkSumming() {
     auto proportions = iota(150_000);
     // copied from std.random.dice:
     auto sum = reduce!("(assert(b >= 0), a + b)")(0.0, 
proportions.save);
}

auto bench = benchmark!benchmarkSumming(1_000);
writeln(bench.front.msecs, " ms");
---

Gives me 1618 ms. ~150_000 is the number of entries in the 
probability table (number of surnames), and generating the sum 
for 1_000 names would take about that long. I really wanted 1 
million names to sort, so scaling that up would mean 1618 
seconds, which is nearly 27 minutes.

That summation doesn't change, hence the importance of caching it 
for repeated runs. That's about 27 minutes of completely wasted 
work.

That all isn't to say that std.random.dice is fundamentally 
flawed (it's fine and perfectly suited for one-shot dice rolls 
and is, in fact, a bit better than a DiceRange for that task) but 
a DiceRange is really needed to efficiently do repeated rolls 
like this.

Jeez, we're talking about maximizing sorting efficiency, and now 
I've gone off talking about making dice rolls faster too! :)

Nov 17 2013

"David Nadlinger" <code klickverbot.at> writes:

On Monday, 18 November 2013 at 05:26:30 UTC, Chris Cain wrote:
 On my computer (`-release -inline -noboundscheck` ... `-O` is 
 omitted because it removes the summation all together since it 
 isn't used anywhere):

The better way to avoid this problem is usually to compile the 
operation to be benchmarked separately, in such a way that the 
result can't be ignored (e.g. by making it read the parameters 
from function arguments and returning the result as the return 
value). Comparing non-optimized (i.e. -O) builds for execution 
speed is of questionable relevance for most cases.

David

Dec 04 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Chris Cain:

 Also, to do this (in reasonable time) I had to create a new 
 range which I called "FastDice" ... it does the same as 
 std.random.dice, but is intended for cases where you'll be 
 throwing dice throws often on the same data, so it does a bit 
 of precomputation (creating a cumulative sum array)

See also:
http://d.puremagic.com/issues/show_bug.cgi?id=5849

Bye,
bearophile

Nov 17 2013

"Chris Cain" <clcain uncg.edu> writes:

On Monday, 18 November 2013 at 03:10:22 UTC, bearophile wrote:
 See also:
 http://d.puremagic.com/issues/show_bug.cgi?id=5849

 Bye,
 bearophile

Nice one, bearophile. You called it.

I've done a bit of testing with the solution you've attached to 
that bug report. It's 10-50% faster than my solution depending on 
how many proportions are provided (I've tested as much as 
500_000). Despite mine being O(log(n)) to find the next dice roll 
and your solution being O(1), there really isn't that much of a 
difference. Strangely (to me), your solution actually seems to 
grow faster than mine as the number of proportions grow larger 
despite the proof claiming it should grow slower (although 
testing for > 500_000 seems pointless until actual data demands 
it).

As far as integer proportions go, I think my solution might be 
better overall because it's effectively equivalent to the current 
dice implementation (meaning it could be used as a near drop-in 
replacement and get precisely the same results, as long as the 
proportions are floating point values) and the code is simpler 
and avoids hiding any real bugs.

(That said, the original implementation has a bug that was easy 
to spot on my second comb-through)
--- in popFront:
// Return the number of elements that are not strictly greater 
than point
         _front = _propCumSumSorted.length - 
_propCumSumSorted.upperBound!pol(point).length;
---

On the other hand, a cumulative sum array approach (like mine and 
the std.random.dice implementation) will likely suffer major 
precision issues with floating point numbers. Especially if the 
first proportions are very large and the later proportions are 
very small. I haven't closely looked through the algorithm you're 
using to see whether it fares better, but I can't imagine it 
doing much worse.

Nov 17 2013

"Ivan Kazmenko" <gassa mail.ru> writes:

On Sunday, 17 November 2013 at 00:18:24 UTC, Chris Cain wrote:
 I think it's more complicated than that. Let's assume for a 
 moment that you've proven that such an unstable sort must exist 
 that is faster (I'm not convinced that it necessarily must take 
 extra work to maintain stability). You have not shown how much 
 faster it might be (it could be only 1% faster) nor how much 
 work it would take to discover (even an ideal pivot choice for 
 quicksort actually cannot be as fast as Timsort on an already 
 sorted sequence, and quicksort is not an appropriate algorithm 
 for accepting presorted subsequences). If it takes 2 years to 
 come up with an unstable sort faster than Timsort, then it 
 seems like a long time to be using something that isn't the 
 best that we currently have. Especially if D is to be used in 
 benchmarks against other languages.

Regarding an ideal pivot choice for quicksort, I'd like to 
emphasize that it is simply non-existent.  Here's why.

Let us measure quicksort performance as the number of comparisons 
it makes.

Let us make some assumptions:
quicksort goes as
--(1) choose a pivot p in O(1) comparisons;
--(2) partition the segment into two;
--(3) run quicksort recursively on the two resulting segments.

Now, (2) is effectively done by comparing every element with p, 
thus in Theta(n) (on the order of n) comparisons where n is the 
length of the current segment.

Under these assumptions, we can construct the killer array for an 
arbitrary quicksort of such kind, even if we don't know how 
exactly the pivot is chosen (say, closed source or obfuscation), 
but we have the following interface to it:

Instead of accessing the array a[], quicksort calls the function 
less(i,j) which tells it whether a[i]<a[j], and we control that 
function.

Now, we start with all values in array a[] undefined.  With each 
a[i], we associate a counter c[i]: how many times it took part in 
a comparison.  As we can see from the above, during a single call 
to quicksort, in steps (1) and (2) the pivot will eventually get 
Theta(n) comparisons, while other elements will all get only O(1) 
comparisons each.

So here's what we'll do.

1. When a comparison asks to relate two undefined values, we pick 
one of them with the larger c[i] and make it the lowest number 
possible.  That is, the first fixed value will be 0, the next 1, 
and so on.  (In the end, a[] will be a permutation of 0, 1, ..., 
n-1.)

2. When we are asked to relate a defined value to an undefined 
one, the defined value will always be lower (because, when we 
eventually define the undefined one, it will be higher than all 
the values defined before it).

3. When we are asked about two known values, just tell the truth.

This simple algorithm ensures that, on each call to quicksort, we 
quickly fix the pivot as one of the O(1) lowest values on the 
segment.  So, one of the next two segments will have length 
n-O(1), and the recursion gives us Theta(n) segments of linearly 
decreasing lengths, and thus Theta(n^2) total running time.

Now, the assumption of picking a pivot in O(1) comparisons covers 
a broad variety of pivot choices, including 
first/last/middle/random element, median of three or five, median 
of medians, or any combination of these.  The random pick fails 
in the following sense: if we seed the RNG, construct a killer 
case, and then start with the same seed, we get Theta(n^2) 
behavior reproduced.

Double-pivot quicksort can be forced to go quadratic in a similar 
fashion.

Now, introsort escapes this fate by switching to a 
guaranteed-n-log-n algorithm instead of (3) when it goes too 
deep.  This comes at a (small) cost of checking the current depth 
on every call to quicksort.

The above is just my retelling of a great short article "A Killer 
Adversary for Quicksort" by M. D. McIlroy here: 
http://www.cs.dartmouth.edu/~doug/mdmspe.pdf

Ivan Kazmenko.

Nov 16 2013

"Ivan Kazmenko" <gassa mail.ru> writes:

On Sunday, 17 November 2013 at 01:07:20 UTC, Ivan Kazmenko wrote:
 Now, the assumption of picking a pivot in O(1) comparisons 
 covers a broad variety of pivot choices, including 
 first/last/middle/random element, median of three or five, 
 median of medians, or any combination of these.  The random 
 pick fails in the following sense: if we seed the RNG, 
 construct a killer case, and then start with the same seed, we 
 get Theta(n^2) behavior reproduced.

Correction: this does *not* cover the median-of-medians case [1]. 
  Median of medians does not fall under our assumption that 
selecting a pivot is O(1).  It also actually guarantees O(n log 
n) worst case performance, but of course at a cost of being 
slower than your typical pivot selection on an average case.

[1] http://en.wikipedia.org/wiki/Median_of_medians

Ivan Kazmenko.

Nov 16 2013

"Chris Cain" <clcain uncg.edu> writes:

On Sunday, 17 November 2013 at 01:07:20 UTC, Ivan Kazmenko wrote:
 Regarding an ideal pivot choice for quicksort, I'd like to 
 emphasize that it is simply non-existent.  Here's why.

Oh, of course. I did that in an algorithms class taught by a 
Professor who is into Cryptography. I agree that essentially 
there is no way to pick pivots that avoid worst case performance 
when an attacker is involved. When I mentioned "ideal pivot" I 
mean an actual good one on a sorted sequence

Given [1,2,3,4,5,6,7]
Picking the best pivot, 4 would result in scanning the entire 
array to assure that it is partitioned appropriately around the 4 
(if a quicksort were designed wise to this sort of trick, but 
most, including D's quicksort, would actually shuffle everything 
around). By that point, Timsort is already done and drinking the 
victory champagne. And the quicksort still has to recur on the 
subsequences to sort them as well!

Nov 16 2013

"Chris Cain" <clcain uncg.edu> writes:

On Sunday, 17 November 2013 at 01:39:43 UTC, Chris Cain wrote:
 (if a quicksort were designed wise to this sort of trick, but 
 most, including D's quicksort, would actually shuffle 
 everything around)

Actually, not everything. It'd swap the pivot and the last 
element (4 and 7 for the first iteration) and then swap it back 
after it figures out the two partitions are good. There was a 
case where it ended up swapping everything around unnecessarily, 
but it wasn't this case... But I digress, the point remains that 
it would be slower than Timsort despite having a good pivot.

Nov 16 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/16/13 5:39 PM, Chris Cain wrote:
 Given [1,2,3,4,5,6,7]
 Picking the best pivot, 4 would result in scanning the entire array to
 assure that it is partitioned appropriately around the 4 (if a quicksort
 were designed wise to this sort of trick, but most, including D's
 quicksort, would actually shuffle everything around).

I keep on thinking of a sort with delayed pivot selection that starts 
scanning the array from left and right and only stops when it figures 
it's unpartitioned. At that point it proceeds with pivot selection, but 
if the array is already sorted there's no need to partition the left and 
right portions that were already explored. Essentially the pivot 
selection would be preceded by a sampling of the left and right of the 
array.

 By that point,
 Timsort is already done and drinking the victory champagne. And the
 quicksort still has to recur on the subsequences to sort them as well!

Yah, already sorted data is ideal for Timsort.


Andrei

Nov 16 2013

"Craig Dillabaugh" <cdillaba cg.scs.carleton.ca> writes:

On Sunday, 17 November 2013 at 01:07:20 UTC, Ivan Kazmenko wrote:
 On Sunday, 17 November 2013 at 00:18:24 UTC, Chris Cain wrote:
 I think it's more complicated than that. Let's assume for a 
 moment that you've proven that such an unstable sort must 
 exist that is faster (I'm not convinced that it necessarily 
 must take extra work to maintain stability). You have not 
 shown how much faster it might be (it could be only 1% faster) 
 nor how much work it would take to discover (even an ideal 
 pivot choice for quicksort actually cannot be as fast as 
 Timsort on an already sorted sequence, and quicksort is not an 
 appropriate algorithm for accepting presorted subsequences). 
 If it takes 2 years to come up with an unstable sort faster 
 than Timsort, then it seems like a long time to be using 
 something that isn't the best that we currently have. 
 Especially if D is to be used in benchmarks against other 
 languages.

 Regarding an ideal pivot choice for quicksort, I'd like to 
 emphasize that it is simply non-existent.  Here's why.

 Let us measure quicksort performance as the number of 
 comparisons it makes.

 Let us make some assumptions:
 quicksort goes as
 --(1) choose a pivot p in O(1) comparisons;
 --(2) partition the segment into two;
 --(3) run quicksort recursively on the two resulting segments.

 Now, (2) is effectively done by comparing every element with p, 
 thus in Theta(n) (on the order of n) comparisons where n is the 
 length of the current segment.

 Under these assumptions, we can construct the killer array for 
 an arbitrary quicksort of such kind, even if we don't know how 
 exactly the pivot is chosen (say, closed source or 
 obfuscation), but we have the following interface to it:

 Instead of accessing the array a[], quicksort calls the 
 function less(i,j) which tells it whether a[i]<a[j], and we 
 control that function.

 Now, we start with all values in array a[] undefined.  With 
 each a[i], we associate a counter c[i]: how many times it took 
 part in a comparison.  As we can see from the above, during a 
 single call to quicksort, in steps (1) and (2) the pivot will 
 eventually get Theta(n) comparisons, while other elements will 
 all get only O(1) comparisons each.

 So here's what we'll do.

 1. When a comparison asks to relate two undefined values, we 
 pick one of them with the larger c[i] and make it the lowest 
 number possible.  That is, the first fixed value will be 0, the 
 next 1, and so on.  (In the end, a[] will be a permutation of 
 0, 1, ..., n-1.)

 2. When we are asked to relate a defined value to an undefined 
 one, the defined value will always be lower (because, when we 
 eventually define the undefined one, it will be higher than all 
 the values defined before it).

 3. When we are asked about two known values, just tell the 
 truth.

 This simple algorithm ensures that, on each call to quicksort, 
 we quickly fix the pivot as one of the O(1) lowest values on 
 the segment.  So, one of the next two segments will have length 
 n-O(1), and the recursion gives us Theta(n) segments of 
 linearly decreasing lengths, and thus Theta(n^2) total running 
 time.

 Now, the assumption of picking a pivot in O(1) comparisons 
 covers a broad variety of pivot choices, including 
 first/last/middle/random element, median of three or five, 
 median of medians, or any combination of these.  The random 
 pick fails in the following sense: if we seed the RNG, 
 construct a killer case, and then start with the same seed, we 
 get Theta(n^2) behavior reproduced.

 Double-pivot quicksort can be forced to go quadratic in a 
 similar fashion.

 Now, introsort escapes this fate by switching to a 
 guaranteed-n-log-n algorithm instead of (3) when it goes too 
 deep.  This comes at a (small) cost of checking the current 
 depth on every call to quicksort.

 The above is just my retelling of a great short article "A 
 Killer Adversary for Quicksort" by M. D. McIlroy here: 
 http://www.cs.dartmouth.edu/~doug/mdmspe.pdf

 Ivan Kazmenko.

Woah, that sounds complicated.  Not 100% sure I entirely
understand but it seems to rely on changing the elements while
you sort.  How would that work in real life?

For example, how would this defeat the following pivoting method.

1. Find the median value in the array. This can be done
deterministically in linear time, so from a theoretical point of
view is just as fast as picking a random element since you use
linear time.
2. Use that as the pivot.

Nov 16 2013

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Sunday, 17 November 2013 at 01:48:14 UTC, Craig Dillabaugh 
wrote:
 1. Find the median value in the array. This can be done
 deterministically in linear time,

My understanding that for unordered data, there is no algorithm 
that runs in worst-case O(n):

http://en.wikipedia.org/wiki/Selection_algorithm
http://en.wikipedia.org/wiki/Quickselect

 so from a theoretical point of
 view is just as fast as picking a random element since you use
 linear time.

Picking a random element is constant time, though. You mean that 
O(1) and O(n) are equivalent here because they're both smaller 
than O(n log n), QuickSort's complexity?

Nov 16 2013

"Craig Dillabaugh" <cdillaba cg.scs.carleton.ca> writes:

On Sunday, 17 November 2013 at 02:43:05 UTC, Vladimir Panteleev
wrote:
 On Sunday, 17 November 2013 at 01:48:14 UTC, Craig Dillabaugh 
 wrote:
 1. Find the median value in the array. This can be done
 deterministically in linear time,

 My understanding that for unordered data, there is no algorithm 
 that runs in worst-case O(n):

 http://en.wikipedia.org/wiki/Selection_algorithm
 http://en.wikipedia.org/wiki/Quickselect

Unless I misunderstand something I am pretty sure there is a
deterministic algorithm for finding the median in unsorted data:

http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15451-s07/www/lecture_notes/lect0125.pdf


 so from a theoretical point of
 view is just as fast as picking a random element since you use
 linear time.

 Picking a random element is constant time, though. You mean 
 that O(1) and O(n) are equivalent here because they're both 
 smaller than O(n log n), QuickSort's complexity?

Yeah, since you have to do O(n) work to partition the array,
whether you do O(1) work, or O(n) work to find your pivot really
doesn't matter. From a theoretical perspective anyway :o)

Nov 17 2013

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Sunday, 17 November 2013 at 08:33:11 UTC, Craig Dillabaugh 
wrote:
 http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15451-s07/www/lecture_notes/lect0125.pdf

Nice! Didn't know about this - although still seems to be a lot 
of work for each 'n'.

Nov 17 2013

"Ivan Kazmenko" <gassa mail.ru> writes:

On Sunday, 17 November 2013 at 01:48:14 UTC, Craig Dillabaugh 
wrote:
 On Sunday, 17 November 2013 at 01:07:20 UTC, Ivan Kazmenko 
 wrote:
 Regarding an ideal pivot choice for quicksort, I'd like to 
 emphasize that it is simply non-existent.  Here's why.

 Let us measure quicksort performance as the number of 
 comparisons it makes.

 Let us make some assumptions:
 quicksort goes as
 --(1) choose a pivot p in O(1) comparisons;
 --(2) partition the segment into two;
 --(3) run quicksort recursively on the two resulting segments.

 Now, (2) is effectively done by comparing every element with 
 p, thus in Theta(n) (on the order of n) comparisons where n is 
 the length of the current segment.

 Under these assumptions, we can construct the killer array for 
 an arbitrary quicksort of such kind, even if we don't know how 
 exactly the pivot is chosen (say, closed source or 
 obfuscation), but we have the following interface to it:

 Instead of accessing the array a[], quicksort calls the 
 function less(i,j) which tells it whether a[i]<a[j], and we 
 control that function.

 (snip)

 Woah, that sounds complicated.  Not 100% sure I entirely
 understand but it seems to rely on changing the elements while
 you sort.  How would that work in real life?

 For example, how would this defeat the following pivoting 
 method.

 1. Find the median value in the array. This can be done
 deterministically in linear time, so from a theoretical point of
 view is just as fast as picking a random element since you use
 linear time.
 2. Use that as the pivot.

Right, this method (median of medians is an example 
implementation) will not be defeated by the above algorithm.  
There's nothing wrong in it, since it takes Theta(n) comparisons 
to choose the pivot and so violates our attack's working 
conditions.  In practice however, it means that, on an average 
case, the constant factor is very likely to be higher than for a 
quicksort vulnerable to our attack.

Ivan Kazmenko.

Nov 17 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/16/13 5:07 PM, Ivan Kazmenko wrote:
 The above is just my retelling of a great short article "A Killer
 Adversary for Quicksort" by M. D. McIlroy here:
 http://www.cs.dartmouth.edu/~doug/mdmspe.pdf

Nice story, but the setup is a tad tenuous (albeit indeed theoretically 
interesting). For starters, if the attacker has a hook into the 
comparison function, they could trivially do a lot worse...

Andrei

Nov 16 2013

"Ivan Kazmenko" <gassa mail.ru> writes:

On Sunday, 17 November 2013 at 03:58:58 UTC, Andrei Alexandrescu 
wrote:
 On 11/16/13 5:07 PM, Ivan Kazmenko wrote:
 The above is just my retelling of a great short article "A 
 Killer
 Adversary for Quicksort" by M. D. McIlroy here:
 http://www.cs.dartmouth.edu/~doug/mdmspe.pdf

 Nice story, but the setup is a tad tenuous (albeit indeed 
 theoretically interesting). For starters, if the attacker has a 
 hook into the comparison function, they could trivially do a 
 lot worse...

Actually, I was thinking about a two phase attack, and it is not 
at all unlikely.

0. Say, the Sorter presents a library quicksort solution.  It may 
be closed source, but can take comparison function as argument.

1. On the preparation phase, the Attacker uses the tricky 
comparison function described previously.  What's important is 
that, besides observing Theta(n^2) behavior once, he now gets a 
real array a[] such that this behavior can be reproduced.

2. On the attack phase, the Attacker does not need access to the 
comparison function.  He just feeds the array obtained on the 
previous step as plain data.

Ivan Kazmenko.

Nov 17 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/17/13 2:20 AM, Ivan Kazmenko wrote:
 On Sunday, 17 November 2013 at 03:58:58 UTC, Andrei Alexandrescu wrote:
 On 11/16/13 5:07 PM, Ivan Kazmenko wrote:
 The above is just my retelling of a great short article "A Killer
 Adversary for Quicksort" by M. D. McIlroy here:
 http://www.cs.dartmouth.edu/~doug/mdmspe.pdf

 Nice story, but the setup is a tad tenuous (albeit indeed
 theoretically interesting). For starters, if the attacker has a hook
 into the comparison function, they could trivially do a lot worse...

 Actually, I was thinking about a two phase attack, and it is not at all
 unlikely.

 0. Say, the Sorter presents a library quicksort solution.  It may be
 closed source, but can take comparison function as argument.

 1. On the preparation phase, the Attacker uses the tricky comparison
 function described previously.  What's important is that, besides
 observing Theta(n^2) behavior once, he now gets a real array a[] such
 that this behavior can be reproduced.

 2. On the attack phase, the Attacker does not need access to the
 comparison function.  He just feeds the array obtained on the previous
 step as plain data.

 Ivan Kazmenko.

That won't do with randomized pivot selection.

Andrei

Nov 17 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 11/17/2013 02:07 AM, Ivan Kazmenko wrote:
 The random pick fails in the following sense: if we seed the RNG,
 construct a killer case, and then start with the same seed, we get
 Theta(n^2) behavior reproduced.

Hence, in no sense. This does not perform independent uniform random picks.

Nov 17 2013

"Ivan Kazmenko" <gassa mail.ru> writes:

On Sunday, 17 November 2013 at 13:07:40 UTC, Timon Gehr wrote:
 On 11/17/2013 02:07 AM, Ivan Kazmenko wrote:
 The random pick fails in the following sense: if we seed the 
 RNG,
 construct a killer case, and then start with the same seed, we 
 get
 Theta(n^2) behavior reproduced.

 Hence, in no sense. This does not perform independent uniform 
 random picks.

Not at all.  There is a number of situations where you want your 
program to use RNG, but it is also important for the result to be 
reproducible.  In such cases, you typically store the RNG seed 
for re-use.

Of course there are also many cases where you don't need 
reproducibility guarantees, and there, the attack is useless.

Ivan Kazmenko.

Nov 17 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 11/17/2013 02:14 PM, Ivan Kazmenko wrote:
 On Sunday, 17 November 2013 at 13:07:40 UTC, Timon Gehr wrote:
 On 11/17/2013 02:07 AM, Ivan Kazmenko wrote:
 The random pick fails in the following sense: if we seed the RNG,
 construct a killer case, and then start with the same seed, we get
 Theta(n^2) behavior reproduced.

 Hence, in no sense. This does not perform independent uniform random
 picks.

 Not at all.  There is a number of situations where you want your program
 to use RNG, but it is also important for the result to be reproducible.
 In such cases, you typically store the RNG seed for re-use.

 Of course there are also many cases where you don't need reproducibility
 guarantees, and there, the attack is useless.

 Ivan Kazmenko.

One can't say random picking is bad because one can use some other 
(deterministic) pivot selection algorithm instead which is bad. If you 
need deterministic reproducibility guarantees, then random picking is 
useless.

Nov 17 2013

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Saturday, 16 November 2013 at 22:11:46 UTC, Xinok wrote:
 And the results (last number is predicate calls):

 Current Unstable Sort  50ms  32783474
 New Unstable Sort      69ms  21503542
 Timsort                35ms  3905887

For the record, I tried both SwapStragegy options with my data 
(the data that got me to start this thread), and although TimSort 
did fewer comparisons (predicate calls), it ran about 20% slower.

Nov 16 2013

"Xinok" <xinok live.com> writes:

On Sunday, 17 November 2013 at 02:44:45 UTC, Vladimir Panteleev 
wrote:
 On Saturday, 16 November 2013 at 22:11:46 UTC, Xinok wrote:
 And the results (last number is predicate calls):

 Current Unstable Sort  50ms  32783474
 New Unstable Sort      69ms  21503542
 Timsort                35ms  3905887

 For the record, I tried both SwapStragegy options with my data 
 (the data that got me to start this thread), and although 
 TimSort did fewer comparisons (predicate calls), it ran about 
 20% slower.

Could you try running a benchmark on the same data using this 
instead?

https://github.com/Xinok/XSort/blob/master/unstablesort.d

My implementation of quick sort has some odd tweaks which makes 
it slower in some cases, but I've never invoked worst-case 
behavior by accident.

Nov 16 2013

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Sunday, 17 November 2013 at 03:14:41 UTC, Xinok wrote:
 On Sunday, 17 November 2013 at 02:44:45 UTC, Vladimir Panteleev 
 wrote:
 On Saturday, 16 November 2013 at 22:11:46 UTC, Xinok wrote:
 And the results (last number is predicate calls):

 Current Unstable Sort  50ms  32783474
 New Unstable Sort      69ms  21503542
 Timsort                35ms  3905887

 For the record, I tried both SwapStragegy options with my data 
 (the data that got me to start this thread), and although 
 TimSort did fewer comparisons (predicate calls), it ran about 
 20% slower.

 Could you try running a benchmark on the same data using this 
 instead?

 https://github.com/Xinok/XSort/blob/master/unstablesort.d

 My implementation of quick sort has some odd tweaks which makes 
 it slower in some cases, but I've never invoked worst-case 
 behavior by accident.

Sorry for the late reply - my access to my home PC was spotty in 
the past week.

I wasn't entirely correct in the parent post - I didn't test 
things in the same way as the original problem, since I 
originally encountered it while using multiSort. multiSort can't 
use stable sort, because "stable $(D partition3) has not been 
implemented yet". If I hack multiSort to use TimSort instead of 
QuickSort, the program finishes quickly.

I reran the test in the same way as in the parent post, incl. 
with unstableSort. Here are the results:

SwapStrategy.unstable: 560576749 comparisons, 4796076 ticks
SwapStrategy.stable  : 340617464 comparisons, 5757974 ticks
unstableSort         : 573916901 comparisons, 5255061 ticks

Dec 04 2013

"Xinok" <xinok live.com> writes:

On Friday, 15 November 2013 at 21:46:26 UTC, Vladimir Panteleev 
wrote:
 ...

I've made a pull request to fix this issue:

https://github.com/D-Programming-Language/phobos/pull/1735

Nov 28 2013

"Ivan Kazmenko" <gassa mail.ru> writes:

On Friday, 29 November 2013 at 01:47:49 UTC, Xinok wrote:
 I've made a pull request to fix this issue:

 https://github.com/D-Programming-Language/phobos/pull/1735

I've left some comments on Github.

The greatest concern IMO is that the new implementation relies on 
assignments rather than swaps.  It would be a regression to 
disallow sorting of entities with disabled assignment.

Ivan Kazmenko.

Nov 28 2013

D Programming

C/C++ Programming

Other

digitalmars.D - Worst-case performance of quickSort / getPivot