## digitalmars.D.learn - Quadratic time to sort in a simple case?

- Ivan Kazmenko (40/40) Apr 19 2013 Hi!
- bearophile (8/17) Apr 19 2013 If you know that, then don't do a sort. Make some free space in
- Ivan Kazmenko (29/38) Apr 19 2013 Thank you, these are good for now.
- bearophile (19/38) Apr 19 2013 With that I think there is no way to write a pure sort. Hopefully
- Chris Cain (28/35) Apr 20 2013 I'm just throwing my 2c in, but I think TimSort ought to be the
- Dmitry Olshansky (5/23) Apr 20 2013 And this all is good but TimSort allocates O(N) memory. The constant in
- Xinok (5/8) Apr 22 2013 Worst case is O(n/2), but it starts small and doubles in size as
- Dmitry Olshansky (5/10) Apr 24 2013 Good to know, still it's something to keep in mind.
- Andrei Alexandrescu (14/46) Apr 22 2013 Yah, that would mean one can't reason what a piece of code does without
- bearophile (9/13) Apr 22 2013 Or switch to a sort that is guaranteed to have a pseudo-linear
- bearophile (4/4) Apr 22 2013 And by the way, I still suggest a dual-pivot quicksort:
- Andrei Alexandrescu (12/22) Apr 23 2013 There's not "the C++ STL sort". Any implementation is fine as long as it...
- bearophile (7/9) Apr 23 2013 This seems to use the Introsort:
- Xinok (18/27) Apr 23 2013 On an average case, two-pivot quick sort will increase the number
- Ivan Kazmenko (13/24) Apr 23 2013 I like the sound of that!
- Ivan Kazmenko (59/71) Apr 23 2013 Sounds like it could decrease speed on average, and still perhaps
- Dmitry Olshansky (7/18) Apr 24 2013 A good unstable sort can do the job faster (at the very least not
- Xinok (8/9) Apr 24 2013 I would actually argue for this, not for safety (introsort is an
- Dmitry Olshansky (4/7) Apr 19 2013 Optimization flags if any?
- Ivan Kazmenko (10/16) Apr 19 2013 Both "-O" and no-flags give quadratic behavior.
- Dmitry Olshansky (8/22) Apr 19 2013 I sought after
- Xinok (5/13) Apr 22 2013 I filed a bug report for this issue a year ago:
- bearophile (6/8) Apr 22 2013 Good. And if you are willing, please also take a look at the
- Xinok (4/12) Apr 22 2013 I could, but I'm not sure what the general consensus is on

Hi! Consider a sorted array. Append an element that is less than all the previous elements. Then sort the array again by the sort function from std.algorithm. An example follows: ----- import std.algorithm; import std.datetime; import std.range; import std.stdio; void main () { int n = 30_000; auto a = array (iota (n)) ~ [0]; auto st = Clock.currTime (); sort (a); auto et = Clock.currTime (); writeln (et - st); } ----- With n = 30_000 as in the example, this takes time of the order of a second on a modern computer, which is clearly O(n^2). I am using DMD 2.062. Well, now I have a point to make and a question. My point is that I think this is bad because such a pattern (sorted array ~ small element, then sort) is fairly likely to occur - well, it did for me, hence the post. Sure, every fast and simple method for choosing the pivot in quicksort will have its O(n^2) corner cases. But there's a difference between a corner case that never occurs in "real" data (but still could be constructed artificially) and a corner case describable by such a simple pattern. The question is rather predictable, really: what is the guaranteed-n-log-n replacement for sort in the standard library? I've found the following way... sort !("a < b", SwapStrategy.stable) (a); ...but it forces to specify the redundant "a < b" and looks a bit clumsy for an everyday sort() call. Tried to keep it short(er) this time, Ivan Kazmenko.

Apr 19 2013

Ivan Kazmenko:Consider a sorted array. Append an element that is less than all the previous elements. Then sort the array again by the sort function from std.algorithm.If you know that, then don't do a sort. Make some free space in the array, shift the items, sort just the first part with a slice.The question is rather predictable, really: what is the guaranteed-n-log-n replacement for sort in the standard library? I've found the following way... sort !("a < b", SwapStrategy.stable) (a);That uses the TimSort that contains the optimization you look for....but it forces to specify the redundant "a < b" and looks a bit clumsy for an everyday sort() call.Then try to define an alias (untested): alias mySort = sort!("a < b", SwapStrategy.stable); Bye, bearophile

Apr 19 2013

That [SwapStrategystable] uses the TimSort that contains the optimization you look for. alias mySort = sort!("a < b", SwapStrategy.stable);Thank you, these are good for now. A seemingly easy way to provide the choice globally would be to have an assignable SwapStrategy.default in std.algorithm... or would that be a bad design decision to have such a global state?That behavior isn't always easy to predict. Still, I think this is a problem in Phobos which should be fixed. In most implementations of quicksort (even middle-element, the easiest to come up with), one expects it to perform in O(n log n) with probability close to one, except on some artificially constructed cases. On a side note, I'm surprised that median-of-three (getPivot in std.algorithm) fails at such a simple test, so I had something new to learn there, too. If I just comment out the switch in getPivot and "return mid" directly, it becomes fast enough in this case. Another view at this is the following. As far as I know by now, one of the points beyond D (and Phobos) design is to have everything safe by default, but faster if you want it explicitly and you know what you are doing. In this particular case, that would mean having a worst-case O(n log n) sort, and/or a stable sort, by default - but with the opportunity to switch to the (time-wise and stability-wise) unsafe quicksort if you really need the extra speedup and know what you are doing. So, why isn't TimSort the default? The one reason I can imagine is that then D would suffer in language comparisons which just take the most easy-to-use "default sort" and don't care about the extra features it got. Are there any other? Ivan Kazmenko.Consider a sorted array. Append an element that is less than all the previous elements. Then sort the array again by the sort function from std.algorithm.If you know that, then don't do a sort. Make some free space in the array, shift the items, sort just the first part with a slice.

Apr 19 2013

Ivan Kazmenko:A seemingly easy way to provide the choice globally would be to have an assignable SwapStrategy.default in std.algorithm... or would that be a bad design decision to have such a global state?With that I think there is no way to write a pure sort. Hopefully someday the sort() will be pure. Generally global mutable state is bad to write unittests, and to reason about code. So I don't think Andrei will do that. It's not kosher.Still, I think this is a problem in Phobos which should be fixed. In most implementations of quicksort (even middle-element, the easiest to come up with), one expects it to perform in O(n log n) with probability close to one, except on some artificially constructed cases. On a side note, I'm surprised that median-of-three (getPivot in std.algorithm) fails at such a simple test, so I had something new to learn there, too. If I just comment out the switch in getPivot and "return mid" directly, it becomes fast enough in this case.Your case is a common one, so I think this problem should be studied a little better. I suggest to write a little better benchmark that shows the problem, and put it in Bugzilla. At worst it will be closed with wontfix.Another view at this is the following. As far as I know by now, one of the points beyond D (and Phobos) design is to have everything safe by default, but faster if you want it explicitly and you know what you are doing.Right, but I think that D Zen rule is mostly about things like memory safety, etc.So, why isn't TimSort the default?Probably because someone thinks that "on average" the other sort is faster. In theory the other should be faster, because if you relax the constraints of the sort being stable I think in theory you should be able to write a little faster sorting algorithm (I don't know if this is true). Bye, bearophile

Apr 19 2013

On Friday, 19 April 2013 at 22:56:00 UTC, bearophile wrote:I'm just throwing my 2c in, but I think TimSort ought to be the default. It's simply the safer option. Since worst-case performance can be designed (and it can be designed unless the pivots are, at least, pseudo-randomly chosen), there is a very real risk of the default sort being used in a way that can be compromised by an attacker. From this perspective, seems to be like TimSort being default would support the design goal #5 of D, "Make doing things the right way easier than the wrong way." Plus, TimSort seems to be faster in most cases in my attempts. The only cases I could find that it wasn't faster is when I could guarantee that the data I was passing in had no order to it. In cases where you suspect (but can't guarantee) data is sorted when passed in (think fetching from a web API and it gives you sorted data, but the docs don't say it should), calling TimSort is nearly as fast as calling an "isSorted" ... So, my recommendation is to just call TimSort and don't worry about the extra branching code ([checking sortedness, if not sorted, call sort] vs [call sort]). This makes it so the "fast" code is also very convenient to write. TBH, though, I think the reason that TimSort is not the default is because it was added only semi-recently. The old stable sort was not nearly as fast as the unstable sort (and, in fact, IIRC, it didn't work properly when I tried it). I doubt that someone intentionally said that quicksort was faster than TimSort on average ... it's just that no one thought to change the default when TimSort was added. Maybe an enhancement request could be made on this?So, why isn't TimSort the default?Probably because someone thinks that "on average" the other sort is faster. In theory the other should be faster, because if you relax the constraints of the sort being stable I think in theory you should be able to write a little faster sorting algorithm (I don't know if this is true).

Apr 20 2013

20-Apr-2013 16:22, Chris Cain пишет:On Friday, 19 April 2013 at 22:56:00 UTC, bearophile wrote:And this all is good but TimSort allocates O(N) memory. The constant in front of N is smallish less then 1.0 but it could cause some grief. -- Dmitry OlshanskyI'm just throwing my 2c in, but I think TimSort ought to be the default. It's simply the safer option. Since worst-case performance can be designed (and it can be designed unless the pivots are, at least, pseudo-randomly chosen), there is a very real risk of the default sort being used in a way that can be compromised by an attacker. From this perspective, seems to be like TimSort being default would support the design goal #5 of D, "Make doing things the right way easier than the wrong way."So, why isn't TimSort the default?Probably because someone thinks that "on average" the other sort is faster. In theory the other should be faster, because if you relax the constraints of the sort being stable I think in theory you should be able to write a little faster sorting algorithm (I don't know if this is true).

Apr 20 2013

On Saturday, 20 April 2013 at 16:35:25 UTC, Dmitry Olshansky wrote:And this all is good but TimSort allocates O(N) memory. The constant in front of N is smallish less then 1.0 but it could cause some grief.Worst case is O(n/2), but it starts small and doubles in size as needed. On a partially sorted array, it may only use a small amount of memory.

Apr 22 2013

23-Apr-2013 05:17, Xinok пишет:On Saturday, 20 April 2013 at 16:35:25 UTC, Dmitry Olshansky wrote:Good to know, still it's something to keep in mind. Especially for allocation-free minded folks. -- Dmitry OlshanskyAnd this all is good but TimSort allocates O(N) memory. The constant in front of N is smallish less then 1.0 but it could cause some grief.Worst case is O(n/2), but it starts small and doubles in size as needed. On a partially sorted array, it may only use a small amount of memory.

Apr 24 2013

On 4/19/13 6:37 PM, Ivan Kazmenko wrote:Yah, that would mean one can't reason what a piece of code does without knowing what the context was.That [SwapStrategystable] uses the TimSort that contains the optimization you look for. alias mySort = sort!("a < b", SwapStrategy.stable);Thank you, these are good for now. A seemingly easy way to provide the choice globally would be to have an assignable SwapStrategy.default in std.algorithm... or would that be a bad design decision to have such a global state?I could have sworn I made getPivot choose at random at some point. I agree this should be fixed; there are a number of possibilities: a) choose median of five b) if the array is large, sort a stride of it first (e.g. 1%) then choose the pivot as the median point of that stride c) add introspection to the algorithm, i.e. if an attempt to partition hits the worst case or near worst case, just try another pivot instead of moving forward with the sorting stage.That behavior isn't always easy to predict. Still, I think this is a problem in Phobos which should be fixed. In most implementations of quicksort (even middle-element, the easiest to come up with), one expects it to perform in O(n log n) with probability close to one, except on some artificially constructed cases. On a side note, I'm surprised that median-of-three (getPivot in std.algorithm) fails at such a simple test, so I had something new to learn there, too. If I just comment out the switch in getPivot and "return mid" directly, it becomes fast enough in this case.Consider a sorted array. Append an element that is less than all the previous elements. Then sort the array again by the sort function from std.algorithm.If you know that, then don't do a sort. Make some free space in the array, shift the items, sort just the first part with a slice.Another view at this is the following. As far as I know by now, one of the points beyond D (and Phobos) design is to have everything safe by default, but faster if you want it explicitly and you know what you are doing.That view of safety is different (memory safety).In this particular case, that would mean having a worst-case O(n log n) sort, and/or a stable sort, by default - but with the opportunity to switch to the (time-wise and stability-wise) unsafe quicksort if you really need the extra speedup and know what you are doing. So, why isn't TimSort the default?TimSort is slower on average. Andrei

Apr 22 2013

Andrei Alexandrescu:c) add introspection to the algorithm, i.e. if an attempt to partition hits the worst case or near worst case, just try another pivot instead of moving forward with the sorting stage.Or switch to a sort that is guaranteed to have a pseudo-linear complexity. I am not sure, but I think the C++ STL sort does this.TimSort is slower on average.Here it's not easy to define what "average" is. Python devs think that a common case is an array with mostly sorted data with unsorted data at the end. Bye, bearophile

Apr 22 2013

And by the way, I still suggest a dual-pivot quicksort: http://permalink.gmane.org/gmane.comp.java.openjdk.core-libs.devel/2628 Bye, bearophile

Apr 22 2013

On 4/22/13 5:52 PM, bearophile wrote:Andrei Alexandrescu:There's not "the C++ STL sort". Any implementation is fine as long as it has O(n log n) expected run time.c) add introspection to the algorithm, i.e. if an attempt to partition hits the worst case or near worst case, just try another pivot instead of moving forward with the sorting stage.Or switch to a sort that is guaranteed to have a pseudo-linear complexity. I am not sure, but I think the C++ STL sort does this.Note that in this case it's sorted data followed by its smallest element. (Changing the value does improve sped.) This is hitting a corner case: median of 3 will pick the smallest element, which will lead to an inefficient partition. I haven't looked at the code, but it seems the smallest element again makes it to the beginning and the end of the array so it's again picked etc. One simple solution to this (which I forgot to mention) is picking a pivot at random. AndreiTimSort is slower on average.Here it's not easy to define what "average" is. Python devs think that a common case is an array with mostly sorted data with unsorted data at the end.

Apr 23 2013

Andrei Alexandrescu:There's not "the C++ STL sort". Any implementation is fine as long as it has O(n log n) expected run time.This seems to use the Introsort: http://www.sgi.com/tech/stl/sort.html I don't know if Xinok has tested a 2-pivot quicksort (called by the usual Introsort setting). Bye, bearophile

Apr 23 2013

On Tuesday, 23 April 2013 at 14:12:01 UTC, bearophile wrote:Andrei Alexandrescu:On an average case, two-pivot quick sort will increase the number of comparisons by about 50%, reason being that about half the elements will be greater than the first pivot and have to he compared to the second pivot. There's also a greater complexity given there are three partitions rather than two, increasing the amount of I/O necessary. Introsort is simply a quick sort which falls back to a heap sort after so many recursions to guarantee O(n log n) performance. I've implemented this using a median of five and it works excellent. This is what I plan to contribute to Phobos. Choosing a random pivot will always invoke "average" performance where as finding the median of N elements usually achieves better performance on sorted lists. You also have to be careful that equal elements are distributed equally among both partitions, else a list with many elements equal to the pivot will still achieve poor performance. (equal elements would all fall onto one side of the pivot)There's not "the C++ STL sort". Any implementation is fine as long as it has O(n log n) expected run time.This seems to use the Introsort: http://www.sgi.com/tech/stl/sort.html I don't know if Xinok has tested a 2-pivot quicksort (called by the usual Introsort setting). Bye, bearophile

Apr 23 2013

On Tuesday, 23 April 2013 at 17:42:08 UTC, Xinok wrote:Choosing a random pivot will always invoke "average" performance where as finding the median of N elements usually achieves better performance on sorted lists. You also have to be careful that equal elements are distributed equally among both partitions, else a list with many elements equal to the pivot will still achieve poor performance. (equal elements would all fall onto one side of the pivot) Introsort is simply a quick sort which falls back to a heap sort after so many recursions to guarantee O(n log n) performance. I've implemented this using a median of five and it works excellent. This is what I plan to contribute to Phobos.I like the sound of that! Regarding my previous post, it's perhaps worth mentioning Go in the comparison, it uses QuickSort with median-of-three for small sizes and median of medians-of-three for larger sizes [1]. And it actually sorts the three medians in median-of-three, which sounds like a good thing to do. They cite "Engineering a Sort Function" (1993) by Bentley and McIlroy as inspiration [2]. Ivan Kazmenko. ----- [1] http://golang.org/src/pkg/sort/sort.go#L104 [2] http://www.enseignement.polytechnique.fr/informatique/profs/Luc.Maranget/421/09/bentley93engineering.pdf

Apr 23 2013

On Monday, 22 April 2013 at 21:34:47 UTC, Andrei Alexandrescu wrote:a) choose median of fiveSounds like it could decrease speed on average, and still perhaps fail on a simple case like [0, 1, 2, 3, ..., 999, 0, 0] or something? Didn't test it though.b) if the array is large, sort a stride of it first (e.g. 1%) then choose the pivot as the median point of that strideA bad case would be, e.g., the array containing almost sorted parts, and the stride getting taken from the part where lowest or highest elements reside.c) add introspection to the algorithm, i.e. if an attempt to partition hits the worst case or near worst case, just try another pivot instead of moving forward with the sorting stage.Now, trying another pivot again and again could give the same O(n^2) in a bad case.One simple solution to this (which I forgot to mention) is picking a pivot at random.But it would again mean that the sort function depends on global state (RNG state). And if it doesn't (a local RNG is initialized somehow each time), an adversary would just have to hardcode it once to get the same O(n^2) worst case anytime he wants. And on Tuesday, 23 April 2013 at 01:10:26 UTC, Xinok wrote:I filed a bug report for this issue a year ago: http://d.puremagic.com/issues/show_bug.cgi?id=7767 I've been meaning to fix this issue myself. Time allowing, I'll do it soon.What I wonder now, what would be the goal of such a fix? One possible goal would be to have an O(n log n) worst case sort. And that would perhaps cost some speed and/or memory on average. Besides, such a sort function is already implemented (TimSort), so it's just a matter of setting the default then. Another goal would be to have O(n^2) only for cases which don't have a reasonable chance to occur in real world, but could still be constructed artificially by an adversary. And that could perhaps be done by just changing the getPivot implementation. Other languages' experience may be useful here: 1. GNU C++ 4.x std::sort implementation is IntroSort [1]. 2. Microsoft .NET used Quicksort up to version 4.0 [2]. Now in .NET 4.5 they use Introsort too [3]. 3. For primitive types, Java 6 used a tuned QuickSort [4]. The current Java 7 uses Dual-Pivot QuickSort [5]. Java uses TimSort for Objects [6]. 4. Python uses TimSort since version 2.3 [7]. In any case, there are techniques to construct a bad case given a QuickSort implementation. One of them is described by M. Douglas McIlroy in "A Killer Adversary for Quicksort" [8]. We run QuickSort on a would-be-array "a" controlling the "a[i] < a[j]" predicate, and, using only certain assumptions (not looking at the code!), we build a killer-case array on the fly. Given some thought, the technique could perhaps be extended to Dual-Pivot QuickSort too. As long as some flavor of QuickSort (and topN) is in Phobos, such a unittest would fit the sort implementation nicely, if only to show just how "bad" it can get. Ivan Kazmenko. ----- References: [1] http://gcc.gnu.org/onlinedocs/libstdc++/libstdc++-html-USERS-4.4/a01027.html#g152148508b4a39e15ffbfbc987ab653a [2] http://msdn.microsoft.com/en-us/library/6tf1f0bc%28v=vs.100%29.aspx [3] http://msdn.microsoft.com/en-us/library/6tf1f0bc%28v=vs.110%29.aspx [4] http://docs.oracle.com/javase/6/docs/api/java/util/Arrays.html#sort%28int[]%29 [5] http://docs.oracle.com/javase/7/docs/api/java/util/Arrays.html#sort%28int[]%29 [6] http://docs.oracle.com/javase/7/docs/api/java/util/Arrays.html#sort%28java.lang.Object[]%29 [7] http://en.wikipedia.org/wiki/Timsort [8] http://www.cs.dartmouth.edu/~doug/mdmspe.pdf

Apr 23 2013

24-Apr-2013 01:09, Ivan Kazmenko пишет:And on Tuesday, 23 April 2013 at 01:10:26 UTC, Xinok wrote:A good unstable sort can do the job faster (at the very least not slower) then a good stable sort. I'm looking forward to a version of Introsort that Xinok has in mind as a "Q-sort fix". -- Dmitry OlshanskyI filed a bug report for this issue a year ago: http://d.puremagic.com/issues/show_bug.cgi?id=7767 I've been meaning to fix this issue myself. Time allowing, I'll do it soon.What I wonder now, what would be the goal of such a fix? One possible goal would be to have an O(n log n) worst case sort. And that would perhaps cost some speed and/or memory on average. Besides, such a sort function is already implemented (TimSort), so it's just a matter of setting the default then.

Apr 24 2013

On Friday, 19 April 2013 at 22:37:45 UTC, Ivan Kazmenko wrote:So, why isn't TimSort the default?I would actually argue for this, not for safety (introsort is an adequate solution) but for different reasons. Timsort is stable and it's faster on data with low entropy, being nearly instantaneous on already sorted lists. I would guess it's the better choice for most cases. Then for those cases where stable sorting isn't necessary and unstable sorting would be faster, the user could choose the second option.

Apr 24 2013

20-Apr-2013 01:03, Ivan Kazmenko пишет:With n = 30_000 as in the example, this takes time of the order of a second on a modern computer, which is clearly O(n^2). I am using DMD 2.062.Optimization flags if any? -- Dmitry Olshansky

Apr 19 2013

Both "-O" and no-flags give quadratic behavior. Well, if that does not convince you the time grows faster than n-log-n, maybe this comparison shall (dmd -O, Core i7-2600K 3400 MHz): n T, seconds 15_000 0.312 30_000 1.076 60_000 3.775 120_000 11.247 With n growing x2, the time grows x3 to x4 at each step.With n = 30_000 as in the example, this takes time of the order of a second on a modern computer, which is clearly O(n^2). I am using DMD 2.062.Optimization flags if any?

Apr 19 2013

20-Apr-2013 02:12, Ivan Kazmenko пишет:I sought after -O -inline -release In non-release mode it tests that it's sorted on exit etc. Anyway it's 8x times as fast with -release -inline for me but the progression is similarly bad.Both "-O" and no-flags give quadratic behavior.With n = 30_000 as in the example, this takes time of the order of a second on a modern computer, which is clearly O(n^2). I am using DMD 2.062.Optimization flags if any?Well, if that does not convince you the time grows faster than n-log-n, maybe this comparison shall (dmd -O, Core i7-2600K 3400 MHz): n T, seconds 15_000 0.312 30_000 1.076 60_000 3.775 120_000 11.247 With n growing x2, the time grows x3 to x4 at each step.-- Dmitry Olshansky

Apr 19 2013

On Friday, 19 April 2013 at 21:03:23 UTC, Ivan Kazmenko wrote:Hi! Consider a sorted array. Append an element that is less than all the previous elements. Then sort the array again by the sort function from std.algorithm. .... With n = 30_000 as in the example, this takes time of the order of a second on a modern computer, which is clearly O(n^2). I am using DMD 2.062.I filed a bug report for this issue a year ago: http://d.puremagic.com/issues/show_bug.cgi?id=7767 I've been meaning to fix this issue myself. Time allowing, I'll do it soon.

Apr 22 2013

Xinok:I've been meaning to fix this issue myself. Time allowing, I'll do it soon.Good. And if you are willing, please also take a look at the parallel sort problem I've raised: http://forum.dlang.org/thread/iyunhhsbmurqyouyrhob forum.dlang.org Bye, bearophile

Apr 22 2013

On Tuesday, 23 April 2013 at 01:28:56 UTC, bearophile wrote:Xinok:I could, but I'm not sure what the general consensus is on std.parallel_algorithm as it's been brought up before in a similar context.I've been meaning to fix this issue myself. Time allowing, I'll do it soon.Good. And if you are willing, please also take a look at the parallel sort problem I've raised: http://forum.dlang.org/thread/iyunhhsbmurqyouyrhob forum.dlang.org Bye, bearophile

Apr 22 2013