## digitalmars.D - Re: Go rant

- Kevin Bealer <kevinbealer gmail.com> Dec 27 2009
- dsimcha <dsimcha yahoo.com> Dec 27 2009
- Kevin Bealer <kevinbealer gmail.com> Dec 27 2009
- Nekuromento <necroment gmail.com> Dec 29 2009
- justme <justme somewhere.net> Dec 29 2009
- Don <nospam nospam.com> Dec 29 2009
- bearophile <bearophileHUGS lycos.com> Dec 29 2009
- bearophile <bearophileHUGS lycos.com> Dec 30 2009
- Don <nospam nospam.com> Dec 30 2009
- bearophile <bearophileHUGS lycos.com> Dec 30 2009
- "Simen kjaeraas" <simen.kjaras gmail.com> Dec 28 2009
- "Denis Koroskin" <2korden gmail.com> Dec 29 2009
- retard <re tard.com.invalid> Dec 30 2009
- Rainer Deyke <rainerd eldwood.com> Dec 27 2009
- Kevin Bealer <kevinbealer gmail.com> Dec 27 2009

Don Wrote:retard wrote:

I'd say it's easier. If you watch someone sorting some cards, they'll use either insertion sort or selection sort. Nobody should have ever heard of bubble sort, I'm pleased to hear some schools aren't mentioning it. Such a foolish algorithm. "the bubble sort seems to have nothing to recommend it, except a catchy name " - Knuth.

(Non-software) people doing routine tasks often come up with better algorithms intuitively than my intuition expects them to. I think a lot of people would do even better than insertion with a deck of poker cards -- they might group cards by either suit or rank (or rank groups) (e.g. "Hmm, I'll make three piles of 1-5, 6-10, and J-A"), then order the "buckets", then stick these ordered sets back together. If you think about it, this is a lot like a radix sort or a multi-pivot cousin of the quick sort. I think when you ask people to do a computer version of how they would sort, they do worse than this. It's so hard to communicate anything complicated to a computer that they instinctively "dumb it down" and do insertion or bubble. Insertion is what you would manually use when dealing with adding a few new elements to a short stack, whereas something like a bubble might be what you would use manually when it's hard to jump around among the elements of the sequence (e.g. if you have a bunch of "see also" references embedded in dictionary entries and you need to sort within the see-also 'chain'). That approach (kind of) matches the claims often given by computer programmers that bubble sort is good for linked lists and/or "only fixing a few things that are out of place in a mostly sorted list". When you have a pupil that is as hard to talk to and unmotivated to learn as a computer, the first attempt is to try to teach it anything at all and call that a success if it works. A more eager human pupil tries to meet you half way, so you naturally respond and take pleasure in teaching them more and more advanced stuff. If you're really hard headed and immune to tedium, you keep trying with even a super-dumb pupil like a CPU. You keep going in for the hard case so many times that you and eventually major in CompSci out of habit. Kevin

Dec 27 2009

== Quote from Kevin Bealer (kevinbealer gmail.com)'s article(Non-software) people doing routine tasks often come up with better algorithms

I think a lot of people would do even better than insertion with a deck of poker

"Hmm, I'll make three piles of 1-5, 6-10, and J-A"), then order the "buckets", then stick these ordered sets back together. If you think about it, this is a lot like a radix sort or a multi-pivot cousin of the quick sort. You mean a bucket sort? http://en.wikipedia.org/wiki/Bucket_sort

Dec 27 2009

dsimcha Wrote:== Quote from Kevin Bealer (kevinbealer gmail.com)'s article(Non-software) people doing routine tasks often come up with better algorithms

I think a lot of people would do even better than insertion with a deck of poker

"Hmm, I'll make three piles of 1-5, 6-10, and J-A"), then order the "buckets", then stick these ordered sets back together. If you think about it, this is a lot like a radix sort or a multi-pivot cousin of the quick sort. You mean a bucket sort? http://en.wikipedia.org/wiki/Bucket_sort

More or less, though I think human beings use algorithms in a more artistic way than sticking to any one algorithm. I'm curious if the multi-pivot quicksort (I think everyone gets what I mean by this? Divide by more than one pivot on each pass? I can give details if you like ...) has been tried out much. It seems like it must have been, but it also seems like something that would have cache-awareness advantages that would not show up in the simplified comparison-counting way of thinking about efficiency. Kevin

Dec 27 2009

Simen kjaeraas Wrote:Kevin Bealer <kevinbealer gmail.com> wrote:I'm curious if the multi-pivot quicksort (I think everyone gets what I mean by this? Divide by more than one pivot on each pass? I can give details if you like ...) has been tried out much. It seems like it must have been, but it also seems like something that would have cache-awareness advantages that would not show up in the simplified comparison-counting way of thinking about efficiency.

I've heard of two-pivot quicksort, but can't remember where. -- Simen

Some information about dual pivot quicksort: http://iaroslavski.narod.ru/quicksort/DualPivotQuicksort.pdf http://permalink.gmane.org/gmane.comp.java.openjdk.core-libs.devel/2628

Dec 29 2009

Nekuromento Wrote:Simen kjaeraas Wrote:Kevin Bealer <kevinbealer gmail.com> wrote:I'm curious if the multi-pivot quicksort (I think everyone gets what I mean by this? Divide by more than one pivot on each pass? I can give details if you like ...) has been tried out much. It seems like it must have been, but it also seems like something that would have cache-awareness advantages that would not show up in the simplified comparison-counting way of thinking about efficiency.

I've heard of two-pivot quicksort, but can't remember where. -- Simen

Some information about dual pivot quicksort: http://iaroslavski.narod.ru/quicksort/DualPivotQuicksort.pdf http://permalink.gmane.org/gmane.comp.java.openjdk.core-libs.devel/2628

Looks awesome? But the proof is too academical for my taste. Why can't D implement a three-pivot quicksort and beat Java? What does myarray.sort in D use now?

Dec 29 2009

Denis Koroskin wrote:On Tue, 29 Dec 2009 21:36:25 +0300, justme <justme somewhere.net> wrote:Nekuromento Wrote:Simen kjaeraas Wrote:Kevin Bealer <kevinbealer gmail.com> wrote:I'm curious if the multi-pivot quicksort (I think everyone gets

mean by this? Divide by more than one pivot on each pass? I can

details if you like ...) has been tried out much. It seems like

have been, but it also seems like something that would have cache-awareness advantages that would not show up in the simplified comparison-counting way of thinking about efficiency.

I've heard of two-pivot quicksort, but can't remember where. -- Simen

Some information about dual pivot quicksort: http://iaroslavski.narod.ru/quicksort/DualPivotQuicksort.pdf http://permalink.gmane.org/gmane.comp.java.openjdk.core-libs.devel/2628

Looks awesome? But the proof is too academical for my taste. Why can't D implement a three-pivot quicksort and beat Java?

Yeah, that's the obvious question -- what's the optimum number of pivots? It's a bit annoying that that paper doesn't mention it.

Dec 29 2009

Don:Yeah, that's the obvious question -- what's the optimum number of pivots? It's a bit annoying that that paper doesn't mention it.

Two pivots help avoid a common bad corner case of the QuickSort (when there are many equal items). Writing a good sorting routine is not easy, there's lot of software engineering behind it. I have studied this topic some. Bye, bearophile

Dec 29 2009

retard:So can you tell us then what is the optimal number of pivots?<

I can tell you that two pivot, used in the correct way, lead to a good quicksort, as I've said.Can it be proven?

I don't know. It can be proven that two pivot are enough to avoid the problem with the classic QuickSort. Bye, bearophile

Dec 30 2009

bearophile wrote:retard:So can you tell us then what is the optimal number of pivots?<

I can tell you that two pivot, used in the correct way, lead to a good quicksort, as I've said.Can it be proven?

I don't know. It can be proven that two pivot are enough to avoid the problem with the classic QuickSort. Bye, bearophile

From a cursory glance, it seems that the number of swaps (0.8) ultimately comes out of the binomial theorem, so it ought to be straightforward to expand the analysis. I also suspect that it'd be better to switch from two-pivot to one-pivot quicksort at some value of n. It'd be interesting to know how the two-pivot quicksort compares to timsort. I suspect there's a median-of-five-killer worst case for this quicksort, just as there's a median-of-three killer for standard quicksort.

Dec 30 2009

Don:It'd be interesting to know how the two-pivot quicksort compares to timsort.

Timsort is probably a bit slower, but they can't be compared, because Timsort is a stable sort, while normal quicksorts aren't.I suspect there's a median-of-five-killer worst case for this quicksort, just as there's a median-of-three killer for standard quicksort.

Yes, there is. The purpose of using two pivots is not to avoid the inevitable worst case (that's made of usually a limited number of permutations, where the really worst is just one specific permutation), but to avoid a whole class of worst cases, where items are distributed in few equivalence classes. This case is much more common, it's not a sharp point in the landscape of the possible input permutations, it's a plateu, so it deserves to be fixed, and there's a simple way to fix it. In practice the presence of the O(n^2) sharp worst case is worrying enough the Java devs, so in Java the quicksort is used only for native data. On classes it uses a safer sorting routine not-quicksort based, so a possible attacker can't put inside the class some logic to dynamically find the worst case and attack a Java system. I think the default system sort of a language like D has to be stable (as in Python), and the unstable algorithm (a templated introsort based on 2 pivot quicksort, with 1 recursive call, trimedian of trimedian, with fallback on heapsort in slow situations) can be used just as optimization when necessary. I don't know if the safety adopted by Java (to not use quicksort on objects) is a good idea in D2 too. Bye, bearophile

Dec 30 2009

Kevin Bealer <kevinbealer gmail.com> wrote:I'm curious if the multi-pivot quicksort (I think everyone gets what I mean by this? Divide by more than one pivot on each pass? I can give details if you like ...) has been tried out much. It seems like it must have been, but it also seems like something that would have cache-awareness advantages that would not show up in the simplified comparison-counting way of thinking about efficiency.

I've heard of two-pivot quicksort, but can't remember where. -- Simen

Dec 28 2009

On Tue, 29 Dec 2009 21:36:25 +0300, justme <justme somewhere.net> wrote:Nekuromento Wrote:Simen kjaeraas Wrote:Kevin Bealer <kevinbealer gmail.com> wrote:I'm curious if the multi-pivot quicksort (I think everyone gets

mean by this? Divide by more than one pivot on each pass? I can

details if you like ...) has been tried out much. It seems like it

have been, but it also seems like something that would have cache-awareness advantages that would not show up in the simplified comparison-counting way of thinking about efficiency.

I've heard of two-pivot quicksort, but can't remember where. -- Simen

Some information about dual pivot quicksort: http://iaroslavski.narod.ru/quicksort/DualPivotQuicksort.pdf http://permalink.gmane.org/gmane.comp.java.openjdk.core-libs.devel/2628

Looks awesome? But the proof is too academical for my taste. Why can't D implement a three-pivot quicksort and beat Java? What does myarray.sort in D use now?

http://www.dsource.org/projects/druntime/browser/trunk/src/rt/qsort.d

Dec 29 2009

Wed, 30 Dec 2009 02:58:14 -0500, bearophile wrote:Don:Yeah, that's the obvious question -- what's the optimum number of pivots? It's a bit annoying that that paper doesn't mention it.

Two pivots help avoid a common bad corner case of the QuickSort (when there are many equal items). Writing a good sorting routine is not easy, there's lot of software engineering behind it. I have studied this topic some.

So can you tell us then what is the optimal number of pivots? Can it be proven? They say the two-pivot version is the best improvement over the practical version of classic quicksort since sliced bread.

Dec 30 2009

Kevin Bealer wrote:I think a lot of people would do even better than insertion with a deck of poker cards -- they might group cards by either suit or rank (or rank groups) (e.g. "Hmm, I'll make three piles of 1-5, 6-10, and J-A"), then order the "buckets", then stick these ordered sets back together. If you think about it, this is a lot like a radix sort or a multi-pivot cousin of the quick sort.

When sorting a deck of cards, insertion sort performs in O(n log n), the same as the best-case performance of a quick sort. When sorting arrays, you can find the insertion point in O(log n), but the actual insertion takes O(n). With a deck of cards, you can perform the insertion in O(1). There is absolutely no point in using quick sort for a deck of cards. Only the non-comparison-based sorts (radix sort, bucket sort) can do better than insertion sort. -- Rainer Deyke - rainerd eldwood.com

Dec 27 2009

Rainer Deyke Wrote:Kevin Bealer wrote:I think a lot of people would do even better than insertion with a deck of poker cards -- they might group cards by either suit or rank (or rank groups) (e.g. "Hmm, I'll make three piles of 1-5, 6-10, and J-A"), then order the "buckets", then stick these ordered sets back together. If you think about it, this is a lot like a radix sort or a multi-pivot cousin of the quick sort.

When sorting a deck of cards, insertion sort performs in O(n log n), the same as the best-case performance of a quick sort. When sorting arrays, you can find the insertion point in O(log n), but the actual insertion takes O(n). With a deck of cards, you can perform the insertion in O(1). There is absolutely no point in using quick sort for a deck of cards. Only the non-comparison-based sorts (radix sort, bucket sort) can do better than insertion sort. -- Rainer Deyke - rainerd eldwood.com

Right, if you mean strictly sorting a physical deck of cards by hand. Insertion sort isn't efficient in a computer because you need to find the insertion point quickly and insert quickly to keep the efficiency, but no data structure can really do both quickly. For linked lists, insert is quick (O(1)) but find is slow (O(n/2)); for arrays, find is quick (assuming you use binary search, O(ln n)) but insert is slow (O(n)). Either choice sticks you with a slow operation. Kevin

Dec 27 2009