digitalmars.D - Re: Go rant

Kevin Bealer <kevinbealer gmail.com> writes:
```Don Wrote:
retard wrote:

I'd say it's easier. If you watch someone sorting some cards, they'll
use either insertion sort or selection sort. Nobody should have ever
heard of bubble sort, I'm pleased to hear some schools aren't mentioning
it. Such a foolish algorithm.

"the bubble sort seems to have nothing to recommend it, except a catchy
name " - Knuth.

(Non-software) people doing routine tasks often come up with better algorithms
intuitively than my intuition expects them to.

I think a lot of people would do even better than insertion with a deck of
poker cards -- they might group cards by either suit or rank (or rank groups)
(e.g. "Hmm, I'll make three piles of 1-5, 6-10, and J-A"), then order the
"buckets", then stick these ordered sets back together.  If you think about it,
this is a lot like a radix sort or a multi-pivot cousin of the quick sort.

I think when you ask people to do a computer version of how they would sort,
they do worse than this.  It's so hard to communicate anything complicated to a
computer that they instinctively "dumb it down" and do insertion or bubble.
Insertion is what you would manually use when dealing with adding a few new
elements to a short stack, whereas something like a bubble might be what you
would use manually when it's hard to jump around among the elements of the
sequence (e.g. if you have a bunch of "see also" references embedded in
dictionary entries and you need to sort within the see-also 'chain').  That
approach (kind of) matches the claims often given by computer programmers that
bubble sort is good for linked lists and/or "only fixing a few things that are
out of place in a mostly sorted list".

When you have a pupil that is as hard to talk to and unmotivated to learn as a
computer, the first attempt is to try to teach it anything at all and call that
a success if it works.  A more eager human pupil tries to meet you half way, so
you naturally respond and take pleasure in teaching them more and more advanced
stuff.  If you're really hard headed and immune to tedium, you keep trying with
even a super-dumb pupil like a CPU.  You keep going in for the hard case so
many times that you and eventually major in CompSci out of habit.

Kevin
```
Dec 27 2009
dsimcha <dsimcha yahoo.com> writes:
```== Quote from Kevin Bealer (kevinbealer gmail.com)'s article
(Non-software) people doing routine tasks often come up with better algorithms

I think a lot of people would do even better than insertion with a deck of
poker

"Hmm, I'll make three piles of 1-5, 6-10, and J-A"), then order the "buckets",
then stick these ordered sets back together.  If you think about it, this is a
lot
like a radix sort or a multi-pivot cousin of the quick sort.

You mean a bucket sort?  http://en.wikipedia.org/wiki/Bucket_sort
```
Dec 27 2009
Kevin Bealer <kevinbealer gmail.com> writes:
```dsimcha Wrote:

== Quote from Kevin Bealer (kevinbealer gmail.com)'s article
(Non-software) people doing routine tasks often come up with better algorithms

I think a lot of people would do even better than insertion with a deck of
poker

"Hmm, I'll make three piles of 1-5, 6-10, and J-A"), then order the "buckets",
then stick these ordered sets back together.  If you think about it, this is a
lot
like a radix sort or a multi-pivot cousin of the quick sort.

You mean a bucket sort?  http://en.wikipedia.org/wiki/Bucket_sort

More or less, though I think human beings use algorithms in a more artistic way
than sticking to any one algorithm.

I'm curious if the multi-pivot quicksort (I think everyone gets what I mean by
this?  Divide by more than one pivot on each pass?  I can give details if you
like ...) has been tried out much.  It seems like it must have been, but it
also seems like something that would have cache-awareness advantages that would
not show up in the simplified comparison-counting way of thinking about
efficiency.

Kevin
```
Dec 27 2009
Nekuromento <necroment gmail.com> writes:
```Simen kjaeraas Wrote:

Kevin Bealer <kevinbealer gmail.com> wrote:

I'm curious if the multi-pivot quicksort (I think everyone gets what I
mean by this?  Divide by more than one pivot on each pass?  I can give
details if you like ...) has been tried out much.  It seems like it must
have been, but it also seems like something that would have
cache-awareness advantages that would not show up in the simplified
comparison-counting way of thinking about efficiency.

I've heard of two-pivot quicksort, but can't remember where.

--
Simen

Some information about dual pivot quicksort:

http://iaroslavski.narod.ru/quicksort/DualPivotQuicksort.pdf

```
Dec 29 2009
justme <justme somewhere.net> writes:
```Nekuromento Wrote:

Simen kjaeraas Wrote:

Kevin Bealer <kevinbealer gmail.com> wrote:

I'm curious if the multi-pivot quicksort (I think everyone gets what I
mean by this?  Divide by more than one pivot on each pass?  I can give
details if you like ...) has been tried out much.  It seems like it must
have been, but it also seems like something that would have
cache-awareness advantages that would not show up in the simplified
comparison-counting way of thinking about efficiency.

I've heard of two-pivot quicksort, but can't remember where.

--
Simen

Some information about dual pivot quicksort:

http://iaroslavski.narod.ru/quicksort/DualPivotQuicksort.pdf

Looks awesome? But the proof is too academical for my taste. Why can't D
implement a three-pivot quicksort and beat Java? What does myarray.sort in D
use now?
```
Dec 29 2009
Don <nospam nospam.com> writes:
```Denis Koroskin wrote:
On Tue, 29 Dec 2009 21:36:25 +0300, justme <justme somewhere.net> wrote:

Nekuromento Wrote:

Simen kjaeraas Wrote:

Kevin Bealer <kevinbealer gmail.com> wrote:

I'm curious if the multi-pivot quicksort (I think everyone gets

mean by this?  Divide by more than one pivot on each pass?  I can

details if you like ...) has been tried out much.  It seems like

have been, but it also seems like something that would have
cache-awareness advantages that would not show up in the simplified
comparison-counting way of thinking about efficiency.

I've heard of two-pivot quicksort, but can't remember where.

--
Simen

Some information about dual pivot quicksort:

http://iaroslavski.narod.ru/quicksort/DualPivotQuicksort.pdf

Looks awesome? But the proof is too academical for my taste. Why can't
D implement a three-pivot quicksort and beat Java?

Yeah, that's the obvious question -- what's the optimum number of
pivots? It's a bit annoying that that paper doesn't mention it.
```
Dec 29 2009
bearophile <bearophileHUGS lycos.com> writes:
```Don:
Yeah, that's the obvious question -- what's the optimum number of
pivots? It's a bit annoying that that paper doesn't mention it.

Two pivots help avoid a common bad corner case of the QuickSort (when there are
many equal items). Writing a good sorting routine is not easy, there's lot of
software engineering behind it. I have studied this topic some.

Bye,
bearophile
```
Dec 29 2009
bearophile <bearophileHUGS lycos.com> writes:
```retard:
So can you tell us then what is the optimal number of pivots?<

I can tell you that two pivot, used in the correct way, lead to a good
quicksort, as I've said.

Can it be proven?

I don't know. It can be proven that two pivot are enough to avoid the problem
with the classic QuickSort.

Bye,
bearophile
```
Dec 30 2009
Don <nospam nospam.com> writes:
```bearophile wrote:
retard:
So can you tell us then what is the optimal number of pivots?<

I can tell you that two pivot, used in the correct way, lead to a good
quicksort, as I've said.

Can it be proven?

I don't know. It can be proven that two pivot are enough to avoid the problem
with the classic QuickSort.

Bye,
bearophile

From a cursory glance, it seems that the number of swaps (0.8)
ultimately comes out of the binomial theorem, so it ought to be
straightforward to expand the analysis.

I also suspect that it'd be better to switch from two-pivot to one-pivot
quicksort at some value of n.
It'd be interesting to know how the two-pivot quicksort compares to timsort.

I suspect there's a median-of-five-killer worst case for this quicksort,
just as there's a median-of-three killer for standard quicksort.
```
Dec 30 2009
bearophile <bearophileHUGS lycos.com> writes:
```Don:
It'd be interesting to know how the two-pivot quicksort compares to timsort.

Timsort is probably a bit slower, but they can't be compared, because Timsort
is a stable sort, while normal quicksorts aren't.

I suspect there's a median-of-five-killer worst case for this quicksort,
just as there's a median-of-three killer for standard quicksort.

Yes, there is. The purpose of using two pivots is not to avoid the inevitable
worst case (that's made of usually a limited number of permutations, where the
really worst is just one specific permutation), but to avoid a whole class of
worst cases, where items are distributed in few equivalence classes. This case
is much more common, it's not a sharp point in the landscape of the possible
input permutations, it's a plateu, so it deserves to be fixed, and there's a
simple way to fix it.

In practice the presence of the O(n^2) sharp worst case is worrying enough the
Java devs, so in Java the quicksort is used only for native data. On classes it
uses a safer sorting routine not-quicksort based, so a possible attacker can't
put inside the class some logic to dynamically find the worst case and attack a
Java system.

I think the default system sort of a language like D has to be stable (as in
Python), and the unstable algorithm (a templated introsort based on 2 pivot
quicksort, with 1 recursive call, trimedian of trimedian, with fallback on
heapsort in slow situations) can be used just as optimization when necessary.

I don't know if the safety adopted by Java (to not use quicksort on objects) is
a good idea in D2 too.

Bye,
bearophile
```
Dec 30 2009
"Simen kjaeraas" <simen.kjaras gmail.com> writes:
```Kevin Bealer <kevinbealer gmail.com> wrote:

I'm curious if the multi-pivot quicksort (I think everyone gets what I
mean by this?  Divide by more than one pivot on each pass?  I can give
details if you like ...) has been tried out much.  It seems like it must
have been, but it also seems like something that would have
cache-awareness advantages that would not show up in the simplified
comparison-counting way of thinking about efficiency.

I've heard of two-pivot quicksort, but can't remember where.

--
Simen
```
Dec 28 2009
"Denis Koroskin" <2korden gmail.com> writes:
```On Tue, 29 Dec 2009 21:36:25 +0300, justme <justme somewhere.net> wrote:

Nekuromento Wrote:

Simen kjaeraas Wrote:

Kevin Bealer <kevinbealer gmail.com> wrote:

I'm curious if the multi-pivot quicksort (I think everyone gets

mean by this?  Divide by more than one pivot on each pass?  I can

details if you like ...) has been tried out much.  It seems like it

have been, but it also seems like something that would have
cache-awareness advantages that would not show up in the simplified
comparison-counting way of thinking about efficiency.

I've heard of two-pivot quicksort, but can't remember where.

--
Simen

Some information about dual pivot quicksort:

http://iaroslavski.narod.ru/quicksort/DualPivotQuicksort.pdf

Looks awesome? But the proof is too academical for my taste. Why can't D
implement a three-pivot quicksort and beat Java? What does myarray.sort
in D use now?

http://www.dsource.org/projects/druntime/browser/trunk/src/rt/qsort.d
```
Dec 29 2009
retard <re tard.com.invalid> writes:
```Wed, 30 Dec 2009 02:58:14 -0500, bearophile wrote:

Don:
Yeah, that's the obvious question -- what's the optimum number of
pivots? It's a bit annoying that that paper doesn't mention it.

Two pivots help avoid a common bad corner case of the QuickSort (when
there are many equal items). Writing a good sorting routine is not easy,
there's lot of software engineering behind it. I have studied this topic
some.

So can you tell us then what is the optimal number of pivots? Can it be
proven? They say the two-pivot version is the best improvement over the
practical version of classic quicksort since sliced bread.
```
Dec 30 2009
Rainer Deyke <rainerd eldwood.com> writes:
```Kevin Bealer wrote:
I think a lot of people would do even better than insertion with a
deck of poker cards -- they might group cards by either suit or rank
(or rank groups) (e.g. "Hmm, I'll make three piles of 1-5, 6-10, and
J-A"), then order the "buckets", then stick these ordered sets back
together.  If you think about it, this is a lot like a radix sort or
a multi-pivot cousin of the quick sort.

When sorting a deck of cards, insertion sort performs in O(n log n), the
same as the best-case performance of a quick sort.  When sorting arrays,
you can find the insertion point in O(log n), but the actual insertion
takes O(n).  With a deck of cards, you can perform the insertion in
O(1).  There is absolutely no point in using quick sort for a deck of
cards.  Only the non-comparison-based sorts (radix sort, bucket sort)
can do better than insertion sort.

--
Rainer Deyke - rainerd eldwood.com
```
Dec 27 2009
Kevin Bealer <kevinbealer gmail.com> writes:
```Rainer Deyke Wrote:

Kevin Bealer wrote:
I think a lot of people would do even better than insertion with a
deck of poker cards -- they might group cards by either suit or rank
(or rank groups) (e.g. "Hmm, I'll make three piles of 1-5, 6-10, and
J-A"), then order the "buckets", then stick these ordered sets back
together.  If you think about it, this is a lot like a radix sort or
a multi-pivot cousin of the quick sort.

When sorting a deck of cards, insertion sort performs in O(n log n), the
same as the best-case performance of a quick sort.  When sorting arrays,
you can find the insertion point in O(log n), but the actual insertion
takes O(n).  With a deck of cards, you can perform the insertion in
O(1).  There is absolutely no point in using quick sort for a deck of
cards.  Only the non-comparison-based sorts (radix sort, bucket sort)
can do better than insertion sort.

--
Rainer Deyke - rainerd eldwood.com

Right, if you mean strictly sorting a physical deck of cards by hand.
Insertion sort isn't efficient in a computer because you need to find the
insertion point quickly and insert quickly to keep the efficiency, but no data
structure can really do both quickly.  For linked lists, insert is quick (O(1))
but find is slow (O(n/2)); for arrays, find is quick (assuming you use binary
search, O(ln n)) but insert is slow (O(n)).  Either choice sticks you with a
slow operation.

Kevin
```
Dec 27 2009