digitalmars.D - Re: Go rant

Kevin Bealer <kevinbealer gmail.com> Dec 27 2009

dsimcha <dsimcha yahoo.com> Dec 27 2009

Kevin Bealer <kevinbealer gmail.com> Dec 27 2009

Nekuromento <necroment gmail.com> Dec 29 2009

justme <justme somewhere.net> Dec 29 2009

Don <nospam nospam.com> Dec 29 2009

bearophile <bearophileHUGS lycos.com> Dec 29 2009

bearophile <bearophileHUGS lycos.com> Dec 30 2009

Don <nospam nospam.com> Dec 30 2009

bearophile <bearophileHUGS lycos.com> Dec 30 2009

"Simen kjaeraas" <simen.kjaras gmail.com> Dec 28 2009
"Denis Koroskin" <2korden gmail.com> Dec 29 2009
retard <re tard.com.invalid> Dec 30 2009

Rainer Deyke <rainerd eldwood.com> Dec 27 2009

Kevin Bealer <kevinbealer gmail.com> Dec 27 2009

Kevin Bealer <kevinbealer gmail.com> writes:

Don Wrote:
 retard wrote:


 
 I'd say it's easier. If you watch someone sorting some cards, they'll 
 use either insertion sort or selection sort. Nobody should have ever 
 heard of bubble sort, I'm pleased to hear some schools aren't mentioning 
 it. Such a foolish algorithm.
 
 "the bubble sort seems to have nothing to recommend it, except a catchy 
 name " - Knuth.


(Non-software) people doing routine tasks often come up with better algorithms
intuitively than my intuition expects them to.

I think a lot of people would do even better than insertion with a deck of
poker cards -- they might group cards by either suit or rank (or rank groups)
(e.g. "Hmm, I'll make three piles of 1-5, 6-10, and J-A"), then order the
"buckets", then stick these ordered sets back together.  If you think about it,
this is a lot like a radix sort or a multi-pivot cousin of the quick sort.

I think when you ask people to do a computer version of how they would sort,
they do worse than this.  It's so hard to communicate anything complicated to a
computer that they instinctively "dumb it down" and do insertion or bubble. 
Insertion is what you would manually use when dealing with adding a few new
elements to a short stack, whereas something like a bubble might be what you
would use manually when it's hard to jump around among the elements of the
sequence (e.g. if you have a bunch of "see also" references embedded in
dictionary entries and you need to sort within the see-also 'chain').  That
approach (kind of) matches the claims often given by computer programmers that
bubble sort is good for linked lists and/or "only fixing a few things that are
out of place in a mostly sorted list".

When you have a pupil that is as hard to talk to and unmotivated to learn as a
computer, the first attempt is to try to teach it anything at all and call that
a success if it works.  A more eager human pupil tries to meet you half way, so
you naturally respond and take pleasure in teaching them more and more advanced
stuff.  If you're really hard headed and immune to tedium, you keep trying with
even a super-dumb pupil like a CPU.  You keep going in for the hard case so
many times that you and eventually major in CompSci out of habit.

Kevin

Dec 27 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Kevin Bealer (kevinbealer gmail.com)'s article
 (Non-software) people doing routine tasks often come up with better algorithms


 I think a lot of people would do even better than insertion with a deck of
poker


"Hmm, I'll make three piles of 1-5, 6-10, and J-A"), then order the "buckets",
then stick these ordered sets back together.  If you think about it, this is a
lot
like a radix sort or a multi-pivot cousin of the quick sort.

You mean a bucket sort?  http://en.wikipedia.org/wiki/Bucket_sort

Dec 27 2009

Kevin Bealer <kevinbealer gmail.com> writes:

dsimcha Wrote:

 == Quote from Kevin Bealer (kevinbealer gmail.com)'s article
 (Non-software) people doing routine tasks often come up with better algorithms


 I think a lot of people would do even better than insertion with a deck of
poker


 "Hmm, I'll make three piles of 1-5, 6-10, and J-A"), then order the "buckets",
 then stick these ordered sets back together.  If you think about it, this is a
lot
 like a radix sort or a multi-pivot cousin of the quick sort.
 
 You mean a bucket sort?  http://en.wikipedia.org/wiki/Bucket_sort


More or less, though I think human beings use algorithms in a more artistic way
than sticking to any one algorithm.

I'm curious if the multi-pivot quicksort (I think everyone gets what I mean by
this?  Divide by more than one pivot on each pass?  I can give details if you
like ...) has been tried out much.  It seems like it must have been, but it
also seems like something that would have cache-awareness advantages that would
not show up in the simplified comparison-counting way of thinking about
efficiency.

Kevin

Dec 27 2009

Nekuromento <necroment gmail.com> writes:

Simen kjaeraas Wrote:

 Kevin Bealer <kevinbealer gmail.com> wrote:
 
 I'm curious if the multi-pivot quicksort (I think everyone gets what I  
 mean by this?  Divide by more than one pivot on each pass?  I can give  
 details if you like ...) has been tried out much.  It seems like it must  
 have been, but it also seems like something that would have  
 cache-awareness advantages that would not show up in the simplified  
 comparison-counting way of thinking about efficiency.


 I've heard of two-pivot quicksort, but can't remember where.
 
 -- 
 Simen


Some information about dual pivot quicksort:

http://iaroslavski.narod.ru/quicksort/DualPivotQuicksort.pdf

http://permalink.gmane.org/gmane.comp.java.openjdk.core-libs.devel/2628

Dec 29 2009

justme <justme somewhere.net> writes:

Nekuromento Wrote:

 Simen kjaeraas Wrote:
 
 Kevin Bealer <kevinbealer gmail.com> wrote:
 
 I'm curious if the multi-pivot quicksort (I think everyone gets what I  
 mean by this?  Divide by more than one pivot on each pass?  I can give  
 details if you like ...) has been tried out much.  It seems like it must  
 have been, but it also seems like something that would have  
 cache-awareness advantages that would not show up in the simplified  
 comparison-counting way of thinking about efficiency.


 I've heard of two-pivot quicksort, but can't remember where.
 
 -- 
 Simen


 Some information about dual pivot quicksort:
 
 http://iaroslavski.narod.ru/quicksort/DualPivotQuicksort.pdf
 
 http://permalink.gmane.org/gmane.comp.java.openjdk.core-libs.devel/2628


Looks awesome? But the proof is too academical for my taste. Why can't D
implement a three-pivot quicksort and beat Java? What does myarray.sort in D
use now?

Dec 29 2009

Don <nospam nospam.com> writes:

Denis Koroskin wrote:
 On Tue, 29 Dec 2009 21:36:25 +0300, justme <justme somewhere.net> wrote:
 
 Nekuromento Wrote:

 Simen kjaeraas Wrote:

 Kevin Bealer <kevinbealer gmail.com> wrote:

 I'm curious if the multi-pivot quicksort (I think everyone gets 




 mean by this?  Divide by more than one pivot on each pass?  I can 




 details if you like ...) has been tried out much.  It seems like 




 have been, but it also seems like something that would have
 cache-awareness advantages that would not show up in the simplified
 comparison-counting way of thinking about efficiency.


 I've heard of two-pivot quicksort, but can't remember where.

 --
 Simen


 Some information about dual pivot quicksort:

 http://iaroslavski.narod.ru/quicksort/DualPivotQuicksort.pdf

 http://permalink.gmane.org/gmane.comp.java.openjdk.core-libs.devel/2628


 Looks awesome? But the proof is too academical for my taste. Why can't 
 D implement a three-pivot quicksort and beat Java? 




Yeah, that's the obvious question -- what's the optimum number of 
pivots? It's a bit annoying that that paper doesn't mention it.

Dec 29 2009

bearophile <bearophileHUGS lycos.com> writes:

Don:
 Yeah, that's the obvious question -- what's the optimum number of 
 pivots? It's a bit annoying that that paper doesn't mention it.


Two pivots help avoid a common bad corner case of the QuickSort (when there are
many equal items). Writing a good sorting routine is not easy, there's lot of
software engineering behind it. I have studied this topic some.

Bye,
bearophile

Dec 29 2009

bearophile <bearophileHUGS lycos.com> writes:

retard:
 So can you tell us then what is the optimal number of pivots?<


I can tell you that two pivot, used in the correct way, lead to a good
quicksort, as I've said.


Can it be proven?


I don't know. It can be proven that two pivot are enough to avoid the problem
with the classic QuickSort.

Bye,
bearophile

Dec 30 2009

Don <nospam nospam.com> writes:

bearophile wrote:
 retard:
 So can you tell us then what is the optimal number of pivots?<


 I can tell you that two pivot, used in the correct way, lead to a good
quicksort, as I've said.
 
 
 Can it be proven?


 I don't know. It can be proven that two pivot are enough to avoid the problem
with the classic QuickSort.
 
 Bye,
 bearophile


 From a cursory glance, it seems that the number of swaps (0.8) 
ultimately comes out of the binomial theorem, so it ought to be 
straightforward to expand the analysis.

I also suspect that it'd be better to switch from two-pivot to one-pivot 
quicksort at some value of n.
It'd be interesting to know how the two-pivot quicksort compares to timsort.

I suspect there's a median-of-five-killer worst case for this quicksort, 
just as there's a median-of-three killer for standard quicksort.

Dec 30 2009

bearophile <bearophileHUGS lycos.com> writes:

Don:
 It'd be interesting to know how the two-pivot quicksort compares to timsort.


Timsort is probably a bit slower, but they can't be compared, because Timsort
is a stable sort, while normal quicksorts aren't.


 I suspect there's a median-of-five-killer worst case for this quicksort, 
 just as there's a median-of-three killer for standard quicksort.


Yes, there is. The purpose of using two pivots is not to avoid the inevitable
worst case (that's made of usually a limited number of permutations, where the
really worst is just one specific permutation), but to avoid a whole class of
worst cases, where items are distributed in few equivalence classes. This case
is much more common, it's not a sharp point in the landscape of the possible
input permutations, it's a plateu, so it deserves to be fixed, and there's a
simple way to fix it.

In practice the presence of the O(n^2) sharp worst case is worrying enough the
Java devs, so in Java the quicksort is used only for native data. On classes it
uses a safer sorting routine not-quicksort based, so a possible attacker can't
put inside the class some logic to dynamically find the worst case and attack a
Java system.

I think the default system sort of a language like D has to be stable (as in
Python), and the unstable algorithm (a templated introsort based on 2 pivot
quicksort, with 1 recursive call, trimedian of trimedian, with fallback on
heapsort in slow situations) can be used just as optimization when necessary.

I don't know if the safety adopted by Java (to not use quicksort on objects) is
a good idea in D2 too.

Bye,
bearophile

Dec 30 2009

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Kevin Bealer <kevinbealer gmail.com> wrote:

 I'm curious if the multi-pivot quicksort (I think everyone gets what I  
 mean by this?  Divide by more than one pivot on each pass?  I can give  
 details if you like ...) has been tried out much.  It seems like it must  
 have been, but it also seems like something that would have  
 cache-awareness advantages that would not show up in the simplified  
 comparison-counting way of thinking about efficiency.


I've heard of two-pivot quicksort, but can't remember where.

-- 
Simen

Dec 28 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Tue, 29 Dec 2009 21:36:25 +0300, justme <justme somewhere.net> wrote:

 Nekuromento Wrote:

 Simen kjaeraas Wrote:

 Kevin Bealer <kevinbealer gmail.com> wrote:

 I'm curious if the multi-pivot quicksort (I think everyone gets  




 mean by this?  Divide by more than one pivot on each pass?  I can  




 details if you like ...) has been tried out much.  It seems like it  




 have been, but it also seems like something that would have
 cache-awareness advantages that would not show up in the simplified
 comparison-counting way of thinking about efficiency.


 I've heard of two-pivot quicksort, but can't remember where.

 --
 Simen


 Some information about dual pivot quicksort:

 http://iaroslavski.narod.ru/quicksort/DualPivotQuicksort.pdf

 http://permalink.gmane.org/gmane.comp.java.openjdk.core-libs.devel/2628


 Looks awesome? But the proof is too academical for my taste. Why can't D  
 implement a three-pivot quicksort and beat Java? What does myarray.sort  
 in D use now?


http://www.dsource.org/projects/druntime/browser/trunk/src/rt/qsort.d

Dec 29 2009

retard <re tard.com.invalid> writes:

Wed, 30 Dec 2009 02:58:14 -0500, bearophile wrote:

 Don:
 Yeah, that's the obvious question -- what's the optimum number of
 pivots? It's a bit annoying that that paper doesn't mention it.


 Two pivots help avoid a common bad corner case of the QuickSort (when
 there are many equal items). Writing a good sorting routine is not easy,
 there's lot of software engineering behind it. I have studied this topic
 some.


So can you tell us then what is the optimal number of pivots? Can it be 
proven? They say the two-pivot version is the best improvement over the 
practical version of classic quicksort since sliced bread.

Dec 30 2009

Rainer Deyke <rainerd eldwood.com> writes:

Kevin Bealer wrote:
 I think a lot of people would do even better than insertion with a
 deck of poker cards -- they might group cards by either suit or rank
 (or rank groups) (e.g. "Hmm, I'll make three piles of 1-5, 6-10, and
 J-A"), then order the "buckets", then stick these ordered sets back
 together.  If you think about it, this is a lot like a radix sort or
 a multi-pivot cousin of the quick sort.


When sorting a deck of cards, insertion sort performs in O(n log n), the
same as the best-case performance of a quick sort.  When sorting arrays,
you can find the insertion point in O(log n), but the actual insertion
takes O(n).  With a deck of cards, you can perform the insertion in
O(1).  There is absolutely no point in using quick sort for a deck of
cards.  Only the non-comparison-based sorts (radix sort, bucket sort)
can do better than insertion sort.


-- 
Rainer Deyke - rainerd eldwood.com

Dec 27 2009

Kevin Bealer <kevinbealer gmail.com> writes:

Rainer Deyke Wrote:

 Kevin Bealer wrote:
 I think a lot of people would do even better than insertion with a
 deck of poker cards -- they might group cards by either suit or rank
 (or rank groups) (e.g. "Hmm, I'll make three piles of 1-5, 6-10, and
 J-A"), then order the "buckets", then stick these ordered sets back
 together.  If you think about it, this is a lot like a radix sort or
 a multi-pivot cousin of the quick sort.


 When sorting a deck of cards, insertion sort performs in O(n log n), the
 same as the best-case performance of a quick sort.  When sorting arrays,
 you can find the insertion point in O(log n), but the actual insertion
 takes O(n).  With a deck of cards, you can perform the insertion in
 O(1).  There is absolutely no point in using quick sort for a deck of
 cards.  Only the non-comparison-based sorts (radix sort, bucket sort)
 can do better than insertion sort.
 
 
 -- 
 Rainer Deyke - rainerd eldwood.com


Right, if you mean strictly sorting a physical deck of cards by hand. 
Insertion sort isn't efficient in a computer because you need to find the
insertion point quickly and insert quickly to keep the efficiency, but no data
structure can really do both quickly.  For linked lists, insert is quick (O(1))
but find is slow (O(n/2)); for arrays, find is quick (assuming you use binary
search, O(ln n)) but insert is slow (O(n)).  Either choice sticks you with a
slow operation.

Kevin

Dec 27 2009

D Programming

C/C++ Programming

Other

digitalmars.D - Re: Go rant