www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Re: Go rant

reply Kevin Bealer <kevinbealer gmail.com> writes:
Don Wrote:
 retard wrote:

 
 I'd say it's easier. If you watch someone sorting some cards, they'll 
 use either insertion sort or selection sort. Nobody should have ever 
 heard of bubble sort, I'm pleased to hear some schools aren't mentioning 
 it. Such a foolish algorithm.
 
 "the bubble sort seems to have nothing to recommend it, except a catchy 
 name " - Knuth.

(Non-software) people doing routine tasks often come up with better algorithms intuitively than my intuition expects them to. I think a lot of people would do even better than insertion with a deck of poker cards -- they might group cards by either suit or rank (or rank groups) (e.g. "Hmm, I'll make three piles of 1-5, 6-10, and J-A"), then order the "buckets", then stick these ordered sets back together. If you think about it, this is a lot like a radix sort or a multi-pivot cousin of the quick sort. I think when you ask people to do a computer version of how they would sort, they do worse than this. It's so hard to communicate anything complicated to a computer that they instinctively "dumb it down" and do insertion or bubble. Insertion is what you would manually use when dealing with adding a few new elements to a short stack, whereas something like a bubble might be what you would use manually when it's hard to jump around among the elements of the sequence (e.g. if you have a bunch of "see also" references embedded in dictionary entries and you need to sort within the see-also 'chain'). That approach (kind of) matches the claims often given by computer programmers that bubble sort is good for linked lists and/or "only fixing a few things that are out of place in a mostly sorted list". When you have a pupil that is as hard to talk to and unmotivated to learn as a computer, the first attempt is to try to teach it anything at all and call that a success if it works. A more eager human pupil tries to meet you half way, so you naturally respond and take pleasure in teaching them more and more advanced stuff. If you're really hard headed and immune to tedium, you keep trying with even a super-dumb pupil like a CPU. You keep going in for the hard case so many times that you and eventually major in CompSci out of habit. Kevin
Dec 27 2009
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Kevin Bealer (kevinbealer gmail.com)'s article
 (Non-software) people doing routine tasks often come up with better algorithms

 I think a lot of people would do even better than insertion with a deck of
poker

"Hmm, I'll make three piles of 1-5, 6-10, and J-A"), then order the "buckets", then stick these ordered sets back together. If you think about it, this is a lot like a radix sort or a multi-pivot cousin of the quick sort. You mean a bucket sort? http://en.wikipedia.org/wiki/Bucket_sort
Dec 27 2009
next sibling parent reply Kevin Bealer <kevinbealer gmail.com> writes:
dsimcha Wrote:

 == Quote from Kevin Bealer (kevinbealer gmail.com)'s article
 (Non-software) people doing routine tasks often come up with better algorithms

 I think a lot of people would do even better than insertion with a deck of
poker

"Hmm, I'll make three piles of 1-5, 6-10, and J-A"), then order the "buckets", then stick these ordered sets back together. If you think about it, this is a lot like a radix sort or a multi-pivot cousin of the quick sort. You mean a bucket sort? http://en.wikipedia.org/wiki/Bucket_sort

More or less, though I think human beings use algorithms in a more artistic way than sticking to any one algorithm. I'm curious if the multi-pivot quicksort (I think everyone gets what I mean by this? Divide by more than one pivot on each pass? I can give details if you like ...) has been tried out much. It seems like it must have been, but it also seems like something that would have cache-awareness advantages that would not show up in the simplified comparison-counting way of thinking about efficiency. Kevin
Dec 27 2009
parent reply Nekuromento <necroment gmail.com> writes:
Simen kjaeraas Wrote:

 Kevin Bealer <kevinbealer gmail.com> wrote:
 
 I'm curious if the multi-pivot quicksort (I think everyone gets what I  
 mean by this?  Divide by more than one pivot on each pass?  I can give  
 details if you like ...) has been tried out much.  It seems like it must  
 have been, but it also seems like something that would have  
 cache-awareness advantages that would not show up in the simplified  
 comparison-counting way of thinking about efficiency.

I've heard of two-pivot quicksort, but can't remember where. -- Simen

Some information about dual pivot quicksort: http://iaroslavski.narod.ru/quicksort/DualPivotQuicksort.pdf http://permalink.gmane.org/gmane.comp.java.openjdk.core-libs.devel/2628
Dec 29 2009
parent reply justme <justme somewhere.net> writes:
Nekuromento Wrote:

 Simen kjaeraas Wrote:
 
 Kevin Bealer <kevinbealer gmail.com> wrote:
 
 I'm curious if the multi-pivot quicksort (I think everyone gets what I  
 mean by this?  Divide by more than one pivot on each pass?  I can give  
 details if you like ...) has been tried out much.  It seems like it must  
 have been, but it also seems like something that would have  
 cache-awareness advantages that would not show up in the simplified  
 comparison-counting way of thinking about efficiency.

I've heard of two-pivot quicksort, but can't remember where. -- Simen

Some information about dual pivot quicksort: http://iaroslavski.narod.ru/quicksort/DualPivotQuicksort.pdf http://permalink.gmane.org/gmane.comp.java.openjdk.core-libs.devel/2628

Looks awesome? But the proof is too academical for my taste. Why can't D implement a three-pivot quicksort and beat Java? What does myarray.sort in D use now?
Dec 29 2009
parent reply Don <nospam nospam.com> writes:
Denis Koroskin wrote:
 On Tue, 29 Dec 2009 21:36:25 +0300, justme <justme somewhere.net> wrote:
 
 Nekuromento Wrote:

 Simen kjaeraas Wrote:

 Kevin Bealer <kevinbealer gmail.com> wrote:

 I'm curious if the multi-pivot quicksort (I think everyone gets 


 mean by this?  Divide by more than one pivot on each pass?  I can 


 details if you like ...) has been tried out much.  It seems like 


 have been, but it also seems like something that would have
 cache-awareness advantages that would not show up in the simplified
 comparison-counting way of thinking about efficiency.

I've heard of two-pivot quicksort, but can't remember where. -- Simen

Some information about dual pivot quicksort: http://iaroslavski.narod.ru/quicksort/DualPivotQuicksort.pdf http://permalink.gmane.org/gmane.comp.java.openjdk.core-libs.devel/2628

Looks awesome? But the proof is too academical for my taste. Why can't D implement a three-pivot quicksort and beat Java?


Yeah, that's the obvious question -- what's the optimum number of pivots? It's a bit annoying that that paper doesn't mention it.
Dec 29 2009
parent reply bearophile <bearophileHUGS lycos.com> writes:
Don:
 Yeah, that's the obvious question -- what's the optimum number of 
 pivots? It's a bit annoying that that paper doesn't mention it.

Two pivots help avoid a common bad corner case of the QuickSort (when there are many equal items). Writing a good sorting routine is not easy, there's lot of software engineering behind it. I have studied this topic some. Bye, bearophile
Dec 29 2009
parent reply bearophile <bearophileHUGS lycos.com> writes:
retard:
 So can you tell us then what is the optimal number of pivots?<

I can tell you that two pivot, used in the correct way, lead to a good quicksort, as I've said.
Can it be proven?

I don't know. It can be proven that two pivot are enough to avoid the problem with the classic QuickSort. Bye, bearophile
Dec 30 2009
parent reply Don <nospam nospam.com> writes:
bearophile wrote:
 retard:
 So can you tell us then what is the optimal number of pivots?<

I can tell you that two pivot, used in the correct way, lead to a good quicksort, as I've said.
 Can it be proven?

I don't know. It can be proven that two pivot are enough to avoid the problem with the classic QuickSort. Bye, bearophile

From a cursory glance, it seems that the number of swaps (0.8) ultimately comes out of the binomial theorem, so it ought to be straightforward to expand the analysis. I also suspect that it'd be better to switch from two-pivot to one-pivot quicksort at some value of n. It'd be interesting to know how the two-pivot quicksort compares to timsort. I suspect there's a median-of-five-killer worst case for this quicksort, just as there's a median-of-three killer for standard quicksort.
Dec 30 2009
parent bearophile <bearophileHUGS lycos.com> writes:
Don:
 It'd be interesting to know how the two-pivot quicksort compares to timsort.

Timsort is probably a bit slower, but they can't be compared, because Timsort is a stable sort, while normal quicksorts aren't.
 I suspect there's a median-of-five-killer worst case for this quicksort, 
 just as there's a median-of-three killer for standard quicksort.

Yes, there is. The purpose of using two pivots is not to avoid the inevitable worst case (that's made of usually a limited number of permutations, where the really worst is just one specific permutation), but to avoid a whole class of worst cases, where items are distributed in few equivalence classes. This case is much more common, it's not a sharp point in the landscape of the possible input permutations, it's a plateu, so it deserves to be fixed, and there's a simple way to fix it. In practice the presence of the O(n^2) sharp worst case is worrying enough the Java devs, so in Java the quicksort is used only for native data. On classes it uses a safer sorting routine not-quicksort based, so a possible attacker can't put inside the class some logic to dynamically find the worst case and attack a Java system. I think the default system sort of a language like D has to be stable (as in Python), and the unstable algorithm (a templated introsort based on 2 pivot quicksort, with 1 recursive call, trimedian of trimedian, with fallback on heapsort in slow situations) can be used just as optimization when necessary. I don't know if the safety adopted by Java (to not use quicksort on objects) is a good idea in D2 too. Bye, bearophile
Dec 30 2009
prev sibling next sibling parent "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Kevin Bealer <kevinbealer gmail.com> wrote:

 I'm curious if the multi-pivot quicksort (I think everyone gets what I  
 mean by this?  Divide by more than one pivot on each pass?  I can give  
 details if you like ...) has been tried out much.  It seems like it must  
 have been, but it also seems like something that would have  
 cache-awareness advantages that would not show up in the simplified  
 comparison-counting way of thinking about efficiency.

I've heard of two-pivot quicksort, but can't remember where. -- Simen
Dec 28 2009
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Tue, 29 Dec 2009 21:36:25 +0300, justme <justme somewhere.net> wrote:

 Nekuromento Wrote:

 Simen kjaeraas Wrote:

 Kevin Bealer <kevinbealer gmail.com> wrote:

 I'm curious if the multi-pivot quicksort (I think everyone gets  


 mean by this?  Divide by more than one pivot on each pass?  I can  


 details if you like ...) has been tried out much.  It seems like it  


 have been, but it also seems like something that would have
 cache-awareness advantages that would not show up in the simplified
 comparison-counting way of thinking about efficiency.

I've heard of two-pivot quicksort, but can't remember where. -- Simen

Some information about dual pivot quicksort: http://iaroslavski.narod.ru/quicksort/DualPivotQuicksort.pdf http://permalink.gmane.org/gmane.comp.java.openjdk.core-libs.devel/2628

Looks awesome? But the proof is too academical for my taste. Why can't D implement a three-pivot quicksort and beat Java? What does myarray.sort in D use now?

http://www.dsource.org/projects/druntime/browser/trunk/src/rt/qsort.d
Dec 29 2009
prev sibling parent retard <re tard.com.invalid> writes:
Wed, 30 Dec 2009 02:58:14 -0500, bearophile wrote:

 Don:
 Yeah, that's the obvious question -- what's the optimum number of
 pivots? It's a bit annoying that that paper doesn't mention it.

Two pivots help avoid a common bad corner case of the QuickSort (when there are many equal items). Writing a good sorting routine is not easy, there's lot of software engineering behind it. I have studied this topic some.

So can you tell us then what is the optimal number of pivots? Can it be proven? They say the two-pivot version is the best improvement over the practical version of classic quicksort since sliced bread.
Dec 30 2009
prev sibling parent reply Rainer Deyke <rainerd eldwood.com> writes:
Kevin Bealer wrote:
 I think a lot of people would do even better than insertion with a
 deck of poker cards -- they might group cards by either suit or rank
 (or rank groups) (e.g. "Hmm, I'll make three piles of 1-5, 6-10, and
 J-A"), then order the "buckets", then stick these ordered sets back
 together.  If you think about it, this is a lot like a radix sort or
 a multi-pivot cousin of the quick sort.

When sorting a deck of cards, insertion sort performs in O(n log n), the same as the best-case performance of a quick sort. When sorting arrays, you can find the insertion point in O(log n), but the actual insertion takes O(n). With a deck of cards, you can perform the insertion in O(1). There is absolutely no point in using quick sort for a deck of cards. Only the non-comparison-based sorts (radix sort, bucket sort) can do better than insertion sort. -- Rainer Deyke - rainerd eldwood.com
Dec 27 2009
parent Kevin Bealer <kevinbealer gmail.com> writes:
Rainer Deyke Wrote:

 Kevin Bealer wrote:
 I think a lot of people would do even better than insertion with a
 deck of poker cards -- they might group cards by either suit or rank
 (or rank groups) (e.g. "Hmm, I'll make three piles of 1-5, 6-10, and
 J-A"), then order the "buckets", then stick these ordered sets back
 together.  If you think about it, this is a lot like a radix sort or
 a multi-pivot cousin of the quick sort.

When sorting a deck of cards, insertion sort performs in O(n log n), the same as the best-case performance of a quick sort. When sorting arrays, you can find the insertion point in O(log n), but the actual insertion takes O(n). With a deck of cards, you can perform the insertion in O(1). There is absolutely no point in using quick sort for a deck of cards. Only the non-comparison-based sorts (radix sort, bucket sort) can do better than insertion sort. -- Rainer Deyke - rainerd eldwood.com

Right, if you mean strictly sorting a physical deck of cards by hand. Insertion sort isn't efficient in a computer because you need to find the insertion point quickly and insert quickly to keep the efficiency, but no data structure can really do both quickly. For linked lists, insert is quick (O(1)) but find is slow (O(n/2)); for arrays, find is quick (assuming you use binary search, O(ln n)) but insert is slow (O(n)). Either choice sticks you with a slow operation. Kevin
Dec 27 2009