digitalmars.D - Project: better partition

Andrei Alexandrescu (13/13) May 17 2016 We should take advantage of the improved partition code I discussed at

Xinok (33/37) May 17 2016 Interesting optimization, I hope you don't mind if I use it for

Xinok (8/18) May 18 2016 I realize that I may have wrote this post a bit prematurely. I

Andrei Alexandrescu (3/22) May 18 2016 No worries. Please take anything you need from there for your code, make...

Xinok (37/41) May 22 2016 As it turns out, easier said than done. I've been thinking about

Andrei Alexandrescu (2/6) May 22 2016 So that's slower than what I have in my slides. Why not use that? -- And...

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

We should take advantage of the improved partition code I discussed at 
ACCU. Also there's a person on 
https://www.reddit.com/r/programming/comments/4jlkhv/accu_2016_keynote_by_
ndrei_alexandrescu/ 
discussing a simpler algorithm based on a couple of additional 
assumptions. The plan would go:

* Add a new public overload of partition() 
(https://dlang.org/phobos/std_algorithm_sorting.html#partition) that 
takes an index as a second argument. Implement partition with pivot per 
the slides.

* Use it in sort

* Benchmark, make sure it's faster.

* Yay.


Andrei

May 17 2016

Xinok <xinok live.com> writes:

On Tuesday, 17 May 2016 at 17:31:47 UTC, Andrei Alexandrescu 
wrote:
 We should take advantage of the improved partition code I 
 discussed at ACCU. Also there's a person on 
 https://www.reddit.com/r/programming/comments/4jlkhv/accu_2016_keynote_by_
ndrei_alexandrescu/ discussing a simpler algorithm based on a couple of
additional assumptions.
 ...

Interesting optimization, I hope you don't mind if I use it for 
my implementation of Quicksort. However, I would like to suggest 
another improvement that I devised a while back.

One shortcoming I find in most implementations of partition is 
the unnecessary swapping of elements equal to the pivot resulting 
in much unneeded work. The code in your slides has this same 
shortcoming. Imagine, for some reason, you call a pivot on an 
array full of zeroes. It's going to be moving lots of elements 
around for no good reason.

The obvious solution is to simply skip over equal elements but 
that is not enough. Reconsider the array full of zeroes; if you 
simply skip over all equal elements on the first pass, then the 
pivot will end up at the very front or end of the array. Ideally, 
at least when sorting, you want the pivot to occur as close to 
the center as possible.

My solution is to alternate between incrementing "lo" and "hi" 
only one step at a time, skipping over equal elements in the 
process. A priori, with an array full of zeroes, the pivot ends 
up in the center. Only once you find an element that belongs in 
the other partition do you fall back to the Hoare partition 
scheme and increment the other iterator until you find another 
element to swap with, but do not skip over equal elements in this 
case! Otherwise, you can trigger the same behavior as before with 
quadratic running time.

Anyways, my solution can be found at the link below. It can be 
over twice as fast in an ideal case, but when applied to real 
world data with lots of duplicate elements, maybe 5-10% faster.

https://github.com/Xinok/XSort/blob/master/xsort/introsort.d#L171

I don't claim credit for this technique. Admittedly I haven't 
really tried looking around to see if anybody else has come up 
with the same solution but I'm probably not the first.

May 17 2016

Xinok <xinok live.com> writes:

On Tuesday, 17 May 2016 at 19:27:22 UTC, Xinok wrote:
 On Tuesday, 17 May 2016 at 17:31:47 UTC, Andrei Alexandrescu 
 wrote:
 We should take advantage of the improved partition code I 
 discussed at ACCU. Also there's a person on 
 https://www.reddit.com/r/programming/comments/4jlkhv/accu_2016_keynote_by_
ndrei_alexandrescu/ discussing a simpler algorithm based on a couple of
additional assumptions.
 ...

 Interesting optimization, I hope you don't mind if I use it for 
 my implementation of Quicksort. However, I would like to 
 suggest another improvement that I devised a while back.
 ...

I realize that I may have wrote this post a bit prematurely. I 
only looked at the code in your slides before writing this and 
didn't realize that you had mentioned the same point I made here 
in your live talk (about equal elements). So I may have come off 
a bit condescending and that wasn't my intention. Great talk 
though, interesting topic and always entertaining to watch you 
speak.

May 18 2016

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/18/16 2:48 PM, Xinok wrote:
 On Tuesday, 17 May 2016 at 19:27:22 UTC, Xinok wrote:
 On Tuesday, 17 May 2016 at 17:31:47 UTC, Andrei Alexandrescu wrote:
 We should take advantage of the improved partition code I discussed
 at ACCU. Also there's a person on
 https://www.reddit.com/r/programming/comments/4jlkhv/accu_2016_keynote_by_andrei_alexandrescu/
 discussing a simpler algorithm based on a couple of additional
 assumptions.
 ...

 Interesting optimization, I hope you don't mind if I use it for my
 implementation of Quicksort. However, I would like to suggest another
 improvement that I devised a while back.
 ...

 I realize that I may have wrote this post a bit prematurely. I only
 looked at the code in your slides before writing this and didn't realize
 that you had mentioned the same point I made here in your live talk
 (about equal elements). So I may have come off a bit condescending and
 that wasn't my intention. Great talk though, interesting topic and
 always entertaining to watch you speak.

No worries. Please take anything you need from there for your code, make 
it better, and contribute it back to the stdlib! -- Andrei

May 18 2016

Xinok <xinok live.com> writes:

On Wednesday, 18 May 2016 at 19:54:19 UTC, Andrei Alexandrescu 
wrote:
 ...
 No worries. Please take anything you need from there for your 
 code, make it better, and contribute it back to the stdlib! -- 
 Andrei

As it turns out, easier said than done.  I've been thinking about 
it for a few days now but I don't see a simple way to optimally 
merge the two techniques. The way that I alternate between 
iterating "lo" and "hi" (or lef/rig in my code) doesn't really 
work when you need to keep the iterator stationary until 
something fills the vacancy.

This is the best solution I have so far and it doesn't feel like 
a good solution at that:

     for (;;)
     {
         ++lo;
         for (;;)
         {
             if(r[lo] < p)  ++lo; else break;
             if(r[lo] <= p) ++lo; else break;
         }

         if(lo > hi) lo = hi;
         r[hi] = r[lo];

         --hi;
         for (;;)
         {
             if(p < r[hi])  --hi; else break;
             if(p <= r[hi]) --hi; else break;
         }
         if(lo >= hi) break;
         r[lo] = r[hi];
     }

The idea is simple: alternate the check for equality in hopes of 
skipping some equal elements. Unfortunately, this modification 
requires a little more work and TWO sentinels at either end of 
the range because it may skip over the first.

In most real-world data, there's only marginal gains to be made 
in skipping over equal elements, too small to justify 
compromising the gains achieved by using sentinels and vacancies. 
So unless an optimal solution exists, it's just not worth it.

May 22 2016

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/22/2016 05:33 PM, Xinok wrote:
 The idea is simple: alternate the check for equality in hopes of
 skipping some equal elements. Unfortunately, this modification requires
 a little more work and TWO sentinels at either end of the range because
 it may skip over the first.

So that's slower than what I have in my slides. Why not use that? -- Andrei

May 22 2016

D Programming

C/C++ Programming

Other

digitalmars.D - Project: better partition