www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - A Benchmark for Phobos Sort Algorithm

reply "Craig Black" <craigblack2 cox.net> writes:
On my computer, the custom sort algorithm performs almost 40 percent better 
than the Phobos one.  I provided this in case anyone wanted to improve the 
phobos algorithm.  I only benchmarked this with DMD.  I would be curious to 
know how it performs with GDC.

-Craig

import std.stdio;
import std.random;
import std.algorithm;

static bool less(T)(T a, T b) { return a < b; }

void insertionSort(A, alias L)(A a, int low, int high)
{
  for(int i = low; i <= high; i++)
  {
    int min = i;
    for(int j = i + 1; j <= high; j++)
      if(L(a[j], a[min])) min = j;
    swap(a[i], a[min]);
  }
}

void quickSort(A, alias L)(A a, int p, int r)
{
  if (p >= r) return;
  if(p + 7 > r) return insertionSort!(A, L)(a, p, r);
  auto x = a[r];
 int j = p - 1;
 for (int i = p; i < r; i++)
 {
  if (L(x, a[i])) continue;
  swap(a[i], a[++j]);
 }
 a[r] = a[j + 1];
 a[j + 1] = x;
 quickSort!(A, L)(a, p, j);
 quickSort!(A, L)(a, j + 2, r);
}

void customSort(T)(T[] a)
{
  quickSort!(T[], less!T)(a, 0, a.length-1);
}

ulong getCycle() { asm { rdtsc; } }

ulong bench1(double[] vals)
{
  ulong startTime = getCycle();
  double[] v;
  v.length = vals.length;
  for(int i = 0; i < 100; i++)
  {
    for(int j = 0; j < v.length; j++) v[j] = vals[j];
    sort(v);
  }
  return getCycle() - startTime;
}

ulong bench2(double[] vals)
{
  ulong startTime = getCycle();
  double[] v;
  v.length = vals.length;
  for(int i = 0; i < 100; i++)
  {
    for(int j = 0; j < v.length; j++) v[j] = vals[j];
    customSort(v);
  }
  return getCycle() - startTime;
}

void main()
{
  Mt19937 gen;
  double[] vals;
  vals.length = 1000;
  for(int i = 0; i < vals.length; i++) vals[i] = uniform(0.0,1000.0);

  ulong time1, time2;
  for(int i = 0; i < 100; i++)
  {
    time1 += bench1(vals);
    time2 += bench2(vals);
  }
  writeln("Sorting with phobos sort: ", time1/1e5);
  writeln("Sorting with custom quickSort: ", time2/1e5);
  writeln(100.0 * (time1-time2) / time1, " percent faster");
} 
Dec 15 2010
parent reply "Nick Voronin" <elfy.nv gmail.com> writes:
On Thu, 16 Dec 2010 04:52:53 +0300, Craig Black <craigblack2 cox.net>  
wrote:

 On my computer, the custom sort algorithm performs almost 40 percent  
 better than the Phobos one.  I provided this in case anyone wanted to  
 improve the phobos algorithm.  I only benchmarked this with DMD.  I  
 would be curious to know how it performs with GDC.
Benchmarks! Love them. They will show whatever you want them to. For example I see that customSort is of different complexity and much slower than phobos' sort. =)
 void quickSort(A, alias L)(A a, int p, int r)
 {
   if (p >= r) return;
   if(p + 7 > r) return insertionSort!(A, L)(a, p, r);
   auto x = a[r];
And here is why. Quicksort is quite famous for being O(n^2) worst case (for sorted data). Your straightforward, simple (and less generic, I must say) implementation shines for random data, but performs terribly for ordered data. Phobos' sort isn't really plain quicksort so it is slower for random data (yet still of the same complexity as your code best case), but it is pretty fast for ordered data. I wonder though, what exactly makes std.algorithm.sort twice slower for random data... overhead of ranges? more complex logic/more code? -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Dec 15 2010
next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
Nick Voronin Wrote:

 On Thu, 16 Dec 2010 04:52:53 +0300, Craig Black <craigblack2 cox.net>  
 wrote:
 
 And here is why. Quicksort is quite famous for being O(n^2) worst case  
 (for sorted data). Your straightforward, simple  (and less generic, I must  
 say) implementation shines for random data, but performs terribly for  
 ordered data. Phobos' sort isn't really plain quicksort so it is slower  
 for random data (yet still of the same complexity as your code best case),  
 but it is pretty fast for ordered data.
A tweaked version of the Tango sort routine is slower for random data but roughly 25% faster for ordered data. The straightforward quicksort is about 30 times slower though. All in all, the Phobos sort routine seems to do quite well. I'd like to see a test with other types of contrived worst-case data though.
Dec 15 2010
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday 15 December 2010 22:44:39 Sean Kelly wrote:
 Nick Voronin Wrote:
 On Thu, 16 Dec 2010 04:52:53 +0300, Craig Black <craigblack2 cox.net>
 wrote:
 
 And here is why. Quicksort is quite famous for being O(n^2) worst case
 (for sorted data). Your straightforward, simple  (and less generic, I
 must say) implementation shines for random data, but performs terribly
 for ordered data. Phobos' sort isn't really plain quicksort so it is
 slower for random data (yet still of the same complexity as your code
 best case), but it is pretty fast for ordered data.
A tweaked version of the Tango sort routine is slower for random data but roughly 25% faster for ordered data. The straightforward quicksort is about 30 times slower though. All in all, the Phobos sort routine seems to do quite well. I'd like to see a test with other types of contrived worst-case data though.
It would be nice to get a fairly extensive lists of types to sort with a variety of values and number of values of those types and set up an extensive set of benchmarking tests. Heck, such a set of types and collections of those types could be a useful benchmarking tool for a variety of algorithms. Then we'll have a good base to build from and compare to. If we're going to tweak algorithms for efficiency, we're going to want to some thorough tests so that we're sure that any tweaks are actually overall improvements. It's easy to find cases which do poorly for a particular algorithm. It can even be fairly easy to tweak an algorithm to improve that particular case. But it's not so easy to be sure that that tweak is actually an overal improvement. - Jonathan M Davis
Dec 15 2010
prev sibling parent spir <denis.spir gmail.com> writes:
On Wed, 15 Dec 2010 23:07:46 -0800
Jonathan M Davis <jmdavisProg gmx.com> wrote:

 It would be nice to get a fairly extensive lists of types to sort with a =
variety=20
 of values and number of values of those types and set up an extensive set=
of=20
 benchmarking tests. Heck, such a set of types and collections of those ty=
pes=20
 could be a useful benchmarking tool for a variety of algorithms. Then we'=
ll have=20
 a good base to build from and compare to. If we're going to tweak algorit=
hms for=20
 efficiency, we're going to want to some thorough tests so that we're sure=
that any=20
 tweaks are actually overall improvements. It's easy to find cases which d=
o poorly=20
 for a particular algorithm. It can even be fairly easy to tweak an algori=
thm to=20
 improve that particular case. But it's not so easy to be sure that that t=
weak is=20 On one hand, having sut a source data set would be nice nice. On the other,= such general-purpose algorithm simply cannot perform well; so, I would not= bother much. If one does need efficiency, then it is necessary to use or write a custom = sort adapted to the data type (int), the value space ([1,99]), the actual d= istribution (many small values), the degree of pre-ordering of source data = (bigger values have higher chances to come last),... The performance ratio between a specific and general algorithm can be huge,= as you know. Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Dec 16 2010
prev sibling parent "Craig Black" <craigblack2 cox.net> writes:
 And here is why. Quicksort is quite famous for being O(n^2) worst case 
 (for sorted data). Your straightforward, simple  (and less generic, I must 
 say) implementation shines for random data, but performs terribly for 
 ordered data. Phobos' sort isn't really plain quicksort so it is slower 
 for random data (yet still of the same complexity as your code best case), 
 but it is pretty fast for ordered data.
Quite right! Phobos sort does a really good job with ordered data. The simple algorithm doesn't... -Craig
Dec 16 2010