digitalmars.D - A Benchmark for Phobos Sort Algorithm
- Craig Black (80/80) Dec 15 2010 On my computer, the custom sort algorithm performs almost 40 percent bet...
- Nick Voronin (15/24) Dec 15 2010 Benchmarks! Love them. They will show whatever you want them to. For
- Sean Kelly (2/11) Dec 15 2010 A tweaked version of the Tango sort routine is slower for random data bu...
- Jonathan M Davis (12/28) Dec 15 2010 It would be nice to get a fairly extensive lists of types to sort with a...
- spir (24/33) Dec 16 2010 variety=20
- Craig Black (3/9) Dec 16 2010 Quite right! Phobos sort does a really good job with ordered data. The...
On my computer, the custom sort algorithm performs almost 40 percent better than the Phobos one. I provided this in case anyone wanted to improve the phobos algorithm. I only benchmarked this with DMD. I would be curious to know how it performs with GDC. -Craig import std.stdio; import std.random; import std.algorithm; static bool less(T)(T a, T b) { return a < b; } void insertionSort(A, alias L)(A a, int low, int high) { for(int i = low; i <= high; i++) { int min = i; for(int j = i + 1; j <= high; j++) if(L(a[j], a[min])) min = j; swap(a[i], a[min]); } } void quickSort(A, alias L)(A a, int p, int r) { if (p >= r) return; if(p + 7 > r) return insertionSort!(A, L)(a, p, r); auto x = a[r]; int j = p - 1; for (int i = p; i < r; i++) { if (L(x, a[i])) continue; swap(a[i], a[++j]); } a[r] = a[j + 1]; a[j + 1] = x; quickSort!(A, L)(a, p, j); quickSort!(A, L)(a, j + 2, r); } void customSort(T)(T[] a) { quickSort!(T[], less!T)(a, 0, a.length-1); } ulong getCycle() { asm { rdtsc; } } ulong bench1(double[] vals) { ulong startTime = getCycle(); double[] v; v.length = vals.length; for(int i = 0; i < 100; i++) { for(int j = 0; j < v.length; j++) v[j] = vals[j]; sort(v); } return getCycle() - startTime; } ulong bench2(double[] vals) { ulong startTime = getCycle(); double[] v; v.length = vals.length; for(int i = 0; i < 100; i++) { for(int j = 0; j < v.length; j++) v[j] = vals[j]; customSort(v); } return getCycle() - startTime; } void main() { Mt19937 gen; double[] vals; vals.length = 1000; for(int i = 0; i < vals.length; i++) vals[i] = uniform(0.0,1000.0); ulong time1, time2; for(int i = 0; i < 100; i++) { time1 += bench1(vals); time2 += bench2(vals); } writeln("Sorting with phobos sort: ", time1/1e5); writeln("Sorting with custom quickSort: ", time2/1e5); writeln(100.0 * (time1-time2) / time1, " percent faster"); }
Dec 15 2010
On Thu, 16 Dec 2010 04:52:53 +0300, Craig Black <craigblack2 cox.net> wrote:On my computer, the custom sort algorithm performs almost 40 percent better than the Phobos one. I provided this in case anyone wanted to improve the phobos algorithm. I only benchmarked this with DMD. I would be curious to know how it performs with GDC.Benchmarks! Love them. They will show whatever you want them to. For example I see that customSort is of different complexity and much slower than phobos' sort. =)void quickSort(A, alias L)(A a, int p, int r) { if (p >= r) return; if(p + 7 > r) return insertionSort!(A, L)(a, p, r); auto x = a[r];And here is why. Quicksort is quite famous for being O(n^2) worst case (for sorted data). Your straightforward, simple (and less generic, I must say) implementation shines for random data, but performs terribly for ordered data. Phobos' sort isn't really plain quicksort so it is slower for random data (yet still of the same complexity as your code best case), but it is pretty fast for ordered data. I wonder though, what exactly makes std.algorithm.sort twice slower for random data... overhead of ranges? more complex logic/more code? -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Dec 15 2010
Nick Voronin Wrote:On Thu, 16 Dec 2010 04:52:53 +0300, Craig Black <craigblack2 cox.net> wrote: And here is why. Quicksort is quite famous for being O(n^2) worst case (for sorted data). Your straightforward, simple (and less generic, I must say) implementation shines for random data, but performs terribly for ordered data. Phobos' sort isn't really plain quicksort so it is slower for random data (yet still of the same complexity as your code best case), but it is pretty fast for ordered data.A tweaked version of the Tango sort routine is slower for random data but roughly 25% faster for ordered data. The straightforward quicksort is about 30 times slower though. All in all, the Phobos sort routine seems to do quite well. I'd like to see a test with other types of contrived worst-case data though.
Dec 15 2010
On Wednesday 15 December 2010 22:44:39 Sean Kelly wrote:Nick Voronin Wrote:It would be nice to get a fairly extensive lists of types to sort with a variety of values and number of values of those types and set up an extensive set of benchmarking tests. Heck, such a set of types and collections of those types could be a useful benchmarking tool for a variety of algorithms. Then we'll have a good base to build from and compare to. If we're going to tweak algorithms for efficiency, we're going to want to some thorough tests so that we're sure that any tweaks are actually overall improvements. It's easy to find cases which do poorly for a particular algorithm. It can even be fairly easy to tweak an algorithm to improve that particular case. But it's not so easy to be sure that that tweak is actually an overal improvement. - Jonathan M DavisOn Thu, 16 Dec 2010 04:52:53 +0300, Craig Black <craigblack2 cox.net> wrote: And here is why. Quicksort is quite famous for being O(n^2) worst case (for sorted data). Your straightforward, simple (and less generic, I must say) implementation shines for random data, but performs terribly for ordered data. Phobos' sort isn't really plain quicksort so it is slower for random data (yet still of the same complexity as your code best case), but it is pretty fast for ordered data.A tweaked version of the Tango sort routine is slower for random data but roughly 25% faster for ordered data. The straightforward quicksort is about 30 times slower though. All in all, the Phobos sort routine seems to do quite well. I'd like to see a test with other types of contrived worst-case data though.
Dec 15 2010
On Wed, 15 Dec 2010 23:07:46 -0800 Jonathan M Davis <jmdavisProg gmx.com> wrote:It would be nice to get a fairly extensive lists of types to sort with a =variety=20of values and number of values of those types and set up an extensive set=of=20benchmarking tests. Heck, such a set of types and collections of those ty=pes=20could be a useful benchmarking tool for a variety of algorithms. Then we'=ll have=20a good base to build from and compare to. If we're going to tweak algorit=hms for=20efficiency, we're going to want to some thorough tests so that we're sure=that any=20tweaks are actually overall improvements. It's easy to find cases which d=o poorly=20for a particular algorithm. It can even be fairly easy to tweak an algori=thm to=20improve that particular case. But it's not so easy to be sure that that t=weak is=20 On one hand, having sut a source data set would be nice nice. On the other,= such general-purpose algorithm simply cannot perform well; so, I would not= bother much. If one does need efficiency, then it is necessary to use or write a custom = sort adapted to the data type (int), the value space ([1,99]), the actual d= istribution (many small values), the degree of pre-ordering of source data = (bigger values have higher chances to come last),... The performance ratio between a specific and general algorithm can be huge,= as you know. Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Dec 16 2010
And here is why. Quicksort is quite famous for being O(n^2) worst case (for sorted data). Your straightforward, simple (and less generic, I must say) implementation shines for random data, but performs terribly for ordered data. Phobos' sort isn't really plain quicksort so it is slower for random data (yet still of the same complexity as your code best case), but it is pretty fast for ordered data.Quite right! Phobos sort does a really good job with ordered data. The simple algorithm doesn't... -Craig
Dec 16 2010