digitalmars.D - Sorting algorithm
- Xinok (11/11) Oct 07 2011 Hi, it's been years since I've posted here. I just wanted to share
- Andrei Alexandrescu (3/14) Oct 07 2011 This is interesting. What do the numbers in the benchmark represent?
- Xinok (19/35) Oct 07 2011 I'll just post the code I used for benchmarking. Simply put, smaller
- Andrei Alexandrescu (10/17) Oct 07 2011 [snip]
- Ellery Newcomer (3/11) Oct 07 2011 stable sort is flat out broken
- Andrei Alexandrescu (3/14) Oct 07 2011 Then the contribution would be even more welcome!
- Xinok (17/36) Oct 07 2011 I'm not familiar with the process. What all would I need to do to
- Andrei Alexandrescu (12/39) Oct 07 2011 D's standard library is on github:
- Xinok (5/46) Oct 07 2011 Thanks, I'll look into it when I have a little more time.
- Andrei Alexandrescu (4/8) Oct 08 2011 Wouldn't swapping the argument to less work?
- Xinok (12/20) Oct 08 2011 The merge step in my algorithm is similar to Timsort, in that it can
- Andrei Alexandrescu (7/28) Oct 08 2011 I don't understand. Say all you have is "<". Then you can do everything:
- Xinok (9/41) Oct 08 2011 I actually wasn't aware you could do that O.o, although you got the
- Andrei Alexandrescu (10/56) Oct 08 2011 That's rig[off - c] < lef[c].
- Xinok (2/9) Oct 08 2011 nm, I got it now. >.< I can't believe I didn't know this.
- Ellery Newcomer (3/8) Oct 08 2011 it'd be nice if there were some convenience templates somewhere for
- Ellery Newcomer (12/12) Oct 07 2011 tinkered with timsort a bit a few months ago. comparing that to your
- Xinok (77/89) Oct 07 2011 The benchmark for descending is no surprise since timsort explicitly
- Xinok (5/16) Oct 08 2011 To follow up on my original post, I wrote a text file which explains the...
- Andrei Alexandrescu (17/35) Oct 08 2011 Nice writeup, but I found it quite difficult to get into. What would
- Xinok (20/35) Oct 08 2011 I didn't mean for this text to be anything official. I just felt it was
- Andrei Alexandrescu (15/53) Oct 08 2011 My attempt was to make a few suggestions that would improve your
- Xinok (16/56) Oct 08 2011 Thanks for that. :) I can't update the original link, so the text is
- Xinok (9/27) Oct 09 2011 And to follow up on this post, I created a permanent page for xinok sort...
Hi, it's been years since I've posted here. I just wanted to share something I worked on, a new sorting algorithm. It's a variant of merge sort, but much more memory efficient with only a small loss in performance. The most similar algorithm I know of is Timsort. I had need for a stable sorting algorithm, but the performance of stable sort in Phobos is terrible. This pretty much left merge sort, which has good performance but requires O(n) space. That persuaded me to design my own sorting algorithm. Here, you can find the code, details of the algorithm, and benchmarks (introSort is the unstable sort in Phobos). http://www.neowin.net/forum/blog/422/entry-3737-sort-algorithm-complete/
Oct 07 2011
On 10/7/11 11:42 AM, Xinok wrote:Hi, it's been years since I've posted here. I just wanted to share something I worked on, a new sorting algorithm. It's a variant of merge sort, but much more memory efficient with only a small loss in performance. The most similar algorithm I know of is Timsort. I had need for a stable sorting algorithm, but the performance of stable sort in Phobos is terrible. This pretty much left merge sort, which has good performance but requires O(n) space. That persuaded me to design my own sorting algorithm. Here, you can find the code, details of the algorithm, and benchmarks (introSort is the unstable sort in Phobos). http://www.neowin.net/forum/blog/422/entry-3737-sort-algorithm-complete/This is interesting. What do the numbers in the benchmark represent? Andrei
Oct 07 2011
On 10/7/2011 1:03 PM, Andrei Alexandrescu wrote:On 10/7/11 11:42 AM, Xinok wrote:I'll just post the code I used for benchmarking. Simply put, smaller numbers are faster. ubyte[16] digest; uint[] base; base.length = 1024 * 1024 * 16; foreach(i, ref v; base) v = i; randomShuffle(base); writeln("Is sorted: ", isSorted(base)); writeln("Is length: ", base.length); SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_TIME_CRITICAL); long start, finish; auto copy = base.dup; QueryPerformanceCounter(&start); xinokSort(copy); QueryPerformanceCounter(&finish); sum(digest, copy); writeln("xinokSort: ", finish - start, "\t", digestToString(digest)); assert(isSorted(copy), "Array not sorted");Hi, it's been years since I've posted here. I just wanted to share something I worked on, a new sorting algorithm. It's a variant of merge sort, but much more memory efficient with only a small loss in performance. The most similar algorithm I know of is Timsort. I had need for a stable sorting algorithm, but the performance of stable sort in Phobos is terrible. This pretty much left merge sort, which has good performance but requires O(n) space. That persuaded me to design my own sorting algorithm. Here, you can find the code, details of the algorithm, and benchmarks (introSort is the unstable sort in Phobos). http://www.neowin.net/forum/blog/422/entry-3737-sort-algorithm-complete/This is interesting. What do the numbers in the benchmark represent? Andrei
Oct 07 2011
On 10/7/11 12:23 PM, Xinok wrote:[snip] Thanks. It would be great if you wanted to contribute your stable sort to Phobos via a pull request. Also, which version of D are you using? I'm seeing that std.algorithm.sort (introSort) performs quite badly; for example, it's twice as slow on shuffled data against quickSort, and it also deals badly with already sorted data. Generally it does much worse than the quickSort baseline. Wonder why. AndreiI'll just post the code I used for benchmarking. Simply put, smaller numbers are faster.http://www.neowin.net/forum/blog/422/entry-3737-sort-algorithm-complete/This is interesting. What do the numbers in the benchmark represent? Andrei
Oct 07 2011
On 10/07/2011 01:20 PM, Andrei Alexandrescu wrote:Also, which version of D are you using? I'm seeing that std.algorithm.sort (introSort) performs quite badly; for example, it's twice as slow on shuffled data against quickSort, and it also deals badly with already sorted data. Generally it does much worse than the quickSort baseline. Wonder why. Andreistable sort is flat out broken http://d.puremagic.com/issues/show_bug.cgi?id=4584
Oct 07 2011
On 10/07/11 13:36, Ellery Newcomer wrote:On 10/07/2011 01:20 PM, Andrei Alexandrescu wrote:Then the contribution would be even more welcome! AndreiAlso, which version of D are you using? I'm seeing that std.algorithm.sort (introSort) performs quite badly; for example, it's twice as slow on shuffled data against quickSort, and it also deals badly with already sorted data. Generally it does much worse than the quickSort baseline. Wonder why. Andreistable sort is flat out broken http://d.puremagic.com/issues/show_bug.cgi?id=4584
Oct 07 2011
On 10/7/2011 2:20 PM, Andrei Alexandrescu wrote:On 10/7/11 12:23 PM, Xinok wrote:I'm not familiar with the process. What all would I need to do to contribute my sort to Phobos? On another note, my sort is most efficient on random access ranges. A simple merge sort would be more practical for other data structures like linked lists. I used DMD 2.051 for those benchmarks, so I did new benchmarks using DMD 2.055. Unstable sort saw a big improvement, but stable sort still did poorly. I find it unusual since I thought stable sort was supposed to use merge sort. xinokSort: 7564654 shellSort: 8843484 quickSort: 6005074 mergeSort: 6625306 radixSort: 2837697 phobos Unstable: 5559726 phobos Stable: 113924852[snip] Thanks. It would be great if you wanted to contribute your stable sort to Phobos via a pull request. Also, which version of D are you using? I'm seeing that std.algorithm.sort (introSort) performs quite badly; for example, it's twice as slow on shuffled data against quickSort, and it also deals badly with already sorted data. Generally it does much worse than the quickSort baseline. Wonder why. AndreiI'll just post the code I used for benchmarking. Simply put, smaller numbers are faster.http://www.neowin.net/forum/blog/422/entry-3737-sort-algorithm-complete/This is interesting. What do the numbers in the benchmark represent? Andrei
Oct 07 2011
On 10/07/11 13:50, Xinok wrote:On 10/7/2011 2:20 PM, Andrei Alexandrescu wrote:D's standard library is on github: https://github.com/D-Programming-Language/phobos. Anyone can fork it, modify it as they please, and then submit the changes for review via a so-called "pull request". Here's an example of a pull request with comments and all: https://github.com/D-Programming-Language/phobos/pull/272. There is documentation available on github.com about how to use the site. It's some work but it's time well invested - the kin of git and github are here to stay. Would anyone in the community be willing to shepherd Xinok through the first pull request? AndreiOn 10/7/11 12:23 PM, Xinok wrote:I'm not familiar with the process. What all would I need to do to contribute my sort to Phobos?[snip] Thanks. It would be great if you wanted to contribute your stable sort to Phobos via a pull request. Also, which version of D are you using? I'm seeing that std.algorithm.sort (introSort) performs quite badly; for example, it's twice as slow on shuffled data against quickSort, and it also deals badly with already sorted data. Generally it does much worse than the quickSort baseline. Wonder why. AndreiI'll just post the code I used for benchmarking. Simply put, smaller numbers are faster.http://www.neowin.net/forum/blog/422/entry-3737-sort-algorithm-complete/This is interesting. What do the numbers in the benchmark represent? Andrei
Oct 07 2011
On 10/7/2011 5:08 PM, Andrei Alexandrescu wrote:On 10/07/11 13:50, Xinok wrote:Thanks, I'll look into it when I have a little more time. If my algorithm were to be implemented in Phobos, the template arguments for std.algorithm.sort would need to be modified. For my algorithm to be stable, it requires defining two conditions, both a 'less' and 'greater'.On 10/7/2011 2:20 PM, Andrei Alexandrescu wrote:D's standard library is on github: https://github.com/D-Programming-Language/phobos. Anyone can fork it, modify it as they please, and then submit the changes for review via a so-called "pull request". Here's an example of a pull request with comments and all: https://github.com/D-Programming-Language/phobos/pull/272. There is documentation available on github.com about how to use the site. It's some work but it's time well invested - the kin of git and github are here to stay. Would anyone in the community be willing to shepherd Xinok through the first pull request? AndreiOn 10/7/11 12:23 PM, Xinok wrote:I'm not familiar with the process. What all would I need to do to contribute my sort to Phobos?[snip] Thanks. It would be great if you wanted to contribute your stable sort to Phobos via a pull request. Also, which version of D are you using? I'm seeing that std.algorithm.sort (introSort) performs quite badly; for example, it's twice as slow on shuffled data against quickSort, and it also deals badly with already sorted data. Generally it does much worse than the quickSort baseline. Wonder why. AndreiI'll just post the code I used for benchmarking. Simply put, smaller numbers are faster.http://www.neowin.net/forum/blog/422/entry-3737-sort-algorithm-complete/This is interesting. What do the numbers in the benchmark represent? Andrei
Oct 07 2011
On 10/07/11 23:00, Xinok wrote:Thanks, I'll look into it when I have a little more time.Looking forward to it.If my algorithm were to be implemented in Phobos, the template arguments for std.algorithm.sort would need to be modified. For my algorithm to be stable, it requires defining two conditions, both a 'less' and 'greater'.Wouldn't swapping the argument to less work? Andrei
Oct 08 2011
On 10/8/2011 9:15 AM, Andrei Alexandrescu wrote:On 10/07/11 23:00, Xinok wrote:The merge step in my algorithm is similar to Timsort, in that it can merge left to right as well as right to left. This needs two conditions to do a stable sort, 'less or equal' and 'greater or equal'. By swapping the arguments, you can only get 'greater or equal'. You're still missing 'less or equal', which is required for my algorithm to be stable. It wouldn't be difficult to overload the template. When using an unstable sort, the greater argument can be left empty. sort!("a < b") would simply translate to: sort!("a < b", "", SwapStrategy.unstable)Thanks, I'll look into it when I have a little more time.Looking forward to it.If my algorithm were to be implemented in Phobos, the template arguments for std.algorithm.sort would need to be modified. For my algorithm to be stable, it requires defining two conditions, both a 'less' and 'greater'.Wouldn't swapping the argument to less work? Andrei
Oct 08 2011
On 10/08/11 10:01, Xinok wrote:On 10/8/2011 9:15 AM, Andrei Alexandrescu wrote:I don't understand. Say all you have is "<". Then you can do everything: a <= b is !(b < a) a > b is b < a This is an absolute classic. Probably it's negation that is missing from your assumptions? You can always negate a predicate. AndreiOn 10/07/11 23:00, Xinok wrote:The merge step in my algorithm is similar to Timsort, in that it can merge left to right as well as right to left. This needs two conditions to do a stable sort, 'less or equal' and 'greater or equal'. By swapping the arguments, you can only get 'greater or equal'. You're still missing 'less or equal', which is required for my algorithm to be stable.Thanks, I'll look into it when I have a little more time.Looking forward to it.If my algorithm were to be implemented in Phobos, the template arguments for std.algorithm.sort would need to be modified. For my algorithm to be stable, it requires defining two conditions, both a 'less' and 'greater'.Wouldn't swapping the argument to less work? Andrei
Oct 08 2011
On 10/8/2011 11:04 AM, Andrei Alexandrescu wrote:On 10/08/11 10:01, Xinok wrote:I actually wasn't aware you could do that O.o, although you got the operators wrong in your example. It does add overhead though. It requires two comparisons, and possibly twice the number of lookups, such as in this statement: lef[c] > rig[off - c] How about, if the greater argument is empty, then it fills it automatically with this. But also give the programmer the option to define both conditions for more efficient code.On 10/8/2011 9:15 AM, Andrei Alexandrescu wrote:I don't understand. Say all you have is "<". Then you can do everything: a <= b is !(b < a) a > b is b < a This is an absolute classic. Probably it's negation that is missing from your assumptions? You can always negate a predicate. AndreiOn 10/07/11 23:00, Xinok wrote:The merge step in my algorithm is similar to Timsort, in that it can merge left to right as well as right to left. This needs two conditions to do a stable sort, 'less or equal' and 'greater or equal'. By swapping the arguments, you can only get 'greater or equal'. You're still missing 'less or equal', which is required for my algorithm to be stable.Thanks, I'll look into it when I have a little more time.Looking forward to it.If my algorithm were to be implemented in Phobos, the template arguments for std.algorithm.sort would need to be modified. For my algorithm to be stable, it requires defining two conditions, both a 'less' and 'greater'.Wouldn't swapping the argument to less work? Andrei
Oct 08 2011
On 10/8/11 10:36 AM, Xinok wrote:On 10/8/2011 11:04 AM, Andrei Alexandrescu wrote:Where?On 10/08/11 10:01, Xinok wrote:I actually wasn't aware you could do that O.o, although you got the operators wrong in your example.On 10/8/2011 9:15 AM, Andrei Alexandrescu wrote:I don't understand. Say all you have is "<". Then you can do everything: a <= b is !(b < a) a > b is b < a This is an absolute classic. Probably it's negation that is missing from your assumptions? You can always negate a predicate. AndreiOn 10/07/11 23:00, Xinok wrote:The merge step in my algorithm is similar to Timsort, in that it can merge left to right as well as right to left. This needs two conditions to do a stable sort, 'less or equal' and 'greater or equal'. By swapping the arguments, you can only get 'greater or equal'. You're still missing 'less or equal', which is required for my algorithm to be stable.Thanks, I'll look into it when I have a little more time.Looking forward to it.If my algorithm were to be implemented in Phobos, the template arguments for std.algorithm.sort would need to be modified. For my algorithm to be stable, it requires defining two conditions, both a 'less' and 'greater'.Wouldn't swapping the argument to less work? AndreiIt does add overhead though. It requires two comparisons, and possibly twice the number of lookups, such as in this statement: lef[c] > rig[off - c]That's rig[off - c] < lef[c].How about, if the greater argument is empty, then it fills it automatically with this. But also give the programmer the option to define both conditions for more efficient code.I think I'm missing something. Again: if you have "<" then "<=", ">", and ">=" are zero cost. What is more expensive is deciding a == b. You need to do it by saying !(a < b) && !(b < a). That's a cost worth paying because the second expression is more general - it allows you to define equivalence classes by using "<" even though the objects are not equal. Andrei
Oct 08 2011
On 10/8/2011 11:56 AM, Andrei Alexandrescu wrote:I think I'm missing something. Again: if you have "<" then "<=", ">", and ">=" are zero cost. What is more expensive is deciding a == b. You need to do it by saying !(a < b) && !(b < a). That's a cost worth paying because the second expression is more general - it allows you to define equivalence classes by using "<" even though the objects are not equal. Andreinm, I got it now. >.< I can't believe I didn't know this.
Oct 08 2011
it'd be nice if there were some convenience templates somewhere for converting a lt function to le,gt,ge, etc On 10/08/2011 10:04 AM, Andrei Alexandrescu wrote:I don't understand. Say all you have is "<". Then you can do everything: a <= b is !(b < a) a > b is b < a
Oct 08 2011
tinkered with timsort a bit a few months ago. comparing that to your sort, I get numbers like xinokSort random: 77 ascending: 0 descending: 21 timsort random: 354 ascending: 1 descending: 4 where each are sorting a 500k element array of int, times are msecs, compilation flags were -O -inline -release, sources are http://personal.utulsa.edu/~ellery-newcomer/timsort.d http://personal.utulsa.edu/~ellery-newcomer/xinokSort.d Nice job, Xinok. anyone want to try to optimize my timsort? :)
Oct 07 2011
On 10/7/2011 2:27 PM, Ellery Newcomer wrote:tinkered with timsort a bit a few months ago. comparing that to your sort, I get numbers like xinokSort random: 77 ascending: 0 descending: 21 timsort random: 354 ascending: 1 descending: 4 where each are sorting a 500k element array of int, times are msecs, compilation flags were -O -inline -release, sources are http://personal.utulsa.edu/~ellery-newcomer/timsort.d http://personal.utulsa.edu/~ellery-newcomer/xinokSort.d Nice job, Xinok. anyone want to try to optimize my timsort? :)The benchmark for descending is no surprise since timsort explicitly searches for runs in descending order. However, the random benchmark isn't what I expected. My sort is a bit slower than a standard merge sort, so I would expect Timsort to be faster. While part of the reason may be your code is unoptimized, it may also be the fault of a bug. I ran some more benchmarks and I found a few cases where your code failed, specifically when the data is mostly random. Just a guess, but your code may have a problem when the "runs" are too small. I also found a case when std.algorithm.sort performs poorly, under Small Shuffle + Descending. Numbers represent time taken; Smaller numbers are faster An MD5 hash is generated for each sort, to verify it was sorted correctly phoboSort is unstable std.algorithm.sort Ascending Is length: 16777216 xinokSort: 50722 C8B8D3A2B4D9882ABCFC31721EF27145 shellSort: 726046 C8B8D3A2B4D9882ABCFC31721EF27145 quickSort: 943475 C8B8D3A2B4D9882ABCFC31721EF27145 mergeSort: 1778944 C8B8D3A2B4D9882ABCFC31721EF27145 radixSort: 3082901 C8B8D3A2B4D9882ABCFC31721EF27145 phoboSort: 955285 C8B8D3A2B4D9882ABCFC31721EF27145 tim Sort: 89201 C8B8D3A2B4D9882ABCFC31721EF27145 Descending Is length: 16777216 xinokSort: 1916452 C8B8D3A2B4D9882ABCFC31721EF27145 shellSort: 2238446 C8B8D3A2B4D9882ABCFC31721EF27145 quickSort: 1581095 C8B8D3A2B4D9882ABCFC31721EF27145 mergeSort: 3390033 C8B8D3A2B4D9882ABCFC31721EF27145 radixSort: 3067271 C8B8D3A2B4D9882ABCFC31721EF27145 phoboSort: 1035827 C8B8D3A2B4D9882ABCFC31721EF27145 tim Sort: 240896 C8B8D3A2B4D9882ABCFC31721EF27145 Complete Shuffle Is length: 16777216 xinokSort: 7607511 C8B8D3A2B4D9882ABCFC31721EF27145 shellSort: 8814887 C8B8D3A2B4D9882ABCFC31721EF27145 quickSort: 5623837 C8B8D3A2B4D9882ABCFC31721EF27145 mergeSort: 6704733 C8B8D3A2B4D9882ABCFC31721EF27145 radixSort: 2825567 C8B8D3A2B4D9882ABCFC31721EF27145 phoboSort: 5532275 C8B8D3A2B4D9882ABCFC31721EF27145 tim Sort: 29142913 E03C4778321F69D8AD10F624E6093599 Small Shuffle (1024 pairs were picked at random and swapped) Is length: 16777216 xinokSort: 589882 C8B8D3A2B4D9882ABCFC31721EF27145 shellSort: 3651222 C8B8D3A2B4D9882ABCFC31721EF27145 quickSort: 2655391 C8B8D3A2B4D9882ABCFC31721EF27145 mergeSort: 2162840 C8B8D3A2B4D9882ABCFC31721EF27145 radixSort: 2988630 C8B8D3A2B4D9882ABCFC31721EF27145 phoboSort: 963103 C8B8D3A2B4D9882ABCFC31721EF27145 tim Sort: 6523251 C8B8D3A2B4D9882ABCFC31721EF27145 Large Shuffle (1,048,576 pairs were picked at random and swapped) Is length: 16777216 xinokSort: 1956126 C8B8D3A2B4D9882ABCFC31721EF27145 shellSort: 7617207 C8B8D3A2B4D9882ABCFC31721EF27145 quickSort: 3089674 C8B8D3A2B4D9882ABCFC31721EF27145 mergeSort: 2606200 C8B8D3A2B4D9882ABCFC31721EF27145 radixSort: 2932306 C8B8D3A2B4D9882ABCFC31721EF27145 phoboSort: 1619484 C8B8D3A2B4D9882ABCFC31721EF27145 tim Sort: 14883251 DE54E94D1A058459FEA3A80096FCBB18 Small Shuffle + Descending Is length: 16777216 xinokSort: 2288869 C8B8D3A2B4D9882ABCFC31721EF27145 shellSort: 4599417 C8B8D3A2B4D9882ABCFC31721EF27145 quickSort: 2604222 C8B8D3A2B4D9882ABCFC31721EF27145 mergeSort: 3457241 C8B8D3A2B4D9882ABCFC31721EF27145 radixSort: 3055080 C8B8D3A2B4D9882ABCFC31721EF27145 phoboSort: 261768564 C8B8D3A2B4D9882ABCFC31721EF27145 tim Sort: 10149160 C8B8D3A2B4D9882ABCFC31721EF27145 Large Shuffle + Descending xinokSort: 2923813 C8B8D3A2B4D9882ABCFC31721EF27145 shellSort: 7678966 C8B8D3A2B4D9882ABCFC31721EF27145 quickSort: 3833138 C8B8D3A2B4D9882ABCFC31721EF27145 mergeSort: 3971124 C8B8D3A2B4D9882ABCFC31721EF27145 radixSort: 2941206 C8B8D3A2B4D9882ABCFC31721EF27145 phoboSort: 3749512 C8B8D3A2B4D9882ABCFC31721EF27145 tim Sort: 22709918 191BD30B3D0E52C922A7E6A16E5E63A5
Oct 07 2011
On 10/7/2011 12:42 PM, Xinok wrote:Hi, it's been years since I've posted here. I just wanted to share something I worked on, a new sorting algorithm. It's a variant of merge sort, but much more memory efficient with only a small loss in performance. The most similar algorithm I know of is Timsort. I had need for a stable sorting algorithm, but the performance of stable sort in Phobos is terrible. This pretty much left merge sort, which has good performance but requires O(n) space. That persuaded me to design my own sorting algorithm. Here, you can find the code, details of the algorithm, and benchmarks (introSort is the unstable sort in Phobos). http://www.neowin.net/forum/blog/422/entry-3737-sort-algorithm-complete/To follow up on my original post, I wrote a text file which explains the algorithm in detail. The most important thing to understand is the "range swap", which I tried to explain as simply as possible. http://cl.ly/0H193k2s0G2T1A002v3I/xinokSort.txt
Oct 08 2011
On 10/8/11 10:11 AM, Xinok wrote:On 10/7/2011 12:42 PM, Xinok wrote:Nice writeup, but I found it quite difficult to get into. What would help is anchoring it with already known stuff (if it's not, the reader must assume it's unrelated, which makes things difficult). So it would be great if you compared and contrasted range swap with the in-place merge algorithm (e.g. http://www.sgi.com/tech/stl/inplace_merge.html), STL's stable sort (http://www.sgi.com/tech/stl/stable_sort.html) which is O(N log(N) log(N)), and possibly with std.algorithm.bringToFront. Simply presenting a stylized implementation of swap range would be helpful. Also there are a few oddities in the text: * "- Constant additional memory (one memory allocation per thread)" -> the parenthesis does not sustain the point. There could be one memory allocation but it might allocate a non-constant amount. * All discussion about tail call optimization is unneeded. Tail calls can be converted trivially to loops, so don't mention anything. Feel free to convert to loops if needed. AndreiHi, it's been years since I've posted here. I just wanted to share something I worked on, a new sorting algorithm. It's a variant of merge sort, but much more memory efficient with only a small loss in performance. The most similar algorithm I know of is Timsort. I had need for a stable sorting algorithm, but the performance of stable sort in Phobos is terrible. This pretty much left merge sort, which has good performance but requires O(n) space. That persuaded me to design my own sorting algorithm. Here, you can find the code, details of the algorithm, and benchmarks (introSort is the unstable sort in Phobos). http://www.neowin.net/forum/blog/422/entry-3737-sort-algorithm-complete/To follow up on my original post, I wrote a text file which explains the algorithm in detail. The most important thing to understand is the "range swap", which I tried to explain as simply as possible. http://cl.ly/0H193k2s0G2T1A002v3I/xinokSort.txt
Oct 08 2011
On 10/8/2011 12:47 PM, Andrei Alexandrescu wrote:Nice writeup, but I found it quite difficult to get into. What would help is anchoring it with already known stuff (if it's not, the reader must assume it's unrelated, which makes things difficult). So it would be great if you compared and contrasted range swap with the in-place merge algorithm (e.g. http://www.sgi.com/tech/stl/inplace_merge.html), STL's stable sort (http://www.sgi.com/tech/stl/stable_sort.html) which is O(N log(N) log(N)), and possibly with std.algorithm.bringToFront. Simply presenting a stylized implementation of swap range would be helpful.I didn't mean for this text to be anything official. I just felt it was important to provide an explanation of my algorithm so others could better understand my algorithm and it's implications. That's all. There's also the issue of, "what if I'm not the first?" I couldn't find anything similar to the "range swap", but that doesn't mean it didn't already exist. Writing papers isn't my forte, I'm a self taught programmer. So if my algorithm ever gains popularity, I'll let the experts deal with it.Also there are a few oddities in the text: * "- Constant additional memory (one memory allocation per thread)" -> the parenthesis does not sustain the point. There could be one memory allocation but it might allocate a non-constant amount.I thought it was important to clarify that. My algorithm is easy to parallelize, but it does require allocating a unique block of memory for each thread. It is relatively constant as well, as it would make sense that the number of running threads matches the number of cores in the hardware. The only reason to allocate a non-constant amount is if you include the optimization I mentioned, to allocate O(n/1024) space.* All discussion about tail call optimization is unneeded. Tail calls can be converted trivially to loops, so don't mention anything. Feel free to convert to loops if needed.I think it's an issue worth addressing though. Some programmers might assume that, because it's a variant of merge sort, stack overflows won't be an issue. When I originally implemented my algorithm, I didn't use tail calls and I had problems with stack overflows on partially sorted data. So it is an issue.
Oct 08 2011
On 10/8/11 12:30 PM, Xinok wrote:On 10/8/2011 12:47 PM, Andrei Alexandrescu wrote:My attempt was to make a few suggestions that would improve your writeup, which in turn not only helps popularity of the algorithm, but also you to hone a useful skill.Nice writeup, but I found it quite difficult to get into. What would help is anchoring it with already known stuff (if it's not, the reader must assume it's unrelated, which makes things difficult). So it would be great if you compared and contrasted range swap with the in-place merge algorithm (e.g. http://www.sgi.com/tech/stl/inplace_merge.html), STL's stable sort (http://www.sgi.com/tech/stl/stable_sort.html) which is O(N log(N) log(N)), and possibly with std.algorithm.bringToFront. Simply presenting a stylized implementation of swap range would be helpful.I didn't mean for this text to be anything official. I just felt it was important to provide an explanation of my algorithm so others could better understand my algorithm and it's implications. That's all. There's also the issue of, "what if I'm not the first?" I couldn't find anything similar to the "range swap", but that doesn't mean it didn't already exist. Writing papers isn't my forte, I'm a self taught programmer. So if my algorithm ever gains popularity, I'll let the experts deal with it.Then you may want to say: * Constant additional memory and one call to an allocation routine per thread.Also there are a few oddities in the text: * "- Constant additional memory (one memory allocation per thread)" -> the parenthesis does not sustain the point. There could be one memory allocation but it might allocate a non-constant amount.I thought it was important to clarify that. My algorithm is easy to parallelize, but it does require allocating a unique block of memory for each thread. It is relatively constant as well, as it would make sense that the number of running threads matches the number of cores in the hardware. The only reason to allocate a non-constant amount is if you include the optimization I mentioned, to allocate O(n/1024) space.No, it's not. Please think through: tail recursion IS A loop, period. (BTW I see you use simple tail recursion as opposed to the more general tail calls.) It's cut and dried. So no discussion is needed. Just mention there is no risk of stack overflow, without ever mentioning tail calls. In fact there's not even a need to mention that because otherwise the algorithm would have non-constant space complexity, which you clarify it doesn't. Andrei* All discussion about tail call optimization is unneeded. Tail calls can be converted trivially to loops, so don't mention anything. Feel free to convert to loops if needed.I think it's an issue worth addressing though. Some programmers might assume that, because it's a variant of merge sort, stack overflows won't be an issue. When I originally implemented my algorithm, I didn't use tail calls and I had problems with stack overflows on partially sorted data. So it is an issue.
Oct 08 2011
On 10/8/2011 1:56 PM, Andrei Alexandrescu wrote:On 10/8/11 12:30 PM, Xinok wrote:Thanks for that. :) I can't update the original link, so the text is what it is. I'll look into making a better write-up of my algorithm and posting it somewhere more permanent.I didn't mean for this text to be anything official. I just felt it was important to provide an explanation of my algorithm so others could better understand my algorithm and it's implications. That's all. There's also the issue of, "what if I'm not the first?" I couldn't find anything similar to the "range swap", but that doesn't mean it didn't already exist. Writing papers isn't my forte, I'm a self taught programmer. So if my algorithm ever gains popularity, I'll let the experts deal with it.My attempt was to make a few suggestions that would improve your writeup, which in turn not only helps popularity of the algorithm, but also you to hone a useful skill.I'll do that.Then you may want to say: * Constant additional memory and one call to an allocation routine per thread.Also there are a few oddities in the text: * "- Constant additional memory (one memory allocation per thread)" -> the parenthesis does not sustain the point. There could be one memory allocation but it might allocate a non-constant amount.I thought it was important to clarify that. My algorithm is easy to parallelize, but it does require allocating a unique block of memory for each thread. It is relatively constant as well, as it would make sense that the number of running threads matches the number of cores in the hardware. The only reason to allocate a non-constant amount is if you include the optimization I mentioned, to allocate O(n/1024) space.Take a look at the Wikipedia page for quick sort. It mentions issues with the algorithm, such as using the first element as the pivot causes worst-case behavior on sorted data, and even something I didn't know about is integer overflow when choosing a pivot. It also mentions tail calls / tail recursion five times. https://secure.wikimedia.org/wikipedia/en/wiki/Quick_sort#Implementation_issues I'll reduce talk about tail calls, I'll probably just list it under optimizations, but I'm not removing it completely. Otherwise, I'll make a note that, depending on the implementation, the range swap can result in a stack overflow, but I won't mention tail calls / tail recursion as a solution.I think it's an issue worth addressing though. Some programmers might assume that, because it's a variant of merge sort, stack overflows won't be an issue. When I originally implemented my algorithm, I didn't use tail calls and I had problems with stack overflows on partially sorted data. So it is an issue.No, it's not. Please think through: tail recursion IS A loop, period. (BTW I see you use simple tail recursion as opposed to the more general tail calls.) It's cut and dried. So no discussion is needed. Just mention there is no risk of stack overflow, without ever mentioning tail calls. In fact there's not even a need to mention that because otherwise the algorithm would have non-constant space complexity, which you clarify it doesn't.
Oct 08 2011
On 10/8/2011 11:11 AM, Xinok wrote:On 10/7/2011 12:42 PM, Xinok wrote:And to follow up on this post, I created a permanent page for xinok sort on Sourceforge. There's nothing new to see on this page yet, but I'll work on that in due time. https://sourceforge.net/projects/xinoksort/ My first goal is to write an implementation which I can contribute as the new stable sorting algorithm for Phobos. This includes string mixins, ranges, asserts, unittests, and anything else that is required. If anybody wants to assist me with this, I welcome the help.Hi, it's been years since I've posted here. I just wanted to share something I worked on, a new sorting algorithm. It's a variant of merge sort, but much more memory efficient with only a small loss in performance. The most similar algorithm I know of is Timsort. I had need for a stable sorting algorithm, but the performance of stable sort in Phobos is terrible. This pretty much left merge sort, which has good performance but requires O(n) space. That persuaded me to design my own sorting algorithm. Here, you can find the code, details of the algorithm, and benchmarks (introSort is the unstable sort in Phobos). http://www.neowin.net/forum/blog/422/entry-3737-sort-algorithm-complete/To follow up on my original post, I wrote a text file which explains the algorithm in detail. The most important thing to understand is the "range swap", which I tried to explain as simply as possible. http://cl.ly/0H193k2s0G2T1A002v3I/xinokSort.txt
Oct 09 2011