www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Parallel Merge Sort

reply "Josh" <Littlestone3 yahoo.com> writes:
How can I make my parallel code more efficient. Currently, I am 
getting destroyed by the serial merge sort.

http://pastebin.com/M0GKfTTX

Thank you.
Mar 03 2015
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Josh wrote:

 How can I make my parallel code more efficient. Currently, I am 
 getting destroyed by the serial merge sort.

 http://pastebin.com/M0GKfTTX
That serial merge sort I've written is little more than a toy. I suggest you to compare your parallel sort with a serial sort that allocates better. Perhaps later I'll add it. Here I have done a quick translation of some C code from Wikipedia, this is wasteful for memory (not tested much!): import std.stdio, std.algorithm; /// A has the items to sort, array B is a work array. void mergeSort(T)(T[] A) pure nothrow safe out { assert(A.isSorted); } body { static void bottomUpMerge(T)(in T[] A, in size_t iLeft, in size_t iRight, in size_t iEnd, T[] B) pure nothrow safe nogc { size_t i0 = iLeft; size_t i1 = iRight; // While there are elements in the left or right runs. for (size_t j = iLeft; j < iEnd; j++) { // If left run head exists and is <= existing right run head. if (i0 < iRight && (i1 >= iEnd || A[i0] <= A[i1])) { B[j] = A[i0]; i0++; } else { B[j] = A[i1]; i1++; } } } immutable n = A.length; auto B = new T[n]; // Each 1-element run in A is already "sorted". // Make successively longer sorted runs of length 2, 4, 8, 16... until whole array is sorted. for (size_t width = 1; width < n; width = 2 * width) { // Array A is full of runs of length width. for (size_t i = 0; i < n; i = i + 2 * width) { // Merge two runs: A[i:i+width-1] and A[i+width:i+2*width-1] to B[] // or copy A[i:n-1] to B[] ( if(i+width >= n) ). bottomUpMerge(A, i, min(i + width, n), min(i + 2 * width, n), B); } // Now work array B is full of runs of length 2*width. swap(A, B); } } void main() { auto a = [3,1,2,5,4,8,6,7,2,9,1,4,3]; a.mergeSort; a.writeln; } Bye, bearophile
Mar 03 2015
next sibling parent "Josh" <Littlestone3 yahoo.com> writes:
On Wednesday, 4 March 2015 at 00:00:52 UTC, bearophile wrote:
 Josh wrote:

 How can I make my parallel code more efficient. Currently, I 
 am getting destroyed by the serial merge sort.

 http://pastebin.com/M0GKfTTX
That serial merge sort I've written is little more than a toy. I suggest you to compare your parallel sort with a serial sort that allocates better. Perhaps later I'll add it. Here I have done a quick translation of some C code from Wikipedia, this is wasteful for memory (not tested much!): import std.stdio, std.algorithm; /// A has the items to sort, array B is a work array. void mergeSort(T)(T[] A) pure nothrow safe out { assert(A.isSorted); } body { static void bottomUpMerge(T)(in T[] A, in size_t iLeft, in size_t iRight, in size_t iEnd, T[] B) pure nothrow safe nogc { size_t i0 = iLeft; size_t i1 = iRight; // While there are elements in the left or right runs. for (size_t j = iLeft; j < iEnd; j++) { // If left run head exists and is <= existing right run head. if (i0 < iRight && (i1 >= iEnd || A[i0] <= A[i1])) { B[j] = A[i0]; i0++; } else { B[j] = A[i1]; i1++; } } } immutable n = A.length; auto B = new T[n]; // Each 1-element run in A is already "sorted". // Make successively longer sorted runs of length 2, 4, 8, 16... until whole array is sorted. for (size_t width = 1; width < n; width = 2 * width) { // Array A is full of runs of length width. for (size_t i = 0; i < n; i = i + 2 * width) { // Merge two runs: A[i:i+width-1] and A[i+width:i+2*width-1] to B[] // or copy A[i:n-1] to B[] ( if(i+width >= n) ). bottomUpMerge(A, i, min(i + width, n), min(i + 2 * width, n), B); } // Now work array B is full of runs of length 2*width. swap(A, B); } } void main() { auto a = [3,1,2,5,4,8,6,7,2,9,1,4,3]; a.mergeSort; a.writeln; } Bye, bearophile
Thanks for the reply bearophile, I appreciate it. I'm looking over this algorithm and just trying to wrap my head around it.
Mar 03 2015
prev sibling next sibling parent reply "Josh" <Littlestone3 yahoo.com> writes:
 Bye,
 bearophile
My problem is understanding the syntax. I'm coming from Java/C++/Rust so it's not a huge stretch for me. Would you mind explaining the major syntax in this piece: out {
     assert(A.isSorted);
 } body {
     static void bottomUpMerge(T)(in T[] A, in size_t iLeft, in 
 size_t iRight, in size_t iEnd, T[] B)
     pure nothrow  safe  nogc {
mainly the 'out' and 'body' as well as the 'ins' Thank you.
Mar 03 2015
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, Mar 04, 2015 at 12:58:28AM +0000, Josh via Digitalmars-d wrote:
Bye,
bearophile
My problem is understanding the syntax. I'm coming from Java/C++/Rust so it's not a huge stretch for me. Would you mind explaining the major syntax in this piece: out {
    assert(A.isSorted);
} body {
    static void bottomUpMerge(T)(in T[] A, in size_t iLeft, in size_t
iRight, in size_t iEnd, T[] B)
    pure nothrow  safe  nogc {
mainly the 'out' and 'body' as well as the 'ins'
[...] The 'out' block is an out-contract, that is, it's code for checking the sanity of the function's return value after the main function body has finished. It's not important to the actual operation of the function; it's just a safety net to force a runtime error in the event that the function didn't work as designed and returned a nonsensical result. Of course, out-contracts are also useful for documentation purposes: they describe what a function does by asserting something that must be true after the function exits. In this case, it asserts that A will be sorted after the function exits, thereby documenting that the function performs a sort on A. The 'body' block is the main body of the function, i.e., the function itself if no contracts were written. The 'in' modifier is the same as 'const' when applied to function parameters, but writing 'in' documents that the parameters are input parameters that won't be modified by the function. (Conversely, writing 'out' documents that the parameter is an output parameter: the compiler initializes it to its default state, and the function (presumably) stores its output value(s) in that parameter before returning. Semantically, 'out' is the same as mutable, which is unmarked in D. But writing 'out' helps code maintainability by documenting the intent of function parameters, so that readers of your code don't have to guess what the purpose of the parameter might be and how it should be used.) T -- It is impossible to make anything foolproof because fools are so ingenious. -- Sammy
Mar 03 2015
parent reply "anonymous" <anonymous example.com> writes:
On Wednesday, 4 March 2015 at 01:17:00 UTC, H. S. Teoh wrote:
 The 'in' modifier is the same as 'const' when applied to 
 function
 parameters, but writing 'in' documents that the parameters are 
 input
 parameters that won't be modified by the function.
You forgot to mention "scope". "in" is short for "scope const". 'scope' means that the argument is used only in this run of the function; it's not stored anywhere (a global, a struct/class member, ...).
 (Conversely, writing
 'out' documents that the parameter is an output parameter: the 
 compiler
 initializes it to its default state, and the function 
 (presumably)
 stores its output value(s) in that parameter before returning.
 Semantically, 'out' is the same as mutable, which is unmarked 
 in D. But
 writing 'out' helps code maintainability by documenting the 
 intent of
 function parameters, so that readers of your code don't have to 
 guess
 what the purpose of the parameter might be and how it should be 
 used.)
You forgot to mention that "out" parameters work like "ref" parameters (plus resetting to default value). That is, arguments are passed by reference; changes that the function makes affect the given lvalue at the call site.
Mar 03 2015
parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, Mar 04, 2015 at 02:05:45AM +0000, anonymous via Digitalmars-d wrote:
 On Wednesday, 4 March 2015 at 01:17:00 UTC, H. S. Teoh wrote:
The 'in' modifier is the same as 'const' when applied to function
parameters, but writing 'in' documents that the parameters are input
parameters that won't be modified by the function.
You forgot to mention "scope". "in" is short for "scope const". 'scope' means that the argument is used only in this run of the function; it's not stored anywhere (a global, a struct/class member, ...).
Ah, you're right, I forgot about that.
(Conversely, writing 'out' documents that the parameter is an output
parameter: the compiler initializes it to its default state, and the
function (presumably) stores its output value(s) in that parameter
before returning.  Semantically, 'out' is the same as mutable, which
is unmarked in D. But writing 'out' helps code maintainability by
documenting the intent of function parameters, so that readers of
your code don't have to guess what the purpose of the parameter might
be and how it should be used.)
You forgot to mention that "out" parameters work like "ref" parameters (plus resetting to default value). That is, arguments are passed by reference; changes that the function makes affect the given lvalue at the call site.
Right again. I really need to check what I write before posting... :-( Thanks for the corrections! T -- Lottery: tax on the stupid. -- Slashdotter
Mar 03 2015
prev sibling parent reply "Josh" <Littlestone3 yahoo.com> writes:
 Bye,
 bearophile
I traced through the algorithm and got caught up in the last step. swap(A, B) what exactly is this doing? Thank you.
Mar 04 2015
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Josh:

 swap(A, B)
 what exactly is this doing?
You should learn to answer your own questions in many cases. Take a look at the source code of Phobos, you will see the swap() implementation inside std.algorithm. It swaps the content of two values. Here the values are the structs that contain the pointer+length of the dynamic arrays. Bye, bearophile
Mar 04 2015
parent reply "Josh" <Littlestone3 yahoo.com> writes:
 Bye,
 bearophile
Believe me I just don't post on here without first trying to answer the question on my own. I was already on the std.algorithm and found this: Swaps lhs and rhs. The instances lhs and rhs are moved in memory, without ever calling opAssign, nor any other function. T need not be assignable at all to be swapped. If lhs and rhs reference the same instance, then nothing is done. lhs and rhs must be mutable. If T is a struct or union, then its fields must also all be (recursivelly) mutable. now say A[] was = [90, 50, 33, 72, 35] and by the end of the function B[] = [33, 50, 72, 90, 35] now we call swap(A,B) and A would now = [33, 35, 50, 72, 90] and somehow the 35 moves in front of the 50.
Mar 04 2015
next sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 03/04/2015 02:18 PM, Josh wrote:

 now say A[] was = [90, 50, 33, 72, 35]

 and by the end of the function B[] = [33, 50, 72, 90, 35]


 now we call swap(A,B) and A would now = [33, 35, 50, 72, 90]

 and somehow the 35 moves in front of the 50.
Not for me. I think there is bug in the algorithm: auto a = [90, 50, 33, 72, 35]; a.mergeSort; assert(a == [33, 50, 72, 90, 35]); // Incorrect ordering :( Ali
Mar 04 2015
next sibling parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 03/04/2015 03:23 PM, Ali Çehreli wrote:> On 03/04/2015 02:18 PM, Josh 
wrote:
  > now say A[] was = [90, 50, 33, 72, 35]
  >
  > and by the end of the function B[] = [33, 50, 72, 90, 35]
  >
  >
  > now we call swap(A,B) and A would now = [33, 35, 50, 72, 90]
  >
  > and somehow the 35 moves in front of the 50.

 Not for me. I think there is bug in the algorithm:

      auto a = [90, 50, 33, 72, 35];
      a.mergeSort;
      assert(a == [33, 50, 72, 90, 35]);    // Incorrect ordering :(
The bug is, what is sorted remains in mergeSort() as its parameter A. Due to a number of swaps, A may be what used to be B, which is quite a different slice from the callers. :( Here is a very quick fix that requires a different usage: // (1) return the sorted slice T[] mergeSort(T)(T[] A) pure nothrow safe out (result) { assert(result.isSorted); // (2) what is returned is checked } body { // ... return A; // (3) return the sorted range } a = a.mergeSort; // (4) the caller must assign back // to the original slice Now it is sorted: [33, 35, 50, 72, 90] Ali
Mar 04 2015
prev sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Ali Çehreli:

 I think there is bug in the algorithm:

     auto a = [90, 50, 33, 72, 35];
     a.mergeSort;
     assert(a == [33, 50, 72, 90, 35]);    // Incorrect ordering 
 :(
I translated that D code from C code of Wikipedia in few minutes, the probability of mistakes is high. Better to use some more reliable code then... Bye, bearophile
Mar 04 2015
prev sibling parent "Josh" <Littlestone3 yahoo.com> writes:
http://pastebin.com/4TyP7Gjj

I feel like there's some sort of optimization that I can do 
instead of the nWayUnion at the end I think that's a lot of 
processing on a single core for the final step.

I was also trying  to use
copy(array[lbound..uperbound].mergeSort, array[lbound..ubound])

instead of:

result ~= array[lbound..ubound].mergeSort

but I'm stuck as to how I can then sort the array now that it 
contained runs
example:
after copy
array would be (thread amount = 3) runs
if threads is = 3 and N = 9
array: [33, 67, 92, 12, 16, 67, 43, 46, 49]

I'm having trouble formulating how I would now sort the runs.
-Or if the copy is a lot of overhead for a large amount of 
numbers.

Thank you,
Josh
Mar 08 2015
prev sibling parent "Xinok" <xinok live.com> writes:
On Tuesday, 3 March 2015 at 22:55:05 UTC, Josh wrote:
 How can I make my parallel code more efficient. Currently, I am 
 getting destroyed by the serial merge sort.

 http://pastebin.com/M0GKfTTX

 Thank you.
I've implemented a concurrent merge sort if you want to take a look at my code. It's much faster than "serial" merge sort. https://github.com/Xinok/XSort/blob/master/mergesort.d#L88
Mar 03 2015