www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Idempotent partition around median of 5?

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
This appears a simple problem: given numbers a, b, c, d, e, swap them 
around so as to place the median in c and partition the others around 
it. I.e. the postcondition is: a <= c && b <= c && c <= d && c <= e.

Searching the Net for this isn't easy. Fortunately "our own" Xinok has 
the best of breed result, see 
http://stackoverflow.com/questions/11065963/possible-to-partition-five-elements-by-median-with-six-comparisons.

His algorithm is a little suboptimal in that when given a sorted array, 
it sometimes leaves it unsorted. So I changed it to 
http://dpaste.dzfl.pl/5fb2238d9891 to fix that. If I count right, it 
also saves one swap: does 0-9 swaps instead of 0-10.

Ideally however, such an algorithm would do 0 swaps if the median is 
already in c. My algorithm may still swap d and e without necessity.

So there's got to be a better solution. Your challenge - should you 
choose to accept it :o) - is an algorithm that does the partitioning in 
6 comparisons and <= 9 swaps, which is idempotent: when applied twice, 
it always does 0 swaps.


Andrei
Feb 03 2016
next sibling parent reply Era Scarecrow <rtcvb32 yahoo.com> writes:
On Thursday, 4 February 2016 at 01:24:15 UTC, Andrei Alexandrescu 
wrote:
 This appears a simple problem: given numbers a, b, c, d, e, 
 swap them around so as to place the median in c and partition 
 the others around it. I.e. the postcondition is: a <= c && b <= 
 c && c <= d && c <= e.

 <snip>

 So there's got to be a better solution. Your challenge - should 
 you choose to accept it :o) - is an algorithm that does the 
 partitioning in 6 comparisons and <= 9 swaps, which is 
 idempotent: when applied twice, it always does 0 swaps.
Well if we look at it, the max compares for optimal sorting algorithm is the log2 of the max combinations in how the elements could be arranged; !5 = 120 or 7 compares. I'm not sure of the max swaps, seems like it could be 4 if done right, but that might be impractical.
Feb 03 2016
parent reply Xinok <xinok live.com> writes:
On Thursday, 4 February 2016 at 01:33:54 UTC, Era Scarecrow wrote:
 On Thursday, 4 February 2016 at 01:24:15 UTC, Andrei 
 Alexandrescu wrote:
 This appears a simple problem: given numbers a, b, c, d, e, 
 swap them around so as to place the median in c and partition 
 the others around it. I.e. the postcondition is: a <= c && b 
 <= c && c <= d && c <= e.

 <snip>

 So there's got to be a better solution. Your challenge - 
 should you choose to accept it :o) - is an algorithm that does 
 the partitioning in 6 comparisons and <= 9 swaps, which is 
 idempotent: when applied twice, it always does 0 swaps.
Well if we look at it, the max compares for optimal sorting algorithm is the log2 of the max combinations in how the elements could be arranged; !5 = 120 or 7 compares.
The order of a,b and d,e don't matter because we're partitioning, not sorting. So this problem can be solved in six comparisons.
  I'm not sure of the max swaps, seems like it could be 4 if 
 done right, but that might be impractical.
I believe that's correct. Each swap can move at least one element into it's final position. After three swaps, there may be two elements which are still out of place, and swapping them will move them both into their final position. Thus, four swaps max. :-) A solution to this problem does exist: Build a tree of if-else branches for all the different possibilities (2^6=64) and hard-code the optimal sequence of swaps. However, this function would be huge so this isn't very practical. Ideally, we want to minimize the size of the function as much as possible, adding in a few extra swaps if necessary. I'll take a crack at it this weekend. It sounds like an interesting problem.
Feb 03 2016
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/03/2016 09:46 PM, Xinok wrote:
 I'll take a crack at it this weekend. It sounds like an interesting
 problem.
Thanks very much! -- Andrei
Feb 03 2016
prev sibling next sibling parent Era Scarecrow <rtcvb32 yahoo.com> writes:
On Thursday, 4 February 2016 at 02:46:42 UTC, Xinok wrote:
 A solution to this problem does exist: Build a tree of if-else 
 branches for all the different possibilities (2^6=64) and 
 hard-code the optimal sequence of swaps. However, this function 
 would be huge so this isn't very practical.
Yeah that's kinda what i was coming to a conclusion to.
 Ideally, we want to minimize the size of the function as much 
 as possible, adding in a few extra swaps if necessary.
Hmmm if i wrote it (this way) it would probably brute force the proper way to build the function rather than doing it by hand myself. So a helper function to make the optimal if/else statements to deal with it. BUT this only makes sense if it's always going to be 5 elements, if it's fewer or more, then it's sorta pointless and a more generic algorithm would be preferred. Although writing a brute force program creation helper function could still be of some use.
Feb 03 2016
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 02/04/2016 03:46 AM, Xinok wrote:
  I'm not sure of the max swaps, seems like it could be 4 if done
 right, but that might be impractical.
I believe that's correct. Each swap can move at least one element into it's final position. After three swaps, there may be two elements which are still out of place, and swapping them will move them both into their final position. Thus, four swaps max. :-)
Three swaps are always enough. Using the first swap, put the median where it belongs, and each subsequent swap can be chosen such that it moves two elements to their final position.
Feb 04 2016
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 02/04/2016 02:24 AM, Andrei Alexandrescu wrote:
 This appears a simple problem: given numbers a, b, c, d, e, swap them
 around so as to place the median in c and partition the others around
 it. I.e. the postcondition is: a <= c && b <= c && c <= d && c <= e.

 Searching the Net for this isn't easy. Fortunately "our own" Xinok has
 the best of breed result, see
 http://stackoverflow.com/questions/11065963/possible-to-partition-five-elements-by-median-with-six-comparisons.


 His algorithm is a little suboptimal in that when given a sorted array,
 it sometimes leaves it unsorted. So I changed it to
 http://dpaste.dzfl.pl/5fb2238d9891 to fix that. If I count right, it
 also saves one swap: does 0-9 swaps instead of 0-10.

 Ideally however, such an algorithm would do 0 swaps if the median is
 already in c. My algorithm may still swap d and e without necessity.

 So there's got to be a better solution. Your challenge - should you
 choose to accept it :o) - is an algorithm that does the partitioning in
 6 comparisons and <= 9 swaps, which is idempotent: when applied twice,
 it always does 0 swaps.


 Andrei
At most 6 comparisons, <=3 swaps, idempotent (optimal number of swaps): void partition5(ref int[5] a){ if(a[0]<a[1]){ if(a[2]<a[3]){ if(a[0]<a[2]){ if(a[1]<a[4]){ if(a[1]<a[2]){ if(!(a[2]<a[4])){ swap(a[4],a[2]); } }else if(a[1]<a[3]){ swap(a[1],a[2]); }else{ swap(a[3],a[2]); swap(a[1],a[3]); } }else if(a[2]<a[4]){ if(a[3]<a[4]){ swap(a[3],a[2]); swap(a[1],a[3]); }else{ swap(a[4],a[2]); swap(a[1],a[4]); } }else if(a[1]<a[2]){ swap(a[1],a[2]); swap(a[1],a[4]); }else{ swap(a[1],a[4]); } }else if(a[3]<a[4]){ if(a[0]<a[3]){ if(a[1]<a[3]){ swap(a[1],a[2]); }else{ swap(a[3],a[2]); swap(a[1],a[3]); } }else if(a[0]<a[4]){ swap(a[0],a[2]); swap(a[1],a[3]); }else{ swap(a[4],a[2]); swap(a[0],a[3]); swap(a[1],a[4]); } }else if(a[0]<a[4]){ if(a[1]<a[4]){ swap(a[1],a[2]); }else{ swap(a[4],a[2]); swap(a[1],a[4]); } }else if(a[0]<a[3]){ swap(a[0],a[2]); swap(a[1],a[4]); }else{ swap(a[3],a[2]); swap(a[0],a[3]); swap(a[1],a[4]); } }else if(a[0]<a[3]){ if(a[1]<a[4]){ if(a[1]<a[3]){ if(a[3]<a[4]){ swap(a[3],a[2]); }else{ swap(a[4],a[2]); } }else if(a[1]<a[2]){ swap(a[1],a[2]); swap(a[1],a[3]); }else{ swap(a[1],a[3]); } }else if(a[3]<a[4]){ if(a[2]<a[4]){ swap(a[1],a[3]); }else{ swap(a[4],a[2]); swap(a[1],a[3]); } }else if(a[1]<a[3]){ swap(a[1],a[2]); swap(a[1],a[4]); }else{ swap(a[3],a[2]); swap(a[1],a[4]); } }else if(a[2]<a[4]){ if(a[0]<a[2]){ if(a[1]<a[2]){ swap(a[1],a[2]); swap(a[1],a[3]); }else{ swap(a[1],a[3]); } }else if(a[0]<a[4]){ swap(a[0],a[2]); swap(a[1],a[3]); }else{ swap(a[4],a[2]); swap(a[0],a[3]); swap(a[1],a[4]); } }else if(a[0]<a[4]){ if(a[1]<a[4]){ swap(a[1],a[2]); swap(a[1],a[3]); }else{ swap(a[4],a[2]); swap(a[1],a[3]); } }else if(a[0]<a[2]){ swap(a[0],a[2]); swap(a[0],a[3]); swap(a[1],a[4]); }else{ swap(a[0],a[3]); swap(a[1],a[4]); } }else if(a[2]<a[3]){ if(a[0]<a[3]){ if(a[2]<a[4]){ if(a[0]<a[4]){ if(!(a[0]<a[2])){ swap(a[0],a[2]); } }else if(a[1]<a[4]){ swap(a[4],a[2]); swap(a[0],a[4]); }else{ swap(a[1],a[2]); swap(a[0],a[4]); } }else if(a[0]<a[2]){ if(a[0]<a[4]){ swap(a[4],a[2]); }else{ swap(a[0],a[2]); swap(a[0],a[4]); } }else if(a[1]<a[2]){ swap(a[0],a[4]); }else{ swap(a[1],a[2]); swap(a[0],a[4]); } }else if(a[1]<a[4]){ if(a[3]<a[4]){ if(a[1]<a[3]){ swap(a[3],a[2]); swap(a[0],a[3]); }else{ swap(a[1],a[2]); swap(a[0],a[3]); } }else if(a[2]<a[4]){ swap(a[4],a[2]); swap(a[0],a[4]); }else{ swap(a[0],a[4]); } }else if(a[1]<a[3]){ if(a[1]<a[2]){ swap(a[0],a[4]); }else{ swap(a[1],a[2]); swap(a[0],a[4]); } }else if(a[3]<a[4]){ swap(a[4],a[2]); swap(a[0],a[3]); swap(a[1],a[4]); }else{swap(a[3],a[2]); swap(a[0],a[3]); swap(a[1],a[4]); } }else if(a[0]<a[2]){ if(a[3]<a[4]){ if(a[0]<a[4]){ if(a[0]<a[3]){ swap(a[3],a[2]); }else{ swap(a[0],a[2]); swap(a[0],a[3]); } }else if(a[1]<a[4]){ swap(a[4],a[2]); swap(a[0],a[3]); }else{ swap(a[1],a[2]); swap(a[0],a[3]); swap(a[1],a[4]); } }else if(a[0]<a[3]){ if(a[0]<a[4]){ swap(a[4],a[2]); }else{ swap(a[0],a[2]); swap(a[0],a[4]); } }else if(a[1]<a[3]){ swap(a[3],a[2]); swap(a[0],a[4]); }else{ swap(a[1],a[2]); swap(a[0],a[3]); swap(a[1],a[4]); } }else if(a[1]<a[4]){ if(a[2]<a[4]){ if(a[1]<a[2]){ swap(a[0],a[3]); }else{ swap(a[1],a[2]); swap(a[0],a[3]); } }else if(a[3]<a[4]){ swap(a[4],a[2]); swap(a[0],a[3]); }else{ swap(a[3],a[2]); swap(a[0],a[4]); } }else if(a[1]<a[2]){ if(a[1]<a[3]){ swap(a[3],a[2]); swap(a[0],a[4]); }else{ swap(a[1],a[2]); swap(a[0],a[3]); swap(a[1],a[4]); } }else if(a[2]<a[4]){ swap(a[4],a[2]); swap(a[0],a[3]); swap(a[1],a[4]); }else{ swap(a[0],a[3]); swap(a[1],a[4]); } }
Feb 04 2016
next sibling parent Era Scarecrow <rtcvb32 yahoo.com> writes:
On Thursday, 4 February 2016 at 20:30:57 UTC, Timon Gehr wrote:
 At most 6 comparisons, <=3 swaps, idempotent (optimal number of 
 swaps):

 void partition5(ref int[5] a){
   if(a[0]<a[1]){
     if(a[2]<a[3]){
       if(a[0]<a[2]){
         if(a[1]<a[4]){
           if(a[1]<a[2]){
             if(!(a[2]<a[4])){
               swap(a[4],a[2]);
             }
           }else if(a[1]<a[3]){
             swap(a[1],a[2]);
           }else{
             swap(a[3],a[2]);
             swap(a[1],a[3]);
 <snip>
That's about what i expected for the actual function (like this) to look like. Course the only way to test this is to brute force the combinations and confirm it's all in the order they should be.
Feb 04 2016
prev sibling next sibling parent Xinok <xinok live.com> writes:
On Thursday, 4 February 2016 at 20:30:57 UTC, Timon Gehr wrote:
 At most 6 comparisons, <=3 swaps, idempotent (optimal number of 
 swaps):

 void partition5(ref int[5] a){
   if(a[0]<a[1]){
 ...
Great, we can all go home! I was curious so I did a crude measurement: when compiled for 64-bit with all optimizations turned on (-release -O -inline -boundscheck=off), the machine code for this function is about 3KB in size.
Feb 04 2016
prev sibling next sibling parent reply tn <no email.com> writes:
On Thursday, 4 February 2016 at 20:30:57 UTC, Timon Gehr wrote:
 At most 6 comparisons, <=3 swaps, idempotent (optimal number of 
 swaps):

 ...
Inspired by this, I made four other versions of the function that are shorter but make more swaps (at most 4, 6, 7 and 8 swaps respectively). Each makes 6 comparisons and should be idempotent. http://dpaste.dzfl.pl/1c53d8f432f7 Here are the distributions of the number of swaps when tested with all 120 possible permutations as the input: testing 'partition5a' (by Timon Gehr) 0 swaps: 4 orderings 1 swaps: 32 orderings 2 swaps: 68 orderings 3 swaps: 16 orderings testing 'partition5b' 0 swaps: 4 orderings 1 swaps: 20 orderings 2 swaps: 42 orderings 3 swaps: 40 orderings 4 swaps: 14 orderings testing 'partition5c' 0 swaps: 4 orderings 1 swaps: 14 orderings 2 swaps: 25 orderings 3 swaps: 30 orderings 4 swaps: 26 orderings 5 swaps: 16 orderings 6 swaps: 5 orderings testing 'partition5d' 0 swaps: 4 orderings 1 swaps: 14 orderings 2 swaps: 24 orderings 3 swaps: 29 orderings 4 swaps: 26 orderings 5 swaps: 16 orderings 6 swaps: 6 orderings 7 swaps: 1 orderings testing 'partition5e' 0 swaps: 4 orderings 1 swaps: 12 orderings 2 swaps: 19 orderings 3 swaps: 25 orderings 4 swaps: 25 orderings 5 swaps: 18 orderings 6 swaps: 11 orderings 7 swaps: 5 orderings 8 swaps: 1 orderings
Feb 05 2016
next sibling parent reply Xinok <xinok live.com> writes:
On Friday, 5 February 2016 at 15:13:56 UTC, tn wrote:
 On Thursday, 4 February 2016 at 20:30:57 UTC, Timon Gehr wrote:
 At most 6 comparisons, <=3 swaps, idempotent (optimal number 
 of swaps):

 ...
Inspired by this, I made four other versions of the function that are shorter but make more swaps (at most 4, 6, 7 and 8 swaps respectively). Each makes 6 comparisons and should be idempotent. http://dpaste.dzfl.pl/1c53d8f432f7 ...
Very nice! I'm curious, to both you and Timon, how did you go about writing these and coming up with the solutions? I'm not sure if I could come up with these results myself and so quickly at that.
Feb 05 2016
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 02/05/2016 10:42 PM, Xinok wrote:
 On Friday, 5 February 2016 at 15:13:56 UTC, tn wrote:
 On Thursday, 4 February 2016 at 20:30:57 UTC, Timon Gehr wrote:
 At most 6 comparisons, <=3 swaps, idempotent (optimal number of swaps):

 ...
Inspired by this, I made four other versions of the function that are shorter but make more swaps (at most 4, 6, 7 and 8 swaps respectively). Each makes 6 comparisons and should be idempotent. http://dpaste.dzfl.pl/1c53d8f432f7 ...
Very nice! I'm curious, to both you and Timon, how did you go about writing these and coming up with the solutions? I'm not sure if I could come up with these results myself and so quickly at that.
I quickly hacked together a brute-force search. Code: http://dpaste.dzfl.pl/43503913ac1d The search only makes sure that it works for distinct input values. Other input orders come for free; the verification code I have included checks this. The code recursively splits the set of all permutations on 5 elements by all possible comparisons between two elements up to a maximal depth of 6 comparisons, using some basic pruning. The recursion terminates if all permutations in the set can be partitioned using the same swaps. (I.e., the index of the median is fixed and there are exactly two possible indices for the first two elements, and exactly two possible indices for the last two elements.) The partition5 function I have posted is the first result that the search found. The above code also supports optimizing for source code size (-version=SHORT). The result is not that much shorter though.
Feb 05 2016
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/05/2016 04:42 PM, Xinok wrote:
 Very nice! I'm curious, to both you and Timon, how did you go about
 writing these and coming up with the solutions? I'm not sure if I could
 come up with these results myself and so quickly at that.
I was thinking the same exact things. -- Andrei
Feb 05 2016
prev sibling parent tn <no email.com> writes:
On Friday, 5 February 2016 at 21:42:41 UTC, Xinok wrote:
 On Friday, 5 February 2016 at 15:13:56 UTC, tn wrote:
 On Thursday, 4 February 2016 at 20:30:57 UTC, Timon Gehr wrote:
 At most 6 comparisons, <=3 swaps, idempotent (optimal number 
 of swaps):

 ...
Inspired by this, I made four other versions of the function that are shorter but make more swaps (at most 4, 6, 7 and 8 swaps respectively). Each makes 6 comparisons and should be idempotent. http://dpaste.dzfl.pl/1c53d8f432f7 ...
Very nice! I'm curious, to both you and Timon, how did you go about writing these and coming up with the solutions? I'm not sure if I could come up with these results myself and so quickly at that.
I basically just started from Timons solution and made is smaller by replacing branches by swaps. The outermost structure of Timons code looks like this: if (a[0] < a[1]) { do something 1 } else { do something 2 } I replaced the above by if (a[0] < a[1]) { swap(a[0], a[1]); } do something 1 So adding one swap just about halved the size of the code. Now "do something 1" again has two branches: if (a[2] < a[3]) { do something 1.1 } else { do something 1.2 } So we can do the same trick again and replace it by if (a[2] < a[3]) { swap(a[2], a[3]); } do something 1.1 And so on. (When doing subsequent swaps we also need to make sure to not break the conditions achieved by the previous swaps.) However, the swap of a[0] and a[1] in the first replacement would break idempotence, so I also had to change which elements are compared (and swapped). So in my code we first compare a[0] and a[2], then a[1] and a[3], and so on.
Feb 06 2016
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/05/2016 10:13 AM, tn wrote:
 On Thursday, 4 February 2016 at 20:30:57 UTC, Timon Gehr wrote:
 At most 6 comparisons, <=3 swaps, idempotent (optimal number of swaps):

 ...
Inspired by this, I made four other versions of the function that are shorter but make more swaps (at most 4, 6, 7 and 8 swaps respectively). Each makes 6 comparisons and should be idempotent. http://dpaste.dzfl.pl/1c53d8f432f7
Whoa, I missed this last night. Awesome work! Thanks! I'm kinda running out of time budget here for this particular subproblem, but I've got to do these justice and try them out as well.
 Here are the distributions of the number of swaps when tested with all
 120 possible permutations as the input:

 testing 'partition5a' (by Timon Gehr)
    0 swaps: 4 orderings
    1 swaps: 32 orderings
    2 swaps: 68 orderings
    3 swaps: 16 orderings
 testing 'partition5b'
    0 swaps: 4 orderings
    1 swaps: 20 orderings
    2 swaps: 42 orderings
    3 swaps: 40 orderings
    4 swaps: 14 orderings
[...] Could you please add two simple calculations? Assuming uniform random distribution of data, compute the average number of swaps as a weighted average of orderings. Also, show the number of lines (one test or one swap per line) as a proxy for generated code size. That should provide good insights. Thanks! Andrei
Feb 06 2016
parent reply tn <no email.com> writes:
On Saturday, 6 February 2016 at 13:06:37 UTC, Andrei Alexandrescu 
wrote:
 Could you please add two simple calculations? Assuming uniform 
 random distribution of data, compute the average number of 
 swaps as a weighted average of orderings. Also, show the number 
 of lines (one test or one swap per line) as a proxy for 
 generated code size. That should provide good insights.
This now prints the numbers of comparison and swap lines and computes the average numbers of swaps. In addition, it also runs the same tests for permutations of 5 elements some of which might be equal (an example of such permutation would be [1, 2, 0, 2, 3], on the other hand [1, 2, 0, 2, 4] should not be included since it is identical). However, in this case the average number of swaps is perhaps not so meaningful. http://dpaste.dzfl.pl/2012caf872ec Output: testing 'partition5a' Code: comparisons = 63, swaps = 107, total = 170 With all orderings of distinct elements: 0 swaps: 4 instances 1 swaps: 32 instances 2 swaps: 68 instances 3 swaps: 16 instances Average number of swaps: 1.8 With all orderings of potentially nondistinct elements: Error, not idempotent: [0, 0, 0, 0, 0] => [0, 0, 0, 0, 0] testing 'partition5b' Code: comparisons = 17, swaps = 21, total = 38 With all orderings of distinct elements: 0 swaps: 4 instances 1 swaps: 20 instances 2 swaps: 42 instances 3 swaps: 40 instances 4 swaps: 14 instances Average number of swaps: 2.33333 With all orderings of potentially nondistinct elements: 0 swaps: 36 instances 1 swaps: 122 instances 2 swaps: 200 instances 3 swaps: 146 instances 4 swaps: 37 instances Average number of swaps: 2.04806 testing 'partition5c' Code: comparisons = 10, swaps = 12, total = 22 With all orderings of distinct elements: 0 swaps: 4 instances 1 swaps: 14 instances 2 swaps: 25 instances 3 swaps: 30 instances 4 swaps: 26 instances 5 swaps: 16 instances 6 swaps: 5 instances Average number of swaps: 3.06667 With all orderings of potentially nondistinct elements: 0 swaps: 36 instances 1 swaps: 92 instances 2 swaps: 130 instances 3 swaps: 138 instances 4 swaps: 92 instances 5 swaps: 42 instances 6 swaps: 11 instances Average number of swaps: 2.60628 testing 'partition5d' Code: comparisons = 7, swaps = 8, total = 15 With all orderings of distinct elements: 0 swaps: 4 instances 1 swaps: 14 instances 2 swaps: 24 instances 3 swaps: 29 instances 4 swaps: 26 instances 5 swaps: 16 instances 6 swaps: 6 instances 7 swaps: 1 instances Average number of swaps: 3.13333 With all orderings of potentially nondistinct elements: 0 swaps: 36 instances 1 swaps: 100 instances 2 swaps: 128 instances 3 swaps: 133 instances 4 swaps: 94 instances 5 swaps: 39 instances 6 swaps: 10 instances 7 swaps: 1 instances Average number of swaps: 2.57486 testing 'partition5e' Code: comparisons = 6, swaps = 8, total = 14 With all orderings of distinct elements: 0 swaps: 4 instances 1 swaps: 12 instances 2 swaps: 19 instances 3 swaps: 25 instances 4 swaps: 25 instances 5 swaps: 18 instances 6 swaps: 11 instances 7 swaps: 5 instances 8 swaps: 1 instances Average number of swaps: 3.53333 With all orderings of potentially nondistinct elements: 0 swaps: 36 instances 1 swaps: 88 instances 2 swaps: 100 instances 3 swaps: 123 instances 4 swaps: 106 instances 5 swaps: 53 instances 6 swaps: 25 instances 7 swaps: 9 instances 8 swaps: 1 instances Average number of swaps: 2.89649
Feb 06 2016
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/06/2016 01:42 PM, tn wrote:
 On Saturday, 6 February 2016 at 13:06:37 UTC, Andrei Alexandrescu wrote:
 Could you please add two simple calculations? Assuming uniform random
 distribution of data, compute the average number of swaps as a
 weighted average of orderings. Also, show the number of lines (one
 test or one swap per line) as a proxy for generated code size. That
 should provide good insights.
This now prints the numbers of comparison and swap lines and computes the average numbers of swaps. In addition, it also runs the same tests for permutations of 5 elements some of which might be equal (an example of such permutation would be [1, 2, 0, 2, 3], on the other hand [1, 2, 0, 2, 4] should not be included since it is identical). However, in this case the average number of swaps is perhaps not so meaningful. http://dpaste.dzfl.pl/2012caf872ec
Awesome, thanks. Will need to try at least a few of these out. At a quick glance, partition5d seems to be the sweet spot - it's small and does only 3.13/2.57 swaps on average. -- Andrei
Feb 06 2016
prev sibling next sibling parent reply Fool <fool dlang.org> writes:
On Thursday, 4 February 2016 at 20:30:57 UTC, Timon Gehr wrote:
 At most 6 comparisons, <=3 swaps, idempotent (optimal number of 
 swaps)[...]
One swap usually decomposes into three moves. A cycle of length n can be sorted with n + 1 moves. Corresponding to the seven integer partitions of 5 we obtain: 5: 6 4 + 1: 5 3 + 2: 4 + 3 3 + 1 + 1: 4 2 + 2 + 1: 3 + 3 2 + 1 + 1 + 1: 3 1 + 1 + 1 + 1 + 1: 0 So we can do with seven moves instead of three swaps (nine moves). ;-)
Feb 05 2016
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Friday, 5 February 2016 at 15:41:11 UTC, Fool wrote:
 So we can do with seven moves instead of three swaps (nine 
 moves). ;-)
Yes, well, this all breaks down when you realize that this is all completely dominated by cache misses and that you can do median more effectively using SIMD instructions. :^P
Feb 05 2016
prev sibling next sibling parent reply Era Scarecrow <rtcvb32 yahoo.com> writes:
On Friday, 5 February 2016 at 15:41:11 UTC, Fool wrote:
 One swap usually decomposes into three moves.
Recently from reading some interesting hacks and super code, a good swap can also be done via 3 xor operations (and avoiding an intermediate storage unit). http://aggregate.ee.engr.uky.edu/MAGIC/#Swap%20Values%20Without%20a%20Temporary x ^= y; y ^= x; x ^= y; Since these can be used directly as registers it might be faster (assuming all 5 are mapped to registers until the final write out, which might not work).
Feb 05 2016
parent reply Xinok <xinok live.com> writes:
On Friday, 5 February 2016 at 17:33:55 UTC, Era Scarecrow wrote:
 On Friday, 5 February 2016 at 15:41:11 UTC, Fool wrote:
 One swap usually decomposes into three moves.
Recently from reading some interesting hacks and super code, a good swap can also be done via 3 xor operations (and avoiding an intermediate storage unit). http://aggregate.ee.engr.uky.edu/MAGIC/#Swap%20Values%20Without%20a%20Temporary x ^= y; y ^= x; x ^= y; Since these can be used directly as registers it might be faster (assuming all 5 are mapped to registers until the final write out, which might not work).
It's a cool trick but I would be surprised if this were actually faster. Modern x64 processors have 16 "general purpose" registers so saving a single register for a simple set of instructions is likely to not have any benefit. My other thought is that the compiler may not recognize this pattern, making it more difficult to optimize. On the other hand, compilers are very good at rearranging simple reads and writes to avoid stalling the pipeline in the CPU.
Feb 05 2016
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/05/2016 04:35 PM, Xinok wrote:
 It's a cool trick but I would be surprised if this were actually faster.
That changes every few years. Last time I measured it was slower. -- Andrei
Feb 05 2016
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 02/05/2016 04:41 PM, Fool wrote:
 On Thursday, 4 February 2016 at 20:30:57 UTC, Timon Gehr wrote:
 At most 6 comparisons, <=3 swaps, idempotent (optimal number of
 swaps)[...]
One swap usually decomposes into three moves. A cycle of length n can be sorted with n + 1 moves. Corresponding to the seven integer partitions of 5 we obtain: 5: 6 4 + 1: 5 3 + 2: 4 + 3 3 + 1 + 1: 4 2 + 2 + 1: 3 + 3 2 + 1 + 1 + 1: 3 1 + 1 + 1 + 1 + 1: 0 So we can do with seven moves instead of three swaps (nine moves). ;-)
The goal is to partition though, not to sort. Six moves are sufficient. (And necessary. E.g. for 42310.)
Feb 05 2016
parent Fool <fool dlang.org> writes:
On Friday, 5 February 2016 at 21:48:58 UTC, Timon Gehr wrote:
 The goal is to partition though, not to sort.
 Six moves are sufficient. (And necessary. E.g. for 42310.)
Indeed. I was wondering about that. Great job!
Feb 05 2016
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/04/2016 03:30 PM, Timon Gehr wrote:
 At most 6 comparisons, <=3 swaps, idempotent (optimal number of swaps):
Timon, Ivan, this is amazing work. I don't know how your minds work folks - I sat for like two hours on it yesterday and couldn't crack it. Surprisingly in a test on millions of random numbers, Ivan's version wins by a hair. Might be the tighter code which has icache effects. Thanks very much! I should say that at these point no better definitions can be found on the Net. Andrei
Feb 05 2016
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/04/2016 03:30 PM, Timon Gehr wrote:
 At most 6 comparisons, <=3 swaps, idempotent (optimal number of swaps):
What is the minimum number of comparisons? Thx! -- Andrei P.S. The sythesized searcher is genius.
Feb 05 2016
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 02/06/2016 02:39 AM, Andrei Alexandrescu wrote:
 On 02/04/2016 03:30 PM, Timon Gehr wrote:
 At most 6 comparisons, <=3 swaps, idempotent (optimal number of swaps):
What is the minimum number of comparisons? Thx! -- Andrei
There is no partition algorithm that uses <= 5 comparisons in all cases. (If that is what you're asking.)
 P.S. The sythesized searcher is genius.
Thanks!
Feb 05 2016
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/05/2016 08:58 PM, Timon Gehr wrote:
 On 02/06/2016 02:39 AM, Andrei Alexandrescu wrote:
 On 02/04/2016 03:30 PM, Timon Gehr wrote:
 At most 6 comparisons, <=3 swaps, idempotent (optimal number of swaps):
What is the minimum number of comparisons? Thx! -- Andrei
There is no partition algorithm that uses <= 5 comparisons in all cases. (If that is what you're asking.)
Was asking about this particular one. For example, the maximum comparisons is 6, but it would be good to know the average number of comparisons. I know I could read through the code, but it's hairy. Thanks! -- Andrei
Feb 06 2016
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 02/06/2016 02:00 PM, Andrei Alexandrescu wrote:
 On 02/05/2016 08:58 PM, Timon Gehr wrote:
 On 02/06/2016 02:39 AM, Andrei Alexandrescu wrote:
 On 02/04/2016 03:30 PM, Timon Gehr wrote:
 At most 6 comparisons, <=3 swaps, idempotent (optimal number of swaps):
What is the minimum number of comparisons? Thx! -- Andrei
There is no partition algorithm that uses <= 5 comparisons in all cases. (If that is what you're asking.)
Was asking about this particular one. For example, the maximum comparisons is 6, but it would be good to know the average number of comparisons. I know I could read through the code, but it's hairy. Thanks! -- Andrei
All versions posted in this thread do 6 comparisons on all code paths. (It is not possible to do better.) BTW, the code I had posted earlier suffers from the following flaws: - Branches were cut off too early, so -version=SHORT didn't actually find the shortest code. [1] - The code (by necessity) only performs the optimal number of swaps if all input elements are different. While it leaves already partitioned integer arrays in the same state as they were encountered, if multiple input elements are equal, the generated algorithm will sometimes swap them, violating idempotence if input elements can compare equal but be different. However, tn's versions fix this: they never perform any swaps if the input array is already partitioned. [1] This seems to be the shortest code that satisfies the specification I have given (<=6 comparisons, optimal number of swaps) for permutations and that performs all the comparing before all the swapping: void partition5(ref int[5] a){ if(a[0]<a[2]){if(a[1]<a[3]){if(a[0]<a[1]){if(a[2]<a[4]){if(a[1]<a[2]){if(a[3]<a[2]){swap(a[3],a[2]);}}else{if(a[1]<a[4]){swap(a[1],a[2]);}else{swap(a[4],a[2]);swap(a[1],a[4]);}}}else{if(a[1]<a[4]){if(a[3]<a[4]){swap(a[3],a[2]);}else{swap(a[4],a[2]);}}else{if(a[1]<a[2]){swap(a[1],a[2]);swap(a[1],a[4]);}else{swap(a[1],a[4]);}}}}else{if(a[3]<a[4]){if(a[0]<a[3]){if(a[3]<a[2]){swap(a[3],a[2]);}}else{if(a[0]<a[4]){swap(a[0],a[2]);swap(a[0],a[3]);}else{swap(a[4],a[2]);swap(a[0],a[3]);}}}else{if(a[0]<a[4]){if(a[4]<a[2]){swap(a[4],a[2]);}}else{if(a[0]<a[3]){swap(a[0],a[2]);swap(a[0],a[4]);}else{swap(a[3],a[2]);swap(a[0],a[4]);}}}}}else{if(a[0]<a[3]){if(a[2]<a[4]){if(a[2]<a[3]){if(a[3]<a[4]){swap(a[3],a[2]);swap(a[1],a[3]);}else{swap(a[4],a[2]);swap(a[1],a[4]);}}else{if(a[1]<a[2]){swap(a[1],a[2]);swap(a[1],a[3]);}else{swap(a[1],a[3]);}}}else{if(a[3]<a[4]){if(a[1]<a[4]){swap(a[1],a[2]);swap(a[1],a[3]);}else{swap(a[4],a[2]);swap(a[1],a[3]);}}else{if(a[2]<a[3]){swap(a[1],a[4]);}else{swap(a[3],a[2] );swap(a[1],a[4]);}}}}else{if(a[1]<a[4]){if(a[0]<a[1]){if(a[1]<a[2]){swap(a[1],a[2]);swap(a[1],a[3]);}else{swap(a[1],a[3]);}}else{if(a[0]<a[4]){swap(a[0],a[2]);swap(a[0],a[3]);}else{swap(a[4],a[2]);swap(a[0],a[3]);}}}else{if(a[0]<a[4]){if(a[2]<a[4]){swap(a[1],a[3]);}else{swap(a[4],a[2]);swap(a[1],a[3]);}}else{if(a[0]<a[1]){swap(a[0],a[2]);swap(a[0],a[3]);swap(a[1],a[4]);}else{swap(a[1],a[2]);swap(a[0],a[3]);swap(a[1],a[4]);}}}}}}else{if(a[3]<a[4]){if(a[0]<a[4]){if(a[1]<a[3]){if(a[0]<a[3]){if(a[0]<a[1]){swap(a[1],a[2]);}else{swap(a[0],a[2]);}}else{if(a[2]<a[3]){swap(a[3],a[2]);swap(a[0],a[3]);}else{swap(a[0],a[3]);}}}else{if(a[0]<a[1]){if(a[0]<a[3]){swap(a[3],a[2]);swap(a[1],a[3]);}else{swap(a[0],a[2]);swap(a[1],a[3]);}}else{if(a[1]<a[2]){swap(a[0],a[3]);}else{swap(a[1],a[2]);swap(a[0],a[3]);}}}}else{if(a[1]<a[2]){if(a[2]<a[4]){if(a[2]<a[3]){swap(a[3],a[2]);swap(a[0],a[3]);}else{swap(a[0],a[3]);}}else{if(a[1]<a[4]){swap(a[4],a[2]);swap(a[0],a[3]);}else{swap(a[1],a[2]);swap(a[0],a[3]);s wap(a[1],a[4]);}}}else{if(a[1]<a[4]){if(a[1]<a[3]){swap(a[3],a[2]);swap(a[0],a[3]);}else{swap(a[1],a[2]);swap(a[0],a[3]);}}else{if(a[2]<a[4]){swap(a[4],a[2]);swap(a[0],a[3]);swap(a[1],a[4]);}else{swap(a[0],a[3]);swap(a[1],a[4]);}}}}}else{if(a[0]<a[3]){if(a[1]<a[4]){if(a[0]<a[4]){if(a[0]<a[1]){swap(a[1],a[2]);}else{swap(a[0],a[2]);}}else{if(a[2]<a[4]){swap(a[4],a[2]);swap(a[0],a[4]);}else{swap(a[0],a[4]);}}}else{if(a[0]<a[1]){if(a[0]<a[4]){swap(a[4],a[2]);swap(a[1],a[4]);}else{swap(a[0],a[2]);swap(a[1],a[4]);}}else{if(a[1]<a[2]){swap(a[0],a[4]);}else{swap(a[1],a[2]);swap(a[0],a[4]);}}}}else{if(a[1]<a[2]){if(a[2]<a[3]){if(a[2]<a[4]){swap(a[4],a[2]);swap(a[0],a[4]);}else{swap(a[0],a[4]);}}else{if(a[1]<a[3]){swap(a[3],a[2]);swap(a[0],a[4]);}else{swap(a[1],a[2]);swap(a[0],a[3]);swap(a[1],a[4]);}}}else{if(a[1]<a[3]){if(a[1]<a[4]){swap(a[4],a[2]);swap(a[0],a[4]);}else{swap(a[1],a[2]);swap(a[0],a[4]);}}else{if(a[2]<a[3]){swap(a[3],a[2]);swap(a[0],a[3]);swap(a[1],a[4]);}else{swap(a[0],a[3]);sw ap(a[1],a[4]);}}}}}} }
Feb 06 2016
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/6/16 9:26 AM, Timon Gehr wrote:
 [1] This seems to be the shortest code that satisfies the specification
 I have given (<=6 comparisons, optimal number of swaps) for permutations
 and that performs all the comparing before all the swapping:
Thanks. Tried this just now, it's better than the pre-discussion baselines but Ivan's still beats it (by a little). Then I just tried tn's partition5c which beats Ivan's. I should also add that returns are starting to diminish. Idempotence did make a large difference. Then reducing max swaps made a smaller difference (at least for the uints I'm working with). BTW my testbed is a careful implementation of BFPRT on random arrays of uint of various sizes. Andrei
Feb 06 2016
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/04/2016 03:30 PM, Timon Gehr wrote:
 At most 6 comparisons, <=3 swaps, idempotent (optimal number of swaps):
Oh, also: could you let that bad boy run and let it find anything that does idempotent partition in 6 compares and 0-7 swaps? Then slowly tighten the number of swaps until you find the best balance between number of swaps and code size. -- Andrei
Feb 05 2016
parent tn <no email.com> writes:
On Saturday, 6 February 2016 at 01:46:58 UTC, Andrei Alexandrescu 
wrote:
 On 02/04/2016 03:30 PM, Timon Gehr wrote:
 At most 6 comparisons, <=3 swaps, idempotent (optimal number 
 of swaps):
Oh, also: could you let that bad boy run and let it find anything that does idempotent partition in 6 compares and 0-7 swaps? Then slowly tighten the number of swaps until you find the best balance between number of swaps and code size. -- Andrei
That is kind of what I tried to do by hand. My function partition5e seems to be practically identical to Ivans solution and partition5a is just a copy of Timons solution. Function partition5b, partition5c and partition5d are in between of those with various tradeoffs between the number of swaps and code size.
Feb 06 2016
prev sibling parent reply Ivan Kazmenko <gassa mail.ru> writes:
On Thursday, 4 February 2016 at 01:24:15 UTC, Andrei Alexandrescu 
wrote:
 So there's got to be a better solution. Your challenge - should 
 you choose to accept it :o) - is an algorithm that does the 
 partitioning in 6 comparisons and <= 9 swaps, which is 
 idempotent: when applied twice, it always does 0 swaps.
Here's my take at it. It's idempotent and does exactly 6 comparisons and 0-8 swaps. The advantages to counter the not-so-good stats are that it has a flat structure, and is not hard to reason about. The idea is to process the following steps: 1. Find the minimum of {b, c, d, e} in three comparisons, and put it into b. The structure is: b < d, c < e, b < c. Note that d and e were not compared if no swaps were made. 2. Find the minimum of {a, c, d, e} in two more comparisons, and put it into a. The structure is: a < d, c < e, a < c. Note that a and b were not compared if no swaps were made. 3. Find the minimum of {c, d, e} in one more comparison, and put it into c. The structure is: c < d, c < e. Note that d and e were not compared if no swaps were made. In the end, we have a < c, b < c, c < d, c < e. Additionally, if these inequalities held initially, everything is left in place regardless of the results of comparisons of (a, b) and (d, e). In code: ----- import std.algorithm; import std.exception; import std.range; import std.stdio; void p5 (ref int a, ref int b, ref int c, ref int d, ref int e) { if (d < b) {swap (b, d);} if (e < c) {swap (c, e);} if (c < b) {swap (b, c); swap (d, e);} if (d < a) {swap (a, d);} if (c < a) {swap (a, c); swap (d, e);} if (d < c) {swap (c, d);} } void main () { auto a = 5.iota.array; do { auto b = a.dup; p5 (b[0], b[1], b[2], b[3], b[4]); auto c = b.dup; p5 (c[0], c[1], c[2], c[3], c[4]); writeln (a, " -> ", b, " -> ", c); enforce (b[0] < b[2] && b[1] < b[2] && b[2] < b[3] && b[2] < b[4]); enforce (equal (b, c)); } while (nextPermutation (a)); } ----- Another interesting task would be to make the function stable, but I don't see how it is possible with such flat structure. Ivan Kazmenko.
Feb 05 2016
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/05/2016 06:36 AM, Ivan Kazmenko wrote:
 Another interesting task would be to make the function stable, but I
 don't see how it is possible with such flat structure.
Under what circumstances isn't your function unstable? -- Andrei
Feb 05 2016
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/05/2016 07:59 PM, Andrei Alexandrescu wrote:
 On 02/05/2016 06:36 AM, Ivan Kazmenko wrote:
 Another interesting task would be to make the function stable, but I
 don't see how it is possible with such flat structure.
Under what circumstances isn't your function unstable? -- Andrei
I meant "is your function unstable". Double negation, but if you speak Russian those count as one :o). -- Andrei
Feb 05 2016
prev sibling parent reply Ivan Kazmenko <gassa mail.ru> writes:
On Saturday, 6 February 2016 at 00:59:17 UTC, Andrei Alexandrescu 
wrote:
 On 02/05/2016 06:36 AM, Ivan Kazmenko wrote:
 Another interesting task would be to make the function stable, 
 but I don't see how it is possible with such flat structure.
Under what circumstances isn't your function unstable? -- Andrei
For example, if elements are "value | id", compared by value, then: 0|0, 0|1, 1|2, 0|3, 0|4 is transformed into 0|0, 0|1, 0|4, 0|3, 1|2 and 0|4 is now placed earlier than 0|3, which violates the stability requirement. Things can be reordered a bit, but it seems that no possible order eliminates the need to remember a part of the history. On the other hand, when we track our whole path in the decision tree (e.g. at the leaves of Timon Gehr's tree of ifs), we have all information to make the partition stable. In any case, finding the shortest-code stable partition-of-5 algorithm looks like a problem solvable by an automated searcher. Regarding the speed, there are different use cases with different requirements, for example: 1. Primitive types (cheap swap, cheap comparison). 2. Heavy structs A (expensive swap, cheap comparison - e.g., compare one field of primitive type). 3. Heavy structs B (expensive swap, expensive comparison - e.g., call a heavy external function). 4. Heavy classes (cheap swap - pointers only, expensive comparison). So there's perhaps no single best solution.
Feb 05 2016
next sibling parent Ivan Kazmenko <gassa mail.ru> writes:
On Saturday, 6 February 2016 at 07:06:27 UTC, Ivan Kazmenko wrote:
 On Saturday, 6 February 2016 at 00:59:17 UTC, Andrei 
 Alexandrescu wrote:
 On 02/05/2016 06:36 AM, Ivan Kazmenko wrote:
 Another interesting task would be to make the function 
 stable, but I don't see how it is possible with such flat 
 structure.
Under what circumstances isn't your function unstable? -- Andrei
The code I used to check stability: http://dpaste.dzfl.pl/2b950cb3e5d8 The "y" and "n" at end of lines are "yes"/"no", as in "stable"/"unstable".
Feb 05 2016
prev sibling next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Saturday, 6 February 2016 at 07:06:27 UTC, Ivan Kazmenko wrote:
 1. Primitive types (cheap swap, cheap comparison).

 2. Heavy structs A (expensive swap, cheap comparison - e.g., 
 compare one field of primitive type).

 3. Heavy structs B (expensive swap, expensive comparison - 
 e.g., call a heavy external function).

 4. Heavy classes (cheap swap - pointers only, expensive 
 comparison).

 So there's perhaps no single best solution.
That's right, but other factors are more important: preventing pipeline stalls. If you are collecting from 5 different cachelines in an array you are likely to get several 40-120 cycles delays unless you do prefetching, and if you do, you need to have other instructions to fill in the latency gaps. But also instructions have latency and concurrency issues. Which is why your version did reasonably well as it made the compares/swaps independent so that they could be concurrently scheduled. Yet, Haswell has SIMD instructions that can do 8-16x 32-bit max/min operations per cycle, with a latency of only 1 cycle, and 4-8x 64bit compares with a latency of 1 cycle. So if you use as small fixed N, like 5, it makes very little sense to count compares/swaps. What makes sense is to focus on how you can avoid branching and build an algorithm with no pipeline stalls. If sorting large arrays you also might want to look at multi-threaded parallel sort functions.
Feb 06 2016
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/06/2016 02:06 AM, Ivan Kazmenko wrote:
 On Saturday, 6 February 2016 at 00:59:17 UTC, Andrei Alexandrescu wrote:
 On 02/05/2016 06:36 AM, Ivan Kazmenko wrote:
 Another interesting task would be to make the function stable, but I
 don't see how it is possible with such flat structure.
Under what circumstances isn't your function unstable? -- Andrei
For example, if elements are "value | id", compared by value, then: 0|0, 0|1, 1|2, 0|3, 0|4 is transformed into 0|0, 0|1, 0|4, 0|3, 1|2 and 0|4 is now placed earlier than 0|3, which violates the stability requirement. Things can be reordered a bit, but it seems that no possible order eliminates the need to remember a part of the history. On the other hand, when we track our whole path in the decision tree (e.g. at the leaves of Timon Gehr's tree of ifs), we have all information to make the partition stable. In any case, finding the shortest-code stable partition-of-5 algorithm looks like a problem solvable by an automated searcher.
Yah. I think stability should be solved if there's a need/application for it, e.g. is there a larger algorithm that would be stable if partition5 were stable?
 Regarding the speed, there are different use cases with different
 requirements, for example:

 1. Primitive types (cheap swap, cheap comparison).

 2. Heavy structs A (expensive swap, cheap comparison - e.g., compare one
 field of primitive type).

 3. Heavy structs B (expensive swap, expensive comparison - e.g., call a
 heavy external function).

 4. Heavy classes (cheap swap - pointers only, expensive comparison).

 So there's perhaps no single best solution.
Yah, good point. I should also add that more comparisons generally mean more branching and more code. -- Andrei
Feb 06 2016