www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - quickSort

reply hdsh <asmasm hotmail.com> writes:
this my try

int[] quickSort(int[] arr) {
    int[] result = quickSort(filter!(arr < arr[0])(arr)) ~ arr[0] ~
quickSort(filter!(arr > arr[0])(arr));
}

but it fail to compile
Sep 13 2011
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, September 14, 2011 01:34:34 hdsh wrote:
 this my try
 
 int[] quickSort(int[] arr) {
     int[] result = quickSort(filter!(arr < arr[0])(arr)) ~ arr[0] ~
 quickSort(filter!(arr > arr[0])(arr));
 }
 
 but it fail to compile
filter does not return an array. It returns a new range which wraps the original array. As you iterate over the range, it skips elements which don't match the predicate. If you want to make an array out of it, you use std.array.array, but that allocates a new array. But as filter doesn't return int[], you can't pass its result to your quickSort function, nor can you concatenate it (if you want to concatenate ranges, you use std.range.chain, which chains them together without allocating anything - it just goes to each successive range when iterating over it). Normally, range-based functions are templated so that they can deal with a variety of range types. If all you're looking for is a sort function, then you can use std.algorithm.sort, but if you're just trying to implement quick sort in D, because you want to implement it in D, then you're going to have to rework how your algorithm works. Either, you're going to have to templatize it (and use chain), use array (which would be less than efficient), or you're going to have to use something other than filter. Personally, I'd advise reworking it so that it doesn't use filter if you want it to be efficient. Using std.array.array would work but would be inefficient, whereas templatizing quickSort might run into trouble. I'm not sure whether such a recursive templated function would work or not. It would depend on whether each call to filter required a new template instantiation or not. In any case, your core problem here is the fact that filter does not return int[]. It returns a new range type. - Jonathan M Davis
Sep 13 2011
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09/14/2011 03:34 AM, hdsh wrote:
 this my try

 int[] quickSort(int[] arr) {
      int[] result = quickSort(filter!(arr<  arr[0])(arr)) ~ arr[0] ~
 quickSort(filter!(arr>  arr[0])(arr));
 }

 but it fail to compile
Note that this approach is an inefficient way of implementing a sorting routine. This works: int[] quickSort(int[] arr) { if(!arr.length) return []; return quickSort(array(filter!((a){return a < arr[0];})(arr))) ~ arr[0] ~ quickSort(array(filter!((a){return a > arr[0];})(arr))); } I'll go quickly through how I fixed it: int[] quickSort(int[] arr) { if(!arr.length) return []; // base case for recursion return // return statement for giving back a result quickSort(array( // turn the lazy filter range into an array filter!( (a){return a < arr[0];} // delegate literal passed to filter )(arr))) ~ arr[0] ~ quickSort(array(filter!( (a){return a >= arr[0];} //allow multiple elements that are equal to the pivot element )(arr[1..$]))); // only include the pivot once } If you have particular questions about this solution, feel free to ask.
Sep 13 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09/14/2011 04:12 AM, Timon Gehr wrote:
 On 09/14/2011 03:34 AM, hdsh wrote:
 this my try

 int[] quickSort(int[] arr) {
 int[] result = quickSort(filter!(arr< arr[0])(arr)) ~ arr[0] ~
 quickSort(filter!(arr> arr[0])(arr));
 }

 but it fail to compile
Note that this approach is an inefficient way of implementing a sorting routine. This works: int[] quickSort(int[] arr) { if(!arr.length) return []; return quickSort(array(filter!((a){return a < arr[0];})(arr))) ~ arr[0] ~ quickSort(array(filter!((a){return a > arr[0];})(arr))); }
Sry, accidentally sent a first draft. This is how it should have looked. int[] quickSort(int[] arr) { if(!arr.length) return []; return quickSort(array(filter!((a){return a < arr[0];})(arr))) ~ arr[0] ~ quickSort(array(filter!((a){return a >= arr[0];})(arr[1..$]))); }
 I'll go quickly through how I fixed it:


 int[] quickSort(int[] arr) {
 if(!arr.length) return []; // base case for recursion
 return // return statement for giving back a result
 quickSort(array( // turn the lazy filter range into an array
 filter!(
 (a){return a < arr[0];} // delegate literal passed to filter
 )(arr))) ~ arr[0] ~
 quickSort(array(filter!(
 (a){return a >= arr[0];} //allow multiple elements that are equal to the
 pivot element
 )(arr[1..$]))); // only include the pivot once
 }


 If you have particular questions about this solution, feel free to ask.
Sep 13 2011
parent reply %u <asmasm hotmail.com> writes:
i have qustion why filter can't return int[]
and if lambda return  the last Expression without return keyword it would much
cleaner
Sep 13 2011
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, September 14, 2011 05:43:37 %u wrote:
 i have qustion why filter can't return int[]
 and if lambda return  the last Expression without return keyword it would
 much cleaner
filter can't return int[]. filter does not alter the original array. It returns a new range with only the elements which match the predicate. For it to return int[], it would have to allocate a new array, and that would be inefficient. Instead, it returns a new range which wraps the original array. If you want to actually allocate a new array, then just pass the result of filter to std.array.array. As for the lambda syntax, there has been some discussion of simplifying it, but there are potential issues with it, and so for the moment, it's staying as-is. - Jonathan M Davis
Sep 13 2011
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
%u:

 i have qustion why filter can't return int[]
Because a lazy filter is handy, you often don't need a real array result, a lazy sequence is enough. A lazy sequence avoids the memory allocation of the output array. In D programs often the slowest parts are the memory allocations. On the other hand I have asked to add amap/afilter functions that return arrays. Andrei has not answered about this yet. Bye, bearophile
Sep 14 2011
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, September 14, 2011 06:09:52 bearophile wrote:
 %u:
 i have qustion why filter can't return int[]
Because a lazy filter is handy, you often don't need a real array result, a lazy sequence is enough. A lazy sequence avoids the memory allocation of the output array. In D programs often the slowest parts are the memory allocations. On the other hand I have asked to add amap/afilter functions that return arrays. Andrei has not answered about this yet.
What would that gain you over passing the result of map or filter to std.array.array? - Jonathan M Davis
Sep 14 2011
parent reply bearophile <bearophileHUGS lycos.com> writes:
Jonathan M Davis:

 What would that gain you over passing the result of map or filter to 
 std.array.array?
1) The code gets shorter 2) The code gets a bit less noisy, because () add noise. 3) You use a single function instead of two, so you reduce the number of chunks your brain has to manage. A human brain is able to manage a very limited number of chunks at the same time (about 7). An expert programmer is able to merge array(map()) into a single chunk, but this requires a bit of training and time. 4) Maybe you are able to reduce the abstraction penalty for the compiler, reducing the amount of code compiled. (But this probably will not happen because amap will probably be just implemented with an array(map()) to reduce library code and time to write to write it). 5) The lazy result of map is quite useful. But in a large amount of situations in D you can't use a lazy result, you need an array (this is not true in Haskell). So in practice about half the time I need a map, I have to apply array() on it. So amap is a very common pattern in D. 6) In std.parallelism there is already an map/amap pair: http://www.d-programming-language.org/phobos/std_parallelism.html#amap So for the D programmer it's not a significant burden to have the same functions in std.algorithm, with the same names and same purposes. Orthogonality is overrated in Phobos. If you take a look at functional languages where the use of higher order functions is common, like Haskell, you see standard functions that are the composition of common functions. Haskell designers have a large experience on the usage of higher order functions. This is the Haskell Prelude (similar to the D object module): http://www.haskell.org/onlinereport/standard-prelude.html As example it contains both map and concatMap (concatMap is absent in Phobos still, but it will be useful to have). The definition of concatMap is just: concatMap :: (a -> [b]) -> [a] -> [b] concatMap f = concat . map f In practice in Haskell you are allowed to write concatMap just like this: concatMap = concat.map This means concatMap is just using three functions, concat, map and dot (that is the composition function). Do you know why such simple function is included in the Prelude, despite it essentially saves just one character (the dot)? Because using a concat on the result of map is a very common pattern in Haskell code. So packing them in a single name allows the mind of the programmer to think of it as a single entity, and allows a bit higher thinking while you program. This is why I think amap/afilter are worth adding to Phobos. I did have them in my dlibs1 and I have used them many times. Bye, bearophile
Sep 14 2011
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, September 14, 2011 07:46:37 bearophile wrote:
 Jonathan M Davis:
 What would that gain you over passing the result of map or filter to
 std.array.array?
1) The code gets shorter 2) The code gets a bit less noisy, because () add noise. 3) You use a single function instead of two, so you reduce the number of chunks your brain has to manage. A human brain is able to manage a very limited number of chunks at the same time (about 7). An expert programmer is able to merge array(map()) into a single chunk, but this requires a bit of training and time. 4) Maybe you are able to reduce the abstraction penalty for the compiler, reducing the amount of code compiled. (But this probably will not happen because amap will probably be just implemented with an array(map()) to reduce library code and time to write to write it). 5) The lazy result of map is quite useful. But in a large amount of situations in D you can't use a lazy result, you need an array (this is not true in Haskell). So in practice about half the time I need a map, I have to apply array() on it. So amap is a very common pattern in D. 6) In std.parallelism there is already an map/amap pair: http://www.d-programming-language.org/phobos/std_parallelism.html#amap So for the D programmer it's not a significant burden to have the same functions in std.algorithm, with the same names and same purposes. Orthogonality is overrated in Phobos. If you take a look at functional languages where the use of higher order functions is common, like Haskell, you see standard functions that are the composition of common functions. Haskell designers have a large experience on the usage of higher order functions. This is the Haskell Prelude (similar to the D object module): http://www.haskell.org/onlinereport/standard-prelude.html As example it contains both map and concatMap (concatMap is absent in Phobos still, but it will be useful to have). The definition of concatMap is just: concatMap :: (a -> [b]) -> [a] -> [b] concatMap f = concat . map f In practice in Haskell you are allowed to write concatMap just like this: concatMap = concat.map This means concatMap is just using three functions, concat, map and dot (that is the composition function). Do you know why such simple function is included in the Prelude, despite it essentially saves just one character (the dot)? Because using a concat on the result of map is a very common pattern in Haskell code. So packing them in a single name allows the mind of the programmer to think of it as a single entity, and allows a bit higher thinking while you program. This is why I think amap/afilter are worth adding to Phobos. I did have them in my dlibs1 and I have used them many times.
So, basically, you just want to shorten your code by wrapping array(func) in a afunc function, and you think that this happens enough with map and filter enough to merit putting these functions into Phobos. Yeah, I don't see that happening. It's not impossible, but there are a number of functions in Phobos which return lazy ranges. It's not at all reasonable to have duplicate versions of them all just to shorten a bit of code. And if anything, Andrei has been looking at heightening the bar for how much a function has to add to make it into std.algorithm or std.range. He feels that there's arguably too much duplication already. And when all a function does is making so that you're doing afunc(r) instead of array(func(r)), there's no way that that's getting in. - Jonathan M Davis
Sep 14 2011
parent reply bearophile <bearophileHUGS lycos.com> writes:
Jonathan M Davis:

 So, basically, you just want to shorten your code by wrapping array(func) in a 
 afunc function, and you think that this happens enough with map and filter 
 enough to merit putting these functions into Phobos.
There is also the point 3) that you have not seen, plus the notes about Haskell Prelude design. Bye, bearophile
Sep 14 2011
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, September 14, 2011 14:36 bearophile wrote:
 Jonathan M Davis:
 So, basically, you just want to shorten your code by wrapping array(func)
 in a afunc function, and you think that this happens enough with map and
 filter enough to merit putting these functions into Phobos.
There is also the point 3) that you have not seen, plus the notes about Haskell Prelude design.
I did see it. I read your entire post, and I don't see anything that IMHO justifies adding amap or afilter. Yes, I can see why you would want to have and use amap and afilter. They aren't worthless. But they just don't add enough value to be worth adding to Phobos. And I'd be very surprised if Andrei disagreed with me based on what he's said about similar stuff. It was an uphill battle just to get him to allow drop to be added to std.range, and that actually allows code to be written in a different style, whereas amap and afilter pretty much just reduce the number of characters that a particular statement requires. Yes, the resulting code may look cleaner, but that's it. There's no real functional difference in the resulting code. It just doesn't add enough value to make it into Phobos. - Jonathan M Davis
Sep 14 2011