digitalmars.D.learn - How to sort a range
- rcorre (15/15) Mar 08 2016 I was in a situation where I wanted to remove duplicates from an
- Edwin van Leeuwen (8/23) Mar 09 2016 I'm not sure why your fix didn't work, but generally I work
- rcorre (5/12) Mar 09 2016 I'd like to avoid allocating here.
- Edwin van Leeuwen (3/8) Mar 09 2016 Well that one does allocate, because it keeps track of which
- rcorre (3/12) Mar 09 2016 Yup, just noticed that >.<
- Edwin van Leeuwen (7/20) Mar 09 2016 Of course it only allocates when the actual result is used, so
- cym13 (7/16) Mar 09 2016 Note that an input range isn't even remotely a container, it's a
- rcorre (7/22) Mar 09 2016 In the general case, yes. However only is a range wrapper around
- Edwin van Leeuwen (8/12) Mar 09 2016 Since you are adapting phobos anyway you could try commenting out
- Xinok (13/15) Mar 09 2016 That's my suspicion as well. It seems that OnlyResult is
- rcorre (10/25) Mar 09 2016 Got it. sort calls to quicksort (for unstable, at least) which
- =?UTF-8?Q?Ali_=c3=87ehreli?= (14/21) Mar 09 2016 Remembering that a range is not the collection, swapAt takes the range
- rcorre (5/9) Mar 09 2016 Oh, I think it just clicked. I was thinking 'sort takes a range,
- Chris Wright (3/4) Mar 09 2016 Which is why sort() has template constraints beyond isInputRange. The
I was in a situation where I wanted to remove duplicates from an OnlyResult. To do this with uniq, I needed to sort it. OnlyResult doesn't satisfy the template constraints of sort, but this seems easy enough to fix. I made front, back, and opIndex return by ref. With this, the following compiles: assert(only(3,1,2).sort.equal(only(1,2,3))); However, it fails with: core.exception.AssertError std/algorithm/sorting.d(1052): Failed to sort range of type OnlyResult!(int, 3LU) So, if you have a range for which sort compiles, what does it take to make sorting actually work? For reference, my two attempts were: https://github.com/rcorre/phobos/commit/d89b3cfab7a0938e178a506b4ceb8faae6ecbfe2 https://github.com/rcorre/phobos/commit/512d9b8db6f311db6a9b6ccb077a691cec66ce70
Mar 08 2016
On Wednesday, 9 March 2016 at 03:05:52 UTC, rcorre wrote:I was in a situation where I wanted to remove duplicates from an OnlyResult. To do this with uniq, I needed to sort it. OnlyResult doesn't satisfy the template constraints of sort, but this seems easy enough to fix. I made front, back, and opIndex return by ref. With this, the following compiles: assert(only(3,1,2).sort.equal(only(1,2,3))); However, it fails with: core.exception.AssertError std/algorithm/sorting.d(1052): Failed to sort range of type OnlyResult!(int, 3LU) So, if you have a range for which sort compiles, what does it take to make sorting actually work? For reference, my two attempts were: https://github.com/rcorre/phobos/commit/d89b3cfab7a0938e178a506b4ceb8faae6ecbfe2 https://github.com/rcorre/phobos/commit/512d9b8db6f311db6a9b6ccb077a691cec66ce70I'm not sure why your fix didn't work, but generally I work around this by converting the OnlyResult into an array: import std.array : array; assert(only(3,1,2).array.sort.equal(only(1,2,3))); If you are looking for a lazy uniq that works on non sorted ranges, I implemented one not to long ago: http://github.com/BlackEdder/ggplotd/blob/master/source/ggplotd/range.d
Mar 09 2016
On Wednesday, 9 March 2016 at 09:15:01 UTC, Edwin van Leeuwen wrote:I'm not sure why your fix didn't work, but generally I work around this by converting the OnlyResult into an array: import std.array : array; assert(only(3,1,2).array.sort.equal(only(1,2,3)));I'd like to avoid allocating here.If you are looking for a lazy uniq that works on non sorted ranges, I implemented one not to long ago: http://github.com/BlackEdder/ggplotd/blob/master/source/ggplotd/range.dThat sounds like the kind of thing I was looking for. I'll take a look, thanks!
Mar 09 2016
On Wednesday, 9 March 2016 at 12:21:55 UTC, rcorre wrote:Well that one does allocate, because it keeps track of which values have already been seen.If you are looking for a lazy uniq that works on non sorted ranges, I implemented one not to long ago: http://github.com/BlackEdder/ggplotd/blob/master/source/ggplotd/range.dThat sounds like the kind of thing I was looking for. I'll take a look, thanks!
Mar 09 2016
On Wednesday, 9 March 2016 at 12:31:18 UTC, Edwin van Leeuwen wrote:On Wednesday, 9 March 2016 at 12:21:55 UTC, rcorre wrote:Yup, just noticed that >.<Well that one does allocate, because it keeps track of which values have already been seen.If you are looking for a lazy uniq that works on non sorted ranges, I implemented one not to long ago: http://github.com/BlackEdder/ggplotd/blob/master/source/ggplotd/range.dThat sounds like the kind of thing I was looking for. I'll take a look, thanks!
Mar 09 2016
On Wednesday, 9 March 2016 at 13:04:31 UTC, rcorre wrote:On Wednesday, 9 March 2016 at 12:31:18 UTC, Edwin van Leeuwen wrote:Of course it only allocates when the actual result is used, so this will probably be more efficient if you only need a small number of unique results or need to keep the unsorted range around/intact. Sorting without allocating and then using uniq should indeed be more efficient in other cases. Did you try different SwapStrategy values in your original?On Wednesday, 9 March 2016 at 12:21:55 UTC, rcorre wrote:Yup, just noticed that >.<Well that one does allocate, because it keeps track of which values have already been seen.If you are looking for a lazy uniq that works on non sorted ranges, I implemented one not to long ago: http://github.com/BlackEdder/ggplotd/blob/master/source/ggplotd/range.dThat sounds like the kind of thing I was looking for. I'll take a look, thanks!
Mar 09 2016
On Wednesday, 9 March 2016 at 12:21:55 UTC, rcorre wrote:On Wednesday, 9 March 2016 at 09:15:01 UTC, Edwin van Leeuwen wrote:Note that an input range isn't even remotely a container, it's a way to iterate on a container. As you don't have all elements at hand you can't sort them, that's why you have to use array here. If you don't want to allocate using the GC just allocate your own memory and store your range elements in it before sorting but sort has to have access to all elements one way or another.I'm not sure why your fix didn't work, but generally I work around this by converting the OnlyResult into an array: import std.array : array; assert(only(3,1,2).array.sort.equal(only(1,2,3)));I'd like to avoid allocating here.
Mar 09 2016
On Wednesday, 9 March 2016 at 14:28:11 UTC, cym13 wrote:On Wednesday, 9 March 2016 at 12:21:55 UTC, rcorre wrote:In the general case, yes. However only is a range wrapper around a static array, and does have all elements at hand. Maybe I should just be using a static array... Still curious as to why it fails; maybe the range is getting copied at some point? I guess I need to step through it. I did try different SwapStrategies with no luck.On Wednesday, 9 March 2016 at 09:15:01 UTC, Edwin van Leeuwen wrote:Note that an input range isn't even remotely a container, it's a way to iterate on a container. As you don't have all elements at hand you can't sort them, that's why you have to use array here.I'm not sure why your fix didn't work, but generally I work around this by converting the OnlyResult into an array: import std.array : array; assert(only(3,1,2).array.sort.equal(only(1,2,3)));I'd like to avoid allocating here.
Mar 09 2016
On Wednesday, 9 March 2016 at 15:39:55 UTC, rcorre wrote:On Wednesday, 9 March 2016 at 14:28:11 UTC, cym13 wrote: Still curious as to why it fails; maybe the range is getting copied at some point? I guess I need to step through it. I did try different SwapStrategies with no luck.Since you are adapting phobos anyway you could try commenting out the assert and see what the result of the sort is. That might give you some clue: //assert(isSorted!lessFun(r), "Failed to sort range of type " ~ Range.stringof); Also I notice the line numbering is different in my sorted.d file. Did you test the latest version of dmd/phobos?
Mar 09 2016
On Wednesday, 9 March 2016 at 15:39:55 UTC, rcorre wrote:Still curious as to why it fails; maybe the range is getting copied at some point? I guess I need to step through it.That's my suspicion as well. It seems that OnlyResult is pass-by-value so every time it gets passed to another function, it creates a copy of all the elements. A simple solution is to provide a wrapper type which refers to the elements in the original container. However, I'm going to argue that the sort function is fine but the modifications you made to OnlyResult are incorrect. I tried running your example of only(...).sort but got a compilation error. Similarly, trying to sort a static array also gives a compilation error. However, if I slice the static array before passing it to sort (thus passing by reference), then it works just fine.
Mar 09 2016
On Wednesday, 9 March 2016 at 16:53:08 UTC, Xinok wrote:On Wednesday, 9 March 2016 at 15:39:55 UTC, rcorre wrote:Got it. sort calls to quicksort (for unstable, at least) which uses swapAt. swapAt takes the range by value, so it just swaps the values in its local copy. The original OnlyResult is untouched. I guess a static array slice maintains a pointer to the underlying array (which is why returning one from a function errors with 'escaping reference to local variable'). Meanwhile, I've realized my code probably doensn't need to remove duplicates anyways, so its a moot point, but still an interesting discovery :)Still curious as to why it fails; maybe the range is getting copied at some point? I guess I need to step through it.That's my suspicion as well. It seems that OnlyResult is pass-by-value so every time it gets passed to another function, it creates a copy of all the elements. A simple solution is to provide a wrapper type which refers to the elements in the original container. However, I'm going to argue that the sort function is fine but the modifications you made to OnlyResult are incorrect. I tried running your example of only(...).sort but got a compilation error. Similarly, trying to sort a static array also gives a compilation error. However, if I slice the static array before passing it to sort (thus passing by reference), then it works just fine.
Mar 09 2016
On 03/09/2016 06:50 PM, rcorre wrote:sort calls to quicksort (for unstable, at least) which uses swapAt. swapAt takes the range by value, so it just swaps the values in its local copy.Remembering that a range is not the collection, swapAt takes the range by value but it does not copy the elements. So, sort() does sort the original array: import std.algorithm; void main() { auto a = [ 9, -1, 2, 0 ]; a.sort(); assert(a == [ -1, 0, 2, 9 ]); }The original OnlyResult is untouched. I guess a static array slice maintains a pointer to the underlying array (which is why returning one from a function errors with 'escaping reference to local variable').Yes: A static array is just a collection of elements, which implicitly converts to a slice and a slice is nothing but a pair of "pointer to the first element" and "number of elements". Ali
Mar 09 2016
On Wednesday, 9 March 2016 at 14:28:11 UTC, cym13 wrote:Note that an input range isn't even remotely a container, it's a way to iterate on a container. As you don't have all elements at hand you can't sort them, that's why you have to use array here.Oh, I think it just clicked. I was thinking 'sort takes a range, so it must be used for sorting ranges', but I should have thought 'sort takes a range so it can sort a container via a range over that container'.
Mar 09 2016
On Wed, 09 Mar 2016 14:28:11 +0000, cym13 wrote:Note that an input range isn't even remotely a containerWhich is why sort() has template constraints beyond isInputRange. The constraints ensure that it is possible to swap values in the range.
Mar 09 2016