www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Is there a function for this?

reply bauss <jj_1337 live.dk> writes:
Let's say you have a range with struct, but some of the struct 
are duplicates of each other.

Is there a standard function in Phobos to remove duplicates?

My first thought was "uniq", but it can't really do it like that, 
but it doesn't work.

See: https://run.dlang.io/is/IcFEtw

Is there another function in Phobos that perhaps would work?

I can of course write my own function, but if there is a standard 
function that can do it, then I'd rather use that.
Oct 06 2018
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Saturday, 6 October 2018 at 13:17:22 UTC, bauss wrote:
 My first thought was "uniq", but it can't really do it like 
 that, but it doesn't work.
uniq needs it to be sorted first, it only compares side-by-side (to avoid allocating space to remember what it has already seen)
 Is there another function in Phobos that perhaps would work?
If the set is small enough and it is at least a forward range, you could do a O(n^2) search with like filter with !canFind (making sure the one it finds is not the one you are currently looking at). But I think you are better off doing your own thing. I like to just stick them all in an associative array, then loop over that. The duplicate keys will naturally be filtered that way. Of course, it allocates memory for that but that's often preferable to running the expensive search anyway.
Oct 06 2018
parent bauss <jj_1337 live.dk> writes:
On Saturday, 6 October 2018 at 13:34:44 UTC, Adam D. Ruppe wrote:
 On Saturday, 6 October 2018 at 13:17:22 UTC, bauss wrote:
 My first thought was "uniq", but it can't really do it like 
 that, but it doesn't work.
uniq needs it to be sorted first, it only compares side-by-side (to avoid allocating space to remember what it has already seen)
 Is there another function in Phobos that perhaps would work?
If the set is small enough and it is at least a forward range, you could do a O(n^2) search with like filter with !canFind (making sure the one it finds is not the one you are currently looking at). But I think you are better off doing your own thing. I like to just stick them all in an associative array, then loop over that. The duplicate keys will naturally be filtered that way. Of course, it allocates memory for that but that's often preferable to running the expensive search anyway.
It "can" be a big set, but it's probably the most straightforward way for now thought. On Saturday, 6 October 2018 at 13:38:34 UTC, Nicholas Wilson wrote:
 On Saturday, 6 October 2018 at 13:17:22 UTC, bauss wrote:
 Let's say you have a range with struct, but some of the struct 
 are duplicates of each other.

 Is there a standard function in Phobos to remove duplicates?

 My first thought was "uniq", but it can't really do it like 
 that, but it doesn't work.

 See: https://run.dlang.io/is/IcFEtw

 Is there another function in Phobos that perhaps would work?

 I can of course write my own function, but if there is a 
 standard function that can do it, then I'd rather use that.
range .array.sort // make equal elements be adjacent .chunks!"a==b" // get a range over those equal elements .map(e => e.front); // get the first one should work, possibly not the most efficient way to do it though. I don't know off the top of mu head if there are any to preserve the order or the original range.
That doesn't work tbh. won't even compile.
Oct 06 2018
prev sibling next sibling parent reply Basile B <b2.temp gmx.com> writes:
On Saturday, 6 October 2018 at 13:17:22 UTC, bauss wrote:
 Let's say you have a range with struct, but some of the struct 
 are duplicates of each other.

 Is there a standard function in Phobos to remove duplicates?

 My first thought was "uniq", but it can't really do it like 
 that, but it doesn't work.

 See: https://run.dlang.io/is/IcFEtw

 Is there another function in Phobos that perhaps would work?

 I can of course write my own function, but if there is a 
 standard function that can do it, then I'd rather use that.
see https://www.programming-idioms.org/idiom/119/deduplicate-list.
Oct 06 2018
parent reply bauss <jj_1337 live.dk> writes:
On Saturday, 6 October 2018 at 13:35:38 UTC, Basile B wrote:
 On Saturday, 6 October 2018 at 13:17:22 UTC, bauss wrote:
 Let's say you have a range with struct, but some of the struct 
 are duplicates of each other.

 Is there a standard function in Phobos to remove duplicates?

 My first thought was "uniq", but it can't really do it like 
 that, but it doesn't work.

 See: https://run.dlang.io/is/IcFEtw

 Is there another function in Phobos that perhaps would work?

 I can of course write my own function, but if there is a 
 standard function that can do it, then I'd rather use that.
see https://www.programming-idioms.org/idiom/119/deduplicate-list.
Did you even read my post? I stated I could already write the function myself, but if a standard function existed I'd rather use that.
Oct 06 2018
next sibling parent Seb <seb wilzba.ch> writes:
On Saturday, 6 October 2018 at 13:56:32 UTC, bauss wrote:
 On Saturday, 6 October 2018 at 13:35:38 UTC, Basile B wrote:
 On Saturday, 6 October 2018 at 13:17:22 UTC, bauss wrote:
 [...]
see https://www.programming-idioms.org/idiom/119/deduplicate-list.
Did you even read my post? I stated I could already write the function myself, but if a standard function existed I'd rather use that.
He did and x.redBlackTree[] isn't too bad!
Oct 06 2018
prev sibling parent reply Basile B <b2.temp gmx.com> writes:
On Saturday, 6 October 2018 at 13:56:32 UTC, bauss wrote:
 On Saturday, 6 October 2018 at 13:35:38 UTC, Basile B wrote:
 On Saturday, 6 October 2018 at 13:17:22 UTC, bauss wrote:
 Let's say you have a range with struct, but some of the 
 struct are duplicates of each other.

 Is there a standard function in Phobos to remove duplicates?

 My first thought was "uniq", but it can't really do it like 
 that, but it doesn't work.

 See: https://run.dlang.io/is/IcFEtw

 Is there another function in Phobos that perhaps would work?

 I can of course write my own function, but if there is a 
 standard function that can do it, then I'd rather use that.
see https://www.programming-idioms.org/idiom/119/deduplicate-list.
Did you even read my post? I stated I could already write the function myself, but if a standard function existed I'd rather use that.
There are two solutions there. One shows how to use uniq, which is why i posted this since you said that it doesn't work as you expected. I didn't suggest you to write a full double loop dedup. routine, i.e àla Golang. Unfortunately the site doesn't allow to link the solutions of a particular language among those who implement the idiom.
Oct 06 2018
parent reply bauss <jj_1337 live.dk> writes:
On Saturday, 6 October 2018 at 15:35:39 UTC, Basile B wrote:
 On Saturday, 6 October 2018 at 13:56:32 UTC, bauss wrote:
 On Saturday, 6 October 2018 at 13:35:38 UTC, Basile B wrote:
 On Saturday, 6 October 2018 at 13:17:22 UTC, bauss wrote:
 Let's say you have a range with struct, but some of the 
 struct are duplicates of each other.

 Is there a standard function in Phobos to remove duplicates?

 My first thought was "uniq", but it can't really do it like 
 that, but it doesn't work.

 See: https://run.dlang.io/is/IcFEtw

 Is there another function in Phobos that perhaps would work?

 I can of course write my own function, but if there is a 
 standard function that can do it, then I'd rather use that.
see https://www.programming-idioms.org/idiom/119/deduplicate-list.
Did you even read my post? I stated I could already write the function myself, but if a standard function existed I'd rather use that.
There are two solutions there. One shows how to use uniq, which is why i posted this since you said that it doesn't work as you expected. I didn't suggest you to write a full double loop dedup. routine, i.e àla Golang. Unfortunately the site doesn't allow to link the solutions of a particular language among those who implement the idiom.
uniq will not work with, say a class and the class will require you to implement opCmp, which you can't always do for classes you don't have access to. That's the problem here. It's easy enough without, but you cannot do it by ex. a property of a class. It will not work properly. You can't even use .group and then .map because you need to use .sort and it requires opCmp again. The whole problem is actually that they do not work with ranges that aren't sorted. Things like .group and .uniq should work without sorted ranges. You can't always expect a range to be sorted to perform such algorithms.
Oct 06 2018
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Oct 06, 2018 at 08:07:42PM +0000, bauss via Digitalmars-d-learn wrote:
[...]
 The whole problem is actually that they do not work with ranges that
 aren't sorted. Things like .group and .uniq should work without sorted
 ranges. You can't always expect a range to be sorted to perform such
 algorithms.
The reason they don't work with ranges that aren't sorted is because the algorithms that don't require sorted ranges are more expensive, or need to allocate (non- nogc), and are sometimes application-specific, meaning that you're better off writing your own algorithm for it that takes advantage of the specific knowledge you have about your data, rather than some poorly-performing generic algorithm that can't possibly optimize for every possible use case. Not to mention, the expensiveness of such algorithms comes from having to do work that basically amounts to sorting the data, so you might as well sort them yourself in the first place. T -- Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald Knuth
Oct 06 2018
prev sibling parent =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 10/06/2018 01:07 PM, bauss wrote:

 uniq will not work with, say a class and the class will require you to
 implement opCmp, which you can't always do for classes you don't have
 access to.
Remembering that uniq works with a custom predicate, which should be sufficient in most of those case.
 to use ..sort
 and it requires opCmp again.
Same for sort: works with a custom predicate.
 Things like .group and .uniq should work without sorted
 ranges. You can't always expect a range to be sorted to perform such
 algorithms.
Yes, it can be inconvenient but I'm under the impression that making algorithmic complexity a property of a function is agreed by most people. (For D, algorithmic complexity is in the documentations of functions; alas, uniq lacks it.) So, it's a good thing that we know that uniq is always O(N). Ali
Oct 06 2018
prev sibling next sibling parent Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Saturday, 6 October 2018 at 13:17:22 UTC, bauss wrote:
 Let's say you have a range with struct, but some of the struct 
 are duplicates of each other.

 Is there a standard function in Phobos to remove duplicates?

 My first thought was "uniq", but it can't really do it like 
 that, but it doesn't work.

 See: https://run.dlang.io/is/IcFEtw

 Is there another function in Phobos that perhaps would work?

 I can of course write my own function, but if there is a 
 standard function that can do it, then I'd rather use that.
range .array.sort // make equal elements be adjacent .chunks!"a==b" // get a range over those equal elements .map(e => e.front); // get the first one should work, possibly not the most efficient way to do it though. I don't know off the top of mu head if there are any to preserve the order or the original range.
Oct 06 2018
prev sibling parent Piotr Mitana <the.mail.of.mi2 gmail.com> writes:
On Saturday, 6 October 2018 at 13:17:22 UTC, bauss wrote:
 Let's say you have a range with struct, but some of the struct 
 are duplicates of each other.

 Is there a standard function in Phobos to remove duplicates?

 My first thought was "uniq", but it can't really do it like 
 that, but it doesn't work.

 See: https://run.dlang.io/is/IcFEtw

 Is there another function in Phobos that perhaps would work?

 I can of course write my own function, but if there is a 
 standard function that can do it, then I'd rather use that.
Unless you want to implement opCmp() and do the .sort.uniq then, you may find this one-liner helpful: randomAccessRange.fold!((arr, elem) => arr.canFind(elem) ? arr : (arr ~= elem))((ElementType!(typeof(randomAccessRange))[]).init) Applied in your example: https://run.dlang.io/is/3KjRvY But yes, it is de facto a custom function. I don't think that there is a function in Phobos that will perfectly deduplicate your range without having it sorted before.
Oct 08 2018