www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Remove duplicates

reply "bearophile" <bearophileHUGS lycos.com> writes:
Sometimes I have need a simple function like this, related to 
std.string.squeeze:


// Must keep the original order of the items.
// Slow implementation that shows the semantics.
T[] noDupes(T)(in T[] s) {
     import std.algorithm: canFind;
     T[] result;
     foreach (T c; s)
         if (!result.canFind(c))
             result ~= c;
     return result;
}

void main() {
     import std.string: squeeze;
     assert("AAAA".noDupes == "A");
     assert("AAAA".squeeze == "A");
     assert("ABAC".noDupes == "ABC");
     assert("ABAC".squeeze == "ABAC");
}


Do you know if this function (or a simple way to implement it) 
already in Phobos?

Bye,
bearophile
May 21 2013
next sibling parent reply "Namespace" <rswhite4 googlemail.com> writes:
On Tuesday, 21 May 2013 at 22:00:09 UTC, bearophile wrote:
 Sometimes I have need a simple function like this, related to 
 std.string.squeeze:


 // Must keep the original order of the items.
 // Slow implementation that shows the semantics.
 T[] noDupes(T)(in T[] s) {
     import std.algorithm: canFind;
     T[] result;
     foreach (T c; s)
         if (!result.canFind(c))
             result ~= c;
     return result;
 }

 void main() {
     import std.string: squeeze;
     assert("AAAA".noDupes == "A");
     assert("AAAA".squeeze == "A");
     assert("ABAC".noDupes == "ABC");
     assert("ABAC".squeeze == "ABAC");
 }


 Do you know if this function (or a simple way to implement it) 
 already in Phobos?

 Bye,
 bearophile
I would prefer a map solution like this: ---- property T[] noDupes(T)(const T[] arr) { bool[T] map; foreach (T item; arr) { if (item !in map) map[item] = true; } return map.keys; } ---- I do not know if this solution would be much faster. But if I should make a statement, I would tend to yes.
May 21 2013
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Namespace:

 // Slow implementation that shows the semantics.
... I do not know if this solution would be much faster. But if I should make a statement, I would tend to yes.
(The point of my code was to show the semantics as much clearly as possible, it was not to show fast code.) If no similar function is in Phobos, and there is no trivial way to implement it efficiently, then maybe it's worth writing a Phobos enhancement request. Bye, bearophile
May 21 2013
parent reply "Namespace" <rswhite4 googlemail.com> writes:
On Tuesday, 21 May 2013 at 22:51:26 UTC, bearophile wrote:
 Namespace:

 // Slow implementation that shows the semantics.
... I do not know if this solution would be much faster. But if I should make a statement, I would tend to yes.
(The point of my code was to show the semantics as much clearly as possible, it was not to show fast code.) If no similar function is in Phobos, and there is no trivial way to implement it efficiently, then maybe it's worth writing a Phobos enhancement request. Bye, bearophile
http://dlang.org/phobos/std_algorithm.html#uniq
May 21 2013
parent "bearophile" <bearophileHUGS lycos.com> writes:
Namespace:

 http://dlang.org/phobos/std_algorithm.html#uniq
Just like squeeze, uniq requires the items to be sorted, to remove the duplicates. The function I am discussing here must keep the order of the not removed items. Bye, bearophile
May 21 2013
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 21 May 2013 18:00:07 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Sometimes I have need a simple function like this, related to  
 std.string.squeeze:


 // Must keep the original order of the items.
 // Slow implementation that shows the semantics.
 T[] noDupes(T)(in T[] s) {
      import std.algorithm: canFind;
      T[] result;
      foreach (T c; s)
          if (!result.canFind(c))
              result ~= c;
      return result;
 }

 void main() {
      import std.string: squeeze;
      assert("AAAA".noDupes == "A");
      assert("AAAA".squeeze == "A");
      assert("ABAC".noDupes == "ABC");
      assert("ABAC".squeeze == "ABAC");
 }


 Do you know if this function (or a simple way to implement it) already  
 in Phobos?
This seems to work for ASCII strings, but the conversion to array is required (I think because writeln may re-evaluate the range's front more than once): Should be O(n), but doesn't really get around the memory allocation. You could probably write a custom range that does this properly (doesn't set present until popFront is called): import std.algorithm; import std.array; import std.stdio; void main() { bool present[256]; // writes "ABC" writeln(array("ABAC".filter!((a){scope(exit) present[a] = true; return !present[a];}))); } Kind of a violation of how a range should work, front should be the same across multiple calls! But it does the trick :) -Steve
May 21 2013
parent "bearophile" <bearophileHUGS lycos.com> writes:
Steven Schveighoffer:

 This seems to work for ASCII strings, but the conversion to 
 array is required (I think because writeln may re-evaluate the 
 range's front more than once):
Thank you, I have opened an ER: http://d.puremagic.com/issues/show_bug.cgi?id=10131 Bye, bearophile
May 21 2013