www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Removing multiple elements from array

reply "aldanor" <i.s.smirnov gmail.com> writes:
Is there an efficient method to remove elements with multiple 
(compile-time-unknown) indices from an array? I currently do 
something like

		if (!index.empty)
			foreach (i; index.sort.reverse)
				a = a.remove(i);

... which looks a bit awkward.
Dec 20 2013
next sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 12/20/2013 04:47 PM, aldanor wrote:

 Is there an efficient method to remove elements with multiple
 (compile-time-unknown) indices from an array? I currently do 
something like
          if (!index.empty)
              foreach (i; index.sort.reverse)
                  a = a.remove(i);
That's probably a bug, right? The indexes would probably become off as the array gets shorter.
 ... which looks a bit awkward.
I am sure others can come up with solutions but I can't think of a way of doing this Phobos right now. :) Here is a range that does what you want (not very well tested ;) ): import std.stdio; import std.array; import std.range; import std.algorithm; struct SkipIndexes(R, I) { alias SortedIndexRange = typeof((I.init).sort.uniq); R range; SortedIndexRange indexes; size_t currentIndex; this(R range, I indexes, size_t currentIndex = 0) { this.range = range; this.indexes = indexes.sort.uniq; this.currentIndex = currentIndex; prime(); } bool empty() const property { return range.empty; } ElementType!R front() const property { return range.front; } void prime() { while (!empty && !indexes.empty && (indexes.front == currentIndex)) { range.popFront(); indexes.popFront(); ++currentIndex; } } void popFront() { range.popFront(); ++currentIndex; prime(); } auto save() property { return this; } void report() const { writeln(indexes); writeln(currentIndex); } } auto skipIndexes(R, I)(R range, I skipIndexes, size_t currentIndex = 0) { return SkipIndexes!(R, I)(range, skipIndexes, currentIndex); } void main() { auto arr = [ 0, 1, 2, 3, 4, 5 ]; size_t[] badIndexes = [ 0, 1, 5 ]; auto skipped = arr.skipIndexes(badIndexes); // skipped is a lazy range: assert(skipped.equal([ 2, 3, 4 ])); // arr is untouched at this point: assert(arr == [ 0, 1, 2, 3, 4, 5 ]); // To affect it, assign from .array of the lazy range: arr = skipped.array; assert(arr == [ 2, 3, 4 ]); } Ali
Dec 20 2013
parent reply "aldanor" <i.s.smirnov gmail.com> writes:
On Saturday, 21 December 2013 at 01:42:04 UTC, Ali Çehreli wrote:

 That's probably a bug, right? The indexes would probably become 
 off as the array gets shorter.
Note the .reverse there? Which makes sure indices don't get off. Still, that's probably an inefficient way of doing this...
Dec 20 2013
parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 12/20/2013 06:17 PM, aldanor wrote:
 On Saturday, 21 December 2013 at 01:42:04 UTC, Ali Çehreli wrote:

 That's probably a bug, right? The indexes would probably become off as
 the array gets shorter.
Note the .reverse there?
Yes, I missed that part. :)
 Which makes sure indices don't get off. Still,
 that's probably an inefficient way of doing this...
Yeah, every removal would move the right-hand elements one position to the left. But you would be removing only M times (the number of indexes). So the complexity is I think O(M logM) + O(NM) for sorting the indexing plus removing M elements by shifting at the order of N elements. The range solution that I've come up with is I think O(M log M) + O(M) + O(N), for sorting the indexes, removing the duplicates with uniq and iterating the elements. Since O(M log M) dominates O(M), it should be O(M log M) + O(N). Ali P.S. Hm. I think I've reading an algorithms book lately. :p
Dec 20 2013
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Dec 20, 2013 at 07:01:17PM -0800, Ali Çehreli wrote:
 On 12/20/2013 06:17 PM, aldanor wrote:
On Saturday, 21 December 2013 at 01:42:04 UTC, Ali Çehreli wrote:

That's probably a bug, right? The indexes would probably become off as
the array gets shorter.
Note the .reverse there?
Yes, I missed that part. :)
Which makes sure indices don't get off. Still,
that's probably an inefficient way of doing this...
Yeah, every removal would move the right-hand elements one position to the left. But you would be removing only M times (the number of indexes). So the complexity is I think O(M logM) + O(NM) for sorting the indexing plus removing M elements by shifting at the order of N elements. The range solution that I've come up with is I think O(M log M) + O(M) + O(N), for sorting the indexes, removing the duplicates with uniq and iterating the elements. Since O(M log M) dominates O(M), it should be O(M log M) + O(N).
[...] If you know the which elements you're going to remove, you can remove them all at once in O(N) time: int[] arr = [ ... ]; for (i=0, j=0; i < arr.length; i++) { if (deleteThisElement(arr[i])) continue; if (j < i) arr[j] = arr[i]; j++; } arr.length = j; Of course, if you know which *indices* you're going to remove, then you can do even better (by starting your loop at the first index to be deleted). T -- BREAKFAST.COM halted...Cereal Port Not Responding. -- YHL
Dec 20 2013
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Dec 20, 2013 at 07:13:59PM -0800, H. S. Teoh wrote:
 On Fri, Dec 20, 2013 at 07:01:17PM -0800, Ali Çehreli wrote:
 On 12/20/2013 06:17 PM, aldanor wrote:
On Saturday, 21 December 2013 at 01:42:04 UTC, Ali Çehreli wrote:

That's probably a bug, right? The indexes would probably become off as
the array gets shorter.
Note the .reverse there?
Yes, I missed that part. :)
Which makes sure indices don't get off. Still,
that's probably an inefficient way of doing this...
Yeah, every removal would move the right-hand elements one position to the left. But you would be removing only M times (the number of indexes). So the complexity is I think O(M logM) + O(NM) for sorting the indexing plus removing M elements by shifting at the order of N elements. The range solution that I've come up with is I think O(M log M) + O(M) + O(N), for sorting the indexes, removing the duplicates with uniq and iterating the elements. Since O(M log M) dominates O(M), it should be O(M log M) + O(N).
[...] If you know the which elements you're going to remove, you can remove them all at once in O(N) time: int[] arr = [ ... ]; for (i=0, j=0; i < arr.length; i++) { if (deleteThisElement(arr[i])) continue; if (j < i) arr[j] = arr[i]; j++; } arr.length = j;
There's also a Phobos solution: the above code is equivalent to: import std.algorithm : remove; int[] arr = [ ... ]; arr = arr.remove!deleteThisElement();
 Of course, if you know which *indices* you're going to remove, then
 you can do even better (by starting your loop at the first index to be
 deleted).
[...] Unfortunately, while std.algorithm.remove does provide a way to do this if the number of indices to remove are known beforehand, it doesn't seem to be able to remove a dynamic list of indices. Probably an enhancement request should be filed: http://d.puremagic.com/issues If the number of indices to remove are fixed, though, you can do this: import std.algorithm : remove; int[] arr = [ ... ]; int i1, i2, i3; // indices to remove arr = arr.remove(i1, i2, i3); T -- It's bad luck to be superstitious. -- YHL
Dec 20 2013
parent "aldanor" <i.s.smirnov gmail.com> writes:
On Saturday, 21 December 2013 at 03:22:22 UTC, H. S. Teoh wrote:
 Unfortunately, while std.algorithm.remove does provide a way to 
 do this
 if the number of indices to remove are known beforehand, it 
 doesn't seem
 to be able to remove a dynamic list of indices. Probably an 
 enhancement
 request should be filed:

 	http://d.puremagic.com/issues

 If the number of indices to remove are fixed, though, you can 
 do this:

 	import std.algorithm : remove;
  	int[] arr = [ ... ];
 	int i1, i2, i3;  // indices to remove
 	arr = arr.remove(i1, i2, i3);
Yes, this is exactly why I opened this thread initially, I noticed this functionality in .remove() and was native enough to assume there's a shortcut for batch removal with indices unknown in advance :)
Dec 21 2013
prev sibling next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
aldanor:

 Is there an efficient method to remove elements with multiple 
 (compile-time-unknown) indices from an array? I currently do 
 something like

 		if (!index.empty)
 			foreach (i; index.sort.reverse)
 				a = a.remove(i);

 ... which looks a bit awkward.
Do not forget to add the () after the sort, otherwise you call the deprecated, buggy and slow built-in sort. reverse is another deprecated built-in, so use "retro". The first "if" is not much useful, trying to sort an empty array will not wast much time. foreach (i; index.sort().retro) a = a.remove(i); In alternative you can also invert the sorting cmp (untested): foreach (i; index.sort!q{a > b}) a = a.remove(i); If you have to remove just few items in-place, that's the way to do it. In future I hope we'll have a better remove function in Phobos (but it's not hard to write). Another solution is to filter the array a. If you have the indexes, you can zip a with the indexes, and then filter, map to extract just the items and use copy() to work in place, but it's not nice-looking code. Bye, bearophile
Dec 20 2013
parent "aldanor" <i.s.smirnov gmail.com> writes:
On Saturday, 21 December 2013 at 02:52:00 UTC, bearophile wrote:
 Do not forget to add the () after the sort, otherwise you call 
 the deprecated, buggy and slow built-in sort.

 reverse is another deprecated built-in, so use "retro".

 The first "if" is not much useful, trying to sort an empty 
 array will not wast much time.

 foreach (i; index.sort().retro)
     a = a.remove(i);

 In alternative you can also invert the sorting cmp (untested):

 foreach (i; index.sort!q{a > b})
     a = a.remove(i);
Thanks, that looks like the cleanest solution so far! So (I guess) it will not cause index.length reallocations by just moving the elements in-place and then returning a slice? P.S. I wonder where would one learn about the deprecated sort and reverse if not from this forum? I followed the official docs and there's nothing about these properties being deprecated... yet another spot where a compiler warning would be appropriate?
Dec 21 2013
prev sibling parent reply "MrSmith" <mrsmith33 yandex.ru> writes:
On Saturday, 21 December 2013 at 00:47:04 UTC, aldanor wrote:
 Is there an efficient method to remove elements with multiple 
 (compile-time-unknown) indices from an array? I currently do 
 something like

 		if (!index.empty)
 			foreach (i; index.sort.reverse)
 				a = a.remove(i);

 ... which looks a bit awkward.
just use foreach_reverse
Dec 21 2013
parent reply "aldanor" <i.s.smirnov gmail.com> writes:
On Saturday, 21 December 2013 at 21:49:01 UTC, MrSmith wrote:
 On Saturday, 21 December 2013 at 00:47:04 UTC, aldanor wrote:
 Is there an efficient method to remove elements with multiple 
 (compile-time-unknown) indices from an array? I currently do 
 something like

 		if (!index.empty)
 			foreach (i; index.sort.reverse)
 				a = a.remove(i);

 ... which looks a bit awkward.
just use foreach_reverse
Wow, I never knew it is a keyword, thank you!
Dec 21 2013
parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 12/21/2013 05:39 PM, aldanor wrote:

 On Saturday, 21 December 2013 at 21:49:01 UTC, MrSmith wrote:
 just use foreach_reverse
Wow, I never knew it is a keyword, thank you!
I feel guilty for omitting in knowingly but we've been hearing that it would be deprecated. I don't think it will ever happen. :-/ Besides, I think retro(), its alternative, is not suitable for recursive algorithms because retro() works only with ranges. I had once tried to do recursive iteration with ranges (probably on a tree) but it had turned out that iteration with opApplyReverse was more natural for recursion. (I may be wrong on my conclusion but that is what I remember now.) Ali
Dec 21 2013