digitalmars.D - Resizing an array: Dangerous? Possibly buggy?
- %u (24/24) Mar 09 2011 Increasing the sizes of an array has always given me the shivers, as
- Steven Schveighoffer (24/46) Mar 09 2011 Since dmd around 2.042, array resizing and memory management has been
- %u (7/7) Mar 09 2011 Huh, interesting, okay.
- Nick Treleaven (9/14) Mar 11 2011 If you're referring to reducing the length of an array, I think people
- %u (6/9) Mar 11 2011 the GC holding onto too much memory would be useful.
- Jonathan M Davis (8/35) Mar 11 2011 No. That means that you have to worry about reallocation at fairly
Increasing the sizes of an array has always given me the shivers, as beautiful as it is. Could someone explain why this code behaves the way it does? string s = "1234"; s.length = 7; writefln("\"%s\"", s); prints: "1234 " Given that it makes no sense to extend a const-size array, shouldn't there be a run-time check on the new size of the array, so that it throws whenever it's bigger than the current size? Also, another issue: Let's say you have an array that you dynamically resize (meaning you grow and shrink it a lot of times). In fact, let's say you have: int[] arr = new int[1024 * 1024]; and then you decide, hey, that's too big for how much space I actually needed: arr.length = 5; Can the rest of the array be garbage collected? a. If the answer is "yes", then how does the GC know? b. If the answer is "no", then doesn't that: (1) Mean that we can have memory leaks? (2) Mean that we still need an ArrayList!(T) type? (Note that using Array!(T) does _NOT_ help here, because it can't hold object references due to garbage collection issues.)
Mar 09 2011
On Wed, 09 Mar 2011 06:41:54 -0500, %u <wfunction hotmail.com> wrote:Increasing the sizes of an array has always given me the shivers, as beautiful as it is.Since dmd around 2.042, array resizing and memory management has been extremely safe. It should be very difficult for you to get into trouble.Could someone explain why this code behaves the way it does? string s = "1234"; s.length = 7; writefln("\"%s\"", s); prints: "1234���" Given that it makes no sense to extend a const-size array, shouldn't there be a run-time check on the new size of the array, so that it throws whenever it's bigger than the current size?A string is an immutable(char)[], it is not a fixed size. Try this, and you will get an error: int[5] x; x.length = 7; I will also point out that setting the length of a string is not a useful thing -- once the elements are added, they are set in stone (immutable), so essentially, you just added 4 unchangeable chars of 0xff. It's better to append to a string, which will do it in place if possible.Also, another issue: Let's say you have an array that you dynamically resize (meaning you grow and shrink it a lot of times). In fact, let's say you have: int[] arr = new int[1024 * 1024]; and then you decide, hey, that's too big for how much space I actually needed: arr.length = 5; Can the rest of the array be garbage collected?No. An array is one contiguous memory-allocated block. It cannot be partially collected.a. If the answer is "yes", then how does the GC know? b. If the answer is "no", then doesn't that: (1) Mean that we can have memory leaks?Somewhat, as long as you keep a reference to that small array. But once that array is unreferenced, the memory should be collected. Note that if you do know that you are holding lots of memory with a small array, you can do something like this: arr = arr[0..5].dup; and this makes a copy just large enough to hold the 5 elements. The original 1MB array should be collected.(2) Mean that we still need an ArrayList!(T) type?It depends on what you want to do, and how ArrayList is implemented. In dcollections, ArrayList stores its elements in one contiguous memory block, just like an array. -Steve
Mar 09 2011
Huh, interesting, okay. I think pitfalls like this one (with the garbage collector, for example) should definitely be documented somewhere. I would imagine that quite a few people who try to set the length of an array won't realize that they can run out of memory this way, especially because it's nondeterministic in many cases. Anyway, thanks for the response!
Mar 09 2011
On Wed, 09 Mar 2011 18:15:42 +0000, %u wrote:I think pitfalls like this one (with the garbage collector, for example) should definitely be documented somewhere. I would imagine that quite a few people who try to set the length of an array won't realize that they can run out of memory this way, especially because it's nondeterministic in many cases.If you're referring to reducing the length of an array, I think people with a C background would expect the memory not to be reallocated, because this avoids copying memory contents, and anyway the array may grow again. I think this is documented somewhere, maybe TDPL when talking about slices. But making people more aware of it is probably a good thing. Perhaps an article on things to watch out for to prevent the GC holding onto too much memory would be useful.
Mar 11 2011
won't realize that they can run out of memory this way, especially because it's nondeterministic in many cases.I think pitfalls like this one (with the garbage collector, for example) should definitely be documented somewhere. I would imagine that quite a few people who try to set the length of an arrayIf you're referring to reducing the length of an array, I think people with a C background would expect the memory not to be reallocated, because this avoids copying memory contents, and anywaythe array may grow again.I think this is documented somewhere, maybe TDPL when talking about slices. But making people more aware of it is probably a good thing. Perhaps an article on things to watch out for to preventthe GC holding onto too much memory would be useful. I'm having an idea: Why not automatically reallocate/shrink an array when it's resized to below 25% of its capacity, and automatically double the capacity when it overflows? That way, we're never on a boundary case (as would happen if we simply shrunk the array when it was below 50% capacity rather than 25%), we could free memory, and the operations would be really O(1) (since the copies are amortized over the items)... does that sound like a good idea?
Mar 11 2011
On Friday 11 March 2011 21:40:36 %u wrote:No. That means that you have to worry about reallocation at fairly unpredicatable points. If you really want that behavior, it's easy enough to create a wrapper which does that. However, you can't get the current behavior by wrapping an array that behaves as you suggest. Also, what you suggest adds additional overhead to every operation which could potentially resize an array. On the whole, the way arrays work right now works quite well. - Jonathan M Daviswon't realize that they can run out of memory this way, especially because it's nondeterministic in many cases.I think pitfalls like this one (with the garbage collector, for example) should definitely be documented somewhere. I would imagine that quite a few people who try to set the length of an arrayIf you're referring to reducing the length of an array, I think people with a C background would expect the memory not to be reallocated, because this avoids copying memory contents, and anywaythe array may grow again.I think this is documented somewhere, maybe TDPL when talking about slices. But making people more aware of it is probably a good thing. Perhaps an article on things to watch out for to preventthe GC holding onto too much memory would be useful. I'm having an idea: Why not automatically reallocate/shrink an array when it's resized to below 25% of its capacity, and automatically double the capacity when it overflows? That way, we're never on a boundary case (as would happen if we simply shrunk the array when it was below 50% capacity rather than 25%), we could free memory, and the operations would be really O(1) (since the copies are amortized over the items)... does that sound like a good idea?
Mar 11 2011