digitalmars.D - Spikes in array capacity
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (46/46) Jul 01 2010 There is something strange in the array capacity algorithm.
- Steven Schveighoffer (3/49) Jul 01 2010 No, that is not intended. Please file a bugzilla against druntime.
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (5/14) Jul 01 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4412
- BCS (11/17) Jul 01 2010 I'm going to take a guess that the gc is (after some point) allocating i...
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (27/39) Jul 01 2010 Yes, the spikes move.
- Steven Schveighoffer (30/70) Jul 02 2010 If your original code is all that was being run, I think there is a bug.
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (51/133) Jul 02 2010 allocating
- Steven Schveighoffer (14/58) Jul 02 2010 4x1024 = 4096 == page size.
- Andrei Alexandrescu (6/27) Jul 02 2010 You may want to also print the pointer to the first element of the
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (18/44) Jul 02 2010 Good point. :) Yes, spikes are only when the array is moved.
- Steven Schveighoffer (3/18) Jul 02 2010 Thanks. I just figured this one out :) Very interesting bug indeed.
- Steven Schveighoffer (4/28) Jul 02 2010 FYI: http://www.dsource.org/projects/druntime/changeset/317
- Steven Schveighoffer (5/34) Jul 02 2010 Missed something, here is a better view of the change:
There is something strange in the array capacity algorithm. import std.stdio; import std.array; void main() { int[] a; size_t oldCapacity; foreach (i; 0 .. 100_000_000) { a ~= i; if (a.capacity != oldCapacity) { writeln("length: ", a.length, " capacity: ", capacity(a), " ratio: ", cast(double)capacity(a) / oldCapacity); oldCapacity = capacity(a); } } } Produces: length: 1 capacity: 3 ratio: inf please ignore this one ;) length: 4 capacity: 7 ratio: 2.33333 length: 8 capacity: 15 ratio: 2.14286 length: 16 capacity: 31 ratio: 2.06667 length: 32 capacity: 63 ratio: 2.03226 length: 64 capacity: 127 ratio: 2.01587 length: 128 capacity: 255 ratio: 2.00787 length: 256 capacity: 509 ratio: 1.99608 length: 512 capacity: 1021 ratio: 2.00589 length: 1022 capacity: 2045 ratio: 2.00294 length: 2046 capacity: 3069 ratio: 1.50073 length: 3070 capacity: 4093 ratio: 1.33366 length: 4094 capacity: 5117 ratio: 1.25018 ... length: 1153022 capacity: 1154045 ratio: 1.00089 length: 1154046 capacity: 1155069 ratio: 1.00089 length: 1155070 capacity: 3153917 ratio: 2.7305 <-- spike length: 3153918 capacity: 3154941 ratio: 1.00032 length: 3154942 capacity: 3155965 ratio: 1.00032 ... length: 4741118 capacity: 4742141 ratio: 1.00022 length: 4742142 capacity: 4743165 ratio: 1.00022 length: 4743166 capacity: 12333053 ratio: 2.60017 <-- spike length: 12333054 capacity: 12334077 ratio: 1.00008 length: 12334078 capacity: 12335101 ratio: 1.00008 ... (there are more spikes later on) Is that intended? Ali
Jul 01 2010
On Thu, 01 Jul 2010 12:46:27 -0400, Ali Çehreli <acehreli yahoo.com> wrote:There is something strange in the array capacity algorithm. import std.stdio; import std.array; void main() { int[] a; size_t oldCapacity; foreach (i; 0 .. 100_000_000) { a ~= i; if (a.capacity != oldCapacity) { writeln("length: ", a.length, " capacity: ", capacity(a), " ratio: ", cast(double)capacity(a) / oldCapacity); oldCapacity = capacity(a); } } } Produces: length: 1 capacity: 3 ratio: inf please ignore this one ;) length: 4 capacity: 7 ratio: 2.33333 length: 8 capacity: 15 ratio: 2.14286 length: 16 capacity: 31 ratio: 2.06667 length: 32 capacity: 63 ratio: 2.03226 length: 64 capacity: 127 ratio: 2.01587 length: 128 capacity: 255 ratio: 2.00787 length: 256 capacity: 509 ratio: 1.99608 length: 512 capacity: 1021 ratio: 2.00589 length: 1022 capacity: 2045 ratio: 2.00294 length: 2046 capacity: 3069 ratio: 1.50073 length: 3070 capacity: 4093 ratio: 1.33366 length: 4094 capacity: 5117 ratio: 1.25018 ... length: 1153022 capacity: 1154045 ratio: 1.00089 length: 1154046 capacity: 1155069 ratio: 1.00089 length: 1155070 capacity: 3153917 ratio: 2.7305 <-- spike length: 3153918 capacity: 3154941 ratio: 1.00032 length: 3154942 capacity: 3155965 ratio: 1.00032 ... length: 4741118 capacity: 4742141 ratio: 1.00022 length: 4742142 capacity: 4743165 ratio: 1.00022 length: 4743166 capacity: 12333053 ratio: 2.60017 <-- spike length: 12333054 capacity: 12334077 ratio: 1.00008 length: 12334078 capacity: 12335101 ratio: 1.00008 ... (there are more spikes later on) Is that intended? AliNo, that is not intended. Please file a bugzilla against druntime. -Steve
Jul 01 2010
Steven Schveighoffer wrote:On Thu, 01 Jul 2010 12:46:27 -0400, Ali Çehreli <acehreli yahoo.com> wrote:...There is something strange in the array capacity algorithm.length: 1153022 capacity: 1154045 ratio: 1.00089 length: 1154046 capacity: 1155069 ratio: 1.00089 length: 1155070 capacity: 3153917 ratio: 2.7305 <-- spike length: 3153918 capacity: 3154941 ratio: 1.00032 length: 3154942 capacity: 3155965 ratio: 1.00032No, that is not intended. Please file a bugzilla against druntime.http://d.puremagic.com/issues/show_bug.cgi?id=4412 Thanks, Ali
Jul 01 2010
Hello Ali,There is something strange in the array capacity algorithm. [...] Is that intended? AliI'm going to take a guess that the gc is (after some point) allocating into the smallest hole with "enough" capacity. If you re run the test but with something else going on to add some noise, do the spikes move? void makeNoise() { static byte[][256] data; data[rand() % $] = new byte[rand() % 512]; } -- ... <IXOYE><
Jul 01 2010
BCS wrote:There is something strange in the array capacity algorithm. [...] Is that intended?I'm going to take a guess that the gc is (after some point) allocating into the smallest hole with "enough" capacity.You may be right. :)If you re run the test but with something else going on to add some noise, do the spikes move?Yes, the spikes move.void makeNoise() { static byte[][256] data; data[rand() % $] = new byte[rand() % 512]; }I am not familiar with the dmd internals; but following the code took me to the function newCapacity() in src/druntime/src/rt/lifetime.d. The capacity growth is more complicated than what I've been assuming so far. And yes, GC is in the picture. Quoting the comments from that file: /* * Better version by Dave Fladebo: * This uses an inverse logorithmic algorithm to pre-allocate a bit more * space for larger arrays. * - Arrays smaller than PAGESIZE bytes are left as-is, so for the most * common cases, memory allocation is 1 to 1. The small overhead added * doesn't affect small array perf. (it's virtually the same as * current). * - Larger arrays have some space pre-allocated. * - As the arrays grow, the relative pre-allocated space shrinks. * - The logorithmic algorithm allocates relatively more space for * mid-size arrays, making it very fast for medium arrays (for * mid-to-large arrays, this turns out to be quite a bit faster than the * equivalent realloc() code in C, on Linux at least. Small arrays are * just as fast as GCC). * - Perhaps most importantly, overall memory usage and stress on the GC * is decreased significantly for demanding environments. */ So there may not be a bug after all... Ali
Jul 01 2010
On Fri, 02 Jul 2010 01:34:26 -0400, Ali Çehreli <acehreli yahoo.com> wrote:BCS wrote: >> There is something strange in the array capacity algorithm. >> [...] >> Is that intended? > I'm going to take a guess that the gc is (after some point) allocating > into the smallest hole with "enough" capacity. You may be right. :) > If you re run the test > but with something else going on to add some noise, do the spikes move? Yes, the spikes move. > void makeNoise() > { > static byte[][256] data; > data[rand() % $] = new byte[rand() % 512]; > } I am not familiar with the dmd internals; but following the code took me to the function newCapacity() in src/druntime/src/rt/lifetime.d. The capacity growth is more complicated than what I've been assuming so far. And yes, GC is in the picture. Quoting the comments from that file: /* * Better version by Dave Fladebo: * This uses an inverse logorithmic algorithm to pre-allocate a bit more * space for larger arrays. * - Arrays smaller than PAGESIZE bytes are left as-is, so for the most * common cases, memory allocation is 1 to 1. The small overhead added * doesn't affect small array perf. (it's virtually the same as * current). * - Larger arrays have some space pre-allocated. * - As the arrays grow, the relative pre-allocated space shrinks. * - The logorithmic algorithm allocates relatively more space for * mid-size arrays, making it very fast for medium arrays (for * mid-to-large arrays, this turns out to be quite a bit faster than the * equivalent realloc() code in C, on Linux at least. Small arrays are * just as fast as GCC). * - Perhaps most importantly, overall memory usage and stress on the GC * is decreased significantly for demanding environments. */ So there may not be a bug after all... AliIf your original code is all that was being run, I think there is a bug. I'll tell you why -- the GC provides a way to append pages to an allocated block. Essentially, you can glue consecutive unused pages together to make a larger block, thereby growing your block. But anything under a page is simply reallocated because anything less than a page is inside a pool of blocks of the same size, and those cannot be glued together. So when your array is growing, and it gets larger than a page, it should grow a single page at a time. But I think (and I stress think, I'm not sure) that when a block of pages is deallocated, each page becomes independently free. They do not stay glued together. So for an allocating requesting 1 or 2 more pages to grow by 2000 pages is suspect. I still have to examine the code, but I think there is a problem. Back to my original assertion -- to disprove the theory that some other "larger hole" is why it gets so big, in one instance your number of elements grows from 1,155,070 (4MB) to 3,153,917 (12MB), that's almost a 200% increase. In order for that 12MB hole to exist, something must have allocated it before, and then deallocated it. It can't be your loop because your loop has not allocated that much space yet. I would find that very strange to happen somewhere in the runtime without you specifically allocating it. If however, your original post is not the entire code, then there may be some possibility to that. BTW, you can take the newCapacity function out of the loop by simply setting the length and then writing the last element, i.e.: a.length += 1; a[$-1] = i; newCapacity isn't called when setting the length explicitly, because it's not considered to be an append. -Steve
Jul 02 2010
Steven Schveighoffer wrote:On Fri, 02 Jul 2010 01:34:26 -0400, Ali Çehreli <acehreli yahoo.com>wrote:allocatingBCS wrote: >> There is something strange in the array capacity algorithm. >> [...] >> Is that intended? > I'm going to take a guess that the gc is (after some point)more> into the smallest hole with "enough" capacity. You may be right. :) > If you re run the test > but with something else going on to add some noise, do the spikes move? Yes, the spikes move. > void makeNoise() > { > static byte[][256] data; > data[rand() % $] = new byte[rand() % 512]; > } I am not familiar with the dmd internals; but following the code took me to the function newCapacity() in src/druntime/src/rt/lifetime.d. The capacity growth is more complicated than what I've been assuming so far. And yes, GC is in the picture. Quoting the comments from that file: /* * Better version by Dave Fladebo: * This uses an inverse logorithmic algorithm to pre-allocate a bitthan the* space for larger arrays. * - Arrays smaller than PAGESIZE bytes are left as-is, so for the most * common cases, memory allocation is 1 to 1. The small overhead added * doesn't affect small array perf. (it's virtually the same as * current). * - Larger arrays have some space pre-allocated. * - As the arrays grow, the relative pre-allocated space shrinks. * - The logorithmic algorithm allocates relatively more space for * mid-size arrays, making it very fast for medium arrays (for * mid-to-large arrays, this turns out to be quite a bit fasterthe GC* equivalent realloc() code in C, on Linux at least. Small arrays are * just as fast as GCC). * - Perhaps most importantly, overall memory usage and stress onThe original code was it.* is decreased significantly for demanding environments. */ So there may not be a bug after all... AliIf your original code is all that was being run, I think there is a bug.I'll tell you why -- the GC provides a way to append pages to an allocated block. Essentially, you can glue consecutive unused pages together to make a larger block, thereby growing your block. But anything under a page is simply reallocated because anything less than a page is inside a pool of blocks of the same size, and those cannot be glued together. So when your array is growing, and it gets larger than a page, it should grow a single page at a time. But I think (and I stress think, I'm not sure) that when a block of pages is deallocated, each page becomes independently free. They do not stay glued together. So for an allocating requesting 1 or 2 more pages to grow by 2000 pages is suspect. I still have to examine the code, but I think there is aproblem.Back to my original assertion -- to disprove the theory that some other "larger hole" is why it gets so big, in one instance your number of elements grows from 1,155,070 (4MB) to 3,153,917 (12MB), that's almost a 200% increase. In order for that 12MB hole to exist, something must have allocated it before, and then deallocated it. It can't be your loop because your loop has not allocated that much space yet. I would find that very strange to happen somewhere in the runtime without you specifically allocating it. If however, your original post is not the entire code, then there may be some possibility to that. BTW, you can take the newCapacity function out of the loop by simply setting the length and then writing the last element, i.e.: a.length += 1; a[$-1] = i; newCapacity isn't called when setting the length explicitly, because it's not considered to be an append.You are right. Looks like I was successful in locating the culprit. :) I took liberty to write ++a.length and also outputted the number of elements that are in reserve at each allocation: import std.stdio; import std.array; void main() { int[] a; size_t oldCapacity; foreach (i; 0 .. 100_000_000) { ++a.length; a[$-1] = i; if (capacity(a) != oldCapacity) { writeln("length: ", a.length, " capacity: ", capacity(a), " ratio: ", cast(double)capacity(a) / oldCapacity, " reserve: ", capacity(a) - a.length); oldCapacity = capacity(a); } } } Now the spikes are gone, and the allocations asymptote at once-per-1024 elements for an int array: length: 1 capacity: 3 ratio: inf reserve: 2 length: 4 capacity: 7 ratio: 2.33333 reserve: 3 length: 8 capacity: 15 ratio: 2.14286 reserve: 7 length: 16 capacity: 31 ratio: 2.06667 reserve: 15 length: 32 capacity: 63 ratio: 2.03226 reserve: 31 length: 64 capacity: 127 ratio: 2.01587 reserve: 63 length: 128 capacity: 255 ratio: 2.00787 reserve: 127 length: 256 capacity: 509 ratio: 1.99608 reserve: 253 length: 512 capacity: 1021 ratio: 2.00589 reserve: 509 length: 1022 capacity: 2045 ratio: 2.00294 reserve: 1023 length: 2046 capacity: 3069 ratio: 1.50073 reserve: 1023 length: 3070 capacity: 4093 ratio: 1.33366 reserve: 1023 ... In the long run, there is one allocation per 1024 elements. That still feels like amortized constant, but considering that relocation times would be O(N), I think reserve should still grow with N, but maybe not too much. (I am not sure about this last point at all.)-SteveThank you very much, Ali
Jul 02 2010
On Fri, 02 Jul 2010 10:01:11 -0400, Ali Çehreli <acehreli yahoo.com> wrote:Steven Schveighoffer wrote: > newCapacity isn't called when setting the length explicitly, because > it's not considered to be an append. You are right. Looks like I was successful in locating the culprit. :) I took liberty to write ++a.length and also outputted the number of elements that are in reserve at each allocation: import std.stdio; import std.array; void main() { int[] a; size_t oldCapacity; foreach (i; 0 .. 100_000_000) { ++a.length; a[$-1] = i; if (capacity(a) != oldCapacity) { writeln("length: ", a.length, " capacity: ", capacity(a), " ratio: ", cast(double)capacity(a) / oldCapacity, " reserve: ", capacity(a) - a.length); oldCapacity = capacity(a); } } } Now the spikes are gone, and the allocations asymptote at once-per-1024 elements for an int array: length: 1 capacity: 3 ratio: inf reserve: 2 length: 4 capacity: 7 ratio: 2.33333 reserve: 3 length: 8 capacity: 15 ratio: 2.14286 reserve: 7 length: 16 capacity: 31 ratio: 2.06667 reserve: 15 length: 32 capacity: 63 ratio: 2.03226 reserve: 31 length: 64 capacity: 127 ratio: 2.01587 reserve: 63 length: 128 capacity: 255 ratio: 2.00787 reserve: 127 length: 256 capacity: 509 ratio: 1.99608 reserve: 253 length: 512 capacity: 1021 ratio: 2.00589 reserve: 509 length: 1022 capacity: 2045 ratio: 2.00294 reserve: 1023 length: 2046 capacity: 3069 ratio: 1.50073 reserve: 1023 length: 3070 capacity: 4093 ratio: 1.33366 reserve: 1023 ... In the long run, there is one allocation per 1024 elements. That still feels like amortized constant, but considering that relocation times would be O(N), I think reserve should still grow with N, but maybe not too much. (I am not sure about this last point at all.)4x1024 = 4096 == page size. The newCapacity function is supposed to reduce the effect of this phenomenon. i.e., when appending, it's assumed you are going to continue appending, so to reduce the overhead, the amount added to the memory block is supposed to be related to the total number of bytes already allocated. It doesn't seem like the function is working very well... It could have been something I did, or it could have always been broken :) I honestly did not examine the code at all, I just only ever read the comments, assuming it works. Please add these comments to the bug report, it will help when I look at it. I don't have a lot of time to look at it right now, so I'm afraid I'll forget what we were doing later :) -Steve
Jul 02 2010
Ali Çehreli wrote: [snip]Now the spikes are gone, and the allocations asymptote at once-per-1024 elements for an int array: length: 1 capacity: 3 ratio: inf reserve: 2 length: 4 capacity: 7 ratio: 2.33333 reserve: 3 length: 8 capacity: 15 ratio: 2.14286 reserve: 7 length: 16 capacity: 31 ratio: 2.06667 reserve: 15 length: 32 capacity: 63 ratio: 2.03226 reserve: 31 length: 64 capacity: 127 ratio: 2.01587 reserve: 63 length: 128 capacity: 255 ratio: 2.00787 reserve: 127 length: 256 capacity: 509 ratio: 1.99608 reserve: 253 length: 512 capacity: 1021 ratio: 2.00589 reserve: 509 length: 1022 capacity: 2045 ratio: 2.00294 reserve: 1023 length: 2046 capacity: 3069 ratio: 1.50073 reserve: 1023 length: 3070 capacity: 4093 ratio: 1.33366 reserve: 1023 .... In the long run, there is one allocation per 1024 elements. That still feels like amortized constant, but considering that relocation times would be O(N), I think reserve should still grow with N, but maybe not too much. (I am not sure about this last point at all.)You may want to also print the pointer to the first element of the array. That provides you information about how often the array is actually moved. Andrei
Jul 02 2010
Andrei Alexandrescu wrote:Ali Çehreli wrote: [snip]Good point. :) Yes, spikes are only when the array is moved. I've also discovered that a basic array invariant is violated at size 509 as well: import std.string; import std.array; void main() { int[] a; a.length = 509; a ~= 0; assert(a.capacity >= a.length, format("capacity: %s length: %s", a.capacity, a.length)); } Outputs core.exception.AssertError deneme.d(17444): capacity: 509 length: 510 I will update the bug with these. AliNow the spikes are gone, and the allocations asymptote at once-per-1024 elements for an int array: length: 1 capacity: 3 ratio: inf reserve: 2 length: 4 capacity: 7 ratio: 2.33333 reserve: 3 length: 8 capacity: 15 ratio: 2.14286 reserve: 7 length: 16 capacity: 31 ratio: 2.06667 reserve: 15 length: 32 capacity: 63 ratio: 2.03226 reserve: 31 length: 64 capacity: 127 ratio: 2.01587 reserve: 63 length: 128 capacity: 255 ratio: 2.00787 reserve: 127 length: 256 capacity: 509 ratio: 1.99608 reserve: 253 length: 512 capacity: 1021 ratio: 2.00589 reserve: 509 length: 1022 capacity: 2045 ratio: 2.00294 reserve: 1023 length: 2046 capacity: 3069 ratio: 1.50073 reserve: 1023 length: 3070 capacity: 4093 ratio: 1.33366 reserve: 1023 .... In the long run, there is one allocation per 1024 elements. That still feels like amortized constant, but considering that relocation times would be O(N), I think reserve should still grow with N, but maybe not too much. (I am not sure about this last point at all.)You may want to also print the pointer to the first element of the array. That provides you information about how often the array is actually moved.
Jul 02 2010
On Fri, 02 Jul 2010 13:01:41 -0400, Ali Çehreli <acehreli yahoo.com> wrote:I've also discovered that a basic array invariant is violated at size 509 as well: import std.string; import std.array; void main() { int[] a; a.length = 509; a ~= 0; assert(a.capacity >= a.length, format("capacity: %s length: %s", a.capacity, a.length)); } Outputs core.exception.AssertError deneme.d(17444): capacity: 509 length: 510 I will update the bug with these.Thanks. I just figured this one out :) Very interesting bug indeed. -Steve
Jul 02 2010
On Fri, 02 Jul 2010 14:50:53 -0400, Steven Schveighoffer <schveiguy yahoo.com> wrote:On Fri, 02 Jul 2010 13:01:41 -0400, Ali Çehreli <acehreli yahoo.com> wrote:FYI: http://www.dsource.org/projects/druntime/changeset/317 -SteveI've also discovered that a basic array invariant is violated at size 509 as well: import std.string; import std.array; void main() { int[] a; a.length = 509; a ~= 0; assert(a.capacity >= a.length, format("capacity: %s length: %s", a.capacity, a.length)); } Outputs core.exception.AssertError deneme.d(17444): capacity: 509 length: 510 I will update the bug with these.Thanks. I just figured this one out :) Very interesting bug indeed.
Jul 02 2010
On Fri, 02 Jul 2010 14:58:41 -0400, Steven Schveighoffer <schveiguy yahoo.com> wrote:On Fri, 02 Jul 2010 14:50:53 -0400, Steven Schveighoffer <schveiguy yahoo.com> wrote:Missed something, here is a better view of the change: http://www.dsource.org/projects/druntime/changeset?new=trunk%2Fsrc%2Frt%2Flifetime.d%40318&old=trunk%2Fsrc%2Frt%2Flifetime.d%40299 -SteveOn Fri, 02 Jul 2010 13:01:41 -0400, Ali Çehreli <acehreli yahoo.com> wrote:FYI: http://www.dsource.org/projects/druntime/changeset/317I've also discovered that a basic array invariant is violated at size 509 as well: import std.string; import std.array; void main() { int[] a; a.length = 509; a ~= 0; assert(a.capacity >= a.length, format("capacity: %s length: %s", a.capacity, a.length)); } Outputs core.exception.AssertError deneme.d(17444): capacity: 509 length: 510 I will update the bug with these.Thanks. I just figured this one out :) Very interesting bug indeed.
Jul 02 2010