digitalmars.D - Array Capacity Field
- dsimcha (13/13) May 07 2008 I know it's been talked about before, but what is the status of the arra...
- Sean Kelly (3/16) May 07 2008 Use std.gc.capacity() in Phobos and tango.core.Memory.GC.sizeOf() in Tan...
- dsimcha (13/13) May 07 2008 According to the Phobos docs, std.gc.capacity "Returns capacity (size of...
- Sean Kelly (12/27) May 07 2008 Right. And since resizing an interior slice "in place" is typically
- dsimcha (25/25) May 07 2008 Because of some difficulties encountered using D for large arrays (See p...
- dsimcha (11/36) May 07 2008 After further testing, I've found an exact threshold for this bug. When...
- Sean Kelly (14/52) May 07 2008 I'm not sure what to say. This sample works fine with D 1.0 using
- Sean Kelly (24/100) May 08 2008 Well whatever's going on, it's not tracked by gc_stats(). I modified
- dsimcha (3/103) May 08 2008 Another interesting point: This bug also occurs on GDC 0.24, but here t...
- dsimcha (11/114) May 09 2008 After looking into this further, it appears it's a fundamental issue wit...
- bearophile (30/31) May 09 2008 Time ago I have asked that question, and someone has shown me a way, you...
- Fawzi Mohamed (7/18) May 09 2008 Nice that there is a workaround, but shouldn't an AA:
- Sean Kelly (6/23) May 09 2008 The GC just deals in memory blocks, so the nodes of a chained AA
- dsimcha (25/56) May 09 2008 This appears to be analogous to simly setting the length field to zero i...
- Sivo Schilling (28/66) May 08 2008 This bug also occurs using DMD 1.029. It seems one have to help phobos
- dsimcha (10/75) May 08 2008 Acutally, if I stick a delete in my original test case at the end of tes...
- Robert Jacques (9/9) May 09 2008 dsimcha and I debugged this last night. It appears it’s a case of the ...
- Nick B (4/19) May 08 2008 Can someone tell me what "interior pointers" are ?
- torhu (4/4) May 08 2008 Nick B wrote:
- Pontus Pihlgren (12/14) May 09 2008 This is my absolute favourite pet peeve with Perl, it has the same
- Sean Kelly (18/35) May 09 2008 The Tango GC actually does release memory back to the OS in some
I know it's been talked about before, but what is the status of the array capacity field that's independent of length? It would be useful to have such a thing, for example, when working on a program that uses multi-dimensional dynamic arrays to avoid excessive wasted space. For example: void main () { uint[][] Table; //Read in a huge jagged table of data from a file. //Do stuff that involves read-only access to Table[][] and // requires a lot of memory. } The above might result in running out of memory due to the amount of wasted space in Table[][]. If there were a way to compact Table[][] once one knows it will not be appended to, this would solve a lot of problems.
May 07 2008
== Quote from dsimcha (dsimcha yahoo.com)'s articleI know it's been talked about before, but what is the status of the array capacity field that's independent of length? It would be useful to have such a thing, for example, when working on a program that uses multi-dimensional dynamic arrays to avoid excessive wasted space. For example: void main () { uint[][] Table; //Read in a huge jagged table of data from a file. //Do stuff that involves read-only access to Table[][] and // requires a lot of memory. } The above might result in running out of memory due to the amount of wasted space in Table[][]. If there were a way to compact Table[][] once one knows it will not be appended to, this would solve a lot of problems.Use std.gc.capacity() in Phobos and tango.core.Memory.GC.sizeOf() in Tango. Sean
May 07 2008
According to the Phobos docs, std.gc.capacity "Returns capacity (size of the memory block) that p points to the beginning of. If p does not point into the gc memory pool, or does not point to the beginning of an allocated memory block, 0 is returned." I had looked through this before. It also doesn't appear that writing my own function to do this works: void capacity(T)(ref T[] array, uint newCapacity) { if(newCapacity < array.length) { array.length=newCapacity; } std.gc.realloc(array.ptr, T.sizeof*newCapacity); } When I use this after allocating a huge array, and then run a fullCollect() and a minimize(), the memory usage of my programs appear not to go down.
May 07 2008
dsimcha wrote:According to the Phobos docs, std.gc.capacity "Returns capacity (size of the memory block) that p points to the beginning of. If p does not point into the gc memory pool, or does not point to the beginning of an allocated memory block, 0 is returned." I had looked through this before.Right. And since resizing an interior slice "in place" is typically undesirable, this does exactly what you want. If you really want the capacity regardless of this however, you can use tango.core.Memory.GC.query(), which returns the size of the block even for interior pointers.It also doesn't appear that writing my own function to do this works: void capacity(T)(ref T[] array, uint newCapacity) { if(newCapacity < array.length) { array.length=newCapacity; } std.gc.realloc(array.ptr, T.sizeof*newCapacity); } When I use this after allocating a huge array, and then run a fullCollect() and a minimize(), the memory usage of my programs appear not to go down.This is somewhat of a different issue. The Phobos GC never returns any memory to the OS. It's also possible to make a Phobos app run out of memory simply by looping on an append op (if I remember the example correctly). The Tango GC behaves a bit differently, and will actually return memory to the OS in certain circumstances. Sean
May 07 2008
Because of some difficulties encountered using D for large arrays (See previous posts about array capacity fields), I produced the following test case that seems to be a reproducible bug in D 2.0.13. The following program keeps allocating a huge array in a function and then letting all references to this array go out of scope. This should result in the array being freed as soon as more memory is needed. import std.stdio, std.gc; void main(){ uint count=0; while(true) { test(); fullCollect(); writefln(++count); } } void test() { uint[] stuff=new uint[15_000_000]; } This produced an out of memory error after 21 iterations on my machine w/ 2 GB of RAM. Using an array size of 10_000_000 instead of 15_000_000, its memory usage stabilized at 350 megs, which seems rather large since a uint[10_000_000] should only use 40 megs, plus maybe another 40 for overhead. Furthermore, it takes several iterations for the memory usage to reach this level. Using a larger array size, such as 100_000_000, made this test run out of memory after even fewer iterations. Furthermore, changing from a uint[] to real[] with a size of 15_000_000 made it run out after 8 iterations instead of 21.
May 07 2008
After further testing, I've found an exact threshold for this bug. When an array of uints gets to 48_693_248 bytes (12_173_312 elements) this problem occurs, after 26 iterations at the threshold, or less for larger arrays. Anything below that, even one element smaller, and memory usage is stable over at least hundreds of iterations. It appears that the number of bytes is the key, since a ulong[] will allow the same number of bytes (1/2 the elements) before causing problems, and a ushort[] will allow twice as many elements (same number of bytes) without crashing. Furthermore, using equivalent sized floats instead of ints (float instead of uint, double instead of ulong) or using signed ints, has no effect. == Quote from dsimcha (dsimcha yahoo.com)'s articleBecause of some difficulties encountered using D for large arrays (See previous posts about array capacity fields), I produced the following test case that seems to be a reproducible bug in D 2.0.13. The following program keeps allocating a huge array in a function and then letting all references to this array go out of scope. This should result in the array being freed as soon as more memory isneeded.import std.stdio, std.gc; void main(){ uint count=0; while(true) { test(); fullCollect(); writefln(++count); } } void test() { uint[] stuff=new uint[15_000_000]; } This produced an out of memory error after 21 iterations on my machine w/ 2 GB of RAM. Using an array size of 10_000_000 instead of 15_000_000, its memory usage stabilized at 350 megs, which seems rather large since a uint[10_000_000] should only use 40 megs, plus maybe another 40 for overhead. Furthermore, it takes several iterations for the memory usage to reach this level. Using a larger array size, such as 100_000_000, made this test run out of memory after even fewer iterations. Furthermore, changing from a uint[] to real[] with a size of 15_000_000 made it run out after 8 iterations instead of 21.
May 07 2008
I'm not sure what to say. This sample works fine with D 1.0 using Tango, though the memory usage is strangely high at around 370 megs. Here's the weird thing, I tried running these two versions of the test function for comparison: void test() { void* stuff = (new byte[15_000_000 * 4]).ptr; } void test() { void* stuff=GC.malloc(15_000_000 * 4, GC.BlkAttr.NO_SCAN); } The first one uses a stable 370 megs of memory and the second a stable 18 megs. Obviously there's something weird going on with how arrays are handled. dsimcha wrote:After further testing, I've found an exact threshold for this bug. When an array of uints gets to 48_693_248 bytes (12_173_312 elements) this problem occurs, after 26 iterations at the threshold, or less for larger arrays. Anything below that, even one element smaller, and memory usage is stable over at least hundreds of iterations. It appears that the number of bytes is the key, since a ulong[] will allow the same number of bytes (1/2 the elements) before causing problems, and a ushort[] will allow twice as many elements (same number of bytes) without crashing. Furthermore, using equivalent sized floats instead of ints (float instead of uint, double instead of ulong) or using signed ints, has no effect. == Quote from dsimcha (dsimcha yahoo.com)'s articleBecause of some difficulties encountered using D for large arrays (See previous posts about array capacity fields), I produced the following test case that seems to be a reproducible bug in D 2.0.13. The following program keeps allocating a huge array in a function and then letting all references to this array go out of scope. This should result in the array being freed as soon as more memory isneeded.import std.stdio, std.gc; void main(){ uint count=0; while(true) { test(); fullCollect(); writefln(++count); } } void test() { uint[] stuff=new uint[15_000_000]; } This produced an out of memory error after 21 iterations on my machine w/ 2 GB of RAM. Using an array size of 10_000_000 instead of 15_000_000, its memory usage stabilized at 350 megs, which seems rather large since a uint[10_000_000] should only use 40 megs, plus maybe another 40 for overhead. Furthermore, it takes several iterations for the memory usage to reach this level. Using a larger array size, such as 100_000_000, made this test run out of memory after even fewer iterations. Furthermore, changing from a uint[] to real[] with a size of 15_000_000 made it run out after 8 iterations instead of 21.
May 07 2008
Well whatever's going on, it's not tracked by gc_stats(). I modified the previous functions slightly to ensure that both allocated exactly the same amount of memory, then verified this via a printf inside the GC code--malloc calls were for 60000000 bytes: void test() { void* stuff = (new byte[15_000_000 * 4 - 1]).ptr; } void test() { void* stuff=GC.malloc(15_000_000 * 4, GC.BlkAttr.NO_SCAN); } Output using gc_stats() on each iteration after the collect was identical for both function calls after things had stabalized: *** stats *** poolsize: 360251392 usedsize: 640 freeblocks: 56 freelistsize: 7552 pageblocks: 6 What I don't understand is why Task Manager is reporting the GC.malloc version using only 18 megs while the other uses 370, since everything I've looked at internally so far suggests they should be the same. Perhaps gc_stats() isn't reporting something properly--I really haven't given that routine a very close look. Sean Kelly wrote:I'm not sure what to say. This sample works fine with D 1.0 using Tango, though the memory usage is strangely high at around 370 megs. Here's the weird thing, I tried running these two versions of the test function for comparison: void test() { void* stuff = (new byte[15_000_000 * 4]).ptr; } void test() { void* stuff=GC.malloc(15_000_000 * 4, GC.BlkAttr.NO_SCAN); } The first one uses a stable 370 megs of memory and the second a stable 18 megs. Obviously there's something weird going on with how arrays are handled. dsimcha wrote:After further testing, I've found an exact threshold for this bug. When an array of uints gets to 48_693_248 bytes (12_173_312 elements) this problem occurs, after 26 iterations at the threshold, or less for larger arrays. Anything below that, even one element smaller, and memory usage is stable over at least hundreds of iterations. It appears that the number of bytes is the key, since a ulong[] will allow the same number of bytes (1/2 the elements) before causing problems, and a ushort[] will allow twice as many elements (same number of bytes) without crashing. Furthermore, using equivalent sized floats instead of ints (float instead of uint, double instead of ulong) or using signed ints, has no effect. == Quote from dsimcha (dsimcha yahoo.com)'s articleBecause of some difficulties encountered using D for large arrays (See previous posts about array capacity fields), I produced the following test case that seems to be a reproducible bug in D 2.0.13. The following program keeps allocating a huge array in a function and then letting all references to this array go out of scope. This should result in the array being freed as soon as more memory isneeded.import std.stdio, std.gc; void main(){ uint count=0; while(true) { test(); fullCollect(); writefln(++count); } } void test() { uint[] stuff=new uint[15_000_000]; } This produced an out of memory error after 21 iterations on my machine w/ 2 GB of RAM. Using an array size of 10_000_000 instead of 15_000_000, its memory usage stabilized at 350 megs, which seems rather large since a uint[10_000_000] should only use 40 megs, plus maybe another 40 for overhead. Furthermore, it takes several iterations for the memory usage to reach this level. Using a larger array size, such as 100_000_000, made this test run out of memory after even fewer iterations. Furthermore, changing from a uint[] to real[] with a size of 15_000_000 made it run out after 8 iterations instead of 21.
May 08 2008
Another interesting point: This bug also occurs on GDC 0.24, but here the threshold is lower, 33_423_360 bytes. Must be something pretty deep. == Quote from Sean Kelly (sean invisibleduck.org)'s articleWell whatever's going on, it's not tracked by gc_stats(). I modified the previous functions slightly to ensure that both allocated exactly the same amount of memory, then verified this via a printf inside the GC code--malloc calls were for 60000000 bytes: void test() { void* stuff = (new byte[15_000_000 * 4 - 1]).ptr; } void test() { void* stuff=GC.malloc(15_000_000 * 4, GC.BlkAttr.NO_SCAN); } Output using gc_stats() on each iteration after the collect was identical for both function calls after things had stabalized: *** stats *** poolsize: 360251392 usedsize: 640 freeblocks: 56 freelistsize: 7552 pageblocks: 6 What I don't understand is why Task Manager is reporting the GC.malloc version using only 18 megs while the other uses 370, since everything I've looked at internally so far suggests they should be the same. Perhaps gc_stats() isn't reporting something properly--I really haven't given that routine a very close look. Sean Kelly wrote:I'm not sure what to say. This sample works fine with D 1.0 using Tango, though the memory usage is strangely high at around 370 megs. Here's the weird thing, I tried running these two versions of the test function for comparison: void test() { void* stuff = (new byte[15_000_000 * 4]).ptr; } void test() { void* stuff=GC.malloc(15_000_000 * 4, GC.BlkAttr.NO_SCAN); } The first one uses a stable 370 megs of memory and the second a stable 18 megs. Obviously there's something weird going on with how arrays are handled. dsimcha wrote:After further testing, I've found an exact threshold for this bug. When an array of uints gets to 48_693_248 bytes (12_173_312 elements) this problem occurs, after 26 iterations at the threshold, or less for larger arrays. Anything below that, even one element smaller, and memory usage is stable over at least hundreds of iterations. It appears that the number of bytes is the key, since a ulong[] will allow the same number of bytes (1/2 the elements) before causing problems, and a ushort[] will allow twice as many elements (same number of bytes) without crashing. Furthermore, using equivalent sized floats instead of ints (float instead of uint, double instead of ulong) or using signed ints, has no effect. == Quote from dsimcha (dsimcha yahoo.com)'s articleBecause of some difficulties encountered using D for large arrays (See previous posts about array capacity fields), I produced the following test case that seems to be a reproducible bug in D 2.0.13. The following program keeps allocating a huge array in a function and then letting all references to this array go out of scope. This should result in the array being freed as soon as more memory isneeded.import std.stdio, std.gc; void main(){ uint count=0; while(true) { test(); fullCollect(); writefln(++count); } } void test() { uint[] stuff=new uint[15_000_000]; } This produced an out of memory error after 21 iterations on my machine w/ 2 GB of RAM. Using an array size of 10_000_000 instead of 15_000_000, its memory usage stabilized at 350 megs, which seems rather large since a uint[10_000_000] should only use 40 megs, plus maybe another 40 for overhead. Furthermore, it takes several iterations for the memory usage to reach this level. Using a larger array size, such as 100_000_000, made this test run out of memory after even fewer iterations. Furthermore, changing from a uint[] to real[] with a size of 15_000_000 made it run out after 8 iterations instead of 21.
May 08 2008
After looking into this further, it appears it's a fundamental issue with conservative GC. I scanned the stack manually for bit patterns, and it seems that with very large arrays, one is virtually guaranteed to have some bit pattern on the stack that looks like a pointer into the array. Manually deleting just the very large arrays is a solution I can live with, since in the type of code I typically write, all the huge arrays are only around for a short time and only accessed from one or two places. However, this still leaves the problem of large associative arrays. There appears to be no way of manually deleting these. Using delete on one (ex. string[string] foo; delete foo;) produces a compile time error. Does anyone know of another way to manually delete associative arrays? == Quote from dsimcha (dsimcha yahoo.com)'s articleAnother interesting point: This bug also occurs on GDC 0.24, but here the threshold is lower, 33_423_360 bytes. Must be something pretty deep. == Quote from Sean Kelly (sean invisibleduck.org)'s articleWell whatever's going on, it's not tracked by gc_stats(). I modified the previous functions slightly to ensure that both allocated exactly the same amount of memory, then verified this via a printf inside the GC code--malloc calls were for 60000000 bytes: void test() { void* stuff = (new byte[15_000_000 * 4 - 1]).ptr; } void test() { void* stuff=GC.malloc(15_000_000 * 4, GC.BlkAttr.NO_SCAN); } Output using gc_stats() on each iteration after the collect was identical for both function calls after things had stabalized: *** stats *** poolsize: 360251392 usedsize: 640 freeblocks: 56 freelistsize: 7552 pageblocks: 6 What I don't understand is why Task Manager is reporting the GC.malloc version using only 18 megs while the other uses 370, since everything I've looked at internally so far suggests they should be the same. Perhaps gc_stats() isn't reporting something properly--I really haven't given that routine a very close look. Sean Kelly wrote:I'm not sure what to say. This sample works fine with D 1.0 using Tango, though the memory usage is strangely high at around 370 megs. Here's the weird thing, I tried running these two versions of the test function for comparison: void test() { void* stuff = (new byte[15_000_000 * 4]).ptr; } void test() { void* stuff=GC.malloc(15_000_000 * 4, GC.BlkAttr.NO_SCAN); } The first one uses a stable 370 megs of memory and the second a stable 18 megs. Obviously there's something weird going on with how arrays are handled. dsimcha wrote:After further testing, I've found an exact threshold for this bug. When an array of uints gets to 48_693_248 bytes (12_173_312 elements) this problem occurs, after 26 iterations at the threshold, or less for larger arrays. Anything below that, even one element smaller, and memory usage is stable over at least hundreds of iterations. It appears that the number of bytes is the key, since a ulong[] will allow the same number of bytes (1/2 the elements) before causing problems, and a ushort[] will allow twice as many elements (same number of bytes) without crashing. Furthermore, using equivalent sized floats instead of ints (float instead of uint, double instead of ulong) or using signed ints, has no effect. == Quote from dsimcha (dsimcha yahoo.com)'s articleBecause of some difficulties encountered using D for large arrays (See previous posts about array capacity fields), I produced the following test case that seems to be a reproducible bug in D 2.0.13. The following program keeps allocating a huge array in a function and then letting all references to this array go out of scope. This should result in the array being freed as soon as more memory isneeded.import std.stdio, std.gc; void main(){ uint count=0; while(true) { test(); fullCollect(); writefln(++count); } } void test() { uint[] stuff=new uint[15_000_000]; } This produced an out of memory error after 21 iterations on my machine w/ 2 GB of RAM. Using an array size of 10_000_000 instead of 15_000_000, its memory usage stabilized at 350 megs, which seems rather large since a uint[10_000_000] should only use 40 megs, plus maybe another 40 for overhead. Furthermore, it takes several iterations for the memory usage to reach this level. Using a larger array size, such as 100_000_000, made this test run out of memory after even fewer iterations. Furthermore, changing from a uint[] to real[] with a size of 15_000_000 made it run out after 8 iterations instead of 21.
May 09 2008
dsimcha Wrote:Does anyone know of another way to manually delete associative arrays?Time ago I have asked that question, and someone has shown me a way, you can find a modified version of it as the clear() function in my libs: http://www.fantascienza.net/leonardo/so/libs_d.zip T clear(T)(ref T items) { static if (IsAA!(T)) { // This only works with the current AA implementation struct BB { void*[] b; size_t nodes; } union ToPtr_clear { T x; void* ptr; } ToPtr_clear toptr; toptr.x = items; BB* b = cast(BB*) toptr.ptr; if (b) { b.b = null; b.nodes = 0; } return items; } else static if (IsDynamicArray!(T)) { items.length = 0; return items; } else static assert(0, "clear() accepts only AAs and dynamic arrays."); } You can find unittests, ddocs and the IsAA!() into those libs, the "func" module (the IsAA isn't necessary if you just want to use it for AAs). Bye, bearophile
May 09 2008
On 2008-05-09 18:20:19 +0200, bearophile <bearophileHUGS lycos.com> said:dsimcha Wrote:Nice that there is a workaround, but shouldn't an AA: 1) possibly contain pointers 2) have no extrnal pointers pointing to its inside so is it possible to tell the gc to ignore internal pointer to it, but still (if kept) scan all the pointers contained in it? FawziDoes anyone know of another way to manually delete associative arrays?Time ago I have asked that question, and someone has shown me a way, you can find a modified version of it as the clear() function in my libs: http://www.fantascienza.net/leonardo/so/libs_d.zip [...] Bye, bearophile
May 09 2008
== Quote from Fawzi Mohamed (fmohamed mac.com)'s articleOn 2008-05-09 18:20:19 +0200, bearophile <bearophileHUGS lycos.com> said:The GC just deals in memory blocks, so the nodes of a chained AA are just blocks containing pointers. The GC doesn't know any more than that, and it doesn't treat them any differently than any other block containing pointers. Seandsimcha Wrote:Nice that there is a workaround, but shouldn't an AA: 1) possibly contain pointers 2) have no extrnal pointers pointing to its inside so is it possible to tell the gc to ignore internal pointer to it, but still (if kept) scan all the pointers contained in it?Does anyone know of another way to manually delete associative arrays?Time ago I have asked that question, and someone has shown me a way, you can find a modified version of it as the clear() function in my libs: http://www.fantascienza.net/leonardo/so/libs_d.zip [...] Bye, bearophile
May 09 2008
This appears to be analogous to simly setting the length field to zero in a dynamic array. It just nulls the pointers and still requires the GC to know to free the memory properly. On the other hand, delete() for dynamic arrays is truly manual memory management and actually frees the memory in question. The following test case illustrates my point, and fails after a few iterations. void main () { uint count; while(true) { test(); fullCollect(); minimize(); writefln(++count); } } void test() { string[uint] foo; for(uint i=1; i < 10_000_000; i++) { foo[i]="Some ridiculously random text foo foo foo foo foo foo foo foo foo"; } clear(foo); } Note: clear() is clear function copied and pasted from Bearophine. == Quote from bearophile (bearophileHUGS lycos.com)'s articledsimcha Wrote:find a modified version of it as the clear() function in my libs:Does anyone know of another way to manually delete associative arrays?Time ago I have asked that question, and someone has shown me a way, you canhttp://www.fantascienza.net/leonardo/so/libs_d.zip T clear(T)(ref T items) { static if (IsAA!(T)) { // This only works with the current AA implementation struct BB { void*[] b; size_t nodes; } union ToPtr_clear { T x; void* ptr; } ToPtr_clear toptr; toptr.x = items; BB* b = cast(BB*) toptr.ptr; if (b) { b.b = null; b.nodes = 0; } return items; } else static if (IsDynamicArray!(T)) { items.length = 0; return items; } else static assert(0, "clear() accepts only AAs and dynamic arrays."); } You can find unittests, ddocs and the IsAA!() into those libs, the "func" module(the IsAA isn't necessary if you just want to use it for AAs).Bye, bearophile
May 09 2008
This bug also occurs using DMD 1.029. It seems one have to help phobos gc for such cases (intended as a workaround). If you excplicitly deletes your large array than there's no error. --- import std.stdio, std.gc; void main() { char[] buf; uint[] stuff; uint count=0; while(true && count < 100) { testa(stuff); // fullCollect(); delete stuff; writefln(++count); } readln(stdin, buf); } void testa(ref uint[] stuff) { stuff=new uint[15_000_000]; // improve execution speed std.gc.hasNoPointers(cast(void*)stuff); } --- This works with DMD 1.029 and DMD 2.013 and phobos (using Wiin XP SP3, 4 GB RAM). dsimcha Wrote:After further testing, I've found an exact threshold for this bug. When an array of uints gets to 48_693_248 bytes (12_173_312 elements) this problem occurs, after 26 iterations at the threshold, or less for larger arrays. Anything below that, even one element smaller, and memory usage is stable over at least hundreds of iterations. It appears that the number of bytes is the key, since a ulong[] will allow the same number of bytes (1/2 the elements) before causing problems, and a ushort[] will allow twice as many elements (same number of bytes) without crashing. Furthermore, using equivalent sized floats instead of ints (float instead of uint, double instead of ulong) or using signed ints, has no effect. == Quote from dsimcha (dsimcha yahoo.com)'s articleBecause of some difficulties encountered using D for large arrays (See previous posts about array capacity fields), I produced the following test case that seems to be a reproducible bug in D 2.0.13. The following program keeps allocating a huge array in a function and then letting all references to this array go out of scope. This should result in the array being freed as soon as more memory isneeded.import std.stdio, std.gc; void main(){ uint count=0; while(true) { test(); fullCollect(); writefln(++count); } } void test() { uint[] stuff=new uint[15_000_000]; } This produced an out of memory error after 21 iterations on my machine w/ 2 GB of RAM. Using an array size of 10_000_000 instead of 15_000_000, its memory usage stabilized at 350 megs, which seems rather large since a uint[10_000_000] should only use 40 megs, plus maybe another 40 for overhead. Furthermore, it takes several iterations for the memory usage to reach this level. Using a larger array size, such as 100_000_000, made this test run out of memory after even fewer iterations. Furthermore, changing from a uint[] to real[] with a size of 15_000_000 made it run out after 8 iterations instead of 21.
May 08 2008
Acutally, if I stick a delete in my original test case at the end of test(), the problem seems to go away. Furthermore, the ridiculously high memory usage even for cases of smaller arrays, goes down to like 41 megs instead of 350 for a uint[10_000_000]. Adding a hasNoPointers seems unnecessary. Why is the Phobos garbage collector so bad at freeing unused dynamic arrays? == Quote from Sivo Schilling (sivo.schilling web.de)'s articleThis bug also occurs using DMD 1.029. It seems one have to help phobos gc for such cases (intended as a workaround). If you excplicitly deletes yourlarge array than there's no error.--- import std.stdio, std.gc; void main() { char[] buf; uint[] stuff; uint count=0; while(true && count < 100) { testa(stuff); // fullCollect(); delete stuff; writefln(++count); } readln(stdin, buf); } void testa(ref uint[] stuff) { stuff=new uint[15_000_000]; // improve execution speed std.gc.hasNoPointers(cast(void*)stuff); } --- This works with DMD 1.029 and DMD 2.013 and phobos (using Wiin XP SP3, 4 GB RAM). dsimcha Wrote:seemsAfter further testing, I've found an exact threshold for this bug. When an array of uints gets to 48_693_248 bytes (12_173_312 elements) this problem occurs, after 26 iterations at the threshold, or less for larger arrays. Anything below that, even one element smaller, and memory usage is stable over at least hundreds of iterations. It appears that the number of bytes is the key, since a ulong[] will allow the same number of bytes (1/2 the elements) before causing problems, and a ushort[] will allow twice as many elements (same number of bytes) without crashing. Furthermore, using equivalent sized floats instead of ints (float instead of uint, double instead of ulong) or using signed ints, has no effect. == Quote from dsimcha (dsimcha yahoo.com)'s articleBecause of some difficulties encountered using D for large arrays (See previous posts about array capacity fields), I produced the following test case thatGB ofto be a reproducible bug in D 2.0.13. The following program keeps allocating a huge array in a function and then letting all references to this array go out of scope. This should result in the array being freed as soon as more memory isneeded.import std.stdio, std.gc; void main(){ uint count=0; while(true) { test(); fullCollect(); writefln(++count); } } void test() { uint[] stuff=new uint[15_000_000]; } This produced an out of memory error after 21 iterations on my machine w/ 2arrayRAM. Using an array size of 10_000_000 instead of 15_000_000, its memory usage stabilized at 350 megs, which seems rather large since a uint[10_000_000] should only use 40 megs, plus maybe another 40 for overhead. Furthermore, it takes several iterations for the memory usage to reach this level. Using a largersize, such as 100_000_000, made this test run out of memory after even fewer iterations. Furthermore, changing from a uint[] to real[] with a size of 15_000_000 made it run out after 8 iterations instead of 21.
May 08 2008
dsimcha and I debugged this last night. It appears it’s a case of the "partly conservative" collector problem. ( http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)#Precise_vs._conservative_a d_internal_pointers ). Essentially, there are values on the stack (including the array’s own length parameter) which when interpreted as a pointer, point to the array. And the bigger the array, the bigger the target and the more likely at least one pointer’s found. The issue were it works but uses too much memory, seems to be a case of some allocations ‘sticking’ before the GC finds some clear ranges, which it uses thereafter.
May 09 2008
Sean Kelly wrote:dsimcha wrote:Can someone tell me what "interior pointers" are ? thanks NickAccording to the Phobos docs, std.gc.capacity "Returns capacity (size of the memory block) that p points to the beginning of. If p does not point into the gc memory pool, or does not point to the beginning of an allocated memory block, 0 is returned." I had looked through this before.Right. And since resizing an interior slice "in place" is typically undesirable, this does exactly what you want. If you really want the capacity regardless of this however, you can use tango.core.Memory.GC.query(), which returns the size of the block even for interior pointers.
May 08 2008
Nick B wrote: > Can someone tell me what "interior pointers" are ? Sure. http://www.memorymanagement.org/glossary/i.html#interior.pointer
May 08 2008
This is somewhat of a different issue. The Phobos GC never returns any memory to the OS.This is my absolute favourite pet peeve with Perl, it has the same behaviour. I'm working on a fairly large program which is used on a cluster system with several instances going at the same time. It is initially uses quite a bit of memory (a few hundred megabytes) but then releases a large part of it and lingers in this state for quite some time (waiting for other processes), but from the systems point of view it never decreases in size. I realize that the GC is trying to be nice to me and speed up new allocs, but I would like to force it to release memory to the system. Is this possible in Phobos or Tango, it really should be? Pontus.
May 09 2008
Pontus Pihlgren wrote:The Tango GC actually does release memory back to the OS in some situations, which is why Tango doesn't explode on the example in this thread. It also actually implements GC.minimize(), which I believe is a no-op in Phobos. However, it's not always easy to return memory to the OS given the design of the GC in Phobos (which Tango inherited and simply modified). Basically, memory is obtained from the OS is large pools and then chunks of these pools are designated as free space for different sized blocks as needed. So to return a pool to the OS (by far the easiest approach), the entire pool must obviously be free. The problem is that the chance that a pool may have at least one chunk used for some existing memory block is pretty high in most situations. However, sufficiently large single allocations (like the 15_000_000 uint array) are given their own pool by the GC, so when this array is collected it's possible to return its pool to the OS immediately. This is basically what Tango does, and is likely why it doesn't crash on this sample app. SeanThis is somewhat of a different issue. The Phobos GC never returns any memory to the OS.This is my absolute favourite pet peeve with Perl, it has the same behaviour. I'm working on a fairly large program which is used on a cluster system with several instances going at the same time. It is initially uses quite a bit of memory (a few hundred megabytes) but then releases a large part of it and lingers in this state for quite some time (waiting for other processes), but from the systems point of view it never decreases in size. I realize that the GC is trying to be nice to me and speed up new allocs, but I would like to force it to release memory to the system. Is this possible in Phobos or Tango, it really should be?
May 09 2008