digitalmars.D - Memory footprint of multi-dimensional arrays
- Ben Mitchell (6/6) Apr 18 2008 So, I was watching my D program eat up ~1GB of ram doing something my ba...
- BCS (9/20) Apr 18 2008 one (kinda ugly) solution would be to allocate the whole thing as a
- Simen Kjaeraas (38/64) Apr 18 2008 :
- BCS (5/10) Apr 18 2008 without going to special types, it's a small as you will get, no extra
- Ben Mitchell (3/34) Apr 18 2008 Yeah, that was more or less what I'd come up with, too; it just seemed s...
- Bill Baxter (8/43) Apr 18 2008 May be too heavy-weight for your needs, but my Murray lib supports
- Sean Kelly (18/23) Apr 18 2008 calculation said should only be taking a few hundred MB max, and I start...
- Fawzi Mohamed (4/26) Apr 19 2008 you probably want to look at the multiarray library:
- Bruno Medeiros (5/8) Apr 25 2008 ubyte[3][] x = new ubyte[3][](10485760);
So, I was watching my D program eat up ~1GB of ram doing something my back of the envelope calculation said should only be taking a few hundred MB max, and I started playing around. Turns out that if I do: ubyte[][] b = new ubyte[][](3, 10485760); It takes about 30MB of main memory, as would be expected. If, on the other hand, I reverse the order of the indexing and do: ubyte[][] b = new ubyte[][](10485760, 3); It takes about 250MB of main memory. This seems totally absurd to me; I can only assume this is because it's an array of (potentially arbitrary size) dynamic arrays, and there's ~220MB of bookkeeping information being stored. Is there any way around this? Can I somehow tell the compiler it's a symmetric array? Can I make a dynamic array of static arrays of size 3? I can reverse the indexing if I have to, but it would make the rest of my code very counter-intuitive, and it just doesn't seem like it should make an order of magnitude difference in memory usage. - Ben
Apr 18 2008
Ben Mitchell wrote:So, I was watching my D program eat up ~1GB of ram doing something my back of the envelope calculation said should only be taking a few hundred MB max, and I started playing around. Turns out that if I do: ubyte[][] b = new ubyte[][](3, 10485760); It takes about 30MB of main memory, as would be expected. If, on the other hand, I reverse the order of the indexing and do: ubyte[][] b = new ubyte[][](10485760, 3); It takes about 250MB of main memory. This seems totally absurd to me; I can only assume this is because it's an array of (potentially arbitrary size) dynamic arrays, and there's ~220MB of bookkeeping information being stored. Is there any way around this? Can I somehow tell the compiler it's a symmetric array? Can I make a dynamic array of static arrays of size 3? I can reverse the indexing if I have to, but it would make the rest of my code very counter-intuitive, and it just doesn't seem like it should make an order of magnitude difference in memory usage. - Benone (kinda ugly) solution would be to allocate the whole thing as a single huge ubyte[] array and slice out the 1M 3 byte arrays: auto tmp = new ubyte[](3*10485760 + 5); ubyte[][] b; b.length = 10485760; foreach(uint i, ref bp; b) bp = tmp[i*3.. (i+1)*3]; A little template magic might be able to do this recursively for nested dynamic arrays
Apr 18 2008
On Fri, 18 Apr 2008 18:15:45 +0200, BCS <BCS pathlink.com> wrote:Ben Mitchell wrote:=So, I was watching my D program eat up ~1GB of ram doing something my=back of the envelope calculation said should only be taking a few =:hundred MB max, and I started playing around. Turns out that if I do=e =ubyte[][] b =3D new ubyte[][](3, 10485760); It takes about 30MB of main memory, as would be expected. If, on th=e; =other hand, I reverse the order of the indexing and do: ubyte[][] b =3D new ubyte[][](10485760, 3); It takes about 250MB of main memory. This seems totally absurd to m=I can only assume this is because it's an array of (potentially =arbitrary size) dynamic arrays, and there's ~220MB of bookkeeping =information being stored. Is there any way around this? Can I =ic =somehow tell the compiler it's a symmetric array? Can I make a dynam=ve =array of static arrays of size 3? I can reverse the indexing if I ha==to, but it would make the rest of my code very counter-intuitive, and=it just doesn't seem like it should make an order of magnitude =difference in memory usage. - Benone (kinda ugly) solution would be to allocate the whole thing as a =single huge ubyte[] array and slice out the 1M 3 byte arrays: auto tmp =3D new ubyte[](3*10485760 + 5); ubyte[][] b; b.length =3D 10485760; foreach(uint i, ref bp; b) bp =3D tmp[i*3.. (i+1)*3]; A little template magic might be able to do this recursively for neste=d =dynamic arraysWouldn't that end up taking just as much space? b would be an array of 10485760 slices, each having a pointer/length pair taking up 8 bytes eac= h, = no? Might save some due to GC wanting bigger slices than 3 bytes, but still.= Now, using a struct with overloaded opIndex (and opSlice, if you want) should be lighter. Something like this (untested)? struct Foo { ubyte[] data; size_t sz1, sz2; static Foo opCall(size_t size1, size_t size2) { Foo tmp; tmp.data.length =3D size1 * size2; tmp.sz1 =3D size1; tmp.sz2 =3D size2; return tmp; } ubyte[] opIndex(size_t index) { return data[index * size2..index * size2 + size1]; } void opIndexAssign(ubyte[] value, size_t index) { data[index * size2..index * size2 + size1] =3D value; } } -- Simen
Apr 18 2008
Simen Kjaeraas wrote:Wouldn't that end up taking just as much space? b would be an array of 10485760 slices, each having a pointer/length pair taking up 8 bytes each, no?yup.Might save some due to GC wanting bigger slices than 3 bytes, but still.without going to special types, it's a small as you will get, no extra GC overhead in alignment or bookkeeping. (I guess you could allocate the whole thing in one go but then you mix pointers and non pointers.)
Apr 18 2008
Simen Kjaeraas Wrote:Now, using a struct with overloaded opIndex (and opSlice, if you want) should be lighter. Something like this (untested)? struct Foo { ubyte[] data; size_t sz1, sz2; static Foo opCall(size_t size1, size_t size2) { Foo tmp; tmp.data.length = size1 * size2; tmp.sz1 = size1; tmp.sz2 = size2; return tmp; } ubyte[] opIndex(size_t index) { return data[index * size2..index * size2 + size1]; } void opIndexAssign(ubyte[] value, size_t index) { data[index * size2..index * size2 + size1] = value; } } -- SimenYeah, that was more or less what I'd come up with, too; it just seemed sort of inelegant, and I was hoping that there was some clean, built in way to do this that I just didn't know about. If there's not, I think that some built-in syntax for non-ragged multi dimensional arrays would be nice, given that it's a pretty common thing to need in scientific computation, and it doesn't seem like it should be that hard an addition to make. - Ben
Apr 18 2008
Ben Mitchell wrote:Simen Kjaeraas Wrote:May be too heavy-weight for your needs, but my Murray lib supports arbitrary N-dimensional arrays using a single allocation strategy like that. And my DFLAT lib supports matrices (just 2-dimensional arrays) in a similar way (but also supports different storage formats like banded, triangular, and various sparse matrix formats). both are at http://www.dsource.org/projects/multiarray --bbNow, using a struct with overloaded opIndex (and opSlice, if you want) should be lighter. Something like this (untested)? struct Foo { ubyte[] data; size_t sz1, sz2; static Foo opCall(size_t size1, size_t size2) { Foo tmp; tmp.data.length = size1 * size2; tmp.sz1 = size1; tmp.sz2 = size2; return tmp; } ubyte[] opIndex(size_t index) { return data[index * size2..index * size2 + size1]; } void opIndexAssign(ubyte[] value, size_t index) { data[index * size2..index * size2 + size1] = value; } } -- SimenYeah, that was more or less what I'd come up with, too; it just seemed sort of inelegant, and I was hoping that there was some clean, built in way to do this that I just didn't know about. If there's not, I think that some built-in syntax for non-ragged multi dimensional arrays would be nice, given that it's a pretty common thing to need in scientific computation, and it doesn't seem like it should be that hard an addition to make. - Ben
Apr 18 2008
== Quote from Ben Mitchell (Arisian gmail.com)'s articleSo, I was watching my D program eat up ~1GB of ram doing something my back of the envelopecalculation said should only be taking a few hundred MB max, and I started playing around. Turns out that if I do:ubyte[][] b = new ubyte[][](3, 10485760); It takes about 30MB of main memory, as would be expected. If, on the other hand, I reverse the orderof the indexing and do:ubyte[][] b = new ubyte[][](10485760, 3); It takes about 250MB of main memory. This seems totally absurd to me; I can only assume this isbecause it's an array of (potentially arbitrary size) dynamic arrays, and there's ~220MB of bookkeeping information being stored. Is there any way around this? Can I somehow tell the compiler it's a symmetric array? Can I make a dynamic array of static arrays of size 3? I can reverse the indexing if I have to, but it would make the rest of my code very counter-intuitive, and it just doesn't seem like it should make an order of magnitude difference in memory usage. In the first case, you have an array containing three pointers to arrays of 10485760 bytes each, for a minimum total of (3 * 4) + (3 * 10485760) bytes -- around 30 MB. In the second case, you have an array containing 10485760 pointers to arrays of 3 bytes each, for a minimum total of (10485760 * 4) + (10485760 * 3) -- about 200 MB. Now given the actual GC implementation, the minimum block size allocated is actually 16 bytes rather than 3 bytes so the second number is actually (10485760 * 16), bringing the memory used for the allocation to roughly 300 MB by my estimation. The basic issue here is that this is a ragged array so the memory is not allocated in one large contiguous block but rather a bunch of linked dynamic blocks. Sean
Apr 18 2008
On 2008-04-18 18:04:17 +0200, Ben Mitchell <Arisian gmail.com> said:So, I was watching my D program eat up ~1GB of ram doing something my back of the envelope calculation said should only be taking a few hundred MB max, and I started playing around. Turns out that if I do: ubyte[][] b = new ubyte[][](3, 10485760); It takes about 30MB of main memory, as would be expected. If, on the other hand, I reverse the order of the indexing and do: ubyte[][] b = new ubyte[][](10485760, 3); It takes about 250MB of main memory. This seems totally absurd to me; I can only assume this is because it's an array of (potentially arbitrary size) dynamic arrays, and there's ~220MB of bookkeeping information being stored. Is there any way around this? Can I somehow tell the compiler it's a symmetric array? Can I make a dynamic array of static arrays of size 3? I can reverse the indexing if I have to, but it would make the rest of my code very counter-intuitive, and it just doesn't seem like it should make an order of magnitude difference in memory usage. - Benyou probably want to look at the multiarray library: http://www.dsource.org/projects/multiarray/ Fawzi
Apr 19 2008
Ben Mitchell wrote:Can I make a dynamic array of static arrays of size 3?ubyte[3][] x = new ubyte[3][](10485760); -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 25 2008