digitalmars.D.learn - Array length & allocation question
- Robert Atkinson (6/6) Jun 08 2006 Quick question concerning Array lengths and memory allocations.
- BCS (13/25) Jun 08 2006 Even if the buffer is there I would think that it would be faster to do
- Sean Kelly (5/12) Jun 08 2006 In most cases it's not worth it to try and maintain the buffer yourself....
- Lars Ivar Igesund (7/26) Jun 08 2006 I think the "double-the-size-when-more-is-needed" strategy is used, and
- Bruno Medeiros (6/29) Jun 11 2006 Hum, and happens when one shortens the length of the array? The Memory
- Derek Parnell (7/9) Jun 11 2006 Yes. However there is a bug (oops - an issue) in which if the length is ...
- Bruno Medeiros (5/16) Jun 12 2006 That makes perfect sense, why would it be a bug?
- Oskar Linde (14/29) Jun 12 2006 I don't know if this is what Derek refers to, but it used to be
- Derek Parnell (22/33) Jun 12 2006 Agreed, it is not a bug in the sense that it is contrary to specificatio...
- Sean Kelly (3/43) Jun 12 2006 Perhaps D arrays simply need a reserve property?
- Oskar Linde (14/15) Jun 12 2006 Something like this ought to work:
- Oskar Linde (11/29) Jun 12 2006 t arr.length = 100_000_000; arr = arr[0..0]; is almost as convenient.
- Derek Parnell (45/68) Jun 12 2006 Unfortunately this only appears to reserve the RAM, because the next cha...
- Oskar Linde (51/121) Jun 13 2006 You are right, changing length forces a reallocation. Interestingly, the...
- Derek Parnell (57/60) Jun 13 2006 Hmmm... I just rewrote that function as below and it seems to test out
- Sean Kelly (6/96) Jun 13 2006 Hrm, there were some changes to gc.d a while back, but it was more than
- Bruno Medeiros (12/23) Jun 13 2006 This is not safe to do. Currently in D null arrays and zero-length
- Oskar Linde (56/77) Jun 13 2006 Yeah, I knew about that. I did mot mean to imply that D is flawless in
- Bruno Medeiros (10/109) Jun 14 2006 Well, those new thing you mentioned are actually very related with
- Dave (16/26) Jun 08 2006 Setting the array length does just that and nothing more or less. But
- Chris Nicholson-Sauls (67/86) Jun 08 2006 So I did. :) My test program:
- Derek Parnell (9/13) Jun 08 2006 Not if you set it back to zero. If you do that, D also deallocates the
Quick question concerning Array lengths and memory allocations. When an array.length = array.length + 1 (or length - 1) happens, does the system only increase (decrease) the memory allocation by 1 [unit] or does it internally mantain a buffer and try to minimise the resizing of the array? I think I can remember seeing posts saying to maintain the buffer yourself and other posts saying it was done automatically behind the scenes.
Jun 08 2006
Robert Atkinson wrote:Quick question concerning Array lengths and memory allocations. When an array.length = array.length + 1 (or length - 1) happens, does the system only increase (decrease) the memory allocation by 1 [unit] or does it internally mantain a buffer and try to minimise the resizing of the array? I think I can remember seeing posts saying to maintain the buffer yourself and other posts saying it was done automatically behind the scenes.Even if the buffer is there I would think that it would be faster to do it your self because you have more information to decide how to do it char[] first = "foo bar" func(first[0..3]); char[] func(char[] inp) { // first time around can't extend in place // logic to check this would be costly while(go()) inp.length = inp.length+1; return inp; }
Jun 08 2006
Robert Atkinson wrote:Quick question concerning Array lengths and memory allocations. When an array.length = array.length + 1 (or length - 1) happens, does the system only increase (decrease) the memory allocation by 1 [unit] or does it internally mantain a buffer and try to minimise the resizing of the array?The latter.I think I can remember seeing posts saying to maintain the buffer yourself and other posts saying it was done automatically behind the scenes.In most cases it's not worth it to try and maintain the buffer yourself. At the very least, you should test both methods and see which is faster. Sean
Jun 08 2006
Sean Kelly wrote:Robert Atkinson wrote:I think the "double-the-size-when-more-is-needed" strategy is used, and afaik, it is the one that performs best in the general case. -- Lars Ivar Igesund blog at http://larsivi.net DSource & #D: larsiviQuick question concerning Array lengths and memory allocations. When an array.length = array.length + 1 (or length - 1) happens, does the system only increase (decrease) the memory allocation by 1 [unit] or does it internally mantain a buffer and try to minimise the resizing of the array?The latter.I think I can remember seeing posts saying to maintain the buffer yourself and other posts saying it was done automatically behind the scenes.In most cases it's not worth it to try and maintain the buffer yourself. At the very least, you should test both methods and see which is faster. Sean
Jun 08 2006
Lars Ivar Igesund wrote:Sean Kelly wrote:Hum, and happens when one shortens the length of the array? The Memory Manager "back" buffer size remains the same? -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#DRobert Atkinson wrote:I think the "double-the-size-when-more-is-needed" strategy is used, and afaik, it is the one that performs best in the general case.Quick question concerning Array lengths and memory allocations. When an array.length = array.length + 1 (or length - 1) happens, does the system only increase (decrease) the memory allocation by 1 [unit] or does it internally mantain a buffer and try to minimise the resizing of the array?The latter.I think I can remember seeing posts saying to maintain the buffer yourself and other posts saying it was done automatically behind the scenes.In most cases it's not worth it to try and maintain the buffer yourself. At the very least, you should test both methods and see which is faster. Sean
Jun 11 2006
On Mon, 12 Jun 2006 09:11:04 +1000, Bruno Medeiros <brunodomedeirosATgmail SPAM.com> wrote:Hum, and happens when one shortens the length of the array? The Memory Manager "back" buffer size remains the same?Yes. However there is a bug (oops - an issue) in which if the length is set to zero the RAM is released back to the the system. -- Derek Parnell Melbourne, Australia
Jun 11 2006
Derek Parnell wrote:On Mon, 12 Jun 2006 09:11:04 +1000, Bruno Medeiros <brunodomedeirosATgmail SPAM.com> wrote:That makes perfect sense, why would it be a bug? -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#DHum, and happens when one shortens the length of the array? The Memory Manager "back" buffer size remains the same?Yes. However there is a bug (oops - an issue) in which if the length is set to zero the RAM is released back to the the system. --Derek Parnell Melbourne, Australia
Jun 12 2006
Bruno Medeiros skrev:Derek Parnell wrote:I don't know if this is what Derek refers to, but it used to be recommended practice to reserve space for an array by doing: arr.length = 1024; arr.length = 0; (start filling arr with data) I'm quite sure this used to be mentioned in the documentation, but I can no longer find any reference to it (except this old post: http://www.digitalmars.com/drn-bin/wwwnews?D/17691) Today, I guess you should do the following instead: arr.length = 1024; arr = arr[0..0]; (start filling arr with data) /OskarOn Mon, 12 Jun 2006 09:11:04 +1000, Bruno Medeiros <brunodomedeirosATgmail SPAM.com> wrote:That makes perfect sense, why would it be a bug?Hum, and happens when one shortens the length of the array? The Memory Manager "back" buffer size remains the same?Yes. However there is a bug (oops - an issue) in which if the length is set to zero the RAM is released back to the the system. --Derek Parnell Melbourne, Australia
Jun 12 2006
On Tue, 13 Jun 2006 05:27:44 +1000, Bruno Medeiros <brunodomedeirosATgmail SPAM.com> wrote:Derek Parnell wrote:Agreed, it is not a bug in the sense that it is contrary to specifications because this behaviour isn't specified. However it does prevent a coder from distinguishing between an empty array from a null array. An Empty one is an array that (no longer) has any elements and a null array is one that doesn't have any RAM to reference. I sugest that Walter either document this functionality or fix it. "When an array length is reduced the RAM it owns is not released and can be reused when the array subsequently is expanded (, unless the length is set to zero in which case the RAM is released). " Setting the length to zero is a convenient way to reserved RAM for an array. Also consider this ... foo(""); Now how can 'foo' be written to detect a coder's error of passing it an uninitialized array. char[] x; foo(x); -- Derek Parnell Melbourne, AustraliaOn Mon, 12 Jun 2006 09:11:04 +1000, Bruno Medeiros <brunodomedeirosATgmail SPAM.com> wrote:That makes perfect sense, why would it be a bug?Hum, and happens when one shortens the length of the array? The Memory Manager "back" buffer size remains the same?Yes. However there is a bug (oops - an issue) in which if the length is set to zero the RAM is released back to the the system. --Derek Parnell Melbourne, Australia
Jun 12 2006
Derek Parnell wrote:On Tue, 13 Jun 2006 05:27:44 +1000, Bruno Medeiros <brunodomedeirosATgmail SPAM.com> wrote:Perhaps D arrays simply need a reserve property? SeanDerek Parnell wrote:Agreed, it is not a bug in the sense that it is contrary to specifications because this behaviour isn't specified. However it does prevent a coder from distinguishing between an empty array from a null array. An Empty one is an array that (no longer) has any elements and a null array is one that doesn't have any RAM to reference. I sugest that Walter either document this functionality or fix it. "When an array length is reduced the RAM it owns is not released and can be reused when the array subsequently is expanded (, unless the length is set to zero in which case the RAM is released). " Setting the length to zero is a convenient way to reserved RAM for an array. Also consider this ... foo(""); Now how can 'foo' be written to detect a coder's error of passing it an uninitialized array. char[] x; foo(x);On Mon, 12 Jun 2006 09:11:04 +1000, Bruno Medeiros <brunodomedeirosATgmail SPAM.com> wrote:That makes perfect sense, why would it be a bug?Hum, and happens when one shortens the length of the array? The Memory Manager "back" buffer size remains the same?Yes. However there is a bug (oops - an issue) in which if the length is set to zero the RAM is released back to the the system. --Derek Parnell Melbourne, Australia
Jun 12 2006
Sean Kelly skrev:Perhaps D arrays simply need a reserve property?Something like this ought to work: template reserve(ArrTy,IntTy) { void reserve(inout ArrTy a, IntTy size) { if (size > a.length) { size_t old_length = a.length; a.length = size; a = a[0..old_length]; } } } usage: arr.reserve(1000); /Oskar
Jun 12 2006
Derek Parnell skrev:I sugest that Walter either document this functionality or fix it.I agree that it should be better documented."When an array length is reduced the RAM it owns is not released and can be reused when the array subsequently is expanded (, unless the length is set to zero in which case the RAM is released). " Setting the length to zero is a convenient way to reserved RAM for an array.t arr.length = 100_000_000; arr = arr[0..0]; is almost as convenient.Also consider this ... foo(""); Now how can 'foo' be written to detect a coder's error of passing it an uninitialized array. char[] x; foo(x);Like this: void foo(char[] arr) { if (!arr) writefln("Uninitialized array passed"); else if (arr.length == 0) writefln("Zero length array received"); } /Oskar
Jun 12 2006
On Tue, 13 Jun 2006 01:05:04 +0200, Oskar Linde wrote:Unfortunately this only appears to reserve the RAM, because the next change in length will cause a new allocation to be made. See the example program below ...Setting the length to zero is a convenient way to reserved RAM for an array.t arr.length = 100_000_000; arr = arr[0..0]; is almost as convenient.Yes, I can see that D can now distinguish between the two. This didn't used to be the case, IIRC. However there is still a 'bug' with this as the program here demonstrates... import std.stdio; void main() { char[] arr; foo(arr); foo(""); foo("".dup); writefln("%s %s", arr.length, arr.ptr); arr.length = 100; writefln("%s %s", arr.length, arr.ptr); arr = arr[0..0]; writefln("%s %s", arr.length, arr.ptr); arr.length = 50; writefln("%s %s", arr.length, arr.ptr); arr.length = 500; writefln("%s %s", arr.length, arr.ptr); } void foo(char[] t) { writefln("foo: %s %s", t.length, t.ptr); } The results are ... foo: 0 0000 foo: 0 413080 foo: 0 0000 *** A 'dup'ed empty string is now a null string. 0 0000 100 8A2F00 0 8A2F00 *** RAM appears to be reserved. 50 8A1F80 *** But it is not as a new allocation just occurred. 500 8A3E00 *** This allocation is expected. -- Derek (skype: derek.j.parnell) Melbourne, Australia "Down with mediocrity!" 13/06/2006 11:08:24 AMAlso consider this ... foo(""); Now how can 'foo' be written to detect a coder's error of passing it an uninitialized array. char[] x; foo(x);Like this: void foo(char[] arr) { if (!arr) writefln("Uninitialized array passed"); else if (arr.length == 0) writefln("Zero length array received"); }
Jun 12 2006
Derek Parnell skrev:On Tue, 13 Jun 2006 01:05:04 +0200, Oskar Linde wrote:You are right, changing length forces a reallocation. Interestingly, the following works: arr.length = 100; arr = arr[0..0]; writefln("%s %s",arr.length,arr.ptr); for (int i = 0; i < 50; i++) arr ~= i; writefln("%s %s",arr.length,arr.ptr); prints (for me): 0 b7ee9e00 50 b7ee9e00 What is even more interesting is that the above "buggy" behavior seems intentional. The following patch removes the forced reallocation when changing length of a 0-length array: --- gc.d.orig 2006-06-04 11:50:08.979945284 +0200 +++ gc.d 2006-06-13 09:19:02.135348959 +0200 -382,8 +382,6 } //printf("newsize = %x, newlength = %x\n", newsize, newlength); - if (p.length) - { newdata = p.data; if (newlength > p.length) { -397,11 +395,6 } newdata[size .. newsize] = 0; } - } - else - { - newdata = cast(byte *)_gc.calloc(newsize + 1, 1); - } } else { With this change, your above code prints: $build -run ./arrtest ~/dmd/src/phobos/internal/gc/gc.d Path and Version : build v2.9(1197) built on Thu Aug 11 16:07:55 2005 foo: 0 0 foo: 0 805765c foo: 0 0 0 0 100 b7ee8e80 0 b7ee8e80 *** RAM is reserved 50 b7ee8e80 *** and is used 500 b7ee9e00 *** This causes reallocation as expected I wonder why the code looks like it does... /OskarUnfortunately this only appears to reserve the RAM, because the next change in length will cause a new allocation to be made. See the example program below ...Setting the length to zero is a convenient way to reserved RAM for an array.t arr.length = 100_000_000; arr = arr[0..0]; is almost as convenient.Yes, I can see that D can now distinguish between the two. This didn't used to be the case, IIRC. However there is still a 'bug' with this as the program here demonstrates... import std.stdio; void main() { char[] arr; foo(arr); foo(""); foo("".dup); writefln("%s %s", arr.length, arr.ptr); arr.length = 100; writefln("%s %s", arr.length, arr.ptr); arr = arr[0..0]; writefln("%s %s", arr.length, arr.ptr); arr.length = 50; writefln("%s %s", arr.length, arr.ptr); arr.length = 500; writefln("%s %s", arr.length, arr.ptr); } void foo(char[] t) { writefln("foo: %s %s", t.length, t.ptr); } The results are ... foo: 0 0000 foo: 0 413080 foo: 0 0000 *** A 'dup'ed empty string is now a null string. 0 0000 100 8A2F00 0 8A2F00 *** RAM appears to be reserved. 50 8A1F80 *** But it is not as a new allocation just occurred. 500 8A3E00 *** This allocation is expected.Also consider this ... foo(""); Now how can 'foo' be written to detect a coder's error of passing it an uninitialized array. char[] x; foo(x);Like this: void foo(char[] arr) { if (!arr) writefln("Uninitialized array passed"); else if (arr.length == 0) writefln("Zero length array received"); }
Jun 13 2006
On Tue, 13 Jun 2006 09:24:34 +0200, Oskar Linde wrote:What is even more interesting is that the above "buggy" behavior seems intentional. The following patch removes the forced reallocation when changing length of a 0-length array:Hmmm... I just rewrote that function as below and it seems to test out quite well too. I incorporated your change plus I removed the check for a zero new length. Seems to work without any problems. ----------------- extern (C) byte[] _d_arraysetlength(size_t newlength, size_t sizeelem, Array *p) in { assert(sizeelem); assert(!p.length || p.data); } body { byte* newdata; newdata = p.data; if (newlength > p.length) { version (D_InlineAsm_X86) { size_t newsize = void; asm { mov EAX,newlength ; mul EAX,sizeelem ; mov newsize,EAX ; jc Loverflow ; } } else { size_t newsize = sizeelem * newlength; if (newsize / newlength != sizeelem) goto Loverflow; } size_t size = p.length * sizeelem; size_t cap = _gc.capacity(p.data); if (cap <= newsize) { newdata = cast(byte *)_gc.malloc(newsize + 1); newdata[0 .. size] = p.data[0 .. size]; } newdata[size .. newsize] = 0; } p.data = newdata; p.length = newlength; return newdata[0 .. newlength]; Loverflow: _d_OutOfMemory(); } --------------- -- Derek (skype: derek.j.parnell) Melbourne, Australia "Down with mediocrity!" 13/06/2006 5:54:57 PM
Jun 13 2006
Oskar Linde wrote:Derek Parnell skrev:Hrm, there were some changes to gc.d a while back, but it was more than 10 versions ago as that's as far back as I have installed at the moment. Perhaps Walter could comment on the change? I suspect it was probably a bug fix. SeanOn Tue, 13 Jun 2006 01:05:04 +0200, Oskar Linde wrote:You are right, changing length forces a reallocation. Interestingly, the following works: arr.length = 100; arr = arr[0..0]; writefln("%s %s",arr.length,arr.ptr); for (int i = 0; i < 50; i++) arr ~= i; writefln("%s %s",arr.length,arr.ptr); prints (for me): 0 b7ee9e00 50 b7ee9e00 What is even more interesting is that the above "buggy" behavior seems intentional.Unfortunately this only appears to reserve the RAM, because the next change in length will cause a new allocation to be made. See the example program below ...Setting the length to zero is a convenient way to reserved RAM for an array.t arr.length = 100_000_000; arr = arr[0..0]; is almost as convenient.Yes, I can see that D can now distinguish between the two. This didn't used to be the case, IIRC. However there is still a 'bug' with this as the program here demonstrates... import std.stdio; void main() { char[] arr; foo(arr); foo(""); foo("".dup); writefln("%s %s", arr.length, arr.ptr); arr.length = 100; writefln("%s %s", arr.length, arr.ptr); arr = arr[0..0]; writefln("%s %s", arr.length, arr.ptr); arr.length = 50; writefln("%s %s", arr.length, arr.ptr); arr.length = 500; writefln("%s %s", arr.length, arr.ptr); } void foo(char[] t) { writefln("foo: %s %s", t.length, t.ptr); } The results are ... foo: 0 0000 foo: 0 413080 foo: 0 0000 *** A 'dup'ed empty string is now a null string. 0 0000 100 8A2F00 0 8A2F00 *** RAM appears to be reserved. 50 8A1F80 *** But it is not as a new allocation just occurred. 500 8A3E00 *** This allocation is expected.Also consider this ... foo(""); Now how can 'foo' be written to detect a coder's error of passing it an uninitialized array. char[] x; foo(x);Like this: void foo(char[] arr) { if (!arr) writefln("Uninitialized array passed"); else if (arr.length == 0) writefln("Zero length array received"); }
Jun 13 2006
Oskar Linde wrote:Like this: void foo(char[] arr) { if (!arr) writefln("Uninitialized array passed"); else if (arr.length == 0) writefln("Zero length array received"); } /OskarThis is not safe to do. Currently in D null arrays and zero-length arrays are conceptually the same. It just so happens that sometimes the arr.ptr is null and sometimes not, depending on the previous operations. The "A 'dup'ed empty string is now a null string." is an example of why that is not safe. I thought you knew this already? This is nothing new. BTW, I do find it (at first sight at least) unnatural that a null array is the same as a zero-length arrays. It doesn't seem conceptually right/consistent. -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Jun 13 2006
Bruno Medeiros skrev:Oskar Linde wrote:Yeah, I knew about that. I did mot mean to imply that D is flawless in this regard. The cases given were: foo(""); and char[] s; foo(s); And for those, the above function works. My only point, if I had one, was that there are differences between zero length arrays and null arrays in some cases in D.Like this: void foo(char[] arr) { if (!arr) writefln("Uninitialized array passed"); else if (arr.length == 0) writefln("Zero length array received"); } /OskarThis is not safe to do. Currently in D null arrays and zero-length arrays are conceptually the same. It just so happens that sometimes the arr.ptr is null and sometimes not, depending on the previous operations. The "A 'dup'ed empty string is now a null string." is an example of why that is not safe. I thought you knew this already? This is nothing new.BTW, I do find it (at first sight at least) unnatural that a null array is the same as a zero-length arrays. It doesn't seem conceptually right/consistent.In my view, D's dynamic arrays are quite different from a conceptually ideal array. Conceptually, I see an array as an ordered collection of elements. The elements belong to (or are part of) the array. One could imagine such arrays as both value and reference types. For a reference type ideal array, there has to be a clear difference between null and zero length. A value type ideal array on the other hand would not need one such distinction. Another conceptual entity apart from an array is an array view. An array view refers to a selection of indices of another array. For example, a range of indices (aka a slice). An array view may or may not remain valid when the referred array changes. D's dynamic array is quite far from my ideal array. Both its reference and its value version. A closer match is actually a by-value array slice. Does it make sense for a by-value array slice type to discriminate between null and zero-length? I would say that it has its uses. For example, a regexp could match a zero length portion of a string. It is still important to know where in the string the match was made. D's arrays have both the role of a non-reference array and of an array slice. In the role of an non-reference array, it makes sense that null is equivalent to zero-length. In the role of an array slice on the other hand, it does make sense to discriminate between zero length and null. There are other differences. Appending elements only makes sense to the array role, not the slice role. dup creates an array from a slice or an array. It therefore makes sense that dup returns null on zero length arrays. The semantics of some operations depends on the role the array has. D has no way of knowing, so it guesses. Take that with a grain of salt, but operations on arrays depend on a runtime judgment by the gc. Take the append operation. Appending elements to a D array that is in the array role makes sense and works like a charm. Appending elements to an array slice doesn't make any sense, but D will create a new array with copies of the elements the slice refers to and append the element to that array. The slice has been transformed into an array. But how does D know when an array is in the slice role or the array role? It doesn't. Here is where the (educated) guess comes in. Any array that starts at the beginning of a gc chunk is assumed to be an array. Otherwise, it is assumed to be a slice. The implications are: char[] mystr = "abcd".dup; char[] slice1 = mystr[0..1]; char[] slice2 = mystr[1..2]; slice1 ~= "x"; // alters the original mystr slice2 ~= "y"; // doesn't alter the original I've written too much nonsense now. Some condensed conclusions: - D's arrays have a schizophrenic nature (slice vs array) - The compiler is unable to tell the difference and can't protect you against mistakes - D arrays are not self documenting: char[] foo(); // <- returns an array or a slice of someone else's array? /Oskar
Jun 13 2006
Oskar Linde wrote:Bruno Medeiros skrev:Well, those new thing you mentioned are actually very related with ownership management, and reference/object immutibility, than to just arrays itself.Oskar Linde wrote:Yeah, I knew about that. I did mot mean to imply that D is flawless in this regard. The cases given were: foo(""); and char[] s; foo(s); And for those, the above function works. My only point, if I had one, was that there are differences between zero length arrays and null arrays in some cases in D.Like this: void foo(char[] arr) { if (!arr) writefln("Uninitialized array passed"); else if (arr.length == 0) writefln("Zero length array received"); } /OskarThis is not safe to do. Currently in D null arrays and zero-length arrays are conceptually the same. It just so happens that sometimes the arr.ptr is null and sometimes not, depending on the previous operations. The "A 'dup'ed empty string is now a null string." is an example of why that is not safe. I thought you knew this already? This is nothing new.BTW, I do find it (at first sight at least) unnatural that a null array is the same as a zero-length arrays. It doesn't seem conceptually right/consistent.In my view, D's dynamic arrays are quite different from a conceptually ideal array. Conceptually, I see an array as an ordered collection of elements. The elements belong to (or are part of) the array. One could imagine such arrays as both value and reference types. For a reference type ideal array, there has to be a clear difference between null and zero length. A value type ideal array on the other hand would not need one such distinction. Another conceptual entity apart from an array is an array view. An array view refers to a selection of indices of another array. For example, a range of indices (aka a slice). An array view may or may not remain valid when the referred array changes. D's dynamic array is quite far from my ideal array. Both its reference and its value version. A closer match is actually a by-value array slice. Does it make sense for a by-value array slice type to discriminate between null and zero-length? I would say that it has its uses. For example, a regexp could match a zero length portion of a string. It is still important to know where in the string the match was made. D's arrays have both the role of a non-reference array and of an array slice. In the role of an non-reference array, it makes sense that null is equivalent to zero-length. In the role of an array slice on the other hand, it does make sense to discriminate between zero length and null. There are other differences. Appending elements only makes sense to the array role, not the slice role. dup creates an array from a slice or an array. It therefore makes sense that dup returns null on zero length arrays. The semantics of some operations depends on the role the array has. D has no way of knowing, so it guesses. Take that with a grain of salt, but operations on arrays depend on a runtime judgment by the gc. Take the append operation. Appending elements to a D array that is in the array role makes sense and works like a charm. Appending elements to an array slice doesn't make any sense, but D will create a new array with copies of the elements the slice refers to and append the element to that array. The slice has been transformed into an array. But how does D know when an array is in the slice role or the array role? It doesn't. Here is where the (educated) guess comes in. Any array that starts at the beginning of a gc chunk is assumed to be an array. Otherwise, it is assumed to be a slice. The implications are: char[] mystr = "abcd".dup; char[] slice1 = mystr[0..1]; char[] slice2 = mystr[1..2]; slice1 ~= "x"; // alters the original mystr slice2 ~= "y"; // doesn't alter the originalI've written too much nonsense now. Some condensed conclusions: - D's arrays have a schizophrenic nature (slice vs array) - The compiler is unable to tell the difference and can't protect you against mistakes - D arrays are not self documenting: char[] foo(); // <- returns an array or a slice of someone else's array? /OskarWe have often mentioned the problems of arrays (both static and dynamic) before. It should be brought under discussion to the "general" D public eventually. (although for me preferably not soon, other things to take care) -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Jun 14 2006
Robert Atkinson wrote:Quick question concerning Array lengths and memory allocations. When an array.length = array.length + 1 (or length - 1) happens, does the system only increase (decrease) the memory allocation by 1 [unit] or does it internally mantain a buffer and try to minimise the resizing of the array? I think I can remember seeing posts saying to maintain the buffer yourself and other posts saying it was done automatically behind the scenes.Setting the array length does just that and nothing more or less. But using the the array concatenation operator (~) will preallocate some space. time this: int[] arr; for(int i = 0; i < 1000000; i++) { arr.length = arr.length + 1; arr[i] = i; } vs this: int[] arr; for(int i = 0; i < 1000000; i++) { arr ~= i; }
Jun 08 2006
Dave wrote:Setting the array length does just that and nothing more or less. But using the the array concatenation operator (~) will preallocate some space. time this: int[] arr; for(int i = 0; i < 1000000; i++) { arr.length = arr.length + 1; arr[i] = i; } vs this: int[] arr; for(int i = 0; i < 1000000; i++) { arr ~= i; }So I did. :) My test program: And my results, compiling with "-release -O -inline", were: <Benchmark Index Assign> Baseline 79.090000 <Benchmark Index Assign> 43.830000 & 1.804472 versus baseline <Benchmark Index Assign> 42.570000 & 1.857881 versus baseline <Benchmark Index Assign> 42.560000 & 1.858318 versus baseline <Benchmark Index Assign> 42.410000 & 1.864890 versus baseline <Benchmark Index Assign> 41.680000 & 1.897553 versus baseline <Benchmark Index Assign> 41.640000 & 1.899376 versus baseline <Benchmark Index Assign> 41.580000 & 1.902116 versus baseline <Benchmark Index Assign> 41.580000 & 1.902116 versus baseline <Benchmark Index Assign> 41.680000 & 1.897553 versus baseline <Benchmark Cat Assign> Baseline 0.720000 <Benchmark Cat Assign> 0.600000 & 1.200000 versus baseline <Benchmark Cat Assign> 0.550000 & 1.309091 versus baseline <Benchmark Cat Assign> 0.610000 & 1.180328 versus baseline <Benchmark Cat Assign> 0.600000 & 1.200000 versus baseline <Benchmark Cat Assign> 0.550000 & 1.309091 versus baseline <Benchmark Cat Assign> 0.600000 & 1.200000 versus baseline <Benchmark Cat Assign> 0.610000 & 1.180328 versus baseline <Benchmark Cat Assign> 0.600000 & 1.200000 versus baseline <Benchmark Cat Assign> 0.550000 & 1.309091 versus baseline DMD 0.160, Win32. That's a rather disturbing disparity, if you ask me. Now, what I didn't test but probably should have, was the effect of "pre-allocating" the array by setting the .length to a large value and then back to zero, expanding the behind-the-scenes capacity of the array. I'm betting in that case the IndexAssign would be the faster. -- Chris Nicholson-Sauls
Jun 08 2006
On Fri, 09 Jun 2006 05:33:33 +1000, Chris Nicholson-Sauls <ibisbasenji gmail.com> wrote:Now, what I didn't test but probably should have, was the effect of "pre-allocating" the array by setting the .length to a large value and then back to zero, expanding the behind-the-scenes capacity of the array. I'm betting in that case the IndexAssign would be the faster.Not if you set it back to zero. If you do that, D also deallocates the RAM. Setting its length back to 1 however is okay it that the allocated RAM stays allocated to the array. This means that the first element is just a dummy to get around the 'bug'. -- Derek Parnell Melbourne, Australia
Jun 08 2006