www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - dynamic array .length vs .reserve - what's the difference?

reply wjoe <invalid example.com> writes:
I just stumbled upon code like this:

struct Foo(T)
{
     T[] b;

     this(int n)
     {
         b.reserve(n);
         b.length = n;
     }
}

.reserve looks redundant.

The docs are explaining .length nicely, however lack any 
specifics about reserve.

Changing the length of an array may relocate and copy. New items 
are initialized with T.init - is that true for both length and 
reserve ?

Also there's .capacity - is that equivalent to reserve ?

Another curiosity I noticed is that the docs say that the runtime 
tries to resize in place, however:

b = b[0..$-1]; if length==1 seems to collect the memory at once 
because if it's immediately followed by b.length = 1; or b ~= 
someT; b.ptr points to a new address.
Why ?

I know that b=b[0..0]; is equivalent to b = null; but I still 
don't understand the need for a new allocation when b could be 
resized in place.
How can I hold on to the memory originally reserved by b but at 
the same time allow the array to temporarily be empty?
Jul 30 2020
next sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 7/30/20 8:58 AM, wjoe wrote:

  =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 b.reserve(n);
  =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 b.length =3D n;
There may be something that I don't know but I think assigning to the=20 =2Elength property alone should be the same as reserving and then assigni= ng. reserve is supposed to make sure no memory will be allocated as elements = are added. capacity tells how many elements can be added without memory is=20 allocated. However, the D runtime does its best to elide actual memory=20 allocation if there is room beyond the array's last element and it's=20 safe to do so. This article is considered a must-read for understanding what is going=20 on behind the scenes: https://dlang.org/articles/d-array-article.html I tried to introduce the concept of slices "sharing elements" as well as = how .capacity is used to determine whether sharing will be terminated, he= re: http://ddili.org/ders/d.en/slices.html#ix_slices..capacity Ali
Jul 30 2020
parent reply wjoe <invalid example.com> writes:
On Thursday, 30 July 2020 at 16:33:22 UTC, Ali Çehreli wrote:
 On 7/30/20 8:58 AM, wjoe wrote:

          b.reserve(n);
          b.length = n;
There may be something that I don't know but I think assigning to the .length property alone should be the same as reserving and then assigning. reserve is supposed to make sure no memory will be allocated as elements are added.
So whichever instruction is redundant depends on whether I want an array of length n, or an empty array that can grow to at least n elements without reallocation.
 This article is considered a must-read for understanding what 
 is going on behind the scenes:

   https://dlang.org/articles/d-array-article.html

 I tried to introduce the concept of slices "sharing elements" 
 as well as how .capacity is used to determine whether sharing 
 will be terminated, here:

   http://ddili.org/ders/d.en/slices.html#ix_slices..capacity

 Ali
These resources are great. Thanks. On Thursday, 30 July 2020 at 16:55:35 UTC, Steven Schveighoffer wrote:
 On 7/30/20 11:58 AM, wjoe wrote:
 Also there's .capacity - is that equivalent to reserve ?
Capacity tells you how many total elements the current reserved space can hold. If the array is not appendable, then this returns 0.
So .capacity can't be assigned a value like length to reserve the RAM ?
 Another curiosity I noticed is that the docs say that the 
 runtime tries to resize in place, however:
 
 b = b[0..$-1]; if length==1 seems to collect the memory at 
 once because if it's immediately followed by b.length = 1; or 
 b ~= someT; b.ptr points to a new address.
 Why ?
Because you would be overwriting the data already in the array. For example: auto a = b; b = b[0 .. $-1]; b ~= someT; If that last line is done in-place, then it overwrites a[$-1].
So this is a case of sharing being terminated ?
 If you know that it's OK to do this, then you should call 
 assumeSafeAppend on the array (which will adjust the "used" 
 space down to the current array).
This array is a private array of pointers to released structs - technically a stack. Every time a new element is requested, the most recently released element would be taken off the stack and be set to the new data until the stack was exhausted. Expired structs are put back into (appended to) the array for reuse. When the length of the array == 0, upon releasing a struct, this array is reallocated which isn't supposed to happen. It should just grow like it did with length > 1. assumeSafeAppend should accomplish that :)
 I know that b=b[0..0]; is equivalent to b = null;
No, if slicing b to b[0 .. 0], b.ptr does not change. b = null sets the pointer to null. In the former case, if you called assumeSafeAppend on it, then it could append in-place at that point. With the null pointer, it would reallocate. -Steve
I could swear just a few weeks ago there was someone asking how to tell if an array was null or of length 0 and an answer was that it's the same and can't be distinguished so I assumed that assigning a slice of 0 length is the same as setting the array to null because the result is the same as a 0 length array. Thanks for the explanations and corrections. bachmeier, a picture tells more than a thousand words. Your illustration is very comprehensive. Thanks :) Thank you everyone, your input is very much appreciated :)
Jul 30 2020
next sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 7/30/20 4:42 PM, wjoe wrote:

 So .capacity can't be assigned a value like length to reserve the RAM ?
Yes, a read-only property...
 auto a = b;
 b = b[0 .. $-1];
 b ~= someT;

 If that last line is done in-place, then it overwrites a[$-1].
So this is a case of sharing being terminated ?
Yes but the "sharing being terminated" phrase was my attempt at explaining things, which did not catch on. :)
 Expired structs are put back into (appended to) the array for reuse.
 When the length of the array == 0, upon releasing a struct, this array
 is reallocated which isn't supposed to happen. It should just grow like
 it did with length > 1.
 assumeSafeAppend should accomplish that :)
Yes, assumeSafeAppend is exactly for cases like that and it helps. Another option, which is curiously said to be more performant in memory allocation than native arrays, is std.array.Appender. I've used function-local static Appenders to cut down on memory allocation. Here is an uncompiled pseudo code: void foo() { static Appender!int a; a.clear(); // <- Clear state from last execution of this function. // 'a' still holds on to its memory. while (someCondition()) { a ~= 42; } // Use 'a' here } So, 'a' will have the longest length ever used up to this point, which may be exactly what is desired. The cool thing is, because data is thread-local by-default in D, every thread gets their own copy of 'a', so there is not danger of data race. :) (Warning: Don't call foo() recursively though. ;) ) Ali
Jul 30 2020
next sibling parent reply wjoe <invalid example.com> writes:
On Friday, 31 July 2020 at 04:28:57 UTC, Ali Çehreli wrote:
 Yes but the "sharing being terminated" phrase was my attempt at 
 explaining things, which did not catch on. :)
Real shame. I quite like it - especially if you say it out loud with an Austrian accent :)
 Another option, which is curiously said to be more performant 
 in memory allocation than native arrays, is std.array.Appender. 
 I've used function-local static Appenders to cut down on memory 
 allocation. Here is an uncompiled pseudo code:

 [...]
This looks like an even better way to do it. Thanks, Ali :)
Jul 31 2020
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 7/31/20 12:32 PM, wjoe wrote:
 On Friday, 31 July 2020 at 04:28:57 UTC, Ali Çehreli wrote: 
 Another option, which is curiously said to be more performant in 
 memory allocation than native arrays, is std.array.Appender. I've used 
 function-local static Appenders to cut down on memory allocation. Here 
 is an uncompiled pseudo code:

 [...]
This looks like an even better way to do it. Thanks, Ali :)
Just FYI, the reason this is faster is because there is no need to go through the opaque calls into druntime to figure out if appending-in-place is possible. The reserved length is stored directly in the struct. -Steve
Aug 01 2020
parent wjoe <invalid example.com> writes:
On Saturday, 1 August 2020 at 16:04:01 UTC, Steven Schveighoffer 
wrote:
 On 7/31/20 12:32 PM, wjoe wrote:
 On Friday, 31 July 2020 at 04:28:57 UTC, Ali Çehreli wrote:
 Another option, which is curiously said to be more performant 
 in memory allocation than native arrays, is 
 std.array.Appender. I've used function-local static Appenders 
 to cut down on memory allocation. Here is an uncompiled 
 pseudo code:

 [...]
This looks like an even better way to do it. Thanks, Ali :)
Just FYI, the reason this is faster is because there is no need to go through the opaque calls into druntime to figure out if appending-in-place is possible. The reserved length is stored directly in the struct. -Steve
That's good to know, thanks :) By the looks of it, Appender is half duplicating the runtime :)
Aug 03 2020
prev sibling parent =?UTF-8?Q?Christian_K=c3=b6stlin?= <christian.koestlin gmail.com> writes:
On 31.07.20 06:28, Ali Çehreli wrote:
 On 7/30/20 4:42 PM, wjoe wrote:
 
  > So .capacity can't be assigned a value like length to reserve the RAM ?
 
 Yes, a read-only property...
 
  >> auto a = b;
  >> b = b[0 .. $-1];
  >> b ~= someT;
  >>
  >> If that last line is done in-place, then it overwrites a[$-1].
  >
  > So this is a case of sharing being terminated ?
 
 Yes but the "sharing being terminated" phrase was my attempt at 
 explaining things, which did not catch on. :)
 
  > Expired structs are put back into (appended to) the array for reuse.
  > When the length of the array == 0, upon releasing a struct, this array
  > is reallocated which isn't supposed to happen. It should just grow like
  > it did with length > 1.
  > assumeSafeAppend should accomplish that :)
 
 Yes, assumeSafeAppend is exactly for cases like that and it helps.
 
 Another option, which is curiously said to be more performant in memory 
 allocation than native arrays, is std.array.Appender. I've used 
 function-local static Appenders to cut down on memory allocation. Here 
 is an uncompiled pseudo code:
 
 void foo() {
    static Appender!int a;
    a.clear();  // <- Clear state from last execution of this function.
                //    'a' still holds on to its memory.
 
    while (someCondition()) {
      a ~= 42;
    }
 
    // Use 'a' here
 }
 
 So, 'a' will have the longest length ever used up to this point, which 
 may be exactly what is desired.
 
 The cool thing is, because data is thread-local by-default in D, every 
 thread gets their own copy of 'a', so there is not danger of data race. 
 :) (Warning: Don't call foo() recursively though. ;) )
 
 Ali
 
That's a trick I need to remember! Thank Ali!
Aug 02 2020
prev sibling parent ag0aep6g <anonymous example.com> writes:
On 31.07.20 01:42, wjoe wrote:
 I could swear just a few weeks ago there was someone asking how to tell 
 if an array was null or of length 0 and an answer was that it's the same 
 and can't be distinguished so I assumed that assigning a slice of 0 
 length is the same as setting the array to null because the result is 
 the same as a 0 length array.
This one? https://forum.dlang.org/thread/xgnbzpziqmjyjfsqlzfi forum.dlang.org `[]` is the same as `null`. While `b[0 .. 0]` is empty like `[]`, it is not the same (unless `b` is already `[]`/`null`). The `.ptr`s are different.
Jul 31 2020
prev sibling next sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 7/30/20 11:58 AM, wjoe wrote:
 I just stumbled upon code like this:
 
 struct Foo(T)
 {
      T[] b;
 
      this(int n)
      {
          b.reserve(n);
          b.length = n;
      }
 }
 
 ..reserve looks redundant.
It is, in this case. Reserve will extend the allocated length to n, but not adjust the slice, and setting the length will adjust the slice and consume that data from the block. Just setting length has the same effect.
 
 The docs are explaining .length nicely, however lack any specifics about 
 reserve.
 
 Changing the length of an array may relocate and copy. New items are 
 initialized with T.init - is that true for both length and reserve ?
reserve preallocates data for use in appending, but does not alter the length of the specified array. It's like saying, "I'm going to append enough elements to this array so the total length is N." It's very much tied to appending. Note that reserve may do nothing, if there is already at least N elements available.
 Also there's .capacity - is that equivalent to reserve ?
Capacity tells you how many total elements the current reserved space can hold. If the array is not appendable, then this returns 0.
 Another curiosity I noticed is that the docs say that the runtime tries 
 to resize in place, however:
 
 b = b[0..$-1]; if length==1 seems to collect the memory at once because 
 if it's immediately followed by b.length = 1; or b ~= someT; b.ptr 
 points to a new address.
 Why ?
Because you would be overwriting the data already in the array. For example: auto a = b; b = b[0 .. $-1]; b ~= someT; If that last line is done in-place, then it overwrites a[$-1]. If you know that it's OK to do this, then you should call assumeSafeAppend on the array (which will adjust the "used" space down to the current array).
 I know that b=b[0..0]; is equivalent to b = null;
No, if slicing b to b[0 .. 0], b.ptr does not change. b = null sets the pointer to null. In the former case, if you called assumeSafeAppend on it, then it could append in-place at that point. With the null pointer, it would reallocate. -Steve
Jul 30 2020
prev sibling parent bachmeier <no spam.net> writes:
On Thursday, 30 July 2020 at 15:58:28 UTC, wjoe wrote:
 I just stumbled upon code like this:

 struct Foo(T)
 {
     T[] b;

     this(int n)
     {
         b.reserve(n);
         b.length = n;
     }
 }

 .reserve looks redundant.

 The docs are explaining .length nicely, however lack any 
 specifics about reserve.
Here's a simple example to see that reserve and length serve different purposes. In particular, reserve doesn't fill any elements but allocates enough memory, while setting length puts values into the elements: import std.stdio; void main() { double[] x; writeln(x.length); writeln(x.capacity); x.reserve(50); writeln(x.length); writeln(x.capacity); x ~= 1.5; writeln(x); double[] y; y.length = 50; writeln(y.length); writeln(y.capacity); y ~= 1.5; writeln(y); } https://run.dlang.io/is/1EWf5R
Jul 30 2020