digitalmars.D.learn - dynamic array .length vs .reserve

digitalmars.D.learn - dynamic array .length vs .reserve - what's the difference?

wjoe (28/28) Jul 30 2020 I just stumbled upon code like this:

=?UTF-8?Q?Ali_=c3=87ehreli?= (18/20) Jul 30 2020 There may be something that I don't know but I think assigning to the=20

wjoe (30/73) Jul 30 2020 So whichever instruction is redundant depends on whether I want

=?UTF-8?Q?Ali_=c3=87ehreli?= (24/36) Jul 30 2020 Yes but the "sharing being terminated" phrase was my attempt at

wjoe (5/12) Jul 31 2020 Real shame. I quite like it - especially if you say it out loud

Steven Schveighoffer (6/17) Aug 01 2020 Just FYI, the reason this is faster is because there is no need to go

wjoe (4/22) Aug 03 2020 That's good to know, thanks :)

=?UTF-8?Q?Christian_K=c3=b6stlin?= (3/54) Aug 02 2020 That's a trick I need to remember!

ag0aep6g (6/11) Jul 31 2020 This one?

Steven Schveighoffer (26/55) Jul 30 2020 It is, in this case. Reserve will extend the allocated length to n, but
bachmeier (23/36) Jul 30 2020 Here's a simple example to see that reserve and length serve

wjoe <invalid example.com> writes:

I just stumbled upon code like this:

struct Foo(T)
{
     T[] b;

     this(int n)
     {
         b.reserve(n);
         b.length = n;
     }
}

.reserve looks redundant.

The docs are explaining .length nicely, however lack any 
specifics about reserve.

Changing the length of an array may relocate and copy. New items 
are initialized with T.init - is that true for both length and 
reserve ?

Also there's .capacity - is that equivalent to reserve ?

Another curiosity I noticed is that the docs say that the runtime 
tries to resize in place, however:

b = b[0..$-1]; if length==1 seems to collect the memory at once 
because if it's immediately followed by b.length = 1; or b ~= 
someT; b.ptr points to a new address.
Why ?

I know that b=b[0..0]; is equivalent to b = null; but I still 
don't understand the need for a new allocation when b could be 
resized in place.
How can I hold on to the memory originally reserved by b but at 
the same time allow the array to temporarily be empty?

Jul 30 2020

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 7/30/20 8:58 AM, wjoe wrote:

  =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 b.reserve(n);
  =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 b.length =3D n;

There may be something that I don't know but I think assigning to the=20
=2Elength property alone should be the same as reserving and then assigni=
ng.

reserve is supposed to make sure no memory will be allocated as elements =

are added.

capacity tells how many elements can be added without memory is=20
allocated. However, the D runtime does its best to elide actual memory=20
allocation if there is room beyond the array's last element and it's=20
safe to do so.

This article is considered a must-read for understanding what is going=20
on behind the scenes:

   https://dlang.org/articles/d-array-article.html

I tried to introduce the concept of slices "sharing elements" as well as =

how .capacity is used to determine whether sharing will be terminated, he=
re:

   http://ddili.org/ders/d.en/slices.html#ix_slices..capacity

Ali

Jul 30 2020

wjoe <invalid example.com> writes:

On Thursday, 30 July 2020 at 16:33:22 UTC, Ali Çehreli wrote:
 On 7/30/20 8:58 AM, wjoe wrote:

          b.reserve(n);
          b.length = n;

 There may be something that I don't know but I think assigning 
 to the .length property alone should be the same as reserving 
 and then assigning.

 reserve is supposed to make sure no memory will be allocated as 
 elements are added.

So whichever instruction is redundant depends on whether I want 
an array of length n, or an empty array that can grow to at least 
n elements without reallocation.

 This article is considered a must-read for understanding what 
 is going on behind the scenes:

   https://dlang.org/articles/d-array-article.html

 I tried to introduce the concept of slices "sharing elements" 
 as well as how .capacity is used to determine whether sharing 
 will be terminated, here:

   http://ddili.org/ders/d.en/slices.html#ix_slices..capacity

 Ali

These resources are great. Thanks.



On Thursday, 30 July 2020 at 16:55:35 UTC, Steven Schveighoffer 
wrote:
 On 7/30/20 11:58 AM, wjoe wrote:
 Also there's .capacity - is that equivalent to reserve ?

 Capacity tells you how many total elements the current reserved 
 space can hold. If the array is not appendable, then this 
 returns 0.

So .capacity can't be assigned a value like length to reserve the 
RAM ?

 Another curiosity I noticed is that the docs say that the 
 runtime tries to resize in place, however:
 
 b = b[0..$-1]; if length==1 seems to collect the memory at 
 once because if it's immediately followed by b.length = 1; or 
 b ~= someT; b.ptr points to a new address.
 Why ?

 Because you would be overwriting the data already in the array.

 For example:

 auto a = b;
 b = b[0 .. $-1];
 b ~= someT;

 If that last line is done in-place, then it overwrites a[$-1].

So this is a case of sharing being terminated ?

 If you know that it's OK to do this, then you should call 
 assumeSafeAppend on the array (which will adjust the "used" 
 space down to the current array).

This array is a private array of pointers to released structs - 
technically a stack.
Every time a new element is requested, the most recently released 
element would be taken off the stack and be set to the new data 
until the stack was exhausted.
Expired structs are put back into (appended to) the array for 
reuse.
When the length of the array == 0, upon releasing a struct, this 
array is reallocated which isn't supposed to happen. It should 
just grow like it did with length > 1.
assumeSafeAppend should accomplish that :)

 I know that b=b[0..0]; is equivalent to b = null;

 No, if slicing b to b[0 .. 0], b.ptr does not change. b = null 
 sets the pointer to null. In the former case, if you called 
 assumeSafeAppend on it, then it could append in-place at that 
 point. With the null pointer, it would reallocate.

 -Steve

I could swear just a few weeks ago there was someone asking how 
to tell if an array was null or of length 0 and an answer was 
that it's the same and can't be distinguished so I assumed that 
assigning a slice of 0 length is the same as setting the array to 
null because the result is the same as a 0 length array.
Thanks for the explanations and corrections.



bachmeier, a picture tells more than a thousand words. Your 
illustration is very comprehensive. Thanks :)


Thank you everyone, your input is very much appreciated :)

Jul 30 2020

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 7/30/20 4:42 PM, wjoe wrote:

 So .capacity can't be assigned a value like length to reserve the RAM ?

Yes, a read-only property...

 auto a = b;
 b = b[0 .. $-1];
 b ~= someT;

 If that last line is done in-place, then it overwrites a[$-1].

 So this is a case of sharing being terminated ?

Yes but the "sharing being terminated" phrase was my attempt at 
explaining things, which did not catch on. :)

 Expired structs are put back into (appended to) the array for reuse.
 When the length of the array == 0, upon releasing a struct, this array
 is reallocated which isn't supposed to happen. It should just grow like
 it did with length > 1.
 assumeSafeAppend should accomplish that :)

Yes, assumeSafeAppend is exactly for cases like that and it helps.

Another option, which is curiously said to be more performant in memory 
allocation than native arrays, is std.array.Appender. I've used 
function-local static Appenders to cut down on memory allocation. Here 
is an uncompiled pseudo code:

void foo() {
   static Appender!int a;
   a.clear();  // <- Clear state from last execution of this function.
               //    'a' still holds on to its memory.

   while (someCondition()) {
     a ~= 42;
   }

   // Use 'a' here
}

So, 'a' will have the longest length ever used up to this point, which 
may be exactly what is desired.

The cool thing is, because data is thread-local by-default in D, every 
thread gets their own copy of 'a', so there is not danger of data race. 
:) (Warning: Don't call foo() recursively though. ;) )

Ali

Jul 30 2020

wjoe <invalid example.com> writes:

On Friday, 31 July 2020 at 04:28:57 UTC, Ali Çehreli wrote:
 Yes but the "sharing being terminated" phrase was my attempt at 
 explaining things, which did not catch on. :)

Real shame. I quite like it - especially if you say it out loud 
with an Austrian accent :)


 Another option, which is curiously said to be more performant 
 in memory allocation than native arrays, is std.array.Appender. 
 I've used function-local static Appenders to cut down on memory 
 allocation. Here is an uncompiled pseudo code:

 [...]

This looks like an even better way to do it.

Thanks, Ali :)

Jul 31 2020

Steven Schveighoffer <schveiguy gmail.com> writes:

On 7/31/20 12:32 PM, wjoe wrote:
 On Friday, 31 July 2020 at 04:28:57 UTC, Ali Çehreli wrote: 
 Another option, which is curiously said to be more performant in 
 memory allocation than native arrays, is std.array.Appender. I've used 
 function-local static Appenders to cut down on memory allocation. Here 
 is an uncompiled pseudo code:

 [...]

 
 This looks like an even better way to do it.
 
 Thanks, Ali :)

Just FYI, the reason this is faster is because there is no need to go 
through the opaque calls into druntime to figure out if 
appending-in-place is possible. The reserved length is stored directly 
in the struct.

-Steve

Aug 01 2020

wjoe <invalid example.com> writes:

On Saturday, 1 August 2020 at 16:04:01 UTC, Steven Schveighoffer 
wrote:
 On 7/31/20 12:32 PM, wjoe wrote:
 On Friday, 31 July 2020 at 04:28:57 UTC, Ali Çehreli wrote:
 Another option, which is curiously said to be more performant 
 in memory allocation than native arrays, is 
 std.array.Appender. I've used function-local static Appenders 
 to cut down on memory allocation. Here is an uncompiled 
 pseudo code:

 [...]

 
 This looks like an even better way to do it.
 
 Thanks, Ali :)

 Just FYI, the reason this is faster is because there is no need 
 to go through the opaque calls into druntime to figure out if 
 appending-in-place is possible. The reserved length is stored 
 directly in the struct.

 -Steve

That's good to know, thanks :)
By the looks of it, Appender is half duplicating the runtime :)

Aug 03 2020

=?UTF-8?Q?Christian_K=c3=b6stlin?= <christian.koestlin gmail.com> writes:

On 31.07.20 06:28, Ali Çehreli wrote:
 On 7/30/20 4:42 PM, wjoe wrote:

  > So .capacity can't be assigned a value like length to reserve the RAM ?

 Yes, a read-only property...

  >> auto a = b;
  >> b = b[0 .. $-1];
  >> b ~= someT;
  >>
  >> If that last line is done in-place, then it overwrites a[$-1].
  >
  > So this is a case of sharing being terminated ?

 Yes but the "sharing being terminated" phrase was my attempt at 
 explaining things, which did not catch on. :)

  > Expired structs are put back into (appended to) the array for reuse.
  > When the length of the array == 0, upon releasing a struct, this array
  > is reallocated which isn't supposed to happen. It should just grow like
  > it did with length > 1.
  > assumeSafeAppend should accomplish that :)

 Yes, assumeSafeAppend is exactly for cases like that and it helps.

 Another option, which is curiously said to be more performant in memory 
 allocation than native arrays, is std.array.Appender. I've used 
 function-local static Appenders to cut down on memory allocation. Here 
 is an uncompiled pseudo code:

 void foo() {
    static Appender!int a;
    a.clear();  // <- Clear state from last execution of this function.
                //    'a' still holds on to its memory.

    while (someCondition()) {
      a ~= 42;
    }

    // Use 'a' here
 }

 So, 'a' will have the longest length ever used up to this point, which 
 may be exactly what is desired.

 The cool thing is, because data is thread-local by-default in D, every 
 thread gets their own copy of 'a', so there is not danger of data race. 
 :) (Warning: Don't call foo() recursively though. ;) )

 Ali

That's a trick I need to remember!

Thank Ali!

Aug 02 2020

ag0aep6g <anonymous example.com> writes:

On 31.07.20 01:42, wjoe wrote:
 I could swear just a few weeks ago there was someone asking how to tell 
 if an array was null or of length 0 and an answer was that it's the same 
 and can't be distinguished so I assumed that assigning a slice of 0 
 length is the same as setting the array to null because the result is 
 the same as a 0 length array.

This one?
https://forum.dlang.org/thread/xgnbzpziqmjyjfsqlzfi forum.dlang.org

`[]` is the same as `null`.

While `b[0 .. 0]` is empty like `[]`, it is not the same (unless `b` is 
already `[]`/`null`). The `.ptr`s are different.

Jul 31 2020

Steven Schveighoffer <schveiguy gmail.com> writes:

On 7/30/20 11:58 AM, wjoe wrote:
 I just stumbled upon code like this:
 
 struct Foo(T)
 {
      T[] b;
 
      this(int n)
      {
          b.reserve(n);
          b.length = n;
      }
 }
 
 ..reserve looks redundant.

It is, in this case. Reserve will extend the allocated length to n, but 
not adjust the slice, and setting the length will adjust the slice and 
consume that data from the block. Just setting length has the same effect.

 
 The docs are explaining .length nicely, however lack any specifics about 
 reserve.
 
 Changing the length of an array may relocate and copy. New items are 
 initialized with T.init - is that true for both length and reserve ?

reserve preallocates data for use in appending, but does not alter the 
length of the specified array.

It's like saying, "I'm going to append enough elements to this array so 
the total length is N." It's very much tied to appending.

Note that reserve may do nothing, if there is already at least N 
elements available.

 Also there's .capacity - is that equivalent to reserve ?

Capacity tells you how many total elements the current reserved space 
can hold. If the array is not appendable, then this returns 0.

 Another curiosity I noticed is that the docs say that the runtime tries 
 to resize in place, however:
 
 b = b[0..$-1]; if length==1 seems to collect the memory at once because 
 if it's immediately followed by b.length = 1; or b ~= someT; b.ptr 
 points to a new address.
 Why ?

Because you would be overwriting the data already in the array.

For example:

auto a = b;
b = b[0 .. $-1];
b ~= someT;

If that last line is done in-place, then it overwrites a[$-1].

If you know that it's OK to do this, then you should call 
assumeSafeAppend on the array (which will adjust the "used" space down 
to the current array).

 I know that b=b[0..0]; is equivalent to b = null;

No, if slicing b to b[0 .. 0], b.ptr does not change. b = null sets the 
pointer to null. In the former case, if you called assumeSafeAppend on 
it, then it could append in-place at that point. With the null pointer, 
it would reallocate.

-Steve

Jul 30 2020

bachmeier <no spam.net> writes:

On Thursday, 30 July 2020 at 15:58:28 UTC, wjoe wrote:
 I just stumbled upon code like this:

 struct Foo(T)
 {
     T[] b;

     this(int n)
     {
         b.reserve(n);
         b.length = n;
     }
 }

 .reserve looks redundant.

 The docs are explaining .length nicely, however lack any 
 specifics about reserve.

Here's a simple example to see that reserve and length serve 
different purposes. In particular, reserve doesn't fill any 
elements but allocates enough memory, while setting length puts 
values into the elements:

import std.stdio;

void main() {
	double[] x;
     writeln(x.length);
     writeln(x.capacity);
     x.reserve(50);
     writeln(x.length);
     writeln(x.capacity);
     x ~= 1.5;
     writeln(x);

     double[] y;
     y.length = 50;
     writeln(y.length);
     writeln(y.capacity);
     y ~= 1.5;
     writeln(y);
}

https://run.dlang.io/is/1EWf5R

Jul 30 2020

D Programming

C/C++ Programming

Other

digitalmars.D.learn - dynamic array .length vs .reserve - what's the difference?