digitalmars.D.learn - dynamic array capacity

spir (14/14) Dec 29 2010 Hello,

Steven Schveighoffer (8/12) Dec 29 2010 int[] x;

%u (2/15) Dec 29 2010 Oh, wow! D2 really kicks D1's ass here..
spir (9/28) Dec 29 2010 =20
spir (70/89) Dec 29 2010 =20

Steven Schveighoffer (14/28) Dec 29 2010 What is the baseline for this? I.e. what is it 20% slower than? FWIW,

spir (36/68) Dec 29 2010 =20

Steven Schveighoffer (35/84) Dec 29 2010 Then I would be suspect of your test. In simple tests I just ran,

spir <denis.spir gmail.com> writes:

Hello,

Is there a common idiom to pre-allocate a dynamic array. I mean allocating =
to avoid numerous re-allocations in loop, not setting length & filling cont=
ent.
The differences are:
(0) no content --> no init
(1) one can simply append, instead of setting elements at defined indices
(2) one can append sub-sequences
(3) when the initially requested capacity is full, the array automagically =
resizes as needed (usng D's builtin dynarray realloc scheme).

Denis
-- -- -- -- -- -- --
vit esse estrany =E2=98=A3

spir.wikidot.com

Dec 29 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 29 Dec 2010 07:29:29 -0500, spir <denis.spir gmail.com> wrote:

 Hello,

 Is there a common idiom to pre-allocate a dynamic array. I mean  
 allocating to avoid numerous re-allocations in loop, not setting length  
 & filling content.

int[] x;
x.reserve(10000); // pre-allocate at least 10000 elements for appending

Another way to do it:

Appender!(int[]) app;
app.reserve(10000);

this will perform much better than builtin array appending.

-Steve

Dec 29 2010

%u <e ee.com> writes:

== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 On Wed, 29 Dec 2010 07:29:29 -0500, spir <denis.spir gmail.com> wrote:
 Hello,

 Is there a common idiom to pre-allocate a dynamic array. I mean
 allocating to avoid numerous re-allocations in loop, not setting length
 & filling content.

 int[] x;
 x.reserve(10000); // pre-allocate at least 10000 elements for appending
 Another way to do it:
 Appender!(int[]) app;
 app.reserve(10000);
 this will perform much better than builtin array appending.
 -Steve

Oh, wow! D2 really kicks D1's ass here..

Dec 29 2010

spir <denis.spir gmail.com> writes:

On Wed, 29 Dec 2010 11:24:01 -0500
"Steven Schveighoffer" <schveiguy yahoo.com> wrote:

 On Wed, 29 Dec 2010 07:29:29 -0500, spir <denis.spir gmail.com> wrote:
=20
 Hello,

 Is there a common idiom to pre-allocate a dynamic array. I mean =20
 allocating to avoid numerous re-allocations in loop, not setting length=


 =20
 & filling content.

=20
 int[] x;
 x.reserve(10000); // pre-allocate at least 10000 elements for appending
=20
 Another way to do it:
=20
 Appender!(int[]) app;
 app.reserve(10000);
=20
 this will perform much better than builtin array appending.
=20
 -Steve

Thank you, Steven. I could not find anything like 'reserve' in the property=
 table because it's not a property ;-)

Denis
-- -- -- -- -- -- --
vit esse estrany =E2=98=A3

spir.wikidot.com

Dec 29 2010

spir <denis.spir gmail.com> writes:

On Wed, 29 Dec 2010 11:24:01 -0500
"Steven Schveighoffer" <schveiguy yahoo.com> wrote:

 On Wed, 29 Dec 2010 07:29:29 -0500, spir <denis.spir gmail.com> wrote:
=20
 Hello,

 Is there a common idiom to pre-allocate a dynamic array. I mean =20
 allocating to avoid numerous re-allocations in loop, not setting length=


 =20
 & filling content.

=20
 int[] x;
 x.reserve(10000); // pre-allocate at least 10000 elements for appending
=20
 Another way to do it:
=20
 Appender!(int[]) app;
 app.reserve(10000);
=20
 this will perform much better than builtin array appending.
=20
 -Steve

I've done some timings using reserve and Appender. Seems not to work on my =
use case (decomposition of a string [actually a sequence of code points] ac=
cording to NFD). (see sample code below)
* use reserve (to source string's length) with builtin append (operator '~=
=3D') --> 20% slower
* use Appender w/o reserve --> 3 times slower
* user Appender + its own reserve --> 1.5 times slower (i.e. divide above t=
ime per 2)

I'm surprised that reserve does not speed up builtin appending, since it ca=
n only avoid numerous reallocations. How to interpret that?
I'm even more surprised of Appender's results on this use case, after havin=
g read about it's performance several times on the list. Strange. Can it be=
 due to the fact that I only append sub-sequences? (the decomposition '*p' =
below is also a mini-array)

Denis

// NFD decomposition
Code[] decomposition (Code[] codes) {
    /** Decompose code sequence according to form "NFD". */
//~     Code[] decompCodes;     // new, decomposed; pile
    Appender!(Code[]) decompCodes;
    Decomp* p;              // pointer to single-code decomposition
    Ordinal i0 =3D 0;         // reference index past last decomposed code
   =20
    // Decompose each code.
//~     decompCodes.reserve(codes.length);  // worse time! ???
//~     foreach (i,code ; codes) {
//~         // case composed code
//~         p =3D (code in DECOMPOSITIONS);
//~         // Do not even check whether code is composite
//~         // for simple ASCII & latin characters.
//~         if (code > 0xBF)
//~             if (p) {
//~                 // Append past simple codes, if any.
//~                 if (i > i0) decompCodes ~=3D codes[i0..i];
//~                 // Append current code's decomposition.
//~                 decompCodes ~=3D *p;
//~                 // Update reference index.
//~                 i0 =3D i + 1;
//~             }
//~     }
//~     // Append last simple codes (if any).
//~     decompCodes ~=3D codes[i0..$];

    decompCodes.reserve(codes.length);  // worse time! ???
    foreach (i,code ; codes) {
        // case composed code
        p =3D (code in DECOMPOSITIONS);
        // Do not even check whether code is composite
        // for simple ASCII & latin characters.
        if (code > 0xBF)
            if (p) {
                // Append past simple codes, if any.
                if (i > i0) decompCodes.put(codes[i0..i]);
                // Append current code's decomposition.
                decompCodes.put(*p);
                // Update reference index.
                i0 =3D i + 1;
            }
    }
    // Append last simple codes (if any).
    decompCodes.put(codes[i0..$]);
   =20
//~     return decompCodes;
    return decompCodes.data;
}
-- -- -- -- -- -- --
vit esse estrany =E2=98=A3

spir.wikidot.com

Dec 29 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 29 Dec 2010 13:14:29 -0500, spir <denis.spir gmail.com> wrote:

 I've done some timings using reserve and Appender. Seems not to work on  
 my use case (decomposition of a string [actually a sequence of code  
 points] according to NFD). (see sample code below)
 * use reserve (to source string's length) with builtin append (operator  
 '~=') --> 20% slower
 * use Appender w/o reserve --> 3 times slower
 * user Appender + its own reserve --> 1.5 times slower (i.e. divide  
 above time per 2)

What is the baseline for this?  I.e. what is it 20% slower than? FWIW,  
Appender should be much faster than builtin append, even without reserve.

However, Appender has a recently fixed bug (not fixed in 2.051) where  
appending *arrays* of elements goes very slow.  I see you are doing that  
in a couple spots.

 I'm surprised that reserve does not speed up builtin appending, since it  
 can only avoid numerous reallocations. How to interpret that?
 I'm even more surprised of Appender's results on this use case, after  
 having read about it's performance several times on the list. Strange.  
 Can it be due to the fact that I only append sub-sequences? (the  
 decomposition '*p' below is also a mini-array)

It should speed up appending.  If it doesn't, then it's either a bug or  
pilot error.  As I said before, Appender in 2.051 and earlier has a bug  
where appending an array is very slow.

But builtin appending should be faster if you reserve.

Simple tests I run prove that this is true.  Recent developments in how  
the appending array grows mitigate this quite a bit, but it certainly will  
result in less memory being consumed, and it always runs faster.

-Steve

Dec 29 2010

spir <denis.spir gmail.com> writes:

On Wed, 29 Dec 2010 13:48:48 -0500
"Steven Schveighoffer" <schveiguy yahoo.com> wrote:

 On Wed, 29 Dec 2010 13:14:29 -0500, spir <denis.spir gmail.com> wrote:
=20
 I've done some timings using reserve and Appender. Seems not to work on=


 =20
 my use case (decomposition of a string [actually a sequence of code =20
 points] according to NFD). (see sample code below)
 * use reserve (to source string's length) with builtin append (operator=


 =20
 '~=3D') --> 20% slower
 * use Appender w/o reserve --> 3 times slower
 * user Appender + its own reserve --> 1.5 times slower (i.e. divide =20
 above time per 2)

=20
 What is the baseline for this?  I.e. what is it 20% slower than? FWIW, =20
 Appender should be much faster than builtin append, even without reserve.

The baseline is builtin append (~=3D) without reserving. (Sorry, thought it=
 was clear.)

 However, Appender has a recently fixed bug (not fixed in 2.051) where =20
 appending *arrays* of elements goes very slow.  I see you are doing that =

=20
 in a couple spots.

As explained below, I'm doing _only_ that. So (as guessed), seems it may be=
 the reason why Appender performs so badly in my case.

 I'm surprised that reserve does not speed up builtin appending, since i=


t =20
 can only avoid numerous reallocations. How to interpret that?


* This question remains. ??? *

 I'm even more surprised of Appender's results on this use case, after =


=20
 having read about it's performance several times on the list. Strange. =


=20
 Can it be due to the fact that I only append sub-sequences? (the =20
 decomposition '*p' below is also a mini-array)

=20
 It should speed up appending.  If it doesn't, then it's either a bug or =

=20
 pilot error.  As I said before, Appender in 2.051 and earlier has a bug =

=20
 where appending an array is very slow.
=20
 But builtin appending should be faster if you reserve.

Yop!

 Simple tests I run prove that this is true.  Recent developments in how =

=20
 the appending array grows mitigate this quite a bit, but it certainly wil=

l =20
 result in less memory being consumed, and it always runs faster.

I can upload the modified version using reserve & Appender (with the timing=
 module) if you like. (The original one is at https://bitbucket.org/denispi=
r/denispir-d/src, files pile.d and chrono.d.)
Anyway, why reserve does not help builtin append is a mystery. In the worst=
 case, it should not trouble.
By the way, how comes I do not need to import anything to be able to use re=
serve (like if it were a builtin prop)?

About Appender, I have had a look at the code in std.array and could not fi=
nd any sign of why -- except that appending a slice loops over its elements=
 --calling put() for each one--, but this should not be _that_ slow, I gues=
s. (Or should it?) Maybe a specialisation of put to take an array (or slice=
) would help? (Instead of using an array interface, as is done now.)
The details of realloc are beyong my competence (e.g. the computation of re=
sizing ratio), so I can hardly search further.

 -Steve

Denis
-- -- -- -- -- -- --
vit esse estrany =E2=98=A3

spir.wikidot.com

Dec 29 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 29 Dec 2010 14:49:35 -0500, spir <denis.spir gmail.com> wrote:

On Wed, 29 Dec 2010 13:48:48 -0500
"Steven Schveighoffer" <schveiguy yahoo.com> wrote:

On Wed, 29 Dec 2010 13:14:29 -0500, spir <denis.spir gmail.com> wrote:

I've done some timings using reserve and Appender. Seems not to work

on
my use case (decomposition of a string [actually a sequence of code
points] according to NFD). (see sample code below)
* use reserve (to source string's length) with builtin append

(operator
'~=') --> 20% slower
* use Appender w/o reserve --> 3 times slower
* user Appender + its own reserve --> 1.5 times slower (i.e. divide
above time per 2)

What is the baseline for this? I.e. what is it 20% slower than? FWIW,
Appender should be much faster than builtin append, even without
reserve.

The baseline is builtin append (~=) without reserving. (Sorry, thought
it was clear.)

Then I would be suspect of your test. In simple tests I just ran,
reserving outperforms normal appending (though not by a huge amount). If
it's *slower*, then there's something else wrong, as the worst case is it
degenerates to non-reserved appending, since the same code paths are
followed.

However, Appender has a recently fixed bug (not fixed in 2.051) where
appending *arrays* of elements goes very slow. I see you are doing that
in a couple spots.

As explained below, I'm doing _only_ that. So (as guessed), seems it may
be the reason why Appender performs so badly in my case.

Yes, most likely. I mean it was like 10x slower in tests I ran.

Try this version of std.array (shouldn't need to recompile phobos, it's a
template):
http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/array.d?rev=2238&format=raw

I'm surprised that reserve does not speed up builtin appending, since

it
can only avoid numerous reallocations. How to interpret that?

* This question remains. ??? *

I would lean towards pilot error (sorry), since simple tests show reserve
does speed things up. I don't have a lot of time to look into your code
in depth. If you can show a simple tests that proves otherwise, I'd be
happy to look at it.

Simple tests I run prove that this is true. Recent developments in how
the appending array grows mitigate this quite a bit, but it certainly
will
result in less memory being consumed, and it always runs faster.

I can upload the modified version using reserve & Appender (with the
timing module) if you like. (The original one is at
https://bitbucket.org/denispir/denispir-d/src, files pile.d and
chrono.d.)

I use the 'time' command to run benchmarks usually because I've been
burned in the past with using custom-built timing mechanisms that have
flaws in them.

Anyway, why reserve does not help builtin append is a mystery. In the
worst case, it should not trouble.

I'll augment that and say "in your application", since it seems to perform
expectedly in my tests.

By the way, how comes I do not need to import anything to be able to use
reserve (like if it were a builtin prop)?

It's in object.di, always imported.

I'm wondering, if those should be added to the 'properties' page of the
spec.

About Appender, I have had a look at the code in std.array and could not
find any sign of why -- except that appending a slice loops over its
elements --calling put() for each one--, but this should not be _that_
slow, I guess. (Or should it?) Maybe a specialisation of put to take an
array (or slice) would help? (Instead of using an array interface, as is
done now.)
The details of realloc are beyong my competence (e.g. the computation of
resizing ratio), so I can hardly search further.

The issue is that appending was not amortized when appending arrays vs.
appending individual elements (where appending *is* amortized). In order
to achieve the appearance of O(1) appending, you must grow the array
proportionally instead of linearly.

In simple terms, you want to essentially double the capacity whenever you
realloc. The actual aglorithm uses a logarithmic growth curve that
settles around 1.4x.

The flawed appender code was just adding enough space to accommodate the
data being appended, but only if an array was appended (appending single
elements was using the growth curve). Not only that, but it wasn't taking
advantage of the ability to extend into consecutive pages without having
to move data around.

-Steve

Dec 29 2010

D Programming

C/C++ Programming

Other

digitalmars.D.learn - dynamic array capacity