www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - why use string for this example of appender?

reply WhatMeForget <kheaser gmail.com> writes:
I think I got a handle on D's static and dynamic arrays, till I 
come to std.array and see all the shiny new tools. I can 
understand all the replace.. functions, but the appender function 
gave me pause. The documentation says appender "Returns a new 
Appender or RefAppender initialized with a given array."

My first thought that doesn't D's built in arrays already allow 
appending? (at least for dynamic arrays)  And the example shows 
the use of appender with string.  Isn't string an immutable array 
or characters?  Wouldn't this be the last data type you would 
want to be appending to?  Another thing that had me wondering is 
the use of put() down below; doesn't the append syntax (~=) give 
you the same exact functionality; so why bother?

void main()
{
     import std.array;
     import std.stdio: write, writeln, writef, writefln;
     auto w = appender!string;
     // pre-allocate space (this avoids costly reallocations)
     w.reserve(10);
     assert(w.capacity >= 10);

     w.put('a'); // single elements
     w.put("bc"); // multiple elements

     // use the append syntax
     w ~= 'd';
     w ~= "ef";

     writeln(w.data); // "abcdef"
Apr 15 2018
next sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 04/15/2018 11:46 PM, WhatMeForget wrote:
 I think I got a handle on D's static and dynamic arrays, till I come to
 std.array and see all the shiny new tools. I can understand all the
 replace.. functions, but the appender function gave me pause. The
 documentation says appender "Returns a new Appender or RefAppender
 initialized with a given array."
New Appender allocates new memory. If an array is already available for an Appender to use, then it's more efficient.
 My first thought that doesn't D's built in arrays already allow
 appending? (at least for dynamic arrays)
Yes but Appender is reported to be faster presumably it uses a different allocation scheme and may be more free compared to dynamic arrays that must play well with GC and its pages.
 And the example shows the use
 of appender with string.  Isn't string an immutable array or
 characters?
Mutable array of immutable characters: immutable(char)[]. What is important is that existing elements shoould not mutate so that existing slices don't get confused. Appending is fine because a new array is allocated and copied for the appending slice. (Aha! This feature is likely why Appender is faster than a dynamic array: it does not have the feature of "element stomping prevention". (Something is wrong with my English at the moment. :p))
 Wouldn't this be the last data type you would want to be
 appending to?
It's not unusual.
 Another thing that had me wondering is the use of put()
 down below; doesn't the append syntax (~=) give you the same exact
 functionality; so why bother?
put() is old, ~= is new. Both are supported. Ali
Apr 16 2018
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 4/16/18 4:49 AM, Ali Çehreli wrote:
 On 04/15/2018 11:46 PM, WhatMeForget wrote:
  >
  > I think I got a handle on D's static and dynamic arrays, till I come to
  > std.array and see all the shiny new tools. I can understand all the
  > replace.. functions, but the appender function gave me pause. The
  > documentation says appender "Returns a new Appender or RefAppender
  > initialized with a given array."
 
 New Appender allocates new memory. If an array is already available for 
 an Appender to use, then it's more efficient.
 
  > My first thought that doesn't D's built in arrays already allow
  > appending? (at least for dynamic arrays)
 
 Yes but Appender is reported to be faster presumably it uses a different 
 allocation scheme and may be more free compared to dynamic arrays that 
 must play well with GC and its pages.
It's because it stores the relevant information in a local type vs. having to look it up in the GC. It's also not all opaque calls (they can be inlined).
  > Another thing that had me wondering is the use of put()
  > down below; doesn't the append syntax (~=) give you the same exact
  > functionality; so why bother?
 
 put() is old, ~= is new. Both are supported.
 
put is required for Appender to be an output range. -Steve
Apr 16 2018
parent reply WhatMeWorry <kheaser gmail.com> writes:
Thanks all.  I sometimes feel like Michael Corleone: "Just when I 
thought I was out, they pull me back in!" :)

I realize it is not the place for it, but sometimes I wish the 
Library Reference explained things in terms of "why".
Apr 16 2018
parent reply ludo <fakeaddress gmail.com> writes:
Hi guys,

I am working on an old software in D1, which defines at some 
point an array.d module. See my github file: 
https://tinyurl.com/5ffbmfvz

If you go line 347, you see an ArrayBuilder struct, which is 
supposed to behave exactly like an array but with faster 
concatenation. The class comment says:
/**
  * Behaves the same as built-in arrays, except about 6x faster 
with concatenation at the expense of the base pointer
  * being 4 system words instead of two (16 instead of 8 bytes on 
a 32-bit system).
  */

I created a unittest, line 586, to test this statement. You can 
git clone + dub test, to see the result. On my machine I get:
*** ArrayBuilder benchmark ***
1) Concatenation with std:          745 μs and 7 hnsecs
2) Concatenation with Arraybuilder: 236 μs and 8 hnsecs
3) Concatenation with Reserve:      583 μs and 5 hnsecs
4) Concatenation with Length:       611 μs and 8 hnsecs
5) Concatenation with Appender:     418 μs

In (1) I use standard array concatenation, in (2) I use the 
ArrayBuilder struct, in (3) I test standard concat with a call to 
reserve() before-hand, in (4) standard + length= beforehand, and 
finally I dug out the Appender class in (5).

None of the D library tools beat the ArrayBuilder class! Note 
though, that Appender is only two times slower, not 6x slower.

Can someone explain to me why ArrayBuilder is so fast (or the 
others so slow)? What is this base pointer being "4 system words 
instead of 2"? I read the code of ArrayBuilder, but I can not 
spot any magic trick... Has this anything to do with standard 
array concat being "safer" (thread-wise, bounds-wise, etc)?

And also, is there a std module which performs as well, some Fast 
Appender class? Could not find any.

Thanks
Mar 31 2021
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 3/31/21 5:32 PM, ludo wrote:
 Hi guys,
 
 I am working on an old software in D1, which defines at some point an 
 array.d module. See my github file: https://tinyurl.com/5ffbmfvz
 
 If you go line 347, you see an ArrayBuilder struct, which is supposed to 
 behave exactly like an array but with faster concatenation. The class 
 comment says:
 /**
 * Behaves the same as built-in arrays, except about 6x faster with 
 concatenation at the expense of the base pointer
 * being 4 system words instead of two (16 instead of 8 bytes on a 32-bit 
 system).
 */
 
 I created a unittest, line 586, to test this statement. You can git 
 clone + dub test, to see the result. On my machine I get:
 *** ArrayBuilder benchmark ***
 1) Concatenation with std:          745 μs and 7 hnsecs
 2) Concatenation with Arraybuilder: 236 μs and 8 hnsecs
 3) Concatenation with Reserve:      583 μs and 5 hnsecs
 4) Concatenation with Length:       611 μs and 8 hnsecs
 5) Concatenation with Appender:     418 μs
 
 In (1) I use standard array concatenation, in (2) I use the ArrayBuilder 
 struct, in (3) I test standard concat with a call to reserve() 
 before-hand, in (4) standard + length= beforehand, and finally I dug out 
 the Appender class in (5).
 
 None of the D library tools beat the ArrayBuilder class! Note though, 
 that Appender is only two times slower, not 6x slower.
 
 Can someone explain to me why ArrayBuilder is so fast (or the others so 
 slow)? What is this base pointer being "4 system words instead of 2"? I 
 read the code of ArrayBuilder, but I can not spot any magic trick... Has 
 this anything to do with standard array concat being "safer" 
 (thread-wise, bounds-wise, etc)?
ArrayBuilder should be similar in performance to Appender. I think part of the issue with appender could be the ref counted design. Only 1000 elements is going to show heavily the setup/teardown time of allocation of the implementation struct. But that is a guess. You may want to up the repetition count to 100 or 1000. Note your code for appending with length is not doing what you think: void concat_withLength() { int[] array; array.length = 1000; for (int j=0; j<1000; j++) array ~= j; } This allocates 1000 elements, and then append 1000 *more* elements. I think you meant in the loop: array[j] = j; This should be the fastest one. -Steve
Mar 31 2021
parent ludo <fakeaddress gmail.com> writes:
Thank Steve,
I open a new thread with some corrections, better title, etc.

On Wednesday, 31 March 2021 at 22:05:12 UTC, Steven Schveighoffer 
wrote:
 On 3/31/21 5:32 PM, ludo wrote:
 [...]
ArrayBuilder should be similar in performance to Appender. I think part of the issue with appender could be the ref counted design. Only 1000 elements is going to show heavily the setup/teardown time of allocation of the implementation struct. But that is a guess. You may want to up the repetition count to 100 or 1000. Note your code for appending with length is not doing what you think: void concat_withLength() { int[] array; array.length = 1000; for (int j=0; j<1000; j++) array ~= j; } This allocates 1000 elements, and then append 1000 *more* elements. I think you meant in the loop: array[j] = j; This should be the fastest one. -Steve
Apr 01 2021
prev sibling parent Boris-Barboris <ismailsiege gmail.com> writes:
On Monday, 16 April 2018 at 06:46:36 UTC, WhatMeForget wrote:

 Another thing that had me wondering is the use of put() down 
 below; doesn't the append syntax (~=) give you the same exact 
 functionality; so why bother?
Appender also performs unicode-related conversions, so you can append dstring to string and vice-versa, wich may come in handy.
Apr 16 2018