digitalmars.D.learn - Dynamic Arrays Capacity
- Salih Dincer (23/25) Jun 01 2022 Hi,
- Mike Parker (12/27) Jun 02 2022 You've initialized `str` with a string literal. No memory is
- Mike Parker (14/16) Jun 02 2022 I posted too soon.
- Mike Parker (2/4) Jun 02 2022 That should be the same as `&(*(ts.ptr + 0))`!
- Salih Dincer (55/60) Jun 02 2022 I didn't know that, so maybe this example proves it; the
- Steven Schveighoffer (22/41) Jun 02 2022 The capacity is how many elements of the array can be stored without
- bauss (7/14) Jun 03 2022 This isn't correct either, at least with unicode, since 1 byte
- Adam D Ruppe (2/4) Jun 03 2022 Depends how you define "character".
- bauss (4/8) Jun 03 2022 I guess that's true as well, unicode really made it impossible to
Hi, Do I misunderstand? A dynamic array is allocated memory according to the `nextpow2()` algorithm(-1 lapse); strings, on the other hand, don't behave like this... ```d string str = "0123456789ABCDEF"; char[] chr = str.dup; assert(str.length == 16); assert(str.capacity == 0); import std.math: thus = nextPow2; //.algebraic assert(chr.capacity == thus(str.length) - 1); assert(chr.capacity == 31); ``` Also, `.ptr` keeps the address of the most recent first element, right? ```d write("str[0] ", &str[0]); writeln(" == ", str.ptr); write("chr[0] ", &chr[0]); writeln(" == ", chr.ptr); ``` **Print Out:** (No Errors)str[0] 5607593901E0 == 5607593901E0 chr[0] 7F9430982000 == 7F9430982000SDB 79
Jun 01 2022
On Thursday, 2 June 2022 at 05:04:03 UTC, Salih Dincer wrote:Hi, Do I misunderstand? A dynamic array is allocated memory according to the `nextpow2()` algorithm(-1 lapse); strings, on the other hand, don't behave like this... ```d string str = "0123456789ABCDEF"; char[] chr = str.dup; assert(str.length == 16); assert(str.capacity == 0); import std.math: thus = nextPow2; //.algebraic assert(chr.capacity == thus(str.length) - 1); assert(chr.capacity == 31);You've initialized `str` with a string literal. No memory is allocated for these from the GC. They're stored in the binary, meaning they're loaded into memory from disk by the OS. So `str.ptr` points to a static memory location that's a fixed size, hence no extra capacity. `chr` is allocated from the GC using whatever algorithm is implemented in the runtime. That it happens to be any given algorithm is an implementation detail that could change in any release.``` Also, `.ptr` keeps the address of the most recent first element, right?More specifically, it points to the starting address of the allocated block of memory.
Jun 02 2022
On Thursday, 2 June 2022 at 08:14:40 UTC, Mike Parker wrote:More specifically, it points to the starting address of the allocated block of memory.I posted too soon. Given an instance `ts` of type `T[]`, array accesses essentially are this: ```d ts[0] == *(ts.ptr + 0); ts[1] == *(ts.ptr + 1); ts[2] == *(ts.ptr + 2); ``` Since the size of `T` is known, each addition to the pointer adds `N * T.sizeof` bytes. If you converted it to a `ubyte` array, you'd need to handle that yourself. And so, `&ts[0]` is the same as `&(*ts.ptr + 0)`, or simply `ts.ptr`.
Jun 02 2022
On Thursday, 2 June 2022 at 08:24:51 UTC, Mike Parker wrote:And so, `&ts[0]` is the same as `&(*ts.ptr + 0)`, or simply `ts.ptr`.That should be the same as `&(*(ts.ptr + 0))`!
Jun 02 2022
On Thursday, 2 June 2022 at 08:14:40 UTC, Mike Parker wrote:You've initialized `str` with a string literal. No memory is allocated for these from the GC. They're stored in the binary, meaning they're loaded into memory from disk by the OS. So `str.ptr` points to a static memory location that's a fixed size, hence no extra capacity.I didn't know that, so maybe this example proves it; the following test code that Ali has started and I have developed: ```d import std.range; import std.stdio; /* toggle array: alias chr = char*; auto data = [' '];/*/ alias chr = immutable(char*); auto data = " ";//*/ void main() { chr[] ptrs; data.fill(3, ptrs); writeln; foreach(ref ptr; ptrs) { " 0x".write(ptr); } } /* Print Out: 0: 0 1: 15 2: 31 3: 47 0x55B07E227020 0x7F2391F9F000 0x7F2391FA0000 0x7F2391FA1000 //*/ void fill(R)(ref R mostRecent, int limit, ref chr[] ptrs) { auto ptr = mostRecent.ptr; size_t capacity, depth; while (depth <= limit) { mostRecent ~= ElementType!R.init; if(ptr != mostRecent.ptr) { ptrs ~= ptr; depth.writef!"%2s: %11s"(capacity); depth++; } if (mostRecent.capacity != capacity) { ptr = mostRecent.ptr; capacity = mostRecent.capacity; } } } ``` As for the result I got from this code: The array configured in the heap is copied to another memory region as soon as its capacity changes (0x5...20 >> 0x7...00). We get the same result in array. Just add the / character to the beginning of the 4th line to try it. Thank you all very much for the replies; all of these open my mind. SDB 79
Jun 02 2022
On 6/2/22 1:04 AM, Salih Dincer wrote:Hi, Do I misunderstand? A dynamic array is allocated memory according to the `nextpow2()` algorithm(-1 lapse); strings, on the other hand, don't behave like this... ```d string str = "0123456789ABCDEF"; char[] chr = str.dup; assert(str.length == 16); assert(str.capacity == 0); import std.math: thus = nextPow2; //.algebraic assert(chr.capacity == thus(str.length) - 1); assert(chr.capacity == 31); ```The capacity is how many elements of the array can be stored without reallocating *when appending*. Why 0 for the string literal? Because it's not from the GC, and so has no capacity for appending (note that a capacity of 0 is returned even though the string currently has 16 characters in it). Why 31 for the GC-allocated array? Because implementation details. But I can give you the details: 1. The GC allocates in powers of 2 (mostly) The smallest block is 16 bytes, and the next size up is 32 bytes. 2. In order to remember which parts of the block are used, it needs to allocate some space to record that value. For a 16-byte block, that requires 1 byte. So it can't fit your 16-byte string + 1 byte for the capacity tracker into a 16 byte block, it has to go into a 32 byte block. And of course, 1 byte of that 32 byte block is for the capacity tracker, hence capacity 31.Also, `.ptr` keeps the address of the most recent first element, right?This statement suggests to me that you have an incorrect perception of a string. A string is a pointer paired with a length of how many characters after that pointer are valid. That's it. `str.ptr` is the pointer to the first element of the string. There isn't a notion of "most recent first element". -Steve
Jun 02 2022
On Thursday, 2 June 2022 at 20:12:30 UTC, Steven Schveighoffer wrote:This statement suggests to me that you have an incorrect perception of a string. A string is a pointer paired with a length of how many characters after that pointer are valid. That's it. `str.ptr` is the pointer to the first element of the string. There isn't a notion of "most recent first element". -SteveThis isn't correct either, at least with unicode, since 1 byte isn't equal to 1 character and a character can be several bytes. I believe it's only true in unicode for utf-32 since all characters do fit in the 4 byte space they have, but for utf-8 and utf-16 the characters will not be the same size of bytes.
Jun 03 2022
On Friday, 3 June 2022 at 12:49:07 UTC, bauss wrote:I believe it's only true in unicode for utf-32 since all characters do fit in the 4 byte space they haveDepends how you define "character".
Jun 03 2022
On Friday, 3 June 2022 at 12:52:30 UTC, Adam D Ruppe wrote:On Friday, 3 June 2022 at 12:49:07 UTC, bauss wrote:I guess that's true as well, unicode really made it impossible to just say "this string has so many characters because it has this many bytes."I believe it's only true in unicode for utf-32 since all characters do fit in the 4 byte space they haveDepends how you define "character".
Jun 03 2022