digitalmars.D.learn - Dynamic array ot not
- forkit (5/5) Jan 15 2022 so at this link: https://dlang.org/spec/arrays.html
- =?UTF-8?Q?Ali_=c3=87ehreli?= (26/31) Jan 15 2022 I have a problem with calling type[] a dynamic array because it is a
- forkit (10/13) Jan 16 2022 Yes. A more useful way of describing [] would be to say:
- =?UTF-8?Q?Ali_=c3=87ehreli?= (63/79) Jan 16 2022 I still don't agree with that. :) To me, [] is always a slice. (Again,
- Salih Dincer (6/10) Jan 16 2022 If count is not equal to 8 I get weird results! The reason of
- =?UTF-8?Q?Ali_=c3=87ehreli?= (24/35) Jan 16 2022 In practice, malloc'ed memory is cleared e.g. by memset(). Or, there is
- H. S. Teoh (26/44) Jan 16 2022 Correction: malloc() is not guaranteed to clear the allocated memory.
- =?UTF-8?Q?Ali_=c3=87ehreli?= (14/27) Jan 16 2022 Yes but they do not (and cannot) put it inside what they returned. They
- Era Scarecrow (23/26) Jan 16 2022 I wonder if you're seeing something you're not suppose to as part
- forkit (17/20) Jan 16 2022 Well, it's fair to say, that 'range-based programming' is kinda
- =?UTF-8?Q?Ali_=c3=87ehreli?= (13/21) Jan 16 2022 to me.
- forkit (11/20) Jan 16 2022 Honestly, that is what I thought, and expected.
- =?UTF-8?Q?Ali_=c3=87ehreli?= (4/8) Jan 16 2022 Definitely a -profile=gc bug. Here are the existing ones:
- forkit (5/8) Jan 16 2022 yeah, a bug makes more sense ... otherwise I really would have
- Steven Schveighoffer (8/22) Jan 16 2022 I think what is happening is that phobos is using a direct call to a
- forkit (39/40) Jan 16 2022 module test;
- Steven Schveighoffer (14/23) Jan 16 2022 Technically, this is using 2 different mechanisms.
- forkit (13/21) Jan 16 2022 yes, that seems to be the case:
- Salih Dincer (37/37) Jan 16 2022 ```d
- Salih Dincer (13/19) Jan 16 2022 In fact, although range and slice seem to be equal to each other,
so at this link: https://dlang.org/spec/arrays.html it indicates an array of type[] is of type 'Dynamic array'. with that in mind, I ask, is this below a 'Dynamic array'. If not, why not? int[][] mArr2 = array(iota(1, 9).chunks(2).map!array.array);
Jan 15 2022
On 1/15/22 20:09, forkit wrote:so at this link: https://dlang.org/spec/arrays.html it indicates an array of type[] is of type 'Dynamic array'.I have a problem with calling type[] a dynamic array because it is a slice, which may be providing access to the elements of a dynamic array. One complexity in D's slices is that they will start providing access to a dynamic array upon appending or concatenation. Here is the proof: void main() { int[2] data; int[] slice = data[]; } 'slice' has *nothing* to do with any dynamic array. Period! It will provide access to one with the following operation: slice ~= 42; Again, 'slice' is not a dynamic array; it provides access to the elements of one. (That dynamic array is owned by the D runtime (more precisely, the GC).) Despite that reality, people want dynamic arrays in the language and call slices dynamic arrays. Ok, I feel better now. :)with that in mind, I ask, is this below a 'Dynamic array'.Nothing different from what I wrote above.If not, why not? int[][] mArr2 = array(iota(1, 9).chunks(2).map!array.array);mArr2 is a slice of slices. We can see from the initialization that there are many dynamic arrays behind the scenes. The outermost array() call is unnecessary because array() of a slice will produce another slice with the same type (perhaps e.g. a 'const' might be dropped) and elements. This is the same: int[][] mArr2 = iota(1, 9).chunks(2).map!array.array ; Ali
Jan 15 2022
On Sunday, 16 January 2022 at 04:58:21 UTC, Ali Çehreli wrote:I have a problem with calling type[] a dynamic array because it is a slice, which may be providing access to the elements of a dynamic array.Yes. A more useful way of describing [] would be to say: "[] represents a dynamic array except when it represents a slice - in which case it is merely shorthand for a slice that is referencing data stored somewhere else." Something that still confuese me in my example then: mArr[] is actually a slice, and not a dynamic array. That is, my slice is referencing data located somewhere else. But where exactly is that other data located? It seems to me, I have a slice of data that doesn't actually exist once I have that slice.
Jan 16 2022
On 1/16/22 01:43, forkit wrote:On Sunday, 16 January 2022 at 04:58:21 UTC, Ali Çehreli wrote:I still don't agree with that. :) To me, [] is always a slice. (Again, some other members of the community like to name it a "dynamic array" instead "slice".)I have a problem with calling type[] a dynamic array because it is a slice, which may be providing access to the elements of a dynamic array.Yes. A more useful way of describing [] would be to say: "[] represents a dynamic array except when it represents a slice- in which case it is merely shorthand for a slice that is referencing data stored somewhere else."A slice always references data stored somewhere else.Something that still confuese me in my example then: mArr[] is actually a slice, and not a dynamic array. That is, my slice is referencing data located somewhere else. But where exactly is that other data located?Can be anywhere: a) On the stack (as in my earlier example of int[2] static array) b) In "dynamic memory" (i.e. "GC memory") c) In malloc'ed memory (e.g. allocated by a C function) d) etc.It seems to me, I have a slice of data that doesn't actually exist once I have that slice.It would be a bug if the slice references non-existing data. Slice consist of two things: 1) A pointer to the beginning of data 2) The number of elements that are being referenced at that location Going back to the three example of data I listed above: a) The data is on the stack and will die when the scope is exited. It would be a bug to refer to that data longer that its lifetime: int[] foo() { int[2] dataOnTheStack; return dataOnTheStack[]; } Luckily, dmd catches that bug in that case: Error: returning `dataOnTheStack[]` escapes a reference to local variable `dataOnTheStack` b) This is the most common case: The data is in dynamic memory. This is owned and managed by druntime. druntime includes the GC, which occasionally performs a cleanup to free unused memory. int[] bar(int[] input) { int[] output = input ~ 42; return output; } void main() { bar([ 1, 2 ]); } The first line of bar() appends 42 to an existing slice. That operation causes druntime to allocate new memory, copy the existing elements of 'input' there and also write 42 at the end of those elements. After bar's first line, this is a picture of 'input' and 'output' during bar(): input.ptr --> [ (some data) ] output.ptr --> [ (copy of 'input's data in dynamic memory) 42 ] Note that while 'input's elements may be e.g. on the stack, 'output's elements are definitely in dynamic memory. The data that is in dynamic memory will be alive as long as there is at least one reference to it. c) The data may be allocated by malloc: import std.stdio; import core.stdc.stdlib; void main() { enum count = 7; // Allocate some memory void* rawData = malloc(int.sizeof * count); // Access that data as int[] int[] slice = (cast(int*)rawData)[0..count]; // Yet another slice of half of those int[] half = slice[0..$/2]; // Free the memory free(rawData); // Ouch! Accessing dead data writeln(slice); writeln(half); } So, in all three examples it is the same D feature, a slice, that references data but the data is managed in different ways. Ali
Jan 16 2022
On Sunday, 16 January 2022 at 11:43:40 UTC, Ali Çehreli wrote:void main() { enum count = 7; // Allocate some memory void* rawData = malloc(int.sizeof * count);If count is not equal to 8 I get weird results! The reason of course, is the free(): // [93947717336544, 1, 2, 3, 4, 5, 6] I remarked it for exclusion purposes but I couldn't try it as char. [Here](https://run.dlang.io/is/yFmXwO) is my test code.
Jan 16 2022
On 1/16/22 07:32, Salih Dincer wrote:On Sunday, 16 January 2022 at 11:43:40 UTC, Ali Çehreli wrote:In practice, malloc'ed memory is cleared e.g. by memset(). Or, there is calloc() which returns memory filled with zeros. It takes two parameters: void* rawData = calloc(count, int.sizeof); (Of course, one also needs to check that the returned value is not null before using it.)void main() { enum count = 7; // Allocate some memory void* rawData = malloc(int.sizeof * count);If count is not equal to 8 I get weird results! The reason of course, is the free(): // [93947717336544, 1, 2, 3, 4, 5, 6]I didn't know free wrote into the freed buffer but since it's undefined behavior, we shouldn't even be able to know whether it did or not. :/[Here](https://run.dlang.io/is/yFmXwO) is my test code.You have the following loop there: foreach(i, ref s; slice) s = cast(int)i; Note that the assignment operator into a malloc'ed buffer works only for some types like the fundamental ones. Otherwise, the opAssign() of a user-defined type may be very unhappy with random or zero data of 'this' object. (Note that there really wouldn't be any user-defined type in the buffer; we would have just cast'ed it to fool ourselves.) The reason is, opAssign() may have to destroy existing members of 'this' during the assignment and destroying random or zero data may not work for a particular type. So, one would have to emplace() an object in that memory. Here is my explanation: http://ddili.org/ders/d.en/memory.html#ix_memory.construction,%20emplace Ok, this is too much theory and those are parts of the reasons why we don't generally use calloc or malloc. :) Ali
Jan 16 2022
On Sun, Jan 16, 2022 at 08:21:29AM -0800, Ali Çehreli via Digitalmars-d-learn wrote:On 1/16/22 07:32, Salih Dincer wrote:Correction: malloc() is not guaranteed to clear the allocated memory. Possibly for "performance reasons". Usually, though, I'd recommend using calloc() instead to initialize the allocated memory and prevent surprising results. (E.g., you allocate a buffer expecting it would be zeroed, but it isn't so the code that assumes it does produces garbled results. Or worse, if you're allocating memory for, e.g., a packet buffer to be transmitted across the network, failing to initialize it may leak the past contents of that memory to the internet. Cf. HeartBleed.) [...]On Sunday, 16 January 2022 at 11:43:40 UTC, Ali Çehreli wrote:In practice, malloc'ed memory is cleared e.g. by memset(). Or, there is calloc() which returns memory filled with zeros.void main() { enum count = 7; // Allocate some memory void* rawData = malloc(int.sizeof * count);[...] This is likely caused by the implementation of malloc/free. Some implementations store pointers and other tracking information inside the free blocks themselves instead of using a separate data structure to track free memory (e.g., the start of the block may contain a pointer to the next available block). After calling free(), that memory is now a free block, so the implementation may start storing such information within the block. Some memory allocator implementations may also store a canary value at the beginning of freed blocks to detect double free's (i.e., a value unlikely to occur in user data that, if detected by free(), may indicate a bug in the code that's trying to free the same block twice). T -- Debian GNU/Linux: Cray on your desktop.If count is not equal to 8 I get weird results! The reason of course, is the free(): // [93947717336544, 1, 2, 3, 4, 5, 6]I didn't know free wrote into the freed buffer but since it's undefined behavior, we shouldn't even be able to know whether it did or not. :/
Jan 16 2022
On 1/16/22 17:51, H. S. Teoh wrote:Agreed. What I meant is, the programmer ordinarily clears it with memset().In practice, malloc'ed memory is cleared e.g. by memset(). Or, there is calloc() which returns memory filled with zeros.Correction: malloc() is not guaranteed to clear the allocated memory.Yes but they do not (and cannot) put it inside what they returned. They store such information in the area right before the returned pointer.I didn't know free wrote into the freed buffer but since it's undefined behavior, we shouldn't even be able to know whether it did or not. :/[...] This is likely caused by the implementation of malloc/free. Some implementations store pointers and other tracking information inside the free blocks themselvesSome memory allocator implementations may also store a canary value at the beginning of freed blocks to detect double free'sI thought of that as well but I doubt they would go against the mentioned performance ideals. My current theory is that our writeln() calls after the free needed memory for the output strings and it got the last freed memory. (Assuming a common implementation of a free list where the next allocated memory is the one most recently freed.) Well... speculation... :)unlikely to occur in user data that, if detected by free()Yes, those are human-identifiable as well but I don't think it's related in this case because they look random values. (Speculation: A string's .ptr value. :) ) Ali
Jan 16 2022
On Sunday, 16 January 2022 at 15:32:41 UTC, Salih Dincer wrote:If count is not equal to 8 I get weird results! The reason of course, is the free(): // [93947717336544, 1, 2, 3, 4, 5, 6]I wonder if you're seeing something you're not suppose to as part of the internals; Typically malloc for 16bit allocation to my understanding worked a little different, where you had 2 bytes BEFORE the memory block which specified a size, and if it was negative then it was a size of free memory (*This allowed a 64k block to have multiple allocations and de-allocations*). So you might have: **64, [64 bytes of data], -1024, [1024 unallocated memory],32,[32 bytes of allocated data, etc]**. in the above if you deallocated the 64 byte block, it would just become -1090 (*merging the next free block, 1024+64+2*). This would also explain why double freeing would break since it would be referring to free size and throw a fit, or it was no longer a length malloc/free could use. Also malloc if it returns anything guarantees at least that size, you might ask for 7 bytes but get 16 and the others is just ignored. Is this how the 32/64bit malloc/free work? Not sure, it wouldn't be unreal for Java and others to pre-allocate say a megabyte and then quickly give/manage a smaller block and extending or getting another block later. Though I'm sure others here have given better/more exact on internals or how allocation works in D.
Jan 16 2022
On Sunday, 16 January 2022 at 11:43:40 UTC, Ali Çehreli wrote:So, in all three examples it is the same D feature, a slice, that references data but the data is managed in different ways. AliWell, it's fair to say, that 'range-based programming' is kinda new to me. With this statement: int[][] mArr = iota(1, 9).chunks(2).map!array.array; - one would intuitively expect, that at the end, you end up with a dynamically allocated array, intialised with it's argument. That is, you would expect some GC allocation to be going on, similar to what would happen with this code: int[][] mArr2 = [[1, 2], [3, 4], [5, 6], [7, 8]]; // GC allocation But it turns out: int[][] mArr = iota(1, 9).chunks(2).map!array.array; // no GC allocation going on at all, not anywhere. How can this be I asked myself? Then I watched this, and learn about 'memory disallocation'. http://dconf.org/2015/talks/bright.html Now I understand.. I think ;-)
Jan 16 2022
On 1/16/22 14:43, forkit wrote:Well, it's fair to say, that 'range-based programming' is kinda newto me. I hope it will be useful to you. :)int[][] mArr2 = [[1, 2], [3, 4], [5, 6], [7, 8]]; // GC allocation But it turns out: int[][] mArr = iota(1, 9).chunks(2).map!array.array; // no GC allocation going on at all, not anywhere.That's not correct. There are many range algorithms that are lazy to defer memory allocation but array() is not one of those. array() does eagerly allocate memory, which is it's whole purpose: https://dlang.org/phobos/std_array.html#array "Allocates an array and initializes it with copies of the elements of range r." So, that range expression has the same allocations as your "GC allocation" line above.Then I watched this, and learn about 'memory disallocation'. http://dconf.org/2015/talks/bright.html Now I understand.. I think ;-)That applies to most other range algorithms. array() is an expensive one. :) Ali
Jan 16 2022
On Sunday, 16 January 2022 at 23:03:49 UTC, Ali Çehreli wrote:That's not correct. There are many range algorithms that are lazy to defer memory allocation but array() is not one of those. array() does eagerly allocate memory, which is it's whole purpose: https://dlang.org/phobos/std_array.html#array "Allocates an array and initializes it with copies of the elements of range r." So, that range expression has the same allocations as your "GC allocation" line above.Honestly, that is what I thought, and expected. However, when I compiled with: -profile=gc .. it indicated no GC allocation going on. That's really why I got confused - because, as you say, the array statement (as I understand it)will result in a dynamically allocated array (i.e. allocated on GC heap). But -profile=gc .. says no. It's not. int[][] mArr = [[1, 2], [3, 4], [5, 6], [7, 8]]; // GC allocation occurs. int[][] mArr = iota(1, 9).chunks(2).map!array.array; // No GC allocation occurs.
Jan 16 2022
On 1/16/22 15:14, forkit wrote:But -profile=gc .. says no. It's not. int[][] mArr = [[1, 2], [3, 4], [5, 6], [7, 8]]; // GC allocation occurs. int[][] mArr = iota(1, 9).chunks(2).map!array.array; // No GC allocation occurs.Definitely a -profile=gc bug. Here are the existing ones: https://issues.dlang.org/buglist.cgi?quicksearch=profile%20gc Ali
Jan 16 2022
On Sunday, 16 January 2022 at 23:34:41 UTC, Ali Çehreli wrote:Definitely a -profile=gc bug. Here are the existing ones: https://issues.dlang.org/buglist.cgi?quicksearch=profile%20gc Aliyeah, a bug makes more sense ... otherwise I really would have had a slice to data that doesn't exist anywhere. some of those reported bugs are pretty old. .. oh well... just another thing in D, that I can't rely on :-(
Jan 16 2022
On 1/16/22 7:54 PM, forkit wrote:On Sunday, 16 January 2022 at 23:34:41 UTC, Ali Çehreli wrote:I think what is happening is that phobos is using a direct call to a compiler hook, which bypasses the mechanism that recognizes it as a GC allocation. It appears that profile=gc doesn't recognize any non-language GC calls. Even `GC.malloc` doesn't identify any allocations. https://issues.dlang.org/show_bug.cgi?id=14892 -SteveDefinitely a -profile=gc bug. Here are the existing ones:  https://issues.dlang.org/buglist.cgi?quicksearch=profile%20gc Aliyeah, a bug makes more sense ... otherwise I really would have had a slice to data that doesn't exist anywhere. some of those reported bugs are pretty old. .. oh well... just another thing in D, that I can't rely on :-(
Jan 16 2022
On Monday, 17 January 2022 at 00:54:19 UTC, forkit wrote:..module test; import std; safe void main() { // mArr1 is a dynamic array allocated on the gc heap. int[][] mArr1 = [[1, 2], [3, 4], [5, 6], [7, 8]]; static assert(is(typeof(mArr1.array()))); alias R1 = typeof(mArr1); static assert(isRandomAccessRange!R1 && isForwardRange!R1 && isInputRange!R1 && isBidirectionalRange!R1 && hasAssignableElements!R1 && !isInfinite!R1 && hasSlicing!R1); // mArr2 is a dynamic array allocated on the gc heap // although compiling with '-profile=gc' incorrectly suggests otherwise. int[][] mArr2 = iota(1, 9).chunks(2).map!array.array; static assert(is(typeof(mArr2.array()))); alias R2 = typeof(mArr2); static assert(isRandomAccessRange!R2 && isForwardRange!R2 && isInputRange!R2 && isBidirectionalRange!R2 && hasAssignableElements!R2 && !isInfinite!R2 && hasSlicing!R2); static assert(is(typeof(mArr1) == typeof(mArr2))); } /+ if add nogc to above code: (8): Error: array literal in ` nogc` function `D main` may cause a GC allocation (22): Error: ` nogc` function `D main` cannot call non- nogc function `std.array.array!(MapResult!(array, Chunks!(Result))).array` +/
Jan 16 2022
On 1/16/22 9:42 PM, forkit wrote:On Monday, 17 January 2022 at 00:54:19 UTC, forkit wrote: // mArr2 is a dynamic array allocated on the gc heap // although compiling with '-profile=gc' incorrectly suggests otherwise.if add nogc to above code: (8): Error: array literal in ` nogc` function `D main` may cause a GC allocation (22): Error: ` nogc` function `D main` cannot call non- nogc function `std.array.array!(MapResult!(array, Chunks!(Result))).array`Technically, this is using 2 different mechanisms. nogc is a function attribute that means it can't call other functions not marked as nogc. This is actually a *compile-time* mechanism. A call to `array` may not actually allocate, but *might* allocate, and therefore it's statically not allowed to be called from a nogc function. The profile=gc appears to only show GC allocations that the *compiler* initiates (i.e. via `new`, array operations (like appending) or closure allocations). It does not detect that the functions that actually allocate memory themselves (such as `core.memory.GC.malloc`) are GC allocations. This actually does not care whether a function might or might not allocate, but records when it actually does allocate. -Steve
Jan 16 2022
On Monday, 17 January 2022 at 03:11:50 UTC, Steven Schveighoffer wrote:The profile=gc appears to only show GC allocations that the *compiler* initiates (i.e. via `new`, array operations (like appending) or closure allocations). It does not detect that the functions that actually allocate memory themselves (such as `core.memory.GC.malloc`) are GC allocations. This actually does not care whether a function might or might not allocate, but records when it actually does allocate. -Steveyes, that seems to be the case: // ----- module test; import std; safe void main() { int[][] mArr2; mArr2 ~= iota(1, 9).chunks(2).map!array.array; // profilegc.log created ok. } // ----
Jan 16 2022
```d import std; // If we summarize in code... void main() { // It's a dynamic array and its copy below: size_t[] arr = [1, 2, 3, 4, 5, 6]; auto arrCopy = arr.dup; // This is its lazy range: auto range = arr.chunks(2); typeid(range).writeln(": ", range); // But this is its copy (you'll see that later!) auto array = arr.chunks(2).map!array.array; // This is a series it's created from slices: auto slices =[arr[0..2], arr[2..4], arr[4..6]]; typeid(slices).writeln(": ", slices); // Equal for fleeting moment: assert(array == slices); // Now, the datasource is sorted but (!) // All copies will not be affected! arrCopy.writeln(" => ", arr.sort!"a>b"); // and now they are not equal! Because, // slices were affected by the sort process, // the range too... assert(array != slices); // Taaata, magic... // Your eyes don't surprise you! typeid(range).writeln(": ", range); typeid(slices).writeln(": ", slices); } /* CONSOLE OUT: std.range.Chunks!(ulong[]).Chunks: [[1, 2], [3, 4], [5, 6]] ulong[][]: [[1, 2], [3, 4], [5, 6]] [1, 2, 3, 4, 5, 6] => [6, 5, 4, 3, 2, 1] std.range.Chunks!(ulong[]).Chunks: [[6, 5], [4, 3], [2, 1]] ulong[][]: [[6, 5], [4, 3], [2, 1]] */ ```
Jan 16 2022
On Monday, 17 January 2022 at 01:06:05 UTC, Salih Dincer wrote:```d // Taaata, magic... // Your eyes don't surprise you! typeid(range).writeln(": ", range); typeid(slices).writeln(": ", slices); ```In fact, although range and slice seem to be equal to each other, they are not! Slice points directly to the array source, but range is not... Because range never has slices showing the array source. But it has criteria that make it effective. and when called, it processes those criteria (uses CPU power) and does the same thing each time. Range does the same thing every time, lazily. Maybe it's not lazy, because range is operated when necessary and always obeys. Ranges obey even if they are lazy. 😀 Good weeks
Jan 16 2022