www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Slicing betterC

reply Oleksii <al.skidan gmail.com> writes:
Hi the folks,

Could you please share your wisdom with me? I wonder why the 
following code:
```
import core.stdc.stdlib;

Foo[] pool;
Foo[] foos;

auto buff = (Foo*)malloc(Foo.sizeof * 10);
pool = buff[0 .. 10];
foos = pool[0 .. 0 ];

// Now let's allocate a Foo:
Foo* allocatedFoo;
if (foos.length < foos.capacity) {    // <= Error: TypeInfo 
cannot be used with -betterC
   allocatedFoo = foos[0 .. $ + 1];    // <= Error: TypeInfo 
cannot be used with -betterC
}
```
fails to compile because of `foos.capacity` and `foos[0 .. $ + 
1]`. Why do these two innocent looking expressions require 
TypeInfo? Aren't slices basically fat pointers with internal 
structure that looks like this:
```
struct Slice(T) {
   size_t capacity;
   size_t size;
   T*     memory;
}
```
?

It's weird that `TypeInfo` (being a run-time and reflection 
specific thing) is required in this particular case. Shouldn't 
static type checking be enough for all that?

Thanks in advance,
--
Oleksii
Sep 06 2018
next sibling parent Oleksii <al.skidan gmail.com> writes:
On Thursday, 6 September 2018 at 17:10:49 UTC, Oleksii wrote:

   allocatedFoo = foos[0 .. $ + 1];    // <= Error: TypeInfo
This line meant to be `allocatedFoo = foos[$]`. Sorry about that.
Sep 06 2018
prev sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 6 September 2018 at 17:10:49 UTC, Oleksii wrote:
 struct Slice(T) {
   size_t capacity;
   size_t size;
   T*     memory;
 }
There's no capacity in the slice, that is stored as part of the GC block, which it looks up with the help of RTTI, thus the TypeInfo reference. Slices *just* know their size and their memory pointer. They don't know how they were allocated and don't know what's beyond their bounds or how to grow their bounds. This needs to be managed elsewhere. If you malloc a slice in regular D, the capacity will be returned as 0 - the GC doesn't know anything about it. Any attempt to append to it will allocate a whole new block. In -betterC, there is no GC to look up at all, and thus it has nowhere to look. You'll have to make your own struct that stores capacity if you need it. I like to do something like struct MyArray { T* rawPointer; int capacity; int currentLength; // most user interaction will occur through this T[] opSlice() { return rawPointer[0 .. currentLength]; } // fill in other operators as needed }
Sep 06 2018
parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Thursday, September 6, 2018 11:34:18 AM MDT Adam D. Ruppe via 
Digitalmars-d-learn wrote:
 On Thursday, 6 September 2018 at 17:10:49 UTC, Oleksii wrote:
 struct Slice(T) {

   size_t capacity;
   size_t size;
   T*     memory;

 }
There's no capacity in the slice, that is stored as part of the GC block, which it looks up with the help of RTTI, thus the TypeInfo reference. Slices *just* know their size and their memory pointer. They don't know how they were allocated and don't know what's beyond their bounds or how to grow their bounds. This needs to be managed elsewhere. If you malloc a slice in regular D, the capacity will be returned as 0 - the GC doesn't know anything about it. Any attempt to append to it will allocate a whole new block. In -betterC, there is no GC to look up at all, and thus it has nowhere to look. You'll have to make your own struct that stores capacity if you need it. I like to do something like struct MyArray { T* rawPointer; int capacity; int currentLength; // most user interaction will occur through this T[] opSlice() { return rawPointer[0 .. currentLength]; } // fill in other operators as needed }
To try to make this even clearer, a dynamic array looks basically like this underneath the hood struct DynamicArray(T) { size_t length; T* ptr; } IIRC, it actually uses void* unfortunately, but that struct is basically what you get. Notice that _all_ of the information that's there is the pointer and the length. That's it. If you understand the semantics of what happens when passing that struct around, you'll understand the semantics of passing around dynamic arrays. And all of the operations that would have anything to do with memory management involve the GC - capacity, ~, ~=, etc. all require the GC. If you're not using -betterC, the fact that the dynamic array was allocated with malloc is pretty irrelevant, since all of those operations will function exactly the same as if the dynamic array were allocated by the GC. It's just that because the dynamic array is not GC-allocated, it's guaranteed that the capacity is 0, and therefore any operations that would increase the arrays length then require reallocating the dynamic array with the GC, whereas if it were already GC-allocated, then its capacity might have been greater than its length, in which case, reallocation would not be required. If you haven't read it already, I would suggest reading this article: https://dlang.org/articles/d-array-article.html It does not use the official terminology, but in spite of that, it should really help clarify things for you. The article refers to T[] as being a slice (which is accurate, since it is a slice of memory), but it incorrectly refers to the memory buffer itself as being the dynamic array, whereas the language spec considers the T[] (the struct shown above) to be the dynamic array. The language does not have a specific name for that memory buffer, and it considers a T[] to be dynamic array regardless of what memory it refers to. So, you should keep that in mind when reading the article, but the concepts that it teaches are very much correct and should help a great deal in understanding how dynamic arrays work in D. - Jonathan M Davis
Sep 06 2018