www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - C interface provides a pointer and a length... wrap without copying?

reply cy <dlang verge.info.tm> writes:
So a lovely C library does its own opaque allocation, and 
provides access to the malloc'd memory, and that memory's length. 
Instead of copying the results into garbage collected memory 
(which would probably be smart) I was thinking about creating a 
structure like:

struct WrappedString {
   byte* ptr;
   size_t length;
}

And then implementing opIndex for it, and opEquals for all the 
different string types, and conversions to those types, and then 
it occurred to me that this sounds like a lot of work. Has 
anybody done this already? Made a pointer/length pair, that acts 
like a string?
Mar 11 2017
next sibling parent ketmar <ketmar ketmar.no-ip.org> writes:
cy wrote:

 So a lovely C library does its own opaque allocation, and provides access 
 to the malloc'd memory, and that memory's length. Instead of copying the 
 results into garbage collected memory (which would probably be smart) I 
 was thinking about creating a structure like:

 struct WrappedString {
    byte* ptr;
    size_t length;
 }

 And then implementing opIndex for it, and opEquals for all the different 
 string types, and conversions to those types, and then it occurred to me 
 that this sounds like a lot of work. Has anybody done this already? Made 
 a pointer/length pair, that acts like a string?
yep, it was done before. int* a = cast(int*)malloc(1024); auto b = a[0..1024]; // yay, b is just an ordinary slice now!
Mar 11 2017
prev sibling parent reply Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Saturday, 11 March 2017 at 22:39:02 UTC, cy wrote:
 So a lovely C library does its own opaque allocation, and 
 provides access to the malloc'd memory, and that memory's 
 length. Instead of copying the results into garbage collected 
 memory (which would probably be smart) I was thinking about 
 creating a structure like:

 struct WrappedString {
   byte* ptr;
   size_t length;
 }

 And then implementing opIndex for it, and opEquals for all the 
 different string types, and conversions to those types, and 
 then it occurred to me that this sounds like a lot of work. Has 
 anybody done this already? Made a pointer/length pair, that 
 acts like a string?
A string *is* a pointer length pair, an immutable(char)[]. Your `WrappedString` is effectively a byte[]. All you need to do is: ubyte[] arr; // or byte/char whatever is the pointed to type returned by giveMeTheMemory arr = giveMeTheMemory()[0 .. getMeTheLength()]; No need to reimplement anything.
Mar 11 2017
parent reply cy <dlang verge.info.tm> writes:
On Saturday, 11 March 2017 at 23:43:54 UTC, Nicholas Wilson wrote:
 A string *is* a pointer length pair, an immutable(char)[].
Yes, but surely there's some silly requirement, like that the pointer must only ever point to garbage collected memory, or something?
 ubyte[] arr; // or byte/char whatever is the pointed to type 
 returned by giveMeTheMemory
 arr = giveMeTheMemory()[0 .. getMeTheLength()];
...guess not! :D
Mar 11 2017
next sibling parent ketmar <ketmar ketmar.no-ip.org> writes:
cy wrote:

 On Saturday, 11 March 2017 at 23:43:54 UTC, Nicholas Wilson wrote:
 A string *is* a pointer length pair, an immutable(char)[].
Yes, but surely there's some silly requirement, like that the pointer must only ever point to garbage collected memory, or something?
why should it? a slice can point anywhere, and GC is smart enough to know what memory it owns and what it isn't. if you'll try to append something to non-GC-owned slice, GC will make a copy first. so the only requirement is: "don't use `~=` on it if you don't want to have a memory leak".
Mar 11 2017
prev sibling parent Jonathan M Davis via Digitalmars-d-learn writes:
On Sunday, March 12, 2017 02:47:19 cy via Digitalmars-d-learn wrote:
 On Saturday, 11 March 2017 at 23:43:54 UTC, Nicholas Wilson wrote:
 A string *is* a pointer length pair, an immutable(char)[].
Yes, but surely there's some silly requirement, like that the pointer must only ever point to garbage collected memory, or something?
No. A dynamic array is just a struct with a pointer and a length. Aside from avoiding accessing memory that is no longer valid or avoiding allocations, it doesn't matter one whit what memory it points to. char[5] a; char[] b = a; is perfectly valid as is slicing memory that comes from malloc or some other crazy place. Most dynamic array operations don't even need to care about what memory they refer to. If you access an element, it does the math and dereferences it like you'd get with naked pointer arithmetic. The difference is that it also does bounds checking for you, because it knows the length. If you slice the array, then you just get an array with an adjusted pointer and/or length. The only times that the GC gets involved are when doing anything involving appending. If you call capacity, reserve, or ~=, then the GC gets involved. In those cases, the GC looks at the pointer to determine whether it can append. If it finds that it's GC-allocated memory, then it will look to see what room there is in the GC-allocated block after the array. If you're calling capacity, it will just return how large the array can grow to without being reallocated. If you're calling reserve or ~= it checks to see whether the capacity is great enough to grow into that space. If not, it will allocate a new block of memory, copy the array's elements into that, and set the array to point to it. In the case of ~=, it will also grow the array into that memory and put the new elements in it, whereas in the case of reserve, it just does the reallocation. If there was enough space, then ~= will just expand the array into that space without reallocating, and reserve will do nothing. If you have a dynamic array that was not allocated by the GC (be it a slice of a static array or malloc-ed memory or whatever), then its capacity will be 0. So, capacity will tell you 0, and ~= and reserve will always result in the array being reallocated. I would suggest that you read this excellent article: http://dlang.org/d-array-article.html though I would point out that it uses the wrong terminology in that it refers to the GC-allocated buffer as the dynamic array rather than T[], and it refers to T[] as a slice, whereas the official terminology is that T[] is the dynamic array and that buffer doesn't have an official name, and while T[] _is_ a slice of memory (assuming that it's not null), slice refers to a lot of other stuff in D (e.g. slicing a container gives you range over that container, so it's a slice of the container, but it's not T[]). And your use case here is a perfect example of why T[] is the dynamic array and not the GC-allocated buffer. Dynamic arrays simply don't care about the memory that they refer to, because they don't manage their own memory. ~=, reserve, and capacity do care about what memory a dynamic array refers to, but they work exactly the same way regardless of what the dynamic array refers to. It's just a question of when reallocations do or do not occur. The big concern with slicing static arrays or malloc-ed memory to get a dynamic array is that it's then up to you to ensure that that dynamic array does not outlive the memory that it refers to. So, there is some danger there, but that's no different from operating on raw pointers, and operating on dynamic arrays gives better safety thanks to bounds-checking, and it also works with appending, though that will cause the dynamic array to then refer to GC-allocated memory instead of the original memory. - Jonathan M Davis
Mar 11 2017