digitalmars.D.learn - interfacing with C: strings and byte vectors
- yawniek (25/25) Jun 11 2016 my C library works a lot with strings defined in C as:
- Mike Parker (43/57) Jun 11 2016 No, you've got the right idea with the constructor. You just need
my C library works a lot with strings defined in C as: struct vec_t { char *base; size_t len; } is there a easy way to feed regular D strings to functions that accept vec_t* without creating a vec_t every time or do i write wrappers for these functions and if so, what is the most elegant way to build them? so far i defined vec_t as: struct vec_t { char *base; size_t len; this(string s) { base = s.ptr; len = s.lenght; } nothrow nogc inout(char)[] toString() inout property { return base[0 .. len]; } nothrow nogc property const(char)[] toSlice() { return cast(string) base[0..len]; } alias toString this; } but i guess there is a more elegant way?!
Jun 11 2016
On Saturday, 11 June 2016 at 09:32:54 UTC, yawniek wrote:so far i defined vec_t as: struct vec_t { char *base; size_t len; this(string s) { base = s.ptr; len = s.lenght; } nothrow nogc inout(char)[] toString() inout property { return base[0 .. len]; } nothrow nogc property const(char)[] toSlice() { return cast(string) base[0..len]; } alias toString this; } but i guess there is a more elegant way?!No, you've got the right idea with the constructor. You just need to change your implementation. You have two big problems with your current constructor. First, only string literals in D (e.g. "foo") are guaranteed to be null terminated. If you construct a vec_t with a string that is not a literal, then you can have a problem on the C side where it is expected to be null terminated. Second, if the string you pass to the constructor was allocated on the GC heap and at some point is collected, then base member of the vec_t will point to an invalid memory location. A simple approach would be: this(string s) { import std.string : toStringz; base = s.toStringz(); len = s.length; } Second, your toString and toSlice implementations are potentially problematic. Aside from the fact that toString is returning a slice and toSlice is returning a string, the slice you create there will always point to the same location as base. If base becomes invalid, then so will the slice. For the toString impelementation, you really should be doing something like this: string toString() nothrow inout { import std.conv.to; return to!string(base); } I don't know what benefit you are expecting from the toSlice method. If you absolutely need it, you should return implement it like so: char[] toSlice() nothrow inout { return base[0 .. len].dup; } This will give you a character array that is modifiable without worrying about the what happens to base. If you really want, you can declare the return as const(char)[]. Rather than toSlice, which isn't something the compiler is aware of, it would be much more appropriate to implement opSlice along with opDollar. Then it's possible to do this: auto slice = myVec[0 .. $]. Of course, if you don't want to allow aribrary slices, then perhaps toSlice is better.
Jun 11 2016
On Saturday, 11 June 2016 at 10:26:17 UTC, Mike Parker wrote:On Saturday, 11 June 2016 at 09:32:54 UTC, yawniek wrote:thanks mike for the in depth answer. i forgot to add a few important points: - the strings in vec_t are not c strings - vec_t might contain other data than strings the original ctor i pasted actually doesn't even work, temporarly i solved it like this(string s) { char[] si = cast(char[]) s; //i'm scared base = si.ptr; len = si.length; } is there a better solution than to fearlessly cast away immutability? i guess i could just make a second vec_t that has immutable base and length that can be used in D to stay clean, would that be worth anything? now what i still don't have a proper idea for is how can i create wrappers for the methods accepting vec_t in a clean way. for the vec_t that are allocated in D-land we can state that the C libs will not modify the data. there is a lot of functions that accept vec_t. is there no way to have strings auto cast to vec_t ? another way i see is a UDA that generates the wrapper function(s). other ideas?
Jun 11 2016
On 06/11/2016 01:59 PM, yawniek wrote:i forgot to add a few important points: - the strings in vec_t are not c strings - vec_t might contain other data than strings the original ctor i pasted actually doesn't even work, temporarly i solved it like this(string s) { char[] si = cast(char[]) s; //i'm scaredRightfully so. Casting away immutable shouldn't be done lightly.base = si.ptr; len = si.length; } is there a better solution than to fearlessly cast away immutability? i guess i could just make a second vec_t that has immutable base and length that can be used in D to stay clean, would that be worth anything?Sure. You wouldn't have to do the risky cast then.now what i still don't have a proper idea for is how can i create wrappers for the methods accepting vec_t in a clean way. for the vec_t that are allocated in D-land we can state that the C libs will not modify the data. there is a lot of functions that accept vec_t. is there no way to have strings auto cast to vec_t ?I don't think so, no. You can make a new type and use alias this to have it convert implicitly to vec_t, but you can't enable that for string. There is no feature to implicitly convert *from* another type, only *to* another type.another way i see is a UDA that generates the wrapper function(s). other ideas?Generating the wrappers seems ok to me. Regarding a UDA: I haven't really used them, but you'd still have to iterate over all functions that have the attribute, right? At that point, the UDA might be pointless. Can also iterate over everything and check if there's a vec_t parameter.
Jun 11 2016