www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Is there a way to slice non-array type in safe?

reply Stefanos Baziotis <sdi1600105 di.uoa.gr> writes:
I searched the forum but did not find something.

I want to do this:

int foo(T)(ref T s1, ref T s2)
{
     const byte[] s1b = (cast(const(byte)*)&s1)[0 .. T.sizeof];
     const byte[] s2b = (cast(const(byte)*)&s2)[0 .. T.sizeof];
}

Which is to create a byte array from the bytes of the value 
given, no matter
the type. The above works, but it's not  safe.

Thanks,
Stefanos
Jul 11 2019
next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Thursday, 11 July 2019 at 16:31:58 UTC, Stefanos Baziotis 
wrote:
 I searched the forum but did not find something.

 I want to do this:

 int foo(T)(ref T s1, ref T s2)
 {
     const byte[] s1b = (cast(const(byte)*)&s1)[0 .. T.sizeof];
     const byte[] s2b = (cast(const(byte)*)&s2)[0 .. T.sizeof];
 }

 Which is to create a byte array from the bytes of the value 
 given, no matter
 the type. The above works, but it's not  safe.

 Thanks,
 Stefanos
Casting from one type of pointer to another and slicing a pointer are both system, by design. What's the actual problem you're trying to solve? There may be a different way to do it that's safe.
Jul 11 2019
next sibling parent reply Stefanos Baziotis <sdi1600105 di.uoa.gr> writes:
On Thursday, 11 July 2019 at 18:46:57 UTC, Paul Backus wrote:
 Casting from one type of pointer to another and slicing a 
 pointer are both  system, by design.
Yes, I'm aware, there are no pointers in the code. The pointer was used here because it was the only way to solve the problem (but not in safe).
 What's the actual problem you're trying to solve? There may be 
 a different way to do it that's  safe.
I want to make an array of bytes that has the bytes of the value passed. For example, if T = int, then I want an array of 4 bytes that has the 4 individual bytes of `s1` let's say. For long, an array of 8 bytes etc. Ideally, that would work with `ref` (i.e. the bytes of where the ref points to).
Jul 11 2019
parent Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Thursday, 11 July 2019 at 19:35:50 UTC, Stefanos Baziotis 
wrote:
 On Thursday, 11 July 2019 at 18:46:57 UTC, Paul Backus wrote:
 Casting from one type of pointer to another and slicing a 
 pointer are both  system, by design.
Yes, I'm aware, there are no pointers in the code. The pointer was used here because it was the only way to solve the problem (but not in safe).
 What's the actual problem you're trying to solve? There may be 
 a different way to do it that's  safe.
I want to make an array of bytes that has the bytes of the value passed. For example, if T = int, then I want an array of 4 bytes that has the 4 individual bytes of `s1` let's say. For long, an array of 8 bytes etc. Ideally, that would work with `ref` (i.e. the bytes of where the ref points to).
imho this cannot be safe on 1st principle basis. You gain access to the machine representation of variable, which means you bypass the "control" the compiler has on its data. Alone the endianness issue is enough to have different behaviour of your program on different implementations. While in practice big endian is nearly an extinct species (, it is still enough to show why that operation is inherently system and should not be considered safe. Of course, a trusted function can be written to take care of that, but that's in fact exactly the case as it should be.
Jul 12 2019
prev sibling parent Stefanos Baziotis <sdi1600105 di.uoa.gr> writes:
On Thursday, 11 July 2019 at 18:46:57 UTC, Paul Backus wrote:
 What's the actual problem you're trying to solve? There may be 
 a different way to do it that's  safe.
I hope I answered the "what". In case the "why" helps too, it is because I'm implementing memcmp().
Jul 11 2019
prev sibling parent reply Nathan S. <no.public.email example.com> writes:
On Thursday, 11 July 2019 at 16:31:58 UTC, Stefanos Baziotis 
wrote:
 I searched the forum but did not find something.

 I want to do this:

 int foo(T)(ref T s1, ref T s2)
 {
     const byte[] s1b = (cast(const(byte)*)&s1)[0 .. T.sizeof];
     const byte[] s2b = (cast(const(byte)*)&s2)[0 .. T.sizeof];
 }

 Which is to create a byte array from the bytes of the value 
 given, no matter
 the type. The above works, but it's not  safe.

 Thanks,
 Stefanos
If you know that what you're doing cannot result in memory corruption but the compiler cannot automatically infer safe, it is appropriate to use trusted. (For this case make sure you're not returning the byte slices, since if the arguments were allocated on the stack you could end up with a pointer to an invalid stack frame. If it's the caller's responsibility to ensure the slice doesn't outlive the struct then it is the caller that should be trusted or not.)
Jul 11 2019
parent reply Stefanos Baziotis <sdi1600105 di.uoa.gr> writes:
On Thursday, 11 July 2019 at 19:37:38 UTC, Nathan S. wrote:
 If you know that what you're doing cannot result in memory 
 corruption but the compiler cannot automatically infer  safe, 
 it is appropriate to use  trusted. (For this case make sure 
 you're not returning the byte slices, since if the arguments 
 were allocated on the stack you could end up with a pointer to 
 an invalid stack frame. If it's the caller's responsibility to 
 ensure the slice doesn't outlive the struct then it is the 
 caller that should be  trusted or not.)
Yes, trusted is an option. I mean it's a good solution, but from the standpoint of the language user, it seems unfortunate that for the case static types trusted has to be used while the array one can be safe: int memcmp(T)(const T[] s1, const T[] s2) safe { const byte[] s1b = (cast(const(byte[]))s1)[0 .. s1.length * T.sizeof]; const byte[] s2b = (cast(const(byte[]))s2)[0 .. s2.length * T.sizeof]; }
Jul 11 2019
next sibling parent Paul Backus <snarwin gmail.com> writes:
On Thursday, 11 July 2019 at 19:44:51 UTC, Stefanos Baziotis 
wrote:
 On Thursday, 11 July 2019 at 19:37:38 UTC, Nathan S. wrote:
 If you know that what you're doing cannot result in memory 
 corruption but the compiler cannot automatically infer  safe, 
 it is appropriate to use  trusted. (For this case make sure 
 you're not returning the byte slices, since if the arguments 
 were allocated on the stack you could end up with a pointer to 
 an invalid stack frame. If it's the caller's responsibility to 
 ensure the slice doesn't outlive the struct then it is the 
 caller that should be  trusted or not.)
Yes, trusted is an option. I mean it's a good solution, but from the standpoint of the language user, it seems unfortunate that for the case static types trusted has to be used while the array one can be safe: int memcmp(T)(const T[] s1, const T[] s2) safe { const byte[] s1b = (cast(const(byte[]))s1)[0 .. s1.length * T.sizeof]; const byte[] s2b = (cast(const(byte[]))s2)[0 .. s2.length * T.sizeof]; }
You can use a union: int foo(T)(ref T s1, ref T s2) { import std.stdio; union U { T val; byte[T.sizeof] bytes; } const U s1u = { val: s1 }; const U s2u = { val: s2 }; writeln("s1 bytes: ", s1u.bytes); writeln("s2 bytes: ", s2u.bytes); return 0; } safe void main() { double a = 12.345, b = 67.890; foo(a, b); } However, accessing the `bytes` member will still be considered system if T is or contains a pointer. To fix this, you can use a trusted nested function to do the union access; e.g., trusted ref const(ubyte[T.sizeof]) getBytes(ref U u) { return u.bytes; } // ... writeln("s1 bytes: ", getBytes(s1u));
Jul 11 2019
prev sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Thursday, July 11, 2019 1:44:51 PM MDT Stefanos Baziotis via Digitalmars-
d-learn wrote:
 On Thursday, 11 July 2019 at 19:37:38 UTC, Nathan S. wrote:
 If you know that what you're doing cannot result in memory
 corruption but the compiler cannot automatically infer  safe,
 it is appropriate to use  trusted. (For this case make sure
 you're not returning the byte slices, since if the arguments
 were allocated on the stack you could end up with a pointer to
 an invalid stack frame. If it's the caller's responsibility to
 ensure the slice doesn't outlive the struct then it is the
 caller that should be  trusted or not.)
Yes, trusted is an option. I mean it's a good solution, but from the standpoint of the language user, it seems unfortunate that for the case static types trusted has to be used while the array one can be safe: int memcmp(T)(const T[] s1, const T[] s2) safe { const byte[] s1b = (cast(const(byte[]))s1)[0 .. s1.length * T.sizeof]; const byte[] s2b = (cast(const(byte[]))s2)[0 .. s2.length * T.sizeof]; }
The compiler would have to have a deeper understanding of what you're doing in order to know that it's safe, and it's just not that smart. That's why trusted is needed in general. Just because something is obvious to the programmer doesn't mean that it's at all obvious to the compiler. In some cases, the compiler could be improved, but in many, it really just comes down to the programmer verifying that the code is doing the correct thing and using trusted. BTW, if you're implementing memcmp, why are you using byte instead of ubyte? byte is signed. Unless you're explicitly trying to do arithmetic on integral values from -127 to 127, odds are, you shouldn't be using byte. If you're doing something like breaking an integer into its 8-bit parts, then ubyte is what's appropriate, not byte. - Jonathan M Davis
Jul 11 2019
parent Stefanos Baziotis <sdi1600105 di.uoa.gr> writes:
Thank you all for your responses. I understand that the compiler
can't ensure  safe and  trusted is needed.. I'm not familiar 
though with all aspects of D and thought I might have missed 
something.

On Friday, 12 July 2019 at 01:24:06 UTC, Jonathan M Davis wrote:
 BTW, if you're implementing memcmp, why are you using byte 
 instead of ubyte? byte is signed. Unless you're explicitly 
 trying to do arithmetic on integral values from -127 to 127, 
 odds are, you shouldn't be using byte. If you're doing 
 something like breaking an integer into its 8-bit parts, then 
 ubyte is what's appropriate, not byte.
I usually start with the signed version and if in the end there's no need for the signed, I make it unsigned. In this case, at the moment there's no actual need for `byte`. Actually, it only makes difficult to use some GDC intrinsics.
Jul 12 2019