www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Safe cast of arrays

reply Iakh <iaktakh gmail.com> writes:
https://dlang.org/spec/function.html#function-safety
Current definition of safety doesn't mention cast of arrays.
E.g this code allowed by DMD

int[] f(void[] a)  safe pure
{
     return cast(int[])a;
}

But same void* to int* cast is forbidden.
So we need some rules for dynamic arrays casting.
e.g allow only cast(void[]) as for pointers was done.

And definition of safety should be changed.
Feb 09 2016
parent reply w0rp <devw0rp gmail.com> writes:
On Tuesday, 9 February 2016 at 21:20:53 UTC, Iakh wrote:
 https://dlang.org/spec/function.html#function-safety
 Current definition of safety doesn't mention cast of arrays.
 E.g this code allowed by DMD

 int[] f(void[] a)  safe pure
 {
     return cast(int[])a;
 }

 But same void* to int* cast is forbidden.
 So we need some rules for dynamic arrays casting.
 e.g allow only cast(void[]) as for pointers was done.

 And definition of safety should be changed.
I think this should be addressed, as if you can't cast between pointer types, you shouldn't be allowed to cast between slice types either. Because slices are just a pointer plus a length. Another way to demonstrate the problem is like this. safe int* badCast(long[] slice) { return (cast(int[]) slice).ptr; } system void main(string[] argv) { auto larger = new long[5]; auto smaller = badCast(larger); } This is a complete program which will compile and run with the latest official release of DMD. The pointer slicing is unsafe, but once the slice is available, it can go off into safe land, and could lead to memory corruption, due to the bad cast. I used .ptr here to show you can safely take the pointer of a badly casted slice, which seems to somewhat contradict the rule that no pointer casting is allowed. Maybe some exception is needed for casting slices of class types. That's about the only thing I can think of.
Feb 10 2016
next sibling parent Era Scarecrow <rtcvb32 yahoo.com> writes:
On Wednesday, 10 February 2016 at 08:49:21 UTC, w0rp wrote:
 I think this should be addressed, as if you can't cast between 
 pointer types, you shouldn't be allowed to cast between slice 
 types either. Because slices are just a pointer plus a length. 
 Another way to demonstrate the problem is like this.
void is a typeless-type as i see it. It stands for a 'block of something'. Naturally void has to be converted to something before it can use it at all, even if it's a byte. The only reason i can see converting between arrays is allowed is because the size is known, so we won't accidentally throw segfaults after converting the type, while a pointer without the size attached to it literally only knows a location potentially of 0 bytes big; Whereas if you cost something 100 bytes of void, to a struct 10 bytes big... Should you be allowed to cast from one type to another? Depends. If they are nearly the same (signed vs unsigned) maybe it should be allowed. Completely different types (or of different lengths)? I'd say not with safe declared; But still there's many many many low level tricks and bypasses we have to do to get code to work sometimes. Maybe pointers shouldn't be allowed to be passed around with safe code either. Both of these would heavily reduce the problems presented. A while back i was thinking that perhaps the array type should have a third member (of flags and tests) for debugging and catching these types of problems, which isn't compiled into release mode. It would include information like if it's a static, stack or heap allocated array, if it's been converted from it's original type, was immutable/const, exists only for the calling function, was assigned/postblit'd, if the ptr was passed bypassing the security features, etc. Then during debugging or runtime the flags could/would be checked for validity. The runtime could throw red flags when these are identified and you could find these types of bugs easier, even identify problems/bugs with library code you previously thought was just fine.
Feb 10 2016
prev sibling next sibling parent reply Iakh <iaktakh gmail.com> writes:
On Wednesday, 10 February 2016 at 08:49:21 UTC, w0rp wrote:
 On Tuesday, 9 February 2016 at 21:20:53 UTC, Iakh wrote:
 https://dlang.org/spec/function.html#function-safety
 Current definition of safety doesn't mention cast of arrays.
I think this should be addressed, as if you can't cast between pointer types, you shouldn't be allowed to cast between slice types either. Because slices are just a pointer plus a length. Another way to demonstrate the problem is like this.
If address it in same fashion as it's done for other things: 4. Cannot access unions that have pointers or references overlapping with other types. "reinterpret cast" is allowed using unions but only for some types. So array casting could be allowed for same set of types. e.g cast(T)arrB allowed if below compiles: () safe { union { T a; typeof(arraB[0]) b; } } Is the condition sufficient? Does this change needs DIP?
Feb 10 2016
parent w0rp <devw0rp gmail.com> writes:
Yeah, I think it should only allow the equivalent of a 
dynamic_cast for types in  safe code, and not allow the 
equivalent of a reinterpret_cast, for T, T*, or T[].
Feb 10 2016
prev sibling parent reply Chris Wright <dhasenan gmail.com> writes:
On Wed, 10 Feb 2016 08:49:21 +0000, w0rp wrote:

 I think this should be addressed, as if you can't cast between pointer
 types, you shouldn't be allowed to cast between slice types either.
 Because slices are just a pointer plus a length. Another way to
 demonstrate the problem is like this.
 
  safe int* badCast(long[] slice) {
      return (cast(int[]) slice).ptr;
 }
 
 
  system void main(string[] argv) {
      auto larger = new long[5];
      auto smaller = badCast(larger);
 }
safe protects you from segmentation faults and reading and writing outside an allocated segment of memory. With array casts, safety is assured: int[] a = [1, 2, 3, 4]; long[] b = cast(long[])a; writeln(b.length); // prints "2" Versus: int[] a = [1, 2, 3, 4, 5]; long[] b = cast(long[])a; This throws an "object.Error: array cast misalignment" specifically because there's no correct, safe way to execute the cast. You'd be left with a dangling half of a long. But if you're using pointers, the runtime can't do this kind of validation, so it's not safe to cast int* to long*. It is, however, safe to cast long* to int*, because writing four bytes to a long* won't overflow the memory allocated to it. So this isn't a problem with safe. Show a way to read or write outside allocated memory with this, or to cause a segmentation fault, and that will require a change in safe. You're looking for something else, data safety rather than memory safety. You want to disallow unions and anything that lets you emulate them.
Feb 10 2016
next sibling parent reply Iakh <iaktakh gmail.com> writes:
On Wednesday, 10 February 2016 at 20:14:29 UTC, Chris Wright 
wrote:
  safe protects you from segmentation faults and reading and 
 writing outside an allocated segment of memory. With array 
 casts,  safety is assured
Yes, safe protects from direct cast to/from ref types but there still is a trick with T[] -> void[] -> T2[] cast: import std.stdio; int[] f(void[] a) safe pure { return cast(int[])a; } struct S { int* a; } void main() safe { S[] a = new S[4]; immutable b = a.f(); writeln(b); a[0].a = new int; writeln(b); //b[0] = 0; //writeln(*a[0].a); } So no safety in this world.
Feb 10 2016
parent reply Chris Wright <dhasenan gmail.com> writes:
On Wed, 10 Feb 2016 21:40:21 +0000, Iakh wrote:

 On Wednesday, 10 February 2016 at 20:14:29 UTC, Chris Wright wrote:
  safe protects you from segmentation faults and reading and writing
 outside an allocated segment of memory. With array casts,  safety is
 assured
Yes, safe protects from direct cast to/from ref types but there still is a trick with T[] -> void[] -> T2[] cast: So no safety in this world.
Okay, that's a problem. It should always be safe to cast from void[] to immutable(T)[] where T doesn't contain pointers. I didn't see a bug for this, so I'm filing it.
Feb 10 2016
next sibling parent Chris Wright <dhasenan gmail.com> writes:
On Wed, 10 Feb 2016 22:49:33 +0000, Chris Wright wrote:

 It should always be safe to cast from void[] to immutable(T)[] where T
 doesn't contain pointers.
 
 I didn't see a bug for this, so I'm filing it.
Filed https://issues.dlang.org/show_bug.cgi?id=15672
Feb 10 2016
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/10/16 5:49 PM, Chris Wright wrote:
 On Wed, 10 Feb 2016 21:40:21 +0000, Iakh wrote:

 On Wednesday, 10 February 2016 at 20:14:29 UTC, Chris Wright wrote:
  safe protects you from segmentation faults and reading and writing
 outside an allocated segment of memory. With array casts,  safety is
 assured
Yes, safe protects from direct cast to/from ref types but there still is a trick with T[] -> void[] -> T2[] cast: So no safety in this world.
Okay, that's a problem. It should always be safe to cast from void[] to immutable(T)[] where T doesn't contain pointers. I didn't see a bug for this, so I'm filing it.
I think casting a mutable array to any array type is a recipe for memory issues, no matter what is in the elements. Remember that you are casting a reference that still has a mutable pointer to it. safe should start from a very cautious and overtightened state, and then we loosen it as we find issues. As it was done, it has holes, and so when we fix things, code breaks. -Steve
Feb 10 2016
parent reply Chris Wright <dhasenan gmail.com> writes:
On Wed, 10 Feb 2016 22:39:20 -0500, Steven Schveighoffer wrote:

 I think casting a mutable array to any array type is a recipe for memory
 issues, no matter what is in the elements. Remember that you are casting
 a reference that still has a mutable pointer to it.
 
  safe should start from a very cautious and overtightened state, and
 then we loosen it as we find issues.
 
 As it was done, it has holes, and so when we fix things, code breaks.
 
 -Steve
I agree with the principle, but it's always safe to read a pointer as if it were not a pointer, and that's what a cast to an immutable array would do.
Feb 10 2016
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/10/16 11:01 PM, Chris Wright wrote:
 On Wed, 10 Feb 2016 22:39:20 -0500, Steven Schveighoffer wrote:

 I think casting a mutable array to any array type is a recipe for memory
 issues, no matter what is in the elements. Remember that you are casting
 a reference that still has a mutable pointer to it.

  safe should start from a very cautious and overtightened state, and
 then we loosen it as we find issues.

 As it was done, it has holes, and so when we fix things, code breaks.
I agree with the principle, but it's always safe to read a pointer as if it were not a pointer, and that's what a cast to an immutable array would do.
A cast to immutable is a guarantee to the compiler that there is no other mutable references that you will use again. This is not the case here. A cast to const may be viable. However, I think casting in safe code is probably not something to allow. If you need to make something work outside the compiler's comfort zone, there's always trusted. -Steve
Feb 12 2016
parent reply Chris Wright <dhasenan gmail.com> writes:
On Fri, 12 Feb 2016 08:45:54 -0500, Steven Schveighoffer wrote:

 A cast to const may be viable.
Touché.
 However, I think casting in safe code is
 probably not something to allow.
*All* casting? Casting between primitive value types (eg long -> int) is safe. You can't get memory errors that way, and the conversions are well-defined. Casting between object references is safe (assuming the object references are valid; safe doesn't protect you from dereferencing an invalid pointer you got from system code). You can dereference null that way, but that's allowed by design. If you wanted to restrict casts between array types, that would be more reasonable, but some work has already gone into making those casts safe (eg long[] -> int[]). It would also prevent safe memory-mapped IO, even if we provided a wrapper that yielded a ubyte[]. If you're just talking about casting from void[] in safe code, that's reasonable, but a little more restrictive than necessary. Casting *to* void[] in this scenario is safe, just not generally useful -- you wouldn't be able to cast back in safe code.
Feb 12 2016
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/12/16 12:15 PM, Chris Wright wrote:

 Casting between primitive value types (eg long -> int) is  safe. You
 can't get memory errors that way, and the conversions are well-defined.

 Casting between object references is  safe (assuming the object
 references are valid;  safe doesn't protect you from dereferencing an
 invalid pointer you got from  system code). You can dereference null that
 way, but that's allowed by design.
All good points. what I'm trying to say safe shouldn't allow is reinterpret casting. i.e.: *cast(T*)(&x) So casting IMO shouldn't be allowed unless it invokes some kind of handler that ensures the conversion is safe. I'd include in this list: a) casting between object types b) casting builtin types that are not, or do not contain, references (that are defined by the compiler) c) casting an aggregate that has a matching opCast
 If you wanted to restrict casts between array types, that would be more
 reasonable, but some work has already gone into making those casts safe
 (eg long[] -> int[]). It would also prevent  safe memory-mapped IO, even
 if we provided a wrapper that yielded a ubyte[].
Casting an array involves casting a pointer with a reinterpret style cast. IMO, the language is better off requiring a trusted escape for such things.
 If you're just talking about casting from void[] in  safe code, that's
 reasonable, but a little more restrictive than necessary. Casting *to*
 void[] in this scenario is safe, just not generally useful -- you
 wouldn't be able to cast back in  safe code.
casting to void[] doesn't require a cast. So I think it should be fine in safe code. -Steve
Feb 12 2016
parent Chris Wright <dhasenan gmail.com> writes:
On Fri, 12 Feb 2016 14:32:32 -0500, Steven Schveighoffer wrote:

 what I'm trying to say safe shouldn't allow is reinterpret casting.
 i.e.: *cast(T*)(&x)
 
 So casting IMO shouldn't be allowed unless it invokes some kind of
 handler that ensures the conversion is safe.
 
 I'd include in this list:
 
 a) casting between object types
 b) casting builtin types that are not,
 or do not contain, references (that are defined by the compiler)
 c) casting an aggregate that has a matching opCast
Casting an array is basically a backdoor way to make a union, ignoring opCast. One of the cases that should be explicitly disallowed here (and of course it isn't). Observe: import std.stdio; struct A { void* m; size_t i; } struct B { size_t i; A opCast() { return A(null, i); } } void main() safe { A[] aa = [A(new int, 5)]; auto bb = cast(B[])aa; writeln(bb[0].i); // prints -22192128 } If this honored opCast, it would print 5. Instead it prints a pointer address. (Also, the length of array bb is 2.) This corresponds to what the spec says, but that's probably not the desired behavior.
Feb 12 2016
prev sibling parent Anon <anon anon.anon> writes:
On Wednesday, 10 February 2016 at 20:14:29 UTC, Chris Wright 
wrote:
 Show a way to read or write outside allocated memory with this, 
 or to cause a segmentation fault, and that will require a 
 change in  safe. You're looking for something else, data safety 
 rather than memory safety. You want to disallow unions and 
 anything that lets you emulate them.
Does this count? http://dpaste.dzfl.pl/96db07a5104e
Feb 10 2016