www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Yet another leak in the sinking ship of safe

reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
While  safe is a good idea in principle, the current implementation is
rather lackluster. Consider, for example:

	void readData(void[] buffer)  safe
	{
		ubyte[] buf = cast(ubyte[]) buffer;
		buf[0] = 0xFF;
	}
	void main()  safe
	{
		auto buffer = new Object[1];
		readData(buffer);
	}

There are (at least) two major problems here:

1) Casting an array of elements with indirections (in this case
Object[]) to void[] is questionable at best, outright unsafe at worst,
as shown here. Even if we were to rewrite readData() and mark it
 trusted, it still raises the question of what a  trusted function can
legally do with void[], which is essentially a type-erased array, that
justifies being tagged as  trusted.  How can a function do anything that
doesn't break  safety if all type information about the array has been
erased, and it essentially sees it only as a ubyte[]?  I'm inclined to
say that  trusted functions should only be allowed to receive
const(void)[] as parameter, not void[].

	https://issues.dlang.org/show_bug.cgi?id=15702

2) To add salt to the wound, though, blatantly casting void[] to T[] is
allowed in  safe code, thus, readData() can even get away with claiming
to be  safe, when in fact it is anything but.

	https://issues.dlang.org/show_bug.cgi?id=15672

These two issues basically make the  safe annotation an empty promise,
because *anything* could happen entirely within the realm of  safe code,
and memory safety isn't guaranteed at all.  It's bad enough that some
Phobos modules (*ahem*std.socket*cough*) liberally sprinkle  trusted on
every function without regard to whether it's truly justified (e.g., see
https://issues.dlang.org/show_bug.cgi?id=15672), now even if we exclude
 trusted functions from the mix, we can still break  safe without the
compiler even raising an eyebrow.

Sadly, the above issues are merely the tip of the iceberg. There are
many more holes in  safe, such as unions containing pointers, unions of
 safe and  system delegates, unaligned pointers, void initialization of
members with indirection, just to name a few.  If we want the  safe ship
to float, we have a lot of work cut out for us.


--T
Feb 18 2016
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 18.02.2016 17:37, H. S. Teoh via Digitalmars-d wrote:
 While  safe is a good idea in principle, the current implementation is
 rather lackluster. Consider, for example:

 	void readData(void[] buffer)  safe
 	{
 		ubyte[] buf = cast(ubyte[]) buffer;
 		buf[0] = 0xFF;
 	}
 	void main()  safe
 	{
 		auto buffer = new Object[1];
 		readData(buffer);
 	}

 There are (at least) two major problems here:

 1) Casting an array of elements with indirections (in this case
 Object[]) to void[] is questionable at best, outright unsafe at worst,
 as shown here. Even if we were to rewrite readData() and mark it
  trusted, it still raises the question of what a  trusted function can
 legally do with void[], which is essentially a type-erased array, that
 justifies being tagged as  trusted.  How can a function do anything that
 doesn't break  safety if all type information about the array has been
 erased, and it essentially sees it only as a ubyte[]?  I'm inclined to
 say that  trusted functions should only be allowed to receive
 const(void)[] as parameter, not void[].

 	https://issues.dlang.org/show_bug.cgi?id=15702
No problem here. There is no way to assign to a void[] without doing 2.
 2) To add salt to the wound, though, blatantly casting void[] to T[] is
 allowed in  safe code, thus, readData() can even get away with claiming
 to be  safe, when in fact it is anything but.

 	https://issues.dlang.org/show_bug.cgi?id=15672
This is the culprit.
Feb 18 2016
next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/18/16 1:30 PM, Timon Gehr wrote:
 On 18.02.2016 17:37, H. S. Teoh via Digitalmars-d wrote:
 While  safe is a good idea in principle, the current implementation is
 rather lackluster. Consider, for example:

     void readData(void[] buffer)  safe
     {
         ubyte[] buf = cast(ubyte[]) buffer;
         buf[0] = 0xFF;
     }
     void main()  safe
     {
         auto buffer = new Object[1];
         readData(buffer);
     }

 There are (at least) two major problems here:

 1) Casting an array of elements with indirections (in this case
 Object[]) to void[] is questionable at best, outright unsafe at worst,
 as shown here. Even if we were to rewrite readData() and mark it
  trusted, it still raises the question of what a  trusted function can
 legally do with void[], which is essentially a type-erased array, that
 justifies being tagged as  trusted.  How can a function do anything that
 doesn't break  safety if all type information about the array has been
 erased, and it essentially sees it only as a ubyte[]?  I'm inclined to
 say that  trusted functions should only be allowed to receive
 const(void)[] as parameter, not void[].

     https://issues.dlang.org/show_bug.cgi?id=15702
No problem here. There is no way to assign to a void[] without doing 2.
foo(void[] arr) { void[] arr2 = [1234, 5678, 91011]; arr[] = arr2[0 .. arr.length]; } -Steve
Feb 18 2016
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 18.02.2016 19:41, Steven Schveighoffer wrote:
 On 2/18/16 1:30 PM, Timon Gehr wrote:
 On 18.02.2016 17:37, H. S. Teoh via Digitalmars-d wrote:
 While  safe is a good idea in principle, the current implementation is
 rather lackluster. Consider, for example:

     void readData(void[] buffer)  safe
     {
         ubyte[] buf = cast(ubyte[]) buffer;
         buf[0] = 0xFF;
     }
     void main()  safe
     {
         auto buffer = new Object[1];
         readData(buffer);
     }

 There are (at least) two major problems here:

 1) Casting an array of elements with indirections (in this case
 Object[]) to void[] is questionable at best, outright unsafe at worst,
 as shown here. Even if we were to rewrite readData() and mark it
  trusted, it still raises the question of what a  trusted function can
 legally do with void[], which is essentially a type-erased array, that
 justifies being tagged as  trusted.  How can a function do anything that
 doesn't break  safety if all type information about the array has been
 erased, and it essentially sees it only as a ubyte[]?  I'm inclined to
 say that  trusted functions should only be allowed to receive
 const(void)[] as parameter, not void[].

     https://issues.dlang.org/show_bug.cgi?id=15702
No problem here. There is no way to assign to a void[] without doing 2.
foo(void[] arr) { void[] arr2 = [1234, 5678, 91011]; arr[] = arr2[0 .. arr.length]; } -Steve
Good point. :)
Feb 18 2016
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, 18 February 2016 at 18:41:25 UTC, Steven 
Schveighoffer wrote:
 On 2/18/16 1:30 PM, Timon Gehr wrote:
 On 18.02.2016 17:37, H. S. Teoh via Digitalmars-d wrote:
 While  safe is a good idea in principle, the current 
 implementation is
 rather lackluster. Consider, for example:

     void readData(void[] buffer)  safe
     {
         ubyte[] buf = cast(ubyte[]) buffer;
         buf[0] = 0xFF;
     }
     void main()  safe
     {
         auto buffer = new Object[1];
         readData(buffer);
     }

 There are (at least) two major problems here:

 1) Casting an array of elements with indirections (in this 
 case
 Object[]) to void[] is questionable at best, outright unsafe 
 at worst,
 as shown here. Even if we were to rewrite readData() and mark 
 it
  trusted, it still raises the question of what a  trusted 
 function can
 legally do with void[], which is essentially a type-erased 
 array, that
 justifies being tagged as  trusted.  How can a function do 
 anything that
 doesn't break  safety if all type information about the array 
 has been
 erased, and it essentially sees it only as a ubyte[]?  I'm 
 inclined to
 say that  trusted functions should only be allowed to receive
 const(void)[] as parameter, not void[].

     https://issues.dlang.org/show_bug.cgi?id=15702
No problem here. There is no way to assign to a void[] without doing 2.
foo(void[] arr) { void[] arr2 = [1234, 5678, 91011]; arr[] = arr2[0 .. arr.length]; }
Well, I'm not sure that that's actually not safe. It's trying to interpret the void[] that's the problem. Certainly, you can convert T[] to void[] and pass it around all day without risking any memory corruption, so that should definitely be safe, and I don't see how reducing the length of a void[] could actually cause memory corruption on its own. It's when you cast the void[] to something else that you risk things going south, and that's what needs to be system. So, I'm not sure that there's actually any reason for your example code to not be safe. - Jonathan M Davis
Feb 18 2016
next sibling parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/18/16 1:50 PM, Jonathan M Davis wrote:
 On Thursday, 18 February 2016 at 18:41:25 UTC, Steven Schveighoffer wrote:
 foo(void[] arr)
 {
    void[] arr2 = [1234, 5678, 91011];
    arr[] = arr2[0 .. arr.length];
 }
Well, I'm not sure that that's actually not safe. It's trying to interpret the void[] that's the problem. Certainly, you can convert T[] to void[] and pass it around all day without risking any memory corruption, so that should definitely be safe, and I don't see how reducing the length of a void[] could actually cause memory corruption on its own. It's when you cast the void[] to something else that you risk things going south, and that's what needs to be system. So, I'm not sure that there's actually any reason for your example code to not be safe.
You don't need to cast to use the original. void bar() safe { int *[] ptrs = new int*[1]; foo(ptrs); *ptrs[0] = 5; } -Steve
Feb 18 2016
prev sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Feb 18, 2016 at 06:50:34PM +0000, Jonathan M Davis via Digitalmars-d
wrote:
 On Thursday, 18 February 2016 at 18:41:25 UTC, Steven Schveighoffer wrote:
[...]
foo(void[] arr)
{
   void[] arr2 = [1234, 5678, 91011];
   arr[] = arr2[0 .. arr.length];
}
Well, I'm not sure that that's actually not safe.
How can it possibly be safe??? Consider: void main() safe { Object[] objs = [ new Object() ]; foo(objs); } Now the pointer to the Object instance has been corrupted.
 It's trying to interpret the void[] that's the problem. Certainly, you
 can convert T[] to void[] and pass it around all day without risking
 any memory corruption, so that should definitely be  safe, and I don't
 see how reducing the length of a void[] could actually cause memory
 corruption on its own. It's when you cast the void[] to something else
 that you risk things going south, and that's what needs to be  system.
 So, I'm not sure that there's actually any reason for your example
 code to not be  safe.
[...] I think you missed the point of his example. :-) The point is that it's perfectly legal to (1) cast an array of int to void[], and (2) it's also perfectly legal to cast an array of anything to void[], and (3) under current rules, it's perfectly legal to copy one void[] to another void[]. Arguably, (3) should not be allowed in safe code. Which again brings us back to the point, that if any function takes void[] as an argument, is there *anything* it can do with the void[] other than reading it, that *won't* break safe? If there's *nothing* it can legally do with a void[] (other than reading it) without violating safe, then shouldn't it be a requirement that all functions marked safe must take const(void)[] rather than void[]? T -- Chance favours the prepared mind. -- Louis Pasteur
Feb 18 2016
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, 18 February 2016 at 18:58:56 UTC, H. S. Teoh wrote:
 On Thu, Feb 18, 2016 at 06:50:34PM +0000, Jonathan M Davis via 
 Digitalmars-d wrote:
 On Thursday, 18 February 2016 at 18:41:25 UTC, Steven 
 Schveighoffer wrote:
[...]
foo(void[] arr)
{
   void[] arr2 = [1234, 5678, 91011];
   arr[] = arr2[0 .. arr.length];
}
Well, I'm not sure that that's actually not safe.
How can it possibly be safe??? Consider: void main() safe { Object[] objs = [ new Object() ]; foo(objs); } Now the pointer to the Object instance has been corrupted.
Does it matter what state the void[] is in until you actually attempt to cast it to something else? If you have Object[] oArr = [ new Object ]; void[] vArr = oArr; vArr = vArr[0 .. $ - 1]; it's not like that's going to cause any problems - not by itself. As soon as you convert it back to Object[] or to int[] or whatever, _then_ you're screwed. But converting from void[] to anything is inherently system regardless of what you did to the array or where it came from. Now, given that reducing the length like that is setting you up for a bad conversion, I can certainly see an argument for reducing the length of a void[] being system, but in and of itself, it isn't going to corrupt memory. It just sets the stage for it. As long as it stays void[], everything is fine. Now given that you can't possibly do anything useful with reducing the length of void[] and that it does make it so that future, system operations would corrupt memory, I can certainly see making it system, but I dispute that it's really doing anything unsafe on its own.
 It's trying to interpret the void[] that's the problem. 
 Certainly, you can convert T[] to void[] and pass it around 
 all day without risking any memory corruption, so that should 
 definitely be  safe, and I don't see how reducing the length 
 of a void[] could actually cause memory corruption on its own. 
 It's when you cast the void[] to something else that you risk 
 things going south, and that's what needs to be  system. So, 
 I'm not sure that there's actually any reason for your example 
 code to not be  safe.
[...] I think you missed the point of his example. :-) The point is that it's perfectly legal to (1) cast an array of int to void[], and (2) it's also perfectly legal to cast an array of anything to void[], and (3) under current rules, it's perfectly legal to copy one void[] to another void[]. Arguably, (3) should not be allowed in safe code. Which again brings us back to the point, that if any function takes void[] as an argument, is there *anything* it can do with the void[] other than reading it, that *won't* break safe? If there's *nothing* it can legally do with a void[] (other than reading it) without violating safe, then shouldn't it be a requirement that all functions marked safe must take const(void)[] rather than void[]?
It could pass it to something else without actually doing anything to it, in which case, I don't see why it couldn't be safe. It's interpreting a void[] that's the problem. _That_ needs to be system. Merely converting to void[] or passing it around is harmless. And as long as converting void[] to anything else is considered system like it should be, I don't see why it would be necessary to care about whether a function accepts void[] or not when considering safety. Useless as it may be void[] identity(void[] arr) safe { return arr; } is perfectly safe. If the compiler does its job, and the function does cast the void[], then the function will be considered system unless it's marked trusted, and all is well without caring one whit about how the function was passed void[]. The problem with std.socket is not that it accepts void[] and considers that safe; it's that it passes that void[] to something else, treating that as safe, when it's not. And since it's a C function that's being called, and it accesses the ptr member of the void[], it's already system, forcing trusted to be used to make the compiler happy. So, the problem is that trusted was used when it shouldn't have been, not that the type system allowed void[] to be passed in without marking it as system. And since the function was marked trusted, it's not like having the compiler treat a function that accepted void[] as system would have helped anyway. - Jonathan M Davis
Feb 18 2016
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Feb 18, 2016 at 07:25:16PM +0000, Jonathan M Davis via Digitalmars-d
wrote:
 On Thursday, 18 February 2016 at 18:58:56 UTC, H. S. Teoh wrote:
On Thu, Feb 18, 2016 at 06:50:34PM +0000, Jonathan M Davis via
Digitalmars-d wrote:
On Thursday, 18 February 2016 at 18:41:25 UTC, Steven Schveighoffer
wrote:
[...]
foo(void[] arr)
{
   void[] arr2 = [1234, 5678, 91011];
   arr[] = arr2[0 .. arr.length];
}
Well, I'm not sure that that's actually not safe.
How can it possibly be safe??? Consider: void main() safe { Object[] objs = [ new Object() ]; foo(objs); } Now the pointer to the Object instance has been corrupted.
Does it matter what state the void[] is in until you actually attempt to cast it to something else? If you have Object[] oArr = [ new Object ]; void[] vArr = oArr; vArr = vArr[0 .. $ - 1]; it's not like that's going to cause any problems - not by itself.
I think you misread Steven's code. This has nothing to do with reducing the length of an array. Read the code again. It has to do with *overwriting* the contents of one void[] with another void[], which is a legal and safe operation under the current rules. Unfortunately, the first void[] is an array of Object references, and the second void[] is an array of ints, so this is overwriting pointers with arbitrary int values. Furthermore, as the caller of foo(), main() has no idea what has happened to its Object[], because it continues to see the same area of memory as Object[], even though it has been overwritten by an array of ints. So when it next tries to dereference the Object[], which is perfectly legal and does not involve any casting, it ends up dereferencing an arbitrary int as a pointer instead. Ergo, safe has been compromised. [...]
I think you missed the point of his example. :-) The point is that
it's perfectly legal to (1) cast an array of int to void[], and (2)
it's also perfectly legal to cast an array of anything to void[], and
(3) under current rules, it's perfectly legal to copy one void[] to
another void[].

Arguably, (3) should not be allowed in  safe code. Which again brings
us back to the point, that if any function takes void[] as an
argument, is there *anything* it can do with the void[] other than
reading it, that *won't* break  safe?

If there's *nothing* it can legally do with a void[] (other than
reading it) without violating  safe, then shouldn't it be a
requirement that all functions marked  safe must take const(void)[]
rather than void[]?
It could pass it to something else without actually doing anything to it, in which case, I don't see why it couldn't be safe. It's interpreting a void[] that's the problem. _That_ needs to be system. Merely converting to void[] or passing it around is harmless.
No. Please look at Steven's code again. What it is doing, is (1) implicitly casting Object[] to void[], which is perfectly legal according to what you said. Then (2) it casts an int[] literal to void[], again, another perfectly legal operation. Finally, (3) it copies the contents of the second void[] to the first void[]. Since both arrays are typed void[] at this point, the copy is perfectly legal (or so the compiler thinks), and here is where the Object[] got overwritten by an int[] and the compiler did not realize it. Then back in main(), the code does not see void[] at all; all it sees is the Object[] it has created all along. There is no interpretation of void[] going on here at all. All that happened was (1) we passed Object[] to void[], which is legal according to what you said, and (2) we dereferenced the Object[], which is not suspect at all because, well, it's an Object[], what could possibly go wrong? At this point, however, we are dereferencing an illegal pointer, because foo() has corrupted our Object[]. Note that *nowhere* in main() is there any reinterpretation of void[] as Object[]. The only thing that led to this mess, as far as main() is concerned, is that it passed an Object[] to a function that takes void[]. In other words, from the caller's POV, as soon as you pass an array to a function that takes void[], all bets are off, anything could happen to your precious data, and safety is no longer guaranteed. However, foo() is marked safe and the compiler does not complain. Also, inside foo() we did nothing unseemly, according to current rules. All we did was to convert two arrays to void[], which is perfectly legal and harmless, as you claim, and copy data from one void[] to another void[], which, under the current rules, is perfectly legal since we are just copying data between two arrays *of the same type*. What could possibly go wrong with copying data between two arrays of the same type? The only problem is, they *aren't* of the same type after all. Just because is(void[] == void[]) is true does NOT mean it is safe to copy one array to another, because the void here is a wildcard placeholder representing arbitrary types. So the bottom line is that the array copy cannot be safe. The larger question, is what, if anything, *can* you do with void[] besides read-only operations, that doesn't break safe? Once something is (implicitly or otherwise) cast to void[], all type information is forgotten, and there is no way, that I can tell, to write to a void[] without causing safe breakage. If so, wouldn't it make sense to require that the type should be const(void)[] rather than void[]? [...]
 The problem with std.socket is not that it accepts void[] and
 considers that  safe; it's that it passes that void[] to something
 else, treating that as  safe, when it's not. And since it's a C
 function that's being called, and it accesses the ptr member of the
 void[], it's already  system, forcing  trusted to be used to make the
 compiler happy. So, the problem is that  trusted was used when it
 shouldn't have been, not that the type system allowed void[] to be
 passed in without marking it as  system. And since the function was
 marked  trusted, it's not like having the compiler treat a function
 that accepted void[] as  system would have helped anyway.
[...] Certainly, the trusted was wrongly applied to that particular function. That's easy to fix: remove trusted and mark it as system, as it should be. Or, to be a bit nicer to our current users, qualify it with a sig constraint so that it only accepts arrays that have no further indirections. But I'm thinking about the larger question here. If modifying the data in a void[] is *always* system, because of the type erasure that basically eliminates all possibility of the lower-level function recovering the type and adapting its operation to be safe, then why allow passing void[] in safe code in the first place? If *any* modification to the data in the void[] carries the risk of breaking safe, then shouldn't we require that we only allow const(void)[] in safe code? It's not as if we don't have trusted as a fallback in those cases where we *know*, outside the type system, that a particular function call is actually safe even though it's marked system. But the point is, by disabling this dangerous operation *by default* (albeit allowing it via the trusted escape hatch), we eliminate the possibility of screwups like Steven's code demonstrated. T -- Never step over a puddle, always step around it. Chances are that whatever made it is still dripping.
Feb 18 2016
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, 18 February 2016 at 20:24:18 UTC, H. S. Teoh wrote:
 On Thu, Feb 18, 2016 at 07:25:16PM +0000, Jonathan M Davis via 
 Digitalmars-d wrote:
 On Thursday, 18 February 2016 at 18:58:56 UTC, H. S. Teoh 
 wrote:
On Thu, Feb 18, 2016 at 06:50:34PM +0000, Jonathan M Davis 
via Digitalmars-d wrote:
On Thursday, 18 February 2016 at 18:41:25 UTC, Steven 
Schveighoffer wrote:
[...]
foo(void[] arr)
{
   void[] arr2 = [1234, 5678, 91011];
   arr[] = arr2[0 .. arr.length];
}
Well, I'm not sure that that's actually not safe.
How can it possibly be safe??? Consider: void main() safe { Object[] objs = [ new Object() ]; foo(objs); } Now the pointer to the Object instance has been corrupted.
Does it matter what state the void[] is in until you actually attempt to cast it to something else? If you have Object[] oArr = [ new Object ]; void[] vArr = oArr; vArr = vArr[0 .. $ - 1]; it's not like that's going to cause any problems - not by itself.
I think you misread Steven's code.
Yes. It looks like I did.
 So the bottom line is that the array copy cannot be  safe.
Exactly. Passing it around is fine, but mutating it definitely is not.
 The larger question, is what, if anything, *can* you do with 
 void[] besides read-only operations, that doesn't break  safe?  
 Once something is (implicitly or otherwise) cast to void[], all 
 type information is forgotten, and there is no way, that I can 
 tell, to write to a void[] without causing  safe breakage.  If 
 so, wouldn't it make sense to require that the type should be 
 const(void)[] rather than void[]?
Why? The problem isn't that void[] was passed it. It's that what was done to it after it was passed in was not safe. We need to fix it so that the compiler doesn't consider mutating void[] or casting away from it or doing anything with it that could corrupt memory safe, but passing it around is perfectly safe, even if it's not very useful by itself. So, I see no reason to make any requirements about const. As long as dmd correctly catches the operations that aren't safe, the function which is passed the void[] and does more than pass it around is going to be forced to be system anyway. So making any requirements about const(void[]) buys us nothing. - Jonathan M Davis
Feb 18 2016
prev sibling parent reply Era Scarecrow <rtcvb32 yahoo.com> writes:
On Thursday, 18 February 2016 at 18:41:25 UTC, Steven 
Schveighoffer wrote:
 On 2/18/16 1:30 PM, Timon Gehr wrote:
 No problem here. There is no way to assign to a void[] without 
 doing 2.
foo(void[] arr) { void[] arr2 = [1234, 5678, 91011]; arr[] = arr2[0 .. arr.length]; }
Since void throws away type information (and all the safety related to it), would it be easier to simply require safe code can't cast implicitly to void? It seems like explicit casting would take care of most of this, or disallowing to/from void converting period while in safe code. To be honest, I think there's only 1 time I actually used a void[] in my code, and that was while writing a section in the BitArray replacement code years back in the case you wanted to use/read another block of data as the source for the BitArray. Beyond that I never touched it.
Feb 18 2016
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 18.02.2016 20:17, Era Scarecrow wrote:
 On Thursday, 18 February 2016 at 18:41:25 UTC, Steven Schveighoffer wrote:
 On 2/18/16 1:30 PM, Timon Gehr wrote:
 No problem here. There is no way to assign to a void[] without doing 2.
foo(void[] arr) { void[] arr2 = [1234, 5678, 91011]; arr[] = arr2[0 .. arr.length]; }
Since void throws away type information (and all the safety related to it), would it be easier to simply require safe code can't cast implicitly to void? It seems like explicit casting would take care of most of this, or disallowing to/from void converting period while in safe code.
The conversion is fine. It just throws away type information. There's no way to soundly type-check the block assignment after that, so that's the operation that should be disallowed. This also prevents safe code from mutating untyped memory that was passed through from system code without preventing it to pass the block back to trusted code.
Feb 18 2016
prev sibling parent Chris Wright <dhasenan gmail.com> writes:
On Thu, 18 Feb 2016 19:17:27 +0000, Era Scarecrow wrote:

 On Thursday, 18 February 2016 at 18:41:25 UTC, Steven Schveighoffer
 wrote:
 On 2/18/16 1:30 PM, Timon Gehr wrote:
 No problem here. There is no way to assign to a void[] without doing
 2.
foo(void[] arr) { void[] arr2 = [1234, 5678, 91011]; arr[] = arr2[0 .. arr.length]; }
Since void throws away type information (and all the safety related to it), would it be easier to simply require safe code can't cast implicitly to void? It seems like explicit casting would take care of most of this, or disallowing to/from void converting period while in safe code.
Casting *from* void[] is also a big issue. Disallow all implicit and explicit casts between void[] and anything else to start, and we can look at the rest case-by-caste. We can probably cast to const(void)[] safely, and we can probably cast arrays that contain no pointers to void[] safely. Casting from void[] to const(T)[] where T contains no pointers (or arrays or functions or delegates) should also be safe.
   To be honest, I think there's only 1 time I actually used a
 void[] in my code, and that was while writing a section in the BitArray
 replacement code years back in the case you wanted to use/read another
 block of data as the source for the BitArray. Beyond that I never
 touched it.
A lot of the IO stuff in Phobos uses void[]. std.socket is lousy with it. I think the intention is that you can send arbitrary data over the wire without having to explicitly marshal it into a ubyte[].
Feb 18 2016
prev sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Feb 18, 2016 at 07:30:34PM +0100, Timon Gehr via Digitalmars-d wrote:
 On 18.02.2016 17:37, H. S. Teoh via Digitalmars-d wrote:
[...]
1) Casting an array of elements with indirections (in this case
Object[]) to void[] is questionable at best, outright unsafe at
worst, as shown here. Even if we were to rewrite readData() and mark
it  trusted, it still raises the question of what a  trusted function
can legally do with void[], which is essentially a type-erased array,
that justifies being tagged as  trusted.  How can a function do
anything that doesn't break  safety if all type information about the
array has been erased, and it essentially sees it only as a ubyte[]?
I'm inclined to say that  trusted functions should only be allowed to
receive const(void)[] as parameter, not void[].

	https://issues.dlang.org/show_bug.cgi?id=15702
No problem here. There is no way to assign to a void[] without doing 2.
Sure there is: void breakSafety(void[] data) safe { union U { void[] d; int[] arr; } U u; u.d = data; u.arr[0] = 0xFF; // kaboom }
2) To add salt to the wound, though, blatantly casting void[] to T[]
is allowed in  safe code, thus, readData() can even get away with
claiming to be  safe, when in fact it is anything but.

	https://issues.dlang.org/show_bug.cgi?id=15672
This is the culprit.
It's only one of many culprits. As long as there is any way around safe, the entire system of guarantees breaks down. T -- Question authority. Don't ask why, just do it.
Feb 18 2016
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, 18 February 2016 at 18:41:58 UTC, H. S. Teoh wrote:
 On Thu, Feb 18, 2016 at 07:30:34PM +0100, Timon Gehr via 
 Digitalmars-d wrote:
 No problem here. There is no way to assign to a void[] without 
 doing 2.
Sure there is: void breakSafety(void[] data) safe { union U { void[] d; int[] arr; } U u; u.d = data; u.arr[0] = 0xFF; // kaboom }
Well, unions with an array in them can't be safe. That's clearly a bug, regardless of whether void[] is involved or not. Regardless, as far as I can tell, there is zero safety problem with converting to void[]. You'll never get corrupted memory with that conversion. It's converting back that risks screwing everything up. And that's what can't be safe.
 It's only one of many culprits. As long as there is any way 
 around  safe, the entire system of guarantees breaks down.
Of course, and we went about things the wrong way with safe. It should have been done via whitelisting, whereas we've done it via blacklisting. Given that fact, we're pretty much permanently at risk of safe being broken. - Jonathan M Davis
Feb 18 2016
parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Feb 18, 2016 at 06:55:18PM +0000, Jonathan M Davis via Digitalmars-d
wrote:
 On Thursday, 18 February 2016 at 18:41:58 UTC, H. S. Teoh wrote:
[...]
It's only one of many culprits. As long as there is any way around
 safe, the entire system of guarantees breaks down.
Of course, and we went about things the wrong way with safe. It should have been done via whitelisting, whereas we've done it via blacklisting. Given that fact, we're pretty much permanently at risk of safe being broken.
[...] https://issues.dlang.org/show_bug.cgi?id=12941 T -- Why are you blatanly misspelling "blatant"? -- Branden Robinson
Feb 18 2016
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 18.02.2016 19:41, H. S. Teoh via Digitalmars-d wrote:
 On Thu, Feb 18, 2016 at 07:30:34PM +0100, Timon Gehr via Digitalmars-d wrote:
 On 18.02.2016 17:37, H. S. Teoh via Digitalmars-d wrote:
[...]
 1) Casting an array of elements with indirections (in this case
 Object[]) to void[] is questionable at best, outright unsafe at
 worst, as shown here. Even if we were to rewrite readData() and mark
 it  trusted, it still raises the question of what a  trusted function
 can legally do with void[], which is essentially a type-erased array,
 that justifies being tagged as  trusted.  How can a function do
 anything that doesn't break  safety if all type information about the
 array has been erased, and it essentially sees it only as a ubyte[]?
 I'm inclined to say that  trusted functions should only be allowed to
 receive const(void)[] as parameter, not void[].

 	https://issues.dlang.org/show_bug.cgi?id=15702
No problem here. There is no way to assign to a void[] without doing 2.
Sure there is: void breakSafety(void[] data) safe { union U { void[] d; int[] arr; } U u; u.d = data; u.arr[0] = 0xFF; // kaboom } ...
That's in essence just a different way to cast void[] to int[]. Steven's example disproves my previous claim though. Still, I don't think the conversion is the problem. It's the mutation of memory without type information.
 2) To add salt to the wound, though, blatantly casting void[] to T[]
 is allowed in  safe code, thus, readData() can even get away with
 claiming to be  safe, when in fact it is anything but.

 	https://issues.dlang.org/show_bug.cgi?id=15672
This is the culprit.
It's only one of many culprits. As long as there is any way around safe, the entire system of guarantees breaks down.
If you want to verify guarantees, safe should be specified by inclusion and ideally, type safety should be proved formally (after applying a few fixes). This first requires a formal language semantics. That's already a lot of effort, and after that you still don't have a verified implementation.
Feb 18 2016
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Feb 18, 2016 at 08:23:24PM +0100, Timon Gehr via Digitalmars-d wrote:
 On 18.02.2016 19:41, H. S. Teoh via Digitalmars-d wrote:
[...]
	void breakSafety(void[] data)  safe
	{
		union U {
			void[] d;
			int[] arr;
		}
		U u;
		u.d = data;
		u.arr[0] = 0xFF; // kaboom
	}
...
That's in essence just a different way to cast void[] to int[]. Steven's example disproves my previous claim though. Still, I don't think the conversion is the problem. It's the mutation of memory without type information.
Fair enough. But mutating memory *with* type information isn't necessarily safer either: void fun(int*[] intPtrs) { union U { int*[] ptrs; long[] longs; } U u; u.ptrs = intPtrs; u.longs[0] = 12345; // oops } Basically, when it comes to safe, one of the intrinsic requirements is that arbitrary values must not be reinterpreted as pointers. In a sense, it doesn't matter how you get there -- whether by implicit cast to void[], or by unions, or whatever -- as soon as there is a way to reinterpret an arbitrary value as a pointer, safe is broken. The arbitrary can be another pointer, and may even be a pointer to a type with a compatible binary representation, but the point remains. For example: void safeFunc() safe { ... } void unsafeFunc() system { ... } union FuncPtrs { void function() safe fun1; void function() system fun2; } FuncPtrs ptrs; ptrs.fun1 = &unsafeFunc(); ptrs.fun2(); // oops
From the POV of binary representation, both fun1 and fun2 are
essentially the same -- they are just function pointers: 32-bit or 64-bit addresses of some executable code. However, the breakage here is caused by reinterpreting a pointer to a system function as a pointer to a safe function. At no time do we lose type information, but as soon as we have reintepretation of something as a pointer, safe is out the window. [...]
2) To add salt to the wound, though, blatantly casting void[] to
T[] is allowed in  safe code, thus, readData() can even get away
with claiming to be  safe, when in fact it is anything but.

	https://issues.dlang.org/show_bug.cgi?id=15672
This is the culprit.
It's only one of many culprits. As long as there is any way around safe, the entire system of guarantees breaks down.
If you want to verify guarantees, safe should be specified by inclusion and ideally, type safety should be proved formally (after applying a few fixes). This first requires a formal language semantics. That's already a lot of effort, and after that you still don't have a verified implementation.
I'm not sure how feasible it is to do a formal proof with the current dmd code, since you'd have to freeze it, and as soon as the next PR is merged, the proof is potentially invalidated. However, a good start is to make *everything* in the language system by default, and then have a whitelist of things that are safe. Currently, our implementation is to assume safety by default and then add things to a system blacklist as we discover them. This is the wrong approach, since we don't know what loopholes currently exist that can be exploited, and the number of combinations of language features is far too large for us to be ever sure that we've plugged all the holes. It's better to start conservatively by assuming everything is system, and as people find more valid use cases that ought to be safe, those can be added to the whitelist after careful vetting. This is essentially the idea behind: https://issues.dlang.org/show_bug.cgi?id=15672 T -- You have to expect the unexpected. -- RL
Feb 18 2016
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 18.02.2016 21:57, H. S. Teoh via Digitalmars-d wrote:
If you want to verify guarantees,  safe should be specified by
inclusion and ideally, type safety should be proved formally (after
applying a few fixes).
This first requires a formal language semantics.
That's already a lot of effort, and after that you still don't have a
verified implementation.
I'm not sure how feasible it is to do a formal proof with the current dmd code, since you'd have to freeze it, and as soon as the next PR is merged, the proof is potentially invalidated.
The most obvious reason why it is infeasible to formally prove correctness of the current dmd code is that it is not correct.
Feb 18 2016
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/18/16 11:37 AM, H. S. Teoh via Digitalmars-d wrote:
 While  safe is a good idea in principle, the current implementation is
 rather lackluster.
IMO, I think safe code should disallow casting that doesn't involve a runtime function that verifies safety (such as casting an object to another type, or invoking a safe opCast), including casting array types. And implicit casts to void[] should be disallowed. This would be a painful restriction, which is why I think it won't happen :( -Steve
Feb 18 2016
next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 18 February 2016 at 18:58:24 UTC, Steven 
Schveighoffer wrote:
 And implicit casts to void[] should be disallowed.
We could strike a sane medium: in a safe function or when calling a safe function, implicit conversions of any array WITH REFERENCES is not allowed, while other arrays may be reinterpreted. So you can still pass ubyte[], char[], long[], etc. to a safe void[] thing, but not Object[] or ubyte*[] or struct Foo {int *a }[]. This should cover the vast majority of the sane use-cases for Socket.receive, etc., anyway without being a problem.
Feb 18 2016
prev sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Feb 18, 2016 at 01:58:24PM -0500, Steven Schveighoffer via
Digitalmars-d wrote:
 On 2/18/16 11:37 AM, H. S. Teoh via Digitalmars-d wrote:
While  safe is a good idea in principle, the current implementation
is rather lackluster.
IMO, I think safe code should disallow casting that doesn't involve a runtime function that verifies safety (such as casting an object to another type, or invoking a safe opCast), including casting array types.
But isn't casting from, say, long[] to int[] safe? Last time I checked, druntime does actually convert .length correctly as well, so that converting, say, int[] to long[] won't let you overrun the array by accessing the last elements of the long[]. But obviously casting Object[] to long[] would be a very bad idea, and should not be allowed in safe. Scarily enough, this code currently compiles without any complaint from the compiler: void main() safe { Object[] objs = [ new Object() ]; long[] longs = cast(long[]) objs; longs[0] = 12345; } Yikes. Apparently you don't even need void[] to break safe here. :-(
 And implicit casts to void[] should be disallowed.
 
 This would be a painful restriction, which is why I think it won't
 happen :(
[...] As far as I can tell, implicit casting to const(void)[] shouldn't be a cause for concern -- if all you can do with it is to read the data (e.g., a hexdump debugging function for printing out the byte representation of something), there shouldn't be any problems. It's when you write to the void[] that bad things begin to happen. T -- Государство делает вид, что платит нам зарплату, а мы делаем вид, что работаем.
Feb 18 2016
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/18/16 2:25 PM, H. S. Teoh via Digitalmars-d wrote:
 On Thu, Feb 18, 2016 at 01:58:24PM -0500, Steven Schveighoffer via
Digitalmars-d wrote:
 On 2/18/16 11:37 AM, H. S. Teoh via Digitalmars-d wrote:
 While  safe is a good idea in principle, the current implementation
 is rather lackluster.
IMO, I think safe code should disallow casting that doesn't involve a runtime function that verifies safety (such as casting an object to another type, or invoking a safe opCast), including casting array types.
But isn't casting from, say, long[] to int[] safe? Last time I checked, druntime does actually convert .length correctly as well, so that converting, say, int[] to long[] won't let you overrun the array by accessing the last elements of the long[].
Probably. But there is always trusted escapes. We are guaranteed safety if we just disallow casting whatsoever. It doesn't sit well with me that casting is a standard mechanism to use in safe code. Casting is supposed to be a red flag. Even converting between object types, I think casting is just a terrible requirement. Maybe it's too restrictive, I don't know. It's unlikely to happen in any case.
 But obviously casting Object[] to long[] would be a very bad idea, and
 should not be allowed in  safe. Scarily enough, this code currently
 compiles without any complaint from the compiler:

 	void main()  safe
 	{
 		Object[] objs = [ new Object() ];
 		long[] longs = cast(long[]) objs;
 		longs[0] = 12345;
 	}

 Yikes. Apparently you don't even need void[] to break  safe here. :-(
The difference here is that you have a cast. Casting to void[] doesn't require one.
 And implicit casts to void[] should be disallowed.

 This would be a painful restriction, which is why I think it won't
 happen :(
[...] As far as I can tell, implicit casting to const(void)[] shouldn't be a cause for concern -- if all you can do with it is to read the data (e.g., a hexdump debugging function for printing out the byte representation of something), there shouldn't be any problems. It's when you write to the void[] that bad things begin to happen.
Possibly casting to const(T)[] should be OK for safe code. But again, a conservative approach that disallows casts is what I would do. Note that you can't print out bytes from a void[], you would need a ubyte[]. -Steve
Feb 18 2016
prev sibling parent reply Kagamin <spam here.lot> writes:
On Thursday, 18 February 2016 at 16:37:10 UTC, H. S. Teoh wrote:
 (*ahem*std.socket*cough*) liberally sprinkle  trusted on every 
 function without regard to whether it's truly justified (e.g., 
 see https://issues.dlang.org/show_bug.cgi?id=15672)
How is bug 15672 related to std.socket? From quick glance at first thousand lines of std.socket I don't see functions incorrectly marked as trusted.
Feb 18 2016
next sibling parent Kagamin <spam here.lot> writes:
Well, ok, getprotobyname indeed looks thread unsafe for some 
reason, but this can happen for any function.
Feb 18 2016
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, 18 February 2016 at 19:28:16 UTC, Kagamin wrote:
 On Thursday, 18 February 2016 at 16:37:10 UTC, H. S. Teoh wrote:
 (*ahem*std.socket*cough*) liberally sprinkle  trusted on every 
 function without regard to whether it's truly justified (e.g., 
 see https://issues.dlang.org/show_bug.cgi?id=15672)
How is bug 15672 related to std.socket? From quick glance at first thousand lines of std.socket I don't see functions incorrectly marked as trusted.
The problem with std.socket is that calls to stuff like recv involve processing the data from void[], which is inherently unsafe, but the D functions calling them (e.g. receive) are marked as trusted. It's conceptually similar to 15672 in that recv is taking the void* that it's given (from the void[]) and essentially converting it to ubyte* (or char*, since it's C), which is more or less equivalent to casting the void[] to ubyte[], which is what 15672 is about. However, even if 15672 is fixed, it wouldn't affect std.socket any, because it's the fact that a pointer to void[] is passed to a C function that converts it to ubyte*/char* that's the problem, not that void[] is converted to ubyte[] (since that never actually happens). - Jonathan M Davis
Feb 18 2016
prev sibling parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Feb 18, 2016 at 07:28:16PM +0000, Kagamin via Digitalmars-d wrote:
 On Thursday, 18 February 2016 at 16:37:10 UTC, H. S. Teoh wrote:
(*ahem*std.socket*cough*) liberally sprinkle  trusted on every function
without regard to whether it's truly justified (e.g., see
https://issues.dlang.org/show_bug.cgi?id=15672)
How is bug 15672 related to std.socket? From quick glance at first thousand lines of std.socket I don't see functions incorrectly marked as trusted.
Sorry, I pasted the wrong link. It should be: https://issues.dlang.org/show_bug.cgi?id=14137 T -- Some days you win; most days you lose.
Feb 18 2016