www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Shouldn't casting an object to void* be considered safe?

reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
I recently got to thinking about a code snippet. The following 
doesn't compile, because casting to a void* is considered unsafe:

-----
import std.stdio;

class C
{
     void foo()  safe
     {
         writeln("%s: C.foo()", cast(void*)this);
     }
}

void main()
{
}
-----

 test.d(7): Error: cast from `test.C` to `void*` not allowed in 
 safe code
However, I don't see this cast as being unsafe. Casting a class object to a `void*` doesn't break the type system by itself. You cannot assign a `void*` to any other pointer type without an additional cast, and that additional cast would be the unsafe one. Additionally, you cannot reference a `void*`, so as far as I can see it's fairly safe to use in safe code. Wouldn't it make sense to allow casting reference types to `void*` in safe code? Are there edge-cases I haven't considered?
Dec 13 2019
next sibling parent reply Paulo Pinto <pjmlp progtools.org> writes:
On Friday, 13 December 2019 at 08:05:53 UTC, Andrej Mitrovic 
wrote:
 I recently got to thinking about a code snippet. The following 
 doesn't compile, because casting to a void* is considered 
 unsafe:

 -----
 import std.stdio;

 class C
 {
     void foo()  safe
     {
         writeln("%s: C.foo()", cast(void*)this);
     }
 }

 void main()
 {
 }
 -----

 test.d(7): Error: cast from `test.C` to `void*` not allowed in 
 safe code
However, I don't see this cast as being unsafe. Casting a class object to a `void*` doesn't break the type system by itself. You cannot assign a `void*` to any other pointer type without an additional cast, and that additional cast would be the unsafe one. Additionally, you cannot reference a `void*`, so as far as I can see it's fairly safe to use in safe code. Wouldn't it make sense to allow casting reference types to `void*` in safe code? Are there edge-cases I haven't considered?
Your example is only safe if writeln happens to be safe. If it is not implemented in D, or it is available as binary only implementation, everything goes. Naturally you could argue that those implementations can as well, change the bit representation after getting the safe reference to C, however this is another defense barrier to prevent doing the wrong thing by default.
Dec 13 2019
parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On Friday, 13 December 2019 at 08:24:03 UTC, Paulo Pinto wrote:
 Your example is only safe if writeln happens to be safe.
Well yes, you could only call safe functions with that void*. That's how transitivity works.
Dec 13 2019
prev sibling next sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Friday, 13 December 2019 at 08:05:53 UTC, Andrej Mitrovic 
wrote:
 However, I don't see this cast as being unsafe. Casting a class 
 object to a `void*` doesn't break the type system by itself. 
 You cannot assign a `void*` to any other pointer type without 
 an additional cast, and that additional cast would be the 
 unsafe one. Additionally, you cannot reference a `void*`, so as 
 far as I can see it's fairly safe to use in  safe code.

 Wouldn't it make sense to allow casting reference types to 
 `void*` in  safe code? Are there edge-cases I haven't 
 considered?
Surely where code has a cast to `void*` and then later back to some other pointer type, its overall safety is dependent on what happens at both ends. The safety of the cast from `void*` cannot be validated by the developer without knowing how the cast _to_ `void*` was done. Making a cast to `void*` unsafe is therefore an important push to the developer to say, "You need to look at this and validate what you're doing against how you use it later."
Dec 14 2019
next sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 14 December 2019 at 17:25:25 UTC, Joseph Rushton 
Wakeling wrote:
 Making a cast to `void*` unsafe is therefore an important push 
 to the developer to say, "You need to look at this and validate 
 what you're doing against how you use it later."
Casting to void should be safe, that is the same as providing a pure object identity. This is useful for a set of objects. That is the most restrictive reference-type the type system provides.
Dec 14 2019
parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Saturday, 14 December 2019 at 17:30:53 UTC, Ola Fosheim 
Grøstad wrote:
 Casting to void should be safe, that is the same as providing a 
 pure object identity.

 This is useful for a set of objects.
What's the point in generating a pure object identity, or a set of objects, unless those objects are going to be used? And how do you validate that that usage is safe without knowing something about the circumstances in which those pure identities were generated?
Dec 14 2019
next sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 14 December 2019 at 17:54:20 UTC, Joseph Rushton 
Wakeling wrote:
 What's the point in generating a pure object identity, or a set 
 of objects, unless those objects are going to be used?  And how 
 do you validate that that usage is safe without knowing 
 something about the circumstances in which those pure 
 identities were generated?
Sometimes you just want a set of all object identities with a specific property so that you can query for that property. It is the same a the "tag" reference type in Pony lang. It is safe for multi-threading too. Say, if a thread is waiting for a set of futures to complete then it can remove object identities from the set based on events.
Dec 14 2019
parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Saturday, 14 December 2019 at 18:10:20 UTC, Ola Fosheim 
Grøstad wrote:
 Sometimes you just want a set of all object identities with a 
 specific property so that you can query for that property.

 It is the same a the "tag"  reference type in Pony lang. It is 
 safe for multi-threading too.

 Say, if a thread is waiting for a set of futures to complete 
 then it can remove object identities from the set based on 
 events.
Fair enough, but note that the safety of the `void*` in those use-cases is still down to the specifics of how they are used. The obligation to put a ` trusted` on the cast to `void*` for use-cases like this seems minor compared to the value of explicitly nudging the developer to verify that the use-case is safe. IOW, having casts to `void*` be unsafe seems a tactic, rather than a principle.
Dec 14 2019
prev sibling parent reply Dennis <dkorpel gmail.com> writes:
On Saturday, 14 December 2019 at 17:54:20 UTC, Joseph Rushton 
Wakeling wrote:
 What's the point in generating a pure object identity, or a set 
 of objects, unless those objects are going to be used?
Whether something is useful or a good practice is not relevant for safe, it is only concerned with whether it makes memory corruption possible.
 And how do you validate that that usage is safe without knowing
 something about the circumstances in which those pure 
 identities were generated?
Any function that takes a void* and casts it to a different pointer that gets dereferenced is not safe, whether class casting to void* is allowed or not. You simply can't assume anything about a void* even in safe code. Note that casting pointer types to void is already allowed in safe: ``` class C {} void main() safe { void* v0 = cast(void*) new int; // allowed, implicit conversion void* v1 = cast(void*) 0xDEAFBEEF; // not allowed, not a pointer type void* v2 = cast(void*) new C(); // not allowed, not a pointer type } ``` The spec says:
 - No casting from a pointer type to any type other than void*.
 - No casting from any non-pointer type to a pointer type.
https://dlang.org/spec/function.html#safe-functions The second line could be changed to "No casting from any non-pointer type to a pointer type other than void*" however.
Dec 14 2019
next sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Saturday, 14 December 2019 at 18:46:38 UTC, Dennis wrote:
 Whether something is useful or a good practice is not relevant 
 for  safe, it is only concerned with whether it makes memory 
 corruption possible.
Indeed. Has anything I said suggested otherwise?
 And how do you validate that that usage is safe without knowing
 something about the circumstances in which those pure 
 identities were generated?
Any function that takes a void* and casts it to a different pointer that gets dereferenced is not safe, whether class casting to void* is allowed or not. You simply can't assume anything about a void* even in safe code.
Note that I said safe, not safe. The difference matters: code can be memory-safe, but that memory-safety may not be verifiable by the compiler. That's relevant to this discussion, because when asking "Should the cast to `void*` be safe?" we are really asking, "Is it possible for the compiler to decide this without needing the developer to verify?"
 Note that casting pointer types to void is already allowed in 
  safe:
You're right, I had a memory lapse on that one. Let's return the focus to casting objects to `void*`. I would suggest that the fact that there's a pointer under the hood of an object is an implementation detail. An unavoidable one, sure, but the point is that by casting it to a pointer type of any kind (`void*` or otherwise) we are creating opportunities for unsafe memory access that didn't exist previously. The question of whether that cast is safe -- not safe! -- is down to the use-case. So it's reasonable to ask the developer to validate both ends (i.e. how the cast to a pointer is done, and how that `void*` is used afterwards).
 The second line could be changed to "No casting from any 
 non-pointer type to a pointer type other than void*" however.
No, that won't do. What if you cast from a `ulong` to a `void*`? The spec is correct to ban casting from non-pointer types to pointer types. Andrej has raised a reasonable point about whether we might reconsider where objects are concerned, but I think the spec is correct as is (for reasons already given, and probably others too).
Dec 14 2019
parent reply Dennis <dkorpel gmail.com> writes:
On Saturday, 14 December 2019 at 19:16:08 UTC, Joseph Rushton 
Wakeling wrote:
 Indeed.  Has anything I said suggested otherwise?
My bad, I thought you were implying that it shouldn't be safe because it isn't useful.
 That's relevant to this discussion, because when asking "Should 
 the cast to `void*` be  safe?" we are really asking, "Is it 
 possible for the compiler to decide this without needing the 
 developer to verify?"
The compiler should only be concerned with verifying what it knows. E.g. if a `bool*` reaches ` safe` code, it must be null or point to a mutable 0 or 1 value. I don't think adding any additional rules is helpful. You might also say "casting ubyte[] to char[] should be unsafe because the compiler doesn't know whether the developer's string functions exhibit undefined behavior on invalid unicode", but if the developer relies on correct unicode like that, his functions are incorrectly marked trusted. If you want that guarantee, you should not use char[] but make your own struct type that ensures that. A void* is an opaque pointer type. If I start writing a function like: ``` void setZero(void* ptr) trusted { } ``` I simply can't assume anything about ptr other than that it is a valid `void*`. It does raise the question what can be assumed about a `void*`. You could say that a void* must point to at least 1 byte of mutable memory, such that this function is correctly trusted: ``` void setZero(void* ptr) trusted { if (ptr) { *(cast(ubyte) ptr) = 0; } } ``` However, I'd argue a void* is even more opaque than that, since it's not uncommon to use them as generic 'handles' in C libraries, such as HMODULE in the Windows API: https://docs.microsoft.com/en-us/windows/win32/api/libloaderapi/nf-libloaderapi-getprocaddress
 we are creating opportunities for unsafe memory access that 
 didn't exist previously.
I doubt that. If you have a code example, I'm very interested. I bet that any trusted code that can be broken by allowing cast(void*) in safe code can also be broken without that.
 No, that won't do.  What if you cast from a `ulong` to a 
 `void*`?
That is ` safe`, unless there is a way to corrupt memory in ` safe` code by doing that.
Dec 14 2019
parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Saturday, 14 December 2019 at 20:53:49 UTC, Dennis wrote:
 No, that won't do.  What if you cast from a `ulong` to a 
 `void*`?
That is ` safe`, unless there is a way to corrupt memory in ` safe` code by doing that.
No, it is not safe, and for good reason. When you cast an integral value to a `void*` that value gets reinterpreted as a memory address. But you have absolutely no right to assume that it is a valid memory address. ulong u = 8; auto v = cast(void*) u; ... is totally unsafe, and the compiler rightly rejects it if you try to do that in a code block marked safe. But you shouldn't need the compiler to tell you to know that this is a really messed up thing to do. How do you know that memory address 8 is in any way valid? Things like this are WHY the spec has the rule that one cannot cast from a non-pointer type to `void*` in code marked safe.
Dec 14 2019
next sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 14 December 2019 at 23:48:38 UTC, Joseph Rushton 
Wakeling wrote:
     ulong u = 8;
     auto v = cast(void*) u;

 ... is totally unsafe, and the compiler rightly rejects it if 
 you try to do that in a code block marked  safe.
It is not type safe, but it is memory safe. void has zero extent, so it will never be dereferenced without an unsafe action. It is no different from having a pointer to a zero-length array, or a zero-length slice. Allocators will often allocate 1 byte if asked to allocate 0 though, to avoid such typing issues. I.e. to allow you to obtain unique identities with zero extent. So it is problematic, but not because of void*. It is only problematic because other code is unsafe.
Dec 14 2019
parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Sunday, 15 December 2019 at 07:40:41 UTC, Ola Fosheim Grøstad 
wrote:
 On Saturday, 14 December 2019 at 23:48:38 UTC, Joseph Rushton 
 Wakeling wrote:
     ulong u = 8;
     auto v = cast(void*) u;

 ... is totally unsafe, and the compiler rightly rejects it if 
 you try to do that in a code block marked  safe.
It is not type safe, but it is memory safe. void has zero extent, so it will never be dereferenced without an unsafe action.
Which brings us back to the start of the discussion: safe checks are not just about the strictest definition of memory safety, but also about actions that open up a path to unsafe behaviour unless they are carefully validated. You may not accept the principle but that is the reality of the spec.
 It is no different from having a pointer to a zero-length 
 array, or a zero-length slice. Allocators will often allocate 1 
 byte if asked to allocate 0 though, to avoid such typing 
 issues. I.e. to allow you to obtain unique identities with zero 
 extent.
I'm not sure that's true. The pointer of a zero-length array should be guaranteed to be `null` or to point to a valid section of GC memory. An arbitrary integral value reinterpreted as a memory address has no such guarantee. BTW, note that the array/slice `.ptr` property is not safe, precisely because it opens the door to looking into a buffer of memory that one is not supposed to have access to.
 So it is problematic, but not because of void*. It is only 
 problematic because other code is unsafe.
These are not the criteria that safe operates by. But FWIW in these cases you're right that there's nothing magical about `void*`. It's the casting of a non-pointer type to ANY pointer type that is considered a violation of safe.
Dec 15 2019
next sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
BTW, note that the spec defines a  safe function as one that has 
been statically checked to exhibit no undefined behaviour.

Casting from an non-pointer type to a pointer seems a pretty good 
example of something that makes it impossible to statically 
confirm that no undefined behaviour is taking place.
Dec 15 2019
next sibling parent ag0aep6g <anonymous example.com> writes:
On 15.12.19 10:32, Joseph Rushton Wakeling wrote:
 BTW, note that the spec defines a  safe function as one that has been 
 statically checked to exhibit no undefined behaviour.
 
 Casting from an non-pointer type to a pointer seems a pretty good 
 example of something that makes it impossible to statically confirm that 
 no undefined behaviour is taking place.
The other side is saying it's possible for void*. It goes like this: 1) By itself, an invalid pointer doesn't exhibit UB. 2) Dereferencing an invalid pointer does exhibit UB. 3) There is no other way to trigger UB with an invalid pointer. 4) Dereferencing void* isn't allowed in safe code. Conclusion: An invalid void* cannot lead to UB in safe code. So casting anything to void* can be allowed there. I'm pretty sure that sentences 1, 2, and 4 are correct. Number 3 seems to be the interesting one. A counter-example (using an invalid pointer to trigger UB without dereferencing the pointer) would shut the argument down.
Dec 15 2019
prev sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Sunday, 15 December 2019 at 09:32:12 UTC, Joseph Rushton 
Wakeling wrote:
 BTW, note that the spec defines a  safe function as one that 
 has been statically checked to exhibit no undefined behaviour.

 Casting from an non-pointer type to a pointer seems a pretty 
 good example of something that makes it impossible to 
 statically confirm that no undefined behaviour is taking place.
Not impossible? It only becomes a problem if trusted code do unsafe things, like taking *void pointers that it intends to dereference. You just need to show that *void pointers that safe code can write to are not dereferenced.
Dec 15 2019
prev sibling parent Dennis <dkorpel gmail.com> writes:
On Sunday, 15 December 2019 at 09:15:39 UTC, Joseph Rushton 
Wakeling wrote:
 Which brings us back to the start of the discussion:  safe 
 checks are not just about the strictest definition of memory 
 safety, but also about actions that open up a path to unsafe 
 behaviour unless they are carefully validated.
You can demonstrate that by giving a piece of trusted code that is correct when safe casting to cast(void*) is disallowed and incorrect when it is allowed. My guess is that you haven't done that, because cast(void*) doesn't actually open up any paths to unsafe behavior that weren't previously there. I love to be proven wrong about this though.
 You may not accept the principle but that is the reality of the 
 spec.
What part of the spec are you referring to? I'm mostly concerned with:
 Memory safety does not imply that code is portable, uses only 
 sound programming practices, is free of byte order 
 dependencies, or other bugs. It is focussed only on eliminating 
 memory corruption possibilities.
https://dlang.org/spec/memory-safe-d.html#limitations
Dec 15 2019
prev sibling next sibling parent Dennis <dkorpel gmail.com> writes:
On Saturday, 14 December 2019 at 23:48:38 UTC, Joseph Rushton 
Wakeling wrote:
     ulong u = 8;
     auto v = cast(void*) u;
 ... is totally unsafe, and the compiler rightly rejects it if 
 you try to do that in a code block marked  safe.
It is not sufficient for causing memory corruption in safe code.
 But you shouldn't need the compiler to tell you to know that 
 this is a really messed up thing to do.
If "messed up" means that it can cause memory corruption in safe code, then you haven't demonstrated that yet. It "messed up" means that it is a bad practice, then we are back to " safe is about eliminating memory corruption possibilities, not bug free / good practice / useful".
 How do you know that memory address 8 is in any way valid?
You don't. That's the point of void*, you cannot safely cast it to any pointer. You can safely cast it to an integer and use that, or pass it to safe functions that use it safely. ``` import std.stdio: writeln; void main() { import core.sys.windows.winbase: CreateSemaphoreW, CloseHandle; void* fromMe = cast(void*) 516; // currently not safe, but should be void* fromC = CreateSemaphoreW(null, 8, 8, null); writeln("fromMe: ", cast(size_t) fromMe); // 516 writeln("fromC: ", cast(size_t) fromC); // could be 516 as well CloseHandle(fromMe); // likely won't succeed, but won't corrupt memory CloseHandle(fromC); badTrusted(fromMe); // NO! badTrusted(fromC); // NO! But still possible when disallowing safe cast(void*) } void badTrusted(void* ptr) trusted { writeln(*(cast(ubyte*) ptr)); // this is simply bad } ```
Dec 15 2019
prev sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 12/14/19 6:48 PM, Joseph Rushton Wakeling wrote:
 On Saturday, 14 December 2019 at 20:53:49 UTC, Dennis wrote:
 No, that won't do.  What if you cast from a `ulong` to a `void*`?
That is ` safe`, unless there is a way to corrupt memory in ` safe` code by doing that.
No, it is not safe, and for good reason.  When you cast an integral value to a `void*` that value gets reinterpreted as a memory address. But you have absolutely no right to assume that it is a valid memory address.
It's not technically unsafe, but is prone to safety problems. It's true that a void * is completely unusable in safe code. But loads of code has both safe and unsafe parts. Not to mention that the GC makes different decisions based on whether a type contains a pointer or not. I think we are reasonably fine leaving a rule in place to prevent casting from a non reference type to a void *. However, a class reference is very similar to a pointer, we should allow that cast (to void * only).
 
 Things like this are WHY the spec has the rule that one cannot cast from 
 a non-pointer type to `void*` in code marked  safe.
The spec rule is likely not focused on void * or classes, but really something like: auto p = cast(int*) 8; which then can be used in safe code to do damage: *p = 5; -Steve
Dec 16 2019
prev sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 14 December 2019 at 18:46:38 UTC, Dennis wrote:
     void* v2 = cast(void*) new C(); // not allowed, not a
Why not? Holding the door open for memory compaction on the GC heap?
Dec 14 2019
parent Dennis <dkorpel gmail.com> writes:
On Saturday, 14 December 2019 at 19:18:58 UTC, Ola Fosheim 
Grøstad wrote:
 Why not? Holding the door open for memory compaction on the GC 
 heap?
I don't know, I suggest allowing it (unless I'm missing something).
Dec 14 2019
prev sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On Saturday, 14 December 2019 at 17:25:25 UTC, Joseph Rushton 
Wakeling wrote:
 Surely where code has a cast to `void*` and then later back to 
 some other pointer type
The question was only about the first part, casting to void*, not casting back.
Dec 15 2019
parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Monday, 16 December 2019 at 01:01:26 UTC, Andrej Mitrovic 
wrote:
 On Saturday, 14 December 2019 at 17:25:25 UTC, Joseph Rushton 
 Wakeling wrote:
 Surely where code has a cast to `void*` and then later back to 
 some other pointer type
The question was only about the first part, casting to void*, not casting back.
Yea, I get that. My point is that I'm not sure that it makes sense to decouple the first part and the second part, when thinking about safety, and the programmer's responsibility to manually verify it.
Dec 16 2019
prev sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 12/13/19 3:05 AM, Andrej Mitrovic wrote:
 I recently got to thinking about a code snippet. The following doesn't 
 compile, because casting to a void* is considered unsafe:
 
 -----
 import std.stdio;
 
 class C
 {
      void foo()  safe
      {
          writeln("%s: C.foo()", cast(void*)this);
      }
 }
 
 void main()
 {
 }
 -----
 
 test.d(7): Error: cast from `test.C` to `void*` not allowed in safe code
However, I don't see this cast as being unsafe. Casting a class object to a `void*` doesn't break the type system by itself. You cannot assign a `void*` to any other pointer type without an additional cast, and that additional cast would be the unsafe one. Additionally, you cannot reference a `void*`, so as far as I can see it's fairly safe to use in safe code. Wouldn't it make sense to allow casting reference types to `void*` in safe code? Are there edge-cases I haven't considered?
Yes. Implicit casting to void * from another pointer type is allowed. i.e. this code compiles in safe mode: int *p = new int; void *v = p; Class references are no different than pointers in safe code (rebindable, can be null, no pointer arithmetic), and they are even more likely to live on the heap, which makes them even safer than the above. Please file an issue if not one already. -Steve
Dec 16 2019