www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Should modifying private members be system?

reply Dennis <dkorpel gmail.com> writes:
Imagine I am making my own slice type:

```
struct SmallSlice(T) {
     private T* ptr = null;
     private ushort _length = 0;
     ubyte[6] extraSpace;

     this(T[] slice)  safe {
         ptr = &slice[0];
         assert(slice.length <= ushort.max);
         _length = cast(ushort) slice.length;
     }

     T opIndex(size_t i)  trusted {
         return ptr[0.._length][i];
     }
}
```

The use of  trusted seems appropriate here, right? I'm doing 
pointer slicing, but I know that the _length member is correct 
for ptr since it came from a slice. The struct couldn't have been 
void-initialized or overlapped in a union since that's not 
allowed in  safe code when there are pointer members. Using { 
initializers } is not allowed since I have a constructor, and the 
members are private so they can't be modified outside of my 
(thoroughly audited) module.

Except that last thing isn't true:

```
import std;
void main()  safe {
     int[4] arr = [10, 20, 30, 40];
     auto slice = SmallSlice!int(arr[]);
     __traits(getMember, slice, "_length") = 100; // change 
private member
     writeln(slice[9]); // out of bounds memory access
}
```

__traits(getMember) enables one to bypass private, so the 
 trusted use was incorrect. This bypassing makes it really hard 
to write  trusted code relying on the integrity of the struct. 
How are you going to do  safe reference counting when the 
reference count can be tampered with from  safe code outside the 
control of the library writer?

I'm tossing the idea that __traits(getMembers) should be  system 
when it's bypassing visibility constraints of the member. What do 
you think?
Oct 04 2019
next sibling parent reply ag0aep6g <anonymous example.com> writes:
On Friday, 4 October 2019 at 11:55:11 UTC, Dennis wrote:
 ```
 struct SmallSlice(T) {
     private T* ptr = null;
     private ushort _length = 0;
     ubyte[6] extraSpace;

     this(T[] slice)  safe {
         ptr = &slice[0];
         assert(slice.length <= ushort.max);
         _length = cast(ushort) slice.length;
     }

     T opIndex(size_t i)  trusted {
         return ptr[0.._length][i];
     }
 }
 ```

 The use of  trusted seems appropriate here, right?
No.
 I'm doing pointer slicing, but I know that the _length member 
 is correct for ptr since it came from a slice.
You don't know that in opIndex. In opIndex you just know that there's a `T* ptr` and a `size_t _length`. opIndex doesn't know where they came from or if they're compatible.
 The struct couldn't have been void-initialized or overlapped in 
 a union since that's not allowed in  safe code when there are 
 pointer members. Using { initializers } is not allowed since I 
 have a constructor, and the members are private so they can't 
 be modified outside of my (thoroughly audited) module.
You're not marking the module as trusted. You're marking opIndex. When you justify trusted with something outside the marked function, you're misusing it. Unfortunately, breaking the rules like this is rather common. But it's still breaking the rules. It's still undermining safe. [...]
 I'm tossing the idea that __traits(getMembers) should be 
  system when it's bypassing visibility constraints of the 
 member. What do you think?
My best idea so far (might not be originally mine) is to let the system attribute apply to variables. You'd mark `_length` as system and the effect would be that safe functions would not be allowed to access it.
Oct 04 2019
parent reply Dennis <dkorpel gmail.com> writes:
On Friday, 4 October 2019 at 12:13:29 UTC, ag0aep6g wrote:
 My best idea so far (might not be originally mine) is to let 
 the  system attribute apply to variables. You'd mark `_length` 
 as  system and the effect would be that  safe functions would 
 not be allowed to access it.
I like that idea a lot more! It wouldn't be arbitrary and code breaking.
 You're not marking the module as  trusted. You're marking 
 opIndex. When you justify  trusted with something outside the 
 marked function, you're misusing it.
In this case it's a member function. A common pattern is to put unsafe parts of a function in a block like this: () trusted {/*do something unsafe*/} () Such a lambda cannot be trusted unconditionally, it's trusted because it is known it can only be called in its context and not outside the outer function's scope. My broader question is whether the language can be designed so member functions can rely on certain assumptions. Currently writing proper trusted code is extremely hard because it is unknown what you can work with. - does const give any guarantees? (afaik, yes) - does pure give any guarantees? (no, you can return `cast(size_t) (new int)`) - does private give any guarantees? (currently not, but it could)
 Unfortunately, breaking the rules like this is rather common. 
 But it's still breaking the rules. It's still undermining  safe.
So with system members, can we give any guarantees of integrity to member functions?
Oct 04 2019
parent ag0aep6g <anonymous example.com> writes:
On Friday, 4 October 2019 at 12:34:46 UTC, Dennis wrote:
 In this case it's a member function. A common pattern is to put 
 unsafe parts of a function in a block like this:
 ()  trusted {/*do something unsafe*/} ()

 Such a lambda cannot be  trusted unconditionally, it's trusted 
 because it is known it can only be called in its context and 
 not outside the outer function's scope.
Being pedantic, if the safety of the nested function relies on its surroundings, then it can't be trusted. It may be a common pattern, but it's undermining safe anyway.
 My broader question is whether the language can be designed so 
 member functions can rely on certain assumptions. Currently 
 writing proper  trusted code is extremely hard because it is 
 unknown what you can work with.
Agreed that writing trusted code is hard.
 - does const give any guarantees? (afaik, yes)
Yes. It guarantees that the value won't change through the const reference. As far as I understand, you can rely on that in trusted code. You can also rely on `immutable` data not ever changing (after creation). The flipside is that you can't break those guarantees in trusted/ system code. The compiler will allow it, but you must take care not to mutate const/immutable values. Same with other guarantees.
 - does pure give any guarantees? (no, you can return 
 `cast(size_t) (new int)`)
Yeah, you can't rely on two identical calls giving the same result. `pure` pretty much just means "doesn't touch globals, at least not visibly". So here: ---- int x; void f(const int* a) pure; void main() { f(&x); } ---- It's guaranteed that the call to f doesn't change x. Without `pure`, it might.
 - does private give any guarantees? (currently not, but it 
 could)
I don't think `private` guarantees anything with regards to safety. Even when `__traits(getMember, ...)` and `.tupleof` and other low-level mechanisms would respect `private`, you still couldn't rely on it for safety. You can't trust the current module/struct/class any more than other code. [...]
 So with  system members, can we give any guarantees of 
 integrity to member functions?
You'd have the guarantee that no safe code can mess with the variable. You'd still have to make sure that all the system and trusted code plays nice with each other, but that's just how it is with non- safe code. That is, some other system/ trusted function could set an invalid _length in your struct. And then your opIndex would violate safety. But that would just be a bug in that other function then.
Oct 04 2019
prev sibling next sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Friday, October 4, 2019 5:55:11 AM MDT Dennis via Digitalmars-d wrote:
 Imagine I am making my own slice type:

 ```
 struct SmallSlice(T) {
      private T* ptr = null;
      private ushort _length = 0;
      ubyte[6] extraSpace;

      this(T[] slice)  safe {
          ptr = &slice[0];
          assert(slice.length <= ushort.max);
          _length = cast(ushort) slice.length;
      }

      T opIndex(size_t i)  trusted {
          return ptr[0.._length][i];
      }
 }
 ```

 The use of  trusted seems appropriate here, right? I'm doing
 pointer slicing, but I know that the _length member is correct
 for ptr since it came from a slice. The struct couldn't have been
 void-initialized or overlapped in a union since that's not
 allowed in  safe code when there are pointer members. Using {
 initializers } is not allowed since I have a constructor, and the
 members are private so they can't be modified outside of my
 (thoroughly audited) module.

 Except that last thing isn't true:

 ```
 import std;
 void main()  safe {
      int[4] arr = [10, 20, 30, 40];
      auto slice = SmallSlice!int(arr[]);
      __traits(getMember, slice, "_length") = 100; // change
 private member
      writeln(slice[9]); // out of bounds memory access
 }
 ```

 __traits(getMember) enables one to bypass private, so the
  trusted use was incorrect. This bypassing makes it really hard
 to write  trusted code relying on the integrity of the struct.
 How are you going to do  safe reference counting when the
 reference count can be tampered with from  safe code outside the
 control of the library writer?

 I'm tossing the idea that __traits(getMembers) should be  system
 when it's bypassing visibility constraints of the member. What do
 you think?
There isn't necessarily anything unsafe about accessing public member variables. An safe function isn't going to be doing anything system regardless of what's done to the member variables by other functions, because if whether an operation is memory safe or not depends on the state of a variable, then that operation is always considered system, and trusted would have to be used to make it safe. However, if a member function is marked trusted, then it's up to the programmer to ensure that it's guaranteed to be safe in spite of whatever system operations it's doing, and if that function relies on the member variables being in a particular state, and that state cannot actually be guaranteed, because the member variables are public and could have been messed with, then it's inappropriate to mark the function as trusted. It's only appropriate to mark it as trusted if the programmer can guarantee that what it's doing is safe even if the member variables were messed with. So, in general, safe isn't a problem with public member variables, but trusted is. Though honestly, having public member variables is problematic in general. It's okay for POD types, since they're just data, and in code that's not publicly distributed, having public member variables isn't necessarily a problem, because they can be made private with property functions if necessary. However, because turning public member variables into property functions breaks code, it's not the sort of thing that you can do when you don't control all of the code that uses it. As such, I think that it's rarely a good idea to use public member variables in public libraries, though in private applications, it's not as big a deal, because then the programmer or company control all of the code and can fix any issues that pop up when APIs are changed in a way that breaks code. - Jonathan M Davis
Oct 04 2019
parent Dennis <dkorpel gmail.com> writes:
On Friday, 4 October 2019 at 12:37:26 UTC, Jonathan M Davis wrote:
 However, if a member function is marked  trusted, then it's up 
 to the programmer to ensure that it's guaranteed to be  safe in 
 spite of whatever  system operations it's doing, and if that 
 function relies on the member variables being in a particular 
 state, and that state cannot actually be guaranteed, because 
 the member variables are public and could have been messed 
 with, then it's inappropriate to mark the function as  trusted.
I agree totally. I'm looking for ways to give trusted code writers guarantees that certain variables haven't been messed with, because writing trusted code without any access to state is crippling the possibilities. With the recent push for safe manual memory management [1] I'm expecting this to become an issue. (As mentioned before, how are you going to do safe reference counting when you can't rely on the reference count to be correct?) [1] https://dlang.org/blog/2019/07/15/ownership-and-borrowing-in-d/
Oct 04 2019
prev sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Friday, 4 October 2019 at 11:55:11 UTC, Dennis wrote:
 I'm tossing the idea that __traits(getMembers) should be 
  system when it's bypassing visibility constraints of the 
 member. What do you think?
A new post on the ACM SIGPLAN blog makes a convincing argument that correct enforcement of data visibility constraints is essential for allowing programmers to create safe interfaces using unsafe language features: https://blog.sigplan.org/2019/10/17/what-type-soundness-theorem-do-you-really-want-to-prove/
Oct 17 2019
parent Dennis <dkorpel gmail.com> writes:
On Thursday, 17 October 2019 at 18:53:29 UTC, Paul Backus wrote:
 https://blog.sigplan.org/2019/10/17/what-type-soundness-theorem-do-you-really-want-to-prove/
I will give that a read, thanks.
Oct 17 2019