digitalmars.D - Should modifying private members be system?

Dennis (44/44) Oct 04 2019 Imagine I am making my own slice type:

ag0aep6g (15/41) Oct 04 2019 You don't know that in opIndex. In opIndex you just know that

Dennis (18/27) Oct 04 2019 I like that idea a lot more! It wouldn't be arbitrary and code

ag0aep6g (39/56) Oct 04 2019 Being pedantic, if the safety of the nested function relies on

Jonathan M Davis (29/73) Oct 04 2019 There isn't necessarily anything unsafe about accessing public member

Dennis (11/18) Oct 04 2019 I agree totally. I'm looking for ways to give @trusted code

Paul Backus (6/9) Oct 17 2019 A new post on the ACM SIGPLAN blog makes a convincing argument

Dennis (2/3) Oct 17 2019 I will give that a read, thanks.

Dennis <dkorpel gmail.com> writes:

Imagine I am making my own slice type:

```
struct SmallSlice(T) {
     private T* ptr = null;
     private ushort _length = 0;
     ubyte[6] extraSpace;

     this(T[] slice)  safe {
         ptr = &slice[0];
         assert(slice.length <= ushort.max);
         _length = cast(ushort) slice.length;
     }

     T opIndex(size_t i)  trusted {
         return ptr[0.._length][i];
     }
}
```

The use of  trusted seems appropriate here, right? I'm doing 
pointer slicing, but I know that the _length member is correct 
for ptr since it came from a slice. The struct couldn't have been 
void-initialized or overlapped in a union since that's not 
allowed in  safe code when there are pointer members. Using { 
initializers } is not allowed since I have a constructor, and the 
members are private so they can't be modified outside of my 
(thoroughly audited) module.

Except that last thing isn't true:

```
import std;
void main()  safe {
     int[4] arr = [10, 20, 30, 40];
     auto slice = SmallSlice!int(arr[]);
     __traits(getMember, slice, "_length") = 100; // change 
private member
     writeln(slice[9]); // out of bounds memory access
}
```

__traits(getMember) enables one to bypass private, so the 
 trusted use was incorrect. This bypassing makes it really hard 
to write  trusted code relying on the integrity of the struct. 
How are you going to do  safe reference counting when the 
reference count can be tampered with from  safe code outside the 
control of the library writer?

I'm tossing the idea that __traits(getMembers) should be  system 
when it's bypassing visibility constraints of the member. What do 
you think?

Oct 04 2019

ag0aep6g <anonymous example.com> writes:

On Friday, 4 October 2019 at 11:55:11 UTC, Dennis wrote:
 ```
 struct SmallSlice(T) {
     private T* ptr = null;
     private ushort _length = 0;
     ubyte[6] extraSpace;

     this(T[] slice)  safe {
         ptr = &slice[0];
         assert(slice.length <= ushort.max);
         _length = cast(ushort) slice.length;
     }

     T opIndex(size_t i)  trusted {
         return ptr[0.._length][i];
     }
 }
 ```

 The use of  trusted seems appropriate here, right?

No.

 I'm doing pointer slicing, but I know that the _length member 
 is correct for ptr since it came from a slice.

You don't know that in opIndex. In opIndex you just know that 
there's a `T* ptr` and a `size_t _length`. opIndex doesn't know 
where they came from or if they're compatible.

 The struct couldn't have been void-initialized or overlapped in 
 a union since that's not allowed in  safe code when there are 
 pointer members. Using { initializers } is not allowed since I 
 have a constructor, and the members are private so they can't 
 be modified outside of my (thoroughly audited) module.

You're not marking the module as  trusted. You're marking 
opIndex. When you justify  trusted with something outside the 
marked function, you're misusing it.

Unfortunately, breaking the rules like this is rather common. But 
it's still breaking the rules. It's still undermining  safe.

[...]
 I'm tossing the idea that __traits(getMembers) should be 
  system when it's bypassing visibility constraints of the 
 member. What do you think?

My best idea so far (might not be originally mine) is to let the 
 system attribute apply to variables. You'd mark `_length` as 
 system and the effect would be that  safe functions would not be 
allowed to access it.

Oct 04 2019

Dennis <dkorpel gmail.com> writes:

On Friday, 4 October 2019 at 12:13:29 UTC, ag0aep6g wrote:
 My best idea so far (might not be originally mine) is to let 
 the  system attribute apply to variables. You'd mark `_length` 
 as  system and the effect would be that  safe functions would 
 not be allowed to access it.

I like that idea a lot more! It wouldn't be arbitrary and code 
breaking.

 You're not marking the module as  trusted. You're marking 
 opIndex. When you justify  trusted with something outside the 
 marked function, you're misusing it.

In this case it's a member function. A common pattern is to put 
unsafe parts of a function in a block like this:
()  trusted {/*do something unsafe*/} ()

Such a lambda cannot be  trusted unconditionally, it's trusted 
because it is known it can only be called in its context and not 
outside the outer function's scope. My broader question is 
whether the language can be designed so member functions can rely 
on certain assumptions. Currently writing proper  trusted code is 
extremely hard because it is unknown what you can work with.

- does const give any guarantees? (afaik, yes)
- does pure give any guarantees? (no, you can return 
`cast(size_t) (new int)`)
- does private give any guarantees? (currently not, but it could)

 Unfortunately, breaking the rules like this is rather common. 
 But it's still breaking the rules. It's still undermining  safe.

So with  system members, can we give any guarantees of integrity 
to member functions?

Oct 04 2019

ag0aep6g <anonymous example.com> writes:

On Friday, 4 October 2019 at 12:34:46 UTC, Dennis wrote:
 In this case it's a member function. A common pattern is to put 
 unsafe parts of a function in a block like this:
 ()  trusted {/*do something unsafe*/} ()

 Such a lambda cannot be  trusted unconditionally, it's trusted 
 because it is known it can only be called in its context and 
 not outside the outer function's scope.

Being pedantic, if the safety of the nested function relies on 
its surroundings, then it can't be  trusted. It may be a common 
pattern, but it's undermining  safe anyway.

 My broader question is whether the language can be designed so 
 member functions can rely on certain assumptions. Currently 
 writing proper  trusted code is extremely hard because it is 
 unknown what you can work with.

Agreed that writing  trusted code is hard.

 - does const give any guarantees? (afaik, yes)

Yes. It guarantees that the value won't change through the const 
reference. As far as I understand, you can rely on that in 
 trusted code. You can also rely on `immutable` data not ever 
changing (after creation).

The flipside is that you can't break those guarantees in 
 trusted/ system code. The compiler will allow it, but you must 
take care not to mutate const/immutable values. Same with other 
guarantees.

 - does pure give any guarantees? (no, you can return 
 `cast(size_t) (new int)`)

Yeah, you can't rely on two identical calls giving the same 
result.

`pure` pretty much just means "doesn't touch globals, at least 
not visibly". So here:

----
int x;
void f(const int* a) pure;
void main() { f(&x); }
----

It's guaranteed that the call to f doesn't change x. Without 
`pure`, it might.

 - does private give any guarantees? (currently not, but it 
 could)

I don't think `private` guarantees anything with regards to 
safety. Even when `__traits(getMember, ...)` and `.tupleof` and 
other low-level mechanisms would respect `private`, you still 
couldn't rely on it for safety. You can't trust the current 
module/struct/class any more than other code.

[...]
 So with  system members, can we give any guarantees of 
 integrity to member functions?

You'd have the guarantee that no  safe code can mess with the 
variable.

You'd still have to make sure that all the  system and  trusted 
code plays nice with each other, but that's just how it is with 
non- safe code.

That is, some other  system/ trusted function could set an 
invalid _length in your struct. And then your opIndex would 
violate safety. But that would just be a bug in that other 
function then.

Oct 04 2019

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Friday, October 4, 2019 5:55:11 AM MDT Dennis via Digitalmars-d wrote:
 Imagine I am making my own slice type:

 ```
 struct SmallSlice(T) {
      private T* ptr = null;
      private ushort _length = 0;
      ubyte[6] extraSpace;

      this(T[] slice)  safe {
          ptr = &slice[0];
          assert(slice.length <= ushort.max);
          _length = cast(ushort) slice.length;
      }

      T opIndex(size_t i)  trusted {
          return ptr[0.._length][i];
      }
 }
 ```

 The use of  trusted seems appropriate here, right? I'm doing
 pointer slicing, but I know that the _length member is correct
 for ptr since it came from a slice. The struct couldn't have been
 void-initialized or overlapped in a union since that's not
 allowed in  safe code when there are pointer members. Using {
 initializers } is not allowed since I have a constructor, and the
 members are private so they can't be modified outside of my
 (thoroughly audited) module.

 Except that last thing isn't true:

 ```
 import std;
 void main()  safe {
      int[4] arr = [10, 20, 30, 40];
      auto slice = SmallSlice!int(arr[]);
      __traits(getMember, slice, "_length") = 100; // change
 private member
      writeln(slice[9]); // out of bounds memory access
 }
 ```

 __traits(getMember) enables one to bypass private, so the
  trusted use was incorrect. This bypassing makes it really hard
 to write  trusted code relying on the integrity of the struct.
 How are you going to do  safe reference counting when the
 reference count can be tampered with from  safe code outside the
 control of the library writer?

 I'm tossing the idea that __traits(getMembers) should be  system
 when it's bypassing visibility constraints of the member. What do
 you think?

There isn't necessarily anything unsafe about accessing public member
variables. An  safe function isn't going to be doing anything  system
regardless of what's done to the member variables by other functions,
because if whether an operation is memory safe or not depends on the state
of a variable, then that operation is always considered  system, and
 trusted would have to be used to make it  safe.

However, if a member function is marked  trusted, then it's up to the
programmer to ensure that it's guaranteed to be  safe in spite of whatever
 system operations it's doing, and if that function relies on the member
variables being in a particular state, and that state cannot actually be
guaranteed, because the member variables are public and could have been
messed with, then it's inappropriate to mark the function as  trusted. It's
only appropriate to mark it as  trusted if the programmer can guarantee that
what it's doing is  safe even if the member variables were messed with.

So, in general,  safe isn't a problem with public member variables, but
 trusted is.

Though honestly, having public member variables is problematic in general.
It's okay for POD types, since they're just data, and in code that's not
publicly distributed, having public member variables isn't necessarily a
problem, because they can be made private with property functions if
necessary. However, because turning public member variables into property
functions breaks code, it's not the sort of thing that you can do when you
don't control all of the code that uses it. As such, I think that it's
rarely a good idea to use public member variables in public libraries,
though in private applications, it's not as big a deal, because then the
programmer or company control all of the code and can fix any issues that
pop up when APIs are changed in a way that breaks code.

- Jonathan M Davis

Oct 04 2019

Dennis <dkorpel gmail.com> writes:

On Friday, 4 October 2019 at 12:37:26 UTC, Jonathan M Davis wrote:
 However, if a member function is marked  trusted, then it's up 
 to the programmer to ensure that it's guaranteed to be  safe in 
 spite of whatever  system operations it's doing, and if that 
 function relies on the member variables being in a particular 
 state, and that state cannot actually be guaranteed, because 
 the member variables are public and could have been messed 
 with, then it's inappropriate to mark the function as  trusted.

I agree totally. I'm looking for ways to give  trusted code 
writers guarantees that certain variables haven't been messed 
with, because writing  trusted code without any access to state 
is crippling the possibilities. With the recent push for  safe 
manual memory management [1] I'm expecting this to become an 
issue. (As mentioned before, how are you going to do safe 
reference counting when you can't rely on the reference count to 
be correct?)

[1] 
https://dlang.org/blog/2019/07/15/ownership-and-borrowing-in-d/

Oct 04 2019

Paul Backus <snarwin gmail.com> writes:

On Friday, 4 October 2019 at 11:55:11 UTC, Dennis wrote:
 I'm tossing the idea that __traits(getMembers) should be 
  system when it's bypassing visibility constraints of the 
 member. What do you think?

A new post on the ACM SIGPLAN blog makes a convincing argument 
that correct enforcement of data visibility constraints is 
essential for allowing programmers to create safe interfaces 
using unsafe language features:

https://blog.sigplan.org/2019/10/17/what-type-soundness-theorem-do-you-really-want-to-prove/

Oct 17 2019

Dennis <dkorpel gmail.com> writes:

On Thursday, 17 October 2019 at 18:53:29 UTC, Paul Backus wrote:
 https://blog.sigplan.org/2019/10/17/what-type-soundness-theorem-do-you-really-want-to-prove/

I will give that a read, thanks.

Oct 17 2019

D Programming

C/C++ Programming

Other

digitalmars.D - Should modifying private members be system?