www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - T.init, struct destructors and invariants - should they be called?

reply FeepingCreature <feepingcreature gmail.com> writes:
I believe there's a good case that struct invariants should not 
be called on struct destruction. Significantly, I believe that 
Phobos, in particular `moveEmplace`, is already written as if 
this is the case, even though it is not specified anywhere.

It is very common for structs' .init to violate invariants. 
Consider the humble struct S
{
     Object obj;
     invariant
     {
         assert(this.obj !is null);
     }
      disable this();
     this(Object obj)
     in(obj !is null)
     {
         this.obj = obj;
     }
}

S is obviously intended to be constructed with a nonzero 
this.obj. However, even in  safe code we may take S.init, which 
violates the invariant.

Consequently, this code currently asserts out:

     S s = S.init;

Why is this a problem? ("Just don't use S.init!")

Well, for one it makes Nullable!S impossible. Nullable, if it is 
to be  nogc, *necessarily* has to construct an S.init struct 
member. This leads us into trouble:

     Nullable!s ns = Nullable!s();

^ Asserts out again - there is no way for Nullable to avoid 
invoking the destructor of its internal S field.

Now, currently this is not a problem because Nullable refuses to 
compile in this case. However, such a basic language feature 
should *really* work for every type, especially given that such 
basic types as SysTime already set up Nullable's reasonable fears 
of invalid behavior.

There is a simple tweak to make this work:

Simply require that struct constructors be defined for T.init, 
even if T.init should violate invariants. This can be implemented 
by not checking the invariant if `this is T.init`, or simply 
disabling the invariant in the struct destructor entirely.

I picked the second option, but more because I didn't know how to 
do the first.

https://github.com/dlang/dlang.org/pull/2410

https://github.com/dlang/dmd/pull/8462

About `moveEmplace`: If `moveEmplace` detects that the struct 
type it is operating on has a user-defined destructor, it 
manually overwrites its source with `T.init`. This suggests that 
`moveEmplace` already operates on this logic. Hence: it's 
necessary, it's unavoidable, and it has precedent.

Opine?
Jul 06 2018
next sibling parent FeepingCreature <feepingcreature gmail.com> writes:
On Friday, 6 July 2018 at 10:44:09 UTC, FeepingCreature wrote:
 Consider the humble struct S
 {
     Object obj;
     invariant
     {
         assert(this.obj !is null);
     }
      disable this();
     this(Object obj)
     in(obj !is null)
     {
         this.obj = obj;
     }
 }
Oops - there should of course be a ~this() { } in there.
Jul 06 2018
prev sibling next sibling parent reply Simen =?UTF-8?B?S2rDpnLDpXM=?= <simen.kjaras gmail.com> writes:
On Friday, 6 July 2018 at 10:44:09 UTC, FeepingCreature wrote:
 Why is this a problem? ("Just don't use S.init!")

 Well, for one it makes Nullable!S impossible. Nullable, if it 
 is to be  nogc, *necessarily* has to construct an S.init struct 
 member.
The rest looks sensible to me, but I have to say this is bollocks. This Nullable never has to construct an S.init: struct Nullable(T) { ubyte[T.sizeof] _payload; bool _hasValue; property trusted ref T value() { assert(_hasValue); return *cast(T*)_payload.ptr; } property trusted void value(T val) { cleanup(); _hasValue = true; _payload = *cast(typeof(_payload)*)&val; } property void value(typeof(null)) { cleanup(); _hasValue = false; } trusted private void cleanup() { if (!_hasValue) return; destroy(*cast(T*)_payload.ptr); _hasValue = false; } this(T val) { value = val; } ~this() { cleanup(); } this(typeof(null) val) { value = val; } void opAssign(T val) { value = val; } void opAssign(typeof(null) val) { value = val; } } struct S { Object obj; invariant { assert(this.obj !is null); } disable this(); safe this(Object obj) { this.obj = obj; } safe ~this() {} } safe unittest { Nullable!S a; // Look ma, no assert! } Not only do I think it's unnecessary for Nullable to declare a T field, I think it's a bug if it does. The destructor problem you point out is one of the key reasons for this. -- Simen
Jul 06 2018
parent reply FeepingCreature <feepingcreature gmail.com> writes:
On Friday, 6 July 2018 at 12:10:58 UTC, Simen Kjærås wrote:
 The rest looks sensible to me, but I have to say this is 
 bollocks. This Nullable never has to construct an S.init:

 struct Nullable(T) {
     ubyte[T.sizeof] _payload;
     bool _hasValue;
Come on, at least make it a union with a void[]. This way will fail with any struct that requires a certain alignment. Not to mention that you can stick a class in it and it'll be garbage collected, because ubyte[] must not store pointers to GC memory. Which makes the point, really - this is an utterly blatant, unreliable, fragile hack.
Jul 06 2018
parent reply Simen =?UTF-8?B?S2rDpnLDpXM=?= <simen.kjaras gmail.com> writes:
On Friday, 6 July 2018 at 12:31:50 UTC, FeepingCreature wrote:
 On Friday, 6 July 2018 at 12:10:58 UTC, Simen Kjærås wrote:
 The rest looks sensible to me, but I have to say this is 
 bollocks. This Nullable never has to construct an S.init:

 struct Nullable(T) {
     ubyte[T.sizeof] _payload;
     bool _hasValue;
Come on, at least make it a union with a void[]. This way will fail with any struct that requires a certain alignment. Not to mention that you can stick a class in it and it'll be garbage collected, because ubyte[] must not store pointers to GC memory. Which makes the point, really - this is an utterly blatant, unreliable, fragile hack.
The union brings back the exact problems we're trying to avoid. It also makes it impossible to handle types where hasElaborateAssign is true and this() is disabled at the same time. There's also the issue that at best it will do the wrong thing when you disable this() and init, but you probably shouldn't do that anyway. As for alignment, GC, and possibly other things, the code was not intended as a complete implementation of Nullable, only to show that an actual member of type T is not necessary. These issues are fixable, if perhaps nontrivial in some cases. -- Simen
Jul 06 2018
parent reply FeepingCreature <feepingcreature gmail.de> writes:
On Friday, 6 July 2018 at 23:37:30 UTC, Simen Kjærås wrote:
 As for alignment, GC, and possibly other things, the code was 
 not intended as a complete implementation of Nullable, only to 
 show that an actual member of type T is not necessary. These 
 issues are fixable, if perhaps nontrivial in some cases.

 --
   Simen
Yeah, sorry - that was a snap answer; union is indeed not a solution, particularly since it doesn't even work for types with elaborate destructors. That said, I agree that this can be implemented, however imo the fact that we have to retrace the work of the compiler in laying out the data is a warning sign - imo, it happens because what we actually want is for the compiler to *work a different way*, which is why we find ourselves painstakingly recreating its interna in userland, except with different destructor semantics. This points at a flaw in the language, imo - if we so urgently need a way to express different semantics, we shouldn't have to painstakingly hide the type behind the compiler's back; the compiler should have our back in this instead of fighting us.
Jul 07 2018
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/7/18 11:06 PM, FeepingCreature wrote:
 On Friday, 6 July 2018 at 23:37:30 UTC, Simen Kjærås wrote:
 As for alignment, GC, and possibly other things, the code was not 
 intended as a complete implementation of Nullable, only to show that 
 an actual member of type T is not necessary. These issues are fixable, 
 if perhaps nontrivial in some cases.

 -- 
Yeah, sorry - that was a snap answer; union is indeed not a solution, particularly since it doesn't even work for types with elaborate destructors. That said, I agree that this can be implemented, however imo the fact that we have to retrace the work of the compiler in laying out the data is a warning sign - imo, it happens because what we actually want is for the compiler to *work a different way*, which is why we find ourselves painstakingly recreating its interna in userland, except with different destructor semantics. This points at a flaw in the language, imo - if we so urgently need a way to express different semantics, we shouldn't have to painstakingly hide the type behind the compiler's back; the compiler should have our back in this instead of fighting us.
Hm... this reminds me of the recent discussion around __symbols, and how they are treated specially. Could we reserve some subset of those symbols to say "treat these differently"? Or have an attribute that marks a member in such a way? I'm reminded of of the future storage class __mutable as well. -Steve
Jul 09 2018
parent FeepingCreature <feepingcreature gmail.com> writes:
On Tuesday, 10 July 2018 at 00:01:28 UTC, Steven Schveighoffer 
wrote:
 On 7/7/18 11:06 PM, FeepingCreature wrote:
 On Friday, 6 July 2018 at 23:37:30 UTC, Simen Kjærås wrote:
 As for alignment, GC, and possibly other things, the code was 
 not intended as a complete implementation of Nullable, only 
 to show that an actual member of type T is not necessary. 
 These issues are fixable, if perhaps nontrivial in some cases.

 --
Yeah, sorry - that was a snap answer; union is indeed not a solution, particularly since it doesn't even work for types with elaborate destructors. That said, I agree that this can be implemented, however imo the fact that we have to retrace the work of the compiler in laying out the data is a warning sign - imo, it happens because what we actually want is for the compiler to *work a different way*, which is why we find ourselves painstakingly recreating its interna in userland, except with different destructor semantics. This points at a flaw in the language, imo - if we so urgently need a way to express different semantics, we shouldn't have to painstakingly hide the type behind the compiler's back; the compiler should have our back in this instead of fighting us.
Hm... this reminds me of the recent discussion around __symbols, and how they are treated specially. Could we reserve some subset of those symbols to say "treat these differently"? Or have an attribute that marks a member in such a way? I'm reminded of of the future storage class __mutable as well. -Steve
That would also solve the problem. Note that I've recently realized that the format!"" weird-exception bug ( issue 19003 https://issues.dlang.org/show_bug.cgi?id=19003 ) is based on the same root assumption - Phobos blithely presumes that T.init is a valid instance of T that it can, say, call toString() on. Do you think "Structs with nontrivial Domain Data Considered Harmful" would make a good article name?
Jul 09 2018
prev sibling parent reply FeepingCreature <feepingcreature gmail.de> writes:
Ping: Resurrecting this thread because  thewilsonator resurrected 
my spec pr.

https://github.com/dlang/dlang.org/pull/2410
Nov 18 2018
parent reply Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Sunday, 18 November 2018 at 13:05:28 UTC, FeepingCreature 
wrote:
 Ping: Resurrecting this thread because  thewilsonator 
 resurrected my spec pr.

 https://github.com/dlang/dlang.org/pull/2410
Rebuking Simen's response then :) On Friday, 6 July 2018 at 23:37:30 UTC, Simen Kjærås wrote:
 The union brings back the exact problems we're trying to avoid. 
 It also makes it impossible to handle types where 
 hasElaborateAssign is true and this() is  disabled at the same 
 time. There's also the issue that at best it will do the wrong 
 thing when you  disable this() and init, but you probably 
 shouldn't do that anyway.
https://issues.dlang.org/show_bug.cgi?id=19321 https://github.com/dlang/dmd/pull/5830 Unions have been "fixed" for some time now. Use them, no need for hacks with ubyte. struct Nullable(T) { private union U { T value = T.init; } private U _u; private property ref _payload() inout { return _u.value; } private bool _hasValue; property trusted ref T value() { assert(_hasValue); return _payload; } property trusted void value(T val) { cleanup(); _hasValue = true; _payload = val; } property void value(typeof(null)) { cleanup(); _hasValue = false; } trusted private void cleanup() { if (!_hasValue) return; destroy(_payload); _hasValue = false; } this(T val) { value = val; } ~this() { cleanup(); } this(typeof(null) val) { value = val; } void opAssign(T val) { value = val; } void opAssign(typeof(null) val) { value = val; } } struct S { Object obj; invariant { assert(this.obj !is null); } disable this(); safe this(Object obj) { this.obj = obj; } safe void opAssign(S) { /* ... */ } safe ~this() {} } safe unittest { Nullable!S a; // Look ma, no assert! }
Nov 18 2018
parent reply FeepingCreature <feepingcreature gmail.de> writes:
On Sunday, 18 November 2018 at 14:38:09 UTC, Stanislav Blinov 
wrote:
  safe unittest {
     Nullable!S a; // Look ma, no assert!
 }
a = S(new Object); // Look pa, assert!
Nov 18 2018
next sibling parent reply Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Sunday, 18 November 2018 at 14:47:51 UTC, FeepingCreature 
wrote:
 On Sunday, 18 November 2018 at 14:38:09 UTC, Stanislav Blinov 
 wrote:
  safe unittest {
     Nullable!S a; // Look ma, no assert!
 }
a = S(new Object); // Look pa, assert!
That has to do with poor implementation of that example Nullable, not the union. opAssign should check for _hasValue.
Nov 18 2018
parent reply FeepingCreature <feepingcreature gmail.de> writes:
On Sunday, 18 November 2018 at 14:54:05 UTC, Stanislav Blinov 
wrote:
 On Sunday, 18 November 2018 at 14:47:51 UTC, FeepingCreature 
 wrote:
 On Sunday, 18 November 2018 at 14:38:09 UTC, Stanislav Blinov 
 wrote:
  safe unittest {
     Nullable!S a; // Look ma, no assert!
 }
a = S(new Object); // Look pa, assert!
That has to do with poor implementation of that example Nullable, not the union. opAssign should check for _hasValue.
Right, which means you end up with moveEmplace in opAssign, which is the current Nullable implementation. Which only works because union{} essentially functions as a semi-official backdoor in the typesystem, even in safe. https://run.dlang.io/is/YAnVBV Is that *really* good language design, though?
Nov 18 2018
parent reply Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Sunday, 18 November 2018 at 15:15:11 UTC, FeepingCreature 
wrote:
 On Sunday, 18 November 2018 at 14:54:05 UTC, Stanislav Blinov 
 wrote:
 On Sunday, 18 November 2018 at 14:47:51 UTC, FeepingCreature 
 wrote:
 On Sunday, 18 November 2018 at 14:38:09 UTC, Stanislav Blinov 
 wrote:
  safe unittest {
     Nullable!S a; // Look ma, no assert!
 }
a = S(new Object); // Look pa, assert!
That has to do with poor implementation of that example Nullable, not the union. opAssign should check for _hasValue.
Right, which means you end up with moveEmplace in opAssign,
No you don't :) (I know, I know, I'm such a negative personality): // still rudimentary, no checks for hasElaborateDestructor struct Nullable(T) { this(T val) { value = val; } this(typeof(null) val) {} ~this() { cleanup(); } property trusted ref T value() { assert(_hasValue); return _u.value; } property trusted void value(T val) { // when it's rebinding, there's no need to move or moveEmplace, // just overwrite the union if (_hasValue) { // this, or it could actually assign to _u.value, depending on the desired semantics of Nullable destroy(_u.value); _u = U(val); } else { _u = U(val); _hasValue = true; } } property void value(typeof(null)) { cleanup(); } void opAssign(T val) { // arguably this should just duplicate what `value` does, // to avoid unnecessary copies passed around. value = val; } void opAssign(typeof(null) val) { value = val; } private: union U { T value = T.init; } U _u; property ref _payload() inout { return _u.value; } bool _hasValue; void cleanup() { if (!_hasValue) return; destroy(_u.value); _hasValue = false; } }
 which is the current Nullable implementation.
Looking at that implementation, ouch... Maybe I'm missing something?..
 Which only works because union{} essentially functions as a 
 semi-official backdoor in the typesystem, even in  safe. Is 
 that *really* good language design, though?
That? Yes. Unions are actually useful now, unlike what they were before. Anyway, my point is that unions *are* the tool for that particular job, just like you said initially. Which has little to do with the actual topic :) For example, I'd argue that the *actual* implementation *must* `move` it's passed-by-value argument (regardless of what it uses for storage), because the caller already made a required copy. But that means wiping out the argument back to T.init, and then we're back to square one.
Nov 18 2018
parent reply FeepingCreature <feepingcreature gmail.de> writes:
On Sunday, 18 November 2018 at 15:45:51 UTC, Stanislav Blinov 
wrote:
 On Sunday, 18 November 2018 at 15:15:11 UTC, FeepingCreature 
 wrote:
 On Sunday, 18 November 2018 at 14:54:05 UTC, Stanislav Blinov 
 wrote:
 On Sunday, 18 November 2018 at 14:47:51 UTC, FeepingCreature 
 wrote:
 On Sunday, 18 November 2018 at 14:38:09 UTC, Stanislav 
 Blinov wrote:
  safe unittest {
     Nullable!S a; // Look ma, no assert!
 }
a = S(new Object); // Look pa, assert!
That has to do with poor implementation of that example Nullable, not the union. opAssign should check for _hasValue.
Right, which means you end up with moveEmplace in opAssign,
No you don't :) (I know, I know, I'm such a negative personality): // still rudimentary, no checks for hasElaborateDestructor struct Nullable(T) { this(T val) { value = val; } this(typeof(null) val) {} ~this() { cleanup(); } property trusted ref T value() { assert(_hasValue); return _u.value; } property trusted void value(T val) { // when it's rebinding, there's no need to move or moveEmplace, // just overwrite the union if (_hasValue) { // this, or it could actually assign to _u.value, depending on the desired semantics of Nullable destroy(_u.value); _u = U(val); } else { _u = U(val); _hasValue = true; } } property void value(typeof(null)) { cleanup(); } void opAssign(T val) { // arguably this should just duplicate what `value` does, // to avoid unnecessary copies passed around. value = val; }
This will not work if your type has an immutable field, by the by.
     void opAssign(typeof(null) val) { value = val; }

 private:
     union U { T value = T.init; }
     U _u;
      property ref _payload() inout { return _u.value; }
     bool _hasValue;

     void cleanup() {
         if (!_hasValue) return;
         destroy(_u.value);
         _hasValue = false;
     }
 }


 which is the current Nullable implementation.
Looking at that implementation, ouch... Maybe I'm missing something?..
 Anyway, my point is that unions *are* the tool for that 
 particular job, just like you said initially. Which has little 
 to do with the actual topic :)
 For example, I'd argue that the *actual* implementation *must* 
 `move` it's passed-by-value argument (regardless of what it 
 uses for storage), because the caller already made a required 
 copy. But that means wiping out the argument back to T.init, 
 and then we're back to square one.
Right, which is why we make a *second* copy, store it in a Union, and moveEmplace that.
 Which only works because union{} essentially functions as a 
 semi-official backdoor in the typesystem, even in  safe. Is 
 that *really* good language design, though?
That? Yes. Unions are actually useful now, unlike what they were before.
Yeah they're useful in that they're helping us to get the language to defeat itself. A destructor is something that is run when an expression goes out of scope. Except if that expression is stored in a union, because a union is a Destructor Blocker™. No it's not! A union is a way to make multiple expressions occupy the same area of memory. Its definition has *nothing* to do with destructor blocking. The only reason that unions are destructor blockers are that they *can't* logically support destructors, and for some ~magical reason~ dlang has decided that it's safe™ to let us store values with destructors in union fields anyway, basically for no reason other than that we decided that safe was *too* safe and we wanted it to be less safe because if it was as safe as it claimed it was, it'd be inconvenient. But to me, that indicates that safe is broken, and I really don't believe in breaking an unrelated feature in order that I can coincidentally unbreak the original brokenness manually. And no, just because I put a fancy term on the brokenness doesn't make it any less broken. Unions let me take a safe expression whose copy constructor has ran and avoid calling its destructor. This is useful because safe would otherwise require me to run a destructor on some expressions whose constructor has never run. But two wrong designs don't make a correct design. Just because the building is on fire doesn't validate the decision to leave a giant jagged hole in the front wall.
Nov 18 2018
parent reply Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Sunday, 18 November 2018 at 19:33:44 UTC, FeepingCreature 
wrote:
 On Sunday, 18 November 2018 at 15:45:51 UTC, Stanislav Blinov 
 wrote:
 Anyway, my point is that unions *are* the tool for that 
 particular job, just like you said initially. Which has little 
 to do with the actual topic :)
 For example, I'd argue that the *actual* implementation *must* 
 `move` it's passed-by-value argument (regardless of what it 
 uses for storage), because the caller already made a required 
 copy. But that means wiping out the argument back to T.init, 
 and then we're back to square one.
Right, which is why we make a *second* copy, store it in a Union, and moveEmplace that.
You misunderstood what I said, and as you saw in my last example you don't need a moveEmplace for the current implementation. What I'm saying is, the actual implementation *should* be doing this: void opAssign(T rhs) { // ... move(rhs, myOwnStorage); // ... } T, for all I know, could be non-copyable, i.e. a Unique. That way the onus of making an instance falls solely on the caller, while Nullalbe would never call or deal with any copy constructors. Recall that D "frowns upon" self-referencing types. Although the language can't statically disallow them, it's free to assume they don't exist. Therefore, moving values around without calling their copy ctors should be acceptable. That's not a `union` problem. If you do it like this where T is your S, you'll get an assert due to invariant. I.e. that is the subject problem :)
 ...But to me, that indicates that  safe is broken, and I really 
 don't believe in breaking an unrelated feature in order that I 
 can coincidentally unbreak the original brokenness manually. 
 And no, just because I put a fancy term on the brokenness 
 doesn't make it any less broken. Unions let me take a  safe 
 expression whose copy constructor has ran and avoid calling its 
 destructor. This is useful because  safe would otherwise 
 require me to run a destructor on some expressions whose 
 constructor has never run. But two wrong designs don't make a 
 correct design. Just because the building is on fire doesn't 
 validate the decision to leave a giant jagged hole in the front 
 wall.
Again, that is not a `union` problem, that's a destructor+invariant problem. Types in .init state should be destructible, period: S[] a; // can't do this: a = new S[10]; // but still can do this: a.length = 10;
Nov 18 2018
parent reply FeepingCreature <feepingcreature gmail.de> writes:
On Monday, 19 November 2018 at 01:46:34 UTC, Stanislav Blinov 
wrote:
 Again, that is not a `union` problem, that's a 
 destructor+invariant problem. Types in .init state should be 
 destructible, period:

 S[] a;
 // can't do this:
 a = new S[10];
 // but still can do this:
 a.length = 10;
Fair enough, I agree with that, it's just been a bit of an uphill struggle to get people to agree that requiring T.init to pass the invariants makes struct invariants mostly useless. If you can get them to agree to a solution that doesn't nerf struct invariants into the ground, then be my guest - there's a DMD PR that could be resurrected, https://github.com/dlang/dmd/pull/8462 , or a better solution found. I just really don't want to have to take our codebase back to classes for domain values, or comment out all the invariants we painstakingly added.
Nov 18 2018
parent reply FeepingCreature <feepingcreature gmail.de> writes:
Ping.
Nov 25 2018
parent Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Sunday, 25 November 2018 at 16:05:11 UTC, FeepingCreature 
wrote:
 Ping.
Pong on GH.
Nov 25 2018
prev sibling parent Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Sunday, 18 November 2018 at 14:47:51 UTC, FeepingCreature 
wrote:
 On Sunday, 18 November 2018 at 14:38:09 UTC, Stanislav Blinov 
 wrote:
  safe unittest {
     Nullable!S a; // Look ma, no assert!
 }
a = S(new Object); // Look pa, assert!
...i.e. it should be: // ... property trusted void value(T val) { if (!_hasValue) { import std.conv : emplace; auto addr = () trusted { return &_payload(); } (); emplace(addr, val); _hasValue = true; } else _payload = val; } // ... since otherwise that calls opAssign. Unless you'd propose to not call invariants for that one too :*)
Nov 18 2018