digitalmars.D - Enhancements can enable memory-safe reference counting
- tsbockman (197/197) May 13 2021 I have written an experimental D reference counting system with a
- jmh530 (4/6) May 23 2021 It's a bit of a shame that there aren't any other comments here
- Gavin Ray (7/14) May 25 2021 Is there anyone smarter than myself that might be willing to
- tsbockman (11/16) May 25 2021 What would you like explained? Or alternatively, which parts of
- Imperatorn (2/6) May 23 2021 This is interesting. I totally missed this post earlier ☀️
- vitoroak (29/37) May 26 2021 Every time I tried to do something similar in D I stumbled across
- tsbockman (14/39) May 26 2021 Yes, that is a problem.
- Paul Backus (20/26) May 26 2021 In theory, these examples are fine, since they result in a null
- tsbockman (8/18) May 26 2021 No. That's what I thought at first, too, but if you walk through
- Paul Backus (5/17) May 26 2021 Thanks, I see the problem now.
- vitoroak (6/25) May 26 2021 Yeah, my point is that if there's any way to make this @safe. The
- tsbockman (16/19) May 26 2021 Although it's not shown in my earlier code example, I did
- vitoroak (23/42) May 27 2021 I saw you mentioning breaking things in @safe code. This example
- tsbockman (6/12) Nov 20 2021 I reduced this once and filed an issue for it, since it is the
- Imperatorn (3/15) Nov 20 2021 Nice, this one should not be that hard to fix.
I have written an experimental D reference counting system with a memory ` safe` API. It supports slices, classes, dynamic casts, and `-betterC`. I found this task just barely possible today, with DIP25 / DIP1000 enabled (ignoring some bugs like [20150 "-dip1000 defeated by pure"](https://issues.dlang.org/show_bug.cgi?id=20150)). But, some of the techniques I used to do it are rather nasty hacks, and I'm sure no one would want my solution in the standard library. However, there are some fairly simple language enhancements that would make it possible to remove most of the hacks: The most important changes relate to `scope` and `return` (as in DIP25's `return ref`): (0) `return` should apply consistently to all indirections, especially `ref`, `*`, `[]`, and `class`, since these all use pointers under the hood. (1) When calling a `return` annotated function, assigning the returned indirection should be considered valid or invalid based on whether the receiving indirection provably has a lifetime that is fully contained inside the lifetime of the `return` annotated input indirection(s) for the function. (2) `foreach` and `foreach_reverse` should support `scope`. Basically, annotating a function with `return` becomes a way to force callers to treat the return value as head `scope`, except when this can be proven unnecessary based on the lifetime of the relevant input(s). A small example program (my real system is too large to embed in this message): ```D module app; import core.stdc.stdlib : malloc, free; import core.lifetime : emplace; import std.traits; struct Unique(_Address) if(is(Unqual!_Address : Target*, Target) || is(_Address == class)) { alias Address = _Address; private Address _address; Address address() return pure safe { return _address; } alias address this; static if(is(Unqual!Address : Target*, Target)) { alias Access = Target; ref Access access() return pure safe { return *_address; } alias Slice = Access[]; Access[] slice() return pure trusted { return _address[0 .. (_address !is null)]; } } else { static assert(is(Address == class)); alias Access = Address; Access access() return pure safe { return _address; } alias Slice = const(Access)[]; Slice slice() return const pure trusted { return (*cast(Access[1]*) &_address) [0 .. (_address !is null)]; } } alias opUnary(string op : `*`) = access; alias opIndex() = slice; this(const(bool) value) trusted { if(value) { static if(is(Access == Address)) { _address = cast(Address) malloc(__traits(classInstanceSize, Access)); } else { _address = cast(Address) malloc(Access.sizeof); } emplace(_address); } else _address = null; } disable ref typeof(this) opAssign(ref typeof(this)); disable this(this); disable this(ref typeof(this)); ~this() trusted { if(_address !is null) { destroy!false(access); free(cast(void*) _address); _address = null; } } } void test(Address)() safe { Unique!Address up = true; with(up) { static a = Address.init, c = Address.init, e = Slice.init; scope b = Address.init, d = Address.init, f = Slice.init; static if( __traits(compiles, a = up.address)) { pragma(msg, "a: ACCEPTS INVALID: static " ~ Address.stringof ~ " = return " ~ Address.stringof); } static if(!__traits(compiles, b = up.address)) { pragma(msg, "d: REJECTS VALID: scope " ~ Address.stringof ~ " = return " ~ Address.stringof); } static if(!is(Access == Address)) { static if( __traits(compiles, c = &(up.access))) { pragma(msg, "b: ACCEPTS INVALID: static " ~ Address.stringof ~ " = &(return ref " ~ Access.stringof ~ ")"); } static if(!__traits(compiles, d = &(up.access))) { pragma(msg, "e: REJECTS VALID: scope " ~ Address.stringof ~ " = &(return ref " ~ Access.stringof ~ ")"); } } static if( __traits(compiles, e = up.slice)) { pragma(msg, "c: ACCEPTS INVALID: static " ~ Slice.stringof ~ " = return " ~ Slice.stringof); } static if(!__traits(compiles, f = up.slice)) { pragma(msg, "f: REJECTS VALID: scope " ~ Slice.stringof ~ " = return " ~ Slice.stringof); } static if(!is(Access == Address)) { static Access g, i, k; scope Access h, j, l; static if(!__traits(compiles, g = up.access)) { pragma(msg, "g: REJECTS VALID: static " ~ Access.stringof ~ " = return ref " ~ Access.stringof); } static if(!__traits(compiles, h = up.access)) { pragma(msg, "j: REJECTS VALID: scope " ~ Access.stringof ~ " = return ref " ~ Access.stringof); } static if(!__traits(compiles, i = *(up.address))) { pragma(msg, "h: REJECTS VALID: static " ~ Access.stringof ~ " = *(return " ~ Address.stringof ~ ")"); } static if(!__traits(compiles, j = *(up.address))) { pragma(msg, "k: REJECTS VALID: scope " ~ Access.stringof ~ " = *(return " ~ Address.stringof ~ ")"); } static if(!__traits(compiles, k = up.slice[0])) { pragma(msg, "i: REJECTS VALID: static " ~ Access.stringof ~ " = (return " ~ Slice.stringof ~ ")[0]"); } static if(!__traits(compiles, l = up.slice[0])) { pragma(msg, "l: REJECTS VALID: scope " ~ Access.stringof ~ " = (return " ~ Slice.stringof ~ ")[0]"); } } } } class D { } void main() safe { test!(int**)(); test!D(); } ``` Output with `-dip1000`: ``` a: ACCEPTS INVALID: static int** = return int** e: REJECTS VALID: scope int** = &(return ref int*) c: ACCEPTS INVALID: static int*[] = return int*[] a: ACCEPTS INVALID: static D = return D ``` (`-dip1000` only prevented this one error, although I believe it does do some other good things that are not tested above): ``` c: ACCEPTS INVALID: static const(D)[] = return const(D)[] ``` Currently, in order to have a truly ` safe` API I must work around the above issues by marking various things ` system` that shouldn't need to be ` system`, and then offering awkward but safe borrowing with something like this: ```D mixin template borrow(alias owner, string name) { mixin(`scope `, name, ` = () trusted { pragma(inline, true); return owner.address; }();`); } ``` With my earlier proposed changes to `return` and `scope`, though, the `borrow` mixin would be unnecessary. I think D is very close to being able to sanely express ` safe` reference counting APIs. I don't think ` live` is necessary; rather, we just need to complete `scope` and `return` and fix some RAII related bugs. For performance reasons, move operators and some minor changes to the GC would also be good, but are not actually required. Destroy?
May 13 2021
On Friday, 14 May 2021 at 00:45:09 UTC, tsbockman wrote:[snip] Destroy?It's a bit of a shame that there aren't any other comments here because it seemed interesting, but I didn't have the time to really dig into this to understand everything going on.
May 23 2021
On Sunday, 23 May 2021 at 13:03:58 UTC, jmh530 wrote:On Friday, 14 May 2021 at 00:45:09 UTC, tsbockman wrote:Is there anyone smarter than myself that might be willing to explain/comment the code and what's going on here? Would be much appreciated =D I think I understand that it's some sort of ownership/refcounting system that's meant to provide scope-based safe resource usage but I am not sure I can appreciate the brilliance to it's fullest.[snip] Destroy?It's a bit of a shame that there aren't any other comments here because it seemed interesting, but I didn't have the time to really dig into this to understand everything going on.
May 25 2021
On Tuesday, 25 May 2021 at 18:40:52 UTC, Gavin Ray wrote:explain/comment the code and what's going on here? Would be much appreciated =DWhat would you like explained? Or alternatively, which parts of the code *do* make sense to you? A full explanation without making any assumptions about your background and experience, other than a minimal working knowledge of D, would be multiple pages long.I think I understand that it's some sort of ownership/refcounting system that's meant to provide scope-based safe resource usageThat is the goal of the full system, yes. The code in my original post isn't intended for practical use, though. It's just a demonstration of some bugs or missing features in the current implementation of `return`, together with a small demonstration of the practical value of fixing them.
May 25 2021
On Friday, 14 May 2021 at 00:45:09 UTC, tsbockman wrote:I have written an experimental D reference counting system with a memory ` safe` API. It supports slices, classes, dynamic casts, and `-betterC`. [...]This is interesting. I totally missed this post earlier ☀️
May 23 2021
On Friday, 14 May 2021 at 00:45:09 UTC, tsbockman wrote:[snip] I think D is very close to being able to sanely express ` safe` reference counting APIs. I don't think ` live` is necessary; rather, we just need to complete `scope` and `return` and fix some RAII related bugs. For performance reasons, move operators and some minor changes to the GC would also be good, but are not actually required. Destroy?Every time I tried to do something similar in D I stumbled across the same problems and as far as I know it's not possible to implement it completely safe today. I think one of the problems is that you can manually destroy/move any struct while there are still references/pointers to it or its internals like in the example below (I used your borrow mixin template). ```d void receiveByValue(Unique!(int*) u) safe { } void main() safe { import std.stdio: writeln; auto u1 = Unique!(int*)(true); mixin borrow!(u1, "x1"); writeln(*x1); // ok destroy(u1); writeln(*x1); // should not be possible import core.lifetime: move; auto u2 = Unique!(int*)(true); mixin borrow!(u2, "x2"); writeln(*x2); // ok receiveByValue(move(u2)); writeln(*x2); // should not be possible } ``` I don't know how this could be solved but for me it's a blocker to do a safe Unique or RC type. Maybe if I always return an RCRef or something like this but I think the overhead would be too big.
May 26 2021
On Wednesday, 26 May 2021 at 18:53:21 UTC, vitoroak wrote:Every time I tried to do something similar in D I stumbled across the same problems and as far as I know it's not possible to implement it completely safe today. I think one of the problems is that you can manually destroy/move any struct while there are still references/pointers to it or its internals like in the example below (I used your borrow mixin template). ```d void receiveByValue(Unique!(int*) u) safe { } void main() safe { import std.stdio: writeln; auto u1 = Unique!(int*)(true); mixin borrow!(u1, "x1"); writeln(*x1); // ok destroy(u1); writeln(*x1); // should not be possibleYes, that is a problem. Manually calling `destroy` or `__dtor` really should be an ` system` operation, regardless of the attributes of `__dtor` itself. The whole point of destructors is to ensure that cleanup work is performed at the correct point, and potentially subverting that should not be considered ` safe`.```d import core.lifetime: move; auto u2 = Unique!(int*)(true); mixin borrow!(u2, "x2"); writeln(*x2); // ok receiveByValue(move(u2)); writeln(*x2); // should not be possible } ```That second test, with `move`, actually doesn't compile (although I'm not sure why): ``` onlineapp.d(150): Error: safe function D main cannot call system function core.lifetime.move!(Unique!(int*)).move /dlang/dmd-nightly/linux/bin64/../../src/druntime/import/core/lifetime.d(1587): core.lifetime.move!(Unique!(int*)).move is declared here ```
May 26 2021
On Wednesday, 26 May 2021 at 18:53:21 UTC, vitoroak wrote:Every time I tried to do something similar in D I stumbled across the same problems and as far as I know it's not possible to implement it completely safe today. I think one of the problems is that you can manually destroy/move any struct while there are still references/pointers to it or its internals like in the example below (I used your borrow mixin template).In theory, these examples are fine, since they result in a null dereference, which is guaranteed by [the language spec][1] to be memory-safe (i.e., to immediately crash the program). In practice, this is *usually* what will happen, but neither DMD, LDC, nor GDC actually *guarantees* an immediate crash upon null dereference in all cases. In particular, a null dereference with a large enough offset (e.g., a struct or class member access through a null pointer) can in principle cause memory corruption at runtime by accessing an address beyond the protected pages at the start of the address space. You can work around this by adding an explicit null check: pure safe ref Access access() return { // assert(0) is not compiled out in release mode if (_address !is null) assert(0); return *_address; } [1]: https://dlang.org/spec/function.html#safe-values
May 26 2021
On Wednesday, 26 May 2021 at 21:48:40 UTC, Paul Backus wrote:On Wednesday, 26 May 2021 at 18:53:21 UTC, vitoroak wrote:No. That's what I thought at first, too, but if you walk through the code more carefully you will see that `x1` never gets set to `null`, and still points to the old target of `u1`. So, he is correct. I've opened [issue https://issues.dlang.org/show_bug.cgi?id=21981) requesting a fix.Every time I tried to do something similar in D I stumbled across the same problems and as far as I know it's not possible to implement it completely safe today. I think one of the problems is that you can manually destroy/move any struct while there are still references/pointers to it or its internals like in the example below (I used your borrow mixin template).In theory, these examples are fine, since they result in a null dereference,
May 26 2021
On Wednesday, 26 May 2021 at 22:06:27 UTC, tsbockman wrote:On Wednesday, 26 May 2021 at 21:48:40 UTC, Paul Backus wrote:Thanks, I see the problem now. I guess the conclusion we're forced to come to is that, given current language rules, it is incorrect to mark the destructor as ` trusted`.On Wednesday, 26 May 2021 at 18:53:21 UTC, vitoroak wrote: In theory, these examples are fine, since they result in a null dereference,No. That's what I thought at first, too, but if you walk through the code more carefully you will see that `x1` never gets set to `null`, and still points to the old target of `u1`. So, he is correct. I've opened [issue https://issues.dlang.org/show_bug.cgi?id=21981) requesting a fix.
May 26 2021
On Thursday, 27 May 2021 at 00:05:24 UTC, Paul Backus wrote:On Wednesday, 26 May 2021 at 22:06:27 UTC, tsbockman wrote:Yeah, my point is that if there's any way to make this safe. The same happens for a Vector implementation where you can call push (that can reallocate) while having a reference to an element. I also don't know a simple way to solve this problem but I think it's important if we want to sell D to have nogc data structures.On Wednesday, 26 May 2021 at 21:48:40 UTC, Paul Backus wrote:Thanks, I see the problem now. I guess the conclusion we're forced to come to is that, given current language rules, it is incorrect to mark the destructor as ` trusted`.On Wednesday, 26 May 2021 at 18:53:21 UTC, vitoroak wrote: In theory, these examples are fine, since they result in a null dereference,No. That's what I thought at first, too, but if you walk through the code more carefully you will see that `x1` never gets set to `null`, and still points to the old target of `u1`. So, he is correct. I've opened [issue https://issues.dlang.org/show_bug.cgi?id=21981) requesting a fix.
May 26 2021
On Thursday, 27 May 2021 at 00:20:44 UTC, vitoroak wrote:The same happens for a Vector implementation where you can call push (that can reallocate) while having a reference to an element.Although it's not shown in my earlier code example, I did recognize the problem of reassignment/reallocation potentially invalidating extant borrowed references in my full system. (But, I missed the problem of manual destructor calls.) My solution for reassignment/reallocation was simply to make `opAssign`, unique move operations, etc. ` system`. Of course this is a significant limitation, but I think there is still a lot of value in a reference counting system if the rest of its operations can be ` trusted`. But, if the destructor has to be ` system` too, then it is no longer possible to use the reference counting system for any practical purpose in ` safe` code. For myself, I will just cheat and leave the destructor ` trusted` anyway, because why would I manually call it in ` safe` code to begin with? But, we'll need a better answer than that for the standard library.
May 26 2021
On Wednesday, 26 May 2021 at 22:06:27 UTC, tsbockman wrote:On Wednesday, 26 May 2021 at 21:48:40 UTC, Paul Backus wrote:I saw you mentioning breaking things in safe code. This example let you access an invalid pointer without no trusted code and heap allocation, only safe code. ```d struct IntRef { int* ptr = void; this(return scope int* p) safe { ptr = p; } int* borrow() return scope safe { return ptr; } } void main() safe { import std.stdio: writeln; auto x = 1; auto r = IntRef(&x); writeln(*r.borrow); destroy!true(r); writeln(*r.borrow); } ```On Wednesday, 26 May 2021 at 18:53:21 UTC, vitoroak wrote:No. That's what I thought at first, too, but if you walk through the code more carefully you will see that `x1` never gets set to `null`, and still points to the old target of `u1`. So, he is correct. I've opened [issue https://issues.dlang.org/show_bug.cgi?id=21981) requesting a fix.Every time I tried to do something similar in D I stumbled across the same problems and as far as I know it's not possible to implement it completely safe today. I think one of the problems is that you can manually destroy/move any struct while there are still references/pointers to it or its internals like in the example below (I used your borrow mixin template).In theory, these examples are fine, since they result in a null dereference,
May 27 2021
On Thursday, 27 May 2021 at 20:47:44 UTC, vitoroak wrote:I saw you mentioning breaking things in safe code. This example let you access an invalid pointer without no trusted code and heap allocation, only safe code. ```d struct IntRef { int* ptr = void; ````void` initializing `IntRef.ptr` does not create a gap in `IntRef.init`. `IntRef.init.ptr` is effectively still `null`.```d this(return scope int* p) safe { ptr = p; } int* borrow() return scope safe { return ptr; } } void main() safe { import std.stdio: writeln; auto x = 1; auto r = IntRef(&x); writeln(*r.borrow); destroy!true(r); ````destroy!true` sets `r` to `IntRef.init`, which sets `r.ptr` to `null`.```d writeln(*r.borrow); } ```As Paul Backus said earlier, dereferencing a `null` pointer is formally considered to be memory-safe in D. This is because it will (with some rare exceptions) crash the program immediately, rather than corrupting memory and continuing execution with undefined behavior.
May 27 2021
On Thursday, 27 May 2021 at 22:13:30 UTC, tsbockman wrote:As Paul Backus said earlier, dereferencing a `null` pointer is formally considered to be memory-safe in D. This is because it will (with some rare exceptions) crash the program immediately, rather than corrupting memory and continuing execution with undefined behavior.That's "memory-safe" in any language in that case because that's a function of the operating system rather than the language. However, there are exception like if you are dereferencing a null pointer + offset and the offset is large, then you can corrupt memory. This is more rare though. A program crash is the best bug you can have. A core dump can be saved and be investigated.
May 27 2021
On Thursday, 27 May 2021 at 22:34:53 UTC, IGotD- wrote:On Thursday, 27 May 2021 at 22:13:30 UTC, tsbockman wrote:True. I'm neither defending nor criticizing D's definition of "memory safe" here. My goal is to achieve a similar level of safety and convenience with RC and borrowing to what D currently considers ` safe` with GC. ` safe` doesn't try to prevent `null` dereferences in GC code, so it shouldn't be a requirement for RC code, either.As Paul Backus said earlier, dereferencing a `null` pointer is formally considered to be memory-safe in D. This is because it will (with some rare exceptions) crash the program immediately, rather than corrupting memory and continuing execution with undefined behavior.That's "memory-safe" in any language in that case because that's a function of the operating system rather than the language. However, there are exception like if you are dereferencing a null pointer + offset and the offset is large, then you can corrupt memory. This is more rare though.
May 27 2021
On Friday, 14 May 2021 at 00:45:09 UTC, tsbockman wrote:```D class D { } ``` ``` a: ACCEPTS INVALID: static D = return D ```I reduced this once and filed an issue for it, since it is the worst of the bunch: "Template breaks return annotation for class reference returned by struct method" https://issues.dlang.org/show_bug.cgi?id=22528
Nov 20 2021
On Saturday, 20 November 2021 at 08:08:44 UTC, tsbockman wrote:On Friday, 14 May 2021 at 00:45:09 UTC, tsbockman wrote:Nice, this one should not be that hard to fix. Does someone have time to look at it?```D class D { } ``` ``` a: ACCEPTS INVALID: static D = return D ```I reduced this once and filed an issue for it, since it is the worst of the bunch: "Template breaks return annotation for class reference returned by struct method" https://issues.dlang.org/show_bug.cgi?id=22528
Nov 20 2021
On Saturday, 20 November 2021 at 08:58:15 UTC, Imperatorn wrote:On Saturday, 20 November 2021 at 08:08:44 UTC, tsbockman wrote:Dennis [partially explained the cause of the problem](https://issues.dlang.org/show_bug.cgi?id=22528#c1), and suggested a workaround. After some experimentation, I found a more general form of the workaround that eliminates all "ACCEPTS INVALID" errors in my earlier test code. Including this line at the beginning of each offending `return` member function will fix them: ```D workaround. ``` With this change to `Unique`, all of the tests from my first post in this thread now pass except for `e: REJECTS VALID: scope int** = &(return ref int*)`. As a result, the `borrow` `mixin template` is no longer needed at all. **Progress!** The hackish, possibly fragile nature of the workaround, and the [destructor issue](https://issues.dlang.org/show_bug.cgi?id=21981) still make my solution unsuitable for the standard library for now, though."Template breaks return annotation for class reference returned by struct method" https://issues.dlang.org/show_bug.cgi?id=22528Nice, this one should not be that hard to fix. Does someone have time to look at it?
Nov 25 2021
On Friday, 26 November 2021 at 04:37:16 UTC, tsbockman wrote:The hackish, possibly fragile nature of the workaround, and the [destructor issue](https://issues.dlang.org/show_bug.cgi?id=21981) still make my solution unsuitable for the standard library for now, though.I have a runtime solution for borrow check, share it here. ```d struct Obj { size_t version; size_t counter; } struct Ptr { Obj* obj; size_t ver; alias this self; property ref self() return scope { // check ver first assert(ver is obj.ver); return *obj; } } ``` To make this work the obj memory can not released before all Pointer released, but it can be reused for new instance, so a memory pool is used to manage every struct|class. You need update version when object destroyed by counter. You can make sure there is only one mut borrow for a object like RUST. You can make WeakRef that don`t change counter, but become Option<Null> when the object released. You can make Unique ownership with this like RUST, but with WeakRef immutable pointer. Almost every things I want for RUST can be done from runtime, the price is managed memory pool for every object you want this kind behavior. I use it for 3 years with my private betterC framework, It also can work cross thread with Atomic counter and version. I give up on class and exception, use c library if there available. I use mecca swichInto Fiber with betterC, but can be graceful exit(mecca fiber never exit). With libuv I build cross platform app for Andoird/IOS/Windows/Mac/Linux. The binary is very small compare to C++. The price is manage memory pool for your self for each diff kind object. This object pool can be reused without fully destroyed. (a methed call reset before return to pool)
Nov 25 2021