www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - DIP1000: 'return scope' ambiguity and why you can't make opIndex work

reply Dennis <dkorpel gmail.com> writes:
You may have seen my previous dip1000 posts:
- [dip1000 + pure is a DEADLY 
COMBO](https://forum.dlang.org/thread/jnkdcngzytgtobihzggj forum.dlang.org)
- [DIP1000: The return of 'Extend Return Scope 
Semantics'](https://forum.dlang.org/thread/zzovywgswjmwneqwbdnm forum.dlang.org)

Consider this part 3 in the "fixing dip1000 series", but it's 
about a different bug.



dip25 and dip1000 are supposed to provide *simple* lifetime 
tracking that's still good enough to be useful. In the previous 
thread [Atila Neves 
mentioned](https://forum.dlang.org/post/azdzxsxyuipovrlmbhbb forum.dlang.org)
that [Lifetime Annotations like in
Rust](https://carols10cents.github.io/book/ch10-03-lifetime-syntax.html#lifetime-annotations-in-
unction-signatures) are to be avoided. Is it simple though?

[On Wednesday, 26 May 2021 at 15:29:32 UTC, Paul Backus 
wrote:](https://forum.dlang.org/post/tsjffxwlwitaawiniztv forum.dlang.org)
 Of course, D's vision here is severely hampered in practice by
 the poor quality of its documentation (raise your hand if you
 can explain what ["return ref parameter semantics with
 additional scope parameter semantics"][1] actually means). But
 that's the idea.

 [1]:
 https://dlang.org/spec/function.html#ref-return-scope-parameters
Working on dip1000 made me finally able to "raise my hand", so here's how it works: Function parameters of a type with pointers have three possible lifetimes: infinite, scope, or return scope. You might have heard that `scope` is "not transitive" and think that there's only one layer to it. However, the key insight is that there's actually *two layers* when `ref` comes into play: then the parameter's *address* itself also has a lifetime in addition to the *value*. It can be demonstrated with a linked list: ```D safe: struct Node { int x; Node* next; } // First layer: returning the address of the node int* get0(return ref Node node) { return &node.x; } // Second layer: returning a value of the node int* get1(ref return scope Node node) { return &node.next.x; } // Third layer and beyond: this is where scope checking ends int* get2(ref scope Node node) { return &node.next.next.x; } ``` The lifetimes are determined as follows: | Lifetime | `ref` address | value of pointer type | |---------------|-----------------------|-----------------------| | infinite | never | default | | current scope | default | with `scope` keyword | | return scope | with `return` keyword | with `return scope` | A few code examples: ```D safe: int* v0( int* x) {return x;} // allowed, no lifetime restrictions int* v1(return int* x) {return x;} // allowed, returned value is `scope` int* v2( scope int* x) {return x;} // not allowed, x is `scope` int* v3(return scope int* x) {return x;} // allowed, equivalent to v1 int* r0( ref int x) {return &x;} // not allowed, `ref` is always scope int* r1(scope ref int x) {return &x;} // not allowed, `scope` does nothing here int* r2(return ref int x) {return &x;} // allowed, return applies to `ref` ``` As you can see, `scope` always applies to the pointer value and not to the `ref`, since `ref` is inherently `scope`. No ambiguity there. But what if we have a `ref int*`: does `return` apply to the address of the `ref` or the `int*` value? That's where those confusing lines from the specification come in, which distinguishes "return ref semantics" and "return scope semantics". It turns out there are three important factors: whether the function's return type is `ref`, whether the parameter is `ref`, and whether the parameter is annotated `scope`. Here's a table: **Does the `return` attribute apply to the parameter's `ref` or the pointer value?** | | `scope` | no `scope` | |----------------------------------|-----------|------------| | `ref` return type / `ref` param | **`ref`** | **`ref`** | | value return type / `ref` param | **value** | **`ref`** | | `ref` return type / value param | **value** | **value** | | value return type / value param | **value** | **value** | If you're still confused, I don't blame you: I'm still confusing myself regularly when reading signatures with `return` and `ref`. Anyway, is this difficulty problematic? [On Wednesday, 15 May 2019 at 08:32:09 UTC, Walter Bright wrote:](https://forum.dlang.org/post/qbgf95$2071$1 digitalmars.com)
 On 5/15/2019 12:21 AM, Dukc wrote:
 Could be worth a try even without docs, but in the long run we
 definitely need some explaining.
True, but I've tried fairly hard with the error messages. Please post your experiences with them. Also, there shouldn't be any caveats with using it. If it passes the compiler, it should be good to go. (Much like const and pure.)
All you need to do is see if the compiler complains, try adding `return` and/or `scope`, and see if the errors go away. Well... ```D safe: struct S { int x; } int* f(ref return scope S s) { return &s.x; // Error: returning `&s.x` escapes a reference to parameter `s` // perhaps annotate the parameter with `return` } ``` That's a confusing supplemental error, the parameter *is* annotated `return`. The actual problem is that `return` applies to the value, not the `ref` parameter, since there is no `ref` return. ```D struct T { int x; int* y; // <- pointer member added } int* g(ref return scope T t) { return &t.x; // No error } ``` And now the compiler accepts invalid code. Indeed, even the compiler doesn't always know what the `return` storage class actually applies to. See [bugzilla issue 21868](https://issues.dlang.org/show_bug.cgi?id=21868). While fixing [issue 21868](https://issues.dlang.org/show_bug.cgi?id=21868), the CI uncovered that [dub package 'automem' relies on the current accepts-invalid behavior](https://github.com/dlang/dmd/pull/12665#issuecomment-858836483). Here's the reduced code: ```D struct Vector { float[] _elements; ref float opIndex(size_t i) scope return { return this._elements[i]; } } ``` With the patch I made, the error becomes: ``` source/automem/vector.d(212,25): Error: scope parameter `this` may not be returned source/automem/vector.d(212,25): note that `return` applies to `ref`, not the value ``` My new supplemental error message is working, yay! But how to fix it? One way is to pass the `Vector` by value instead of by reference, but `opIndex` must be a member function to work as an operator overload and member functions pass `this` by reference. Another way is to return by value instead of by reference, but that means accessing array elements introduces a copy, and `&vector[0]` won't work anymore. dip1000 simply can't express a 'return scope' `opIndex` returning by `ref`. So it turns out the double duty of the `return` storage class is neither simple, nor expressive enough. Do you have any ideas how to move forward, and express the `Vector.opIndex` method without making the attribute soup worse? Keep in mind that dip25 (with `return ref`) is already in the language, but dip1000 (with `return scope`) is still behind a preview switch.
Jun 18 2021
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 6/18/21 11:44 AM, Dennis wrote:
 If you're still confused, I don't blame you: I'm still confusing myself 
 regularly when reading signatures with `return` and `ref`.
I have a headache reading this post, and it makes me want to never use DIP1000. We are creeping towards having as much confusion and pain as Rust, without the benefit. I strongly believe we should implement DIP1000 in an expressive manner, instead of relying on confusing conventions -- just make a type constructor to signify lifetime management and be done. -Steve
Jun 18 2021
parent Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Friday, 18 June 2021 at 17:00:14 UTC, Steven Schveighoffer 
wrote:
 I strongly believe we should implement DIP1000 in an expressive 
 manner, instead of relying on confusing conventions -- just 
 make a type constructor to signify lifetime management and be 
 done.
Yes, I'm thinking about that too. I wonder if there's a performance penalty for such an expressive management we are currently not paying for.
Jul 06 2021
prev sibling next sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Friday, 18 June 2021 at 15:44:02 UTC, Dennis wrote:
 If you're still confused, I don't blame you: I'm still 
 confusing myself regularly when reading signatures with 
 `return` and `ref`. Anyway, is this difficulty problematic?
I am getting the same feeling from this as I am getting from certain aspects in C++ (e.g. intricate details of constructors). Thank you for explaining it, but I also think I will not remember it. I think stuff like this is what programmers will throw into a bucket labeled "I will figure this out later" and just apply keywords until it compiles... I've suggested that one might want to make the function signatures more readable and keep "auxiliary stuff" on a separate line: https://forum.dlang.org/thread/nzwobsazsawxvxbxhoue forum.dlang.org I personally think explicit lifetimes are easier to read, because I don't actually have to remember what keywords signify. It also makes it possible to expand the capabilities of the compiler over time.
Jun 18 2021
next sibling parent jmh530 <john.michael.hall gmail.com> writes:
On Friday, 18 June 2021 at 17:02:41 UTC, Ola Fosheim Grøstad 
wrote:
 [snip]

 I've suggested that one might want to make the function 
 signatures more readable and keep "auxiliary stuff" on a 
 separate line:

 https://forum.dlang.org/thread/nzwobsazsawxvxbxhoue forum.dlang.org

 I personally think explicit lifetimes are easier to read, 
 because I don't actually have to remember what keywords signify.

 It also makes it possible to expand the capabilities of the 
 compiler over time.
I am sympathetic to this. scope is relatively simple, but once you start getting into more combinations it requires a bit of mental energy.
Jun 18 2021
prev sibling parent reply Bradley Chatha <sealabjaster gmail.com> writes:
On Friday, 18 June 2021 at 17:02:41 UTC, Ola Fosheim Grøstad 
wrote:
 I've suggested that one might want to make the function 
 signatures more readable and keep "auxiliary stuff" on a 
 separate line:

 https://forum.dlang.org/thread/nzwobsazsawxvxbxhoue forum.dlang.org

 I personally think explicit lifetimes are easier to read, 
 because I don't actually have to remember what keywords signify.

 It also makes it possible to expand the capabilities of the 
 compiler over time.
Being able to perform explicit, sort of 'algebra-esque' expressions of lifetime seems like a much more reasonable idea than the current magical keyword combinations. What are the chances though that the path/syntax can be changed at this point though, mostly in regards to convincing people? Not just for this suggestion, but any suggestion/criticism towards DIP 1000 in general? My main worry is that we'll end up with an inflexible, hard to understand system that doesn't even do the job right. Yet another tacked on feature for the language, etc. I've not been terribly optimistic for a quite a while now about the general direction things like this end up going, so I'm not getting my hopes up in anyway.
Jun 19 2021
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 19 June 2021 at 10:40:45 UTC, Bradley Chatha wrote:
 What are the chances though that the path/syntax can be changed 
 at this point though, mostly in regards to convincing people? 
 Not just for this suggestion, but any suggestion/criticism 
 towards DIP 1000 in general?
I think it is mostly up to the other compiler devs to convince Walter? I don't think there is anything I can do anyway.
 My main worry is that we'll end up with an inflexible, hard to 
 understand system that doesn't even do the job right. Yet 
 another tacked on feature for the language, etc.
This is a good reason to have an experimental branch and let new features sit there until people get enough experience with them.
Jun 19 2021
parent reply Dennis <dkorpel gmail.com> writes:
On Saturday, 19 June 2021 at 13:47:06 UTC, Ola Fosheim Grøstad 
wrote:
 This is a good reason to have an experimental branch and let 
 new features sit there until people get enough experience with 
 them.
DIP1000 is still behind a preview switch, which you could consider an experimental branch, and it has been for five years now. I don't think incubation time is the problem here, but rather the lack of documentation and thorough testing. Now we have some code bases (most notably Phobos) that compile with -dip1000 but only because of hacks and bugs such as [inout implies return](https://issues.dlang.org/show_bug.cgi?id=22027), [pure implies scope](https://issues.dlang.org/show_bug.cgi?id=20150), [address of ref isn't scope](https://issues.dlang.org/show_bug.cgi?id=20245), [conflation of return-ref and return-scope](https://issues.dlang.org/show_bug.cgi?id=21868), [getting a slice to a scope static array is allowed](https://issues.dlang.org/show_bug.cgi?id=20505), [separate compilation isn't checked](https://issues.dlang.org/show_bug.cgi?id=20023), ... Those were great in the short term because they minimize the amount of breaking changes when turning on -dip1000. In the long term, all this code relying on accepts-invalid bugs is annoying to fix though.
Jun 19 2021
next sibling parent Dukc <ajieskola gmail.com> writes:
On Saturday, 19 June 2021 at 14:37:24 UTC, Dennis wrote:
 Those were great in the short term because they minimize the 
 amount of breaking changes when turning on -dip1000. In the 
 long term, all this code relying on accepts-invalid bugs is 
 annoying to fix though.
And the worst of it, the bugs cannot be easily fixed because they would break Phobos, as you outlined in the first post of your series. I think it was a mistake to declare Phobos `-dip1000` compilant with all those issues still around. I'd much rather have a non-Phobos `-dip1000` that behaves as it's supposed to.
Jun 19 2021
prev sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 19 June 2021 at 14:37:24 UTC, Dennis wrote:
 DIP1000 is still behind a preview switch, which you could 
 consider an experimental branch, and it has been for five years 
 now. I don't think incubation time is the problem here, but 
 rather the lack of documentation and thorough testing.
I don't know if people use switches all that much? With an experimental branch most hobbyists would use it to get the latest features, so I think it would lead to more use of upcoming features... but I don't know for sure.
Jun 19 2021
prev sibling next sibling parent reply ag0aep6g <anonymous example.com> writes:
On Friday, 18 June 2021 at 15:44:02 UTC, Dennis wrote:
 **Does the `return` attribute apply to the parameter's `ref` or 
 the pointer value?**
 |                                  | `scope`   | no `scope` |
 |----------------------------------|-----------|------------|
 | `ref` return type / `ref` param  | **`ref`** | **`ref`**  |
 | value return type / `ref` param  | **value** | **`ref`**  |
 | `ref` return type / value param  | **value** | **value**  |
 | value return type / value param  | **value** | **value**  |
[...]
 Here's the reduced code:
 ```D
 struct Vector {
     float[] _elements;
     ref float opIndex(size_t i) scope return {
         return this._elements[i];
     }
 }
 ```

 With the patch I made, the error becomes:
 ```
 source/automem/vector.d(212,25): Error: scope parameter `this` 
 may not be returned
 source/automem/vector.d(212,25):        note that `return` 
 applies to `ref`, not the value
 ```
Geez, this isn't easy. I had to go step by step to make sense of that error, so maybe this can help others understand: `opIndex` has a half-hidden parameter: `return ref scope this`. Depending on the `opIndex`'s return type, the `return` part of the `this` parameter can either bind to its `ref` part or to its `scope` part. In pseudo code, it can be either `(return ref) (not-return scope) this` or `(not-return ref) (return scope) this`. `opIndex` has a `ref` return type. According to the table above, that means `return` binds to the `ref` part of `ref scope this`. I.e., it's `(return ref) (not-return scope) this`. `(return ref) this` means `opIndex` may return a `ref` to `this` or `this._elements` (same address). `(not-return scope) this` means it cannot return a `ref` to the elements of `this._elements`, because that would be returning a `scope` pointer which hasn't been annotated with `return`. As far as I understand, `opIndex` could return `&this._elements[i]` by value. Then the `return` would bind to the `scope` part of `ref scope this`, making `&this._elements[i]` a `return scope` pointer. But `float*` would be an awkward return type for `opIndex`. Geez, this isn't easy.
Jun 18 2021
parent reply Dennis <dkorpel gmail.com> writes:
On Friday, 18 June 2021 at 17:04:02 UTC, ag0aep6g wrote:
 Geez, this isn't easy.
I know right? When I started to get the hang of it I was like "I should write a tutorial about this" followed closely by "how am I going to explain this in one go to someone who hasn't spelunked dmd/escape.d and looked at the relevant spec a dozen times?" For this post I hoped to get across the idea that dmd has concepts of 'escaping by reference' for `ref int` and 'escaping by value' for `int*`, and that it currently sometimes goes wrong when you mix them. But there is so much more to cover: - constructors act like they return `this` by `ref`, but still have `return scope` semantics - `out` acts like `ref` - `in` acts like... I don't know. With -preview=in it's implementation defined whether it's `ref scope` or just `scope`, so is it also implementation defined what `return` applies to then? - `auto ref`... Don't know how that works internally. - `ref` in foreach is actually *not* inerhently scope like in parameters, and [it has its own hole](https://issues.dlang.org/show_bug.cgi?id=22040). - when `scope` is inferred, could it change the meaning of `return` to apply to the value instead of the `ref`? - ... who knows what I missed Learning a complex system could be rewarding if afterwards you can write expressive code with lifetime tracking, but in the case of dip1000, after all your learning efforts you still can't write a routine that splits a `scope string` into a `scope(string)[]` because dip1000 simply can't express that.
Jun 18 2021
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Friday, 18 June 2021 at 18:31:40 UTC, Dennis wrote:
 Learning a complex system could be rewarding if afterwards you 
 can write expressive code with lifetime tracking, but in the 
 case of dip1000, after all your learning efforts you still 
 can't write a routine that splits a `scope string` into a 
 `scope(string)[]` because dip1000 simply can't express that.
I think this is the most significant issue. There is now way to extend it later without making signatures even more complicated.
Jun 19 2021
prev sibling next sibling parent Dukc <ajieskola gmail.com> writes:
On Friday, 18 June 2021 at 15:44:02 UTC, Dennis wrote:
 [snip]
Wow, if nothing else you're doing a great job documenting DIP1000 with your posts. Thanks! With regular pointers and `ref` parameters, I think we should change the semantics of `scope ref` to be simply same as `ref`, i.e. no binding `scope` to the underlying pointer. Other than that, the semantics you explained are understandable IMO. I'd prefer to call the `return scope` storage class just a `return` storage class. Your post shows they are the same except for the corner cases with `ref scope` I just recommended ditching. Do you agree? Of course, we also need to be able to annotate the `this` pointer as return. Simplest answer IMO: have `return` storage class for a function declaration to always bind to the `this` argument, compiler error if there is none. `return` storage class for the returned value makes no sense anyway.
Jun 18 2021
prev sibling next sibling parent reply ag0aep6g <anonymous example.com> writes:
On 18.06.21 17:44, Dennis wrote:
 So it turns out the double duty of the `return` storage class is neither 
 simple, nor expressive enough. Do you have any ideas how to move 
 forward, and express the `Vector.opIndex` method without making the 
 attribute soup worse? Keep in mind that dip25 (with `return ref`) is 
 already in the language, but dip1000 (with `return scope`) is still 
 behind a preview switch.
A quick and easy fix could be introducing `return(ref)` and `return(scope)`, allowing the programmer to pick what `return` binds to. Then `opIndex` can be written this way: ---- ref float opIndex(size_t i) return(scope) { return this._elements[i]; } ---- But: * That's still hard to figure out, especially with methods because `ref this` is invisible. * It doesn't address the underlying issues: one level of `scope` is not enough, and treating `ref` different from other indirections is confusing. I'm afraid DIPs 25 and 1000 are falling short.
Jun 19 2021
next sibling parent reply Dukc <ajieskola gmail.com> writes:
On Saturday, 19 June 2021 at 09:43:18 UTC, ag0aep6g wrote:
 A quick and easy fix could be introducing `return(ref)` and 
 `return(scope)`, allowing the programmer to pick what `return` 
 binds to. Then `opIndex` can be written this way:

 ----
 ref float opIndex(size_t i) return(scope) {
     return this._elements[i];
 }
 ----
No thanks. The compiler still has ambiguity between ```d ref float opIndex(return(scope) ref __this, size_t i); return(scope) ref float opIndex(ref __this, size_t i); ``` It's simpler and better if `return` attribute outside the parameter list always binds to `this`. Unless there is some reason you might want to annotate the return type as `return`? I can't think of any.
 I'm afraid DIPs 25 and 1000 are falling short.
In the sense that you can't have deep `scope`, true. But I think DIP1000 was deliberately designed to not address that, for simplicity. I suggest that we leave fixing that for a potential future DIP. Just by fixing that `opIndex` example we still have the ability to define custom types that can be used for deep lifetime checking.
Jun 19 2021
next sibling parent reply ag0aep6g <anonymous example.com> writes:
On Saturday, 19 June 2021 at 14:39:01 UTC, Dukc wrote:
 It's simpler and better if `return` attribute outside the 
 parameter list always binds to `this`. Unless there is some 
 reason you might want to annotate the return type as `return`? 
 I can't think of any.
I think you misunderstand. `return` does bind to `this`. But it can bind to two different aspects of `this`: (1) `ref` or (2) `scope`. Currently, you cannot freely choose which aspect of `this` you want to be `return`. The `opIndex` example fails because `return` is applied to the wrong aspect, and the programmer can't override it.
Jun 19 2021
parent reply Dukc <ajieskola gmail.com> writes:
On Saturday, 19 June 2021 at 14:48:45 UTC, ag0aep6g wrote:
 I think you misunderstand. `return` does bind to `this`. But it 
 can bind to two different aspects of `this`: (1) `ref` or (2) 
 `scope`.

 Currently, you cannot freely choose which aspect of `this` you 
 want to be `return`. The `opIndex` example fails because 
 `return` is applied to the wrong aspect, and the programmer 
 can't override it.
Ah thanks, let's try again. So the member function rewritten would look like ```d ref float opIndex(return ref float[] vector, size_t i){ return vector[i]; } ``` So the problem is that `return` allows returning the vector array itself by reference, but not any of it's elements. I think this function should compile, `return` or no. DIP1000 is not supposed to be transitive, so it should only check returning `vector` itself by reference, but not anything that it refers to. However, the user still may want to annotate the `vector` parameter as `return`, because that lets the client code to ensure that the reference to `vector[i]` won't live longer than `vector`.
Jun 19 2021
parent reply Dennis <dkorpel gmail.com> writes:
On Saturday, 19 June 2021 at 15:18:20 UTC, Dukc wrote:
 ```d
 ref float opIndex(return ref float[] vector, size_t i){
    return vector[i];
 }
 ```

 So the problem is that `return` allows returning the vector 
 array itself by reference, but not any of it's elements.

 I think this function should compile, `return` or no. DIP1000 
 is not supposed to be transitive, so it should only check 
 returning `vector` itself by reference, but not anything that 
 it refers to.
It currently does compile. `float[] vector` is assumed to have infinite lifetime here, so you can return its elements by ref just fine. You cannot call this `opIndex` on a slice of stack memory, then your signature has to add the `scope` keyword: ```D ref float opIndex(return ref scope float[] vector, size_t i){ return vector[i]; } ``` And then the compiler needs to know that the returned float's lifetime is bound to the stack memory you put in, so then you want `return scope` semantics (which you can't get here because there's a `ref` return and a `ref` param.
Jun 19 2021
parent Dukc <ajieskola gmail.com> writes:
On Saturday, 19 June 2021 at 15:27:24 UTC, Dennis wrote:
 It currently does compile. `float[] vector` is assumed to have 
 infinite lifetime here, so you can return its elements by ref 
 just fine. You cannot call this `opIndex` on a slice of stack 
 memory, then your signature has to add the `scope` keyword:


 ```D
 ref float opIndex(return ref scope float[] vector, size_t i){
    return vector[i];
 }
 ```
 And then the compiler needs to know that the returned float's 
 lifetime is bound to the stack memory you put in, so then you 
 want `return scope` semantics (which you can't get here because 
 there's a `ref` return and a `ref` param.
I'm not sure that `scope ref` should have any different semantics from `ref` anyway. If `vector` were passed by value it'd work with stack memory. But for that to work with operator overloading in same time, we'd need a non-`ref` `this` argument, or overloading with `static` or UFCS functions. Okay, now I'm not sure what would be the best course of action.
Jun 19 2021
prev sibling parent Dennis <dkorpel gmail.com> writes:
On Saturday, 19 June 2021 at 14:39:01 UTC, Dukc wrote:
 No thanks. The compiler still has ambiguity between

 ```d
 ref float opIndex(return(scope) ref __this, size_t i);
 return(scope) ref float opIndex(ref __this, size_t i);
 ```
 It's simpler and better if `return` attribute outside the 
 parameter list always binds to `this`.
There is no ambiguity, `return` and `scope` outside the parameter list *always* bind to `this`, whether you put them on the left or right. ```D scope int* f(); // Error: function `onlineapp.f` functions cannot be `scope` int* g() scope; // Error: function `onlineapp.g` functions cannot be `scope` ``` (Nice error message btw)
Jun 19 2021
prev sibling parent reply Dennis <dkorpel gmail.com> writes:
On Saturday, 19 June 2021 at 09:43:18 UTC, ag0aep6g wrote:
 But:

 * That's still hard to figure out, especially with methods 
 because `ref this` is invisible.
On top of that, `scope` could be invisible behind `in`. Here's an idea: could `return` apply to both the `ref` and value if there is `return scope`? | | `scope` | no `scope` | |---------------------------------|-----------|------------| | `ref` return type / `ref` param | **both** | **`ref`** | | value return type / `ref` param | **both** | **`ref`** | | `ref` return type / value param | **value** | **value** | | value return type / value param | **value** | **value** | The table would simply become: | | `scope` | no `scope` | |-------------|-----------|------------| | `ref` param | **both** | **`ref`** | | value param | **value** | **value** | You could even make this work then: ```D struct Vector { float[4] small; // return ref float[] large; // return scope bool isSmall; ref float opIndex(size_t i) return scope { return isSmall ? small[i] : large[i]; // dynamically choose } } ``` I *think* this is sound because there is no way a `ref` outlives its `scope` members, so the lifetime of the returned value is simply the one of the `ref` (which is the smaller one). However, I feel like there's a caveat I overlooked. Can the solution really be this simple?
Jun 19 2021
parent reply Dukc <ajieskola gmail.com> writes:
On Saturday, 19 June 2021 at 19:40:36 UTC, Dennis wrote:
 I *think* this is sound because there is no way a `ref` 
 outlives its `scope` members, so the lifetime of the returned 
 value is simply the one of the `ref` (which is the smaller 
 one). However, I feel like there's a caveat I overlooked. Can 
 the solution really be this simple?
Well, I had to think this one really hard. Your solution has one disadvantage: ```d struct CustomPtr { private ubyte* ptr; //works, but suboptimal ubyte* opUnary(string op)() if(op=="*") return scope {return ptr;} } ``` With your solution, DIP1000 will prevent the return value of `*customPtr` outliving `customPtr`. We would ideally want only to prevent `*customPtr` outliving whatever `customPtr.ptr` points to, which is what happens now if I understood the table right. But I can see no flaw in changing the field "`ref` arg / `ref` return / `scope`" to "both". Whether the complexity of the current semantics are worth it because of the issue I mentioned, I'm not sure at all. It's highly likely there are more corner cases lurking about. I have a feeling that we should go with your proposal, or continue brainstorming. Deriving the `scope` semantics from return value makes too much assumptions about the intention.
Jun 21 2021
parent reply Dennis <dkorpel gmail.com> writes:
On Monday, 21 June 2021 at 16:10:53 UTC, Dukc wrote:
 With your solution, DIP1000 will prevent the return value of 
 `*customPtr` outliving `customPtr`. We would ideally want only 
 to prevent `*customPtr` outliving whatever `customPtr.ptr` 
 points to, which is what happens now if I understood the table 
 right.
That was my first reservation when considering it, but so far I failed to find a scenario where the `scope` outlives its `ref`. Take for example this: ```D safe: struct CustomPtr { private ubyte* ptr; ubyte* get() return scope {return ptr;} } void f(scope ubyte* ptr0) { scope ubyte* ptr1; ptr1 = ptr0; // fine { CustomPtr c; c.ptr = ptr0; ptr1 = c.get(); // scope variable `c` assigned to `ptr1` with longer lifetime } } ``` While ptr0 can be assigned to ptr1, when going through a `CustomPtr c`, dmd still sees it as assigning "scope variable `c`".
Jun 21 2021
parent Dukc <ajieskola gmail.com> writes:
On Monday, 21 June 2021 at 16:56:30 UTC, Dennis wrote:
 While ptr0 can be assigned to ptr1, when going through a 
 `CustomPtr c`, dmd still sees it as assigning "scope variable 
 `c`".
I was about to say it might be that you need to write ```d void f(scope ubyte* ptr0) { scope ubyte* ptr1; ptr1 = ptr0; // fine { auto c = CustomPtr(ptr0); ptr1 = c.get();with longer lifetime } } ``` ...instead, but that fails too, at least on 2.096 release canditate. As does this one: ```d ubyte* get(ref return scope ubyte* ptr){return ptr;} void f(scope ubyte* ptr0) { scope ubyte* ptr1; ptr1 = ptr0; // fine { auto c = ptr0; ptr1 = c.get(); } } ``` I believe we can conclude that the field "`ref` argument / value return / `scope`" already reads either "`ref`" or "both", at least when you're returning a pointer. That being the case, your solution makes perfect sense.
Jun 21 2021
prev sibling next sibling parent nkm1 <t4nk074 openmailbox.org> writes:
On Friday, 18 June 2021 at 15:44:02 UTC, Dennis wrote:

 If you're still confused, I don't blame you
I didn't understand anything. This is so bad.
Jun 21 2021
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/18/2021 8:44 AM, Dennis wrote:
 Here's the reduced code:
 ```D
 struct Vector {
      float[] _elements;
      ref float opIndex(size_t i) scope return {
          return this._elements[i];
      }
 }
 ```
It *always* helps to reduce these examples to the bare minimum, i.e. strip all the higher level constructs out. The compiler absolutely must be consistent in this. The above reduces to: ref int test(ref scope return int* p) { return *p; } Take a moment to satisfy yourself that it is logically the same. [] was replaced by *, float by int, i no longer needed, don't confuse things with operator overloading, replace `this` with a ref parameter making it a non-member function, and then strip away the struct wrapper. This currently compiles with master and -dip1000 without error. Examining Dennis' table, it has a [ref return type] and a [scope value], and so the `return` applies to the pointer type. And so it is working as intended and is not a bug. I totally understand that this is headache-inducing. The aspirin is realizing that 100% of these examples can be rewritten using *only* the following terms: int i; int* p; return i; return *p; return &i; return &p; Write the example using these, sprinkling in ref, return, and scope. If anything else appears, like class, struct, this, [], delegate, etc., more work is needed to reduce it. Hope this helps!
Jul 05 2021
parent reply ag0aep6g <anonymous example.com> writes:
On Monday, 5 July 2021 at 09:28:47 UTC, Walter Bright wrote:
   ref int test(ref scope return int* p)
   {
      return *p;
   }
[...]
 Examining Dennis' table, it has a [ref return type] and a 
 [scope value], and so the `return` applies to the pointer type.
I think you're misreading something there. For reference, the table:
 **Does the `return` attribute apply to the parameter's `ref` or 
 the pointer value?**
 |                                  | `scope`   | no `scope` |
 |----------------------------------|-----------|------------|
 | `ref` return type / `ref` param  | **`ref`** | **`ref`**  |
 | value return type / `ref` param  | **value** | **`ref`**  |
 | `ref` return type / value param  | **value** | **value**  |
 | value return type / value param  | **value** | **value**  |
Your test function has a `ref` return type and a `ref` parameter, so we're looking at the first row. That row says `return` applies to the `ref` part of the parameter (with and without `scope`).
Jul 05 2021
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/5/2021 3:01 AM, ag0aep6g wrote:
 On Monday, 5 July 2021 at 09:28:47 UTC, Walter Bright wrote:
   ref int test(ref scope return int* p)
   {
      return *p;
   }
[...]
 Examining Dennis' table, it has a [ref return type] and a [scope value], and 
 so the `return` applies to the pointer type.
I think you're misreading something there. For reference, the table:
 **Does the `return` attribute apply to the parameter's `ref` or the pointer 
 value?**
 |                                  |
`scope`   | no `scope` |
 |----------------------------------|-----------|------------|
 | `ref` return type / `ref` param  | **`ref`** | **`ref`**  |
 | value return type / `ref` param  | **value** | **`ref`**  |
 | `ref` return type / value param  | **value** | **value**  |
 | value return type / value param  | **value** | **value**  |
Your test function has a `ref` return type and a `ref` parameter, so we're looking at the first row. That row says `return` applies to the `ref` part of the parameter (with and without `scope`).
I think you might be right.
Jul 05 2021
next sibling parent reply claptrap <clap trap.com> writes:
On Monday, 5 July 2021 at 10:44:54 UTC, Walter Bright wrote:
 On 7/5/2021 3:01 AM, ag0aep6g wrote:
 On Monday, 5 July 2021 at 09:28:47 UTC, Walter Bright wrote:
 Your test function has a `ref` return type and a `ref` 
 parameter, so we're looking at the first row. That row says 
 `return` applies to the `ref` part of the parameter (with and 
 without `scope`).
I think you might be right.
How the heck are us mere mortals supposed to use this stuff if you struggle to understand what it means?
Jul 05 2021
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/5/2021 4:36 AM, claptrap wrote:
 How the heck are us mere mortals supposed to use this stuff if you struggle to 
 understand what it means?
I make mistakes all the time. That's why D has unit tests built in, and an acceptance test suite. One reason I like writing software is once I get it right, it stays right every time. Actually, I find Dennis' table a tad confusing. An alternate formulation is: --------- Consider the ambiguity of a `return ref scope` parameter. Which is it: 1. `return ref` and `scope` 2. `ref` and `return scope` ? The ambiguity is resolved by looking at the function return. If it returns by `ref`, then it's (1), otherwise (2). This is evident in the table above. This rule is arbitrary, precluding things like creating a `return ref` and `return scope` for the same parameter. ----------
Jul 05 2021
next sibling parent reply claptrap <clap trap.com> writes:
On Tuesday, 6 July 2021 at 05:15:24 UTC, Walter Bright wrote:
 On 7/5/2021 4:36 AM, claptrap wrote:
 How the heck are us mere mortals supposed to use this stuff if 
 you struggle to understand what it means?
I make mistakes all the time. That's why D has unit tests built in, and an acceptance test suite.
It wasnt a criticism of you, it was like if the most experienced and talented developers are struggling to make sense of it how can the average programmer (me) ever hope to use it correctly. It just seems that whenever a DIP1000/safe thread comes up nobody really understands it. Like it's the exact antithesis of "easy to use correctly"
Jul 06 2021
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Tuesday, 6 July 2021 at 09:06:47 UTC, claptrap wrote:
 It wasnt a criticism of you, it was like if the most 
 experienced and talented developers are struggling to make 
 sense of it how can the average programmer (me) ever hope to 
 use it correctly.
It is also a question of whether one wants to support people who use the language occasionally or only want to support full time users. Languages like Ada and C++ is is the full time camp, but that only works if you have high commercial adoption, so not a good strategy in general.
Jul 06 2021
prev sibling parent reply Dennis <dkorpel gmail.com> writes:
On Tuesday, 6 July 2021 at 05:15:24 UTC, Walter Bright wrote:
 This rule is arbitrary, precluding things like creating a 
 `return ref` and
 `return scope` for the same parameter.
My question is: why? Why can't we allow a ref parameter to return either its address or its value? Given that lifetime(address) <= lifetime(value) and the returned value has lifetime(address), I'm struggling to see where it goes wrong.
Jul 06 2021
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/6/2021 2:23 AM, Dennis wrote:
 On Tuesday, 6 July 2021 at 05:15:24 UTC, Walter Bright wrote:
 This rule is arbitrary, precluding things like creating a `return ref` and
 `return scope` for the same parameter.
My question is: why? Why can't we allow a ref parameter to return either its address or its value? Given that lifetime(address) <= lifetime(value) and the returned value has lifetime(address), I'm struggling to see where it goes wrong.
It doesn't make sense to return either p or the address of p with the same function. What kind of a function does that? We could trick the code into doing that, but that would introduce all kinds of bugs into the lifetime analysis code. The real problem is not being able to return the value of p as a ref, if p is also passed by ref. I'm going to look into addressing that.
Jul 06 2021
parent Dennis <dkorpel gmail.com> writes:
On Tuesday, 6 July 2021 at 09:37:30 UTC, Walter Bright wrote:
 It doesn't make sense to return either p or the address of p 
 with the same function. What kind of a function does that?
Here's an example: https://forum.dlang.org/post/uxtqfrbomcsfzzbefkyw forum.dlang.org But such a function should not be the norm. The point is that users can simply add `return` and think "now I'm allowed to return `this` or `&this.x` or `this.arr[0]` or whatever" without delving into the whole "am I returning an address or value and am I allowed to" conundrum.
 We could trick the code into doing that, but that would 
 introduce all kinds of bugs into the lifetime analysis code.
Do you have an example?
 The real problem is not being able to return the value of p as 
 a ref, if p is also passed by ref.

 I'm going to look into addressing that.
Awesome!
Jul 06 2021
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
Added PRs:

https://github.com/dlang/dmd/pull/12817
https://github.com/dlang/dmd/pull/12819

This, of course, breaks:

  struct Vector {
     float[] _elements;
     ref float opIndex(size_t i) scope return {
         return this._elements[i];
     }
  }

as Dennis points out. It still is the correct fix, however, we just have to
find 
a way to get opIndex to work.
Jul 06 2021