digitalmars.dip.development - __rvalue and Move Semantics first draft

Walter Bright (2/2) Nov 09 2024 https://github.com/WalterBright/documents/blob/5dbf6728d7d0ae46a411c720e...

kinke (48/55) Nov 09 2024 Thanks, this is definitely a step in the right direction, getting

Walter Bright (7/7) Nov 09 2024 Some great insights.

kinke (10/15) Nov 11 2024 I'm not too fond of that, as that means doing the blit for every

Walter Bright (12/23) Nov 09 2024 I'm not sure it's a problem or a danger.

Richard (Rikki) Andrew Cattermole (18/23) Nov 09 2024 Break it down into an IR:

Walter Bright (2/4) Nov 09 2024 By setting the source to S.init after the move, it will work safely.

Timon Gehr (8/23) Nov 10 2024 I think the main potential trouble is that there is usually an

kinke (19/40) Nov 11 2024 But that's at least already invalid/undefined in your proposal.

Timon Gehr (8/51) Nov 14 2024 Well I would rather not consider this valid as the last use of the

Walter Bright (4/12) Nov 18 2024 The problem exists anyway.

Timon Gehr (11/27) Nov 20 2024 Well, aliasing between `ref` parameters is an expected thing that can

Richard (Rikki) Andrew Cattermole (12/16) Nov 09 2024 This is a restatement of what I said yesterday at the monthly meeting.

Walter Bright (5/12) Nov 09 2024 This can work, but if the users have to proactively add this attribute, ...

Richard (Rikki) Andrew Cattermole (64/81) Nov 09 2024 Yes, but for lifetime tracking, we need to be able to say the original
Salih Dincer (10/24) Nov 14 2024 When will I be able to measure performance for different data

kinke (10/10) Nov 09 2024 Oh, there's at least one problem with the `this(T)` move-ctor

Walter Bright (2/10) Nov 09 2024 We could disallow __rvalue arguments for call to C++ functions?

Timon Gehr (5/18) Nov 10 2024 How do you even call a C++ function that accepts an rvalue reference fro...

kinke (9/26) Nov 11 2024 We can't without rvalue-ref complications in D, but we could

Quirin Schroll (104/130) Jan 15 I think that sections need revising. As I understand it, a

Walter Bright <newshound2 digitalmars.com> writes:

https://github.com/WalterBright/documents/blob/5dbf6728d7d0ae46a411c720ec41e3603310172b/rvalue.md

I gave up on the previous move DIP. This one is better.

Nov 09 2024

kinke <noone nowhere.com> writes:

Thanks, this is definitely a step in the right direction, getting 
us perfect forwarding. I very much like its simplicity. First 
thoughts wrt. the `__rvalue` builtin:

 This means that an __rvalue(lvalue expression) argument 
 destroys the expression upon function return. Attempts to 
 continue to use the lvalue expression are invalid. The compiler 
 won't always be able to detect a use after being passed to the 
 function, which means that the destructor for the object must 
 reset the object's contents to its initial value, or at least a 
 benign value.

What IMO needs to be stressed here is that there's always one 
implicit use of the original lvalue after the __rvalue usage - 
its destruction when going out of scope! So the dtor at the very 
least needs to make sure that it can handle a double-destruction, 
adjusting the payload to make the 2nd destruction a 'noop', not 
freeing effective resources twice etc.

And that's my only real problem with the proposal in its current 
shape - who's going to revise all existing code to check for 
problematic struct dtors that don't handle double-destruction, 
just in case someone applies __rvalue on one of these types, or a 
custom struct with those types as fields?

The proposed `__rvalue` is very similar to what I proposed in 
https://forum.dlang.org/thread/xnwhexrctbfgntfklzaf forum.dlang.org, the
proposed revised `forward` semantics in the non-ref-storage-class case. The
main difference is that I went the suppress-2nd-destruction way, limiting its
applicability to local variables (incl. params) only, where the destruction
could be controlled via a magic destructor-guard variable for each local that
might be __rvalue'd.

When going with the double destruction to keep things simpler and 
allow __rvalue for *all* lvalues (I guess PODs too, which aren't 
guaranteed to be passed by ref under the hood, and so might still 
be blitted or passed in registers, depending on the platform 
ABI), then I'd propose automatically performing a reset-blit to 
`T.init` after the function call (incl. the case where the callee 
threw - the rvalue has still been destructed in that case, so we 
still need to reset the payload for the 2nd destruction). This 
has a number of advantages:
* No need to check and fix up all existing dtors.
* Well-defined state of the lvalue after its usage as __rvalue - 
`T.init` -, not some nebulous  'initial value, or at least a 
benign value' (as proposed, the state the first destruction left 
the object in, or if the type has no dtor (not all non-PODs have 
a dtor), the state the callee left the object in).
* Not paying the price for resets for every destruction, only 
after __rvalue usages. I guess the overall number of destructions 
is usually orders of magnitude greater than __rvalue usages.

Eliding the `T.init` reset and the 2nd destruction - in suited 
cases - could be implemented as an optimization later.

---

Wrt. safety, I think we should at least also mention the aliasing 
problem/danger:
```D
void callee(ref S x, S y) {
     assert(&x != &y);
}

void caller() {
     S lval;
     callee(lval, __rvalue(lval));
}
```

Nov 09 2024

Walter Bright <newshound2 digitalmars.com> writes:

Some great insights.

I suggest the most pragmatic implementation of your ideas is to append to the 
destructor calls to rvalue parameters a blit of the .init value. It is only 
necessary if the rvalue has a destructor. The callee cannot know if an rvalue 
was passed using __rvalue, so it has to defensively do this anyway.

I also suggest that maybe omit the blit for  system code, like we enable 
omitting array bounds checking in  system code. For efficiency, naturally!

Nov 09 2024

kinke <noone nowhere.com> writes:

On Saturday, 9 November 2024 at 22:39:33 UTC, Walter Bright wrote:
 I suggest the most pragmatic implementation of your ideas is to 
 append to the destructor calls to rvalue parameters a blit of 
 the .init value. It is only necessary if the rvalue has a 
 destructor. The callee cannot know if an rvalue was passed 
 using __rvalue, so it has to defensively do this anyway.

I'm not too fond of that, as that means doing the blit for every 
value parameter with a dtor, not just in the (presumably *way* 
less) call sites using `__rvalue`. Adding a cleanup-scope 
(`finally`) for the call shouldn't be too hard, reset-blitting 
all arguments that were __rvalue'd. Incl. PODs and non-PODs 
without dtor, to get the `T.init`-state guarantee in all cases, 
required to make this feature half-way safe in cases where the 
compiler cannot prove that the original lvalue isn't accessed 
later.

Nov 11 2024

Walter Bright <newshound2 digitalmars.com> writes:

I'm not sure it's a problem or a danger.

Timon mentioned the related problem with:

```
callee(__rvalue s, __rvalue s);
```

where s would be destroyed twice. This isn't always detectable:
```
S* ps = ...;
callee(__rvalue *s, __rvalue(*s));
```
But can be rendered benign with the blit of S.init after the destructor call.

On 11/9/2024 6:32 AM, kinke wrote:
 Wrt. safety, I think we should at least also mention the aliasing
problem/danger:
 ```D
 void callee(ref S x, S y) {
      assert(&x != &y);
 }
 
 void caller() {
      S lval;
      callee(lval, __rvalue(lval));
 }
 ```

Nov 09 2024

"Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:

On 10/11/2024 11:44 AM, Walter Bright wrote:
 Timon mentioned the related problem with:
 
 |callee(__rvalue s, __rvalue s); |
 
 where s would be destroyed twice. This isn't always detectable:

Break it down into an IR:

```
a = __rvalue(s)
b = __rvalue(s)
callee(a, b)
```

This is what type state analysis sees at an IR level.

```
// s must be >=initialized
a = s
// s is reachable which is < initialied

// s must be >= initialized, ERROR
b = s
```

We don't need to solve type state analysis here ;)

But it does tell us, that as a language feature it is dependent upon it, 
to be working correctly, so can't be turned on until then.

Nov 09 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 11/9/2024 6:38 PM, Richard (Rikki) Andrew Cattermole wrote:
 But it does tell us, that as a language feature it is dependent upon it, to be 
 working correctly, so can't be turned on until then.

By setting the source to S.init after the move, it will work safely.

Nov 09 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 11/9/24 23:44, Walter Bright wrote:
 I'm not sure it's a problem or a danger.
 
 Timon mentioned the related problem with:
 
 ```
 callee(__rvalue s, __rvalue s);
 ```
 
 where s would be destroyed twice. This isn't always detectable:
 ```
 S* ps = ...;
 callee(__rvalue *s, __rvalue(*s));
 ```
 But can be rendered benign with the blit of S.init after the destructor 
 call.

I think the main potential trouble is that there is usually an 
assumption that there is no aliasing between rvalue arguments.

For example, if a compiler backend assumes no aliasing, undefined 
behavior might be introduced if one of the arguments is modified and 
then the other is read.

Of course, we can instead specify that the aliasing is legal (but it may 
still be surprising).

Nov 10 2024

kinke <noone nowhere.com> writes:

On Sunday, 10 November 2024 at 17:36:25 UTC, Timon Gehr wrote:
 On 11/9/24 23:44, Walter Bright wrote:
 I'm not sure it's a problem or a danger.
 
 Timon mentioned the related problem with:
 
 ```
 callee(__rvalue s, __rvalue s);
 ```
 
 where s would be destroyed twice. This isn't always detectable:
 ```
 S* ps = ...;
 callee(__rvalue *s, __rvalue(*s));
 ```
 But can be rendered benign with the blit of S.init after the 
 destructor call.


But that's at least already invalid/undefined in your proposal. 
I've used `callee(lval, __rvalue(lval))` to show that the 
aliasing problem can occur in valid code too - `lval` isn't 
accessed lexically after __rvalue'ing it. __rvalue'ing a global 
variable and checking that the global isn't accessed in the 
callee is even harder.

 I think the main potential trouble is that there is usually an 
 assumption that there is no aliasing between rvalue arguments.

It's not just an assumption, it's an implicit guarantee - a 
by-value parameter is analogous to a local in high-level terms, 
so of course with its own distinct private memory. Well, until 
now. :)

 For example, if a compiler backend assumes no aliasing, 
 undefined behavior might be introduced if one of the arguments 
 is modified and then the other is read.

This isn't just a problem with compiler optimizations, but in 
general:

```D
void callee(const ref S x, S y) {
     y.bla = x.bla - 1;
     assert(y.bla != x.bla, "have changed ref via value alias!");
}
```

Nov 11 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 11/11/24 12:31, kinke wrote:
 On Sunday, 10 November 2024 at 17:36:25 UTC, Timon Gehr wrote:
 On 11/9/24 23:44, Walter Bright wrote:
 I'm not sure it's a problem or a danger.

 Timon mentioned the related problem with:

 ```
 callee(__rvalue s, __rvalue s);
 ```

 where s would be destroyed twice. This isn't always detectable:
 ```
 S* ps = ...;
 callee(__rvalue *s, __rvalue(*s));
 ```
 But can be rendered benign with the blit of S.init after the 
 destructor call.


 
 But that's at least already invalid/undefined in your proposal. I've 
 used `callee(lval, __rvalue(lval))` to show that the aliasing problem 
 can occur in valid code too - `lval` isn't accessed lexically after 
 __rvalue'ing it. __rvalue'ing a global variable and checking that the 
 global isn't accessed in the callee is even harder.
 ...

Well I would rather not consider this valid as the last use of the 
original `lval` may be within the callee after the move. My favorite 
design would be making `__rvalue` a low-level ` system` operation by 
default and having the high-level `move` operation actually ensure these 
things cannot happen.

 I think the main potential trouble is that there is usually an 
 assumption that there is no aliasing between rvalue arguments.

 
 It's not just an assumption, it's an implicit guarantee - a by-value 
 parameter is analogous to a local in high-level terms, so of course with 
 its own distinct private memory. Well, until now. :)
 ...

Yup.

 For example, if a compiler backend assumes no aliasing, undefined 
 behavior might be introduced if one of the arguments is modified and 
 then the other is read.

 
 This isn't just a problem with compiler optimizations, but in general:
 
 ```D
 void callee(const ref S x, S y) {
      y.bla = x.bla - 1;
      assert(y.bla != x.bla, "have changed ref via value alias!");
 }
 ```

Yup.

Nov 14 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 11/10/2024 9:36 AM, Timon Gehr wrote:
 I think the main potential trouble is that there is usually an assumption that 
 there is no aliasing between rvalue arguments.
 
 For example, if a compiler backend assumes no aliasing, undefined behavior
might 
 be introduced if one of the arguments is modified and then the other is read.
 
 Of course, we can instead specify that the aliasing is legal (but it may still 
 be surprising).

The problem exists anyway.

https://github.com/dlang/DIPs/blob/master/DIPs/accepted/DIP1021.md

This has been incorporated, but is only turned on with a switch.

Nov 18 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 11/19/24 07:50, Walter Bright wrote:
 On 11/10/2024 9:36 AM, Timon Gehr wrote:
 I think the main potential trouble is that there is usually an 
 assumption that there is no aliasing between rvalue arguments.

 For example, if a compiler backend assumes no aliasing, undefined 
 behavior might be introduced if one of the arguments is modified and 
 then the other is read.

 Of course, we can instead specify that the aliasing is legal (but it 
 may still be surprising).

 
 The problem exists anyway.
 
 https://github.com/dlang/DIPs/blob/master/DIPs/accepted/DIP1021.md
 
 This has been incorporated, but is only turned on with a switch.

Well, aliasing between `ref` parameters is an expected thing that can 
occur. Backends and users are aware of this possibility. Check out the 
implementation of std.algorithm.swap.

Aliasing between non-`ref` parameters (or across `ref`-ness) is a 
different thing. This can be rather surprising and I think it would 
sometimes lead to undefined behavior with current backends.

So the question is how `__rvalue` will interact with ` safe`, and if it 
is sometimes unsafe, whether there will be a safe variant that 
conservatively moves rvalues in memory to avoid aliasing situations if 
they cannot be precluded.

Nov 20 2024

"Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:

On 09/11/2024 10:33 PM, Walter Bright wrote:
 https://github.com/WalterBright/documents/ 
 blob/5dbf6728d7d0ae46a411c720ec41e3603310172b/rvalue.md
 
 I gave up on the previous move DIP. This one is better.

This is a restatement of what I said yesterday at the monthly meeting.

I am significantly happier with this design however:

1. We'll need to introduce a swap builtin, since we have no way to say 
describe moves between parameters. This can come later, as it is an 
addition.
2. I have the concern that existing code that is not designed to accept 
a move, will have a move into it. White listing via an attribute 
`` move`` to say that this constructor/opAssign is designed to handle a 
move in would be valuable.
3. Optimizing of eliding of destructors should be done with type state 
analysis, it does not need its own dedicated DFA.

Nov 09 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 11/9/2024 8:15 AM, Richard (Rikki) Andrew Cattermole wrote:
 1. We'll need to introduce a swap builtin, since we have no way to say
describe 
 moves between parameters. This can come later, as it is an addition.

Doesn't a swap function get arguments passed by `ref`?

 2. I have the concern that existing code that is not designed to accept a
move, 
 will have a move into it. White listing via an attribute `` move`` to say that 
 this constructor/opAssign is designed to handle a move in would be valuable.

This can work, but if the users have to proactively add this attribute, I'm 
afraid we've failed.

 3. Optimizing of eliding of destructors should be done with type state
analysis, 
 it does not need its own dedicated DFA.

The two are the same, aren't they?

Nov 09 2024

"Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:

On 10/11/2024 11:59 AM, Walter Bright wrote:
 On 11/9/2024 8:15 AM, Richard (Rikki) Andrew Cattermole wrote:
 1. We'll need to introduce a swap builtin, since we have no way to say 
 describe moves between parameters. This can come later, as it is an 
 addition.

 
 Doesn't a swap function get arguments passed by `ref`?

Yes, but for lifetime tracking, we need to be able to say the original 
value isn't here anymore.

```d
int* a, b;

int* c = a, d = b;

swap(a, b);

// c has same variable state as a
// d has same variable state as b
```

In general moving is easy:

```d
int* move(?initialized,reachable ref int* input) {
	return input;
}
```

But swap isn't.

```d
void swap(
      ?initialized,initialized  escape(b) ref int* a,
      ?initialized,initialized  escape(a) ref int* b);
```

 2. I have the concern that existing code that is not designed to 
 accept a move, will have a move into it. White listing via an 
 attribute `` move`` to say that this constructor/opAssign is designed 
 to handle a move in would be valuable.

 
 This can work, but if the users have to proactively add this attribute, 
 I'm afraid we've failed.

The alternative is to disallow constructor/opAssign that is in a D2 
module and not by-ref to have __rvalue passed to it.

Tie it to a new edition.

Any function being called that is by-ref will work the same.

```d
module thing 2025;

struct Foo {
	this(Foo input);
}

void main() {
	Foo f;
	Foo t = __rvalue(f); // move constructor call
}
```

```d
module thing 2;

struct Foo {
	this(Foo input);
	this(ref Foo input);
}

void main() {
	Foo f;
	Foo t = __rvalue(f); // copy constructor call
}
```

 3. Optimizing of eliding of destructors should be done with type state 
 analysis, it does not need its own dedicated DFA.

 
 The two are the same, aren't they?

Yes exactly.

When you converge (or other known points), you'd look to see what the 
last destructor is, and if appropriete ``var.lastDestroy.disabled = true;``.

Type state analysis has the absolutely beautiful property that the 
builtin states are 100% correct even in `` system`` code.

It is _always_ an error to dereference a null pointer.

It is _always_ a logic error to read from uninitialized memory.

So it'll be run on all code, which means we can rely on it to do eliding 
for stuff like this.

Same situation with RC.

```d
rc.opAddRef();
rc.opSubRef();
```

Same object, pair can be elided.

It is why the add needs to happen in the called function, because then 
it can be elided without cross-function analysis.

Nov 09 2024

Salih Dincer <salihdb hotmail.com> writes:

On Saturday, 9 November 2024 at 22:59:34 UTC, Walter Bright wrote:
 On 11/9/2024 8:15 AM, Richard (Rikki) Andrew Cattermole wrote:
 1. We'll need to introduce a swap builtin, since we have no 
 way to say describe moves between parameters. This can come 
 later, as it is an addition.

 Doesn't a swap function get arguments passed by `ref`?

 2. I have the concern that existing code that is not designed 
 to accept a move, will have a move into it. White listing via 
 an attribute `` move`` to say that this constructor/opAssign 
 is designed to handle a move in would be valuable.

 This can work, but if the users have to proactively add this 
 attribute, I'm afraid we've failed.

 3. Optimizing of eliding of destructors should be done with 
 type state analysis, it does not need its own dedicated DFA.

 The two are the same, aren't they?

When will I be able to measure performance for different data 
types and usage scenarios?

Thus, we can concretely show the performance advantages of the 
__rvalue keyword. In my opinion, the performance impact of 
transport semantics in real-world applications should be analyzed 
in detail, and its effects on large data structures and 
frequently used objects should be examined.

We have no time to lose...

SDB 79

Nov 14 2024

kinke <noone nowhere.com> writes:

Oh, there's at least one problem with the `this(T)` move-ctor 
signature - C++ interop. C++ doesn't destroy the parameter, 
because it's an rvalue-ref. The proposed by-value signature in D 
however includes the destruction of the value-parameter as part 
of the move-construction. The same applies to move-assignment via 
`opAssign(T)`. So after calling a C++ move ctor/assignOp with an 
`__rvalue(x)` argument, the rvalue wasn't destructed, and its 
state is as the C++ callee left it. Automatically reset-blitting 
to `T.init` would be invalid in that case, as the moved-from 
lvalue might still have stuff to destruct.

Nov 09 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 11/9/2024 9:37 AM, kinke wrote:
 Oh, there's at least one problem with the `this(T)` move-ctor signature - C++ 
 interop. C++ doesn't destroy the parameter, because it's an rvalue-ref. The 
 proposed by-value signature in D however includes the destruction of the 
 value-parameter as part of the move-construction. The same applies to 
 move-assignment via `opAssign(T)`. So after calling a C++ move ctor/assignOp 
 with an `__rvalue(x)` argument, the rvalue wasn't destructed, and its state is 
 as the C++ callee left it. Automatically reset-blitting to `T.init` would be 
 invalid in that case, as the moved-from lvalue might still have stuff to
destruct.

We could disallow __rvalue arguments for call to C++ functions?

Nov 09 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 11/10/24 00:01, Walter Bright wrote:
 On 11/9/2024 9:37 AM, kinke wrote:
 Oh, there's at least one problem with the `this(T)` move-ctor 
 signature - C++ interop. C++ doesn't destroy the parameter, because 
 it's an rvalue-ref. The proposed by-value signature in D however 
 includes the destruction of the value-parameter as part of the move- 
 construction. The same applies to move-assignment via `opAssign(T)`. 
 So after calling a C++ move ctor/assignOp with an `__rvalue(x)` 
 argument, the rvalue wasn't destructed, and its state is as the C++ 
 callee left it. Automatically reset-blitting to `T.init` would be 
 invalid in that case, as the moved-from lvalue might still have stuff 
 to destruct.

 
 We could disallow __rvalue arguments for call to C++ functions?

How do you even call a C++ function that accepts an rvalue reference from D?

If `extern(C++) this(T)` magically matches the C++ move constructor, it 
seems that additional magic has to be added to all calls in any case to 
deal with the mismatch.

Nov 10 2024

kinke <noone nowhere.com> writes:

On Sunday, 10 November 2024 at 17:42:27 UTC, Timon Gehr wrote:
 On 11/10/24 00:01, Walter Bright wrote:
 On 11/9/2024 9:37 AM, kinke wrote:
 Oh, there's at least one problem with the `this(T)` move-ctor 
 signature - C++ interop. C++ doesn't destroy the parameter, 
 because it's an rvalue-ref. The proposed by-value signature 
 in D however includes the destruction of the value-parameter 
 as part of the move- construction. The same applies to 
 move-assignment via `opAssign(T)`. So after calling a C++ 
 move ctor/assignOp with an `__rvalue(x)` argument, the rvalue 
 wasn't destructed, and its state is as the C++ callee left 
 it. Automatically reset-blitting to `T.init` would be invalid 
 in that case, as the moved-from lvalue might still have stuff 
 to destruct.

 
 We could disallow __rvalue arguments for call to C++ functions?

 How do you even call a C++ function that accepts an rvalue 
 reference from D?

We can't without rvalue-ref complications in D, but we could 
definitely special-case move ctors and assignment operators, just 
need to match the C++ mangle. And match the same semantics 
obviously, which is the crux. - We can already interop with the 
main C++ lifetime member functions - regular constructors, copy 
constructors, destructors. It'd IMO be a shame not being able to 
use the original C++ move ctor and assignOp too, having to 
re-implement them in D for a complete binding.

Nov 11 2024

Quirin Schroll <qs.il.paperinik gmail.com> writes:

On Saturday, 9 November 2024 at 09:33:24 UTC, Walter Bright wrote:
 https://github.com/WalterBright/documents/blob/5dbf6728d7d0ae46a411c720ec41e3603310172b/rvalue.md

 From the DIP:
 An rvalue argument is considered to be owned by the function 
 called. Hence, if an lvalue is matched to the rvalue argument, 
 a copy is made of the lvalue to be passed to the function. The 
 function will then call the destructor (if any) on the 
 parameter at the conclusion of the function. An rvalue argument 
 is not copied, as it is assumed to already be unique, and is 
 also destroyed at the conclusion of the function. The 
 destruction is automatically appended to the function body by 
 the compiler.

 The function cannot know if its parameter originated as an 
 rvalue or is a copy of an lvalue.

 This means that an `__rvalue(lvalue expression)` argument 
 destroys the expression upon function return. Attempts to 
 continue to use the lvalue expression are invalid. The compiler 
 won't always be able to detect a use after being passed to the 
 function, which means that the destructor for the object must 
 reset the object's contents to its initial value, or at least a 
 benign value.

I think that sections need revising. As I understand it, a 
function binds an argument by reference or by value:
```d
void f(ref T reference); // binds by reference
void g(T value); // binds by value
```

In my mind, function parameters are essentially local variables 
of the function that are assigned by the caller (by providing 
arguments). If argument passing does not work exactly like 
initializing (local) variables, I’d consider that a flaw of the 
language.

This means:

If a parameter is bound by value, it will be destroyed as `g` 
returns (whether that is done by the caller or the callee is an 
implementation detail and not part of the language). If the 
caller passes `x` or `__rvalue(x)` is completely irrelevant for 
the callee. It only ever sees its parameter initialized and is 
responsible for its destruction. It cannot care where it came 
from.

If an argument is bound by reference, passing `__rvalue(x)` is 
either invalid or, if the `rvaluerefparam` preview is active, 
binds a temporary initialized in the stack frame of the caller by 
`__rvalue(x)`. It does not bind `x`, that would be extremely 
confusing. In that case, the caller is responsible for the 
destruction of the temporary. (The callee knows nothing about the 
creation of the temporary.)

We could introduce a parameter storage class `__rvalue ref` that:
* Corresponds to C++ rvalue references
* Allows binding rvalues only, and for `__rvalue(x)` arguments, 
no temporary is created.

That would allow a function to freely move from an argument:
```d
void tryAdd(__rvalue ref T x)
{
     if (…) this.x = __rvalue(x);
}
```

Contrary to the above, `void tryAdd(T x)` requires a move to pass 
an rvalue argument and another move to assign `this.x`. However, 
if moving a `T` is reasonably cheap, pass-by-value can make sense 
if binding lvalue arguments should be supported.

By itself, `__rvalue(x)` should do nothing. Only if an operation 
on it distinguishes rvalues and lvalues does it matter, which is 
its use case; then that ***usually*** leaves `x` in a moved-from 
state, but as shown above, there’s a use case for not moving from 
the variable. Thus, after `tryAdd(__rvalue(x))` the variable `x` 
contains a valid `T` object or a moved-from `T` object.

A moved-from `T` object need not support all operations `T` 
allows, but in C++, it must allow for two operations:
- being assigned
- being destroyed

Most types can support an empty state, and moving from an object 
would put it in that state.

---

It seems your DIP Draft conflates moving and relocation (C++ 
lingo). A relocation is a move followed by destruction of the 
source. The notion of relocation is meaningful because there are 
types for which relocation is trivial but moving is not.

For example, a `std::unique_ptr` has a non-trivial move: It must 
set the source `std::unique_ptr` in a null state (such that it 
can be assigned again or destroyed without releasing the managed 
resource, which has a new owner). A `std::unique_ptr` has a 
trivial relocation, though. If we simply copy the internal 
pointer and do not run the destructor on the source, the managed 
resource has a new owner and we don’t waste time setting the 
source null and then checking if the source is null (to skip the 
freeing of a possible managed resource.)

An example for a type that is not trivially relocatable is a type 
with an internal pointer (such as `std::string` usually). It has 
to readjust that pointer the relocation.

Using a moved-from object is reasonable; C++ requires assignment 
to be valid, usually more/all operations are allowed for most 
types. D can require a moved-from object to be fully usable.

Using a relocated-from object(!) is fundamentally invalid. It is 
already destroyed (that is, conceptually destroyed, an actual 
destructor need not have run). Using the variable is valid for 
taking its address or using the storage (e.g. for placement new) 
are valid.

For reference, the [Circle C++ language 
extension](https://github.com/seanbaxter/circle/blob/master/new-circle
README.md#relocate) implements relocation as a built-in operation.

Relocation and placement new make lifetimes non-lexical. Moving, 
on the other hand, does not disturb lexical lifetime.

The last paragraph of the quote again:
 This means that an `__rvalue(lvalue expression)` argument 
 destroys the expression upon function return. Attempts to 
 continue to use the lvalue expression are invalid. The compiler 
 won't always be able to detect a use after being passed to the 
 function, which means that the destructor for the object must 
 reset the object's contents to its initial value, or at least a 
 benign value.

That is probably not a good idea. It would render `__rvalue` a 
` system` feature. Either the compiler can guarantee it’s safe to 
use or it can’t. Reliably recognizing use after destruction is 
probably impossible (definitely in ` system` code, and in purely 
` safe` code, it at least requires difficult data-flow analysis). 
In C++, one is content saying it’s UB and moves on. D, with it’s 
focus on ` safe`, can’t do that (or rather shouldn’t, as it would 
make `__rvalue` immediately ` system`).

My suggestion: Require all D objects to be valid after being 
moved from (whatever the reason for a move was).

If you really want to explore relocation in the DIP, add 
`__relocate(x)` for that:
- Requires the result is used (assigned to something, initializes 
something, or passed by value(!) as a function argument).
- Removes the destructor call of `x` if it is a local and hasn’t 
used a placement `new` on it afterwards.
- `__relocate` could maybe be ` safe` in very constrained 
circumstances: The argument must be a local and there must not 
exist any references or aliases.

Jan 15

D Programming

C/C++ Programming

Other

digitalmars.dip.development - __rvalue and Move Semantics first draft