www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - move semantics are a mess

reply Manu <turkeyman gmail.com> writes:
I've been trying to do some initial work with copy ctor's, and that
has lead me to closely scritinise the various construction flow's, and
it's revealed a whole bunch of issues with move semantics.

The 2 most immediate issues are here:
https://issues.dlang.org/show_bug.cgi?id=19904
Those are surprising, the functions specifically designed to perform
move semantics stop move semantics in their tracks.

------------------------------

But it doesn't end there.
I tried to correct those issues by adding the appropriate
`forward!args` in the right places, but that causes chaos.

One serious issue I've noticed looks like this:
  void fun(Args...)(auto ref Args args)
  {
    auto b = T(forward!args);
  }

  fun(myT.move); // <- call with rvalue; move semantics desired

I've encountered various forms of this general pattern. So the trouble
here is, it tried to call a T constructor with `args`, and there are
cases:
  1. args are actual constructor args -> call appropriate constructor
  2. args is a T lvalue -> call the copy constructor
  3. args is a T rvalue -> `b` should be move initialised, but you get
a compile error because it tries to pass an rvalue to the copy
constructor which strictly reveices a ref arg, and that call is
invalid.

It leads to this:

struct S
{
  this(ref inout(S) copyFrom) inout {} // <- copy ctor
  this(S moveFrom) { this = moveFrom.move; } // <- !!! some kind of
move constructor?
}

The rvalue ctor becomes necessary whenever a copy constructor exists,
otherwise lots of meta breaks.

------------------------------

Now, to top it off... if you inspect the move/emplace/forward
machinery in druntime, you'll notice that the implementations of those
functions are unbelievably hideous. They have to carefully determine
heaps of edge-case-ey junk, and then manually call copy ctors, or use
memcpy() and memset() to manually perform a binary move operations.

I have determined that in move/emplace/moveEmplace/forward
constructions, where it does correctly perform moves (run through the
memcpy() path) a series of times, the compiler does NOT optimise these
memcpy/memset's away, and the memory just gets shuffled around a whole
bunch between initial construction and the resting location.

Move semantics effectively don't work. They're a gross hack at best.

I suggest, the *language* desperately needs an `emplace` semantic,
something that can be recognised by the compiler and cascade through
chains of such work, where all that edge-case-ey crap can be properly
internalised.

It's really sad to say that the C++'s rvalue (T&&) solution is
uncomparably simpler than the mess we have to do to express `emplace`
or `move` in D.

It's embarrassing that the language can't express
emplace/move/forward/etc natively, and I think this should be
extremely high priority for a DIP, perhaps supported by the dlang
foundation.

__traits(emplace, T, ptr, args...) ?
Similar traits might exist for `forward` and friends too, and they
could be thinly wrapped in the functions in druntime.
May 26 2019
next sibling parent reply Exil <Exil gmall.com> writes:
On Sunday, 26 May 2019 at 18:24:17 UTC, Manu wrote:
 I've been trying to do some initial work with copy ctor's, and 
 that has lead me to closely scritinise the various construction 
 flow's, and it's revealed a whole bunch of issues with move 
 semantics.

 The 2 most immediate issues are here:
 https://issues.dlang.org/show_bug.cgi?id=19904
 Those are surprising, the functions specifically designed to 
 perform
 move semantics stop move semantics in their tracks.

 ------------------------------

 But it doesn't end there.
 I tried to correct those issues by adding the appropriate
 `forward!args` in the right places, but that causes chaos.

 One serious issue I've noticed looks like this:
   void fun(Args...)(auto ref Args args)
   {
     auto b = T(forward!args);
   }

   fun(myT.move); // <- call with rvalue; move semantics desired

 I've encountered various forms of this general pattern. So the 
 trouble
 here is, it tried to call a T constructor with `args`, and 
 there are
 cases:
   1. args are actual constructor args -> call appropriate 
 constructor
   2. args is a T lvalue -> call the copy constructor
   3. args is a T rvalue -> `b` should be move initialised, but 
 you get
 a compile error because it tries to pass an rvalue to the copy
 constructor which strictly reveices a ref arg, and that call is
 invalid.

 It leads to this:

 struct S
 {
   this(ref inout(S) copyFrom) inout {} // <- copy ctor
   this(S moveFrom) { this = moveFrom.move; } // <- !!! some 
 kind of
 move constructor?
 }

 The rvalue ctor becomes necessary whenever a copy constructor 
 exists, otherwise lots of meta breaks.

 ------------------------------

 Now, to top it off... if you inspect the move/emplace/forward 
 machinery in druntime, you'll notice that the implementations 
 of those functions are unbelievably hideous. They have to 
 carefully determine heaps of edge-case-ey junk, and then 
 manually call copy ctors, or use memcpy() and memset() to 
 manually perform a binary move operations.

 I have determined that in move/emplace/moveEmplace/forward 
 constructions, where it does correctly perform moves (run 
 through the memcpy() path) a series of times, the compiler does 
 NOT optimise these memcpy/memset's away, and the memory just 
 gets shuffled around a whole bunch between initial construction 
 and the resting location.

 Move semantics effectively don't work. They're a gross hack at 
 best.

 I suggest, the *language* desperately needs an `emplace` 
 semantic, something that can be recognised by the compiler and 
 cascade through chains of such work, where all that 
 edge-case-ey crap can be properly internalised.

 It's really sad to say that the C++'s rvalue (T&&) solution is 
 uncomparably simpler than the mess we have to do to express 
 `emplace` or `move` in D.

 It's embarrassing that the language can't express 
 emplace/move/forward/etc natively, and I think this should be 
 extremely high priority for a DIP, perhaps supported by the 
 dlang foundation.

 __traits(emplace, T, ptr, args...) ?
 Similar traits might exist for `forward` and friends too, and 
 they
 could be thinly wrapped in the functions in druntime.
Think this has been brought up before, moving in D is 'built-in' and is done as just a copy. Which is why there is that opMove DIP. In C++ it is just a type essentially. If you have a function that needs to forward it's parameters to another function, in C++ where you are just passing a reference in D that means making N copies, where N is the number of function calls you have to go through. Price of performance for "simplicity"?
May 26 2019
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Sunday, 26 May 2019 at 19:19:12 UTC, Exil wrote:
 Think this has been brought up before, moving in D is 
 'built-in' and is done as just a copy. Which is why there is 
 that opMove DIP. In C++ it is just a type essentially. If you 
 have a function that needs to forward it's parameters to 
 another function, in C++ where you are just passing a reference 
 in D that means making N copies, where N is the number of 
 function calls you have to go through. Price of performance for 
 "simplicity"?
Yes, but what I want to know is this: what happens if an exception is thrown right after the move? Does the ownership just disappear or is it moved back? There is also this thread: https://forum.dlang.org/thread/n8nm3u$1q1g$1 digitalmars.com?page=1
May 26 2019
prev sibling next sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Sunday, 26 May 2019 at 18:24:17 UTC, Manu wrote:

<snip>
 I tried to correct those issues by adding the appropriate
 `forward!args` in the right places, but that causes chaos.

 One serious issue I've noticed looks like this:
   void fun(Args...)(auto ref Args args)
   {
     auto b = T(forward!args);
   }

   fun(myT.move); // <- call with rvalue; move semantics desired

 I've encountered various forms of this general pattern. So the 
 trouble
 here is, it tried to call a T constructor with `args`, and 
 there are
 cases:
   1. args are actual constructor args -> call appropriate 
 constructor
   2. args is a T lvalue -> call the copy constructor
   3. args is a T rvalue -> `b` should be move initialised, but 
 you get
 a compile error because it tries to pass an rvalue to the copy
 constructor which strictly reveices a ref arg, and that call is
 invalid.
This is indeed problematic.
 It leads to this:

 struct S
 {
   this(ref inout(S) copyFrom) inout {} // <- copy ctor
   this(S moveFrom) { this = moveFrom.move; } // <- !!! some 
 kind of
 move constructor?
 }
Unfortunately that doesn't even work - the way you wrote it above leads to a crash due to infinite recursion (see code below) and swapping the order of the definitions of the "move" ctor with the copy one results in a compiler error message telling the user that you can't define both (the ordering thing is a bug since fixed). Code: ------------- import std.stdio; struct Foo { this(int i) safe { writeln("Foo int ctor ", i); } this(ref const(Foo) other) safe { writeln("Foo copy ctor"); } this(Foo other) safe { writeln("Foo move ctor"); } } void fun(Args...)(auto ref Args args) { import std.functional; auto b = Foo(forward!args); } void main() { fun(77); auto f = Foo(33); fun(f); // infinite recursion here // fun(Foo(42)); // doesn't compile with only the copy ctor defined } -------------
 Move semantics effectively don't work. They're a gross hack at 
 best.
I think the language definition on this needs to be precisely defined.
May 27 2019
parent Manu <turkeyman gmail.com> writes:
On Mon, May 27, 2019 at 6:56 AM Atila Neves via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On Sunday, 26 May 2019 at 18:24:17 UTC, Manu wrote:

 <snip>
 I tried to correct those issues by adding the appropriate
 `forward!args` in the right places, but that causes chaos.

 One serious issue I've noticed looks like this:
   void fun(Args...)(auto ref Args args)
   {
     auto b = T(forward!args);
   }

   fun(myT.move); // <- call with rvalue; move semantics desired

 I've encountered various forms of this general pattern. So the
 trouble
 here is, it tried to call a T constructor with `args`, and
 there are
 cases:
   1. args are actual constructor args -> call appropriate
 constructor
   2. args is a T lvalue -> call the copy constructor
   3. args is a T rvalue -> `b` should be move initialised, but
 you get
 a compile error because it tries to pass an rvalue to the copy
 constructor which strictly reveices a ref arg, and that call is
 invalid.
This is indeed problematic.
 It leads to this:

 struct S
 {
   this(ref inout(S) copyFrom) inout {} // <- copy ctor
   this(S moveFrom) { this = moveFrom.move; } // <- !!! some
 kind of
 move constructor?
 }
Unfortunately that doesn't even work - the way you wrote it above leads to a crash due to infinite recursion
Haha, truefacts. I just typed that in the email to make the point. The code I actually wrote uses `moveFrom.moveEmplace(this)`, which didn't trigger recursion (because memcpy). It's all rubbish though! Even if this stuff worked, the sequence of memcpy's and memset's from construction to emplacement is extremely lame! Calling a function like `emplace` introduces at least 2 unnecessary memcpy's, `forward` introduces another at each callsite, and then if you are authoring a utility object, or a container, then there's at lest 1 additional forwarding cycle for that layer... we'll easily land common cases where move construction introduces 4-8 memcpy/memset's!
 Move semantics effectively don't work. They're a gross hack at
 best.
I think the language definition on this needs to be precisely defined.
I think this should be D's next big top-priority ticket. It's a gaping and embarrassing hole in D. It's another one of those things that I couldn't show to my colleagues with a straight face and expect them to take me (or D) seriously :/ I wonder if Andrei has room to consider this? He's probably the single best person for the job...
May 27 2019
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/26/19 2:24 PM, Manu wrote:
 The 2 most immediate issues are here:
 https://issues.dlang.org/show_bug.cgi?id=19904
I made a comment to that and added a new issue.
May 27 2019
prev sibling parent Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Sunday, 26 May 2019 at 18:24:17 UTC, Manu wrote:
 I've been trying to do some initial work with copy ctor's, and 
 that has lead me to closely scritinise the various construction 
 flow's, and it's revealed a whole bunch of issues with move 
 semantics.
A cumbersome but working way of experimenting with the potential (performance) benefits of eliding copy construction of the elements in `args` would be to replace void emplaceRef(UT, Args...)(ref UT chunk, auto ref Args args) if (is(UT == core.internal.traits.Unqual!UT)) { emplaceRef!(UT, UT)(chunk, args); } with void emplaceRef(UT, Args...)(ref UT chunk, auto ref Args args) if (is(UT == core.internal.traits.Unqual!UT)) { static if (args.length == 1) { emplaceRef!(UT, UT)(chunk, move(args[0])); } else static if (args.length == 2) { emplaceRef!(UT, UT)(chunk, move(args[0]), move(args[1])); } else ...up to args.length } and similarly for the calls to p.__ctor(args) in void emplaceRef(T, UT, Args...)(ref UT chunk, auto ref Args args) .
May 29 2019