digitalmars.D - Constructing a class in-place

Shachar Shemesh (14/14) Jul 25 2018 Forget the "why" for the moment.

rikki cattermole (8/25) Jul 25 2018 Copies the default initialized state, basically does .init for a struct

Johan Engelen (23/25) Jul 26 2018 Thanks for pointing to D's placement new. This is bad news for my

rikki cattermole (8/36) Jul 26 2018 Both of those links is related to structs not classes (and original post...

Johan Engelen (8/26) Jul 28 2018 Uhm, this has everything to do with our situation in D and with

Petar Kirov [ZombineDev] (81/109) Jul 26 2018 Please excuse if my question is too naive, but how does this

Petar Kirov [ZombineDev] (15/28) Jul 26 2018 That is not too say that we shouldn't try to improve D's spec to
Johan Engelen (39/54) Jul 28 2018 The main insight is to reason about things in terms of language

Steven Schveighoffer (6/32) Aug 01 2018 Reading those items, though, doesn't emplace effectively do what
Kagamin (4/7) Aug 02 2018 Just say that devirtualization is incompatible with

Shachar Shemesh <shachar weka.io> writes:

Forget the "why" for the moment.

T construct(T, ARGS...)(ARGS args) if( is(T==class) ) {
    auto buffer = new ubyte[__traits(classInstanceSize, T)];
    T cls = cast(T)buffer.ptr;

    // Is this really the best way to do this?
    buffer[] = cast(ubyte[])typeid(T).initializer()[];
    cls.__ctor(args);

    return cls;
}

My question is this: Is this the correct way to do it? There are steps 
here that seem kinda arbitrary, to say the least.

I am looking for something akin to C++'s placement new.

Thank you,
Shachar

Jul 25 2018

rikki cattermole <rikki cattermole.co.nz> writes:

On 25/07/2018 8:05 PM, Shachar Shemesh wrote:
 Forget the "why" for the moment.
 
 T construct(T, ARGS...)(ARGS args) if( is(T==class) ) {
     auto buffer = new ubyte[__traits(classInstanceSize, T)];
     T cls = cast(T)buffer.ptr;

Allocates the storage space of the fields (both public and private).

     // Is this really the best way to do this?
     buffer[] = cast(ubyte[])typeid(T).initializer()[];

Copies the default initialized state, basically does .init for a struct 
but was not too long ago renamed because it conflicted.

     cls.__ctor(args);

Calls a constructor that matches the given arguments.

     return cls;
 }
 
 My question is this: Is this the correct way to do it? There are steps 
 here that seem kinda arbitrary, to say the least.

Yes and not arbitrary, read above :)

 I am looking for something akin to C++'s placement new.
 
 Thank you,
 Shachar

Standard solution[0].

[0] https://dlang.org/phobos/std_conv.html#.emplace.4

Jul 25 2018

Johan Engelen <j j.nl> writes:

On Wednesday, 25 July 2018 at 08:11:59 UTC, rikki cattermole 
wrote:
 Standard solution[0].

 [0] https://dlang.org/phobos/std_conv.html#.emplace.4

Thanks for pointing to D's placement new. This is bad news for my 
devirtualization work; before, I thought D is in a better 
situation than C++, but now it seems we may be worse off.

Before I continue the work, I'll have to look closer at this 
(perhaps write an article about the situation in D, so more ppl 
can help and see what is going on). In short:
C++'s placement new can change the dynamic type of an object, 
which is problematic for devirtualization. However, in C++ the 
pointer passed to placement new may not be used afterwards (it'd 
be UB). This means that the code `A* a = new A(); a->foo(); 
a->foo();` is guaranteed to call the same function `A::foo` 
twice, because if the first call to `foo` would do a placement 
new on `a` (e.g. through `this`), the second call would be UB.
In D, we don't have placement new, great! And now, I learn that 
the _standard library_ _does_ have something that looks like 
placement new, but without extra guarantees of the spec that C++ 
has.
For some more info:
https://stackoverflow.com/a/49569305
https://stackoverflow.com/a/48164192

- Johan

Jul 26 2018

rikki cattermole <rikki cattermole.co.nz> writes:

On 27/07/2018 12:45 AM, Johan Engelen wrote:
 On Wednesday, 25 July 2018 at 08:11:59 UTC, rikki cattermole wrote:
 Standard solution[0].

 [0] https://dlang.org/phobos/std_conv.html#.emplace.4

 
 Thanks for pointing to D's placement new. This is bad news for my 
 devirtualization work; before, I thought D is in a better situation than 
 C++, but now it seems we may be worse off.
 
 Before I continue the work, I'll have to look closer at this (perhaps 
 write an article about the situation in D, so more ppl can help and see 
 what is going on). In short:
 C++'s placement new can change the dynamic type of an object, which is 
 problematic for devirtualization. However, in C++ the pointer passed to 
 placement new may not be used afterwards (it'd be UB). This means that 
 the code `A* a = new A(); a->foo(); a->foo();` is guaranteed to call the 
 same function `A::foo` twice, because if the first call to `foo` would 
 do a placement new on `a` (e.g. through `this`), the second call would 
 be UB.
 In D, we don't have placement new, great! And now, I learn that the 
 _standard library_ _does_ have something that looks like placement new, 
 but without extra guarantees of the spec that C++ has.
 For some more info:
 https://stackoverflow.com/a/49569305
 https://stackoverflow.com/a/48164192
 
 - Johan

Both of those links is related to structs not classes (and original post 
is about classes).
Given the content (I could be wrong) but I don't think its related to 
our situation in D.

Classes in D are very "heavy" with their explicit vtable. Given that 
classes in C++ can act as a value and a reference type, you have to be 
pretty careful when comparing them.

Jul 26 2018

Johan Engelen <j j.nl> writes:

On Thursday, 26 July 2018 at 12:53:44 UTC, rikki cattermole wrote:
 On 27/07/2018 12:45 AM, Johan Engelen wrote:
 
 In D, we don't have placement new, great! And now, I learn 
 that the _standard library_ _does_ have something that looks 
 like placement new, but without extra guarantees of the spec 
 that C++ has.
 For some more info:
 https://stackoverflow.com/a/49569305
 https://stackoverflow.com/a/48164192
 
 - Johan

 Both of those links is related to structs not classes (and 
 original post is about classes).
 Given the content (I could be wrong) but I don't think its 
 related to our situation in D.

Uhm, this has everything to do with our situation in D and with 
classes in D too. The links are of course about classes with and 
without vtable.

 Classes in D are very "heavy" with their explicit vtable. Given 
 that classes in C++ can act as a value and a reference type, 
 you have to be pretty careful when comparing them.

I'd appreciate it if you reread and think more about it. D's 
classes and C++'s structs/classes are the same in what is 
discussed here, and vtable is just one of the issues.

-Johan

Jul 28 2018

Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:

On Thursday, 26 July 2018 at 12:45:52 UTC, Johan Engelen wrote:
 On Wednesday, 25 July 2018 at 08:11:59 UTC, rikki cattermole 
 wrote:
 Standard solution[0].

 [0] https://dlang.org/phobos/std_conv.html#.emplace.4

 Thanks for pointing to D's placement new. This is bad news for 
 my devirtualization work; before, I thought D is in a better 
 situation than C++, but now it seems we may be worse off.

 Before I continue the work, I'll have to look closer at this 
 (perhaps write an article about the situation in D, so more ppl 
 can help and see what is going on). In short:
 C++'s placement new can change the dynamic type of an object, 
 which is problematic for devirtualization. However, in C++ the 
 pointer passed to placement new may not be used afterwards 
 (it'd be UB). This means that the code `A* a = new A(); 
 a->foo(); a->foo();` is guaranteed to call the same function 
 `A::foo` twice, because if the first call to `foo` would do a 
 placement new on `a` (e.g. through `this`), the second call 
 would be UB.
 In D, we don't have placement new, great! And now, I learn that 
 the _standard library_ _does_ have something that looks like 
 placement new, but without extra guarantees of the spec that 
 C++ has.
 For some more info:
 https://stackoverflow.com/a/49569305
 https://stackoverflow.com/a/48164192

 - Johan

Please excuse if my question is too naive, but how does this 
change anything?

The general pattern of using classes is:
1. Allocate memory. This can be either:
   1.a) implicit dynamic heap allocation done by the call to 
`GC.malloc` invoked via the implementation of the `new` operator 
for classes.

   1.b) explicit dynamic heap allocation via any allocator 
(`GC.malloc`, libc, std.experimental.allocator, etc.)
(1.b) is also a special case for class created via `new` - COM 
classes are allocated via malloc - see: 
https://github.com/dlang/druntime/blob/cb5efa9854775c5a72acd6870083b16e5ebba369/src/rt/lifetime.d#L79)

   1.c) implicit stack allocation via `scope c = new Class();`
   1.d) implicit stack allocation via struct wrapper like `auto c 
= scoped!Class();`
   1.e) explicit stack allocation via 
`void[__traits(classInstanceSize, A)] buf = void;`
   1.f) explicit stack allocation via `void[] buf = 
alloca(__traits(classInstanceSize, A))[0 .. 
__traits(classInstanceSize, A)];`
   1.g) static allocation as thread-local or global variable or a 
part of one via implace buffer. To be honest I'm not sure how 
compilers implement this today.

   1.e) Or any of the many variations of the above.

2. Explicit or implicit initialization its vtable, monitor (if 
the class is or derived from Object) and its fields: `buf[] = 
typeid(Class).initializer[];`

3. The class constructor is invoked, which in turn may require 
calls to one more base classes.

...


4. The class is destroyed
   4.a) Implicitly via the GC
   4.b) Explicitly via `core.memory.__delete()`
   4.b) Explicitly via `destroy()`
   4.c) Explicitly via `std.experimental.allocator.dispose`, or 
any similar allocator wrapper.

5. The class instance memory may be freed.

At the end of the day, the destructor is called and potentially 
the memory is freed (e.g. if it's dynamically allocated). Nothing 
stops the same bytes from being reused for another object of a 
different type.

<slightly-off-topic>
C++ has the two liberties that D does not have or should/needs to 
have:
A. The C++ standard is very hand-wavy about the abstract machine 
on which C++ programs are semantically executing giving special 
powers to its standard library to implement features that can't 
be expressed with standard C++.

B. Its primary target audience of expert only programmers can 
tolerate the extremely dense minefield of undefined behavior that 
the standard committee doesn't shy from from putting behind each 
corner in the name of easier development of 'sufficiently smart 
compilers'. I'm talking about things like 
https://en.cppreference.com/w/cpp/utility/launder which most C 
programmers (curiously, 'C != C++') would consider truly bjorked.

</slightly-off-topic>

D on the other hand is (or at least I'm hopeful that it is) 
moving away giving magical powers to its runtime or standard 
library and is its embracing the spirit of bare bones systems 
programming where the programmer is allowed or even encouraged to 
implement everything from scratch (cref -betterC) for when that 
is the most sensible option.
While C and C++ approach portability by abstracting the machine, 
the approaches portability by laying all the cards on the table 
and defining things, rather than letting them be unspecified or 
at least documenting the implementation definition.

What I'm trying to say is that 'new' is not as special in D as it 
is in C++ (ironically, as the 'new'-ed objects are GC-ed in D, 
and what could be more magical in a language spec than a GC) and 
given the ongoing  nogc long-term campaign its use is even 
becoming discouraged.
Given this trend, the abundance of templates, increasing 
availability of LTO and library-defined allocation and 
object/resource management schemes I think it's more and more 
likely that compilers will be see the full picture of class 
lifetime and should either treat 1, 2, 3, 4 and 5 with C 
semantics (don't make any assumptions) or try to detect instances 
of 4 and 5 and mark the end of the object's lifetime in the 
compiler to allow aliasing of its storage as a potentially 
different type.

Jul 26 2018

Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:

On Thursday, 26 July 2018 at 21:22:45 UTC, Petar Kirov 
[ZombineDev] wrote:
 [..]

 D on the other hand is (or at least I'm hopeful that it is) 
 moving away giving magical powers to its runtime or standard 
 library and is its embracing the spirit of bare bones systems 
 programming where the programmer is allowed or even encouraged 
 to implement everything from scratch (cref -betterC) for when 
 that is the most sensible option.
 While C and C++ approach portability by abstracting the 
 machine, the approaches portability by laying all the cards on 
 the table and defining things, rather than letting them be 
 unspecified or at least documenting the implementation 
 definition.

 [..]

That is not too say that we shouldn't try to improve D's spec to 
allow more room for compiler optimizations (like the problem that 
you can't type instances of TypeInfo as fully read-only, because 
of the questionable feature of using them as an abundant pool of 
mutexes).
My point is that at least in the near term future, D compilers 
shouldn't try to assume they have monopoly (like there only one 
right way) on object lifetime, given that everybody in the 
community so to speak is busy making their own memory management 
scheme. Removing UBs in this area at the cost of limiting 
compiler optimizations will at least make the  nogc transition 
period smoother for everyone. Though I'm sure there's plenty of 
other opportunities for tightening the spec.

Jul 26 2018

Johan Engelen <j j.nl> writes:

On Thursday, 26 July 2018 at 21:22:45 UTC, Petar Kirov 
[ZombineDev] wrote:
 Please excuse if my question is too naive, but how does this 
 change anything?

The main insight is to reason about things in terms of language 
semantics, not in terms of actual memory addresses and 
instructions as processed by the CPU. Then reread my post. I am 
not talking about disallowing storing different objects in the 
same physical hardware memory location: the language spec says 
nothing about that, and it shouldn't.

 Nothing stops the same bytes from being reused for another 
 object of a different type.

Here you are talking about physical memory bits, which is none of 
the language's business. So in practice, of course memory will be 
reused. But (most of) that should be transparent to D's language 
semantics.

 D on the other hand is (or at least I'm hopeful that it is) 
 moving away giving magical powers to its runtime or standard 
 library and is its embracing the spirit of bare bones systems 
 programming where the programmer is allowed or even encouraged 
 to implement everything from scratch (cref -betterC) for when 
 that is the most sensible option.

This is a matter of opinion I guess. But why wouldn't you just 
program in assembly? For example, things like 
`__traits(isReturnOnStack)` don't make sense in a high level 
language like D. Some machines don't have a stack. In other 
cases, the decision whether to return something on the stack can 
be delayed until optimization for better performance. I see you 
mention LTO; forget about _any_ optimization and high-level 
language features, if you care about controlling what the machine 
is doing.

 While C and C++ approach portability by abstracting the 
 machine, the approaches portability by laying all the cards on 
 the table and defining things, rather than letting them be 
 unspecified or at least documenting the implementation 
 definition.

The kind of low-level control that you want is not what D should 
give (and doesn't). With "laying cards on the table" you mean 
specifying language semantics in hardware behavior? Because the 
strength of most languages is in _not_ doing that. (some of the 
strengths that'd be lost: cross platform, cross architecture, 
performance)

Note that this is not only about optimization. It's about being 
able to reason sensibly about code. You are advocating this?
```
class A { virtual void foo(); }
class B : A { ... }
class C : A { ... }

void bar(A a) {
    a.foo(); // type of a is B, but turns it into C
    a.foo(); // type is now C, call different foo
}
```

- Johan

Jul 28 2018

Steven Schveighoffer <schveiguy gmail.com> writes:

On 7/26/18 8:45 AM, Johan Engelen wrote:
 On Wednesday, 25 July 2018 at 08:11:59 UTC, rikki cattermole wrote:
 Standard solution[0].

 [0] https://dlang.org/phobos/std_conv.html#.emplace.4

 
 Thanks for pointing to D's placement new. This is bad news for my 
 devirtualization work; before, I thought D is in a better situation than 
 C++, but now it seems we may be worse off.
 
 Before I continue the work, I'll have to look closer at this (perhaps 
 write an article about the situation in D, so more ppl can help and see 
 what is going on). In short:
 C++'s placement new can change the dynamic type of an object, which is 
 problematic for devirtualization. However, in C++ the pointer passed to 
 placement new may not be used afterwards (it'd be UB). This means that 
 the code `A* a = new A(); a->foo(); a->foo();` is guaranteed to call the 
 same function `A::foo` twice, because if the first call to `foo` would 
 do a placement new on `a` (e.g. through `this`), the second call would 
 be UB.
 In D, we don't have placement new, great! And now, I learn that the 
 _standard library_ _does_ have something that looks like placement new, 
 but without extra guarantees of the spec that C++ has.
 For some more info:
 https://stackoverflow.com/a/49569305
 https://stackoverflow.com/a/48164192

Reading those items, though, doesn't emplace effectively do what 
std::launder does in C++, since it's crossing a function boundary? Is 
std::launder a special part of the spec, or does it just return its 
parameter to remove the potential for UB?

-Steve

Aug 01 2018

Kagamin <spam here.lot> writes:

On Thursday, 26 July 2018 at 12:45:52 UTC, Johan Engelen wrote:
 Thanks for pointing to D's placement new. This is bad news for 
 my devirtualization work; before, I thought D is in a better 
 situation than C++, but now it seems we may be worse off.

Just say that devirtualization is incompatible with 
method-changing reemplace. I don't think this imposes notable 
portability restrictions.

Aug 02 2018

D Programming

C/C++ Programming

Other

digitalmars.D - Constructing a class in-place