www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Help with DMD internals

reply Manu <turkeyman gmail.com> writes:
I've started digging at some surfac-ey extern(C++) issues.

First up, I desperately want a document that describes D's precise
construction/destruction rules; there are a bunch of generated
functions, they get called, in what order, and under what conditions?

Construction:
  What is the order of operations?
  Where and when is init applied before calling constructors?
  What about super-constructors? Aggregate member constructors? When
are where-from are they called?
  Is this all wrapped in an outer '__xctor' function that does the
whole job? Is it all rolled into __ctor? Or is it just a bunch of
loose operations that appear at the call-site?
  I want a function that wraps initialisation, super construction,
aggregate construction, and local construction; similar to __xdtor is
for destruction. That function would match C++, and that would be the
logical point of interaction with C++ (mangling).

Destruction:
  What's the precise story with __dtor, __xdtor, __fieldDtor?
  Is __xdtor **always** present?
  extern(C++) seems to have bugs(?) with __xdtor...
  Is re-initialisation to 'init' part of destruction, or is it a
separate post-process? (I feel it's a post-process)


Regarding extern(C++), I've started with trying to mangle correctly,
but then next will come trying to match C++ semantics.

Issue 1: extern(C++) classes have broken __xdtor. I observe
extern(C++) classes with no destructor will generate an __xdtor that
correctly calls aggregate destruction. If you then add a destructor
(__dtor), __xdtor will call that function directly (or maybe it just
becomes an alias?), and aggregate destruction will no longer occur.

Issue 2: assuming the above is fixed, __xdtor matches C++ expectation
for destruction. I intend to change the mangling such that __xdtor
mangles as the C++ symbol, and not __dtor.

Issue 3: If the user specifies an extern(C++) destructor *prototype*
with no implementation (ie, extern link to C++ destructor), it needs a
hack to re-interpret as a prototype for an extern __xdtor, rather than
__dtor (and __dtor can happily not exist, or just alias). C++
destructors perform a full-destruction, which is equivalent to __xdtor
from D's perspective.

That should lead to destruction semantics matching C++.


Matching construction semantics is a little bit fiddly too. It might
need special-casing.
D doesn't seem to wrap up a full construction into a single nice
function like C++ does... or does it? I'm struggling to understand D
construction from end-to-end.


Let's not talk about passing by-val... yet.
May 20 2018
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/20/2018 12:28 PM, Manu wrote:
 I've started digging at some surfac-ey extern(C++) issues.
I've improved the definition of how construction works, such as when the .init happens, in the spec. https://dlang.org/spec/class.html#constructors
 Is __xdtor **always** present?
No. If it's POD, it is not. When it is added, it is added as an AliasDeclaration, not a FuncDeclaration. See buildDtor() in clone.d, which is where it is created. You can also see in that function how _ArrayDtor and __fieldDtor are built on demand, and the order in which they are called.
May 20 2018
parent Manu <turkeyman gmail.com> writes:
On 20 May 2018 at 13:25, Walter Bright via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 5/20/2018 12:28 PM, Manu wrote:
 I've started digging at some surfac-ey extern(C++) issues.
I've improved the definition of how construction works, such as when the .init happens, in the spec. https://dlang.org/spec/class.html#constructors
 Is __xdtor **always** present?
No. If it's POD, it is not. When it is added, it is added as an AliasDeclaration, not a FuncDeclaration. See buildDtor() in clone.d, which is where it is created. You can also see in that function how _ArrayDtor and __fieldDtor are built on demand, and the order in which they are called.
Yup, I'm onto it. Check out my PR!
May 23 2018
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/20/2018 12:28 PM, Manu wrote:
    Is re-initialisation to 'init' part of destruction,
No.
 or is it a
 separate post-process? (I feel it's a post-process)
Yes, and only for delete.
May 20 2018
parent reply Manu <turkeyman gmail.com> writes:
On 20 May 2018 at 17:14, Walter Bright via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 5/20/2018 12:28 PM, Manu wrote:
    Is re-initialisation to 'init' part of destruction,
No.
 or is it a
 separate post-process? (I feel it's a post-process)
Yes, and only for delete.
destroy() also seems to do it.
May 20 2018
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 5/20/18 9:49 PM, Manu wrote:
 On 20 May 2018 at 17:14, Walter Bright via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/20/2018 12:28 PM, Manu wrote:
     Is re-initialisation to 'init' part of destruction,
No.
 or is it a
 separate post-process? (I feel it's a post-process)
Yes, and only for delete.
Why? This doesn't make a lot of sense, since delete is freeing the memory, it shouldn't matter what state the memory is left in. I would argue that should only be done in debug mode, and actually, I wonder if some other kind of sentinel memory pattern should be written instead.
 
 destroy() also seems to do it.
 
Yes, because destroy leaves the memory available for reuse. This is intentional. -Steve
May 21 2018
parent reply Manu <turkeyman gmail.com> writes:
On 21 May 2018 at 06:10, Steven Schveighoffer via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 5/20/18 9:49 PM, Manu wrote:
 On 20 May 2018 at 17:14, Walter Bright via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/20/2018 12:28 PM, Manu wrote:
     Is re-initialisation to 'init' part of destruction,
No.
 or is it a
 separate post-process? (I feel it's a post-process)
Yes, and only for delete.
Why? This doesn't make a lot of sense, since delete is freeing the memory, it shouldn't matter what state the memory is left in. I would argue that should only be done in debug mode, and actually, I wonder if some other kind of sentinel memory pattern should be written instead.
 destroy() also seems to do it.
Yes, because destroy leaves the memory available for reuse. This is intentional.
It's uninitialised memory though, Wouldn't the same argument you made above also apply?
May 21 2018
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 5/21/18 6:18 PM, Manu wrote:
 On 21 May 2018 at 06:10, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/20/18 9:49 PM, Manu wrote:
 On 20 May 2018 at 17:14, Walter Bright via Digitalmars-d
 Yes, and only for delete.
Why? This doesn't make a lot of sense, since delete is freeing the memory, it shouldn't matter what state the memory is left in. I would argue that should only be done in debug mode, and actually, I wonder if some other kind of sentinel memory pattern should be written instead.
 destroy() also seems to do it.
Yes, because destroy leaves the memory available for reuse. This is intentional.
It's uninitialised memory though, Wouldn't the same argument you made above also apply?
Uninitialized, but allocated and usable. The difference between this and delete is that delete is going to unallocate that memory. The next time it's allocated, it will be overwritten with an init pattern for the new type. Basically, in D when you have access to memory, it should be in a valid state. This is a valid usage in D: struct S { this(...) {...} } auto s = S(a, b, c); destroy(s); // s is now a fully initialized S ready to be constructed s.__ctor(a, b, c); -Steve
May 21 2018
parent reply Manu <turkeyman gmail.com> writes:
On 21 May 2018 at 15:29, Steven Schveighoffer via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 5/21/18 6:18 PM, Manu wrote:
 On 21 May 2018 at 06:10, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/20/18 9:49 PM, Manu wrote:
 On 20 May 2018 at 17:14, Walter Bright via Digitalmars-d
 Yes, and only for delete.
Why? This doesn't make a lot of sense, since delete is freeing the memory, it shouldn't matter what state the memory is left in. I would argue that should only be done in debug mode, and actually, I wonder if some other kind of sentinel memory pattern should be written instead.
 destroy() also seems to do it.
Yes, because destroy leaves the memory available for reuse. This is intentional.
It's uninitialised memory though, Wouldn't the same argument you made above also apply?
Uninitialized, but allocated and usable. The difference between this and delete is that delete is going to unallocate that memory. The next time it's allocated, it will be overwritten with an init pattern for the new type. Basically, in D when you have access to memory, it should be in a valid state.
Why is it reasonable to expect that the buffer after `destroy()` should be 'valid'? Who's to say that the init state is 'valid'? Surely it's not reasonable to just start using a class after calling destroy(), as if it's a newly constructed class? Surely we should expect that dead buffer is emplaced with a new construction?
 This is a valid usage in D:

 struct S
 {
    this(...) {...}
 }

 auto s = S(a, b, c);
 destroy(s);

 // s is now a fully initialized S ready to be constructed
 s.__ctor(a, b, c);
And you've demonstrated exactly what I'm saying... there's no need for destroy() to return initialised memory here. It could return 0xfeeefeee memory or something (for debugging purposes).
May 21 2018
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 5/21/18 6:37 PM, Manu wrote:
 On 21 May 2018 at 15:29, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/21/18 6:18 PM, Manu wrote:
 On 21 May 2018 at 06:10, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/20/18 9:49 PM, Manu wrote:
 On 20 May 2018 at 17:14, Walter Bright via Digitalmars-d
 Yes, and only for delete.
Why? This doesn't make a lot of sense, since delete is freeing the memory, it shouldn't matter what state the memory is left in. I would argue that should only be done in debug mode, and actually, I wonder if some other kind of sentinel memory pattern should be written instead.
 destroy() also seems to do it.
Yes, because destroy leaves the memory available for reuse. This is intentional.
It's uninitialised memory though, Wouldn't the same argument you made above also apply?
Uninitialized, but allocated and usable. The difference between this and delete is that delete is going to unallocate that memory. The next time it's allocated, it will be overwritten with an init pattern for the new type. Basically, in D when you have access to memory, it should be in a valid state.
Why is it reasonable to expect that the buffer after `destroy()` should be 'valid'? Who's to say that the init state is 'valid'?
It's valid in that it's not garbage, and that it's not full of dangling pointers.
 
 Surely it's not reasonable to just start using a class after calling
 destroy(), as if it's a newly constructed class?
No, of course not. For classes, actually the vtable is zeroed out, so any usage gets a segfault. You'd have to fix that before using again. But potentially, yes, you can use it if you call a ctor on it. Fun fact: originally destroy (clear at the time) was going to call the default ctor on a class instance so it could be used again.
 Surely we should expect that dead buffer is emplaced with a new construction?
 
 This is a valid usage in D:

 struct S
 {
     this(...) {...}
 }

 auto s = S(a, b, c);
 destroy(s);

 // s is now a fully initialized S ready to be constructed
 s.__ctor(a, b, c);
And you've demonstrated exactly what I'm saying... there's no need for destroy() to return initialised memory here. It could return 0xfeeefeee memory or something (for debugging purposes).
No, the ctor assumes all values are in init state. If that's not the case, then the code will crash or corrupt memory. -Steve
May 21 2018
parent reply Manu <turkeyman gmail.com> writes:
On 21 May 2018 at 15:51, Steven Schveighoffer via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 5/21/18 6:37 PM, Manu wrote:
 On 21 May 2018 at 15:29, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 Uninitialized, but allocated and usable. The difference between this and
 delete is that delete is going to unallocate that memory. The next time
 it's
 allocated, it will be overwritten with an init pattern for the new type.

 Basically, in D when you have access to memory, it should be in a valid
 state.
Why is it reasonable to expect that the buffer after `destroy()` should be 'valid'? Who's to say that the init state is 'valid'?
It's valid in that it's not garbage, and that it's not full of dangling pointers.
Ah! Dangling pointers... that might conservatively retain garbage. That's a good reason.
 And you've demonstrated exactly what I'm saying... there's no need for
 destroy() to return initialised memory here. It could return
 0xfeeefeee memory or something (for debugging purposes).
No, the ctor assumes all values are in init state. If that's not the case, then the code will crash or corrupt memory.
...so, assign init immediately before construction? Rather than assigning init on destruction, which makes destroy more expensive, and it may or may not be re-used (most likely not). Why isn't assigning init built into the constructor?
May 21 2018
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 5/22/18 1:01 AM, Manu wrote:
 On 21 May 2018 at 15:51, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/21/18 6:37 PM, Manu wrote:
 On 21 May 2018 at 15:29, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 Uninitialized, but allocated and usable. The difference between this and
 delete is that delete is going to unallocate that memory. The next time
 it's
 allocated, it will be overwritten with an init pattern for the new type.

 Basically, in D when you have access to memory, it should be in a valid
 state.
Why is it reasonable to expect that the buffer after `destroy()` should be 'valid'? Who's to say that the init state is 'valid'?
It's valid in that it's not garbage, and that it's not full of dangling pointers.
Ah! Dangling pointers... that might conservatively retain garbage. That's a good reason.
Well, I was thinking more along the lines of accessing memory that is no longer valid :) A destructor can, for instance, free C malloced memory.
 And you've demonstrated exactly what I'm saying... there's no need for
 destroy() to return initialised memory here. It could return
 0xfeeefeee memory or something (for debugging purposes).
No, the ctor assumes all values are in init state. If that's not the case, then the code will crash or corrupt memory.
....so, assign init immediately before construction? Rather than assigning init on destruction, which makes destroy more expensive, and it may or may not be re-used (most likely not). Why isn't assigning init built into the constructor?
It provides for slight efficiencies. For example, if you allocate an array of 100 structs, whose init value is all 0, you can do one memset on the entire array. When you split the init blitting from the constructor, you have more options. Note: you can "destroy" without blitting init, all the tools are there (the same ones destroy uses, it's not a magic function). Might be a good addition -- unsafeDestroy. -Steve
May 22 2018
parent reply Manu <turkeyman gmail.com> writes:
On 22 May 2018 at 06:44, Steven Schveighoffer via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 5/22/18 1:01 AM, Manu wrote:
 On 21 May 2018 at 15:51, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/21/18 6:37 PM, Manu wrote:
 On 21 May 2018 at 15:29, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 Uninitialized, but allocated and usable. The difference between this
 and
 delete is that delete is going to unallocate that memory. The next time
 it's
 allocated, it will be overwritten with an init pattern for the new
 type.

 Basically, in D when you have access to memory, it should be in a valid
 state.
Why is it reasonable to expect that the buffer after `destroy()` should be 'valid'? Who's to say that the init state is 'valid'?
It's valid in that it's not garbage, and that it's not full of dangling pointers.
Ah! Dangling pointers... that might conservatively retain garbage. That's a good reason.
Well, I was thinking more along the lines of accessing memory that is no longer valid :) A destructor can, for instance, free C malloced memory.
The memory post-destruction is invalid... nobody will access it. Accessing it should be undefined behaviour until such a time as the memory is re-constructed with a new instance...
May 22 2018
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 5/22/18 9:48 PM, Manu wrote:
 On 22 May 2018 at 06:44, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/22/18 1:01 AM, Manu wrote:
 Ah! Dangling pointers... that might conservatively retain garbage.
 That's a good reason.
Well, I was thinking more along the lines of accessing memory that is no longer valid :) A destructor can, for instance, free C malloced memory.
The memory post-destruction is invalid... nobody will access it. Accessing it should be undefined behaviour until such a time as the memory is re-constructed with a new instance...
This particular sub-thread was about using destroy on something, not freeing the memory. The memory is not invalid. To clarify, I'm talking about something like this: struct S { int *foo; this(int x) { foo = cast(int*)malloc(int.sizeof); } ~this() { free(foo); } } auto s = S(1); destroy(s); *s.foo = 5; // oops If s.foo is reset to null, then you get a nice predictable segfault. If not, then you get memory corruption. Sure you can just say "undefined behavior", but I'd rather the defined behavior. -Steve
May 23 2018
parent Manu <turkeyman gmail.com> writes:
On 23 May 2018 at 06:36, Steven Schveighoffer via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 5/22/18 9:48 PM, Manu wrote:
 On 22 May 2018 at 06:44, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/22/18 1:01 AM, Manu wrote:
 Ah! Dangling pointers... that might conservatively retain garbage.
 That's a good reason.
Well, I was thinking more along the lines of accessing memory that is no longer valid :) A destructor can, for instance, free C malloced memory.
The memory post-destruction is invalid... nobody will access it. Accessing it should be undefined behaviour until such a time as the memory is re-constructed with a new instance...
This particular sub-thread was about using destroy on something, not freeing the memory. The memory is not invalid. To clarify, I'm talking about something like this: struct S { int *foo; this(int x) { foo = cast(int*)malloc(int.sizeof); } ~this() { free(foo); } } auto s = S(1); destroy(s); *s.foo = 5; // oops If s.foo is reset to null, then you get a nice predictable segfault. If not, then you get memory corruption. Sure you can just say "undefined behavior", but I'd rather the defined behavior.
Yeah no... I'm happy with "accessing uninitialised memory is undefined behaviour". If you want defined behaviour; initialise the memory!
May 23 2018