digitalmars.D - Help with DMD internals

Manu (45/45) May 20 2018 I've started digging at some surfac-ey extern(C++) issues.

Walter Bright (9/11) May 20 2018 I've improved the definition of how construction works, such as when the...

Manu (3/15) May 23 2018 Yup, I'm onto it. Check out my PR!

Walter Bright (3/6) May 20 2018 Yes, and only for delete.

Manu (3/10) May 20 2018 destroy() also seems to do it.

Steven Schveighoffer (8/25) May 21 2018 Why? This doesn't make a lot of sense, since delete is freeing the

Manu (4/33) May 21 2018 It's uninitialised memory though, Wouldn't the same argument you made

Steven Schveighoffer (17/39) May 21 2018 Uninitialized, but allocated and usable. The difference between this and...

Manu (11/55) May 21 2018 Why is it reasonable to expect that the buffer after `destroy()`

Steven Schveighoffer (11/78) May 21 2018 It's valid in that it's not garbage, and that it's not full of dangling

Manu (8/32) May 21 2018 Ah! Dangling pointers... that might conservatively retain garbage.

Steven Schveighoffer (12/50) May 22 2018 Well, I was thinking more along the lines of accessing memory that is no...

Manu (5/43) May 22 2018 The memory post-destruction is invalid... nobody will access it.

Steven Schveighoffer (17/31) May 23 2018 This particular sub-thread was about using destroy on something, not

Manu (5/40) May 23 2018 Yeah no... I'm happy with "accessing uninitialised memory is undefined

Manu <turkeyman gmail.com> writes:

I've started digging at some surfac-ey extern(C++) issues.

First up, I desperately want a document that describes D's precise
construction/destruction rules; there are a bunch of generated
functions, they get called, in what order, and under what conditions?

Construction:
  What is the order of operations?
  Where and when is init applied before calling constructors?
  What about super-constructors? Aggregate member constructors? When
are where-from are they called?
  Is this all wrapped in an outer '__xctor' function that does the
whole job? Is it all rolled into __ctor? Or is it just a bunch of
loose operations that appear at the call-site?
  I want a function that wraps initialisation, super construction,
aggregate construction, and local construction; similar to __xdtor is
for destruction. That function would match C++, and that would be the
logical point of interaction with C++ (mangling).

Destruction:
  What's the precise story with __dtor, __xdtor, __fieldDtor?
  Is __xdtor **always** present?
  extern(C++) seems to have bugs(?) with __xdtor...
  Is re-initialisation to 'init' part of destruction, or is it a
separate post-process? (I feel it's a post-process)


Regarding extern(C++), I've started with trying to mangle correctly,
but then next will come trying to match C++ semantics.

Issue 1: extern(C++) classes have broken __xdtor. I observe
extern(C++) classes with no destructor will generate an __xdtor that
correctly calls aggregate destruction. If you then add a destructor
(__dtor), __xdtor will call that function directly (or maybe it just
becomes an alias?), and aggregate destruction will no longer occur.

Issue 2: assuming the above is fixed, __xdtor matches C++ expectation
for destruction. I intend to change the mangling such that __xdtor
mangles as the C++ symbol, and not __dtor.

Issue 3: If the user specifies an extern(C++) destructor *prototype*
with no implementation (ie, extern link to C++ destructor), it needs a
hack to re-interpret as a prototype for an extern __xdtor, rather than
__dtor (and __dtor can happily not exist, or just alias). C++
destructors perform a full-destruction, which is equivalent to __xdtor
from D's perspective.

That should lead to destruction semantics matching C++.


Matching construction semantics is a little bit fiddly too. It might
need special-casing.
D doesn't seem to wrap up a full construction into a single nice
function like C++ does... or does it? I'm struggling to understand D
construction from end-to-end.


Let's not talk about passing by-val... yet.

May 20 2018

Walter Bright <newshound2 digitalmars.com> writes:

On 5/20/2018 12:28 PM, Manu wrote:
 I've started digging at some surfac-ey extern(C++) issues.

I've improved the definition of how construction works, such as when the .init 
happens, in the spec.

https://dlang.org/spec/class.html#constructors

 Is __xdtor **always** present?

No. If it's POD, it is not. When it is added, it is added as an 
AliasDeclaration, not a FuncDeclaration. See buildDtor() in clone.d, which is 
where it is created.

You can also see in that function how _ArrayDtor and __fieldDtor are built on 
demand, and the order in which they are called.

May 20 2018

Manu <turkeyman gmail.com> writes:

On 20 May 2018 at 13:25, Walter Bright via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 5/20/2018 12:28 PM, Manu wrote:
 I've started digging at some surfac-ey extern(C++) issues.


 I've improved the definition of how construction works, such as when the
 .init happens, in the spec.

 https://dlang.org/spec/class.html#constructors

 Is __xdtor **always** present?

 No. If it's POD, it is not. When it is added, it is added as an
 AliasDeclaration, not a FuncDeclaration. See buildDtor() in clone.d, which
 is where it is created.

 You can also see in that function how _ArrayDtor and __fieldDtor are built
 on demand, and the order in which they are called.

Yup, I'm onto it. Check out my PR!

May 23 2018

Walter Bright <newshound2 digitalmars.com> writes:

On 5/20/2018 12:28 PM, Manu wrote:
    Is re-initialisation to 'init' part of destruction,

No.

 or is it a
 separate post-process? (I feel it's a post-process)

Yes, and only for delete.

May 20 2018

Manu <turkeyman gmail.com> writes:

On 20 May 2018 at 17:14, Walter Bright via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 5/20/2018 12:28 PM, Manu wrote:
    Is re-initialisation to 'init' part of destruction,


 No.

 or is it a
 separate post-process? (I feel it's a post-process)


 Yes, and only for delete.

destroy() also seems to do it.

May 20 2018

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/20/18 9:49 PM, Manu wrote:
 On 20 May 2018 at 17:14, Walter Bright via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/20/2018 12:28 PM, Manu wrote:
     Is re-initialisation to 'init' part of destruction,


 No.

 or is it a
 separate post-process? (I feel it's a post-process)


 Yes, and only for delete.


Why? This doesn't make a lot of sense, since delete is freeing the 
memory, it shouldn't matter what state the memory is left in. I would 
argue that should only be done in debug mode, and actually, I wonder if 
some other kind of sentinel memory pattern should be written instead.

 
 destroy() also seems to do it.
 

Yes, because destroy leaves the memory available for reuse. This is 
intentional.

-Steve

May 21 2018

Manu <turkeyman gmail.com> writes:

On 21 May 2018 at 06:10, Steven Schveighoffer via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 5/20/18 9:49 PM, Manu wrote:
 On 20 May 2018 at 17:14, Walter Bright via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/20/2018 12:28 PM, Manu wrote:
     Is re-initialisation to 'init' part of destruction,



 No.

 or is it a
 separate post-process? (I feel it's a post-process)



 Yes, and only for delete.



 Why? This doesn't make a lot of sense, since delete is freeing the memory,
 it shouldn't matter what state the memory is left in. I would argue that
 should only be done in debug mode, and actually, I wonder if some other kind
 of sentinel memory pattern should be written instead.

 destroy() also seems to do it.

 Yes, because destroy leaves the memory available for reuse. This is
 intentional.

It's uninitialised memory though, Wouldn't the same argument you made
above also apply?

May 21 2018

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/21/18 6:18 PM, Manu wrote:
 On 21 May 2018 at 06:10, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/20/18 9:49 PM, Manu wrote:
 On 20 May 2018 at 17:14, Walter Bright via Digitalmars-d



 Yes, and only for delete.



 Why? This doesn't make a lot of sense, since delete is freeing the memory,
 it shouldn't matter what state the memory is left in. I would argue that
 should only be done in debug mode, and actually, I wonder if some other kind
 of sentinel memory pattern should be written instead.

 destroy() also seems to do it.

 Yes, because destroy leaves the memory available for reuse. This is
 intentional.

 
 It's uninitialised memory though, Wouldn't the same argument you made
 above also apply?

Uninitialized, but allocated and usable. The difference between this and 
delete is that delete is going to unallocate that memory. The next time 
it's allocated, it will be overwritten with an init pattern for the new 
type.

Basically, in D when you have access to memory, it should be in a valid 
state.

This is a valid usage in D:

struct S
{
    this(...) {...}
}

auto s = S(a, b, c);
destroy(s);

// s is now a fully initialized S ready to be constructed
s.__ctor(a, b, c);

-Steve

May 21 2018

Manu <turkeyman gmail.com> writes:

On 21 May 2018 at 15:29, Steven Schveighoffer via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 5/21/18 6:18 PM, Manu wrote:
 On 21 May 2018 at 06:10, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/20/18 9:49 PM, Manu wrote:
 On 20 May 2018 at 17:14, Walter Bright via Digitalmars-d




 Yes, and only for delete.




 Why? This doesn't make a lot of sense, since delete is freeing the
 memory,
 it shouldn't matter what state the memory is left in. I would argue that
 should only be done in debug mode, and actually, I wonder if some other
 kind
 of sentinel memory pattern should be written instead.

 destroy() also seems to do it.

 Yes, because destroy leaves the memory available for reuse. This is
 intentional.


 It's uninitialised memory though, Wouldn't the same argument you made
 above also apply?


 Uninitialized, but allocated and usable. The difference between this and
 delete is that delete is going to unallocate that memory. The next time it's
 allocated, it will be overwritten with an init pattern for the new type.

 Basically, in D when you have access to memory, it should be in a valid
 state.

Why is it reasonable to expect that the buffer after `destroy()`
should be 'valid'?
Who's to say that the init state is 'valid'?

Surely it's not reasonable to just start using a class after calling
destroy(), as if it's a newly constructed class?
Surely we should expect that dead buffer is emplaced with a new construction?

 This is a valid usage in D:

 struct S
 {
    this(...) {...}
 }

 auto s = S(a, b, c);
 destroy(s);

 // s is now a fully initialized S ready to be constructed
 s.__ctor(a, b, c);

And you've demonstrated exactly what I'm saying... there's no need for
destroy() to return initialised memory here. It could return
0xfeeefeee memory or something (for debugging purposes).

May 21 2018

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/21/18 6:37 PM, Manu wrote:
 On 21 May 2018 at 15:29, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/21/18 6:18 PM, Manu wrote:
 On 21 May 2018 at 06:10, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/20/18 9:49 PM, Manu wrote:
 On 20 May 2018 at 17:14, Walter Bright via Digitalmars-d




 Yes, and only for delete.




 Why? This doesn't make a lot of sense, since delete is freeing the
 memory,
 it shouldn't matter what state the memory is left in. I would argue that
 should only be done in debug mode, and actually, I wonder if some other
 kind
 of sentinel memory pattern should be written instead.

 destroy() also seems to do it.

 Yes, because destroy leaves the memory available for reuse. This is
 intentional.


 It's uninitialised memory though, Wouldn't the same argument you made
 above also apply?


 Uninitialized, but allocated and usable. The difference between this and
 delete is that delete is going to unallocate that memory. The next time it's
 allocated, it will be overwritten with an init pattern for the new type.

 Basically, in D when you have access to memory, it should be in a valid
 state.

 
 Why is it reasonable to expect that the buffer after `destroy()`
 should be 'valid'?
 Who's to say that the init state is 'valid'?

It's valid in that it's not garbage, and that it's not full of dangling 
pointers.

 
 Surely it's not reasonable to just start using a class after calling
 destroy(), as if it's a newly constructed class?

No, of course not. For classes, actually the vtable is zeroed out, so 
any usage gets a segfault. You'd have to fix that before using again. 
But potentially, yes, you can use it if you call a ctor on it.

Fun fact: originally destroy (clear at the time) was going to call the 
default ctor on a class instance so it could be used again.

 Surely we should expect that dead buffer is emplaced with a new construction?
 
 This is a valid usage in D:

 struct S
 {
     this(...) {...}
 }

 auto s = S(a, b, c);
 destroy(s);

 // s is now a fully initialized S ready to be constructed
 s.__ctor(a, b, c);

 
 And you've demonstrated exactly what I'm saying... there's no need for
 destroy() to return initialised memory here. It could return
 0xfeeefeee memory or something (for debugging purposes).

No, the ctor assumes all values are in init state. If that's not the 
case, then the code will crash or corrupt memory.

-Steve

May 21 2018

Manu <turkeyman gmail.com> writes:

On 21 May 2018 at 15:51, Steven Schveighoffer via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 5/21/18 6:37 PM, Manu wrote:
 On 21 May 2018 at 15:29, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 Uninitialized, but allocated and usable. The difference between this and
 delete is that delete is going to unallocate that memory. The next time
 it's
 allocated, it will be overwritten with an init pattern for the new type.

 Basically, in D when you have access to memory, it should be in a valid
 state.


 Why is it reasonable to expect that the buffer after `destroy()`
 should be 'valid'?
 Who's to say that the init state is 'valid'?


 It's valid in that it's not garbage, and that it's not full of dangling
 pointers.

Ah! Dangling pointers... that might conservatively retain garbage.
That's a good reason.


 And you've demonstrated exactly what I'm saying... there's no need for
 destroy() to return initialised memory here. It could return
 0xfeeefeee memory or something (for debugging purposes).


 No, the ctor assumes all values are in init state. If that's not the case,
 then the code will crash or corrupt memory.

...so, assign init immediately before construction? Rather than
assigning init on destruction, which makes destroy more expensive, and
it may or may not be re-used (most likely not).
Why isn't assigning init built into the constructor?

May 21 2018

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/22/18 1:01 AM, Manu wrote:
 On 21 May 2018 at 15:51, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/21/18 6:37 PM, Manu wrote:
 On 21 May 2018 at 15:29, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 Uninitialized, but allocated and usable. The difference between this and
 delete is that delete is going to unallocate that memory. The next time
 it's
 allocated, it will be overwritten with an init pattern for the new type.

 Basically, in D when you have access to memory, it should be in a valid
 state.


 Why is it reasonable to expect that the buffer after `destroy()`
 should be 'valid'?
 Who's to say that the init state is 'valid'?


 It's valid in that it's not garbage, and that it's not full of dangling
 pointers.

 
 Ah! Dangling pointers... that might conservatively retain garbage.
 That's a good reason.

Well, I was thinking more along the lines of accessing memory that is no 
longer valid :) A destructor can, for instance, free C malloced memory.

 And you've demonstrated exactly what I'm saying... there's no need for
 destroy() to return initialised memory here. It could return
 0xfeeefeee memory or something (for debugging purposes).


 No, the ctor assumes all values are in init state. If that's not the case,
 then the code will crash or corrupt memory.

 
 ....so, assign init immediately before construction? Rather than
 assigning init on destruction, which makes destroy more expensive, and
 it may or may not be re-used (most likely not).
 Why isn't assigning init built into the constructor?

It provides for slight efficiencies. For example, if you allocate an 
array of 100 structs, whose init value is all 0, you can do one memset 
on the entire array.

When you split the init blitting from the constructor, you have more 
options.

Note: you can "destroy" without blitting init, all the tools are there 
(the same ones destroy uses, it's not a magic function). Might be a good 
addition -- unsafeDestroy.

-Steve

May 22 2018

Manu <turkeyman gmail.com> writes:

On 22 May 2018 at 06:44, Steven Schveighoffer via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 5/22/18 1:01 AM, Manu wrote:
 On 21 May 2018 at 15:51, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/21/18 6:37 PM, Manu wrote:
 On 21 May 2018 at 15:29, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 Uninitialized, but allocated and usable. The difference between this
 and
 delete is that delete is going to unallocate that memory. The next time
 it's
 allocated, it will be overwritten with an init pattern for the new
 type.

 Basically, in D when you have access to memory, it should be in a valid
 state.



 Why is it reasonable to expect that the buffer after `destroy()`
 should be 'valid'?
 Who's to say that the init state is 'valid'?



 It's valid in that it's not garbage, and that it's not full of dangling
 pointers.


 Ah! Dangling pointers... that might conservatively retain garbage.
 That's a good reason.


 Well, I was thinking more along the lines of accessing memory that is no
 longer valid :) A destructor can, for instance, free C malloced memory.

The memory post-destruction is invalid... nobody will access it.
Accessing it should be undefined behaviour until such a time as the
memory is re-constructed with a new instance...

May 22 2018

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/22/18 9:48 PM, Manu wrote:
 On 22 May 2018 at 06:44, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/22/18 1:01 AM, Manu wrote:
 Ah! Dangling pointers... that might conservatively retain garbage.
 That's a good reason.


 Well, I was thinking more along the lines of accessing memory that is no
 longer valid :) A destructor can, for instance, free C malloced memory.

 
 The memory post-destruction is invalid... nobody will access it.
 Accessing it should be undefined behaviour until such a time as the
 memory is re-constructed with a new instance...
 

This particular sub-thread was about using destroy on something, not 
freeing the memory. The memory is not invalid.

To clarify, I'm talking about something like this:

struct S
{
   int *foo;
   this(int x) { foo = cast(int*)malloc(int.sizeof); }
   ~this() { free(foo); }
}

auto s = S(1);
destroy(s);
*s.foo = 5; // oops

If s.foo is reset to null, then you get a nice predictable segfault. If 
not, then you get memory corruption. Sure you can just say "undefined 
behavior", but I'd rather the defined behavior.

-Steve

May 23 2018

Manu <turkeyman gmail.com> writes:

On 23 May 2018 at 06:36, Steven Schveighoffer via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 5/22/18 9:48 PM, Manu wrote:
 On 22 May 2018 at 06:44, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 5/22/18 1:01 AM, Manu wrote:
 Ah! Dangling pointers... that might conservatively retain garbage.
 That's a good reason.



 Well, I was thinking more along the lines of accessing memory that is no
 longer valid :) A destructor can, for instance, free C malloced memory.


 The memory post-destruction is invalid... nobody will access it.
 Accessing it should be undefined behaviour until such a time as the
 memory is re-constructed with a new instance...

 This particular sub-thread was about using destroy on something, not freeing
 the memory. The memory is not invalid.

 To clarify, I'm talking about something like this:

 struct S
 {
   int *foo;
   this(int x) { foo = cast(int*)malloc(int.sizeof); }
   ~this() { free(foo); }
 }

 auto s = S(1);
 destroy(s);
 *s.foo = 5; // oops

 If s.foo is reset to null, then you get a nice predictable segfault. If not,
 then you get memory corruption. Sure you can just say "undefined behavior",
 but I'd rather the defined behavior.

Yeah no... I'm happy with "accessing uninitialised memory is undefined
behaviour".
If you want defined behaviour; initialise the memory!

May 23 2018

D Programming

C/C++ Programming

Other

digitalmars.D - Help with DMD internals