www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 17454] New: ABI non-conformity: produces unaligned placement

https://issues.dlang.org/show_bug.cgi?id=17454

          Issue ID: 17454
           Summary: ABI non-conformity: produces unaligned placement of
                    struct on stack
           Product: D
           Version: D2
          Hardware: x86_64
                OS: Windows
            Status: NEW
          Severity: blocker
          Priority: P1
         Component: dmd
          Assignee: nobody puremagic.com
          Reporter: turkeyman gmail.com

I have this problem that I don't understand, and I can't reproduce in a reduced
test-case.
When I simplify the program structure (or change it at all really), the
addresses of variables all change and the problem may disappear.

I can only produce a high-level outline of what my code does. My code has a lot
more 'stuff'.

Compiling for Win64, given something like this:

D:
--------

class C {}
struct S { C c; } // struct with a single pointer member

extern (C++) S d_fun()
{
  return S(new C);  // <- extern(C++) function returns the struct S by value
}


C++:
--------

class C {};
struct S
{
  C *c;

  S(S &&rval)
    : c(rval.c)
  {
    rval.c = nullptr;
  }
};

void test()
{
  S result(d_fun()); // move-construct result from d-function
}

--------


This is the context, but there is lots of additional noise in d_fun().

So, what goes wrong is, d_fun returns a struct by value which appears by
looking at the disassembly that d_fun allocates room for the result value on
it's stack, and then returns a pointer to the result in RAX. When the function
returns, the C++ calling code expects a pointer to the result value in RAX, and
it goes from there.
This appears to be happening correctly; when d_fun returns, RAX is a valid
pointer to the result struct (looking at the debug info, it appears to be a
value called '__HID1'), but I see a crash in C++ shortly after inside the
destructor for the result rvalue.

I eventually realised the reason for the crash; even though the rvalue pointer
is a pointer to valid data, I noticed that the rvalue pointer (ie, __HID1) is
always out by 1 byte, in this case: 0x000000884398e917.
Debug info confirms that __HID1's address is in that odd location.
The struct S has a single pointer member, which is 8-byte aligned, which means
the whole S struct must be 8-byte aligned, I can confirm that is(typeof(__HID1)
== S), but __HID1 is not aligned to 8 bytes.
When C++ receives this pointer from d_fun in RAX, it accesses the pointer
directly a couple of times and gets proper data, but then MSVC seems to make
some assumption a little later that the pointer is 8-byte-aligned and allows
code generation that truncates the bottom 3 bits, and acts on data out of
alignment causing a crash.

In my case, it is during the move constructor I illustrated; &&rval is the
unaligned pointer to __HID1. First it initialises this->c from rval.c, which
follows the unaligned pointer and correctly initialises this->c. It then writes
nullptr to rval.c, but it writes `0` to *aligned* rval instead of the unaligned
it received as function argument. The destructor later operates on a bad
pointer because `c` is not nullptr as expected.


So TL;DR: allocation of the result value, when returning a struct by-value,
known by the debug info as __HID1, in an extern(C++) function, appears to be
capable on not aligning the result struct correctly. If I change my code around
randomly, the value often becomes properly aligned, and then the crash
disappears as expected.

I can't find any pattern that reliably affects the placement of __HID1, I even
tried adding `align(8) struct S { C c; }`, but that had no affect and the
rvalue was still unaligned.

Can anybody imagine any reason that the placement of the rvalue struct may
become unaligned in this case where returning struct from extern(C++) by-value?

Like I say, it doesn't happen in simple cases; when I try and fabricate a
repro, I can't reproduce the problem. Something about the complexity of my
d_fun affects placement of __HID1.

Perhaps put an assert in the compiler that the rvalue struct is aligned
correctly in these conditions, and ICE if it's not? We'll have a better chance
of catching it then, as it might show up unnoticed in more common cases, but
rarely find itself in conjunction with MSVC++ code that (reasonably) assumes
proper alignment.

--
May 29 2017