www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.ldc - Odd behavior in fiber unittest - printf stops optimization crash

reply "Kevin Brogan" <kevin brogan.ca> writes:
msvc 64 bit build.

Trying to figure out why the unit test for core.thread is failing 
for release but not for debug.

First I wanted to figure out which tests were causing the crash. 
Turns out that only the fiber related tests are crashing. All 
thread related unit tests are fine.

Thoughts are that it would be an optimization problem, since 
release is crashing but not debug.

I switch out the -O3 flag in build scripts for core.thread with 
-02. No more crash.

So that's confirmed...  but what line of code is causing the 
crash? Switch back to -O3 and disable all of the fibre unit tests 
except for what looked to be the simplest one.

unittest
{
     int x = 0;

     (new Fiber({
         x++;
     })).call();
     assert( x == 1 );
}

Using printf statements I determine that the crash is happening 
inside of a naked assembly function called fiber_switchContext.

There's almost zero comments so I don't understand exactly why 
it's doing what it's doing. Seems to be saving the win64 abi 
non-volatile registers as well as something to do with thread 
local storage using the GS register which is completely 
undocumented on msdn from what I can tell.

I have questions about the function. Why is RBX pushed after the 
XMM registers. (xmm alignment i guess?) Why are RDI and RSI not 
saved, as they are non-volatile as well according to 
https://msdn.microsoft.com/en-us/library/9z1stfyw.aspx.

So while I'm writing this I realize I haven't tested what happens 
if I modify the context switch to save RDI and RSI as well...  
unit test is working now!!! next!

Earlier I had figured out that placing a printf on either side of 
the call to fiber_switchContext would eliminate the crash. I 
guess it changed the optimized code...

On to the next failing fiber related unit test!
Feb 28 2015
next sibling parent "Kevin Brogan" <kevin brogan.ca> writes:
Seems that all of them are working now except for two. 
testNonvolatileRegister is failing as is "Multiple threads 
running shared fibers".

I'm not sure that the RSI RDI additions are what was required 
since they are clearly not saved in the non volatile register 
test.
Feb 28 2015
prev sibling parent reply "Kai Nacke" <kai redstar.de> writes:
On Sunday, 1 March 2015 at 00:30:03 UTC, Kevin Brogan wrote:
 msvc 64 bit build.

 Trying to figure out why the unit test for core.thread is 
 failing for release but not for debug.

 First I wanted to figure out which tests were causing the 
 crash. Turns out that only the fiber related tests are 
 crashing. All thread related unit tests are fine.

 Thoughts are that it would be an optimization problem, since 
 release is crashing but not debug.

 I switch out the -O3 flag in build scripts for core.thread with 
 -02. No more crash.

 So that's confirmed...  but what line of code is causing the 
 crash? Switch back to -O3 and disable all of the fibre unit 
 tests except for what looked to be the simplest one.

 unittest
 {
     int x = 0;

     (new Fiber({
         x++;
     })).call();
     assert( x == 1 );
 }

 Using printf statements I determine that the crash is happening 
 inside of a naked assembly function called fiber_switchContext.

 There's almost zero comments so I don't understand exactly why 
 it's doing what it's doing. Seems to be saving the win64 abi 
 non-volatile registers as well as something to do with thread 
 local storage using the GS register which is completely 
 undocumented on msdn from what I can tell.

 I have questions about the function. Why is RBX pushed after 
 the XMM registers. (xmm alignment i guess?) Why are RDI and RSI 
 not saved, as they are non-volatile as well according to 
 https://msdn.microsoft.com/en-us/library/9z1stfyw.aspx.

 So while I'm writing this I realize I haven't tested what 
 happens if I modify the context switch to save RDI and RSI as 
 well...  unit test is working now!!! next!

 Earlier I had figured out that placing a printf on either side 
 of the call to fiber_switchContext would eliminate the crash. I 
 guess it changed the optimized code...

 On to the next failing fiber related unit test!
Hi Kevin! fiber_switchContext has to parameters: the stack pointer of the old fiber (oldp) and the stack pointer of the new fiber (newp). In essence, the state of the old fiber is pushed on the stack pointed to by oldp. Then the state of the new fiber is loaded from the stack pointed to by newp. The context switch is finished with a JMP to the new fiber code. The GS: prefix points to some Windows internal structure. The first element is the Thread Information Block: Definition from <winnt.h>: typedef struct _NT_TIB { struct _EXCEPTION_REGISTRATION_RECORD *ExceptionList; PVOID StackBase; PVOID StackLimit; PVOID SubSystemTib; .... } The saved elements are therefore the exception chain, the stack base and the stack limit. The order of the XMM register is simply dictated by alignment as you guessed. Please note the comment from start of function: if you change the stack layout then you also have to change function Fiber.initStack because here the initial layout is created. A context switch with fiber_switchContext looks like a normal function call to the compiler. The nonvolatile registers must be preserved. That RDI/RSI are not saved is a bug. Compare with the function from druntime 2.067: the bug is fixed there. Regards, Kai
Mar 01 2015
parent reply "Kai Nacke" <kai redstar.de> writes:
On Sunday, 1 March 2015 at 13:35:54 UTC, Kai Nacke wrote:
 A context switch with fiber_switchContext looks like a normal 
 function call to the compiler. The nonvolatile registers must 
 be preserved. That RDI/RSI are not saved is a bug. Compare with 
 the function from druntime 2.067: the bug is fixed there.
I pushed this to master. Maybe you wanna try it. Regards, Kai
Mar 01 2015
parent "Kevin Brogan" <kevin brogan.ca> writes:
On Sunday, 1 March 2015 at 14:56:18 UTC, Kai Nacke wrote:
 On Sunday, 1 March 2015 at 13:35:54 UTC, Kai Nacke wrote:
 A context switch with fiber_switchContext looks like a normal 
 function call to the compiler. The nonvolatile registers must 
 be preserved. That RDI/RSI are not saved is a bug. Compare 
 with the function from druntime 2.067: the bug is fixed there.
I pushed this to master. Maybe you wanna try it. Regards, Kai
thanks for the info
Mar 01 2015