digitalmars.D.ldc - Android/ARM: fixing exception-handling

Joakim (10/10) Jun 16 2015 I've gotten pretty far along with ldc for Android/ARM, with the

David Nadlinger (8/11) Jun 16 2015 Unfortunately, I don't know of anything more specific than the

Dan Olson (24/34) Jun 16 2015 There might be some clues in the iOS branch for ldc/eh.d even though it

Joakim (10/35) Jun 17 2015 I'm currently using the merge-2.067 branch linked against a

Joakim (40/46) Jul 08 2015 It appears that the only change you made is to turn off passing

Dan Olson (17/24) Jul 09 2015 Hi Joakm,
Joakim (20/52) Jul 25 2015 I spent some more time looking into this and it appears that an

Dan Olson (6/12) Jul 28 2015 Good puzzle solving.

Joakim (7/13) Jun 17 2015 OK, I'll look into those. It does seem that there are a lot more

"Joakim" <dlang joakim.fea.st> writes:

I've gotten pretty far along with ldc for Android/ARM, with the 
big remaining issue appearing to be the unfinished support for 
exception-handling.  Many exceptions seem to work just fine, 
while others cause segfaults.  I've just started looking at 
ldc.eh with one such failing exception from the unit tests for 
core.thread and it seems to error out when trying to find the 
landing pad and action offset, I think in the get_uleb128 helper.

David, what remains to be done for ARM support, if you know 
anything more specific than simply finding and fixing the 
remaining stuff that doesn't work?

Jun 16 2015

"David Nadlinger" <code klickverbot.at> writes:

On Tuesday, 16 June 2015 at 23:07:45 UTC, Joakim wrote:
 David, what remains to be done for ARM support, if you know 
 anything more specific than simply finding and fixing the 
 remaining stuff that doesn't work?

Unfortunately, I don't know of anything more specific than the 
couple EH-related of test case failures on Linux/EABI.

It has been quite some while since I last worked on LDC/ARM to be 
honest; most of my ARM work is getting embedded stuff done with 
C++14 these days. Maybe Dan knows of some other 
codegen/math-related issues still to be solved?

  - David

Jun 16 2015

Dan Olson <gorox comcast.net> writes:

"David Nadlinger" <code klickverbot.at> writes:

 On Tuesday, 16 June 2015 at 23:07:45 UTC, Joakim wrote:
 David, what remains to be done for ARM support, if you know anything
 more specific than simply finding and fixing the remaining stuff
 that doesn't work?

 Unfortunately, I don't know of anything more specific than the couple
 EH-related of test case failures on Linux/EABI.

 It has been quite some while since I last worked on LDC/ARM to be
 honest; most of my ARM work is getting embedded stuff done with C++14
 these days. Maybe Dan knows of some other codegen/math-related issues
 still to be solved?

There might be some clues in the iOS branch for ldc/eh.d even though it
is dealing with SjLj style exceptions and landing pads are interpreted
differently.  It has few version variations but uses much of the same
code.  It seems to work ok during the unittests.  I haven't encountered
any weirdness since I spent some late nights with a debugger a year ago.

https://github.com/smolt/druntime/blob/ios/src/ldc/eh.d

Diff with tag v0.15.1 to see where I changed stuff.  All my published
ios branches are currently based on 0.15.1 and using LLVM 3.5.1.

Joakim, what branch of LDC are you basing your Android stuff on?

I can publish to github ios merges with 0.15.2-beta and 0.16.0 (branch
merge-2.067), but I don't think there is any additional help there with
regard to EH, even though ldc/eh.d did change for druntime ldc branch.

As far as codegen problems - there is nothing related to EH that I can
think of.  The optimizer occassionally gets some alignment wrong with
neon instructions in LLVM 3.5.1, but that does not show up as a EH
problem.  Currently neon is disabled when building optimized libs.

If you haven't created a gen/abi-arm.{h,cpp}, you will need to as the
default has a few problems on ARM, but still not related to EH.  If you
are on LLVM 3.5.1, try the one on the ios branch named abi-ios.{h,cpp}.
There are additional abi-ios changes for 0.15.2 because D variadic
functions handling changed.
-- 
Dan

Jun 16 2015

"Joakim" <dlang joakim.fea.st> writes:

On Wednesday, 17 June 2015 at 06:50:52 UTC, Dan Olson wrote:
 There might be some clues in the iOS branch for ldc/eh.d even 
 though it is dealing with SjLj style exceptions and landing 
 pads are interpreted differently.  It has few version 
 variations but uses much of the same code.  It seems to work ok 
 during the unittests.  I haven't encountered any weirdness 
 since I spent some late nights with a debugger a year ago.

 https://github.com/smolt/druntime/blob/ios/src/ldc/eh.d

 Diff with tag v0.15.1 to see where I changed stuff.  All my 
 published ios branches are currently based on 0.15.1 and using 
 LLVM 3.5.1.

 Joakim, what branch of LDC are you basing your Android stuff on?

I'm currently using the merge-2.067 branch linked against a 
lightly patched llvm 3.6, the one that's used in the Android NDK, 
and compiled by clang 3.6.1.

 I can publish to github ios merges with 0.15.2-beta and 0.16.0 
 (branch merge-2.067), but I don't think there is any additional 
 help there with regard to EH, even though ldc/eh.d did change 
 for druntime ldc branch.

I hadn't bothered looking at how your iOS branch dealt with 
exceptions, since you had said a while back that it uses 
setjmp/longjmp exceptions, but I'll take a look now and see if 
there's anything helpful.

 As far as codegen problems - there is nothing related to EH 
 that I can think of.  The optimizer occassionally gets some 
 alignment wrong with neon instructions in LLVM 3.5.1, but that 
 does not show up as a EH problem.  Currently neon is disabled 
 when building optimized libs.

 If you haven't created a gen/abi-arm.{h,cpp}, you will need to 
 as the default has a few problems on ARM, but still not related 
 to EH.  If you are on LLVM 3.5.1, try the one on the ios branch 
 named abi-ios.{h,cpp}. There are additional abi-ios changes for 
 0.15.2 because D variadic functions handling changed.

I'll take a look.  Right now, the only change I made to 
gen/abi.cpp is to use the C calling convention everywhere.

Jun 17 2015

"Joakim" <dlang joakim.fea.st> writes:

On Wednesday, 17 June 2015 at 07:32:35 UTC, Joakim wrote:
 I hadn't bothered looking at how your iOS branch dealt with 
 exceptions, since you had said a while back that it uses 
 setjmp/longjmp exceptions, but I'll take a look now and see if 
 there's anything helpful.

Took a look, don't think it's relevant to DWARF exceptions.

 I'll take a look.  Right now, the only change I made to 
 gen/abi.cpp is to use the C calling convention everywhere.

It appears that the only change you made is to turn off passing 
structs by value?

https://github.com/smolt/ldc/blob/ios/gen/abi-ios.cpp#L53

The fast C calling convention works for you?  It always caused 
problems for me on ARM, including causing a segfault in llvm when 
compiling, the last time I tried it.

I spent some time looking into the ARM EH issues and it appears 
that disabling inlining fixes a lot of it:

--- a/gen/optimizer.cpp
+++ b/gen/optimizer.cpp
   -163,8 +163,8    static unsigned sizeLevel() {

  // Determines whether or not to run the normal, full inlining 
pass.
  bool willInline() {
-    return enableInlining == cl::BOU_TRUE ||
-        (enableInlining == cl::BOU_UNSET && optLevel() > 1);
+    return enableInlining == cl::BOU_TRUE;// ||
+        //(enableInlining == cl::BOU_UNSET && optLevel() > 1);
  }

  bool isOptimizationEnabled() {

I also get proper backtraces in gdb much more often after turning 
off inlining, not to mention actual error output on the 
command-line as opposed to segfaults.  I'm guessing something is 
screwed up in the generation or handling of DWARF exception data 
by function inlining.  Almost all of druntime now passes tests on 
Android/ARM, with the exception of some codegen issues in 
core.time.

For a comparison, running the phobos tests with logging turned on 
in the ldc/eh.d code showed that only about 67 exceptions were 
thrown with -O2/-O3 -release and inlining turned on.  With 
inlining turned off, it jumps up to 658 exceptions, an order of 
magnitude more, because many more tests are run once EH starts 
working.  A couple exceptions might still be uncaught and need to 
be fixed, but it appears that EH is not the bottleneck anymore, 
it's codegen and other ARM issues.

David, Kai, or whoever else runs tests on linux/Android/ARM, can 
you turn inlining off and verify the same results on your ARM 
hardware?

Jul 08 2015

Dan Olson <gorox comcast.net> writes:

"Joakim" <dlang joakim.fea.st> writes:

 On Wednesday, 17 June 2015 at 07:32:35 UTC, Joakim wrote:
 It appears that the only change you made is to turn off passing
 structs by value?

 https://github.com/smolt/ldc/blob/ios/gen/abi-ios.cpp#L53

Hi Joakm,

Yes, that little change had a big impact.

http://forum.dlang.org/post/m2r3u5ac0c.fsf comcast.net

Structs are still passed by value, just in a different way.  The LLVM
"byval" attribute non-inuitively passes a pointer to a struct instead of
passing its contents in registers and stack.

http://llvm.org/docs/LangRef.html#parameter-attributes.  

 The fast C calling convention works for you?  It always caused
 problems for me on ARM, including causing a segfault in llvm when
 compiling, the last time I tried it.

fastcc has worked quite well and an attempt to change to C calling
convention (ccc) led to funny codegen for some aggregate function return
values (e.g complex reals) when optimization was enabled.  But that
problem seemed to go away with LLVM 3.6.

In the end I have abandoned fastcc for ccc with my 0.15.2 and 2.067
merge branches because LDC adopted a different variadic approach and
fastcc doesn't support it.
-- 
Dan

Jul 09 2015

"Joakim" <dlang joakim.fea.st> writes:

On Wednesday, 8 July 2015 at 16:14:43 UTC, Joakim wrote:
 I spent some time looking into the ARM EH issues and it appears 
 that disabling inlining fixes a lot of it:

 --- a/gen/optimizer.cpp
 +++ b/gen/optimizer.cpp
    -163,8 +163,8    static unsigned sizeLevel() {

  // Determines whether or not to run the normal, full inlining 
 pass.
  bool willInline() {
 -    return enableInlining == cl::BOU_TRUE ||
 -        (enableInlining == cl::BOU_UNSET && optLevel() > 1);
 +    return enableInlining == cl::BOU_TRUE;// ||
 +        //(enableInlining == cl::BOU_UNSET && optLevel() > 1);
  }

  bool isOptimizationEnabled() {

 I also get proper backtraces in gdb much more often after 
 turning off inlining, not to mention actual error output on the 
 command-line as opposed to segfaults.  I'm guessing something 
 is screwed up in the generation or handling of DWARF exception 
 data by function inlining.  Almost all of druntime now passes 
 tests on Android/ARM, with the exception of some codegen issues 
 in core.time.

 For a comparison, running the phobos tests with logging turned 
 on in the ldc/eh.d code showed that only about 67 exceptions 
 were thrown with -O2/-O3 -release and inlining turned on.  With 
 inlining turned off, it jumps up to 658 exceptions, an order of 
 magnitude more, because many more tests are run once EH starts 
 working.  A couple exceptions might still be uncaught and need 
 to be fixed, but it appears that EH is not the bottleneck 
 anymore, it's codegen and other ARM issues.

 David, Kai, or whoever else runs tests on linux/Android/ARM, 
 can you turn inlining off and verify the same results on your 
 ARM hardware?

I spent some more time looking into this and it appears that an 
ARM optimization pass in llvm is the real issue, not inlining.  
It turns out that enabling the EH_personality debug output in 
ldc.eh and turning off inlining happened to generate ARM code 
that worked earlier, but I can get it to work without those two 
hacks by turning off one call to an ARM optimization pass in llvm 
instead.  Specifically, if I disable this second call to 
createARMLoadStoreOptimizationPass() and then compile only 
ldc/eh.d with the resulting ldc2, ARM EH will work, because the 
second "while" loop in eh_personality_common doesn't segfault 
anymore:

https://github.com/llvm-mirror/llvm/blob/release_36/lib/Target/ARM/ARMTargetMachine.cpp#L312

Otherwise, it will often, though not always, fail at a ldmib 
instruction, similar to the other codegen issue I brought up in 
another thread, which Dan provided a workaround for.  With this 
second pass turned off, that ldmib instruction isn't there and EH 
starts working.  I haven't looked further into exactly what that 
ARM optimization pass is screwing up, but this is probably an 
llvm codegen issue.

Jul 25 2015

Dan Olson <gorox comcast.net> writes:

"Joakim" <dlang joakim.fea.st> writes:
 I spent some more time looking into this and it appears that an ARM
 optimization pass in llvm is the real issue, not inlining.  It turns
 out that enabling the EH_personality debug output in ldc.eh and
 turning off inlining happened to generate ARM code that worked
 earlier, but I can get it to work without those two hacks by turning
 off one call to an ARM optimization pass in llvm instead.

Good puzzle solving.

There might be clues in the clang source code on how to set everything
up to make that optimization pass work.  Clang does a lot of interesting
stuff, like coercing args and changing alignment that I don't think is
done in LDC.

Jul 28 2015

"Joakim" <dlang joakim.fea.st> writes:

On Wednesday, 17 June 2015 at 01:03:19 UTC, David Nadlinger wrote:
 On Tuesday, 16 June 2015 at 23:07:45 UTC, Joakim wrote:
 David, what remains to be done for ARM support, if you know 
 anything more specific than simply finding and fixing the 
 remaining stuff that doesn't work?

 Unfortunately, I don't know of anything more specific than the 
 couple EH-related of test case failures on Linux/EABI.

OK, I'll look into those.  It does seem that there are a lot more 
unit tests that throw exceptions in 2.067 though, so a lot more 
than a couple fail.

I've also found one or two tests unrelated to exceptions that may 
have ARM codegen issues.  I'll look into those further and file 
the appropriate issues, if necessary.

Jun 17 2015

D Programming

C/C++ Programming

Other

digitalmars.D.ldc - Android/ARM: fixing exception-handling