digitalmars.D.ldc - TLS for Android
- Joakim (25/25) Mar 07 2014 So I've been looking into implementing TLS for Android/x86,
- Jacob Carlborg (6/18) Mar 08 2014 Yes. DMD started to implemented support for TLS on OS X before 10.7
- David Nadlinger (14/20) Mar 08 2014 LLVM does support putting variables into custom sections, and you can
- Joakim (15/30) Mar 08 2014 You're talking about findDataSection and friends?
- David Nadlinger (17/35) Mar 08 2014 Not quite. I was referring to
- Joakim (13/70) Mar 08 2014 Okay, I started looking around the master branch and didn't find
- Joakim (11/28) Mar 08 2014 You mention "replacing the part that Glibc does (but Bionic
- David Nadlinger (40/49) Mar 09 2014 There are several possible ABIs for thread-local storage. For the sake
- Joakim (25/79) Mar 09 2014 Yeah, I've had that pdf loaded in my browser for the last couple
- Joakim (15/37) Mar 17 2014 Alright, I looked into the ARM and X86 assembly lowering source
- Joakim (11/14) Mar 20 2014 Since packed TLS looks like the way this needs to be done, any
- Dan Olson (9/9) Mar 27 2014 Any TLS progress out there in LDC-land?
- Joakim (12/21) Mar 27 2014 I've been familiarizing myself with the relevant dmd backend
- Dan Olson (78/87) Mar 30 2014 The approach I started with was to make LLVM do the work. I read
- Joakim (21/132) Mar 30 2014 Nice find, I guess it helps that they have a desktop OS that does
- Dan Olson (20/38) Mar 30 2014 I did try it in an iOS app. The function _tlv_bootstrap is unresolved
- Joakim (24/85) Mar 30 2014 Hmm, you and Jacob are probably right, it may be better to just
- Dan Olson (8/14) Mar 30 2014 Thinking about this some more. It probably makes sense to have an
- Jacob Carlborg (9/90) Mar 30 2014 I would follow the native TLS implementation in OS X, i.e. using
- Dan Olson (8/14) Mar 30 2014 Do think we can just drop the dyld code into druntime? It should work
- Jacob Carlborg (14/19) Mar 30 2014 Yes, with minor modifications. The TLS related code in dyld is pretty
- David Nadlinger (14/17) Mar 31 2014 More specifically, for the DMD TLS emulation implementation, this is
- Dan Olson (9/26) Mar 31 2014 I had disabled initTLSRanges for iOS since dyld_enumerate_tlv_storage is
- Dan Olson (11/38) Apr 01 2014 I did reenable and it works. I can tell because the std.datetime
- Jacob Carlborg (5/12) Mar 31 2014 "dyld_enumerate_tlv_storage" should probably be replaced with a function...
- David Nadlinger (7/9) Mar 31 2014 If you find such a function, please let me know (or, better, submit a
- Jacob Carlborg (6/11) Mar 31 2014 Hmm, it might be a bit more complicated than I first thought. I might
- Dan Olson (26/28) Mar 31 2014 I have tried and success.
- David Nadlinger (3/9) Mar 31 2014 Nice!
- Jacob Carlborg (4/11) Mar 31 2014 Awesome :)
- David Nadlinger (3/5) Mar 27 2014 Would be great – I don't think anybody else is working on this right n...
- Dan Olson (9/31) Mar 08 2014 While on the subject of TLS, that is probably the most needed language
- David Nadlinger (3/6) Mar 08 2014 If their code is Boost-licensed (general druntime/Phobos license), yes.
- Dan Olson (2/8) Mar 08 2014 Yes, that file is Boost - good!
- Joakim (9/14) Mar 08 2014 I wondered earlier why you weren't just using Walter's packed TLS
- Jacob Carlborg (6/11) Mar 09 2014 I think it would be possible to implement the missing TLV functions our
- Joakim (10/36) Mar 09 2014 OK, I assumed OS support was necessary, maybe not.
- Jacob Carlborg (7/8) Mar 09 2014 Well, yes. In this case the OS support comes in the form of the dynamic
So I've been looking into implementing TLS for Android/x86, rummaging through old TLS git commits for dmd and ldc to see what to do. It appears that Walter implemented TLS on OS X more than four years ago by packing thread-local variables into special segments and then unpacking them in druntime, which uses pthread_(get|set)specific on OS X nowadays: http://www.drdobbs.com/architecture-and-design/implementing-thread-local-storage-on-os/228701185 https://github.com/D-Programming-Language/druntime/blob/master/src/rt/sections_osx.d#L106 Since Android also provides these pthread functions for TLS, seems like a similar approach is called for. I notice that ldc never used this approach, depending on llvm's built-in TLS support instead: https://github.com/ldc-developers/ldc/commit/4d7a6eda234bc8d12703cc577c09c2ca50ac6bda#diff-19 It seems that this also meant that TLS wasn't garbage-collected on OSX, until David added it a little more than a year ago: https://github.com/ldc-developers/druntime/blob/ldc/src/ldc/osx_tls.c I can copy what dmd is doing on OS X Mach-O with ELF, but it's not going to be easily transferable to ldc, which will be necessary for Android/ARM. Do you have any advice on how to pull this off with ldc? Should I be going the dmd route and packing the TLS myself? Does llvm provide good support for this? Or is there some other llvm TLS shortcut I can use? I tried to see if llvm just has some thread-local implementation that automatically uses pthread_setspecific, but didn't find anything.
Mar 07 2014
On 2014-03-08 01:55, Joakim wrote:So I've been looking into implementing TLS for Android/x86, rummaging through old TLS git commits for dmd and ldc to see what to do. It appears that Walter implemented TLS on OS X more than four years ago by packing thread-local variables into special segments and then unpacking them in druntime, which uses pthread_(get|set)specific on OS X nowadays: http://www.drdobbs.com/architecture-and-design/implementing-thread-local-storage-on-os/228701185 https://github.com/D-Programming-Language/druntime/blob/master/src/rt/sections_osx.d#L106 Since Android also provides these pthread functions for TLS, seems like a similar approach is called for. I notice that ldc never used this approach, depending on llvm's built-in TLS support instead: https://github.com/ldc-developers/ldc/commit/4d7a6eda234bc8d12703cc577c09c2ca50ac6bda#diff-19Yes. DMD started to implemented support for TLS on OS X before 10.7 which is the first version of OS X to natively support TLS. LDC doesn't support older versions of OS X than 10.7 since it uses native TLS. -- /Jacob Carlborg
Mar 08 2014
On 03/08/2014 01:55 AM, Joakim wrote:Do you have any advice on how to pull this off with ldc? Should I be going the dmd route and packing the TLS myself? Does llvm provide good support for this? Or is there some other llvm TLS shortcut I can use? I tried to see if llvm just has some thread-local implementation that automatically uses pthread_setspecific, but didn't find anything.LLVM does support putting variables into custom sections, and you can more or less get away with the DMD bracketing approach (see e.g. the new ModuleInfo discovery functionality I implemented for Linux, which is the same as DMD's druntime uses). However, there is a catch: Due to what I can only imagine is a bug, LLVM does not support emitting a symbol both into a custom section and with weak linkage. Thus, you might be in for a round of LLVM hacking either way, even though it will likely involve much less when going the DMD route. However, there is a third options which might be worth investigating, namely re-implementing at least parts of the necessary runtime linker features in druntime and continuing to use the same scheme as on GNU Linux/x86. This depends on %gs not being used in another way, etc. though. David
Mar 08 2014
On Saturday, 8 March 2014 at 14:25:43 UTC, David Nadlinger wrote:LLVM does support putting variables into custom sections, and you can more or less get away with the DMD bracketing approach (see e.g. the new ModuleInfo discovery functionality I implemented for Linux, which is the same as DMD's druntime uses).You're talking about findDataSection and friends? https://github.com/ldc-developers/druntime/blob/ldc/src/rt/sections_ldc.d#L115However, there is a catch: Due to what I can only imagine is a bug, LLVM does not support emitting a symbol both into a custom section and with weak linkage. Thus, you might be in for a round of LLVM hacking either way, even though it will likely involve much less when going the DMD route.Hmm, I guess this is why you don't use the bracketing approach anywhere? What will be much less when going the DMD route?However, there is a third options which might be worth investigating, namely re-implementing at least parts of the necessary runtime linker features in druntime and continuing to use the same scheme as on GNU Linux/x86. This depends on %gs not being used in another way, etc. though.I tried to reuse the existing dl_iterate_phdr approach on Android, but then I noticed that the dl_phdr_info struct defined in bionic doesn't include the dlpi_tls_modid and dlpi_tls_data members. However, now that you mention it, maybe those aren't strictly necessary, as long as I'm not worried about shared libraries. I'll look into it further. As for reimplementing the runtime linker, in a sense that's what's being done with dmd/druntime for OS X, where it implements it's own ___tls_get_addr using pthread_setspecific. I'll have to do the same for Android, as bionic doesn't have a __tls_get_addr.
Mar 08 2014
On Sat, Mar 8, 2014 at 7:16 PM, Joakim <joakim airpost.net> wrote:On Saturday, 8 March 2014 at 14:25:43 UTC, David Nadlinger wrote:Not quite. I was referring to https://github.com/ldc-developers/druntime/blob/ldc-merge-2.064/src/rt/sections_linux.d (_d_dso_registry, ...) and the associated compiler-side implementation, https://github.com/ldc-developers/ldc/blob/5b14a5e5c4f292024afd8e5f520e837035942003/gen/module.cpp#L396.LLVM does support putting variables into custom sections, and you can more or less get away with the DMD bracketing approach (see e.g. the new ModuleInfo discovery functionality I implemented for Linux, which is the same as DMD's druntime uses).You're talking about findDataSection and friends? https://github.com/ldc-developers/druntime/blob/ldc/src/rt/sections_ldc.d#L115Actually, we didn't use the special section approach at all until very recently (i.e. Martin's shared library changes in 2.064). And I meant that you would probably get away with less LLVM hacking when just changing the way LDC emits TLS globals/accesses than when implementing "emulated" TLS on the LLVM backend side.However, there is a catch: Due to what I can only imagine is a bug, LLVM does not support emitting a symbol both into a custom section and with weak linkage. Thus, you might be in for a round of LLVM hacking either way, even though it will likely involve much less when going the DMD route.Hmm, I guess this is why you don't use the bracketing approach anywhere? What will be much less when going the DMD route?As for reimplementing the runtime linker, in a sense that's what's being done with dmd/druntime for OS X, where it implements it's own ___tls_get_addr using pthread_setspecific. I'll have to do the same for Android, as bionic doesn't have a __tls_get_addr.Well, yes and no. I was specifically referring to keeping the normal TLS infrastructure (i.e. %gs-based addressing on Linux/x86) in place and just replacing the part that Glibc does (but Bionic doesn't) with a piece of code in druntime. __tls_get_addr isn't necessarily used on x86. David
Mar 08 2014
On Saturday, 8 March 2014 at 22:44:16 UTC, David Nadlinger wrote:On Sat, Mar 8, 2014 at 7:16 PM, Joakim <joakim airpost.net> wrote:Okay, I started looking around the master branch and didn't find what you were talking about. No wonder, it's in the merge-2.064 branch. I'll look at what you did there.On Saturday, 8 March 2014 at 14:25:43 UTC, David Nadlinger wrote:Not quite. I was referring to https://github.com/ldc-developers/druntime/blob/ldc-merge-2.064/src/rt/sections_linux.d (_d_dso_registry, ...) and the associated compiler-side implementation, https://github.com/ldc-developers/ldc/blob/5b14a5e5c4f292024afd8e5f520e837035942003/gen/module.cpp#L396.LLVM does support putting variables into custom sections, and you can more or less get away with the DMD bracketing approach (see e.g. the new ModuleInfo discovery functionality I implemented for Linux, which is the same as DMD's druntime uses).You're talking about findDataSection and friends? https://github.com/ldc-developers/druntime/blob/ldc/src/rt/sections_ldc.d#L115Well, the special section approach still isn't in the master branch, hence my confusion. Okay, I wasn't clear that you were comparing the dmd route to having llvm generate the right pthread calls for Android.Actually, we didn't use the special section approach at all until very recently (i.e. Martin's shared library changes in 2.064). And I meant that you would probably get away with less LLVM hacking when just changing the way LDC emits TLS globals/accesses than when implementing "emulated" TLS on the LLVM backend side.However, there is a catch: Due to what I can only imagine is a bug, LLVM does not support emitting a symbol both into a custom section and with weak linkage. Thus, you might be in for a round of LLVM hacking either way, even though it will likely involve much less when going the DMD route.Hmm, I guess this is why you don't use the bracketing approach anywhere? What will be much less when going the DMD route?While Android/X86 TLS does use the %gs register (https://github.com/android/platform_bionic/blob/master/libc/priva e/__get_tls.h#L45), that's not portable and I'd like to try Android/ARM after this, so I'll stick with the pthread_(get|set)specific calls to wrap it: https://github.com/android/platform_bionic/blob/master/libc/bionic/pthread_key.cppAs for reimplementing the runtime linker, in a sense that's what's being done with dmd/druntime for OS X, where it implements it's own ___tls_get_addr using pthread_setspecific. I'll have to do the same for Android, as bionic doesn't have a __tls_get_addr.Well, yes and no. I was specifically referring to keeping the normal TLS infrastructure (i.e. %gs-based addressing on Linux/x86) in place and just replacing the part that Glibc does (but Bionic doesn't) with a piece of code in druntime. __tls_get_addr isn't necessarily used on x86.
Mar 08 2014
On Sunday, 9 March 2014 at 05:38:07 UTC, Joakim wrote:On Saturday, 8 March 2014 at 22:44:16 UTC, David Nadlinger wrote:You mention "replacing the part that Glibc does (but Bionic doesn't) with a piece of code in druntime." Just to be clear, you're referring to accessing TLS variables using an offset into the initialization image, which is what ___tls_get_addr from druntime does in Walter's packed TLS approach, right? If not, I'm not sure exactly what you're referring to. With all this TLS stuff split up between the compiler, linker, and runtime linker, often undocumented or poorly documenented in the latter two cases, it's been confusing to follow the TLS code path to see what's happening.Well, yes and no. I was specifically referring to keeping the normal TLS infrastructure (i.e. %gs-based addressing on Linux/x86) in place and just replacing the part that Glibc does (but Bionic doesn't) with a piece of code in druntime. __tls_get_addr isn't necessarily used on x86.While Android/X86 TLS does use the %gs register (https://github.com/android/platform_bionic/blob/master/libc/priva e/__get_tls.h#L45), that's not portable and I'd like to try Android/ARM after this, so I'll stick with the pthread_(get|set)specific calls to wrap it: https://github.com/android/platform_bionic/blob/master/libc/bionic/pthread_key.cpp
Mar 08 2014
On 9 Mar 2014, at 8:36, Joakim wrote:You mention "replacing the part that Glibc does (but Bionic doesn't) with a piece of code in druntime." Just to be clear, you're referring to accessing TLS variables using an offset into the initialization image, which is what ___tls_get_addr from druntime does in Walter's packed TLS approach, right? If not, I'm not sure exactly what you're referring to. With all this TLS stuff split up between the compiler, linker, and runtime linker, often undocumented or poorly documenented in the latter two cases, it's been confusing to follow the TLS code path to see what's happening.There are several possible ABIs for thread-local storage. For the sake of this argument, let's assume that our particular system works like the Linux/x86 implementation or Walter's OS X approach in that the TLS storage area is simply a flat block of memory where the individual variables reside at some offset. Then, there is still the question of how the application knows a) the base address of the block and b) the offset of the variable of interest. In Walter's OS X implementation, both is taken care of by __tls_get_addr, which expects a pointer into the section where the TLS initialization data is stored. On e.g. Linux/x86_64, however, the base address is stored in %fs, and the offset is provided by special linker relocations (which essentially evaluate to the offset of a given symbol from the beginning of the initialization image). No extra function calls are inserted by the compiler here to access TLS data, and the (C) runtime is not directly involved for the accesses. For an overview of the different models, see http://www.akkadia.org/drepper/tls.pdf (which is the most comprehensive document I could find, in spite of what you might think about the author). But regardless of what model is chosen, there is still the issue of actually setting up a copy of the data for each thread during initialization. This is what I was referring to when I mentioned "replacing the part that Glibc does (but Bionic doesn't) with a piece of code in druntime". So, if %gs works as expected on Android and the linker supports the necessary relocations, then it might be an option to simply use the existing TLS implementation in LLVM and simply provide the missing bits in druntime. On the other hand, if you choose to go with an entirely different TLS scheme (such as the DMD OS X implementation), you need to figure out how to change the codegen to emit the extra function calls to your __tls_get_addr analog, etc. Looking at llvm/lib/Target/ARM/ARMISelLowering.cpp, there might actually be a working implementation for this in LLVM already (which I didn't realize before), so this route would not necessarily be more complex than going with a different scheme. You'd probably just need to provide the __tls_get_addr implementation in druntime and figure out how LLVM emits the TLS image resp. how to get its base address. Hope this helps, David
Mar 09 2014
On Sunday, 9 March 2014 at 16:12:19 UTC, David Nadlinger wrote:On 9 Mar 2014, at 8:36, Joakim wrote:Yeah, I've had that pdf loaded in my browser for the last couple months, skimmed some of it initially and I've been slowly going through it in more detail. I tried simply loading a binary built using bracketed sections and the linker's current TLS relocations, ie no extra function calls, in Android/x86 and I got some other random data in the resulting TLS initialization image. I think this is because bionic stores the pthread_setspecific-created void* pointers in the normal TLS area, so you can't just use the TLS relocations that dmd and the gold linker generate for linux/x86 on Android/x86, ie using the %gs register directly. I have no opinion on the author, should I? ;)You mention "replacing the part that Glibc does (but Bionic doesn't) with a piece of code in druntime." Just to be clear, you're referring to accessing TLS variables using an offset into the initialization image, which is what ___tls_get_addr from druntime does in Walter's packed TLS approach, right? If not, I'm not sure exactly what you're referring to. With all this TLS stuff split up between the compiler, linker, and runtime linker, often undocumented or poorly documenented in the latter two cases, it's been confusing to follow the TLS code path to see what's happening.There are several possible ABIs for thread-local storage. For the sake of this argument, let's assume that our particular system works like the Linux/x86 implementation or Walter's OS X approach in that the TLS storage area is simply a flat block of memory where the individual variables reside at some offset. Then, there is still the question of how the application knows a) the base address of the block and b) the offset of the variable of interest. In Walter's OS X implementation, both is taken care of by __tls_get_addr, which expects a pointer into the section where the TLS initialization data is stored. On e.g. Linux/x86_64, however, the base address is stored in %fs, and the offset is provided by special linker relocations (which essentially evaluate to the offset of a given symbol from the beginning of the initialization image). No extra function calls are inserted by the compiler here to access TLS data, and the (C) runtime is not directly involved for the accesses. For an overview of the different models, see http://www.akkadia.org/drepper/tls.pdf (which is the most comprehensive document I could find, in spite of what you might think about the author).But regardless of what model is chosen, there is still the issue of actually setting up a copy of the data for each thread during initialization. This is what I was referring to when I mentioned "replacing the part that Glibc does (but Bionic doesn't) with a piece of code in druntime".I was finally able to access a proper initialization image created by dmd in druntime on Android/x86 a couple hours back, by using dl_phdr_info similarly to what is done on linux now.So, if %gs works as expected on Android and the linker supports the necessary relocations, then it might be an option to simply use the existing TLS implementation in LLVM and simply provide the missing bits in druntime. On the other hand, if you choose to go with an entirely different TLS scheme (such as the DMD OS X implementation), you need to figure out how to change the codegen to emit the extra function calls to your __tls_get_addr analog, etc. Looking at llvm/lib/Target/ARM/ARMISelLowering.cpp, there might actually be a working implementation for this in LLVM already (which I didn't realize before), so this route would not necessarily be more complex than going with a different scheme. You'd probably just need to provide the __tls_get_addr implementation in druntime and figure out how LLVM emits the TLS image resp. how to get its base address.I think this is the best route, with the advantage that if my ___tls_get_addr uses pthread_(get|set)specific, it will likely just work on ARM too. I thought I'd have to get ldc to generate slightly different IR to do this, but it'd be great if llvm already does this. I had briefly looked at X86ISelLowering.cpp but not the ARM one, I'll see what it does.Hope this helps, DavidYeah, I think we're on the same page, thanks for the explanation. I've just been learning about TLS recently, so I wasn't sure before.
Mar 09 2014
On Sunday, 9 March 2014 at 18:23:00 UTC, Joakim wrote:On Sunday, 9 March 2014 at 16:12:19 UTC, David Nadlinger wrote:Alright, I looked into the ARM and X86 assembly lowering source and it appears that those __tls_get_addr calls are simply the ones put in for the dynamic thread models. I tried hijacking those ___tls_get_addr calls by compiling all code as PIC, which forces a dynamic thread model in llvm that puts in the __tls_get_addr function calls, and then building as a shared library, which causes the gold linker to disable any linker optimizations that remove those calls. However, the resulting shared library would not run because there are still a few TLS relocations from the GOT for the dynamic linker to execute and the Android dynamic linker doesn't do those TLS relocations. So that was a deadend, looks like it's back to the packed TLS approach and having ldc generate IR that calls my __tls_get_addr manually.So, if %gs works as expected on Android and the linker supports the necessary relocations, then it might be an option to simply use the existing TLS implementation in LLVM and simply provide the missing bits in druntime. On the other hand, if you choose to go with an entirely different TLS scheme (such as the DMD OS X implementation), you need to figure out how to change the codegen to emit the extra function calls to your __tls_get_addr analog, etc. Looking at llvm/lib/Target/ARM/ARMISelLowering.cpp, there might actually be a working implementation for this in LLVM already (which I didn't realize before), so this route would not necessarily be more complex than going with a different scheme. You'd probably just need to provide the __tls_get_addr implementation in druntime and figure out how LLVM emits the TLS image resp. how to get its base address.I think this is the best route, with the advantage that if my ___tls_get_addr uses pthread_(get|set)specific, it will likely just work on ARM too. I thought I'd have to get ldc to generate slightly different IR to do this, but it'd be great if llvm already does this. I had briefly looked at X86ISelLowering.cpp but not the ARM one, I'll see what it does.
Mar 17 2014
On Monday, 17 March 2014 at 10:25:22 UTC, Joakim wrote:So that was a deadend, looks like it's back to the packed TLS approach and having ldc generate IR that calls my __tls_get_addr manually.Since packed TLS looks like the way this needs to be done, any chance one of the ldc developers might be able to toss this off? This is the first time I've ever tinkered with a compiler, so it will very likely take me longer than it would take one of you. Right now, I'm looking at hacking dmd to do this, as that seems like the fastest route to get something working, but obviously ldc will need it too for Android/ARM and the dmd patch is not going to be reusable for ldc. If not, not a big deal, I'm sure I'll get something working eventually.
Mar 20 2014
Any TLS progress out there in LDC-land? To pass thread/fiber unittests on iOS, I put in temporary workaround using pthread_get/setspecific directly for the two threadlocals (Thread.sm_this and Fiber.sm_this). Now I can pass 74 of 85 druntime/phobos unittests on iOS. If nobody is working on the emulated TLS for LDC, I will give it a try. Nothing to lose. -- Dan
Mar 27 2014
On Thursday, 27 March 2014 at 16:01:31 UTC, Dan Olson wrote:Any TLS progress out there in LDC-land?I've been familiarizing myself with the relevant dmd backend source, but haven't tried anything yet.To pass thread/fiber unittests on iOS, I put in temporary workaround using pthread_get/setspecific directly for the two threadlocals (Thread.sm_this and Fiber.sm_this). Now I can pass 74 of 85 druntime/phobos unittests on iOS.I thought about doing the same, but didn't bother since I was able to get all of druntime's unit tests to pass by using Android's limited and flaky TLS support, left over from the linux kernel.If nobody is working on the emulated TLS for LDC, I will give it a try. Nothing to lose.Whatever I do to implement packed TLS in the dmd backend is not going to work for ldc anyway, so nothing stopping you from making your own effort. You will have to patch llvm also, if the weak symbols bug David pointed out is still around in llvm 3.5. Let us know what approach you take.
Mar 27 2014
"Joakim" <joakim airpost.net> writes:On Thursday, 27 March 2014 at 16:01:31 UTC, Dan Olson wrote:The approach I started with was to make LLVM do the work. I read through all of the comments in this thread and decided this might be the most fun. ARMISelLowering.cpp has TLS disabled for all but ELF targets. I commented out an assertion blocking other targets to see what would happen for iOS (Mach-O). To my suprise, found that Mach-O tls sections are generated (__thread_vars, __thread_data, .tbss) and populated with the D thread local vars. The load/store instructions were treating TLS vars like global data though. So I looked at the Mach-O X86 version and saw what it is trying to do. LLVM coding is still a mystery to me, but managed after many hours today to hack together something that would turn this D code module tlsd; int a; void test() { a += 4; // access a } into this: movw r0, :lower16:(__D4tlsd1ai-(LPC4_0+4)) movt r0, :upper16:(__D4tlsd1ai-(LPC4_0+4)) LPC4_0: add r0, pc blx ___tls_get_addr ldr r1, [r0] str r1, [r0] ... .tbss __D4tlsd1ai$tlv$init, 4, 2 .section __DATA,__thread_vars,thread_local_variables .globl __D4tlsd1ai __D4tlsd1ai: .long __tlv_bootstrap .long 0 .long __D4tlsd1ai$tlv$init The following link helped explain what is going on with the __thread_vars data layout. http://www.opensource.apple.com/source/dyld/dyld-210.2.3/src/threadLocalVariables.c Mach-O dyln replaces tlv_bootstrap (thunk) with tlv_get_addr in the TLVDescriptor (__thread_vars). My LLVM hack for now is just doing a direct call to __tls_get_addr instead of indirect to tlv_get_addr. For proof of concept (one thread only), I have __tls_get_addr hard wired as follows: extern (C) { struct TLVDescriptor { void* function(TLVDescriptor*) thunk; uint key; uint offset; } //void* tlv_get_addr(TLVDescriptor* d) //void* __tls_get_addr(void* ptr) void* __tls_get_addr(TLVDescriptor* tlvd) { __gshared static ubyte data[512]; printf("__tls_get_addr %p \n", tlvd); printf("thunk %p, key %u, offset %u\n", tlvd.thunk, tlvd.key, tlvd.offset); return data.ptr + tlvd.offset; } void _tlv_bootstrap() { assert(false, "Should not get here"); } } It looks promising. Next step is to add in some realistic runtime support. Not sure if I will base it on dmd's sections-osx or the Apple dyld. Probably a hybrid. Eventually will need some help getting the LLVM changes clean instead of my hack job. Now that I've gone down this path a bit, I am beginning to wonder if changing LLVM to support iOS thread locals will have issues. Would LLVM want changes that affect Darwin/Mach-O (Apple's turf)? I suppose they could be optional. -- DanIf nobody is working on the emulated TLS for LDC, I will give it a try. Nothing to lose.Whatever I do to implement packed TLS in the dmd backend is not going to work for ldc anyway, so nothing stopping you from making your own effort. You will have to patch llvm also, if the weak symbols bug David pointed out is still around in llvm 3.5. Let us know what approach you take.
Mar 30 2014
On Sunday, 30 March 2014 at 08:22:15 UTC, Dan Olson wrote:"Joakim" <joakim airpost.net> writes:Nice find, I guess it helps that they have a desktop OS that does it differently.On Thursday, 27 March 2014 at 16:01:31 UTC, Dan Olson wrote:The approach I started with was to make LLVM do the work. I read through all of the comments in this thread and decided this might be the most fun. ARMISelLowering.cpp has TLS disabled for all but ELF targets. I commented out an assertion blocking other targets to see what would happen for iOS (Mach-O). To my suprise, found that Mach-O tls sections are generated (__thread_vars, __thread_data, .tbss) and populated with the D thread local vars.If nobody is working on the emulated TLS for LDC, I will give it a try. Nothing to lose.Whatever I do to implement packed TLS in the dmd backend is not going to work for ldc anyway, so nothing stopping you from making your own effort. You will have to patch llvm also, if the weak symbols bug David pointed out is still around in llvm 3.5. Let us know what approach you take.The load/store instructions were treating TLS vars like global data though. So I looked at the Mach-O X86 version and saw what it is trying to do. LLVM coding is still a mystery to me, but managed after many hours today to hack together something that would turn this D code module tlsd; int a; void test() { a += 4; // access a } into this: movw r0, :lower16:(__D4tlsd1ai-(LPC4_0+4)) movt r0, :upper16:(__D4tlsd1ai-(LPC4_0+4)) LPC4_0: add r0, pc blx ___tls_get_addr ldr r1, [r0] str r1, [r0] ... .tbss __D4tlsd1ai$tlv$init, 4, 2 .section __DATA,__thread_vars,thread_local_variables .globl __D4tlsd1ai __D4tlsd1ai: .long __tlv_bootstrap .long 0 .long __D4tlsd1ai$tlv$init The following link helped explain what is going on with the __thread_vars data layout. http://www.opensource.apple.com/source/dyld/dyld-210.2.3/src/threadLocalVariables.c Mach-O dyln replaces tlv_bootstrap (thunk) with tlv_get_addr in the TLVDescriptor (__thread_vars). My LLVM hack for now is just doing a direct call to __tls_get_addr instead of indirect to tlv_get_addr. For proof of concept (one thread only), I have __tls_get_addr hard wired as follows: extern (C) { struct TLVDescriptor { void* function(TLVDescriptor*) thunk; uint key; uint offset; } //void* tlv_get_addr(TLVDescriptor* d) //void* __tls_get_addr(void* ptr) void* __tls_get_addr(TLVDescriptor* tlvd) { __gshared static ubyte data[512]; printf("__tls_get_addr %p \n", tlvd); printf("thunk %p, key %u, offset %u\n", tlvd.thunk, tlvd.key, tlvd.offset); return data.ptr + tlvd.offset; } void _tlv_bootstrap() { assert(false, "Should not get here"); } } It looks promising. Next step is to add in some realistic runtime support. Not sure if I will base it on dmd's sections-osx or the Apple dyld. Probably a hybrid.Have you experimented with seeing which of that TLV stuff from OS X that iOS actually supports? The iOS dyld could be pretty different. We don't know since they don't release the source for the iOS core like they do for OS X, ie is tlv_get_addr even available in the iOS dyld and does it execute other possible TLS relocations? Only way to find out is to try it, or somehow inspect their iOS binaries. ;) Their source does show an ARM assembly implementation of tlv_get_address but it's commented out: http://www.opensource.apple.com/source/dyld/dyld-210.2.3/src/threadLocalHelpers.s I wonder if it'd be easier to pack your own Mach-O sections rather than figuring out how to access all their sections and reimplementing their TLV functions, assuming they're not available. You might even be able to do it as an llvm patch since the relevant lib/MC/ files where llvm packs the TLS data into Mach-O sections seem pretty straightforward.Eventually will need some help getting the LLVM changes clean instead of my hack job. Now that I've gone down this path a bit, I am beginning to wonder if changing LLVM to support iOS thread locals will have issues. Would LLVM want changes that affect Darwin/Mach-O (Apple's turf)? I suppose they could be optional.I've never submitted anything to llvm, so not really based on anything than speculation, but I doubt they would accept such a patch, doesn't mean we can't use it though. ;)
Mar 30 2014
"Joakim" <joakim airpost.net> writes:Have you experimented with seeing which of that TLV stuff from OS X that iOS actually supports? The iOS dyld could be pretty different. We don't know since they don't release the source for the iOS core like they do for OS X, ie is tlv_get_addr even available in the iOS dyld and does it execute other possible TLS relocations? Only way to find out is to try it, or somehow inspect their iOS binaries. ;) Their source does show an ARM assembly implementation of tlv_get_address but it's commented out: http://www.opensource.apple.com/source/dyld/dyld-210.2.3/src/threadLocalHelpers.sI did try it in an iOS app. The function _tlv_bootstrap is unresolved when I link in Xcode using the current iPhoneSDK. That is why I had to provide a stub. Pretty sure tlv functions are not available.I wonder if it'd be easier to pack your own Mach-O sections rather than figuring out how to access all their sections and reimplementing their TLV functions, assuming they're not available. You might even be able to do it as an llvm patch since the relevant lib/MC/ files where llvm packs the TLS data into Mach-O sections seem pretty straightforward.I think we can use their sections and it did not take long to figure out. Here is what an example link map has for one of my test apps: 0x0004E22C 0x00000084 __DATA __thread_vars 0x0004E2B0 0x0000000C __DATA __thread_data 0x0004E2BC 0x00000024 __DATA __thread_bss The _thread_vars section has a TVLDescriptors for each thread local. It is used for caching the pthread_get/set key and has the variable offset into the thread local chunk of memory that can be initialized by copying _thread_data and _thread_bss (or just zerofill it).I've never submitted anything to llvm, so not really based on anything than speculation, but I doubt they would accept such a patch, doesn't mean we can't use it though. ;)Another thing, Apple might consider the tlv functions and thread local sections a reserved API. A long way off from submitting anything to App Store. With the way things change, tlv may show up in a near future sdk, then this just becomes a bridge. -- Dan
Mar 30 2014
On Sunday, 30 March 2014 at 15:24:53 UTC, Dan Olson wrote:"Joakim" <joakim airpost.net> writes: I think we can use their sections and it did not take long to figure out. Here is what an example link map has for one of my test apps: 0x0004E22C 0x00000084 __DATA __thread_vars 0x0004E2B0 0x0000000C __DATA __thread_data 0x0004E2BC 0x00000024 __DATA __thread_bss The _thread_vars section has a TVLDescriptors for each thread local. It is used for caching the pthread_get/set key and has the variable offset into the thread local chunk of memory that can be initialized by copying _thread_data and _thread_bss (or just zerofill it).---snip---A long way off from submitting anything to App Store. With the way things change, tlv may show up in a near future sdk, then this just becomes a bridge.Hmm, you and Jacob are probably right, it may be better to just follow what they do. On Sunday, 30 March 2014 at 15:34:08 UTC, Dan Olson wrote:Jacob Carlborg <doob me.com> writes:I think the APSL is more similar to the CDDL, which was Sun's license for OpenSolaris and much of their open-source contributions, and requires that source is provided for APS-licensed files. I think you could always add an APS-licensed file to druntime and the licenses would not clash, but that would make druntime not completely boost-licensed anymore, as the APSL has additional requirements than the minimal boost license. It's probably best to just reimplement the necessary functions yourself.I would follow the native TLS implementation in OS X, i.e. using "tlv_get_addr", as close as possible. In theory it should be possible to move the code from threadLocalVariables.c and threadLocalHelpers.s directly in to druntime. Hopefully that would mean the same code for generating TLS access could be used both on OS X and iOS.Do think we can just drop the dyld code into druntime? It should work with perhaps some modifications, but I am not familiar with the Apple opensource license. I should read it. It is BSD-like right?Would still need to hook in the garbage collector so it scans the thread local memory. I'll have to try it tonight.David did this for the TLV code on OS X a year back, should be pretty straightforward to do something similar to what he did. On Sunday, 30 March 2014 at 15:44:52 UTC, Dan Olson wrote:"Joakim" <joakim airpost.net> writes:Doesn't look like it, plus it'll need to be specialized for each object format, like Mach, ELF, or COFF, anyway. After looking at the relevant llvm source for packing sections to see how it was working for you with Mach, I wonder if I won't be able to patch some of the existing llvm files for packing TLS data into ELF and get the TLS variables packed easily that way. I'll try that approach at some point.I wonder if it'd be easier to pack your own Mach-O sections rather than figuring out how to access all their sections and reimplementing their TLV functions, assuming they're not available. You might even be able to do it as an llvm patch since the relevant lib/MC/ files where llvm packs the TLS data into Mach-O sections seem pretty straightforward.Thinking about this some more. It probably makes sense to have an optional approach that can be used on any target that does not have native TLS. This current approach for iOS will only work for Mach-O. I wonder if the LLVM folks are working toward a generic TLS without OS support.
Mar 30 2014
"Joakim" <joakim airpost.net> writes:I wonder if it'd be easier to pack your own Mach-O sections rather than figuring out how to access all their sections and reimplementing their TLV functions, assuming they're not available. You might even be able to do it as an llvm patch since the relevant lib/MC/ files where llvm packs the TLS data into Mach-O sections seem pretty straightforward.Thinking about this some more. It probably makes sense to have an optional approach that can be used on any target that does not have native TLS. This current approach for iOS will only work for Mach-O. I wonder if the LLVM folks are working toward a generic TLS without OS support. -- Dan
Mar 30 2014
On 2014-03-30 10:22, Dan Olson wrote:"Joakim" <joakim airpost.net> writes:I would follow the native TLS implementation in OS X, i.e. using "tlv_get_addr", as close as possible. In theory it should be possible to move the code from threadLocalVariables.c and threadLocalHelpers.s directly in to druntime. Hopefully that would mean the same code for generating TLS access could be used both on OS X and iOS. -- /Jacob CarlborgOn Thursday, 27 March 2014 at 16:01:31 UTC, Dan Olson wrote:The approach I started with was to make LLVM do the work. I read through all of the comments in this thread and decided this might be the most fun. ARMISelLowering.cpp has TLS disabled for all but ELF targets. I commented out an assertion blocking other targets to see what would happen for iOS (Mach-O). To my suprise, found that Mach-O tls sections are generated (__thread_vars, __thread_data, .tbss) and populated with the D thread local vars. The load/store instructions were treating TLS vars like global data though. So I looked at the Mach-O X86 version and saw what it is trying to do. LLVM coding is still a mystery to me, but managed after many hours today to hack together something that would turn this D code module tlsd; int a; void test() { a += 4; // access a } into this: movw r0, :lower16:(__D4tlsd1ai-(LPC4_0+4)) movt r0, :upper16:(__D4tlsd1ai-(LPC4_0+4)) LPC4_0: add r0, pc blx ___tls_get_addr ldr r1, [r0] str r1, [r0] ... .tbss __D4tlsd1ai$tlv$init, 4, 2 .section __DATA,__thread_vars,thread_local_variables .globl __D4tlsd1ai __D4tlsd1ai: .long __tlv_bootstrap .long 0 .long __D4tlsd1ai$tlv$init The following link helped explain what is going on with the __thread_vars data layout. http://www.opensource.apple.com/source/dyld/dyld-210.2.3/src/threadLocalVariables.c Mach-O dyln replaces tlv_bootstrap (thunk) with tlv_get_addr in the TLVDescriptor (__thread_vars). My LLVM hack for now is just doing a direct call to __tls_get_addr instead of indirect to tlv_get_addr. For proof of concept (one thread only), I have __tls_get_addr hard wired as follows: extern (C) { struct TLVDescriptor { void* function(TLVDescriptor*) thunk; uint key; uint offset; } //void* tlv_get_addr(TLVDescriptor* d) //void* __tls_get_addr(void* ptr) void* __tls_get_addr(TLVDescriptor* tlvd) { __gshared static ubyte data[512]; printf("__tls_get_addr %p \n", tlvd); printf("thunk %p, key %u, offset %u\n", tlvd.thunk, tlvd.key, tlvd.offset); return data.ptr + tlvd.offset; } void _tlv_bootstrap() { assert(false, "Should not get here"); } } It looks promising. Next step is to add in some realistic runtime support. Not sure if I will base it on dmd's sections-osx or the Apple dyld. Probably a hybrid.If nobody is working on the emulated TLS for LDC, I will give it a try. Nothing to lose.Whatever I do to implement packed TLS in the dmd backend is not going to work for ldc anyway, so nothing stopping you from making your own effort. You will have to patch llvm also, if the weak symbols bug David pointed out is still around in llvm 3.5. Let us know what approach you take.
Mar 30 2014
Jacob Carlborg <doob me.com> writes:I would follow the native TLS implementation in OS X, i.e. using "tlv_get_addr", as close as possible. In theory it should be possible to move the code from threadLocalVariables.c and threadLocalHelpers.s directly in to druntime. Hopefully that would mean the same code for generating TLS access could be used both on OS X and iOS.Do think we can just drop the dyld code into druntime? It should work with perhaps some modifications, but I am not familiar with the Apple opensource license. I should read it. It is BSD-like right? Would still need to hook in the garbage collector so it scans the thread local memory. I'll have to try it tonight. -- Dan
Mar 30 2014
On 30/03/14 17:34, Dan Olson wrote:Do think we can just drop the dyld code into druntime?Yes, with minor modifications. The TLS related code in dyld is pretty much self contained. I don't see dyld using any functionality that isn't available to a regular application.It should work with perhaps some modifications, but I am not familiar with the Apple opensource license. I should read it. It is BSD-like right?The license is a completely different issue. The safest would be to re-implement the code. One can document the existing code and some other can do the implementation. Regardless of the license, you can still give a try to see if the technical parts work.Would still need to hook in the garbage collector so it scans the thread local memory. I'll have to try it tonight.You'll just need to add a call to druntime in one of the functions in the dyld TLS code. Have a look at: https://github.com/D-Programming-Language/druntime/blob/master/src/rt/sections_osx.d -- /Jacob Carlborg
Mar 30 2014
On 31 Mar 2014, at 8:25, Jacob Carlborg wrote:You'll just need to add a call to druntime in one of the functions in the dyld TLS code. Have a look at: https://github.com/D-Programming-Language/druntime/blob/master/src/rt/sections_osx.dMore specifically, for the DMD TLS emulation implementation, this is done in the initTLSRanges() function, which forwards to getTLSBlock(). IIRC, initTLSRanges() is only called for new threads. For the main thread, the TLS ranges is included in the GC ranges detected in initSections(). For LDC on OS X, which makes use of the 10.7+ system-level TLS implementation, the place where this is handled is https://github.com/ldc-developers/druntime/blob/a08f158618eb5d06c42bd4746b782312e937f6b3/src/rt/ ections_ldc.d#L296. _d_dyld_getTLSRange uses an undocumented dyld API function (dyld_enumerate_tlv_storage) to get the actual TLS memory range on the current thread: https://github.com/ldc-developers/druntime/blob/a08f158618eb5d06c42bd4746b782312e937f6b3/src/ldc/osx_tls.c. David
Mar 31 2014
"David Nadlinger" <code klickverbot.at> writes:On 31 Mar 2014, at 8:25, Jacob Carlborg wrote:I had disabled initTLSRanges for iOS since dyld_enumerate_tlv_storage is a stub for x86 (see http://www.opensource.apple.com/source/dyld/dyld-210.2.3/src/threadLocalVariables.c). Now that I have tweaked threadLocalVariables.c, dyld_enumerate_tlv_storage should now work on iOS. I will have to reenble initTLSRanges and see what happens. -- DanYou'll just need to add a call to druntime in one of the functions in the dyld TLS code. Have a look at: https://github.com/D-Programming-Language/druntime/blob/master/src/rt/sections_osx.dMore specifically, for the DMD TLS emulation implementation, this is done in the initTLSRanges() function, which forwards to getTLSBlock(). IIRC, initTLSRanges() is only called for new threads. For the main thread, the TLS ranges is included in the GC ranges detected in initSections(). For LDC on OS X, which makes use of the 10.7+ system-level TLS implementation, the place where this is handled is https://github.com/ldc-developers/druntime/blob/a08f158618eb5d06c42bd4746b782312e937f6b3/src/rt/ ections_ldc.d#L296. _d_dyld_getTLSRange uses an undocumented dyld API function (dyld_enumerate_tlv_storage) to get the actual TLS memory range on the current thread: https://github.com/ldc-developers/druntime/blob/a08f158618eb5d06c42bd4746b782312e937f6b3/src/ldc/osx_tls.c. David
Mar 31 2014
Dan Olson <zans.is.for.cans yahoo.com> writes:"David Nadlinger" <code klickverbot.at> writes:I meant it is a stub for ARM.On 31 Mar 2014, at 8:25, Jacob Carlborg wrote:I had disabled initTLSRanges for iOS since dyld_enumerate_tlv_storage is a stub for x86 (see http://www.opensource.apple.com/source/dyld/dyld-210.2.3/src/threadLocalVariables.c).You'll just need to add a call to druntime in one of the functions in the dyld TLS code. Have a look at: https://github.com/D-Programming-Language/druntime/blob/master/src/rt/sections_osx.dMore specifically, for the DMD TLS emulation implementation, this is done in the initTLSRanges() function, which forwards to getTLSBlock(). IIRC, initTLSRanges() is only called for new threads. For the main thread, the TLS ranges is included in the GC ranges detected in initSections(). For LDC on OS X, which makes use of the 10.7+ system-level TLS implementation, the place where this is handled is https://github.com/ldc-developers/druntime/blob/a08f158618eb5d06c42bd4746b782312e937f6b3/src/rt/ ections_ldc.d#L296. _d_dyld_getTLSRange uses an undocumented dyld API function (dyld_enumerate_tlv_storage) to get the actual TLS memory range on the current thread: https://github.com/ldc-developers/druntime/blob/a08f158618eb5d06c42bd4746b782312e937f6b3/src/ldc/osx_tls.c. DavidNow that I have tweaked threadLocalVariables.c, dyld_enumerate_tlv_storage should now work on iOS. I will have to reenble initTLSRanges and see what happens.I did reenable and it works. I can tell because the std.datetime unittest uses enough memory that it causes a GC. When I first rebuild everything with TLS enabled and plugged in support from threadLocalVariables.c (but without initTLSRanges enabled), the std.datetime unittest started crashing. The datetime unittest tests have a fair number of thread locals. Then I reenabled David's initTLSRanges() for iOS, and std.datetime unittest went back to passing. -- Dan
Apr 01 2014
On 2014-03-31 16:05, David Nadlinger wrote:For LDC on OS X, which makes use of the 10.7+ system-level TLS implementation, the place where this is handled is https://github.com/ldc-developers/druntime/blob/a08f158618eb5d06c42bd4746b782312e937f6b3/src/rt/sections_ldc.d#L296. _d_dyld_getTLSRange uses an undocumented dyld API function (dyld_enumerate_tlv_storage) to get the actual TLS memory range on the current thread: https://github.com/ldc-developers/druntime/blob/a08f158618eb5d06c42bd4746b782312e937f6b3/src/ldc/osx_tls.c."dyld_enumerate_tlv_storage" should probably be replaced with a function that is publicly available, at some point. -- /Jacob Carlborg
Mar 31 2014
On 31 Mar 2014, at 19:39, Jacob Carlborg wrote:"dyld_enumerate_tlv_storage" should probably be replaced with a function that is publicly available, at some point.If you find such a function, please let me know (or, better, submit a pull request). Maybe it is possible to reimplement dyld_enumerate_tlv_storage using public APIs, but back then I didn't spend too much time on investigating that. David
Mar 31 2014
On 2014-03-31 19:42, David Nadlinger wrote:If you find such a function, please let me know (or, better, submit a pull request).Hmm, it might be a bit more complicated than I first thought. I might have a look at it some time.Maybe it is possible to reimplement dyld_enumerate_tlv_storage using public APIs, but back then I didn't spend too much time on investigating that.Fair enough. -- /Jacob Carlborg
Mar 31 2014
Jacob Carlborg <doob me.com> writes:Regardless of the license, you can still give a try to see if the technical parts work.I have tried and success. I added threadLocalHelpers.s and threadLocalVariables.c, modified to enable for arm, then had to sprinkle in some missing types from dl_priv.h. Then put in a call to tlv_initializer(). The proof is that thread locals get proper initial values when accessed through tlv_get_addr(). For example, a thread local double is being initialized to nan. Still having LLVM emit a __tls_get_addr, so to try this out, I changed my test __tls_get_addr() implementation to forward to tlv_get_addr() in threadLocalHelpers.s. extern (C) void* __tls_get_addr(TLVDescriptor* tlvd) { __gshared static ubyte data[512]; printf("__tls_get_addr %p \n", tlvd); printf("thunk %p, key %u, offset %u\n", tlvd.thunk, tlvd.key, tlvd.offset); // tlv_initializer() will change thunk to tlv_get_addr if (tlvd.thunk is &tlv_get_addr) { puts("calling real tlv_get_addr instead"); return tlv_get_addr(tlvd); } // tlv not initialized yet, return my fake thread local data. return data.tlvd + tlvd.offset; }
Mar 31 2014
On 31 Mar 2014, at 17:23, Dan Olson wrote:I added threadLocalHelpers.s and threadLocalVariables.c, modified to enable for arm, then had to sprinkle in some missing types from dl_priv.h. Then put in a call to tlv_initializer(). The proof is that thread locals get proper initial values when accessed through tlv_get_addr(). For example, a thread local double is being initialized to nan.Nice! David
Mar 31 2014
On 2014-03-31 17:23, Dan Olson wrote:I have tried and success. I added threadLocalHelpers.s and threadLocalVariables.c, modified to enable for arm, then had to sprinkle in some missing types from dl_priv.h. Then put in a call to tlv_initializer(). The proof is that thread locals get proper initial values when accessed through tlv_get_addr(). For example, a thread local double is being initialized to nan.Awesome :) -- /Jacob Carlborg
Mar 31 2014
On 27 Mar 2014, at 17:01, Dan Olson wrote:If nobody is working on the emulated TLS for LDC, I will give it a try. Nothing to lose.Would be great – I don't think anybody else is working on this right now. David
Mar 27 2014
David Nadlinger <code klickverbot.at> writes:On 03/08/2014 01:55 AM, Joakim wrote:While on the subject of TLS, that is probably the most needed language feature to allow threading to work reliably on iOS. So hoping the solution will work on iOS too! Another topic - I was looking at adding fiber_switchContext support for arm in threadasm.S, and noticed GDC's version has an arm implementation. Is it ok to use portions of GDC source in LDC? -- DanDo you have any advice on how to pull this off with ldc? Should I be going the dmd route and packing the TLS myself? Does llvm provide good support for this? Or is there some other llvm TLS shortcut I can use? I tried to see if llvm just has some thread-local implementation that automatically uses pthread_setspecific, but didn't find anything.LLVM does support putting variables into custom sections, and you can more or less get away with the DMD bracketing approach (see e.g. the new ModuleInfo discovery functionality I implemented for Linux, which is the same as DMD's druntime uses). However, there is a catch: Due to what I can only imagine is a bug, LLVM does not support emitting a symbol both into a custom section and with weak linkage. Thus, you might be in for a round of LLVM hacking either way, even though it will likely involve much less when going the DMD route. However, there is a third options which might be worth investigating, namely re-implementing at least parts of the necessary runtime linker features in druntime and continuing to use the same scheme as on GNU Linux/x86. This depends on %gs not being used in another way, etc. though. David
Mar 08 2014
On Sat, Mar 8, 2014 at 8:11 PM, Dan Olson <zans.is.for.cans yahoo.com> wrote:Another topic - I was looking at adding fiber_switchContext support for arm in threadasm.S, and noticed GDC's version has an arm implementation. Is it ok to use portions of GDC source in LDC?If their code is Boost-licensed (general druntime/Phobos license), yes. David
Mar 08 2014
David Nadlinger <code klickverbot.at> writes:On Sat, Mar 8, 2014 at 8:11 PM, Dan Olson <zans.is.for.cans yahoo.com> wrote:Yes, that file is Boost - good!Another topic - I was looking at adding fiber_switchContext support for arm in threadasm.S, and noticed GDC's version has an arm implementation. Is it ok to use portions of GDC source in LDC?If their code is Boost-licensed (general druntime/Phobos license), yes. David
Mar 08 2014
On Saturday, 8 March 2014 at 19:11:52 UTC, Dan Olson wrote:While on the subject of TLS, that is probably the most needed language feature to allow threading to work reliably on iOS. So hoping the solution will work on iOS too!I wondered earlier why you weren't just using Walter's packed TLS approach and now I see why, ldc doesn't use it. Looks like Apple hasn't ported the TLV functions which ldc uses to iOS yet either, so you're out of luck there too. I guess you'll have to port Walter's approach to ldc to get TLS working on iOS: https://github.com/D-Programming-Language/dmd/blob/master/src/backend/machobj.c#L1673 Either that or get llvm to emit the right pthread calls, like I mentioned earlier.
Mar 08 2014
On 2014-03-09 07:04, Joakim wrote:I wondered earlier why you weren't just using Walter's packed TLS approach and now I see why, ldc doesn't use it. Looks like Apple hasn't ported the TLV functions which ldc uses to iOS yet either, so you're out of luck there too. I guess you'll have to port Walter's approach to ldc to get TLS working on iOS:I think it would be possible to implement the missing TLV functions our self in druntime. Hopefully this would allow to use the same TLS approach both on OS X and on iOS. -- /Jacob Carlborg
Mar 09 2014
On Sunday, 9 March 2014 at 09:55:33 UTC, Jacob Carlborg wrote:On 2014-03-09 07:04, Joakim wrote:OK, I assumed OS support was necessary, maybe not. On Saturday, 8 March 2014 at 18:16:58 UTC, Joakim wrote:I wondered earlier why you weren't just using Walter's packed TLS approach and now I see why, ldc doesn't use it. Looks like Apple hasn't ported the TLV functions which ldc uses to iOS yet either, so you're out of luck there too. I guess you'll have to port Walter's approach to ldc to get TLS working on iOS:I think it would be possible to implement the missing TLV functions our self in druntime. Hopefully this would allow to use the same TLS approach both on OS X and on iOS.On Saturday, 8 March 2014 at 14:25:43 UTC, David Nadlinger wrote:Speaking of OS support, I just tried this and I was able to access the TLS initialization image using dl_phdr_info on Android/x86. Those dlpi_tls_* members are not necessary, though I'm guessing dlpi_tls_modid would be for shared library support. Now I just have to figure out some way to have the TLS relocations access the initialization image, presumably the way Walter does it for dmd/OSX.However, there is a third options which might be worth investigating, namely re-implementing at least parts of the necessary runtime linker features in druntime and continuing to use the same scheme as on GNU Linux/x86. This depends on %gs not being used in another way, etc. though.I tried to reuse the existing dl_iterate_phdr approach on Android, but then I noticed that the dl_phdr_info struct defined in bionic doesn't include the dlpi_tls_modid and dlpi_tls_data members. However, now that you mention it, maybe those aren't strictly necessary, as long as I'm not worried about shared libraries. I'll look into it further.
Mar 09 2014
On 2014-03-09 11:11, Joakim wrote:OK, I assumed OS support was necessary, maybe not.Well, yes. In this case the OS support comes in the form of the dynamic linker. We can do the same as the dynamic linker does in druntime. I don't know if it helps but the dynamic linker on OS X has code for tlv_get_addr for ARM, but it's disabled. -- /Jacob Carlborg
Mar 09 2014