digitalmars.D.ldc - --emulated-tls explanation?
- Denis Feklushkin (4/4) Oct 13 2020 Hi!
- Denis Feklushkin (6/10) Oct 13 2020 Problem was:
- Denis Feklushkin (3/14) Oct 13 2020 I.e., compiler presumes that.
- IGotD- (17/21) Oct 13 2020 I think it is a compatibility layer that GCC provides. Instead of
- Denis Feklushkin (6/23) Oct 13 2020 So, compiler knows what this platform is not supports
- IGotD- (17/23) Oct 13 2020 You can see the implementation yourself.
- Denis Feklushkin (8/34) Oct 13 2020 Ok, I see in my binary what if I use "--emulated-tls" 3-rd party
- IGotD- (9/11) Oct 13 2020 If you don't have emulated TLS, your build shouldn't even succeed
- Denis Feklushkin (4/14) Oct 13 2020 It is provided by "picolibc" library.
- IGotD- (15/30) Oct 13 2020 The function prototype of __tls_get_address is wrong. It should be
- IGotD- (11/25) Oct 13 2020 Just to add to the confusion. If you compile everything
- Denis Feklushkin (5/23) Oct 13 2020 It is implemented inside of LLVM?
- Denis Feklushkin (3/12) Oct 13 2020 Don't worry, found it. Thanks again!
- IGotD- (14/17) Oct 13 2020 Sorry, I can't because it is a mess. I think that __tls_get_addr
- Denis Feklushkin (4/7) Oct 14 2020 Looks like this problem is solvable: it is possible to push
- IGotD- (8/11) Oct 15 2020 That's the same as implementing your own version so you don't
- Denis Feklushkin (6/14) Oct 16 2020 I am not sure, but looks like different arguments for
- Denis Feklushkin (3/13) Oct 13 2020 (I don't spawn threads)
Hi! Can anyone explain what "--emulated-tls" actually do? It solves my problem with correct static variables placement on ARM Cortex M3, but I don't know why.
Oct 13 2020
On Tuesday, 13 October 2020 at 09:32:07 UTC, Denis Feklushkin wrote:Hi! Can anyone explain what "--emulated-tls" actually do? It solves my problem with correct static variables placement on ARM Cortex M3, but I don't know why.Problem was: Without --emulated-tls static member variables sometimes(?) was placed on same place. This not affect usual (TLS) variables, or shared/__gshared.
Oct 13 2020
On Tuesday, 13 October 2020 at 09:59:40 UTC, Denis Feklushkin wrote:On Tuesday, 13 October 2020 at 09:32:07 UTC, Denis Feklushkin wrote:I.e., compiler presumes that.Hi! Can anyone explain what "--emulated-tls" actually do? It solves my problem with correct static variables placement on ARM Cortex M3, but I don't know why.Problem was: Without --emulated-tls static member variables sometimes(?) was placed on same place.
Oct 13 2020
On Tuesday, 13 October 2020 at 09:32:07 UTC, Denis Feklushkin wrote:Hi! Can anyone explain what "--emulated-tls" actually do? It solves my problem with correct static variables placement on ARM Cortex M3, but I don't know why.I think it is a compatibility layer that GCC provides. Instead of implementing everything yourself from scratch, GCC provide a framework and a set of hooks you should implement. http://www.chiark.greenend.org.uk/doc/gcc-4.9-doc/gccint.html#Emulated-TLS It seems like GCC provides default hooks, so for example if threading is not enabled this TLS emulation layer is probably pretty stupid and do not know what a thread is. The variables are dynamically allocated using the C library memory allocation functions. In practice you should read about the Runtime ABI for the ARM architecture, how TLS is implemented for ARM. For a custom system you have all the degrees of freedom and can do what you want and is usually better. ARM and GCC also offers several options how to implement it, static version, dynamic version, a mix, use thread pointer register, use a function for retrieval etc.
Oct 13 2020
On Tuesday, 13 October 2020 at 10:25:35 UTC, IGotD- wrote:On Tuesday, 13 October 2020 at 09:32:07 UTC, Denis Feklushkin wrote:So, compiler knows what this platform is not supports multithreading and does some things wrong with thread static variables if "--emulated-tls" is ommited?Hi! Can anyone explain what "--emulated-tls" actually do? It solves my problem with correct static variables placement on ARM Cortex M3, but I don't know why.I think it is a compatibility layer that GCC provides. Instead of implementing everything yourself from scratch, GCC provide a framework and a set of hooks you should implement. http://www.chiark.greenend.org.uk/doc/gcc-4.9-doc/gccint.html#Emulated-TLS It seems like GCC provides default hooks, so for example if threading is not enabled this TLS emulation layer is probably pretty stupid and do not know what a thread is.The variables are dynamically allocated using the C library memory allocation functions.As I understand, variables allocated by compiler, but it uses internal implict call to __tls_get_addr to provide access to them.
Oct 13 2020
On Tuesday, 13 October 2020 at 10:35:57 UTC, Denis Feklushkin wrote:So, compiler knows what this platform is not supports multithreading and does some things wrong with thread static variables if "--emulated-tls" is ommited?You can see the implementation yourself. https://github.com/gcc-mirror/gcc/blob/master/libgcc/emutls.c I have used TLS emulation myself and it just works despite the library has no definition of threads or mutexes so I guess these are just stubs or use stubs in the C library in that case. Basically single threaded TLS.As I understand, variables allocated by compiler, but it uses internal implict call to __tls_get_addr to provide access to them.If SW call is chosen then __tls_get_addr is the function that is used in order to obtain the address of a TLS variable. If emulation is used this function is just forwarded to the emulation function. It is almost easier to implement __tls_get_addr yourself and skip the emulation. If you look at the emulation layer it is filled with mutexes and mallocs and on simple systems this can be totally avoided if you use your own solution. Especially in real-time systems, the emulation layer should not be used for obvious reasons.
Oct 13 2020
On Tuesday, 13 October 2020 at 11:02:56 UTC, IGotD- wrote:On Tuesday, 13 October 2020 at 10:35:57 UTC, Denis Feklushkin wrote:Ok, I see in my binary what if I use "--emulated-tls" 3-rd party function __tls_get_address (provided by picolibc) replaced by __emutls_get_address. But it is still not clear why static variables are now not "superimposed" on one another at the same addresses.So, compiler knows what this platform is not supports multithreading and does some things wrong with thread static variables if "--emulated-tls" is ommited?You can see the implementation yourself. https://github.com/gcc-mirror/gcc/blob/master/libgcc/emutls.c I have used TLS emulation myself and it just works despite the library has no definition of threads or mutexes so I guess these are just stubs or use stubs in the C library in that case. Basically single threaded TLS.As I understand, variables allocated by compiler, but it uses internal implict call to __tls_get_addr to provide access to them.If SW call is chosen then __tls_get_addr is the function that is used in order to obtain the address of a TLS variable. If emulation is used this function is just forwarded to the emulation function. It is almost easier to implement __tls_get_addr yourself and skip the emulation.If you look at the emulation layer it is filled with mutexes and mallocs and on simple systems this can be totally avoided if you use your own solution. Especially in real-time systems, the emulation layer should not be used for obvious reasons.Yes, we already have fibers for this. However, at least one TLS must be created that belongs to the main thread.
Oct 13 2020
On Tuesday, 13 October 2020 at 11:13:16 UTC, Denis Feklushkin wrote:But it is still not clear why static variables are now not "superimposed" on one another at the same addresses.If you don't have emulated TLS, your build shouldn't even succeed if you haven't implemented __tls_get_address. So if you have a build that you can test, where does this __tls_get_address come from? Another possibility is that you can use the thread pointer register directly and not use __aeabi_read_tp.__aeabi_read_tp is basically the function that is used for obtaining the thread pointer by SW instead for initial exec model.
Oct 13 2020
On Tuesday, 13 October 2020 at 11:23:23 UTC, IGotD- wrote:On Tuesday, 13 October 2020 at 11:13:16 UTC, Denis Feklushkin wrote:It is provided by "picolibc" library. Actually it provides __aeabi_read_tp but I wrap it: https://github.com/denizzzka/d_c_arm_test/blob/master/d/freertos_druntime_backend/external/rt/sections.d#L45But it is still not clear why static variables are now not "superimposed" on one another at the same addresses.If you don't have emulated TLS, your build shouldn't even succeed if you haven't implemented __tls_get_address. So if you have a build that you can test, where does this __tls_get_address come from?
Oct 13 2020
On Tuesday, 13 October 2020 at 11:27:45 UTC, Denis Feklushkin wrote:On Tuesday, 13 October 2020 at 11:23:23 UTC, IGotD- wrote:The function prototype of __tls_get_address is wrong. It should be struct tls_index { size_t ti_module; size_t ti_offset; }; void* __tls_get_addr(tls_index* ti) You perhaps don't use modules but you certainly need an offset. It should rather be something like void* __tls_get_addr(tls_index* ti) { return getThreadTlsArea(ti->ti_module) + ti->ti_offset; }On Tuesday, 13 October 2020 at 11:13:16 UTC, Denis Feklushkin wrote:It is provided by "picolibc" library. Actually it provides __aeabi_read_tp but I wrap it: https://github.com/denizzzka/d_c_arm_test/blob/master/d/freertos_druntime_backend/external/rt/sections.d#L45But it is still not clear why static variables are now not "superimposed" on one another at the same addresses.If you don't have emulated TLS, your build shouldn't even succeed if you haven't implemented __tls_get_address. So if you have a build that you can test, where does this __tls_get_address come from?
Oct 13 2020
On Tuesday, 13 October 2020 at 11:38:33 UTC, IGotD- wrote:The function prototype of __tls_get_address is wrong. It should be struct tls_index { size_t ti_module; size_t ti_offset; }; void* __tls_get_addr(tls_index* ti) You perhaps don't use modules but you certainly need an offset. It should rather be something like void* __tls_get_addr(tls_index* ti) { return getThreadTlsArea(ti->ti_module) + ti->ti_offset; }Just to add to the confusion. If you compile everything statically into one binary, __tls_get_addr should never really be called at least with C/C++. Then the compiler should optimize and call __aeabi_read_tp directly. The compiler inserts TP + offset itself instead as it assumes all statically and dynamically linked that are loaded during program start have already allocated the TLS area and TP is valid. However, I've seen that D seems to insert calls __tls_get_addr anyway like the initial exec model optimization doesn't exist. That's a question if that model is implemented in D.
Oct 13 2020
On Tuesday, 13 October 2020 at 11:38:33 UTC, IGotD- wrote:It is implemented inside of LLVM? Can you provide link to right declaration? Google full of "__tls_get_addr()" formIt is provided by "picolibc" library. Actually it provides __aeabi_read_tp but I wrap it: https://github.com/denizzzka/d_c_arm_test/blob/master/d/freertos_druntime_backend/external/rt/sections.d#L45The function prototype of __tls_get_address is wrong.It should be struct tls_index { size_t ti_module; size_t ti_offset; }; void* __tls_get_addr(tls_index* ti) You perhaps don't use modules but you certainly need an offset. It should rather be something like void* __tls_get_addr(tls_index* ti) { return getThreadTlsArea(ti->ti_module) + ti->ti_offset; }Yep, sounds like the correct explanation of my issue! Thanks!
Oct 13 2020
On Tuesday, 13 October 2020 at 12:00:34 UTC, Denis Feklushkin wrote:On Tuesday, 13 October 2020 at 11:38:33 UTC, IGotD- wrote:Don't worry, found it. Thanks again!It is implemented inside of LLVM? Can you provide link to right declaration?It is provided by "picolibc" library. Actually it provides __aeabi_read_tp but I wrap it: https://github.com/denizzzka/d_c_arm_test/blob/master/d/freertos_druntime_backend/external/rt/sections.d#L45The function prototype of __tls_get_address is wrong.
Oct 13 2020
On Tuesday, 13 October 2020 at 12:00:34 UTC, Denis Feklushkin wrote:It is implemented inside of LLVM? Can you provide link to right declaration? Google full of "__tls_get_addr()" formSorry, I can't because it is a mess. I think that __tls_get_addr is connected to an ELF standard for thread-local storage. https://akkadia.org/drepper/tls.pdf Note that the document doesn't include ARM but Itanium and other unusual CPUs. However declaration of __tls_get_addr is stated there for other CPUs than ARM. I've seen other versions of __tls_get_addr out there as well and it can be architecture dependent. Also keep in mind that I use ARMv7-ar as reference point here, I cannot 100% say that it is the same for Cortex M3. The RunTime ABI for ARM http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=C3E4EA7E008BA776DF7E0F54C8F7CCF1?doi=10.1.1.352.5218&rep=rep1&type=pdf isn't really that helpful either.
Oct 13 2020
On Tuesday, 13 October 2020 at 12:33:57 UTC, IGotD- wrote:Sorry, I can't because it is a mess. I think that __tls_get_addr is connected to an ELF standard for thread-local storage.Looks like this problem is solvable: it is possible to push through own implementation of (emu)TLS by replacing it in the linker (--wrap=). Emutls implementation not tied to anything.
Oct 14 2020
On Thursday, 15 October 2020 at 01:23:53 UTC, Denis Feklushkin wrote:Looks like this problem is solvable: it is possible to push through own implementation of (emu)TLS by replacing it in the linker (--wrap=). Emutls implementation not tied to anything.That's the same as implementing your own version so you don't need the TLS emulation at all. TLS in a static system isn't that difficult. https://wiki.osdev.org/Thread_Local_Storage The only thing you need to implement is __tls_get_addr and possibly __aeabi_read_tp.
Oct 15 2020
On Thursday, 15 October 2020 at 15:09:45 UTC, IGotD- wrote:On Thursday, 15 October 2020 at 01:23:53 UTC, Denis Feklushkin wrote:I am not sure, but looks like different arguments for __tls_get_addr will be used in case if emulation is enabled and disabled? If emulation will be used arguments is same as for gcc and these args allows you to avoid ELF-related things.Looks like this problem is solvable: it is possible to push through own implementation of (emu)TLS by replacing it in the linker (--wrap=). Emutls implementation not tied to anything.That's the same as implementing your own version so you don't need the TLS emulation at all.
Oct 16 2020
On Tuesday, 13 October 2020 at 11:13:16 UTC, Denis Feklushkin wrote:(I don't spawn threads)If SW call is chosen then __tls_get_addr is the function that is used in order to obtain the address of a TLS variable. If emulation is used this function is just forwarded to the emulation function. It is almost easier to implement __tls_get_addr yourself and skip the emulation.Ok, I see in my binary what if I use "--emulated-tls" 3-rd party function __tls_get_address (provided by picolibc) replaced by __emutls_get_address. But it is still not clear why static variables are now not "superimposed" on one another at the same addresses.
Oct 13 2020