digitalmars.D - OS X libphobos2.so
- Kingsley (5/5) Nov 04 2015 Hi
- Jacob Carlborg (5/8) Nov 04 2015 Nothing will happen unless someone fixes those issues.
- bitwise (12/20) Nov 05 2015 I actually made some progress on this. I managed to get dmd to
- Jacob Carlborg (4/6) Nov 05 2015 Then the TLS is left as well.
- Kingsley (12/17) Nov 05 2015 Hi - I would like to help. I don't have the knowledge or skills
- Jacob Carlborg (7/17) Nov 06 2015 It's pretty easy in theory. OS X 10.7 got support for native TLS. DMD
- bitwise (6/25) Nov 06 2015 The issue is very complex, and I wouldn't know where to start
- bitwise (6/11) Nov 06 2015 The existing emulated TLS solution can be modified to work with
- Jacob Carlborg (6/9) Nov 06 2015 I don't see how it can be modified "pretty easily". You don't need
- bitwise (6/15) Nov 06 2015 Currently, the compiler just calls ___tls_get_addr(void *p) to
- Jacob Carlborg (17/20) Nov 07 2015 Hehe, you make it sound so easy. Perhaps I missed something and you know...
- bitwise (37/60) Nov 08 2015 Well, I'm speaking in relative terms when I say easy... ;)
- Kingsley (10/15) Nov 08 2015 Hi Bit,
- Jacob Carlborg (7/17) Nov 09 2015 Not sure if this would be too much work for the first version. But would...
- bitwise (14/39) Nov 09 2015 The AA is not needed. The offset of the TLS var is known at
- Jacob Carlborg (46/58) Nov 09 2015 I was thinking instead of iterating over all loaded images. Something
- bitwise (21/81) Nov 10 2015 Our current approach is already very similar - the one for
- Jacob Carlborg (4/5) Nov 10 2015 Better compatibility, better performance. Why not?
- bitwise (5/9) Nov 10 2015 How so?
- Jacob Carlborg (4/5) Nov 11 2015 But it is, that's why we have this conversation ;)
- David Nadlinger (5/8) Nov 10 2015 It's been quite some time long time since I have looked at the
- bitwise (8/8) Nov 05 2015 On Thursday, 5 November 2015 at 07:28:24 UTC, Jacob Carlborg
Hi Anyone know when a version of libphobos2.so will be available on OS X? I understand there are issues preventing us having one. -k
Nov 04 2015
On 2015-11-05 01:18, Kingsley wrote:Hi Anyone know when a version of libphobos2.so will be available on OS X?No.I understand there are issues preventing us having one.Nothing will happen unless someone fixes those issues. -- /Jacob Carlborg
Nov 04 2015
On Thursday, 5 November 2015 at 07:28:24 UTC, Jacob Carlborg wrote:On 2015-11-05 01:18, Kingsley wrote:I actually made some progress on this. I managed to get dmd to generate/insert init/term funcs into each module with minimal alterations in the front end. Currently, the init one runs, but the term function causes a segfault..I checked my binary with mach-o viewer, and it shows that the pointer I've put into the __mod_term_funcs section somehow points at writeline instead....heh. Once I get this sorted out, the rest shouldn't be that bad. It will still probably be a few months minimum though. BitHi Anyone know when a version of libphobos2.so will be available on OS X?No.I understand there are issues preventing us having one.Nothing will happen unless someone fixes those issues.
Nov 05 2015
On 2015-11-05 16:51, bitwise wrote:Once I get this sorted out, the rest shouldn't be that bad. It will still probably be a few months minimum though.Then the TLS is left as well. -- /Jacob Carlborg
Nov 05 2015
On Thursday, 5 November 2015 at 21:09:41 UTC, Jacob Carlborg wrote:On 2015-11-05 16:51, bitwise wrote:Hi - I would like to help. I don't have the knowledge or skills (yet) to be of much use. However I'm certainly interested in starting a public project somewhere and encouraging as many people who do have the skills and knowledge to help out. I have absolutely no idea where to start. However if you are remotely interested could you reply here. If people with skills and knowledge were open to jumping on a regular skype call to discuss how to get this moving forward I could possibly provide some kind of compensation as motivation. Please let me know :)Once I get this sorted out, the rest shouldn't be that bad. It will still probably be a few months minimum though.Then the TLS is left as well.
Nov 05 2015
On 2015-11-06 08:02, Kingsley wrote:Hi - I would like to help. I don't have the knowledge or skills (yet) to be of much use. However I'm certainly interested in starting a public project somewhere and encouraging as many people who do have the skills and knowledge to help out. I have absolutely no idea where to start. However if you are remotely interested could you reply here. If people with skills and knowledge were open to jumping on a regular skype call to discuss how to get this moving forward I could possibly provide some kind of compensation as motivation. Please let me know :)It's pretty easy in theory. OS X 10.7 got support for native TLS. DMD works on 10.6 as well, because of that it uses its own custom implementation of TLS. Modify DMD to generate the same code as Clang would do when accessing TLS. -- /Jacob Carlborg
Nov 06 2015
On Friday, 6 November 2015 at 07:02:35 UTC, Kingsley wrote:On Thursday, 5 November 2015 at 21:09:41 UTC, Jacob Carlborg wrote:The issue is very complex, and I wouldn't know where to start explaining, but these two dconf talks touch on the issue: https://www.youtube.com/watch?v=i63VeudjZM4 https://www.youtube.com/watch?v=WzXe2kT9sEo BitOn 2015-11-05 16:51, bitwise wrote:Hi - I would like to help. I don't have the knowledge or skills (yet) to be of much use. However I'm certainly interested in starting a public project somewhere and encouraging as many people who do have the skills and knowledge to help out. I have absolutely no idea where to start. However if you are remotely interested could you reply here. If people with skills and knowledge were open to jumping on a regular skype call to discuss how to get this moving forward I could possibly provide some kind of compensation as motivation. Please let me know :)Once I get this sorted out, the rest shouldn't be that bad. It will still probably be a few months minimum though.Then the TLS is left as well.
Nov 06 2015
On Thursday, 5 November 2015 at 21:09:41 UTC, Jacob Carlborg wrote:On 2015-11-05 16:51, bitwise wrote:The existing emulated TLS solution can be modified to work with shared libraries pretty easily. At present, I have no intention of trying to implement native TLS. BitOnce I get this sorted out, the rest shouldn't be that bad. It will still probably be a few months minimum though.Then the TLS is left as well.
Nov 06 2015
On 2015-11-06 18:15, bitwise wrote:The existing emulated TLS solution can be modified to work with shared libraries pretty easily. At present, I have no intention of trying to implement native TLS.I don't see how it can be modified "pretty easily". You don't need native TLS, but as far as I can see you basically need to do, in the runtime, what the dynamic linker is already doing. -- /Jacob Carlborg
Nov 06 2015
On Friday, 6 November 2015 at 17:56:06 UTC, Jacob Carlborg wrote:On 2015-11-06 18:15, bitwise wrote:Currently, the compiler just calls ___tls_get_addr(void *p) to get the thread local copy of a global. If that function signature is altered to take a pointer to the image as well, the problem is solved. BitThe existing emulated TLS solution can be modified to work with shared libraries pretty easily. At present, I have no intention of trying to implement native TLS.I don't see how it can be modified "pretty easily". You don't need native TLS, but as far as I can see you basically need to do, in the runtime, what the dynamic linker is already doing.
Nov 06 2015
On 2015-11-06 19:46, bitwise wrote:Currently, the compiler just calls ___tls_get_addr(void *p) to get the thread local copy of a global. If that function signature is altered to take a pointer to the image as well, the problem is solved.Hehe, you make it sound so easy. Perhaps I missed something and you know more than I do. But as far as I know you have two options: 1. Implement native TLS. This will require modifications to the compiler and minor tweaks in the runtime 2. Continue to use the custom TLS implementation but add support for dynamic libraries. This will require modifications to the compiler (as you said above) and major changes to the runtime The native TLS implementation works as you described above (roughly). I can hardly believe that the code Apple added to the dynamic linker to implement TLS is not necessary. I don't see how you can get around not implementing the same code as the dynamic linker does. I also think that this is a good opportunity to change to native TLS. I don't like this situation we have now: "Yeah, D is compatible with C, except TLS on OS X.". -- /Jacob Carlborg
Nov 07 2015
On Saturday, 7 November 2015 at 08:37:40 UTC, Jacob Carlborg wrote:On 2015-11-06 19:46, bitwise wrote:Well, I'm speaking in relative terms when I say easy... ;) Right now, TLS has a fairly simple implementation. DMD puts any global TLS vars into their own section in the binary. Then, at the point here those vars are accessed in code, DMD inserts a call to ___tls_get_addr(void*) to map the address of the var to some thread specific block of memory. When ___tls_get_addr() is called, it lazily instantiates a block of memory for the calling thread, memcpy's the TLS vars from the TLS section in the binary, and stores that thread local copy using pthread_set_specific(). Any subsequent calls to ___tls_get_addr() will simply use pthread_get_specific() to retrieve that block of memory, and map the received address to one pointing in that block. So, since binaries will not be mapped to overlapping address spaces, I can loop over all the binary images and find the range to which the argument of ___tls_get_addr() belongs, and map the pointer to the appropriate block of memory. I am concerned that looping over all binary images for each TLS access will have performance implications, but for now, this solution is good enough. Later, ___tls_get_addr() can be amended to pass a pointer to the image from which the TLS originated, allowing constant time lookup. I believe Martin has already done this for linux/fbsd, but I had time to look at this specific issue. So.. I've got a basic implementation working at this point. The global ctors are now used instead of that infernal dyld callback to initialize sections. I've tried loading(dynamically) a shared library, and everything seems to work. Next on the list is to work on how all this interacts with threads. Martin seems to have already solved this too, so it should be fairly straight forward. Currently, linking a dylib statically throws "thread.d(2916): Unable to suspend thread", but other wise, seems to work as expected. Anyways, I am open to any help on the TLS stuff if you've got time. BitCurrently, the compiler just calls ___tls_get_addr(void *p) to get the thread local copy of a global. If that function signature is altered to take a pointer to the image as well, the problem is solved.Hehe, you make it sound so easy. Perhaps I missed something and you know more than I do. But as far as I know you have two options: 1. Implement native TLS. This will require modifications to the compiler and minor tweaks in the runtime 2. Continue to use the custom TLS implementation but add support for dynamic libraries. This will require modifications to the compiler (as you said above) and major changes to the runtime The native TLS implementation works as you described above (roughly). I can hardly believe that the code Apple added to the dynamic linker to implement TLS is not necessary. I don't see how you can get around not implementing the same code as the dynamic linker does. I also think that this is a good opportunity to change to native TLS. I don't like this situation we have now: "Yeah, D is compatible with C, except TLS on OS X.".
Nov 08 2015
On Sunday, 8 November 2015 at 18:12:04 UTC, bitwise wrote:On Saturday, 7 November 2015 at 08:37:40 UTC, Jacob Carlborg wrote:Hi Bit, I'm very excited by your posts with your insights and progress into this issue. I'm afraid I am not able to help much (lacking in skills not enthusiasm). But Please keep going :) and keep us updated - if there is anything I can do to help - please don't hesitate to ask :) Thanks for the links you posted - I have started watching Martin's presentation with interest. --K[...]Well, I'm speaking in relative terms when I say easy... ;) [...]
Nov 08 2015
On 2015-11-08 19:12, bitwise wrote:So, since binaries will not be mapped to overlapping address spaces, I can loop over all the binary images and find the range to which the argument of ___tls_get_addr() belongs, and map the pointer to the appropriate block of memory. I am concerned that looping over all binary images for each TLS access will have performance implications, but for now, this solution is good enough. Later, ___tls_get_addr() can be amended to pass a pointer to the image from which the TLS originated, allowing constant time lookup. I believe Martin has already done this for linux/fbsd, but I had time to look at this specific issue.Not sure if this would be too much work for the first version. But would it be possible to, for each loaded image, register its memory range in an associative array. Where the key is the range the value is the image? Hmm, when I think about, it might not help at all. -- /Jacob Carlborg
Nov 09 2015
On Monday, 9 November 2015 at 15:29:25 UTC, Jacob Carlborg wrote:On 2015-11-08 19:12, bitwise wrote:The AA is not needed. The offset of the TLS var is known at compile time. If you look at sections_elf_shared.d you can see the signature of __tls_get_addr, and that it takes a pointer to the struct tls_index or something. *if* I understand correctly, one of the two vars in that struct is the index of the image, and the other is the offset into the imag's tls section. Not sure where/hoe that struct is outputted though. So you would have to figure out how to get the backend to do the same thing for OSX. I think the image index may have to be assigned at load time, but I'm not sure. The amount of code to actually do it should be trivial, it's reading/interpreting the backend that will be the problem ;) BitSo, since binaries will not be mapped to overlapping address spaces, I can loop over all the binary images and find the range to which the argument of ___tls_get_addr() belongs, and map the pointer to the appropriate block of memory. I am concerned that looping over all binary images for each TLS access will have performance implications, but for now, this solution is good enough. Later, ___tls_get_addr() can be amended to pass a pointer to the image from which the TLS originated, allowing constant time lookup. I believe Martin has already done this for linux/fbsd, but I had time to look at this specific issue.Not sure if this would be too much work for the first version. But would it be possible to, for each loaded image, register its memory range in an associative array. Where the key is the range the value is the image? Hmm, when I think about, it might not help at all.
Nov 09 2015
On 2015-11-09 18:30, bitwise wrote:The AA is not needed. The offset of the TLS var is known at compile time.I was thinking instead of iterating over all loaded images. Something that could be done without modifying the compiler.If you look at sections_elf_shared.d you can see the signature of __tls_get_addr, and that it takes a pointer to the struct tls_index or something. *if* I understand correctly, one of the two vars in that struct is the index of the image, and the other is the offset into the imag's tls section. Not sure where/hoe that struct is outputted though. So you would have to figure out how to get the backend to do the same thing for OSX. I think the image index may have to be assigned at load time, but I'm not sure.If we're going to modify the backend it's better to match the native implementation. I looked a bit at the implementation. For each TLS variable it outputs two symbols (at least if the variable is initialized). One with the same name as the variable, and one with the variable name plus a prefix, "$tlv$init". The symbol with the prefix contains the actual value which the variable is initialized in the source code with. The other symbol is a struct looking something like this: struct TLVDescriptor { void* function (TLVDescriptor*) thunk; size_t key; size_t offset; } The dynamic loader will, when an image is loaded, set "thunk" to a function implemented in the dynamic loader. "key" is set to a key created by "pthread_key_create". It then maps the key to the currently loading image. I think the compiler access the variable as if it were a global variable of type "TLVDescriptor". Then calls the thunk passing in the variable itself. So the following code: int a = 3; void foo() { auto b = a; } Would be lowered to: TLVDescriptor _a; int _a$tlv$init = 3; void foo() { TLVDescriptor tmp = _a; int b = cast(int) tmp.thunk(&tmp); } When the compiler stores the symbol in the image it would only need to set the offset since the dynamic loader sets the other two fields. Although I'm not sure how the "_a$tlv$init" symbol is used. If the dynamic loader completely handles that or if the compiler need to do something with that. The enhancement request for implementing native TLS contains some information [1].The amount of code to actually do it should be trivial, it's reading/interpreting the backend that will be the problem ;)Yeah, I agree :) [1] https://issues.dlang.org/show_bug.cgi?id=9476#c2 -- /Jacob Carlborg
Nov 09 2015
On Monday, 9 November 2015 at 21:02:35 UTC, Jacob Carlborg wrote:On 2015-11-09 18:30, bitwise wrote:Why?The AA is not needed. The offset of the TLS var is known at compile time.I was thinking instead of iterating over all loaded images. Something that could be done without modifying the compiler.If you look at sections_elf_shared.d you can see the signature of __tls_get_addr, and that it takes a pointer to the struct tls_index or something. *if* I understand correctly, one of the two vars in that struct is the index of the image, and the other is the offset into the imag's tls section. Not sure where/hoe that struct is outputted though. So you would have to figure out how to get the backend to do the same thing for OSX. I think the image index may have to be assigned at load time, but I'm not sure.If we're going to modify the backend it's better to match the native implementation.I looked a bit at the implementation. For each TLS variable it outputs two symbols (at least if the variable is initialized). One with the same name as the variable, and one with the variable name plus a prefix, "$tlv$init". The symbol with the prefix contains the actual value which the variable is initialized in the source code with. The other symbol is a struct looking something like this: struct TLVDescriptor { void* function (TLVDescriptor*) thunk; size_t key; size_t offset; } The dynamic loader will, when an image is loaded, set "thunk" to a function implemented in the dynamic loader. "key" is set to a key created by "pthread_key_create". It then maps the key to the currently loading image. I think the compiler access the variable as if it were a global variable of type "TLVDescriptor". Then calls the thunk passing in the variable itself. So the following code: int a = 3; void foo() { auto b = a; } Would be lowered to: TLVDescriptor _a; int _a$tlv$init = 3; void foo() { TLVDescriptor tmp = _a; int b = cast(int) tmp.thunk(&tmp); } When the compiler stores the symbol in the image it would only need to set the offset since the dynamic loader sets the other two fields. Although I'm not sure how the "_a$tlv$init" symbol is used. If the dynamic loader completely handles that or if the compiler need to do something with that.Our current approach is already very similar - the one for linux/bsd, even more so than OSX. The data layout and exact specifics differ slightly, both the approach you're describing sounds basically the same as what we're already doing. We allocate the TLS block and pthread key for an entire image in one shot, instead of one var at a time, which is a difference, if I understand correctly...but aside from that, I think the effect is the same. On a slightly different note, I'm looking at our implementation right now... and a couple of things seem wrong with it. First of all, it allocates the TLS block for each thread that accesses a TLS var: https://github.com/D-Programming-Language/druntime/blob/fb127f747edb211b06b35a5a5e548f03e9b750e3/src/rt/sections_osx.d#L156 But where does it ever free it!? Does this mean it causes leaks when you create threads and access TLS vars from them? It seems so. Also, the memory is allocated using calloc, and the block is never added to the GC..doesn't this mean that the GC won't scan there, and could potentially free objects that are stored there? Bit
Nov 10 2015
On 2015-11-10 18:55, bitwise wrote:Why?Better compatibility, better performance. Why not? -- /Jacob Carlborg
Nov 10 2015
On Tuesday, 10 November 2015 at 18:57:52 UTC, Jacob Carlborg wrote:On 2015-11-10 18:55, bitwise wrote:How so?Why?Better compatibility, better performance.Why not?If it ain't broke...don't fix it :) Bit
Nov 10 2015
On 2015-11-10 20:22, bitwise wrote:If it ain't broke...don't fix it :)But it is, that's why we have this conversation ;) -- /Jacob Carlborg
Nov 11 2015
On Tuesday, 10 November 2015 at 17:55:58 UTC, bitwise wrote:Also, the memory is allocated using calloc, and the block is never added to the GC..doesn't this mean that the GC won't scan there, and could potentially free objects that are stored there?It's been quite some time long time since I have looked at the details of DMD's TLS emulation (LDC does not need it), but for scanning the TLS area, you want to have a look at initTLSRanges(). — David
Nov 10 2015
On Thursday, 5 November 2015 at 07:28:24 UTC, Jacob Carlborg wrote: [...] Also, someone seems to have hard coded dmd to output _all_ functions as COMDAT on OSX...which may explain the wierd symbol merging problems I was talking about before.. Not sure why this was done. Bit
Nov 05 2015