www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - OS X libphobos2.so

reply Kingsley <kingsley.hendrickse gmail.com> writes:
Hi

Anyone know when a version of libphobos2.so will be available on 
OS X?

I understand there are issues preventing us having one.

-k
Nov 04 2015
parent reply Jacob Carlborg <doob me.com> writes:
On 2015-11-05 01:18, Kingsley wrote:
 Hi

 Anyone know when a version of libphobos2.so will be available on OS X?
No.
 I understand there are issues preventing us having one.
Nothing will happen unless someone fixes those issues. -- /Jacob Carlborg
Nov 04 2015
next sibling parent reply bitwise <bitwise.pvt gmail.com> writes:
On Thursday, 5 November 2015 at 07:28:24 UTC, Jacob Carlborg 
wrote:
 On 2015-11-05 01:18, Kingsley wrote:
 Hi

 Anyone know when a version of libphobos2.so will be available 
 on OS X?
No.
 I understand there are issues preventing us having one.
Nothing will happen unless someone fixes those issues.
I actually made some progress on this. I managed to get dmd to generate/insert init/term funcs into each module with minimal alterations in the front end. Currently, the init one runs, but the term function causes a segfault..I checked my binary with mach-o viewer, and it shows that the pointer I've put into the __mod_term_funcs section somehow points at writeline instead....heh. Once I get this sorted out, the rest shouldn't be that bad. It will still probably be a few months minimum though. Bit
Nov 05 2015
parent reply Jacob Carlborg <doob me.com> writes:
On 2015-11-05 16:51, bitwise wrote:

 Once I get this sorted out, the rest shouldn't be that bad. It will
 still probably be a few months minimum though.
Then the TLS is left as well. -- /Jacob Carlborg
Nov 05 2015
next sibling parent reply Kingsley <kingsley.hendrickse gmail.com> writes:
On Thursday, 5 November 2015 at 21:09:41 UTC, Jacob Carlborg 
wrote:
 On 2015-11-05 16:51, bitwise wrote:

 Once I get this sorted out, the rest shouldn't be that bad. It 
 will
 still probably be a few months minimum though.
Then the TLS is left as well.
Hi - I would like to help. I don't have the knowledge or skills (yet) to be of much use. However I'm certainly interested in starting a public project somewhere and encouraging as many people who do have the skills and knowledge to help out. I have absolutely no idea where to start. However if you are remotely interested could you reply here. If people with skills and knowledge were open to jumping on a regular skype call to discuss how to get this moving forward I could possibly provide some kind of compensation as motivation. Please let me know :)
Nov 05 2015
next sibling parent Jacob Carlborg <doob me.com> writes:
On 2015-11-06 08:02, Kingsley wrote:

 Hi - I would like to help. I don't have the knowledge or skills (yet) to
 be of much use. However I'm certainly interested in starting a public
 project somewhere and encouraging as many people who do have the skills
 and knowledge to help out.

 I have absolutely no idea where to start. However if you are remotely
 interested could you reply here.

 If people with skills and knowledge were open to jumping on a regular
 skype call to discuss how to get this moving forward I could possibly
 provide some kind of compensation as motivation.

 Please let me know :)
It's pretty easy in theory. OS X 10.7 got support for native TLS. DMD works on 10.6 as well, because of that it uses its own custom implementation of TLS. Modify DMD to generate the same code as Clang would do when accessing TLS. -- /Jacob Carlborg
Nov 06 2015
prev sibling parent bitwise <bitwise.pvt gmail.com> writes:
On Friday, 6 November 2015 at 07:02:35 UTC, Kingsley wrote:
 On Thursday, 5 November 2015 at 21:09:41 UTC, Jacob Carlborg 
 wrote:
 On 2015-11-05 16:51, bitwise wrote:

 Once I get this sorted out, the rest shouldn't be that bad. 
 It will
 still probably be a few months minimum though.
Then the TLS is left as well.
Hi - I would like to help. I don't have the knowledge or skills (yet) to be of much use. However I'm certainly interested in starting a public project somewhere and encouraging as many people who do have the skills and knowledge to help out. I have absolutely no idea where to start. However if you are remotely interested could you reply here. If people with skills and knowledge were open to jumping on a regular skype call to discuss how to get this moving forward I could possibly provide some kind of compensation as motivation. Please let me know :)
The issue is very complex, and I wouldn't know where to start explaining, but these two dconf talks touch on the issue: https://www.youtube.com/watch?v=i63VeudjZM4 https://www.youtube.com/watch?v=WzXe2kT9sEo Bit
Nov 06 2015
prev sibling parent reply bitwise <bitwise.pvt gmail.com> writes:
On Thursday, 5 November 2015 at 21:09:41 UTC, Jacob Carlborg 
wrote:
 On 2015-11-05 16:51, bitwise wrote:

 Once I get this sorted out, the rest shouldn't be that bad. It 
 will
 still probably be a few months minimum though.
Then the TLS is left as well.
The existing emulated TLS solution can be modified to work with shared libraries pretty easily. At present, I have no intention of trying to implement native TLS. Bit
Nov 06 2015
parent reply Jacob Carlborg <doob me.com> writes:
On 2015-11-06 18:15, bitwise wrote:

 The existing emulated TLS solution can be modified to work with shared
 libraries pretty easily. At present, I have no intention of trying to
 implement native TLS.
I don't see how it can be modified "pretty easily". You don't need native TLS, but as far as I can see you basically need to do, in the runtime, what the dynamic linker is already doing. -- /Jacob Carlborg
Nov 06 2015
parent reply bitwise <bitwise.pvt gmail.com> writes:
On Friday, 6 November 2015 at 17:56:06 UTC, Jacob Carlborg wrote:
 On 2015-11-06 18:15, bitwise wrote:

 The existing emulated TLS solution can be modified to work 
 with shared
 libraries pretty easily. At present, I have no intention of 
 trying to
 implement native TLS.
I don't see how it can be modified "pretty easily". You don't need native TLS, but as far as I can see you basically need to do, in the runtime, what the dynamic linker is already doing.
Currently, the compiler just calls ___tls_get_addr(void *p) to get the thread local copy of a global. If that function signature is altered to take a pointer to the image as well, the problem is solved. Bit
Nov 06 2015
parent reply Jacob Carlborg <doob me.com> writes:
On 2015-11-06 19:46, bitwise wrote:

 Currently, the compiler just calls ___tls_get_addr(void *p) to get the
 thread local copy of a global. If that function signature is altered to
 take a pointer to the image as well, the problem is solved.
Hehe, you make it sound so easy. Perhaps I missed something and you know more than I do. But as far as I know you have two options: 1. Implement native TLS. This will require modifications to the compiler and minor tweaks in the runtime 2. Continue to use the custom TLS implementation but add support for dynamic libraries. This will require modifications to the compiler (as you said above) and major changes to the runtime The native TLS implementation works as you described above (roughly). I can hardly believe that the code Apple added to the dynamic linker to implement TLS is not necessary. I don't see how you can get around not implementing the same code as the dynamic linker does. I also think that this is a good opportunity to change to native TLS. I don't like this situation we have now: "Yeah, D is compatible with C, except TLS on OS X.". -- /Jacob Carlborg
Nov 07 2015
parent reply bitwise <bitwise.pvt gmail.com> writes:
On Saturday, 7 November 2015 at 08:37:40 UTC, Jacob Carlborg 
wrote:
 On 2015-11-06 19:46, bitwise wrote:

 Currently, the compiler just calls ___tls_get_addr(void *p) to 
 get the
 thread local copy of a global. If that function signature is 
 altered to
 take a pointer to the image as well, the problem is solved.
Hehe, you make it sound so easy. Perhaps I missed something and you know more than I do. But as far as I know you have two options: 1. Implement native TLS. This will require modifications to the compiler and minor tweaks in the runtime 2. Continue to use the custom TLS implementation but add support for dynamic libraries. This will require modifications to the compiler (as you said above) and major changes to the runtime The native TLS implementation works as you described above (roughly). I can hardly believe that the code Apple added to the dynamic linker to implement TLS is not necessary. I don't see how you can get around not implementing the same code as the dynamic linker does. I also think that this is a good opportunity to change to native TLS. I don't like this situation we have now: "Yeah, D is compatible with C, except TLS on OS X.".
Well, I'm speaking in relative terms when I say easy... ;) Right now, TLS has a fairly simple implementation. DMD puts any global TLS vars into their own section in the binary. Then, at the point here those vars are accessed in code, DMD inserts a call to ___tls_get_addr(void*) to map the address of the var to some thread specific block of memory. When ___tls_get_addr() is called, it lazily instantiates a block of memory for the calling thread, memcpy's the TLS vars from the TLS section in the binary, and stores that thread local copy using pthread_set_specific(). Any subsequent calls to ___tls_get_addr() will simply use pthread_get_specific() to retrieve that block of memory, and map the received address to one pointing in that block. So, since binaries will not be mapped to overlapping address spaces, I can loop over all the binary images and find the range to which the argument of ___tls_get_addr() belongs, and map the pointer to the appropriate block of memory. I am concerned that looping over all binary images for each TLS access will have performance implications, but for now, this solution is good enough. Later, ___tls_get_addr() can be amended to pass a pointer to the image from which the TLS originated, allowing constant time lookup. I believe Martin has already done this for linux/fbsd, but I had time to look at this specific issue. So.. I've got a basic implementation working at this point. The global ctors are now used instead of that infernal dyld callback to initialize sections. I've tried loading(dynamically) a shared library, and everything seems to work. Next on the list is to work on how all this interacts with threads. Martin seems to have already solved this too, so it should be fairly straight forward. Currently, linking a dylib statically throws "thread.d(2916): Unable to suspend thread", but other wise, seems to work as expected. Anyways, I am open to any help on the TLS stuff if you've got time. Bit
Nov 08 2015
next sibling parent Kingsley <kingsley.hendrickse gmail.com> writes:
On Sunday, 8 November 2015 at 18:12:04 UTC, bitwise wrote:
 On Saturday, 7 November 2015 at 08:37:40 UTC, Jacob Carlborg 
 wrote:
 [...]
Well, I'm speaking in relative terms when I say easy... ;) [...]
Hi Bit, I'm very excited by your posts with your insights and progress into this issue. I'm afraid I am not able to help much (lacking in skills not enthusiasm). But Please keep going :) and keep us updated - if there is anything I can do to help - please don't hesitate to ask :) Thanks for the links you posted - I have started watching Martin's presentation with interest. --K
Nov 08 2015
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2015-11-08 19:12, bitwise wrote:

 So, since binaries will not be mapped to overlapping address spaces, I
 can loop over all the binary images and find the range to which the
 argument of ___tls_get_addr() belongs, and map the pointer to the
 appropriate block of memory.

 I am concerned that looping over all binary images for each TLS access
 will have performance implications, but for now, this solution is good
 enough. Later, ___tls_get_addr() can be amended to pass a pointer to the
 image from which the TLS originated, allowing constant time lookup. I
 believe Martin has already done this for linux/fbsd, but I had time to
 look at this specific issue.
Not sure if this would be too much work for the first version. But would it be possible to, for each loaded image, register its memory range in an associative array. Where the key is the range the value is the image? Hmm, when I think about, it might not help at all. -- /Jacob Carlborg
Nov 09 2015
parent reply bitwise <bitwise.pvt gmail.com> writes:
On Monday, 9 November 2015 at 15:29:25 UTC, Jacob Carlborg wrote:
 On 2015-11-08 19:12, bitwise wrote:

 So, since binaries will not be mapped to overlapping address 
 spaces, I
 can loop over all the binary images and find the range to 
 which the
 argument of ___tls_get_addr() belongs, and map the pointer to 
 the
 appropriate block of memory.

 I am concerned that looping over all binary images for each 
 TLS access
 will have performance implications, but for now, this solution 
 is good
 enough. Later, ___tls_get_addr() can be amended to pass a 
 pointer to the
 image from which the TLS originated, allowing constant time 
 lookup. I
 believe Martin has already done this for linux/fbsd, but I had 
 time to
 look at this specific issue.
Not sure if this would be too much work for the first version. But would it be possible to, for each loaded image, register its memory range in an associative array. Where the key is the range the value is the image? Hmm, when I think about, it might not help at all.
The AA is not needed. The offset of the TLS var is known at compile time. If you look at sections_elf_shared.d you can see the signature of __tls_get_addr, and that it takes a pointer to the struct tls_index or something. *if* I understand correctly, one of the two vars in that struct is the index of the image, and the other is the offset into the imag's tls section. Not sure where/hoe that struct is outputted though. So you would have to figure out how to get the backend to do the same thing for OSX. I think the image index may have to be assigned at load time, but I'm not sure. The amount of code to actually do it should be trivial, it's reading/interpreting the backend that will be the problem ;) Bit
Nov 09 2015
parent reply Jacob Carlborg <doob me.com> writes:
On 2015-11-09 18:30, bitwise wrote:

 The AA is not needed. The offset of the TLS var is known at compile
 time.
I was thinking instead of iterating over all loaded images. Something that could be done without modifying the compiler.
 If you look at sections_elf_shared.d you can see the signature of
 __tls_get_addr, and that it takes a pointer to the struct tls_index or
 something. *if* I understand correctly, one of the two vars in that
 struct is the index of the image, and the other is the offset into the
 imag's tls section. Not sure where/hoe that struct is outputted though.
 So you would have to figure out how to get the backend to do the same
 thing for OSX. I think the image index may have to be assigned at load
 time, but I'm not sure.
If we're going to modify the backend it's better to match the native implementation. I looked a bit at the implementation. For each TLS variable it outputs two symbols (at least if the variable is initialized). One with the same name as the variable, and one with the variable name plus a prefix, "$tlv$init". The symbol with the prefix contains the actual value which the variable is initialized in the source code with. The other symbol is a struct looking something like this: struct TLVDescriptor { void* function (TLVDescriptor*) thunk; size_t key; size_t offset; } The dynamic loader will, when an image is loaded, set "thunk" to a function implemented in the dynamic loader. "key" is set to a key created by "pthread_key_create". It then maps the key to the currently loading image. I think the compiler access the variable as if it were a global variable of type "TLVDescriptor". Then calls the thunk passing in the variable itself. So the following code: int a = 3; void foo() { auto b = a; } Would be lowered to: TLVDescriptor _a; int _a$tlv$init = 3; void foo() { TLVDescriptor tmp = _a; int b = cast(int) tmp.thunk(&tmp); } When the compiler stores the symbol in the image it would only need to set the offset since the dynamic loader sets the other two fields. Although I'm not sure how the "_a$tlv$init" symbol is used. If the dynamic loader completely handles that or if the compiler need to do something with that. The enhancement request for implementing native TLS contains some information [1].
 The amount of code to actually do it should be
 trivial, it's reading/interpreting the backend that will be the problem ;)
Yeah, I agree :) [1] https://issues.dlang.org/show_bug.cgi?id=9476#c2 -- /Jacob Carlborg
Nov 09 2015
parent reply bitwise <bitwise.pvt gmail.com> writes:
On Monday, 9 November 2015 at 21:02:35 UTC, Jacob Carlborg wrote:
 On 2015-11-09 18:30, bitwise wrote:

 The AA is not needed. The offset of the TLS var is known at 
 compile
 time.
I was thinking instead of iterating over all loaded images. Something that could be done without modifying the compiler.
 If you look at sections_elf_shared.d you can see the signature 
 of
 __tls_get_addr, and that it takes a pointer to the struct 
 tls_index or
 something. *if* I understand correctly, one of the two vars in 
 that
 struct is the index of the image, and the other is the offset 
 into the
 imag's tls section. Not sure where/hoe that struct is 
 outputted though.
 So you would have to figure out how to get the backend to do 
 the same
 thing for OSX. I think the image index may have to be assigned 
 at load
 time, but I'm not sure.
If we're going to modify the backend it's better to match the native implementation.
Why?
 I looked a bit at the implementation. For each TLS variable it 
 outputs two symbols (at least if the variable is initialized). 
 One with the same name as the variable, and one with the 
 variable name plus a prefix, "$tlv$init". The symbol with the 
 prefix contains the actual value which the variable is 
 initialized in the source code with.

 The other symbol is a struct looking something like this:

 struct TLVDescriptor
 {
     void* function (TLVDescriptor*) thunk;
     size_t key;
     size_t offset;
 }

 The dynamic loader will, when an image is loaded, set "thunk" 
 to a function implemented in the dynamic loader. "key" is set 
 to a key created by "pthread_key_create". It then maps the key 
 to the currently loading image.

 I think the compiler access the variable as if it were a global 
 variable of type "TLVDescriptor". Then calls the thunk passing 
 in the variable itself.

 So the following code:

 int a = 3;

 void foo() { auto b = a; }

 Would be lowered to:

 TLVDescriptor _a;
 int _a$tlv$init = 3;

 void foo()
 {
     TLVDescriptor tmp = _a;
     int b = cast(int) tmp.thunk(&tmp);
 }

 When the compiler stores the symbol in the image it would only 
 need to set the offset since the dynamic loader sets the other 
 two fields.

 Although I'm not sure how the "_a$tlv$init" symbol is used. If 
 the dynamic loader completely handles that or if the compiler 
 need to do something with that.
Our current approach is already very similar - the one for linux/bsd, even more so than OSX. The data layout and exact specifics differ slightly, both the approach you're describing sounds basically the same as what we're already doing. We allocate the TLS block and pthread key for an entire image in one shot, instead of one var at a time, which is a difference, if I understand correctly...but aside from that, I think the effect is the same. On a slightly different note, I'm looking at our implementation right now... and a couple of things seem wrong with it. First of all, it allocates the TLS block for each thread that accesses a TLS var: https://github.com/D-Programming-Language/druntime/blob/fb127f747edb211b06b35a5a5e548f03e9b750e3/src/rt/sections_osx.d#L156 But where does it ever free it!? Does this mean it causes leaks when you create threads and access TLS vars from them? It seems so. Also, the memory is allocated using calloc, and the block is never added to the GC..doesn't this mean that the GC won't scan there, and could potentially free objects that are stored there? Bit
Nov 10 2015
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2015-11-10 18:55, bitwise wrote:

 Why?
Better compatibility, better performance. Why not? -- /Jacob Carlborg
Nov 10 2015
parent reply bitwise <bitwise.pvt gmail.com> writes:
On Tuesday, 10 November 2015 at 18:57:52 UTC, Jacob Carlborg 
wrote:
 On 2015-11-10 18:55, bitwise wrote:

 Why?
Better compatibility, better performance.
How so?
 Why not?
If it ain't broke...don't fix it :) Bit
Nov 10 2015
parent Jacob Carlborg <doob me.com> writes:
On 2015-11-10 20:22, bitwise wrote:

 If it ain't broke...don't fix it :)
But it is, that's why we have this conversation ;) -- /Jacob Carlborg
Nov 11 2015
prev sibling parent David Nadlinger <code klickverbot.at> writes:
On Tuesday, 10 November 2015 at 17:55:58 UTC, bitwise wrote:
 Also, the memory is allocated using calloc, and the block is 
 never added to the GC..doesn't this mean that the GC won't scan 
 there, and could potentially free objects that are stored there?
It's been quite some time long time since I have looked at the details of DMD's TLS emulation (LDC does not need it), but for scanning the TLS area, you want to have a look at initTLSRanges(). — David
Nov 10 2015
prev sibling parent bitwise <bitwise.pvt gmail.com> writes:
On Thursday, 5 November 2015 at 07:28:24 UTC, Jacob Carlborg 
wrote:
[...]

Also, someone seems to have hard coded dmd to output _all_ 
functions as COMDAT on OSX...which may explain the wierd symbol 
merging problems I was talking about before.. Not sure why this 
was done.

     Bit
Nov 05 2015