www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Shared libraries, symbol visibilities, Posix vs. Windows

reply kinke <noone nowhere.com> writes:
This post is intended to shed some light on shared-library 
details, common pitfalls, and compare Posix and Windows. It's 
LDC-centric. I'm trying not to go into too many details, but 
that's hard. :)



I'm focusing on ELF here (Linux, BSDs, …), but Apple's Mach-O 
seems to work analogously (except for no shared druntime/Phobos 
support for macOS with DMD yet).

For symbols to be accessible from other binaries, they need to be 
'dynamic symbols', which can e.g. be inspected via `objdump -T 
<binary>` (or `readelf --dyn-syms <binary>`). A symbol becomes a 
dynamic symbol if both of these requirements are met:

* object file: default (ELF) symbol visibility (`STV_DEFAULT`), 
not hidden (`STV_HIDDEN`)
* binary: exporting these symbols at link-time via 
`--export-dynamic` (all default-visibility symbols) or 
selectively via `--export-dynamic-symbol[-list]` (linker support 
varies)
   * `--export-dynamic` is the default setting when linking a 
shared library
   * but not for executables (DMD adds it implicitly to the linker 
command, LDC doesn't)

With LDC, the object-file symbol visibility is controlled by 
`-fvisibility={public,hidden}` (analogous to gcc/clang - 
controlling the default visibility for all symbol definitions), 
as well as explicit `export` D visibility and the 
` ldc.attributes.hidden` UDA.

The compiler doesn't need to know whether an external symbol is 
going to be provided by some object file/static library or a 
shared library at link-time (no 'import' complications at all; 
`-dllimport` is ignored on Posix). On Posix, static and shared 
libs are mostly interchangeable, everything 'just works'.



One important aspect is that the dynamic loader 'unifies' dynamic 
symbols if multiple binaries define it (probably using the first 
encountered symbol). So dynamic-symbol addresses are identical in 
these binaries, and the binaries operate on the same shared state 
(data symbols).

Say we have a D executable statically linked against the 
`concurrency` dub library, and a shared D library that contains 
its own `concurrency` library (linked statically into the shared 
library). If both binaries export their `concurrency` symbols as 
dynamic symbols, there's effectively a single shared 
`concurrency` state for the whole process. So you don't *need* to 
link executable and shared library against a shared `concurrency` 
library to e.g. have a single `globalStopSource` instance for the 
whole process.

If there are multiple versions of the same library in the whole 
process (duplicate static libs), a potentially surprising pitfall 
is that module constructors, CRT constructors etc. are still 
invoked once per containing binary, so multiple times (and 
operating on the same data). This can be even more surprising if 
the static libs are compiled differently, e.g., via extra 
`version`s for the static `concurrency` lib linked into the 
shared library, but loading the shared library then invoking the 
module constructors from the executable (if the `ModuleInfo` data 
symbol is a dynamic symbol in both binaries, or the module 
constructor function itself). [We've had such a case at Symmetry, 
so I'm not pulling this out of thin air.]



AFAIK, one usually doesn't bother with selective exports via 
`-fvisibility=hidden`, just compiling with default 
`-fvisibility=public` and thus exporting ~everything. ` hidden` 
is handy for symbols that need to be DSO-local (to be resolved 
inside the same binary only, not 'imported' or 
unified/preempted), but that's an exceptional use case (LDC's 
druntime has a few of these).

For D in particular, the stack traces in druntime depend on 
dynamic symbols - the function names are only resolved if the 
function is a dynamic symbol [while file+line infos are derived 
from the DWARF debuginfos]. So using `-L--export-dynamic` for 
linking executables isn't uncommon (default for DMD) to resolve 
function names from the executable too. The downside is that it 
prevents the linker from stripping unused symbols - dynamic 
symbols aren't stripped, and accordingly neither are any 
non-dynamic symbols that they reference.

Another D-specific aspect is that if a process consists of 
multiple D binaries, they must share a single shared druntime 
[compiled with `-version=Shared` for some important diffs between 
static and shared druntime variants]. So if e.g. a D executable 
comes with plugins support (loading shared D libraries at 
runtime), the executable needs to be linked with 
`-link-defaultlib-shared` explicitly (`-link-defaultlib-shared` 
is the default when linking a shared library via `-shared`), to 
link against the shared druntime and Phobos libraries [separate 
for LDC, not a single merged `libphobos2.so` as for DMD].



On Windows, we are back in the stone age. Some 
limitations/differences:

* Binaries cannot export more than 64K symbols.
* When linking a DLL implicitly (i.e., not loading it manually at 
runtime and looking up the symbol address via 
`GetProcAddress()`), you don't link against the .so/dylib 
directly as on Posix, but have to use a separate 'import library' 
generated by the linker (`mylib.dll` with import library 
`mylib.lib`).
* You can't link a DLL and have some symbols resolved at 
load-time (to be provided by the loading process). All symbols 
need to be resolved at link-time.
* The loader doesn't take care of resolving references to symbols 
exported from other binaries; the compiler needs to do it 
manually at runtime. Accordingly, no automatic 'unifying' of 
duplicate exported symbols.

With that ridiculous 64K-symbols limit, it's clear that we cannot 
default to `-fvisibility=public` on Windows, otherwise you 
wouldn't be able to link any binary with more than 64K symbol 
definitions. [At Symmetry, we have a fat shared library, which on 
Linux has more than 600K dynamic symbols; on Windows, we 
explicitly export a handful of symbols only.] So one needs to 
either resort to selective `export`s (e.g., for plugins with a 
small number of exported functions only), or use a higher number 
of smaller shared libraries explicitly compiled with 
`-fvisibility=public` (such as the druntime and Phobos DLLs).



There's no concept of object-file visibilities in COFF. Instead, 
what happens is that the compiler embeds linker directives in the 
object file if a symbol defined in that object file is to be 
exported (`/EXPORT:foo`). AFAIK, you can't override or tweak this 
at link-time later (as possible on Posix via 
`--export-dynamic…`), so this is all controlled at compile-time 
already. If there are exported symbols/linker directives, the 
linker automatically generates an import library for the linked 
executable/DLL.



While on Posix there's no explicit importing, on Windows things 
are totally different - if you want to directly access a symbol 
defined in another binary, you need to use the import-symbol 
indirection (symbol `foo` needs to be resolved as `*__imp_foo` - 
at runtime, as `__imp_foo` is set by the system at startup).

The `export` visibility on Windows serves two purposes:

- For the object file defining an `export`ed symbol, it causes 
the symbol to be dllexported from every binary that object file 
is linked into.
- In other object files referencing that symbol, the symbol is 
dllimported, unless the object file has been compiled together 
(in the same compiler invocation) with the object file that 
exports it. The assumption here is that all of the object files 
produced in a single compiler invocation are linked together, not 
ending up in different binaries. E.g., if you compile a static 
library in a single compiler invocation, and export a symbol 
explicitly, then all produced object files that don't define the 
symbol reference it directly without dllimport (so to be resolved 
inside the same binary at link-time). So you don't *have* to use 
a .di header to replace an `export` definition with a declaration 
- if the module defining the symbol isn't part of the current 
compilation (not a root module, only D-imported), it's 
dllimported automatically.



For functions, the import libraries fortunately contain 
trampolines (with the original function names). When calling some 
`foo` function exported by another binary, you can link that 
binary's associated import library, which provides a `foo` 
trampoline, which (presumably) loads `__imp_foo` and jumps to 
that address. So calling/accessing some function in another 
binary doesn't *require* any extra handling from the compiler.

Note that the function addresses will diverge across binaries (as 
`&foo` might be a trampoline specific to the current binary), 
unlike on Posix. [For LDC, I've had to adapt a single druntime 
unittest, where the function identity/address mattered.] And 
well, you're going through a trampoline instead of calling the 
function directly, so this might come with a tiny performance 
penalty.



Data symbols on the other hand are a problem - trampolines aren't 
an option because the indirection needs to be loaded at runtime 
(so we need to *run* code for that, can't just access some 
`__imp_foo` directly). In essence, the compiler needs to know in 
advance if a data symbol will be imported from some other binary, 
and then replace `foo` by `*__imp_foo`. That's pretty simple in 
function bodies.

[References to such dllimported data symbols in static data 
initializers on the other hand are a pain. E.g., if an object 
file defines a TypeInfo for some struct defined in another DLL, 
and that `TypeInfo.initializer.ptr` needing to be set to the 
dllimported init symbol. LDC keeps track of such references per 
object-file and emits a CRT constructor which performs the 
required 'relocations' manually, at runtime.]

Note that there's no support for exporting/importing **TLS** 
symbols at all (in C++ neither). Again, something that just works 
on Posix. [IIRC, I've only had to adapt a single TLS variable in 
druntime for now though, using a function returning a ref 
instead.]

Compared to C++, the situation is trickier for D, as we have a 
bunch of implicit data symbols, like ModuleInfos, init symbols 
and way more commonly used (and complicated!) TypeInfos.


`-dllimport={none,all,defaultLibsOnly}`

The main problem on Windows is that the compiler needs to know in 
advance if a data symbol will be imported from some other binary. 
While you could provide the compiler with a fine-grained list of 
modules/packages that are to be treated as external (ending up in 
another binary), I've decided to go with a simpler scheme for 
LDC, focusing on 2 use cases:

1. Building every library as its own shared library. For a dub 
project, this would be building every direct and indirect 
dependency as its own separate shared library (not really 
feasible with dub today). Similar to a Linux distro package 
manager with a central set of shared libraries.
    - This is what LDC defaults to with `-shared`, for symmetry 
with Posix.
    - Similar to how it just works on Posix: export everything 
(`-fvisibility=public`), and import all (`extern(D)`) data 
symbols that aren't defined in a compiled root module 
(`-dllimport=all`). No need for a carefully manually crafted 
`export` library interface. This works best if compiling each 
library with a single compiler invocation (all modules contained 
in the shared library), but isn't a requirement [then potentially 
dllimporting data symbols exported in separately compiled object 
files, with a linker warning 'importing locally defined symbol' - 
probably a slight performance penalty].
    - And also similar to Posix, there's a single state per 
library, because each library is present only once in the whole 
process (no duplicate static libraries with their own separate 
states).
    - With many smaller DLLs, the 64K symbols-limit should be 
manageable.
2. A process consisting of few larger shared libraries, each with 
few selective/explicit `export`s only (`-fvisibility=hidden`), 
but automatically importing all data symbols from druntime and 
Phobos (`-dllimport=defaultLibsOnly` - basically treating a 
module as binary-external if starting with `std.`, `core.` or 
`ldc.`).
    - When linking a static library into such a binary, it must 
have been compiled with matching visibility options 
(`-fvisibility=hidden -dllimport=defaultLibsOnly`). Somewhat 
similar to how you have to compile C(++) code ending up in a 
shared Posix library with `-fPIC`.

This makes it possible to use shared libraries on Windows quite 
painlessly, all controlled by the `-fvisibility` and `-dllimport` 
compile options, and optionally the D `export` visibility + 
` hidden` UDA.

What isn't supported is, for example, a dub project where some 
deps are built as shared library (without selective/explicit 
`export`s), and others as static libraries. Say, only using the 
`concurrency` dub dependency as a shared library exporting 
everything (to have a single process-global state for that 
library on Windows too), and linking everything else statically. 
That would require more fine-grained control over binary-external 
modules, with an according combinatorial explosion (something 
like `-dllimport=std.*,core.*,ldc.*,concurrency.*`).



Similar to gcc/clang's `-fvisibility-inlines-hidden`, you can use 
LDC's `-linkonce-templates` to NOT export any instantiated 
symbols, so that each binary comes with its own instantiated 
state and functions.

On Windows, without `-linkonce-templates`, there's again the 
problem of importing instantiated data symbols. Such a symbol can 
be instantiated and defined (possibly exported) in multiple 
binaries, plus there's template-codegen-culling mechanism in the 
frontend. For somewhat predictable behavior, I've chosen to do a 
sort of 'lightweight' `-linkonce-templates` for instantiated data 
symbols, if the template *declaration* is in a binary-external 
module. This means that there's one such instantiated data symbol 
for each Windows binary that references it. A simplified example: 
if Phobos declares a template with some counter global, and 
multiple binaries compiled with `-dllimport=defaultLibsOnly` 
instantiate it identically, they'll all have their own counter 
globals. Again, on Posix, the loader unifies the instantiated 
data symbol, everything just works. [More infos: 
https://github.com/ldc-developers/ldc/issues/3931]



For a project at Symmetry, we currently have the following 
architecture, working on both Linux (DMD and LDC) and Windows 
(LDC only):

* a bunch of thin frontends (executables and shared libraries),
* the core as a single fat shared library, with a handful of 
explicit `export`ed functions (and something akin to a `.di` 
header as shared-lib interface), implicitly linked against all 
frontends, and
* a bunch of plugins (shared libraries) which can be loaded 
dynamically at runtime, each with a dozen (or so) explicitly 
`export`ed functions (resolved via `GetProcAddress`/`dlsym`)

On Windows, *everything* (except for prebuilt druntime and Phobos 
DLLs) is compiled with `-fvisibility=hidden 
-dllimport=defaultLibsOnly`. All binaries share some base dub 
dependencies that are all linked statically.

This is an evolution from a prior approach, where we had a 
smaller core with about 25 plugins, and linked that core 
statically into every frontend. The static libraries duplication 
(base dub dependencies) was much worse then, causing a much 
higher overall bundle size. So we extracted the core as separate 
shared library and now link most former plugins statically into 
that core.

Handling non-unified separate states on Windows can be a pain: 
https://github.com/symmetryinvestments/concurrency/pull/88

The full bundle consists of about 200 dub libraries/executables, 
so the alternative of building every dub dependency as its own 
shared library with (on Windows) `-fvisibility=public 
-dllimport=all` doesn't seem too attractive and hasn't been 
tested yet; it would surely be a huge challenge. :)



As is hopefully clear by now, it's archaic Windows which 
complicates matters enormously wrt. shared libraries. My strong 
opinion on this is that the D language itself shouldn't cater to 
its limitations - we try to do our best (with reasonable effort) 
to make things work on Windows too (Rainer Schütze has been 
working on adopting the LDC scheme to DMD, some things landed 
already), but the OS is just too primitive to handle all cases 
without too much Windows-only effort (like adding our own 
D-specific extra indirection for all symbols to implement a 
unified state, or wrapping TLS variables with functions - all 
stuff the compiler could do, but just for a crappy operating 
system?).
May 16
next sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
Over all a very good write up of compiler + platform specific under the 
hood details!

On 17/05/2024 1:00 AM, kinke wrote:
 There's no concept of object-file visibilities in COFF. Instead, what 
 happens is that the compiler embeds linker directives in the object file 
 if a symbol defined in that object file is to be exported 
 (|/EXPORT:foo|). AFAIK, you can't override or tweak this at link-time 
 later (as possible on Posix via |--export-dynamic…|), so this is all 
 controlled at compile-time already. If there are exported symbols/linker 
 directives, the linker automatically generates an import library for the 
 linked executable/DLL.
You can add exports later with the help of a linker script. I don't think you can go the other way however.
 With many smaller DLLs, the 64K symbols-limit should be manageable.
Can confirm, as long as templates are not exported we are talking around 200k LOC. More than enough for a single (sub)package.
May 16
prev sibling next sibling parent reply Lance Bachmeier <no spam.net> writes:
On Thursday, 16 May 2024 at 13:00:44 UTC, kinke wrote:

 As is hopefully clear by now, it's archaic Windows which 
 complicates matters enormously wrt. shared libraries. My strong 
 opinion on this is that the D language itself shouldn't cater 
 to its limitations - we try to do our best (with reasonable 
 effort) to make things work on Windows too (Rainer Schütze has 
 been working on adopting the LDC scheme to DMD, some things 
 landed already), but the OS is just too primitive to handle all 
 cases without too much Windows-only effort (like adding our own 
 D-specific extra indirection for all symbols to implement a 
 unified state, or wrapping TLS variables with functions - all 
 stuff the compiler could do, but just for a crappy operating 
 system?).
WSL works very well these days. Not long ago I tried to get someone using D on Windows (they'd have been creating shared libraries). I eventually gave up on the "native" effort and told them to try WSL. They installed everything themselves and got on with their work. I don't know how feasible WSL is as a general solution, but it's a native Windows solution that's part of the OS, and additional Windows-only effort is only a benefit in situations where you need to take a different approach.
May 16
parent Adam Wilson <flyboynw gmail.com> writes:
On Thursday, 16 May 2024 at 17:53:24 UTC, Lance Bachmeier wrote:
 On Thursday, 16 May 2024 at 13:00:44 UTC, kinke wrote:

 WSL works very well these days. Not long ago I tried to get 
 someone using D on Windows (they'd have been creating shared 
 libraries). I eventually gave up on the "native" effort and 
 told them to try WSL. They installed everything themselves and 
 got on with their work.

 I don't know how feasible WSL is as a general solution, but 
 it's a native Windows solution that's part of the OS, and 
 additional Windows-only effort is only a benefit in situations 
 where you need to take a different approach.
It's not. As an add-on to Windows, you cannot rely on it's existence. Most Windows shops likely don't even know it exists. Many shops have rules about what can and cannot be installed on their Windows boxes for compliance. The rules are byzantine, make no sense to programmers, and are inviolable. This is kind of an IYKYK situation, and if you know, I feel your pain.
May 17
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/16/2024 6:00 AM, kinke wrote:
 Apple's Mach-O seems to work 
 analogously (except for no shared druntime/Phobos support for macOS with DMD
yet).
Bugzilla entry, please!
May 17