www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Big picture on shared libraries when they go wrong, how?

reply Richard (Rikki) Andrew Cattermole <richard cattermole.co.nz> writes:
This post is meant to be a highly enlightening and entertaining 
explanation (or should I say it shouldn't cure anyones insomnia) 
of just how many things can go wrong with shared libraries if 
they are not worked with right regardless of platform.

Now I know this is an utter wall of text, but if you want to work 
with shared libraries you probably should read all of this. It'll 
get you up to speed on the theory of using them by preventing a 
repeat of my experiences, no war stories for you!

If you have inside knowledge of how shared libraries work, please 
expand upon this in the comments, perhaps we can get an article 
out of it for the site.

Some of the advice in this article may go against your previous 
experiences working with shared libraries. The recommendations 
here exist because the alternatives have seen to be problematic 
for a large portion of support requests over a two year period. 
If you understand what you are doing, you of course can disregard 
a particular piece and may want to expand upon or refine what the 
information that is being given here so that we can create a 
great overview of the subject for future programmers to learn 
from!

Latest copy can be found 
[here](https://gist.github.com/rikkimax/c2b501e64e3cca6d59343a286e5466df).



Before we begin to get into actual content we should probably 
cover some basic terms.

- Binary (within a process, can be known as an image or module): 
An executable or shared library.
- Static library: An archive containing one or more object files.
- Shared library: A reusable and multi-loadable binary that 
typically does not contain an entry point function.
- Out of binary: A symbol that does not exist in the current 
(compiling/linking) binary.
- Visiblity override switch: A compiler switch that changes the 
default symbol mode of symbols, unless stated otherwise.
- DllImport override switch: A compiler switch that changes the 
default symbol mode for symbols that are external, unless stated 
otherwise.
- Silo'd: a library that is unaware of other instances of itself 
(my own definition for the usage of this article).
- Isolated: a library that is sandboxed so that no resources can 
cross into other code (my own definition for the usage of this 
article).



- Common Mistakes
	> Not asking for help in understanding the theory behind shared 
libraries, linking and loading in general is going to lead to 
failure for your project. No matter how good you are with this 
stuff, help will be needed at some point.

- Things That are Not Covered
	> Not everything has been described here that can impact shared 
libraries usage in D. It is not a tutorial, but a reference for 
before you start using them.

- Is a Dynamic Link Library a Shared Library?
	> Yes, but they make it easy to think otherwise!

- Import Libraries are Special Yes?
	> There is nothing special about import libraries, don't export 
global variables, oh and you should probably just link against a 
DLL dynamically!
	
- Symbol Modes Make Ya Go Mad!
	> When dealing with shared libraries there are three modes a 
symbol can be in ``Internal``, ``DllImport`` and ``DllExport``. 
Setting these up right are the core problem that results in both 
linkage failures and runtime errors.

	- Not Everything Should Be Exported
		> Just because something can be exported, doesn't mean it 
should be, i.e. TLS.

- Symbol, What Symbol?
	> Current language is not very helpful with any generated 
symbols and this can lead to program corruption.

- Knowing When to DllImport
	> Current solutions are too broad, inconsistent and will out 
right result in linker errors without any compiler assistance. 
They outright prevent intermediary usage of static libraries and 
object files without issues arrising.

- Why Not Intermediary Static Libraries?
	>  A static library does not fully get included, eliding FTW! 
Use object files for intermediaries rather than static libraries 
for anything that gets exported.

- It is Loaded, Works Yes?
	> Just because it linked, doesn't mean it'll load even with the 
right dependencies and the behavior of loaders are not consistent 
between platforms.

- Unloading
	> To keep your sanity, don't unload a shared library unless your 
process is dieing.

- Initializing Your Shared Library
	> A shared library that allows you to borrow resources it owns, 
and borrows from another is full of failure modes that may not be 
avoidable.

	- TLS Hooking
		> Only Windows offer hooking of threads, which supports zero or 
more ``DllMain`` and for druntime should be automatically 
injected.

	- Scenario: Your Own Memory Allocator
		> The order of deinitialization can matter between siblings 
shared libraries, if you can avoid letting a sibling shared 
library borrow resources from you, you should avoid it.

	- Scenario: Your Own Threads
		> If you're going to do your own threads, don't forget to 
register them with druntime and handle cyclic registration to and 
from.

- Where Is Thy Runtime?
	> Did you follow my advice in ``Unloading``, no? Well good luck 
with that. If you have a runtime loaded don't have duplicates of 
it, stick to a single shared library build of it.

- Who Needs a Scope Anyway?
	> Go ahead be smart! Don't use shared libraries or static 
libraries, go import only! See how quickly you kill off that 
scope that depends on having state.



TLDR: Not asking for help in understanding the theory behind 
shared libraries, linking and loading in general is going to lead 
to failure for your project. No matter how good you are with this 
stuff, help will be needed at some point.

I would write a lot more here, but currently the language and the 
tooling simply does not assist you in getting what you need sent 
to the linker sent.

- You cannot tell the compiler that a module is not in your 
binary. See: ``Knowing When to DllImport``. My DIP fixes this.
- You cannot tell the compiler that something is private, 
actually needs to be exported and have it work correctly 
(``export`` is currently a visibility modifier). See Atila's 
DConf 2023 talk [``You're Writing D Wrong--Átila 
Neves``](https://www.youtube.com/watch?v=Rm_8Hpex68s) as to why 
this is very worrying that we cannot do it currently. This is 
something my DIP resolves.
- If you are able to tell the compiler that a type needs to be 
exported, it will not export things it generates leading to it 
not work anyway. See: ``Symbol, What Symbol?``. Another thing my 
DIP fixes.
- If it does work, its going to cause silent program corruption. 
See: ``Symbol, What Symbol?``.

In general if you're going to work with shared libraries, you 
will likely run into situations where you need help. Buying, 
reading and learning from [Linkers & 
Loaders](https://www.amazon.com.au/Linkers-Loaders-John-Levine/dp/1558604960)
is not going to be enough to get you to a successful outcome.



TLDR: Not everything has been described here that can impact 
shared libraries usage in D. It is not a tutorial, but a 
reference for before you start using them.

- No D code with build file examples
- Exceptions
- Template instantiations that cross the shared library boundary



TLDR: Yes, but they make it easy to think otherwise!

So let's start with something simple, a Dynamic Link Library 
(DLL) is not a shared library. This is not an accurate statement, 
as a DLL facilitates the role that a shared library does on 
non-Windows systems. As an issue this come up in a few places 
such as [Windows System Programming 3rd edition pg. 
150](https://www.amazon.com/Windows-System-Programming-Johnson-
art/dp/0321256190), [documentation for
GetFullPathNameA](https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileap
-getfullpathnamea), [an answer on stack
overflow](https://stackoverflow.com/a/62517860).

The shared library model is notable because of the reusable 
nature of a binary that the OS loader can merge into your 
process. Either during initial load started by the kernel or 
during execution of your program at your request.

Of note is that each binary that makes up a process (executable 
vs shared library) are _not_ isolated. They are **_merged_**. 
Once merged the only thing preventing exposure of one to another, 
is the symbol table that the kernel keeps for each binary which 
is used for patching.

In another section ``Where Is Thy Runtime`` I describe a library 
that is silo'd, this just means it does not know about other 
things in the process. Isolation on the other hand would refer to 
sandboxing which as far as I am aware no OS does.

Okay so how is that entertaining? Great question, due to the 
indirection introduced by DLL's it can appear that they are in 
fact isolated which can lead to quite some interesting moments!



TLDR: There is nothing special about import libraries, don't 
export global variables, oh and you should probably just link 
against a DLL dynamically!

Whenever you link a binary you may have noted a corresponding 
file has been created along with it. This is an import library, 
it was generated by the linker when it saw that you exported 
something. These are quite informational, they tell you what 
symbols were exported, but more importantly they tell a future 
linker invocation about them too!

Not all platforms use these files, others such as Linux rely on 
what is in the binary to provide this information solely. On 
Windows they utilize by import libraries and information in the 
shared library to map their symbols which works great for their 
commercially concerned OS!

So what are import libraries? Some custom format or other 
horrendous thing to never learn about?

No! In fact they are just regular static libraries! If you can 
emit a static library you can probably create your own without 
much work.

The two main things that they contain which are of interest is 
the extern symbols that have ``_imp`` prefixed to their name and 
wrappers to these symbols where a simple jump (or similar) to 
what is pointed at. ``jmp [_imp_symbol];`` these are symbols are 
generated to have the original symbol name (without the ``_imp``).

Those generated wrappers are why the druntime bindings to WinAPI 
currently work, without ``DllImport`` support being cleanly 
defined and in active use by the language!

This has another interesting tidbit, you should **_only_** have 
the ability to export functions, not global variables.  You can 
see this in [Microsoft's 
libc](https://learn.microsoft.com/en-us/cpp/c-runtime-library/errno-doserrno-sys-errlist-and-sys
nerr?view=msvc-170) how they have it to be a function call in a macro.

What is great about this is in practice there is no difference 
between linking against a shared library statically (using 
linker) or loading dynamically (using loader yourself). Either 
way you're dealing with an indirection of using a global pointer!

So if you're ever asking yourself if you should statically or 
dynamically link against a shared library on Windows, you should 
probably link dynamically unless you're distributing the end 
binary as it makes no difference when using a symbol.



TLDR: When dealing with shared libraries there are three modes a 
symbol can be in ``Internal``, ``DllImport`` and ``DllExport``. 
Setting these up right are the core problem that results in both 
linkage failures and runtime errors.

In the traditionally applied (POSIX) shared library model, the 
only symbol modes relevant to discussion are internal versus 
external. An external symbol is one not defined in a given 
binary, and internal is found within. However just because a 
symbol is internal does not mean it has its symbol name known or 
accessible to other binaries to link against.

Along came Windows DLL's and we no longer use internal versus 
external terminology with shared libraries although it is still 
relevant to object files and it is how linkers and loaders still 
operate at the lowest level even if we are no longer operating 
solely within it. Now we use ``Internal``, ``DllImport`` and 
``DllExport`` regardless of the platform.

- An Internal symbol is a symbol that is found in a binary that 
is not directly accessible by name externally to that binary.
- A ``DllImport`` symbol is a symbol that is not found in the 
current binary and is external to it. For Windows specifically 
this refers to the symbol having indirection via a global pointer 
to the internal symbol. See ``_imp`` prefixed symbols in import 
libraries heading above.
- A ``DllExport`` symbol is a. internal symbol that has an 
exportation linker flag applied to it. Traditionally this will 
expose the symbol name for the symbol. For Windows it will hide 
the internal symbol and instead expose a new global variable 
which is a pointer, using the name with the prefix ``_imp``, that 
points to the internal symbol.

Each platform has its own tunings to the shared library model, 
both OSX and Linux may both be POSIX, but they each have their 
own behaviors that are not necessarily POSIX compliant.

LLVM has some explanations for these modes, there are many others 
they support although they are not relevant to this document. For 
[internal](https://llvm.org/docs/LangRef.html#linkage-types), and 
for 
[DllImport/DllExport](https://llvm.org/docs/LangRef.html#dll-storage-classes).

Symbol modes are the heart and sole of the majority of issues 
relating to shared library support in the language. Most 
specifically what should be exported automatically, and when do 
we apply ``DllImport`` instead of Internal.



TLDR: Just because something can be exported, doesn't mean it 
should be, i.e. TLS.

The vast majority of symbols that are user written (not compiler 
generated) error due to the symbol modes ``DllImport`` and 
``Internal`` being mixed up. But sometimes ``DllExport`` can 
cause issues for both generated and user written symbols.

According to [Ulrich 
Drepper](https://www.akkadia.org/drepper/dsohowto.pdf) and at 
least one other [Stack overflow 
user](https://stackoverflow.com/a/32701238) C 
constructors/destructors on linux do not need to be exported.
Since it is not required to be exported, exporting can only 
invite problems when it is done unnecessarily. See the [bug 
ticket](https://issues.dlang.org/show_bug.cgi?id=24536) to track 
disallowing exportation of functions marked as such.

Alternatively another set of issues can be seen with generated 
symbols such as ``ModuleInfo`` or ``TypeInfo``. By not exporting 
``ModuleInfo`` and assuming it is available the compiler 
introduces a hidden dependency on a generated symbol that may not 
exist.

This is a bit of problem with shared libraries. Especially when a 
D file could actually be a binding to a C library (like Deimos). 
See these two tracking issues for ``ModuleInfo`` exportation 
problems [Export 
ModuleInfo](https://issues.dlang.org/show_bug.cgi?id=231770) and 
[Remove 
dependency](https://issues.dlang.org/show_bug.cgi?id=23974).

Unfortunately the removal of the dependency can only work 
correctly if you know that the module is out of binary or you end 
up with fun situations where a dependency module does not 
initialize before you try to access it.

See ``Why Not Intermediary Static Libraries?`` for an explanation 
on why a static library should not contain exports.

Thread local variables (TLS), Fiber local variables (FLS) are 
examples of specialty global variables that should never be 
exported. The scheme used for each depends on the platform and 
can change over time (Android has recently changed its TLS scheme 
for instance).

The global itself could be a key into some sort of map that the 
operating system provides, or emulated by the toolchain into 
existing. The creation of the key into map may be done by user 
code, as done with 
[pthread](https://pubs.opengroup.org/onlinepubs/009695399/functions/pthr
ad_key_create.html) and
[Win32](https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-process
hreadsapi-tlsalloc) which has explicit mention that the handle may not cross
the DLL boundary.

Instead of exporting a TLS variable you can wrap the access to 
the storage pointer by a function that returns it. This should be 
done automatically by the compiler or disallowed.



TLDR: Current language is not very helpful with any generated 
symbols and this can lead to program corruption.

So you've got yourself a fancy pants type and you've done 
everything right. Exported all the symbols that don't get 
exported automatically (that the compiler is supposed to 
exporting for you since you can't in language), annotated with 
export on the type and methods itself but... you get a segfault 
when you used it. What would you do?

I have had to deal with this very situation before multiple times 
when the D code looks like this:

```d
MyType var;
var = MyType(...);
```

It looks like it should be working fine! The segfault isn't even 
in this function!!! How is this code buggy? Well you won't 
believe this... but that variable initialization, didn't 
initialize.

See dmd is rather "_helpful_" even though it didn't know that the 
``.init`` symbol is in ``DllImport`` mode rather than 
``Internal``, and because of the way the codegen works it still 
linked and didn't cause any memory corruption!

So when the copy from the ``.init`` symbol to the stack occurs it 
sees a zero length, and it thinks I'm done! Wahoo, I did the 
thing. Except it didn't do the thing. In fact it did zero of the 
things it was meant to do.

What you end up with is a variable with junk left over stack data 
which can be pretty much anything. This shows up very easily when 
you are dealing with library based reference counting, due to the 
atomic alignment check. Not a fun time to be had.

This shows us how important it is to export symbols generated 
from a type automatically when other symbols have been explicitly 
exported. D has a lot of house keeping symbols that get 
generated, including ``opCmp``! All of these must be handled for 
you, or it hasn't got a chance to work and there will be a lot of 
distractions requiring a significant amount of debugging to 
resolve.



TLDR: Current solutions are too broad, inconsistent and will out 
right result in linker errors without any compiler assistance. 
They outright prevent intermediary usage of static libraries and 
object files without issues arrising.

So we've so far covered how the compiler needs to assist with 
exportation automatically and that you must have a way to put a 
symbol into ``DllExport`` mode, but we still have to cover 
``DllImport``, and what the compiler can do to assist you.

Nothing. It cannot help you. It will get it wrong, things will 
not link.

So it is fully on you to put symbols into ``DllImport`` mode, and 
that right there is the giant problem, how do you do this?

Well you can start with the ``dllimport`` override switch that 
ldc has introduced. But you are limited to either system 
libraries like druntime and phobos, or every shared library. 
There is no finer grained solution as part of CLI switches 
currently.

If you do it in code, now suddenly you have to maintain both an 
interface file and the source file. Oh did I mention that the 
compiler can't help here either? Yeah... the D interface 
generator has no knowledge of if you want the resulting file to 
be used for a static library or shared library. Even if it was 
going to work, it isn't going to work for you today.

So you have got to annotate per symbol that it is in 
``DllImport`` mode. In my DIP for exportation I changed this to 
have the consistent syntax of ``export`` with ``extern`` and this 
applies to all symbols.

Still this isn't a good enough situation, doesn't help build 
managers and certainly is a major pain, obviously nobody is going 
to do this manually if they have a choice.

While it is great to have a fine grained solution (including 
conditionally) for setting ``DllImport`` mode, this shouldn't be 
your primary way of setting up the symbol modes.

There is an alternative that works great as a story for both 
build managers and for people who don't know anything about _why_ 
it exists!

The external import path switch ``-extI`` this is a switch I have 
proposed similar to ``-I``. If you understand the import path 
switch you can understand that the external import switch is just 
for modules found in a shared library. Easy swap!

 From a compiler perspective it knows that any module found from 
an external import switch is found in another binary, and if its 
from the import switch that it can be found from the currently 
compiling binary!

This enables it to switch any found ``DllExport`` symbols to 
``DllImport`` without any action on each symbol by the 
programmer. How wonderful!

But what if we didn't annotate with ``export`` and instead used 
the visiblity override switch to set exportation, well use the 
``dllimport`` override switch to apply to all symbols found from 
a external module. Great, more compiler assistance with minimal 
changes!

But why not use the override switches isn't this good enough? No, 
no it is not. It's too broad.

Without the ability to pick which modules are out of binary, 
versus being linked into the current binary you get [linker 
warnings](https://learn.microsoft.com/en-us/cpp/error-messages/tool-errors/linker-tools-warning-ln
4286?view=msvc-170) and they exist because you are out right doing the wrong
thing by adding extra indirection (which may not have been enabled by the
(lacking, or different setting) of visibility override switch).
This has the unfortunate casualty of no static library or object 
file intermediaries without causing problems.



TDLR: A static library does not fully get included, eliding FTW! 
Use object files for intermediaries rather than static libraries 
for anything that gets exported.

So you've been a good programmer, split up your code base so that 
there are intemediary compilation steps to enable faster rebuild 
times and proper scoping of project work. Nothing could go wrong 
with that when it comes to shared libraries right? Right???

Oh how are you naive! There is so much wrong with this that 
you're going to rethink everything you have ever done.

So linkers don't just include a static library whole, it only 
includes an object file that it contains if something references 
it by default. Great for when you are building executables, not 
so great when you are constructing a shared library from static 
libraries containing exports that do not get pulled in by 
anything.

Unfortunately while there is a way to [force 
it](https://learn.microsoft.com/en-us/cpp/build/reference/wholearchive-include-all-library-object-f
les?view=msvc-170), you need to know the static libraries name and can be a bit
buggy depending on the linker in question. Only resonable solution to this is
to use object files, that do not get elided.

According to Adam Wilson, the recommendation from Microsoft 
internally is to not export from static libraries and this makes 
sense given the above issues. So while you can use a static 
library to contribute towards your shared library, it should not 
be providing any exported symbols.

This is problematic with dub, as it does not support object files 
currently. See this 
[ticket](https://github.com/dlang/dub/issues/2633) for a 
potential redesign of how dub works with target types.

You should also be aware that with both of the override switches 
(``visibility`` and ``dllimport``) you will not have fined 
grained control over exports in a static library versus object 
files in dub today based upon the (sub)package. There are 
multiple things that will need to be done to enable people to 
prevent running afoul of these recommendations whilst still 
enabling full control.

To further complicate matters, if you want to fully isolate a 
static library neither dub nor the compiler can assist you (by 
using the .di generator). This will require further research to 
enable this advice of not exporting from static libraries to be 
automatically applied with minimal intervention by the programmer.



TLDR: Just because it linked, doesn't mean it'll load even with 
the right dependencies and the behavior of loaders are not 
consistent between platforms.

So you have succesfully compiled and linked. Symbols that were 
supposed to be exported were, and those that weren't weren't. So 
it will work now yes? YES?

NOPE. We are not done yet.

Now we gotta talk about loading of shared libraries and ensuring 
their state is valid.

But where does a loader look for a shared library to load? First 
place is system directories which of course depends upon your 
system configuration.

For POSIX systems it uses some environment variables to determine 
auxiliary locations. It also looks in a special string within a 
binary (executable and shared libraries) called ``RPATH``, 
however keep in mind this will carry with the binary no matter 
where its called or by who.

On 
[Windows](https://learn.microsoft.com/en-us/windows/win32/dlls/dynamic-link-l
brary-search-order) and
[OSX](https://developer.apple.com/library/archive/documentation/DeveloperTools/Conceptual/DynamicLibraries/100-Articles/UsingDy
amicLibraries.html) it'll look in the current working directory by default too,
not just system directories or the ``PATH`` variable.

Windows does support some 
[customization](https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-setdlldirectorya?sou
ce=recommendations) for the usage of launchers, that will allow at runtime to
setup some additional paths.

So much variety in behavior of the system loader, how do we 
ensure we have a consistent behavior that "just works" with our 
build managers? Outside of the build manager we really can't do a 
whole lot.

But what we can do is unify upon placing them into the same 
directory as the executable and then letting the build manager 
use the appropriete environment arguments to setup the lookup 
paths to point to it. If all you are doing is wanting to run your 
program that is great.

I have a [PR](https://github.com/dlang/dub/pull/2718) to add this 
capability to dub, which has been a tad contentious for those who 
are not me or Martin.

Of course all of this assumes you have all the dependencies setup 
with no conflicts in place (such as versioning). If you don't 
you're going to need a tool like 
[Dependencies](https://github.com/lucasg/Dependencies) to figure 
this one out the hard way.



TLDR: To keep your sanity, don't unload a shared library unless 
your process is dieing.

Remember when I said shared libraries are not _isolated_ 
(sandboxed)? Yeah that. That is a bit of a problem...

If you unload a shared library you are putting your process into 
an indeterminate state on if it could be corrupted. For this 
reason I would not recommend unloading a shared library except in 
one rather particular case.

If you can guarantee that a given shared library has not during 
its existance been sharing its resources and you have not been 
taking any pointers into it, you may unload it.

To work around this limitation of no sharing of resources, you 
can use handles as long as they are not the integral 
representation of a pointer and to convert them internally to a 
pointer use a data structure to map it. A much slower approach, 
but safer if you need to do unloading.

The simplest solution to all of this which is what I would 
recommend, is to simply keep a shared library loaded but detach 
them internally. So if you mess up you are not risking a program 
crash. Just don't subvert your API that controls attachment and 
it should work safely.

This approach takes care of both read only memory (functions, 
globals, constant literals) as well as heap allocated memory.



TLDR: A shared library that allows you to borrow resources it 
owns, and borrows from another is full of failure modes that may 
not be avoidable.

All platforms worth mentioning here support some method to run 
initializers and deinitializers in your shared library after load 
and before unload with priorities. In D this can be hooked using 
the ``pragma(crt_constructor)``  and ``pragma(crt_destructor)``. 
However we do not support priorities.

Windows has some additional support of initialization callbacks 
via the ``DllMain`` function, however this will be covered in the 
sub heading ``TLS Hooking``.

When a shared library is designed to work in isolation and not 
take ownership of any resource it did not create for its own 
internal use, there should be minimal concerns surrounding its 
initialization and deinitialization, as long as they were never 
exposed to other code, nor other code exposed to it.
See my prior point in ``Unloading`` regarding handles.

On the other hand when you have a shared library similar to 
druntime that:

- Does not define its own initialization/deinitialization 
functions that are automatically run (you must explicitly run 
them).
- Owns threads that you can request, borrow and sets up its own 
internal state.
- Can be informed of threads you own, but does not allow you to 
add its internal state onto it (not necessarily required but 
there is no function that you are supposed to call to make it 
happen).
- Owns memory (GC) that you can borrow at your request.
- Borrows memory that it scans for GC memory.
- Runs other peoples code (module (de)constructors, unittests, 
destructors) at potentially indeterminate times.

Every single one of these things could be the cause of your 
programs corruption. Best case scenario is a segfault, but silent 
program corruption is just as possible.



TLDR: Only Windows offer hooking of threads, which supports zero 
or more ``DllMain``'s and for druntime should be automatically 
injected.

Having knowledge of when a thread is created or destroyed is 
quite useful to have if your goal is to register threads to a 
shared library, construction or destruction of your state.

Windows has this capacity in the form of a function called 
[``DllMain``](https://learn.microsoft.com/en-us/windows/win32/dlls/dllmain)
this maps into a section inside of a the PE-COFF binary for [TLS callback
functions](https://learn.microsoft.com/en-us/windows/win32/debug/pe-format#tls
callback-functions) and enables a compiler to provide as many hook functions as
desired to load/unload of binaries as well as on creation and destruction of
threads.

This leads to a concern about the existance of a mixin template 
in druntime called ``SimpleDllMain``. When druntime is built as a 
shared library on Windows, it'll automatically be included. 
However if you build a shared library that has druntime as a 
static library this will not be handled for you and it could be 
without using the ``DllMain`` function up.

If we offered a 
[pragma](https://issues.dlang.org/show_bug.cgi?id=24532) to set a 
function as a TLS callback function we could let druntime have 
its own, remove the need for ``SimpleDllMain`` entirely.

Although in the above I say only Windows supports it, in recent 
years C++ has introduced thread local variables and with that 
destructor support. This might be 
[hookable](https://issues.dlang.org/show_bug.cgi?id=23756), 
although this would not solve the on thread creation hook and for 
that reason it should be considered Windows only for the time 
being.



TLDR: The order of deinitialization can matter between siblings 
shared libraries, if you can avoid letting a sibling shared 
library borrow resources from you, you should avoid it.

Scenario: you have a shared library sitting side by side as a 
sibling to druntime, that has been told that druntime exists via 
registration (see dub's ``injectSourceFiles`` as a way to do this 
automatically) and you have your own memory allocator.

You want to tell the GC about any memory you allocate, because of 
course somebody might want to put GC memory into it and you don't 
want to let it get free'd.

So you tell the GC all about it by adding it as a range, no 
problem right? You're being a good person! And you would be 
rather mistaken when it comes time to do unloading...

See it is totally possible that your shared library gets 
deinitialized after druntime does. And of course when you 
deinitialize, you gotta tell druntime to remove those ranges! 
This is one way to get a crash deep inside of the druntime's GC 
without a way of knowing why.

Please do not ask me how I know about this, it wasn't a fun time 
to debug this one.

A workaround to this is to add an additional initialization and 
deinitialization call to druntime. This will increase the counter 
internally and when you do _your_ call to it will let it die 
proper. Making it so all your state has it gone, and all its 
state about you is also gone.

Note: this works with the C constructor/destructor, so this is 
running outside of the user start function.



TLDR: If you're going to do your own threads, don't forget to 
register them with druntime and handle cyclic registration to and 
from.

So you have decided to create your own thread abstraction, you 
wrote it and it worked first time, well done! And now you have 
gotten a user to try it; the program crashed once run. The horror!

Out of pure curiosity did you register the thread and then ran 
the thread initialization code for module constructors and TLS? 
Yes? Why of course you didn't, you didn't even know that druntime 
was loaded in process. See ``Scenario: Your Own Memory 
Allocator`` section for more information on registering druntime.

Okay now that you have done it and it runs, great job!

So tell me, has druntime registered its threads with you also? 
No? Curious, that you wanted to build a thread abstraction 
library but you only cared enough to write the code regarding the 
threads that _you_ wanted. Still at least no other threads are 
interacting with your code. What? That isn't the case? Oh no...

Okay so the needful has been done, you have a module constructor 
and destructor that informs you of thread creation and 
destruction by druntime. Super. But why are you getting stack 
overflows now?

See you did the most intelligent thing possible, you registered 
your thread with druntime, and druntime registered its thread 
with your abstraction. Isn't that how its meant to be? Why yes, 
yes it is meant to be like that. Except you created a bit of a 
loop there...

After all that work, now it starts to work without failures, 
assuming of course you didn't mess out an implementation detail 
some place like I did. It's always fun to have to debug code 
where an object gets deallocated and the same pointer gets 
allocated for the same thing and you wonder why the state keeps 
changing on you!



TLDR: Did you follow my advice in ``Unloading``, no? Well good 
luck with that. If you have a runtime loaded don't have 
duplicates of it, stick to a single shared library build of it.

I tried... I really did, I spent an entire day trying to write 
this section. Fact is what this section was meant to talk about 
is when multiple copies of a runtime are loaded into a process 
with no knowledge of each other.

If the owned resources of a shared library never crossed the 
boundary to other peoples code is followed as I recommended in 
``Unloading`` then this section wouldn't matter. But of course 
nobody does that, see [SDL](https://www.libsdl.org/), 
[SQLite](https://www.sqlite.org/) or should I say pretty much 
EVERY C LIBRARY IN ACTIVE USE. Oh and for anyone in doubt, how 
about that COM eh? Ya know the C++ based remote process 
communication, that uses heap allocated classes that underpins a 
pretty significant portion of the Windows shell and Microsoft 
products extension capabilities.

Okay rant over, hopefully everyone who has made it this far can 
see that there is a risk here that I am trying to educate about.

So you have a library, a runtime of sorts. Lets call it druntime. 
This runtime owns and loans out memory from it, and has callbacks 
registered into it (destructors, module destructors ext.) as well 
as memory registered into it (``ModuleInfo``, ``TypeInfo``). Not 
only that but it also has system resources such as locks and 
threads that it owns and loans out to other code. Sometimes it 
even knows about system resources that other code has created 
such as threads!

So this "druntime", you build it as a shared library and you have 
multiple binaries depending upon it loaded into your process. You 
load and unload, register and unregister all correctly. No 
segfaults happen on start up and shutdown. Good job, I'm sure 
that you have followed all of my advice that I have detailed in 
the other sections of the article.

Alternatively you could have built this "druntime" into an 
executable or shared library and you end up having a mix leading 
you to have multiple copies loaded into your process. Only they 
know nothing of each other. This is unfortunately a very real 
possibility, after all where will you register your _runtime_ 
into?

Which one do you think is going to cause problems at 
indeterminate points in time?

The second of course! Okay I lie it could be either but the 
second one is almost guaranteed to result in problems that are 
impossible to debug for the novice.

Problem is each "druntime" is silo'd, it has no knowledge of the 
other, or have the ability to communicate with it. But lets say 
you did have the ability to communicate which is a rather big if, 
have you really got all the state ready to be communicatable 
between them? What happens when it is time to unload? Different 
version size mismatch, behavior changes fields ext. This of 
course doesn't answer questions like whose memory allocator do 
you use from that point on, who ends up owning threads, and how 
do you detect ROM that no longer will exist (i.e. ``TypeInfo``). 
You are just asking for trouble trying to merge them.

In ``Is a Dynamic Link Library a Shared Library?`` I explain the 
difference between a library that has been silo'd versus 
isolated. Where the latter is sandboxed and the former is merely 
ignorant of what else is in the process.

So should you accept that they are silo'd because anything else 
is a developmental nightmare even if you have been successful in 
aggregating state so that it can be passed back and forth. Now 
the question has become, have you crossed resources (even if it 
was done accidently) that are owned from one "druntime" to 
another "druntime" instance? Of course you did, because who 
wouldn't? Its not like there is any protection from doing it. Go 
ahead propose exploding the number of pointer types... See where 
that gets ya.

You put one bit of memory into another bit of memory with each 
being owned by a different GC, which of course doesn't know about 
the other. Naturally the memory that went into the other has no 
other references and its GC has gone ahead and collected it. Not 
long after that you accessed it, oh hey segfault! What did you 
expect? This is too easy to do by accident.

If you are going to have a runtime that has resources it owns 
exposed to other code (RAM, handles such as a thread or lock) 
don't duplicate that runtime. You are asking for trouble. Use a 
shared library for this, not a mix of static libraries with 
shared library builds of it.



TLDR: Go ahead be smart! Don't use shared libraries or static 
libraries, go import only! See how quickly you kill off that 
scope that depends on having state.

So you wanna be smart, you think that your project having _any_ 
binary is just a big ball of problems, so you're going import 
only! Well aren't you clever!

Just to clarify some things first:

- Does it have any state? Threads, locks, globals, inter-thread 
communication?
- Does it need any giant lookup tables, that should be in read 
only memory and shared throughout a process?
- Will there be any symbols that cannot be templated? Or should I 
have said will be a right pain to use if it were templated?
- Are you linking against a non-D library?

If you answered no to all of these questions, well 
congratulations you can go import only!

What? You didn't answer no to all of these questions? What are 
you trying to build, a whole new standard library or something?

Limiting yourself to import only requires you to limit your 
scope. Good bye event loops, windowing, anything asynchronous. 
While you can do these things, you will be limiting yourself 
severely enough that your code will not look familiar to others. 
So up to you, listen to my advice, use a shared library and have 
a state that can be shared or don't and put a copy into every 
binary, which might be fine if all you have is a single 
executable.

Either way, good luck with that PhobosV3 event loop whilst still 
being import only!
May 05
next sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Monday, 6 May 2024 at 03:28:47 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 This post is meant to be a highly enlightening and entertaining 
 explanation (or should I say it shouldn't cure anyones 
 insomnia) of just how many things can go wrong with shared 
 libraries if they are not worked with right regardless of 
 platform.

 [...]
Thanks for the write-up. It's going to take a while and probably several re-reads for me to get through this and write down some notes, but I think it's a valuable use of time.
May 07
next sibling parent reply Mike Shah <mshah.475 gmail.com> writes:
On Tuesday, 7 May 2024 at 23:50:17 UTC, Atila Neves wrote:
 On Monday, 6 May 2024 at 03:28:47 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 This post is meant to be a highly enlightening and 
 entertaining explanation (or should I say it shouldn't cure 
 anyones insomnia) of just how many things can go wrong with 
 shared libraries if they are not worked with right regardless 
 of platform.

 [...]
Thanks for the write-up. It's going to take a while and probably several re-reads for me to get through this and write down some notes, but I think it's a valuable use of time.
Will echo Atila's comments -- thanks for taking the time to write this up! It may be nice to have a version of this on the blog, glad it's archived here at the least!
May 07
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 08/05/2024 2:04 PM, Mike Shah wrote:
 On Tuesday, 7 May 2024 at 23:50:17 UTC, Atila Neves wrote:
 On Monday, 6 May 2024 at 03:28:47 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 This post is meant to be a highly enlightening and entertaining 
 explanation (or should I say it shouldn't cure anyones insomnia) of 
 just how many things can go wrong with shared libraries if they are 
 not worked with right regardless of platform.

 [...]
Thanks for the write-up. It's going to take a while and probably several re-reads for me to get through this and write down some notes, but I think it's a valuable use of time.
Will echo Atila's comments -- thanks for taking the time to write this up! It may be nice to have a version of this on the blog, glad it's archived here at the least!
Thanks! I wasn't sure it would fully fit into a single N.G. post. 6500 words, 30k characters, took four days to write. I could almost write a masters thesis with this as a base!
May 07
prev sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 08/05/2024 11:50 AM, Atila Neves wrote:
 On Monday, 6 May 2024 at 03:28:47 UTC, Richard (Rikki) Andrew Cattermole 
 wrote:
 This post is meant to be a highly enlightening and entertaining 
 explanation (or should I say it shouldn't cure anyones insomnia) of 
 just how many things can go wrong with shared libraries if they are 
 not worked with right regardless of platform.

 [...]
Thanks for the write-up. It's going to take a while and probably several re-reads for me to get through this and write down some notes, but I think it's a valuable use of time.
Thank you for saying that. I do appreciate that you are going to take the time to read it, it should be quite an interesting jumping off point for you with all the references! There is mention of your last DConf talk wrt. private, and why someone (such as myself) would not appreciate export being a visibility modifier. See ``Common Mistakes`` heading.
May 07
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
Thanks for writing this.

Are you writing solely about DLLs on Windows? They don't have much in common 
with shared libraries on OSX and Posix.
May 07
next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 08/05/2024 3:08 PM, Walter Bright wrote:
 Thanks for writing this.
I'm happy to do it. I do sincerely hope it raises enough awareness of the situations that you can get into if your builds are even slightly "interesting" that we can have this be fully solved with some form of finality.
 Are you writing solely about DLLs on Windows?
No, although that is where I found the vast majority of problems however it isn't the source of them. There is mention of ``RPATH`` and a difference in behavior of the loader between POSIX, OSX and Linux. Porting my code base was fairly straight forward as the only things specific to Linux I had to deal with was featured in the ``TLS Hooking`` and ``It is Loaded, Works Yes?`` headings. Everything else was just system library differences basically and applying existing solutions to known problems found on Windows.
 They don't have much in common with shared libraries on OSX and Posix.
They do have plenty in common, this is a misconception I really want to get you off of. There is a dedicated heading for this ``Is a Dynamic Link Library a Shared Library?``. The base level of how the linkers and loader on Windows work is still the traditional model that you are an expert in. External symbols to be found elsewhere and internal symbols found in a given binary. If this was not the case, Optlink would not have the ability to produce DLL's that still work on Windows today. Microsoft of course wasn't happy with that model and placed a bunch of extra behavior on top of it that I call tunings as part of their linker. No other platform has such extreme tunings, but others do have tunings, which is why we no longer use the traditional model at the compiler level for any platform. See LLVM's IR documentation (its referenced), ``DllImport``, ``DllExport``, ``Internal`` (there are variations of it, but we'll just simplify it down to internal).
May 07
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/7/2024 8:45 PM, Richard (Rikki) Andrew Cattermole wrote:
 They don't have much in common with shared libraries on OSX and Posix.
They do have plenty in common, this is a misconception I really want to get you off of. There is a dedicated heading for this ``Is a Dynamic Link Library a Shared Library?``.
Isn't it true that DLLs on Windows share their global data segment with all users of the DLL? While Linux shared libraries have a separate data segment for each process? This is a very major difference.
May 08
next sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 08/05/2024 7:15 PM, Walter Bright wrote:
 On 5/7/2024 8:45 PM, Richard (Rikki) Andrew Cattermole wrote:
 They don't have much in common with shared libraries on OSX and Posix.
They do have plenty in common, this is a misconception I really want to get you off of. There is a dedicated heading for this ``Is a Dynamic Link Library a Shared Library?``.
Isn't it true that DLLs on Windows share their global data segment with all users of the DLL? While Linux shared libraries have a separate data segment for each process? This is a very major difference.
So I was going to write out that I have no idea how you came to this conclusion and I'd love to hear the story that makes you think that this is true for a OS that is used by financial, governmental and military organizations. And then I found MSVC link's ``/SECTION`` flag. https://learn.microsoft.com/en-us/cpp/build/reference/section-specify-section-attributes?view=msvc-170 See the flag ``S``. https://www.codeproject.com/Articles/240/How-to-share-a-data-segment-in-a-DLL I haven't been able to find an equivalent linux ld flag. Either way, this isn't the default and in no way would I ever consider that it should be used! Sounds like a big ball of insanity.
May 08
prev sibling parent Adam Wilson <flyboynw gmail.com> writes:
On Wednesday, 8 May 2024 at 07:15:20 UTC, Walter Bright wrote:
 On 5/7/2024 8:45 PM, Richard (Rikki) Andrew Cattermole wrote:
 They don't have much in common with shared libraries on OSX 
 and Posix.
They do have plenty in common, this is a misconception I really want to get you off of. There is a dedicated heading for this ``Is a Dynamic Link Library a Shared Library?``.
Isn't it true that DLLs on Windows share their global data segment with all users of the DLL? While Linux shared libraries have a separate data segment for each process?
I've been writing DLL's since Windows 2000, and this has not been true for the entire time I've been doing this. As far as I know, *if* this was ever true, then it hasn't been true since Windows 3.1. The global data segment is shared at the per-process level, no further. And I for one do *not* want to consider the absolute chaos that would ensue if it were any other way... \<runs away in terror>.
May 08
prev sibling parent reply Gregor =?UTF-8?B?TcO8Y2ts?= <gregormueckl gmx.de> writes:
On Wednesday, 8 May 2024 at 03:08:15 UTC, Walter Bright wrote:
 Thanks for writing this.

 Are you writing solely about DLLs on Windows? They don't have 
 much in common with shared libraries on OSX and Posix.
That is confusing me as well. DLLs share concepts with shared libraries on other platforms, but they have subtle differences. The ones that come to my mind: - Shared libraries export everything by default. DLLs export nothing by default. This relates to the non-standard declspec(dllexport) declaration supported by MSVC to mark exported symbols. - Unix system linkers take shared libraries as input files directly. Windows linkers require import libraries. These import libraries contain thunks that jump to the real code in the DLL. Those thunks can be avoided if the compiler knows a symbol comes from a DLL. This is why declspec(dllimport) exists in MSVC (as a performance optimization). - DllMain() is a Windows only construct. If it is present, it is invoked for a lot of different events (PROCESS_ATTACH, THREAD_ATTACH...). Some Unix/Posix OSes support callbacks for loading/unloading libraries at most. The mechanisms are not equivalent. - And then there are all the funny ways in which static initialization in C++ can break in combination with Unix shared libraries. There are some fun, really opaque pitfalls like static constructors getting executed multiple times (and at times when you probably woudldn't expect). I don't think the same is true on Windows. These differences result in a number of things that are different in one model and not the other. On Unix, it's legal to have name collisions between symbols exported from different libraries. Typically, the first encountered symbol wins. This allows mechanisms like LD_PRELOAD to work and and use a program with a replacement malloc() implementation, for example. There is no Windows equivalent for this. You'd have to provide a shim DLL in the search path that provides all symbols.
May 07
next sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 08/05/2024 5:13 PM, Gregor Mückl wrote:
 On Wednesday, 8 May 2024 at 03:08:15 UTC, Walter Bright wrote:
 Thanks for writing this.

 Are you writing solely about DLLs on Windows? They don't have much in 
 common with shared libraries on OSX and Posix.
That is confusing me as well. DLLs share concepts with shared libraries on other platforms, but they have subtle differences. The ones that come to my mind: - Shared libraries export everything by default. DLLs export nothing by default. This relates to the non-standard declspec(dllexport) declaration supported by MSVC to mark exported symbols.
It is a convention on POSIX systems to export everything by default (negative annotation). On Windows you have the 64k exported symbol limit so from a practical stand point you have to go positive instead. About a year ago deadalnix told me that he thought that this was changing for some linux distros (unconfirmed) and it makes sense why the desire might be there. Anytime you export a symbol you are pinning it into existence. You are preventing both compiler and linker from performing optimizations. It also makes your binaries larger and increases your load times. Positive annotation might be a bit annoying and require you to understand how symbols are represented but using it regardless of platform is a much better default, this is something both me and Walter agree with although I am unsure what information he used to come to that conclusion so I cannot speak for him on that.
 - Unix system linkers take shared libraries as input files directly. 
 Windows linkers require import libraries. These import libraries contain 
 thunks that jump to the real code in the DLL. Those thunks can be 
 avoided if the compiler knows a symbol comes from a DLL. This is why 
 declspec(dllimport) exists in MSVC (as a performance optimization).
That is mostly correct, but your conclusion is wrong. It's only a performance optimization for functions. For anything else you're stuck with going into ``DllImport`` mode explicitly. Such as an array that's in ROM like our ``.init`` symbol or ``TypeInfo`` instances; so being explicit about symbol modes is quite important to D, without the explicitness D simply won't load. As for externs into a DLL, on Windows its pretty common for the exports to be missing from the DLL itself, hence you need the extra file for static linking. The tradeoffs that Microsoft picked here must have an interesting origin. I don't think it will be purely because Windows 95a was distributed on 30 floppy disks (or there abouts I'd have to count).
 - DllMain() is a Windows only construct. If it is present, it is invoked 
 for a lot of different events (PROCESS_ATTACH, THREAD_ATTACH...). Some 
 Unix/Posix OSes support callbacks for loading/unloading libraries at 
 most. The mechanisms are not equivalent.
I covered this in ``TLS Hooking`` heading. But basically inside of PE-COFF there is a TLS section that allows providing as many of these functions are you like. The name might be special (as to indicate the purpose is meant for user-code not library code such as druntime), but the purpose is not special. I have no idea why POSIX hasn't added this as a feature to pthread. As far as I'm aware there is no legitimate reason why it shouldn't exist. It seems like an "ewww Microsoft did it so we won't copy their good idea" kind of thing.
 - And then there are all the funny ways in which static initialization 
 in C++ can break in combination with Unix shared libraries. There are 
 some fun, really opaque pitfalls like static constructors getting 
 executed multiple times (and at times when you probably woudldn't 
 expect). I don't think the same is true on Windows.
See ``TLS Hooking``, but one thing I did find is as part of glibc it'll hook the thread death and run all the thread destructors. Did I mention that those destructors can be run multiple times? Yeah it's a mess.
 These differences result in a number of things that are different in one 
 model and not the other. On Unix, it's legal to have name collisions 
 between symbols exported from different libraries. Typically, the first 
 encountered symbol wins. This allows mechanisms like LD_PRELOAD to work 
 and and use a program with a replacement malloc() implementation, for 
 example. There is no Windows equivalent for this. You'd have to provide 
 a shim DLL in the search path that provides all symbols.
I've done a quick look, it seems its allowed to have duplicate symbols on Windows as well, which makes sense otherwise things like plugins wouldn't exactly work right (and could lead to failures for stuff like REPL's). https://learn.microsoft.com/en-us/windows/win32/api/dbghelp/nc-dbghelp-psymbol_registered_callback64 That function callback/struct is used as part of the Windows image introspection library for both loaded and not yet loaded binaries so it must be possible to get into that situation. As for stuff like ``LD_PRELOAD`` I don't think there is anything to prevent it from existing, its just Microsoft decided not to support it. In some ways this is a security concern in it existing so I can understand that they didn't want to implement it. Unfortunately the Windows loader is pretty badly documented, the only place I know of that documents it is the Windows Internal books and I'm a couple of versions behind (I don't remember 5 mentioning duplicate symbols). After more reading there is a something akin to ``LD_PRELOAD`` which shock and horror is not recommended and is disabled with secure boot enabled. https://devblogs.microsoft.com/oldnewthing/20071213-00/?p=24183
May 07
prev sibling parent reply Paulo Pinto <pjmlp progtools.org> writes:
On Wednesday, 8 May 2024 at 05:13:55 UTC, Gregor Mückl wrote:
 On Wednesday, 8 May 2024 at 03:08:15 UTC, Walter Bright wrote:
 Thanks for writing this.

 Are you writing solely about DLLs on Windows? They don't have 
 much in common with shared libraries on OSX and Posix.
That is confusing me as well. DLLs share concepts with shared libraries on other platforms, but they have subtle differences. The ones that come to my mind: - Shared libraries export everything by default. DLLs export nothing by default. This relates to the non-standard declspec(dllexport) declaration supported by MSVC to mark exported symbols. - Unix system linkers take shared libraries as input files directly. Windows linkers require import libraries. These import libraries contain thunks that jump to the real code in the DLL. Those thunks can be avoided if the compiler knows a symbol comes from a DLL. This is why declspec(dllimport) exists in MSVC (as a performance optimization). - DllMain() is a Windows only construct. If it is present, it is invoked for a lot of different events (PROCESS_ATTACH, THREAD_ATTACH...). Some Unix/Posix OSes support callbacks for loading/unloading libraries at most. The mechanisms are not equivalent. - And then there are all the funny ways in which static initialization in C++ can break in combination with Unix shared libraries. There are some fun, really opaque pitfalls like static constructors getting executed multiple times (and at times when you probably woudldn't expect). I don't think the same is true on Windows. These differences result in a number of things that are different in one model and not the other. On Unix, it's legal to have name collisions between symbols exported from different libraries. Typically, the first encountered symbol wins. This allows mechanisms like LD_PRELOAD to work and and use a program with a replacement malloc() implementation, for example. There is no Windows equivalent for this. You'd have to provide a shim DLL in the search path that provides all symbols.
This is also not fully correct. The Windows DLL model is also present on Aix, including having export files, and similar kind of linker features. While Aix adopted ELf later on, COFF is still quite prevalent, having evolved into XCOFF. Although no longer relevant, Symbian also used the same DLL model. Then we Amiga Libraries, BeOS, IBM i, IBM z and Unisys ClearPath MCP, all of which are kind of their own thing. While they all might be irrelevant for D, there is a bit more to shared libraries as only POSIX vs Windows.
May 08
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 09/05/2024 6:39 PM, Paulo Pinto wrote:
 The Windows DLL model is also present on Aix, including having export 
 files, and similar kind of linker features. While Aix adopted ELf later 
 on, COFF is still quite prevalent, having evolved into XCOFF.
 
 Although no longer relevant, Symbian also used the same DLL model.
 
 Then we Amiga Libraries, BeOS, IBM i, IBM z and Unisys ClearPath MCP, 
 all of which are kind of their own thing.
 
 While they all might be irrelevant for D, there is a bit more to shared 
 libraries as only POSIX vs Windows.
Indeed, pretty much all platforms will do tuning for shared libraries to make it fit their needs. Windows is an extreme example of it, and is in active use by the D community so is worthy of a lot of attention. I tried to explain that there are tunings that are used but apparently I didn't do so well at it as I don't have experience with any of them. Any additional write up you would be willing to do, I'd be happy to add it attributed.
May 09
prev sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Monday, 6 May 2024 at 03:28:47 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 This post is meant to be a highly enlightening and entertaining 
 explanation (or should I say it shouldn't cure anyones 
 insomnia) of just how many things can go wrong with shared 
 libraries if they are not worked with right regardless of 
 platform.

 [...]
There were a few things here I didn't understand; sometimes whole sentences. I don't know what "siblings shared libraries" are, and "intermediary static libraries" only made sense to me further down when I understood that it was about linking several static libraries into a dynamic one. Some questions: On everything druntime-related, how is it done in C++? C (unless freestanding) and C++ both have runtimes despite some people pretending otherwise. "you cannot tell the compiler that a module is not in your binary." - isn't this exactly what happens with `export` on a declaration (as opposed to a definition)? That is, my understanding is that `export` with no body means `dllimport` and `export` with a body means `dllexport`. Where does the need for "private but export" come from again? Is there an equivalent in C++ (`static dllexport`?), or does this only happen due to something specific to D like `T.init`? "By not exporting ModuleInfo and assuming it is available the compiler introduces a hidden dependency on a generated symbol that may not exist." - do we have an issue for that? I searched for ModuleInfo in the issues but none of them looked like a match to me.
May 09
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 10/05/2024 10:04 AM, Atila Neves wrote:
 On Monday, 6 May 2024 at 03:28:47 UTC, Richard (Rikki) Andrew Cattermole 
 wrote:
 This post is meant to be a highly enlightening and entertaining 
 explanation (or should I say it shouldn't cure anyones insomnia) of 
 just how many things can go wrong with shared libraries if they are 
 not worked with right regardless of platform.

 [...]
There were a few things here I didn't understand; sometimes whole sentences. I don't know what "siblings shared libraries" are, and "intermediary static libraries" only made sense to me further down when I understood that it was about linking several static libraries into a dynamic one. Some questions: On everything druntime-related, how is it done in C++? C (unless freestanding) and C++ both have runtimes despite some people pretending otherwise.
You use the shared library build of it by default. You can opt into using a switch to use a static library instead, but you are on your own. This reflects my recommendations. Oh hey, similar recommendations I gave can be found here: https://learn.microsoft.com/en-us/cpp/c-runtime-library/crt-library-features?view=msvc-170#what-problems-exist-if-an-application-uses-more-than-one-crt-version "When you build a release version of your project, one of the basic C runtime libraries (libcmt.lib, msvcmrt.lib, msvcrt.lib) is linked by default" libcmt.lib Statically links the native CRT startup into your code. msvcmrt.lib Static library for the mixed native and managed CRT startup for use with DLL UCRT and vcruntime. msvcrt.lib Static library for the native CRT startup for use with DLL UCRT and vcruntime.
 "you cannot tell the compiler that a module is not in your binary." - 
 isn't this exactly what happens with `export` on a declaration (as 
 opposed to a definition)? That is, my understanding is that `export` 
 with no body means `dllimport` and `export` with a body means `dllexport`.
Keep in mind that nobody that I know of is using it to mean this today. Also a D module may be in binary, but it may have declarations that point to something that is not. See the bindings in druntime where it has D symbols. https://github.com/dlang/dmd/blob/5fc02ba152ccaa71711f3ed84b6d44a2a940f206/druntime/src/core/stdc/stdio.d#L1191 This is why you don't pretend that one symbol in ``DllImport`` mode means the entire module is because it may be a binding to something else. Note: there are plenty of examples of it which are not templated, like structs that have to be initialized using the D init array.
 Where does the need for "private but export" come from again? Is there 
 an equivalent in C++ (`static dllexport`?), or does this only happen due 
 to something specific to D like `T.init`?
It happens because D confuses exportation which is a linker concept, with a language visibility concept. In C/C++ it uses a completely separate attribute to donate that it does not affect Member Access Control such as ``private``. But you can see this with things like templates, you want to access an internal symbol but don't want somebody else to? Yeah no, can't do that today. https://learn.microsoft.com/en-us/cpp/cpp/dllexport-dllimport?view=msvc-170 You have to be very explicit in c/c++ over this, we do not have that level of control (apart from saying do not export this symbol via `` hidden``).
 "By not exporting ModuleInfo and assuming it is available the compiler 
 introduces a hidden dependency on a generated symbol that may not 
 exist." - do we have an issue for that? I searched for ModuleInfo in the 
 issues but none of them looked like a match to me.
Yes two. They are referenced in the article. Note: they are not duplicates. Okay I lie there is a bunch more. https://issues.dlang.org/show_bug.cgi?id=23850 https://issues.dlang.org/show_bug.cgi?id=23177 https://issues.dlang.org/show_bug.cgi?id=23974 https://issues.dlang.org/show_bug.cgi?id=6019 https://issues.dlang.org/show_bug.cgi?id=9816 Here is my workaround code to get around this problem: https://github.com/Project-Sidero/basic_memory/blob/main/source/sidero/base/moduleinfostubs.d Not something I am proud of needing.
May 09
parent reply Atila Neves <atila.neves gmail.com> writes:
On Friday, 10 May 2024 at 02:05:18 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 On 10/05/2024 10:04 AM, Atila Neves wrote:
 On Monday, 6 May 2024 at 03:28:47 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 This post is meant to be a highly enlightening and 
 entertaining explanation (or should I say it shouldn't cure 
 anyones insomnia) of just how many things can go wrong with 
 shared libraries if they are not worked with right regardless 
 of platform.

 [...]
 "you cannot tell the compiler that a module is not in your 
 binary." - isn't this exactly what happens with `export` on a 
 declaration (as opposed to a definition)? That is, my 
 understanding is that `export` with no body means `dllimport` 
 and `export` with a body means `dllexport`.
Keep in mind that nobody that I know of is using it to mean this today.
Maybe the recommendation should then be that they should? Doesn't the point still stand that "you cannot tell the compiler that a module is not in your binary" isn't actually true? I saw in one issue where there was a problem with variable declaration though, where dllimport/dllexport was determined by the presence or not of an initialiser, which is... yuck.
 Also a D module may be in binary, but it may have declarations 
 that point to something that is not.

 See the bindings in druntime where it has D symbols.

 https://github.com/dlang/dmd/blob/5fc02ba152ccaa71711f3ed84b6d44a2a940f206/druntime/src/core/stdc/stdio.d#L1191
The relevant code is: private extern shared FILE[3] __sF; property auto stdin()() { return &__sF[0]; } `__sF` is declared extern, i.e., not in binary. I don't understand what the issue would be?
 This is why you don't pretend that one symbol in ``DllImport`` 
 mode means the entire module is because it may be a binding to 
 something else.
Why would the entire module be dllimport?
 Note: there are plenty of examples of it which are not 
 templated, like structs that have to be initialized using the D 
 init array.
Yes, but I don't know how this is related to the above.
 Where does the need for "private but export" come from again? 
 Is there an equivalent in C++ (`static dllexport`?), or does 
 this only happen due to something specific to D like `T.init`?
It happens because D confuses exportation which is a linker concept, with a language visibility concept.
My question is: when would I want to export a private symbol?
 In C/C++ it uses a completely separate attribute to donate that 
 it does not affect Member Access Control such as ``private``.
On Windows, C/C++ compilers use a non-standard extension to do so, but yes. AFAIK (and I could well be wrong), one can't dllexport something that's static? You'd have to put it in a header, and it'd be compiled into the current translation unit anyway, so I also don't understand why you'd want to.
 But you can see this with things like templates, you want to 
 access an internal symbol but don't want somebody else to? Yeah 
 no, can't do that today.

 https://learn.microsoft.com/en-us/cpp/cpp/dllexport-dllimport?view=msvc-170
Assuming the link is supposed to elucidate the template comment, I don't understand the relevance. Otherwise, what does "you want to access an internal symbol but don't want somebody else to" mean?
 You have to be very explicit in c/c++ over this, we do not have 
 that level of control (apart from saying do not export this 
 symbol via `` hidden``).
This could mean several things. We have the control over individual symbols that are actually in the source code with `export`. Is the comment above about things like T.init?
 "By not exporting ModuleInfo and assuming it is available the 
 compiler introduces a hidden dependency on a generated symbol 
 that may not exist." - do we have an issue for that? I 
 searched for ModuleInfo in the issues but none of them looked 
 like a match to me.
Yes two. They are referenced in the article. Note: they are not duplicates. Okay I lie there is a bunch more.
Thanks! On a somewhat related note, we use dlls at work and seem to have fixed "everything" by using ldc and `-fvisibility=hidden -dllimport=defaultLibsOnly`, as well as `-link-defaultlib-shared`.
May 10
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 11/05/2024 1:27 AM, Atila Neves wrote:
 "you cannot tell the compiler that a module is not in your binary." - 
 isn't this exactly what happens with `export` on a declaration (as 
 opposed to a definition)? That is, my understanding is that `export` 
 with no body means `dllimport` and `export` with a body means 
 `dllexport`.
Keep in mind that nobody that I know of is using it to mean this today.
Maybe the recommendation should then be that they should? Doesn't the point still stand that "you cannot tell the compiler that a module is not in your binary" isn't actually true? I saw in one issue where there was a problem with variable declaration though, where dllimport/dllexport was determined by the presence or not of an initialiser, which is... yuck.
"you cannot tell the compiler that a module is not in your binary" is true, there is no syntax or cli flag to do this today. Note: this is for the entire module, its metadata (``ModuleInfo``) not just the user written symbols. There probably should be syntax on the module to specify it being out of binary. However that would not be suitable for the majority of programmers and most definitely is not suitable for build managers as it would require it to modify/create the files. That could come later once we have more experience with reliable support. But yes having an initializer or not, should not determine the symbol mode. I did my best to clean it up in a way that would keep things simple and not break the world. This is why I've simplified things down to: Use ``export`` + ``extern`` to go into ``DllImport`` mode. Or rely on the external import path switch to set the ``extern`` for the majority of users. Which is ideal for things like the di generator or build managers ;) Or use the dllimport override switch to set all symbols found from a module that is known to be out of binary as ``DllImport`` (helps with mixing some imports being in binary and some out).
 Also a D module may be in binary, but it may have declarations that 
 point to something that is not.

 See the bindings in druntime where it has D symbols.

 https://github.com/dlang/dmd/blob/5fc02ba152ccaa71711f3ed84b6d44a2a940f206/druntime/src/core/stdc/stdio.d#L1191
The relevant code is:     private extern shared FILE[3] __sF;     property auto stdin()() { return &__sF[0]; } `__sF` is declared extern, i.e., not in binary. I don't understand what the issue would be?
Yeah that's not the best example, but keep in mind that symbol is not in ``DllImport`` mode, its ``Internal``. And that right there is a problem. If all those symbols were declared using positive annotation (like they should be), then anything that has D symbols like the structs such as: https://github.com/dlang/dmd/blob/5fc02ba152ccaa71711f3ed84b6d44a2a940f206/druntime/src/core/stdc/stdio.d#L872 Would also be affected if the following was applied.
 This is why you don't pretend that one symbol in ``DllImport`` mode 
 means the entire module is because it may be a binding to something else.
Why would the entire module be dllimport?
Walter came up with this idea a while back, so I'm a tad defensive towards it. https://github.com/dlang/dmd/pull/15298 Me and Martin had to really fight Walter on it, as it would mean breaking peoples builds in incredibly frustrating ways. The problem is once you start making assumptions about if a module is in or out of your binary without direct instruction by the user, you can mess up your dependencies between modules at the minimum. So if you depend on another it may not initialize before you. It needs to be explicit, or its going to ruin someones day pretty quickly (hence the external import path switch). Here is a binding module that is compiled into your executable. ```d module binding; export extern void otherFunction(); shared static this() { import std.stdio; writeln("binding"); } ``` ```d module app; import binding; void main() { otherFunction(); } shared static this() { import std.stdio; writeln("main"); } ``` ``$ dmd main.d binding.d someImport.lib`` With Walter's idea the ``app`` module constructor could be called before ``binding``'s does, which absolutely should not happen.
 Note: there are plenty of examples of it which are not templated, like 
 structs that have to be initialized using the D init array.
Yes, but I don't know how this is related to the above.
If you were to use Walter's approach to determining if a module is out of binary simply because there is a symbol in ``DllImport`` mode, and then applied it to all declarations (as if the dllimport override switch was set to ``DllImport`` everything) it'll cause link errors. Basically a module can have some symbols out of binary whilst the rest are in. So you can't make this sort of assumption. This is an unfortunate side effect of analyzing Walter's idea to its natural conclusion. Me getting a bit unhappy towards any suggestion of inferring out of binary status, its pain in waiting.
 Where does the need for "private but export" come from again? Is 
 there an equivalent in C++ (`static dllexport`?), or does this only 
 happen due to something specific to D like `T.init`?
It happens because D confuses exportation which is a linker concept, with a language visibility concept.
My question is: when would I want to export a private symbol?
Q1: Should a given symbol be private, yes? Q2: Does any template use the previous symbol? A: That is why you would export a private symbol. Everything in here needs to be exported and should have package visibility so that it is not available for anyone else to access: https://github.com/Project-Sidero/basic_memory/blob/main/source/sidero/base/console/internal/rawwrite.d See some usage here: https://github.com/Project-Sidero/basic_memory/blob/main/source/sidero/base/console/internal/writer.d#L70 I seem to be severely lacking in my ability to explain that having templates be limited to accessing only public things is not a good idea. This has been a bit of an annoyance for me, it goes in the face of all my interests in program security by introducing uncertainty that has no reason to exist. In practice it means people can go and mess around with your internal state without any language protection. You are better off using negative annotation and never touching positive. Its safer. A lot safer. Needless to say they are in a few places in my code base: https://github.com/Project-Sidero/basic_memory/blob/65ee9d0c4b1d4bc666772b21d4afdec30835120c/source/sidero/base/text/format/rawread.d#L126 https://github.com/Project-Sidero/basic_memory/blob/65ee9d0c4b1d4bc666772b21d4afdec30835120c/source/sidero/base/text/format/prettyprint.d#L67 https://github.com/Project-Sidero/basic_memory/blob/65ee9d0c4b1d4bc666772b21d4afdec30835120c/source/sidero/base/path/file.d#L1012 https://github.com/Project-Sidero/basic_memory/blob/65ee9d0c4b1d4bc666772b21d4afdec30835120c/source/sidero/base/logger.d#L447 Loggers, console read/write, are where I have hit it. But I'm sure you could come up with other examples of where you don't want others directly touching your _internal_ symbols, but still need your code which may be compiled into another binary to touch them.
 In C/C++ it uses a completely separate attribute to donate that it 
 does not affect Member Access Control such as ``private``.
On Windows, C/C++ compilers use a non-standard extension to do so, but yes. AFAIK (and I could well be wrong), one can't dllexport something that's static? You'd have to put it in a header, and it'd be compiled into the current translation unit anyway, so I also don't understand why you'd want to.
Yes the extension is https://learn.microsoft.com/en-us/cpp/cpp/dllexport-dllimport?view=msvc-170 For D apart from maybe metadata, only templates can go into another binary.
 But you can see this with things like templates, you want to access an 
 internal symbol but don't want somebody else to? Yeah no, can't do 
 that today.

 https://learn.microsoft.com/en-us/cpp/cpp/dllexport-dllimport?view=msvc-170
Assuming the link is supposed to elucidate the template comment, I don't understand the relevance. Otherwise, what does "you want to access an internal symbol but don't want somebody else to" mean?
```d module database_access; safe: void doAThing(T)() { iCanKillYourDatabase(false, T.sizeof); } /*private:*/ export: void iCanKillYourDatabase(bool doWrongThing = true, size_t sizeOfThing) { if (doWrongThing) database.corruptSilently(); } ``` A bit dramatic (due to unrealistic nature), but that should get the point across that ``iCanKillYourDatabase`` should be private but also exported. The kernel might not stop you from doing a bad thing, but D should be making it a lot harder by making ``iCanKillYourDatabase`` private. Note: you can of course still gain access to it via ``dlopen``, but that is not `` safe`` code and you would need to gain access to the symbol name before you could do that.
 You have to be very explicit in c/c++ over this, we do not have that 
 level of control (apart from saying do not export this symbol via 
 `` hidden``).
This could mean several things. We have the control over individual symbols that are actually in the source code with `export`. Is the comment above about things like T.init?
For generated symbols like ``T.init``, ``opCmp``, ``ModuleInfo`` ext., we have zero control over these currently. Either you use a linker script or you use negative annotation. Everyone I know of uses negative annotation (although I support both). For other symbols we can control not to export per symbol, and if we want to export and have it be public then we can use the export keyword. In C/C++ land, you control if its ``DllExport`` or ``DllImport`` with an attribute directly. With my DIP you would use ``export`` and ``export`` with ``extern`` to denote each.
 "By not exporting ModuleInfo and assuming it is available the 
 compiler introduces a hidden dependency on a generated symbol that 
 may not exist." - do we have an issue for that? I searched for 
 ModuleInfo in the issues but none of them looked like a match to me.
Yes two. They are referenced in the article. Note: they are not duplicates. Okay I lie there is a bunch more.
Thanks! On a somewhat related note, we use dlls at work and seem to have fixed "everything" by using ldc and `-fvisibility=hidden -dllimport=defaultLibsOnly`, as well as `-link-defaultlib-shared`.
``-link-defaultlib-shared`` sets the druntime to be a shared library (which lets face it should be the default). ``-dllimport=defaultLibsOnly`` all symbols for druntime and phobos are defaulting to ``DllImport`` but none others. As for ``-fvisibility=hidden``, that would imply that you are using positive annotation in every code base. Which is curious considering Martin's prior work has been the exact opposite to this. I don't know enough details of Symmetry's projects or how they are laid out to comment about them beyond the tidbits I get. Switching from negative to positive annotation would be a massive undertaking so the notion that you have switched to positive annotation is a statement I am having trouble coinciding it with what I know. Perhaps the one you are referring to is a plugin with a known fixed public API? That being positive annotation and everything else being compiled in being hidden would make sense.
May 10
parent reply Atila Neves <atila.neves gmail.com> writes:
On Friday, 10 May 2024 at 15:21:15 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 On 11/05/2024 1:27 AM, Atila Neves wrote:
 "you cannot tell the compiler that a module is not in your 
 binary." - isn't this exactly what happens with `export` on 
 a declaration (as opposed to a definition)? That is, my 
 understanding is that `export` with no body means 
 `dllimport` and `export` with a body means `dllexport`.
Keep in mind that nobody that I know of is using it to mean this today.
Maybe the recommendation should then be that they should? Doesn't the point still stand that "you cannot tell the compiler that a module is not in your binary" isn't actually true? I saw in one issue where there was a problem with variable declaration though, where dllimport/dllexport was determined by the presence or not of an initialiser, which is... yuck.
"you cannot tell the compiler that a module is not in your binary" is true, there is no syntax or cli flag to do this today.
Correct, sorry, I thought we were talking about symbols and somehow missed "module". The question would then be: why would someone want to tell the compiler that an entire module is out of binary?
 But yes having an initializer or not, should not determine the 
 symbol mode. I did my best to clean it up in a way that would 
 keep things simple and not break the world.

 This is why I've simplified things down to:

 Use ``export`` + ``extern`` to go into ``DllImport`` mode.
Makes sense.
 Or rely on the external import path switch to set the 
 ``extern`` for the majority of users. Which is ideal for things 
 like the di generator or build managers ;)
Do you mean "build systems"?
 Or use the dllimport override switch to set all symbols found 
 from a module that is known to be out of binary as 
 ``DllImport`` (helps with mixing some imports being in binary 
 and some out).

 Yeah that's not the best example, but keep in mind that symbol 
 is not in ``DllImport`` mode, its ``Internal``.

 And that right there is a problem.
And wouldn't the solution be to add `export`?
 Why would the entire module be dllimport?
Walter came up with this idea a while back, so I'm a tad defensive towards it. https://github.com/dlang/dmd/pull/15298
I went through all of that and am still confused as to whether you want or don't want whole modules to be declared out of binary, or why one would want to do that.
 With Walter's idea the ``app`` module constructor could be 
 called before ``binding``'s does, which absolutely should not 
 happen.
I don't know why not, nor did I understand the relevance of the example.
 Basically a module can have some symbols out of binary whilst 
 the rest are in. So you can't make this sort of assumption.
Ok.
 My question is: when would I want to export a private symbol?
Q1: Should a given symbol be private, yes? Q2: Does any template use the previous symbol? A: That is why you would export a private symbol.
Can't we do what C++ does and stick the private symbol where the template is being instantiated?
 ```d
 module database_access;

  safe:

 void doAThing(T)() {
 	iCanKillYourDatabase(false, T.sizeof);
 }


 /*private:*/
 export:

 void iCanKillYourDatabase(bool doWrongThing = true, size_t 
 sizeOfThing) {
 	if (doWrongThing)
 		database.corruptSilently();
 }
 ```

 A bit dramatic (due to unrealistic nature), but that should get 
 the point across that ``iCanKillYourDatabase`` should be 
 private but also exported.
Inline it instead? The code is right there.
 For generated symbols like ``T.init``, ``opCmp``, 
 ``ModuleInfo`` ext., we have zero control over these currently. 
 Either you use a linker script or you use negative annotation. 
 Everyone I know of uses negative annotation (although I support 
 both).
What would the solution be?
 "By not exporting ModuleInfo and assuming it is available 
 the compiler introduces a hidden dependency on a generated 
 symbol that may not exist." - do we have an issue for that? 
 I searched for ModuleInfo in the issues but none of them 
 looked like a match to me.
Yes two. They are referenced in the article. Note: they are not duplicates. Okay I lie there is a bunch more.
Thanks! On a somewhat related note, we use dlls at work and seem to have fixed "everything" by using ldc and `-fvisibility=hidden -dllimport=defaultLibsOnly`, as well as `-link-defaultlib-shared`.
``-link-defaultlib-shared`` sets the druntime to be a shared library (which lets face it should be the default).
Given how much people go on about how great Go is because it links statically (even though C/C++ have been able to do this for basically forever if you opt-in), I'm not sure of that.
 ``-dllimport=defaultLibsOnly`` all symbols for druntime and 
 phobos are defaulting to ``DllImport`` but none others.

 As for ``-fvisibility=hidden``, that would imply that you are 
 using positive annotation in every code base. Which is curious 
 considering Martin's prior work has been the exact opposite to 
 this.
I grepped for `export` and it appears in quite a few places as expected.
 Perhaps the one you are referring to is a plugin with a known 
 fixed public API? That being positive annotation and everything 
 else being compiled in being hidden would make sense.
I think that's what we're doing yes, and that the fact that we are is a good thing.
May 14
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 15/05/2024 6:19 AM, Atila Neves wrote:
 On Friday, 10 May 2024 at 15:21:15 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 On 11/05/2024 1:27 AM, Atila Neves wrote:
 "you cannot tell the compiler that a module is not in your binary." 
 - isn't this exactly what happens with `export` on a declaration 
 (as opposed to a definition)? That is, my understanding is that 
 `export` with no body means `dllimport` and `export` with a body 
 means `dllexport`.
Keep in mind that nobody that I know of is using it to mean this today.
Maybe the recommendation should then be that they should? Doesn't the point still stand that "you cannot tell the compiler that a module is not in your binary" isn't actually true? I saw in one issue where there was a problem with variable declaration though, where dllimport/dllexport was determined by the presence or not of an initialiser, which is... yuck.
"you cannot tell the compiler that a module is not in your binary" is true, there is no syntax or cli flag to do this today.
Correct, sorry, I thought we were talking about symbols and somehow missed "module". The question would then be: why would someone want to tell the compiler that an entire module is out of binary?
1. Metadata (ModuleInfo, TypeInfo, RTInfo) 2. It is almost certainly going to be correct and ``-I`` is almost certainly at play, therefore we can leverage this information to turn ``export`` for positive annotation into ``DllImport``. Can you spot the problem with this: ``[void*]`` vs ``[void*, void**, void*]`` Metadata quite literally is incapable of crossing the DLL boundary without knowing if its in or out of binary. It prevents linking. If it does link, that pointer could very well be pointing at a jump instruction. Not exactly a fun day if you want to debug it.
 But yes having an initializer or not, should not determine the symbol 
 mode. I did my best to clean it up in a way that would keep things 
 simple and not break the world.

 This is why I've simplified things down to:

 Use ``export`` + ``extern`` to go into ``DllImport`` mode.
Makes sense.
This also plays into the external import path switch, since we know that is almost certainly correct, adding the extern on make a very clean simple solution for positive annotation!
 Or rely on the external import path switch to set the ``extern`` for 
 the majority of users. Which is ideal for things like the di generator 
 or build managers ;)
Do you mean "build systems"?
They are interchangeable at this level in my mind, but yes.
 Or use the dllimport override switch to set all symbols found from a 
 module that is known to be out of binary as ``DllImport`` (helps with 
 mixing some imports being in binary and some out).

 Yeah that's not the best example, but keep in mind that symbol is not 
 in ``DllImport`` mode, its ``Internal``.

 And that right there is a problem.
And wouldn't the solution be to add `export`?
Both druntime and PhobosV2 will remain in using negative annotation for the foreseeable future. The amount of work to convert that to positive is quite significant because its not just slapping export on things, you also have to test it. Now do the rest of the ecosystem but with people who don't know what they are doing.
 Why would the entire module be dllimport?
Walter came up with this idea a while back, so I'm a tad defensive towards it. https://github.com/dlang/dmd/pull/15298
I went through all of that and am still confused as to whether you want or don't want whole modules to be declared out of binary, or why one would want to do that.
Being out of binary is what that PR does, by deriving it based upon any symbol being in DllImport mode. We need to know if it is out of binary (see at top of this comment especially here), and will give false positives if you use positive annotation due the deriviation.
 With Walter's idea the ``app`` module constructor could be called 
 before ``binding``'s does, which absolutely should not happen.
I don't know why not, nor did I understand the relevance of the example.
It shows that shared libraries weren't involved, yet because a module was derived as out of binary things didn't work correctly.
 My question is: when would I want to export a private symbol?
Q1: Should a given symbol be private, yes? Q2: Does any template use the previous symbol? A: That is why you would export a private symbol.
Can't we do what C++ does and stick the private symbol where the template is being instantiated?
So you want even more global state? And why am I feeling like there is a shifting sands feeling going on for.
 ```d
 module database_access;

  safe:

 void doAThing(T)() {
     iCanKillYourDatabase(false, T.sizeof);
 }


 /*private:*/
 export:

 void iCanKillYourDatabase(bool doWrongThing = true, size_t sizeOfThing) {
     if (doWrongThing)
         database.corruptSilently();
 }
 ```

 A bit dramatic (due to unrealistic nature), but that should get the 
 point across that ``iCanKillYourDatabase`` should be private but also 
 exported.
Inline it instead? The code is right there.
What if its accessing global state (and perhaps giving you access to it via callback)? What if it is global state instead of a function? What makes you think that it can be inlined?
 For generated symbols like ``T.init``, ``opCmp``, ``ModuleInfo`` ext., 
 we have zero control over these currently. Either you use a linker 
 script or you use negative annotation. Everyone I know of uses 
 negative annotation (although I support both).
What would the solution be?
Some cannot be like ``ModuleInfo``, others have be exported based upon if other things in the encapsulation are exported. Its either that or we export literally everything inside of the encapsulation unit (I don't like that at all). Or we invent new syntax... Again not a fan.
 "By not exporting ModuleInfo and assuming it is available the 
 compiler introduces a hidden dependency on a generated symbol that 
 may not exist." - do we have an issue for that? I searched for 
 ModuleInfo in the issues but none of them looked like a match to me.
Yes two. They are referenced in the article. Note: they are not duplicates. Okay I lie there is a bunch more.
Thanks! On a somewhat related note, we use dlls at work and seem to have fixed "everything" by using ldc and `-fvisibility=hidden -dllimport=defaultLibsOnly`, as well as `-link-defaultlib-shared`.
``-link-defaultlib-shared`` sets the druntime to be a shared library (which lets face it should be the default).
Given how much people go on about how great Go is because it links statically (even though C/C++ have been able to do this for basically forever if you opt-in), I'm not sure of that.
The way I view it is as thus: - If you use D shared libraries with a static runtime, you're going to have your program have indeterminate behavior. - On the other hand, if you don't use shared libraries with a shared runtime it works. In the latter you will need to copy the druntime/phobos shared library, but hey the system loader will tell you if you forgot! In the former there is no warning, it will happily do the wrong thing with no warnings and it might not even crash it could just corrupt data instead. So from my perspective its better to be opt-in for static runtime/phobos builds if you know you don't need it, than the opposite.
May 14
parent reply Atila Neves <atila.neves gmail.com> writes:
On Tuesday, 14 May 2024 at 22:13:19 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 On 15/05/2024 6:19 AM, Atila Neves wrote:
 [...]
1. Metadata (ModuleInfo, TypeInfo, RTInfo)
These are whats, not whys.
 2. It is almost certainly going to be correct
What is most certainly correct?
 and ``-I`` is almost certainly at play, therefore we can 
 leverage this information to turn ``export`` for positive 
 annotation into ``DllImport``.
We already have a mechanism to do this.
 Can you spot the problem with this: ``[void*]`` vs ``[void*, 
 void**, void*]``
No, but I don't understand the question.
 Metadata quite literally is incapable of crossing the DLL 
 boundary without knowing if its in or out of binary. It 
 prevents linking.
What prevents linking?
 [...]
This also plays into the external import path switch, since we know that is almost certainly correct, adding the extern on make a very clean simple solution for positive annotation!
I don't see how `extern` is related to a potential external import path switch.
 [...]
Both druntime and PhobosV2 will remain in using negative annotation for the foreseeable future. The amount of work to convert that to positive is quite significant because its not just slapping export on things, you also have to test it.
`export:`? What needs to be tested?
 Now do the rest of the ecosystem but with people who don't know 
 what they are doing.
I think it's unlikely that most projects would need to work in dlls.
 [...]
So you want even more global state?
?
 [...]
What if its accessing global state (and perhaps giving you access to it via callback)? What if it is global state instead of a function? What makes you think that it can be inlined?
C++.
 [...]
Some cannot be like ``ModuleInfo``, others have be exported based upon if other things in the encapsulation are exported.
What's an "encapsulation"?
 Its either that or we export literally everything inside of the 
 encapsulation unit (I don't like that at all).
?
 [...]
The way I view it is as thus: - If you use D shared libraries with a static runtime, you're going to have your program have indeterminate behavior.
Which doesn't seem to be what happens most often.
 - On the other hand, if you don't use shared libraries with a 
 shared runtime it works.
It doesn't seem this is what most people want.
 So from my perspective its better to be opt-in for static 
 runtime/phobos builds if you know you don't need it, than the 
 opposite.
Again, the raves from people new to Go would suggest otherwise. So would the lack of requests for this in D.
May 15
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 16/05/2024 5:30 AM, Atila Neves wrote:
 On Tuesday, 14 May 2024 at 22:13:19 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 On 15/05/2024 6:19 AM, Atila Neves wrote:
 [...]
1. Metadata (ModuleInfo, TypeInfo, RTInfo)
These are whats, not whys.
As always, this entire thread is all about why things do not link, load or run. All three of those are examples of things that because they exist do not link, load or run at runtime correctly. These files should not exist: https://github.com/Project-Sidero/basic_memory/blob/main/msvc_exports.def https://github.com/Project-Sidero/basic_memory/blob/main/source/sidero/base/dllmain.d https://github.com/Project-Sidero/basic_memory/blob/main/source/sidero/base/moduleinfostubs.d https://github.com/Project-Sidero/eventloop/blob/master/source/sidero/eventloop/rtinfoimplstub.d https://github.com/Project-Sidero/image/blob/master/source/sidero/image/rtinfoimplstub.d https://github.com/Project-Sidero/image/blob/master/source/sidero/image/dllmain.d
 2. It is almost certainly going to be correct
What is most certainly correct?
If a module is found via a path provided to the external import path switch to be out of binary.
 and ``-I`` is almost certainly at play, therefore we can leverage this 
 information to turn ``export`` for positive annotation into 
 ``DllImport``.
We already have a mechanism to do this.
Ugh no? Not for positive annotation we do not.
 Can you spot the problem with this: ``[void*]`` vs ``[void*, void**, 
 void*]``
No, but I don't understand the question.
ModuleInfo goes into an array during loading. For ones in binary that is an array of them. Because we can only bind to the jmp wrapper function, or the DllImport global pointer /if/ we know that it is in another binary we end up with an array that has both ModuleInfo's and pointers to ModuleInfo's there is no way to reliably distinguish between the two.
 Metadata quite literally is incapable of crossing the DLL boundary 
 without knowing if its in or out of binary. It prevents linking.
What prevents linking?
Only functions can work across the DLL boundary, ROM data like arrays or meta data like ModuleInfo cannot cross directly. If you know it is out of binary then you can perform a load against ``ModuleInfo*``, but you cannot patch against ``ModuleInfo``. Its not like how it is when its all in binary. This is not something any user of D should be needing to attempt a work around for and **I have working ones** puke.
 [...]
This also plays into the external import path switch, since we know that is almost certainly correct, adding the extern on make a very clean simple solution for positive annotation!
I don't see how `extern` is related to a potential external import path switch.
Simple, it adds it. By adding it you convert an ``export`` to ``export`` with ``extern`` which the DIP defines as ``DllImport``. An ``export`` by itself would remain in ``internal`` mode. This is desirable when you are doing multiple step builds, such as incremental compilation or ya know using static libraries like dub requires today.
 [...]
Both druntime and PhobosV2 will remain in using negative annotation for the foreseeable future. The amount of work to convert that to positive is quite significant because its not just slapping export on things, you also have to test it.
`export:`? What needs to be tested?
Anytime you convert a module from negative annotation to positive, everything has to be tested to verify that it still links, loads and runs correctly.
 Now do the rest of the ecosystem but with people who don't know what 
 they are doing.
I think it's unlikely that most projects would need to work in dlls.
It doesn't matter, it only takes one project that depends upon another for the dependency to have to care about it.
 [...]
So you want even more global state?
?
In the above you were promoting duplicating global state as a solution instead of letting the user mark something that is private as exported. It's an insane idea. Imagine there being multiple different mallocs inside each shared library! Lots more ram usage, and plenty of free's not freeing because the state wasn't shared. And that assume you even succeed at making that link at all.
 [...]
What if its accessing global state (and perhaps giving you access to it via callback)? What if it is global state instead of a function? What makes you think that it can be inlined?
C++.
Well C++ isn't D. They have things we don't have, different expectations. Keep in mind that D's official build manager is based upon npm, not CMake. We use D very differently than C++ users use C++. You have to understand how D is being used here, there is not substitute for experience supporting people with shared library support and seeing what they are trying to do.
 [...]
Some cannot be like ``ModuleInfo``, others have be exported based upon if other things in the encapsulation are exported.
What's an "encapsulation"?
struct/class/union Basically anything that contains other symbols.
 Its either that or we export literally everything inside of the 
 encapsulation unit (I don't like that at all).
?
That is negative annotation. It is the default in dub and is the only reliable method of using shared libraries in D and even then its no where near as reliable as it should be as you cannot use (sub)packages.
 [...]
The way I view it is as thus: - If you use D shared libraries with a static runtime, you're going to have your program have indeterminate behavior.
Which doesn't seem to be what happens most often.
Right, we don't default to that with dub when using shared libraries, because it can only ever introduce undesirable behavior.
 - On the other hand, if you don't use shared libraries with a shared 
 runtime it works.
It doesn't seem this is what most people want.
Defaults should never be foot-gun heavy. That is what you are promoting here by suggesting the status quo with static library builds of druntime/phobos are a good thing. It's why C always defaults to shared libraries for its libc and with that the C++ runtime as well.
 So from my perspective its better to be opt-in for static 
 runtime/phobos builds if you know you don't need it, than the opposite.
Again, the raves from people new to Go would suggest otherwise. So would the lack of requests for this in D.
Go isn't D, its used in a very different way to D is. You are very much mistaken about lack of requests, they are not needed because dub automates the change of defaults. When I first introduced support for shared libraries into dub, you had to explicitly set that you wanted a shared library build of druntime/phobos as well as setting dllimport to all and visibility to public. The reason nobody has talked about it is because its done for them automatically when they use dub. If they use something else, then they probably know how to figure out what they need to do on their own (I've had no support requests for anything that isn't dub).
May 15
parent reply Atila Neves <atila.neves gmail.com> writes:
On Thursday, 16 May 2024 at 02:15:33 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 On 16/05/2024 5:30 AM, Atila Neves wrote:
 On Tuesday, 14 May 2024 at 22:13:19 UTC, Richard (Rikki) 
 Andrew Cattermole wrote:
 On 15/05/2024 6:19 AM, Atila Neves wrote:
 [...]
1. Metadata (ModuleInfo, TypeInfo, RTInfo)
All three of those are examples of things that because they exist do not link, load or run at runtime correctly.
I still don't understand what are the situations under which things don't work. Or why they require marking an entire module as out of binary.
 These files should not exist:
Oof. No, they should not. I understand you have workarounds here for problems with shared libraries in D, but I still don't understand why these workarounds are necessary.
 2. It is almost certainly going to be correct
What is most certainly correct?
If a module is found via a path provided to the external import path switch to be out of binary.
This "begs the assumption" that such a switch exists.
 and ``-I`` is almost certainly at play, therefore we can 
 leverage this information to turn ``export`` for positive 
 annotation into ``DllImport``.
We already have a mechanism to do this.
Ugh no? Not for positive annotation we do not.
We do? export void foo(); // dllimport export void foo() { } // dllexport
 Can you spot the problem with this: ``[void*]`` vs ``[void*, 
 void**, void*]``
No, but I don't understand the question.
ModuleInfo goes into an array during loading. For ones in binary that is an array of them. [...]
I get an inkling of what the problem is here with the description, but: * I don't actually know * I have a gut feeling there's a solution that isn't too complicated
 Its not like how it is when its all in binary. This is not 
 something any user of D should be needing to attempt a work 
 around for and **I have working ones** puke.
I agree.
 I don't see how `extern` is related to a potential external 
 import path switch.
Simple, it adds it.
Right, but why does it need to? extern export int foo; // dllimport export int foo; // dllexport
 This is desirable when you are doing multiple step builds, such 
 as incremental compilation or ya know using static libraries 
 like dub requires today.
We do this at work everyday; I don't get why the switch would be needed.
 What needs to be tested?
Anytime you convert a module from negative annotation to positive, everything has to be tested to verify that it still links, loads and runs correctly.
What's "everything"?
 In the above you were promoting duplicating global state as a 
 solution instead of letting the user mark something that is 
 private as exported.

 It's an insane idea.
It's what C++ does. There's no way to do it without putting it into a header and marking it `static`.
 Imagine there being multiple different mallocs inside each 
 shared library!
Because of a global variable?
 What makes you think that it can be inlined?
C++.
Well C++ isn't D. They have things we don't have, different expectations.
And we have things they don't have; but given that dll{import,export} is a Microsoft C/C++ thing, I think that it's a good idea to use them as inspiration for what we should do.
 Keep in mind that D's official build manager is based upon npm, 
 not CMake. We use D very differently than C++ users use C++.
I don't see how this is relevant, especially since there are multiple (meta) build systems one can use for both.
 You have to understand how D is being used here
Ok, how is it being used in a way that's so different from how C++ is used?
 That is negative annotation. It is the default in dub
How is it the default in dub?
 Defaults should never be foot-gun heavy.
I agree.
 That is what you are promoting here by suggesting the status 
 quo with static library builds of druntime/phobos are a good 
 thing.
I don't see how; 0 bullets in my foot due to this so far in 10 years.
 It's why C always defaults to shared libraries for its libc and 
 with that the C++ runtime as well.
I'm 99.9% sure that it was to save space and for every dependendent app to be updated automatically to the new version of the libraries.
 So from my perspective its better to be opt-in for static 
 runtime/phobos builds if you know you don't need it, than the 
 opposite.
Again, the raves from people new to Go would suggest otherwise. So would the lack of requests for this in D.
Go isn't D, its used in a very different way to D is.
Yes/no. I'd rather statically link, personally.
 You are very much mistaken about lack of requests, they are not 
 needed because dub automates the change of defaults.
I'm confused; are there requests or not?
 When I first introduced support for shared libraries into dub, 
 you had to explicitly set that you wanted a shared library 
 build of druntime/phobos as well as setting dllimport to all 
 and visibility to public.
Ok. And now it automatically says "dynamically link"? If that's the case, I didn't know that, and now I'm wondering why we have an explicit flag for that in a dub.sdl.
May 16
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 17/05/2024 1:46 AM, Atila Neves wrote:
 On Thursday, 16 May 2024 at 02:15:33 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 On 16/05/2024 5:30 AM, Atila Neves wrote:
 On Tuesday, 14 May 2024 at 22:13:19 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 On 15/05/2024 6:19 AM, Atila Neves wrote:
 [...]
1. Metadata (ModuleInfo, TypeInfo, RTInfo)
All three of those are examples of things that because they exist do not link, load or run at runtime correctly.
I still don't understand what are the situations under which things don't work. Or why they require marking an entire module as out of binary.
They are data not functions. See Martin's post. https://forum.dlang.org/post/dobouzmhwabquswguunk forum.dlang.org They literally cannot cross the DLL boundary if you do not know that they are out of binary.
 These files should not exist:
Oof. No, they should not. I understand you have workarounds here for problems with shared libraries in D, but I still don't understand why these workarounds are necessary.
Yeah I can tell, you're not quite getting the difference between data and functions here and unfortunately they matter if you are using positive annotation. If I dropped positive annotation and went to negative I could get rid of them even for dmd (tested after Rainer added support).
 and ``-I`` is almost certainly at play, therefore we can leverage 
 this information to turn ``export`` for positive annotation into 
 ``DllImport``.
We already have a mechanism to do this.
Ugh no? Not for positive annotation we do not.
We do?     export void foo();    // dllimport     export void foo() { } // dllexport
This is no different than initializers on a global variable determining its symbol mode. Think about the .di generator removing bodies, and suddenly you can no longer ship both static and shared libraries of your library. One set of interface files to a library is better than two, one for each build.
 Can you spot the problem with this: ``[void*]`` vs ``[void*, void**, 
 void*]``
No, but I don't understand the question.
ModuleInfo goes into an array during loading. For ones in binary that is an array of them. [...]
I get an inkling of what the problem is here with the description, but: * I don't actually know * I have a gut feeling there's a solution that isn't too complicated
The only "easy" solution I know of is to nullify out if it can't load the symbol. That assumes the linker/loader supports it, and unfortunately as a solution it only solves ModuleInfo. You still have to deal with other data like TypeInfo or .init. Each time I had to deal with .init not crossing over it took me days to figure it out and here is my workaround to that: https://github.com/Project-Sidero/basic_memory/blob/65ee9d0c4b1d4bc666772b21d4afdec30835120c/source/sidero/base/text/format/specifier.d#L262 https://github.com/Project-Sidero/basic_memory/blob/65ee9d0c4b1d4bc666772b21d4afdec30835120c/source/sidero/base/text/format/prettyprint.d#L28
 I don't see how `extern` is related to a potential external import 
 path switch.
Simple, it adds it.
Right, but why does it need to?     extern export int foo; // dllimport     export int foo;        // dllexport
What you have written is behavior my DIP promotes. However the question is what happens when you have ``export int foo;`` in a file passed to ``-I``. In my DIP that is in internal symbol mode, not ``DllImport``. This allows for both static libraries and shared libraries to be used with one set of D files and in turn keeps the .di generator simpler.
 This is desirable when you are doing multiple step builds, such as 
 incremental compilation or ya know using static libraries like dub 
 requires today.
We do this at work everyday; I don't get why the switch would be needed.
That's because you're building it as a plugin with a set exported interface. This is a much more limited scenario that does not reflect projects like game engines or a standard library and should not be compared.
 What needs to be tested?
Anytime you convert a module from negative annotation to positive, everything has to be tested to verify that it still links, loads and runs correctly.
What's "everything"?
All symbols that you change the symbol mode on.
 In the above you were promoting duplicating global state as a solution 
 instead of letting the user mark something that is private as exported.

 It's an insane idea.
It's what C++ does. There's no way to do it without putting it into a header and marking it `static`.
 Imagine there being multiple different mallocs inside each shared 
 library!
Because of a global variable?
Bad example, but yes. If you are requiring duplicated globals that means you can end up with a library with N state, rather than 1. With template instantiations that is a very real problem. However I am now wondering if we might be miscommunicating on this.
 What makes you think that it can be inlined?
C++.
Well C++ isn't D. They have things we don't have, different expectations.
And we have things they don't have; but given that dll{import,export} is a Microsoft C/C++ thing, I think that it's a good idea to use them as inspiration for what we should do.
They have no QoL stuff at this level for us to take inspiration from. You use the macro preprocessor to set when something is DllImport versus internal. As for terminology yes, we absolutely must migrate over. Everyone has. Both LLVM and GCC use it, it's the standard model at the compiler level now.
 Keep in mind that D's official build manager is based upon npm, not 
 CMake. We use D very differently than C++ users use C++.
I don't see how this is relevant, especially since there are multiple (meta) build systems one can use for both.
 You have to understand how D is being used here
Ok, how is it being used in a way that's so different from how C++ is used?
Basically people expect to be able to import and for things to work. Nobody should be seeing linker warnings when using dub just because they have a dependency and are building a shared library. It is abnormal outside of plugins for people to be using positive annotation. They are exclusively negative annotation based upon the support I have given over a 2+ year period. We have people in the community who do not wish to know that linkers exist and still manage (with help) to use shared libraries. C++ on the other hand has for most of its life been exclusively positive annotation. Very different knowledge and willingness to deal with these details.
 That is negative annotation. It is the default in dub
How is it the default in dub?
https://github.com/dlang/dub/blob/master/source/dub/compilers/ldc.d#L172 Its hard wired at the compiler personality level to change defaults. Note: default on other platforms is negative annotation (which makes it highly inconsistent, which my DIP fixes).
 It's why C always defaults to shared libraries for its libc and with 
 that the C++ runtime as well.
I'm 99.9% sure that it was to save space and for every dependendent app to be updated automatically to the new version of the libraries.
That is a nice side benefit. It also save RAM for the read only segments (that could be large due to tables). But it has global state for things like malloc, that you really want to be unified within the entire application. Otherwise you get memory leaks.
 You are very much mistaken about lack of requests, they are not needed 
 because dub automates the change of defaults.
I'm confused; are there requests or not?
I got it wrong, it wasn't dub that was changing defaults. Its ldc. ``` --link-defaultlib-shared - Link with shared versions of default libraries. Defaults to true when generating a shared library (-shared) ``` So no requests, because it is already switched (for shared libraries but not executables)!
 When I first introduced support for shared libraries into dub, you had 
 to explicitly set that you wanted a shared library build of 
 druntime/phobos as well as setting dllimport to all and visibility to 
 public.
Ok. And now it automatically says "dynamically link"? If that's the case, I didn't know that, and now I'm wondering why we have an explicit flag for that in a dub.sdl.
I got it wrong that its ldc doing it, although it did do it once back in 2016. I don't know why its there. Is that on an executable?
May 16