www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - runtime crashes with "rt/sections_elf_shared.d(688) _handleToDSO not

reply Jaap Geurts <jaap.geurts gmail.com> writes:
I apologize if this has been asked before but I couldn't find 
help here or anywhere else.

I'm trying to load my code as a dynamic library in a C++ program. 
It is working, but I have a problem upon program exit with thread 
local storage.

What I'm trying to do:

The c++ loads my dynamic library from a thread (not the main 
thread) and when the application exits from the main thread 
(without properly unloading the library), the finalizers run in 
the main thread and do not know about the loaded library (because 
it's kept in TLS `_loadedDSOs` in the thread that loaded the 
library) but the handle to the library is recorded in `__global 
_handleToDSOs`. When the main thread exits the loaded libraries 
is compared to the open handles. Since the main thread thinks 
nothing is loaded (`_loadedDSOs` is empty) the code aborts 
because the handle `_handleToDSOs` is not empty. This causes the 
d runtime to abort. I've reduced the problem to this example code:

```
#include <dlfcn.h>
#include <pthread.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

void* loadlib(void*);

void main()
{
     int counter = 0;
     pthread_t thread1;
     pthread_attr_t attrs;

     pthread_attr_init(&attrs);

     pthread_create(&thread1, &attrs, &loadlib, NULL);

     while (true) {
         printf("Main %d\n", counter++);
         usleep(300000);
         if (counter == 5)
             break;
     }
}

void* loadlib(void* args)
{
     int counter = 100;

     void* dlib = dlopen("libdlibrary.so", RTLD_LAZY | 
RTLD_GLOBAL);
     if (dlib == NULL) {
         printf("Can't open dlib\n");
         exit(0);
     }

     void (*dfunc)() = dlsym(dlib, "dfunc");

     dfunc();

     while (true) {
         printf("Thread %d\n", counter++);
         usleep(700000);
     }
}
```

Compile with `cc -o app app.c`

And the D code as a library

```
import std.stdio;
import core.runtime;

extern (C) void dfunc() {
     writeln("Hello from D");
// rt_init();
}
```

Compile with: `ldc2 -shared dlibrary.d`

And run the app with: `LD_LIBRARY_PATH=. ./app`

This causes the crash.

I realize that this is not the correct way to do things, but the 
problem is, I can't change the way the C++ program works which 
is: It loads my library from a thread, but when it exits, it just 
quits the main thread. Things I can't do on the C++ program:

1. load the lib from the main thread. (My code is a plugin which 
only becomes active when it is being loaded by the thread)
2. gracefully stop the thread that loaded the library and unload 
the library
3. the c++ program never unloads the lib so 
pragma(crt_destructor) doesn't work either
4. rt_init() doesn't help either.

Is there anything I can do to fix this issue on the D side?

Help would be greatly appreciated!
Nov 15 2023
next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
Not a fix but you could try putting your shared library into LD_PRELOAD 
before running the main executable.

If it works, it'll return a duplicate handle with the dlopen, but it 
depends upon what flags it is passing in.
Nov 15 2023
parent Jaap Geurts <jaap.geurts gmail.com> writes:
On Thursday, 16 November 2023 at 05:25:16 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 Not a fix but you could try putting your shared library into 
 LD_PRELOAD before running the main executable.
Not really feasible because the program in question is a QT/QML/KDE plugin for the desktop. Unfortunately it doesn't work and still gives me this problem. I'll keep looking.
Nov 16 2023
prev sibling parent reply Arjan <arjan ask.me.to> writes:
On Wednesday, 15 November 2023 at 22:41:19 UTC, Jaap Geurts wrote:
 I apologize if this has been asked before but I couldn't find 
 help here or anywhere else.

 [...]
I do not understand what you want to achieve, but maybe (ab)using the [`atexit`](https://en.cppreference.com/w/cpp/utility/program/atexit)from the D shared object to register a function *in* the D shared object at app exit? Be-aware when [`dlclose`](https://linux.die.net/man/3/dlclose) *is* used on the D shared object, the app will crash due to the function-pointer pointing to already unmapped lib function is called by `atexit`. I however never tried this, might not work.
Nov 16 2023
parent reply Jaap Geurts <jaap.geurts gmail.com> writes:
On Thursday, 16 November 2023 at 17:59:56 UTC, Arjan wrote:
 On Wednesday, 15 November 2023 at 22:41:19 UTC, Jaap Geurts 
 wrote:
 I apologize if this has been asked before but I couldn't find 
 help here or anywhere else.

 [...]
I do not understand what you want to achieve, but maybe
I want to load a plugin into QML. This works with the help of dqml. It crashes upon exit of the QML app but I must have a clean exit. What I think happens is the following: Qt loads the DSO (my plugin written in D). In an init routine I start the D runtime with `rt_init()`. This happens in a thread created by Qt which means that D will setup TLS or reuse existing TLS for that thread. Further the runtime keeps track of libraries which have been loaded. So when my plugin is loaded, the D runtime scans all segments looking for loaded libraries which were dynamically linked-in into my plugin. It then stores those pointers to the loaded libraries in a `__gshared` collection so that whenever another thread tries to open an already stored library, the runtime will return that pointer. The runtime also stores which libraries are available for each thread and it stores that for each thread individually in TLS space. What happens is this: * At startup the `__gshared` library collection gets populated with pointers of already loaded libraries. Second the TLS collection of open libraries also gets setup for the current thread. * At exit, when Qt quits the main thread(without properly unloading libraries or quitting other threads nicely), the D exit handler runs. It sees no library pointers in the collection in the current TLS space (because now it's looking in the TLS of `main()` but the `__gshared` collection still has references. Then it bugs out with a safeAbort saying that loading and unloading isn't synchronized. This is technically correct but not helpful. The best solution for me would be that the D runtime would still report an error, but not call `abort()`.
 (ab)using the 
 [`atexit`](https://en.cppreference.com/w/cpp/utility/program/atexit)from the D
shared object to register a function *in* the D shared object at app exit?
 Be-aware when [`dlclose`](https://linux.die.net/man/3/dlclose) 
 *is* used on the D shared object, the app will crash due to the 
 function-pointer pointing to already unmapped lib function is 
 called by `atexit`.
I tried this, but doesn't work either, because when I register my function I'm late to the party and my function gets called after D's runtime exit handler has been called. Even if I could intercept the call it still wouldn't work, because `atexit()` runs in the main thread and the current TLS library collection is empty and I can't unload the library because the `__gshared` collection is internal to D's runtime. My code example above is a good explanation of what happens in my setup. It crashes with the exact same error message. Thanks for taking the time to answer me.
Nov 20 2023
parent Arjan <arjan ask.me.to> writes:
On Monday, 20 November 2023 at 23:03:43 UTC, Jaap Geurts wrote:
 On Thursday, 16 November 2023 at 17:59:56 UTC, Arjan wrote:
 On Wednesday, 15 November 2023 at 22:41:19 UTC, Jaap Geurts

 I do not understand what you want to achieve, but maybe
I want to load a plugin into QML. This works with the help of dqml. It crashes upon exit of the QML app but I must have a clean exit. What I think happens is the following: The best solution for me would be that the D runtime would still report an error, but not call `abort()`.
 (ab)using the 
 [`atexit`](https://en.cppreference.com/w/cpp/utility/program/atexit)from the D
shared
[..]
 I tried this, but doesn't work either, because when I register 
 my function I'm late to the party and my function gets called 
 after D's runtime exit handler has been called.
The order of calling registered function by `atexit` should be in LIFO, IIRC, that would imply the D's runtime exit handler is registered later? Or not using the `atexit` at all? Still there would be the TLS thread issue to solve.. Isn't the [module destructor](https://dlang.org/spec/module.html#staticorder) `static ~this` or `shared static ~this` called in the correct thread and correct order?
 My code example above is a good explanation of what happens in 
 my setup. It crashes with the exact same error message.
I see. Will play around with it if time permits..
Nov 21 2023