www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - toStringz lifetime

reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
toStringz documentation is clear on why, when, and how to extend the 
lifetime of a D string:

   https://dlang.org/phobos/std_string.html#.toStringz

Assume foo is a D library function that passes a "string" result to e.g. C:

extern(C)
void foo(ref const(char) * name) {
   name = format!"file%s.txt"(42).toStringz;  // Allocates from GC memory
}

This may be fine for "immediate use" on the C side because at first 
glance no garbage collection can take place between our returning the 
result and their using it:

// C caller:
   const char * name = NULL;
   foo(&name);                 // Calls us
   printf("%s", name);         // Uses 'name' immediately

Is it really safe? Imagine a multi-threaded environment where another D 
function is executed that triggers a GC collection right before the printf.

Does the GC see that local variable 'name' that is on the C side? What I 
don't know is whether the GC is aware only of the stack frames of D 
functions or the entire thread, which would include the C caller's 'name'.

Ali
Oct 25 2020
next sibling parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 25/10/2020 11:03 PM, Ali Çehreli wrote:
 Does the GC see that local variable 'name' that is on the C side? What I 
 don't know is whether the GC is aware only of the stack frames of D 
 functions or the entire thread, which would include the C caller's 'name'.
The thread stack frame that is registered with the D GC will know about the D side and may know about the C side. It depends on what the C side is doing. If the C side went ahead and made a new stack frame via a fiber... it won't know about it. But even if it did, the D stack frame is still alive and pinning that bit of memory. Ultimately, if the C side puts that pointer some place like a global or send it to another thread, there are no guarantees that things will play out well.
Oct 25 2020
next sibling parent =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 10/25/20 3:19 AM, rikki cattermole wrote:
 On 25/10/2020 11:03 PM, Ali =C3=87ehreli wrote:
 Does the GC see that local variable 'name' that is on the C side? What=
=20
 I don't know is whether the GC is aware only of the stack frames of D =
 functions or the entire thread, which would include the C caller's=20
 'name'.
=20 The thread stack frame that is registered with the D GC will know about=
=20
 the D side and may know about the C side.
=20
 It depends on what the C side is doing.
=20
 If the C side went ahead and made a new stack frame via a fiber... it=20
 won't know about it. But even if it did, the D stack frame is still=20
 alive and pinning that bit of memory.
=20
 Ultimately, if the C side puts that pointer some place like a global or=
=20
 send it to another thread, there are no guarantees that things will pla=
y=20
 out well.
Thanks. That's reassuring. :) So, as long as the D function documents=20 that the C side should make a copy if they want to extend the string's=20 lifetime it's their responsibility. And from your description I=20 understand that they have time to make that copy. Ali
Oct 25 2020
prev sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 10/25/20 3:19 AM, rikki cattermole wrote:
 On 25/10/2020 11:03 PM, Ali =C3=87ehreli wrote:
 Does the GC see that local variable 'name' that is on the C side? What=
=20
 I don't know is whether the GC is aware only of the stack frames of D =
 functions or the entire thread, which would include the C caller's=20
 'name'.
=20 The thread stack frame that is registered with the D GC will know about=
=20
 the D side and may know about the C side.
=20
 It depends on what the C side is doing.
=20
 If the C side went ahead and made a new stack frame via a fiber... it=20
 won't know about it. But even if it did, the D stack frame is still=20
 alive and pinning that bit of memory.
=20
 Ultimately, if the C side puts that pointer some place like a global or=
=20
 send it to another thread, there are no guarantees that things will pla=
y=20
 out well.
Sorry to bring this up again but I want to understand this fully before=20 I say something wrong during my DConf presentation. :) The D code is a library. The actual program is e.g. written in C. When=20 the D library is loaded into the program, the following function is=20 executed and the D GC is initialized: pragma (crt_constructor) extern(C) int initialize() { return rt_init(); } Does the D GC know the complete function call stack of the C program all = the way up from 'main'? Is there the concept of "bottom of the stack" or = does the D GC can only know the value of the stack pointer at the time=20 rt_init() was called. If the latter, then I think a toStringz string may = not be alive in a C function. Imagine the C program dlopens our library from inside a function called=20 from main. Then the program calls one of our library functions from=20 another function in main: // C program int main() { initializeDlibrary(); // This does dlopen() useDlibrary(); // This receives a string returned from // toStringZ and uses that string. } So, the question is, does D GC only know initializeDlibrary's stack=20 frame up because it was initialized there? I know threads complicate matters and they need to be attached to the GC = with core.thread.osthread.thread_attachThis but I am not there yet. :) I = want to understand the basic single thread stack pointer issue first. Thank you, Ali
Nov 08 2020
parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 09/11/2020 2:58 PM, Ali Çehreli wrote:
 Does the D GC know the complete function call stack of the C program all 
 the way up from 'main'? Is there the concept of "bottom of the stack" or 
 does the D GC can only know the value of the stack pointer at the time 
 rt_init() was called. If the latter, then I think a toStringz string may 
 not be alive in a C function.
https://github.com/dlang/druntime/blob/master/src/core/thread/context.d#L16 https://github.com/dlang/druntime/blob/master/src/core/thread/threadbase.d#L469 https://github.com/dlang/druntime/blob/master/src/core/thread/osthread.d#L1455 https://github.com/dlang/druntime/blob/master/src/core/thread/osthread.d#L1208 I'm tired, so here is the code related to your questions. Note: the GC will use this abstraction for dealing with stack frames (otherwise it would be duplicated).
Nov 08 2020
parent =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 11/8/20 6:58 PM, rikki cattermole wrote:

 On 09/11/2020 2:58 PM, Ali =C3=87ehreli wrote:
 Does the D GC know the complete function call stack of the C program
 all the way up from 'main'? Is there the concept of "bottom of the
 stack"
=20
https://github.com/dlang/druntime/blob/master/src/core/thread/osthread.d#= L1455=20
 I'm tired, so here is the code related to your questions.
Hey, I'm tired too! :p Thank you. By the presence of getStackTop() and getStackBottom() above,=20 I'm convinced that the entire stack is available. So, pointer returned=20 by toStringz will be kept alive by the C caller during their immediate=20 use. (They obviously cannot store for later use.) Ali
Nov 08 2020
prev sibling parent Johan Engelen <j j.nl> writes:
On Sunday, 25 October 2020 at 10:03:44 UTC, Ali Çehreli wrote:
 Is it really safe? Imagine a multi-threaded environment where 
 another D function is executed that triggers a GC collection 
 right before the printf.

 Does the GC see that local variable 'name' that is on the C 
 side? What I don't know is whether the GC is aware only of the 
 stack frames of D functions or the entire thread, which would 
 include the C caller's 'name'.
Small note: besides the stack, it is crucial that the GC is aware of the CPU register values. -Johan
Oct 25 2020