www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Interfacing to C - Storage Allocation - eek?

reply Mike Capp <mike.capp gmail.com> writes:
From the current page:

"If pointers to D garbage collector allocated memory are passed to C functions,
it's critical to ensure that that memory will not be collected by the garbage
collector before the C function is done with it."

Various suggestions on how to prevent this are offered. Well and good, for now.
What happens if and when D gets a copying GC implementation?

The first suggestion - copying to non-GCed memory - is not appealing for large
data blocks. The others would presumably throw up hard-to-diagnose wild pointer
errors in rare and hard-to-reproduce circumstances.
Aug 02 2005
next sibling parent reply llothar <llothar_member pathlink.com> writes:
In article <dcorea$2pov$1 digitaldaemon.com>, Mike Capp says...
Various suggestions on how to prevent this are offered. Well and good, for now.
What happens if and when D gets a copying GC implementation?
If the GC changes in such a way, all non D code must be completely rewritten. The usual way for a copying garbage collector is to offer indirect pointers to the referrenced value, while non copying collectors simply pass a direct pointer.
Aug 02 2005
parent Mike Capp <mike.capp gmail.com> writes:
In article <dcosu3$2r13$1 digitaldaemon.com>, llothar says...
If the GC changes in such a way, all non D code must be completely rewritten. 
Hence the "eek". Completely rewriting win32, for example, is not a viable option. And yet the docs and many previous postings have indicated that a copying GC at some point is possible. cheers Mike
Aug 02 2005
prev sibling parent reply Sean Kelly <sean f4.ca> writes:
In article <dcorea$2pov$1 digitaldaemon.com>, Mike Capp says...
From the current page:

"If pointers to D garbage collector allocated memory are passed to C functions,
it's critical to ensure that that memory will not be collected by the garbage
collector before the C function is done with it."

Various suggestions on how to prevent this are offered. Well and good, for now.
What happens if and when D gets a copying GC implementation?
The GC would offer a method to "pin" memory that should not be moved. Sean
Aug 02 2005
parent reply Mike Capp <mike.capp gmail.com> writes:
In article <dcou45$2s21$1 digitaldaemon.com>, Sean Kelly says...
What happens if and when D gets a copying GC implementation?
The GC would offer a method to "pin" memory that should not be moved.
Exactly. Existing D code rebuilt with such an implementation won't ever call that method and will _silently_ become horribly unsafe. That's going to be popular. Plus, of course, many if not most calls to C APIs will have to be wrapped with pin/unpin calls. Verbose, ugly and easy to forget. Even worse with asynchronous operations or userdata pointers for callbacks. cheers Mike
Aug 02 2005
parent reply Sean Kelly <sean f4.ca> writes:
In article <dcovni$2svo$1 digitaldaemon.com>, Mike Capp says...
In article <dcou45$2s21$1 digitaldaemon.com>, Sean Kelly says...
What happens if and when D gets a copying GC implementation?
The GC would offer a method to "pin" memory that should not be moved.
Exactly. Existing D code rebuilt with such an implementation won't ever call that method and will _silently_ become horribly unsafe. That's going to be popular.
Once I get threading finished for Ares I'm planning to look at the GC interface. I'll add 'pin' functions in the process, even if they do nothing with the current GC. I'll try and remember to ask that the same be done for Phobos.
Plus, of course, many if not most calls to C APIs will have to be wrapped with
pin/unpin calls. Verbose, ugly and easy to forget. Even worse with asynchronous
operations or userdata pointers for callbacks.
So what would you suggest? Auto classes might help to streamline the process, but I can't think of any way completely automate it. Sean
Aug 02 2005
parent reply Mike Capp <mike.capp gmail.com> writes:
In article <dcp03v$2t8k$1 digitaldaemon.com>, Sean Kelly says...
So what would you suggest? 
Shrug. I'd suggest that GC is a lousy official memory management strategy for a language like D, but I've already done that and seem to be in a minority of one. (Cue 'Hearts and Flowers'...)
but I can't think of any way completely automate it.
Not at the library level, no. It might be partially possible at the compiler level - "whenever you see a pointer argument being passed to a function declared as extern (C), insert pin/unpin calls for that pointer around the function invocation" - but that doesn't help for the async and userdata-callback cases. Still, those are maybe rare enough that requiring a manual pin is acceptable. cheers Mike
Aug 02 2005
next sibling parent reply Sean Kelly <sean f4.ca> writes:
In article <dcp2vg$2v6p$1 digitaldaemon.com>, Mike Capp says...
In article <dcp03v$2t8k$1 digitaldaemon.com>, Sean Kelly says...
but I can't think of any way completely automate it.
Not at the library level, no. It might be partially possible at the compiler level - "whenever you see a pointer argument being passed to a function declared as extern (C), insert pin/unpin calls for that pointer around the function invocation" - but that doesn't help for the async and userdata-callback cases.
There are two cases where a pointer may be passed to a C function: - it is used for the function call, and no references persist after return - the function stores this pointer in a static location and continues using it after return (callbacks and such) For the latter case, auto-unpinning is incorrect behavior. Also, pin/unpin is currently only necessary in the first case if the application is multithreaded, as the current D GC will only cleanup during a call to 'new'.
Still, those are maybe rare enough that requiring a manual pin is acceptable.
Perhaps an auto class would facilitate this a tad. Not ideal, but it's something. Sean
Aug 02 2005
parent reply Mike Capp <mike.capp gmail.com> writes:
In article <dcp6pf$b5$1 digitaldaemon.com>, Sean Kelly says...
There are two cases where a pointer may be passed to a C function:

- it is used for the function call, and no references persist after return
- the function stores this pointer in a static location and continues using it
after return (callbacks and such)

For the latter case, auto-unpinning is incorrect behavior.
Is it? I don't how pinning is implemented in garbage collectors; I was imagining that you could stick multiple pins in a chunk of memory. Auto-unpinning would then be pointless but harmless as long as the user had also done a manual pin.
Also, pin/unpin is
currently only necessary in the first case if the application is multithreaded,
as the current D GC will only cleanup during a call to 'new'.
Not _strictly_ true. If you call a C function (fc) which takes both a data pointer (pd) and a function pointer (pf), and you supply a function in your D module (fd) as the value of pf, and fc calls pf before accessing pd, and fd happens to call 'new'...
Perhaps an auto class would facilitate this [async case] a tad.  
Not ideal, but it's something.
The more I look at auto the less I'm convinced it facilitates anything. As currently implemented it appears to be a variant syntax for try-finally, only with added limitations and extra cost. I don't see how it could help here, since autos are limited to local scope and an asynchronous function guaranteed to complete by the end of a local scope doesn't sound very asynchronous to me. cheers Mike
Aug 02 2005
parent Brad Beveridge <brad somewhere.net> writes:
I think that it depends on the pin/unpin syntax.  Obviously you should 
be able to pin/unpin a chunk of memory at any time, but how about also 
being able to pin the memory at variable assign time.

pinned char myBuffer[];

which tells the GC to treat myBuffer in exactly the same way that the 
current GC treats it.  Ie, it can move if you reallocated it, but the 
copying collector will not move it.

Brad
Aug 03 2005
prev sibling parent reply Shammah Chancellor <Shammah_member pathlink.com> writes:
In article <dcp2vg$2v6p$1 digitaldaemon.com>, Mike Capp says...
In article <dcp03v$2t8k$1 digitaldaemon.com>, Sean Kelly says...
So what would you suggest? 
Shrug. I'd suggest that GC is a lousy official memory management strategy for a language like D, but I've already done that and seem to be in a minority of one. (Cue 'Hearts and Flowers'...)
You earlier stated that you found pin / unpin as being ugly: Plus, of course, many if not most calls to C APIs will have to be wrapped with pin/unpin calls. Verbose, ugly and easy to forget. Even worse with asynchronous operations or userdata pointers for callbacks. Yet you, in the msg I quoted, say that you'd rather there be no GC. So you'd rather be doing ugly verbose memory management everywhere than a few ugly verbose pin/unpin calls when you occasionally need to call into a C library? -Sha
Aug 03 2005
parent reply Mike Capp <mike.capp gmail.com> writes:
In article <dcr6l7$1l46$1 digitaldaemon.com>, Shammah Chancellor says...
You earlier stated that you found pin / unpin as being ugly:
If manually wrapped around almost every C API call, yes. Particularly since D's easy integration with C means that it's not necessarily blindingly obvious which calls are to C APIs.
Yet you, in the msg I quoted, say that you'd rather there be no GC. 
No, I said I thought it was a lousy official memory management strategy. My problem isn't with D having GC; it's with D not having much else. GC is a reasonable approach to managing memory, but it's a completely worthless approach to managing most other resources. When non-memory resources are tightly coupled to objects, e.g. handles in an SWT-like GUI library that uses native peers, that inadequacy spills over.
So you'd
rather be doing ugly verbose memory management everywhere than a few ugly
verbose pin/unpin calls when you occasionally need to call into a C library?
Resource management (and not just memory) in a language with deterministic dtors is far from ugly or verbose. (My current C++ project contains a grand total of one 'delete' and one 'delete[]'.) And it's consistent; the same mechanism works for everything. You don't have to worry about whether class Foo needs to be Dispose()d or Close()d or whatever, and when, and what happens to the 'zombie' object afterwards. You just let it go. The class knows how to clean itself up, and does so predictably as soon as possible and no sooner. Yes, there are costs. Yes, it takes a bit of discipline. IMHO it's worth it. cheers Mike
Aug 03 2005
parent Shammah Chancellor <Shammah_member pathlink.com> writes:
In article <dcrcql$1qpi$1 digitaldaemon.com>, Mike Capp says...
In article <dcr6l7$1l46$1 digitaldaemon.com>, Shammah Chancellor says...
You earlier stated that you found pin / unpin as being ugly:
If manually wrapped around almost every C API call, yes. Particularly since D's easy integration with C means that it's not necessarily blindingly obvious which calls are to C APIs.
Eh? D Requires imports, just like perl or any other language. With the proper imports any calls to C from perl, php, qbasic, or whatever else look the same as a normal call. If you're talking about the imports already in phobos. I should hope you know which functions are in std.c.* that you're importing. If not, you can easily check by removing that import and looking at the slew of errors.
Yet you, in the msg I quoted, say that you'd rather there be no GC. 
No, I said I thought it was a lousy official memory management strategy. My problem isn't with D having GC; it's with D not having much else. GC is a reasonable approach to managing memory, but it's a completely worthless approach to managing most other resources. When non-memory resources are tightly coupled to objects, e.g. handles in an SWT-like GUI library that uses native peers, that inadequacy spills over.
So you'd
rather be doing ugly verbose memory management everywhere than a few ugly
verbose pin/unpin calls when you occasionally need to call into a C library?
Resource management (and not just memory) in a language with deterministic dtors is far from ugly or verbose. (My current C++ project contains a grand total of one 'delete' and one 'delete[]'.) And it's consistent; the same mechanism works for everything. You don't have to worry about whether class Foo needs to be Dispose()d or Close()d or whatever, and when, and what happens to the 'zombie' object afterwards. You just let it go. The class knows how to clean itself up, and does so predictably as soon as possible and no sooner. Yes, there are costs. Yes, it takes a bit of discipline. IMHO it's worth it.
Congradulations, you spent a helluva alot of time getting your stuff setup perfectly and have no memory leaks. (Which is a big problem with C++ programs... Exactly the reason there is a huge market for memory debuggers. Have you used one of these on your program? you might be surprised.) Being required to pin a variable which is getting passed to a C function is not that big of a deal IMO. If if becomes such a big problem, it might not be hard to auto-pin references passed to stuff which is extern(C) I don't know.
Aug 03 2005