www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - pinning revisited

reply Jason O'Brien <xergen hotmail.com> writes:
It's not that any of us need to pin right now, but that the standard
puts anyone using C interop in a hard place: It's not possible to write
forward-compatible gc safe code.  We want to, but we can't, and so we
don't.  Noone is right now, and I'm afraid the more broken code we all
write the less likely it is a more efficient copying collector will ever
be worthwhile, and if it does get done all the more code we have to rewrite.

I have a couple of ideas, but whatever we decide on we should do it soon.

----------------------------------------------------------------------

1) void gcpin(void*) and void gcunpin(void*)
	These functions keep a block of memory from being moved by a copying
collector.  Right now, and to preserve speed in non-copying gc's, I
propose the dummy implementation:
Version(STATIC_GC) { void gcpin(void* ptr){} void gcunpin(void* ptr){} }

This has an extra benefit that if a compiler supports both a copying and
noncopying gc (for whatever reason), all that has to be changed is

Version(COPYING_GC) { //stuff here }
else { //empty versions }

There is a problem with pinned objects being collected though.
How to deal with this would be implementation specific, but wouldn't be
too hard. Either search the pin list as roots, or have roots and pinned
objects be one in the same (gcpin just resolving to addRoot).

Of course, this leaves alot of room for error.  This could lead to some
major memory leaks if you forget to unpin things after interop.  So I
propose a keyword (which is the reason gcpin is named the way it is):
pin(void* ptr)
{
	// interop functions requiring the pin go here
}
This would just resolve to
gcpin(ptr);
//interop functions;
gcunpin(ptr):
but gets rid of the chance to shoot yourself in the foot in most cases.
  You only need to use the manual pins when an object pointer is stored
on the C side, in which case the compiler couldnt know when to unpin it
so you have to do it yourself anyway.

----------------------------------------------------------------------

2)  A pin autoclass.
If combined with a using type scope statement for auto classes, you
could have nearly the same syntax:
using(Pin pin(void* ptr))
{
	// interop functions
}
The alternative being to just have a local pin, but that doesnt give you
  scope clearly telling you where it is and is not pinned, and also
requires the object be pinned until the scope ends (which is potentially
bad for memory consumption)

And if the C side is storing it, you could keep a pin as a class member.
  This is a bit more fool-proof than the option above, but I'm not sure
its even possible with auto classes as they are at the moment, and could
get clumsy if you are pinning more than one object.

One thing to worry about is keeping track of nesting.. (pin() pin()
unpin() should leave the object pinned until another call to unpin()),
though either way could implement this easily.

----------------------------------------------------------------------

Personally, I prefer option 1, the functions being a little easier to
work with when you need asynchronous pinning without worrying about
keeping members to active pins.  I don't see why both couldn't be
implemented, so people can choose whichever they like, as making an
autoclass to wrap the gcpin and gcunpin functions would be simple.

I'm not sure when it is valid for the GC to run, but in the interest of
callback and thread safety, I think it might be necessary for copying gc
implementations to automatically wrap any function call into C with the
appropriate pin.  This way, pointers wont change during the execution of
a C function.  This could easily be done transparently to the user, say:

extern(C) void cFunc(void* ptrNotStored);

then
cFunc(&myClass);
becomes
pin (&myClass) { cFunc(&myClass); }
for heap allocated objects.

Any more complex demands, such as the C side storing a pointer, the user
is responsible for pinning the appropriate data.

any thoughts?  I really think this is urgent, I don't want to see the D
community close itself off from a more efficient gc
Jul 20 2006
next sibling parent reply Dave <Dave_member pathlink.com> writes:
Jason O'Brien wrote:
 It's not that any of us need to pin right now, but that the standard
 puts anyone using C interop in a hard place: It's not possible to write
 forward-compatible gc safe code.  We want to, but we can't, and so we
 don't.  Noone is right now, and I'm afraid the more broken code we all
 write the less likely it is a more efficient copying collector will ever
 be worthwhile, and if it does get done all the more code we have to 
 rewrite.
 
 I have a couple of ideas, but whatever we decide on we should do it soon.
 
The concern is valid and your ideas are good ones IMO. But for now we can always get around using the GC by using malloc/free, etc. Another idea: Why not just make it easier to de/alloc using the crt and place those into "auto blocks" (for now). import std.stdarg, std.c.stdlib : malloc, free; extern(C) { char* strncpy(char*,char*,size_t); size_t strlen(char*); } void main() { size_t len = 100; with(new SafeCStr(len,len)) { const char[] str = "abcdefg..."; char* s = ptrs[0], d = ptrs[1]; strncpy(s,str,str.length); strncpy(d,s,str.length); printf("%s: %d\n",d,strlen(d)); } } auto class CAlloc(T) { T*[] ptrs; this(...) { for(size_t idx = 0; idx < _arguments.length; idx++) { if(_arguments[idx] != typeid(int) && _arguments[idx] != typeid(size_t)) throw new Exception("wrong type of arg"); long len = va_arg!(int)(_argptr); if(len < 0) throw new Exception("invalid allocation length"); ptrs ~= cast(T*)malloc(len * T.sizeof); if(ptrs[idx] is null) throw new Exception("allocation failed"); } } ~this() { foreach(inout ptr; ptrs) { if(ptr) { free(ptr); ptr = null; } } } } alias CAlloc!(char) SafeCStr;
Jul 20 2006
parent reply S. <S._member pathlink.com> writes:
Why does that have to be a template?  Can't you deduce the size from the
typeinfo's of the variadic function?

-SC

In article <e9pmt6$20j0$1 digitaldaemon.com>, Dave says...
Jason O'Brien wrote:
 It's not that any of us need to pin right now, but that the standard
 puts anyone using C interop in a hard place: It's not possible to write
 forward-compatible gc safe code.  We want to, but we can't, and so we
 don't.  Noone is right now, and I'm afraid the more broken code we all
 write the less likely it is a more efficient copying collector will ever
 be worthwhile, and if it does get done all the more code we have to 
 rewrite.
 
 I have a couple of ideas, but whatever we decide on we should do it soon.
 
The concern is valid and your ideas are good ones IMO. But for now we can always get around using the GC by using malloc/free, etc. Another idea: Why not just make it easier to de/alloc using the crt and place those into "auto blocks" (for now). import std.stdarg, std.c.stdlib : malloc, free; extern(C) { char* strncpy(char*,char*,size_t); size_t strlen(char*); } void main() { size_t len = 100; with(new SafeCStr(len,len)) { const char[] str = "abcdefg..."; char* s = ptrs[0], d = ptrs[1]; strncpy(s,str,str.length); strncpy(d,s,str.length); printf("%s: %d\n",d,strlen(d)); } } auto class CAlloc(T) { T*[] ptrs; this(...) { for(size_t idx = 0; idx < _arguments.length; idx++) { if(_arguments[idx] != typeid(int) && _arguments[idx] != typeid(size_t)) throw new Exception("wrong type of arg"); long len = va_arg!(int)(_argptr); if(len < 0) throw new Exception("invalid allocation length"); ptrs ~= cast(T*)malloc(len * T.sizeof); if(ptrs[idx] is null) throw new Exception("allocation failed"); } } ~this() { foreach(inout ptr; ptrs) { if(ptr) { free(ptr); ptr = null; } } } } alias CAlloc!(char) SafeCStr;
Jul 21 2006
parent Dave <Dave_member pathlink.com> writes:
S. wrote:
 Why does that have to be a template?  Can't you deduce the size from the
 typeinfo's of the variadic function?
 
Gimme a break - it was a 10 minute demo. <g> Two usability reasons though (IMHO): a) you'd need to pass the TypeInfo into the ctor and b) you'd need to cast the returned pointers. //with(new SafeCStr(len,len)) with(new CAlloc(typeid(char),len,len)) { char[] str = "abcdefg..."; //char* s = ptrs[0], d = ptrs[1]; char* s = cast(char*)ptrs[0], d = cast(char*)ptrs[1]; Seems easier for the end-user to just do the one time aliases.
 -SC
 
 In article <e9pmt6$20j0$1 digitaldaemon.com>, Dave says...
 Jason O'Brien wrote:
 It's not that any of us need to pin right now, but that the standard
 puts anyone using C interop in a hard place: It's not possible to write
 forward-compatible gc safe code.  We want to, but we can't, and so we
 don't.  Noone is right now, and I'm afraid the more broken code we all
 write the less likely it is a more efficient copying collector will ever
 be worthwhile, and if it does get done all the more code we have to 
 rewrite.

 I have a couple of ideas, but whatever we decide on we should do it soon.
The concern is valid and your ideas are good ones IMO. But for now we can always get around using the GC by using malloc/free, etc. Another idea: Why not just make it easier to de/alloc using the crt and place those into "auto blocks" (for now). import std.stdarg, std.c.stdlib : malloc, free; extern(C) { char* strncpy(char*,char*,size_t); size_t strlen(char*); } void main() { size_t len = 100; with(new SafeCStr(len,len)) { const char[] str = "abcdefg..."; char* s = ptrs[0], d = ptrs[1]; strncpy(s,str,str.length); strncpy(d,s,str.length); printf("%s: %d\n",d,strlen(d)); } } auto class CAlloc(T) { T*[] ptrs; this(...) { for(size_t idx = 0; idx < _arguments.length; idx++) { if(_arguments[idx] != typeid(int) && _arguments[idx] != typeid(size_t)) throw new Exception("wrong type of arg"); long len = va_arg!(int)(_argptr); if(len < 0) throw new Exception("invalid allocation length"); ptrs ~= cast(T*)malloc(len * T.sizeof); if(ptrs[idx] is null) throw new Exception("allocation failed"); } } ~this() { foreach(inout ptr; ptrs) { if(ptr) { free(ptr); ptr = null; } } } } alias CAlloc!(char) SafeCStr;
Jul 21 2006
prev sibling next sibling parent reply Sean Kelly <sean f4.ca> writes:
The Ares GC interface is a tad different from Phobos, and contains 
pin/unpin methods.  Here's a webpage describing it:

http://svn.dsource.org/projects/ares/trunk/doc/ares/std/memory.html
Jul 23 2006
parent Jason O'Brien <xergen hotmail.com> writes:
Sean Kelly wrote:
 The Ares GC interface is a tad different from Phobos, and contains 
 pin/unpin methods.  Here's a webpage describing it:
 
 http://svn.dsource.org/projects/ares/trunk/doc/ares/std/memory.html
I'm glad to see ares has them :) However, from what I could tell it will be a while before Ares replaces phobos :( Adding the dummy functions would be very helpful. (I'd also like to see the pin keyword in the language, for reasons almost eerily similiar to the GC vs manual memory debate. It would avoid memory leaks for the common case, which is per function pinning. Also obviously doesn't have any overhead over the manual alternative :)) I'd also think a copying collector probably shouldn't collect pinned objects, so for compatability this one shouldnt either (by scanning pin list for roots). Would make language interop and ownership transfer more intuitive (if you temporarily don't want it to move, you probably don't want it to be collected in the meantime either). Could do that yourself of course with addRoot, should work in either case, I just think it's less intuitive. Obviously the overhead of this could be avoided if the compiler was pin-aware (you don't need to root C function params for example). Not sure whether it'd hurt performance of a noncopying gc, but a copying one could simply search the pin list for roots.
Jul 25 2006
prev sibling parent Kirk McDonald <kirklin.mcdonald gmail.com> writes:
Jason O'Brien wrote:
 It's not that any of us need to pin right now, but that the standard
 puts anyone using C interop in a hard place: It's not possible to write
 forward-compatible gc safe code.  We want to, but we can't, and so we
 don't.  Noone is right now, and I'm afraid the more broken code we all
 write the less likely it is a more efficient copying collector will ever
 be worthwhile, and if it does get done all the more code we have to 
 rewrite.
[snip]
 any thoughts?  I really think this is urgent, I don't want to see the D
 community close itself off from a more efficient gc
Yes, if nothing else we need gcpin and gcunpin dummy functions in Phobos. Pyd will need this functionality whenever a GC-allocated object is wrapped and sent into Python. (It needs to pin the object for the duration of its stay in the Python interpreter. As soon as the Python wrapper object is collected, it can unpin the object.) Proper nested pin support is important: A user might wrap the same D object with multiple Python objects, and these might be collected by Python at any time and in any order. However, updating Pyd to do this is reasonably simple, as I believe all of this only needs to be done in one place. (The same place where I keep a reference to all of the D objects currently living in Python; that and pinning are two parts of the same operation.) -- Kirk McDonald Pyd: Wrapping Python with D http://dsource.org/projects/pyd/wiki
Jul 24 2006