digitalmars.D - draft proposal for ref counting in D
- Walter Bright (3/3) Oct 09 2013 This is an email conversation we had last summer. It's of general intere...
- Walter Bright (84/84) Oct 09 2013 This is based on n.g. discussions and ideas from you guys. I'll redo it ...
- Walter Bright (51/51) Oct 09 2013 Steven Schveighoffer wrote:
- Walter Bright (40/63) Oct 09 2013 Reason being, you could be assigning a reference the same value as befor...
- Walter Bright (2/2) Oct 09 2013 I have overlooked addressing what happens when you pass an RC ref to a p...
- Walter Bright (15/16) Oct 09 2013 Steven Schveighoffer wrote:
- Walter Bright (74/79) Oct 09 2013 well with something like TempAlloc. I don't think we should use convent...
- Walter Bright (9/9) Oct 09 2013 If autoreleasepull is just a handy way to lump together Release() calls,...
- Walter Bright (16/18) Oct 09 2013 that is quite unnecessary if the compiler inserts calls to Release()
- Walter Bright (12/14) Oct 09 2013 management (whether it requires flow-analysis or not), but it seems to w...
- Walter Bright (12/18) Oct 09 2013 management (whether it requires flow-analysis or not), but it seems to w...
- Walter Bright (31/40) Oct 09 2013 well with something like TempAlloc. I don't think we should use convent...
- Walter Bright (18/21) Oct 09 2013 giving the compiler a way to elide careful tracking of temporaries' refe...
- Walter Bright (16/21) Oct 09 2013 wouldn't have to think of releasing manually every object called functio...
- Walter Bright (3/4) Oct 09 2013 don't keep an unretained reference to an autoreleased object when the po...
- Walter Bright (8/11) Oct 09 2013 don't keep an unretained reference to an autoreleased object when the po...
- Walter Bright (7/15) Oct 09 2013 should not be no problem."
- Walter Bright (13/26) Oct 09 2013 don't keep an unretained reference to an autoreleased object when the po...
- Walter Bright (6/7) Oct 09 2013 have memory safety.
- bearophile (10/15) Oct 09 2013 Walter Bright:
- bearophile (5/7) Oct 11 2013 There is now a discussion and paper on optimizing a reference
- Andrei Alexandrescu (3/9) Oct 11 2013 Yes, I think that's great work.
- deadalnix (5/11) Oct 09 2013 You'll soon see the roadblock here with templates and type
- Walter Bright (89/89) Oct 09 2013 Updated incorporating Steven's suggestion, and some comments about
- Walter Bright (102/102) Oct 09 2013 Michel Fortin wrote:
- Walter Bright (90/118) Oct 09 2013 reasons, most Objective-C methods return autoreleased objects (deferred ...
- Walter Bright (33/108) Oct 09 2013 Cool. I didn't expect this to be tackled so soon.
- Walter Bright (35/71) Oct 09 2013 object itself, so we might want to allow reference counting on other typ...
- Walter Bright (6/8) Oct 09 2013 It's great that you're bringing this up. I think it's a good idea to add...
- zoujiaqing (8/12) Sep 18 2019 Automatic Reference Counting (ARC) as an alternative to D's
- Walter Bright (33/44) Oct 09 2013 object itself, so we might want to allow reference counting on other typ...
- Walter Bright (54/68) Oct 09 2013 prevent the compiler for inserting any of those calls so it becomes the
- Walter Bright (52/106) Oct 09 2013 I currently don't see how it can be memory safe with this proposal.
- Robert Schadek (3/7) Oct 10 2013 I would imagine the counter to be manipulated with atomic_add_and_fetch
- inout (3/14) Oct 11 2013 On shared objects, yes. Local objects need no atomics at all.
- Rainer Schuetze (22/33) Oct 11 2013 Atomic increments/decrements are not good enough for shared references.
- Michel Fortin (14/49) Oct 12 2013 I think you are mixing shared references with shared objects. Atomic
- Rainer Schuetze (8/53) Oct 12 2013 I agree, that's why I used the term "shared reference", too ;-)
- Michel Fortin (19/24) Oct 13 2013 For one of my projects I implemented a shared pointer like this. It
- Rainer Schuetze (9/30) Oct 14 2013 Locking is very bad if you have threads at different priorities as it
- Michel Fortin (23/53) Oct 14 2013 Spinning is good only when you very rarely expect contention, which is
- deadalnix (4/12) Oct 14 2013 If you don't want any pause, concurrent GC is the way to go. This
- Michel Fortin (26/39) Oct 14 2013 I'm not an expert in GCs, but as far as I know a concurrent GC also
- deadalnix (19/33) Oct 14 2013 Usual strategy include :
- Michel Fortin (23/59) Oct 14 2013 So as I understand it, your plan would then be:
- deadalnix (17/35) Oct 14 2013 Yes, I think this is what make the most sense for D. If the GC
- Michel Fortin (14/19) Oct 14 2013 mprotect is the one I'm worried about, as it lets you set the
- Jacob Carlborg (7/10) Oct 15 2013 I haven't tried compiling anything and I don't know if I'm looking in
- Michel Fortin (19/31) Oct 15 2013 You're right. Yes it does exist. I was confused.
- Michel Fortin (19/23) Oct 15 2013 I think there's a small mistake in your phrasing, but it makes a differe...
- deadalnix (24/48) Oct 15 2013 No, that is the beauty of it :D
- Michel Fortin (33/54) Oct 15 2013 But you still have to scan the thread-local heaps and stacks to find
- deadalnix (8/8) Oct 16 2013 Yes and no.
- Paulo Pinto (6/56) Oct 14 2013 Well, if real time concurrent GC for Java systems is good enough for
- Benjamin Thaut (6/11) Oct 16 2013 The problem is not that there are no GCs around in other languages which...
- Sean Kelly (11/13) Oct 16 2013 which satisfy certain requirements. The problem is actually implementing...
- Jacob Carlborg (6/7) Oct 16 2013 One need to be very careful with the memory when interfacing with C code...
- Benjamin Thaut (8/12) Oct 16 2013 I think a even bigger problem are structs. Because if you need write
- Sean Kelly (17/26) Oct 17 2013 satisfy certain requirements. The problem is actually implementing them...
- David Nadlinger (7/12) Oct 17 2013 LLVM actually comes with a quite expensive GC support
- Benjamin Thaut (10/22) Oct 17 2013 Uhhh, this sounds really good. They in fact have everything to implement...
- Sean Kelly (5/10) Oct 17 2013 There's always a tradeoff. If an app is very delay-sensitive,
- Benjamin Thaut (9/19) Oct 17 2013 Well I just read it again, and it appears to me that the shadow stack is...
- Paulo Pinto (8/22) Oct 16 2013 I have read it when it came out in the late 90's.
- Rainer Schuetze (15/22) Oct 15 2013 The work I was talking about uses C++, not D, so there is no GC involved...
- deadalnix (8/23) Oct 12 2013 You need an sequentially consistent write. You also need to
- Rainer Schuetze (9/34) Oct 13 2013 Same here ;-)
- deadalnix (8/9) Oct 13 2013 I assumed that the reference count was in a struct with the data,
- Artur Skawina (12/19) Oct 13 2013 No, if you have two (or more) threads concurrently accessing the object,
- Sean Kelly (6/14) Oct 13 2013 I suppose it's worth noting that Boost (and now standard C++) has
- =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= (5/20) Oct 13 2013 I didn't read the paper, but I'd suspect that the paper refers to the
- Rainer Schuetze (9/30) Oct 14 2013 I haven't read it either, but AFAICT the cas2 operation is used two
- John Colvin (3/49) Oct 14 2013 I'm totally out of my depth here but can't you store the
- deadalnix (6/8) Oct 14 2013 I think that can work if the refcount is stored with the pointer
- Rainer Schuetze (3/43) Oct 14 2013 This might work for a single pointer, but not if you have multiple
- Walter Bright (6/28) Oct 09 2013 That means you have to maintain two reference counts if you have both AR...
- Walter Bright (14/42) Oct 09 2013 COM used in the same class. ARC can only release the object if both coun...
- Walter Bright (38/53) Oct 09 2013 could be more obscure so people are less tempted to use it, but if you'r...
- Walter Bright (27/129) Oct 09 2013 compiler already splits this into two function calls, this cannot be don...
- Walter Bright (29/30) Oct 09 2013 every time this is done would be a disaster. It means that people would ...
- Walter Bright (31/37) Oct 09 2013 every time this is done would be a disaster. It means that people would ...
- Walter Bright (21/62) Oct 09 2013 When preparing for dconf I read the "Garbage Collection Handbook" by
- Walter Bright (33/40) Oct 09 2013 On my way to work today, I figured that a safe but slow implementation c...
- Walter Bright (8/37) Oct 09 2013 work, if the interface is not AddRef/Release, but
- Walter Bright (9/72) Oct 09 2013 deletion is performed before the thread resumes, AddRef is called on gar...
- Walter Bright (12/51) Oct 09 2013 You're right (I have been about to run to a meeting when writing this). ...
- Walter Bright (33/37) Oct 09 2013 Speaking of the GC, you should probably rethink this point:
- Walter Bright (36/46) Oct 09 2013 regardless of its reference count. That's fine as long as all the retain...
- Walter Bright (24/30) Oct 09 2013 normal
- Walter Bright (41/52) Oct 09 2013 automatically
- Walter Bright (39/50) Oct 09 2013 objects)
- Walter Bright (22/38) Oct 09 2013 Actually, I made a small error in opRetain. To match Objective-C ARC it ...
- Walter Bright (5/6) Oct 09 2013 Here is another nitpick that needs to be addressed: As mentioned in the
- Walter Bright (32/42) Oct 09 2013 I just had an idea, maybe it is obvious and just distracts, but I though...
- Walter Bright (8/18) Oct 09 2013 Sorry to produce these drop by drop, but while writing the last mail, I ...
- Walter Bright (8/9) Oct 09 2013 interface type must do reference counting as well. So the interface must...
- Walter Bright (12/22) Oct 09 2013 Yes, that is probably the way to go. But that makes using protocols like...
- Walter Bright (13/20) Oct 09 2013 it is
- Walter Bright (18/45) Oct 09 2013 reference type. The compiler detects a type declaration "reference_type"...
- Walter Bright (11/26) Oct 09 2013 of the non-reference-counted base class. This would require an implicit
- deadalnix (2/11) Oct 09 2013 It means OOP is completely broken with that design.
- Walter Bright (4/5) Oct 09 2013 I know. The thread kind of petered out as we began to realize the obstac...
- Walter Bright (7/9) Oct 09 2013 By "decay", do mean the lowering or something else?
- Walter Bright (7/12) Oct 09 2013 shared_ptr!C. Only @trusted code in shared_ptr will have to deal with "r...
- Walter Bright (14/29) Oct 09 2013 Any parameter of type C is also lowered to shared_ptr!C. Calling a membe...
- Walter Bright (12/34) Oct 09 2013 reference counting for safety. Treating every explicite or implicite usa...
- Walter Bright (10/11) Oct 09 2013 class C {}
- Walter Bright (10/15) Oct 09 2013 to an object of class C, not the object itself. If you replace type C wi...
- Walter Bright (4/12) Oct 09 2013 send them to you.
- Walter Bright (8/9) Oct 09 2013 interface type must do reference counting as well. So the interface must...
- Walter Bright (13/16) Oct 09 2013 I don't see why you would want to lower from shared_ptr!C to C. It's onl...
- Walter Bright (3/4) Oct 09 2013 No problem. It's a complicated subject, and none of us can think of all ...
- Walter Bright (24/40) Oct 09 2013 I tried to scan it yesterday, but got large black bar at the fold (don't...
- Walter Bright (7/9) Oct 09 2013 mov [S], ecx
- Walter Bright (9/48) Oct 09 2013 if this the correct term) that eraased the first inch of text. I would h...
- Walter Bright (4/5) Oct 09 2013 interfaces other than ones derived from IUnknown.
- Walter Bright (32/35) Oct 09 2013 interfaces other than ones derived from IUnknown.
- Walter Bright (15/31) Oct 09 2013 support anyway (for autoreleased objects notably). Tweaking the compiler...
- Walter Bright (3/4) Oct 09 2013 Shouldn't all this reference counting be in its own DIP?
- Walter Bright (2/5) Oct 09 2013 Yes. It's not ready yet, though.
- Walter Bright (8/10) Oct 09 2013 fields will
- Walter Bright (15/19) Oct 09 2013 exist already.
- Walter Bright (11/17) Oct 09 2013 fields will
- Walter Bright (11/20) Oct 09 2013 exist already.
- Walter Bright (19/32) Oct 09 2013 fields will
- Walter Bright (22/34) Oct 09 2013 fields will
- Walter Bright (21/37) Oct 09 2013 doesn't exist already.
- Walter Bright (21/22) Oct 09 2013 penalty. Note that if the GC is reaping a cycle, nobody else is referenc...
- Walter Bright (16/23) Oct 09 2013 The GC starts to collect this cycle.
- Walter Bright (21/26) Oct 09 2013 primitives uses atomic reference counts.
- Walter Bright (13/23) Oct 09 2013 penalty. Note that if the GC is reaping a cycle, nobody else is referenc...
- Walter Bright (24/45) Oct 09 2013 performance penalty. Note that if the GC is reaping a cycle, nobody else...
- Walter Bright (23/47) Oct 09 2013 performance penalty. Note that if the GC is reaping a cycle, nobody else...
- Walter Bright (19/28) Oct 09 2013 are all stopped using a signal. It's a very bad idea to run destructors...
- Walter Bright (11/18) Oct 09 2013 are all stopped using a signal. It's a very bad idea to run destructors...
- Walter Bright (12/15) Oct 09 2013 Logically, it's sound, but the implementation will be very difficult. I...
- Jacob Carlborg (5/24) Oct 09 2013 Is this the last email in the conversation? In that case I think you
- Walter Bright (2/3) Oct 10 2013 Yes, I posted them in chronological order.
- Walter Bright (9/10) Oct 09 2013 in another thread than they belong in. The situation you describe is not...
- Walter Bright (7/11) Oct 09 2013 in another thread than they belong in. The situation you describe is not...
- Walter Bright (1/1) Oct 10 2013 And that's the last email in the original thread!
- Walter Bright (2/3) Oct 10 2013 Durn, no it isn't.
- Sean Kelly (10/29) Oct 11 2013 I think this ties into the requirement that after the GC collects
- =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= (44/44) Oct 12 2013 I've made short roundup of the different features/requirements that have...
- =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= (71/71) Oct 12 2013 To support most of the requirements we need to offer some control over
- deadalnix (19/60) Oct 12 2013 We have a first missing block here :D
- Michel Fortin (22/23) Oct 12 2013 A basic concept in Objective-C to make manual reference counting
This is an email conversation we had last summer. It's of general interest, so reposted here with permission. We didn't reach any conclusions, but there's a lot of good stuff in here, and it's particularly relevant to other recent threads.
Oct 09 2013
This is based on n.g. discussions and ideas from you guys. I'll redo it as a DIP if it passes the smoke test from y'all. ---------------------------------------------------------------------- Adding Reference Counting to D D currently supports manual memory management and generalized GC. Unfortunately, the pausing and memory consumption inherent in GC is not acceptable for many programs, and manual memory management is error-prone, tedious, and unsafe. A third style, reference counting (RC), addresses this. Common implementations are COM's AddRef/Release, Objective-C's ARC, and C++'s shared_ptr<>. None of these three schemes are guaranteed memory-safe, they all require the programmer to conform to a protocol (even O-C's ARC). Stepping outside of the protocol results in memory corruption. A D implementation must make it possible to use ref counted objects in code marked as safe, although it will be necessary for the implementation of those objects to be unsafe. Some aspects of a D implementation are inevitable: 1. Avoid any requirement for more pointer types. This would cause drastic increases in complexity for both the compiler and the user. It may make generic code much more difficult to write. 2. Decay of a ref-counted pointer to a non-ref-counted pointer is unsafe, and can only be allowed (in safe code) in circumstances where it can be statically proven to be safe. 3. A ref counted object is inherently a reference type, not a value type. 4. The compiler needs to know about ref counted types. ==Proposal== If a class contains the following methods, in either itself or a base class, it is an RC class: T AddRef(); T Release(); An RC class is like a regular D class with these additional semantics: 1. In safe code, casting (implicit or explicit) to a base class that does not have both AddRef() and Release() is an error. 2. Initialization of a class reference causes a call to AddRef(). 3. Assignment to a class reference causes a call to Release() on its original value and AddRef() on the new value. 4. Null checks are done before calling any AddRef() or Release(). 5. Upon scope exit of all RC variables or temporaries, a call to Release() is performed, analogously to the destruction of struct variables and temporaries. 6. If a class or struct contains RC fields, calls to Release() for those fields will be added to the destructor, and a destructor will be created if one doesn't exist already. 7. If a closure is created that contains RC fields, either a compile time error will be generated or a destructor will be created for it. 8. Explicit calls to AddRef/Release will not be allowed in safe code. 9. A call to AddRef() will be added to any argument passed as a parameter. 10. Function returns have an AddRef() already done to the return value. 11. The compiler can elide any AddRef()/Release() calls it can prove are redundant. 12. AddRef() is not called when passed as the implicit 'this' reference. 13. Taking the address of, or passing by reference, any fields of an RC object is not allowed in safe code. Passing by reference an RC field is allowed. 14. RC objects will still be allocated on the GC heap - this means that a normal GC run will reap RC objects that are in a cycle, and RC objects will get automatically scanned for heap references with no additional action required by the user. ==Existing Code== D COM objects already have AddRef() and Release(). This proposal should not break that code, it'll just mean that there will be extra AddRef()/Release calls made. Calling AddRef()/Release() should never have been allowed in safe code anyway. Any other existing uses of AddRef()/Release() will break. ==Arrays== Built-in arrays have no place to put a reference count. Ref counted arrays would hence become a library type, based on a ref counted class with overloaded operators for the array operations. ==Results== This is a very flexible approach, allowing for support of general RC objects, as well as specific support for COM objects and Objective-C ARC. AddRef()/Release()'s implementation is entirely up to the user or library writer. safe code can be guaranteed to be memory safe, as long as AddRef()/Release() are correctly implemented.
Oct 09 2013
Steven Schveighoffer wrote: Looks like a good start. A few things: 1. Proposal point 3, you need to AddRef first, THEN Release the original. Reason being, you could be assigning a reference the same value as before. In this case, if you decrement *first*, you will decrement the reference, which might reduce it to 0, and free the object before you increment. In most cases, the AddRef and Release are elided, so it's not bad if you do this. I wonder if it's not a good idea to have a RefAssign function that takes two RC objects and does a check to see if they are the same before doing AddRef and Release to help with this issue. Calls where the compiler can prove they are the same value can be elided. 2. AddRef and Release should be treated not as function calls, but as callables. That is, if for some reason AddRef and Release should be aliases, this does not detract from the solution. Only requirement should be that they cannot be UFCS, as that doesn't make any sense (one cannot add reference counting after the fact to an object). I'm thinking of the Objective-C objects, whose functions are "release" and "retain". It would be good to use the names Objective-C coders are used to. 3. Objective-C ARC uses a mechanism called auto-release pools that help cut down on the release/retain calls. It works like this: autoreleasepool { // create a new pool NSString *str = [NSString stringWithFormat: "an int: %d", 1]; autoreleasepool { // create a new pool on the "pool stack" NSDate *date = [NSDate date]; { NSDate *date2 = [NSDate date]; } } // auto-release date and date2 } // auto-release str In this way, it's not the pointer going out of scope that releases the object, it's the release pool going out of scope that releases the object. These release pools themselves are simply RC objects that call release on all their objects (in fact, prior to the autoreleasepool directive, you had to manually create and destroy these pools). The benefit of this model is that basically, inside a pool, you can move around auto-released objects at will, pass them into functions, return them from functions, etc, without having to retain or release them for each assignment. It's kind of like a mini-GC. It works by convention, that any function that returns a RC object: if it's called 'new...' or 'init...' or 'alloc...' or 'copy...', then the object is assumed returned with it's retain count incremented on behalf of the calling scope. This means, if you assign it to a member variable, for instance, you do not have to retain the object again, and if it goes out of scope, you must call release on it. All other functions return 'auto-released' objects, or objects which have queued in the latest auto release pool. The compiler knows this and can elide more of the releases and retains. Would this be a mechanism that's worth putting in? I think it goes really well with something like TempAlloc. I don't think we should use convention, though... -Steve
Oct 09 2013
On 6/25/2013 1:19 PM, Steven Schveighoffer wrote:Looks like a good start. A few things: 1. Proposal point 3, you need to AddRef first, THEN Release the original.Reason being, you could be assigning a reference the same value as before. In this case, if you decrement *first*, you will decrement the reference, which might reduce it to 0, and free the object before you increment. In most cases, the AddRef and Release are elided, so it's not bad if you do this. Yeah, I got that backwards, and I should know better.I wonder if it's not a good idea to have a RefAssign function that takes twoRC objects and does a check to see if they are the same before doing AddRef and Release to help with this issue. Calls where the compiler can prove they are the same value can be elided. I'd like to see how far we get with just AddRef/Release first, and getting the semantics of them right first.2. AddRef and Release should be treated not as function calls, but ascallables. That is, if for some reason AddRef and Release should be aliases, this does not detract from the solution. Yes, just like the names for Ranges are used.Only requirement should be that they cannot be UFCS, as that doesn't makeany sense (one cannot add reference counting after the fact to an object). That would be covered by disallowing explicit calls to AddRef/Release in safe code.I'm thinking of the Objective-C objects, whose functions are "release" and"retain". It would be good to use the names Objective-C coders are used to. AddRef/Release are the COM names. It's trivial to have one wrap the other. I picked AddRef/Release because I'm familiar with their semantics, and am not with O-C.3. Objective-C ARC uses a mechanism called auto-release pools that help cutdown on the release/retain calls. It works like this:autoreleasepool { // create a new pool NSString *str = [NSString stringWithFormat: "an int: %d", 1]; autoreleasepool { // create a new pool on the "pool stack" NSDate *date = [NSDate date]; { NSDate *date2 = [NSDate date]; } } // auto-release date and date2 } // auto-release str In this way, it's not the pointer going out of scope that releases theobject, it's the release pool going out of scope that releases the object. These release pools themselves are simply RC objects that call release on all their objects (in fact, prior to the autoreleasepool directive, you had to manually create and destroy these pools).The benefit of this model is that basically, inside a pool, you can movearound auto-released objects at will, pass them into functions, return them from functions, etc, without having to retain or release them for each assignment. It's kind of like a mini-GC.It works by convention, that any function that returns a RC object: if it's called 'new...' or 'init...' or 'alloc...' or 'copy...', then theobject is assumed returned with it's retain count incremented on behalf of the calling scope. This means, if you assign it to a member variable, for instance, you do not have to retain the object again, and if it goes out of scope, you must call release on it.All other functions return 'auto-released' objects, or objects which havequeued in the latest auto release pool. The compiler knows this and can elide more of the releases and retains.Would this be a mechanism that's worth putting in? I think it goes reallywell with something like TempAlloc. I don't think we should use convention, though... I agree with not relying on convention. But also reserving the new*, init*, alloc* and copy* namespaces seems excessive for D. As for autoreleasepool, it is relying on a convention that its fields are not leaking. I don't see how we can enforce this.
Oct 09 2013
I have overlooked addressing what happens when you pass an RC ref to a pure function. Is the pure function allowed to call AddRef()/Release()? Not sure.
Oct 09 2013
Steven Schveighoffer wrote: That's a resounding yes. Consider that this is allowed: class X {} struct S { X foo; void setFoo(X newfoo) pure {foo = newfoo;} } If X is ref-counted, you HAVE to increment the ref count. The only issue here is, ref counting may have to access global data. But we already have exceptions for memory management, even for strong-pure functions. -Steve On Jun 25, 2013, at 4:48 PM, Walter Bright wrote:I have overlooked addressing what happens when you pass an RC ref to a purefunction. Is the pure function allowed to call AddRef()/Release()? Not sure.
Oct 09 2013
Steven Schveighoffer wrote: On Jun 25, 2013, at 4:44 PM, Walter Bright wrote:On 6/25/2013 1:19 PM, Steven Schveighoffer wrote:well with something like TempAlloc. I don't think we should use convention, though...Would this be a mechanism that's worth putting in? I think it goes reallyI agree with not relying on convention. But also reserving the new*, init*,alloc* and copy* namespaces seems excessive for D.As for autoreleasepool, it is relying on a convention that its fields are notleaking. I don't see how we can enforce this. I don't think the autoreleasepool is relying on convention, it's simply giving the compiler a way to elide careful tracking of temporaries' reference counts. It was definitely of more use when manual reference counting was done, because one only had to worry about retaining non-temporary data in that case. But the compiler can make the same optimizations (and does in the ARC version of Objective-C). Consider the following code: NSString *str = [NSString stringWithFormat: "%d", 5]; // translating to D, that would be something like: NSString str = NSString.stringWithFormat("%d", 5); stringWithFormat is a class method that gives you back a temporary string. You are not asserting ownership, you are just assigning to a variable. Now, if you wanted to do some fancy stuff with str, we could do: { NSString str2; { NSString str = NSString.stringWithFormat("%d", 5); if(condition) str2 = str; if(otherCondition) { NSString str3 = str; str = NSString.stringWithFormat("%d", 6); } } str2 = str; } Now, in all this mess, how is the compiler to sort out the AddRefs and Releases? Most likely, it will end up adding more than it needs to. But with autorelease pool, it's like you did this: AutoReleasePool arp; { NSString str2; { NSString str = NSString.stringWithFormat("%d", 5); arp.add(str); if(condition) str2 = str; if(otherCondition) { NSString str3 = str; str = NSString.stringWithFormat("%d", 6); arp.add(str); } } str2 = str; } arp.drain(); // releases both strings used, don't care what now-out-of-scope variables Essentially, they are only retained when created, and because they go out of scope, they are no longer needed. The compiler can surmise that because the fields aren't leaving the scope, it doesn't have to retain them. If it does see that, it adds a retain. Then, it can release them all at once. In fact, this could be done automatically, but you have to allocate a place to put these 'scheduled for release' things. In Cocoa, the main event loop has an auto release pool, and you can add them manually wherever you wish for more fine-grained memory management (that is, if you wanted to free objects before you left the event loop). Note that in Objective-C, they use those naming conventions to determine whether an object is auto-released or not. But we could make sure it's *always* auto-released, as we don't have the historical requirements that Objective-C has. The question is, does it make sense to use this technique to "lump together" deallocations instead of conservatively calling retain/release wherever you assign variables (like C++ shared_ptr)? And a further question is whether the compiler should pick those points, or whether they should be picked manually. -Steve
Oct 09 2013
If autoreleasepull is just a handy way to lump together Release() calls, then that is quite unnecessary if the compiler inserts calls to Release() automatically. If it is, instead, a promise that members of autoreleasepull do not leak references outside of that object, then this is very problematic for D to guarantee such - and guarantee it it must. I.e. it's "escape analysis" in another disguise. I think the compiler should pick where to put the Release() calls, that is the whole point of ARC. If the compiler can do sufficient escape analysis to determine that the calls can be elided, so much the better.
Oct 09 2013
On Jun 25, 2013, at 5:28 PM, Walter Bright wrote:If autoreleasepull is just a handy way to lump together Release() calls, thenthat is quite unnecessary if the compiler inserts calls to Release() automatically. If it is, instead, a promise that members of autoreleasepull do not leak references outside of that object, then this is very problematic for D to guarantee such - and guarantee it it must. I.e. it's "escape analysis" in another disguise.I think the compiler should pick where to put the Release() calls, that isthe whole point of ARC. If the compiler can do sufficient escape analysis to determine that the calls can be elided, so much the better. I'm not sure exactly what is required for ARC to guarantee proper memory management (whether it requires flow-analysis or not), but it seems to work quite well for Objective-C. I think it helps minimize the expensive release/retain calls when you can just say "oh, someone else will clean that up later", just like you can with a GC. It might be good for someone who knows the ARC eliding techniques that clang uses to explain how they work. We certainly shouldn't ignore those techniques. -Steve
Oct 09 2013
On 6/25/2013 2:47 PM, Steven Schveighoffer wrote:I'm not sure exactly what is required for ARC to guarantee proper memorymanagement (whether it requires flow-analysis or not), but it seems to work quite well for Objective-C. I think it helps minimize the expensive release/retain calls when you can just say "oh, someone else will clean that up later", just like you can with a GC.It might be good for someone who knows the ARC eliding techniques that clanguses to explain how they work. We certainly shouldn't ignore those techniques.Also remember that O-C doesn't guarantee memory safety, so they are freed from some of the constraints we operate under. They can say "don't do that", we can't. C++ shared_ptr<> is memory safe as long as you don't escape a pointer - and no C++ compiler checks for that. COM is also memory safe as long as you carefully follow the conventions - and again, no C++ compiler checks it.
Oct 09 2013
On 6/25/2013 2:47 PM, Steven Schveighoffer wrote:management (whether it requires flow-analysis or not), but it seems to work quite well for Objective-C. I think it helps minimize the expensive release/retain calls when you can just say "oh, someone else will clean that up later", just like you can with a GC.I'm not sure exactly what is required for ARC to guarantee proper memoryuses to explain how they work. We certainly shouldn't ignore those techniques.It might be good for someone who knows the ARC eliding techniques that clangfrom some of the constraints we operate under. They can say "don't do that", we can't. I'm not sure that it doesn't. At least when we are talking about object references. The only thing clang complains about is when you try to call any memory management manually, or if you disobey the naming conventions. -SteveAlso remember that O-C doesn't guarantee memory safety, so they are freed
Oct 09 2013
Michel Fortin wrote: Le 25-juin-2013 à 17:20, Steven Schveighoffer a écrit :On Jun 25, 2013, at 4:44 PM, Walter Bright wrote:well with something like TempAlloc. I don't think we should use convention, though...On 6/25/2013 1:19 PM, Steven Schveighoffer wrote:Would this be a mechanism that's worth putting in? I think it goes reallyalloc* and copy* namespaces seems excessive for D.I agree with not relying on convention. But also reserving the new*, init*,not leaking. I don't see how we can enforce this.As for autoreleasepool, it is relying on a convention that its fields areI don't think the autoreleasepool is relying on convention, it's simplygiving the compiler a way to elide careful tracking of temporaries' reference counts. Not at all. Autorelease pools were useful at a time before ARC so you wouldn't have to think of releasing manually every object called functions were returned to you. Instead, most functions would return autoreleased object and you'd only have to retain those objects you were storing elsewhere. Nowadays, with ARC, we still have them but that's mostly for interoperability already existing code. Most functions still return autoreleased objects because that's the convention, and breaking that convention would cause objects to be retained or released to many times. So we still need autorelease pools. But ARC is hard at work[^1] behind the scene to reduce the number of those autoreleased objects. So no, we shouldn't introduce autorelease pools to D... well, except maybe for the part were we want interoperability with Objective-C (because we have no choice). And finally, there's nothing unsafe with autorelease pools as long as you don't keep an unretained reference to an autoreleased object when the pool drains. Making sure you have no unretained reference is ARC's job, so with ARC it should not be no problem. [^1]: One clever trick ARC does is inside the implicit call to objc_returnAutoreleased it adds at the end of an autoreleasing function, the runtime checks to see if the return address points to an instruction that'll call objc_retain on that same pointer. If that's the case, it skips the autorelease and also skip objc_retain and goes to the next instruction directly. Of course if the convention was always to return object retained, none of this would be needed. I saw that explained on a WWDC video a couple of years back.
Oct 09 2013
Steven Schveighoffer wrote: On Jun 25, 2013, at 9:31 PM, Michel Fortin wrote:Le 25-juin-2013 à 17:20, Steven Schveighoffer a écrit :giving the compiler a way to elide careful tracking of temporaries' reference counts.I don't think the autoreleasepool is relying on convention, it's simplyNot at all. Autorelease pools were useful at a time before ARC so youwouldn't have to think of releasing manually every object called functions were returned to you. Instead, most functions would return autoreleased object and you'd only have to retain those objects you were storing elsewhere. Having used MRC, I appreciate what autoreleasepool did, but I thought of it being also as a kind of blanket way to allow the compiler to remove extra retains/releases in ARC. Is it not advantageous to release a whole pool of objects vs. releasing them individually during execution? All releases and retains are atomic, so I figured one could do some optimization when it's all lumped together. I find the autorelease pools very GC-like -- you don't have to worry who uses or forgets the reference, it's kept in memory until you don't need it. Anyway, everything I know about Obj-C ARC I learned from my iOS 5 book :) So don't take me as an expert. -Steve
Oct 09 2013
Michel Fortin wrote: Le 25-juin-2013 à 21:40, Steven Schveighoffer a écrit :On Jun 25, 2013, at 9:31 PM, Michel Fortin wrote:wouldn't have to think of releasing manually every object called functions were returned to you. Instead, most functions would return autoreleased object and you'd only have to retain those objects you were storing elsewhere.Not at all. Autorelease pools were useful at a time before ARC so youHaving used MRC, I appreciate what autoreleasepool did, but I thought of itbeing also as a kind of blanket way to allow the compiler to remove extra retains/releases in ARC.Is it not advantageous to release a whole pool of objects vs. releasing themindividually during execution? All releases and retains are atomic, so I figured one could do some optimization when it's all lumped together. I haven't done any benchmarking, but I'd have to assume it is more advantageous to just return objects retained since Apple went to great lengths to make sure this can happen even when the convention is to return autoreleased. There's no question it also simplifies the compiler. It's much easier to reason about pairs of retain/release than retain/autorelease.I find the autorelease pools very GC-like -- you don't have to worry who usesor forgets the reference, it's kept in memory until you don't need it. The concept was truly great, no doubt about that.
Oct 09 2013
On 6/25/2013 6:31 PM, Michel Fortin wrote:And finally, there's nothing unsafe with autorelease pools as long as youdon't keep an unretained reference to an autoreleased object when the pool drains. Well, that's exactly the issue - an escaping reference.
Oct 09 2013
Steven Schveighoffer wrote: On Jun 25, 2013, at 11:04 PM, Walter Bright wrote:On 6/25/2013 6:31 PM, Michel Fortin wrote:don't keep an unretained reference to an autoreleased object when the pool drains.And finally, there's nothing unsafe with autorelease pools as long as youWell, that's exactly the issue - an escaping reference.Read the next sentence after your quote though: "Making sure you have no unretained reference is ARC's job, so with ARC it should not be no problem." So with ARC, it's not unsafe. I think that was the ultimate point. -Steve
Oct 09 2013
On 6/25/2013 8:08 PM, Steven Schveighoffer wrote:On Jun 25, 2013, at 11:04 PM, Walter Bright wrote:don't keep an unretained reference to an autoreleased object when the pool drains.On 6/25/2013 6:31 PM, Michel Fortin wrote:And finally, there's nothing unsafe with autorelease pools as long as youshould not be no problem."Well, that's exactly the issue - an escaping reference.Read the next sentence after your quote though: "Making sure you have no unretained reference is ARC's job, so with ARC itSo with ARC, it's not unsafe. I think that was the ultimate point.Yes, I read the next sentence. What it means to me is that autorelease pools add nothing when ARC is implemented - no optimizations, either. Either that or I don't understand how ARC figures out that there are no escaping references without doing things like runtime checks.
Oct 09 2013
Steven Schveighoffer wrote: On Jun 26, 2013, at 12:10 AM, Walter Bright wrote:On 6/25/2013 8:08 PM, Steven Schveighoffer wrote:don't keep an unretained reference to an autoreleased object when the pool drains.On Jun 25, 2013, at 11:04 PM, Walter Bright wrote:On 6/25/2013 6:31 PM, Michel Fortin wrote:And finally, there's nothing unsafe with autorelease pools as long as youshould not be no problem."Well, that's exactly the issue - an escaping reference.Read the next sentence after your quote though: "Making sure you have no unretained reference is ARC's job, so with ARC itadd nothing when ARC is implemented - no optimizations, either. Either that or I don't understand how ARC figures out that there are no escaping references without doing things like runtime checks. OK, it sounded like you were saying still that Objective C with ARC didn't have memory safety. I'm no longer arguing about autorelease, I defer to Michel on that. I clearly didn't understand that autorelease doesn't provide any benefits for ARC, it's just there for compatibility. -SteveSo with ARC, it's not unsafe. I think that was the ultimate point.Yes, I read the next sentence. What it means to me is that autorelease pools
Oct 09 2013
On 6/25/2013 9:13 PM, Steven Schveighoffer wrote:OK, it sounded like you were saying still that Objective C with ARC didn'thave memory safety. My initial first pass read of: http://clang.llvm.org/docs/AutomaticReferenceCounting.html indicates that ARC does not guarantee memory safety. There are a number of "undefined" behaviors when the O-C programmer does not follow the rules.
Oct 09 2013
Walter Bright: Where are the benchmarks that show that this is a good idea in some real situations? This is the essential first step.If a class contains the following methods, in either itself or a base class, it is an RC class: T AddRef(); T Release();What if a programmer adds only one of those two? Currently if you add only part of the hashing protocol (or you bork a function signature) the compiler often gives no errors. What are the plans for coalescing and optimizing away some reference counts updates? Bye, bearophile
Oct 09 2013
What are the plans for coalescing and optimizing away some reference counts updates?There is now a discussion and paper on optimizing a reference counter: http://lambda-the-ultimate.org/node/4825 Bye, bearophile
Oct 11 2013
On 10/11/13 10:28 AM, bearophile wrote:Yes, I think that's great work. AndreiWhat are the plans for coalescing and optimizing away some reference counts updates?There is now a discussion and paper on optimizing a reference counter: http://lambda-the-ultimate.org/node/4825 Bye, bearophile
Oct 11 2013
On Wednesday, 9 October 2013 at 22:31:34 UTC, Walter Bright wrote:==Arrays== Built-in arrays have no place to put a reference count. Ref counted arrays would hence become a library type, based on a ref counted class with overloaded operators for the array operations.You'll soon see the roadblock here with templates and type qualifiers. const(A!T) and A!const(T) has nothing to do with each other as far as the compiler is concerned, and no we have no way around this ATM.
Oct 09 2013
Updated incorporating Steven's suggestion, and some comments about shared/const/mutable/purity. ------------------------------------------------------------- Adding Reference Counting to D D currently supports manual memory management and generalized GC. Unfortunately, the pausing and memory consumption inherent in GC is not acceptable for many programs, and manual memory management is error-prone, tedious, and unsafe. A third style, reference counting (RC), addresses this. Common implementations are COM's AddRef/Release, Objective-C's ARC, and C++'s shared_ptr<>. None of these three schemes are guaranteed memory-safe, they all require the programmer to conform to a protocol (even O-C's ARC). Stepping outside of the protocol results in memory corruption. A D implementation must make it possible to use ref counted objects in code marked as safe, although it will be necessary for the implementation of those objects to be unsafe. Some aspects of a D implementation are inevitable: 1. Avoid any requirement for more pointer types. This would cause drastic increases in complexity for both the compiler and the user. It may make generic code much more difficult to write. 2. Decay of a ref-counted pointer to a non-ref-counted pointer is unsafe, and can only be allowed (in safe code) in circumstances where it can be statically proven to be safe. 3. A ref counted object is inherently a reference type, not a value type. 4. The compiler needs to know about ref counted types. ==Proposal== If a class contains the following methods, in either itself or a base class, it is an RC class: T AddRef(); T Release(); An RC class is like a regular D class with these additional semantics: 1. In safe code, casting (implicit or explicit) to a base class that does not have both AddRef() and Release() is an error. 2. Initialization of a class reference causes a call to AddRef(). 3. Assignment to a class reference causes a call to AddRef() on the new value followed by a call to Release() on its original value. 4. Null checks are done before calling any AddRef() or Release(). 5. Upon scope exit of all RC variables or temporaries, a call to Release() is performed, analogously to the destruction of struct variables and temporaries. 6. If a class or struct contains RC fields, calls to Release() for those fields will be added to the destructor, and a destructor will be created if one doesn't exist already. 7. If a closure is created that contains RC fields, either a compile time error will be generated or a destructor will be created for it. 8. Explicit calls to AddRef/Release will not be allowed in safe code. 9. A call to AddRef() will be added to any argument passed as a parameter. 10. Function returns have an AddRef() already done to the return value. 11. The compiler can elide any AddRef()/Release() calls it can prove are redundant. 12. AddRef() is not called when passed as the implicit 'this' reference. 13. Taking the address of, or passing by reference, any fields of an RC object is not allowed in safe code. Passing by reference an RC field is allowed. 14. RC objects will still be allocated on the GC heap - this means that a normal GC run will reap RC objects that are in a cycle, and RC objects will get automatically scanned for heap references with no additional action required by the user. 15. The class implementor will be responsible for deciding whether or not to support sharing. Casting to shared is already disallowed in safe code, so this is only viable in system code. 16. RC objects cannot be const or immutable. 17. Can RC objects be arguments to pure functions? ==Existing Code== D COM objects already have AddRef() and Release(). This proposal should not break that code, it'll just mean that there will be extra AddRef()/Release calls made. Calling AddRef()/Release() should never have been allowed in safe code anyway. Any other existing uses of AddRef()/Release() will break. ==Arrays== Built-in arrays have no place to put a reference count. Ref counted arrays would hence become a library type, based on a ref counted class with overloaded operators for the array operations. ==Results== This is a very flexible approach, allowing for support of general RC objects, as well as specific support for COM objects and Objective-C ARC. AddRef()/Release()'s implementation is entirely up to the user or library writer. safe code can be guaranteed to be memory safe, as long as AddRef()/Release() are correctly implemented.
Oct 09 2013
Michel Fortin wrote: While its a start, this is hardly enough for Objective-C. Mostly for legacy reasons, most Objective-C methods return autoreleased objects (deferred release using an autorelease pool) based on a naming convention. Also, Objective-C objects can't be allocated from the D heap, so to avoid cycles we need weak pointers. More on Objective-C later. While it's good that a direct call to AddRef/Release is forbidden in safe code, I think it should be forbidden in system code too. The reason is that if the compiler is inserting calls to these automatically and you're also adding your own explicitly in the same function, it becomes practically impossible to reason about the reference counts, short of looking at the assembly. Instead, I think you should create a noarc attribute for functions: it'll prevent the compiler for inserting any of those calls so it becomes the responsibility of the author to make those calls (which are then allowed). noarc would be incompatible with safe, obviously. Finally, that's a nitpick but I wish you'd use function names that fit D better, such as opRetain and opRelease. Then you can add a "final void opRetain() { AddRef(); }" function to the IUnknown COM interface and we could do the same for Objective-C. Objective-C is a special case. In Objective-C we need to know whether the returned object of a function is already retained or if it is deferred released (autoreleased). This is easily deducted from the naming convention. Occasionally, we might need to create autorelease pools too, but that can probably stay system. (Note: all this idea of autoreleased objects might sound silly, but it was a great help before ARC, and Objective-C ARC has to be compatible with legacy code so it conforms to those conventions.) You can easily implement ARC for COM using an implementation of ARC for Objective-C, the reverse is not true because COM does not have this (old but still needed) concept of autorelease pools and deferred release where you need to know at each function boundary whether returned values (including those returned by pointer arguments) whether the object is expected to be retained or not. If I were you Walter, I would just not care about Objective-C idioms while implementing this feature at first. It'll have to be special cased anyway. Here's how I expect that'll be done: What will need to be done later when adding Objective-C support is to add an internal "autoreleasedReturn" flag to a function that'll make codegen call "autorelease" in the callee when returning an object and "retain" in the caller where it receives an object from a function with that flag. Also, the same flag and behaviour is needed for out parameters (to mimick those cases where an object is returned by pointer). That flag will then be set automatically internally depending on the function name (only for Objective-C member functions), and it should be possible to override it explicitly with an attribute or a pragma of some sort. This is what Clang is doing, and we must match that to allow things to work. Checking for null is redundant in the Objective-C case: that check is done by the runtime. That's of minor importance, but it might impact performance and should probably special-cased in this case. With Apple's implementation of reference counting (using global hash tables protected by spin locks), it is more efficient to update many counters in one operation. The codegen for Objective-C ARC upon assignement to a variable calls "objc_storeStrong(id *object, id value)", incrementing and decrementing the two counters presumably in one operation (as well as replacing the content of the variable pointed by the first argument with the new value). Ideally, the codegen for Objective-C ARC in D would call the same functions so we have the same performance. This means that codegen should make a call "objc_retain" when first initializing a variable, "objc_storeStrong" when doing an assignment, and "objc_release" when destructing a variable. As for returning autoreleased objects, there are two functions to choose from depending on whether the object needs to be retained at the same time nor not. (In general, the object needs to be retained prior autoreleasing if it comes from a variable not part of the function's stack frame.) Here's Clang's documentation for how it implements ARC: http://clang.llvm.org/docs/AutomaticReferenceCounting.html Weak pointers are essential in order to break retain cycles in Objective-C where there is no GC. They are implemented with the same kind of function calls as strong pointers. Unfortunately, Apple's Objective-C implementation won't sit well with D the way it works now. Weak pointers are implemented in Objective-C by registering the address of the pointer with the runtime. This means that when a pointer is moved from one location to another, the need to be notified of that through a call to objc_moveWeak. This breaks one assumption of D that you can move memory at will without calling anything. While we could still implement a working weak pointer with a template struct, that struct would have to allocate a pointer on the heap (where it is guarantied to not move) so it can store the true weak pointer recognized by the runtime. I'm not sure that would be acceptable, but at least it would work. I feel like I should share some of my thoughts here about a broader use of reference counting in D. First, we don't have to assume the reference counter has to be part of the object. Apple implements reference counting using global hash tables where the key is the address. It works very well. If we added a hash table like this for all memory allocated from the GC, we'd just have to find the base address of any memory block to get to its reference counter. I know you were designing with only classes in mind, but I want to point out that it is possible to reference-count everything the GC allocates if we want to. The downside is that every assignment to a pointer anywhere has to call a function. While this is some overhead, it is more predictable than overhead from a GC scan and would be preferred in some situation (games I guess). Another downside is you have an object retained by being present on the stack frame of a C function, it'd have to be explicitly retained from elsewhere. As for pointers not pointing to GC memory, the generic addRef/release functions can ignore those pointers just like the GC ignores them today when it does its scan. Finally, cycles can still be reclaimed by having the GC scan for them. Those scans should be less frequent however since most of the memory can be reclaimed through reference counting.
Oct 09 2013
On 6/25/2013 6:09 PM, Michel Fortin wrote:While its a start, this is hardly enough for Objective-C. Mostly for legacyreasons, most Objective-C methods return autoreleased objects (deferred release using an autorelease pool) based on a naming convention. Also, Objective-C objects can't be allocated from the D heap, so to avoid cycles we need weak pointers. More on Objective-C later.While it's good that a direct call to AddRef/Release is forbidden in safecode, I think it should be forbidden in system code too. The reason is that if the compiler is inserting calls to these automatically and you're also adding your own explicitly in the same function, it becomes practically impossible to reason about the reference counts, short of looking at the assembly. Instead, I think you should create a noarc attribute for functions: it'll prevent the compiler for inserting any of those calls so it becomes the responsibility of the author to make those calls (which are then allowed). noarc would be incompatible with safe, obviously. It's a good point, but adding such an attribute to the function may be too coarse, for one, and may cause composition problems, for another. Maybe just disallowing it altogether is the best solution.Finally, that's a nitpick but I wish you'd use function names that fit Dbetter, such as opRetain and opRelease. Then you can add a "final void opRetain() { AddRef(); }" function to the IUnknown COM interface and we could do the same for Objective-C. Makes sense.Objective-C is a special case. In Objective-C we need to know whether thereturned object of a function is already retained or if it is deferred released (autoreleased). This is easily deducted from the naming convention. Occasionally, we might need to create autorelease pools too, but that can probably stay system.(Note: all this idea of autoreleased objects might sound silly, but it was agreat help before ARC, and Objective-C ARC has to be compatible with legacy code so it conforms to those conventions.)You can easily implement ARC for COM using an implementation of ARC forObjective-C, the reverse is not true because COM does not have this (old but still needed) concept of autorelease pools and deferred release where you need to know at each function boundary whether returned values (including those returned by pointer arguments) whether the object is expected to be retained or not.If I were you Walter, I would just not care about Objective-C idioms whileimplementing this feature at first. It'll have to be special cased anyway. Here's how I expect that'll be done: From reading over that clang document, O-C arc is far more complex than I'd anticipated. I think it is way beyond what we'd want in regular D. It also comes with all kinds of pointer and function annotations - something I strongly want to avoid.What will need to be done later when adding Objective-C support is to add aninternal "autoreleasedReturn" flag to a function that'll make codegen call "autorelease" in the callee when returning an object and "retain" in the caller where it receives an object from a function with that flag. Also, the same flag and behaviour is needed for out parameters (to mimick those cases where an object is returned by pointer). That flag will then be set automatically internally depending on the function name (only for Objective-C member functions), and it should be possible to override it explicitly with an attribute or a pragma of some sort. This is what Clang is doing, and we must match that to allow things to work. I agree that this complexity should only be in O-C code.Checking for null is redundant in the Objective-C case: that check is done bythe runtime. That's of minor importance, but it might impact performance and should probably special-cased in this case.With Apple's implementation of reference counting (using global hash tablesprotected by spin locks), it is more efficient to update many counters in one operation. The codegen for Objective-C ARC upon assignement to a variable calls "objc_storeStrong(id *object, id value)", incrementing and decrementing the two counters presumably in one operation (as well as replacing the content of the variable pointed by the first argument with the new value).Ideally, the codegen for Objective-C ARC in D would call the same functionsso we have the same performance. This means that codegen should make a call "objc_retain" when first initializing a variable, "objc_storeStrong" when doing an assignment, and "objc_release" when destructing a variable.As for returning autoreleased objects, there are two functions to choose fromdepending on whether the object needs to be retained at the same time nor not. (In general, the object needs to be retained prior autoreleasing if it comes from a variable not part of the function's stack frame.)Here's Clang's documentation for how it implements ARC: http://clang.llvm.org/docs/AutomaticReferenceCounting.html Weak pointers are essential in order to break retain cycles in Objective-Cwhere there is no GC. They are implemented with the same kind of function calls as strong pointers. Unfortunately, Apple's Objective-C implementation won't sit well with D the way it works now.Weak pointers are implemented in Objective-C by registering the address ofthe pointer with the runtime. This means that when a pointer is moved from one location to another, the need to be notified of that through a call to objc_moveWeak. This breaks one assumption of D that you can move memory at will without calling anything.While we could still implement a working weak pointer with a template struct,that struct would have to allocate a pointer on the heap (where it is guarantied to not move) so it can store the true weak pointer recognized by the runtime. I'm not sure that would be acceptable, but at least it would work.I feel like I should share some of my thoughts here about a broader use ofreference counting in D.First, we don't have to assume the reference counter has to be part of theobject. Apple implements reference counting using global hash tables where the key is the address. It works very well.If we added a hash table like this for all memory allocated from the GC, we'djust have to find the base address of any memory block to get to its reference counter. I know you were designing with only classes in mind, but I want to point out that it is possible to reference-count everything the GC allocates if we want to. D would need manual, RC and GC to coexist peacefully.The downside is that every assignment to a pointer anywhere has to call afunction. While this is some overhead, it is more predictable than overhead from a GC scan and would be preferred in some situation (games I guess). Another downside is you have an object retained by being present on the stack frame of a C function, it'd have to be explicitly retained from elsewhere. Doesn't this make it impractical to mix vanilla C with D code? An important feature of D is this capability, without worrying about a "JNI" style interface. As for D switching to a full refcounted GC for everything, I'm very hesitant for such a step. For one thing, reading the clang spec on all the various pointer and function annotations necessary is very off-putting.As for pointers not pointing to GC memory, the generic addRef/releasefunctions can ignore those pointers just like the GC ignores them today when it does its scan.Finally, cycles can still be reclaimed by having the GC scan for them. Thosescans should be less frequent however since most of the memory can be reclaimed through reference counting.
Oct 09 2013
Rainer Schuetze wrote: On 25.06.2013 23:00, Walter Bright wrote:Updated incorporating Steven's suggestion, and some comments about shared/const/mutable/purity. ------------------------------------------------------------- Adding Reference Counting to DCool. I didn't expect this to be tackled so soon.[...]3. A ref counted object is inherently a reference type, not a value type.As Michel also said, the reference count does not have to be in inside the object itself, so we might want to allow reference counting on other types aswell.4. The compiler needs to know about ref counted types.I imagine a few (constrained) templated functions for the different operations defined in the library could also do the job, though it might drown compilation speed. Also getting help from the optimizer to remove redundant calls will need some back doors.==Proposal== If a class contains the following methods, in either itself or a base class, it is an RC class: T AddRef(); T Release();Is T typeof(this) here? I don't think we should force linking this functionality with COM, the programmer can do this with a simple wrapper.An RC class is like a regular D class with these additional semantics: 1. In safe code, casting (implicit or explicit) to a base class that does not have both AddRef() and Release() is an error. 2. Initialization of a class reference causes a call to AddRef(). 3. Assignment to a class reference causes a call to AddRef() on the new value followed by a call to Release() on its original value.It might be common knowledge, but I want to point out that the usual COM implementation (atomic increment/decrement and free when refcount goes down to 0) is not thread-safe for shared pointers. That means you either have to guard all reads and writes with a lock to make the full assignment atomic or have to implement reference counting very different (e.g. deferred reference counting).4. Null checks are done before calling any AddRef() or Release(). 5. Upon scope exit of all RC variables or temporaries, a call to Release() is performed, analogously to the destruction of struct variables and temporaries. 6. If a class or struct contains RC fields, calls to Release() for those fields will be added to the destructor, and a destructor will be created if one doesn't exist already. 7. If a closure is created that contains RC fields, either a compile time error will be generated or a destructor will be created for it. 8. Explicit calls to AddRef/Release will not be allowed in safe code. 9. A call to AddRef() will be added to any argument passed as a parameter. 10. Function returns have an AddRef() already done to the return value. 11. The compiler can elide any AddRef()/Release() calls it can prove are redundant. 12. AddRef() is not called when passed as the implicit 'this' reference.Isn't this unsafe if a member function is called through the last existing reference and this reference is then cleared during execution of this member function or from another thread?13. Taking the address of, or passing by reference, any fields of an RC object is not allowed in safe code. Passing by reference an RC field is allowed.Please note that this includes slices to fixed size arrays.14. RC objects will still be allocated on the GC heap - this means that a normal GC run will reap RC objects that are in a cycle, and RC objects will get automatically scanned for heap references with no additional action required by the user. 15. The class implementor will be responsible for deciding whether or not to support sharing. Casting to shared is already disallowed in safe code, so this is only viable in system code. 16. RC objects cannot be const or immutable.This is a bit of a downer. If the reference count is not within the object, this can be implemented.17. Can RC objects be arguments to pure functions? ==Existing Code== D COM objects already have AddRef() and Release(). This proposal should not break that code, it'll just mean that there will be extra AddRef()/Release calls made. Calling AddRef()/Release() should never have been allowed in safe code anyway. Any other existing uses of AddRef()/Release() will break. ==Arrays== Built-in arrays have no place to put a reference count. Ref counted arrays would hence become a library type, based on a ref counted class with overloaded operators for the array operations. ==Results== This is a very flexible approach, allowing for support of general RC objects, as well as specific support for COM objects and Objective-C ARC. AddRef()/Release()'s implementation is entirely up to the user or library writer. safe code can be guaranteed to be memory safe, as long as AddRef()/Release() are correctly implemented.I feel I'm hijacking this proposal, but the step to library defined read/write barriers seems pretty small. Make AddRef, Release and assignment free template functions, e.g. void ptrConstruct(T,bool stackOrHeap)(T*adr, T p); void ptrAssign(T,bool stackOrHeap)(T*adr, T p); void ptrRelease(T,bool stackOrHeap)(T*adr); and we are able to experiment with all kinds of sophisticated GC algorithms including RC. Eliding redundant addref/release pairs would need some extra support though, I read that LLVM does something like this, but I don't know how.
Oct 09 2013
On 6/26/2013 12:19 AM, Rainer Schuetze wrote:As Michel also said, the reference count does not have to be in inside theobject itself, so we might want to allow reference counting on other types aswell. That opens the question of what is the point of other RC types? For example, C++ can throw any type - but it turns out that throwing anything but class types is largely pointless. My proposal does not specify where the count actually is - the two functions can be arbitrarily implemented.operations defined in the library could also do the job, though it might drown compilation speed. Also getting help from the optimizer to remove redundant calls will need some back doors. I don't see how this can be done without specific compiler knowledge in a memory safe way.4. The compiler needs to know about ref counted types.I imagine a few (constrained) templated functions for the differentT is not relevant to the proposal - it's up to the specific implementation of those functions.T AddRef(); T Release();Is T typeof(this) here?I don't think we should force linking this functionality with COM, theprogrammer can do this with a simple wrapper. Yeah, that's Michel's suggestion, and it's a good one.implementation (atomic increment/decrement and free when refcount goes down to 0) is not thread-safe for shared pointers. That means you either have to guard all reads and writes with a lock to make the full assignment atomic or have to implement reference counting very different (e.g. deferred reference counting). Since the implementation of AddRef()/Release() is up to the user, whether it uses locks or not and whether it supports shared or not is up to the user.An RC class is like a regular D class with these additional semantics: 1. In safe code, casting (implicit or explicit) to a base class that does not have both AddRef() and Release() is an error. 2. Initialization of a class reference causes a call to AddRef(). 3. Assignment to a class reference causes a call to AddRef() on the new value followed by a call to Release() on its original value.It might be common knowledge, but I want to point out that the usual COMreference and this reference is then cleared during execution of this member function or from another thread? No. The caller of the function still retains a reference in that thread.12. AddRef() is not called when passed as the implicit 'this' reference.Isn't this unsafe if a member function is called through the last existingAs I suggested, arrays would not be supported with this proposal - but the user can create ref counted array-like objects.13. Taking the address of, or passing by reference, any fields of an RC object is not allowed in safe code. Passing by reference an RC field is allowed.Please note that this includes slices to fixed size arrays.this can be implemented. Also, an exception could be made for the AddRef()/Release() functions.16. RC objects cannot be const or immutable.This is a bit of a downer. If the reference count is not within the object,I feel I'm hijacking this proposal, but the step to library definedread/write barriers seems pretty small. Make AddRef, Release and assignment free template functions, e.g.void ptrConstruct(T,bool stackOrHeap)(T*adr, T p); void ptrAssign(T,bool stackOrHeap)(T*adr, T p); void ptrRelease(T,bool stackOrHeap)(T*adr); and we are able to experiment with all kinds of sophisticated GC algorithmsincluding RC. Eliding redundant addref/release pairs would need some extra support though, I read that LLVM does something like this, but I don't know how.It's pretty invasive into the code generation and performance, and could completely disrupt the C compatibility of D.
Oct 09 2013
Jacob Carlborg wrote: On Jun 25, 2013, at 11:00 PM, Walter Bright wrote:Updated incorporating Steven's suggestion, and some comments about shared/const/mutable/purity.It's great that you're bringing this up. I think it's a good idea to add reference counting to D. Although I don't think I can add much to the discussion and Michel have a lot better knowledge about Objective-C than I have. All I can say is that I really want it to be compatible with Objective-C.
Oct 09 2013
On Thursday, 10 October 2013 at 00:02:09 UTC, Walter Bright wrote:Updated incorporating Steven's suggestion, and some comments about shared/const/mutable/purity. ------------------------------------------------------------- Adding Reference Counting to DAutomatic Reference Counting (ARC) as an alternative to D's Garbage Collector? What is its state? I think ARC is a very good direction. Thanks. Ref link: https://wiki.dlang.org/Language_design_discussions#Automatic_Reference_Counting_.28ARC.29_as_an_alternative_to_D.27s_Garbage_Collector
Sep 18 2019
Michel Fortin wrote: Le 26-juin-2013 à 5:38, Walter Bright a écrit :On 6/26/2013 12:19 AM, Rainer Schuetze wrote:object itself, so we might want to allow reference counting on other types aswell.As Michel also said, the reference count does not have to be in inside theThat opens the question of what is the point of other RC types? For example,C++ can throw any type - but it turns out that throwing anything but class types is largely pointless. RC is just another garbage collection scheme. You might favor it for its performance characteristics, its determinism, or the lower memory footprint. Or you might need it to interact with foreign code that relies on it (COM, Objective-C, etc.), in which case it needs to be customizable (use the foreign implementation) or be manually managed. That's two different use cases. And in the later case you can't use the GC to release cycles because foreign code is using memory invisible to the GC. It is important to note that when foreign code calls AddRef you don't want the GC to collect that object, at least not until Release is called.programmer can do this with a simple wrapper.I don't think we should force linking this functionality with COM, theYeah, that's Michel's suggestion, and it's a good one.It could also be done with user attributes: void x_init(ref X x); void x_destroy(ref X x); void x_assign(ref X a, X b); // retains a, releases b arc!(x_init, x_destroy, x_assign) class X {} This way you don't have to check for null in the generated code: the standalone function is in charge of that. Also, this "assign" function here provides good opportunity for optimization in the RC implementation because you can update the two reference counts in a single operation (in the case where they're stored at the same place and are protected by the same lock). It's an improvement over making two separate function calls.No. The caller of the function still retains a reference in that thread.This should also apply to all function arguments. The caller is better placed than the callee to elide redundant AddRef/Release pairs, so it should be the one in charge of retaining the object for the callee.this can be implemented.16. RC objects cannot be const or immutable.This is a bit of a downer. If the reference count is not within the object,Also, an exception could be made for the AddRef()/Release() functions.I'm not too fond of that idea.
Oct 09 2013
Michel Fortin wrote: Le 26-juin-2013 à 1:36, Walter Bright a écrit :On 6/25/2013 6:09 PM, Michel Fortin wrote:prevent the compiler for inserting any of those calls so it becomes the responsibility of the author to make those calls (which are then allowed). noarc would be incompatible with safe, obviously.Instead, I think you should create a noarc attribute for functions: it'llIt's a good point, but adding such an attribute to the function may be toocoarse, for one, and may cause composition problems, for another. Maybe just disallowing it altogether is the best solution. Well, it's mostly required to write runtime support functions. The attribute could be more obscure so people are less tempted to use it, but if you're going to implement the ref-counting code you'll need that.reference counting in D.I feel like I should share some of my thoughts here about a broader use ofobject. Apple implements reference counting using global hash tables where the key is the address. It works very well.First, we don't have to assume the reference counter has to be part of thewe'd just have to find the base address of any memory block to get to its reference counter. I know you were designing with only classes in mind, but I want to point out that it is possible to reference-count everything the GC allocates if we want to.If we added a hash table like this for all memory allocated from the GC,D would need manual, RC and GC to coexist peacefully.The problem is how to make the three of those use the same codegen? - Druntime could have a flag to disable/enable refcounting. It'd make the retain/release functions no-ops, but it'd not prevent the GC from reclaiming memory as it does today. - Druntime could have a flag to disable/enable garbage collection (it already has). That'd prevent cycles from being collected, but you could use weak pointers to work around that or request a collection manually at the appropriate time. - A noarc (or similar) attribute at the function level could be used to prevent the compiler from generating function calls on pointer assignments. You could make a whole module noarc if you want by adding " noarc:" at the top. Here's the annoying thing: noarc is totally safe if reference counting is disabled and we rely entirely on the GC. noarc is unsafe when reference counting is enabled.function. While this is some overhead, it is more predictable than overhead from a GC scan and would be preferred in some situation (games I guess). Another downside is you have an object retained by being present on the stack frame of a C function, it'd have to be explicitly retained from elsewhere.The downside is that every assignment to a pointer anywhere has to call aDoesn't this make it impractical to mix vanilla C with D code? An importantfeature of D is this capability, without worrying about a "JNI" style interface. It's not very different than with the GC today. If you call a C function by giving it a ref-counted pointer argument, that memory block is guarantied to live at least for that call's lifetime (because it is retained by the caller). So simple calls to C functions are not a problem. If the C function puts that pointer elsewhere you'll need to retain it some other way, but you have to do this with the GC too. If you're implementing a callback called from C you need to care about what you return because the caller's C code won't retain it, while with the GC you could manage if C code did not store that pointer outside of the stack. I think that's all you have to worry about.As for D switching to a full refcounted GC for everything, I'm very hesitantfor such a step. For one thing, reading the clang spec on all the various pointer and function annotations necessary is very off-putting. Don't let Clang intimidate you. The Clang spec is about four to five time more complicated than needed because of autoreleased objects and because it supports weak pointers. Weak pointers can be implemented as a struct templates (as long as we have noarc). And all those annotations are for special cases, when you need to break the rules. You don't use them when doing normal programming, well except for __weak.
Oct 09 2013
Rainer Schuetze wrote: On 26.06.2013 11:38, Walter Bright wrote:On 6/26/2013 12:19 AM, Rainer Schuetze wrote:I currently don't see how it can be memory safe with this proposal.I imagine a few (constrained) templated functions for the different operations defined in the library could also do the job, though it might drown compilation speed. Also getting help from the optimizer to remove redundant calls will need some back doors.I don't see how this can be done without specific compiler knowledge in a memory safe way.You have to put the lock around the pair of AddRef and Release, but if the compiler already splits this into two function calls, this cannot be done in the implementation.Since the implementation of AddRef()/Release() is up to the user, whether it uses locks or not and whether it supports shared or not is up to the user.3. Assignment to a class reference causes a call to AddRef() on the new value followed by a call to Release() on its original value.It might be common knowledge, but I want to point out that the usual COM implementation (atomic increment/decrement and free when refcount goes down to 0) is not thread-safe for shared pointers. That means you either have to guard all reads and writes with a lock to make the full assignment atomic or have to implement reference counting very different (e.g. deferred reference counting).Hmmm, I guess I misunderstand the proposal. Assume for example a refcounted class R and this code class R : RefCounted { int _x; int readx() { return _x; } } int main() { R r = new R; return r.readx(); } According to 12. there is no refcounting going on when calling or executing readx. Ok, now what happens here: class R : RefCounted { int _x; int readx(C c) { c.r = null; // "standard" rc deletes r here return _x; // reads garbage } } class C { R r; } int main() { C c = new C; c.r = new R; return c.r.readx(c); } This reads garbage or crashes if there is no reference counting going on when calling readx.No. The caller of the function still retains a reference in that thread.12. AddRef() is not called when passed as the implicit 'this' reference.Isn't this unsafe if a member function is called through the last existing reference and this reference is then cleared during execution of this member function or from another thread?Just to clarify, I meant taking a slice of a static array that is a field of a refcounted class. Is it forbidden to have a field like this in a refcounted class or is taking the address through slicing forbidden?As I suggested, arrays would not be supported with this proposal - but the user can create ref counted array-like objects.13. Taking the address of, or passing by reference, any fields of an RC object is not allowed in safe code. Passing by reference an RC field is allowed.Please note that this includes slices to fixed size arrays.I don't see a big difference between a free function and a member function call, though the template character of it might hurt compilation performance. Two more notes: - I'm not sure it is mentioned, but I think you have to describe what happens when copying a struct. pre- and post-blit actions have to be taken if the struct contain pointers to refcounted objects.I feel I'm hijacking this proposal, but the step to library defined read/write barriers seems pretty small. Make AddRef, Release and assignment free template functions, e.g. void ptrConstruct(T,bool stackOrHeap)(T*adr, T p); void ptrAssign(T,bool stackOrHeap)(T*adr, T p); void ptrRelease(T,bool stackOrHeap)(T*adr); and we are able to experiment with all kinds of sophisticated GC algorithms including RC. Eliding redundant addref/release pairs would need some extra support though, I read that LLVM does something like this, but I don't know how.It's pretty invasive into the code generation and performance, and could completely disrupt the C compatibility of D.10. Function returns have an AddRef() already done to the return value.- A refcounted reference returned from a function (including new) would have to be Released if the return value is ignored or if only used as part of an expression.
Oct 09 2013
On 10/10/2013 03:45 AM, Walter Bright wrote:Rainer Schuetze wrote: You have to put the lock around the pair of AddRef and Release, but if the compiler already splits this into two function calls, this cannot be done in the implementation.I would imagine the counter to be manipulated with atomic_add_and_fetch operations, so no locks are required.
Oct 10 2013
On Thursday, 10 October 2013 at 08:55:00 UTC, Robert Schadek wrote:On 10/10/2013 03:45 AM, Walter Bright wrote:On shared objects, yes. Local objects need no atomics at all.Rainer Schuetze wrote: You have to put the lock around the pair of AddRef and Release, but if the compiler already splits this into two function calls, this cannot be done in the implementation.I would imagine the counter to be manipulated with atomic_add_and_fetch operations, so no locks are required.
Oct 11 2013
On 12.10.2013 04:16, inout wrote:On Thursday, 10 October 2013 at 08:55:00 UTC, Robert Schadek wrote:Atomic increments/decrements are not good enough for shared references. See this example from later in the discussion: Consider a global shared reference R that holds the last reference to an object O. One thread exchanges the reference with another reference P while another thread reads the reference into S. shared(C) R = O; ; refcnt of O is 1 in pseudo-assembly missing null-checks: Thread1 (R = P) Thread2 (S = R) mov ecx,[R] ; thread suspended mov eax,[P] inc [eax].refcnt mov ebx,[R] mov [R],eax dec [ebx].refcnt ; refcnt of O now 0 jnz done call delete_ebx ; thread resumed inc [ecx].refcnt done: The increment on [ecx].refcnt modifies garbage.On 10/10/2013 03:45 AM, Walter Bright wrote:On shared objects, yes. Local objects need no atomics at all.Rainer Schuetze wrote: You have to put the lock around the pair of AddRef and Release, but if the compiler already splits this into two function calls, this cannot be done in the implementation.I would imagine the counter to be manipulated with atomic_add_and_fetch operations, so no locks are required.
Oct 11 2013
On 2013-10-12 06:16:17 +0000, Rainer Schuetze <r.sagitario gmx.de> said:On 12.10.2013 04:16, inout wrote:I think you are mixing shared references with shared objects. Atomic increment/decrement will work fine for shared objects with unshared pointers to it. But if the pointer to the object is itself shared and available from two threads (as in your example) then you need to properly serialize reads and writes to that pointer (with a mutex or through other means). Of course, then you fall into the problem that in D you are not able to set different attributes to an object and to its reference. (Hint: const(Object)ref) -- Michel Fortin michel.fortin michelf.ca http://michelf.caOn Thursday, 10 October 2013 at 08:55:00 UTC, Robert Schadek wrote:Atomic increments/decrements are not good enough for shared references. See this example from later in the discussion: Consider a global shared reference R that holds the last reference to an object O. One thread exchanges the reference with another reference P while another thread reads the reference into S. shared(C) R = O; ; refcnt of O is 1 in pseudo-assembly missing null-checks: Thread1 (R = P) Thread2 (S = R) mov ecx,[R] ; thread suspended mov eax,[P] inc [eax].refcnt mov ebx,[R] mov [R],eax dec [ebx].refcnt ; refcnt of O now 0 jnz done call delete_ebx ; thread resumed inc [ecx].refcnt done: The increment on [ecx].refcnt modifies garbage.I would imagine the counter to be manipulated with atomic_add_and_fetch operations, so no locks are required.On shared objects, yes. Local objects need no atomics at all.
Oct 12 2013
On 12.10.2013 14:16, Michel Fortin wrote:On 2013-10-12 06:16:17 +0000, Rainer Schuetze <r.sagitario gmx.de> said:I agree, that's why I used the term "shared reference", too ;-) If you are using only shared objects, but not shared references, you'll have to use message passing coming with its own set of synchronization operations that are not easily made lock-free.On 12.10.2013 04:16, inout wrote:I think you are mixing shared references with shared objects. Atomic increment/decrement will work fine for shared objects with unshared pointers to it. But if the pointer to the object is itself shared and available from two threads (as in your example) then you need to properly serialize reads and writes to that pointer (with a mutex or through other means).On Thursday, 10 October 2013 at 08:55:00 UTC, Robert Schadek wrote:Atomic increments/decrements are not good enough for shared references. See this example from later in the discussion: Consider a global shared reference R that holds the last reference to an object O. One thread exchanges the reference with another reference P while another thread reads the reference into S. shared(C) R = O; ; refcnt of O is 1 in pseudo-assembly missing null-checks: Thread1 (R = P) Thread2 (S = R) mov ecx,[R] ; thread suspended mov eax,[P] inc [eax].refcnt mov ebx,[R] mov [R],eax dec [ebx].refcnt ; refcnt of O now 0 jnz done call delete_ebx ; thread resumed inc [ecx].refcnt done: The increment on [ecx].refcnt modifies garbage.I would imagine the counter to be manipulated with atomic_add_and_fetch operations, so no locks are required.On shared objects, yes. Local objects need no atomics at all.Of course, then you fall into the problem that in D you are not able to set different attributes to an object and to its reference. (Hint: const(Object)ref)I think being able to distinguish these types is missing from the type system. It's even worse for the class describing TypeInfo object as it is used for both the reference and the object.
Oct 12 2013
On 2013-10-13 06:54:04 +0000, Rainer Schuetze <r.sagitario gmx.de> said:I agree, that's why I used the term "shared reference", too ;-) If you are using only shared objects, but not shared references, you'll have to use message passing coming with its own set of synchronization operations that are not easily made lock-free.For one of my projects I implemented a shared pointer like this. It uses the pointer value itself as a spin lock with the assumption that -1 is an invalid pointer value: 1. read pointer value 2. if read value is -1 go to line 1 (spin) 3. compare and swap (previously read value <-> -1) 4. if failure go to line 1 (spin) // now pointer is "locked", its value is -1 but we have a copy of the original 5. copy pointer locally or assign to it (and update counter) 6. write back pointer value atomically to replace the -1 No mutex, but there's a spin lock so it's not good if there's contention. That said, I find it extremely rare to want a shared pointer that isn't already protected by a mutex alongside other variables, or that isn't propagated using some form of message passing. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 13 2013
On 13.10.2013 13:48, Michel Fortin wrote:On 2013-10-13 06:54:04 +0000, Rainer Schuetze <r.sagitario gmx.de> said:Locking is very bad if you have threads at different priorities as it might introduce priority inversion. Spinning is probably even worse in that scenario. At work, I use shared pointers all the time to pass information to a real time audio thread. The scheme uses triple-buffering of pointers for a lock free safe transport from/to the real time thread. Not having to worry about these low-level locking stuff is one of the good aspects about garbage collecting.I agree, that's why I used the term "shared reference", too ;-) If you are using only shared objects, but not shared references, you'll have to use message passing coming with its own set of synchronization operations that are not easily made lock-free.For one of my projects I implemented a shared pointer like this. It uses the pointer value itself as a spin lock with the assumption that -1 is an invalid pointer value: 1. read pointer value 2. if read value is -1 go to line 1 (spin) 3. compare and swap (previously read value <-> -1) 4. if failure go to line 1 (spin) // now pointer is "locked", its value is -1 but we have a copy of the original 5. copy pointer locally or assign to it (and update counter) 6. write back pointer value atomically to replace the -1 No mutex, but there's a spin lock so it's not good if there's contention. That said, I find it extremely rare to want a shared pointer that isn't already protected by a mutex alongside other variables, or that isn't propagated using some form of message passing.
Oct 14 2013
On 2013-10-14 17:57:18 +0000, Rainer Schuetze <r.sagitario gmx.de> said:On 13.10.2013 13:48, Michel Fortin wrote:Spinning is good only when you very rarely expect contention, which is the case for me. The above code is used once per object the first time someone requests a weak pointer for it. Having contention for that just doesn't make sense in most usage patterns. But still, being curious I added a log message anytime it does actually has to spin, and I have yet to see that message once in my logs (which probably means aggressive enough unit tests are missing). If you have a lot of read accesses and rarely write to the pointer you could instead try a read-write mutex with concurrent read access. In any case, there's no solution that will be ideal in all cases. Different situations asks for different trade-offs.For one of my projects I implemented a shared pointer like this. It uses the pointer value itself as a spin lock with the assumption that -1 is an invalid pointer value: 1. read pointer value 2. if read value is -1 go to line 1 (spin) 3. compare and swap (previously read value <-> -1) 4. if failure go to line 1 (spin) // now pointer is "locked", its value is -1 but we have a copy of the original 5. copy pointer locally or assign to it (and update counter) 6. write back pointer value atomically to replace the -1 No mutex, but there's a spin lock so it's not good if there's contention. That said, I find it extremely rare to want a shared pointer that isn't already protected by a mutex alongside other variables, or that isn't propagated using some form of message passing.Locking is very bad if you have threads at different priorities as it might introduce priority inversion. Spinning is probably even worse in that scenario.At work, I use shared pointers all the time to pass information to a real time audio thread. The scheme uses triple-buffering of pointers for a lock free safe transport from/to the real time thread. Not having to worry about these low-level locking stuff is one of the good aspects about garbage collecting.Indeed. The current garbage collector makes it easy to have shared pointers to shared objects. But the GC can also interrupt real-time threads for an unpredictable duration, how do you cope with that in a real-time thread? I know ARC isn't the ideal solution for all use cases. But neither is the GC, especially for real-time applications. So, which one would you recommend for a project having a real-time audio thread? -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 14 2013
On Monday, 14 October 2013 at 19:42:36 UTC, Michel Fortin wrote:Indeed. The current garbage collector makes it easy to have shared pointers to shared objects. But the GC can also interrupt real-time threads for an unpredictable duration, how do you cope with that in a real-time thread? I know ARC isn't the ideal solution for all use cases. But neither is the GC, especially for real-time applications. So, which one would you recommend for a project having a real-time audio thread?If you don't want any pause, concurrent GC is the way to go. This type of GC come at a cost of increased memory usage (everything is a tradeoff) but exists.
Oct 14 2013
On 2013-10-14 20:45:35 +0000, "deadalnix" <deadalnix gmail.com> said:On Monday, 14 October 2013 at 19:42:36 UTC, Michel Fortin wrote:I'm not an expert in GCs, but as far as I know a concurrent GC also requires some bookkeeping to be done when assigning to pointers, similar to ARC, and also when moving pointers, unlike ARC. So it requires hooks in the codegen that will perform atomic operations, just like ARC. Unless of course you use "fork" and scan inside the child process… but that has its own set of problems on some platforms. The only consensus we'll reach is that different projects have different needs. In theory being able to swap the GC for something else could bring everyone together. But to be able to replace the GC for another with a strategy different enough to matter (concurrent GC or ARC) you need the codegen to be different. So we can either: 1. make the codegen configurable -- which brings its own set of compatibility problems for compiled code but is good for experimentation, or 2. settle on something middle-ground performance-wise that can accommodate a couple of strategies, or 3. choose one GC strategy that ought to satisfy everybody and adapt the codegen to fit, or 4. keep everything as is. We're stuck with 4 until something else is decided. -- Michel Fortin michel.fortin michelf.ca http://michelf.caIndeed. The current garbage collector makes it easy to have shared pointers to shared objects. But the GC can also interrupt real-time threads for an unpredictable duration, how do you cope with that in a real-time thread? I know ARC isn't the ideal solution for all use cases. But neither is the GC, especially for real-time applications. So, which one would you recommend for a project having a real-time audio thread?If you don't want any pause, concurrent GC is the way to go. This type of GC come at a cost of increased memory usage (everything is a tradeoff) but exists.
Oct 14 2013
On Monday, 14 October 2013 at 21:25:29 UTC, Michel Fortin wrote:I'm not an expert in GCs, but as far as I know a concurrent GC also requires some bookkeeping to be done when assigning to pointers, similar to ARC, and also when moving pointers, unlike ARC. So it requires hooks in the codegen that will perform atomic operations, just like ARC.Usual strategy include : - When you JIT, change the function itself, to write pointers through a function that mark the old value as live. - When AOT, always go throw that function, which make a test and mark alive the old value if this is done during a collection. This basically add a read to a global and a test for each pointer write. - Use the page protection mechanism and do regular write. This can be done via fork, but also via remapping the GCed memory as COW. The tax is then more expensive, but you only pay it once per page you actually write and only when actually collecting. The good news, is that this tax is only required for object that contains shared mutable pointers. In D, most data are thread local or immutable. D's type system is really friendly to concurrent GC, and we definitively should go in that direction.The only consensus we'll reach is that different projects have different needs. In theory being able to swap the GC for something else could bring everyone together. But to be able to replace the GC for another with a strategy different enough to matter (concurrent GC or ARC) you need the codegen to be different. So we can either:ARC like system need a different codegen, but you can do this with regular codegen if you use page protection to detect writes.1. make the codegen configurable -- which brings its own set of compatibility problems for compiled code but is good for experimentation, orBad, we will end up having different incompatible binaries.
Oct 14 2013
On 2013-10-14 23:45:42 +0000, "deadalnix" <deadalnix gmail.com> said:On Monday, 14 October 2013 at 21:25:29 UTC, Michel Fortin wrote:Very insightful. Thank you.I'm not an expert in GCs, but as far as I know a concurrent GC also requires some bookkeeping to be done when assigning to pointers, similar to ARC, and also when moving pointers, unlike ARC. So it requires hooks in the codegen that will perform atomic operations, just like ARC.Usual strategy include : - When you JIT, change the function itself, to write pointers through a function that mark the old value as live. - When AOT, always go throw that function, which make a test and mark alive the old value if this is done during a collection. This basically add a read to a global and a test for each pointer write. - Use the page protection mechanism and do regular write. This can be done via fork, but also via remapping the GCed memory as COW. The tax is then more expensive, but you only pay it once per page you actually write and only when actually collecting. The good news, is that this tax is only required for object that contains shared mutable pointers. In D, most data are thread local or immutable. D's type system is really friendly to concurrent GC, and we definitively should go in that direction.The only consensus we'll reach is that different projects have different needs. In theory being able to swap the GC for something else could bring everyone together. But to be able to replace the GC for another with a strategy different enough to matter (concurrent GC or ARC) you need the codegen to be different. So we can either:ARC like system need a different codegen, but you can do this with regular codegen if you use page protection to detect writes.So as I understand it, your plan would then be: - use concurrent GC using the page protection mechanism and COW to catch writes during collection - add thread-local awareness to the GC so only shared and mutable memory needs COW Makes sense. But adding thread-local awareness requires a couple of language changes. It won't work as long as people keep casting things around so you need to fix a lot of cases where casts are needed. But otherwise, seems like a good plan. I'm a little wary of the cost of doing COW during collection. Obviously the GC isn't pausing the program per-see, but it'll be slowing it down. By how much is probably dependent on what you're doing. At the very least the GC should allocate types with no pointers on separate pages from those with pointers. Also, what are the calls required to implement page protection and COW on posix? I'd like to check whether those are allowed within the OS X and iOS sandbox. For instance fork() isn't allowed for sandboxed apps. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca1. make the codegen configurable -- which brings its own set of compatibility problems for compiled code but is good for experimentation, orBad, we will end up having different incompatible binaries.
Oct 14 2013
On Tuesday, 15 October 2013 at 00:38:20 UTC, Michel Fortin wrote:So as I understand it, your plan would then be: - use concurrent GC using the page protection mechanism and COW to catch writes during collection - add thread-local awareness to the GC so only shared and mutable memory needs COWYes, I think this is what make the most sense for D. If the GC allow explicit free; the reference counting and other manual memory management techniques can be built on top of it.Makes sense. But adding thread-local awareness requires a couple of language changes. It won't work as long as people keep casting things around so you need to fix a lot of cases where casts are needed.Type qualifiers provide the necessary information. However, some practice (that are already mentioned as being undefined behavior) will become really unsafe. It also require to enrich the GC api, but users aren't supposed to use it directly :DI'm a little wary of the cost of doing COW during collection. Obviously the GC isn't pausing the program per-see, but it'll be slowing it down. By how much is probably dependent on what you're doing. At the very least the GC should allocate types with no pointers on separate pages from those with pointers.Nothing is free, and indeed, trapping the write via memory protection is quite expensive. Hopefully, we can segregate object that contain mutable pointer from others and only memory protect theses. It will indeed cause trouble for code that mutate a large amount of shared pointers. I'd say that such code is probably asking for trouble in the first place, but as always, no silver bullet. I still think solution is the one that fit D the best.Also, what are the calls required to implement page protection and COW on posix? I'd like to check whether those are allowed within the OS X and iOS sandbox. For instance fork() isn't allowed for sandboxed apps.You need mmap, mprotect and all the signal handling machinery.
Oct 14 2013
On 2013-10-15 02:20:49 +0000, "deadalnix" <deadalnix gmail.com> said:mprotect is the one I'm worried about, as it lets you set the executable bit (among other things) which could be exploited to run arbitrary code. So I tested it and it seems to work fine on OS X inside the sandbox (including for setting the executable bit). I guess an executable with a reference to mprotect would probably also pass Apple's Mac App Store validation, but I haven't tested. mprotect isn't available at all with the iOS SDK. So making this collector work on iOS (and the iOS Simulator) would require a different codegen. -- Michel Fortin michel.fortin michelf.ca http://michelf.caAlso, what are the calls required to implement page protection and COW on posix? I'd like to check whether those are allowed within the OS X and iOS sandbox. For instance fork() isn't allowed for sandboxed apps.You need mmap, mprotect and all the signal handling machinery.
Oct 14 2013
On 2013-10-15 05:11, Michel Fortin wrote:mprotect isn't available at all with the iOS SDK. So making this collector work on iOS (and the iOS Simulator) would require a different codegen.I haven't tried compiling anything and I don't know if I'm looking in the correct file but this file: /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS7.0.sdk/usr/include/sys/mman.h Does contain "mprotect". -- /Jacob Carlborg
Oct 15 2013
On 2013-10-15 07:28:16 +0000, Jacob Carlborg <doob me.com> said:On 2013-10-15 05:11, Michel Fortin wrote:Doesmprotect isn't available at all with the iOS SDK. So making this collector work on iOS (and the iOS Simulator) would require a different codegen.I haven't tried compiling anything and I don't know if I'm looking in the correct file but this file: /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS7.0.sdk/usr/include/sys/mman.hcontain "mprotect".You're right. Yes it does exist. I was confused. Not only it does exist, but it lets you set the executable bit. I find that depressing, since I'm pretty sure App Store apps are prevented from setting the executable bit, and I'd tend to think now that they're blocking it by checking for references to mprotect it in the executable when submitting to the App Store. And by doing it this way they probably wouldn't be able to distinguish between setting the executable bit or making a page read-only. Also, someone would need to check that Windows Phone apps and Windows 8-style (Metro) apps can access mprotect (or equivalent) too. They're sandboxed just as heavily and statically checked upon submission the same way. Could some game consoles out there block it too? -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 15 2013
On 2013-10-15 02:20:49 +0000, "deadalnix" <deadalnix gmail.com> said:It will indeed cause trouble for code that mutate a large amount of shared pointers. I'd say that such code is probably asking for trouble in the first place, but as always, no silver bullet. I still think solution is the one that fit D the best.I think there's a small mistake in your phrasing, but it makes a difference. When the collector is running, it needs to know about any mutation for pointers to its shared memory pool, including pointers that are themselves thread-local but point to shared memory. So COW will be trouble for code that mutate a large amount of **pages containing pointers to shared memory**. And this which includes **pointers to immutable data** because immutable is implicitly shared. And this includes **pointers to const data** since those pointers might point to immutable (thus shared) memory. So any memory page susceptible of containing pointers to shared memory would need to use COW during collection. Which means all the thread's stacks, and also all objects with a pointer to shared, immutable, and const data. At this point I think it is fair to approximate this to almost all memory that could contain pointers. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 15 2013
On Tuesday, 15 October 2013 at 11:03:01 UTC, Michel Fortin wrote:On 2013-10-15 02:20:49 +0000, "deadalnix" <deadalnix gmail.com> said:No, that is the beauty of it :D Consider you have pointer from Tl -> shared -> immutable and TL -> immutable. I'm not covering TL collection here (It seem to be obvious that it doesn't require to stop the world). So the starting point is that we have the roots in all TL heaps/stacks, and we want to collect shared/immutable without blocking the worlds. TL heap may get new pointers to the shared heap, but they can only come from the shared heap itself or new allocations. At this point, you consider every new allocations as live. Reading a pointer from the shared heap and copy it to the TL heap isn't problematic in itself, but then we have a problem if this pointer is now updated in the shared heap, as the GC may never scan this pointer. This is why you need to track pointer writes to the shared heap. The write value itself isn't important : it come from either new alloc that are live, or from somewhere else in the shared heap (so it will be scanned as we track writes).It will indeed cause trouble for code that mutate a large amount of shared pointers. I'd say that such code is probably asking for trouble in the first place, but as always, no silver bullet. I still think solution is the one that fit D the best.I think there's a small mistake in your phrasing, but it makes a difference. When the collector is running, it needs to know about any mutation for pointers to its shared memory pool, including pointers that are themselves thread-local but point to shared memory. So COW will be trouble for code that mutate a large amount of **pages containing pointers to shared memory**. And this which includes **pointers to immutable data** because immutable is implicitly shared. And this includes **pointers to const data** since those pointers might point to immutable (thus shared) memory.So any memory page susceptible of containing pointers to shared memory would need to use COW during collection. Which means all the thread's stacks, and also all objects with a pointer to shared, immutable, and const data. At this point I think it is fair to approximate this to almost all memory that could contain pointers.No, only the shared one, that is the beauty of the technique. Not that I'm not making that up myself, it is how GC used to work in the Caml family for a while, and it has proven itself really efficient (in Caml family, most data are either immutable or thread local, and the shared heap typically small).
Oct 15 2013
On 2013-10-15 18:32:01 +0000, "deadalnix" <deadalnix gmail.com> said:No, that is the beauty of it :D Consider you have pointer from Tl -> shared -> immutable and TL -> immutable. I'm not covering TL collection here (It seem to be obvious that it doesn't require to stop the world). So the starting point is that we have the roots in all TL heaps/stacks, and we want to collect shared/immutable without blocking the worlds. TL heap may get new pointers to the shared heap, but they can only come from the shared heap itself or new allocations. At this point, you consider every new allocations as live. Reading a pointer from the shared heap and copy it to the TL heap isn't problematic in itself, but then we have a problem if this pointer is now updated in the shared heap, as the GC may never scan this pointer. This is why you need to track pointer writes to the shared heap. The write value itself isn't important : it come from either new alloc that are live, or from somewhere else in the shared heap (so it will be scanned as we track writes).But you still have to scan the thread-local heaps and stacks to find pointers to shared memory. If you don't stop those threads, how do you know one of these threads isn't moving a pointer value during the scan from one place the GC still has to scan to another that the GC has just scanned, making it so the GC never sees the pointer? For instance: class A { // string is immutable and points to shared heap immutable(char)[] s; } A global; void func() { // GC scans the stack of this thread in the background // no reference to our string on the stack // moving string pointer to the stack // while the GC is running auto tmp = global.s; global.s = null; // GC scans the heap of this thread in the background // no reference to our string on the heap // (missed reference to our string now on the stack) } The thread can move the pointer around while the GC is looking away and you'll end up with a pointer to a freed string. So you have to use COW for the thread's stack, or get notified of a pointer assignment somehow (or of a pointer move), or stop the thread while you're scanning its heap. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 15 2013
Yes and no. You obviously need to scan TL heaps at some point. When doing so you'll have a set of root that allow you to scan shared/immutable heap. What is going on in the TL heap become irrelevant once you have the root. And getting the roots from the TL heap is another problem altogether (any kind of GC can be used for that, stop the world, where the world is thread local, or another concurrent GC, the important point being that it is an independent problem).
Oct 16 2013
Am 14.10.2013 21:42, schrieb Michel Fortin:On 2013-10-14 17:57:18 +0000, Rainer Schuetze <r.sagitario gmx.de> said:Well, if real time concurrent GC for Java systems is good enough for systems that control militar missile systems, maybe it is good enough for real-time audio as well. -- PauloOn 13.10.2013 13:48, Michel Fortin wrote:Spinning is good only when you very rarely expect contention, which is the case for me. The above code is used once per object the first time someone requests a weak pointer for it. Having contention for that just doesn't make sense in most usage patterns. But still, being curious I added a log message anytime it does actually has to spin, and I have yet to see that message once in my logs (which probably means aggressive enough unit tests are missing). If you have a lot of read accesses and rarely write to the pointer you could instead try a read-write mutex with concurrent read access. In any case, there's no solution that will be ideal in all cases. Different situations asks for different trade-offs.For one of my projects I implemented a shared pointer like this. It uses the pointer value itself as a spin lock with the assumption that -1 is an invalid pointer value: 1. read pointer value 2. if read value is -1 go to line 1 (spin) 3. compare and swap (previously read value <-> -1) 4. if failure go to line 1 (spin) // now pointer is "locked", its value is -1 but we have a copy of the original 5. copy pointer locally or assign to it (and update counter) 6. write back pointer value atomically to replace the -1 No mutex, but there's a spin lock so it's not good if there's contention. That said, I find it extremely rare to want a shared pointer that isn't already protected by a mutex alongside other variables, or that isn't propagated using some form of message passing.Locking is very bad if you have threads at different priorities as it might introduce priority inversion. Spinning is probably even worse in that scenario.At work, I use shared pointers all the time to pass information to a real time audio thread. The scheme uses triple-buffering of pointers for a lock free safe transport from/to the real time thread. Not having to worry about these low-level locking stuff is one of the good aspects about garbage collecting.Indeed. The current garbage collector makes it easy to have shared pointers to shared objects. But the GC can also interrupt real-time threads for an unpredictable duration, how do you cope with that in a real-time thread? I know ARC isn't the ideal solution for all use cases. But neither is the GC, especially for real-time applications. So, which one would you recommend for a project having a real-time audio thread?
Oct 14 2013
Am 14.10.2013 22:59, schrieb Paulo Pinto:Well, if real time concurrent GC for Java systems is good enough for systems that control militar missile systems, maybe it is good enough for real-time audio as well. -- PauloThe problem is not that there are no GCs around in other languages which satisfy certain requirements. The problem is actually implementing them in D. I suggest that you read "The Garbage Collection Handbook" which explains this in deep detail. I'm currently reading it, and I might write an article about the entire D GC issue once I'm done with it.
Oct 16 2013
On Oct 16, 2013, at 11:54 AM, Benjamin Thaut <code benjamin-thaut.de> = wrote:=20 The problem is not that there are no GCs around in other languages =which satisfy certain requirements. The problem is actually implementing = them in D. I suggest that you read "The Garbage Collection Handbook" = which explains this in deep detail. I'm currently reading it, and I = might write an article about the entire D GC issue once I'm done with = it. I think the short version is that D being able to directly call C code = is a huge problem here. Incremental GCs all rely on the GC being = notified when pointers are changed. We might be able to manage it for = SafeD, but then SafeD would basically be its own language.=
Oct 16 2013
On 2013-10-16 21:05, Sean Kelly wrote:I think the short version is that D being able to directly call C code is a huge problem here. Incremental GCs all rely on the GC being notified when pointers are changed. We might be able to manage it for SafeD, but then SafeD would basically be its own language.One need to be very careful with the memory when interfacing with C code today. What about having a function that notifies the GC that a pointer has been updated? -- /Jacob Carlborg
Oct 16 2013
Am 16.10.2013 21:05, schrieb Sean Kelly:On Oct 16, 2013, at 11:54 AM, Benjamin Thaut <code benjamin-thaut.de> wrote:I think a even bigger problem are structs. Because if you need write barriers for pointers on the heap you are going to have a problem with structs. Because you will never know if they are located on the heap or the stack. Additionally making the stack percisely scannable and adding GC-Points will require a lot of compiler support. And even if this is doable in respect to DMD its going to be a big problem for GDC or LDC to change the codegen.The problem is not that there are no GCs around in other languages which satisfy certain requirements. The problem is actually implementing them in D. I suggest that you read "The Garbage Collection Handbook" which explains this in deep detail. I'm currently reading it, and I might write an article about the entire D GC issue once I'm done with it.I think the short version is that D being able to directly call C code is a huge problem here. Incremental GCs all rely on the GC being notified when pointers are changed. We might be able to manage it for SafeD, but then SafeD would basically be its own language.
Oct 16 2013
On Oct 16, 2013, at 1:05 PM, Benjamin Thaut <code benjamin-thaut.de> wrote:=20 Am 16.10.2013 21:05, schrieb Sean Kelly:te:On Oct 16, 2013, at 11:54 AM, Benjamin Thaut <code benjamin-thaut.de> wro=satisfy certain requirements. The problem is actually implementing them in D= . I suggest that you read "The Garbage Collection Handbook" which explains t= his in deep detail. I'm currently reading it, and I might write an article a= bout the entire D GC issue once I'm done with it.=20 The problem is not that there are no GCs around in other languages which=a huge problem here. Incremental GCs all rely on the GC being notified whe= n pointers are changed. We might be able to manage it for SafeD, but then S= afeD would basically be its own language.=20 I think the short version is that D being able to directly call C code is==20 I think a even bigger problem are structs. Because if you need write barri=ers for pointers on the heap you are going to have a problem with structs. B= ecause you will never know if they are located on the heap or the stack. Add= itionally making the stack percisely scannable and adding GC-Points will req= uire a lot of compiler support. And even if this is doable in respect to DMD= its going to be a big problem for GDC or LDC to change the codegen. Yes, any pointer anywhere. I recall someone posting a doc about a compromise= solution a few years back, but I'd have to do some digging to figure out wh= at the approach was.=20=
Oct 17 2013
On Thursday, 17 October 2013 at 17:11:06 UTC, Sean Kelly wrote:LLVM actually comes with a quite expensive GC support infrastructure: http://llvm.org/docs/GarbageCollection.html. As far as I'm aware, it is not widely used in terms of the "top-tier" LLVM projects, so there might be quite a bit of work involved in getting that to run. DavidAnd even if this is doable in respect to DMD its going to be a big problem for GDC or LDC to change the codegen.Yes, any pointer anywhere. I recall someone posting a doc about a compromise solution a few years back, but I'd have to do some digging to figure out what the approach was.
Oct 17 2013
Am 17.10.2013 19:16, schrieb David Nadlinger:On Thursday, 17 October 2013 at 17:11:06 UTC, Sean Kelly wrote:Uhhh, this sounds really good. They in fact have everything to implement a generational garbage collector. This would improve the D GC situation a lot. But reading the part about the shadow stack really lowers my expectations. Thats really something you don't want. The performance impact is so going to be so big, that it doesn't make sense to use the better GC in the first place. Kind Regards Benjamin ThautLLVM actually comes with a quite expensive GC support infrastructure: http://llvm.org/docs/GarbageCollection.html. As far as I'm aware, it is not widely used in terms of the "top-tier" LLVM projects, so there might be quite a bit of work involved in getting that to run. DavidAnd even if this is doable in respect to DMD its going to be a big problem for GDC or LDC to change the codegen.Yes, any pointer anywhere. I recall someone posting a doc about a compromise solution a few years back, but I'd have to do some digging to figure out what the approach was.
Oct 17 2013
On Thursday, 17 October 2013 at 17:28:17 UTC, Benjamin Thaut wrote:Am 17.10.2013 19:16, schrieb David Nadlinger: But reading the part about the shadow stack really lowers my expectations. Thats really something you don't want. The performance impact is so going to be so big, that it doesn't make sense to use the better GC in the first place.There's always a tradeoff. If an app is very delay-sensitive, then making the app run slower in general might be worthwhile if the delay could be eliminated.
Oct 17 2013
Am 17.10.2013 19:47, schrieb Sean Kelly:On Thursday, 17 October 2013 at 17:28:17 UTC, Benjamin Thaut wrote:Well I just read it again, and it appears to me that the shadow stack is something they already have implemented and it can be used for "gc prototyping" but if you want you can write your own code generator plugin and generate your own stack maps. It actually sounds more feasable to implement a generational GC with LLVM then with what we have in dmd. Kind Regards Benjamin ThautAm 17.10.2013 19:16, schrieb David Nadlinger: But reading the part about the shadow stack really lowers my expectations. Thats really something you don't want. The performance impact is so going to be so big, that it doesn't make sense to use the better GC in the first place.There's always a tradeoff. If an app is very delay-sensitive, then making the app run slower in general might be worthwhile if the delay could be eliminated.
Oct 17 2013
Am 16.10.2013 20:54, schrieb Benjamin Thaut:Am 14.10.2013 22:59, schrieb Paulo Pinto:I have read it when it came out in the late 90's. My main focus areas in the university were compiler design, distributed systems and graphics programming. Although, like many, I tend to do plain boring enterprise applications nowadays. -- PauloWell, if real time concurrent GC for Java systems is good enough for systems that control militar missile systems, maybe it is good enough for real-time audio as well. -- PauloThe problem is not that there are no GCs around in other languages which satisfy certain requirements. The problem is actually implementing them in D. I suggest that you read "The Garbage Collection Handbook" which explains this in deep detail. I'm currently reading it, and I might write an article about the entire D GC issue once I'm done with it.
Oct 16 2013
On 14.10.2013 21:42, Michel Fortin wrote:Indeed. The current garbage collector makes it easy to have shared pointers to shared objects. But the GC can also interrupt real-time threads for an unpredictable duration, how do you cope with that in a real-time thread?The work I was talking about uses C++, not D, so there is no GC involved. The options I see for real-time threads in D is either a concurrent GC (which means read/write barriers for pointer accesses) or just excluding the real time thread from suspension by the GC. This forces the programmer to ensure that references in the real time thread are also found elsewhere. I'm not sure if this eliminates the benefits regarding locking, though.I know ARC isn't the ideal solution for all use cases. But neither is the GC, especially for real-time applications. So, which one would you recommend for a project having a real-time audio thread?ARC doesn't work for real time threads anyway, because you are not allowed to deallocate if it can cause locks. It can only work if you defer reference counting into another thread through some buffering. Realistically I would currently recommend the approach above: exclude the thread from suspension, and keep a reference to used object elsewhere. This is probably about as difficult as avoiding allocations/deallocations in C++, but harder to debug.
Oct 15 2013
On Saturday, 12 October 2013 at 06:16:24 UTC, Rainer Schuetze wrote:in pseudo-assembly missing null-checks: Thread1 (R = P) Thread2 (S = R) mov ecx,[R] ; thread suspendedYou need an sequentially consistent write. You also need to increment the refcount BEFORE ! This codegen is incorrect.mov eax,[P] inc [eax].refcntSame here.mov ebx,[R] mov [R],eax dec [ebx].refcnt ; refcnt of O now 0 jnz done call delete_ebx ; thread resumed inc [ecx].refcnt done: The increment on [ecx].refcnt modifies garbage.This can be done atomically (even with eventually consistent increment, don't need full sequential consistency here, you still need fully sequential consistent decrement).
Oct 12 2013
On 12.10.2013 20:31, deadalnix wrote:On Saturday, 12 October 2013 at 06:16:24 UTC, Rainer Schuetze wrote:How do you increment the counter without reading its address?in pseudo-assembly missing null-checks: Thread1 (R = P) Thread2 (S = R) mov ecx,[R] ; thread suspendedYou need an sequentially consistent write. You also need to increment the refcount BEFORE ! This codegen is incorrect.Same here ;-)mov eax,[P] inc [eax].refcntSame here.According to the "Handbook of Garbage Collection" by Richard Jones eager lock-free reference counting can only be done with a cas2 operation modifying two seperate locations atomically (algorithm 18.2 "Eager reference counting with CompareAndSwap is broken"). This might be the quoted paper: http://scholr.ly/paper/2199608/lock-free-reference-counting Unfortunately the CAS2 operation does not exist in most processors.mov ebx,[R] mov [R],eax dec [ebx].refcnt ; refcnt of O now 0 jnz done call delete_ebx ; thread resumed inc [ecx].refcnt done: The increment on [ecx].refcnt modifies garbage.This can be done atomically (even with eventually consistent increment, don't need full sequential consistency here, you still need fully sequential consistent decrement).
Oct 13 2013
On Sunday, 13 October 2013 at 07:03:39 UTC, Rainer Schuetze wrote:How do you increment the counter without reading its address?I assumed that the reference count was in a struct with the data, and refcounted point to it. In this case, if you remove the pointer via a sequencially consistent write (while keeping a local copy internally) and THEN decrement the counter, the other thread will access another object (or skip on a null check). Granted the read is sequencially consistent.
Oct 13 2013
On 10/13/13 11:19, deadalnix wrote:On Sunday, 13 October 2013 at 07:03:39 UTC, Rainer Schuetze wrote:No, if you have two (or more) threads concurrently accessing the object, it is possible that one threads reads the pointer, then sleeps before incrementing the count. Then another thread comes and /destroys/ the object, the memory is reused for something else, or even unmapped. Then the first thread wakes up and incs the counter, which is no longer there, causing a crash or data corruption. But this is only a problem for shared objects, which are accessed without any locking -- it's not a common case at all, and can be dealt with by simply taking a lock *before* reading the reference. (there are many much more complex solutions such as CAS2 or RCU based ones). arturHow do you increment the counter without reading its address?I assumed that the reference count was in a struct with the data, and refcounted point to it. In this case, if you remove the pointer via a sequencially consistent write (while keeping a local copy internally) and THEN decrement the counter, the other thread will access another object (or skip on a null check). Granted the read is sequencially consistent.
Oct 13 2013
On Sunday, 13 October 2013 at 07:03:39 UTC, Rainer Schuetze wrote:According to the "Handbook of Garbage Collection" by Richard Jones eager lock-free reference counting can only be done with a cas2 operation modifying two seperate locations atomically (algorithm 18.2 "Eager reference counting with CompareAndSwap is broken"). This might be the quoted paper: http://scholr.ly/paper/2199608/lock-free-reference-counting Unfortunately the CAS2 operation does not exist in most processors.I suppose it's worth noting that Boost (and now standard C++) has a shared_ptr that works across threads and the implementation I've seen doesn't use a mutex. In fact, I think the Boost one doesn't even use CAS on x86, though it's been quite a few years so my memory could be wrong on that last detail.
Oct 13 2013
Am 13.10.2013 17:15, schrieb Sean Kelly:On Sunday, 13 October 2013 at 07:03:39 UTC, Rainer Schuetze wrote:I didn't read the paper, but I'd suspect that the paper refers to the case where both, the reference count _and_ the reference is thread-safe, since the boost/c++ shared_ptr only has a thread-safe reference count after all.According to the "Handbook of Garbage Collection" by Richard Jones eager lock-free reference counting can only be done with a cas2 operation modifying two seperate locations atomically (algorithm 18.2 "Eager reference counting with CompareAndSwap is broken"). This might be the quoted paper: http://scholr.ly/paper/2199608/lock-free-reference-counting Unfortunately the CAS2 operation does not exist in most processors.I suppose it's worth noting that Boost (and now standard C++) has a shared_ptr that works across threads and the implementation I've seen doesn't use a mutex. In fact, I think the Boost one doesn't even use CAS on x86, though it's been quite a few years so my memory could be wrong on that last detail.
Oct 13 2013
On 13.10.2013 19:05, Sönke Ludwig wrote:Am 13.10.2013 17:15, schrieb Sean Kelly:I haven't read it either, but AFAICT the cas2 operation is used two modify the pointer and the reference count at the same time atomically. I just checked boost::shared_ptr, it uses cas operations on the reference counts. It has the same problem as described in my example, see the read/write example 3 here: http://www.boost.org/doc/libs/1_54_0/libs/smart_ptr/shared_ptr.htm#ThreadSafety boost::shared_ptr is also unsafe with respect to calling member functions through "->" as it doesn't increment the reference count.On Sunday, 13 October 2013 at 07:03:39 UTC, Rainer Schuetze wrote:I didn't read the paper, but I'd suspect that the paper refers to the case where both, the reference count _and_ the reference is thread-safe, since the boost/c++ shared_ptr only has a thread-safe reference count after all.According to the "Handbook of Garbage Collection" by Richard Jones eager lock-free reference counting can only be done with a cas2 operation modifying two seperate locations atomically (algorithm 18.2 "Eager reference counting with CompareAndSwap is broken"). This might be the quoted paper: http://scholr.ly/paper/2199608/lock-free-reference-counting Unfortunately the CAS2 operation does not exist in most processors.I suppose it's worth noting that Boost (and now standard C++) has a shared_ptr that works across threads and the implementation I've seen doesn't use a mutex. In fact, I think the Boost one doesn't even use CAS on x86, though it's been quite a few years so my memory could be wrong on that last detail.
Oct 14 2013
On Monday, 14 October 2013 at 17:43:33 UTC, Rainer Schuetze wrote:On 13.10.2013 19:05, Sönke Ludwig wrote:I'm totally out of my depth here but can't you store the reference count adjacent to the pointer and use CMPXCHG16BAm 13.10.2013 17:15, schrieb Sean Kelly:I haven't read it either, but AFAICT the cas2 operation is used two modify the pointer and the reference count at the same time atomically. I just checked boost::shared_ptr, it uses cas operations on the reference counts. It has the same problem as described in my example, see the read/write example 3 here: http://www.boost.org/doc/libs/1_54_0/libs/smart_ptr/shared_ptr.htm#ThreadSafety boost::shared_ptr is also unsafe with respect to calling member functions through "->" as it doesn't increment the reference count.On Sunday, 13 October 2013 at 07:03:39 UTC, Rainer Schuetze wrote:I didn't read the paper, but I'd suspect that the paper refers to the case where both, the reference count _and_ the reference is thread-safe, since the boost/c++ shared_ptr only has a thread-safe reference count after all.According to the "Handbook of Garbage Collection" by Richard Jones eager lock-free reference counting can only be done with a cas2 operation modifying two seperate locations atomically (algorithm 18.2 "Eager reference counting with CompareAndSwap is broken"). This might be the quoted paper: http://scholr.ly/paper/2199608/lock-free-reference-counting Unfortunately the CAS2 operation does not exist in most processors.I suppose it's worth noting that Boost (and now standard C++) has a shared_ptr that works across threads and the implementation I've seen doesn't use a mutex. In fact, I think the Boost one doesn't even use CAS on x86, though it's been quite a few years so my memory could be wrong on that last detail.
Oct 14 2013
On Monday, 14 October 2013 at 17:50:01 UTC, John Colvin wrote:I'm totally out of my depth here but can't you store the reference count adjacent to the pointer and use CMPXCHG16BI think that can work if the refcount is stored with the pointer (ie. if a fat pointer is used) and not in the object. But that would defeat the whole point of the refcount :( It seems that Rainer Schuetze is right and that the implementation that boost::shared_ptr uses is not safe.
Oct 14 2013
On 14.10.2013 19:50, John Colvin wrote:On Monday, 14 October 2013 at 17:43:33 UTC, Rainer Schuetze wrote:This might work for a single pointer, but not if you have multiple pointers to the same object.On 13.10.2013 19:05, Sönke Ludwig wrote:I'm totally out of my depth here but can't you store the reference count adjacent to the pointer and use CMPXCHG16BAm 13.10.2013 17:15, schrieb Sean Kelly:I haven't read it either, but AFAICT the cas2 operation is used two modify the pointer and the reference count at the same time atomically. I just checked boost::shared_ptr, it uses cas operations on the reference counts. It has the same problem as described in my example, see the read/write example 3 here: http://www.boost.org/doc/libs/1_54_0/libs/smart_ptr/shared_ptr.htm#ThreadSafety boost::shared_ptr is also unsafe with respect to calling member functions through "->" as it doesn't increment the reference count.On Sunday, 13 October 2013 at 07:03:39 UTC, Rainer Schuetze wrote:I didn't read the paper, but I'd suspect that the paper refers to the case where both, the reference count _and_ the reference is thread-safe, since the boost/c++ shared_ptr only has a thread-safe reference count after all.According to the "Handbook of Garbage Collection" by Richard Jones eager lock-free reference counting can only be done with a cas2 operation modifying two seperate locations atomically (algorithm 18.2 "Eager reference counting with CompareAndSwap is broken"). This might be the quoted paper: http://scholr.ly/paper/2199608/lock-free-reference-counting Unfortunately the CAS2 operation does not exist in most processors.I suppose it's worth noting that Boost (and now standard C++) has a shared_ptr that works across threads and the implementation I've seen doesn't use a mutex. In fact, I think the Boost one doesn't even use CAS on x86, though it's been quite a few years so my memory could be wrong on that last detail.
Oct 14 2013
Rainer Schuetze wrote: On 26.06.2013 13:12, Michel Fortin wrote:Le 26-juin-2013 à 5:38, Walter Bright a écrit :That means you have to maintain two reference counts if you have both ARC and COM used in the same class. ARC can only release the object if both counters are 0, while the COM implementation has to add/remove roots to the GC when counting from/to 0 to avoid that the object is collected while external references exist.On 6/26/2013 12:19 AM, Rainer Schuetze wrote:RC is just another garbage collection scheme. You might favor it for its performance characteristics, its determinism, or the lower memory footprint. Or you might need it to interact with foreign code that relies on it (COM, Objective-C, etc.), in which case it needs to be customizable (use the foreign implementation) or be manually managed. That's two different use cases. And in the later case you can't use the GC to release cycles because foreign code is using memory invisible to the GC. It is important to note that when foreign code calls AddRef you don't want the GC to collect that object, at least not until Release is called.As Michel also said, the reference count does not have to be in inside the object itself, so we might want to allow reference counting on other types aswell.That opens the question of what is the point of other RC types? For example, C++ can throw any type - but it turns out that throwing anything but class types is largely pointless.
Oct 09 2013
Michel Fortin wrote: Le 2013-06-26 à 17:42, Rainer Schuetze <r.sagitario gmx.de> a écrit :On 26.06.2013 13:12, Michel Fortin wrote:COM used in the same class. ARC can only release the object if both counters are 0, while the COM implementation has to add/remove roots to the GC when counting from/to 0 to avoid that the object is collected while external references exist. Yes. The "external" reference count prevents the gc from releasing memory and the "internal" reference count keeps track of the number of references from gc-scanned memory. When both fall to zero, the memory is released immediately; when the external count falls to zero, the gc can collect memory if it isn't connected to any of its roots. Note that you could implement the external count as a separate hash table that'll include a gc-scanned pointer to the object. That pointer would keep the internal count above zero as long as it is present in the table. The external count doesn't need to be formally part of the gc data structure.Le 26-juin-2013 à 5:38, Walter Bright a écrit :That means you have to maintain two reference counts if you have both ARC andOn 6/26/2013 12:19 AM, Rainer Schuetze wrote:RC is just another garbage collection scheme. You might favor it for its performance characteristics, its determinism, or the lower memory footprint. Or you might need it to interact with foreign code that relies on it (COM, Objective-C, etc.), in which case it needs to be customizable (use the foreign implementation) or be manually managed. That's two different use cases. And in the later case you can't use the GC to release cycles because foreign code is using memory invisible to the GC. It is important to note that when foreign code calls AddRef you don't want the GC to collect that object, at least not until Release is called.As Michel also said, the reference count does not have to be in inside the object itself, so we might want to allow reference counting on other types aswell.That opens the question of what is the point of other RC types? For example, C++ can throw any type - but it turns out that throwing anything but class types is largely pointless.
Oct 09 2013
On 6/26/2013 4:15 AM, Michel Fortin wrote:Well, it's mostly required to write runtime support functions. The attributecould be more obscure so people are less tempted to use it, but if you're going to implement the ref-counting code you'll need that. I know the temptation is strong to create more attributes as an easy solution, but I'd really like to try hard to find other ways.D would need manual, RC and GC to coexist peacefully.The problem is how to make the three of those use the same codegen? - Druntime could have a flag to disable/enable refcounting. It'd make theretain/release functions no-ops, but it'd not prevent the GC from reclaiming memory as it does today.- Druntime could have a flag to disable/enable garbage collection (it alreadyhas). That'd prevent cycles from being collected, but you could use weak pointers to work around that or request a collection manually at the appropriate time.- A noarc (or similar) attribute at the function level could be used toprevent the compiler from generating function calls on pointer assignments. You could make a whole module noarc if you want by adding " noarc:" at the top.Here's the annoying thing: noarc is totally safe if reference counting isdisabled and we rely entirely on the GC. noarc is unsafe when reference counting is enabled. I don't really understand your point. My proposal enables manual, RC and GC to coexist peacefully. The GC wouldn't even know about RC.function. While this is some overhead, it is more predictable than overhead from a GC scan and would be preferred in some situation (games I guess). Another downside is you have an object retained by being present on the stack frame of a C function, it'd have to be explicitly retained from elsewhere.The downside is that every assignment to a pointer anywhere has to call afeature of D is this capability, without worrying about a "JNI" style interface.Doesn't this make it impractical to mix vanilla C with D code? An importantIt's not very different than with the GC today. If you call a C function by giving it a ref-counted pointer argument, thatmemory block is guarantied to live at least for that call's lifetime (because it is retained by the caller). So simple calls to C functions are not a problem.If the C function puts that pointer elsewhere you'll need to retain it someother way, but you have to do this with the GC too. If you're implementing a callback called from C you need to care about what you return because the caller's C code won't retain it, while with the GC you could manage if C code did not store that pointer outside of the stack.I think that's all you have to worry about.D (like C and C++) loves to manipulate pointers. Having to call a function every time this is done would be a disaster. It means that people would be motivated to drop down to C to do the fast code, and we might as well throw in the towel.for such a step. For one thing, reading the clang spec on all the various pointer and function annotations necessary is very off-putting.As for D switching to a full refcounted GC for everything, I'm very hesitantDon't let Clang intimidate you. The Clang spec is about four to five timemore complicated than needed because of autoreleased objects and because it supports weak pointers. Weak pointers can be implemented as a struct templates (as long as we have noarc). And all those annotations are for special cases, when you need to break the rules. You don't use them when doing normal programming, well except for __weak.Ok.
Oct 09 2013
On 6/26/2013 2:33 PM, Rainer Schuetze wrote:On 26.06.2013 11:38, Walter Bright wrote:I'm a little confused about what you are referring to here.On 6/26/2013 12:19 AM, Rainer Schuetze wrote:I currently don't see how it can be memory safe with this proposal.I imagine a few (constrained) templated functions for the different operations defined in the library could also do the job, though it might drown compilation speed. Also getting help from the optimizer to remove redundant calls will need some back doors.I don't see how this can be done without specific compiler knowledge in a memory safe way.compiler already splits this into two function calls, this cannot be done in the implementation. Why is it necessary to put a lock around the pair?You have to put the lock around the pair of AddRef and Release, but if theSince the implementation of AddRef()/Release() is up to the user, whether it uses locks or not and whether it supports shared or not is up to the user.3. Assignment to a class reference causes a call to AddRef() on the new value followed by a call to Release() on its original value.It might be common knowledge, but I want to point out that the usual COM implementation (atomic increment/decrement and free when refcount goes down to 0) is not thread-safe for shared pointers. That means you either have to guard all reads and writes with a lock to make the full assignment atomic or have to implement reference counting very different (e.g. deferred reference counting).class R and this codeHmmm, I guess I misunderstand the proposal. Assume for example a refcountedNo. The caller of the function still retains a reference in that thread.12. AddRef() is not called when passed as the implicit 'this' reference.Isn't this unsafe if a member function is called through the last existing reference and this reference is then cleared during execution of this member function or from another thread?class R : RefCounted { int _x; int readx() { return _x; } } int main() { R r = new R; return r.readx(); } According to 12. there is no refcounting going on when calling or executingreadx. Ok, now what happens here:class R : RefCounted { int _x; int readx(C c) { c.r = null; // "standard" rc deletes r here return _x; // reads garbage } } class C { R r; } int main() { C c = new C; c.r = new R; return c.r.readx(c); } This reads garbage or crashes if there is no reference counting going on whencalling readx. I think you're right. Grrrr!a refcounted class. Is it forbidden to have a field like this in a refcounted class or is taking the address through slicing forbidden? It would be forbidden to obtain a slice of a ref counted object in this way - or even to simply refer to a static array embedded in a ref counted object (in safe code).Just to clarify, I meant taking a slice of a static array that is a field ofAs I suggested, arrays would not be supported with this proposal - but the user can create ref counted array-like objects.13. Taking the address of, or passing by reference, any fields of an RC object is not allowed in safe code. Passing by reference an RC field is allowed.Please note that this includes slices to fixed size arrays.call, though the template character of it might hurt compilation performance.I don't see a big difference between a free function and a member functionI feel I'm hijacking this proposal, but the step to library defined read/write barriers seems pretty small. Make AddRef, Release and assignment free template functions, e.g. void ptrConstruct(T,bool stackOrHeap)(T*adr, T p); void ptrAssign(T,bool stackOrHeap)(T*adr, T p); void ptrRelease(T,bool stackOrHeap)(T*adr); and we are able to experiment with all kinds of sophisticated GC algorithms including RC. Eliding redundant addref/release pairs would need some extra support though, I read that LLVM does something like this, but I don't know how.It's pretty invasive into the code generation and performance, and could completely disrupt the C compatibility of D.Two more notes: - I'm not sure it is mentioned, but I think you have to describe what happenswhen copying a struct. pre- and post-blit actions have to be taken if the struct contain pointers to refcounted objects. Yes. It's analogous to copying a struct that has fields which contain constructors and destructors.to be Released if the return value is ignored or if only used as part of an expression.10. Function returns have an AddRef() already done to the return value.- A refcounted reference returned from a function (including new) would haveThat's right. Just as if a function returned a struct with a destructor. The model I am basing this on is C++'s shared_ptr<>, which makes use of all the various rules around construction, assignment, and destruction. The wrinkles we have are: 1. memory safety 2. working with the GC
Oct 09 2013
Michel Fortin wrote: Le 26-juin-2013 à 19:48, Walter Bright a écrit :D (like C and C++) loves to manipulate pointers. Having to call a functionevery time this is done would be a disaster. It means that people would be motivated to drop down to C to do the fast code, and we might as well throw in the towel. But at the same time, having a GC that stops the world at irregular interval is worse for certain kind of things, games notably. It's been stated often on the forum that game developers prefer increasing the overhead if it prevents those hiccups. Making everything in D reference-counted would undoubtedly increase the general overhead, but it'd allow game developers to leverage the whole language and its libraries instead of restricting themselves to a custom subset of the class hierarchy derived from a reference counted class. And about pointer manipulation: for cases where you know for certain that the pointer still points to the same memory block before and after the assignment (when you call popFront on an array for instance), you have no reference count to update and can elide the call. The downside of optional support for language-wide reference counting is that it requires two incompatible codegen (or rather one incompatible with RC). We could have only one if it's the one that calls the retain/release functions on pointer assignment, with those functions replaced with empty stubs in druntime when reference counting is disabled, but some overhead would remain for the function call. I'm not claiming this is the right solution. It's just an idea I wanted to mention as an aside because it has some common points. It is however a mostly separate matter from your initial goal of supporting custom reference counting schemes for some object hierarchies. I decided to write about it mostly because you talked about reimplementing arrays using classes and that got me thinking. But perhaps I shouldn't have mentioned it because it seems to be side-tracking the discussion.
Oct 09 2013
On 6/26/2013 6:33 PM, Michel Fortin wrote:Le 26-juin-2013 à 19:48, Walter Bright a écrit :every time this is done would be a disaster. It means that people would be motivated to drop down to C to do the fast code, and we might as well throw in the towel.D (like C and C++) loves to manipulate pointers. Having to call a functionBut at the same time, having a GC that stops the world at irregular intervalis worse for certain kind of things, games notably. It's been stated often on the forum that game developers prefer increasing the overhead if it prevents those hiccups. Making everything in D reference-counted would undoubtedly increase the general overhead, but it'd allow game developers to leverage the whole language and its libraries instead of restricting themselves to a custom subset of the class hierarchy derived from a reference counted class.And about pointer manipulation: for cases where you know for certain that thepointer still points to the same memory block before and after the assignment (when you call popFront on an array for instance), you have no reference count to update and can elide the call. I've never seen a scheme for "knows for certain" that did not involve extensive and intrusive pointer annotations, something we very much want to avoid. Pointer annotations work great in theory, but in practice no successful language I know of uses them (we'll see if Rust will become an exception).The downside of optional support for language-wide reference counting is thatit requires two incompatible codegen (or rather one incompatible with RC). We could have only one if it's the one that calls the retain/release functions on pointer assignment, with those functions replaced with empty stubs in druntime when reference counting is disabled, but some overhead would remain for the function call.I'm not claiming this is the right solution. It's just an idea I wanted tomention as an aside because it has some common points. It is however a mostly separate matter from your initial goal of supporting custom reference counting schemes for some object hierarchies. I decided to write about it mostly because you talked about reimplementing arrays using classes and that got me thinking. But perhaps I shouldn't have mentioned it because it seems to be side-tracking the discussion.This proposal is modeled after C++'s shared_ptr<T> in that it should have equivalent performance and capabilities. Since it has been well accepted into the C++ "best practices", I think we're on solid ground with it.
Oct 09 2013
Rainer Schuetze wrote: On 27.06.2013 02:11, Walter Bright wrote:On 6/26/2013 2:33 PM, Rainer Schuetze wrote:When preparing for dconf I read the "Garbage Collection Handbook" by Richard Jones, and it very much supported my suspicion that the usual reference counting cannot be both memory safe and high-performance.On 26.06.2013 11:38, Walter Bright wrote:I'm a little confused about what you are referring to here.On 6/26/2013 12:19 AM, Rainer Schuetze wrote:I currently don't see how it can be memory safe with this proposal.I imagine a few (constrained) templated functions for the different operations defined in the library could also do the job, though it might drown compilation speed. Also getting help from the optimizer to remove redundant calls will need some back doors.I don't see how this can be done without specific compiler knowledge in a memory safe way.To be more accurate, it is the assignment and the Release that have to be atomic, in addition to a concurrent read with AddRef. Imagine the reading thread is suspended while just having read the pointer, but not incremented the reference count yet. If an assignment with release and deletion is performed before the thread resumes, AddRef is called on garbage. IIRC you also have the GC handbook book on your shelf. Check the chapters on RC, especially algorithm 18.2 "Eager reference counting with CompareAndSwap is broken".You have to put the lock around the pair of AddRef and Release, but if the compiler already splits this into two function calls, this cannot be done in the implementation.Why is it necessary to put a lock around the pair?Ok.Just to clarify, I meant taking a slice of a static array that is a field of a refcounted class. Is it forbidden to have a field like this in a refcounted class or is taking the address through slicing forbidden?It would be forbidden to obtain a slice of a ref counted object in this way - or even to simply refer to a static array embedded in a ref counted object (in safe code).Ok, I tend to forget about the swapping to a temporary when assigning structs.Two more notes: - I'm not sure it is mentioned, but I think you have to describe what happens when copying a struct. pre- and post-blit actions have to be taken if the struct contain pointers to refcounted objects.Yes. It's analogous to copying a struct that has fields which contain constructors and destructors.The model I am basing this on is C++'s shared_ptr<>, which makes use of all the various rules around construction, assignment, and destruction. The wrinkles we have are: 1. memory safetyThis is the hard part.2. working with the GCI don't think that the GC is getting in the way as long as it is mark-and-sweep. A fully reference counting GC is a different story and can be made concurrent, but it is usually not eager and needs to defer actual reference counting to avoid locks. Instead it logs accesses to some thread local buffer which needs to be processed eventually.
Oct 09 2013
Rainer Schuetze wrote:Why is it necessary to put a lock around the pair?To be more accurate, it is the assignment and the Release that have to be atomic, in addition to a concurrent read with AddRef. Imagine the reading thread is suspended while just having read the pointer, but not incremented the reference count yet. If an assignment with release and deletion is performed before the thread resumes, AddRef is called on garbage.On my way to work today, I figured that a safe but slow implementation can work, if the interface is not AddRef/Release, but class C { C readThis(); void writeThis(ref C c); } where the function can include the necessary locks, e.g. class C { int refcnt; C readThis() { synchronized(this) { refcnt++; return this; } } void writeThis(ref C c) { synchronized(c) { C x = c; c = this; if (--c.refcnt == 0) delete c; } } } Reading/Writing null (e.g. when constructing or destructing a reference) would have to be special cased, that would not be necesessary with free functions.
Oct 09 2013
Michel Fortin wrote: Le 27-juin-2013 à 8:03, "Rainer Schuetze" <r.sagitario gmx.de> a écrit :On my way to work today, I figured that a safe but slow implementation canwork, if the interface is not AddRef/Release, butclass C { C readThis(); void writeThis(ref C c); } where the function can include the necessary locks, e.g. class C { int refcnt; C readThis() { synchronized(this) { refcnt++; return this; } } void writeThis(ref C c) { synchronized(c) { C x = c; c = this; if (--c.refcnt == 0) delete c; } } }There's an error in this code. You must synchronize on the lock protecting the pointer, not on the lock at the other end of the pointer's value. Also, you only need to do this if the pointer pointing to the object is shared. If the pointer is thread-local, assignment does not need to be atomic. And if the object itself is thread-local, not even the reference counter need to be atomic.
Oct 09 2013
On 6/26/2013 11:28 PM, Rainer Schuetze wrote:On 27.06.2013 02:11, Walter Bright wrote:I think with the rules in the proposal we have we can support it.On 6/26/2013 2:33 PM, Rainer Schuetze wrote:When preparing for dconf I read the "Garbage Collection Handbook" by Richard Jones, and it very much supported my suspicion that the usual reference counting cannot be both memory safe and high-performance.On 26.06.2013 11:38, Walter Bright wrote:I'm a little confused about what you are referring to here.On 6/26/2013 12:19 AM, Rainer Schuetze wrote:I currently don't see how it can be memory safe with this proposal.I imagine a few (constrained) templated functions for the different operations defined in the library could also do the job, though it might drown compilation speed. Also getting help from the optimizer to remove redundant calls will need some back doors.I don't see how this can be done without specific compiler knowledge in a memory safe way.deletion is performed before the thread resumes, AddRef is called on garbage. I see. I'll have to think about that some more.To be more accurate, it is the assignment and the Release that have to be atomic, in addition to a concurrent read with AddRef. Imagine the reading thread is suspended while just having read the pointer, but not incremented the reference count yet. If an assignment with release andYou have to put the lock around the pair of AddRef and Release, but if the compiler already splits this into two function calls, this cannot be done in the implementation.Why is it necessary to put a lock around the pair?IIRC you also have the GC handbook book on your shelf. Check the chapters on RC, especially algorithm 18.2 "Eager reference counting with CompareAndSwap is broken".I have the book, but it is the first edition and there's no chapter 18 in it :-(made concurrent, but it is usually not eager and needs to defer actual reference counting to avoid locks. Instead it logs accesses to some thread local buffer which needs to be processed eventually.Ok.Just to clarify, I meant taking a slice of a static array that is a field of a refcounted class. Is it forbidden to have a field like this in a refcounted class or is taking the address through slicing forbidden?It would be forbidden to obtain a slice of a ref counted object in this way - or even to simply refer to a static array embedded in a ref counted object (in safe code).Ok, I tend to forget about the swapping to a temporary when assigning structs.Two more notes: - I'm not sure it is mentioned, but I think you have to describe what happens when copying a struct. pre- and post-blit actions have to be taken if the struct contain pointers to refcounted objects.Yes. It's analogous to copying a struct that has fields which contain constructors and destructors.The model I am basing this on is C++'s shared_ptr<>, which makes use of all the various rules around construction, assignment, and destruction. The wrinkles we have are: 1. memory safetyThis is the hard part.2. working with the GCI don't think that the GC is getting in the way as long as it is mark-and-sweep. A fully reference counting GC is a different story and can beI don't think we should do a fully ref counted GC anyway.
Oct 09 2013
Rainer Schuetze wrote: On 27.06.2013 15:50, Michel Fortin wrote:Le 27-juin-2013 à 8:03, "Rainer Schuetze" <r.sagitario gmx.de> a écrit :You're right (I have been about to run to a meeting when writing this). Then, readThis will also need a reference to the pointer. Another more obvious bug is that it should read if (--x.refcnt == 0) delete x;class C { C readThis(); void writeThis(ref C c); } where the function can include the necessary locks, e.g. class C { int refcnt; C readThis() { synchronized(this) { refcnt++; return this; } } void writeThis(ref C c) { synchronized(c) { C x = c; c = this; if (--c.refcnt == 0) delete c; } } }There's an error in this code. You must synchronize on the lock protecting the pointer, not on the lock at the other end of the pointer's value.Also, you only need to do this if the pointer pointing to the object is shared. If the pointer is thread-local, assignment does not need to be atomic. And if the object itself is thread-local, not even the reference counter need to be atomic.True, these issues only happen with shared pointers. But remember that fields in shared objects are also shared. I also have a hard time to imagine how the code above works with reading pointers that live in registers or writing to "references to registers". These are never shared, so they could have simpler implementations.
Oct 09 2013
Michel Fortin wrote: Le 27-juin-2013 à 13:04, Walter Bright a écrit :I don't think we should do a fully ref counted GC anyway.Speaking of the GC, you should probably rethink this point:14. RC objects will still be allocated on the GC heap - this means that a normal GC run will reap RC objects that are in a cycle, and RC objects will getautomaticallyscanned for heap references with no additional action required by the user.If you allocate the object from the GC heap, the GC will collect it regardless of its reference count. That's fine as long as all the retaining pointers are visible to the GC. But if you're defining a COM object, likely that's because you'll pass a pointer to an external API, and this API might store the pointer somewhere not scanned by the GC. This API will call AddRef to make sure the object is retained, but if the GC doesn't see that pointer on its heap it'll deallocate and next time external code uses the object everything goes boom! So that doesn't work. If instead you allocate the object outside of the GC heap and your object contains pointers to the GC heap, you'll need to add roots to the GC for any pointer variable in the object. (This is what DMD/Objective-C currently does.) There's no way to detect cycles with that scheme, but it is simple. We could use a hybrid scheme with two reference counts: one for internal references that the GC can see and one for external references that the GC cannot see. The GC cannot collect an object if the external reference count is non-zero. If the external count is zero, it can collect the object if the internal reference count reaches zero or if it becomes unreachable from any root. This allows detection of cycles, as long as this cycle is only made of internal references. Care must be taken about incrementing/decrementing the right reference count depending on the context, which sounds tricky. Or we could use a somewhat less hybrid scheme where we have one reference count and the only thing it does is prevent objects from being deallocated. This can be implemented as one global hash table and you put all objects that have a non-zero reference count in that table. This hash table being scanned by the GC anything in it will never be collected. This will also detect internal cycles like the previous two-counter scheme, but it doesn't allow immediate deallocation as it waits for the GC to deallocate. (This is similar to how it worked in my defunct D/Objective-C bridge that did not rely on tweaking the compiler.)
Oct 09 2013
On 6/27/2013 11:38 AM, Michel Fortin wrote:Le 27-juin-2013 à 13:04, Walter Bright a écrit :automaticallyI don't think we should do a fully ref counted GC anyway.Speaking of the GC, you should probably rethink this point:14. RC objects will still be allocated on the GC heap - this means that a normal GC run will reap RC objects that are in a cycle, and RC objects will getregardless of its reference count. That's fine as long as all the retaining pointers are visible to the GC. But if you're defining a COM object, likely that's because you'll pass a pointer to an external API, and this API might store the pointer somewhere not scanned by the GC. This API will call AddRef to make sure the object is retained, but if the GC doesn't see that pointer on its heap it'll deallocate and next time external code uses the object everything goes boom! So that doesn't work. We already require that if you're going to pass a pointer to any GC allocated data to external code, that you retain a pointer. I see no additional issue with requiring this for COM objects created on the GC heap.scanned for heap references with no additional action required by the user.If you allocate the object from the GC heap, the GC will collect itIf instead you allocate the object outside of the GC heap and your objectcontains pointers to the GC heap, you'll need to add roots to the GC for any pointer variable in the object. (This is what DMD/Objective-C currently does.) There's no way to detect cycles with that scheme, but it is simple. Yes, but that's a lot harder (and more error-prone) than simply requiring the programmer to retain a pointer as I outlined above.We could use a hybrid scheme with two reference counts: one for internalreferences that the GC can see and one for external references that the GC cannot see. The GC cannot collect an object if the external reference count is non-zero. If the external count is zero, it can collect the object if the internal reference count reaches zero or if it becomes unreachable from any root. This allows detection of cycles, as long as this cycle is only made of internal references. Care must be taken about incrementing/decrementing the right reference count depending on the context, which sounds tricky. That also seems far more complex than what I proposed.Or we could use a somewhat less hybrid scheme where we have one referencecount and the only thing it does is prevent objects from being deallocated. This can be implemented as one global hash table and you put all objects that have a non-zero reference count in that table. This hash table being scanned by the GC anything in it will never be collected. This will also detect internal cycles like the previous two-counter scheme, but it doesn't allow immediate deallocation as it waits for the GC to deallocate. (This is similar to how it worked in my defunct D/Objective-C bridge that did not rely on tweaking the compiler.)I'd really like to stick to the shared_ptr<T> model. (A global hash table also is not so simple when factoring in loading and unloading DLLs.) Of course, for the O-C bridge, you can implement it as required to be compatible with O-C.
Oct 09 2013
Michel Fortin wrote: Le 27-juin-2013 à 15:32, Walter Bright a écrit :On 6/27/2013 11:38 AM, Michel Fortin wrote:normal14. RC objects will still be allocated on the GC heap - this means that aautomaticallyGC run will reap RC objects that are in a cycle, and RC objects will getregardless of its reference count. That's fine as long as all the retaining pointers are visible to the GC. But if you're defining a COM object, likely that's because you'll pass a pointer to an external API, and this API might store the pointer somewhere not scanned by the GC. This API will call AddRef to make sure the object is retained, but if the GC doesn't see that pointer on its heap it'll deallocate and next time external code uses the object everything goes boom! So that doesn't work.scanned for heap references with no additional action required by the user.If you allocate the object from the GC heap, the GC will collect itWe already require that if you're going to pass a pointer to any GC allocateddata to external code, that you retain a pointer. I see no additional issue with requiring this for COM objects created on the GC heap. Perhaps it's just me, but I'd say if you need to anticipate the duration for which you need to keep the object alive when you pass it to some external code it completely defeats the purpose of said external code calling AddRef and Release. With the scheme you propose, reference counting would be useful inside D code as a way to deallocate some classes of objects early without waiting a GC scan. The GC can collect cycles for those objects. People passing COM objects to external code however should allocate those objects outside of the GC if they intend to pass the object to external code. They should also add member pointers as GC roots. Also, no cycle detection for those objects. If done right it could be made memory safe, but cycles will leak. Maybe that could work.
Oct 09 2013
On 6/27/2013 1:15 PM, Michel Fortin wrote:Le 27-juin-2013 à 15:32, Walter Bright a écrit :normalOn 6/27/2013 11:38 AM, Michel Fortin wrote:14. RC objects will still be allocated on the GC heap - this means that aautomaticallyGC run will reap RC objects that are in a cycle, and RC objects will getregardless of its reference count. That's fine as long as all the retaining pointers are visible to the GC. But if you're defining a COM object, likely that's because you'll pass a pointer to an external API, and this API might store the pointer somewhere not scanned by the GC. This API will call AddRef to make sure the object is retained, but if the GC doesn't see that pointer on its heap it'll deallocate and next time external code uses the object everything goes boom! So that doesn't work.scanned for heap references with no additional action required by the user.If you allocate the object from the GC heap, the GC will collect itallocated data to external code, that you retain a pointer. I see no additional issue with requiring this for COM objects created on the GC heap.We already require that if you're going to pass a pointer to any GCPerhaps it's just me, but I'd say if you need to anticipate the duration forwhich you need to keep the object alive when you pass it to some external code it completely defeats the purpose of said external code calling AddRef and Release.With the scheme you propose, reference counting would be useful inside D codeas a way to deallocate some classes of objects early without waiting a GC scan. The GC can collect cycles for those objects.People passing COM objects to external code however should allocate thoseobjects outside of the GC if they intend to pass the object to external code. They should also add member pointers as GC roots. Also, no cycle detection for those objects. If done right it could be made memory safe, but cycles will leak.Maybe that could work.Nothing about the proposal acts to prevent one from constructing COM objects any way they wish, including using malloc/free and managing it all themselves. All COM objects require is an implementation of the COM interface, which says nothing at all beyond having a pointer to an AddRef() and Release(). If you are building a COM object that is to be fired and forgotten into the void of unknown external code, I don't think there's any automated replacement for thinking carefully about it and constructing it accordingly. D's memory safety guarantees cannot, of course, cover unknown and unknowable external code. What I'm trying to accomplish with this proposal is: 1. A way to do ref-counted memory allocation for specific objects 2. Do it in a guaranteed memory safe manner (at least for the user of those objects) 3. Do it in a way that does not interfere with people who want to use the GC or do manual memory management 4. Not impose penalties on non-refcounted code 5. Do it in a way that offers a similar performance and overhead profile to C++'s shared_ptr<T> 6. Do it in a way that makes it usable to construct COM objects, and work with NSObject's 7. Not require pointer annotations 8. Address the most common "why I can't use D" complaint What I'm not trying to accomplish is: 1. Replacing all memory allocation in D with ref counting
Oct 09 2013
Michel Fortin wrote: Le 27-juin-2013 à 16:56, Walter Bright a écrit :What I'm trying to accomplish with this proposal is: 1. A way to do ref-counted memory allocation for specific objects 2. Do it in a guaranteed memory safe manner (at least for the user of thoseobjects)3. Do it in a way that does not interfere with people who want to use the GCor do manual memory management4. Not impose penalties on non-refcounted code 5. Do it in a way that offers a similar performance and overhead profile toC++'s shared_ptr<T>6. Do it in a way that makes it usable to construct COM objects, and workwith NSObject's7. Not require pointer annotations 8. Address the most common "why I can't use D" complaint What I'm not trying to accomplish is: 1. Replacing all memory allocation in D with ref countingThat list is great for limiting the scope of your DIP. Make sure you include it in the DIP. So if we return to the core of it, here's the problems that still need solving: 1. Depending on the reference counting scheme implemented, it might be more efficient to have a single operation for an assignment (retain a/release b) operation. I think that should be allowed. 2. If the pointer variable is shared assignment must be atomic (done under a lock, and it must always be the same lock for a given pointer, obviously). 3. If the pointer variable is shared, reading its value must be done atomically with a retain too. Here's a suggestion for problem number 1 above: class MyObject { // user-implemented static void opRetain(MyObject var); // must accept null static void opRelease(MyObject var); // must accept null // optional (showing default implementation below) // this can be made faster with for some implementations of ref-counting // only call it for an assignment, not for constructing/destructing the pointer // (notably for Objective-C) static void opPtrAssign(ref MyObject var, MyObject newVal) { opRetain(newVal); opRelease(var); var = newVal; } } This maps 1 on 1 to the underlying functions for Objective-C ARC. I don't have a solution for the shared case. We do in fact have a tail-shared problem here. If I have a shared(MyObject), the pointer is as much shared along with the object. When the pointer itself is shared, we need a lock to access it reliably and that can only be provided by the outer context. If we had a way to express tail-shared, then we could repeat the above three functions for tail-shared object pointers and it'd work reliably for that.
Oct 09 2013
Michel Fortin Wrote: Le 27-juin-2013 à 18:35, Michel Fortin a écrit :class MyObject { // user-implemented static void opRetain(MyObject var); // must accept null static void opRelease(MyObject var); // must accept null // optional (showing default implementation below) // this can be made faster with for some implementations of ref-counting // only call it for an assignment, not for constructing/destructing the pointer // (notably for Objective-C) static void opPtrAssign(ref MyObject var, MyObject newVal) { opRetain(newVal); opRelease(var); var = newVal; } } This maps 1 on 1 to the underlying functions for Objective-C ARC.Actually, I made a small error in opRetain. To match Objective-C ARC it should return the retained object: static MyObject opRetain(MyObject var); // must accept null and the default implementation for opPtrAssign would then become: static void opPtrAssign(ref MyObject var, MyObject newVal) { newVal = opRetain(newVal); opRelease(var); var = newVal; } One reason is that Objective-C blocks (equivalent of delegate literals) are stack allocated. If you call retain on a block, it'll make a copy on the heap and return that copy. Another reason for opRetain to return the object is to enable tail-call optimization for cases like this one: NSObject x; NSObject getX() { return x; // D ARC should insert an implicit opRetain here } Of course it doesn't necessarily need to work that way, but it'd certainly make it easier to integrate with Objective-C if it worked that way.
Oct 09 2013
Rainer Schuetze wrote: On 25.06.2013 23:00, Walter Bright wrote:4. Null checks are done before calling any AddRef() or Release().Here is another nitpick that needs to be addressed: As mentioned in the implementation of ComObject invariants (and out contracts) must not be called when returning from Release, if it is ok to actually delete the object.
Oct 09 2013
Rainer Schuetze wrote: On 28.06.2013 00:35, Michel Fortin wrote:So if we return to the core of it, here's the problems that still need solving: 1. Depending on the reference counting scheme implemented, it might be more efficient to have a single operation for an assignment (retain a/release b) operation. I think that should be allowed.2. If the pointer variable is shared assignment must be atomic (done under a lock, and it must always be the same lock for a given pointer, obviously).3. If the pointer variable is shared, reading its value must be done atomically with a retain too.I just had an idea, maybe it is obvious and just distracts, but I thought it might be worth sharing: Instead of defining methods on the class type, we could also redefine the reference type. The compiler detects a type declaration "reference_type" in the class declaration and replaces all references to that class with that type. class C { alias shared_ptr!C reference_type; } C c = new C; is lowered to shared_ptr!C c = new C; "new C" returns a shared_ptr!C aswell. It is then up to the implementation of shared_ptr to define what member functions to call for reference counting and to deal with proper shared semantics in assignments. It can also define whether opCall should increment the reference count or not. For most of the needed functionality, struct semantics work out-of-the-box. 2 immediate gotchas - In a class hierarchy, you would want to define the reference_type in the base class only, so maybe it has to be a template. I'm not sure implicite casting to base class reference type and interfaces can be implemented. - the implementation of the shared_ptr template will have to be able to deal with the "raw" reference, so that might need some type modifier/annotation. I think this might also be true for the addRef/release version, if the implementation is not just working on the refcount, but is also calling other functions. - To elide redundant reference counting, the compiler will need annotations here, too. Move semantics of structs might reduce the number of reference count operations already, though.
Oct 09 2013
Rainer Schuetze wrote: On 28.06.2013 09:07, Walter Bright wrote:Will add to proposal. On 6/27/2013 11:27 PM, Rainer Schuetze wrote:Sorry to produce these drop by drop, but while writing the last mail, I noticed another issue to think about: What happens if the class also implements interfaces? A reference of the interface type must do reference counting as well. So the interface must also define AddRef and Release. This is currently true for COM-interfaces derived from IUnknown, but not for other interfaces.On 25.06.2013 23:00, Walter Bright wrote:4. Null checks are done before calling any AddRef() or Release().Here is another nitpick that needs to be addressed: As mentioned in the implementation of ComObject invariants (and out contracts) must not be called when returning from Release, if it is ok to actually delete the object.
Oct 09 2013
Michel Fortin wrote: Le 28-juin-2013 à 4:55, Rainer Schuetze <r.sagitario gmx.de> a écrit :What happens if the class also implements interfaces? A reference of theinterface type must do reference counting as well. So the interface must also define AddRef and Release. This is currently true for COM-interfaces derived from IUnknown, but not for other interfaces. I would assume if an object of a reference-counted class cannot be cast to its base non-reference-counted class that it'd be the same for casting to non-reference-counted interfaces.
Oct 09 2013
Rainer Schuetze wrote: On 28.06.2013 13:37, Michel Fortin wrote:Le 28-juin-2013 à 4:55, Rainer Schuetze <r.sagitario gmx.de> a écrit :Yes, that is probably the way to go. But that makes using protocols like COM difficult in safe mode. We have already talked Walter into not linking reference counting to AddRef and Release, but if it is implemented with other methods, these cannot be added to the already existing COM interfaces. Passing or getting interface pointers to/from external code being unsafe sounds ok, but passing around these interface references in D code would be unsafe as well. Adding aliases or non-virtual wrappers to the interface declaration to forward reference counting to AddRef/Release might help but could also introduce ambiguities in a class that derives both from a reference counted base class and an interface like this.What happens if the class also implements interfaces? A reference of the interface type must do reference counting as well. So the interface must also define AddRef and Release. This is currently true for COM-interfaces derived from IUnknown, but not for other interfaces.I would assume if an object of a reference-counted class cannot be cast to its base non-reference-counted class that it'd be the same for casting to non-reference-counted interfaces.
Oct 09 2013
Michel Fortin wrote: Le 25-juin-2013 à 17:00, Walter Bright a écrit :If a class contains the following methods, in either itself or a base class,it isan RC class: T AddRef(); T Release(); An RC class is like a regular D class with these additional semantics: 1. In safe code, casting (implicit or explicit) to a base class that does not have both AddRef() and Release() is an error.I'm just realizing that this means safe code cannot call any member function of the non-reference-counted base class. This would require an implicit conversion of "this" to the base class. system code could, but it'd be extremely uneasy doing such calls unless I am the one in charge of that code and can make sure the base function will never store the (unretained) pointer somewhere it shouldn't now and in the future. An misstep here and you get memory corruption. Seriously, I don't think system code should allow implicit conversions to the base class, it should be explicit. I am starting to doubt there is any value in inheriting the base ref-counted-class from another class.
Oct 09 2013
On 6/28/2013 1:47 AM, Rainer Schuetze wrote:On 28.06.2013 00:35, Michel Fortin wrote:might be worth sharing:So if we return to the core of it, here's the problems that still need solving: 1. Depending on the reference counting scheme implemented, it might be more efficient to have a single operation for an assignment (retain a/release b) operation. I think that should be allowed.2. If the pointer variable is shared assignment must be atomic (done under a lock, and it must always be the same lock for a given pointer, obviously).3. If the pointer variable is shared, reading its value must be done atomically with a retain too.I just had an idea, maybe it is obvious and just distracts, but I thought itInstead of defining methods on the class type, we could also redefine thereference type. The compiler detects a type declaration "reference_type" in the class declaration and replaces all references to that class with that type.class C { alias shared_ptr!C reference_type; } C c = new C; is lowered to shared_ptr!C c = new C; "new C" returns a shared_ptr!C aswell. It is then up to the implementation of shared_ptr to define what memberfunctions to call for reference counting and to deal with proper shared semantics in assignments. It can also define whether opCall should increment the reference count or not. For most of the needed functionality, struct semantics work out-of-the-box.2 immediate gotchas - In a class hierarchy, you would want to define the reference_type in thebase class only, so maybe it has to be a template. I'm not sure implicite casting to base class reference type and interfaces can be implemented.- the implementation of the shared_ptr template will have to be able to dealwith the "raw" reference, so that might need some type modifier/annotation. I think this might also be true for the addRef/release version, if the implementation is not just working on the refcount, but is also calling other functions.- To elide redundant reference counting, the compiler will need annotationshere, too. Move semantics of structs might reduce the number of reference count operations already, though.The main problem with this is the decay of a shared_ptr!C to a C. Once that happens, all the memory safety goes out the window.
Oct 09 2013
On 6/28/2013 7:14 AM, Michel Fortin wrote:Le 25-juin-2013 à 17:00, Walter Bright a écrit :it isIf a class contains the following methods, in either itself or a base class,of the non-reference-counted base class. This would require an implicit conversion of "this" to the base class. That's right.an RC class: T AddRef(); T Release(); An RC class is like a regular D class with these additional semantics: 1. In safe code, casting (implicit or explicit) to a base class that does not have both AddRef() and Release() is an error.I'm just realizing that this means safe code cannot call any member functionsystem code could, but it'd be extremely uneasy doing such calls unless I amthe one in charge of that code and can make sure the base function will never store the (unretained) pointer somewhere it shouldn't now and in the future. An misstep here and you get memory corruption. Seriously, I don't think system code should allow implicit conversions to the base class, it should be explicit. It's a worthy point.I am starting to doubt there is any value in inheriting the baseref-counted-class from another class.
Oct 09 2013
On Thursday, 10 October 2013 at 02:19:02 UTC, Walter Bright wrote:It means OOP is completely broken with that design.system code could, but it'd be extremely uneasy doing suchcalls unless I am the one in charge of that code and can make sure the base function will never store the (unretained) pointer somewhere it shouldn't now and in the future. An misstep here and you get memory corruption. Seriously, I don't think system code should allow implicit conversions to the base class, it should be explicit. It's a worthy point.
Oct 09 2013
On 10/9/2013 7:21 PM, deadalnix wrote:It means OOP is completely broken with that design.I know. The thread kind of petered out as we began to realize the obstacles with this approach. But it's important to have the conversation in the record so we don't have to go through it again.
Oct 09 2013
Rainer Schuetze wrote: On 28.06.2013 21:50, Walter Bright wrote:The main problem with this is the decay of a shared_ptr!C to a C. Once that happens, all the memory safety goes out the window.By "decay", do mean the lowering or something else? There is no stray C reference in user code, it always gets lowered to shared_ptr!C. Only trusted code in shared_ptr will have to deal with "raw" references. It is shared_ptr's responsibilty to maintain memory safety, just the same as for AddRef and Release.
Oct 09 2013
On 6/28/2013 1:11 PM, Rainer Schuetze wrote:On 28.06.2013 21:50, Walter Bright wrote:shared_ptr!C. Only trusted code in shared_ptr will have to deal with "raw" references. It is shared_ptr's responsibilty to maintain memory safety, just the same as for AddRef and Release.The main problem with this is the decay of a shared_ptr!C to a C. Once that happens, all the memory safety goes out the window.By "decay", do mean the lowering or something else? There is no stray C reference in user code, it always gets lowered to"Decay" means it is converted to type C in order to call functions that take C as the 'this' pointer or C as a parameter. The problem is both type C and type shared_ptr!C will exist.
Oct 09 2013
Rainer Schuetze wrote: On 28.06.2013 22:29, Walter Bright wrote:On 6/28/2013 1:11 PM, Rainer Schuetze wrote:Any parameter of type C is also lowered to shared_ptr!C. Calling a member function would go through opDot, which could also do reference counting for safety. Treating every explicite or implicite usage of "this" as a temporary shared_ptr!C might be overkill, so it could be restricted to assigning "this" to another reference (this includes passing it as an argument to another function or returning it from a function). My current adhoc rule: if "this" is not followed by a '.', it has to be lowered to construct shared_ptr!C(this). Assuming the reference count is updated by shared_ptr!C.opDot, there will always be a thread local reference while inside a member function (it must have been called through an external reference at least once). Other member functions of the same object can always be called without ref-counting assuming that the object never gets destroyed through changing other references.On 28.06.2013 21:50, Walter Bright wrote:"Decay" means it is converted to type C in order to call functions that take C as the 'this' pointer or C as a parameter. The problem is both type C and type shared_ptr!C will exist.The main problem with this is the decay of a shared_ptr!C to a C. Once that happens, all the memory safety goes out the window.By "decay", do mean the lowering or something else? There is no stray C reference in user code, it always gets lowered to shared_ptr!C. Only trusted code in shared_ptr will have to deal with "raw" references. It is shared_ptr's responsibilty to maintain memory safety, just the same as for AddRef and Release.
Oct 09 2013
On 6/28/2013 2:03 PM, Rainer Schuetze wrote:On 28.06.2013 22:29, Walter Bright wrote:I don't see how lowering C to shared_ptr!C and lowering share_ptr!C to C can work?On 6/28/2013 1:11 PM, Rainer Schuetze wrote:Any parameter of type C is also lowered to shared_ptr!C.On 28.06.2013 21:50, Walter Bright wrote:"Decay" means it is converted to type C in order to call functions that take C as the 'this' pointer or C as a parameter. The problem is both type C and type shared_ptr!C will exist.The main problem with this is the decay of a shared_ptr!C to a C. Once that happens, all the memory safety goes out the window.By "decay", do mean the lowering or something else? There is no stray C reference in user code, it always gets lowered to shared_ptr!C. Only trusted code in shared_ptr will have to deal with "raw" references. It is shared_ptr's responsibilty to maintain memory safety, just the same as for AddRef and Release.Calling a member function would go through opDot, which could also doreference counting for safety. Treating every explicite or implicite usage of "this" as a temporary shared_ptr!C might be overkill, so it could be restricted to assigning "this" to another reference (this includes passing it as an argument to another function or returning it from a function). My current adhoc rule: if "this" is not followed by a '.', it has to be lowered to construct shared_ptr!C(this).Assuming the reference count is updated by shared_ptr!C.opDot, there willalways be a thread local reference while inside a member function (it must have been called through an external reference at least once). Other member functions of the same object can always be called without ref-counting assuming that the object never gets destroyed through changing other references.
Oct 09 2013
Michel Fortin wrote: Le 28-juin-2013 à 17:03, Rainer Schuetze <r.sagitario gmx.de> a écrit :Any parameter of type C is also lowered to shared_ptr!C.class C {} People still constantly forget that C used as a type represents a *reference* to an object of class C, not the object itself. If you replace type C with shared_ptr!C, you must then replace it with shared_ptr!(shared_ptr!C) and so on; there's no end to it. Also, I strongly doubt the compiler will be able to elide redundant calls to retain/release made within shared_ptr!C while still respecting normal struct semantics.
Oct 09 2013
On 6/28/2013 6:42 PM, Michel Fortin wrote:Le 28-juin-2013 à 17:03, Rainer Schuetze <r.sagitario gmx.de> a écrit :to an object of class C, not the object itself. If you replace type C with shared_ptr!C, you must then replace it with shared_ptr!(shared_ptr!C) and so on; there's no end to it.Any parameter of type C is also lowered to shared_ptr!C.class C {} People still constantly forget that C used as a type represents a *reference*Also, I strongly doubt the compiler will be able to elide redundant calls toretain/release made within shared_ptr!C while still respecting normal struct semantics.Using some sort of shared_ptr!T was the original idea, but I could not figure a reasonable way to make it memory safe without the compiler knowing about it. The easiest way to have the compiler know about it is to make it some sort of class type, not a struct type.
Oct 09 2013
On 6/27/2013 11:33 AM, Rainer Schuetze wrote:On 27.06.2013 19:04, Walter Bright wrote:send them to you.I can remove the dust from my scanner to copy the 3 mostly relevant pages andIIRC you also have the GC handbook book on your shelf. Check the chapters on RC, especially algorithm 18.2 "Eager reference counting with CompareAndSwap is broken".I have the book, but it is the first edition and there's no chapter 18 in it :-(I understand the issue (I think), but I can't think of a case where the ref count would be 1 when this happens.
Oct 09 2013
On 6/28/2013 1:55 AM, Rainer Schuetze wrote:What happens if the class also implements interfaces? A reference of theinterface type must do reference counting as well. So the interface must also define AddRef and Release. This is currently true for COM-interfaces derived from IUnknown, but not for other interfaces.Even implementing IUnknown is a problem, if we do the suggestion that opAddref and opRelease be used as wrappers around AddRef and Release. I think the simplest thing is to not allow ref counted classes to implement interfaces other than ones derived from IUnknown.
Oct 09 2013
Rainer Schuetze wrote: On 29.06.2013 00:22, Walter Bright wrote:I don't see why you would want to lower from shared_ptr!C to C. It's only inside shared_ptr where access to the non-lowered C is needed, e.g. by disabling the lowering inside the shared_ptr. I was referring to "raw" references before, so the lowering would be better shared_ptr!(__raw(C)). But I agree, having the lowering include the original seems bad. I realized a worse flaw with my proposal: it doesn't solve the assignment problem it was meant to. shared_ptr implemented as a struct does not have full control of the assignment, but is only called for the postblit and the destruction of the previous value. It has no way to put a lock around the full assignment. Still thinking too much C++... Sorry for the noise.Any parameter of type C is also lowered to shared_ptr!C.I don't see how lowering C to shared_ptr!C and lowering share_ptr!C to C can work?
Oct 09 2013
On 6/29/2013 12:16 AM, Rainer Schuetze wrote:Sorry for the noise.No problem. It's a complicated subject, and none of us can think of all the ramifications. That's why this thread exists.
Oct 09 2013
Rainer Schuetze wrote: On 29.06.2013 06:38, Walter Bright wrote:On 6/27/2013 11:33 AM, Rainer Schuetze wrote:I tried to scan it yesterday, but got large black bar at the fold (don't know if this the correct term) that eraased the first inch of text. I would have to rip the book apart to get better results.On 27.06.2013 19:04, Walter Bright wrote:I can remove the dust from my scanner to copy the 3 mostly relevant pages and send them to you.IIRC you also have the GC handbook book on your shelf. Check the chapters on RC, especially algorithm 18.2 "Eager reference counting with CompareAndSwap is broken".I have the book, but it is the first edition and there's no chapter 18 in it :-(I understand the issue (I think), but I can't think of a case where the ref count would be 1 when this happens.Consider a global shared reference R that holds the last reference to an object O. One thread exchanges the reference with another reference P while another thread reads the reference into S. shared(C) R = O; ; refcnt of O is 1 in pseudo-assembly missing null-checks: Thread1 (R = P) Thread2 (S = R) mov ecx,[R] ; thread suspended mov eax,[P] inc [eax].refcnt mov ebx,[R] mov [R],eax dec [ebx].refcnt ; refcnt of O now 0 jnz done call delete_ebx ; thread resumed inc [ecx].refcnt done: The increment on [ecx].refcnt modifies garbage.
Oct 09 2013
Rainer Schuetze wrote: On 29.06.2013 09:36, Rainer Schuetze wrote:inc [ecx].refcnt done:mov [S], ecx Just wanted to add the book states that lock-free reference counting can be implemented with a cas2 operation modifying two seperate locations atomically. Unfortunately this operation does not exist in most processors. This might be the quoted paper: http://scholr.ly/paper/2199608/lock-free-reference-counting
Oct 09 2013
On 6/29/2013 12:36 AM, Rainer Schuetze wrote:On 29.06.2013 06:38, Walter Bright wrote:if this the correct term) that eraased the first inch of text. I would have to rip the book apart to get better results. Ah, don't rip up your book! (I cut the back off of mine and scanned it, but I no longer care to store the thousands of pounds of books I have anymore, and I like that my whole library fits on my laptop now!).On 6/27/2013 11:33 AM, Rainer Schuetze wrote:I tried to scan it yesterday, but got large black bar at the fold (don't knowOn 27.06.2013 19:04, Walter Bright wrote:I can remove the dust from my scanner to copy the 3 mostly relevant pages and send them to you.IIRC you also have the GC handbook book on your shelf. Check the chapters on RC, especially algorithm 18.2 "Eager reference counting with CompareAndSwap is broken".I have the book, but it is the first edition and there's no chapter 18 in it :-(object O. One thread exchanges the reference with another reference P while another thread reads the reference into S.I understand the issue (I think), but I can't think of a case where the ref count would be 1 when this happens.Consider a global shared reference R that holds the last reference to anshared(C) R = O; ; refcnt of O is 1 in pseudo-assembly missing null-checks: Thread1 (R = P) Thread2 (S = R) mov ecx,[R] ; thread suspended mov eax,[P] inc [eax].refcnt mov ebx,[R] mov [R],eax dec [ebx].refcnt ; refcnt of O now 0 jnz done call delete_ebx ; thread resumed inc [ecx].refcnt done: The increment on [ecx].refcnt modifies garbage.Ok, I see. Let me think about it some more.
Oct 09 2013
Jacob Carlborg: On 29 jun 2013, at 06:42, Walter Bright wrote:I think the simplest thing is to not allow ref counted classes to implementinterfaces other than ones derived from IUnknown. What about Objective-C interfaces?
Oct 09 2013
Michel Fortin wrote: Le 29-juin-2013 à 6:08, Jacob Carlborg a écrit :On 29 jun 2013, at 06:42, Walter Bright wrote:interfaces other than ones derived from IUnknown.I think the simplest thing is to not allow ref counted classes to implementWhat about Objective-C interfaces?Implementing ARC for Objective-C is going to require some more compiler support anyway (for autoreleased objects notably). Tweaking the compiler it so accepts Objective-C interfaces should just be a matter of tweaking the boolean expression that makes this check. "if (classdelc->objc && interfacedecl->objc) return true;" or something like that. As for which function the compiler should call, it'll probably need to be special-cased for Objective-C in the compiler too. Here's the list of changes that'd be needed for Objective-C ARC: == Retain == COM: if (obj) obj->AddRef(); ObjC: obj = objc_retain(obj); ObjC blocks: obj = objc_retainBlock(obj); // objc_retainBlock might do a copy == Release == COM: if (obj) obj->Release(); ObjC: objc_release(obj); == Assignment == COM: if (obj) obj->AddRef(); if (var) var->Release(); var = obj; ObjC: objc_storeStrong(&var, obj); ObjC blocks: obj = objc_retainBlock(obj); objc_release(var); var = obj; As long as Walter implements D ARC in a way we can make the above substitutions it shouldn't be too hard. Then, support for autorelease is a matter of calling objc_autorelease on returned objects from autoreleasing functions, followed by objc_retain after the function call in the caller. We'll also have to check what ObjC ARC does for pointer write-backs to autoreleased variables and mimick that. There's an optimized path for autoreleased return values that we should use, but I'd defer that to later. It involves a special no-op instruction to insert at the right place as a flag. Also, autoreleased returns should be eliminated when inlining.
Oct 09 2013
Looks like this should go into the O-C DIP. On 6/29/2013 5:38 AM, Michel Fortin wrote:Implementing ARC for Objective-C is going to require some more compilersupport anyway (for autoreleased objects notably). Tweaking the compiler it so accepts Objective-C interfaces should just be a matter of tweaking the boolean expression that makes this check. "if (classdelc->objc && interfacedecl->objc) return true;" or something like that.As for which function the compiler should call, it'll probably need to bespecial-cased for Objective-C in the compiler too. Here's the list of changes that'd be needed for Objective-C ARC:== Retain == COM: if (obj) obj->AddRef(); ObjC: obj = objc_retain(obj); ObjC blocks: obj = objc_retainBlock(obj); // objc_retainBlock might do a copy == Release == COM: if (obj) obj->Release(); ObjC: objc_release(obj); == Assignment == COM: if (obj) obj->AddRef(); if (var) var->Release(); var = obj; ObjC: objc_storeStrong(&var, obj); ObjC blocks: obj = objc_retainBlock(obj); objc_release(var); var = obj; As long as Walter implements D ARC in a way we can make the abovesubstitutions it shouldn't be too hard.Then, support for autorelease is a matter of calling objc_autorelease onreturned objects from autoreleasing functions, followed by objc_retain after the function call in the caller. We'll also have to check what ObjC ARC does for pointer write-backs to autoreleased variables and mimick that.There's an optimized path for autoreleased return values that we should use,but I'd defer that to later. It involves a special no-op instruction to insert at the right place as a flag. Also, autoreleased returns should be eliminated when inlining.
Oct 09 2013
Jacob Carlborg: On 29 jun 2013, at 22:24, Walter Bright wrote:Looks like this should go into the O-C DIP.Shouldn't all this reference counting be in its own DIP?
Oct 09 2013
On 6/30/2013 1:50 AM, Jacob Carlborg wrote:On 29 jun 2013, at 22:24, Walter Bright wrote:Yes. It's not ready yet, though.Looks like this should go into the O-C DIP.Shouldn't all this reference counting be in its own DIP?
Oct 09 2013
Michel Fortin wrote: Le 25-juin-2013 à 17:00, Walter Bright a écrit :6. If a class or struct contains RC fields, calls to Release() for thosefields willbe added to the destructor, and a destructor will be created if one doesn'texist already. Another thing to note that the above is dangerous if the destructor is called from the GC and RC objects are allocated from GC memory. Referenced objects might already have been destroyed and you'll be calling Release() on them. This will happen when the GC releases a cycle.
Oct 09 2013
On 6/30/2013 12:35 PM, Michel Fortin wrote:Le 25-juin-2013 à 17:00, Walter Bright a écrit :fields will6. If a class or struct contains RC fields, calls to Release() for thoseexist already.be added to the destructor, and a destructor will be created if one doesn'tAnother thing to note that the above is dangerous if the destructor is calledfrom the GC and RC objects are allocated from GC memory. Referenced objects might already have been destroyed and you'll be calling Release() on them. This will happen when the GC releases a cycle.Amended as: 6. If a class or struct contains RC fields, calls to Release() for those fields will be added to the destructor, and a destructor will be created if one doesn't exist already. Release() implementations should take care to not destroy objects that are already destroyed, which can happen if the objects are allocated on the GC heap and the GC removes a cycle of refcounted objects.
Oct 09 2013
Michel Fortin wrote: Le 2013-06-30 à 16:32, Walter Bright a écrit :Amended as: 6. If a class or struct contains RC fields, calls to Release() for thosefields willbe added to the destructor, and a destructor will be created if one doesn'texist already.Release() implementations should take care to not destroy objects that arealready destroyed,which can happen if the objects are allocated on the GC heap and the GCremoves a cycle ofrefcounted objects.Good advice. But... how do you implement that? For one thing, I doubt there's an API in the GC you can query for deleted objects, and if there was it'd be inefficient to call it for every call to Release. And also, will a virtual call to a function of a destroyed object work in the first place? It all seems quite fragile to me.
Oct 09 2013
On 6/30/2013 3:05 PM, Michel Fortin wrote:Le 2013-06-30 à 16:32, Walter Bright a écrit :fields willAmended as: 6. If a class or struct contains RC fields, calls to Release() for thoseexist already.be added to the destructor, and a destructor will be created if one doesn'talready destroyed,Release() implementations should take care to not destroy objects that areremoves a cycle ofwhich can happen if the objects are allocated on the GC heap and the GCan API in the GC you can query for deleted objects, and if there was it'd be inefficient to call it for every call to Release. And also, will a virtual call to a function of a destroyed object work in the first place? It all seems quite fragile to me.refcounted objects.Good advice. But... how do you implement that? For one thing, I doubt there'sThe GC doesn't actually delete anything while it is doing a collection cycle. So the refcount could simply be checked.
Oct 09 2013
Steven Schveighoffer wrote: On Jun 30, 2013, at 6:11 PM, Walter Bright wrote:On 6/30/2013 3:05 PM, Michel Fortin wrote:fields willLe 2013-06-30 à 16:32, Walter Bright a écrit :Amended as: 6. If a class or struct contains RC fields, calls to Release() for thoseexist already.be added to the destructor, and a destructor will be created if one doesn'talready destroyed,Release() implementations should take care to not destroy objects that areremoves a cycle ofwhich can happen if the objects are allocated on the GC heap and the GCthere's an API in the GC you can query for deleted objects, and if there was it'd be inefficient to call it for every call to Release. And also, will a virtual call to a function of a destroyed object work in the first place? It all seems quite fragile to me.refcounted objects.Good advice. But... how do you implement that? For one thing, I doubtSo the refcount could simply be checked. AFAIK, this isn't a requirement of the GC. May want to add it. I have bad experiences with trying to second guess the GC and when it actually kills the object. Note, if this is the case, then inc/dec refcount cannot depend on vtable, since that is zeroed. I'm wondering if the GC shouldn't set the RC to size_t.max when destructing, or even just +1 it, to ensure the ref count destructor doesn't accidentally free it before the reaper does. -SteveThe GC doesn't actually delete anything while it is doing a collection cycle.
Oct 09 2013
Michel Fortin wrote: Le 30-juin-2013 à 18:11, Walter Bright a écrit :On 6/30/2013 3:05 PM, Michel Fortin wrote:fields willLe 2013-06-30 à 16:32, Walter Bright a écrit :Amended as: 6. If a class or struct contains RC fields, calls to Release() for thoseexist already.be added to the destructor, and a destructor will be created if one doesn'talready destroyed,Release() implementations should take care to not destroy objects that areremoves a cycle ofwhich can happen if the objects are allocated on the GC heap and the GCthere's an API in the GC you can query for deleted objects, and if there was it'd be inefficient to call it for every call to Release. And also, will a virtual call to a function of a destroyed object work in the first place? It all seems quite fragile to me.refcounted objects.Good advice. But... how do you implement that? For one thing, I doubtThe GC doesn't actually delete anything while it is doing a collection cycle.So the refcount could simply be checked. ... checked and decremented, and if it reaches zero in the thread the GC is currently running then it doesn't have to delete the object as, in theory, it should be destructed as part of the same run. Ok, I get it now. You should add a requirement that the reference counter be atomic because the GC can run in any thread and you still need to decrement counters of referenced objects in destructor. Honestly, I think it'd be much easier if the runtime provided its own base object you could use for reference counting with the GC to collect cycles. The provided implementation could rely on internal details of the GC since both would be part of druntime. There isn't much room for alternate implementations when the GC is involved anyway.
Oct 09 2013
On 6/30/2013 4:35 PM, Michel Fortin wrote:Le 30-juin-2013 à 18:11, Walter Bright a écrit :fields willOn 6/30/2013 3:05 PM, Michel Fortin wrote:Le 2013-06-30 à 16:32, Walter Bright a écrit :Amended as: 6. If a class or struct contains RC fields, calls to Release() for thosedoesn't exist already.be added to the destructor, and a destructor will be created if onealready destroyed,Release() implementations should take care to not destroy objects that areremoves a cycle ofwhich can happen if the objects are allocated on the GC heap and the GCthere's an API in the GC you can query for deleted objects, and if there was it'd be inefficient to call it for every call to Release. And also, will a virtual call to a function of a destroyed object work in the first place? It all seems quite fragile to me.refcounted objects.Good advice. But... how do you implement that? For one thing, I doubtcycle. So the refcount could simply be checked.The GC doesn't actually delete anything while it is doing a collection... checked and decremented, and if it reaches zero in the thread the GC iscurrently running then it doesn't have to delete the object as, in theory, it should be destructed as part of the same run. Ok, I get it now.You should add a requirement that the reference counter be atomic because theGC can run in any thread and you still need to decrement counters of referenced objects in destructor. I very much want to avoid requiring atomic counts - it's a major performance penalty. Note that if the GC is reaping a cycle, nobody else is referencing the object, so this should not be an issue.Honestly, I think it'd be much easier if the runtime provided its own baseobject you could use for reference counting with the GC to collect cycles. The provided implementation could rely on internal details of the GC since both would be part of druntime. There isn't much room for alternate implementations when the GC is involved anyway.
Oct 09 2013
Steven Schveighoffer wrote: On Jun 30, 2013, at 8:18 PM, Walter Bright wrote:I very much want to avoid requiring atomic counts - it's a major performancepenalty. Note that if the GC is reaping a cycle, nobody else is referencing the object, so this should not be an issue. I think you didn't understand what Michel was saying. Take for example: A->B->C->A this is a cycle. Imagine that nobody else is pointing at A, B or C. Fine. The GC starts to collect this cycle. But let's say that D is not being collected *AND* B has a reference to D. B could be getting destroyed in one thread, and decrementing D's reference count, while someone else in another thread is incrementing/decrementing D's reference count. I agree that RC optimally is thread-local. But if you involve the GC, then ref incs and decs have to be atomic. I don't think this is that bad. iOS on ARM which has terrible atomic primitives uses atomic reference counts. If you do NOT involve the GC and are careful about cycles, then you could potentially have a RC solution that does not require atomics. But that would have to be a special case, with the danger of having cycles. -Steve
Oct 09 2013
Michel Fortin wrote: Le 30-juin-2013 à 20:25, Steven Schveighoffer a écrit :A->B->C->A this is a cycle. Imagine that nobody else is pointing at A, B or C. Fine.The GC starts to collect this cycle.But let's say that D is not being collected *AND* B has a reference to D. B could be getting destroyed in one thread, and decrementing D's referencecount, while someone else in another thread is incrementing/decrementing D's reference count.I agree that RC optimally is thread-local. But if you involve the GC, thenref incs and decs have to be atomic. Exactly what I was trying to explain. Thanks.I don't think this is that bad. iOS on ARM which has terrible atomicprimitives uses atomic reference counts. Moreover iOS uses a single spinlock to protect a global hash table containing all reference counts.If you do NOT involve the GC and are careful about cycles, then you couldpotentially have a RC solution that does not require atomics. But that would have to be a special case, with the danger of having cycles. Not involving the GC is quite difficult: you need to be absolutely sure you have no pointer pointing to that thread-local ref-counted object anywhere in the GC-heap. Unfortunately, there's no way to guaranty statically what is part of the GC heap and what is not, so any non-atomic reference counter is not safe.
Oct 09 2013
Steven Schveighoffer wrote: On Jun 30, 2013, at 10:08 PM, Michel Fortin wrote:Le 30-juin-2013 à 20:25, Steven Schveighoffer a écrit :primitives uses atomic reference counts.I don't think this is that bad. iOS on ARM which has terrible atomicMoreover iOS uses a single spinlock to protect a global hash table containingall reference counts. Hearing this, I actually find it amazing how well it works :)potentially have a RC solution that does not require atomics. But that would have to be a special case, with the danger of having cycles.If you do NOT involve the GC and are careful about cycles, then you couldNot involving the GC is quite difficult: you need to be absolutely sure youhave no pointer pointing to that thread-local ref-counted object anywhere in the GC-heap. Unfortunately, there's no way to guaranty statically what is part of the GC heap and what is not, so any non-atomic reference counter is not safe. This is true, I was thinking of garbage collected RC object referring to RC object, I wasn't thinking of fully GC object referring to RC object. In terms of pure functions and possibly a nogc attribute, this might be a possibility. Maybe at some point we have a nogcref attribute we attach to specific *types* so the compiler prevents you from storing any references to that type in the GC. I think it is be important to reserve the possibility for having cases where RC inc/dec is not atomic. Especially where we have D's type system identifying what is shared and what is not. Especially when there is the possibility for thread-local GCs. -Steve
Oct 09 2013
On 6/30/2013 5:25 PM, Steven Schveighoffer wrote:On Jun 30, 2013, at 8:18 PM, Walter Bright wrote:penalty. Note that if the GC is reaping a cycle, nobody else is referencing the object, so this should not be an issue.I very much want to avoid requiring atomic counts - it's a major performanceI think you didn't understand what Michel was saying. Take for example: A->B->C->A this is a cycle. Imagine that nobody else is pointing at A, B or C. Fine.The GC starts to collect this cycle.But let's say that D is not being collected *AND* B has a reference to D. B could be getting destroyed in one thread, and decrementing D's referencecount, while someone else in another thread is incrementing/decrementing D's reference count.I agree that RC optimally is thread-local. But if you involve the GC, thenref incs and decs have to be atomic. This is actually a problem right now with the GC, as destructors may be run in another thread than they belong in. The situation you describe is not worse or better than that, it's the same thing. The solution is to run the destructors in the same thread the objects belong in.I don't think this is that bad. iOS on ARM which has terrible atomicprimitives uses atomic reference counts. It's bad. ARM is not the only processor out there.
Oct 09 2013
Steven Schveighoffer wrote: On Jun 30, 2013, at 10:26 PM, Walter Bright wrote:On 6/30/2013 5:25 PM, Steven Schveighoffer wrote:performance penalty. Note that if the GC is reaping a cycle, nobody else is referencing the object, so this should not be an issue.On Jun 30, 2013, at 8:18 PM, Walter Bright wrote:I very much want to avoid requiring atomic counts - it's a majorThe GC starts to collect this cycle.I think you didn't understand what Michel was saying. Take for example: A->B->C->A this is a cycle. Imagine that nobody else is pointing at A, B or C. Fine.count, while someone else in another thread is incrementing/decrementing D's reference count.But let's say that D is not being collected *AND* B has a reference to D. B could be getting destroyed in one thread, and decrementing D's referenceref incs and decs have to be atomic.I agree that RC optimally is thread-local. But if you involve the GC, thenThis is actually a problem right now with the GC, as destructors may be runin another thread than they belong in. The situation you describe is not worse or better than that, it's the same thing. The solution is to run the destructors in the same thread the objects belong in. I think that's a tall order presently. For instance, on linux, the threads are all stopped using a signal. It's a very bad idea to run destructors in a signal handler. What it seems like you are saying is that a prerequisite for ref counting is to have thread-local GC working. If that is the case, we need to start a thread-local GC "thread" before this goes any further.primitives uses atomic reference counts.I don't think this is that bad. iOS on ARM which has terrible atomicIt's bad. ARM is not the only processor out there.Pragmatically, I think if D targets x86 variants and ARM, it is well-situated in the mainstream of existing devices. Yes, it would be nice if it could target other obscure platforms, but if we are talking ref counting works poorly on those, I don't think we are any worse off than today. Note that we can keep the options open, and implement atomic RC now without many headaches. -Steve
Oct 09 2013
On 6/30/2013 7:36 PM, Steven Schveighoffer wrote:On Jun 30, 2013, at 10:26 PM, Walter Bright wrote:performance penalty. Note that if the GC is reaping a cycle, nobody else is referencing the object, so this should not be an issue.On 6/30/2013 5:25 PM, Steven Schveighoffer wrote:On Jun 30, 2013, at 8:18 PM, Walter Bright wrote:I very much want to avoid requiring atomic counts - it's a majorThe GC starts to collect this cycle.I think you didn't understand what Michel was saying. Take for example: A->B->C->A this is a cycle. Imagine that nobody else is pointing at A, B or C. Fine.count, while someone else in another thread is incrementing/decrementing D's reference count.But let's say that D is not being collected *AND* B has a reference to D. B could be getting destroyed in one thread, and decrementing D's referenceref incs and decs have to be atomic.I agree that RC optimally is thread-local. But if you involve the GC, thenin another thread than they belong in. The situation you describe is not worse or better than that, it's the same thing. The solution is to run the destructors in the same thread the objects belong in.This is actually a problem right now with the GC, as destructors may be runI think that's a tall order presently. For instance, on linux, the threadsare all stopped using a signal. It's a very bad idea to run destructors in a signal handler.What it seems like you are saying is that a prerequisite for ref counting isto have thread-local GC working. If that is the case, we need to start a thread-local GC "thread" before this goes any further. Not really. This doesn't make anything worse. Also, the proposed solution to this issue is to post the "destruct" list to the appropriate thread, and that thread runs it next time it calls the GC.primitives uses atomic reference counts.I don't think this is that bad. iOS on ARM which has terrible atomicin the mainstream of existing devices. Yes, it would be nice if it could target other obscure platforms, but if we are talking ref counting works poorly on those, I don't think we are any worse off than today. Note that we can keep the options open, and implement atomic RC now without many headaches.It's bad. ARM is not the only processor out there.Pragmatically, I think if D targets x86 variants and ARM, it is well-situatedWe don't need to require atomic RC for these.
Oct 09 2013
Steven Schveighoffer wrote: On Jul 1, 2013, at 3:11 AM, Walter Bright wrote:On 6/30/2013 7:36 PM, Steven Schveighoffer wrote:are all stopped using a signal. It's a very bad idea to run destructors in a signal handler.I think that's a tall order presently. For instance, on linux, the threadsto have thread-local GC working. If that is the case, we need to start a thread-local GC "thread" before this goes any further.What it seems like you are saying is that a prerequisite for ref counting isNot really. This doesn't make anything worse. Also, the proposed solution tothis issue is to post the "destruct" list to the appropriate thread, and that thread runs it next time it calls the GC. I really urge you to make this a separate project. It's not trivial. Logically, it's sound, but the implementation will be very difficult. I also think Sean (and probably others) should be involved for that discussion.well-situated in the mainstream of existing devices. Yes, it would be nice if it could target other obscure platforms, but if we are talking ref counting works poorly on those, I don't think we are any worse off than today. Note that we can keep the options open, and implement atomic RC now without many headaches.Pragmatically, I think if D targets x86 variants and ARM, it isI didn't say that. I said we could implement atomic RC without any changes to the GC, and worry about optimizing with non-atomic RC later. As long as we make it *possible*. -SteveWe don't need to require atomic RC for these.
Oct 09 2013
On 7/1/2013 6:08 AM, Steven Schveighoffer wrote:On Jul 1, 2013, at 3:11 AM, Walter Bright wrote:are all stopped using a signal. It's a very bad idea to run destructors in a signal handler.On 6/30/2013 7:36 PM, Steven Schveighoffer wrote:I think that's a tall order presently. For instance, on linux, the threadsis to have thread-local GC working. If that is the case, we need to start a thread-local GC "thread" before this goes any further.What it seems like you are saying is that a prerequisite for ref countingthis issue is to post the "destruct" list to the appropriate thread, and that thread runs it next time it calls the GC.Not really. This doesn't make anything worse. Also, the proposed solution toI really urge you to make this a separate project. It's not trivial.Logically, it's sound, but the implementation will be very difficult. I also think Sean (and probably others) should be involved for that discussion. Make what a separate project? The destruction of objects by the GC in local threads? It already is not part of the ref counting proposal.
Oct 09 2013
Steven Schveighoffer wrote: On Jul 1, 2013, at 12:17 PM, Walter Bright wrote:Logically, it's sound, but the implementation will be very difficult. I also think Sean (and probably others) should be involved for that discussion.I really urge you to make this a separate project. It's not trivial.Make what a separate project? The destruction of objects by the GC in localthreads? It already is not part of the ref counting proposal.As far as I can tell, the ref counting proposal is not viable without it, as long as you insist on non-atomic RC increments and decrements. How can it possibly not be a prerequisite to this, and therefore part of the proposal? Unless you are saying now that atomic ref counting is OK? I'm going by your previous statement:I very much want to avoid requiring atomic counts - it's a major performancepenalty. -Steve
Oct 09 2013
On 2013-10-10 04:35, Walter Bright wrote:Steven Schveighoffer wrote: On Jul 1, 2013, at 12:17 PM, Walter Bright wrote: >> I really urge you to make this a separate project. It's not trivial. Logically, it's sound, but the implementation will be very difficult. I also think Sean (and probably others) should be involved for that discussion. > > Make what a separate project? The destruction of objects by the GC in local threads? It already is not part of the ref counting proposal. > As far as I can tell, the ref counting proposal is not viable without it, as long as you insist on non-atomic RC increments and decrements. How can it possibly not be a prerequisite to this, and therefore part of the proposal? Unless you are saying now that atomic ref counting is OK? I'm going by your previous statement: > I very much want to avoid requiring atomic counts - it's a major performance penalty. -SteveIs this the last email in the conversation? In that case I think you clearly mark that with a post. -- /Jacob Carlborg
Oct 09 2013
On 10/9/2013 11:52 PM, Jacob Carlborg wrote:Is this the last email in the conversation?Yes, I posted them in chronological order.
Oct 10 2013
Michel Fortin wrote: Le 30-juin-2013 à 22:26, Walter Bright a écrit :This is actually a problem right now with the GC, as destructors may be runin another thread than they belong in. The situation you describe is not worse or better than that, it's the same thing. The solution is to run the destructors in the same thread the objects belong in. Indeed. Maybe that could work. How ironic that we can't implement RC efficiently because of the GC. That said, it strongly favors having a base RC object implementation in druntime, where it can be kept in sync with the GC.
Oct 09 2013
On 6/30/2013 7:47 PM, Michel Fortin wrote:Le 30-juin-2013 à 22:26, Walter Bright a écrit :in another thread than they belong in. The situation you describe is not worse or better than that, it's the same thing. The solution is to run the destructors in the same thread the objects belong in.This is actually a problem right now with the GC, as destructors may be runIndeed. Maybe that could work. How ironic that we can't implement RCefficiently because of the GC.That said, it strongly favors having a base RC object implementation indruntime, where it can be kept in sync with the GC.The GC doesn't need to know about it.
Oct 09 2013
And that's the last email in the original thread!
Oct 10 2013
On 10/10/2013 12:31 AM, Walter Bright wrote:And that's the last email in the original thread!Durn, no it isn't.
Oct 10 2013
On Thursday, 10 October 2013 at 02:28:13 UTC, Walter Bright wrote:Steven Schveighoffer wrote: On Jun 30, 2013, at 8:18 PM, Walter Bright wrote:I think this ties into the requirement that after the GC collects thread-local objects, they must be finalized by the thread that owns them (assuming it's still alive). What's missing is some way to track what thread owns an object. This isn't super difficult to add in the simple case, but if we allow thread-local objects to be transferred between threads, then the transferral of ownership has to be communicated to the GC. Assuming for the moment that's not a problem though, I think RC updates could be non-atomic for thread-local data.I very much want to avoid requiring atomic counts - it's amajor performance penalty. Note that if the GC is reaping a cycle, nobody else is referencing the object, so this should not be an issue. I think you didn't understand what Michel was saying. Take for example: A->B->C->A this is a cycle. Imagine that nobody else is pointing at A, B or C. Fine. The GC starts to collect this cycle. But let's say that D is not being collected *AND* B has a reference to D. B could be getting destroyed in one thread, and decrementing D's reference count, while someone else in another thread is incrementing/decrementing D's reference count. I agree that RC optimally is thread-local. But if you involve the GC, then ref incs and decs have to be atomic.
Oct 11 2013
I've made short roundup of the different features/requirements that have been mentioned (may have forgotten some and added some) as this thread has a complexity that presumably makes it quite hard to follow for most readers. I have also attached an estimated priority for each requirement based on the discussion and my own experiences. - Memory safety [very important but also very difficult/limiting] Disallow escaping uncounted references to reference counted memory. Keywords: pure, scope, isolated/owned - COM compatible [important] Needs to support internal reference counting using AddRef/ReleaseRef, while obeying to the call convention - Objective-C compatible [important] Weak references, manual memory management and autorelease pools are some keywords here - Handle reference cycles [nice to have] Requires GC memory for storing the instances - Weak pointers [critical] Only briefly mentioned, but critical for many data structures (e.g. graphs or caches) requires external reference counting - Not require two separate counts for COM [mildly important] Using GC memory would require a second reference count for the D side of things, which is not desirable for multiple reasons - Support type qualifiers [critical] All usual type qualifiers and conversions should work as expected. This is not possible in a pure template based solution. - Support non-class types [nice to have] Although slightly limiting, classes provide a convenient abstraction and will arguably capture >90% use cases just fine - Support referencing fields/slices [nice to have] Letting references to members escape in a safe way would greatly increase the flexibility, but ties it tightly to the GC - Allow manual memory management [critical] For COM/Obj-C and any situation where external code is to take responsibility of the ref counting this needs to be an option - Not require new annotations [important] Getting this to work without introducing new keywords/syntax is strongly preferred by Walter - Safe shared(RefCountedType) variables [important] There needs to be a way to make shared references thread-safe - Support OOP reasonably well [important] The usual up and down casts should work and calling functions of base classes needs to be safe Please mention any points that I have forgotten so that we have some kind of unit test against which proposed designs can be checked.
Oct 12 2013
To support most of the requirements we need to offer some control over the reference type. Forcing the reference to be a pointer to the class instance precludes proper weak references and makes thread-safety difficult. Rainer Schütze's proposal [1] looked promising, but didn't quite work out. However, by going a bit further, I think this approach can be fixed and will provide all the flexibility needed to implement solutions that can satisfy any of those requirements. The basic idea is the same: Any reference to a class that is recognized as being reference counted is replaced by a struct that performs the reference counting using RAII (e.g. std.typecons.RefCounted). This allows any reference counting scheme to be implemented (internal/external, support weak refs or not, global counter table, GC memory or malloc, etc.). TL;DR Let some code speak instead of a full blown spec first: struct RefCounted(T) { /* ... */ } // referenceType!RefCounted class C { // the presence of a C.ReferenceType template makes the class a // reference counted class // Note: A referenceType UDA as above defined in object.d // could be a cleaner/more explicit alternative alias ReferenceType = RefCounted; // C is now internally renamed to __ref_C to avoid ambiguities // and "C" itself is an alias for ReferenceType!__ref_C pragma(msg, C.stringof); // "RefCounted!__ref_C" void method() { // The "this" pointer, too, is of the reference counted // type. caveats: how to handle private fields? COM call // convention? pragma(msg, typeof(this)); // "RefCounted!__ref_C" // Alternative: allow only pure functions to avoid // escaping references // Another alternative is to make 'scope' powerful // enough and use that: pragma(msg, typeof(this)); // "scope __ref_C" } } // The reference type itself is never const/immutable // to enable reference counting for qualified types C c; // -> RefCounted!__ref_C const(C) d; // -> RefCounted!(const(__ref_C)) // To support the usual implicit conversions, some substitution is // needed, since we have no implicit cast support for UDTs in the // language d = c; // d = typeof(D)(c) // or // d = typeof(D).implicitCastFrom(c) // or // d = typeof(C).implicitCastTo!D // shared, however, is applied to the reference count itself (and // transitively to the object) to force a thread-safety - or rather to // avoid accidental use of unsafe implementations for shared references) shared(const(C)) e; // -> shared(RefCounted!(const(__ref_C))) --- Caveat: Changing the "this" pointer from '__ref_C' to 'RefCounted!__ref_C' has implications on the calling convention, which needs to be taken into account when COM objects are involved. A simple COMPtr-like struct that only contains the target pointer may be enough here, though. Also, to guarantee memory safety, some additional measures need to be taken to avoid escaping plain references to refcounted memory. One solution is the use of isolated/owned types, another is to make 'scope' not only check for shallow reference escaping, but also check for escaping of references to fields (similar thing but behaves differently). Both combined would of course be ideal. I think this is an issue that is mostly orthogonal to the refcount topic. See also the corresponding thread [2]. [1]: http://forum.dlang.org/thread/l34lei$255v$1 digitalmars.com?page=5#post-l352nk:242g3b:246:40digitalmars.com [2]: http://forum.dlang.org/thread/kluaojijixhwigoujeip forum.dlang.org
Oct 12 2013
On Saturday, 12 October 2013 at 14:03:26 UTC, Sönke Ludwig wrote:I've made short roundup of the different features/requirements that have been mentioned (may have forgotten some and added some) as this thread has a complexity that presumably makes it quite hard to follow for most readers. I have also attached an estimated priority for each requirement based on the discussion and my own experiences. - Memory safety [very important but also very difficult/limiting] Disallow escaping uncounted references to reference counted memory. Keywords: pure, scope, isolated/ownedWe have a first missing block here :D I do think this is mandatory.- Objective-C compatible [important] Weak references, manual memory management and autorelease pools are some keywords hereCan someone explain me what an autorelease pool is ?- Handle reference cycles [nice to have] Requires GC memory for storing the instancesThe good new is that we can (and IMO should) layer ref counting on top of GC. This is the only way to make both work nicely together and have safety net for leakage.- Weak pointers [critical] Only briefly mentioned, but critical for many data structures (e.g. graphs or caches) requires external reference countingEasy for RefCounted but become tricky for GC stuff.- Not require two separate counts for COM [mildly important] Using GC memory would require a second reference count for the D side of things, which is not desirable for multiple reasonsCan you explain that one a bit more ? Especially how it require 2 count.- Support type qualifiers [critical] All usual type qualifiers and conversions should work as expected. This is not possible in a pure template based solution.Here we have the second piece missing. We need a way to tail qualify template.- Not require new annotations [important] Getting this to work without introducing new keywords/syntax is strongly preferred by WalterWe have 2 missing bloc to make that work (from a language perpective, many runtime/compiler magic is also required).- Support OOP reasonably well [important] The usual up and down casts should work and calling functions of base classes needs to be safeI see no way to make that work without increasing the scope of scope (huhuhu :P)Please mention any points that I have forgotten so that we have some kind of unit test against which proposed designs can be checked.You did an excellent job here. I guess we have 2 missing piece (and an incomplete one in the name of scope) to get sorted out and we can get something really nice here !
Oct 12 2013
On 2013-10-13 01:15:49 +0000, "deadalnix" <deadalnix gmail.com> said:Can someone explain me what an autorelease pool is ?A basic concept in Objective-C to make manual reference counting bearable. As things are moving to *automatic* reference counting now autorelease pools are becoming less important, but they remain there for backward compatibility and autoreleased objects must be handled correctly by ARC following to the existing conventions. The concept is to have functions return autoreleased objects, objects pending release. Each time you autorelease an object, instead of the counter being decremented immediately, the object gets added to the autorelease pool and the pool decrement the counter later when it gets drained. So, when the caller gets an autoreleased object, it doesn't have to decrement the counter when it stopped using the object as a temporary, the object will be cleaned up automatically later, generally at the next iteration of the event loop. You only need to retain the object if you're storing it somewhere else than a local variable. So this is what made manual reference counting bearable in Objective-C. Autorelease pool support is only useful and needed for correctly implementing ARC for Objective-C object types. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 12 2013