digitalmars.D - Proposal: Thread-sealed containers
- dsimcha (21/21) Mar 26 2011 I'm thinking about ways to make std.concurrency's model more flexible
- Jason House (5/27) Mar 26 2011 It might be possible to simply store the owning thread id. Since the thr...
- dsimcha (18/36) Mar 27 2011 I was using Andrei's definition that had been floated around here
- Andrei Alexandrescu (5/26) Mar 27 2011 Sounds like a good idea. Given the restriction, instead of one reference...
I'm thinking about ways to make std.concurrency's model more flexible without compromising safety. It's sometimes useful to cheaply _move_ (as opposed to share or copy) data between threads. I wonder if we could co-opt sealed containers here: 1. Define basic data structures like arrays, hash tables and trees that are otherwise identical to sealed containers, but maintain a _thread-specific_ reference count. This can be naively implemented as a hash table with Thread keys and size_t values, but there's probably a better way. Provisionally call these "thread-sealed containers". A thread-sealed container has the invariant that no two thread-specific reference counts may be simultaneously nonzero. If this becomes untrue, an Error is thrown at runtime. This is not as good as a compile time error, but it's much better than a low-level data race. 2. Restrict the contents of thread-sealed containers to types that can be safely shared/moved between threads: Primitives, shared and immutable data, and other thread-sealed containers. 3. Define a std.concurrency.moveSend() function. This function takes a thread-sealed container by reference, decrements the sending thread's reference count, enforces that it's zero, then sends the container as a message to the receiving thread. The receiving thread increments its reference count of the container.
Mar 26 2011
dsimcha Wrote:I'm thinking about ways to make std.concurrency's model more flexible without compromising safety. It's sometimes useful to cheaply _move_ (as opposed to share or copy) data between threads. I wonder if we could co-opt sealed containers here:What definition of sealed are you using? A quick web search mostly brought up Andrei might have used the term to mean no aliases to the data held by the container?1. Define basic data structures like arrays, hash tables and trees that are otherwise identical to sealed containers, but maintain a _thread-specific_ reference count. This can be naively implemented as a hash table with Thread keys and size_t values, but there's probably a better way. Provisionally call these "thread-sealed containers".It might be possible to simply store the owning thread id. Since the thread that sent the container will still have a reference to it, you may be forced to do some extra runtime checks on every method call anyway...A thread-sealed container has the invariant that no two thread-specific reference counts may be simultaneously nonzero. If this becomes untrue, an Error is thrown at runtime. This is not as good as a compile time error, but it's much better than a low-level data race. 2. Restrict the contents of thread-sealed containers to types that can be safely shared/moved between threads: Primitives, shared and immutable data, and other thread-sealed containers.Some kind of equivalent to assumeUnique may be needed here?3. Define a std.concurrency.moveSend() function. This function takes a thread-sealed container by reference, decrements the sending thread's reference count, enforces that it's zero, then sends the container as a message to the receiving thread. The receiving thread increments its reference count of the container.I like that you're thinking about this kind of stuff. I am over tired right now and haven't really thought this through, but it feels like this could lead to a major safety upgrade for std.concurrency.
Mar 26 2011
On 3/27/2011 2:47 AM, Jason House wrote:dsimcha Wrote:I was using Andrei's definition that had been floated around here previously. For those not already familiar, a sealed container is one where references/pointers to the elements cannot be obtained. This allows things like reference-counted memory management to be done safely.I'm thinking about ways to make std.concurrency's model more flexible without compromising safety. It's sometimes useful to cheaply _move_ (as opposed to share or copy) data between threads. I wonder if we could co-opt sealed containers here:What definition of sealed are you using? A quick web search mostly brought up Andrei might have used the term to mean no aliases to the data held by the container?1. Define basic data structures like arrays, hash tables and trees that are otherwise identical to sealed containers, but maintain a _thread-specific_ reference count. This can be naively implemented as a hash table with Thread keys and size_t values, but there's probably a better way. Provisionally call these "thread-sealed containers".It might be possible to simply store the owning thread id. Since the thread that sent the container will still have a reference to it, you may be forced to do some extra runtime checks on every method call anyway...Now that I think about it some more, we don't need explicit thread-specific reference counts. All we need to prove is that we're dealing with a unique reference, i.e. that the reference count is 1. moveSend() then clears the sending thread's view of the container so that the sending thread no longer has access to it and sends it as a message to the receiving thread. The receiving thread is the new owner. The bottom line concept is that uniqueness is useful but difficult to prove statically. (To refresh people's memory uniqueness in this context means that a piece of data only has one pointer/reference pointing to it.) Static solutions will always be conservative and risk making the type system absurdly complex. Therefore, enforce uniqueness at runtime when it can be determined exactly via reference counting. The same principle may solve the problem of immutable data creation.3. Define a std.concurrency.moveSend() function. This function takes a thread-sealed container by reference, decrements the sending thread's reference count, enforces that it's zero, then sends the container as a message to the receiving thread. The receiving thread increments its reference count of the container.I like that you're thinking about this kind of stuff. I am over tired right now and haven't really thought this through, but it feels like this could lead to a major safety upgrade for std.concurrency.
Mar 27 2011
On 03/26/2011 05:26 PM, dsimcha wrote:I'm thinking about ways to make std.concurrency's model more flexible without compromising safety. It's sometimes useful to cheaply _move_ (as opposed to share or copy) data between threads. I wonder if we could co-opt sealed containers here: 1. Define basic data structures like arrays, hash tables and trees that are otherwise identical to sealed containers, but maintain a _thread-specific_ reference count. This can be naively implemented as a hash table with Thread keys and size_t values, but there's probably a better way. Provisionally call these "thread-sealed containers". A thread-sealed container has the invariant that no two thread-specific reference counts may be simultaneously nonzero. If this becomes untrue, an Error is thrown at runtime. This is not as good as a compile time error, but it's much better than a low-level data race. 2. Restrict the contents of thread-sealed containers to types that can be safely shared/moved between threads: Primitives, shared and immutable data, and other thread-sealed containers. 3. Define a std.concurrency.moveSend() function. This function takes a thread-sealed container by reference, decrements the sending thread's reference count, enforces that it's zero, then sends the container as a message to the receiving thread. The receiving thread increments its reference count of the container.Sounds like a good idea. Given the restriction, instead of one reference count per thread you only need a pair threadid + refcount. Then the refcounts of all other threads are implicitly zero. Andrei
Mar 27 2011