digitalmars.D - Proposal: Thread-sealed containers

dsimcha (21/21) Mar 26 2011 I'm thinking about ways to make std.concurrency's model more flexible

Jason House (5/27) Mar 26 2011 It might be possible to simply store the owning thread id. Since the thr...

dsimcha (18/36) Mar 27 2011 I was using Andrei's definition that had been floated around here

Andrei Alexandrescu (5/26) Mar 27 2011 Sounds like a good idea. Given the restriction, instead of one reference...

dsimcha <dsimcha yahoo.com> writes:

I'm thinking about ways to make std.concurrency's model more flexible 
without compromising safety.  It's sometimes useful to cheaply _move_ 
(as opposed to share or copy) data between threads.  I wonder if we 
could co-opt sealed containers here:

1.  Define basic data structures like arrays, hash tables and trees that 
are otherwise identical to sealed containers, but maintain a 
_thread-specific_ reference count.  This can be naively implemented as a 
hash table with Thread keys and size_t values, but there's probably a 
better way.  Provisionally call these "thread-sealed containers".

A thread-sealed container has the invariant that no two thread-specific 
reference counts may be simultaneously nonzero.  If this becomes untrue, 
an Error is thrown at runtime.  This is not as good as a compile time 
error, but it's much better than a low-level data race.

2.  Restrict the contents of thread-sealed containers to types that can 
be safely shared/moved between threads:  Primitives, shared and 
immutable data, and other thread-sealed containers.

3.  Define a std.concurrency.moveSend() function.  This function takes a 
thread-sealed container by reference, decrements the sending thread's 
reference count, enforces that it's zero, then sends the container as a 
message to the receiving thread.  The receiving thread increments its 
reference count of the container.

Mar 26 2011

Jason House <jason.james.house gmail.com> writes:

dsimcha Wrote:

 I'm thinking about ways to make std.concurrency's model more flexible 
 without compromising safety.  It's sometimes useful to cheaply _move_ 
 (as opposed to share or copy) data between threads.  I wonder if we 
 could co-opt sealed containers here:

What definition of sealed are you using? A quick web search mostly brought up

Andrei might have used the term to mean no aliases to the data held by the
container?

 1.  Define basic data structures like arrays, hash tables and trees that 
 are otherwise identical to sealed containers, but maintain a 
 _thread-specific_ reference count.  This can be naively implemented as a 
 hash table with Thread keys and size_t values, but there's probably a 
 better way.  Provisionally call these "thread-sealed containers".

It might be possible to simply store the owning thread id. Since the thread
that sent the container will still have a reference to it, you may be forced to
do some extra runtime checks on every method call anyway...

 A thread-sealed container has the invariant that no two thread-specific 
 reference counts may be simultaneously nonzero.  If this becomes untrue, 
 an Error is thrown at runtime.  This is not as good as a compile time 
 error, but it's much better than a low-level data race.
 
 2.  Restrict the contents of thread-sealed containers to types that can 
 be safely shared/moved between threads:  Primitives, shared and 
 immutable data, and other thread-sealed containers.

Some kind of equivalent to assumeUnique may be needed here?

 3.  Define a std.concurrency.moveSend() function.  This function takes a 
 thread-sealed container by reference, decrements the sending thread's 
 reference count, enforces that it's zero, then sends the container as a 
 message to the receiving thread.  The receiving thread increments its 
 reference count of the container.

I like that you're thinking about this kind of stuff. I am over tired right now
and haven't really thought this through, but it feels like this could lead to a
major safety upgrade for std.concurrency.

Mar 26 2011

dsimcha <dsimcha yahoo.com> writes:

On 3/27/2011 2:47 AM, Jason House wrote:
 dsimcha Wrote:

 I'm thinking about ways to make std.concurrency's model more flexible
 without compromising safety.  It's sometimes useful to cheaply _move_
 (as opposed to share or copy) data between threads.  I wonder if we
 could co-opt sealed containers here:

 What definition of sealed are you using? A quick web search mostly brought up

Andrei might have used the term to mean no aliases to the data held by the
container?

I was using Andrei's definition that had been floated around here 
previously.  For those not already familiar, a sealed container is one 
where references/pointers to the elements cannot be obtained.  This 
allows things like reference-counted memory management to be done safely.

 1.  Define basic data structures like arrays, hash tables and trees that
 are otherwise identical to sealed containers, but maintain a
 _thread-specific_ reference count.  This can be naively implemented as a
 hash table with Thread keys and size_t values, but there's probably a
 better way.  Provisionally call these "thread-sealed containers".

 It might be possible to simply store the owning thread id. Since the thread
that sent the container will still have a reference to it, you may be forced to
do some extra runtime checks on every method call anyway...

 3.  Define a std.concurrency.moveSend() function.  This function takes a
 thread-sealed container by reference, decrements the sending thread's
 reference count, enforces that it's zero, then sends the container as a
 message to the receiving thread.  The receiving thread increments its
 reference count of the container.

 I like that you're thinking about this kind of stuff. I am over tired right
now and haven't really thought this through, but it feels like this could lead
to a major safety upgrade for std.concurrency.

Now that I think about it some more, we don't need explicit 
thread-specific reference counts.  All we need to prove is that we're 
dealing with a unique reference, i.e. that the reference count is 1. 
moveSend() then clears the sending thread's view of the container so 
that the sending thread no longer has access to it and sends it as a 
message to the receiving thread.  The receiving thread is the new owner.

The bottom line concept is that uniqueness is useful but difficult to 
prove statically.  (To refresh people's memory uniqueness in this 
context means that a piece of data only has one pointer/reference 
pointing to it.)  Static solutions will always be conservative and risk 
making the type system absurdly complex.  Therefore, enforce uniqueness 
at runtime when it can be determined exactly via reference counting. 
The same principle may solve the problem of immutable data creation.

Mar 27 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 03/26/2011 05:26 PM, dsimcha wrote:
 I'm thinking about ways to make std.concurrency's model more flexible
 without compromising safety. It's sometimes useful to cheaply _move_ (as
 opposed to share or copy) data between threads. I wonder if we could
 co-opt sealed containers here:

 1. Define basic data structures like arrays, hash tables and trees that
 are otherwise identical to sealed containers, but maintain a
 _thread-specific_ reference count. This can be naively implemented as a
 hash table with Thread keys and size_t values, but there's probably a
 better way. Provisionally call these "thread-sealed containers".

 A thread-sealed container has the invariant that no two thread-specific
 reference counts may be simultaneously nonzero. If this becomes untrue,
 an Error is thrown at runtime. This is not as good as a compile time
 error, but it's much better than a low-level data race.

 2. Restrict the contents of thread-sealed containers to types that can
 be safely shared/moved between threads: Primitives, shared and immutable
 data, and other thread-sealed containers.

 3. Define a std.concurrency.moveSend() function. This function takes a
 thread-sealed container by reference, decrements the sending thread's
 reference count, enforces that it's zero, then sends the container as a
 message to the receiving thread. The receiving thread increments its
 reference count of the container.

Sounds like a good idea. Given the restriction, instead of one reference 
count per thread you only need a pair threadid + refcount. Then the 
refcounts of all other threads are implicitly zero.

Andrei

Mar 27 2011

D Programming

C/C++ Programming

Other

digitalmars.D - Proposal: Thread-sealed containers