digitalmars.D - What To Do About Shared?

dsimcha (35/35) Mar 22 2011 Some discussions about std.parallelism have prompted an examination of

Graham St Jack (12/47) Mar 22 2011 Sounds good in principal.

dsimcha (7/16) Mar 22 2011 The static checks I'm referring to already are in the language, but are

Jason House (3/6) Mar 23 2011 I didn't follow the review of std.parallelism, can you give some specifi...

dsimcha (78/84) Mar 23 2011 I have already decided that, unless shared is drastically improved in

dsimcha (15/28) Mar 23 2011 their own apps. I don't like your proposal and think std.parallelism sho...

Don (10/38) Mar 23 2011 Seems to me, that you're making use of some primitive that I'll call a

dsimcha (6/15) Mar 23 2011 I'm not sure. I assume that the implementation would rely on a shared a...

Don (22/37) Mar 23 2011 No. I haven't said anything about the implementation of DivisibleArray

Andrei Alexandrescu (9/57) Mar 23 2011 That's a great idea!

dsimcha <dsimcha yahoo.com> writes:

Some discussions about std.parallelism have prompted an examination of 
how far D's guarantees against low level data races should extend and 
how safety and practicality should be balanced.  On the one hand, 
coarse-grained multithreading with hard guarantees against low-level 
races is a great thing if it's flexible enough to do what you need it to.

On the other hand, not everything is implementable (at least not 
efficiently or easily) in such a paradigm.  D is a systems language and 
should not force people who want unchecked shared state multithreading 
to either do without it for fight the type system every inch of the way 
(by casting all over the place) to get it.

I've come up with the following proposal, which is implicitly used in 
the design of std.parallelism, but which I think should be made explicit.

1.  All  safe code must be statically checkable and provably free from 
low level data races provided that all  trusted code it calls is 
correctly implemented.  It may not cast away shared, etc.

2.  All  trusted code must guarantee to its clients that calling such 
code from  safe code will not result in low level data races.

3.  All modules that deal with multithreading must document either that:

     a.  They will use the type system to guarantee that low-level data 
races can't happen.

     b.  They will share state freely.

     c.  They will mostly share state freely, but will make guarantees 
about some specific subset.

std.concurrency would be in category a.  core.thread would be in 
category b.  std.parallelism would be in category c.

All code that only uses modules from category a, does not cast away 
shared and does not use __gshared variables can be guaranteed free from 
low level data races even if it is not  safe.

If you want hard guarantees about low level data races, these can be 
achieved with a very small amount of discipline:  Only use modules from 
category a or only use  safe code.  This is easily checkable.  Using 
modules from category b or modules from category c in non- safe code 
should be considered equivalent to casting away shared:  You may do so, 
but you're on your own when it comes to thread safety and you may not do 
it in  safe code.

Mar 22 2011

Graham St Jack <Graham.StJack internode.on.net> writes:

On 23/03/11 11:08, dsimcha wrote:
 Some discussions about std.parallelism have prompted an examination of 
 how far D's guarantees against low level data races should extend and 
 how safety and practicality should be balanced.  On the one hand, 
 coarse-grained multithreading with hard guarantees against low-level 
 races is a great thing if it's flexible enough to do what you need it to.

 On the other hand, not everything is implementable (at least not 
 efficiently or easily) in such a paradigm.  D is a systems language 
 and should not force people who want unchecked shared state 
 multithreading to either do without it for fight the type system every 
 inch of the way (by casting all over the place) to get it.

 I've come up with the following proposal, which is implicitly used in 
 the design of std.parallelism, but which I think should be made explicit.

 1.  All  safe code must be statically checkable and provably free from 
 low level data races provided that all  trusted code it calls is 
 correctly implemented.  It may not cast away shared, etc.

 2.  All  trusted code must guarantee to its clients that calling such 
 code from  safe code will not result in low level data races.

 3.  All modules that deal with multithreading must document either that:

     a.  They will use the type system to guarantee that low-level data 
 races can't happen.

     b.  They will share state freely.

     c.  They will mostly share state freely, but will make guarantees 
 about some specific subset.

 std.concurrency would be in category a.  core.thread would be in 
 category b.  std.parallelism would be in category c.

 All code that only uses modules from category a, does not cast away 
 shared and does not use __gshared variables can be guaranteed free 
 from low level data races even if it is not  safe.

 If you want hard guarantees about low level data races, these can be 
 achieved with a very small amount of discipline:  Only use modules 
 from category a or only use  safe code.  This is easily checkable.  
 Using modules from category b or modules from category c in non- safe 
 code should be considered equivalent to casting away shared:  You may 
 do so, but you're on your own when it comes to thread safety and you 
 may not do it in  safe code.

Sounds good in principal.

I assume that category a code could be  trusted, and that category b and 
c must not be  trusted.

I agree that trying to use the language to discriminate between category 
b and c would be too tricky, especially when you consider the subtleties 
of what is shared and what is not. Documentation is the only viable 
option there.

Are these static checks feasible, and if so, what are the chances of 
getting them into the language anytime soon?

-- 
Graham St Jack

Mar 22 2011

dsimcha <dsimcha yahoo.com> writes:

On 3/23/2011 12:36 AM, Graham St Jack wrote:
 Sounds good in principal.

 I assume that category a code could be  trusted, and that category b and
 c must not be  trusted.

Right, except for the subset of category C code that does make guarantees.

 I agree that trying to use the language to discriminate between category
 b and c would be too tricky, especially when you consider the subtleties
 of what is shared and what is not. Documentation is the only viable
 option there.

 Are these static checks feasible, and if so, what are the chances of
 getting them into the language anytime soon?

The static checks I'm referring to already are in the language, but are 
dependent on the threading library that's being used enforcing them. 
std.concurrency does, making it category a.  core.thread does not, 
making it category b.  std.parallelism does only for a few  trusted and 
 safe functions, making it category c.

Mar 22 2011

Jason House <jason.james.house gmail.com> writes:

dsimcha Wrote:

 Some discussions about std.parallelism have prompted an examination of 
 how far D's guarantees against low level data races should extend and 
 how safety and practicality should be balanced.

I didn't follow the review of std.parallelism, can you give some specific
examples?

Users of languages look to standard libraries as a model for how to write their
own apps. I don't like your proposal and think std.parallelism should use
shared properly. I'd like to understand better what your issues with shared
were. I've done a descent amount of shared-correct code, so I'm pretty sure
it's usable. In fact, the only really nasty bug I had could have been caught if
std.thread had been shared-correct...

Mar 23 2011

dsimcha <dsimcha yahoo.com> writes:

On 3/23/2011 9:09 AM, Jason House wrote:
 dsimcha Wrote:

 Some discussions about std.parallelism have prompted an examination of
 how far D's guarantees against low level data races should extend and
 how safety and practicality should be balanced.

 I didn't follow the review of std.parallelism, can you give some specific
examples?

 Users of languages look to standard libraries as a model for how to write
their own apps. I don't like your proposal and think std.parallelism should use
shared properly. I'd like to understand better what your issues with shared
were. I've done a descent amount of shared-correct code, so I'm pretty sure
it's usable. In fact, the only really nasty bug I had could have been caught if
std.thread had been shared-correct...

I have already decided that, unless shared is drastically improved in 
ways I don't foresee (I'm not even sure exactly how, this would need to 
be discussed), I will not be making std.parallelism shared correct.  I 
put some serious thought into this issue before making the proposal and 
concluded that, for even moderately fine-grained parallelism, shared 
would get in the way more than it helps.  shared has its place if you're 
primarily using message passing and using shared state in only a very 
limited number of places, but IMHO that's the only way it helps more 
than it gets in the way.

If it comes down to a choice between the two (I hope I don't have to 
make this choice), I'd rather have std.parallelism be a useful 3rd party 
lib than an unusable bondage-and-discipline Phobos module.  If someone 
else wants to fork it and try to make it shared correct, that's their 
prerogative.

Remember, D is a **SYSTEMS LANGUAGE**.  There is no excuse for it going 
out of its way to make certain paradigms as difficult as possible, or 
not supporting them, just because they're dangerous.  If that's the 
direction we're going in, why don't we rip pointer arithmetic, inline 
ASM, unsafe casts, manual memory management, etc. out of the language 
and call ourselves Java++?  IMHO making shared-correctness mandatory 
unless you fight the type system every inch of the way would be going in 
that direction with regard to concurrency.  core.thread is a low-level 
druntime module.  If you wanted shared-correct multithreading, you 
should have been using std.concurrency.  If std.concurrency wasn't 
enough to get the job done, then that's proof that shared is only useful 
if you're mostly using message passing and occasionally shared state.

Some examples:

// This is my example for parallel foreach.
auto logs = new double[1_000_000];

foreach(i, ref elem; parallel(logs)) {
     elem = log(i + 1);
}

Here you have multiple threads writing to the same array in parallel. 
They're guaranteed never to write to the same element, though, making it 
safe except on some obscure/ancient hardware that we don't care about 
(e.g. old DEC Alphas) that can't write to memory at byte granularity.

Yes, I'm aware of the false sharing issue with writing to adjacent 
addresses from different threads.  This is not a problem for this 
example because these falsely shared writes will be such a small portion 
of all writes that the performance impact is negligible.  Making all 
updates atomic/fenced/whatever shared does would be a huge performance 
hit for no benefit, and would make the code more verbose and type heavy.

// This is a parallel quick sort.  Again, it writes to a data
// structures from multiple threads, but in a way that guarantees no
// element is "owned" by two threads at the same time.
void parallelSort(T)(T[] data) {
     // Sort small subarrays serially.
     if(data.length < 100) {
          std.algorithm.sort(data);
          return;
     }

     // Partition the array.
     swap(data[$ / 2], data[$ - 1]);
     auto pivot = data[$ - 1];
     bool lessThanPivot(T elem) { return elem < pivot; }

     auto greaterEqual = partition!lessThanPivot(data[0..$ - 1]);
     swap(data[$ - greaterEqual.length - 1], data[$ - 1]);

     auto less = data[0..$ - greaterEqual.length - 1];
     greaterEqual = data[$ - greaterEqual.length..$];

     // Execute both recursion branches in parallel.
     auto recurseTask = task!(parallelSort)(greaterEqual);
     taskPool.put(recurseTask);
     parallelSort(less);
     recurseTask.yieldForce();
}

// Read in a file in a background thread and return the results in a
// mutable, non-shared array that the caller can then process further.
import std.file, std.parallelism;

void main() {
     // Create and submit a Task object for reading foo.txt.
     auto file1Task = task(&read, "foo.txt");
     file1Task.executeInNewThread();

     // Read bar.txt in parallel.
     auto file2Data = read("bar.txt");

     // Get the results of reading foo.txt.
     auto file1Data = file1Task.yieldForce();
}

Mar 23 2011

dsimcha <dsimcha yahoo.com> writes:

== Quote from dsimcha (dsimcha yahoo.com)'s article
 On 3/23/2011 9:09 AM, Jason House wrote:
 dsimcha Wrote:

 Some discussions about std.parallelism have prompted an examination of
 how far D's guarantees against low level data races should extend and
 how safety and practicality should be balanced.

 I didn't follow the review of std.parallelism, can you give some specific


examples?
 Users of languages look to standard libraries as a model for how to write


their own apps. I don't like your proposal and think std.parallelism should use
shared properly. I'd like to understand better what your issues with shared
were.
I've done a descent amount of shared-correct code, so I'm pretty sure it's
usable.
In fact, the only really nasty bug I had could have been caught if std.thread
had
been shared-correct...
 I have already decided that, unless shared is drastically improved in
 ways I don't foresee (I'm not even sure exactly how, this would need to
 be discussed), I will not be making std.parallelism shared correct.

One small clarification/relaxation of this position:  I will seriously consider
making individual artifacts of std.parallelism shared correct if this can
demonstrably be done without affecting efficiency, flexibility or ease of use.
For example, certain uses of future/promise parallelism via task() are
shared-correct, marked as  safe/ trusted and documented as such.

However, I will not make shared correctness a higher priority than efficiency,
ease of use or flexibility and I will not cripple or remove artifacts that
cannot
reasonably be made shared correct.

Mar 23 2011

Don <nospam nospam.com> writes:

dsimcha wrote:
 == Quote from dsimcha (dsimcha yahoo.com)'s article
 On 3/23/2011 9:09 AM, Jason House wrote:
 dsimcha Wrote:

 Some discussions about std.parallelism have prompted an examination of
 how far D's guarantees against low level data races should extend and
 how safety and practicality should be balanced.

 I didn't follow the review of std.parallelism, can you give some specific


 examples?
 Users of languages look to standard libraries as a model for how to write


 their own apps. I don't like your proposal and think std.parallelism should use
 shared properly. I'd like to understand better what your issues with shared
were.
 I've done a descent amount of shared-correct code, so I'm pretty sure it's
usable.
 In fact, the only really nasty bug I had could have been caught if std.thread
had
 been shared-correct...
 I have already decided that, unless shared is drastically improved in
 ways I don't foresee (I'm not even sure exactly how, this would need to
 be discussed), I will not be making std.parallelism shared correct.

 
 One small clarification/relaxation of this position:  I will seriously consider
 making individual artifacts of std.parallelism shared correct if this can
 demonstrably be done without affecting efficiency, flexibility or ease of use.
 For example, certain uses of future/promise parallelism via task() are
 shared-correct, marked as  safe/ trusted and documented as such.
 
 However, I will not make shared correctness a higher priority than efficiency,
 ease of use or flexibility and I will not cripple or remove artifacts that
cannot
 reasonably be made shared correct.

Seems to me, that you're making use of some primitive that I'll call a 
'DivisableArray' -- an array that can be sliced up (into other 
DivisibleArrays), and different DivisableArrays can be sent to different 
threads. You can extract a normal array slice from a DivisibleArray, but 
cannot send that slice to other threads: only DivisibleArrays can do that.
In debug mode, a DivisibleArray could keep track of how it has been 
sliced, and disallow overlapping slices. The DivisibleArray could even 
ensure that all slices lie on word/paragraph boundaries, thus dealing 
with word tearing.

Mar 23 2011

dsimcha <dsimcha yahoo.com> writes:

== Quote from Don (nospam nospam.com)'s article
 Seems to me, that you're making use of some primitive that I'll call a
 'DivisableArray' -- an array that can be sliced up (into other
 DivisibleArrays), and different DivisableArrays can be sent to different
 threads. You can extract a normal array slice from a DivisibleArray, but
 cannot send that slice to other threads: only DivisibleArrays can do that.
 In debug mode, a DivisibleArray could keep track of how it has been
 sliced, and disallow overlapping slices. The DivisibleArray could even
 ensure that all slices lie on word/paragraph boundaries, thus dealing
 with word tearing.

I'm not sure.  I assume that the implementation would rely on a shared array,
plus
casting away shared under the hood when you extract a normal array.  In this
case,
you could still have a DivisibleArray aliased to an unshared slice into this
array.  This is not safe IIUC.

If I'm misunderstanding something, please correct me.

Mar 23 2011

Don <nospam nospam.com> writes:

dsimcha wrote:
 == Quote from Don (nospam nospam.com)'s article
 Seems to me, that you're making use of some primitive that I'll call a
 'DivisableArray' -- an array that can be sliced up (into other
 DivisibleArrays), and different DivisableArrays can be sent to different
 threads. You can extract a normal array slice from a DivisibleArray, but
 cannot send that slice to other threads: only DivisibleArrays can do that.
 In debug mode, a DivisibleArray could keep track of how it has been
 sliced, and disallow overlapping slices. The DivisibleArray could even
 ensure that all slices lie on word/paragraph boundaries, thus dealing
 with word tearing.

 
 I'm not sure.  I assume that the implementation would rely on a shared array,
plus
 casting away shared under the hood when you extract a normal array.  In this
case,
 you could still have a DivisibleArray aliased to an unshared slice into this
 array.  This is not safe IIUC.

No. I haven't said anything about the implementation of DivisibleArray 
(or even that it is implementable in the language at the present time). 
Haven't thought about it that much. It's more a thought experiment, 
trying to imagine a magical type would which might solve the big issues.
Stream of consciousness below...

It wouldn't be a shared array. In fact, it would explicitly NOT be 
shared (If it's shared, some other thread somewhere else could be 
clobbering it while the operation's happenning!). What it would be, 
would be a (non-shared) array which guarantees that no overlapping 
slices of that type exist. It would hand out slices which are guaranteed 
to be unique.

So in practice, it'd need to be a reference type, so that it remained 
valid when copied (preventing you from copying it, then slicing in two 
different places).
I guess it would need to construct the array itself, since that's the 
only way it could guarantee uniqueness. That's an inefficiency. Data 
could be copied into it and out of it, though.
One problem is, that once it's given out a non-const slice for the data, 
it can't safely be split. It could allow slice and index assignment 
itself, though.

I think those are the main implementation challenges.

Mar 23 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/23/11 10:22 AM, Don wrote:
 dsimcha wrote:
 == Quote from dsimcha (dsimcha yahoo.com)'s article
 On 3/23/2011 9:09 AM, Jason House wrote:
 dsimcha Wrote:

 Some discussions about std.parallelism have prompted an examination of
 how far D's guarantees against low level data races should extend and
 how safety and practicality should be balanced.

 I didn't follow the review of std.parallelism, can you give some
 specific


 examples?
 Users of languages look to standard libraries as a model for how to
 write


 their own apps. I don't like your proposal and think std.parallelism
 should use
 shared properly. I'd like to understand better what your issues with
 shared were.
 I've done a descent amount of shared-correct code, so I'm pretty sure
 it's usable.
 In fact, the only really nasty bug I had could have been caught if
 std.thread had
 been shared-correct...
 I have already decided that, unless shared is drastically improved in
 ways I don't foresee (I'm not even sure exactly how, this would need to
 be discussed), I will not be making std.parallelism shared correct.

 One small clarification/relaxation of this position: I will seriously
 consider
 making individual artifacts of std.parallelism shared correct if this can
 demonstrably be done without affecting efficiency, flexibility or ease
 of use.
 For example, certain uses of future/promise parallelism via task() are
 shared-correct, marked as  safe/ trusted and documented as such.

 However, I will not make shared correctness a higher priority than
 efficiency,
 ease of use or flexibility and I will not cripple or remove artifacts
 that cannot
 reasonably be made shared correct.

 Seems to me, that you're making use of some primitive that I'll call a
 'DivisableArray' -- an array that can be sliced up (into other
 DivisibleArrays), and different DivisableArrays can be sent to different
 threads. You can extract a normal array slice from a DivisibleArray, but
 cannot send that slice to other threads: only DivisibleArrays can do that.
 In debug mode, a DivisibleArray could keep track of how it has been
 sliced, and disallow overlapping slices. The DivisibleArray could even
 ensure that all slices lie on word/paragraph boundaries, thus dealing
 with word tearing.

That's a great idea!

I agree with the sentiment that std.parallelism is not necessarily 
supposed to work with the shared qualifier, which is intended for 
general inter-thread communication. std.parallelism is using implicit 
sharing: as Don described, most often data in std.parallelism is 
factually not shared (different threads work on different portions of 
the data).


Andrei

Mar 23 2011

D Programming

C/C++ Programming

Other

digitalmars.D - What To Do About Shared?