digitalmars.D - std.parallelism: Final review

Lars T. Kyllingstad (12/12) Mar 04 2011 David Simcha has made a proposal for an std.parallelism module to be

Lars T. Kyllingstad (12/19) Mar 18 2011 I would like to remind everyone that there is now only one week left of

dsimcha (8/27) Mar 18 2011 It's kinda interesting--I don't know at all where this lib stands. The ...

Andrei Alexandrescu (11/38) Mar 18 2011 Probably a weighted average of the two. If I were to venture a guess I'd...

Michel Fortin (31/49) Mar 18 2011 One reason might also be that not many people are invested in D for

dsimcha (32/80) Mar 18 2011 I think your use case is both beyond the scope of std.parallelism and

Michel Fortin (43/77) Mar 19 2011 I know. But if this gets its way in the standard library, perhaps it

dsimcha (25/67) Mar 19 2011 Fair enough. You've convinced me, since I've just recently started

Michel Fortin (10/26) Mar 19 2011 Great.

dsimcha (2/5) Mar 20 2011 Good point. Done.

Jonas Drewsen (4/23) Mar 18 2011 I can't say that I've read the code thoroughly but maybe someone can

dsimcha (14/43) Mar 18 2011 Not in Cilk style. Everything just goes to a shared queue. In theory t...

Jonas Drewsen (9/52) Mar 18 2011 I guess that work stealing could be implemented without changing the

dsimcha (9/22) Mar 18 2011 Yes, this would be possible. However, in my experience super

Caligo (4/16) Mar 19 2011 Is std.parallelism better suited for data parallelism, task parallelism,...

dsimcha (11/30) Mar 19 2011 It was not **explicitly** designed to be an OMP killer, but supports

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

David Simcha has made a proposal for an std.parallelism module to be 
included in Phobos.  We now begin the formal review process.

The code repository and documentation can be found here:

  https://github.com/dsimcha/std.parallelism/wiki
  http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html

Please review the code and the API, and post comments in this thread 
within the next three weeks.

On 25 March I will start a new thread for voting over the inclusion of 
the module.  Voting will last one week, until 1 April.  Votes cast before 
or after this will not be counted.

David, do you have any comments?

-Lars

Mar 04 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Fri, 04 Mar 2011 21:05:39 +0000, Lars T. Kyllingstad wrote:

 David Simcha has made a proposal for an std.parallelism module to be
 included in Phobos.  We now begin the formal review process.
 
 The code repository and documentation can be found here:
 
   https://github.com/dsimcha/std.parallelism/wiki
   http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html


I would like to remind everyone that there is now only one week left of 
the std.parallelism review period.  If you have any comments, please 
speak now, so that David has time to make the changes.

I realise that the module has been through several review cycles already, 
and that it is already in active use (by me, among others), so there 
probably won't be any big issues.  However, if it gets voted into Phobos, 
that's it -- it will be an official part of the D standard library.  So 
start nitpicking, folks!

The voting will start next Friday, 25 March, and last for a week, until 1 
April.

-Lars

Mar 18 2011

dsimcha <dsimcha yahoo.com> writes:

== Quote from Lars T. Kyllingstad (public kyllingen.NOSPAMnet)'s article
 On Fri, 04 Mar 2011 21:05:39 +0000, Lars T. Kyllingstad wrote:
 David Simcha has made a proposal for an std.parallelism module to be
 included in Phobos.  We now begin the formal review process.

 The code repository and documentation can be found here:

   https://github.com/dsimcha/std.parallelism/wiki
   http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html

 I would like to remind everyone that there is now only one week left of
 the std.parallelism review period.  If you have any comments, please
 speak now, so that David has time to make the changes.
 I realise that the module has been through several review cycles already,
 and that it is already in active use (by me, among others), so there
 probably won't be any big issues.  However, if it gets voted into Phobos,
 that's it -- it will be an official part of the D standard library.  So
 start nitpicking, folks!
 The voting will start next Friday, 25 March, and last for a week, until 1
 April.
 -Lars

It's kinda interesting--I don't know at all where this lib stands.  The
deafening
silence for the past week makes me think one of two things is true:

1.  std.parallelism solves a problem that's too niche for 90% of D users, or

2.  It's already been through so many rounds of discussion in various places
(informally with friends, then on the Phobos list, then on this NG) that there
really is nothing left to nitpick.

I have no idea which of these is true.

Mar 18 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/18/11 3:55 PM, dsimcha wrote:
 == Quote from Lars T. Kyllingstad (public kyllingen.NOSPAMnet)'s article
 On Fri, 04 Mar 2011 21:05:39 +0000, Lars T. Kyllingstad wrote:
 David Simcha has made a proposal for an std.parallelism module to be
 included in Phobos.  We now begin the formal review process.

 The code repository and documentation can be found here:

    https://github.com/dsimcha/std.parallelism/wiki
    http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html

 I would like to remind everyone that there is now only one week left of
 the std.parallelism review period.  If you have any comments, please
 speak now, so that David has time to make the changes.
 I realise that the module has been through several review cycles already,
 and that it is already in active use (by me, among others), so there
 probably won't be any big issues.  However, if it gets voted into Phobos,
 that's it -- it will be an official part of the D standard library.  So
 start nitpicking, folks!
 The voting will start next Friday, 25 March, and last for a week, until 1
 April.
 -Lars

 It's kinda interesting--I don't know at all where this lib stands.  The
deafening
 silence for the past week makes me think one of two things is true:

 1.  std.parallelism solves a problem that's too niche for 90% of D users, or

 2.  It's already been through so many rounds of discussion in various places
 (informally with friends, then on the Phobos list, then on this NG) that there
 really is nothing left to nitpick.

 I have no idea which of these is true.

Probably a weighted average of the two. If I were to venture a guess I'd 
ascribe more weight to 1. This is partly because I'm also receiving 
relatively little feedback on the concurrency chapter in TDPL. Also the 
general pattern on many such discussion groups is that the amount of 
traffic on a given topic is inversely correlated with its complexity.

FWIW a review is on my todo list.

Anyway, I'm glad we have gotten the terminology (concurrency and 
parallelism) so nicely. See 
http://www.reddit.com/r/programming/comments/g6k0p/parallelism_is_not_concurrency/


Andrei

Mar 18 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-03-18 17:12:07 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 3/18/11 3:55 PM, dsimcha wrote:
 It's kinda interesting--I don't know at all where this lib stands.  The 
 deafening
 silence for the past week makes me think one of two things is true:
 
 1.  std.parallelism solves a problem that's too niche for 90% of D users, or
 
 2.  It's already been through so many rounds of discussion in various places
 (informally with friends, then on the Phobos list, then on this NG) that there
 really is nothing left to nitpick.
 
 I have no idea which of these is true.

 
 Probably a weighted average of the two. If I were to venture a guess 
 I'd ascribe more weight to 1. This is partly because I'm also receiving 
 relatively little feedback on the concurrency chapter in TDPL. Also the 
 general pattern on many such discussion groups is that the amount of 
 traffic on a given topic is inversely correlated with its complexity.

One reason might also be that not many people are invested in D for 
such things right now. It's hard to review such code and make useful 
comments without actually testing it on a problem that would benefit 
from its use.

If I was writing in D the application I am currently writing, I'd 
certainly give it a try. But the thing I have that would benefit from 
something like this is in Objective-C (it's a Cocoa program I'm 
writing). I'll eventually get D to interact well with Apple's 
Objective-C APIs, but in the meantime all I'm writing in D is some 
simple web stuff which doesn't require multithreading at all.

In my application, what I'm doing is starting hundreds of tasks from 
the main thread, and once those tasks are done they generally send back 
a message to the main thread through Cocoa's event dispatching 
mechanism. From a quick glance at the documentation, std.parallelism 
offers what I'd need if I were to implement a similar application in D. 
The only thing I don't see is a way to priorize tasks: some of my tasks 
need a more immediate execution than others in order to keep the 
application responsive.

One interesting bit: what I'm doing in those tasks is mostly I/O on the 
hard drive combined with some parsing. I find a task queue is useful to 
manage all the work, in my case it's not really about maximizing the 
utilization of a multicore processor but more about keeping it out of 
the main thread so the application is still responsive. Maximizing 
speed is still a secondary objective, but given most of the work is 
I/O-bound, having multiple cores available doesn't help much.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Mar 18 2011

dsimcha <dsimcha yahoo.com> writes:

I think your use case is both beyond the scope of std.parallelism and 
better handled by std.concurrency.  std.parallelism is mostly meant to 
handle the pure multicore parallelism use case.  It's not that it 
**can't** handle other use cases, but that's not what it's tuned for.

As far as prioritization, it wouldn't be hard to implement 
prioritization of when a task starts (i.e. have a high- and low-priority 
queue).  However, the whole point of TaskPool is to avoid starting a new 
thread for each task.  Threads are recycled for efficiency.  This 
prevents changing the priority of things in the OS scheduler.  I also 
don't see how to generalize prioritization to map, reduce, parallel 
foreach, etc. w/o making the API much more complex.

In addition, std.parallelism guarantees that tasks will be started in 
the order that they're submitted, except that if the results are needed 
immediately and the task hasn't been started yet, it will be pulled out 
of the middle of the queue and executed immediately.  One way to get the 
prioritization you need is to just submit the tasks in order of 
priority, assuming you're submitting them all from the same place.

One last thing:  As far as I/O goes, AsyncBuf may be useful.  This 
allows you to pipeline reading of a file and higher level processing. 
Example:

// Read the lines of a file into memory in parallel with processing
// them.
import std.stdio, std.parallelism, std.algorithm;

void main() {
     auto lines = map!"a.idup"(File("foo.txt").byLine());
     auto pipelined = taskPool.asyncBuf(lines);

     foreach(line; pipelined) {
         auto ls = line.split("\t");
         auto nums = to!(double[])(ls);
     }
}

On 3/18/2011 9:27 PM, Michel Fortin wrote:
 On 2011-03-18 17:12:07 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 On 3/18/11 3:55 PM, dsimcha wrote:
 It's kinda interesting--I don't know at all where this lib stands.
 The deafening
 silence for the past week makes me think one of two things is true:

 1. std.parallelism solves a problem that's too niche for 90% of D
 users, or

 2. It's already been through so many rounds of discussion in various
 places
 (informally with friends, then on the Phobos list, then on this NG)
 that there
 really is nothing left to nitpick.

 I have no idea which of these is true.

 Probably a weighted average of the two. If I were to venture a guess
 I'd ascribe more weight to 1. This is partly because I'm also
 receiving relatively little feedback on the concurrency chapter in
 TDPL. Also the general pattern on many such discussion groups is that
 the amount of traffic on a given topic is inversely correlated with
 its complexity.

 One reason might also be that not many people are invested in D for such
 things right now. It's hard to review such code and make useful comments
 without actually testing it on a problem that would benefit from its use.

 If I was writing in D the application I am currently writing, I'd
 certainly give it a try. But the thing I have that would benefit from
 something like this is in Objective-C (it's a Cocoa program I'm
 writing). I'll eventually get D to interact well with Apple's
 Objective-C APIs, but in the meantime all I'm writing in D is some
 simple web stuff which doesn't require multithreading at all.

 In my application, what I'm doing is starting hundreds of tasks from the
 main thread, and once those tasks are done they generally send back a
 message to the main thread through Cocoa's event dispatching mechanism.
  From a quick glance at the documentation, std.parallelism offers what
 I'd need if I were to implement a similar application in D. The only
 thing I don't see is a way to priorize tasks: some of my tasks need a
 more immediate execution than others in order to keep the application
 responsive.

 One interesting bit: what I'm doing in those tasks is mostly I/O on the
 hard drive combined with some parsing. I find a task queue is useful to
 manage all the work, in my case it's not really about maximizing the
 utilization of a multicore processor but more about keeping it out of
 the main thread so the application is still responsive. Maximizing speed
 is still a secondary objective, but given most of the work is I/O-bound,
 having multiple cores available doesn't help much.

Mar 18 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-03-18 22:27:14 -0400, dsimcha <dsimcha yahoo.com> said:

 I think your use case is both beyond the scope of std.parallelism and 
 better handled by std.concurrency.  std.parallelism is mostly meant to 
 handle the pure multicore parallelism use case.  It's not that it 
 **can't** handle other use cases, but that's not what it's tuned for.

I know. But if this gets its way in the standard library, perhaps it 
should aim at reaching a slightly wider audience? Especially since it 
lacks so little to become more general purpose...


 As far as prioritization, it wouldn't be hard to implement 
 prioritization of when a task starts (i.e. have a high- and 
 low-priority queue).  However, the whole point of TaskPool is to avoid 
 starting a new thread for each task.  Threads are recycled for 
 efficiency.  This prevents changing the priority of things in the OS 
 scheduler.  I also don't see how to generalize prioritization to map, 
 reduce, parallel foreach, etc. w/o making the API much more complex.

I was not talking about thread priority, but ordering priority (which 
task gets chosen first). I don't really care about thread priority in 
my application, and I understand that per-task thread priority doesn't 
make much sense. If I needed per-task thread priority I'd simply make 
pools for the various thread priorities and put tasks in the right 
pools.

That said, perhaps I could do exactly that: create two or three pools 
with different thread priorities, put tasks into the right pool and let 
the OS sort out the scheduling. But then the question becomes: how do I 
choose the thread priority of a task pool? I doesn't seem possible from 
the documentation. Perhaps TaskPool's constructor should have a 
parameter for that.


 In addition, std.parallelism guarantees that tasks will be started in 
 the order that they're submitted, except that if the results are needed 
 immediately and the task hasn't been started yet, it will be pulled out 
 of the middle of the queue and executed immediately.  One way to get 
 the prioritization you need is to just submit the tasks in order of 
 priority, assuming you're submitting them all from the same place.

Most of my tasks are background tasks that just need to be done 
eventually while others are user-requested tasks which can be requested 
at any time in the main thread. Issuing them serially is not really an 
option.


 One last thing:  As far as I/O goes, AsyncBuf may be useful.  This 
 allows you to pipeline reading of a file and higher level processing. 
 Example:
 
 // Read the lines of a file into memory in parallel with processing
 // them.
 import std.stdio, std.parallelism, std.algorithm;
 
 void main() {
      auto lines = map!"a.idup"(File("foo.txt").byLine());
      auto pipelined = taskPool.asyncBuf(lines);
 
      foreach(line; pipelined) {
          auto ls = line.split("\t");
          auto nums = to!(double[])(ls);
      }
 }

Looks nice, but doesn't really work for what I'm doing. Currently I 
have one task per file, each task reading a relatively small file and 
then parsing its content.

 - - -

Another remarks: in the documentation for the TaskPool constructor, it says:

""Default constructor that initializes a TaskPool with one worker 
thread for each CPU reported available by the OS, minus 1 because the 
thread that initialized the pool will also do work.""

This "minus 1" thing doesn't really work for me. It certainly make 
sense for a parallel foreach use case -- whenever the current thread 
would block until the work is done you can use that thread to work too 
-- but in my use case I delegate all the work to other threads because 
my main thread isn't a dedicated working thread and it must not block. 
I'd be nice to have a boolean parameter for the constructor to choose 
if the main thread will work or not (and whether it should do minus 1 
or not).

For the global taskPool, I guess I would just have to write 
"defaultPoolThreads = defaultPoolThreads+1" at the start of the program 
if the main thread isn't going to be working.


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Mar 19 2011

dsimcha <dsimcha yahoo.com> writes:

On 3/19/2011 9:37 AM, Michel Fortin wrote:
 On 2011-03-18 22:27:14 -0400, dsimcha <dsimcha yahoo.com> said:

 I think your use case is both beyond the scope of std.parallelism and
 better handled by std.concurrency. std.parallelism is mostly meant to
 handle the pure multicore parallelism use case. It's not that it
 **can't** handle other use cases, but that's not what it's tuned for.

 I know. But if this gets its way in the standard library, perhaps it
 should aim at reaching a slightly wider audience? Especially since it
 lacks so little to become more general purpose...

Fair enough.  You've convinced me, since I've just recently started 
pushing std.parallelism in this direction in both my research work and 
in some of the examples I've been using, and you've given very good 
specific suggestions about **how** to expand things a little.

 As far as prioritization, it wouldn't be hard to implement
 prioritization of when a task starts (i.e. have a high- and
 low-priority queue). However, the whole point of TaskPool is to avoid
 starting a new thread for each task. Threads are recycled for
 efficiency. This prevents changing the priority of things in the OS
 scheduler. I also don't see how to generalize prioritization to map,
 reduce, parallel foreach, etc. w/o making the API much more complex.

 I was not talking about thread priority, but ordering priority (which
 task gets chosen first). I don't really care about thread priority in my
 application, and I understand that per-task thread priority doesn't make
 much sense. If I needed per-task thread priority I'd simply make pools
 for the various thread priorities and put tasks in the right pools.

 That said, perhaps I could do exactly that: create two or three pools
 with different thread priorities, put tasks into the right pool and let
 the OS sort out the scheduling. But then the question becomes: how do I
 choose the thread priority of a task pool? I doesn't seem possible from
 the documentation. Perhaps TaskPool's constructor should have a
 parameter for that.

This sounds like a good solution.  The general trend I've seen is that 
the ability to create >1 pools elegantly solves a lot of problems that 
would be a PITA from both an interface and an implementation perspective 
to solve more directly.  I've added a priority property to TaskPool that 
allows setting the OS priority of the threads in the pool.  This just 
forwards to core.thread.priority(), so usage is identical.

 - - -

 Another remarks: in the documentation for the TaskPool constructor, it
 says:

 ""Default constructor that initializes a TaskPool with one worker thread
 for each CPU reported available by the OS, minus 1 because the thread
 that initialized the pool will also do work.""

 This "minus 1" thing doesn't really work for me. It certainly make sense
 for a parallel foreach use case -- whenever the current thread would
 block until the work is done you can use that thread to work too -- but
 in my use case I delegate all the work to other threads because my main
 thread isn't a dedicated working thread and it must not block. I'd be
 nice to have a boolean parameter for the constructor to choose if the
 main thread will work or not (and whether it should do minus 1 or not).

 For the global taskPool, I guess I would just have to write
 "defaultPoolThreads = defaultPoolThreads+1" at the start of the program
 if the main thread isn't going to be working.

I've solved this, though in a slightly different way.  Based on 
discussions on this newsgroup I had recently added an osReportedNcpu 
variable to std.parallelism instead of using core.cpuid.  This is an 
immutable global variable that is set in a static this() statement.

Since we don't know what the API for querying stuff like this should be, 
I had made it private.  I changed it to public.  I realized that, even 
if a more full-fledged API is added at some point for this stuff, there 
should be an obvious, convenient way to get it directly from 
std.parallelism anyhow, and it would be trivial to call whatever API 
eventually evolves to set this value.  Now, if you don't like the -1 
thing, you can just do:

auto pool = new TaskPool(osReportedNcpu);

or

defaultPoolThreads = osReportedNcpu;

Mar 19 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-03-19 10:45:12 -0400, dsimcha <dsimcha yahoo.com> said:

 I've added a priority property to TaskPool that allows setting the OS 
 priority of the threads in the pool.  This just forwards to 
 core.thread.priority(), so usage is identical.

Great.

Next to "priority" I notice the "makeDaemon" and "makeAngel" 
functions... wouldn't it make more sense to mirror the core.thread API 
for this too and make an "isDaemon" property out of these?


 Since we don't know what the API for querying stuff like this should 
 be, I had made it private.  I changed it to public.  I realized that, 
 even if a more full-fledged API is added at some point for this stuff, 
 there should be an obvious, convenient way to get it directly from 
 std.parallelism anyhow, and it would be trivial to call whatever API 
 eventually evolves to set this value.  Now, if you don't like the -1 
 thing, you can just do:
 
 auto pool = new TaskPool(osReportedNcpu);
 
 or
 
 defaultPoolThreads = osReportedNcpu;

Also good.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Mar 19 2011

dsimcha <dsimcha yahoo.com> writes:

On 3/19/2011 2:36 PM, Michel Fortin wrote:
 Next to "priority" I notice the "makeDaemon" and "makeAngel"
 functions... wouldn't it make more sense to mirror the core.thread API
 for this too and make an "isDaemon" property out of these?

Good point.  Done.

Mar 20 2011

Jonas Drewsen <jdrewsen nospam.com> writes:

On 18/03/11 10.40, Lars T. Kyllingstad wrote:
 On Fri, 04 Mar 2011 21:05:39 +0000, Lars T. Kyllingstad wrote:

 David Simcha has made a proposal for an std.parallelism module to be
 included in Phobos.  We now begin the formal review process.

 The code repository and documentation can be found here:

    https://github.com/dsimcha/std.parallelism/wiki
    http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html


 I would like to remind everyone that there is now only one week left of
 the std.parallelism review period.  If you have any comments, please
 speak now, so that David has time to make the changes.

 I realise that the module has been through several review cycles already,
 and that it is already in active use (by me, among others), so there
 probably won't be any big issues.  However, if it gets voted into Phobos,
 that's it -- it will be an official part of the D standard library.  So
 start nitpicking, folks!

 The voting will start next Friday, 25 March, and last for a week, until 1
 April.

 -Lars

I can't say that I've read the code thoroughly but maybe someone can 
tell me if it supports work stealing?

/Jonas

Mar 18 2011

dsimcha <dsimcha yahoo.com> writes:

== Quote from Jonas Drewsen (jdrewsen nospam.com)'s article
 On 18/03/11 10.40, Lars T. Kyllingstad wrote:
 On Fri, 04 Mar 2011 21:05:39 +0000, Lars T. Kyllingstad wrote:

 David Simcha has made a proposal for an std.parallelism module to be
 included in Phobos.  We now begin the formal review process.

 The code repository and documentation can be found here:

    https://github.com/dsimcha/std.parallelism/wiki
    http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html


 I would like to remind everyone that there is now only one week left of
 the std.parallelism review period.  If you have any comments, please
 speak now, so that David has time to make the changes.

 I realise that the module has been through several review cycles already,
 and that it is already in active use (by me, among others), so there
 probably won't be any big issues.  However, if it gets voted into Phobos,
 that's it -- it will be an official part of the D standard library.  So
 start nitpicking, folks!

 The voting will start next Friday, 25 March, and last for a week, until 1
 April.

 -Lars

 I can't say that I've read the code thoroughly but maybe someone can
 tell me if it supports work stealing?
 /Jonas

Not in Cilk style.  Everything just goes to a shared queue.  In theory this
could
be a bottleneck in the micro parallelism case.  However, some experimentation I
did early in the design convinced me that in practice there's so much other
overhead involved in moving work from one processor to another (cache misses,
needing to wake up a thread, etc.) that, in cases where a shared queue might be
a
bottleneck, the parallelism is probably too fine-grained anyhow.

std.parallelism does, however, support semantics somewhat similar to work
stealing
in that, when a thread needs the results of a job that has not yet been started,
said job will be pulled out of the middle of the queue and executed immediately
in
the thread that needs the result.

Using a shared queue simplifies the design massively and arguably makes more
sense
in that tasks are guaranteed to be started in the order received, except in the
case described above.

Mar 18 2011

Jonas Drewsen <jdrewsen nospam.com> writes:

On 18/03/11 22.43, dsimcha wrote:
 == Quote from Jonas Drewsen (jdrewsen nospam.com)'s article
 On 18/03/11 10.40, Lars T. Kyllingstad wrote:
 On Fri, 04 Mar 2011 21:05:39 +0000, Lars T. Kyllingstad wrote:

 David Simcha has made a proposal for an std.parallelism module to be
 included in Phobos.  We now begin the formal review process.

 The code repository and documentation can be found here:

     https://github.com/dsimcha/std.parallelism/wiki
     http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html


 I would like to remind everyone that there is now only one week left of
 the std.parallelism review period.  If you have any comments, please
 speak now, so that David has time to make the changes.

 I realise that the module has been through several review cycles already,
 and that it is already in active use (by me, among others), so there
 probably won't be any big issues.  However, if it gets voted into Phobos,
 that's it -- it will be an official part of the D standard library.  So
 start nitpicking, folks!

 The voting will start next Friday, 25 March, and last for a week, until 1
 April.

 -Lars

 I can't say that I've read the code thoroughly but maybe someone can
 tell me if it supports work stealing?
 /Jonas

 Not in Cilk style.  Everything just goes to a shared queue.  In theory this
could
 be a bottleneck in the micro parallelism case.  However, some experimentation I
 did early in the design convinced me that in practice there's so much other
 overhead involved in moving work from one processor to another (cache misses,
 needing to wake up a thread, etc.) that, in cases where a shared queue might
be a
 bottleneck, the parallelism is probably too fine-grained anyhow.

I guess that work stealing could be implemented without changing the 
current interface if evidence shows up that would favor work stealing?

Maybe later an extension to the task scheduler for task cpu affinity 
would be nice in order to lower cache misses for certain kinds of tasks.

 std.parallelism does, however, support semantics somewhat similar to work
stealing
 in that, when a thread needs the results of a job that has not yet been
started,
 said job will be pulled out of the middle of the queue and executed
immediately in
 the thread that needs the result.

This is indeed a nice feature.

 Using a shared queue simplifies the design massively and arguably makes more
sense
 in that tasks are guaranteed to be started in the order received, except in the
 case described above.

Yes it works very well in general I believe.

Nice work!
/Jonas

Mar 18 2011

dsimcha <dsimcha yahoo.com> writes:

On 3/18/2011 7:33 PM, Jonas Drewsen wrote:
 Not in Cilk style. Everything just goes to a shared queue. In theory
 this could
 be a bottleneck in the micro parallelism case. However, some
 experimentation I
 did early in the design convinced me that in practice there's so much
 other
 overhead involved in moving work from one processor to another (cache
 misses,
 needing to wake up a thread, etc.) that, in cases where a shared queue
 might be a
 bottleneck, the parallelism is probably too fine-grained anyhow.

 I guess that work stealing could be implemented without changing the
 current interface if evidence shows up that would favor work stealing?

Yes, this would be possible.  However, in my experience super 
fine-grained parallelism is almost never needed to take full advantage 
of whatever hardware you're running on.  Therefore, I'm hesitant to add 
complexity to std.parallelism to support super fine-grained parallelism, 
at least without strong justification in terms of real-world use cases. 
  The one thing work stealing (and improvements to the queue in general) 
has going for it is that it would only make the implementation more 
complex, not the interface.

Mar 18 2011

Caligo <iteronvexor gmail.com> writes:

On Fri, Mar 4, 2011 at 3:05 PM, Lars T. Kyllingstad
<public kyllingen.nospamnet> wrote:

 David Simcha has made a proposal for an std.parallelism module to be
 included in Phobos.  We now begin the formal review process.

 The code repository and documentation can be found here:

  https://github.com/dsimcha/std.parallelism/wiki
  http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html

 Please review the code and the API, and post comments in this thread
 within the next three weeks.

 On 25 March I will start a new thread for voting over the inclusion of
 the module.  Voting will last one week, until 1 April.  Votes cast before
 or after this will not be counted.

 David, do you have any comments?

 -Lars

Is std.parallelism better suited for data parallelism, task parallelism, or
both?  And how does it compare to something like OpenMP?

Mar 19 2011

dsimcha <dsimcha yahoo.com> writes:

On 3/19/2011 3:08 PM, Caligo wrote:
 On Fri, Mar 4, 2011 at 3:05 PM, Lars T. Kyllingstad
 <public kyllingen.nospamnet> wrote:

     David Simcha has made a proposal for an std.parallelism module to be
     included in Phobos.  We now begin the formal review process.

     The code repository and documentation can be found here:

     https://github.com/dsimcha/std.parallelism/wiki
     http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html
     <http://cis.jhu.edu/%7Edsimcha/d/phobos/std_parallelism.html>

     Please review the code and the API, and post comments in this thread
     within the next three weeks.

     On 25 March I will start a new thread for voting over the inclusion of
     the module.  Voting will last one week, until 1 April.  Votes cast
     before
     or after this will not be counted.

     David, do you have any comments?

     -Lars


 Is std.parallelism better suited for data parallelism, task parallelism,
 or both?

Both to some degree, but with more emphasis on data parallelism.

 And how does it compare to something like OpenMP?

It was not **explicitly** designed to be an OMP killer, but supports 
parallel foreach(which can be made into parallel for using 
std.range.iota), and parallel reduce.  The synchronization primitives 
that OMP supports are already in druntime.

A major advantage over OpenMP is that std.parallelism is implemented 
within the language.  This means it's mostly portable across compilers 
and platforms and can easily be modified if you don't like something in 
it.  It also means that the syntax is more consistent with standard D 
syntax rather than being a bunch of weird looking pragmas.

Mar 19 2011

D Programming

C/C++ Programming

Other

digitalmars.D - std.parallelism: Final review