www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Thread Pooling

reply "Craig Black" <cblack ara.com> writes:
I am not an expert at using threads, but I have an idea that I would like to 
present that would perhaps make threading easier.

Thread pooling is a great way to take advantage of concurrency on multi-core 
systems.  The number of threads in the pool could be equal to the number of 
cores in the multi-core system.

D has a big advantage over C++ because it has nested methods, anonymous 
methods, and delegates.  Delegates could be used with thread pools so that 
all the programmer would have to do to start a thread would be to call a 
method like, say, scheduleTask() and pass it a delegate.  (This could be a 
nested method, or an anonymous method.)  The thread pool would assign the 
task to the first thread that is idle.  scheduleTask() could return a handle 
to a Task object.  The Task handle could be used to test to see if the task 
is running or finished.

Comments?

-Craig 
Apr 23 2006
next sibling parent reply Sean Kelly <sean f4.ca> writes:
Craig Black wrote:
 I am not an expert at using threads, but I have an idea that I would like to 
 present that would perhaps make threading easier.
 
 Thread pooling is a great way to take advantage of concurrency on multi-core 
 systems.  The number of threads in the pool could be equal to the number of 
 cores in the multi-core system.
 
 D has a big advantage over C++ because it has nested methods, anonymous 
 methods, and delegates.  Delegates could be used with thread pools so that 
 all the programmer would have to do to start a thread would be to call a 
 method like, say, scheduleTask() and pass it a delegate.  (This could be a 
 nested method, or an anonymous method.)  The thread pool would assign the 
 task to the first thread that is idle.  scheduleTask() could return a handle 
 to a Task object.  The Task handle could be used to test to see if the task 
 is running or finished.
 
 Comments?
I've considered this as well. In fact, I've got library code to do this for C++. But I'd prefer something a bit more forward-looking for D (see Herb Sutter's Concur project for one possibility). I'm planning to address this need at some point in Ares, probably in a manner similar to the Concur design, but I don't have time at the moment. Sean
Apr 23 2006
next sibling parent reply "Craig Black" <cblack ara.com> writes:
 I've considered this as well.  In fact, I've got library code to do this 
 for C++.  But I'd prefer something a bit more forward-looking for D (see 
 Herb Sutter's Concur project for one possibility).  I'm planning to 
 address this need at some point in Ares, probably in a manner similar to 
 the Concur design, but I don't have time at the moment.
I don't see syntax this high-level to be implementable in a library. It seems like this would require features to be added to the language core. Or do you disagree? -Craig
Apr 23 2006
next sibling parent reply kris <foo bar.com> writes:
Craig Black wrote:
I've considered this as well.  In fact, I've got library code to do this 
for C++.  But I'd prefer something a bit more forward-looking for D (see 
Herb Sutter's Concur project for one possibility).  I'm planning to 
address this need at some point in Ares, probably in a manner similar to 
the Concur design, but I don't have time at the moment.
I don't see syntax this high-level to be implementable in a library. It seems like this would require features to be added to the language core. Or do you disagree? -Craig
Are you thinking of Occam-style 'par' and 'alt' constructs, or something different?
Apr 23 2006
parent reply "Craig Black" <cblack ara.com> writes:
"kris" <foo bar.com> wrote in message 
news:e2gtv3$1oqh$1 digitaldaemon.com...
 Craig Black wrote:
I've considered this as well.  In fact, I've got library code to do this 
for C++.  But I'd prefer something a bit more forward-looking for D (see 
Herb Sutter's Concur project for one possibility).  I'm planning to 
address this need at some point in Ares, probably in a manner similar to 
the Concur design, but I don't have time at the moment.
I don't see syntax this high-level to be implementable in a library. It seems like this would require features to be added to the language core. Or do you disagree? -Craig
Are you thinking of Occam-style 'par' and 'alt' constructs, or something different?
I don't know anything about "Occam" (Objective caml?) so I can't comment on that. Concur is an experimental research project in search high-level language constructs for C++ that provide safe and efficient parallelism. I can give you a brief tutorial but it's probably best look it up on the net. With Concur language features, there is no more dealing with threads and locks. The first example presented is the notion of an "active" class, which is a class where each object instance conceptually runs on its own thread. active class C { void foo() { ... } } void main() { active C c; c.foo(); // call is nonblocking ... // this code can execute in parallel with c.f() } There is an issue with return values when calling a method on a separate thread. That is, the method doesn't return until the thread is finished. For such return values, Concur introduces the notion of "future" values, or values that may or may not be evaluated yet. active class C { int foo() { ... } } void bar(int x) { ... } void bar(future int x) { ... } void main() { active C c; future int result = c.foo(); bar(result.wait()); // calls bar(int) bar(result); // calls bar(future int) } I think "out" parameters in active methods should also be declared "future" as well. (Please anyone correct me if I am misinterpreting any of Herb Sutter's stuff.) There are some other ideas presented as well ... but like I said you can look it up on the web. -Craig
Apr 23 2006
parent reply kris <foo bar.com> writes:
Craig Black wrote:
 "kris" <foo bar.com> wrote in message 
 news:e2gtv3$1oqh$1 digitaldaemon.com...
 
Craig Black wrote:

I've considered this as well.  In fact, I've got library code to do this 
for C++.  But I'd prefer something a bit more forward-looking for D (see 
Herb Sutter's Concur project for one possibility).  I'm planning to 
address this need at some point in Ares, probably in a manner similar to 
the Concur design, but I don't have time at the moment.
I don't see syntax this high-level to be implementable in a library. It seems like this would require features to be added to the language core. Or do you disagree? -Craig
Are you thinking of Occam-style 'par' and 'alt' constructs, or something different?
I don't know anything about "Occam" (Objective caml?) so I can't comment on that. Concur is an experimental research project in search high-level language constructs for C++ that provide safe and efficient parallelism. I can give you a brief tutorial but it's probably best look it up on the net.
Ah. I'd interpreted your response to Sean as being something other than Concur; but it's now clear you were talking about the same thing. Occam is a language designed in the 80's specifically for wildly concurrent machine-architectures. The hardware and language were designed to reflect each other (concurrently <g>) and, at the time, Occam and the Transputer were truly awesome advances. Made it almost trivial to build hypercubes, backed with uber-fast crossbar routing-switches. Just never took off in the mainstream -- probably ahead of their time :) -- and INMOS died a horrible death. http://en.wikipedia.org/wiki/Occam_programming_language http://en.wikipedia.org/wiki/Transputer (the venerable "Connection Machine" was not Transputer based, but the concepts were very similar)
 
 With Concur language features, there is no more dealing with threads and 
 locks.  The first example presented is the notion of an "active" class, 
 which is a class where each object instance conceptually runs on its own 
 thread.
 
 active class C {
   void foo() { ... }
 }
 
 void main()
 {
    active C c;
    c.foo(); // call is nonblocking
    ...  // this code can execute in parallel with c.f()
 }
 
[snip] Reading the PARC slides, Concur looks quite like a rehash of Occam constructs? Sutter talks about the FIFO approach to method invocation (via message passing), whereas Occam/transputer used "channels" for such things -- including "future" values and so on -- all backed in hardware (with the physical channels looking more than a bit like the multi-hypertransport busses of today). As you can see from the above links, PAR and ALT provided for efficient and granular control over concurrency. Occam is long since dead, but the ideas and implementation were ground-breaking. Certainly worthy of a look in this regard? And, I agree completely -- it would be awesome if D were to excel in this arena; perhaps the biggest sea-change in programming strategy we're likely to see. Any language that can ride that wave will garner a whole lot of attention :)
Apr 23 2006
next sibling parent kris <foo bar.com> writes:
 http://en.wikipedia.org/wiki/Occam_programming_language
 http://en.wikipedia.org/wiki/Transputer
 
 (the venerable "Connection Machine" was not Transputer based, but the 
 concepts were very similar)
Just a follow up for all the hardware geeks: Here's an old 42 transputer board, arranged in a 2D array via their on-board links. Cool, or what? http://www.cs.bris.ac.uk/~dave/photographs/B0042.jpg
Apr 23 2006
prev sibling parent reply Sean Kelly <sean f4.ca> writes:
kris wrote:
 
 Reading the PARC slides, Concur looks quite like a rehash of Occam 
 constructs? Sutter talks about the FIFO approach to method invocation 
 (via message passing), whereas Occam/transputer used "channels" for such 
 things -- including "future" values and so on -- all backed in hardware 
 (with the physical channels looking more than a bit like the 
 multi-hypertransport busses of today). As you can see from the above 
 links, PAR and ALT provided for efficient and granular control over 
 concurrency.
 
 Occam is long since dead, but the ideas and implementation were 
 ground-breaking. Certainly worthy of a look in this regard?
Definately. As I mentioned in my other reply, I think Concur is a step in the right direction, but merely a step. Still, it beats the traditional approach :-)
 And, I agree completely -- it would be awesome if D were to excel in 
 this arena; perhaps the biggest sea-change in programming strategy we're 
 likely to see. Any language that can ride that wave will garner a whole 
 lot of attention :)
Definately. Sean
Apr 23 2006
parent kris <foo bar.com> writes:
Sean Kelly wrote:
 kris wrote:
 
 Reading the PARC slides, Concur looks quite like a rehash of Occam 
 constructs? Sutter talks about the FIFO approach to method invocation 
 (via message passing), whereas Occam/transputer used "channels" for 
 such things -- including "future" values and so on -- all backed in 
 hardware (with the physical channels looking more than a bit like the 
 multi-hypertransport busses of today). As you can see from the above 
 links, PAR and ALT provided for efficient and granular control over 
 concurrency.

 Occam is long since dead, but the ideas and implementation were 
 ground-breaking. Certainly worthy of a look in this regard?
[I should have said "dead with respect to mainstream adoption"]
 
 
 Definately.  As I mentioned in my other reply, I think Concur is a step 
 in the right direction, but merely a step.  Still, it beats the 
 traditional approach :-)
 
 And, I agree completely -- it would be awesome if D were to excel in 
 this arena; perhaps the biggest sea-change in programming strategy 
 we're likely to see. Any language that can ride that wave will garner 
 a whole lot of attention :)
Definately. Sean
Since Occam is probably unknown to most, here's a mini tutorial linky: http://frmb.org/occtutor.html The syntax will make some folks barf and chunder, but try to get beyond that :) Most importantly, the language is an implementation of CSP. This means the code is mechanically provable as "correct" in terms of its concurrency: race conditions, alias avoidance, and various other nasties. This is in stark contrast to C style languages. Notice that occam had array-slices (long ago <g>), and take a look at how simple the classic concurrent producer/consumer example is (search for the text "going parallel" in the above link). A fully concurrent OS project is described over here: http://www.cs.kent.ac.uk/projects/ofa/kroc/rmox03.pdf , which makes some interesting comments with respect to CSP versus traditional (difficult) concurrency. I hope Walter takes the time to read both of these. The ideas have been in use since the late 70's, so they're fairly refined at this point. Besides, provably-correct concurrency is surely a perfect partner for DbC :)
Apr 23 2006
prev sibling parent reply Sean Kelly <sean f4.ca> writes:
Craig Black wrote:
 I've considered this as well.  In fact, I've got library code to do this 
 for C++.  But I'd prefer something a bit more forward-looking for D (see 
 Herb Sutter's Concur project for one possibility).  I'm planning to 
 address this need at some point in Ares, probably in a manner similar to 
 the Concur design, but I don't have time at the moment.
I don't see syntax this high-level to be implementable in a library. It seems like this would require features to be added to the language core. Or do you disagree?
I don't think the exact syntax matters so much as behavior and ease of use, though inner functions and delegates should allow us to get fairly close without any language changes. The most obvious approach would be something like this (note that I haven't given this much thought so it may require some changes): auto x = active( void delegate() { foo(10); } ); auto y = active( void delegate() { a.b(c); } ); auto p = active( Bar delegate() { return new Bar; } ); (p & (x | y)).wait(); // waits on a temporary future group return p.val; Here, the 'active' keyword has been replaced by a function accepting a delegate, and future group logic uses bitwise operators instead of their logical operators (operating a bit like simple expression templates). Also, the 'val' member is used in place of implicit type conversion to get at the returned value, if one exists. This sort of thing seems preferable to explicit use of threads and locks, but I feel it's merely the first step towards something better. Hopefully, better ideas will present themselves once we have a chance to play with the idea a bit, as I'm not sure I'd actually propose language changes to support this sort of thing quite yet. Sean
Apr 23 2006
parent reply "Craig Black" <cblack ara.com> writes:
 I don't think the exact syntax matters so much as behavior and ease of 
 use, though inner functions and delegates should allow us to get fairly 
 close without any language changes.  The most obvious approach would be 
 something like this (note that I haven't given this much thought so it may 
 require some changes):

     auto x = active( void delegate() { foo(10); } );
     auto y = active( void delegate() { a.b(c); } );
     auto p = active( Bar  delegate() { return new Bar; } );
     (p & (x | y)).wait(); // waits on a temporary future group
     return p.val;

 Here, the 'active' keyword has been replaced by a function accepting a 
 delegate, and future group logic uses bitwise operators instead of their 
 logical operators (operating a bit like simple expression templates). 
 Also, the 'val' member is used in place of implicit type conversion to get 
 at the returned value, if one exists.

 This sort of thing seems preferable to explicit use of threads and locks, 
 but I feel it's merely the first step towards something better. Hopefully, 
 better ideas will present themselves once we have a chance to play with 
 the idea a bit, as I'm not sure I'd actually propose language changes to 
 support this sort of thing quite yet.
I realize that you haven't had time to think this through completely, so perhaps I can help to stimulate your thinking. And as always, I could be missing something here so correct me if I'm way off base. What benefit does this approach have over a thread pool? It seems to be basically the idea that I had originally presented, except that I had not presented a solution for methods with return values. The approach that you outline does not allow for active classes and so misses a large part of what the Concur pseudocode syntax provides. For example, active class C { void f() { ... } void g() { ... } } ... active C c; c.f(); c.g(); Both of the above method calls are assigned to the same thread. Thus, they are executed in order and not in parallel. This is important, since they have access to common data members declared in class C. How does your approach provide for this case? Would we not have to revert to using locks? In order to assign tasks to the same thread, we could make the active() method take one or more delegates. Each delegate in the argument list would be assigned to the same thread and so would be executed in order rather than in parallel. Even still, this approach lacks the OO elegance of the active class pseudocode. But it's perhaps better than using locks. I will, however, refrain, as much as it pains me to do so, from making any more feature requests for D until 1.0 is released. (Not a promise but I will try :) -Craig
Apr 23 2006
parent reply Sean Kelly <sean f4.ca> writes:
Craig Black wrote:
 
 I realize that you haven't had time to think this through completely, so 
 perhaps I can help to stimulate your thinking.  And as always,  I could be 
 missing something here so correct me if I'm way off base.
 
 What benefit does this approach have over a thread pool?  It seems to be 
 basically the idea that I had originally presented, except that I had not 
 presented a solution for methods with return values.
Compared to working directly with a thread pool, it allows for the scheduling policy to be separated from the task queue. So actives could be run sequentially in the main thread, spread across multiple threads, etc. However, a more general task processor could really do the same thing. For the rest, I think it mostly provides a decent framework to use for asynchronous task completion. For example, I think it's potentially useful to be able to wait on combinations of futures using or/and logic.
 The approach that you outline does not allow for active classes and so 
 misses a large part of what the Concur pseudocode syntax provides.  For 
 example,
 
 active class C {
   void f() { ... }
   void g() { ... }
 }
 ...
 active C c;
 c.f();
 c.g();
I'll admit I skipped that part for the sake of brevity, but it should be possible to create active objects via subclassing or perhaps with a combination of mixins and interfaces. However, I think the ad hoc approach may prove to be much more useful in practice.
 Even still, this approach lacks the OO elegance of the active class 
 pseudocode.  But it's perhaps better than using locks.
Depends on the problem, I think. I think the non-OO active syntax better illustrates just how easy it is to break sequential code into parallel chunks. I'll probably have more to say once I've read the links Kris posted. Sean
Apr 24 2006
parent reply kris <foo bar.com> writes:
Sean Kelly wrote:
 Craig Black wrote:
 
 I realize that you haven't had time to think this through completely, 
 so perhaps I can help to stimulate your thinking.  And as always,  I 
 could be missing something here so correct me if I'm way off base.

 What benefit does this approach have over a thread pool?  It seems to 
 be basically the idea that I had originally presented, except that I 
 had not presented a solution for methods with return values.
Compared to working directly with a thread pool, it allows for the scheduling policy to be separated from the task queue. So actives could be run sequentially in the main thread, spread across multiple threads, etc. However, a more general task processor could really do the same thing. For the rest, I think it mostly provides a decent framework to use for asynchronous task completion. For example, I think it's potentially useful to be able to wait on combinations of futures using or/and logic.
 The approach that you outline does not allow for active classes and so 
 misses a large part of what the Concur pseudocode syntax provides.  
 For example,

 active class C {
   void f() { ... }
   void g() { ... }
 }
 ...
 active C c;
 c.f();
 c.g();
I'll admit I skipped that part for the sake of brevity, but it should be possible to create active objects via subclassing or perhaps with a combination of mixins and interfaces. However, I think the ad hoc approach may prove to be much more useful in practice.
 Even still, this approach lacks the OO elegance of the active class 
 pseudocode.  But it's perhaps better than using locks.
Depends on the problem, I think. I think the non-OO active syntax better illustrates just how easy it is to break sequential code into parallel chunks. I'll probably have more to say once I've read the links Kris posted. Sean
Sean ~ you might find these interesting: A Java library of CSP primitives. This was built with the cooperation of the CSP and occam communities, and was used by Doug Lea for the CSP examples in his book "Concurrent Programming in Java" 2nd Ed. Intro: http://www.cs.kent.ac.uk/projects/ofa/jcsp/explain.html JDocs: http://www.cs.kent.ac.uk/projects/ofa/jcsp/jcsp1-0-rc4/jcsp-docs/ Presentation & commentary on CSP-style concurrency versus thread/monitor style: http://www.cs.kent.ac.uk/projects/ofa/jcsp/jcsp.pdf Presentation on distributed JCSP: http://www.cs.kent.ac.uk/projects/ofa/jcsp/jcsp-net-slides.pdf
Apr 24 2006
parent Ian <ianc qubesoft.com> writes:
You might want to take a look at the chord based approach to concurrent

I spent a while playing with it and found the code I wrote using it to be far
more readable than any 
other system I have used. I think the syntax for chord definitions sucks, but
I'm probably just 
being picky :)
http://research.microsoft.com/~nick/polyphony/
	Ian Cottrell
Apr 24 2006
prev sibling parent "Craig Black" <cblack ara.com> writes:
 I've considered this as well.  In fact, I've got library code to do this 
 for C++.  But I'd prefer something a bit more forward-looking for D (see 
 Herb Sutter's Concur project for one possibility).  I'm planning to 
 address this need at some point in Ares, probably in a manner similar to 
 the Concur design, but I don't have time at the moment.
BTW Sean, I didn't mention it before, but this would be VERY cool to have in D. I'm glad you are thinking along these lines. Kudos for being so darn smart. -Craig
Apr 23 2006
prev sibling parent reply Norbert Nemec <Norbert Nemec-online.de> writes:
Weren't thread pools mostly a crutch for old systems where creating
threads was expensive? Nowadays, threads are usually lightweight enough
that it is the best strategy to write a program creating plenty of
threads and leave the scheduling completely to the operating system.

So, of course, you could have a function like "scheduleTask", but that
function should simply create a thread and it is to operating systems's
job to schedule this thread for later execution.




Craig Black wrote:
 I am not an expert at using threads, but I have an idea that I would like to 
 present that would perhaps make threading easier.
 
 Thread pooling is a great way to take advantage of concurrency on multi-core 
 systems.  The number of threads in the pool could be equal to the number of 
 cores in the multi-core system.
 
 D has a big advantage over C++ because it has nested methods, anonymous 
 methods, and delegates.  Delegates could be used with thread pools so that 
 all the programmer would have to do to start a thread would be to call a 
 method like, say, scheduleTask() and pass it a delegate.  (This could be a 
 nested method, or an anonymous method.)  The thread pool would assign the 
 task to the first thread that is idle.  scheduleTask() could return a handle 
 to a Task object.  The Task handle could be used to test to see if the task 
 is running or finished.
 
 Comments?
 
 -Craig 
 
 
Apr 24 2006
parent reply wilsonk nowhere.com writes:
In article <e2i973$1iap$1 digitaldaemon.com>, Norbert Nemec says...
Weren't thread pools mostly a crutch for old systems where creating
threads was expensive? Nowadays, threads are usually lightweight enough
that it is the best strategy to write a program creating plenty of
threads and leave the scheduling completely to the operating system.

So, of course, you could have a function like "scheduleTask", but that
function should simply create a thread and it is to operating systems's
job to schedule this thread for later execution.
Hi Norbert, I think you are correct with your statement above about thread creation being expensive on old systems compared to nowadays. But the creation of "plenty of threads and leave the scheduling completely to the operating system" may need some clarification. I assume that you are thinking of a program that only uses any given thread once during exectution, then the program terminates. If this is not the model you are thinking of then, thread pools are a useful optimization. I wrote a server generation tool that used thread pooling and it was much faster under Linux than the process-based (ie. "lightweight threads", as you say, under Linux ;) or thread-spawning models. The reason for this is the very short term use (and no re-use) of many threads in the server <--- usually HTTP servers. Example: "For each connection spawn a new thread, service the request, then kill thread" is very expensive compared to "For each new connection grab thread from pool, service request, place thread back in pool". Thread pools are used in the ACE framework, SEDA framework, etc. for high speed/high load communications (Apache 2 also uses a thread pool, by the way). So anyways, if you are not worried about communications software (and the like) then simply letting the OS schedule everything may be marginally acceptable (I am sure others know of programs where this strategy 'won't' be acceptable, though). Thanks, K.Wilson
Apr 24 2006
parent Norbert Nemec <Norbert Nemec-online.de> writes:
Thanks, K.Wilson for your insightful clarification. Like ever so often,
I forgot that I, like every programmer, have only a limited perspective
and should not be too quick to judge according to my own experiences.

Indeed, "plenty of threads" should be clarified to "at least as many
threads as there are processors but not orders of magnitude more".
Creating a thread is fairly cheap nowadays, but, of course it has some
cost, and there certainly are applications (like the server that you
mention) where this cost has considerable weight.



wilsonk nowhere.com wrote:
 In article <e2i973$1iap$1 digitaldaemon.com>, Norbert Nemec says...
 Weren't thread pools mostly a crutch for old systems where creating
 threads was expensive? Nowadays, threads are usually lightweight enough
 that it is the best strategy to write a program creating plenty of
 threads and leave the scheduling completely to the operating system.

 So, of course, you could have a function like "scheduleTask", but that
 function should simply create a thread and it is to operating systems's
 job to schedule this thread for later execution.
Hi Norbert, I think you are correct with your statement above about thread creation being expensive on old systems compared to nowadays. But the creation of "plenty of threads and leave the scheduling completely to the operating system" may need some clarification. I assume that you are thinking of a program that only uses any given thread once during exectution, then the program terminates. If this is not the model you are thinking of then, thread pools are a useful optimization. I wrote a server generation tool that used thread pooling and it was much faster under Linux than the process-based (ie. "lightweight threads", as you say, under Linux ;) or thread-spawning models. The reason for this is the very short term use (and no re-use) of many threads in the server <--- usually HTTP servers. Example: "For each connection spawn a new thread, service the request, then kill thread" is very expensive compared to "For each new connection grab thread from pool, service request, place thread back in pool". Thread pools are used in the ACE framework, SEDA framework, etc. for high speed/high load communications (Apache 2 also uses a thread pool, by the way). So anyways, if you are not worried about communications software (and the like) then simply letting the OS schedule everything may be marginally acceptable (I am sure others know of programs where this strategy 'won't' be acceptable, though). Thanks, K.Wilson
Apr 24 2006