digitalmars.D - Parallel programming
- bearophile (10/10) Jul 15 2008 How much time do we have to wait to see some parallel processing feature...
- Sean Kelly (24/36) Jul 15 2008 I asked for parallelization support for foreach... well, ages ago. At
- downs (4/10) Jul 15 2008 For what it's worth, coroutines and futures are also in tools. ( http://...
- downs (24/39) Jul 15 2008 Patched GDC supports autovectorization with -ftree-vectorize, although t...
- Markus Koskimies (36/40) Jul 15 2008 A very short answer; for true parallel processing, 2-4 processors is
- Sean Kelly (24/31) Jul 15 2008 I agree completely. MP is easy to comprehend (it's how people naturally
- Markus Koskimies (17/42) Jul 15 2008 I couldn't agree more. MP is very natural way for us humans to organize
- Markus Koskimies (2/4) Jul 15 2008 Many *even* highly etc. etc. :oops:
- JAnderson (6/22) Jul 15 2008 I'm hoping that the new "Pure" stuff Walter is working on, will enable
- Jascha Wetzel (3/19) Jul 16 2008 agreed, we absolutely need an OpenMP (http://www.openmp.org)
How much time do we have to wait to see some parallel processing features in D? People are getting more and more rabid because they have few ways to use their 2-4 core CPUs. Classic multithreading is useful, but sometimes it's not easy to use correctly. There are other ways to write parallel code, that D may adopt (more than one way is probably better, no silver bullet exists in this field). Their point is to allow to use the 2-4+ core CPUs of today (and maybe the 80-1000+ cores of the future) in non-speed-critical parts of the code where the programmer wants to use the other cores anyway, without too much programming efforts. I think Walter wants D language to be multi-paradigm; one of the best ways to allow multi-processing in a simple and safer way is the Stream Processing (http://en.wikipedia.org/wiki/Stream_Processing ), D syntax may grow few constructs to use such kind of programming in a simple way (C++ has some such libs, I think). Another easy way to perform multi processing is to vectorize. It means the compiler can automatically use all the cores to perform operators like array1+array2+array3. Another way to perform multi processing is so add to the D syntax the parallel_for (and few related things to merge things back, etc) syntax that was present in the "Parallel Pascal" language. Such things are quite simpler to use correctly than threads. The new "Fortress" language by Sun shows similar things, but they are more refined compared to the Parallel Pascal ones (and they look more complex to understand and use, so they may be overkill for D, I don't know. Some of those parallel things of Fortress look quite difficult to implement to me). Time ago I have seen a form of parallel_for and the like in a small and easy language from MIT, that I think are simple enough. Other ways to use parallel code are now being pushed by Intel, OpenMP, and the hairy but usable CUDA by Nvidia (I am not sure I want to learn CUDA, it's a C variant, but seems to require a large human memory and a large human brain to be used, while I think D may have simpler built-in things. Much more "serious" D programmers may use external libs that allow them any fine control they want). I to me they look too much in flux now to be copied too much by D. Bye, bearophile
Jul 15 2008
bearophile wrote:How much time do we have to wait to see some parallel processing features in D? People are getting more and more rabid because they have few ways to use their 2-4 core CPUs. Classic multithreading is useful, but sometimes it's not easy to use correctly. There are other ways to write parallel code, that D may adopt (more than one way is probably better, no silver bullet exists in this field). Their point is to allow to use the 2-4+ core CPUs of today (and maybe the 80-1000+ cores of the future) in non-speed-critical parts of the code where the programmer wants to use the other cores anyway, without too much programming efforts. I think Walter wants D language to be multi-paradigm; one of the best ways to allow multi-processing in a simple and safer way is the Stream Processing (http://en.wikipedia.org/wiki/Stream_Processing ), D syntax may grow few constructs to use such kind of programming in a simple way (C++ has some such libs, I think). Another easy way to perform multi processing is to vectorize. It means the compiler can automatically use all the cores to perform operators like array1+array2+array3. Another way to perform multi processing is so add to the D syntax the parallel_for (and few related things to merge things back, etc) syntax that was present in the "Parallel Pascal" language. Such things are quite simpler to use correctly than threads. The new "Fortress" language by Sun shows similar things, but they are more refined compared to the Parallel Pascal ones (and they look more complex to understand and use, so they may be overkill for D, I don't know. Some of those parallel things of Fortress look quite difficult to implement to me). Time ago I have seen a form of parallel_for and the like in a small and easy language from MIT, that I think are simple enough.I asked for parallelization support for foreach... well, ages ago. At the time Walter said no because DMD was years away from being able to do anything like that, but perhaps with the new focus on multiprogramming one can argue more strongly that it's important to get something like this in the spec even if DMD itself doesn't support it. My request was pretty minimal and partially a reaction to foreach_reverse. It was: foreach( ... ) // defaults to "fwd" foreach(fwd)( ... ) foreach(rev)( ... ) foreach(any)( ... ) Thus foreach(any) is eligible for parallelization, while fwd and rev are what we have now. This would be easy enough with templates and another keyword: apply!(fwd)( ... ) etc. But passing a delegate literal as an argument isn't nearly as nice as the built-in foreach. And Tom's (IIRC) proposal to clean up the syntax for this doesn't look like it will ever be accepted.Other ways to use parallel code are now being pushed by Intel, OpenMP, and the hairy but usable CUDA by Nvidia (I am not sure I want to learn CUDA, it's a C variant, but seems to require a large human memory and a large human brain to be used, while I think D may have simpler built-in things. Much more "serious" D programmers may use external libs that allow them any fine control they want). I to me they look too much in flux now to be copied too much by D.D already has coroutines, DCSP, and futures available from various programmers (Mikola Lysenko for the first two), so I think the state of multiprogramming in D is actually pretty good even without additional language support. Sean
Jul 15 2008
Sean Kelly wrote:D already has coroutines, DCSP, and futures available from various programmers (Mikola Lysenko for the first two), so I think the state of multiprogramming in D is actually pretty good even without additional language support. SeanFor what it's worth, coroutines and futures are also in tools. ( http://dsource.org/projects/scrapple/browser/trunk/tools/tools ) Also, I agree with your sentiment. --downs
Jul 15 2008
bearophile wrote:How much time do we have to wait to see some parallel processing features in D? People are getting more and more rabid because they have few ways to use their 2-4 core CPUs. Classic multithreading is useful, but sometimes it's not easy to use correctly.Grow a pair and use threads. It's not _that_ hard.There are other ways to write parallel code, that D may adopt (more than one way is probably better, no silver bullet exists in this field). Their point is to allow to use the 2-4+ core CPUs of today (and maybe the 80-1000+ cores of the future) in non-speed-critical parts of the code where the programmer wants to use the other cores anyway, without too much programming efforts. I think Walter wants D language to be multi-paradigm; one of the best ways to allow multi-processing in a simple and safer way is the Stream Processing (http://en.wikipedia.org/wiki/Stream_Processing ), D syntax may grow few constructs to use such kind of programming in a simple way (C++ has some such libs, I think). Another easy way to perform multi processing is to vectorize. It means the compiler can automatically use all the cores to perform operators like array1+array2+array3.Patched GDC supports autovectorization with -ftree-vectorize, although that's single-core. One of the good things IMHO about D is that its operations are mostly easy to understand, i.e. there's little magic going on. PLEASE don't change that.Another way to perform multi processing is so add to the D syntax the parallel_for (and few related things to merge things back, etc) syntax that was present in the "Parallel Pascal" language. Such things are quite simpler to use correctly than threads. The new "Fortress" language by Sun shows similar things, but they are more refined compared to the Parallel Pascal ones (and they look more complex to understand and use, so they may be overkill for D, I don't know. Some of those parallel things of Fortress look quite difficult to implement to me). Time ago I have seen a form of parallel_for and the like in a small and easy language from MIT, that I think are simple enough.auto tp = new Threadpool(4); tp.mt_foreach(Range[4], (int e) { });Other ways to use parallel code are now being pushed by Intel, OpenMP, and the hairy but usable CUDA by Nvidia (I am not sure I want to learn CUDA, it's a C variant, but seems to require a large human memory and a large human brain to be used, while I think D may have simpler built-in things. Much more "serious" D programmers may use external libs that allow them any fine control they want). I to me they look too much in flux now to be copied too much by D.Please, no hardware specific features. D is x86 dependent enough as it is, it would be a bad idea to add dependencies on _graphics cards_.Bye, bearophileIMHO what's really needed is good tools to discover interaction between threads. I'd like a standardized way to grab debug info, like the current back trace of a std.thread.Thread. This could be used to implement fairly sophisticated logging. Also, what I have requested before .. single-instruction function bodies should be able to omit their {}s, to bring them in line with normal loop statements. This sounds like a hack, but which is better? void test() { synchronized(this) { ... } } or void test() synchronized(this) { ... } --downs
Jul 15 2008
On Tue, 15 Jul 2008 19:34:24 -0400, bearophile wrote:How much time do we have to wait to see some parallel processing features in D? People are getting more and more rabid because they have few ways to use their 2-4 core CPUs. Classic multithreading is useful, but sometimes it's not easy to use correctly.A very short answer; for true parallel processing, 2-4 processors is nothing. The success of CFL (Control-Flow Languages) like C, C++, D, Pascal, Perl, Python, BASICs, Cobol, Comal, PL/I, whitespace, malbolge, etc. etc. is that they follow the underlaying paradigm of computer. There has been many efforts to declare languages, that are implicitly parallel. The most used approach is to use DFL (Data-Flow Language) paradigms, and the most well-know of these is definitely VHDL. Others are e.g. NESL and ID. Then there are several languages that are either in- between like functional programming languages (Haskell, Erlang) or reductive languages (like make and Prolog). Short references: http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=714561 http://portal.acm.org/citation.cfm?id=359579&dl=GUIDE http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=630241 Especially Hartenstein's articles are good to read, if you are trying to understand, why we are still using CFL & RASP, and why parallel architectures have failed. No, the future will show us not any more parallelism at source level. Instead, (a) the compilers start to understand source better, to parallelize inner kernels of loops automatically, and (b) there will be even more layers between source we are writing and the instructions/ configurations processors are executing, and thus the main purpose of source language is not any more to follow the underlaying paradigm, but productivity - how easy it is to humans to express things; and CFL- languages are far from their counterparts in this area. Comparing CFL/DFL at compiler level, see e.g. http://csdl2.computer.org/persagen/DLAbsToc.jsp?resourcePath=/dl/ proceedings/&toc=comp/proceedings/ fccm/1995/7086/00/7086toc.xml&DOI=10.1109/FPGA.1995.477423 If I would asked to say what is the way of writing future programs, I would say it is MPS (Message Passing Systems), refer to e.g. Hewitt's Actor Model (1973). Furthermore, I would predict processors to start to do low-level reconfigurations, e.g. RSIP (Reconfigurable Instruction Set Processor) -paradigm. Look google for GARP and the awesome performance increasements it can offer for certain tasks.
Jul 15 2008
== Quote from Markus Koskimies (markus reaaliaika.net)'s articleIf I would asked to say what is the way of writing future programs, I would say it is MPS (Message Passing Systems), refer to e.g. Hewitt's Actor Model (1973).I agree completely. MP is easy to comprehend (it's how people naturally operate) and the tech behind it is extremely well established. I remain skeptical that we'll see a tremendous amount of automatic parallelization of ostensibly procedural code by the interpreter (ie. compiler or VM). For one thing, it complicates debugging tremendously, not to mention the error conditions that such translation can introduce. As an potentially relevant anecdote, after Herb Sutter's presentation on Concur a year or two ago at SDWest I asked him what should happen if two threads of an automatically parallelized loop both throw an exception, given that the C++ spec dictates that having more than one in-flight exception per thread should call terminate(). He dodged my question and turned to talk to someone else, who interestingly enough, did make an attempt to ensure that Herb understood what I was asking, but to no avail. Implications about Herb aside, I do think that this suggests that there are known problems with implicit parallelization that everyone is hoping will just magically disappear. How can one verify the correctness of code that may fail if implicitly parallelized but work if not?Furthermore, I would predict processors to start to do low-level reconfigurations, e.g. RSIP (Reconfigurable Instruction Set Processor) -paradigm. Look google for GARP and the awesome performance increasements it can offer for certain tasks.Interestingly, parallel programming is the topic covered by ACM Communications magazine this month, and I believe there is a bit about this sort of hardware parallelism in addition to transactional memory, etc. The articles I've read so far have all been well-reasoned and pretty honest about the benefits and problems with each idea. Sean
Jul 15 2008
On Wed, 16 Jul 2008 00:53:28 +0000, Sean Kelly wrote:== Quote from Markus Koskimies (markus reaaliaika.net)'s articleI couldn't agree more. MP is very natural way for us humans to organize parallel things. But there is even more behind it; the very fundamental reason that restricts computers to come PRAM machines is this our world around us. It restricts all physical machines, including computers to maximum of three spatial dimensions, and inherently neighborhood connected models; and those are very very far from ideal PRAM things...If I would asked to say what is the way of writing future programs, I would say it is MPS (Message Passing Systems), refer to e.g. Hewitt's Actor Model (1973).I agree completely. MP is easy to comprehend (it's how people naturally operate) and the tech behind it is extremely well established.I remain skeptical that we'll see a tremendous amount of automatic parallelization of ostensibly procedural code by the interpreter (ie. compiler or VM). For one thing, it complicates debugging tremendously, not to mention the error conditions that such translation can introduce.Another thing I completely agree. It is not about what could be ideally best things, it is the reality that matters. Debugging a highly parallel thing, e.g. FPGA hardware, is very, very time-consuming thing.As an potentially relevant anecdote, after Herb Sutter's presentation on Concur [...]Many highly skillful people are very bound to the great ideas they have in their mind. I'm not an exception :)If reconfigurable computers - and more or less distributed computing - does not come as next major processor architectures, I will go to some distant place and shame. They are not ideal nor optimal computers, far from that - programming one is very laborous and it is very hard for compilers. But they just work.Furthermore, I would predict processors to start to do low-level reconfigurations, e.g. RSIP (Reconfigurable Instruction Set Processor) -paradigm. Look google for GARP and the awesome performance increasements it can offer for certain tasks.Interestingly, parallel programming is the topic covered by ACM Communications magazine this month, and I believe there is a bit about this sort of hardware parallelism in addition to transactional memory, etc. The articles I've read so far have all been well-reasoned and pretty honest about the benefits and problems with each idea.
Jul 15 2008
On Wed, 16 Jul 2008 01:15:23 +0000, Markus Koskimies wrote:Many highly skillful people are very bound to the great ideas they have in their mind. I'm not an exception :)Many *even* highly etc. etc. :oops:
Jul 15 2008
bearophile wrote:How much time do we have to wait to see some parallel processing features in D? People are getting more and more rabid because they have few ways to use their 2-4 core CPUs. Classic multithreading is useful, but sometimes it's not easy to use correctly. There are other ways to write parallel code, that D may adopt (more than one way is probably better, no silver bullet exists in this field). Their point is to allow to use the 2-4+ core CPUs of today (and maybe the 80-1000+ cores of the future) in non-speed-critical parts of the code where the programmer wants to use the other cores anyway, without too much programming efforts. I think Walter wants D language to be multi-paradigm; one of the best ways to allow multi-processing in a simple and safer way is the Stream Processing (http://en.wikipedia.org/wiki/Stream_Processing ), D syntax may grow few constructs to use such kind of programming in a simple way (C++ has some such libs, I think). Another easy way to perform multi processing is to vectorize. It means the compiler can automatically use all the cores to perform operators like array1+array2+array3. Another way to perform multi processing is so add to the D syntax the parallel_for (and few related things to merge things back, etc) syntax that was present in the "Parallel Pascal" language. Such things are quite simpler to use correctly than threads. The new "Fortress" language by Sun shows similar things, but they are more refined compared to the Parallel Pascal ones (and they look more complex to understand and use, so they may be overkill for D, I don't know. Some of those parallel things of Fortress look quite difficult to implement to me). Time ago I have seen a form of parallel_for and the like in a small and easy language from MIT, that I think are simple enough. Other ways to use parallel code are now being pushed by Intel, OpenMP, and the hairy but usable CUDA by Nvidia (I am not sure I want to learn CUDA, it's a C variant, but seems to require a large human memory and a large human brain to be used, while I think D may have simpler built-in things. Much more "serious" D programmers may use external libs that allow them any fine control they want). I to me they look too much in flux now to be copied too much by D. Bye, bearophileI'm hoping that the new "Pure" stuff Walter is working on, will enable the compiler to automatically parrellize things like foreach. It won't be as fast as something that's hand tuned to be faster however it will be a hell of a lot easier to write. -Joel
Jul 15 2008
agreed, we absolutely need an OpenMP (http://www.openmp.org) implementation for D. bearophile wrote:How much time do we have to wait to see some parallel processing features in D? People are getting more and more rabid because they have few ways to use their 2-4 core CPUs. Classic multithreading is useful, but sometimes it's not easy to use correctly. There are other ways to write parallel code, that D may adopt (more than one way is probably better, no silver bullet exists in this field). Their point is to allow to use the 2-4+ core CPUs of today (and maybe the 80-1000+ cores of the future) in non-speed-critical parts of the code where the programmer wants to use the other cores anyway, without too much programming efforts. I think Walter wants D language to be multi-paradigm; one of the best ways to allow multi-processing in a simple and safer way is the Stream Processing (http://en.wikipedia.org/wiki/Stream_Processing ), D syntax may grow few constructs to use such kind of programming in a simple way (C++ has some such libs, I think). Another easy way to perform multi processing is to vectorize. It means the compiler can automatically use all the cores to perform operators like array1+array2+array3. Another way to perform multi processing is so add to the D syntax the parallel_for (and few related things to merge things back, etc) syntax that was present in the "Parallel Pascal" language. Such things are quite simpler to use correctly than threads. The new "Fortress" language by Sun shows similar things, but they are more refined compared to the Parallel Pascal ones (and they look more complex to understand and use, so they may be overkill for D, I don't know. Some of those parallel things of Fortress look quite difficult to implement to me). Time ago I have seen a form of parallel_for and the like in a small and easy language from MIT, that I think are simple enough. Other ways to use parallel code are now being pushed by Intel, OpenMP, and the hairy but usable CUDA by Nvidia (I am not sure I want to learn CUDA, it's a C variant, but seems to require a large human memory and a large human brain to be used, while I think D may have simpler built-in things. Much more "serious" D programmers may use external libs that allow them any fine control they want). I to me they look too much in flux now to be copied too much by D. Bye, bearophile
Jul 16 2008