digitalmars.D.learn - Looking for writing parallel foreach kind of statement for nested
- Sparsh Mittal (15/15) Feb 09 2013 I saw foreach and parallel commands on
- FG (4/16) Feb 09 2013 Huh?
- Sparsh Mittal (6/8) Feb 09 2013 Thanks a lot. This one divides the x-cross-y region by rows.
- FG (12/15) Feb 09 2013 Think again if you need that. Things start getting pretty ugly. :)
- Philippe Sigaud (3/11) Feb 10 2013 Indeed... Sparsh, any reason you need the calculation to be done on 2d
- Sparsh Mittal (10/15) Feb 10 2013 For my problem, original answer was fine, since parallel
I saw foreach and parallel commands on http://dlang.org/phobos/std_parallelism.html I have nested for-loops as follows: for(int i=1; i< N; i++) { for(int j=1; j < M; j++) { func(i,j); } } Here func(i,j) is such that there is no dependency or communication b/w different (i,j) tuple values. Thus, problem is very simple. My question is how can I write a parallel foreach loop (if at all) for such problems in D. I would be grateful for your help.
Feb 09 2013
On 2013-02-09 23:32, Sparsh Mittal wrote:I saw foreach and parallel commands on http://dlang.org/phobos/std_parallelism.html I have nested for-loops as follows: for(int i=1; i< N; i++) { for(int j=1; j < M; j++) { func(i,j); } } Here func(i,j) is such that there is no dependency or communication b/w different (i,j) tuple values. Thus, problem is very simple.Huh? for(int i=1; i< N; i++) <==> foreach(i; iota(1, N)) so you can use: foreach(i; parallel(iota(1, N))) { ... }
Feb 09 2013
for(int i=1; i< N; i++) <==> foreach(i; iota(1, N)) so you can use: foreach(i; parallel(iota(1, N))) { ... }Thanks a lot. This one divides the x-cross-y region by rows. Suppose dimension is 8*12 and 4 parallel threads are there, so current method is dividing by 2*12 to each of 4 threads. The current reply answers my question, but I was just curious. Can we have a method which divides the 2d region as follows: 8*12 divided into 4*6 to each of 4 threads.
Feb 09 2013
On 2013-02-10 02:22, Sparsh Mittal wrote:The current reply answers my question, but I was just curious. Can we have a method which divides the 2d region as follows: 8*12 divided into 4*6 to each of 4 threads.Think again if you need that. Things start getting pretty ugly. :) const uint parts = 2; // in each dimension foreach (block; parallel(iota(parts*parts))) { uint p1 = block / parts; foreach (i; (p1*N/parts)..(p1==parts-1 ? N : (p1+1)*N/parts)) { uint p2 = block % parts; foreach (j; (p2*M/parts)..(p2==parts-1 ? M : (p2+1)*M/parts)) { func(i, j, p1, p2, block); } } }
Feb 09 2013
On Sun, Feb 10, 2013 at 8:45 AM, FG <home fgda.pl> wrote:On 2013-02-10 02:22, Sparsh Mittal wrote:Indeed... Sparsh, any reason you need the calculation to be done on 2d blocks instead of independent slots?The current reply answers my question, but I was just curious. Can we have a method which divides the 2d region as follows: 8*12 divided into 4*6 to each of 4 threads.Think again if you need that. Things start getting pretty ugly. :)
Feb 10 2013
Yes, it is not at all intuitive.Think again if you need that. Things start getting pretty ugly. :)Indeed... Sparsh, any reason you need the calculation to be done on 2d blocks instead of independent slots?For my problem, original answer was fine, since parallel calculations are not at all dependent on each others. Sometime there are calculations to be done on 2d grid, where calculations are not uniform across the grid(see paper "An overview of parallel visualisation methods for Mandelbrot and Julia sets" where you can see Julia sets parallelization), and hence, dividing in a particular way leads to better load-balancing than others. Thanks a lot for your answer and time.
Feb 10 2013