www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Looking for writing parallel foreach kind of statement for nested

reply "Sparsh Mittal" <sparsh0mittal gmail.com> writes:
I saw foreach and parallel commands on
http://dlang.org/phobos/std_parallelism.html

I have nested for-loops as follows:

for(int i=1; i< N; i++)
  {
   for(int j=1; j < M; j++)
    {
      func(i,j);
    }
  }

Here func(i,j) is such that there is no dependency or 
communication b/w different (i,j) tuple values. Thus, problem is 
very simple.

My question is how can I write a parallel foreach loop (if at 
all) for such problems in D. I would be grateful for your help.
Feb 09 2013
parent reply FG <home fgda.pl> writes:
On 2013-02-09 23:32, Sparsh Mittal wrote:
 I saw foreach and parallel commands on
 http://dlang.org/phobos/std_parallelism.html

 I have nested for-loops as follows:

 for(int i=1; i< N; i++)
   {
    for(int j=1; j < M; j++)
     {
       func(i,j);
     }
   }

 Here func(i,j) is such that there is no dependency or communication b/w
 different (i,j) tuple values. Thus, problem is very simple.
Huh? for(int i=1; i< N; i++) <==> foreach(i; iota(1, N)) so you can use: foreach(i; parallel(iota(1, N))) { ... }
Feb 09 2013
parent reply "Sparsh Mittal" <sparsh0mittal gmail.com> writes:
 for(int i=1; i< N; i++)    <==>    foreach(i; iota(1, N))
 so you can use:  foreach(i; parallel(iota(1, N))) { ... }
Thanks a lot. This one divides the x-cross-y region by rows. Suppose dimension is 8*12 and 4 parallel threads are there, so current method is dividing by 2*12 to each of 4 threads. The current reply answers my question, but I was just curious. Can we have a method which divides the 2d region as follows: 8*12 divided into 4*6 to each of 4 threads.
Feb 09 2013
parent reply FG <home fgda.pl> writes:
On 2013-02-10 02:22, Sparsh Mittal wrote:
 The current reply answers my question, but I was just curious. Can we have a
 method which divides the 2d region as follows: 8*12 divided into 4*6 to each of
 4 threads.
Think again if you need that. Things start getting pretty ugly. :) const uint parts = 2; // in each dimension foreach (block; parallel(iota(parts*parts))) { uint p1 = block / parts; foreach (i; (p1*N/parts)..(p1==parts-1 ? N : (p1+1)*N/parts)) { uint p2 = block % parts; foreach (j; (p2*M/parts)..(p2==parts-1 ? M : (p2+1)*M/parts)) { func(i, j, p1, p2, block); } } }
Feb 09 2013
parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Sun, Feb 10, 2013 at 8:45 AM, FG <home fgda.pl> wrote:
 On 2013-02-10 02:22, Sparsh Mittal wrote:
 The current reply answers my question, but I was just curious. Can we have
 a
 method which divides the 2d region as follows: 8*12 divided into 4*6 to
 each of
 4 threads.
Think again if you need that. Things start getting pretty ugly. :)
Indeed... Sparsh, any reason you need the calculation to be done on 2d blocks instead of independent slots?
Feb 10 2013
parent "Sparsh Mittal" <sparsh0mittal gmail.com> writes:
 Think again if you need that. Things start getting pretty 
 ugly. :)
Yes, it is not at all intuitive.
 Indeed... Sparsh, any reason you need the calculation to be 
 done on 2d
 blocks instead of independent slots?
For my problem, original answer was fine, since parallel calculations are not at all dependent on each others. Sometime there are calculations to be done on 2d grid, where calculations are not uniform across the grid(see paper "An overview of parallel visualisation methods for Mandelbrot and Julia sets" where you can see Julia sets parallelization), and hence, dividing in a particular way leads to better load-balancing than others. Thanks a lot for your answer and time.
Feb 10 2013