www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Parallelization of a large array

reply "Dennis Ritchie" <dennis.ritchie mail.ru> writes:
Hi.
How to parallelize a large array to check for the presence of an 
element matching the value with the data?

std.stdio;
std.algorithm;
std.parallelism;

void main() {

	int[] a = new int[1000000];

	foreach (i, ref elem; a)
		elem = i;

	/*if (find(parallel(a), 895639).length != 0)
		writeln("Yes");
	else
		writeln("No");*/
}
Mar 10 2015
next sibling parent reply "safety0ff" <safety0ff.dev gmail.com> writes:
On Tuesday, 10 March 2015 at 20:41:14 UTC, Dennis Ritchie wrote:
 Hi.
 How to parallelize a large array to check for the presence of 
 an element matching the value with the data?
Here's a simple method (warning: has pitfalls): import std.stdio; import std.parallelism; void main() { int[] a = new int[1000000]; foreach (i, ref elem; a) elem = cast(int)i; bool found; foreach (elem; a.parallel) if (elem == 895639) found = true; if (found) writeln("Yes"); else writeln("No"); }
Mar 10 2015
parent reply "Dennis Ritchie" <dennis.ritchie mail.ru> writes:
On Tuesday, 10 March 2015 at 21:15:17 UTC, safety0ff wrote:
 On Tuesday, 10 March 2015 at 20:41:14 UTC, Dennis Ritchie wrote:
 Hi.
 How to parallelize a large array to check for the presence of 
 an element matching the value with the data?
Here's a simple method (warning: has pitfalls): import std.stdio; import std.parallelism; void main() { int[] a = new int[1000000]; foreach (i, ref elem; a) elem = cast(int)i; bool found; foreach (elem; a.parallel) if (elem == 895639) found = true; if (found) writeln("Yes"); else writeln("No"); }
Thanks.
Mar 10 2015
parent reply "Dennis Ritchie" <dennis.ritchie mail.ru> writes:
On Tuesday, 10 March 2015 at 21:27:42 UTC, Dennis Ritchie wrote:
 Thanks.
No, it does not suit me, because of the parallel array in a foreach loop there is no break. import std.stdio; import std.algorithm; import std.parallelism; void main() { int b = 2; auto a = [1, 2, 2, 3]; if (find(a, b).length != 0) writeln("Yes_"); foreach (elem; a.parallel) if (elem == b) writeln("Yes"); // prints "Yes" twice }
Mar 10 2015
next sibling parent "Dennis Ritchie" <dennis.ritchie mail.ru> writes:
On Tuesday, 10 March 2015 at 22:11:57 UTC, Dennis Ritchie wrote:
 No, it does not suit me, because of the parallel array in a 
 foreach loop there is no break.
I already understood everything: found = true;
Mar 10 2015
prev sibling parent reply "Meta" <jared771 gmail.com> writes:
On Tuesday, 10 March 2015 at 22:11:57 UTC, Dennis Ritchie wrote:
 On Tuesday, 10 March 2015 at 21:27:42 UTC, Dennis Ritchie wrote:
 Thanks.
No, it does not suit me, because of the parallel array in a foreach loop there is no break. import std.stdio; import std.algorithm; import std.parallelism; void main() { int b = 2; auto a = [1, 2, 2, 3]; if (find(a, b).length != 0) writeln("Yes_"); foreach (elem; a.parallel) if (elem == b) writeln("Yes"); // prints "Yes" twice }
Just add a condition variable. import std.stdio; import std.algorithm; import std.parallelism; void main() { int b = 2; auto a = [1, 2, 2, 3]; if (find(a, b).length != 0) writeln("Yes_"); auto found = false; foreach (elem; a.parallel) if (!found && elem == b) { writeln("Yes"); found = true; } } This program pints: Yes_ Yes
Mar 10 2015
parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 03/10/2015 03:16 PM, Meta wrote:

 Just add a condition variable.

 import std.stdio;
 import std.algorithm;
 import std.parallelism;

 void main() {

      int b = 2;

      auto a = [1, 2, 2, 3];

      if (find(a, b).length != 0)
          writeln("Yes_");

      auto found = false;

      foreach (elem; a.parallel)
          if (!found && elem == b)
          {
              writeln("Yes");
              found = true;
I thought about the same solution but then realized that it's a race condition, which needs to be taken care of. It's true that the value always changes from false to true but still...
          }
 }
Ali
Mar 10 2015
parent "Meta" <jared771 gmail.com> writes:
On Tuesday, 10 March 2015 at 22:37:29 UTC, Ali Çehreli wrote:
 On 03/10/2015 03:16 PM, Meta wrote:

 Just add a condition variable.

 import std.stdio;
 import std.algorithm;
 import std.parallelism;

 void main() {

      int b = 2;

      auto a = [1, 2, 2, 3];

      if (find(a, b).length != 0)
          writeln("Yes_");

      auto found = false;

      foreach (elem; a.parallel)
          if (!found && elem == b)
          {
              writeln("Yes");
              found = true;
I thought about the same solution but then realized that it's a race condition, which needs to be taken care of. It's true that the value always changes from false to true but still...
          }
 }
Ali
Yes, it's a non-deterministic solution, but I don't think it matters all that much as we don't care which element fulfills the criterion, just that such an element exists. That's my reasoning, anyway. I know next to nothing about parallel processing.
Mar 10 2015
prev sibling next sibling parent "anonymous" <anonymous example.com> writes:
On Tuesday, 10 March 2015 at 20:41:14 UTC, Dennis Ritchie wrote:
 Hi.
 How to parallelize a large array to check for the presence of 
 an element matching the value with the data?

 std.stdio;
 std.algorithm;
 std.parallelism;
You forgot a couple "import"s here.
 void main() {

 	int[] a = new int[1000000];

 	foreach (i, ref elem; a)
 		elem = i;
Type mismatch here. i is a size_t, but elem is an int.
 	/*if (find(parallel(a), 895639).length != 0)
 		writeln("Yes");
 	else
 		writeln("No");*/
 }
I guess you'd have to write your own "find". Since std.algorithm.find is just linear search, that shouldn't be hard. Something like this: bool found = false; foreach(x; parallel(a)) { if(x == 895639) { found = true; /* Maybe figure out how to break all parallel loops. */ } } std.algorithm.find would work on mere input ranges, and it would return the tail of the range and not just a bool. Both of those don't make sense with a parallel search, though. Also, with a trivial predicate like integer equality, parallelization might not buy you anything.
Mar 10 2015
prev sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 03/10/2015 01:41 PM, Dennis Ritchie wrote:
 Hi.
 How to parallelize a large array to check for the presence of an element
 matching the value with the data?

 std.stdio;
 std.algorithm;
 std.parallelism;

 void main() {

      int[] a = new int[1000000];

      foreach (i, ref elem; a)
          elem = i;

      /*if (find(parallel(a), 895639).length != 0)
          writeln("Yes");
      else
          writeln("No");*/
 }
It is possible by accessing the actual range by chunks: import std.stdio; import std.algorithm; import std.parallelism; import std.range; import std.conv; void main() { const size_t elementCount = 895640; int[] a = iota(elementCount) .map!(i => i.to!int) .array; const chunkSize = a.length / taskPool.size; auto chunks = a.chunks(chunkSize); bool[] results = chunks.taskPool.map!(chunk => chunk.canFind(895639)); writeln(results.canFind(true) ? "Yes" : "No"); } We can hope to make it simpler by taking advantage of parallel map but it requires a static local function or a global function (a lambda cannot be used): static bool canFindIt(Range)(Range range) { return range.canFind(895639); } auto results = taskPool.map!canFindIt(chunks); Ali
Mar 10 2015
next sibling parent "Dennis Ritchie" <dennis.ritchie mail.ru> writes:
On Tuesday, 10 March 2015 at 22:34:34 UTC, Ali Çehreli wrote:
 It is possible by accessing the actual range by chunks:

 import std.stdio;
 import std.algorithm;
 import std.parallelism;
 import std.range;
 import std.conv;

 void main()
 {
     const size_t elementCount = 895640;
     int[] a = iota(elementCount)
               .map!(i => i.to!int)
               .array;

     const chunkSize = a.length / taskPool.size;
     auto chunks = a.chunks(chunkSize);
     bool[] results = chunks.taskPool.map!(chunk => 
 chunk.canFind(895639));

     writeln(results.canFind(true) ? "Yes" : "No");
 }

 We can hope to make it simpler by taking advantage of parallel 
 map but it requires a static local function or a global 
 function (a lambda cannot be used):

     static bool canFindIt(Range)(Range range)
     {
         return range.canFind(895639);
     }

     auto results = taskPool.map!canFindIt(chunks);
Thanks.
Mar 10 2015
prev sibling next sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 03/10/2015 03:34 PM, Ali Çehreli wrote:

 It is possible by accessing the actual range by chunks:
Sorry, I posted a jumbled program that does not compile. The following is the program that does NOT use taskPool.map. I am also changing the name of a variable because the local 'chunks' looked like std.range.chunks. import std.stdio; import std.algorithm; import std.parallelism; import std.range; import std.conv; void main() { const size_t elementCount = 895640; int[] a = iota(elementCount) .map!(i => i.to!int) .array; const chunkSize = a.length / taskPool.size; auto ch = a.chunks(chunkSize); bool[] results = new bool[ch.length]; foreach (i, chunk; ch.parallel) { results[i] = chunk.canFind(895639); } writeln(results.canFind(true) ? "Yes" : "No"); } Ali
Mar 10 2015
parent "Dennis Ritchie" <dennis.ritchie mail.ru> writes:
On Tuesday, 10 March 2015 at 22:43:08 UTC, Ali Çehreli wrote:
 The following is the program that does NOT use taskPool.map. I 
 am also changing the name of a variable because the local 
 'chunks' looked like std.range.chunks.

 import std.stdio;
 import std.algorithm;
 import std.parallelism;
 import std.range;
 import std.conv;

 void main()
 {
     const size_t elementCount = 895640;
     int[] a = iota(elementCount)
               .map!(i => i.to!int)
               .array;

     const chunkSize = a.length / taskPool.size;
     auto ch = a.chunks(chunkSize);
     bool[] results = new bool[ch.length];

     foreach (i, chunk; ch.parallel) {
         results[i] = chunk.canFind(895639);
     }

     writeln(results.canFind(true) ? "Yes" : "No");
 }
Thanks. This code will help me.
Mar 10 2015
prev sibling parent reply Russel Winder via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> writes:
On Tue, 2015-03-10 at 15:34 -0700, Ali =C3=87ehreli via Digitalmars-d-learn
wrote:
[=E2=80=A6]
 We can hope to make it simpler by taking advantage of parallel map but=
=20
 it requires a static local function or a global function (a lambda=20
 cannot be used):
[=E2=80=A6] Is there a reason for excluding lambdas, it a priori sounds like a bug.
=20
--=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Mar 11 2015
parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 03/11/2015 01:07 AM, Russel Winder via Digitalmars-d-learn wrote:
 On Tue, 2015-03-10 at 15:34 -0700, Ali Çehreli via Digitalmars-d-learn
 wrote:
 […]
 We can hope to make it simpler by taking advantage of parallel map but
 it requires a static local function or a global function (a lambda
 cannot be used):
[…] Is there a reason for excluding lambdas, it a priori sounds like a bug.
This is a current language implementation issue because there is just one context pointer. Parallel map is a member function that takes its function as an 'alias' template parameter: https://issues.dlang.org/show_bug.cgi?id=5710 Ali
Mar 11 2015
parent Russel Winder via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> writes:
On Wed, 2015-03-11 at 07:10 -0700, Ali =C3=87ehreli via Digitalmars-d-learn
wrote:
 On 03/11/2015 01:07 AM, Russel Winder via Digitalmars-d-learn wrote:
 On Tue, 2015-03-10 at 15:34 -0700, Ali =C3=87ehreli via Digitalmars-d-l=
earn
 wrote:
 [=E2=80=A6]
 We can hope to make it simpler by taking advantage of parallel map but
 it requires a static local function or a global function (a lambda
 cannot be used):
[=E2=80=A6] Is there a reason for excluding lambdas, it a priori sounds like a bug.
=20 This is a current language implementation issue because there is just=20 one context pointer. Parallel map is a member function that takes its=20 function as an 'alias' template parameter: =20 https://issues.dlang.org/show_bug.cgi?id=3D5710
Sadly the conversation seems to summarized as: There is a major problem. This problem must be fixed. None of the solution proposed so far are perfect in all cases. Let's leave the mess as is not even trying to fix at least some of the worst problems. No more debate. I haven't thought about this much, but my initial reaction to this is that a language that has functions and closures/delegates where a closure/delegate is not substitutable for a function, has real problems. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Mar 11 2015