www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - How to store data when using parallel processing

reply Andrew Chapman <nycran gmail.com> writes:
Hi all, just wanting some advice on parallel processing and 
specifically how to deal with access violations.

I am reading a list of words from a file like this:

auto fileHandle = File("wordlist.txt", "r");

string word;
string[] words;
string[ulong] hashMap;

while ((word = fileHandle.readln()) !is null) {
	words ~= word;
}

Then I'm doing some processing on the words.  I want to make this 
run as quickly as possible so I am doing the processing across 
the cores of my CPU like this:

foreach (thisWord; parallel(words)) {
	string wordLower = thisWord.strip().toLower();
	ulong key = keyMaker.createKeyForWord(wordLower);

         // hashMap[key] = wordLower;
}

The question is, in the above loop, how can I make the commented 
out line work without having an access violation.  Do I need to 
use a different data structure?  Or rethink what I'm doing?

Thanks in advance.
Andrew.
Aug 26 2017
parent reply Jonathan M Davis via Digitalmars-d-learn writes:
On Sunday, August 27, 2017 00:26:33 Andrew Chapman via Digitalmars-d-learn 
wrote:
 Hi all, just wanting some advice on parallel processing and
 specifically how to deal with access violations.

 I am reading a list of words from a file like this:

 auto fileHandle = File("wordlist.txt", "r");

 string word;
 string[] words;
 string[ulong] hashMap;

 while ((word = fileHandle.readln()) !is null) {
   words ~= word;
 }

 Then I'm doing some processing on the words.  I want to make this
 run as quickly as possible so I am doing the processing across
 the cores of my CPU like this:

 foreach (thisWord; parallel(words)) {
   string wordLower = thisWord.strip().toLower();
   ulong key = keyMaker.createKeyForWord(wordLower);

          // hashMap[key] = wordLower;
 }

 The question is, in the above loop, how can I make the commented
 out line work without having an access violation.  Do I need to
 use a different data structure?  Or rethink what I'm doing?
std.parallelism is designed to work on stuff that's truly parallel, whereas adding a value to an AA is not. For multiple threads to be able to access it, it needs to be protected by a mutex or a synchronized block, in which case the assignment is no longer parallel. Whether that matters much depends on how expensive the rest of what the loop is doing is. If it's cheap enough, then the threads will all just end up blocking on the mutex, effectively making the code serial and making the parallelism pointless, whereas if it's expensive enough that the threads will spend most of their time doing the rest of the loop, then you can get an increase in performance over just doing it in one thread. It wouldn't surprise me if simply doing all of the work in the first loop and not creating the words array in the first place would be faster than trying to parallelize the second loop, but I don't know. You'd have to test it and see. But std.parallelism cheats with regards to shared (which is part of why it's system). hiding the fact that you're dealing with multiple threads, but all of the issues when using shared data remain (e.g. needing to use mutexes to protect against data races). std.parallelism just normally manages to avoid that problem by operating on separate pieces of data simultaneously rather than operating on the same data on multiple threads at the same time, whereas what you're doing with regards to the result involves operating on the same data from multiple threads at the same time, which doesn't work without mutexes. An alternative would be to have a separate AA per thread, and then you combine them after the loop, but that requires checking which thread you're on so that you grab the correct one in a particular iteration of the loop. I assume that std.parallelism provides a reasonably easy way to do that, but I haven't done much with it, so I don't know. Worst case, you can probably go off of the thread ID using core.thread. std.parallelism may offer other ways to accomplish this, but I'd have to study it to be sure. Either way, fundamentally, you're either going to have to protect the hash table with a mutex, and the writes will be synchronous even if the data creation is parallelized, or you have to store the data from each thread separately while the threads are running and then combine the data when the threads are done. - Jonathan M Davis
Aug 26 2017
parent Andrew Chapman <nycran gmail.com> writes:
On Sunday, 27 August 2017 at 01:58:04 UTC, Jonathan M Davis wrote:
 [...]
Thanks Jonathan, that makes sense. As it turns out, the Mutex approach actually makes things slower. In this case I believe trying to use multiple cores isn't worth it. Cheers.
Aug 27 2017