digitalmars.D.learn - Passing large or complex data structures to threads

Joseph Rushton Wakeling (21/21) May 24 2013 Hello all,

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (44/48) May 24 2013 The following simple example uses mutable data but it should work with

Joseph Rushton Wakeling (4/5) May 26 2013 Limiting ourselves to read-only, won't there still be a slowdown caused ...

John Colvin (12/20) May 26 2013 Not necessarily. It really depends on the memory access patterns

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

Hello all,

Are there any recommended strategies for passing large or complex data
structures (particularly reference types) to threads?

For the purpose of this discussion we can assume that it's read-only data, so if
we're talking about just an array (albeit perhaps a large one) I guess just
passing an .idup copy would be best.  However, the practical situation I have is
a data structure of the form,

	Tuple!(size_t, size_t)[][]

... which I _could_ .idup, but it's a little bit of a hassle to do so, so I'm
wondering if there are alternative ways or suggestions.

The actual way I found to "solve" this problem (for now) was that, since the
data in question is loaded from a file, I just got each thread to load the file
separately.  However, this runs into a different problem -- the function(s)
required to load and interpret the file may vary, and may require different
input, which requires manual rewriting of the thread function when it needs to
be changed (it's a tolerable but annoying solution).  I guess I could solve this
by passing the thread a delegate, or maybe employ mixins ... ?

Anyway, I thought I'd throw the question open in case others can suggest better
ideas!

Thanks & best wishes,

     -- Joe

May 24 2013

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

On 05/24/2013 06:26 AM, Joseph Rushton Wakeling wrote:

 Are there any recommended strategies for passing large or complex data
 structures (particularly reference types) to threads?

std.concurrency works with shared data.

 For the purpose of this discussion we can assume that it's read-only 
 data

The following simple example uses mutable data but it should work with 
'const' too.

import std.stdio;
import std.concurrency;
import std.typecons;
import core.thread;

alias DataElement = Tuple!(size_t, size_t);
alias DataRow = DataElement[];
alias Data = DataRow[];

enum size_t totalRows = 4;

void func(Tid owner, shared(Data) data, size_t rowId)
{
     foreach (ref element; data[rowId]) {
         element[0] *= 10;
         element[1] *= 10;
     }
}

shared(Data) makeData()
{
     shared(Data) data;

     foreach (size_t row; 0 .. totalRows) {
         shared(DataRow) dataRow;

         foreach (size_t col; 0 .. 10) {
             dataRow ~= tuple(row, col);
         }

         data ~= dataRow;
     }

     return data;
}

void main()
{
     shared(Data) data = makeData();

     writeln("before: ", data);
     foreach (rowId, row; data) {
         // Instead of 'data' and 'rowId', the child could take
         // its own row (not tested)
         spawn(&func, thisTid, data, rowId);
     }

     thread_joinAll();
     writeln("after : ", data);
}

Ali

May 24 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 05/24/2013 05:59 PM, Ali Çehreli wrote:
 The following simple example uses mutable data but it should work with 'const'
too.

Limiting ourselves to read-only, won't there still be a slowdown caused by
multiple threads trying to access the same data?  The particular case I have
will involve continuous reading from the data concerned.

May 26 2013

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Sunday, 26 May 2013 at 12:08:41 UTC, Joseph Rushton Wakeling 
wrote:
 On 05/24/2013 05:59 PM, Ali Çehreli wrote:
 The following simple example uses mutable data but it should 
 work with 'const' too.

 Limiting ourselves to read-only, won't there still be a 
 slowdown caused by
 multiple threads trying to access the same data?  The 
 particular case I have
 will involve continuous reading from the data concerned.

Not necessarily. It really depends on the memory access patterns 
of the algorithms, the number of threads, the size of the data, 
the number/size/hierarchy of cpu caches, the number of CPUs.

Hard and fast rule: If your threads are reading data that is 
distant (i.e. one thread reading from around the beginning of a 
long array, the second reading from the end) then the fact that 
they happen to be the same data "object" is irrelevant.

Also, remember that in the short term the CPUs are all keeping 
their own independent copies of the relevant parts of the data in 
caches anyway.

May 26 2013

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Passing large or complex data structures to threads