digitalmars.D.learn - Threading methodology
- Frustrated (43/43) Dec 07 2013 I have to process n arrays in some partial order. Instead of all
- Marco Leise (17/67) Dec 08 2013 Wait, what partial order and how is it relevant? Who is "all"
I have to process n arrays in some partial order. Instead of all working only on the n arrays and reusing them, if I "duplicate" them(effectively write once read many) does that make things simpler and allow threading to be used more efficiently? Basically, instead of having to worry about threads waiting on other threads when they are writing the data I can just have each thread write to it's own duplicate array(although hopefully won't have to duplicate the buffer as that should not be necessary?). This should then make it much easier to deal because there is essentially no write contention. If I'm not mistaken this is basically the idea of TLS? (read many write once time idea so that one doesn't have to lock memory which stalls other threads in some cases and/or uses expensive locks?) The case I gave above is a little more simplified than the actual issue but the idea is simply that by writing to a threads own buffer reduces inter-threading issues and increases performance. One issue I have is how to combine the data back. In fact, the real case is basically n arrays. Each array is processed to create m arrays, which are processed to create l arrays and so on. Not all processing of each array takes the same time and I would like to multithread the overall processing by allowing threads to pick which (sub)arrays need processing and process them, but in their own space, and at the end send them back to the main thread. (But one thread may need access to another's buffer but only for reading so it shouldn't be a problem) This uses a lot more memory but if there are N threads and n arrays and N > n, no thread will be wasted. (in my case I would have n*m*l which will almost always be larger than N so threads will always be doing something except towards the end) Another way to see the problem is to think of a digraph where each node processes all connected to it. All independent nodes must be processed first which then works itself up to the most dependent nodes. Every processed node uses a new buffer to write to instead of writing to the same buffer which locks that buffer and all other threads that may be trying to read from it(which is the key in my mind that this method is better except it requires more memory). I will of course mark nodes as being processed or not so threads don't work on unprocessed buffers and this allows dependent nodes to start up once all there dependencies become processed. Does any of this make sense?
Dec 07 2013
Am Sat, 07 Dec 2013 17:53:06 +0100 schrieb "Frustrated" <c1514843 drdrb.com>:I have to process n arrays in some partial order. Instead of all working only on the n arrays and reusing them, [...]Wait, what partial order and how is it relevant? Who is "all" in "all working"? Why "only" the n arrays, I thought those are all?if I "duplicate" them(effectively write once read many) does that make things simpler and allow threading to be used more efficiently? Basically, instead of having to worry about threads waiting on other threads when they are writing the data I can just have each thread write to it's own duplicate array(although hopefully won't have to duplicate the buffer as that should not be necessary?). This should then make it much easier to deal because there is essentially no write contention. If I'm not mistaken this is basically the idea of TLS? (read many write once time idea so that one doesn't have to lock memory which stalls other threads in some cases and/or uses expensive locks?)The idea of TLS in D as I understand it, is that your global variables are not shared between threads by default as is the case in C. So as you point out you need not lock them since each thread has its own copy.The case I gave above is a little more simplified than the actual issue but the idea is simply that by writing to a threads own buffer reduces inter-threading issues and increases performance. One issue I have is how to combine the data back. In fact, the real case is basically n arrays. Each array is processed to create m arrays, which are processed to create l arrays and so on. Not all processing of each array takes the same time and I would like to multithread the overall processing by allowing threads to pick which (sub)arrays need processing and process them, but in their own space, and at the end send them back to the main thread. (But one thread may need access to another's buffer but only for reading so it shouldn't be a problem) This uses a lot more memory but if there are N threads and n arrays and N > n, no thread will be wasted. (in my case I would have n*m*l which will almost always be larger than N so threads will always be doing something except towards the end) Another way to see the problem is to think of a digraph where each node processes all connected to it. All independent nodes must be processed first which then works itself up to the most dependent nodes. Every processed node uses a new buffer to write to instead of writing to the same buffer which locks that buffer and all other threads that may be trying to read from it(which is the key in my mind that this method is better except it requires more memory). I will of course mark nodes as being processed or not so threads don't work on unprocessed buffers and this allows dependent nodes to start up once all there dependencies become processed. Does any of this make sense?It is a little clearer now, but I didn't fully understand it yet. The sub-arrays need to be processed before the parent arrays can be completed? When does a thread need access to another thread's buffer? What if you have only 1 CPU core and there is only one thread and one buffer? Does your algorithm break? :p -- Marco
Dec 08 2013