digitalmars.D - Unsynchronized int access from threads
- H. S. Teoh (16/16) Jun 18 2020 Suppose I have an int[] which may contain some zeroes, and 2 functions,
- Paul Backus (5/21) Jun 18 2020 As long as reads and writes of `int` are atomic on the platform
- =?UTF-8?Q?Ali_=c3=87ehreli?= (4/8) Jun 18 2020 I've worked in high performance projects where exactly that was done. I
- Steven Schveighoffer (10/23) Jun 18 2020 I think it's possible there's a problem if the reads are not atomic:
- H. S. Teoh (11/23) Jun 18 2020 [...]
- Andrei Alexandrescu (4/24) Jun 18 2020 Once again the asinine response "all inter-thread communication must be
- Timon Gehr (7/9) Jun 18 2020 I think Steven just says that your reads and writes have to be atomic
- H. S. Teoh (18/27) Jun 18 2020 What I meant is that I'm thinking of optimizing at a higher level, to
- Steven Schveighoffer (4/9) Jun 19 2020 Atomic reads/writes are not as heavy as locks. Though certainly if there...
- claptrap (8/23) Jun 18 2020 If you're on x86 all reads and writes are atomic if they are
- claptrap (3/9) Jun 18 2020 Found the relevant doc
- aliak (10/26) Jun 18 2020 If there's no alignment hanky panky (e.g. struct packing), and
Suppose I have an int[] which may contain some zeroes, and 2 functions, one scans the array for zeroes but does not modify it, and the other modifies array elements but never introduces new zeroes (though it may write a 0 to an existing 0). Is it thread-safe to run the two functions in parallel without any synchronization between them? I.e., will the first function always see zeroes where there are zeroes, in spite of the 2nd function writing to the array simultaneously? Are there any hardware situations where the 1st function may read a non-zero value if the 2nd function is simultaneously overwriting an existing zero with another zero? Or a situation where the 1st function may read a zero if the 2nd function is simultaneously overwriting a non-zero value with another non-zero value? Or am I playing with fire here? T -- Chance favours the prepared mind. -- Louis Pasteur
Jun 18 2020
On Thursday, 18 June 2020 at 16:42:15 UTC, H. S. Teoh wrote:Suppose I have an int[] which may contain some zeroes, and 2 functions, one scans the array for zeroes but does not modify it, and the other modifies array elements but never introduces new zeroes (though it may write a 0 to an existing 0). Is it thread-safe to run the two functions in parallel without any synchronization between them? I.e., will the first function always see zeroes where there are zeroes, in spite of the 2nd function writing to the array simultaneously? Are there any hardware situations where the 1st function may read a non-zero value if the 2nd function is simultaneously overwriting an existing zero with another zero? Or a situation where the 1st function may read a zero if the 2nd function is simultaneously overwriting a non-zero value with another non-zero value? Or am I playing with fire here? TAs long as reads and writes of `int` are atomic on the platform in question (which they probably are) you should be ok. Personally, I would use core.atomic.atomic{Load,Store} just to be safe; they should optimize themselves away if they're not needed.
Jun 18 2020
On 6/18/20 9:42 AM, H. S. Teoh wrote:Are there any hardware situations where the 1st function may read a non-zero value if the 2nd function is simultaneously overwriting an existing zero with another zero?I've worked in high performance projects where exactly that was done. I am not aware of any hardware feature that would make that unsafe. Ali
Jun 18 2020
On 6/18/20 12:42 PM, H. S. Teoh wrote:Suppose I have an int[] which may contain some zeroes, and 2 functions, one scans the array for zeroes but does not modify it, and the other modifies array elements but never introduces new zeroes (though it may write a 0 to an existing 0). Is it thread-safe to run the two functions in parallel without any synchronization between them? I.e., will the first function always see zeroes where there are zeroes, in spite of the 2nd function writing to the array simultaneously? Are there any hardware situations where the 1st function may read a non-zero value if the 2nd function is simultaneously overwriting an existing zero with another zero? Or a situation where the 1st function may read a zero if the 2nd function is simultaneously overwriting a non-zero value with another non-zero value?I think it's possible there's a problem if the reads are not atomic: Let's say a value is 0xffff0000 Thread that is looking for 0s reads the lower half of the number -> 0x????0000 Context switches, thread 2 writes 0x0000ffff into that memory slot. Back at the zero-scanning thread, it reads the second half -> 0x0000???? Combines with the first part, and now it sees a 0 where there isn't one. I would agree with Paul -- make the reads/writes atomic. -Steve
Jun 18 2020
On Thu, Jun 18, 2020 at 03:07:54PM -0400, Steven Schveighoffer via Digitalmars-d wrote: [...]Let's say a value is 0xffff0000 Thread that is looking for 0s reads the lower half of the number -> 0x????0000 Context switches, thread 2 writes 0x0000ffff into that memory slot. Back at the zero-scanning thread, it reads the second half -> 0x0000???? Combines with the first part, and now it sees a 0 where there isn't one.[...] Hmmm very good point, hadn't thought of that. :-/ Oh well, I'll just think of another way of parallelizing my code. I think I can break the work up into large enough chunks that the overhead of using traditional synchronization primitives will be insignificant. Thanks all for the replies! T -- A mathematician learns more and more about less and less, until he knows everything about nothing; whereas a philospher learns less and less about more and more, until he knows nothing about everything.
Jun 18 2020
On 6/18/20 6:15 PM, H. S. Teoh wrote:On Thu, Jun 18, 2020 at 03:07:54PM -0400, Steven Schveighoffer via Digitalmars-d wrote: [...]Once again the asinine response "all inter-thread communication must be done by a handshake in which both threads participate knowingly" turns out to be correct.Let's say a value is 0xffff0000 Thread that is looking for 0s reads the lower half of the number -> 0x????0000 Context switches, thread 2 writes 0x0000ffff into that memory slot. Back at the zero-scanning thread, it reads the second half -> 0x0000???? Combines with the first part, and now it sees a 0 where there isn't one.[...] Hmmm very good point, hadn't thought of that. :-/ Oh well, I'll just think of another way of parallelizing my code. I think I can break the work up into large enough chunks that the overhead of using traditional synchronization primitives will be insignificant.
Jun 18 2020
On 19.06.20 00:15, H. S. Teoh wrote:Oh well, I'll just think of another way of parallelizing my code.I think Steven just says that your reads and writes have to be atomic for the behavior to be defined at language level, so using core.atomic with memory order `raw` may in fact do what you want. (Or not, I am not sure what your full use case is.) I don't think in terms of generated assembly this will differ from what you had in mind as aligned word stores and loads are atomic on x86.
Jun 18 2020
On Fri, Jun 19, 2020 at 06:00:32AM +0200, Timon Gehr via Digitalmars-d wrote:On 19.06.20 00:15, H. S. Teoh wrote:What I meant is that I'm thinking of optimizing at a higher level, to arrange it in such a way that I don't need to lock. E.g., if scanning for zeroes can't easily multithread with updating the array, perhaps the better way to do it is to let them run sequentially, but multithread across multiple instances of the computation instead. I did some experiments along this line, and I've been finding that I'm able to boost performance by 2x by multithreading at a higher level in the code, but this comes at a significant overhead for small problem sizes. I probably need to apply some kind of heuristic to switch between plain ole single-threaded code (for small problems where the overhead of parallel computations outweigh the benefits -- I'm seeing 2x slowdown for smaller problem sets), and the parallel version for larger problems where the overhead of multithreading is dwarfed by the overall performance gain. T -- "Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next. -- (Stolen from the net)Oh well, I'll just think of another way of parallelizing my code.I think Steven just says that your reads and writes have to be atomic for the behavior to be defined at language level, so using core.atomic with memory order `raw` may in fact do what you want. (Or not, I am not sure what your full use case is.) I don't think in terms of generated assembly this will differ from what you had in mind as aligned word stores and loads are atomic on x86.
Jun 18 2020
On 6/19/20 1:54 AM, H. S. Teoh wrote:What I meant is that I'm thinking of optimizing at a higher level, to arrange it in such a way that I don't need to lock. E.g., if scanning for zeroes can't easily multithread with updating the array, perhaps the better way to do it is to let them run sequentially, but multithread across multiple instances of the computation instead.Atomic reads/writes are not as heavy as locks. Though certainly if there are better ways to organize the code, that might be more fruitful. -Steve
Jun 19 2020
On Thursday, 18 June 2020 at 16:42:15 UTC, H. S. Teoh wrote:Suppose I have an int[] which may contain some zeroes, and 2 functions, one scans the array for zeroes but does not modify it, and the other modifies array elements but never introduces new zeroes (though it may write a 0 to an existing 0). Is it thread-safe to run the two functions in parallel without any synchronization between them? I.e., will the first function always see zeroes where there are zeroes, in spite of the 2nd function writing to the array simultaneously? Are there any hardware situations where the 1st function may read a non-zero value if the 2nd function is simultaneously overwriting an existing zero with another zero? Or a situation where the 1st function may read a zero if the 2nd function is simultaneously overwriting a non-zero value with another non-zero value? Or am I playing with fire here?If you're on x86 all reads and writes are atomic if they are naturally aligned. IE 16 bits on 2 byte boundry, 32 bits on 4 byte boundry, 64 bits on 64 bit boundry. (Dont think it applies to 128 bit regs) Short version is you're fine as long as your ints are on 4 byte boundrys. If you want unaligned atomicity you need a lock prefix.
Jun 18 2020
On Thursday, 18 June 2020 at 16:42:15 UTC, H. S. Teoh wrote:Suppose I have an int[] which may contain some zeroes, and 2 functions, one scans the array for zeroes but does not modify it, and the other modifies array elements but never introduces new zeroes (though it may write a 0 to an existing 0). Is it thread-safe to run the two functions in parallel without any synchronization between them?Found the relevant doc http://www.cs.cmu.edu/~410-f10/doc/Intel_Reordering_318147.pdf
Jun 18 2020
On Thursday, 18 June 2020 at 16:42:15 UTC, H. S. Teoh wrote:Suppose I have an int[] which may contain some zeroes, and 2 functions, one scans the array for zeroes but does not modify it, and the other modifies array elements but never introduces new zeroes (though it may write a 0 to an existing 0). Is it thread-safe to run the two functions in parallel without any synchronization between them? I.e., will the first function always see zeroes where there are zeroes, in spite of the 2nd function writing to the array simultaneously? Are there any hardware situations where the 1st function may read a non-zero value if the 2nd function is simultaneously overwriting an existing zero with another zero? Or a situation where the 1st function may read a zero if the 2nd function is simultaneously overwriting a non-zero value with another non-zero value? Or am I playing with fire here? TIf there's no alignment hanky panky (e.g. struct packing), and you're on x86-64, you have your guarantee - as reading aligned words are atomic. If you want portability then there's no guarantee. Steven's case could certainly happen if a value crosses a word boundary on a platform that doesn't guarantee atomic reads/writes. I was confirming things and found this which was a fun little read: https://preshing.com/20130618/atomic-vs-non-atomic-operations/
Jun 18 2020