digitalmars.D - synchronized class bugs?
- Gregor =?UTF-8?B?TcO8Y2ts?= (36/36) Apr 07 2020 I've been playing around with synchronized class. My example is
- IGotD- (4/41) Apr 07 2020 Correct me if I'm wrong, aren't all synchronized classes
- Gregor =?UTF-8?B?TcO8Y2ts?= (9/12) Apr 07 2020 That's what's I'm trying to say. They should be protected and the
- IGotD- (7/19) Apr 07 2020 I think this is correct, you should be allowed to cast to atomics
- Gregor =?UTF-8?B?TcO8Y2ts?= (6/27) Apr 07 2020 I don't think you understand: the compiler actively prohibits the
- IGotD- (3/7) Apr 07 2020 Sorry, I misunderstood. I agree, that doesn't make any sense at
- Steven Schveighoffer (24/67) Apr 08 2020 The requirement for atomicOp is separate from the synchronization. The
- Gregor =?UTF-8?B?TcO8Y2ts?= (73/101) Apr 08 2020 OK, I understand the rationale, even though I don't like the
- Steven Schveighoffer (10/21) Apr 08 2020 No, you misunderstand. The correct path is to cast away shared for those...
- Steven Schveighoffer (18/46) Apr 08 2020 The issue is not synchronization of the class data, but your promiscuous...
- Gregor =?UTF-8?B?TcO8Y2ts?= (10/13) Apr 08 2020 Whooops... that was a major brain fart :(. Sorry for stealing
- Steven Schveighoffer (7/18) Apr 08 2020 Yeah, there is a lot of things we could possibly do to prevent these
I've been playing around with synchronized class. My example is the following dummy class: synchronized class Shared { public: void increment(int value) { //(cast(Shared) this).x += value; x += value; } void decrement(int value) { //(cast(Shared) this).x -= value; x -= value; } shared int get() { return x; } private: int x; } Then I'm calling increment and decrement many times in parallel threads so that they *should* cancel out in the end and x should be back to 0 at the end. With the implicit synchronization using d monitors, this should work. Bug number 1: dmd doesn't translate this without casting away shared: sharedtest.d(14): Error: read-modify-write operations are not allowed for shared variables. Use core.atomic.atomicOp!"+="(this.x, value) instead. The += and -= operators are safe as they are inside locked monitors. The emitted code contains calls to d_montiorenter and d_monitorexit. The compiler should understand this. Bug number 2: even when casting away shared, the end result is wrong when using multiple threads. Thus, the monitor locking code in druntime cannot be correct. It does have an effect, though. Omitting synchronized on Shared results in values that are wide off the mark. Including it results in small differences. This suggests that a memory fence might be missing somewhere. The observed reality contradicts the documentation in many ways and is inconsistent with itself. What is going on here?
Apr 07 2020
On Tuesday, 7 April 2020 at 15:18:41 UTC, Gregor Mückl wrote:I've been playing around with synchronized class. My example is the following dummy class: synchronized class Shared { public: void increment(int value) { //(cast(Shared) this).x += value; x += value; } void decrement(int value) { //(cast(Shared) this).x -= value; x -= value; } shared int get() { return x; } private: int x; } Then I'm calling increment and decrement many times in parallel threads so that they *should* cancel out in the end and x should be back to 0 at the end. With the implicit synchronization using d monitors, this should work. Bug number 1: dmd doesn't translate this without casting away shared: sharedtest.d(14): Error: read-modify-write operations are not allowed for shared variables. Use core.atomic.atomicOp!"+="(this.x, value) instead. The += and -= operators are safe as they are inside locked monitors. The emitted code contains calls to d_montiorenter and d_monitorexit. The compiler should understand this. Bug number 2: even when casting away shared, the end result is wrong when using multiple threads. Thus, the monitor locking code in druntime cannot be correct. It does have an effect, though. Omitting synchronized on Shared results in values that are wide off the mark. Including it results in small differences. This suggests that a memory fence might be missing somewhere. The observed reality contradicts the documentation in many ways and is inconsistent with itself. What is going on here?Correct me if I'm wrong, aren't all synchronized classes protected with a mutex. In this case atomic operations are pointless as all methods are protected by the mutex anyway.
Apr 07 2020
On Tuesday, 7 April 2020 at 16:11:12 UTC, IGotD- wrote:Correct me if I'm wrong, aren't all synchronized classes protected with a mutex. In this case atomic operations are pointless as all methods are protected by the mutex anyway.That's what's I'm trying to say. They should be protected and the code for locking and unlocking is generated (tested on dmd, gdc and ldc). But - the compiler tries to enforce atomics in this context anyway, but they are pointless due to the mutex - the mutex doesn't lead to proper locking behavior at runtime; when calling increment an decrement in many threads in parallel, the result is wrong.
Apr 07 2020
On Tuesday, 7 April 2020 at 16:18:53 UTC, Gregor Mückl wrote:On Tuesday, 7 April 2020 at 16:11:12 UTC, IGotD- wrote:I think this is correct, you should be allowed to cast to atomics or do whatever operation you want within the mutex, atomic operation or not. The wrong result sound more like a bug in the locking itself. One way is to try this is on another operating system and see if you get similar result.Correct me if I'm wrong, aren't all synchronized classes protected with a mutex. In this case atomic operations are pointless as all methods are protected by the mutex anyway.That's what's I'm trying to say. They should be protected and the code for locking and unlocking is generated (tested on dmd, gdc and ldc). But - the compiler tries to enforce atomics in this context anyway, but they are pointless due to the mutex - the mutex doesn't lead to proper locking behavior at runtime; when calling increment an decrement in many threads in parallel, the result is wrong.
Apr 07 2020
On Tuesday, 7 April 2020 at 16:33:13 UTC, IGotD- wrote:On Tuesday, 7 April 2020 at 16:18:53 UTC, Gregor Mückl wrote:I don't think you understand: the compiler actively prohibits the use of non-atomic operations in synchronized functions. That makes no sense.On Tuesday, 7 April 2020 at 16:11:12 UTC, IGotD- wrote:I think this is correct, you should be allowed to cast to atomics or do whatever operation you want within the mutex, atomic operation or not.Correct me if I'm wrong, aren't all synchronized classes protected with a mutex. In this case atomic operations are pointless as all methods are protected by the mutex anyway.That's what's I'm trying to say. They should be protected and the code for locking and unlocking is generated (tested on dmd, gdc and ldc). But - the compiler tries to enforce atomics in this context anyway, but they are pointless due to the mutex - the mutex doesn't lead to proper locking behavior at runtime; when calling increment an decrement in many threads in parallel, the result is wrong.The wrong result sound more like a bug in the locking itself. One way is to try this is on another operating system and see if you get similar result.The results are wrong with dmd 2.088 on Windows and dmd 2.91, gdc 9.3.0 and ldc 1.20.1 on Linux.
Apr 07 2020
On Tuesday, 7 April 2020 at 23:34:13 UTC, Gregor Mückl wrote:On Tuesday, 7 April 2020 at 16:33:13 UTC, IGotD- wrote: I don't think you understand: the compiler actively prohibits the use of non-atomic operations in synchronized functions. That makes no sense.Sorry, I misunderstood. I agree, that doesn't make any sense at all.
Apr 07 2020
On 4/7/20 11:18 AM, Gregor Mückl wrote:I've been playing around with synchronized class. My example is the following dummy class: synchronized class Shared { public: void increment(int value) { //(cast(Shared) this).x += value; x += value; } void decrement(int value) { //(cast(Shared) this).x -= value; x -= value; } shared int get() { return x; } private: int x; } Then I'm calling increment and decrement many times in parallel threads so that they *should* cancel out in the end and x should be back to 0 at the end. With the implicit synchronization using d monitors, this should work. Bug number 1: dmd doesn't translate this without casting away shared: sharedtest.d(14): Error: read-modify-write operations are not allowed for shared variables. Use core.atomic.atomicOp!"+="(this.x, value) instead. The += and -= operators are safe as they are inside locked monitors. The emitted code contains calls to d_montiorenter and d_monitorexit. The compiler should understand this.The requirement for atomicOp is separate from the synchronization. The synchronization is only sound because you made it sound. In fact, if you did: auto getPtr() { return &x; } Then the synchronization all of a sudden does not apply to that variable. This is why we are moving to REQUIRING casting away shared to read/write data -- you have to do it anyway.Bug number 2: even when casting away shared, the end result is wrong when using multiple threads. Thus, the monitor locking code in druntime cannot be correct. It does have an effect, though. Omitting synchronized on Shared results in values that are wide off the mark. Including it results in small differences. This suggests that a memory fence might be missing somewhere. The observed reality contradicts the documentation in many ways and is inconsistent with itself. What is going on here?synchronized classes call _d_monitorenter and _d_monitorexit, which lock OS primitives (and initialize the monitor if necessary). These have not been touched in years, and depend on the OS mutex which should be rock-solid. Looking at the history, it looks like the last significant change was https://github.com/dlang/druntime/commit/70b3536ddd06e8e1afb269b88ed3298ec03b8798#diff-d609308f6ce1 06db16e214b30db6f74 in 2015. I've definitely depended on synchronized functions in the past, and haven't had issues, but it was in the far past. The only place I can reasonably assume that there might be an issue is for the initialization. To test this, try initializing the lock FIRST before spawning multiple threads (just call one of the functions), and see if it helps. If so, it might be a bug in the initialization routines. Also if you find an issue, it may be specific to your OS. -Steve
Apr 08 2020
On Wednesday, 8 April 2020 at 13:43:08 UTC, Steven Schveighoffer wrote:This is why we are moving to REQUIRING casting away shared to read/write data -- you have to do it anyway.OK, I understand the rationale, even though I don't like the solution very much. You shouldn't really use atomic ops when locks on the affected variables are held. That's bad practice.The calls to _d_monitorenter and _d_monitorexit are in the generated code and I've looked up their implementation before posting to make sure I have correct understanding. I would also assume mutexes to be rock solid. This is why I am in utter disbelief about the actual behavior of the toy program (see code at the end).Bug number 2: even when casting away shared, the end result is wrong when using multiple threads. Thus, the monitor locking code in druntime cannot be correct. It does have an effect, though. Omitting synchronized on Shared results in values that are wide off the mark. Including it results in small differences. This suggests that a memory fence might be missing somewhere. The observed reality contradicts the documentation in many ways and is inconsistent with itself. What is going on here?synchronized classes call _d_monitorenter and _d_monitorexit, which lock OS primitives (and initialize the monitor if necessary). These have not been touched in years, and depend on the OS mutex which should be rock-solid. Looking at the history, it looks like the last significant change was https://github.com/dlang/druntime/commit/70b3536ddd06e8e1afb269b88ed3298ec03b8798#diff-d609308f6ce1 06db16e214b30db6f74 in 2015.I've definitely depended on synchronized functions in the past, and haven't had issues, but it was in the far past. The only place I can reasonably assume that there might be an issue is for the initialization. To test this, try initializing the lock FIRST before spawning multiple threads (just call one of the functions), and see if it helps. If so, it might be a bug in the initialization routines.I added such function calls and that didn't change anything.Also if you find an issue, it may be specific to your OS.Well, as I've mentioned before, I've reproduced this on Linux and Windows with dmd, ldc and gdc. So it's not OS specific, or at least not entirely.-SteveThe the full program code is below the line. You can run it on run.dlang.org to reproduce the problem. ---- import core.thread; import std.stdio; version = Sync; version(Sync) { pragma(msg, "synchronized"); synchronized class Shared { public: void increment(int value) { (cast(Shared) this).x += value; } void decrement(int value) { (cast(Shared) this).x -= value; } shared int get() { return x; } private: int x; } } else { pragma(msg, "non-synchronized"); class Shared { public: shared void increment(int value) { (cast(Shared) this).x += value; } shared void decrement(int value) { (cast(Shared) this).x -= value; } shared int get() { return x; } private: int x; } } void main() { shared Shared s = new shared Shared(); s.increment(1); s.decrement(1); writeln(s.get()); Thread[] threads = new Thread[10]; for(int i = 0; i < threads.length; i++) { threads[i] = new Thread({ for(int j = 0; j < 1_000_000; j++) { s.increment(i); } for(int j = 0; j < 1_000_000; j++) { s.decrement(i); } }); threads[i].start(); } for(int i = 0; i < threads.length; i++) { threads[i].join(); } writeln(s.get()); }
Apr 08 2020
On 4/8/20 6:50 PM, Gregor Mückl wrote:On Wednesday, 8 April 2020 at 13:43:08 UTC, Steven Schveighoffer wrote:No, you misunderstand. The correct path is to cast away shared for those variables that you manually ensure cannot be touched outside your class, not to use atomic ops. But the compiler has no way to prove you did it right, which is why the casting is required. The expectation is that frameworks that use this requirement will emerge that make this much more convenient.This is why we are moving to REQUIRING casting away shared to read/write data -- you have to do it anyway.OK, I understand the rationale, even though I don't like the solution very much. You shouldn't really use atomic ops when locks on the affected variables are held. That's bad practice.The the full program code is below the line. You can run it on run.dlang.org to reproduce the problem.OK, will look to see what I can find. -Steve
Apr 08 2020
On 4/8/20 6:50 PM, Gregor Mückl wrote:void main() { shared Shared s = new shared Shared(); s.increment(1); s.decrement(1); writeln(s.get()); Thread[] threads = new Thread[10]; for(int i = 0; i < threads.length; i++) { threads[i] = new Thread({ for(int j = 0; j < 1_000_000; j++) { s.increment(i); } for(int j = 0; j < 1_000_000; j++) { s.decrement(i); } }); threads[i].start(); } for(int i = 0; i < threads.length; i++) { threads[i].join(); } writeln(s.get()); }The issue is not synchronization of the class data, but your promiscuous usage of the stack frame ;) Try this instead for your loop: for(int i = 0; i < threads.length; i++) { threads[i] = (int val) { return new Thread({ for(int j = 0; j < 1_000_000; j++) { s.increment(val); } for(int j = 0; j < 1_000_000; j++) { s.decrement(val); } });}(i); Why doesn't yours work? Because you are starting your threads and then changing the value of i as you iterate the loop. This works, because it captures the i value as a new variable in a closure, so `val` remains constant while `i` increments. -Steve
Apr 08 2020
On Wednesday, 8 April 2020 at 23:25:09 UTC, Steven Schveighoffer wrote:This works, because it captures the i value as a new variable in a closure, so `val` remains constant while `i` increments. -SteveWhooops... that was a major brain fart :(. Sorry for stealing your time. Still, this makes me think about the rules for accessing variables in the outer frame in lambdas. I guess I'm used too much to the C++ rules that require explicit capture of everything and default to copying. At least safe code should require explicit capture. There is no guarantee that a lambda is executed in the same thread.
Apr 08 2020
On 4/8/20 8:20 PM, Gregor Mückl wrote:On Wednesday, 8 April 2020 at 23:25:09 UTC, Steven Schveighoffer wrote:No worries, I didn't spend too long on it.This works, because it captures the i value as a new variable in a closure, so `val` remains constant while `i` increments.Whooops... that was a major brain fart :(. Sorry for stealing your time.Still, this makes me think about the rules for accessing variables in the outer frame in lambdas. I guess I'm used too much to the C++ rules that require explicit capture of everything and default to copying. At least safe code should require explicit capture. There is no guarantee that a lambda is executed in the same thread.Yeah, there is a lot of things we could possibly do to prevent these things. I don't know if they are doable without major breakage elsewhere. I had thought of a capture utility function, maybe that can help in cases like this. -Steve
Apr 08 2020