digitalmars.D - A breach of immutability due to memory implicit conversions to
- John Colvin (36/36) Nov 11 2018 Take a look at this (I think?) supposed-to-be-thread-safe code
- Steven Schveighoffer (9/54) Nov 11 2018 Hm... I wouldn't expect the compiler would allow reordering across a
- Steven Schveighoffer (4/40) Nov 11 2018 Wait, wouldn't the atomicStore create a barrier, such that all the code
- Neia Neutuladh (10/12) Nov 11 2018 It creates a smaller barrier that's supposed to be good enough (`LOCK
- John Colvin (6/56) Nov 12 2018 The compiler can definitely re-order over return boundaries in
- John Colvin (10/68) Nov 12 2018 Hmm, ok, so it seems that perhaps this only occurs with relaxed
- John Colvin (6/10) Nov 12 2018 The correct statement would be more like "reading immutable data
- Timon Gehr (4/13) Nov 12 2018 Yes. More generally, I think implicit sharing of immutable data was a
- Kagamin (5/10) Nov 13 2018 Well, without acquire-release you can't meaningfully share data -
- John Colvin (6/16) Nov 13 2018 What - precisely - do you mean by "without acquire-release you
- Nicholas Wilson (2/17) Nov 13 2018 I *think* it isn't guaranteed to leave cache otherwise.
- Kagamin (8/13) Nov 14 2018 You mean relaxed order? Looks like llvm has better description
- John Colvin (38/52) Nov 14 2018 What relaxed does is roughly: "all relaxed operations on this
- John Colvin (20/20) Nov 15 2018 Correcting the obvious typos:
- Rubn (25/62) Nov 11 2018 Maybe I'm missing something but:
- John Colvin (9/80) Nov 12 2018 The MODIFY line changes the value where the pointer looks. I'm
- Kagamin (4/9) Nov 12 2018 For that you will need READ to happen before atomicLoad (in which
- John Colvin (4/13) Nov 12 2018 The first one is definitely not allowed, I'm not worried about
- Kagamin (5/8) Nov 12 2018 Well, the topic is a little abstract,
Take a look at this (I think?) supposed-to-be-thread-safe code according to the rules of D: import std.stdio; import core.thread; import core.atomic; int* foo() pure { auto ret = new int; *ret = 3; // MODIFY return ret; } shared immutable(int)* g; void bar() { immutable(int)* d = null; while (d is null) d = g.atomicLoad; assert(*d == 3); // READ assert(*d == 3); // READ AGAIN } void main() { auto t = new Thread(&bar).start(); immutable(int)* a = foo(); g.atomicStore(a); t.join; } What stops the CPU executing this such that MODIFY happens between READ and READ AGAIN ? To aid in the thought experiment, imagine if we replace `*ret = 3` with `*ret = longComputation()`? It might help your reasoning about re-ordering if you consider `foo` inlined. Is implicit conversion to the shared part of immutable actually safe, or does it secretly hide data races? Another way of putting this: Is `ret` secretly shared due to implicit conversion to immutable?
Nov 11 2018
On 11/11/18 2:21 PM, John Colvin wrote:Take a look at this (I think?) supposed-to-be-thread-safe code according to the rules of D: import std.stdio; import core.thread; import core.atomic; int* foo() pure { auto ret = new int; *ret = 3; // MODIFY return ret; } shared immutable(int)* g; void bar() { immutable(int)* d = null; while (d is null) d = g.atomicLoad; assert(*d == 3); // READ assert(*d == 3); // READ AGAIN } void main() { auto t = new Thread(&bar).start(); immutable(int)* a = foo(); g.atomicStore(a); t.join; } What stops the CPU executing this such that MODIFY happens between READ and READ AGAIN ? To aid in the thought experiment, imagine if we replace `*ret = 3` with `*ret = longComputation()`? It might help your reasoning about re-ordering if you consider `foo` inlined.Hm... I wouldn't expect the compiler would allow reordering across a return boundary. Even if the inlining occurs. Implicit conversion of pure return values depends on that. I don't know all the rules of reordering, but this would definitely cause races if it was allowed to reorder the storage of data in *d, and the storage of d into g.Is implicit conversion to the shared part of immutable actually safe, or does it secretly hide data races? Another way of putting this: Is `ret` secretly shared due to implicit conversion to immutable?If it is, we need to make it so it's not. -Steve
Nov 11 2018
On 11/11/18 3:47 PM, Steven Schveighoffer wrote:On 11/11/18 2:21 PM, John Colvin wrote:Wait, wouldn't the atomicStore create a barrier, such that all the code before it must execute? -SteveTake a look at this (I think?) supposed-to-be-thread-safe code according to the rules of D: import std.stdio; import core.thread; import core.atomic; int* foo() pure { auto ret = new int; *ret = 3; // MODIFY return ret; } shared immutable(int)* g; void bar() { immutable(int)* d = null; while (d is null) d = g.atomicLoad; assert(*d == 3); // READ assert(*d == 3); // READ AGAIN } void main() { auto t = new Thread(&bar).start(); immutable(int)* a = foo(); g.atomicStore(a); t.join; } What stops the CPU executing this such that MODIFY happens between READ and READ AGAIN ?
Nov 11 2018
On Sun, 11 Nov 2018 15:48:31 -0500, Steven Schveighoffer wrote:Wait, wouldn't the atomicStore create a barrier, such that all the code before it must execute?It creates a smaller barrier that's supposed to be good enough (`LOCK XCHG` instead of `MFENCE`). And it *is* good enough when dealing with value types. I'm not good with multithreading, so I'm not entirely certain whether this is an actual problem in druntime. `LOCK XCHG` will serialize access for the global variable and any cache line containing it. It will also limit reordering. I haven't found anything remotely definitive on whether it provides a strong enough barrier for reordering and memory operations to suffice here, but I'd put a small bet in favor of it.
Nov 11 2018
On Sunday, 11 November 2018 at 20:47:16 UTC, Steven Schveighoffer wrote:On 11/11/18 2:21 PM, John Colvin wrote:The compiler can definitely re-order over return boundaries in general, but perhaps there is some special logic to prevent mistakes in these cases?Take a look at this (I think?) supposed-to-be-thread-safe code according to the rules of D: import std.stdio; import core.thread; import core.atomic; int* foo() pure { auto ret = new int; *ret = 3; // MODIFY return ret; } shared immutable(int)* g; void bar() { immutable(int)* d = null; while (d is null) d = g.atomicLoad; assert(*d == 3); // READ assert(*d == 3); // READ AGAIN } void main() { auto t = new Thread(&bar).start(); immutable(int)* a = foo(); g.atomicStore(a); t.join; } What stops the CPU executing this such that MODIFY happens between READ and READ AGAIN ? To aid in the thought experiment, imagine if we replace `*ret = 3` with `*ret = longComputation()`? It might help your reasoning about re-ordering if you consider `foo` inlined.Hm... I wouldn't expect the compiler would allow reordering across a return boundary. Even if the inlining occurs. Implicit conversion of pure return values depends on that. I don't know all the rules of reordering, but this would definitely cause races if it was allowed to reorder the storage of data in *d, and the storage of d into g.Wait, wouldn't the atomicStore create a barrier, such that all the code before it must execute? -SteveNot a full memory barrier on most (all relevant?) platforms.
Nov 12 2018
On Monday, 12 November 2018 at 09:11:04 UTC, John Colvin wrote:On Sunday, 11 November 2018 at 20:47:16 UTC, Steven Schveighoffer wrote:Hmm, ok, so it seems that perhaps this only occurs with relaxed memory ordering, because with acquire-release (or better) if bar sees the result of main's write to g (i.e. it gets out the while loop) then it must also see all side-effects of everything main did before that write. However, with relaxed memory ordering I'm now more convinced that this would break immutability, which in turn means that the oft-made statement "immutable data doesn't require synchronisation" isn't true.On 11/11/18 2:21 PM, John Colvin wrote:The compiler can definitely re-order over return boundaries in general, but perhaps there is some special logic to prevent mistakes in these cases?Take a look at this (I think?) supposed-to-be-thread-safe code according to the rules of D: import std.stdio; import core.thread; import core.atomic; int* foo() pure { auto ret = new int; *ret = 3; // MODIFY return ret; } shared immutable(int)* g; void bar() { immutable(int)* d = null; while (d is null) d = g.atomicLoad; assert(*d == 3); // READ assert(*d == 3); // READ AGAIN } void main() { auto t = new Thread(&bar).start(); immutable(int)* a = foo(); g.atomicStore(a); t.join; } What stops the CPU executing this such that MODIFY happens between READ and READ AGAIN ? To aid in the thought experiment, imagine if we replace `*ret = 3` with `*ret = longComputation()`? It might help your reasoning about re-ordering if you consider `foo` inlined.Hm... I wouldn't expect the compiler would allow reordering across a return boundary. Even if the inlining occurs. Implicit conversion of pure return values depends on that. I don't know all the rules of reordering, but this would definitely cause races if it was allowed to reorder the storage of data in *d, and the storage of d into g.Wait, wouldn't the atomicStore create a barrier, such that all the code before it must execute? -SteveNot a full memory barrier on most (all relevant?) platforms.
Nov 12 2018
On Monday, 12 November 2018 at 14:54:12 UTC, John Colvin wrote:However, with relaxed memory ordering I'm now more convinced that this would break immutability, which in turn means that the oft-made statement "immutable data doesn't require synchronisation" isn't true.The correct statement would be more like "reading immutable data doesn't require synchronisation if the method of obtaining the reference to the immutable data has at least acquire-release semantics". Which is considerably less snappy and confidence inspiring haha!
Nov 12 2018
On 12.11.18 16:00, John Colvin wrote:On Monday, 12 November 2018 at 14:54:12 UTC, John Colvin wrote:Yes. More generally, I think implicit sharing of immutable data was a mistake. (All it does is to forbid thread-local immutable data -- a valid use case.)However, with relaxed memory ordering I'm now more convinced that this would break immutability, which in turn means that the oft-made statement "immutable data doesn't require synchronisation" isn't true.The correct statement would be more like "reading immutable data doesn't require synchronisation if the method of obtaining the reference to the immutable data has at least acquire-release semantics". Which is considerably less snappy and confidence inspiring haha!
Nov 12 2018
On Monday, 12 November 2018 at 15:00:33 UTC, John Colvin wrote:The correct statement would be more like "reading immutable data doesn't require synchronisation if the method of obtaining the reference to the immutable data has at least acquire-release semantics". Which is considerably less snappy and confidence inspiring haha!Well, without acquire-release you can't meaningfully share data - it simply doesn't reach another thread. Does it even count as sharing? With immutable data you need to deliver data only once, after that you can read it without synchronization.
Nov 13 2018
On Tuesday, 13 November 2018 at 11:39:48 UTC, Kagamin wrote:On Monday, 12 November 2018 at 15:00:33 UTC, John Colvin wrote:What - precisely - do you mean by "without acquire-release you can't meaningfully share data - it simply doesn't reach another thread"? How would the incrementing of the reference count in shared_ptr work then?The correct statement would be more like "reading immutable data doesn't require synchronisation if the method of obtaining the reference to the immutable data has at least acquire-release semantics". Which is considerably less snappy and confidence inspiring haha!Well, without acquire-release you can't meaningfully share data - it simply doesn't reach another thread. Does it even count as sharing? With immutable data you need to deliver data only once, after that you can read it without synchronization.
Nov 13 2018
On Tuesday, 13 November 2018 at 17:18:17 UTC, John Colvin wrote:On Tuesday, 13 November 2018 at 11:39:48 UTC, Kagamin wrote:I *think* it isn't guaranteed to leave cache otherwise.On Monday, 12 November 2018 at 15:00:33 UTC, John Colvin wrote:What - precisely - do you mean by "without acquire-release you can't meaningfully share data - it simply doesn't reach another thread"?The correct statement would be more like "reading immutable data doesn't require synchronisation if the method of obtaining the reference to the immutable data has at least acquire-release semantics". Which is considerably less snappy and confidence inspiring haha!Well, without acquire-release you can't meaningfully share data - it simply doesn't reach another thread. Does it even count as sharing? With immutable data you need to deliver data only once, after that you can read it without synchronization.
Nov 13 2018
On Tuesday, 13 November 2018 at 17:18:17 UTC, John Colvin wrote:What - precisely - do you mean by "without acquire-release you can't meaningfully share data - it simply doesn't reach another thread"?The data simply remains in processor cache.How would the incrementing of the reference count in shared_ptr work then?You mean relaxed order? Looks like llvm has better description for that http://llvm.org/docs/Atomics.html#monotonic it's a synchronization, just weak - synchronizes only one variable, not the entire cache, and why you need full synchronization on decrement - because you may need to run the destructor, which needs the latest state of the object and everything it references.
Nov 14 2018
On Wednesday, 14 November 2018 at 08:38:55 UTC, Kagamin wrote:On Tuesday, 13 November 2018 at 17:18:17 UTC, John Colvin wrote:What relaxed does is roughly: "all relaxed operations on this value happen atomically in some order". int* foo() pure { auto ret = new int; // A *ret = 3; // B return ret; } shared immutable(int)* g; void bar() { immutable(int)* d = null; while (d is null) d = g.atomicLoad; assert(*d == 3); // D assert(*d == 3); // E } void main() { auto t = new Thread(&bar).start(); immutable(int)* a = foo(); g.atomicStore!(MemoryOrder.relaxed)(a); // C t.join; } The relevant ordering restrictions are as follows: A must happen before B, because `ret` must exist to be used. A must happen before C, because `a` must exist to be used, which cannot exist until `ret` is used. C must happen before D & E because it's the only way out of the while loop. So we have A -> C -> D -> E but B can sit anywhere after A, including between D and E! To address the "it won't leave the cache" idea which I'm guessing is actually meant to be "relaxed atomics make no promises about when changes become visible", you should consider whether acq-rel actually gives you any more guarantee about that in this case (how many iterations of the while loop would you expect?), but also that lack of guarantee of timeliness is not the same as a guarantee of never.What - precisely - do you mean by "without acquire-release you can't meaningfully share data - it simply doesn't reach another thread"?The data simply remains in processor cache.How would the incrementing of the reference count in shared_ptr work then?You mean relaxed order? Looks like llvm has better description for that http://llvm.org/docs/Atomics.html#monotonic it's a synchronization, just weak - synchronizes only one variable, not the entire cache, and why you need full synchronization on decrement - because you may need to run the destructor, which needs the latest state of the object and everything it references.
Nov 14 2018
Correcting the obvious typos: int* foo() pure { auto ret = new int; // A *ret = 3; // B return ret; } shared immutable(int)* g; void bar() { immutable(int)* d = null; while (d is null) d = g.atomicLoad!(MemoryOrder.raw); assert(*d == 3); // D assert(*d == 3); // E } void main() { auto t = new Thread(&bar).start(); immutable(int)* a = foo(); g.atomicStore!(MemoryOrder.raw)(a); // C t.join; }
Nov 15 2018
On Sunday, 11 November 2018 at 19:21:10 UTC, John Colvin wrote:Take a look at this (I think?) supposed-to-be-thread-safe code according to the rules of D: import std.stdio; import core.thread; import core.atomic; int* foo() pure { auto ret = new int; *ret = 3; // MODIFY return ret; } shared immutable(int)* g; void bar() { immutable(int)* d = null; while (d is null) d = g.atomicLoad; assert(*d == 3); // READ assert(*d == 3); // READ AGAIN } void main() { auto t = new Thread(&bar).start(); immutable(int)* a = foo(); g.atomicStore(a); t.join; } What stops the CPU executing this such that MODIFY happens between READ and READ AGAIN ? To aid in the thought experiment, imagine if we replace `*ret = 3` with `*ret = longComputation()`? It might help your reasoning about re-ordering if you consider `foo` inlined. Is implicit conversion to the shared part of immutable actually safe, or does it secretly hide data races? Another way of putting this: Is `ret` secretly shared due to implicit conversion to immutable?Maybe I'm missing something but: void bar() { immutable(int)* d = null; while (d is null) d = g.atomicLoad; assert(*d == 3); // READ assert(*d == 3); // READ - these two reads will always be the same } They will always be the same as what you are changing is the pointer. But here you atomicLoad() the pointer. So you will always have the same pointer. The value will only change if you do another atomicLoad or change the value that the pointer points to (which you don't do anywhere in your example). int* foo() /* pure */ { static auto ret = new int; // Note: Static here *ret = 3; // MODIFY return ret; } You might have a problem if "ret" here is static. But then I don't think the function would be pure anymore. Or rather at the very least it shouldn't be pure.
Nov 11 2018
On Sunday, 11 November 2018 at 23:29:13 UTC, Rubn wrote:On Sunday, 11 November 2018 at 19:21:10 UTC, John Colvin wrote:The MODIFY line changes the value where the pointer looks. I'm asking whether it's possible that either READ happens before MODIFY (which would be v. bad). Using two reads instead of one was just to turn the (potential) ordering bug in to a (potential) breach of immutability.Take a look at this (I think?) supposed-to-be-thread-safe code according to the rules of D: import std.stdio; import core.thread; import core.atomic; int* foo() pure { auto ret = new int; *ret = 3; // MODIFY return ret; } shared immutable(int)* g; void bar() { immutable(int)* d = null; while (d is null) d = g.atomicLoad; assert(*d == 3); // READ assert(*d == 3); // READ AGAIN } void main() { auto t = new Thread(&bar).start(); immutable(int)* a = foo(); g.atomicStore(a); t.join; } What stops the CPU executing this such that MODIFY happens between READ and READ AGAIN ? To aid in the thought experiment, imagine if we replace `*ret = 3` with `*ret = longComputation()`? It might help your reasoning about re-ordering if you consider `foo` inlined. Is implicit conversion to the shared part of immutable actually safe, or does it secretly hide data races? Another way of putting this: Is `ret` secretly shared due to implicit conversion to immutable?Maybe I'm missing something but: void bar() { immutable(int)* d = null; while (d is null) d = g.atomicLoad; assert(*d == 3); // READ assert(*d == 3); // READ - these two reads will always be the same } They will always be the same as what you are changing is the pointer. But here you atomicLoad() the pointer. So you will always have the same pointer. The value will only change if you do another atomicLoad or change the value that the pointer points to (which you don't do anywhere in your example).int* foo() /* pure */ { static auto ret = new int; // Note: Static here *ret = 3; // MODIFY return ret; } You might have a problem if "ret" here is static. But then I don't think the function would be pure anymore. Or rather at the very least it shouldn't be pure.Yes, that isn't pure, so the implicit conversion to immutable wouldn't work. It also doesn't compile at all due to using non-constant ctfe pointer as an initialiser.
Nov 12 2018
On Monday, 12 November 2018 at 09:19:26 UTC, John Colvin wrote:The MODIFY line changes the value where the pointer looks. I'm asking whether it's possible that either READ happens before MODIFY (which would be v. bad). Using two reads instead of one was just to turn the (potential) ordering bug in to a (potential) breach of immutability.For that you will need READ to happen before atomicLoad (in which case it will just dereference null pointer?) or MODIFY to happen after atomicStore?
Nov 12 2018
On Monday, 12 November 2018 at 11:54:36 UTC, Kagamin wrote:On Monday, 12 November 2018 at 09:19:26 UTC, John Colvin wrote:The first one is definitely not allowed, I'm not worried about that. But is it actually forbidden for the MODIFY to happen after atomicStore?The MODIFY line changes the value where the pointer looks. I'm asking whether it's possible that either READ happens before MODIFY (which would be v. bad). Using two reads instead of one was just to turn the (potential) ordering bug in to a (potential) breach of immutability.For that you will need READ to happen before atomicLoad (in which case it will just dereference null pointer?) or MODIFY to happen after atomicStore?
Nov 12 2018
On Monday, 12 November 2018 at 13:17:09 UTC, John Colvin wrote:The first one is definitely not allowed, I'm not worried about that. But is it actually forbidden for the MODIFY to happen after atomicStore?Well, the topic is a little abstract, https://en.cppreference.com/w/c/atomic/memory_order - but this description looks pretty decent, way better than whatever I saw before.
Nov 12 2018