www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - A breach of immutability due to memory implicit conversions to

reply John Colvin <john.loughran.colvin gmail.com> writes:
Take a look at this (I think?) supposed-to-be-thread-safe code 
according to the rules of D:

import std.stdio;
import core.thread;
import core.atomic;

int* foo() pure
{
     auto ret = new int;
     *ret = 3;  // MODIFY
     return ret;
}

shared immutable(int)* g;

void bar()
{
     immutable(int)* d = null;
     while (d is null)
     	d = g.atomicLoad;
     assert(*d == 3); // READ
     assert(*d == 3); // READ AGAIN
}

void main()
{
     auto t = new Thread(&bar).start();
     immutable(int)* a = foo();
     g.atomicStore(a);
     t.join;
}

What stops the CPU executing this such that MODIFY happens 
between READ and READ AGAIN ?

To aid in the thought experiment, imagine if we replace `*ret = 
3` with `*ret = longComputation()`? It might help your reasoning 
about re-ordering if you consider `foo` inlined.

Is implicit conversion to the shared part of immutable actually 
safe, or does it secretly hide data races?

Another way of putting this:
Is `ret` secretly shared due to implicit conversion to immutable?
Nov 11 2018
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 11/11/18 2:21 PM, John Colvin wrote:
 Take a look at this (I think?) supposed-to-be-thread-safe code according 
 to the rules of D:
 
 import std.stdio;
 import core.thread;
 import core.atomic;
 
 int* foo() pure
 {
      auto ret = new int;
      *ret = 3;  // MODIFY
      return ret;
 }
 
 shared immutable(int)* g;
 
 void bar()
 {
      immutable(int)* d = null;
      while (d is null)
          d = g.atomicLoad;
      assert(*d == 3); // READ
      assert(*d == 3); // READ AGAIN
 }
 
 void main()
 {
      auto t = new Thread(&bar).start();
      immutable(int)* a = foo();
      g.atomicStore(a);
      t.join;
 }
 
 What stops the CPU executing this such that MODIFY happens between READ 
 and READ AGAIN ?
 
 To aid in the thought experiment, imagine if we replace `*ret = 3` with 
 `*ret = longComputation()`? It might help your reasoning about 
 re-ordering if you consider `foo` inlined.
Hm... I wouldn't expect the compiler would allow reordering across a return boundary. Even if the inlining occurs. Implicit conversion of pure return values depends on that. I don't know all the rules of reordering, but this would definitely cause races if it was allowed to reorder the storage of data in *d, and the storage of d into g.
 
 Is implicit conversion to the shared part of immutable actually safe, or 
 does it secretly hide data races?
 
 Another way of putting this:
 Is `ret` secretly shared due to implicit conversion to immutable?
If it is, we need to make it so it's not. -Steve
Nov 11 2018
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 11/11/18 3:47 PM, Steven Schveighoffer wrote:
 On 11/11/18 2:21 PM, John Colvin wrote:
 Take a look at this (I think?) supposed-to-be-thread-safe code 
 according to the rules of D:

 import std.stdio;
 import core.thread;
 import core.atomic;

 int* foo() pure
 {
      auto ret = new int;
      *ret = 3;  // MODIFY
      return ret;
 }

 shared immutable(int)* g;

 void bar()
 {
      immutable(int)* d = null;
      while (d is null)
          d = g.atomicLoad;
      assert(*d == 3); // READ
      assert(*d == 3); // READ AGAIN
 }

 void main()
 {
      auto t = new Thread(&bar).start();
      immutable(int)* a = foo();
      g.atomicStore(a);
      t.join;
 }

 What stops the CPU executing this such that MODIFY happens between 
 READ and READ AGAIN ?
Wait, wouldn't the atomicStore create a barrier, such that all the code before it must execute? -Steve
Nov 11 2018
next sibling parent Neia Neutuladh <neia ikeran.org> writes:
On Sun, 11 Nov 2018 15:48:31 -0500, Steven Schveighoffer wrote:
 Wait, wouldn't the atomicStore create a barrier, such that all the code
 before it must execute?
It creates a smaller barrier that's supposed to be good enough (`LOCK XCHG` instead of `MFENCE`). And it *is* good enough when dealing with value types. I'm not good with multithreading, so I'm not entirely certain whether this is an actual problem in druntime. `LOCK XCHG` will serialize access for the global variable and any cache line containing it. It will also limit reordering. I haven't found anything remotely definitive on whether it provides a strong enough barrier for reordering and memory operations to suffice here, but I'd put a small bet in favor of it.
Nov 11 2018
prev sibling parent reply John Colvin <john.loughran.colvin gmail.com> writes:
On Sunday, 11 November 2018 at 20:47:16 UTC, Steven Schveighoffer 
wrote:
 On 11/11/18 2:21 PM, John Colvin wrote:
 Take a look at this (I think?) supposed-to-be-thread-safe code 
 according to the rules of D:
 
 import std.stdio;
 import core.thread;
 import core.atomic;
 
 int* foo() pure
 {
      auto ret = new int;
      *ret = 3;  // MODIFY
      return ret;
 }
 
 shared immutable(int)* g;
 
 void bar()
 {
      immutable(int)* d = null;
      while (d is null)
          d = g.atomicLoad;
      assert(*d == 3); // READ
      assert(*d == 3); // READ AGAIN
 }
 
 void main()
 {
      auto t = new Thread(&bar).start();
      immutable(int)* a = foo();
      g.atomicStore(a);
      t.join;
 }
 
 What stops the CPU executing this such that MODIFY happens 
 between READ and READ AGAIN ?
 
 To aid in the thought experiment, imagine if we replace `*ret 
 = 3` with `*ret = longComputation()`? It might help your 
 reasoning about re-ordering if you consider `foo` inlined.
Hm... I wouldn't expect the compiler would allow reordering across a return boundary. Even if the inlining occurs. Implicit conversion of pure return values depends on that. I don't know all the rules of reordering, but this would definitely cause races if it was allowed to reorder the storage of data in *d, and the storage of d into g.
The compiler can definitely re-order over return boundaries in general, but perhaps there is some special logic to prevent mistakes in these cases?
 Wait, wouldn't the atomicStore create a barrier, such that all 
 the code before it must execute?
 
 -Steve
Not a full memory barrier on most (all relevant?) platforms.
Nov 12 2018
parent reply John Colvin <john.loughran.colvin gmail.com> writes:
On Monday, 12 November 2018 at 09:11:04 UTC, John Colvin wrote:
 On Sunday, 11 November 2018 at 20:47:16 UTC, Steven 
 Schveighoffer wrote:
 On 11/11/18 2:21 PM, John Colvin wrote:
 Take a look at this (I think?) supposed-to-be-thread-safe 
 code according to the rules of D:
 
 import std.stdio;
 import core.thread;
 import core.atomic;
 
 int* foo() pure
 {
      auto ret = new int;
      *ret = 3;  // MODIFY
      return ret;
 }
 
 shared immutable(int)* g;
 
 void bar()
 {
      immutable(int)* d = null;
      while (d is null)
          d = g.atomicLoad;
      assert(*d == 3); // READ
      assert(*d == 3); // READ AGAIN
 }
 
 void main()
 {
      auto t = new Thread(&bar).start();
      immutable(int)* a = foo();
      g.atomicStore(a);
      t.join;
 }
 
 What stops the CPU executing this such that MODIFY happens 
 between READ and READ AGAIN ?
 
 To aid in the thought experiment, imagine if we replace `*ret 
 = 3` with `*ret = longComputation()`? It might help your 
 reasoning about re-ordering if you consider `foo` inlined.
Hm... I wouldn't expect the compiler would allow reordering across a return boundary. Even if the inlining occurs. Implicit conversion of pure return values depends on that. I don't know all the rules of reordering, but this would definitely cause races if it was allowed to reorder the storage of data in *d, and the storage of d into g.
The compiler can definitely re-order over return boundaries in general, but perhaps there is some special logic to prevent mistakes in these cases?
 Wait, wouldn't the atomicStore create a barrier, such that all 
 the code before it must execute?
 
 -Steve
Not a full memory barrier on most (all relevant?) platforms.
Hmm, ok, so it seems that perhaps this only occurs with relaxed memory ordering, because with acquire-release (or better) if bar sees the result of main's write to g (i.e. it gets out the while loop) then it must also see all side-effects of everything main did before that write. However, with relaxed memory ordering I'm now more convinced that this would break immutability, which in turn means that the oft-made statement "immutable data doesn't require synchronisation" isn't true.
Nov 12 2018
parent reply John Colvin <john.loughran.colvin gmail.com> writes:
On Monday, 12 November 2018 at 14:54:12 UTC, John Colvin wrote:
 However, with relaxed memory ordering I'm now more convinced 
 that this would break immutability, which in turn means that 
 the oft-made statement "immutable data doesn't require 
 synchronisation" isn't true.
The correct statement would be more like "reading immutable data doesn't require synchronisation if the method of obtaining the reference to the immutable data has at least acquire-release semantics". Which is considerably less snappy and confidence inspiring haha!
Nov 12 2018
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 12.11.18 16:00, John Colvin wrote:
 On Monday, 12 November 2018 at 14:54:12 UTC, John Colvin wrote:
 However, with relaxed memory ordering I'm now more convinced that this 
 would break immutability, which in turn means that the oft-made 
 statement "immutable data doesn't require synchronisation" isn't true.
The correct statement would be more like "reading immutable data doesn't require synchronisation if the method of obtaining the reference to the immutable data has at least acquire-release semantics". Which is considerably less snappy and confidence inspiring haha!
Yes. More generally, I think implicit sharing of immutable data was a mistake. (All it does is to forbid thread-local immutable data -- a valid use case.)
Nov 12 2018
prev sibling parent reply Kagamin <spam here.lot> writes:
On Monday, 12 November 2018 at 15:00:33 UTC, John Colvin wrote:
 The correct statement would be more like "reading immutable 
 data doesn't require synchronisation if the method of obtaining 
 the reference to the immutable data has at least 
 acquire-release semantics". Which is considerably less snappy 
 and confidence inspiring haha!
Well, without acquire-release you can't meaningfully share data - it simply doesn't reach another thread. Does it even count as sharing? With immutable data you need to deliver data only once, after that you can read it without synchronization.
Nov 13 2018
parent reply John Colvin <john.loughran.colvin gmail.com> writes:
On Tuesday, 13 November 2018 at 11:39:48 UTC, Kagamin wrote:
 On Monday, 12 November 2018 at 15:00:33 UTC, John Colvin wrote:
 The correct statement would be more like "reading immutable 
 data doesn't require synchronisation if the method of 
 obtaining the reference to the immutable data has at least 
 acquire-release semantics". Which is considerably less snappy 
 and confidence inspiring haha!
Well, without acquire-release you can't meaningfully share data - it simply doesn't reach another thread. Does it even count as sharing? With immutable data you need to deliver data only once, after that you can read it without synchronization.
What - precisely - do you mean by "without acquire-release you can't meaningfully share data - it simply doesn't reach another thread"? How would the incrementing of the reference count in shared_ptr work then?
Nov 13 2018
next sibling parent Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Tuesday, 13 November 2018 at 17:18:17 UTC, John Colvin wrote:
 On Tuesday, 13 November 2018 at 11:39:48 UTC, Kagamin wrote:
 On Monday, 12 November 2018 at 15:00:33 UTC, John Colvin wrote:
 The correct statement would be more like "reading immutable 
 data doesn't require synchronisation if the method of 
 obtaining the reference to the immutable data has at least 
 acquire-release semantics". Which is considerably less snappy 
 and confidence inspiring haha!
Well, without acquire-release you can't meaningfully share data - it simply doesn't reach another thread. Does it even count as sharing? With immutable data you need to deliver data only once, after that you can read it without synchronization.
What - precisely - do you mean by "without acquire-release you can't meaningfully share data - it simply doesn't reach another thread"?
I *think* it isn't guaranteed to leave cache otherwise.
Nov 13 2018
prev sibling parent reply Kagamin <spam here.lot> writes:
On Tuesday, 13 November 2018 at 17:18:17 UTC, John Colvin wrote:
 What - precisely - do you mean by "without acquire-release you 
 can't meaningfully share data - it simply doesn't reach another 
 thread"?
The data simply remains in processor cache.
 How would the incrementing of the reference count in shared_ptr 
 work then?
You mean relaxed order? Looks like llvm has better description for that http://llvm.org/docs/Atomics.html#monotonic it's a synchronization, just weak - synchronizes only one variable, not the entire cache, and why you need full synchronization on decrement - because you may need to run the destructor, which needs the latest state of the object and everything it references.
Nov 14 2018
parent reply John Colvin <john.loughran.colvin gmail.com> writes:
On Wednesday, 14 November 2018 at 08:38:55 UTC, Kagamin wrote:
 On Tuesday, 13 November 2018 at 17:18:17 UTC, John Colvin wrote:
 What - precisely - do you mean by "without acquire-release you 
 can't meaningfully share data - it simply doesn't reach 
 another thread"?
The data simply remains in processor cache.
 How would the incrementing of the reference count in 
 shared_ptr work then?
You mean relaxed order? Looks like llvm has better description for that http://llvm.org/docs/Atomics.html#monotonic it's a synchronization, just weak - synchronizes only one variable, not the entire cache, and why you need full synchronization on decrement - because you may need to run the destructor, which needs the latest state of the object and everything it references.
What relaxed does is roughly: "all relaxed operations on this value happen atomically in some order". int* foo() pure { auto ret = new int; // A *ret = 3; // B return ret; } shared immutable(int)* g; void bar() { immutable(int)* d = null; while (d is null) d = g.atomicLoad; assert(*d == 3); // D assert(*d == 3); // E } void main() { auto t = new Thread(&bar).start(); immutable(int)* a = foo(); g.atomicStore!(MemoryOrder.relaxed)(a); // C t.join; } The relevant ordering restrictions are as follows: A must happen before B, because `ret` must exist to be used. A must happen before C, because `a` must exist to be used, which cannot exist until `ret` is used. C must happen before D & E because it's the only way out of the while loop. So we have A -> C -> D -> E but B can sit anywhere after A, including between D and E! To address the "it won't leave the cache" idea which I'm guessing is actually meant to be "relaxed atomics make no promises about when changes become visible", you should consider whether acq-rel actually gives you any more guarantee about that in this case (how many iterations of the while loop would you expect?), but also that lack of guarantee of timeliness is not the same as a guarantee of never.
Nov 14 2018
parent John Colvin <john.loughran.colvin gmail.com> writes:
Correcting the obvious typos:

int* foo() pure {
     auto ret = new int; // A
     *ret = 3; // B
     return ret;
}

shared immutable(int)* g;

void bar() {
     immutable(int)* d = null;
     while (d is null)
         d = g.atomicLoad!(MemoryOrder.raw);
     assert(*d == 3); // D
     assert(*d == 3); // E
}

void main() {
     auto t = new Thread(&bar).start();
     immutable(int)* a = foo();
     g.atomicStore!(MemoryOrder.raw)(a); // C
     t.join;
}
Nov 15 2018
prev sibling parent reply Rubn <where is.this> writes:
On Sunday, 11 November 2018 at 19:21:10 UTC, John Colvin wrote:
 Take a look at this (I think?) supposed-to-be-thread-safe code 
 according to the rules of D:

 import std.stdio;
 import core.thread;
 import core.atomic;

 int* foo() pure
 {
     auto ret = new int;
     *ret = 3;  // MODIFY
     return ret;
 }

 shared immutable(int)* g;

 void bar()
 {
     immutable(int)* d = null;
     while (d is null)
     	d = g.atomicLoad;
     assert(*d == 3); // READ
     assert(*d == 3); // READ AGAIN
 }

 void main()
 {
     auto t = new Thread(&bar).start();
     immutable(int)* a = foo();
     g.atomicStore(a);
     t.join;
 }

 What stops the CPU executing this such that MODIFY happens 
 between READ and READ AGAIN ?

 To aid in the thought experiment, imagine if we replace `*ret = 
 3` with `*ret = longComputation()`? It might help your 
 reasoning about re-ordering if you consider `foo` inlined.

 Is implicit conversion to the shared part of immutable actually 
 safe, or does it secretly hide data races?

 Another way of putting this:
 Is `ret` secretly shared due to implicit conversion to 
 immutable?
Maybe I'm missing something but: void bar() { immutable(int)* d = null; while (d is null) d = g.atomicLoad; assert(*d == 3); // READ assert(*d == 3); // READ - these two reads will always be the same } They will always be the same as what you are changing is the pointer. But here you atomicLoad() the pointer. So you will always have the same pointer. The value will only change if you do another atomicLoad or change the value that the pointer points to (which you don't do anywhere in your example). int* foo() /* pure */ { static auto ret = new int; // Note: Static here *ret = 3; // MODIFY return ret; } You might have a problem if "ret" here is static. But then I don't think the function would be pure anymore. Or rather at the very least it shouldn't be pure.
Nov 11 2018
parent reply John Colvin <john.loughran.colvin gmail.com> writes:
On Sunday, 11 November 2018 at 23:29:13 UTC, Rubn wrote:
 On Sunday, 11 November 2018 at 19:21:10 UTC, John Colvin wrote:
 Take a look at this (I think?) supposed-to-be-thread-safe code 
 according to the rules of D:

 import std.stdio;
 import core.thread;
 import core.atomic;

 int* foo() pure
 {
     auto ret = new int;
     *ret = 3;  // MODIFY
     return ret;
 }

 shared immutable(int)* g;

 void bar()
 {
     immutable(int)* d = null;
     while (d is null)
     	d = g.atomicLoad;
     assert(*d == 3); // READ
     assert(*d == 3); // READ AGAIN
 }

 void main()
 {
     auto t = new Thread(&bar).start();
     immutable(int)* a = foo();
     g.atomicStore(a);
     t.join;
 }

 What stops the CPU executing this such that MODIFY happens 
 between READ and READ AGAIN ?

 To aid in the thought experiment, imagine if we replace `*ret 
 = 3` with `*ret = longComputation()`? It might help your 
 reasoning about re-ordering if you consider `foo` inlined.

 Is implicit conversion to the shared part of immutable 
 actually safe, or does it secretly hide data races?

 Another way of putting this:
 Is `ret` secretly shared due to implicit conversion to 
 immutable?
Maybe I'm missing something but: void bar() { immutable(int)* d = null; while (d is null) d = g.atomicLoad; assert(*d == 3); // READ assert(*d == 3); // READ - these two reads will always be the same } They will always be the same as what you are changing is the pointer. But here you atomicLoad() the pointer. So you will always have the same pointer. The value will only change if you do another atomicLoad or change the value that the pointer points to (which you don't do anywhere in your example).
The MODIFY line changes the value where the pointer looks. I'm asking whether it's possible that either READ happens before MODIFY (which would be v. bad). Using two reads instead of one was just to turn the (potential) ordering bug in to a (potential) breach of immutability.
 int* foo() /* pure */
 {
     static auto ret = new int; // Note: Static here
     *ret = 3;  // MODIFY
     return ret;
 }

 You might have a problem if "ret" here is static. But then I 
 don't think the function would be pure anymore. Or rather at 
 the very least it shouldn't be pure.
Yes, that isn't pure, so the implicit conversion to immutable wouldn't work. It also doesn't compile at all due to using non-constant ctfe pointer as an initialiser.
Nov 12 2018
parent reply Kagamin <spam here.lot> writes:
On Monday, 12 November 2018 at 09:19:26 UTC, John Colvin wrote:
 The MODIFY line changes the value where the pointer looks. I'm 
 asking whether it's possible that either READ happens before 
 MODIFY (which would be v. bad). Using two reads instead of one 
 was just to turn the (potential) ordering bug in to a 
 (potential) breach of immutability.
For that you will need READ to happen before atomicLoad (in which case it will just dereference null pointer?) or MODIFY to happen after atomicStore?
Nov 12 2018
parent reply John Colvin <john.loughran.colvin gmail.com> writes:
On Monday, 12 November 2018 at 11:54:36 UTC, Kagamin wrote:
 On Monday, 12 November 2018 at 09:19:26 UTC, John Colvin wrote:
 The MODIFY line changes the value where the pointer looks. I'm 
 asking whether it's possible that either READ happens before 
 MODIFY (which would be v. bad). Using two reads instead of one 
 was just to turn the (potential) ordering bug in to a 
 (potential) breach of immutability.
For that you will need READ to happen before atomicLoad (in which case it will just dereference null pointer?) or MODIFY to happen after atomicStore?
The first one is definitely not allowed, I'm not worried about that. But is it actually forbidden for the MODIFY to happen after atomicStore?
Nov 12 2018
parent Kagamin <spam here.lot> writes:
On Monday, 12 November 2018 at 13:17:09 UTC, John Colvin wrote:
 The first one is definitely not allowed, I'm not worried about 
 that. But is it actually forbidden for the MODIFY to happen 
 after atomicStore?
Well, the topic is a little abstract, https://en.cppreference.com/w/c/atomic/memory_order - but this description looks pretty decent, way better than whatever I saw before.
Nov 12 2018