digitalmars.D - A breach of immutability due to memory implicit conversions to

John Colvin (36/36) Nov 11 2018 Take a look at this (I think?) supposed-to-be-thread-safe code

Steven Schveighoffer (9/54) Nov 11 2018 Hm... I wouldn't expect the compiler would allow reordering across a

Steven Schveighoffer (4/40) Nov 11 2018 Wait, wouldn't the atomicStore create a barrier, such that all the code

Neia Neutuladh (10/12) Nov 11 2018 It creates a smaller barrier that's supposed to be good enough (`LOCK
John Colvin (6/56) Nov 12 2018 The compiler can definitely re-order over return boundaries in

John Colvin (10/68) Nov 12 2018 Hmm, ok, so it seems that perhaps this only occurs with relaxed

John Colvin (6/10) Nov 12 2018 The correct statement would be more like "reading immutable data

Timon Gehr (4/13) Nov 12 2018 Yes. More generally, I think implicit sharing of immutable data was a
Kagamin (5/10) Nov 13 2018 Well, without acquire-release you can't meaningfully share data -

John Colvin (6/16) Nov 13 2018 What - precisely - do you mean by "without acquire-release you

Nicholas Wilson (2/17) Nov 13 2018 I *think* it isn't guaranteed to leave cache otherwise.
Kagamin (8/13) Nov 14 2018 You mean relaxed order? Looks like llvm has better description

John Colvin (38/52) Nov 14 2018 What relaxed does is roughly: "all relaxed operations on this

John Colvin (20/20) Nov 15 2018 Correcting the obvious typos:

Rubn (25/62) Nov 11 2018 Maybe I'm missing something but:

John Colvin (9/80) Nov 12 2018 The MODIFY line changes the value where the pointer looks. I'm

Kagamin (4/9) Nov 12 2018 For that you will need READ to happen before atomicLoad (in which

John Colvin (4/13) Nov 12 2018 The first one is definitely not allowed, I'm not worried about

Kagamin (5/8) Nov 12 2018 Well, the topic is a little abstract,

John Colvin <john.loughran.colvin gmail.com> writes:

Take a look at this (I think?) supposed-to-be-thread-safe code 
according to the rules of D:

import std.stdio;
import core.thread;
import core.atomic;

int* foo() pure
{
     auto ret = new int;
     *ret = 3;  // MODIFY
     return ret;
}

shared immutable(int)* g;

void bar()
{
     immutable(int)* d = null;
     while (d is null)
     	d = g.atomicLoad;
     assert(*d == 3); // READ
     assert(*d == 3); // READ AGAIN
}

void main()
{
     auto t = new Thread(&bar).start();
     immutable(int)* a = foo();
     g.atomicStore(a);
     t.join;
}

What stops the CPU executing this such that MODIFY happens 
between READ and READ AGAIN ?

To aid in the thought experiment, imagine if we replace `*ret = 
3` with `*ret = longComputation()`? It might help your reasoning 
about re-ordering if you consider `foo` inlined.

Is implicit conversion to the shared part of immutable actually 
safe, or does it secretly hide data races?

Another way of putting this:
Is `ret` secretly shared due to implicit conversion to immutable?

Nov 11 2018

Steven Schveighoffer <schveiguy gmail.com> writes:

On 11/11/18 2:21 PM, John Colvin wrote:
 Take a look at this (I think?) supposed-to-be-thread-safe code according 
 to the rules of D:
 
 import std.stdio;
 import core.thread;
 import core.atomic;
 
 int* foo() pure
 {
      auto ret = new int;
      *ret = 3;  // MODIFY
      return ret;
 }
 
 shared immutable(int)* g;
 
 void bar()
 {
      immutable(int)* d = null;
      while (d is null)
          d = g.atomicLoad;
      assert(*d == 3); // READ
      assert(*d == 3); // READ AGAIN
 }
 
 void main()
 {
      auto t = new Thread(&bar).start();
      immutable(int)* a = foo();
      g.atomicStore(a);
      t.join;
 }
 
 What stops the CPU executing this such that MODIFY happens between READ 
 and READ AGAIN ?
 
 To aid in the thought experiment, imagine if we replace `*ret = 3` with 
 `*ret = longComputation()`? It might help your reasoning about 
 re-ordering if you consider `foo` inlined.

Hm... I wouldn't expect the compiler would allow reordering across a 
return boundary. Even if the inlining occurs. Implicit conversion of 
pure return values depends on that.

I don't know all the rules of reordering, but this would definitely 
cause races if it was allowed to reorder the storage of data in *d, and 
the storage of d into g.

 
 Is implicit conversion to the shared part of immutable actually safe, or 
 does it secretly hide data races?
 
 Another way of putting this:
 Is `ret` secretly shared due to implicit conversion to immutable?

If it is, we need to make it so it's not.

-Steve

Nov 11 2018

Steven Schveighoffer <schveiguy gmail.com> writes:

On 11/11/18 3:47 PM, Steven Schveighoffer wrote:
 On 11/11/18 2:21 PM, John Colvin wrote:
 Take a look at this (I think?) supposed-to-be-thread-safe code 
 according to the rules of D:

 import std.stdio;
 import core.thread;
 import core.atomic;

 int* foo() pure
 {
      auto ret = new int;
      *ret = 3;  // MODIFY
      return ret;
 }

 shared immutable(int)* g;

 void bar()
 {
      immutable(int)* d = null;
      while (d is null)
          d = g.atomicLoad;
      assert(*d == 3); // READ
      assert(*d == 3); // READ AGAIN
 }

 void main()
 {
      auto t = new Thread(&bar).start();
      immutable(int)* a = foo();
      g.atomicStore(a);
      t.join;
 }

 What stops the CPU executing this such that MODIFY happens between 
 READ and READ AGAIN ?


Wait, wouldn't the atomicStore create a barrier, such that all the code 
before it must execute?

-Steve

Nov 11 2018

Neia Neutuladh <neia ikeran.org> writes:

On Sun, 11 Nov 2018 15:48:31 -0500, Steven Schveighoffer wrote:
 Wait, wouldn't the atomicStore create a barrier, such that all the code
 before it must execute?

It creates a smaller barrier that's supposed to be good enough (`LOCK 
XCHG` instead of `MFENCE`). And it *is* good enough when dealing with 
value types. I'm not good with multithreading, so I'm not entirely certain 
whether this is an actual problem in druntime.

`LOCK XCHG` will serialize access for the global variable and any cache 
line containing it. It will also limit reordering. I haven't found 
anything remotely definitive on whether it provides a strong enough barrier 
for reordering and memory operations to suffice here, but I'd put a small 
bet in favor of it.

Nov 11 2018

John Colvin <john.loughran.colvin gmail.com> writes:

On Sunday, 11 November 2018 at 20:47:16 UTC, Steven Schveighoffer 
wrote:
 On 11/11/18 2:21 PM, John Colvin wrote:
 Take a look at this (I think?) supposed-to-be-thread-safe code 
 according to the rules of D:
 
 import std.stdio;
 import core.thread;
 import core.atomic;
 
 int* foo() pure
 {
      auto ret = new int;
      *ret = 3;  // MODIFY
      return ret;
 }
 
 shared immutable(int)* g;
 
 void bar()
 {
      immutable(int)* d = null;
      while (d is null)
          d = g.atomicLoad;
      assert(*d == 3); // READ
      assert(*d == 3); // READ AGAIN
 }
 
 void main()
 {
      auto t = new Thread(&bar).start();
      immutable(int)* a = foo();
      g.atomicStore(a);
      t.join;
 }
 
 What stops the CPU executing this such that MODIFY happens 
 between READ and READ AGAIN ?
 
 To aid in the thought experiment, imagine if we replace `*ret 
 = 3` with `*ret = longComputation()`? It might help your 
 reasoning about re-ordering if you consider `foo` inlined.

 Hm... I wouldn't expect the compiler would allow reordering 
 across a return boundary. Even if the inlining occurs. Implicit 
 conversion of pure return values depends on that.

 I don't know all the rules of reordering, but this would 
 definitely cause races if it was allowed to reorder the storage 
 of data in *d, and the storage of d into g.

The compiler can definitely re-order over return boundaries in 
general, but perhaps there is some special logic to prevent 
mistakes in these cases?

 Wait, wouldn't the atomicStore create a barrier, such that all 
 the code before it must execute?
 
 -Steve

Not a full memory barrier on most (all relevant?) platforms.

Nov 12 2018

John Colvin <john.loughran.colvin gmail.com> writes:

On Monday, 12 November 2018 at 09:11:04 UTC, John Colvin wrote:
 On Sunday, 11 November 2018 at 20:47:16 UTC, Steven 
 Schveighoffer wrote:
 On 11/11/18 2:21 PM, John Colvin wrote:
 Take a look at this (I think?) supposed-to-be-thread-safe 
 code according to the rules of D:
 
 import std.stdio;
 import core.thread;
 import core.atomic;
 
 int* foo() pure
 {
      auto ret = new int;
      *ret = 3;  // MODIFY
      return ret;
 }
 
 shared immutable(int)* g;
 
 void bar()
 {
      immutable(int)* d = null;
      while (d is null)
          d = g.atomicLoad;
      assert(*d == 3); // READ
      assert(*d == 3); // READ AGAIN
 }
 
 void main()
 {
      auto t = new Thread(&bar).start();
      immutable(int)* a = foo();
      g.atomicStore(a);
      t.join;
 }
 
 What stops the CPU executing this such that MODIFY happens 
 between READ and READ AGAIN ?
 
 To aid in the thought experiment, imagine if we replace `*ret 
 = 3` with `*ret = longComputation()`? It might help your 
 reasoning about re-ordering if you consider `foo` inlined.

 Hm... I wouldn't expect the compiler would allow reordering 
 across a return boundary. Even if the inlining occurs. 
 Implicit conversion of pure return values depends on that.

 I don't know all the rules of reordering, but this would 
 definitely cause races if it was allowed to reorder the 
 storage of data in *d, and the storage of d into g.

 The compiler can definitely re-order over return boundaries in 
 general, but perhaps there is some special logic to prevent 
 mistakes in these cases?

 Wait, wouldn't the atomicStore create a barrier, such that all 
 the code before it must execute?
 
 -Steve

 Not a full memory barrier on most (all relevant?) platforms.

Hmm, ok, so it seems that perhaps this only occurs with relaxed 
memory ordering, because with acquire-release (or better) if bar 
sees the result of main's write to g (i.e. it gets out the while 
loop) then it must also see all side-effects of everything main 
did before that write.

However, with relaxed memory ordering I'm now more convinced that 
this would break immutability, which in turn means that the 
oft-made statement "immutable data doesn't require 
synchronisation" isn't true.

Nov 12 2018

John Colvin <john.loughran.colvin gmail.com> writes:

On Monday, 12 November 2018 at 14:54:12 UTC, John Colvin wrote:
 However, with relaxed memory ordering I'm now more convinced 
 that this would break immutability, which in turn means that 
 the oft-made statement "immutable data doesn't require 
 synchronisation" isn't true.

The correct statement would be more like "reading immutable data 
doesn't require synchronisation if the method of obtaining the 
reference to the immutable data has at least acquire-release 
semantics". Which is considerably less snappy and confidence 
inspiring haha!

Nov 12 2018

Timon Gehr <timon.gehr gmx.ch> writes:

On 12.11.18 16:00, John Colvin wrote:
 On Monday, 12 November 2018 at 14:54:12 UTC, John Colvin wrote:
 However, with relaxed memory ordering I'm now more convinced that this 
 would break immutability, which in turn means that the oft-made 
 statement "immutable data doesn't require synchronisation" isn't true.

 
 The correct statement would be more like "reading immutable data doesn't 
 require synchronisation if the method of obtaining the reference to the 
 immutable data has at least acquire-release semantics". Which is 
 considerably less snappy and confidence inspiring haha!

Yes. More generally, I think implicit sharing of immutable data was a 
mistake. (All it does is to forbid thread-local immutable data -- a 
valid use case.)

Nov 12 2018

Kagamin <spam here.lot> writes:

On Monday, 12 November 2018 at 15:00:33 UTC, John Colvin wrote:
 The correct statement would be more like "reading immutable 
 data doesn't require synchronisation if the method of obtaining 
 the reference to the immutable data has at least 
 acquire-release semantics". Which is considerably less snappy 
 and confidence inspiring haha!

Well, without acquire-release you can't meaningfully share data - 
it simply doesn't reach another thread. Does it even count as 
sharing? With immutable data you need to deliver data only once, 
after that you can read it without synchronization.

Nov 13 2018

John Colvin <john.loughran.colvin gmail.com> writes:

On Tuesday, 13 November 2018 at 11:39:48 UTC, Kagamin wrote:
 On Monday, 12 November 2018 at 15:00:33 UTC, John Colvin wrote:
 The correct statement would be more like "reading immutable 
 data doesn't require synchronisation if the method of 
 obtaining the reference to the immutable data has at least 
 acquire-release semantics". Which is considerably less snappy 
 and confidence inspiring haha!

 Well, without acquire-release you can't meaningfully share data 
 - it simply doesn't reach another thread. Does it even count as 
 sharing? With immutable data you need to deliver data only 
 once, after that you can read it without synchronization.

What - precisely - do you mean by "without acquire-release you 
can't meaningfully share data - it simply doesn't reach another 
thread"?

How would the incrementing of the reference count in shared_ptr 
work then?

Nov 13 2018

Nicholas Wilson <iamthewilsonator hotmail.com> writes:

On Tuesday, 13 November 2018 at 17:18:17 UTC, John Colvin wrote:
 On Tuesday, 13 November 2018 at 11:39:48 UTC, Kagamin wrote:
 On Monday, 12 November 2018 at 15:00:33 UTC, John Colvin wrote:
 The correct statement would be more like "reading immutable 
 data doesn't require synchronisation if the method of 
 obtaining the reference to the immutable data has at least 
 acquire-release semantics". Which is considerably less snappy 
 and confidence inspiring haha!

 Well, without acquire-release you can't meaningfully share 
 data - it simply doesn't reach another thread. Does it even 
 count as sharing? With immutable data you need to deliver data 
 only once, after that you can read it without synchronization.

 What - precisely - do you mean by "without acquire-release you 
 can't meaningfully share data - it simply doesn't reach another 
 thread"?

I *think* it isn't guaranteed to leave cache otherwise.

Nov 13 2018

Kagamin <spam here.lot> writes:

On Tuesday, 13 November 2018 at 17:18:17 UTC, John Colvin wrote:
 What - precisely - do you mean by "without acquire-release you 
 can't meaningfully share data - it simply doesn't reach another 
 thread"?

The data simply remains in processor cache.

 How would the incrementing of the reference count in shared_ptr 
 work then?

You mean relaxed order? Looks like llvm has better description 
for that http://llvm.org/docs/Atomics.html#monotonic it's a 
synchronization, just weak - synchronizes only one variable, not 
the entire cache, and why you need full synchronization on 
decrement - because you may need to run the destructor, which 
needs the latest state of the object and everything it references.

Nov 14 2018

John Colvin <john.loughran.colvin gmail.com> writes:

On Wednesday, 14 November 2018 at 08:38:55 UTC, Kagamin wrote:
 On Tuesday, 13 November 2018 at 17:18:17 UTC, John Colvin wrote:
 What - precisely - do you mean by "without acquire-release you 
 can't meaningfully share data - it simply doesn't reach 
 another thread"?

 The data simply remains in processor cache.

 How would the incrementing of the reference count in 
 shared_ptr work then?

 You mean relaxed order? Looks like llvm has better description 
 for that http://llvm.org/docs/Atomics.html#monotonic it's a 
 synchronization, just weak - synchronizes only one variable, 
 not the entire cache, and why you need full synchronization on 
 decrement - because you may need to run the destructor, which 
 needs the latest state of the object and everything it 
 references.

What relaxed does is roughly: "all relaxed operations on this 
value happen atomically in some order".

int* foo() pure {
     auto ret = new int; // A
     *ret = 3; // B
     return ret;
}

shared immutable(int)* g;

void bar() {
     immutable(int)* d = null;
     while (d is null)
     	d = g.atomicLoad;
     assert(*d == 3); // D
     assert(*d == 3); // E
}

void main() {
     auto t = new Thread(&bar).start();
     immutable(int)* a = foo();
     g.atomicStore!(MemoryOrder.relaxed)(a); // C
     t.join;
}

The relevant ordering restrictions are as follows:

A must happen before B, because `ret` must exist to be used.
A must happen before C, because `a` must exist to be used, which 
cannot exist until `ret` is used.
C must happen before D & E because it's the only way out of the 
while loop.

So we have

A -> C -> D -> E

but B can sit anywhere after A, including between D and E!

To address the "it won't leave the cache" idea which I'm guessing 
is actually meant to be "relaxed atomics make no promises about 
when changes become visible", you should consider whether acq-rel 
actually gives you any more guarantee about that in this case 
(how many iterations of the while loop would you expect?), but 
also that lack of guarantee of timeliness is not the same as a 
guarantee of never.

Nov 14 2018

John Colvin <john.loughran.colvin gmail.com> writes:

Correcting the obvious typos:

int* foo() pure {
     auto ret = new int; // A
     *ret = 3; // B
     return ret;
}

shared immutable(int)* g;

void bar() {
     immutable(int)* d = null;
     while (d is null)
         d = g.atomicLoad!(MemoryOrder.raw);
     assert(*d == 3); // D
     assert(*d == 3); // E
}

void main() {
     auto t = new Thread(&bar).start();
     immutable(int)* a = foo();
     g.atomicStore!(MemoryOrder.raw)(a); // C
     t.join;
}

Nov 15 2018

Rubn <where is.this> writes:

On Sunday, 11 November 2018 at 19:21:10 UTC, John Colvin wrote:
 Take a look at this (I think?) supposed-to-be-thread-safe code 
 according to the rules of D:

 import std.stdio;
 import core.thread;
 import core.atomic;

 int* foo() pure
 {
     auto ret = new int;
     *ret = 3;  // MODIFY
     return ret;
 }

 shared immutable(int)* g;

 void bar()
 {
     immutable(int)* d = null;
     while (d is null)
     	d = g.atomicLoad;
     assert(*d == 3); // READ
     assert(*d == 3); // READ AGAIN
 }

 void main()
 {
     auto t = new Thread(&bar).start();
     immutable(int)* a = foo();
     g.atomicStore(a);
     t.join;
 }

 What stops the CPU executing this such that MODIFY happens 
 between READ and READ AGAIN ?

 To aid in the thought experiment, imagine if we replace `*ret = 
 3` with `*ret = longComputation()`? It might help your 
 reasoning about re-ordering if you consider `foo` inlined.

 Is implicit conversion to the shared part of immutable actually 
 safe, or does it secretly hide data races?

 Another way of putting this:
 Is `ret` secretly shared due to implicit conversion to 
 immutable?


Maybe I'm missing something but:

void bar()
{
     immutable(int)* d = null;
     while (d is null)
     	d = g.atomicLoad;
     assert(*d == 3); // READ
     assert(*d == 3); // READ - these two reads will always be the 
same
}

They will always be the same as what you are changing is the 
pointer. But here you atomicLoad() the pointer. So you will 
always have the same pointer. The value will only change if you 
do another atomicLoad or change the value that the pointer points 
to (which you don't do anywhere in your example).

int* foo() /* pure */
{
     static auto ret = new int; // Note: Static here
     *ret = 3;  // MODIFY
     return ret;
}

You might have a problem if "ret" here is static. But then I 
don't think the function would be pure anymore. Or rather at the 
very least it shouldn't be pure.

Nov 11 2018

John Colvin <john.loughran.colvin gmail.com> writes:

On Sunday, 11 November 2018 at 23:29:13 UTC, Rubn wrote:
 On Sunday, 11 November 2018 at 19:21:10 UTC, John Colvin wrote:
 Take a look at this (I think?) supposed-to-be-thread-safe code 
 according to the rules of D:

 import std.stdio;
 import core.thread;
 import core.atomic;

 int* foo() pure
 {
     auto ret = new int;
     *ret = 3;  // MODIFY
     return ret;
 }

 shared immutable(int)* g;

 void bar()
 {
     immutable(int)* d = null;
     while (d is null)
     	d = g.atomicLoad;
     assert(*d == 3); // READ
     assert(*d == 3); // READ AGAIN
 }

 void main()
 {
     auto t = new Thread(&bar).start();
     immutable(int)* a = foo();
     g.atomicStore(a);
     t.join;
 }

 What stops the CPU executing this such that MODIFY happens 
 between READ and READ AGAIN ?

 To aid in the thought experiment, imagine if we replace `*ret 
 = 3` with `*ret = longComputation()`? It might help your 
 reasoning about re-ordering if you consider `foo` inlined.

 Is implicit conversion to the shared part of immutable 
 actually safe, or does it secretly hide data races?

 Another way of putting this:
 Is `ret` secretly shared due to implicit conversion to 
 immutable?


 Maybe I'm missing something but:

 void bar()
 {
     immutable(int)* d = null;
     while (d is null)
     	d = g.atomicLoad;
     assert(*d == 3); // READ
     assert(*d == 3); // READ - these two reads will always be 
 the same
 }

 They will always be the same as what you are changing is the 
 pointer. But here you atomicLoad() the pointer. So you will 
 always have the same pointer. The value will only change if you 
 do another atomicLoad or change the value that the pointer 
 points to (which you don't do anywhere in your example).

The MODIFY line changes the value where the pointer looks. I'm 
asking whether it's possible that either READ happens before 
MODIFY (which would be v. bad). Using two reads instead of one 
was just to turn the (potential) ordering bug in to a (potential) 
breach of immutability.

 int* foo() /* pure */
 {
     static auto ret = new int; // Note: Static here
     *ret = 3;  // MODIFY
     return ret;
 }

 You might have a problem if "ret" here is static. But then I 
 don't think the function would be pure anymore. Or rather at 
 the very least it shouldn't be pure.

Yes, that isn't pure, so the implicit conversion to immutable 
wouldn't work. It also doesn't compile at all due to using 
non-constant ctfe pointer as an initialiser.

Nov 12 2018

Kagamin <spam here.lot> writes:

On Monday, 12 November 2018 at 09:19:26 UTC, John Colvin wrote:
 The MODIFY line changes the value where the pointer looks. I'm 
 asking whether it's possible that either READ happens before 
 MODIFY (which would be v. bad). Using two reads instead of one 
 was just to turn the (potential) ordering bug in to a 
 (potential) breach of immutability.

For that you will need READ to happen before atomicLoad (in which 
case it will just dereference null pointer?) or MODIFY to happen 
after atomicStore?

Nov 12 2018

John Colvin <john.loughran.colvin gmail.com> writes:

On Monday, 12 November 2018 at 11:54:36 UTC, Kagamin wrote:
 On Monday, 12 November 2018 at 09:19:26 UTC, John Colvin wrote:
 The MODIFY line changes the value where the pointer looks. I'm 
 asking whether it's possible that either READ happens before 
 MODIFY (which would be v. bad). Using two reads instead of one 
 was just to turn the (potential) ordering bug in to a 
 (potential) breach of immutability.

 For that you will need READ to happen before atomicLoad (in 
 which case it will just dereference null pointer?) or MODIFY to 
 happen after atomicStore?

The first one is definitely not allowed, I'm not worried about 
that. But is it actually forbidden for the MODIFY to happen after 
atomicStore?

Nov 12 2018

Kagamin <spam here.lot> writes:

On Monday, 12 November 2018 at 13:17:09 UTC, John Colvin wrote:
 The first one is definitely not allowed, I'm not worried about 
 that. But is it actually forbidden for the MODIFY to happen 
 after atomicStore?

Well, the topic is a little abstract, 
https://en.cppreference.com/w/c/atomic/memory_order - but this 
description looks pretty decent, way better than whatever I saw 
before.

Nov 12 2018

D Programming

C/C++ Programming

Other

digitalmars.D - A breach of immutability due to memory implicit conversions to