digitalmars.D.ldc - LDC optimized builds causing unexpected behaviour when sharing
- Keivan Shah (29/29) Jun 01 2022 Hello,
- Johan (10/40) Jun 01 2022 This is not a bug.
- Keivan Shah (14/25) Jun 01 2022 Hey Johan,
- Kagamin (4/7) Jun 01 2022 No, `shared` is a low level primitive. If you want one size fits
- Keivan Shah (10/17) Jun 01 2022 Hey Kagamin,
- kinke (17/19) Jun 01 2022 `core.atomic.atomic{Load,Store}` handle this:
- Keivan Shah (12/31) Jun 01 2022 Hey kinke,
- Johan (8/40) Jun 02 2022 `atomicFence` only works for ordering memory operations on a
- Keivan Shah (11/18) Jun 02 2022 Hey Johan,
- rm (6/27) Jun 06 2022 You can check this:
- Kagamin (3/5) Jun 15 2022 You should disable postblit or it will be copyable without atomic
- David Nadlinger (14/16) Jun 04 2022 `atomicFence` should work across threads as well; it inserts actual
- Johan (18/24) Jun 05 2022 What I meant to say is that the fence only works for ordering
Hello, I am encountering a strange bug when making optimized build with LDC. Can someone please help me debug it. ``` void main() { import std.stdio : writeln; import core.thread; // Runinng this code on LDC with `-O` leads to infinite loop writeln("Starting..."); bool done = false; // Making this `shared` also doesn't work new Thread({ done = true; }).start(); while (!done) { } // Wait for done writeln("Done!"); } ``` The program ends up in an infinite loop if `-O` is passed to LDC and works normally without it. Also works fine in both cases with dmd. I thought this was due to the variable `done` being shared between threads, so I tried marking the variable as `shared` but even that does not solve the problem. Also tried making the variable global and using `__gshared` doesn't seem to work as well. Finally I was able to solve the problem by using [core.volatile](https://dlang.org/phobos/core_volatile.html), but that doesn't look like a very great solution. So I wanted to understand the main problem here and make sure and also verify that this is not a LDC bug.
Jun 01 2022
On Wednesday, 1 June 2022 at 07:02:08 UTC, Keivan Shah wrote:Hello, I am encountering a strange bug when making optimized build with LDC. Can someone please help me debug it. ``` void main() { import std.stdio : writeln; import core.thread; // Runinng this code on LDC with `-O` leads to infinite loop writeln("Starting..."); bool done = false; // Making this `shared` also doesn't work new Thread({ done = true; }).start(); while (!done) { } // Wait for done writeln("Done!"); } ``` The program ends up in an infinite loop if `-O` is passed to LDC and works normally without it. Also works fine in both cases with dmd. I thought this was due to the variable `done` being shared between threads, so I tried marking the variable as `shared` but even that does not solve the problem. Also tried making the variable global and using `__gshared` doesn't seem to work as well. Finally I was able to solve the problem by using [core.volatile](https://dlang.org/phobos/core_volatile.html), but that doesn't look like a very great solution. So I wanted to understand the main problem here and make sure and also verify that this is not a LDC bug.This is not a bug. The solution with `volatileLoad` works, but I think it technically isn't quite correct; documentation says: "They should not be used for communication between threads.". There are many articles online about this kind of conditional loop on a variable that is modified by another thread. For example: https://stackoverflow.com/questions/10091112/using-a-global-variable-to-control-a-while-loop- n-a-separate-thread and https://devblogs.microsoft.com/oldnewthing/20180924-00/?p=99805 -Johan
Jun 01 2022
On Wednesday, 1 June 2022 at 08:23:18 UTC, Johan wrote:This is not a bug. The solution with `volatileLoad` works, but I think it technically isn't quite correct; documentation says: "They should not be used for communication between threads.". There are many articles online about this kind of conditional loop on a variable that is modified by another thread. For example: https://stackoverflow.com/questions/10091112/using-a-global-variable-to-control-a-while-loop- n-a-separate-thread and https://devblogs.microsoft.com/oldnewthing/20180924-00/?p=99805 -JohanHey Johan, Thanks for the quick response. As you mentioned even I don't feel the solution with `volatileLoad` is correct and so was looking for something else. The [documentation](https://dlang.org/articles/migrate-to-shared.html#shared) stated thatGlobal data that is meant to be shared among multiple threads should be marked with the shared keywordSo to me it seemed like marking the variable as `shared` should have made it accessible as well as it takes care of the memory ordering. That plus the fact that the snippet works in `dmd` but not `ldc` made me think it maybe was a bug. Anyways, have gotten a better understanding of the issue now, probably will use [core.atomic.atomicFence](https://dlang.org/phobos/core_atomi Thanks and Regards, Keivan Shah.
Jun 01 2022
On Wednesday, 1 June 2022 at 13:35:45 UTC, Keivan Shah wrote:So to me it seemed like marking the variable as `shared` should have made it accessible as well as it takes care of the memory ordering.No, `shared` is a low level primitive. If you want one size fits all concurrency, use mutex, synchronized statement or std.concurrency.
Jun 01 2022
On Wednesday, 1 June 2022 at 15:31:40 UTC, Kagamin wrote:On Wednesday, 1 June 2022 at 13:35:45 UTC, Keivan Shah wrote:Hey Kagamin, Thanks, starting to understand that a little now. Seems like that I was relying on some incorrect assumptions and will need to make some overall logic changes. Will look into using synchronized, but seems like my use-case might be fine with `core.atomic` for now since need a fast lightweight way for data sharing for a few shared variables. Regards and Thanks, Keivan Shah.So to me it seemed like marking the variable as `shared` should have made it accessible as well as it takes care of the memory ordering.No, `shared` is a low level primitive. If you want one size fits all concurrency, use mutex, synchronized statement or std.concurrency.
Jun 01 2022
On Wednesday, 1 June 2022 at 13:35:45 UTC, Keivan Shah wrote:probably will use [core.atomic.atomicFence](https://dlang.org/phobos/core_atomi`core.atomic.atomic{Load,Store}` handle this: ``` void main() { import std.stdio : writeln; import core.thread; import core.atomic; writeln("Starting..."); shared bool done = false; new Thread({ atomicStore(done, true); }).start(); while (!atomicLoad(done)) { } // Wait for done writeln("Done!"); } ``
Jun 01 2022
On Wednesday, 1 June 2022 at 16:46:51 UTC, kinke wrote:On Wednesday, 1 June 2022 at 13:35:45 UTC, Keivan Shah wrote:Hey kinke, Thanks for the snippet, I have some existing code that follows the pattern similar to the example I had given. There are a lot of variables that are shared between the 2 threads and so instead of changing the reads and writes for every variable to be `atomic{Store,Load}`, I was thinking if `atomicFence` might achieve the same effect with far lesser code change requirements. Would that be recommended or should I use `atomic{Store,Load}` instead? Regards and Thanks, Keivan Shahprobably will use [core.atomic.atomicFence](https://dlang.org/phobos/core_atomi`core.atomic.atomic{Load,Store}` handle this: ``` void main() { import std.stdio : writeln; import core.thread; import core.atomic; writeln("Starting..."); shared bool done = false; new Thread({ atomicStore(done, true); }).start(); while (!atomicLoad(done)) { } // Wait for done writeln("Done!"); } ``
Jun 01 2022
On Thursday, 2 June 2022 at 04:33:39 UTC, Keivan Shah wrote:On Wednesday, 1 June 2022 at 16:46:51 UTC, kinke wrote:`atomicFence` only works for ordering memory operations on a single thread, not ordering between threads. You can either use a mutex (to coordinate between threads), or need all store/loads to be atomic with `atomic{Store,Load}`. I am surprised that there is no `core.atomic!int` to simplify your life. Perhaps you should make a feature request :) -JohanOn Wednesday, 1 June 2022 at 13:35:45 UTC, Keivan Shah wrote:Hey kinke, Thanks for the snippet, I have some existing code that follows the pattern similar to the example I had given. There are a lot of variables that are shared between the 2 threads and so instead of changing the reads and writes for every variable to be `atomic{Store,Load}`, I was thinking if `atomicFence` might achieve the same effect with far lesser code change requirements. Would that be recommended or should I use `atomic{Store,Load}` instead?probably will use [core.atomic.atomicFence](https://dlang.org/phobos/core_atomi`core.atomic.atomic{Load,Store}` handle this: ``` void main() { import std.stdio : writeln; import core.thread; import core.atomic; writeln("Starting..."); shared bool done = false; new Thread({ atomicStore(done, true); }).start(); while (!atomicLoad(done)) { } // Wait for done writeln("Done!"); } ``
Jun 02 2022
On Thursday, 2 June 2022 at 11:56:07 UTC, Johan wrote:`atomicFence` only works for ordering memory operations on a single thread, not ordering between threads. You can either use a mutex (to coordinate between threads), or need all store/loads to be atomic with `atomic{Store,Load}`. I am surprised that there is no `core.atomic!int` to simplify your life. Perhaps you should make a feature request :) -JohanHey Johan, Thanks for the reply, I finally ended up using `atomic{Store,Load}` with some memory ordering after reading about them a bit. Atomic variables (`core.atomic!int`) would have been a great feature, I have no further usecases for them and they probably are a bit more complex for my understanding to make a informed request right now, but will definitely make one when my understanding improves in the future. Thanks and regards, Keivan Shah.
Jun 02 2022
On 02/06/2022 20:41, Keivan Shah wrote:On Thursday, 2 June 2022 at 11:56:07 UTC, Johan wrote:You can check this: https://github.com/rymrg/drm/blob/main/atomic.d It works well enough for my needs at the moment. But it requires more extensive testing to see if it plays nicely with safe. But for __gshared it should work just fine.`atomicFence` only works for ordering memory operations on a single thread, not ordering between threads. You can either use a mutex (to coordinate between threads), or need all store/loads to be atomic with `atomic{Store,Load}`. I am surprised that there is no `core.atomic!int` to simplify your life. Perhaps you should make a feature request :) -JohanHey Johan, Thanks for the reply, I finally ended up using `atomic{Store,Load}` with some memory ordering after reading about them a bit. Atomic variables (`core.atomic!int`) would have been a great feature, I have no further usecases for them and they probably are a bit more complex for my understanding to make a informed request right now, but will definitely make one when my understanding improves in the future. Thanks and regards, Keivan Shah.
Jun 06 2022
On Monday, 6 June 2022 at 08:08:45 UTC, rm wrote:You can check this: https://github.com/rymrg/drm/blob/main/atomic.dYou should disable postblit or it will be copyable without atomic operation.
Jun 15 2022
On 2 Jun 2022, at 7:56, Johan via digitalmars-d-ldc wrote:`atomicFence` only works for ordering memory operations on a single thread, not ordering between threads.`atomicFence` should work across threads as well; it inserts actual processor fences as necessary, in addition to the requisite effect on compiler optimisations. However, in practice, code using standalone fences often harder to make sense of than that using atomic loads/stores, unless the relevant accesses are explicitly documented in enough detail. Just sprinkling `atomicFence` over an existing code base to make it thread-safe is pretty much guaranteed to fail. Thus, preferring atomic loads/stores is very sound advice in practice. (There are also fences that are effectively for compiler optimisations only and usually go by volatile fences or signal fences; perhaps that’s what Johan was thinking of.) — David
Jun 04 2022
On Saturday, 4 June 2022 at 21:33:30 UTC, David Nadlinger wrote:On 2 Jun 2022, at 7:56, Johan via digitalmars-d-ldc wrote:What I meant to say is that the fence only works for ordering memory operations of the thread that contains the fence, as observable by other threads. Indeed you use it to coordinate between threads. If thread A has a fence somewhere, then that fence orders the memory operations of thread A that thread B will observe. So you can use variables from thread A in thread B, for example: `if (value_is_ok_flag) foo(value);`. The fence in thread A (between writing `value` and `ok_flag`) prevents that `ok_flag==true` is observed in thread B _before_ the new `value` value is observed. Because the OP issue is not a thread synchronization issue, but a re-read optimization (only need to read variable once, instead of upon every loop iteration), the atomic fence is likely not to work. I think the atomicLoad works because it is _both_ `volatile` and `atomic`. cheers, Johan`atomicFence` only works for ordering memory operations on a single thread, not ordering between threads.`atomicFence` should work across threads as well; it inserts actual processor fences as necessary, in addition to the requisite effect on compiler optimisations.
Jun 05 2022