digitalmars.D - Fixing core.atomic
- rm (4/4) May 30 2021 I plan on making core.atomic more consistent and easier to use. Please
- Johan Engelen (6/10) May 30 2021 Pretty nice initiative.
- Johan Engelen (4/8) May 30 2021 I would also make the template accept more than just integral
- rm (5/15) May 30 2021 The plan is to support pointers as this will be really useful for lock
- IGotD- (9/13) May 30 2021 Definitely, the D atomic library is cumbersome to use. C++
- Zardoz (2/17) May 30 2021 Yes, please! This should be merged ASAP.
- sarn (7/16) May 31 2021 The trouble is that only works in a handful of simple cases
- Guillaume Piolat (5/9) May 31 2021 I have once implemented an atomic struct like this and the first
- IGotD- (9/13) May 31 2021 Yes, so the programmers must be aware of this. In C++ only the
- Paul Backus (2/4) May 31 2021 `@disable opBinary` should do it.
- Guillaume Piolat (4/6) May 31 2021 I don't think so, it's a bit implicit meaning to be atomic, so it
- Guillaume Piolat (2/4) May 31 2021 big*
- rm (4/12) May 31 2021 I don't consider this a problem. In this case you have a load and a
- Guillaume Piolat (3/6) May 31 2021 I prefer atomicLoad and atomicStore then, because it's explicit
- IGotD- (3/6) May 31 2021 Yes, you can use it if you want to. We will not remove the
- Max Haughton (4/11) May 31 2021 That and the C++ `std::atomic` will provide the same semantics on
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (8/20) May 31 2021 Are you sure?
- Max Haughton (3/17) May 31 2021 "Atomic types are also allowed to be sometimes lock-free"
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (4/9) May 31 2021 Yes, that is what the trait is for?
- Max Haughton (7/17) May 31 2021 This is orthogonal to the example I posted, what if the hardware
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (12/19) May 31 2021 I am not sure I understand what you mean now. Locking operations
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (4/13) May 31 2021 Yes, how often do people use this anyway? I try to avoid
- rm (7/21) Jun 02 2021 It's useful if you want to implement known concurrency algorithms with
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (5/8) Jun 02 2021 Have you ever used Lamport's Bakery, though?
- rm (13/21) Jun 02 2021 Not Lamport's Bakery. But I did implement some primitives. betterC does
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (4/5) Jun 02 2021 IIRC some architectures provide more efficient inc/dec atomics
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (7/12) Jun 02 2021 No, I think that was wrong, I think they usually return the
- Max Haughton (4/17) Jun 02 2021 Are they always fixed latency? No dependence on the load store
- rm (3/20) Jun 02 2021 At least on x86-TSO, an atomic operation forces the cache to be flushed
I plan on making core.atomic more consistent and easier to use. Please provide me with your feedback. https://github.com/rymrg/drm/blob/main/atomic.d https://github.com/rymrg/drm/blob/main/atomic_rationale.md
May 30 2021
On Sunday, 30 May 2021 at 20:41:29 UTC, rm wrote:I plan on making core.atomic more consistent and easier to use. Please provide me with your feedback. https://github.com/rymrg/drm/blob/main/atomic.d https://github.com/rymrg/drm/blob/main/atomic_rationale.mdPretty nice initiative. `fadd` --> `fetchadd` or `increment`. `fadd` and `fsub` look like floatingpoint add/sub to me... cheers, Johan
May 30 2021
On Sunday, 30 May 2021 at 20:41:29 UTC, rm wrote:I plan on making core.atomic more consistent and easier to use. Please provide me with your feedback. https://github.com/rymrg/drm/blob/main/atomic.d https://github.com/rymrg/drm/blob/main/atomic_rationale.mdI would also make the template accept more than just integral types, and only add the increment/binop functions for integral types.
May 30 2021
On 30/05/2021 23:49, Johan Engelen wrote:On Sunday, 30 May 2021 at 20:41:29 UTC, rm wrote:The plan is to support pointers as this will be really useful for lock free data structures. Currently `isPointer` is defined in `core.internal.traits`. So I haven't written this usage.I plan on making core.atomic more consistent and easier to use. Please provide me with your feedback. https://github.com/rymrg/drm/blob/main/atomic.d https://github.com/rymrg/drm/blob/main/atomic_rationale.mdI would also make the template accept more than just integral types, and only add the increment/binop functions for integral types.
May 30 2021
On Sunday, 30 May 2021 at 20:41:29 UTC, rm wrote:I plan on making core.atomic more consistent and easier to use. Please provide me with your feedback. https://github.com/rymrg/drm/blob/main/atomic.d https://github.com/rymrg/drm/blob/main/atomic_rationale.mdDefinitely, the D atomic library is cumbersome to use. C++ std::atomic supports operator overloading for example. atomicVar += 1; will create an atomic add as atomicVar is of the atomic type. D doesn't have this and I think D should add atomic types like std::atomic<T>. I like this because then I can easily switch between atomic operations and normal operations by just changing the type and very few changes.
May 30 2021
On Sunday, 30 May 2021 at 20:58:56 UTC, IGotD- wrote:On Sunday, 30 May 2021 at 20:41:29 UTC, rm wrote:Yes, please! This should be merged ASAP.I plan on making core.atomic more consistent and easier to use. Please provide me with your feedback. https://github.com/rymrg/drm/blob/main/atomic.d https://github.com/rymrg/drm/blob/main/atomic_rationale.mdDefinitely, the D atomic library is cumbersome to use. C++ std::atomic supports operator overloading for example. atomicVar += 1; will create an atomic add as atomicVar is of the atomic type. D doesn't have this and I think D should add atomic types like std::atomic<T>. I like this because then I can easily switch between atomic operations and normal operations by just changing the type and very few changes.
May 30 2021
On Sunday, 30 May 2021 at 20:58:56 UTC, IGotD- wrote:Definitely, the D atomic library is cumbersome to use. C++ std::atomic supports operator overloading for example. atomicVar += 1; will create an atomic add as atomicVar is of the atomic type. D doesn't have this and I think D should add atomic types like std::atomic<T>.That was a design choice. It's because of this:I like this because then I can easily switch between atomic operations and normal operations by just changing the type and very few changes.The trouble is that only works in a handful of simple cases (e.g., you just want a simple event counter that doesn't affect flow of control). For anything else, you need to think carefully about exactly where the atomic operations are, so there's no point making them implicit.
May 31 2021
On 01/06/2021 5:50, sarn wrote:On Sunday, 30 May 2021 at 20:58:56 UTC, IGotD- wrote:I agree about that. One shouldn't simply access the same memory location atomically and non-atomically interchangeably. That is a source for many bugs. Especially considering the kind of synchronization you'll have or not have as a result. Still, there are cases where you *know* that your thread is the *only one* that can access this variable. In a case like this, only after you made sure to synchronize you can also allow for non atomic access to the variable (Though I'd still avoid this). Alternatively, the other case is going from non-atomic to atomic. After initializing the location with an allocator in a non atomic manner, you move to use it atomically to synchronize between threads. But regarding the design choice, if your intention is to prevent casting the atomic to non-atomic. You can simply wrap it in a struct and not allowing access to the raw value. That should be sufficient. Anyway, I disagree about the simple cases. Because specifically the case of simple event counter that isn't require for synchronization, you should be using relaxed. There is no need for sequential consistency in this case.Definitely, the D atomic library is cumbersome to use. C++ std::atomic supports operator overloading for example. atomicVar += 1; will create an atomic add as atomicVar is of the atomic type. D doesn't have this and I think D should add atomic types like std::atomic<T>.That was a design choice. It's because of this:I like this because then I can easily switch between atomic operations and normal operations by just changing the type and very few changes.The trouble is that only works in a handful of simple cases (e.g., you just want a simple event counter that doesn't affect flow of control). For anything else, you need to think carefully about exactly where the atomic operations are, so there's no point making them implicit.
Jun 02 2021
On Wednesday, 2 June 2021 at 14:50:44 UTC, rm wrote:*snip*Sorry, but I don't feel like anything you wrote relates to anything I actually said. For example:Anyway, I disagree about the simple cases. Because specifically the case of simple event counter that isn't require for synchronization, you should be using relaxed. There is no need for sequential consistency in this case.The "simple cases" comment was about how the "thread-safe value" abstraction only works at all in a few simple cases (such as an event counter that doesn't affect flow of control). No one has implied anything about what memory order you need for a counter. But if you do consider memory order, that's more reason to treat atomic operations as explicit atomic *operations*, and not wrap them in a "thread-safe value" abstraction.
Jun 02 2021
On 03/06/2021 1:04, sarn wrote:On Wednesday, 2 June 2021 at 14:50:44 UTC, rm wrote:I think I was conflating two replies into one and misread your response. Either way, it does allow for easier porting from C/C++ if such code is used.*snip*Sorry, but I don't feel like anything you wrote relates to anything I actually said. For example:Anyway, I disagree about the simple cases. Because specifically the case of simple event counter that isn't require for synchronization, you should be using relaxed. There is no need for sequential consistency in this case.The "simple cases" comment was about how the "thread-safe value" abstraction only works at all in a few simple cases (such as an event counter that doesn't affect flow of control). No one has implied anything about what memory order you need for a counter. But if you do consider memory order, that's more reason to treat atomic operations as explicit atomic *operations*, and not wrap them in a "thread-safe value" abstraction.
Jun 06 2021
On Sunday, 30 May 2021 at 20:41:29 UTC, rm wrote:I plan on making core.atomic more consistent and easier to use. Please provide me with your feedback. https://github.com/rymrg/drm/blob/main/atomic.d https://github.com/rymrg/drm/blob/main/atomic_rationale.mdI have once implemented an atomic struct like this and the first thing that happened is that you would write: s = s + 1; Breaking atomicity.
May 31 2021
On Monday, 31 May 2021 at 08:18:35 UTC, Guillaume Piolat wrote:I have once implemented an atomic struct like this and the first thing that happened is that you would write: s = s + 1; Breaking atomicity.Yes, so the programmers must be aware of this. In C++ only the unary -= and += are supported and -- and ++. Is it possible to overload binary operators so that they cause an compiler error? In order achieve the same s = s + 1; you need to write. s.store(s.load() + 1) However, the assignment operator writing s = 1 is nice instead of s.store(1).
May 31 2021
On Monday, 31 May 2021 at 08:52:33 UTC, IGotD- wrote:Is it possible to overload binary operators so that they cause an compiler error?` disable opBinary` should do it.
May 31 2021
On Monday, 31 May 2021 at 08:52:33 UTC, IGotD- wrote:However, the assignment operator writing s = 1 is nice instead of s.store(1).I don't think so, it's a bit implicit meaning to be atomic, so it falls under "nice short syntax for something pretty much important": a terrible idea.
May 31 2021
On Monday, 31 May 2021 at 16:33:05 UTC, Guillaume Piolat wrote:On Monday, 31 May 2021 at 08:52:33 UTC, IGotD- wrote: bitbig*
May 31 2021
On 31/05/2021 11:18, Guillaume Piolat wrote:I have once implemented an atomic struct like this and the first thing that happened is that you would write: s = s + 1; Breaking atomicity.I don't consider this a problem. In this case you have a load and a store. This is a non-atomic RMW. On the other hand, you do get sequential consistency synchronization from this process.
May 31 2021
On Monday, 31 May 2021 at 09:26:36 UTC, rm wrote:I don't consider this a problem. In this case you have a load and a store. This is a non-atomic RMW. On the other hand, you do get sequential consistency synchronization from this process.I prefer atomicLoad and atomicStore then, because it's explicit and it's useless to hide the fact it's atomic behind nice syntax.
May 31 2021
On Monday, 31 May 2021 at 16:34:35 UTC, Guillaume Piolat wrote:I prefer atomicLoad and atomicStore then, because it's explicit and it's useless to hide the fact it's atomic behind nice syntax.Yes, you can use it if you want to. We will not remove the regular D atomic functions.
May 31 2021
On Monday, 31 May 2021 at 17:51:26 UTC, IGotD- wrote:On Monday, 31 May 2021 at 16:34:35 UTC, Guillaume Piolat wrote:That and the C++ `std::atomic` will provide the same semantics on types of wide size (LDC and GDC seem to differ in behaviour here when you use the atomic primitive functions.)I prefer atomicLoad and atomicStore then, because it's explicit and it's useless to hide the fact it's atomic behind nice syntax.Yes, you can use it if you want to. We will not remove the regular D atomic functions.
May 31 2021
On Monday, 31 May 2021 at 20:01:57 UTC, Max Haughton wrote:On Monday, 31 May 2021 at 17:51:26 UTC, IGotD- wrote:Are you sure? «All atomic types except for std::atomic_flag may be implemented using mutexes or other locking operations, rather than using the lock-free atomic CPU instructions.» https://en.cppreference.com/w/cpp/atomic/atomic_is_lock_free C/C++ is trying to be hardware-independent to a much larger extent than D.On Monday, 31 May 2021 at 16:34:35 UTC, Guillaume Piolat wrote:That and the C++ `std::atomic` will provide the same semantics on types of wide size (LDC and GDC seem to differ in behaviour here when you use the atomic primitive functions.)I prefer atomicLoad and atomicStore then, because it's explicit and it's useless to hide the fact it's atomic behind nice syntax.Yes, you can use it if you want to. We will not remove the regular D atomic functions.
May 31 2021
On Monday, 31 May 2021 at 20:43:37 UTC, Ola Fosheim Grøstad wrote:On Monday, 31 May 2021 at 20:01:57 UTC, Max Haughton wrote:"Atomic types are also allowed to be sometimes lock-free" https://gcc.godbolt.org/z/Ph981GvY8 Note the use of library calls.On Monday, 31 May 2021 at 17:51:26 UTC, IGotD- wrote:Are you sure? «All atomic types except for std::atomic_flag may be implemented using mutexes or other locking operations, rather than using the lock-free atomic CPU instructions.» https://en.cppreference.com/w/cpp/atomic/atomic_is_lock_free C/C++ is trying to be hardware-independent to a much larger extent than D.[...]That and the C++ `std::atomic` will provide the same semantics on types of wide size (LDC and GDC seem to differ in behaviour here when you use the atomic primitive functions.)
May 31 2021
On Monday, 31 May 2021 at 21:01:35 UTC, Max Haughton wrote:Yes, that is what the trait is for? But with the limited hardware scope D has it surely can provide more convenient guarantees than C++ can?https://en.cppreference.com/w/cpp/atomic/atomic_is_lock_free C/C++ is trying to be hardware-independent to a much larger extent than D."Atomic types are also allowed to be sometimes lock-free"
May 31 2021
On Monday, 31 May 2021 at 21:08:37 UTC, Ola Fosheim Grøstad wrote:On Monday, 31 May 2021 at 21:01:35 UTC, Max Haughton wrote:This is orthogonal to the example I posted, what if the hardware can't perform the operation using simple atomic instructions, you might as well provide the fallback case anyway - both for easier correctness and to kill two birds with one API. Guaranteeing that the type uses the instructions anyway is up to the implementation, but the guarantee can be made nonetheless.Yes, that is what the trait is for? But with the limited hardware scope D has it surely can provide more convenient guarantees than C++ can?https://en.cppreference.com/w/cpp/atomic/atomic_is_lock_free C/C++ is trying to be hardware-independent to a much larger extent than D."Atomic types are also allowed to be sometimes lock-free"
May 31 2021
On Monday, 31 May 2021 at 21:23:17 UTC, Max Haughton wrote:This is orthogonal to the example I posted, what if the hardware can't perform the operation using simple atomic instructions, you might as well provide the fallback case anyway - both for easier correctness and to kill two birds with one API. Guaranteeing that the type uses the instructions anyway is up to the implementation, but the guarantee can be made nonetheless.I am not sure I understand what you mean now. Locking operations may imply completely different algorithms. In C++ you can either do a static compile time check using ```is_always_lock_free``` or a dynamic runtime check (then take an alternative path if it isn't). The dynamic check is to allow higher performance when it can be used, but that might require a completely different algorithm? Or with C++20 you have optional ```atomic_signed_lock_free``` and ```atomic_unsigned_lock_free```, which I probably will use when I get them.
May 31 2021
On Monday, 31 May 2021 at 16:34:35 UTC, Guillaume Piolat wrote:On Monday, 31 May 2021 at 09:26:36 UTC, rm wrote:Yes, how often do people use this anyway? I try to avoid concurrency issues and have found that I tend to end up using compare-exchange when I have to.I don't consider this a problem. In this case you have a load and a store. This is a non-atomic RMW. On the other hand, you do get sequential consistency synchronization from this process.I prefer atomicLoad and atomicStore then, because it's explicit and it's useless to hide the fact it's atomic behind nice syntax.
May 31 2021
On 31/05/2021 23:33, Ola Fosheim Grøstad wrote:On Monday, 31 May 2021 at 16:34:35 UTC, Guillaume Piolat wrote:It's useful if you want to implement known concurrency algorithms with SC semantics. Such as lamports lock (which requires SC). http://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-1-of-2 http://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-2-of-2 It's there to nudge people away from using the weaker semantics and allow easy synchronization.On Monday, 31 May 2021 at 09:26:36 UTC, rm wrote:Yes, how often do people use this anyway? I try to avoid concurrency issues and have found that I tend to end up using compare-exchange when I have to.I don't consider this a problem. In this case you have a load and a store. This is a non-atomic RMW. On the other hand, you do get sequential consistency synchronization from this process.I prefer atomicLoad and atomicStore then, because it's explicit and it's useless to hide the fact it's atomic behind nice syntax.
Jun 02 2021
On Wednesday, 2 June 2021 at 14:08:32 UTC, rm wrote:It's useful if you want to implement known concurrency algorithms with SC semantics. Such as lamports lock (which requires SC).Have you ever used Lamport's Bakery, though? Atomic inc/dec are obviously useful, but usually you want to know what the value was before/after the operation, so fetch_add/compare_exchange are easier to deal with IMO.
Jun 02 2021
On 02/06/2021 17:59, Ola Fosheim Grøstad wrote:On Wednesday, 2 June 2021 at 14:08:32 UTC, rm wrote:Not Lamport's Bakery. But I did implement some primitives. betterC does limit the options to work with phobos. For the other cases, I do start with explicit syntax. As I start with strong accesses and try to relax them as I progress. But that's mostly because I want try use the weaker memory semantics.It's useful if you want to implement known concurrency algorithms with SC semantics. Such as lamports lock (which requires SC).Have you ever used Lamport's Bakery, though?Atomic inc/dec are obviously useful, but usually you want to know what the value was before/after the operation, so fetch_add/compare_exchange are easier to deal with IMO.What's wrong with this? ```D Atomic!int x = 5; int a = x+; // a = 5 ``` https://github.com/rymrg/drm/blob/9db88fb468e2b8babdf9bde488d28d733aea638f/atomic.d#L95 inc/dec are implemented in terms of fetch_add.
Jun 02 2021
On Wednesday, 2 June 2021 at 15:09:54 UTC, rm wrote:inc/dec are implemented in terms of fetch_add.IIRC some architectures provide more efficient inc/dec atomics without fetch? I haven't looked at that in years, so I have no idea what the contemporary situation is.
Jun 02 2021
On Wednesday, 2 June 2021 at 15:19:59 UTC, Ola Fosheim Grøstad wrote:On Wednesday, 2 June 2021 at 15:09:54 UTC, rm wrote:No, I think that was wrong, I think they usually return the original value (or set a flag or whatever). But it doesn't matter. We should just look at what the common contemporary processors provide and look at instructions per clock cycles throughput. I guess last generation ARM/Intel/AMD is sufficient?inc/dec are implemented in terms of fetch_add.IIRC some architectures provide more efficient inc/dec atomics without fetch? I haven't looked at that in years, so I have no idea what the contemporary situation is.
Jun 02 2021
On Wednesday, 2 June 2021 at 15:30:46 UTC, Ola Fosheim Grøstad wrote:On Wednesday, 2 June 2021 at 15:19:59 UTC, Ola Fosheim Grøstad wrote:Are they always fixed latency? No dependence on the load store queue state (etc.) for example?On Wednesday, 2 June 2021 at 15:09:54 UTC, rm wrote:No, I think that was wrong, I think they usually return the original value (or set a flag or whatever). But it doesn't matter. We should just look at what the common contemporary processors provide and look at instructions per clock cycles throughput. I guess last generation ARM/Intel/AMD is sufficient?inc/dec are implemented in terms of fetch_add.IIRC some architectures provide more efficient inc/dec atomics without fetch? I haven't looked at that in years, so I have no idea what the contemporary situation is.
Jun 02 2021
On 02/06/2021 20:33, Max Haughton wrote:On Wednesday, 2 June 2021 at 15:30:46 UTC, Ola Fosheim Grøstad wrote:At least on x86-TSO, an atomic operation forces the cache to be flushed to memory.On Wednesday, 2 June 2021 at 15:19:59 UTC, Ola Fosheim Grøstad wrote:Are they always fixed latency? No dependence on the load store queue state (etc.) for example?On Wednesday, 2 June 2021 at 15:09:54 UTC, rm wrote:No, I think that was wrong, I think they usually return the original value (or set a flag or whatever). But it doesn't matter. We should just look at what the common contemporary processors provide and look at instructions per clock cycles throughput. I guess last generation ARM/Intel/AMD is sufficient?inc/dec are implemented in terms of fetch_add.IIRC some architectures provide more efficient inc/dec atomics without fetch? I haven't looked at that in years, so I have no idea what the contemporary situation is.
Jun 02 2021