digitalmars.D - atomic operations compared to c++
- gzp (26/26) Jun 12 2017 I'm trying to port some simple lock-free algorithm to D and as
- Kagamin (1/1) Jun 13 2017 LDC uses seq_cst seq_cst
- gzp (25/25) Jun 14 2017 After digging into it the source for me it seems as D is lacking
- rikki cattermole (4/34) Jun 14 2017 Please create an issue here: issues.dlang.org for druntime atomic suppor...
- Russel Winder via Digitalmars-d (44/88) Jun 14 2017 Step back a moment. C++ and Java are trying to stop programmers using
- rikki cattermole (6/19) Jun 14 2017 Yes. A N.G. post will be forgotten about quickly, but an issue in the
- Russel Winder via Digitalmars-d (15/22) Jun 14 2017 [=E2=80=A6]
- rikki cattermole (3/17) Jun 14 2017 core.atomic
- gzp (3/3) Jun 14 2017 Actually I've just found an isue from 2015 (still in NEW state):
- Patrick Schluter (5/14) Jun 14 2017 Especially since D has officially support for inline assembly.
- Russel Winder via Digitalmars-d (21/47) Jun 14 2017 Appears to be in core.atomic.
- gzp (18/28) Jun 14 2017 Atomic is not meant to replace the higher level abstraction
- David Nadlinger (9/24) Jun 14 2017 There is a misunderstanding here. cas() is in core.atomic, but it
- Manu (4/32) Aug 19 2019 FWIW, I fixed this.
- Adrian Matoga (3/5) Jun 14 2017 What do you currently use for in C++?
- Guillaume Piolat (9/13) Jun 14 2017 I have some difficulty already to comprehend MemoryOrder.rel and
- David Nadlinger (18/24) Jun 14 2017 That's true. In fact, this applies not only to atomic intrinsics,
I'm trying to port some simple lock-free algorithm to D and as the docs are quite minimal I'm stuck a little bit. The memory order seem to be ok: MemoryOrder.acq -> C++ accquire MemoryOrder.rel -> C++ release MemoryOrder.raw -> C++ relaxed MemoryOrder.seq -> C++ seq_cst or acq_rel (the strongest) There is no consume in D. But what about compare_exchange (CAS) ? In C++ one have to provide Memory ordering for success and failure, but not in D. Does it mean, it is the strongest sequaential all the time, all some explicit fence have to be provided. Or the difference is that, CAS in D does not updates the expected value and in C++ the orderin is used for this update ? Thus in the usual spin loop I have to add an explicit fence on success? ubyte flagsNow, newFlags; do { flagsNow = atomicLoad!( MemoryOrder.acq )( flags_ ); newFlags = update( flagsNow ); } while( !cas( &flags_, flagsNow, newFlags ) ); // do I need fence here ??? Another issue is the fence. In D there is no memoryordering for fence, only the strongest one exists. Is it intentional? (Not as if I have ever used explicit barriers apart from the one included in the atomic operations itself:) ) Thanks: Gzp
Jun 12 2017
After digging into it the source for me it seems as D is lacking a "standardized" atomic library. It has some basic concepts, but far behind the c++ standards. I don't know if there are any RFC-s in this topic but it requires a lot of work. Just to mention some by my first experience: cas in all api I've seen on a failed swap, the current value is retrieved (in c/c++ there are intrinsic for them) exchange no api for it and not implementable without spinning (in c/c++ there are intrinsic for them) atomicFence No memory ordering is considered in the API Even tough it falls back to the strongest/slowest one for the current implementation it should be part of the API. If D wants be be a real system programming language (ex a replacement for c++) please address these issues. I'm not an expert on the subject, but D seems to be in a c++11 stage where compiler/memory barriers and atomic had to be implemented differently for each platform and the programmer could only hope that compiler won't f*ck up everything during optimization. I don't know if D compiler is aware of the fences and won't move out/in instructions from guarded areas. Thanks: gzp
Jun 14 2017
On 14/06/2017 11:40 AM, gzp wrote:After digging into it the source for me it seems as D is lacking a "standardized" atomic library. It has some basic concepts, but far behind the c++ standards. I don't know if there are any RFC-s in this topic but it requires a lot of work. Just to mention some by my first experience: cas in all api I've seen on a failed swap, the current value is retrieved (in c/c++ there are intrinsic for them) exchange no api for it and not implementable without spinning (in c/c++ there are intrinsic for them) atomicFence No memory ordering is considered in the API Even tough it falls back to the strongest/slowest one for the current implementation it should be part of the API. If D wants be be a real system programming language (ex a replacement for c++) please address these issues. I'm not an expert on the subject, but D seems to be in a c++11 stage where compiler/memory barriers and atomic had to be implemented differently for each platform and the programmer could only hope that compiler won't f*ck up everything during optimization. I don't know if D compiler is aware of the fences and won't move out/in instructions from guarded areas. Thanks: gzpPlease create an issue here: issues.dlang.org for druntime atomic support. Clearly the requirements that we have been working under are not up to your expectations (or needs).
Jun 14 2017
On Wed, 2017-06-14 at 12:28 +0100, rikki cattermole via Digitalmars-d wrote:On 14/06/2017 11:40 AM, gzp wrote:Step back a moment. C++ and Java are trying to stop programmers using these features, in favour of using higher level abstractions. In C++ and Java such features as above are often required to implement the higher level abstraction but so as to allow other programmers not to have to use them. That D can do the high level parallel and concurrent programming using a more actor style model, i.e. processes and channels, or data parallelism, tasks on a threadpool, that C++11 didn't have but C++17 has (I believe), potentially means there is no reason to slavishly follow other languages in providing features that are not needed. So what is it that requires D to have CAS, exchange, and atomicFence? This proposal to introduce them needs driving by showing what C++ can do at the application level that D cannot, rather than being tick box driven via a list of types. C++ and Java have formal memory models because people use a lot of shared memory multithreading. If you use actor/dataflow/data parallelism at the application level then it is entirely feasible to get away without a formal memory model as long as the actor/dataflow/data parallelism frameworks can be constructed without one. It is, of course, easier to do this if there is a memory model. So the first port of call has to be "does D have a formal memory model" rather than dopes it have CAS, exchange, and fences. Oh, and if you can avoid fences you must. Remember, locks, semaphores, mutexes, barriers, and fences are all designed to stop parallelism, they are designed to slow things down. They are needed for implementing operating system, but unless your application is an operating system in some sort of disguise, you really don't want them in your code. Investigation may discover that D is missing some of these features, but there needs to be a reason to have them other than "other languages have them". =20After digging into it the source for me it seems as D is lacking a=20 "standardized" atomic library. It has some basic concepts, but far=20 behind the c++ standards. I don't know if there are any RFC-s in this topic but it requires a lot=20 of work. Just to mention some by my first experience: =20 cas in all api I've seen on a failed swap, the current value is retrieved (in c/c++ there are intrinsic for them) =20 exchange no api for it and not implementable without spinning (in c/c++ there are intrinsic for them) =20 atomicFence No memory ordering is considered in the API Even tough it falls back to the strongest/slowest one for the current=20 implementation it should be part of the API. =20 If D wants be be a real system programming language (ex a replacement=20 for c++) please address these issues. I'm not an expert on the subject,=20 but D seems to be in a c++11 stage where compiler/memory barriers and=20 atomic had to be implemented differently for each platform and the=20 programmer could only hope that compiler won't f*ck up everything during=20 optimization.Is an issue the right vehicle for investigating the need for these? --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winderI don't know if D compiler is aware of the fences and won't move out/in=20 instructions from guarded areas. =20 Thanks: gzp=20 Please create an issue here: issues.dlang.org for druntime atomic support. Clearly the requirements that we have been working under are not up to=20 your expectations (or needs).
Jun 14 2017
On 14/06/2017 1:15 PM, Russel Winder via Digitalmars-d wrote: snipYes. A N.G. post will be forgotten about quickly, but an issue in the bug tracker can send you updates as things progress. At the end of the day, that module grew organically, it just needs a bit of planning put into it for the future that's all.Is an issue the right vehicle for investigating the need for these?I don't know if D compiler is aware of the fences and won't move out/in instructions from guarded areas. Thanks: gzpPlease create an issue here: issues.dlang.org for druntime atomic support. Clearly the requirements that we have been working under are not up to your expectations (or needs).
Jun 14 2017
On Wed, 2017-06-14 at 13:27 +0100, rikki cattermole via Digitalmars-d wrote:=20[=E2=80=A6]Yes. A N.G. post will be forgotten about quickly, but an issue in the=20 bug tracker can send you updates as things progress.OK. Feel free to sign me up for the issue.At the end of the day, that module grew organically, it just needs a bit=20 of planning put into it for the future that's all.Which module? --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Jun 14 2017
On 14/06/2017 1:48 PM, Russel Winder via Digitalmars-d wrote:On Wed, 2017-06-14 at 13:27 +0100, rikki cattermole via Digitalmars-d wrote:If an issue is created, you can add yourself pretty easily (cc field).[…]Yes. A N.G. post will be forgotten about quickly, but an issue in the bug tracker can send you updates as things progress.OK. Feel free to sign me up for the issue.core.atomicAt the end of the day, that module grew organically, it just needs a bit of planning put into it for the future that's all.Which module?
Jun 14 2017
Actually I've just found an isue from 2015 (still in NEW state): https://issues.dlang.org/show_bug.cgi?id=15007 I've updated and linked this forum.
Jun 14 2017
On Wednesday, 14 June 2017 at 12:15:49 UTC, Russel Winder wrote:On Wed, 2017-06-14 at 12:28 +0100, rikki cattermole via Digitalmars-d wrote:Especially since D has officially support for inline assembly. All these low-level constructs are better handled directly at the machine code level as their semantic varies significantly between architectures.Step back a moment. C++ and Java are trying to stop programmers using these features, in favour of using higher level abstractions. In C++ and Java such features as above are often required to implement the higher level abstraction but so as to allow other programmers not to have to use them. [...][...]
Jun 14 2017
On Wed, 2017-06-14 at 10:40 +0000, gzp via Digitalmars-d wrote:After digging into it the source for me it seems as D is lacking=20 a "standardized" atomic library. It has some basic concepts, but=20 far behind the c++ standards. I don't know if there are any RFC-s in this topic but it requires=20 a lot of work. Just to mention some by my first experience: =20 cas in all api I've seen on a failed swap, the current value is=20 retrieved (in c/c++ there are intrinsic for them)This appears to be in core.atomic.exchange no api for it and not implementable without spinning (in c/c++ there are intrinsic for them)atomicFence No memory ordering is considered in the API Even tough it falls back to the strongest/slowest one for the=20 current implementation it should be part of the API.Appears to be in core.atomic.If D wants be be a real system programming language (ex a=20 replacement for c++) please address these issues. I'm not an=20 expert on the subject, but D seems to be in a c++11 stage where=20 compiler/memory barriers and atomic had to be implemented=20 differently for each platform and the programmer could only hope=20 that compiler won't f*ck up everything during optimization. =20 I don't know if D compiler is aware of the fences and won't move=20 out/in instructions from guarded areas.I am fairly sure it isn't, but why is this needed if you use a parallelism oriented approach to the architecture and design? Sorry to repeat but whilst there are circumstances where this stuff is needed (operating systems), most other applications should be written without the need for locks, mutexes, fences, memory model, etc. any need for that stuff should be covered in the frameworks used. We need to be careful not to bring 1960s views of threads into 2010 programming. Sometimes they are needed, usually they are not.=20 --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Jun 14 2017
I am fairly sure it isn't, but why is this needed if you use a parallelism oriented approach to the architecture and design? Sorry to repeat but whilst there are circumstances where this stuff is needed (operating systems), most other applications should be written without the need for locks, mutexes, fences, memory model, etc. any need for that stuff should be covered in the frameworks used. We need to be careful not to bring 1960s views of threads into 2010 programming. Sometimes they are needed, usually they are not.Atomic is not meant to replace the higher level abstraction things (neither in c++ nor in any other language). They are meant to implement new higher lever abstraction layers as a library instead of as a language feature. As new parallel algorithms are discovered, they can be (not so) "easily" added to the languages through libs. How would you implement a lock-free list? Or a lock-free multiple producer, single consumer queue? It's good to have send/receive mechanism in a parallel world, but in my opinion the framework should be a library and not the language itself. And to easy the writing of such libraries some good, reliable building blocks are required (mutex, atomic, fence, etc.). You are right these features are not meant to be used too much, but required to build more general parallel, containers, schedulers, algorithms etc. Note: Why do I keep mentioning C++11 with respect to atomic? Because some experts spent a lot of time to find a good stable API for these things.
Jun 14 2017
On Wednesday, 14 June 2017 at 12:48:14 UTC, Russel Winder wrote:On Wed, 2017-06-14 at 10:40 +0000, gzp via Digitalmars-d wrote:There is a misunderstanding here. cas() is in core.atomic, but it returns true/false rather than the read value. However, this is just fine for virtually all algorithms. In fact, the respective <atomic> functions in C++11 also return a boolean result.[…] cas in all api I've seen on a failed swap, the current value is retrieved (in c/c++ there are intrinsic for them)This appears to be in core.atomic.Where exactly would that be? There is no unconditional swap/xchg in core.atomic, and atomicFence() indeed only supports sequentially consistent semantics. — Davidexchange no api for it and not implementable without spinning (in c/c++ there are intrinsic for them)atomicFence No memory ordering is considered in the API Even tough it falls back to the strongest/slowest one for the current implementation it should be part of the API.Appears to be in core.atomic.
Jun 14 2017
On Wed, Jun 14, 2017 at 10:51 AM David Nadlinger via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Wednesday, 14 June 2017 at 12:48:14 UTC, Russel Winder wrote:FWIW, I fixed this. https://github.com/dlang/druntime/pull/2745On Wed, 2017-06-14 at 10:40 +0000, gzp via Digitalmars-d wrote:There is a misunderstanding here. cas() is in core.atomic, but it returns true/false rather than the read value. However, this is just fine for virtually all algorithms. In fact, the respective <atomic> functions in C++11 also return a boolean result.[…] cas in all api I've seen on a failed swap, the current value is retrieved (in c/c++ there are intrinsic for them)This appears to be in core.atomic.Where exactly would that be? There is no unconditional swap/xchg in core.atomic, and atomicFence() indeed only supports sequentially consistent semantics. — Davidexchange no api for it and not implementable without spinning (in c/c++ there are intrinsic for them)atomicFence No memory ordering is considered in the API Even tough it falls back to the strongest/slowest one for the current implementation it should be part of the API.Appears to be in core.atomic.
Aug 19 2019
On Tuesday, 13 June 2017 at 06:12:46 UTC, gzp wrote:(...) There is no consume in D.What do you currently use for in C++? It is temporarily deprecated in C++17.
Jun 14 2017
On Tuesday, 13 June 2017 at 06:12:46 UTC, gzp wrote:But what about compare_exchange (CAS) ? In C++ one have to provide Memory ordering for success and failure, but not in D.I have some difficulty already to comprehend MemoryOrder.rel and MemoryOrder.acq A cas with MemoryOrder.raw wouldn't be very useful.Does it mean, it is the strongest sequaential all the time, all some explicit fence have to be provided.It uses lock xchg https://github.com/dlang/druntime/blob/ce0f089fec56f7ff5b1df689f5c81256218e415b/src/core/atomic.d#L769 So no additional fences needed, it is already the strongest IIRC. imho, if a CAS requires additional memory barriers, it's a bit useless..
Jun 14 2017
Hi, On Tuesday, 13 June 2017 at 06:12:46 UTC, gzp wrote:the docs are quite minimalThat's true. In fact, this applies not only to atomic intrinsics, but all of `shared`. We need to sit down and properly specify things at some point. Andrei has been trying to get an initiative going to do just that recently.There is no consume in D.There is indeed no equivalent to memory_order_consume. Note, however, that consume is about to be deprecated in C/C++, as it turned out to be more or less unimplementable in its current form (at least while still being useful). Introducing the notion of source-level dependencies into a language that otherwise operates with as-if semantics on an abstract machine is a tricky business.But what about compare_exchange (CAS) ? […] Does it mean, it is the strongest sequaential all the timeYes, core.atomic.cas() is always seq_cst for the time being (we should fix this).Another issue is the fence. In D there is no memoryordering for fence, only the strongest one exists. Is it intentional?No; it is just a questionable design decision/unnecessary limitation which can easily be remedied by adding an optional parameter. — David
Jun 14 2017