digitalmars.D - Something needs to happen with shared, and soon.
- =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (71/71) Nov 11 2012 Hi,
- Andrej Mitrovic (13/16) Nov 11 2012 I think most people probably don't even use shared due to lacking
- Chris Nicholson-Sauls (37/60) Nov 11 2012 Fix support for shared(T) in std.variant, and you will have fixed
- Benjamin Thaut (3/3) Nov 11 2012 Fully agree.
- martin (2/3) Nov 11 2012 +1
- Graham St Jack (4/8) Nov 11 2012 +1.
- deadalnix (2/10) Nov 11 2012 That isn't a bad thing in itself.
- Jonathan M Davis (50/63) Nov 11 2012 I don' think that it's really intended that shared by 100% easy to use.=
- bearophile (5/6) Nov 11 2012 Maybe deprecate it and introduce something else that is rather
- nixda (4/81) Nov 11 2012 drop it in favour of :
- Michel Fortin (16/22) Nov 11 2012 I feel like the concurrency aspect of D2 was rushed in the haste of
- Timon Gehr (2/19) Nov 13 2012 I am always irritated by shared-by-default static variables.
- Michel Fortin (8/23) Nov 13 2012 I tend to have very little global state in my code, so
- Jonathan M Davis (5/8) Nov 13 2012 Thread-local by default is a _huge_ step forward, and in hindsight, it s...
- Timon Gehr (14/35) Nov 14 2012 So do I. A thread-local static variable does not imply global state.
- Michel Fortin (17/54) Nov 14 2012 I'd consider that poor style. Use a struct to encapsulate the state,
- Timon Gehr (12/60) Nov 14 2012 I'd consider this a poor statement to make. Universally quantified
- Michel Fortin (13/37) Nov 14 2012 Indeed. There's not enough context to judge fairly. I can accept the
- Walter Bright (25/28) Nov 11 2012 I think a couple things are clear:
- Benjamin Thaut (5/5) Nov 11 2012 The only problem beeing that you can not really have user defined shared...
- Walter Bright (4/7) Nov 11 2012 If you include an object designed to work only in a single thread (non-s...
- Benjamin Thaut (11/21) Nov 12 2012 I'm not talking about objects, I'm talking about value types.
- Johannes Pfau (101/111) Nov 12 2012 But there are also shared member functions and they're kind of annoying
- Walter Bright (20/120) Nov 12 2012 You can't get away from the fact that data that can be accessed from mul...
- luka8088 (12/151) Nov 12 2012 If I understood correctly there is no reason why this should not compile...
- deadalnix (3/14) Nov 12 2012 D has no ownership, so the compiler can't know what
- luka8088 (147/166) Nov 12 2012 Here i as wild idea:
- Johannes Pfau (53/113) Nov 12 2012 I know share can't automatically make the code thread safe. I
- Sean Kelly (10/63) Nov 14 2012 Yes. You end up having two methods for each function, one as a =
- Regan Heath (24/30) Nov 12 2012 So what we actually want, in order to make the above "nice" is a "scoped...
- Regan Heath (29/57) Nov 12 2012 There was talk a while back about how to handle the existing object mute...
- deadalnix (5/25) Nov 12 2012 As already explain in the thread you mention, it is not gonna work. The
- Jacob Carlborg (4/32) Nov 12 2012 I'm just throwing it in here again, AST macros could probably solve this...
- Simen Kjaeraas (7/41) Nov 12 2012 Until someone writes a proper DIP on them, macros can write entire softw...
- Jacob Carlborg (4/8) Nov 12 2012 Sure, I can try and stop doing that :)
- FeepingCreature (3/12) Nov 13 2012 You know, AST macros could probably stop doing that.
- deadalnix (10/41) Nov 12 2012 The compiler is able to do some optimization on that, and, it never
- Manu (26/58) Nov 12 2012 s
- luka8088 (13/44) Nov 13 2012 This clarifies a lot, but still a lot of people get confused with:
- luka8088 (8/78) Nov 13 2012 Um, sorry, the following code:
- =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= (3/73) Nov 13 2012 Only std.concurrency (using spawn() and send()) enforces that unshared d...
- luka8088 (4/77) Nov 13 2012 In that case http://dlang.org/faq.html#shared_guarantees is wrong, it is...
- David Nadlinger (7/15) Nov 13 2012 You are right, it could probably be added to avoid confusion. But
- Sean Kelly (24/81) Nov 14 2012 that
- luka8088 (5/72) Nov 14 2012 Yes, that makes perfect sense... I just wanted to point out the
- Walter Bright (4/16) Nov 13 2012 Andrei is a proponent of having shared to memory barriers, I disagree wi...
- Peter Alexander (4/12) Nov 13 2012 FWIW, I'm with you on this one. Memory barriers would not make
- Andrei Alexandrescu (4/11) Nov 13 2012 Wait, then what would shared do? This is new to me as I've always
- Peter Alexander (10/25) Nov 13 2012 I'm speaking out of turn, but...
- Andrei Alexandrescu (8/30) Nov 13 2012 Oh ok, thanks. That does make sense. There's been quite a bit of
- deadalnix (2/24) Nov 13 2012 It cannot unless some ownership is introduced in D.
- Walter Bright (18/29) Nov 13 2012 I'm just not convinced that having the compiler add memory barriers:
- Andrei Alexandrescu (13/48) Nov 13 2012 I'm fine with these arguments. We'll need to break current uses of
- David Nadlinger (5/9) Nov 13 2012 You mean x.store(4)? Or am I completely misunderstanding your
- Andrei Alexandrescu (3/11) Nov 13 2012 Apologies, yes, store.
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (7/58) Nov 13 2012 Is that meant to be an atomic store, or just a regular, but explicit, st...
- Andrei Alexandrescu (3/11) Nov 13 2012 Atomic and sequentially consistent.
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (27/39) Nov 13 2012 OK, but then we have the problem I presented in the OP: This only works
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (19/57) Nov 13 2012 Scratch that, make it this:
- deadalnix (2/60) Nov 13 2012 That list sound more reasonable.
- Andrei Alexandrescu (7/12) Nov 13 2012 When I wrote TDPL I looked at the contemporary architectures and it
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (11/23) Nov 13 2012 I do not know of a single architecture apart from x86 that supports >
- Andrei Alexandrescu (5/26) Nov 13 2012 Intel does 128-bit atomic load and store, see
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (7/34) Nov 13 2012 That's Itanium, though, not x86. Itanium is a fairly high-end,
- Rainer Schuetze (4/41) Nov 14 2012 On x86 you can use LOCK CMPXCHG16b to do the atomic read:
- deadalnix (4/42) Nov 13 2012 I wouldn't expected it to work for delegates, long, ulong, double and
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (6/53) Nov 13 2012 8-byte atomic loads/stores is doable on all major architectures.
- Andrei Alexandrescu (3/4) Nov 13 2012 We're looking at 128-bit load, store, and CAS for 64-bit machines.
- Walter Bright (4/16) Nov 13 2012 Not going to portably work on long, ulong, double, slices, or delegates.
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (9/29) Nov 13 2012 I amended that (see my other post). 8-byte loads/stores can be done
- deadalnix (2/29) Nov 13 2012 http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (7/38) Nov 13 2012 Thanks, exactly that. No MIPS, though. I guess I'm going to have to go
- Walter Bright (2/4) Nov 13 2012 Our car doesn't have an electric starter yet, but it's still better than...
- Andrei Alexandrescu (5/11) Nov 13 2012 Please don't. This is "we're doing better than C++" in disguise and
- Jonathan M Davis (23/25) Nov 13 2012 At this point, I don't see how it could be otherwise. Having the shared
- deadalnix (15/35) Nov 13 2012 That is what java's volatile do. It have several uses cases, including
- Walter Bright (10/19) Nov 13 2012 Please, please file a bug report about this, rather than a vague stateme...
- deadalnix (3/28) Nov 13 2012 Why would you destroy something that isn't dead yet ?
- David Nadlinger (10/16) Nov 14 2012 What stops you from using core.atomic.{atomicLoad, atomicStore}?
- deadalnix (5/20) Nov 14 2012 It is a solution now (it wasn't at the time).
- David Nadlinger (6/9) Nov 14 2012 You mean moving non-atomic loads/stores across atomic
- Andrei Alexandrescu (8/23) Nov 14 2012 This is a simplification of what should be going on. The
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (15/40) Nov 14 2012 They already work as they should:
- Andrei Alexandrescu (5/43) Nov 14 2012 The language definition should be made clear so as future optimizations
- David Nadlinger (17/52) Nov 14 2012 Sorry, I don't quite see where I simplified things. Yes, in the
- Andrei Alexandrescu (15/58) Nov 14 2012 First, there are more kinds of atomic loads and stores. Then, the fact
- Sean Kelly (11/13) Nov 14 2012 that the calls are not supposed to be reordered must be a guarantee of =
- Andrei Alexandrescu (4/8) Nov 14 2012 I think we should focus on sequential consistency as that's where the
- Sean Kelly (9/11) Nov 14 2012 core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so =
- deadalnix (3/7) Nov 15 2012 It is sufficient for monocore and mostly correct for x86. But isn't enou...
- Sean Kelly (13/23) Nov 15 2012 core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so =
- deadalnix (3/15) Nov 16 2012 I'm not aware of D1 compiler inserting memory barrier, so any memory
- Sean Kelly (9/14) Nov 14 2012 core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so =
- Jacob Carlborg (6/15) Nov 13 2012 If the compiler should/does not add memory barriers, then is there a
- Walter Bright (2/4) Nov 14 2012 Memory barriers can certainly be added using library functions.
- Jacob Carlborg (4/5) Nov 14 2012 Is there then any real advantage of having it directly in the language?
- Walter Bright (2/5) Nov 14 2012 Not that I can think of.
- Jacob Carlborg (5/6) Nov 14 2012 Then we might want to remove it since it's either not working or
- Andrei Alexandrescu (3/7) Nov 14 2012 Actually this hypothesis is false.
- Jacob Carlborg (6/7) Nov 14 2012 That we should remove it or that it's not working/nobody understands
- Andrei Alexandrescu (3/8) Nov 14 2012 The hypothesis that atomic primitives can be implemented as a library.
- Jacob Carlborg (4/5) Nov 14 2012 I don't know these kind of things, that's why I'm asking.
- deadalnix (4/7) Nov 14 2012 The compiler can do more reordering in regard to barriers. For instance,...
- Jacob Carlborg (4/7) Nov 14 2012 I see.
- Andrei Alexandrescu (3/6) Nov 14 2012 It's not an advantage, it's a necessity.
- Jacob Carlborg (6/7) Nov 14 2012 Walter seems to indicate that there is no technical reason for "shared"
- Andrei Alexandrescu (6/12) Nov 14 2012 Walter is a self-confessed dilettante in threading. To be frank I hope
- Jacob Carlborg (4/6) Nov 14 2012 Ok, thanks for the expatiation.
- Andrei Alexandrescu (4/9) Nov 14 2012 The compiler must understand the semantics of barriers such as e.g. it
- David Nadlinger (8/20) Nov 14 2012 Again, this is true, but it would be a fallacy to conclude that
- Andrei Alexandrescu (3/19) Nov 14 2012 Compiler intrinsics ====== built into the language.
- Iain Buclaw (13/41) Nov 14 2012 e:
- Andrei Alexandrescu (3/42) Nov 14 2012 aware of what it is and what it does ====== built into the language.
- Sean Kelly (5/14) Nov 14 2012 doesn't hoist code above an acquire barrier or below a release barrier.
- =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (10/21) Nov 14 2012 The volatile statement was too general. All relevant compiler back ends
-
Sean Kelly
(21/38)
Nov 14 2012
wrote: - =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (8/28) Nov 14 2012 Well, there's not much point in that when all compilers have intrinsics
- Andrei Alexandrescu (3/14) Nov 14 2012 Because it's better to associate volatility with data than with code.
-
Sean Kelly
(12/28)
Nov 14 2012
Alexandrescu
wrote: - deadalnix (5/24) Nov 15 2012 Happy to see I'm not alone on that one.
- Sean Kelly (14/39) Nov 15 2012 enough?
- Andrei Alexandrescu (3/16) Nov 14 2012 The compiler must be in this so as to not do certain reorderings.
- Jonathan M Davis (12/19) Nov 13 2012 Being able to have double-checked locking work would be valuable, and ha...
- Walter Bright (3/6) Nov 14 2012 I'm not saying "memory barriers are bad". I'm saying that having the com...
- Andrei Alexandrescu (3/12) Nov 14 2012 Let's not hasten. That works for Java and C#, and is allowed in C++.
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (13/26) Nov 14 2012 I need some clarification here: By memory barrier, do you mean x86's
- deadalnix (10/37) Nov 14 2012 In fact, x86 is mostly sequentially consistent due to its memory model.
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (11/52) Nov 14 2012 I just used x86's fencing instructions as an example because most people...
- Andrei Alexandrescu (12/36) Nov 14 2012 Sorry, I was imprecise. We need to (a) define intrinsics for loading and...
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (15/52) Nov 14 2012 Let's continue this part of the discussion in my other reply (the one
- David Nadlinger (22/32) Nov 14 2012 Sorry, I didn't see this message of yours before replying (the
- David Nadlinger (8/10) Nov 14 2012 Let my clarify that: We don't necessarily need to tuck on any
- Andrei Alexandrescu (11/39) Nov 14 2012 Yah, the whole point here is that we need something IN THE LANGUAGE
- Manu (38/85) Nov 15 2012 e:
- deadalnix (2/7) Nov 15 2012 Can you elaborate on that ?
- Andrei Alexandrescu (9/26) Nov 15 2012 All contemporary languages that are serious about concurrency support
- Sean Kelly (8/17) Nov 15 2012 is
- Manu (15/49) Nov 16 2012 I'm not conflating the 2, I'm suggesting to stick with the primitives th...
- Pragma Tix (4/82) Nov 16 2012 Seems to me that Soenkes's library solution went into to right
- Manu (14/81) Nov 16 2012 Looks reasonable to me, also Dmitry Olshansky and luka have both made
- Pragma Tix (10/131) Nov 16 2012 Hi Manu,
- Manu (10/34) Nov 16 2012 I can't resist... D may be serious about the *idea* of concurrency, but ...
- David Nadlinger (8/20) Nov 15 2012 What are these special properties? Sorry, it seems like we are
- Andrei Alexandrescu (4/11) Nov 15 2012 For example you can't hoist a memory operation before a shared load or
- David Nadlinger (10/26) Nov 15 2012 Well, to be picky, that depends on what kind of memory operation
- Sean Kelly (14/28) Nov 15 2012 wrote:
- David Nadlinger (10/20) Nov 15 2012 Oh well, I was just being stupid when typing up my response: What
- Sean Kelly (15/24) Nov 15 2012 mean =96 moving non-volatile loads/stores across volatile ones is =
- Andrei Alexandrescu (3/21) Nov 15 2012 Shared must be sequentially consistent.
- deadalnix (3/19) Nov 18 2012 If it is known that the memory read/write is thread local, this is safe,...
- Andrei Alexandrescu (8/29) Nov 15 2012 In D that's fine (as long as in-thread SC is respected) because
- Walter Bright (3/38) Nov 14 2012 Yes. And also, I agree that having something typed as "shared" must prev...
- Andrei Alexandrescu (4/7) Nov 14 2012 It's the same issue at hand: ordering properly and inserting barriers
- Sean Kelly (8/14) Nov 14 2012 are two ways to ensure one single goal, sequential consistency. Same =
- Andrei Alexandrescu (4/18) Nov 14 2012 Yah, but the baseline here is acquire-release which has subtle
- Sean Kelly (9/28) Nov 15 2012 differences that are all the more maddening.
- deadalnix (4/53) Nov 15 2012 I'm sorry but that is dumb.
- Sean Kelly (4/6) Nov 15 2012 load/stores if the CPU is allowed to do so ?
- David Nadlinger (5/12) Nov 15 2012 I think the question was: Why would you want to disable compiler
- Andrei Alexandrescu (4/15) Nov 15 2012 The compiler does whatever it takes to ensure sequential consistency for...
- David Nadlinger (9/32) Nov 15 2012 How does this have anything to do with deadalnix' question that I
- Sean Kelly (9/16) Nov 15 2012 such ability for the compiler right now.
- Jacob Carlborg (6/16) Nov 14 2012 If there is a problem with efficiency in some cases then the developer
- Benjamin Thaut (8/17) Nov 14 2012 I still don't agree with you there. The struct would have clearly
- Walter Bright (5/11) Nov 14 2012 If you know this for a fact, then cast it to thread local. The compiler ...
- Benjamin Thaut (18/32) Nov 14 2012 Could you please give an example where it would break?
- Walter Bright (8/23) Nov 14 2012 Thread 1:
- Benjamin Thaut (8/16) Nov 14 2012 But for passing a reference to a value type you would have to use a
- Walter Bright (7/24) Nov 14 2012 1. You can't escape pointers in safe code (well, it's a bug if you do).
- Benjamin Thaut (12/40) Nov 14 2012 So just to be clear, escaping pointers in a single threaded context is a...
- Walter Bright (8/10) Nov 14 2012 I hate to repeat myself, but:
- Andrei Alexandrescu (4/12) Nov 14 2012 That should be disallowed at least in safe code. If I had my way I'd
- Jacob Carlborg (6/13) Nov 15 2012 Why would the object be destroyed if there's still a reference to it? If...
- Jonathan M Davis (28/42) Nov 15 2012 Yeah. If the reference passed across were shared, then the runtime shoul...
- Benjamin Thaut (10/37) Nov 15 2012 Thank you, thats exatcly how I'm thinking too. And because of this it
- Dmitry Olshansky (15/26) Nov 15 2012 Ain't structs typically copied anyway?
- Jonathan M Davis (5/23) Nov 14 2012 Pointers are not considered unsafe at all and are perfectly legal in Saf...
- Jason House (12/18) Nov 14 2012 This is a fairly reasonable use of shared, but it is bypassing
- Andrei Alexandrescu (5/10) Nov 14 2012 This is very different from how I view we should do things (and how we
- Jonathan M Davis (19/31) Nov 14 2012 Well, this is clearly how things work now, and if you want to use shared...
- Michel Fortin (87/90) Nov 14 2012 One thing I'm confused about right now is how people are using shared.
- Regan Heath (47/109) Nov 15 2012 =
- Sean Kelly (19/33) Nov 15 2012 http://forum.dlang.org/thread/k7orpj$1tt5$1@digitalmars.com?page=3D2#pos...
- Dmitry Olshansky (38/48) Nov 15 2012 While the rest of proposal was more or less fine. I don't get why we
- Michel Fortin (74/125) Nov 16 2012 In case you want to protect two variables (or more) with the same
- =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= (9/26) Nov 16 2012 Can you have a look at my thread about this?
- Sean Kelly (13/43) Nov 16 2012 said:
- Michel Fortin (22/36) Nov 17 2012 Perhaps it's just my style of coding, but when designing a class that
- Dmitry Olshansky (76/173) Nov 16 2012 Wrap in a struct and it would be even much clearer and safer.
- Michel Fortin (29/107) Nov 17 2012 Clever. But you forgot to access the variable somewhere. What's its
- Jacob Carlborg (5/12) Nov 17 2012 If a feature can be implemented in a library with the same syntax,
- Dmitry Olshansky (17/39) Nov 17 2012 Not having the name would imply you can't escape it :) But I agree it's
-
foobar
(7/27)
Nov 19 2012
- Michel Fortin (9/36) Nov 19 2012 No solution will be foolproof in the general case unless we add new
- Andrej Mitrovic (6/7) Nov 14 2012 It says (on p.413) reading and writing shared values are guaranteed to
- Jonathan M Davis (6/14) Nov 14 2012 Actually, I think that what it comes down to is that shared works nicely...
- Andrei Alexandrescu (4/18) Nov 14 2012 TDPL 13.14 explains that inside synchronized classes, top-level shared
- Jonathan M Davis (21/23) Nov 15 2012 Then it's doing the casting for you. I suppose that that's an argument t...
- =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= (6/27) Nov 15 2012 There are three problems I currently see with this:
- Manu (17/27) Nov 15 2012 The pattern Walter describes is primitive and useful, I'd like to see
- Jacob Carlborg (13/20) Nov 15 2012 How about implementing a library function, something like this:
- Manu (9/28) Nov 15 2012 Interesting concept. Nice idea, could certainly be useful, but it doesn'...
- luka8088 (10/41) Nov 15 2012 I managed to make a simple example that works with the current
- Jacob Carlborg (4/13) Nov 15 2012 I don't understand how a template would cause problems.
- Jonathan M Davis (22/33) Nov 15 2012 1. It wouldn't stop you from needing to cast away shared at all, because...
- Manu (23/65) Nov 15 2012 I don't really see the difference, other than, as you say, the cast is
- Sean Kelly (5/13) Nov 15 2012 So what happens if you pass a reference to the now non-shared object to ...
- Jason House (7/21) Nov 17 2012 The constructive thing to do may be to try and figure out what
- deadalnix (3/24) Nov 18 2012 Nothing is safe if ownership cannot be statically proven. This is
- =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= (7/32) Nov 19 2012 But you can at least prove ownership under some limited circumstances. L...
- Jason House (15/47) Nov 20 2012 Bartosz's design was very explicit about ownership, but was
- =?ISO-8859-1?Q?S=F6nke_Ludwig?= (141/148) Nov 12 2012 After reading Walter's comment, it suddenly seemed obvious that we are
- Regan Heath (21/37) Nov 12 2012 t
- =?ISO-8859-15?Q?S=F6nke_Ludwig?= (13/49) Nov 12 2012 The only problem is that for this approach to be safe, any aliasing
- deadalnix (3/151) Nov 12 2012 With some kind of ownership in the type system, it can me made automagic...
- =?ISO-8859-1?Q?S=F6nke_Ludwig?= (9/12) Nov 12 2012 Yes and I would love to have that, but I fear that we then basically get
- deadalnix (2/14) Nov 12 2012 Don't get me started on fibers /D
- =?ISO-8859-1?Q?S=F6nke_Ludwig?= (5/5) Nov 12 2012 I generated some quick documentation with examples here:
- =?ISO-8859-1?Q?S=F6nke_Ludwig?= (3/11) Nov 12 2012 All examples compile now. Put everything on github for reference:
- Jonathan M Davis (9/18) Nov 14 2012 Good to know, but none of that really has anything to do with the castin...
- Jonathan M Davis (45/71) Nov 15 2012 You could make casting away const implicit too, which would make some co...
- Manu (24/123) Nov 15 2012 ... no, they're not even the same thing. const things can not be changed...
- Mehrdad (3/3) Nov 15 2012 Would it be useful if 'shared' in D did something like 'volatile'
Hi, It's starting to get outright embarrassing to talk to newcomers about D's concurrency support because the most fundamental part of it -- the shared type qualifier -- does not have well-defined semantics at all. I'm certainly not alone in being annoyed by this state of affairs: http://d.puremagic.com/issues/show_bug.cgi?id=8993 I've posted rants about the state of shared before and, from the comments on those, it appears that what most people want shared to do is at least one (and usually multiple) of * make variables global (if appropriate in the context); * make the wrapped type completely separate from the unwrapped type; * make all operations be atomic; * make all operations result in memory barriers. At a glance, this looks fine. Exactly what you would want for shared types in a concurrent setting, right? Except, not really. I'll try to explain all of the unsolved problems with shared below... First of all, the fact that shared(T) is completely separate from T (i.e. no conversions allowed, except for primitive types) is a huge usability problem. In practice, it means that 99% of the standard library is unusable with shared types. Hell, even most of the runtime doesn't work with shared types. I don't know how to best solve this particular problem; I'm just pointing it out because anyone who tries to do anything non-trivial with shared will invariably run into this. Second, the idea of making shared insert atomic operations is an absolute fallacy. It only makes sense for primitive types for the most part, and even for those, what sizes are supported depends on the target architecture. A number of ideas have come up to solve this problem: * We make shared(T) not compile for certain Ts depending on the target architecture. I personally think this is a terrible idea because most code using shared will not be portable at all. * We require any architecture D targets to support atomic operations for a certain size S at the very least. This is fine for primitives up to 64 bits in size, but doesn't clear up the situation for larger types (real, complex types, cent/ucent, ...). * We make shared not insert atomic operations at all (thus making it kind of useless for anything but documentation). * (Possibly others I have forgotten; please let me know if this is the case.) I don't think any of these are particularly attractive, to be honest. If we do make shared insert atomic operations, we would also have to consider the memory ordering of those operations. Third, we have memory barriers. I strongly suspect that this is a misnomer in most cases where people have suggested this; it's generally not useful to have a compiler insert barriers because they are used to control ordering of load/store operations which is something the programmer will want to do explicitly. In any case, the compiler can't usefully figure out where to put barriers, so it would just result in really bad performance for no apparent gain. Fourth, there is implementation complexity. If shared is meant to insert specialized instructions, it will result in effectively two code paths for most code generation in any D compiler (read: maintenance nightmare). Fifth, it is completely unclear whether casting to and from shared is legal (but with a big fat "caution" sign like casting away const) or if it's undefined behavior. Making it undefined behavior would further increase the usability problem I described above. And finally, the worst part of all of this? People writing code that uses shared today are blindly assuming it actually does the right thing. It doesn't. Their code will break on any non-x86 platform. This is an absolutely horrifying situation now that ARM, MIPS, and PowerPC are starting to become viable targets for D. Something needs to be done about shared. I don't know what, but the current situation is -- and I'm really not exaggerating here -- laughable. I think we either need to just make it perfectly clear that shared is for documentation purposes and nothing else, or, figure out an alternative system to shared, because I don't see shared actually being useful for real world work no matter what we do with it. -- Alex Rønne Petersen alex lycus.org http://lycus.org
Nov 11 2012
On 11/11/12, Alex R=F8nne Petersen <alex lycus.org> wrote:And finally, the worst part of all of this? People writing code that uses shared today are blindly assuming it actually does the right thing. It doesn't.I think most people probably don't even use shared due to lacking Phobos support. E.g. http://d.puremagic.com/issues/show_bug.cgi?id=3D7036 Not even using the write functions worked on shared types until 2.059 (e.g. printing shared arrays). 'shared' has this wonderfully attractive name to it, but apparently it doesn't have much guarantees? E.g. Walter's comment here: http://d.puremagic.com/issues/show_bug.cgi?id=3D8077#c1 So +1 from me just because I have no idea what shared is supposed to guarantee. I've just stubbornly used __gshared variables because std.concurrency.send() doesn't accept mutable data. send() doesn't work with shared either, so I have no clue.. :)
Nov 11 2012
On Sunday, 11 November 2012 at 19:28:30 UTC, Andrej Mitrovic wrote:On 11/11/12, Alex Rønne Petersen <alex lycus.org> wrote:Fix support for shared(T) in std.variant, and you will have fixed send() as well. Meanwhile, in common cases a simple wrapper struct suffices. module toy; import std.concurrency, std.stdio; struct SImpl { string s; int i; } alias shared( SImpl ) S; struct Msg { S s; } struct Quit {} S global = S( "global", 999 ); void main () { auto child = spawn( &task ); S s = S( "abc", 42 ); child.send( Msg( s ) ); child.send( Msg( global ) ); child.send( Quit() ); } void task () { bool sentinel = true; while ( sentinel ) { receive( ( Msg msg ) { writeln( msg.s.s, " -- ", msg.s.i ); }, ( Quit msg ) { sentinel = false; } ); } } grant aesgard ~/Projects/D/foo/shared_test $ dmd toy && ./toy abc -- 42 global -- 999 -- Chris Nicholson-SaulsAnd finally, the worst part of all of this? People writing code that uses shared today are blindly assuming it actually does the right thing. It doesn't.I think most people probably don't even use shared due to lacking Phobos support. E.g. http://d.puremagic.com/issues/show_bug.cgi?id=7036 Not even using the write functions worked on shared types until 2.059 (e.g. printing shared arrays). 'shared' has this wonderfully attractive name to it, but apparently it doesn't have much guarantees? E.g. Walter's comment here: http://d.puremagic.com/issues/show_bug.cgi?id=8077#c1 So +1 from me just because I have no idea what shared is supposed to guarantee. I've just stubbornly used __gshared variables because std.concurrency.send() doesn't accept mutable data. send() doesn't work with shared either, so I have no clue.. :)
Nov 11 2012
Fully agree. Kind Regards Benjamin Thaut
Nov 11 2012
On Sunday, 11 November 2012 at 20:08:25 UTC, Benjamin Thaut wrote:Fully agree.+1
Nov 11 2012
On Sun, 11 Nov 2012 22:19:08 +0100, martin wrote:On Sunday, 11 November 2012 at 20:08:25 UTC, Benjamin Thaut wrote:+1. I find it so broken that I have to avoid using it in all but the most trivial situations.Fully agree.+1
Nov 11 2012
Le 11/11/2012 23:36, Graham St Jack a écrit :On Sun, 11 Nov 2012 22:19:08 +0100, martin wrote:That isn't a bad thing in itself.On Sunday, 11 November 2012 at 20:08:25 UTC, Benjamin Thaut wrote:+1. I find it so broken that I have to avoid using it in all but the most trivial situations.Fully agree.+1
Nov 11 2012
On Monday, November 12, 2012 01:17:06 deadalnix wrote:Le 11/11/2012 23:36, Graham St Jack a =C3=A9crit :On Sun, 11 Nov 2012 22:19:08 +0100, martin wrote:On Sunday, 11 November 2012 at 20:08:25 UTC, Benjamin Thaut wrote:=st=20 +1. =20 I find it so broken that I have to avoid using it in all but the mo=Fully agree.=20 +1I don' think that it's really intended that shared by 100% easy to use.= You're=20 _supposed_ to use it sparingly. But at this point, it borders on being = utterly=20 unusable. We have a bit of a problem with the basic idea though in that you're no= t=20 supposed to be using shared much, and it's supposed to be segregated su= ch that=20 having the shared equivalent of const (as in it works with both shared = and=20 non-shared) would pose a big problem (it's also probably untenable with= memory=20 barriers and the like), but if you _don't_ have something like that, yo= u=20 either can't use shared with much of anything, or you have to cast it a= way all=20 over the place, which loses all of the memory barriers or whatnot. We h= ave=20 conflicting requirements which aren't being managed very well. I don't know how protected shared really needs to be though. Anything=20= involving shared should make heavy use of mutexes and synchronized and = whatnot=20 meaning that at least some of the protections that people want with sha= red are=20 useless unless you're writing code which is being stupid and not using = mutexes=20 or whatnot. So, casting away shared might not actually be that big a de= al so=20 long as it's temporary to call a function (as opposed to stashing the v= ariable=20 away somewhere) and that call is protected by a mutex or other thread- protection mechanism. At the moment, I think that the only way to make stuff work with both s= hared=20 and unshared (aside from using lots of casts) is to make use of templat= es, and=20 since most of druntime and Phobos isn't tested with shared, things like= Unqual=20 probably screw with that pretty thoroughly. It's at least conceivable t= hough=20 that stuff like std.algorithm could work with shared just fine. I don't think that there's much question though that shared is the majo= r chink=20 in our armor with regards to thread-local by default. The basic idea is= great,=20 but the details still need some work. - Jonathan M Davistrivial situations.=20 That isn't a bad thing in itself.
Nov 11 2012
Alex Rønne Petersen:Something needs to be done about shared. I don't know what,Maybe deprecate it and introduce something else that is rather different and based on thought-out theory? Bye, bearophile
Nov 11 2012
drop it in favour of : http://forum.dlang.org/post/k7j1ta$2kv8$1 digitalmars.com On Sunday, 11 November 2012 at 18:46:12 UTC, Alex Rønne Petersen wrote:Hi, It's starting to get outright embarrassing to talk to newcomers about D's concurrency support because the most fundamental part of it -- the shared type qualifier -- does not have well-defined semantics at all. I'm certainly not alone in being annoyed by this state of affairs: http://d.puremagic.com/issues/show_bug.cgi?id=8993 I've posted rants about the state of shared before and, from the comments on those, it appears that what most people want shared to do is at least one (and usually multiple) of * make variables global (if appropriate in the context); * make the wrapped type completely separate from the unwrapped type; * make all operations be atomic; * make all operations result in memory barriers. At a glance, this looks fine. Exactly what you would want for shared types in a concurrent setting, right? Except, not really. I'll try to explain all of the unsolved problems with shared below... First of all, the fact that shared(T) is completely separate from T (i.e. no conversions allowed, except for primitive types) is a huge usability problem. In practice, it means that 99% of the standard library is unusable with shared types. Hell, even most of the runtime doesn't work with shared types. I don't know how to best solve this particular problem; I'm just pointing it out because anyone who tries to do anything non-trivial with shared will invariably run into this. Second, the idea of making shared insert atomic operations is an absolute fallacy. It only makes sense for primitive types for the most part, and even for those, what sizes are supported depends on the target architecture. A number of ideas have come up to solve this problem: * We make shared(T) not compile for certain Ts depending on the target architecture. I personally think this is a terrible idea because most code using shared will not be portable at all. * We require any architecture D targets to support atomic operations for a certain size S at the very least. This is fine for primitives up to 64 bits in size, but doesn't clear up the situation for larger types (real, complex types, cent/ucent, ...). * We make shared not insert atomic operations at all (thus making it kind of useless for anything but documentation). * (Possibly others I have forgotten; please let me know if this is the case.) I don't think any of these are particularly attractive, to be honest. If we do make shared insert atomic operations, we would also have to consider the memory ordering of those operations. Third, we have memory barriers. I strongly suspect that this is a misnomer in most cases where people have suggested this; it's generally not useful to have a compiler insert barriers because they are used to control ordering of load/store operations which is something the programmer will want to do explicitly. In any case, the compiler can't usefully figure out where to put barriers, so it would just result in really bad performance for no apparent gain. Fourth, there is implementation complexity. If shared is meant to insert specialized instructions, it will result in effectively two code paths for most code generation in any D compiler (read: maintenance nightmare). Fifth, it is completely unclear whether casting to and from shared is legal (but with a big fat "caution" sign like casting away const) or if it's undefined behavior. Making it undefined behavior would further increase the usability problem I described above. And finally, the worst part of all of this? People writing code that uses shared today are blindly assuming it actually does the right thing. It doesn't. Their code will break on any non-x86 platform. This is an absolutely horrifying situation now that ARM, MIPS, and PowerPC are starting to become viable targets for D. Something needs to be done about shared. I don't know what, but the current situation is -- and I'm really not exaggerating here -- laughable. I think we either need to just make it perfectly clear that shared is for documentation purposes and nothing else, or, figure out an alternative system to shared, because I don't see shared actually being useful for real world work no matter what we do with it.
Nov 11 2012
On 2012-11-11 18:46:10 +0000, Alex Rønne Petersen <alex lycus.org> said:Something needs to be done about shared. I don't know what, but the current situation is -- and I'm really not exaggerating here -- laughable. I think we either need to just make it perfectly clear that shared is for documentation purposes and nothing else, or, figure out an alternative system to shared, because I don't see shared actually being useful for real world work no matter what we do with it.I feel like the concurrency aspect of D2 was rushed in the haste of having it ready for TDPL. Shared, deadlock-prone synchronized classes[1] as well as destructors running in any thread (thanks GC!) plus a couple of other irritants makes the whole concurrency scheme completely flawed if you ask me. D2 needs a near complete overhaul on the concurrency front. I'm currently working on a big code base in C++. While I do miss D when it comes to working with templates as well as for its compilation speed and a few other things, I can't say I miss D much when it comes to anything touching concurrency. [1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/ -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Nov 11 2012
On 11/12/2012 02:48 AM, Michel Fortin wrote:On 2012-11-11 18:46:10 +0000, Alex Rønne Petersen <alex lycus.org> said:I am always irritated by shared-by-default static variables.Something needs to be done about shared. I don't know what, but the current situation is -- and I'm really not exaggerating here -- laughable. I think we either need to just make it perfectly clear that shared is for documentation purposes and nothing else, or, figure out an alternative system to shared, because I don't see shared actually being useful for real world work no matter what we do with it.I feel like the concurrency aspect of D2 was rushed in the haste of having it ready for TDPL. Shared, deadlock-prone synchronized classes[1] as well as destructors running in any thread (thanks GC!) plus a couple of other irritants makes the whole concurrency scheme completely flawed if you ask me. D2 needs a near complete overhaul on the concurrency front. I'm currently working on a big code base in C++. While I do miss D when it comes to working with templates as well as for its compilation speed and a few other things, I can't say I miss D much when it comes to anything touching concurrency. [1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/
Nov 13 2012
On 2012-11-13 19:54:32 +0000, Timon Gehr <timon.gehr gmx.ch> said:On 11/12/2012 02:48 AM, Michel Fortin wrote:I tend to have very little global state in my code, so shared-by-default is not something I have to fight with very often. I do agree that thread-local is a better default. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/I feel like the concurrency aspect of D2 was rushed in the haste of having it ready for TDPL. Shared, deadlock-prone synchronized classes[1] as well as destructors running in any thread (thanks GC!) plus a couple of other irritants makes the whole concurrency scheme completely flawed if you ask me. D2 needs a near complete overhaul on the concurrency front. I'm currently working on a big code base in C++. While I do miss D when it comes to working with templates as well as for its compilation speed and a few other things, I can't say I miss D much when it comes to anything touching concurrency. [1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/I am always irritated by shared-by-default static variables.
Nov 13 2012
On Tuesday, November 13, 2012 22:12:12 Michel Fortin wrote:I tend to have very little global state in my code, so shared-by-default is not something I have to fight with very often. I do agree that thread-local is a better default.Thread-local by default is a _huge_ step forward, and in hindsight, it seems pretty ridiculous that a language would do anything else. Shared by default is just too horrible. - Jonathan M Davis
Nov 13 2012
On 11/14/2012 04:12 AM, Michel Fortin wrote:On 2012-11-13 19:54:32 +0000, Timon Gehr <timon.gehr gmx.ch> said:So do I. A thread-local static variable does not imply global state. (The execution stack is static.) Eg. in a few cases it is sensible to use static variables as implicit arguments to avoid having to pass them around by copying them all over the execution stack. private int x = 0; int foo(){ int xold = x; scope(exit) x = xold; x = new_value; bar(); // reads x return baz(); // reads x } Unfortunately, this destroys 'pure' even though it actually does not.On 11/12/2012 02:48 AM, Michel Fortin wrote:I tend to have very little global state in my code,I feel like the concurrency aspect of D2 was rushed in the haste of having it ready for TDPL. Shared, deadlock-prone synchronized classes[1] as well as destructors running in any thread (thanks GC!) plus a couple of other irritants makes the whole concurrency scheme completely flawed if you ask me. D2 needs a near complete overhaul on the concurrency front. I'm currently working on a big code base in C++. While I do miss D when it comes to working with templates as well as for its compilation speed and a few other things, I can't say I miss D much when it comes to anything touching concurrency. [1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/I am always irritated by shared-by-default static variables.so shared-by-default is not something I have to fight with very often. I do agree that thread-local is a better default.
Nov 14 2012
On 2012-11-14 10:30:46 +0000, Timon Gehr <timon.gehr gmx.ch> said:On 11/14/2012 04:12 AM, Michel Fortin wrote:I'd consider that poor style. Use a struct to encapsulate the state, then make bar, and baz member functions of that struct. The resulting code is cleaner and easier to read: pure int foo() { auto state = State(new_value); state.bar(); return state.baz(); } You could achieve something similar with nested functions too.On 2012-11-13 19:54:32 +0000, Timon Gehr <timon.gehr gmx.ch> said:So do I. A thread-local static variable does not imply global state. (The execution stack is static.) Eg. in a few cases it is sensible to use static variables as implicit arguments to avoid having to pass them around by copying them all over the execution stack. private int x = 0; int foo(){ int xold = x; scope(exit) x = xold; x = new_value; bar(); // reads x return baz(); // reads x }On 11/12/2012 02:48 AM, Michel Fortin wrote:I tend to have very little global state in my code,I feel like the concurrency aspect of D2 was rushed in the haste of having it ready for TDPL. Shared, deadlock-prone synchronized classes[1] as well as destructors running in any thread (thanks GC!) plus a couple of other irritants makes the whole concurrency scheme completely flawed if you ask me. D2 needs a near complete overhaul on the concurrency front. I'm currently working on a big code base in C++. While I do miss D when it comes to working with templates as well as for its compilation speed and a few other things, I can't say I miss D much when it comes to anything touching concurrency. [1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/I am always irritated by shared-by-default static variables.Unfortunately, this destroys 'pure' even though it actually does not.Using a local-scoped struct would work with pure, be more efficient (accessing thread-local variables takes more cycles), and be less error-prone while refactoring. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Nov 14 2012
On 11/14/2012 01:42 PM, Michel Fortin wrote:On 2012-11-14 10:30:46 +0000, Timon Gehr <timon.gehr gmx.ch> said:I'd consider this a poor statement to make. Universally quantified assertions require more rigorous justification. "In a few cases" it is not, even if it is poor style "most of the time".On 11/14/2012 04:12 AM, Michel Fortin wrote:I'd consider that poor style.On 2012-11-13 19:54:32 +0000, Timon Gehr <timon.gehr gmx.ch> said:So do I. A thread-local static variable does not imply global state. (The execution stack is static.) Eg. in a few cases it is sensible to use static variables as implicit arguments to avoid having to pass them around by copying them all over the execution stack. private int x = 0; int foo(){ int xold = x; scope(exit) x = xold; x = new_value; bar(); // reads x return baz(); // reads x }On 11/12/2012 02:48 AM, Michel Fortin wrote:I tend to have very little global state in my code,I feel like the concurrency aspect of D2 was rushed in the haste of having it ready for TDPL. Shared, deadlock-prone synchronized classes[1] as well as destructors running in any thread (thanks GC!) plus a couple of other irritants makes the whole concurrency scheme completely flawed if you ask me. D2 needs a near complete overhaul on the concurrency front. I'm currently working on a big code base in C++. While I do miss D when it comes to working with templates as well as for its compilation speed and a few other things, I can't say I miss D much when it comes to anything touching concurrency. [1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/I am always irritated by shared-by-default static variables.Use a struct to encapsulate the state, then make bar, and baz member functions of that struct.They could eg. be virtual member functions of a class already.Using a local-scoped struct would work with pure,It might.be more efficientNot necessarily.(accessing thread-local variables takes more cycles),It can be accessed sparsely, copying around the struct pointer is work too, and the fastest access path in a proper alternative design would potentially be even slower.and be less error-prone while refactoring.If done in such a way that it makes refactoring error prone, it is to be considered poor style.
Nov 14 2012
On 2012-11-14 14:30:19 +0000, Timon Gehr <timon.gehr gmx.ch> said:On 11/14/2012 01:42 PM, Michel Fortin wrote:Indeed. There's not enough context to judge fairly. I can accept the idea there are situations where it is really inconvenient or impossible to pass the state as an argument. That said, I disagree that this is not using global state. It might not be globally accessible (because x is private), but the state still exists globally since variable x exists in all threads irrespective of whether they use foo or not.On 2012-11-14 10:30:46 +0000, Timon Gehr <timon.gehr gmx.ch> said:I'd consider this a poor statement to make. Universally quantified assertions require more rigorous justification.So do I. A thread-local static variable does not imply global state. (The execution stack is static.) Eg. in a few cases it is sensible to use static variables as implicit arguments to avoid having to pass them around by copying them all over the execution stack. private int x = 0; int foo(){ int xold = x; scope(exit) x = xold; x = new_value; bar(); // reads x return baz(); // reads x }I'd consider that poor style.If done in such a way that it makes refactoring error prone, it is to be considered poor style.I guess we agree. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Nov 14 2012
On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:It's starting to get outright embarrassing to talk to newcomers about D's concurrency support because the most fundamental part of it -- the shared type qualifier -- does not have well-defined semantics at all.I think a couple things are clear: 1. Slapping shared on a type is never going to make algorithms on that type work in a concurrent context, regardless of what is done with memory barriers. Memory barriers ensure sequential consistency, they do nothing for race conditions that are sequentially consistent. Remember, single core CPUs are all sequentially consistent, and still have major concurrency problems. This also means that having templates accept shared(T) as arguments and have them magically generate correct concurrent code is a pipe dream. 2. The idea of shared adding memory barriers for access is not going to ever work. Adding barriers has to be done by someone who knows what they're doing for that particular use case, and the compiler inserting them is not going to substitute. However, and this is a big however, having shared as compiler-enforced self-documentation is immensely useful. It flags where and when data is being shared. So, your algorithm won't compile when you pass it a shared type? That is because it is NEVER GOING TO WORK with a shared type. At least you get a compile time indication of this, rather than random runtime corruption. To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutex Also, all op= need to be disabled for shared types.
Nov 11 2012
The only problem beeing that you can not really have user defined shared (value) types: http://d.puremagic.com/issues/show_bug.cgi?id=8295 Kind Regards Benjamin Thaut
Nov 11 2012
On 11/11/2012 10:05 PM, Benjamin Thaut wrote:The only problem beeing that you can not really have user defined shared (value) types: http://d.puremagic.com/issues/show_bug.cgi?id=8295If you include an object designed to work only in a single thread (non-shared), make it shared, and then destruct it when other threads may be pointing to it ... What should happen?
Nov 11 2012
Am 12.11.2012 07:50, schrieb Walter Bright:On 11/11/2012 10:05 PM, Benjamin Thaut wrote:I'm not talking about objects, I'm talking about value types. And you can't make it work at all. If you do shared ~this() { buf = null; } it won't work either. You don't have _any_ option to destroy a shared struct. Kind Regards Benjamin ThautThe only problem beeing that you can not really have user defined shared (value) types: http://d.puremagic.com/issues/show_bug.cgi?id=8295If you include an object designed to work only in a single thread (non-shared), make it shared, and then destruct it when other threads may be pointing to it ... What should happen?
Nov 12 2012
Am Sun, 11 Nov 2012 18:30:17 -0800 schrieb Walter Bright <newshound2 digitalmars.com>:To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutex Also, all op= need to be disabled for shared types.But there are also shared member functions and they're kind of annoying right now: * You can't call shared methods from non-shared methods or vice versa. This leads to code duplication, you basically have to implement everything twice: ---------- struct ABC { Mutext mutex; void a() { aImpl(); } shared void a() { synchronized(mutex) aImpl(); //not allowed } private void aImpl() { } } ---------- The only way to avoid this is casting away shared in the shared a method, but that really is annoying. * You can't have data members be included only for the shared version. In the above example, the mutex member will always be included, even if ABC instance is thread local. So you're often better off writing a non-thread safe struct and writing a wrapper struct. This way you don't have useless overhead in the non-thread safe implementation. But the nice instance syntax is lost: shared(ABC) abc1; ABC abc2; vs SharedABC abc1; ABC abc2; even worse, shared propagation won't work this way; struct DEF { ABC abc; } shared(DEF) def; def.abc.a(); and then there's also the druntime issue: core.sync doesn't work with shared which leads to this schizophrenic situation: struct A { Mutex m; void a() //Doesn't compile with shared { m.lock(); //Compiles, but locks on a TLS mutex! m.unlock(); } } struct A { shared Mutex m; shared void a() { m.lock(); //Doesn't compile (cast(Mutex)m).unlock(); //Ugly } } So the only useful solution avoids using shared: struct A { __gshared Mutex m; //Good we have __gshared! shared void a() { m.lock(); m.unlock(); } } And then there are some open questions with advanced use cases: * How do I make sure that a non-shared delegate is only accepted if I have an A, but a shared delegate should be supported for shared(A) and A? (calling a shared delegate from a non-shared function should work, right?) struct A { void a(T)(T v) { writeln("non-shared"); } shared void a(T)(T v) if (isShared!v) //isShared doesn't exist { writeln("shared"); } } And having fun with this little example: http://dpaste.dzfl.pl/7f6a4ad2 * What's the difference between: "void delegate() shared" and "shared(void delegate())"? Error: cannot implicitly convert expression (&a.abc) of type void delegate() shared to shared(void delegate()) * So let's call it void delegate() shared instead: void incrementA(void delegate() shared del) /home/c684/c922.d(7): Error: const/immutable/shared/inout attributes are only valid for non-static member functions
Nov 12 2012
On 11/12/2012 2:57 AM, Johannes Pfau wrote:But there are also shared member functions and they're kind of annoying right now: * You can't call shared methods from non-shared methods or vice versa. This leads to code duplication, you basically have to implement everything twice:You can't get away from the fact that data that can be accessed from multiple threads has to be dealt with in a *fundamentally* different way than single threaded code. You cannot share code between the two. There is simply no conceivable way that "share" can be added and then code will become thread safe. Most of the issues you're having seem to revolve around treating shared data access just like single threaded access, except "share" was added. This cannot work. The compiler error messages, while very annoying, are in their own obscure way pointing this out. It's my fault, I have not explained share very well, and have oversold it. It does not solve concurrency problems, it points them out.---------- struct ABC { Mutext mutex; void a() { aImpl(); } shared void a() { synchronized(mutex) aImpl(); //not allowed } private void aImpl() { } } ---------- The only way to avoid this is casting away shared in the shared a method, but that really is annoying.As I explained, the way to manipulate shared data is to get exclusive access to it via a mutex, cast away the shared-ness, manipulate it as single threaded data, convert it back to shared, and release the mutex.* You can't have data members be included only for the shared version. In the above example, the mutex member will always be included, even if ABC instance is thread local. So you're often better off writing a non-thread safe struct and writing a wrapper struct. This way you don't have useless overhead in the non-thread safe implementation. But the nice instance syntax is lost: shared(ABC) abc1; ABC abc2; vs SharedABC abc1; ABC abc2; even worse, shared propagation won't work this way; struct DEF { ABC abc; } shared(DEF) def; def.abc.a(); and then there's also the druntime issue: core.sync doesn't work with shared which leads to this schizophrenic situation: struct A { Mutex m; void a() //Doesn't compile with shared { m.lock(); //Compiles, but locks on a TLS mutex! m.unlock(); } } struct A { shared Mutex m; shared void a() { m.lock(); //Doesn't compile (cast(Mutex)m).unlock(); //Ugly } } So the only useful solution avoids using shared: struct A { __gshared Mutex m; //Good we have __gshared! shared void a() { m.lock(); m.unlock(); } }Yes, mutexes will need to exist in a global space.And then there are some open questions with advanced use cases: * How do I make sure that a non-shared delegate is only accepted if I have an A, but a shared delegate should be supported for shared(A) and A? (calling a shared delegate from a non-shared function should work, right?) struct A { void a(T)(T v) { writeln("non-shared"); } shared void a(T)(T v) if (isShared!v) //isShared doesn't exist { writeln("shared"); } }First, you have to decide what you mean by a shared delegate. Do you mean the variable containing the two pointers that make up a delegate are shared, or the delegate is supposed to deal with shared data?And having fun with this little example: http://dpaste.dzfl.pl/7f6a4ad2 * What's the difference between: "void delegate() shared" and "shared(void delegate())"? Error: cannot implicitly convert expression (&a.abc) of type void delegate() sharedThe delegate deals with shared data.to shared(void delegate())The variable holding the delegate is shared.* So let's call it void delegate() shared instead: void incrementA(void delegate() shared del) /home/c684/c922.d(7): Error: const/immutable/shared/inout attributes are only valid for non-static member functions
Nov 12 2012
If I understood correctly there is no reason why this should not compile ? import core.sync.mutex; class MyClass { void method () {} } void main () { auto myObject = new shared(MyClass); synchronized (myObject) { myObject.method(); } } On 12.11.2012 12:19, Walter Bright wrote:On 11/12/2012 2:57 AM, Johannes Pfau wrote:But there are also shared member functions and they're kind of annoying right now: * You can't call shared methods from non-shared methods or vice versa. This leads to code duplication, you basically have to implement everything twice:You can't get away from the fact that data that can be accessed from multiple threads has to be dealt with in a *fundamentally* different way than single threaded code. You cannot share code between the two. There is simply no conceivable way that "share" can be added and then code will become thread safe. Most of the issues you're having seem to revolve around treating shared data access just like single threaded access, except "share" was added. This cannot work. The compiler error messages, while very annoying, are in their own obscure way pointing this out. It's my fault, I have not explained share very well, and have oversold it. It does not solve concurrency problems, it points them out.---------- struct ABC { Mutext mutex; void a() { aImpl(); } shared void a() { synchronized(mutex) aImpl(); //not allowed } private void aImpl() { } } ---------- The only way to avoid this is casting away shared in the shared a method, but that really is annoying.As I explained, the way to manipulate shared data is to get exclusive access to it via a mutex, cast away the shared-ness, manipulate it as single threaded data, convert it back to shared, and release the mutex.* You can't have data members be included only for the shared version. In the above example, the mutex member will always be included, even if ABC instance is thread local. So you're often better off writing a non-thread safe struct and writing a wrapper struct. This way you don't have useless overhead in the non-thread safe implementation. But the nice instance syntax is lost: shared(ABC) abc1; ABC abc2; vs SharedABC abc1; ABC abc2; even worse, shared propagation won't work this way; struct DEF { ABC abc; } shared(DEF) def; def.abc.a(); and then there's also the druntime issue: core.sync doesn't work with shared which leads to this schizophrenic situation: struct A { Mutex m; void a() //Doesn't compile with shared { m.lock(); //Compiles, but locks on a TLS mutex! m.unlock(); } } struct A { shared Mutex m; shared void a() { m.lock(); //Doesn't compile (cast(Mutex)m).unlock(); //Ugly } } So the only useful solution avoids using shared: struct A { __gshared Mutex m; //Good we have __gshared! shared void a() { m.lock(); m.unlock(); } }Yes, mutexes will need to exist in a global space.And then there are some open questions with advanced use cases: * How do I make sure that a non-shared delegate is only accepted if I have an A, but a shared delegate should be supported for shared(A) and A? (calling a shared delegate from a non-shared function should work, right?) struct A { void a(T)(T v) { writeln("non-shared"); } shared void a(T)(T v) if (isShared!v) //isShared doesn't exist { writeln("shared"); } }First, you have to decide what you mean by a shared delegate. Do you mean the variable containing the two pointers that make up a delegate are shared, or the delegate is supposed to deal with shared data?And having fun with this little example: http://dpaste.dzfl.pl/7f6a4ad2 * What's the difference between: "void delegate() shared" and "shared(void delegate())"? Error: cannot implicitly convert expression (&a.abc) of type void delegate() sharedThe delegate deals with shared data.to shared(void delegate())The variable holding the delegate is shared.* So let's call it void delegate() shared instead: void incrementA(void delegate() shared del) /home/c684/c922.d(7): Error: const/immutable/shared/inout attributes are only valid for non-static member functions
Nov 12 2012
Le 12/11/2012 16:00, luka8088 a écrit :If I understood correctly there is no reason why this should not compile ? import core.sync.mutex; class MyClass { void method () {} } void main () { auto myObject = new shared(MyClass); synchronized (myObject) { myObject.method(); } }D has no ownership, so the compiler can't know what if it is safe to do so or not.
Nov 12 2012
Here i as wild idea: ////////// void main () { mutex x; // mutex is not a type but rather a keyword // x is a symbol in order to allow // different x in different scopes shared(x) int i; // ... or maybe use UDA ? // mutex x must be locked // in order to change i synchronized (x) { // lock x in a compiler-aware way i++; // compiler guarantees that i will not // be changed outside synchronized(x) } } ////////// so I tried something similar with current implementation: ////////// import std.stdio; void main () { shared(int) i1; auto m1 = new MyMutex(); i1.attachMutex(m1); // m1 must be locked in order to modify i1 // i1++; // should throw a compiler error // sharedAccess(i1)++; // runtime exception, m1 is not locked synchronized (m1) { sharedAccess(i1)++; // ok, m1 is locked } } // some generic code import core.sync.mutex; class MyMutex : Mutex { property bool locked = false; trusted void lock () { super.lock(); locked = true; } trusted void unlock () { locked = false; super.unlock(); } bool tryLock () { bool result = super.tryLock(); if (result) locked = true; return result; } } template unshared (T : shared(T)) { alias T unshared; } template unshared (T : shared(T)*) { alias T* unshared; } auto ref sharedAccess (T) (ref T value) { assert(value.attachMutex().locked); unshared!(T)* refVal = (cast(unshared!(T*)) &value); return *refVal; } MyMutex attachMutex (T) (T value, MyMutex mutex = null) { static __gshared MyMutex[T] mutexes; // this memory leak can be solved // but it's left like this to make the code simple synchronized if (value !in mutexes && mutex !is null) mutexes[value] = mutex; assert(mutexes[value] !is null); return mutexes[value]; } ////////// and another example with methods: ////////// import std.stdio; class a { int i; void increment () { i++; } } void main () { auto a1 = new shared(a); auto m1 = new MyMutex(); a1.attachMutex(m1); // m1 must be locked in order to modify a1 // a1.increment(); // compiler error // sharedAccess(a1).increment(); // runtime exception, m1 is not locked synchronized (m1) { sharedAccess(a1).increment(); // ok, m1 is locked } } // some generic code import core.sync.mutex; class MyMutex : Mutex { property bool locked = false; trusted void lock () { super.lock(); locked = true; } trusted void unlock () { locked = false; super.unlock(); } bool tryLock () { bool result = super.tryLock(); if (result) locked = true; return result; } } template unshared (T : shared(T)) { alias T unshared; } template unshared (T : shared(T)*) { alias T* unshared; } auto ref sharedAccess (T) (ref T value) { assert(value.attachMutex().locked); unshared!(T)* refVal = (cast(unshared!(T*)) &value); return *refVal; } MyMutex attachMutex (T) (T value, MyMutex mutex = null) { static __gshared MyMutex[T] mutexes; // this memory leak can be solved // but it's left like this to make the code simple synchronized if (value !in mutexes && mutex !is null) mutexes[value] = mutex; assert(mutexes[value] !is null); return mutexes[value]; } ////////// In any case, if shared itself does not provide locking and does not fixes problems but only points them out (not to be misunderstood, I completely agree with that) then I think that assigning a mutex to the variable is a must. Aldo latter examples already work with current implementation I like the first one (or something similar to the first one) more, it looks cleaner and leaves space for additional optimizations. On 12.11.2012 17:14, deadalnix wrote:Le 12/11/2012 16:00, luka8088 a écrit :If I understood correctly there is no reason why this should not compile ? import core.sync.mutex; class MyClass { void method () {} } void main () { auto myObject = new shared(MyClass); synchronized (myObject) { myObject.method(); } }D has no ownership, so the compiler can't know what if it is safe to do so or not.
Nov 12 2012
On Monday, 12 November 2012 at 11:19:57 UTC, Walter Bright wrote:On 11/12/2012 2:57 AM, Johannes Pfau wrote:I know share can't automatically make the code thread safe. I just wanted to point out that this casting / code duplication is annoying but I don't know either how this could be solved.But there are also shared member functions and they're kind of annoying right now: * You can't call shared methods from non-shared methods or vice versa. This leads to code duplication, you basically have to implement everything twice:You can't get away from the fact that data that can be accessed from multiple threads has to be dealt with in a *fundamentally* different way than single threaded code. You cannot share code between the two. There is simply no conceivable way that "share" can be added and then code will become thread safe.Yes, mutexes will need to exist in a global space.I'm not sure if I undestand this. Don't you think shared(Mutex) should work? AFAICS that's only a library problem: Add shared to the lock / unlock methods in druntime and it should work? Or global as in not in the struct instance?I'm talking about a delegate pointing to a method declared with the "shared" keyword and the "this pointer" pointing to a shared object: struct A { shared void a(){} } shared A instance; auto del = &instance.a; //I'm talking about this type To explain that usecase: I think of a shared delegate as a delegate that can be safely called from different threads. So I can store it in a struct instance and later on call it from any thread: struct Signal { //The variable is shared _AND_ the method is shared shared(shared void delegate()) _handler; shared void call() //Can be called from any thread { //Would have to synchronize access to the variable in a real world case, //but the call itself wouldn't have to be synchronized shared void delegate() localHandler; synchronized(mutex) { localHandler = _handler; } localHandler (); } }And then there are some open questions with advanced use cases: * How do I make sure that a non-shared delegate is only accepted if I have an A, but a shared delegate should be supported for shared(A) and A? (calling a shared delegate from a non-shared function should work, right?) struct A { void a(T)(T v) { writeln("non-shared"); } shared void a(T)(T v) if (isShared!v) //isShared doesn't exist { writeln("shared"); } }First, you have to decide what you mean by a shared delegate. Do you mean the variable containing the two pointers that make up a delegate are shared, or the delegate is supposed to deal with shared data?OK so that's what I need but the compiler doesn't let me declare that type. alias void delegate() shared del; Error: const/immutable/shared/inout attributes are only valid for non-static member functionsAnd having fun with this little example: http://dpaste.dzfl.pl/7f6a4ad2 * What's the difference between: "void delegate() shared" and "shared(void delegate())"? Error: cannot implicitly convert expression (&a.abc) of type void delegate() sharedThe delegate deals with shared data.OK, but when it's used as a function parameter, which is pass-by-value for delegates and because of tail-shared there's effectively no difference, right? In that case it's not possible to pass a shared variable to the function as this will always create a copy? void abcd(shared(void delegate()) del) which is the same as void abcd(shared void delegate() del) How would you pass del as a shared variable?to shared(void delegate())The variable holding the delegate is shared.* So let's call it void delegate() shared instead: void incrementA(void delegate() shared del) /home/c684/c922.d(7): Error: const/immutable/shared/inout attributes are only valid for non-static member functions
Nov 12 2012
On Nov 12, 2012, at 2:57 AM, Johannes Pfau <nospam example.com> wrote:Am Sun, 11 Nov 2012 18:30:17 -0800 schrieb Walter Bright <newshound2 digitalmars.com>: =20annoying=20 To make a shared type work in an algorithm, you have to: =20 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutex =20 Also, all op=3D need to be disabled for shared types.=20 But there are also shared member functions and they're kind of =right now: =20 * You can't call shared methods from non-shared methods or vice versa. This leads to code duplication, you basically have to implement everything twice: =20 ---------- struct ABC { Mutext mutex; void a() { aImpl(); } shared void a() { synchronized(mutex) aImpl(); //not allowed } private void aImpl() { =09 } } ---------- The only way to avoid this is casting away shared in the shared a method, but that really is annoying.Yes. You end up having two methods for each function, one as a = synchronized wrapper that casts away shared and another that does the = actual work.and then there's also the druntime issue: core.sync doesn't work with shared which leads to this schizophrenic situation: struct A { Mutex m; void a() //Doesn't compile with shared { m.lock(); //Compiles, but locks on a TLS mutex! m.unlock(); } }Most of the reason for this was that I didn't like the old implications = of shared, which was that shared methods would at some time in the = future end up with memory barriers all over the place. That's been = dropped, but I'm still not a fan of the wrapper method for each = function. It makes for a crappy class design.=
Nov 14 2012
On Mon, 12 Nov 2012 02:30:17 -0000, Walter Bright <newshound2 digitalmars.com> wrote:To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutexSo what we actually want, in order to make the above "nice" is a "scoped" struct wrapping the mutex and shared object which does all the "dirty" work for you. I'm thinking.. // (0) with(ScopedLock(obj,lock)) // (1) { obj.foo = 2; // (2) } // (3) // (4) (0) obj is a "shared" reference, lock is a global mutex (1) mutex is acquired here, shared is cast away (2) 'obj' is not "shared" here so data access is allowed (3) ScopedLock is "destroyed" and the mutex released (4) obj is shared again I think most of the above can be done without any compiler support but it would be "nice" if the compiler did something clever with 'obj' such that it knew it wasn't 'shared' inside the the 'with' above. If not, if a full library solution is desired we could always have another temporary "unshared" variable referencing obj. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Nov 12 2012
On Mon, 12 Nov 2012 11:55:51 -0000, Regan Heath <regan netmail.co.nz> wrote:On Mon, 12 Nov 2012 02:30:17 -0000, Walter Bright <newshound2 digitalmars.com> wrote:There was talk a while back about how to handle the existing object mutex and synchronized{} statement blocks and this subject has me thinking back to that. My thinking has gone full circle and rather than bore you with all the details I want to present a conclusion which I am hoping is both implementable and useful. First off, IIRC object contains a mutex/monitor/critical section, which means all objects contain one. The last discussion saw many people wanting this removed for efficiency. I propose we do this. Then, if a class or struct is declared as "shared" or a "shared" instance of a class or struct is constructed we magically include one (compiler magic which I hope is possible). Secondly I say we make "shared" illegal on basic types. This is a limitation(*) but I believe in most cases a single int is unlikely to be shared without an accompanying group of other variables, and usually an algorithm operating on those variables. These variables and the algorithm should be encapsulated in a class or struct - which can in turn be shared. Now.. the synchronized() {} statement can do the magic described above (as ScopedLock) for us. It would be illegal to call it on a non "shared" instance. It would acquire the mutex and cast away "shared" inside the block/scope, at the end of the scope it would cast shared back and release the mutex. (*) for those rare cases where a single int or other basic type is all that is shared we can provide a wrapper struct which is declared as "shared". R -- Using Opera's revolutionary email client: http://www.opera.com/mail/To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutexSo what we actually want, in order to make the above "nice" is a "scoped" struct wrapping the mutex and shared object which does all the "dirty" work for you. I'm thinking.. // (0) with(ScopedLock(obj,lock)) // (1) { obj.foo = 2; // (2) } // (3) // (4) (0) obj is a "shared" reference, lock is a global mutex (1) mutex is acquired here, shared is cast away (2) 'obj' is not "shared" here so data access is allowed (3) ScopedLock is "destroyed" and the mutex released (4) obj is shared again I think most of the above can be done without any compiler support but it would be "nice" if the compiler did something clever with 'obj' such that it knew it wasn't 'shared' inside the the 'with' above. If not, if a full library solution is desired we could always have another temporary "unshared" variable referencing obj.
Nov 12 2012
Le 12/11/2012 13:25, Regan Heath a écrit :First off, IIRC object contains a mutex/monitor/critical section, which means all objects contain one. The last discussion saw many people wanting this removed for efficiency. I propose we do this. Then, if a class or struct is declared as "shared" or a "shared" instance of a class or struct is constructed we magically include one (compiler magic which I hope is possible).As already explain in the thread you mention, it is not gonna work. The conclusion of the thread is that only synchronized classes should have one mutex field.Secondly I say we make "shared" illegal on basic types. This is a limitation(*) but I believe in most cases a single int is unlikely to be shared without an accompanying group of other variables, and usually an algorithm operating on those variables. These variables and the algorithm should be encapsulated in a class or struct - which can in turn be shared.Shared reference counting ? Disruptor ?Now.. the synchronized() {} statement can do the magic described above (as ScopedLock) for us. It would be illegal to call it on a non "shared" instance. It would acquire the mutex and cast away "shared" inside the block/scope, at the end of the scope it would cast shared back and release the mutex. (*) for those rare cases where a single int or other basic type is all that is shared we can provide a wrapper struct which is declared as "shared".
Nov 12 2012
On 2012-11-12 12:55, Regan Heath wrote:On Mon, 12 Nov 2012 02:30:17 -0000, Walter Bright <newshound2 digitalmars.com> wrote:I'm just throwing it in here again, AST macros could probably solve this. -- /Jacob CarlborgTo make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutexSo what we actually want, in order to make the above "nice" is a "scoped" struct wrapping the mutex and shared object which does all the "dirty" work for you. I'm thinking.. // (0) with(ScopedLock(obj,lock)) // (1) { obj.foo = 2; // (2) } // (3) // (4) (0) obj is a "shared" reference, lock is a global mutex (1) mutex is acquired here, shared is cast away (2) 'obj' is not "shared" here so data access is allowed (3) ScopedLock is "destroyed" and the mutex released (4) obj is shared again I think most of the above can be done without any compiler support but it would be "nice" if the compiler did something clever with 'obj' such that it knew it wasn't 'shared' inside the the 'with' above. If not, if a full library solution is desired we could always have another temporary "unshared" variable referencing obj.
Nov 12 2012
On 2012-11-12, 15:11, Jacob Carlborg wrote:On 2012-11-12 12:55, Regan Heath wrote:Until someone writes a proper DIP on them, macros can write entire software packages, download Hitler, turn D into lisp, and bake bread. Can we please stop with the 'macros could do that' until there's any sort of consensus as to what macros *could* do? -- SimenOn Mon, 12 Nov 2012 02:30:17 -0000, Walter Bright <newshound2 digitalmars.com> wrote:I'm just throwing it in here again, AST macros could probably solve this.To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutexSo what we actually want, in order to make the above "nice" is a "scoped" struct wrapping the mutex and shared object which does all the "dirty" work for you. I'm thinking.. // (0) with(ScopedLock(obj,lock)) // (1) { obj.foo = 2; // (2) } // (3) // (4) (0) obj is a "shared" reference, lock is a global mutex (1) mutex is acquired here, shared is cast away (2) 'obj' is not "shared" here so data access is allowed (3) ScopedLock is "destroyed" and the mutex released (4) obj is shared again I think most of the above can be done without any compiler support but it would be "nice" if the compiler did something clever with 'obj' such that it knew it wasn't 'shared' inside the the 'with' above. If not, if a full library solution is desired we could always have another temporary "unshared" variable referencing obj.
Nov 12 2012
On 2012-11-12 17:57, Simen Kjaeraas wrote:Until someone writes a proper DIP on them, macros can write entire software packages, download Hitler, turn D into lisp, and bake bread. Can we please stop with the 'macros could do that' until there's any sort of consensus as to what macros *could* do?Sure, I can try and stop doing that :) -- /Jacob Carlborg
Nov 12 2012
On 11/12/12 20:08, Jacob Carlborg wrote:On 2012-11-12 17:57, Simen Kjaeraas wrote:You know, AST macros could probably stop doing that. Food for thought.Until someone writes a proper DIP on them, macros can write entire software packages, download Hitler, turn D into lisp, and bake bread. Can we please stop with the 'macros could do that' until there's any sort of consensus as to what macros *could* do?Sure, I can try and stop doing that :)
Nov 13 2012
Le 12/11/2012 03:30, Walter Bright a écrit :On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:The compiler is able to do some optimization on that, and, it never forget to put a barrier where I would. Some algorithms are safe to use concurrently, granted the right barriers are in place. Think double check locking for instance. This is the very reason why volatile have been modified in Java 1.5 to include barriers. I wish D's shared get a semantic close to java's volatile.It's starting to get outright embarrassing to talk to newcomers about D's concurrency support because the most fundamental part of it -- the shared type qualifier -- does not have well-defined semantics at all.I think a couple things are clear: 1. Slapping shared on a type is never going to make algorithms on that type work in a concurrent context, regardless of what is done with memory barriers. Memory barriers ensure sequential consistency, they do nothing for race conditions that are sequentially consistent. Remember, single core CPUs are all sequentially consistent, and still have major concurrency problems. This also means that having templates accept shared(T) as arguments and have them magically generate correct concurrent code is a pipe dream. 2. The idea of shared adding memory barriers for access is not going to ever work. Adding barriers has to be done by someone who knows what they're doing for that particular use case, and the compiler inserting them is not going to substitute.However, and this is a big however, having shared as compiler-enforced self-documentation is immensely useful. It flags where and when data is being shared. So, your algorithm won't compile when you pass it a shared type? That is because it is NEVER GOING TO WORK with a shared type. At least you get a compile time indication of this, rather than random runtime corruption.Agreed.To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutex Also, all op= need to be disabled for shared types.That is never gonna scale without some kind of ownership of data. Think about slices.
Nov 12 2012
On 12 November 2012 04:30, Walter Bright <newshound2 digitalmars.com> wrote= :On 11/11/2012 10:46 AM, Alex R=C3=B8nne Petersen wrote:sIt's starting to get outright embarrassing to talk to newcomers about D'=dconcurrency support because the most fundamental part of it -- the share=retype qualifier -- does not have well-defined semantics at all.I think a couple things are clear: 1. Slapping shared on a type is never going to make algorithms on that type work in a concurrent context, regardless of what is done with memory barriers. Memory barriers ensure sequential consistency, they do nothing for race conditions that are sequentially consistent. Remember, single co=CPUs are all sequentially consistent, and still have major concurrency problems. This also means that having templates accept shared(T) as arguments and have them magically generate correct concurrent code is a pipe dream. 2. The idea of shared adding memory barriers for access is not going to ever work. Adding barriers has to be done by someone who knows what they'=redoing for that particular use case, and the compiler inserting them is no=tgoing to substitute. However, and this is a big however, having shared as compiler-enforced self-documentation is immensely useful. It flags where and when data is being shared. So, your algorithm won't compile when you pass it a shared type? That is because it is NEVER GOING TO WORK with a shared type. At least you get a compile time indication of this, rather than random runti=mecorruption. To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutex Also, all op=3D need to be disabled for shared types.I agree completely the OP, shared is really very unhelpful right now. It just inconveniences you, and forces you to perform explicit casts (which may cast away other attributes like const). I've thought before that what it might be useful+practical for shared to do, is offer convenient methods to implement precisely what you describe above. Imagine a system where tagging a variable 'shared' would cause it to gain some properties: Gain a mutex, implicit var.lock()/release() methods to call on either side of access to your shared variable, and unlike the current situation where assignment is illegal, rather, assignment works as usual, but the shared tag implies a runtime check to verify the item is locked when performing assignment (perhaps that runtime check would be removed in -release for performance). This would make implementing the logic you describe above convenient, and you wouldn't need to be declaring explicit mutexes around the place. It would also address the safety by asserting that it is locked whenever accessed.
Nov 12 2012
On 12.11.2012 3:30, Walter Bright wrote:On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:This clarifies a lot, but still a lot of people get confused with: http://dlang.org/faq.html#shared_memory_barriers is it a faq error ? and also with http://dlang.org/faq.html#shared_guarantees said, I come to think that the fact that the following code compiles is either lack of implementation, a compiler bug or a faq error ? ////////// import core.thread; void main () { shared int i; (new Thread({ i++; })).start(); }It's starting to get outright embarrassing to talk to newcomers about D's concurrency support because the most fundamental part of it -- the shared type qualifier -- does not have well-defined semantics at all.I think a couple things are clear: 1. Slapping shared on a type is never going to make algorithms on that type work in a concurrent context, regardless of what is done with memory barriers. Memory barriers ensure sequential consistency, they do nothing for race conditions that are sequentially consistent. Remember, single core CPUs are all sequentially consistent, and still have major concurrency problems. This also means that having templates accept shared(T) as arguments and have them magically generate correct concurrent code is a pipe dream. 2. The idea of shared adding memory barriers for access is not going to ever work. Adding barriers has to be done by someone who knows what they're doing for that particular use case, and the compiler inserting them is not going to substitute. However, and this is a big however, having shared as compiler-enforced self-documentation is immensely useful. It flags where and when data is being shared. So, your algorithm won't compile when you pass it a shared type? That is because it is NEVER GOING TO WORK with a shared type. At least you get a compile time indication of this, rather than random runtime corruption. To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutex Also, all op= need to be disabled for shared types.
Nov 13 2012
On Tuesday, 13 November 2012 at 09:11:15 UTC, luka8088 wrote:On 12.11.2012 3:30, Walter Bright wrote:Um, sorry, the following code: ////////// import core.thread; void main () { int i; (new Thread({ i++; })).start(); }On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:This clarifies a lot, but still a lot of people get confused with: http://dlang.org/faq.html#shared_memory_barriers is it a faq error ? and also with http://dlang.org/faq.html#shared_guarantees said, I come to think that the fact that the following code compiles is either lack of implementation, a compiler bug or a faq error ? ////////// import core.thread; void main () { shared int i; (new Thread({ i++; })).start(); }It's starting to get outright embarrassing to talk to newcomers about D's concurrency support because the most fundamental part of it -- the shared type qualifier -- does not have well-defined semantics at all.I think a couple things are clear: 1. Slapping shared on a type is never going to make algorithms on that type work in a concurrent context, regardless of what is done with memory barriers. Memory barriers ensure sequential consistency, they do nothing for race conditions that are sequentially consistent. Remember, single core CPUs are all sequentially consistent, and still have major concurrency problems. This also means that having templates accept shared(T) as arguments and have them magically generate correct concurrent code is a pipe dream. 2. The idea of shared adding memory barriers for access is not going to ever work. Adding barriers has to be done by someone who knows what they're doing for that particular use case, and the compiler inserting them is not going to substitute. However, and this is a big however, having shared as compiler-enforced self-documentation is immensely useful. It flags where and when data is being shared. So, your algorithm won't compile when you pass it a shared type? That is because it is NEVER GOING TO WORK with a shared type. At least you get a compile time indication of this, rather than random runtime corruption. To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutex Also, all op= need to be disabled for shared types.
Nov 13 2012
Am 13.11.2012 10:14, schrieb luka8088:On Tuesday, 13 November 2012 at 09:11:15 UTC, luka8088 wrote:Only std.concurrency (using spawn() and send()) enforces that unshared data cannot be pass between threads. The core.thread module is just a low-level module that just represents the OS functionality.On 12.11.2012 3:30, Walter Bright wrote:Um, sorry, the following code: ////////// import core.thread; void main () { int i; (new Thread({ i++; })).start(); }On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:This clarifies a lot, but still a lot of people get confused with: http://dlang.org/faq.html#shared_memory_barriers is it a faq error ? and also with http://dlang.org/faq.html#shared_guarantees said, I come to think that the fact that the following code compiles is either lack of implementation, a compiler bug or a faq error ? ////////// import core.thread; void main () { shared int i; (new Thread({ i++; })).start(); }It's starting to get outright embarrassing to talk to newcomers about D's concurrency support because the most fundamental part of it -- the shared type qualifier -- does not have well-defined semantics at all.I think a couple things are clear: 1. Slapping shared on a type is never going to make algorithms on that type work in a concurrent context, regardless of what is done with memory barriers. Memory barriers ensure sequential consistency, they do nothing for race conditions that are sequentially consistent. Remember, single core CPUs are all sequentially consistent, and still have major concurrency problems. This also means that having templates accept shared(T) as arguments and have them magically generate correct concurrent code is a pipe dream. 2. The idea of shared adding memory barriers for access is not going to ever work. Adding barriers has to be done by someone who knows what they're doing for that particular use case, and the compiler inserting them is not going to substitute. However, and this is a big however, having shared as compiler-enforced self-documentation is immensely useful. It flags where and when data is being shared. So, your algorithm won't compile when you pass it a shared type? That is because it is NEVER GOING TO WORK with a shared type. At least you get a compile time indication of this, rather than random runtime corruption. To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutex Also, all op= need to be disabled for shared types.
Nov 13 2012
On 13.11.2012 10:20, Sönke Ludwig wrote:Am 13.11.2012 10:14, schrieb luka8088:In that case http://dlang.org/faq.html#shared_guarantees is wrong, it is not a correct guarantee. Or at least that should be noted there. If nothing else it is confusing...On Tuesday, 13 November 2012 at 09:11:15 UTC, luka8088 wrote:Only std.concurrency (using spawn() and send()) enforces that unshared data cannot be pass between threads. The core.thread module is just a low-level module that just represents the OS functionality.On 12.11.2012 3:30, Walter Bright wrote:Um, sorry, the following code: ////////// import core.thread; void main () { int i; (new Thread({ i++; })).start(); }On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:This clarifies a lot, but still a lot of people get confused with: http://dlang.org/faq.html#shared_memory_barriers is it a faq error ? and also with http://dlang.org/faq.html#shared_guarantees said, I come to think that the fact that the following code compiles is either lack of implementation, a compiler bug or a faq error ? ////////// import core.thread; void main () { shared int i; (new Thread({ i++; })).start(); }It's starting to get outright embarrassing to talk to newcomers about D's concurrency support because the most fundamental part of it -- the shared type qualifier -- does not have well-defined semantics at all.I think a couple things are clear: 1. Slapping shared on a type is never going to make algorithms on that type work in a concurrent context, regardless of what is done with memory barriers. Memory barriers ensure sequential consistency, they do nothing for race conditions that are sequentially consistent. Remember, single core CPUs are all sequentially consistent, and still have major concurrency problems. This also means that having templates accept shared(T) as arguments and have them magically generate correct concurrent code is a pipe dream. 2. The idea of shared adding memory barriers for access is not going to ever work. Adding barriers has to be done by someone who knows what they're doing for that particular use case, and the compiler inserting them is not going to substitute. However, and this is a big however, having shared as compiler-enforced self-documentation is immensely useful. It flags where and when data is being shared. So, your algorithm won't compile when you pass it a shared type? That is because it is NEVER GOING TO WORK with a shared type. At least you get a compile time indication of this, rather than random runtime corruption. To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutex Also, all op= need to be disabled for shared types.
Nov 13 2012
On Tuesday, 13 November 2012 at 10:06:12 UTC, luka8088 wrote:On 13.11.2012 10:20, Sönke Ludwig wrote:You are right, it could probably be added to avoid confusion. But then, non- safe code is not guaranteed to maintain any type system invariants at all if you don't pay attention to what its requirements are, so memory sharing is not really special in that regard… DavidOnly std.concurrency (using spawn() and send()) enforces that unshared data cannot be pass between threads. The core.thread module is just a low-level module that just represents the OS functionality.In that case http://dlang.org/faq.html#shared_guarantees is wrong, it is not a correct guarantee. Or at least that should be noted there. If nothing else it is confusing...
Nov 13 2012
On Nov 13, 2012, at 1:14 AM, luka8088 <luka8088 owave.net> wrote:On Tuesday, 13 November 2012 at 09:11:15 UTC, luka8088 wrote:about D'sOn 12.11.2012 3:30, Walter Bright wrote:On 11/11/2012 10:46 AM, Alex R=F8nne Petersen wrote:It's starting to get outright embarrassing to talk to newcomers =thatconcurrency support because the most fundamental part of it -- the shared type qualifier -- does not have well-defined semantics at all.=20 I think a couple things are clear: =20 1. Slapping shared on a type is never going to make algorithms on =dotype work in a concurrent context, regardless of what is done with memory barriers. Memory barriers ensure sequential consistency, they =Remember,nothing for race conditions that are sequentially consistent. =majorsingle core CPUs are all sequentially consistent, and still have =toconcurrency problems. This also means that having templates accept shared(T) as arguments and have them magically generate correct concurrent code is a pipe dream. =20 2. The idea of shared adding memory barriers for access is not going =insertingever work. Adding barriers has to be done by someone who knows what they're doing for that particular use case, and the compiler =compiler-enforcedthem is not going to substitute. =20 =20 However, and this is a big however, having shared as =isself-documentation is immensely useful. It flags where and when data =sharedbeing shared. So, your algorithm won't compile when you pass it a =Attype? That is because it is NEVER GOING TO WORK with a shared type. =come to think that the fact that the following code compiles is either = lack of implementation, a compiler bug or a faq error ?least you get a compile time indication of this, rather than random runtime corruption. =20 To make a shared type work in an algorithm, you have to: =20 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutex =20 Also, all op=3D need to be disabled for shared types.=20 =20 This clarifies a lot, but still a lot of people get confused with: http://dlang.org/faq.html#shared_memory_barriers is it a faq error ? =20 and also with http://dlang.org/faq.html#shared_guarantees said, I ==20 ////////// =20 import core.thread; =20 void main () { int i; (new Thread({ i++; })).start(); }It's intentional. core.thread is for people who know what they're = doing, and there are legitimate uses along these lines: void main() { int i; auto t =3D new Thread({i++;}); t.start(); t.join(); write(i); } This is perfectly safe and has a deterministic result.=
Nov 14 2012
On 14.11.2012 20:54, Sean Kelly wrote:On Nov 13, 2012, at 1:14 AM, luka8088<luka8088 owave.net> wrote:Yes, that makes perfect sense... I just wanted to point out the misguidance in FAQ because (at least before this forum thread) there is not much written about shared and you can get a wrong idea from it (at least I did).On Tuesday, 13 November 2012 at 09:11:15 UTC, luka8088 wrote:It's intentional. core.thread is for people who know what they're doing, and there are legitimate uses along these lines: void main() { int i; auto t = new Thread({i++;}); t.start(); t.join(); write(i); } This is perfectly safe and has a deterministic result.On 12.11.2012 3:30, Walter Bright wrote:////////// import core.thread; void main () { int i; (new Thread({ i++; })).start(); }On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:This clarifies a lot, but still a lot of people get confused with: http://dlang.org/faq.html#shared_memory_barriers is it a faq error ? and also with http://dlang.org/faq.html#shared_guarantees said, I come to think that the fact that the following code compiles is either lack of implementation, a compiler bug or a faq error ?It's starting to get outright embarrassing to talk to newcomers about D's concurrency support because the most fundamental part of it -- the shared type qualifier -- does not have well-defined semantics at all.I think a couple things are clear: 1. Slapping shared on a type is never going to make algorithms on that type work in a concurrent context, regardless of what is done with memory barriers. Memory barriers ensure sequential consistency, they do nothing for race conditions that are sequentially consistent. Remember, single core CPUs are all sequentially consistent, and still have major concurrency problems. This also means that having templates accept shared(T) as arguments and have them magically generate correct concurrent code is a pipe dream. 2. The idea of shared adding memory barriers for access is not going to ever work. Adding barriers has to be done by someone who knows what they're doing for that particular use case, and the compiler inserting them is not going to substitute. However, and this is a big however, having shared as compiler-enforced self-documentation is immensely useful. It flags where and when data is being shared. So, your algorithm won't compile when you pass it a shared type? That is because it is NEVER GOING TO WORK with a shared type. At least you get a compile time indication of this, rather than random runtime corruption. To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutex Also, all op= need to be disabled for shared types.
Nov 14 2012
On 11/13/2012 1:11 AM, luka8088 wrote:This clarifies a lot, but still a lot of people get confused with: http://dlang.org/faq.html#shared_memory_barriers is it a faq error ?Andrei is a proponent of having shared to memory barriers, I disagree with him. We haven't convinced each other yet, so this is a bit up in the air.and also with http://dlang.org/faq.html#shared_guarantees said, I come to think that the fact that the following code compiles is either lack of implementation, a compiler bug or a faq error ? ////////// import core.thread; void main () { shared int i; (new Thread({ i++; })).start(); }I think it's a user bug.
Nov 13 2012
On Tuesday, 13 November 2012 at 21:29:13 UTC, Walter Bright wrote:On 11/13/2012 1:11 AM, luka8088 wrote:FWIW, I'm with you on this one. Memory barriers would not make shared more useful, as they do not solve the issue with concurrency (as you have explained earlier).This clarifies a lot, but still a lot of people get confused with: http://dlang.org/faq.html#shared_memory_barriers is it a faq error ?Andrei is a proponent of having shared to memory barriers, I disagree with him. We haven't convinced each other yet, so this is a bit up in the air.
Nov 13 2012
On 11/13/12 1:28 PM, Walter Bright wrote:On 11/13/2012 1:11 AM, luka8088 wrote:Wait, then what would shared do? This is new to me as I've always assumed you and I have the same view on this. AndreiThis clarifies a lot, but still a lot of people get confused with: http://dlang.org/faq.html#shared_memory_barriers is it a faq error ?Andrei is a proponent of having shared to memory barriers, I disagree with him. We haven't convinced each other yet, so this is a bit up in the air.
Nov 13 2012
On Tuesday, 13 November 2012 at 21:56:21 UTC, Andrei Alexandrescu wrote:On 11/13/12 1:28 PM, Walter Bright wrote:I'm speaking out of turn, but... I'll flip that around: what would shared do if there were memory barriers? Walter has said previously in this thread that shared is to be used to mark shared data, and disallow any potentially non-thread-safe operations. To use shared data, you need to manually lock it and then cast away the shared temporarily. This can be made more pleasant with library utilities.On 11/13/2012 1:11 AM, luka8088 wrote:Wait, then what would shared do? This is new to me as I've always assumed you and I have the same view on this. AndreiThis clarifies a lot, but still a lot of people get confused with: http://dlang.org/faq.html#shared_memory_barriers is it a faq error ?Andrei is a proponent of having shared to memory barriers, I disagree with him. We haven't convinced each other yet, so this is a bit up in the air.
Nov 13 2012
On 11/13/12 2:07 PM, Peter Alexander wrote:On Tuesday, 13 November 2012 at 21:56:21 UTC, Andrei Alexandrescu wrote:Oh ok, thanks. That does make sense. There's been quite a bit of discussion between Bartosz, Walter, and myself about allowing transparent loads and stores as opposed to defining intrinsics x.load and x.store(y). In C++11 both transparent and implicit are allowed, and an emergent idiom is "already use the explicit versions because they clarify flow and cost". AndreiOn 11/13/12 1:28 PM, Walter Bright wrote:I'm speaking out of turn, but... I'll flip that around: what would shared do if there were memory barriers? Walter has said previously in this thread that shared is to be used to mark shared data, and disallow any potentially non-thread-safe operations. To use shared data, you need to manually lock it and then cast away the shared temporarily. This can be made more pleasant with library utilities.On 11/13/2012 1:11 AM, luka8088 wrote:Wait, then what would shared do? This is new to me as I've always assumed you and I have the same view on this. AndreiThis clarifies a lot, but still a lot of people get confused with: http://dlang.org/faq.html#shared_memory_barriers is it a faq error ?Andrei is a proponent of having shared to memory barriers, I disagree with him. We haven't convinced each other yet, so this is a bit up in the air.
Nov 13 2012
Le 13/11/2012 23:07, Peter Alexander a écrit :On Tuesday, 13 November 2012 at 21:56:21 UTC, Andrei Alexandrescu wrote:It cannot unless some ownership is introduced in D.On 11/13/12 1:28 PM, Walter Bright wrote:I'm speaking out of turn, but... I'll flip that around: what would shared do if there were memory barriers? Walter has said previously in this thread that shared is to be used to mark shared data, and disallow any potentially non-thread-safe operations. To use shared data, you need to manually lock it and then cast away the shared temporarily. This can be made more pleasant with library utilities.On 11/13/2012 1:11 AM, luka8088 wrote:Wait, then what would shared do? This is new to me as I've always assumed you and I have the same view on this. AndreiThis clarifies a lot, but still a lot of people get confused with: http://dlang.org/faq.html#shared_memory_barriers is it a faq error ?Andrei is a proponent of having shared to memory barriers, I disagree with him. We haven't convinced each other yet, so this is a bit up in the air.
Nov 13 2012
On 11/13/2012 1:56 PM, Andrei Alexandrescu wrote:On 11/13/12 1:28 PM, Walter Bright wrote:I'm just not convinced that having the compiler add memory barriers: 1. will result in correctly working code, when done by programmers who have only an incomplete understanding of memory barriers, which would be about 99.9% of us. 2. will result in efficient code I also worry that it will lure programmers into a false sense of complacency about shared, that simply adding "shared" to a type will make their concurrent code work. Few seem to realize that adding memory barriers only makes code sequentially consistent, it does *not* eliminate race conditions. It just turns a multicore machine into (logically) a single core one, *not* a single threaded one. But I do see enormous value in shared in that it logically (and rather forcefully) separates thread-local code from multi-thread code. For example, see the post here about adding a destructor to a shared struct, and having it fail to compile. The complaint was along the lines of shared being broken, whereas I viewed it along the lines of shared pointing out a logic problem in the code - what does destroying a struct accessible from multiple threads mean? I think it must be clear that destroying an object can only happen in one thread, i.e. the object must become thread local in order to be destroyed.On 11/13/2012 1:11 AM, luka8088 wrote:Wait, then what would shared do? This is new to me as I've always assumed you and I have the same view on this.This clarifies a lot, but still a lot of people get confused with: http://dlang.org/faq.html#shared_memory_barriers is it a faq error ?Andrei is a proponent of having shared to memory barriers, I disagree with him. We haven't convinced each other yet, so this is a bit up in the air.
Nov 13 2012
On 11/13/12 2:22 PM, Walter Bright wrote:On 11/13/2012 1:56 PM, Andrei Alexandrescu wrote:I'm fine with these arguments. We'll need to break current uses of shared then. What you say is that essentially you can't do even this: shared int x; ... x = 4; You'll need to use x.load(4) instead. Just for the record I'm okay with this breakage.On 11/13/12 1:28 PM, Walter Bright wrote:I'm just not convinced that having the compiler add memory barriers: 1. will result in correctly working code, when done by programmers who have only an incomplete understanding of memory barriers, which would be about 99.9% of us. 2. will result in efficient codeOn 11/13/2012 1:11 AM, luka8088 wrote:Wait, then what would shared do? This is new to me as I've always assumed you and I have the same view on this.This clarifies a lot, but still a lot of people get confused with: http://dlang.org/faq.html#shared_memory_barriers is it a faq error ?Andrei is a proponent of having shared to memory barriers, I disagree with him. We haven't convinced each other yet, so this is a bit up in the air.I also worry that it will lure programmers into a false sense of complacency about shared, that simply adding "shared" to a type will make their concurrent code work. Few seem to realize that adding memory barriers only makes code sequentially consistent, it does *not* eliminate race conditions.It does eliminate all low-level races.It just turns a multicore machine into (logically) a single core one, *not* a single threaded one.This is very approximate.But I do see enormous value in shared in that it logically (and rather forcefully) separates thread-local code from multi-thread code. For example, see the post here about adding a destructor to a shared struct, and having it fail to compile. The complaint was along the lines of shared being broken, whereas I viewed it along the lines of shared pointing out a logic problem in the code - what does destroying a struct accessible from multiple threads mean? I think it must be clear that destroying an object can only happen in one thread, i.e. the object must become thread local in order to be destroyed.As long as a cast is required along the way, we can't claim victory. I need to think about that scenario. Andrei
Nov 13 2012
On Tuesday, 13 November 2012 at 22:33:51 UTC, Andrei Alexandrescu wrote:shared int x; ... x = 4; You'll need to use x.load(4) instead.You mean x.store(4)? Or am I completely misunderstanding your message? David
Nov 13 2012
On 11/13/12 3:07 PM, David Nadlinger wrote:On Tuesday, 13 November 2012 at 22:33:51 UTC, Andrei Alexandrescu wrote:Apologies, yes, store. Andreishared int x; ... x = 4; You'll need to use x.load(4) instead.You mean x.store(4)? Or am I completely misunderstanding your message? David
Nov 13 2012
On 13-11-2012 23:33, Andrei Alexandrescu wrote:On 11/13/12 2:22 PM, Walter Bright wrote:Is that meant to be an atomic store, or just a regular, but explicit, store? (I know you meant store.)On 11/13/2012 1:56 PM, Andrei Alexandrescu wrote:I'm fine with these arguments. We'll need to break current uses of shared then. What you say is that essentially you can't do even this: shared int x; ... x = 4; You'll need to use x.load(4) instead.On 11/13/12 1:28 PM, Walter Bright wrote:I'm just not convinced that having the compiler add memory barriers: 1. will result in correctly working code, when done by programmers who have only an incomplete understanding of memory barriers, which would be about 99.9% of us. 2. will result in efficient codeOn 11/13/2012 1:11 AM, luka8088 wrote:Wait, then what would shared do? This is new to me as I've always assumed you and I have the same view on this.This clarifies a lot, but still a lot of people get confused with: http://dlang.org/faq.html#shared_memory_barriers is it a faq error ?Andrei is a proponent of having shared to memory barriers, I disagree with him. We haven't convinced each other yet, so this is a bit up in the air.Just for the record I'm okay with this breakage.-- Alex Rønne Petersen alex lycus.org http://lycus.orgI also worry that it will lure programmers into a false sense of complacency about shared, that simply adding "shared" to a type will make their concurrent code work. Few seem to realize that adding memory barriers only makes code sequentially consistent, it does *not* eliminate race conditions.It does eliminate all low-level races.It just turns a multicore machine into (logically) a single core one, *not* a single threaded one.This is very approximate.But I do see enormous value in shared in that it logically (and rather forcefully) separates thread-local code from multi-thread code. For example, see the post here about adding a destructor to a shared struct, and having it fail to compile. The complaint was along the lines of shared being broken, whereas I viewed it along the lines of shared pointing out a logic problem in the code - what does destroying a struct accessible from multiple threads mean? I think it must be clear that destroying an object can only happen in one thread, i.e. the object must become thread local in order to be destroyed.As long as a cast is required along the way, we can't claim victory. I need to think about that scenario. Andrei
Nov 13 2012
On 11/13/12 3:28 PM, Alex Rønne Petersen wrote:On 13-11-2012 23:33, Andrei Alexandrescu wrote:Atomic and sequentially consistent. Andreishared int x; ... x = 4; You'll need to use x.store(4) instead.Is that meant to be an atomic store, or just a regular, but explicit, store?
Nov 13 2012
On 14-11-2012 00:38, Andrei Alexandrescu wrote:On 11/13/12 3:28 PM, Alex Rønne Petersen wrote:OK, but then we have the problem I presented in the OP: This only works for certain types, on certain architectures, for certain processors, ... So, we could limit shared load/store to only work on certain types and require all architectures that D compilers target to provide those. *But* this means that shared on any non-primitive types becomes essentially useless and will in 99% of cases just be casted away. On the other hand, if we make it implementation-defined, people end up writing highly unportable code. So, (unless anyone can come up with better alternatives), I think guaranteeing atomic load/store for a certain set of types is the most sensible way forward. FWIW, these are the types and type categories I'd expect shared load/store to work on, on any architecture: * ubyte, byte * ushort, short * uint, int * ulong, long * float, double * pointers * slices * references * function pointers * delegates -- Alex Rønne Petersen alex lycus.org http://lycus.orgOn 13-11-2012 23:33, Andrei Alexandrescu wrote:Atomic and sequentially consistent. Andreishared int x; ... x = 4; You'll need to use x.store(4) instead.Is that meant to be an atomic store, or just a regular, but explicit, store?
Nov 13 2012
On 14-11-2012 00:43, Alex Rønne Petersen wrote:On 14-11-2012 00:38, Andrei Alexandrescu wrote:Scratch that, make it this: * ubyte, byte * ushort, short * uint, int * ulong, long * float, double * pointers * references * function pointers Slices and delegates can't be loaded/stored atomically because very few architectures provide instructions to atomically load/store 16 bytes of data (required on 64-bit; 32-bit would be fine since that's just 8 bytes, but portability is king). This is also why ucent, cent, and real are not included in the list. -- Alex Rønne Petersen alex lycus.org http://lycus.orgOn 11/13/12 3:28 PM, Alex Rønne Petersen wrote:OK, but then we have the problem I presented in the OP: This only works for certain types, on certain architectures, for certain processors, ... So, we could limit shared load/store to only work on certain types and require all architectures that D compilers target to provide those. *But* this means that shared on any non-primitive types becomes essentially useless and will in 99% of cases just be casted away. On the other hand, if we make it implementation-defined, people end up writing highly unportable code. So, (unless anyone can come up with better alternatives), I think guaranteeing atomic load/store for a certain set of types is the most sensible way forward. FWIW, these are the types and type categories I'd expect shared load/store to work on, on any architecture: * ubyte, byte * ushort, short * uint, int * ulong, long * float, double * pointers * slices * references * function pointers * delegatesOn 13-11-2012 23:33, Andrei Alexandrescu wrote:Atomic and sequentially consistent. Andreishared int x; ... x = 4; You'll need to use x.store(4) instead.Is that meant to be an atomic store, or just a regular, but explicit, store?
Nov 13 2012
Le 14/11/2012 00:48, Alex Rønne Petersen a écrit :On 14-11-2012 00:43, Alex Rønne Petersen wrote:That list sound more reasonable.On 14-11-2012 00:38, Andrei Alexandrescu wrote:Scratch that, make it this: * ubyte, byte * ushort, short * uint, int * ulong, long * float, double * pointers * references * function pointers Slices and delegates can't be loaded/stored atomically because very few architectures provide instructions to atomically load/store 16 bytes of data (required on 64-bit; 32-bit would be fine since that's just 8 bytes, but portability is king). This is also why ucent, cent, and real are not included in the list.On 11/13/12 3:28 PM, Alex Rønne Petersen wrote:OK, but then we have the problem I presented in the OP: This only works for certain types, on certain architectures, for certain processors, ... So, we could limit shared load/store to only work on certain types and require all architectures that D compilers target to provide those. *But* this means that shared on any non-primitive types becomes essentially useless and will in 99% of cases just be casted away. On the other hand, if we make it implementation-defined, people end up writing highly unportable code. So, (unless anyone can come up with better alternatives), I think guaranteeing atomic load/store for a certain set of types is the most sensible way forward. FWIW, these are the types and type categories I'd expect shared load/store to work on, on any architecture: * ubyte, byte * ushort, short * uint, int * ulong, long * float, double * pointers * slices * references * function pointers * delegatesOn 13-11-2012 23:33, Andrei Alexandrescu wrote:Atomic and sequentially consistent. Andreishared int x; ... x = 4; You'll need to use x.store(4) instead.Is that meant to be an atomic store, or just a regular, but explicit, store?
Nov 13 2012
On 11/13/12 3:48 PM, Alex Rønne Petersen wrote:Slices and delegates can't be loaded/stored atomically because very few architectures provide instructions to atomically load/store 16 bytes of data (required on 64-bit; 32-bit would be fine since that's just 8 bytes, but portability is king). This is also why ucent, cent, and real are not included in the list.When I wrote TDPL I looked at the contemporary architectures and it seemed all were or were about to support double-word atomic ops. So the intent is to allow shared delegates and slices. Are there any architectures today that don't support double-word load, store, and CAS? Andrei
Nov 13 2012
On 14-11-2012 02:52, Andrei Alexandrescu wrote:On 11/13/12 3:48 PM, Alex Rønne Petersen wrote:I do not know of a single architecture apart from x86 that supports > 8-byte load/store/CAS (and come to think of it, I'm not so sure x86 actually can do 16-byte load/store, only CAS). So while a shared delegate is doable in 32-bit, it isn't really in 64-bit. (I deliberately talk in terms of bytes here because that's the nomenclature most architecture manuals use from what I've seen.) -- Alex Rønne Petersen alex lycus.org http://lycus.orgSlices and delegates can't be loaded/stored atomically because very few architectures provide instructions to atomically load/store 16 bytes of data (required on 64-bit; 32-bit would be fine since that's just 8 bytes, but portability is king). This is also why ucent, cent, and real are not included in the list.When I wrote TDPL I looked at the contemporary architectures and it seemed all were or were about to support double-word atomic ops. So the intent is to allow shared delegates and slices. Are there any architectures today that don't support double-word load, store, and CAS? Andrei
Nov 13 2012
On 11/13/12 5:58 PM, Alex Rønne Petersen wrote:On 14-11-2012 02:52, Andrei Alexandrescu wrote:Intel does 128-bit atomic load and store, see http://www.intel.com/content/www/us/en/processors/itanium/itanium-architecture-software-developer-rev-2- -vol-2-manual.html, "4.5 Memory Datum Alignment and Atomicity". AndreiOn 11/13/12 3:48 PM, Alex Rønne Petersen wrote:I do not know of a single architecture apart from x86 that supports > 8-byte load/store/CAS (and come to think of it, I'm not so sure x86 actually can do 16-byte load/store, only CAS). So while a shared delegate is doable in 32-bit, it isn't really in 64-bit.Slices and delegates can't be loaded/stored atomically because very few architectures provide instructions to atomically load/store 16 bytes of data (required on 64-bit; 32-bit would be fine since that's just 8 bytes, but portability is king). This is also why ucent, cent, and real are not included in the list.When I wrote TDPL I looked at the contemporary architectures and it seemed all were or were about to support double-word atomic ops. So the intent is to allow shared delegates and slices. Are there any architectures today that don't support double-word load, store, and CAS? Andrei
Nov 13 2012
On 14-11-2012 03:02, Andrei Alexandrescu wrote:On 11/13/12 5:58 PM, Alex Rønne Petersen wrote:That's Itanium, though, not x86. Itanium is a fairly high-end, enterprise-class thing, so that's not very surprising. -- Alex Rønne Petersen alex lycus.org http://lycus.orgOn 14-11-2012 02:52, Andrei Alexandrescu wrote:Intel does 128-bit atomic load and store, see http://www.intel.com/content/www/us/en/processors/itanium/itanium-architecture-software-developer-rev-2-3-vol-2-manual.html, "4.5 Memory Datum Alignment and Atomicity". AndreiOn 11/13/12 3:48 PM, Alex Rønne Petersen wrote:I do not know of a single architecture apart from x86 that supports > 8-byte load/store/CAS (and come to think of it, I'm not so sure x86 actually can do 16-byte load/store, only CAS). So while a shared delegate is doable in 32-bit, it isn't really in 64-bit.Slices and delegates can't be loaded/stored atomically because very few architectures provide instructions to atomically load/store 16 bytes of data (required on 64-bit; 32-bit would be fine since that's just 8 bytes, but portability is king). This is also why ucent, cent, and real are not included in the list.When I wrote TDPL I looked at the contemporary architectures and it seemed all were or were about to support double-word atomic ops. So the intent is to allow shared delegates and slices. Are there any architectures today that don't support double-word load, store, and CAS? Andrei
Nov 13 2012
On 11/14/2012 3:05 AM, Alex Rønne Petersen wrote:On 14-11-2012 03:02, Andrei Alexandrescu wrote:On x86 you can use LOCK CMPXCHG16b to do the atomic read: http://stackoverflow.com/questions/9726566/atomic-16-byte-read-on-x64-cpus This just excludes a small number of early AMD processors.On 11/13/12 5:58 PM, Alex Rønne Petersen wrote:That's Itanium, though, not x86. Itanium is a fairly high-end, enterprise-class thing, so that's not very surprising.On 14-11-2012 02:52, Andrei Alexandrescu wrote:Intel does 128-bit atomic load and store, see http://www.intel.com/content/www/us/en/processors/itanium/itanium-architecture-software-developer-rev-2-3-vol-2-manual.html, "4.5 Memory Datum Alignment and Atomicity". AndreiOn 11/13/12 3:48 PM, Alex Rønne Petersen wrote:I do not know of a single architecture apart from x86 that supports > 8-byte load/store/CAS (and come to think of it, I'm not so sure x86 actually can do 16-byte load/store, only CAS). So while a shared delegate is doable in 32-bit, it isn't really in 64-bit.Slices and delegates can't be loaded/stored atomically because very few architectures provide instructions to atomically load/store 16 bytes of data (required on 64-bit; 32-bit would be fine since that's just 8 bytes, but portability is king). This is also why ucent, cent, and real are not included in the list.When I wrote TDPL I looked at the contemporary architectures and it seemed all were or were about to support double-word atomic ops. So the intent is to allow shared delegates and slices. Are there any architectures today that don't support double-word load, store, and CAS? Andrei
Nov 14 2012
Le 14/11/2012 00:43, Alex Rønne Petersen a écrit :On 14-11-2012 00:38, Andrei Alexandrescu wrote:I wouldn't expected it to work for delegates, long, ulong, double and slice on every arch. If it does work, that is awesome, and add to my determination that this is the thing to do.On 11/13/12 3:28 PM, Alex Rønne Petersen wrote:OK, but then we have the problem I presented in the OP: This only works for certain types, on certain architectures, for certain processors, ... So, we could limit shared load/store to only work on certain types and require all architectures that D compilers target to provide those. *But* this means that shared on any non-primitive types becomes essentially useless and will in 99% of cases just be casted away. On the other hand, if we make it implementation-defined, people end up writing highly unportable code. So, (unless anyone can come up with better alternatives), I think guaranteeing atomic load/store for a certain set of types is the most sensible way forward. FWIW, these are the types and type categories I'd expect shared load/store to work on, on any architecture: * ubyte, byte * ushort, short * uint, int * ulong, long * float, double * pointers * slices * references * function pointers * delegatesOn 13-11-2012 23:33, Andrei Alexandrescu wrote:Atomic and sequentially consistent. Andreishared int x; ... x = 4; You'll need to use x.store(4) instead.Is that meant to be an atomic store, or just a regular, but explicit, store?
Nov 13 2012
On 14-11-2012 01:09, deadalnix wrote:Le 14/11/2012 00:43, Alex Rønne Petersen a écrit :8-byte atomic loads/stores is doable on all major architectures. -- Alex Rønne Petersen alex lycus.org http://lycus.orgOn 14-11-2012 00:38, Andrei Alexandrescu wrote:I wouldn't expected it to work for delegates, long, ulong, double and slice on every arch. If it does work, that is awesome, and add to my determination that this is the thing to do.On 11/13/12 3:28 PM, Alex Rønne Petersen wrote:OK, but then we have the problem I presented in the OP: This only works for certain types, on certain architectures, for certain processors, ... So, we could limit shared load/store to only work on certain types and require all architectures that D compilers target to provide those. *But* this means that shared on any non-primitive types becomes essentially useless and will in 99% of cases just be casted away. On the other hand, if we make it implementation-defined, people end up writing highly unportable code. So, (unless anyone can come up with better alternatives), I think guaranteeing atomic load/store for a certain set of types is the most sensible way forward. FWIW, these are the types and type categories I'd expect shared load/store to work on, on any architecture: * ubyte, byte * ushort, short * uint, int * ulong, long * float, double * pointers * slices * references * function pointers * delegatesOn 13-11-2012 23:33, Andrei Alexandrescu wrote:Atomic and sequentially consistent. Andreishared int x; ... x = 4; You'll need to use x.store(4) instead.Is that meant to be an atomic store, or just a regular, but explicit, store?
Nov 13 2012
On 11/13/12 5:33 PM, Alex Rønne Petersen wrote:8-byte atomic loads/stores is doable on all major architectures.We're looking at 128-bit load, store, and CAS for 64-bit machines. Andrei
Nov 13 2012
On 11/13/2012 3:43 PM, Alex Rønne Petersen wrote:FWIW, these are the types and type categories I'd expect shared load/store to work on, on any architecture: * ubyte, byte * ushort, short * uint, int * ulong, long * float, double * pointers * slices * references * function pointers * delegatesNot going to portably work on long, ulong, double, slices, or delegates. (The compiler should issue an error where it won't work, and allow it where it does, letting the user decide what to do about the non-working cases.)
Nov 13 2012
On 14-11-2012 02:33, Walter Bright wrote:On 11/13/2012 3:43 PM, Alex Rønne Petersen wrote:I amended that (see my other post). 8-byte loads/stores can be done atomically on all relevant architectures today. Andrei linked a page a while back that explained how to do it on x86, ARM, MIPS, and PowerPC (if memory serves), but I can't seem to find it again... -- Alex Rønne Petersen alex lycus.org http://lycus.orgFWIW, these are the types and type categories I'd expect shared load/store to work on, on any architecture: * ubyte, byte * ushort, short * uint, int * ulong, long * float, double * pointers * slices * references * function pointers * delegatesNot going to portably work on long, ulong, double, slices, or delegates. (The compiler should issue an error where it won't work, and allow it where it does, letting the user decide what to do about the non-working cases.)
Nov 13 2012
Le 14/11/2012 02:36, Alex Rønne Petersen a écrit :On 14-11-2012 02:33, Walter Bright wrote:http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.htmlOn 11/13/2012 3:43 PM, Alex Rønne Petersen wrote:I amended that (see my other post). 8-byte loads/stores can be done atomically on all relevant architectures today. Andrei linked a page a while back that explained how to do it on x86, ARM, MIPS, and PowerPC (if memory serves), but I can't seem to find it again...FWIW, these are the types and type categories I'd expect shared load/store to work on, on any architecture: * ubyte, byte * ushort, short * uint, int * ulong, long * float, double * pointers * slices * references * function pointers * delegatesNot going to portably work on long, ulong, double, slices, or delegates. (The compiler should issue an error where it won't work, and allow it where it does, letting the user decide what to do about the non-working cases.)
Nov 13 2012
On 14-11-2012 03:00, deadalnix wrote:Le 14/11/2012 02:36, Alex Rønne Petersen a écrit :Thanks, exactly that. No MIPS, though. I guess I'm going to have to go dig through their manuals. -- Alex Rønne Petersen alex lycus.org http://lycus.orgOn 14-11-2012 02:33, Walter Bright wrote:http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.htmlOn 11/13/2012 3:43 PM, Alex Rønne Petersen wrote:I amended that (see my other post). 8-byte loads/stores can be done atomically on all relevant architectures today. Andrei linked a page a while back that explained how to do it on x86, ARM, MIPS, and PowerPC (if memory serves), but I can't seem to find it again...FWIW, these are the types and type categories I'd expect shared load/store to work on, on any architecture: * ubyte, byte * ushort, short * uint, int * ulong, long * float, double * pointers * slices * references * function pointers * delegatesNot going to portably work on long, ulong, double, slices, or delegates. (The compiler should issue an error where it won't work, and allow it where it does, letting the user decide what to do about the non-working cases.)
Nov 13 2012
On 11/13/2012 2:33 PM, Andrei Alexandrescu wrote:As long as a cast is required along the way, we can't claim victory. I need to think about that scenario.Our car doesn't have an electric starter yet, but it's still better than a horse :-)
Nov 13 2012
On 11/13/12 5:29 PM, Walter Bright wrote:On 11/13/2012 2:33 PM, Andrei Alexandrescu wrote:Please don't. This is "we're doing better than C++" in disguise and exactly the wrong frame of mind. I find few things more negatively disruptive than lulling into a false sense of achievement. AndreiAs long as a cast is required along the way, we can't claim victory. I need to think about that scenario.Our car doesn't have an electric starter yet, but it's still better than a horse :-)
Nov 13 2012
On Tuesday, November 13, 2012 14:33:50 Andrei Alexandrescu wrote:As long as a cast is required along the way, we can't claim victory. I need to think about that scenario.At this point, I don't see how it could be otherwise. Having the shared equivalent of const would just lead to that being used everywhere and defeat the purpose of shared in the first place. If it's not segregated, it's not doing its job. But that leaves us with most functions not working with shared, which is also a problem. Templates are a partial solution, but they obviously don't work for everything. In general, I would expect that all uses of shared would be protected by a mutex or synchronized block or other similar construct. It's just going to cause problems to do otherwise. There are some cases where if you can guarantee that writes and reads are atomic, you're fine skipping the mutexes, but those are relatively rare, particularly when you consider the issues in making anything but extremely trivial writes or reads atomic. That being the case, it doesn't really seem all that unreasonable to me for it to be normal to have to cast shared to non-shared to pass to functions as long as all of that code is protected with a mutex or another, similar construct - though if those functions aren't pure, you _could_ run into entertaining problems when a non-shared reference to the data gets wiled away somewhere in those function calls. But we seem to have contradictory requirements here of trying to segregate shared from normal, thread-local stuff but are still looking to be able to use shared with functions intended to be used with non-shared stuff. - Jonathan M Davis
Nov 13 2012
Le 13/11/2012 23:22, Walter Bright a écrit :I'm just not convinced that having the compiler add memory barriers: 1. will result in correctly working code, when done by programmers who have only an incomplete understanding of memory barriers, which would be about 99.9% of us. 2. will result in efficient code I also worry that it will lure programmers into a false sense of complacency about shared, that simply adding "shared" to a type will make their concurrent code work. Few seem to realize that adding memory barriers only makes code sequentially consistent, it does *not* eliminate race conditions. It just turns a multicore machine into (logically) a single core one, *not* a single threaded one.That is what java's volatile do. It have several uses cases, including valid double check locking (It has to be noted that this idiom is used incorrectly in druntime ATM, which proves both its usefullness and that it require language support) and disruptor which I wanted to implement for message passing in D but couldn't because of lack of support at the time. See: http://www.slideshare.net/trishagee/introduction-to-the-disruptor So sequentially consistent read/write are usefull.But I do see enormous value in shared in that it logically (and rather forcefully) separates thread-local code from multi-thread code. For example, see the post here about adding a destructor to a shared struct, and having it fail to compile. The complaint was along the lines of shared being broken, whereas I viewed it along the lines of shared pointing out a logic problem in the code - what does destroying a struct accessible from multiple threads mean? I think it must be clear that destroying an object can only happen in one thread, i.e. the object must become thread local in order to be destroyed.language multithread, have everything shared and still are able to have finalizer of some sort. Why couldn't a shared object be destroyed ? Why should it be destroyed in a specific thread as it can only refer shared data because of transitivity ?
Nov 13 2012
On 11/13/2012 4:04 PM, deadalnix wrote:That is what java's volatile do. It have several uses cases, including valid double check locking (It has to be noted that this idiom is used incorrectly in druntime ATM,Please, please file a bug report about this, rather than a vague statement here. If there already is one, please post its number.So sequentially consistent read/write are usefull.Sure, I agree with that.multithread, have everything shared and still are able to have finalizer of some sort.I understand, though, that they take steps to ensure that the finalizer is run in one thread and no other thread still has access to it - i.e. it is converted back to a local reference.Why couldn't a shared object be destroyed ? Why should it be destroyed in a specific thread as it can only refer shared data because of transitivity ?How can you destroy an object in one thread when another thread holding live references to it? (Well, how can you destroy it without causing corruption bugs, that is.)
Nov 13 2012
Le 14/11/2012 02:39, Walter Bright a écrit :On 11/13/2012 4:04 PM, deadalnix wrote:http://d.puremagic.com/issues/show_bug.cgi?id=6607That is what java's volatile do. It have several uses cases, including valid double check locking (It has to be noted that this idiom is used incorrectly in druntime ATM,Please, please file a bug report about this, rather than a vague statement here. If there already is one, please post its number.Why would you destroy something that isn't dead yet ?So sequentially consistent read/write are usefull.Sure, I agree with that.language multithread, have everything shared and still are able to have finalizer of some sort.I understand, though, that they take steps to ensure that the finalizer is run in one thread and no other thread still has access to it - i.e. it is converted back to a local reference.Why couldn't a shared object be destroyed ? Why should it be destroyed in a specific thread as it can only refer shared data because of transitivity ?How can you destroy an object in one thread when another thread holding live references to it? (Well, how can you destroy it without causing corruption bugs, that is.)
Nov 13 2012
On Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix wrote:That is what java's volatile do. It have several uses cases, including valid double check locking (It has to be noted that this idiom is used incorrectly in druntime ATM, which proves both its usefullness and that it require language support) and disruptor which I wanted to implement for message passing in D but couldn't because of lack of support at the time.What stops you from using core.atomic.{atomicLoad, atomicStore}? I don't know whether there might be a weird spec loophole which could theoretically lead to them being undefined behavior, but I'm sure that they are guaranteed to produce the right code on all relevant compilers. You can even specify the memory order semantics if you know what you are doing (although this used to trigger a template resolution bug in the frontend, no idea if it works now). David
Nov 14 2012
Le 14/11/2012 13:23, David Nadlinger a écrit :On Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix wrote:It is a solution now (it wasn't at the time). The main drawback with that solution is that the compiler can't optimize thread local read/write regardless of shared read/write. This is wasted opportunity.That is what java's volatile do. It have several uses cases, including valid double check locking (It has to be noted that this idiom is used incorrectly in druntime ATM, which proves both its usefullness and that it require language support) and disruptor which I wanted to implement for message passing in D but couldn't because of lack of support at the time.What stops you from using core.atomic.{atomicLoad, atomicStore}? I don't know whether there might be a weird spec loophole which could theoretically lead to them being undefined behavior, but I'm sure that they are guaranteed to produce the right code on all relevant compilers. You can even specify the memory order semantics if you know what you are doing (although this used to trigger a template resolution bug in the frontend, no idea if it works now). David
Nov 14 2012
On Wednesday, 14 November 2012 at 13:19:12 UTC, deadalnix wrote:The main drawback with that solution is that the compiler can't optimize thread local read/write regardless of shared read/write. This is wasted opportunity.You mean moving non-atomic loads/stores across atomic instructions? This is simply a matter of the compiler providing the right intrinsics for implementing the core.atomic functions. LDC already does it. David
Nov 14 2012
On 11/14/12 4:23 AM, David Nadlinger wrote:On Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix wrote:This is a simplification of what should be going on. The core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the compiler generate sequentially consistent code with them (i.e. not perform certain reorderings). Then there are loads and stores with weaker consistency semantics (acquire, release, acquire/release, and consume). AndreiThat is what java's volatile do. It have several uses cases, including valid double check locking (It has to be noted that this idiom is used incorrectly in druntime ATM, which proves both its usefullness and that it require language support) and disruptor which I wanted to implement for message passing in D but couldn't because of lack of support at the time.What stops you from using core.atomic.{atomicLoad, atomicStore}? I don't know whether there might be a weird spec loophole which could theoretically lead to them being undefined behavior, but I'm sure that they are guaranteed to produce the right code on all relevant compilers. You can even specify the memory order semantics if you know what you are doing (although this used to trigger a template resolution bug in the frontend, no idea if it works now). David
Nov 14 2012
On 14-11-2012 15:32, Andrei Alexandrescu wrote:On 11/14/12 4:23 AM, David Nadlinger wrote:They already work as they should: * DMD: They use inline asm, so they're guaranteed to not be reordered. Calls aren't reordered with DMD either, so even if the former wasn't the case, it'd still work. * GDC: They map directly to the GCC __sync_* builtins, which have the semantics you describe (with full sequential consistency). * LDC: They map to LLVM's load/store instructions with the atomic flag set and with the given atomic consistency, which have the semantics you describe. I don't think there's anything that actually needs to be fixed there. -- Alex Rønne Petersen alex lycus.org http://lycus.orgOn Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix wrote:This is a simplification of what should be going on. The core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the compiler generate sequentially consistent code with them (i.e. not perform certain reorderings). Then there are loads and stores with weaker consistency semantics (acquire, release, acquire/release, and consume). AndreiThat is what java's volatile do. It have several uses cases, including valid double check locking (It has to be noted that this idiom is used incorrectly in druntime ATM, which proves both its usefullness and that it require language support) and disruptor which I wanted to implement for message passing in D but couldn't because of lack of support at the time.What stops you from using core.atomic.{atomicLoad, atomicStore}? I don't know whether there might be a weird spec loophole which could theoretically lead to them being undefined behavior, but I'm sure that they are guaranteed to produce the right code on all relevant compilers. You can even specify the memory order semantics if you know what you are doing (although this used to trigger a template resolution bug in the frontend, no idea if it works now). David
Nov 14 2012
On 11/14/12 7:11 AM, Alex Rønne Petersen wrote:On 14-11-2012 15:32, Andrei Alexandrescu wrote:The language definition should be made clear so as future optimizations of existing implementations, and future implementations, don't push things over the limit. AndreiOn 11/14/12 4:23 AM, David Nadlinger wrote:They already work as they should: * DMD: They use inline asm, so they're guaranteed to not be reordered. Calls aren't reordered with DMD either, so even if the former wasn't the case, it'd still work. * GDC: They map directly to the GCC __sync_* builtins, which have the semantics you describe (with full sequential consistency). * LDC: They map to LLVM's load/store instructions with the atomic flag set and with the given atomic consistency, which have the semantics you describe. I don't think there's anything that actually needs to be fixed there.On Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix wrote:This is a simplification of what should be going on. The core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the compiler generate sequentially consistent code with them (i.e. not perform certain reorderings). Then there are loads and stores with weaker consistency semantics (acquire, release, acquire/release, and consume). AndreiThat is what java's volatile do. It have several uses cases, including valid double check locking (It has to be noted that this idiom is used incorrectly in druntime ATM, which proves both its usefullness and that it require language support) and disruptor which I wanted to implement for message passing in D but couldn't because of lack of support at the time.What stops you from using core.atomic.{atomicLoad, atomicStore}? I don't know whether there might be a weird spec loophole which could theoretically lead to them being undefined behavior, but I'm sure that they are guaranteed to produce the right code on all relevant compilers. You can even specify the memory order semantics if you know what you are doing (although this used to trigger a template resolution bug in the frontend, no idea if it works now). David
Nov 14 2012
On Wednesday, 14 November 2012 at 14:32:34 UTC, Andrei Alexandrescu wrote:On 11/14/12 4:23 AM, David Nadlinger wrote:Sorry, I don't quite see where I simplified things. Yes, in the implementation of atomicLoad/atomicStore, one would probably use compiler intrinsics, as done in LDC's druntime, or inline assembly, as done for DMD. But an optimizer will never move instructions across opaque function calls, because they could have arbitrary side effects. So, either we are fine by definition, or if the compiler inlines the atomicLoad/atomicStore calls (which is actually possible in LDC), then its optimizer will detect the presence of inline assembly resp. the load/store intrinsics, and take care of not reordering the instructions in an invalid way. I don't see how this makes my answer to deadalnix (that »volatile« is not necessary to implement sequentially consistent loads/stores) any less valid. DavidOn Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix wrote:This is a simplification of what should be going on. The core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the compiler generate sequentially consistent code with them (i.e. not perform certain reorderings). Then there are loads and stores with weaker consistency semantics (acquire, release, acquire/release, and consume).That is what java's volatile do. It have several uses cases, including valid double check locking (It has to be noted that this idiom is used incorrectly in druntime ATM, which proves both its usefullness and that it require language support) and disruptor which I wanted to implement for message passing in D but couldn't because of lack of support at the time.What stops you from using core.atomic.{atomicLoad, atomicStore}? I don't know whether there might be a weird spec loophole which could theoretically lead to them being undefined behavior, but I'm sure that they are guaranteed to produce the right code on all relevant compilers. You can even specify the memory order semantics if you know what you are doing (although this used to trigger a template resolution bug in the frontend, no idea if it works now). David
Nov 14 2012
On 11/14/12 8:59 AM, David Nadlinger wrote:On Wednesday, 14 November 2012 at 14:32:34 UTC, Andrei Alexandrescu wrote:First, there are more kinds of atomic loads and stores. Then, the fact that the calls are not supposed to be reordered must be a guarantee of the language, not a speculation about an implementation. We can't argue that a feature works just because it so happens an implementation works a specific way.On 11/14/12 4:23 AM, David Nadlinger wrote:Sorry, I don't quite see where I simplified things.On Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix wrote:This is a simplification of what should be going on. The core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the compiler generate sequentially consistent code with them (i.e. not perform certain reorderings). Then there are loads and stores with weaker consistency semantics (acquire, release, acquire/release, and consume).That is what java's volatile do. It have several uses cases, including valid double check locking (It has to be noted that this idiom is used incorrectly in druntime ATM, which proves both its usefullness and that it require language support) and disruptor which I wanted to implement for message passing in D but couldn't because of lack of support at the time.What stops you from using core.atomic.{atomicLoad, atomicStore}? I don't know whether there might be a weird spec loophole which could theoretically lead to them being undefined behavior, but I'm sure that they are guaranteed to produce the right code on all relevant compilers. You can even specify the memory order semantics if you know what you are doing (although this used to trigger a template resolution bug in the frontend, no idea if it works now). DavidYes, in the implementation of atomicLoad/atomicStore, one would probably use compiler intrinsics, as done in LDC's druntime, or inline assembly, as done for DMD. But an optimizer will never move instructions across opaque function calls, because they could have arbitrary side effects.Nowhere in the language definition is explained what an opaque function call is and what optimizations can and cannot be done in the presence of such.So, either we are fine by definition,s/definition/happenstance/or if the compiler inlines the atomicLoad/atomicStore calls (which is actually possible in LDC), then its optimizer will detect the presence of inline assembly resp. the load/store intrinsics, and take care of not reordering the instructions in an invalid way. I don't see how this makes my answer to deadalnix (that »volatile« is not necessary to implement sequentially consistent loads/stores) any less valid.Using load/store everywhere would make volatile unneeded (and for us, shared). But the advantage there is that you qualify the type/value once and then you don't need to remember to only use specific primitives to manipulate it. Andrei
Nov 14 2012
On Nov 14, 2012, at 9:50 AM, Andrei Alexandrescu = <SeeWebsiteForEmail erdani.org> wrote:=20 First, there are more kinds of atomic loads and stores. Then, the fact =that the calls are not supposed to be reordered must be a guarantee of = the language, not a speculation about an implementation. We can't argue = that a feature works just because it so happens an implementation works = a specific way. I've always been a fan of release consistency, and it dovetails well = with the behavior of mutexes = (http://en.wikipedia.org/wiki/Release_consistency). It would be cool if = we could sort out transactional memory as well, but that's not a short = term thing.=
Nov 14 2012
On 11/14/12 12:04 PM, Sean Kelly wrote:On Nov 14, 2012, at 9:50 AM, Andrei Alexandrescu<SeeWebsiteForEmail erdani.org> wrote:I think we should focus on sequential consistency as that's where the industry is converging. AndreiFirst, there are more kinds of atomic loads and stores. Then, the fact that the calls are not supposed to be reordered must be a guarantee of the language, not a speculation about an implementation. We can't argue that a feature works just because it so happens an implementation works a specific way.I've always been a fan of release consistency, and it dovetails well with the behavior of mutexes (http://en.wikipedia.org/wiki/Release_consistency). It would be cool if we could sort out transactional memory as well, but that's not a short term thing.
Nov 14 2012
On Nov 14, 2012, at 6:32 AM, Andrei Alexandrescu = <SeeWebsiteForEmail erdani.org> wrote:=20 This is a simplification of what should be going on. The =core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so = the compiler generate sequentially consistent code with them (i.e. not = perform certain reorderings). Then there are loads and stores with = weaker consistency semantics (acquire, release, acquire/release, and = consume). No. These functions all contain volatile ask blocks. If the compiler = respected the "volatile" it would be enough.=
Nov 14 2012
Le 14/11/2012 21:01, Sean Kelly a écrit :On Nov 14, 2012, at 6:32 AM, Andrei Alexandrescu<SeeWebsiteForEmail erdani.org> wrote:It is sufficient for monocore and mostly correct for x86. But isn't enough. volatile isn't for concurency, but memory mapping.This is a simplification of what should be going on. The core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the compiler generate sequentially consistent code with them (i.e. not perform certain reorderings). Then there are loads and stores with weaker consistency semantics (acquire, release, acquire/release, and consume).No. These functions all contain volatile ask blocks. If the compiler respected the "volatile" it would be enough.
Nov 15 2012
On Nov 15, 2012, at 4:54 AM, deadalnix <deadalnix gmail.com> wrote:Le 14/11/2012 21:01, Sean Kelly a =E9crit :Alexandrescu<SeeWebsiteForEmail erdani.org> wrote:On Nov 14, 2012, at 6:32 AM, Andrei =core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so = the compiler generate sequentially consistent code with them (i.e. not = perform certain reorderings). Then there are loads and stores with = weaker consistency semantics (acquire, release, acquire/release, and = consume).=20 This is a simplification of what should be going on. The =compiler respected the "volatile" it would be enough.=20 No. These functions all contain volatile ask blocks. If the ==20 It is sufficient for monocore and mostly correct for x86. But isn't =enough.=20 volatile isn't for concurency, but memory mapping.Traditionally, the term "volatile" is for memory mapping. The = description of "volatile" for D1, though, would have worked for = concurrency. Or is there some example you can provide where this isn't = true?=
Nov 15 2012
Le 15/11/2012 17:33, Sean Kelly a écrit :On Nov 15, 2012, at 4:54 AM, deadalnix<deadalnix gmail.com> wrote:I'm not aware of D1 compiler inserting memory barrier, so any memory operation reordering done by the CPU would have screwed up.Le 14/11/2012 21:01, Sean Kelly a écrit :Traditionally, the term "volatile" is for memory mapping. The description of "volatile" for D1, though, would have worked for concurrency. Or is there some example you can provide where this isn't true?On Nov 14, 2012, at 6:32 AM, Andrei Alexandrescu<SeeWebsiteForEmail erdani.org> wrote:It is sufficient for monocore and mostly correct for x86. But isn't enough. volatile isn't for concurency, but memory mapping.This is a simplification of what should be going on. The core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the compiler generate sequentially consistent code with them (i.e. not perform certain reorderings). Then there are loads and stores with weaker consistency semantics (acquire, release, acquire/release, and consume).No. These functions all contain volatile ask blocks. If the compiler respected the "volatile" it would be enough.
Nov 16 2012
On Nov 14, 2012, at 12:01 PM, Sean Kelly <sean invisibleduck.org> wrote:On Nov 14, 2012, at 6:32 AM, Andrei Alexandrescu =<SeeWebsiteForEmail erdani.org> wrote:core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so = the compiler generate sequentially consistent code with them (i.e. not = perform certain reorderings). Then there are loads and stores with = weaker consistency semantics (acquire, release, acquire/release, and = consume).=20 This is a simplification of what should be going on. The ==20 No. These functions all contain volatile ask blocks. If the compiler =respected the "volatile" it would be enough. asm blocks. Darn auto-correct.=
Nov 14 2012
On 2012-11-13 23:22, Walter Bright wrote:But I do see enormous value in shared in that it logically (and rather forcefully) separates thread-local code from multi-thread code. For example, see the post here about adding a destructor to a shared struct, and having it fail to compile. The complaint was along the lines of shared being broken, whereas I viewed it along the lines of shared pointing out a logic problem in the code - what does destroying a struct accessible from multiple threads mean? I think it must be clear that destroying an object can only happen in one thread, i.e. the object must become thread local in order to be destroyed.If the compiler should/does not add memory barriers, then is there a reason for having it built into the language? Can a library solution be enough? -- /Jacob Carlborg
Nov 13 2012
On 11/13/2012 11:37 PM, Jacob Carlborg wrote:If the compiler should/does not add memory barriers, then is there a reason for having it built into the language? Can a library solution be enough?Memory barriers can certainly be added using library functions.
Nov 14 2012
On 2012-11-14 10:20, Walter Bright wrote:Memory barriers can certainly be added using library functions.Is there then any real advantage of having it directly in the language? -- /Jacob Carlborg
Nov 14 2012
On 11/14/2012 1:31 AM, Jacob Carlborg wrote:On 2012-11-14 10:20, Walter Bright wrote:Not that I can think of.Memory barriers can certainly be added using library functions.Is there then any real advantage of having it directly in the language?
Nov 14 2012
On 2012-11-14 11:38, Walter Bright wrote:Not that I can think of.Then we might want to remove it since it's either not working or basically everyone has misunderstood how it should work. -- /Jacob Carlborg
Nov 14 2012
On 11/14/12 4:47 AM, Jacob Carlborg wrote:On 2012-11-14 11:38, Walter Bright wrote:Actually this hypothesis is false. AndreiNot that I can think of.Then we might want to remove it since it's either not working or basically everyone has misunderstood how it should work.
Nov 14 2012
On 2012-11-14 15:33, Andrei Alexandrescu wrote:Actually this hypothesis is false.That we should remove it or that it's not working/nobody understands what it should do? If it's the latter then this thread is the evidence that my hypothesis is true. -- /Jacob Carlborg
Nov 14 2012
On 11/14/12 7:14 AM, Jacob Carlborg wrote:On 2012-11-14 15:33, Andrei Alexandrescu wrote:The hypothesis that atomic primitives can be implemented as a library. AndreiActually this hypothesis is false.That we should remove it or that it's not working/nobody understands what it should do? If it's the latter then this thread is the evidence that my hypothesis is true.
Nov 14 2012
On 2012-11-14 18:36, Andrei Alexandrescu wrote:The hypothesis that atomic primitives can be implemented as a library.I don't know these kind of things, that's why I'm asking. -- /Jacob Carlborg
Nov 14 2012
Le 14/11/2012 10:31, Jacob Carlborg a écrit :On 2012-11-14 10:20, Walter Bright wrote:The compiler can do more reordering in regard to barriers. For instance, the compiler may reorder thread local read write accross the barrier. This can't be done with a library solution.Memory barriers can certainly be added using library functions.Is there then any real advantage of having it directly in the language?
Nov 14 2012
On 2012-11-14 12:04, deadalnix wrote:The compiler can do more reordering in regard to barriers. For instance, the compiler may reorder thread local read write accross the barrier. This can't be done with a library solution.I see. -- /Jacob Carlborg
Nov 14 2012
On 11/14/12 1:31 AM, Jacob Carlborg wrote:On 2012-11-14 10:20, Walter Bright wrote:It's not an advantage, it's a necessity. AndreiMemory barriers can certainly be added using library functions.Is there then any real advantage of having it directly in the language?
Nov 14 2012
On 2012-11-14 15:22, Andrei Alexandrescu wrote:It's not an advantage, it's a necessity.Walter seems to indicate that there is no technical reason for "shared" to be part of the language. I don't know how these memory barriers work, that's why I'm asking. Does it need to be in the language or not? -- /Jacob Carlborg
Nov 14 2012
On 11/14/12 7:16 AM, Jacob Carlborg wrote:On 2012-11-14 15:22, Andrei Alexandrescu wrote:Walter is a self-confessed dilettante in threading. To be frank I hope he asks more and answers less in this thread.It's not an advantage, it's a necessity.Walter seems to indicate that there is no technical reason for "shared" to be part of the language.I don't know how these memory barriers work, that's why I'm asking. Does it need to be in the language or not?Memory ordering must be built into the language and understood by the compiler. Andrei
Nov 14 2012
On 2012-11-14 18:40, Andrei Alexandrescu wrote:Memory ordering must be built into the language and understood by the compiler.Ok, thanks for the expatiation. -- /Jacob Carlborg
Nov 14 2012
On 11/14/12 1:20 AM, Walter Bright wrote:On 11/13/2012 11:37 PM, Jacob Carlborg wrote:The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier. AndreiIf the compiler should/does not add memory barriers, then is there a reason for having it built into the language? Can a library solution be enough?Memory barriers can certainly be added using library functions.
Nov 14 2012
On Wednesday, 14 November 2012 at 14:16:57 UTC, Andrei Alexandrescu wrote:On 11/14/12 1:20 AM, Walter Bright wrote:Again, this is true, but it would be a fallacy to conclude that compiler-inserted memory barriers for »shared« are required due to this (and it is »shared« we are discussing here!). Simply having compiler intrinsics for atomic loads/stores is enough, which is hardly »built into the language«. DavidOn 11/13/2012 11:37 PM, Jacob Carlborg wrote:The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.If the compiler should/does not add memory barriers, then is there a reason for having it built into the language? Can a library solution be enough?Memory barriers can certainly be added using library functions.
Nov 14 2012
On 11/14/12 9:15 AM, David Nadlinger wrote:On Wednesday, 14 November 2012 at 14:16:57 UTC, Andrei Alexandrescu wrote:Compiler intrinsics ====== built into the language. AndreiOn 11/14/12 1:20 AM, Walter Bright wrote:Again, this is true, but it would be a fallacy to conclude that compiler-inserted memory barriers for »shared« are required due to this (and it is »shared« we are discussing here!). Simply having compiler intrinsics for atomic loads/stores is enough, which is hardly »built into the language«.On 11/13/2012 11:37 PM, Jacob Carlborg wrote:The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.If the compiler should/does not add memory barriers, then is there a reason for having it built into the language? Can a library solution be enough?Memory barriers can certainly be added using library functions.
Nov 14 2012
On 14 November 2012 17:50, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:On 11/14/12 9:15 AM, David Nadlinger wrote:e:On Wednesday, 14 November 2012 at 14:16:57 UTC, Andrei Alexandrescu wrot=hisOn 11/14/12 1:20 AM, Walter Bright wrote:Again, this is true, but it would be a fallacy to conclude that compiler-inserted memory barriers for =BBshared=AB are required due to t=On 11/13/2012 11:37 PM, Jacob Carlborg wrote:The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.If the compiler should/does not add memory barriers, then is there a reason for having it built into the language? Can a library solution be enough?Memory barriers can certainly be added using library functions.Not necessarily. For example, printf is a compiler intrinsic for GDC, but it's not built into the language in the sense of the compiler *provides* the codegen for it. Though it is aware of what it is and what it does, so can perform relevant optimisations around the use of it. Regards, --=20 Iain Buclaw *(p < e ? p++ : p) =3D (c & 0x0f) + '0';(and it is =BBshared=AB we are discussing here!). Simply having compiler intrinsics for atomic loads/stores is enough, which is hardly =BBbuilt into the language=AB.Compiler intrinsics =3D=3D=3D=3D=3D=3D built into the language. Andrei
Nov 14 2012
On 11/14/12 11:21 AM, Iain Buclaw wrote:On 14 November 2012 17:50, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:aware of what it is and what it does ====== built into the language. AndreiOn 11/14/12 9:15 AM, David Nadlinger wrote:Not necessarily. For example, printf is a compiler intrinsic for GDC, but it's not built into the language in the sense of the compiler *provides* the codegen for it. Though it is aware of what it is and what it does, so can perform relevant optimisations around the use of it.On Wednesday, 14 November 2012 at 14:16:57 UTC, Andrei Alexandrescu wrote:Compiler intrinsics ====== built into the language. AndreiOn 11/14/12 1:20 AM, Walter Bright wrote:Again, this is true, but it would be a fallacy to conclude that compiler-inserted memory barriers for »shared« are required due to this (and it is »shared« we are discussing here!). Simply having compiler intrinsics for atomic loads/stores is enough, which is hardly »built into the language«.On 11/13/2012 11:37 PM, Jacob Carlborg wrote:The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.If the compiler should/does not add memory barriers, then is there a reason for having it built into the language? Can a library solution be enough?Memory barriers can certainly be added using library functions.
Nov 14 2012
On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu = <SeeWebsiteForEmail erdani.org> wrote:On 11/14/12 1:20 AM, Walter Bright wrote:doesn't hoist code above an acquire barrier or below a release barrier. That was the point of the now deprecated "volatile" statement. I still = don't entirely understand why it was deprecated.=On 11/13/2012 11:37 PM, Jacob Carlborg wrote:=20 The compiler must understand the semantics of barriers such as e.g. it =If the compiler should/does not add memory barriers, then is there a reason for having it built into the language? Can a library solution be enough?=20 Memory barriers can certainly be added using library functions.
Nov 14 2012
On 14-11-2012 21:00, Sean Kelly wrote:On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:The volatile statement was too general. All relevant compiler back ends today only know of two kinds of volatile operations: Loads and stores. Volatile statements couldn't ever be properly implemented in GDC and LDC for example. See also: http://prowiki.org/wiki4d/wiki.cgi?LanguageDevel/DIPs/DIP20 -- Alex Rønne Petersen alex lycus.org http://lycus.orgOn 11/14/12 1:20 AM, Walter Bright wrote:That was the point of the now deprecated "volatile" statement. I still don't entirely understand why it was deprecated.On 11/13/2012 11:37 PM, Jacob Carlborg wrote:The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.If the compiler should/does not add memory barriers, then is there a reason for having it built into the language? Can a library solution be enough?Memory barriers can certainly be added using library functions.
Nov 14 2012
On Nov 14, 2012, at 12:07 PM, Alex R=F8nne Petersen <alex lycus.org> = wrote:On 14-11-2012 21:00, Sean Kelly wrote:<SeeWebsiteForEmail erdani.org> wrote:On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu =a=20On 11/14/12 1:20 AM, Walter Bright wrote:On 11/13/2012 11:37 PM, Jacob Carlborg wrote:If the compiler should/does not add memory barriers, then is there =enough?reason for having it built into the language? Can a library solution be =it doesn't hoist code above an acquire barrier or below a release = barrier.=20 Memory barriers can certainly be added using library functions.=20 The compiler must understand the semantics of barriers such as e.g. =still don't entirely understand why it was deprecated.=20 That was the point of the now deprecated "volatile" statement. I =ends today only know of two kinds of volatile operations: Loads and = stores. Volatile statements couldn't ever be properly implemented in GDC = and LDC for example. Well, the semantics of volatile are that there's an acquire barrier = before the statement block and a release barrier after the statement = block. Or for a first cut just insert a full barrier at the beginning = and end of the block. Either way, it should be pretty simply for a = compiler to handle if the compiler supports mutex use. I do like the idea of built-in load and store intrinsics only because D = only supports x86 assembler right now. But really, it would be just as = easy to fan out a D template function to a bunch of C functions = implemented in separate ASM code files. Druntime actually had this for = core.atomic on PPC until not too long ago.==20=20 The volatile statement was too general. All relevant compiler back =
Nov 14 2012
On 14-11-2012 21:15, Sean Kelly wrote:On Nov 14, 2012, at 12:07 PM, Alex Rønne Petersen <alex lycus.org> wrote:Well, there's not much point in that when all compilers have intrinsics anyway (e.g. GDC has __sync_* and __atomic_* and LDC has some intrinsics in ldc.intrinsics that map to certain LLVM instructions). -- Alex Rønne Petersen alex lycus.org http://lycus.orgOn 14-11-2012 21:00, Sean Kelly wrote:Well, the semantics of volatile are that there's an acquire barrier before the statement block and a release barrier after the statement block. Or for a first cut just insert a full barrier at the beginning and end of the block. Either way, it should be pretty simply for a compiler to handle if the compiler supports mutex use. I do like the idea of built-in load and store intrinsics only because D only supports x86 assembler right now. But really, it would be just as easy to fan out a D template function to a bunch of C functions implemented in separate ASM code files. Druntime actually had this for core.atomic on PPC until not too long ago.On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:The volatile statement was too general. All relevant compiler back ends today only know of two kinds of volatile operations: Loads and stores. Volatile statements couldn't ever be properly implemented in GDC and LDC for example.On 11/14/12 1:20 AM, Walter Bright wrote:That was the point of the now deprecated "volatile" statement. I still don't entirely understand why it was deprecated.On 11/13/2012 11:37 PM, Jacob Carlborg wrote:The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.If the compiler should/does not add memory barriers, then is there a reason for having it built into the language? Can a library solution be enough?Memory barriers can certainly be added using library functions.
Nov 14 2012
On 11/14/12 12:00 PM, Sean Kelly wrote:On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu<SeeWebsiteForEmail erdani.org> wrote:Because it's better to associate volatility with data than with code. AndreiOn 11/14/12 1:20 AM, Walter Bright wrote:That was the point of the now deprecated "volatile" statement. I still don't entirely understand why it was deprecated.On 11/13/2012 11:37 PM, Jacob Carlborg wrote:The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.If the compiler should/does not add memory barriers, then is there a reason for having it built into the language? Can a library solution be enough?Memory barriers can certainly be added using library functions.
Nov 14 2012
On Nov 14, 2012, at 2:21 PM, Andrei Alexandrescu = <SeeWebsiteForEmail erdani.org> wrote:On 11/14/12 12:00 PM, Sean Kelly wrote:Alexandrescu<SeeWebsiteForEmail erdani.org> wrote:On Nov 14, 2012, at 6:16 AM, Andrei =a=20On 11/14/12 1:20 AM, Walter Bright wrote:On 11/13/2012 11:37 PM, Jacob Carlborg wrote:If the compiler should/does not add memory barriers, then is there =enough?reason for having it built into the language? Can a library solution be =it doesn't hoist code above an acquire barrier or below a release = barrier.=20 Memory barriers can certainly be added using library functions.=20 The compiler must understand the semantics of barriers such as e.g. =still don't entirely understand why it was deprecated.=20 That was the point of the now deprecated "volatile" statement. I ==20 Because it's better to associate volatility with data than with code.Fair enough. Though this may mean building a bunch of different forms = of volatility into the language. I always saw "volatile" as a library = tool anyway, so while making it code-related was a bit weird, it was a = sufficient tool for the job.
Nov 14 2012
Le 14/11/2012 23:21, Andrei Alexandrescu a écrit :On 11/14/12 12:00 PM, Sean Kelly wrote:Happy to see I'm not alone on that one. Plus, volatile and sequential consistency are 2 different beast. Volatile means no register promotion and no load/store reordering. It is required, but not sufficient for concurrency.On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu<SeeWebsiteForEmail erdani.org> wrote:Because it's better to associate volatility with data than with code.On 11/14/12 1:20 AM, Walter Bright wrote:That was the point of the now deprecated "volatile" statement. I still don't entirely understand why it was deprecated.On 11/13/2012 11:37 PM, Jacob Carlborg wrote:The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.If the compiler should/does not add memory barriers, then is there a reason for having it built into the language? Can a library solution be enough?Memory barriers can certainly be added using library functions.
Nov 15 2012
On Nov 15, 2012, at 5:10 AM, deadalnix <deadalnix gmail.com> wrote:Le 14/11/2012 23:21, Andrei Alexandrescu a =E9crit :there aOn 11/14/12 12:00 PM, Sean Kelly wrote:On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu<SeeWebsiteForEmail erdani.org> wrote: =20On 11/14/12 1:20 AM, Walter Bright wrote:On 11/13/2012 11:37 PM, Jacob Carlborg wrote:If the compiler should/does not add memory barriers, then is =enough?reason for having it built into the language? Can a library solution be =still=20 That was the point of the now deprecated "volatile" statement. I ==20 Memory barriers can certainly be added using library functions.=20 The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.Volatile means no register promotion and no load/store reordering. It is = required, but not sufficient for concurrency. It's sufficient for concurrency when coupled with library code that does = the hardware-level synchronization. In short, a program has two = separate machines doing similar optimizations on it: the compiler and = the CPU. In D we can use ASM to control CPU optimizations, and in D1 we = had "volatile" to control compiler optimizations. "volatile" was the = minimum required for handling the compiler portion and was easy to get = wrong, but it used only one keyword and I suspect was relatively easy to = implement on the compiler side as well.==20 Happy to see I'm not alone on that one. =20 Plus, volatile and sequential consistency are 2 different beast. =don't entirely understand why it was deprecated.=20 Because it's better to associate volatility with data than with code.
Nov 15 2012
On 11/13/12 11:37 PM, Jacob Carlborg wrote:On 2012-11-13 23:22, Walter Bright wrote:The compiler must be in this so as to not do certain reorderings. AndreiBut I do see enormous value in shared in that it logically (and rather forcefully) separates thread-local code from multi-thread code. For example, see the post here about adding a destructor to a shared struct, and having it fail to compile. The complaint was along the lines of shared being broken, whereas I viewed it along the lines of shared pointing out a logic problem in the code - what does destroying a struct accessible from multiple threads mean? I think it must be clear that destroying an object can only happen in one thread, i.e. the object must become thread local in order to be destroyed.If the compiler should/does not add memory barriers, then is there a reason for having it built into the language? Can a library solution be enough?
Nov 14 2012
On Tuesday, November 13, 2012 14:22:07 Walter Bright wrote:I'm just not convinced that having the compiler add memory barriers: 1. will result in correctly working code, when done by programmers who have only an incomplete understanding of memory barriers, which would be about 99.9% of us. 2. will result in efficient codeBeing able to have double-checked locking work would be valuable, and having memory barriers would reduce race condition weirdness when locks aren't used properly, so I think that it would be desirable to have memory barriers. If there's a major performance penalty though, that might be a reason not to do it. Certainly, I don't think that there's any question that adding memory barriers won't make it so that you don't need mutexes or synchronized blocks or whatnot. shared's primary benefit is in logically separating normal code from code that must shared data across threads and making it possible for the compiler to optimize based on the fact that it knows that a variable is thread-local. - Jonathan M Davis
Nov 13 2012
On 11/13/2012 11:56 PM, Jonathan M Davis wrote:Being able to have double-checked locking work would be valuable, and having memory barriers would reduce race condition weirdness when locks aren't used properly, so I think that it would be desirable to have memory barriers.I'm not saying "memory barriers are bad". I'm saying that having the compiler blindly insert them for shared reads/writes is far from the right way to do it.
Nov 14 2012
On 11/14/12 1:19 AM, Walter Bright wrote:On 11/13/2012 11:56 PM, Jonathan M Davis wrote:AndreiBeing able to have double-checked locking work would be valuable, and having memory barriers would reduce race condition weirdness when locks aren't used properly, so I think that it would be desirable to have memory barriers.I'm not saying "memory barriers are bad". I'm saying that having the compiler blindly insert them for shared reads/writes is far from the right way to do it.
Nov 14 2012
On 14-11-2012 15:14, Andrei Alexandrescu wrote:On 11/14/12 1:19 AM, Walter Bright wrote:I need some clarification here: By memory barrier, do you mean x86's mfence, sfence, and lfence? Because as Walter said, inserting those blindly when unnecessary can lead to terrible performance because it practically murders pipelining. (And note that you can't optimize this either; since the dependencies memory barriers are supposed to express are subtle and not detectable by a compiler, the compiler would always have to insert them because it can't know when it would be safe not to.) -- Alex Rønne Petersen alex lycus.org http://lycus.orgOn 11/13/2012 11:56 PM, Jonathan M Davis wrote:AndreiBeing able to have double-checked locking work would be valuable, and having memory barriers would reduce race condition weirdness when locks aren't used properly, so I think that it would be desirable to have memory barriers.I'm not saying "memory barriers are bad". I'm saying that having the compiler blindly insert them for shared reads/writes is far from the right way to do it.
Nov 14 2012
Le 14/11/2012 15:39, Alex Rønne Petersen a écrit :On 14-11-2012 15:14, Andrei Alexandrescu wrote:In fact, x86 is mostly sequentially consistent due to its memory model. It only require an mfence when an shared store is followed by a shared load. See : http://g.oswego.edu/dl/jmm/cookbook.html for more information on the barrier required on different architectures.On 11/14/12 1:19 AM, Walter Bright wrote:I need some clarification here: By memory barrier, do you mean x86's mfence, sfence, and lfence? Because as Walter said, inserting those blindly when unnecessary can lead to terrible performance because it practically murders pipelining.On 11/13/2012 11:56 PM, Jonathan M Davis wrote:AndreiBeing able to have double-checked locking work would be valuable, and having memory barriers would reduce race condition weirdness when locks aren't used properly, so I think that it would be desirable to have memory barriers.I'm not saying "memory barriers are bad". I'm saying that having the compiler blindly insert them for shared reads/writes is far from the right way to do it.(And note that you can't optimize this either; since the dependencies memory barriers are supposed to express are subtle and not detectable by a compiler, the compiler would always have to insert them because it can't know when it would be safe not to.)Compiler is aware of what is thread local and what isn't. It means the compiler can fully optimize TL store and load (like doing register promotion or reorder them across shared store/load). This have a cost, indeed, but is useful, and Walter's solution to cast away shared when a mutex is acquired is always available.
Nov 14 2012
On 14-11-2012 15:50, deadalnix wrote:Le 14/11/2012 15:39, Alex Rønne Petersen a écrit :I just used x86's fencing instructions as an example because most people here are familiar with it. The problem is much, much bigger on architectures like ARM, MIPS, and PowerPC which are not in-order.On 14-11-2012 15:14, Andrei Alexandrescu wrote:In fact, x86 is mostly sequentially consistent due to its memory model. It only require an mfence when an shared store is followed by a shared load.On 11/14/12 1:19 AM, Walter Bright wrote:I need some clarification here: By memory barrier, do you mean x86's mfence, sfence, and lfence? Because as Walter said, inserting those blindly when unnecessary can lead to terrible performance because it practically murders pipelining.On 11/13/2012 11:56 PM, Jonathan M Davis wrote:AndreiBeing able to have double-checked locking work would be valuable, and having memory barriers would reduce race condition weirdness when locks aren't used properly, so I think that it would be desirable to have memory barriers.I'm not saying "memory barriers are bad". I'm saying that having the compiler blindly insert them for shared reads/writes is far from the right way to do it.See : http://g.oswego.edu/dl/jmm/cookbook.html for more information on the barrier required on different architectures.Thread-local loads and stores are not atomic and thus do not take part in the reordering constraints that atomic operations impose. See e.g. the LLVM docs for atomicrmw and atomic load/store.(And note that you can't optimize this either; since the dependencies memory barriers are supposed to express are subtle and not detectable by a compiler, the compiler would always have to insert them because it can't know when it would be safe not to.)Compiler is aware of what is thread local and what isn't. It means the compiler can fully optimize TL store and load (like doing register promotion or reorder them across shared store/load).This have a cost, indeed, but is useful, and Walter's solution to cast away shared when a mutex is acquired is always available.-- Alex Rønne Petersen alex lycus.org http://lycus.org
Nov 14 2012
On 11/14/12 6:39 AM, Alex Rønne Petersen wrote:On 14-11-2012 15:14, Andrei Alexandrescu wrote:Sorry, I was imprecise. We need to (a) define intrinsics for loading and storing data with high-level semantics (a short list: acquire, release, acquire+release, and sequentially-consistent) and THEN (b) implement the needed code generation appropriately for each architecture. Indeed on x86 there is little need to insert fence instructions, BUT there is a definite need for the compiler to prevent certain reorderings. That's why implementing shared data operations (whether implicit or explicit) as sheer library code is NOT possible.On 11/14/12 1:19 AM, Walter Bright wrote:I need some clarification here: By memory barrier, do you mean x86's mfence, sfence, and lfence?On 11/13/2012 11:56 PM, Jonathan M Davis wrote:AndreiBeing able to have double-checked locking work would be valuable, and having memory barriers would reduce race condition weirdness when locks aren't used properly, so I think that it would be desirable to have memory barriers.I'm not saying "memory barriers are bad". I'm saying that having the compiler blindly insert them for shared reads/writes is far from the right way to do it.Because as Walter said, inserting those blindly when unnecessary can lead to terrible performance because it practically murders pipelining.I think at this point we need to develop a better understanding of what's going on before issuing assessments. Andrei
Nov 14 2012
On 14-11-2012 16:08, Andrei Alexandrescu wrote:On 11/14/12 6:39 AM, Alex Rønne Petersen wrote:Let's continue this part of the discussion in my other reply (the one explaining how core.atomic is implemented in the various compilers).On 14-11-2012 15:14, Andrei Alexandrescu wrote:Sorry, I was imprecise. We need to (a) define intrinsics for loading and storing data with high-level semantics (a short list: acquire, release, acquire+release, and sequentially-consistent) and THEN (b) implement the needed code generation appropriately for each architecture. Indeed on x86 there is little need to insert fence instructions, BUT there is a definite need for the compiler to prevent certain reorderings. That's why implementing shared data operations (whether implicit or explicit) as sheer library code is NOT possible.On 11/14/12 1:19 AM, Walter Bright wrote:I need some clarification here: By memory barrier, do you mean x86's mfence, sfence, and lfence?On 11/13/2012 11:56 PM, Jonathan M Davis wrote:AndreiBeing able to have double-checked locking work would be valuable, and having memory barriers would reduce race condition weirdness when locks aren't used properly, so I think that it would be desirable to have memory barriers.I'm not saying "memory barriers are bad". I'm saying that having the compiler blindly insert them for shared reads/writes is far from the right way to do it.I dunno. On low-end architectures like ARM the out-of-order processing is pretty much what makes them usable at all because they don't have the raw power that x86 does (I even recall an ARM Holdings executive saying that they couldn't possibly switch to a strong memory model with an in-order pipeline without severely reducing the efficiency of ARM). So I'm just putting that out there - it's definitely worth taking into consideration because very few architectures are actually fully in-order like x86.Because as Walter said, inserting those blindly when unnecessary can lead to terrible performance because it practically murders pipelining.I think at this point we need to develop a better understanding of what's going on before issuing assessments.Andrei-- Alex Rønne Petersen alex lycus.org http://lycus.org
Nov 14 2012
On Wednesday, 14 November 2012 at 15:08:35 UTC, Andrei Alexandrescu wrote:Sorry, I was imprecise. We need to (a) define intrinsics for loading and storing data with high-level semantics (a short list: acquire, release, acquire+release, and sequentially-consistent) and THEN (b) implement the needed code generation appropriately for each architecture. Indeed on x86 there is little need to insert fence instructions, BUT there is a definite need for the compiler to prevent certain reorderings. That's why implementing shared data operations (whether implicit or explicit) as sheer library code is NOT possible.Sorry, I didn't see this message of yours before replying (the perils of threaded news readers…). You are right about the fact that we need some degree of compiler support for atomic instructions. My point was that is it already available, otherwise it would have been impossible to implement core.atomic.{atomicLoad, atomicStore} (for DMD inline asm is used, which prohibits compiler code motion). Thus, »we«, meaning on a language level, don't need to change anything about the current situations, with the possible exception of adding finer-grained control to core.atomic.MemoryOrder/mysnc [1]. It is the duty of the compiler writers to provide the appropriate means to implement druntime on their code generation infrastructure – and indeed, the situation in DMD could be improved, using inline asm is hitting a fly with a sledgehammer. David [1] I am not sure where the point of diminishing returns is here, although it might make sense to provide the same options as C++11. If I remember correctly, D1/Tango supported a lot more levels of synchronization.
Nov 14 2012
On Wednesday, 14 November 2012 at 17:31:07 UTC, David Nadlinger wrote:Thus, »we«, meaning on a language level, don't need to change anything about the current situations, […]Let my clarify that: We don't necessarily need to tuck on any extra semantics to the language other than what we currently have. However, what we must indeed do is clarifying/specifying the implicit consensus on which the current implementations are built. We really need a »The D Memory Model«-style document. David
Nov 14 2012
On 11/14/12 9:31 AM, David Nadlinger wrote:On Wednesday, 14 November 2012 at 15:08:35 UTC, Andrei Alexandrescu wrote:Yah, the whole point here is that we need something IN THE LANGUAGE DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION. THIS IS VERY IMPORTANT.Sorry, I was imprecise. We need to (a) define intrinsics for loading and storing data with high-level semantics (a short list: acquire, release, acquire+release, and sequentially-consistent) and THEN (b) implement the needed code generation appropriately for each architecture. Indeed on x86 there is little need to insert fence instructions, BUT there is a definite need for the compiler to prevent certain reorderings. That's why implementing shared data operations (whether implicit or explicit) as sheer library code is NOT possible.Sorry, I didn't see this message of yours before replying (the perils of threaded news readers…). You are right about the fact that we need some degree of compiler support for atomic instructions. My point was that is it already available, otherwise it would have been impossible to implement core.atomic.{atomicLoad, atomicStore} (for DMD inline asm is used, which prohibits compiler code motion).Thus, »we«, meaning on a language level, don't need to change anything about the current situations, with the possible exception of adding finer-grained control to core.atomic.MemoryOrder/mysnc [1]. It is the duty of the compiler writers to provide the appropriate means to implement druntime on their code generation infrastructure – and indeed, the situation in DMD could be improved, using inline asm is hitting a fly with a sledgehammer.That is correct. My point is that compiler implementers would follow some specification. That specification would contain informationt hat atomicLoad and atomicStore must have special properties that put them apart from any other functions.David [1] I am not sure where the point of diminishing returns is here, although it might make sense to provide the same options as C++11. If I remember correctly, D1/Tango supported a lot more levels of synchronization.We could start with sequential consistency and then explore riskier/looser policies. Andrei
Nov 14 2012
On 14 November 2012 19:54, Andrei Alexandrescu < SeeWebsiteForEmail erdani.org> wrote:On 11/14/12 9:31 AM, David Nadlinger wrote:e:On Wednesday, 14 November 2012 at 15:08:35 UTC, Andrei Alexandrescu wrot=I won't outright disagree, but this seems VERY dangerous to me. You need to carefully study all popular architectures, and consider that if the language is made to depend on these primitives, and the architecture doesn't support it, or support that particular style of implementation (fairly likely), than D will become incompatible with a huge number of architectures on that day. This is a very big deal. I would be scared to see the compiler generate intrinsic calls to atomic synchronisation primitives. It's almost like banning architectures from the language. The Nintendo Wii for instance, not an unpopular machine, only sold 130 million units! Does not have synchronisation instructions in the architecture (insane, I know, but there it is. I've had to spend time working around this in the past). I'm sure it's not unique in this way. People getting fancy with lock-free/atomic operations will probably wrap it up in libraries. And they're not globally applicable, atomic memory operations don't magically solve problems, they require very specific structures and access patterns around them. I'm just not convinced they should be intrinsics issued by the language. They're just not as well standardised as 'int' or 'float'. Side note: I still think a convenient and fairly practical solution is to make 'shared' things 'lockable'; where you can lock()/unlock() them, and assignment to/from shared things is valid (no casting), but a runtime assert insists that the entity is locked whenever it is accessed.* *It's simplistic, but it's safe, and it works with the same primitives that already exist and are proven. Let the programmer mark the lock/unlock moments, worry about sequencing, etc... at least for the time being. Don't try and do it automatically (yet). The broad use cases in D aren't yet known, but making 'shared' useful today would be valuable. Thus, =C2=BBwe=C2=AB, meaning on a language level, don't need to change an= ythingYah, the whole point here is that we need something IN THE LANGUAGE DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION. THIS IS VERY IMPORTANT.Sorry, I was imprecise. We need to (a) define intrinsics for loading and storing data with high-level semantics (a short list: acquire, release, acquire+release, and sequentially-consistent) and THEN (b) implement the needed code generation appropriately for each architecture. Indeed on x86 there is little need to insert fence instructions, BUT there is a definite need for the compiler to prevent certain reorderings. That's why implementing shared data operations (whether implicit or explicit) as sheer library code is NOT possible.Sorry, I didn't see this message of yours before replying (the perils of threaded news readers=E2=80=A6). You are right about the fact that we need some degree of compiler support for atomic instructions. My point was that is it already available, otherwise it would have been impossible to implement core.atomic.{atomicLoad, atomicStore} (for DMD inline asm is used, which prohibits compiler code motion).indeed,about the current situations, with the possible exception of adding finer-grained control to core.atomic.MemoryOrder/mysnc [1]. It is the duty of the compiler writers to provide the appropriate means to implement druntime on their code generation infrastructure =E2=80=93 and=adthe situation in DMD could be improved, using inline asm is hitting a fly with a sledgehammer.That is correct. My point is that compiler implementers would follow some specification. That specification would contain informationt hat atomicLo=and atomicStore must have special properties that put them apart from any other functions. Davidr[1] I am not sure where the point of diminishing returns is here, although it might make sense to provide the same options as C++11. If I remember correctly, D1/Tango supported a lot more levels of synchronization.We could start with sequential consistency and then explore riskier/loose=policies. Andrei
Nov 15 2012
Le 15/11/2012 10:08, Manu a écrit :The Nintendo Wii for instance, not an unpopular machine, only sold 130 million units! Does not have synchronisation instructions in the architecture (insane, I know, but there it is. I've had to spend time working around this in the past). I'm sure it's not unique in this way.Can you elaborate on that ?
Nov 15 2012
On 11/15/12 1:08 AM, Manu wrote:On 14 November 2012 19:54, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>> wrote: Yah, the whole point here is that we need something IN THE LANGUAGE DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION. THIS IS VERY IMPORTANT. I won't outright disagree, but this seems VERY dangerous to me. You need to carefully study all popular architectures, and consider that if the language is made to depend on these primitives, and the architecture doesn't support it, or support that particular style of implementation (fairly likely), than D will become incompatible with a huge number of architectures on that day.All contemporary languages that are serious about concurrency support atomic primitives one way or another. We must too. There's no two ways about it. [snip]Side note: I still think a convenient and fairly practical solution is to make 'shared' things 'lockable'; where you can lock()/unlock() them, and assignment to/from shared things is valid (no casting), but a runtime assert insists that the entity is locked whenever it is accessed.This (IIUC) is conflating mutex-based synchronization with memory models and atomic operations. I suggest we postpone anything related to that for the sake of staying focused. Andrei
Nov 15 2012
On Nov 15, 2012, at 7:17 AM, Andrei Alexandrescu = <SeeWebsiteForEmail erdani.org> wrote:On 11/15/12 1:08 AM, Manu wrote:is=20 Side note: I still think a convenient and fairly practical solution =them,to make 'shared' things 'lockable'; where you can lock()/unlock() =models and atomic operations. I suggest we postpone anything related to = that for the sake of staying focused. By extension, I'd suggest postponing anything related to classes as = well.=and assignment to/from shared things is valid (no casting), but a runtime assert insists that the entity is locked whenever it is accessed.=20 This (IIUC) is conflating mutex-based synchronization with memory =
Nov 15 2012
On 15 November 2012 17:17, Andrei Alexandrescu < SeeWebsiteForEmail erdani.org> wrote:On 11/15/12 1:08 AM, Manu wrote:I'm not conflating the 2, I'm suggesting to stick with the primitives that are already present and proven, at least for the time being. This thread is about addressing the problem in the short term, long term plans can simmer until they're ready, but any moves in the short term should make use of the primitives available and known to work, ie, don't try and weave in language level support for architectural atomic operations until there's a thoroughly detailed plan, and it's validated against many architectures so we know what we're losing. Libraries can already be written to do a lot of atomic stuff, but I still agree with the OP that shared should be addressed and made more useful in the short term, hence my simplistic suggestion; runtime assert that a shared object is locked when it is read/written, and consequently, lift the cast requirement, making it compatible with templates.On 14 November 2012 19:54, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail **erdani.org<SeeWebsiteForEmail erdani.org>All contemporary languages that are serious about concurrency support atomic primitives one way or another. We must too. There's no two ways about it. [snip]wrote: Yah, the whole point here is that we need something IN THE LANGUAGE DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION. THIS IS VERY IMPORTANT. I won't outright disagree, but this seems VERY dangerous to me. You need to carefully study all popular architectures, and consider that if the language is made to depend on these primitives, and the architecture doesn't support it, or support that particular style of implementation (fairly likely), than D will become incompatible with a huge number of architectures on that day.Side note: I still think a convenient and fairly practical solution is to make 'shared' things 'lockable'; where you can lock()/unlock() them, and assignment to/from shared things is valid (no casting), but a runtime assert insists that the entity is locked whenever it is accessed.This (IIUC) is conflating mutex-based synchronization with memory models and atomic operations. I suggest we postpone anything related to that for the sake of staying focused.
Nov 16 2012
On Friday, 16 November 2012 at 09:24:22 UTC, Manu wrote:On 15 November 2012 17:17, Andrei Alexandrescu < SeeWebsiteForEmail erdani.org> wrote:Seems to me that Soenkes's library solution went into to right direction http://forum.dlang.org/post/k831b6$1368$1 digitalmars.comOn 11/15/12 1:08 AM, Manu wrote:I'm not conflating the 2, I'm suggesting to stick with the primitives that are already present and proven, at least for the time being. This thread is about addressing the problem in the short term, long term plans can simmer until they're ready, but any moves in the short term should make use of the primitives available and known to work, ie, don't try and weave in language level support for architectural atomic operations until there's a thoroughly detailed plan, and it's validated against many architectures so we know what we're losing. Libraries can already be written to do a lot of atomic stuff, but I still agree with the OP that shared should be addressed and made more useful in the short term, hence my simplistic suggestion; runtime assert that a shared object is locked when it is read/written, and consequently, lift the cast requirement, making it compatible with templates.On 14 November 2012 19:54, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail **erdani.org<SeeWebsiteForEmail erdani.org>All contemporary languages that are serious about concurrency support atomic primitives one way or another. We must too. There's no two ways about it. [snip]wrote: Yah, the whole point here is that we need something IN THE LANGUAGE DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION. THIS IS VERY IMPORTANT. I won't outright disagree, but this seems VERY dangerous to me. You need to carefully study all popular architectures, and consider that if the language is made to depend on these primitives, and the architecture doesn't support it, or support that particular style of implementation (fairly likely), than D will become incompatible with a huge number of architectures on that day.Side note: I still think a convenient and fairly practical solution is to make 'shared' things 'lockable'; where you can lock()/unlock() them, and assignment to/from shared things is valid (no casting), but a runtime assert insists that the entity is locked whenever it is accessed.This (IIUC) is conflating mutex-based synchronization with memory models and atomic operations. I suggest we postpone anything related to that for the sake of staying focused.
Nov 16 2012
On 16 November 2012 12:09, Pragma Tix <bizprac orange.fr> wrote:On Friday, 16 November 2012 at 09:24:22 UTC, Manu wrote:Looks reasonable to me, also Dmitry Olshansky and luka have both made suggestions that look good to me aswell. I think the only problem with all these is that they don't really feel like a feature of the language, just some template that's not yet even in the library. D likes to claim that it is strong on concurrency, with that in mind, I'd expect to at least see one of these approaches polished, and probably even nicely sugared. That's a minimum that people will expect, it's a proven, well known pattern that many are familiar with, and it can be done in the language right now. Sugaring a feature like that is simply about improving clarity, and reducing friction for users of something that D likes to advertise as being a core feature of the language.On 15 November 2012 17:17, Andrei Alexandrescu < SeeWebsiteForEmail erdani.org> wrote: On 11/15/12 1:08 AM, Manu wrote:Seems to me that Soenkes's library solution went into to right direction http://forum.dlang.org/post/**k831b6$1368$1 digitalmars.com<http://forum.dlang.org/post/k831b6$1368$1 digitalmars.com>On 14 November 2012 19:54, Andrei AlexandrescuI'm not conflating the 2, I'm suggesting to stick with the primitives that are already present and proven, at least for the time being. This thread is about addressing the problem in the short term, long term plans can simmer until they're ready, but any moves in the short term should make use of the primitives available and known to work, ie, don't try and weave in language level support for architectural atomic operations until there's a thoroughly detailed plan, and it's validated against many architectures so we know what we're losing. Libraries can already be written to do a lot of atomic stuff, but I still agree with the OP that shared should be addressed and made more useful in the short term, hence my simplistic suggestion; runtime assert that a shared object is locked when it is read/written, and consequently, lift the cast requirement, making it compatible with templates.<SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail **e** rdani.org <http://erdani.org><SeeWebsiteForEmail **erdani.org<SeeWebsiteForEmail erdani.org>All contemporary languages that are serious about concurrency support atomic primitives one way or another. We must too. There's no two ways about it. [snip] Side note: I still think a convenient and fairly practical solution iswrote: Yah, the whole point here is that we need something IN THE LANGUAGE DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION. THIS IS VERY IMPORTANT. I won't outright disagree, but this seems VERY dangerous to me. You need to carefully study all popular architectures, and consider that if the language is made to depend on these primitives, and the architecture doesn't support it, or support that particular style of implementation (fairly likely), than D will become incompatible with a huge number of architectures on that day.to make 'shared' things 'lockable'; where you can lock()/unlock() them, and assignment to/from shared things is valid (no casting), but a runtime assert insists that the entity is locked whenever it is accessed.This (IIUC) is conflating mutex-based synchronization with memory models and atomic operations. I suggest we postpone anything related to that for the sake of staying focused.
Nov 16 2012
On Friday, 16 November 2012 at 10:59:02 UTC, Manu wrote:On 16 November 2012 12:09, Pragma Tix <bizprac orange.fr> wrote:Hi Manu, point taken. But Dimitry and Luka just made suggestions. Soenke offers something concrete. (working right NOW) I am afraid that we'll end up in a situation similar to the std.collections opera. Just bla bla, and zero results. (And the collection situation isn't solved since the very beginning of D, not to talk about immutable collections) Probably not En Vogue : For me Transactional Memory Management makes sense.On Friday, 16 November 2012 at 09:24:22 UTC, Manu wrote:Looks reasonable to me, also Dmitry Olshansky and luka have both made suggestions that look good to me aswell. I think the only problem with all these is that they don't really feel like a feature of the language, just some template that's not yet even in the library. D likes to claim that it is strong on concurrency, with that in mind, I'd expect to at least see one of these approaches polished, and probably even nicely sugared. That's a minimum that people will expect, it's a proven, well known pattern that many are familiar with, and it can be done in the language right now. Sugaring a feature like that is simply about improving clarity, and reducing friction for users of something that D likes to advertise as being a core feature of the language.On 15 November 2012 17:17, Andrei Alexandrescu < SeeWebsiteForEmail erdani.org> wrote: On 11/15/12 1:08 AM, Manu wrote:Seems to me that Soenkes's library solution went into to right direction http://forum.dlang.org/post/**k831b6$1368$1 digitalmars.com<http://forum.dlang.org/post/k831b6$1368$1 digitalmars.com>On 14 November 2012 19:54, Andrei AlexandrescuI'm not conflating the 2, I'm suggesting to stick with the primitives that are already present and proven, at least for the time being. This thread is about addressing the problem in the short term, long term plans can simmer until they're ready, but any moves in the short term should make use of the primitives available and known to work, ie, don't try and weave in language level support for architectural atomic operations until there's a thoroughly detailed plan, and it's validated against many architectures so we know what we're losing. Libraries can already be written to do a lot of atomic stuff, but I still agree with the OP that shared should be addressed and made more useful in the short term, hence my simplistic suggestion; runtime assert that a shared object is locked when it is read/written, and consequently, lift the cast requirement, making it compatible with templates.<SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail **e** rdani.org <http://erdani.org><SeeWebsiteForEmail **erdani.org<SeeWebsiteForEmail erdani.org>All contemporary languages that are serious about concurrency support atomic primitives one way or another. We must too. There's no two ways about it. [snip] Side note: I still think a convenient and fairly practical solution iswrote: Yah, the whole point here is that we need something IN THE LANGUAGE DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION. THIS IS VERY IMPORTANT. I won't outright disagree, but this seems VERY dangerous to me. You need to carefully study all popular architectures, and consider that if the language is made to depend on these primitives, and the architecture doesn't support it, or support that particular style of implementation (fairly likely), than D will become incompatible with a huge number of architectures on that day.to make 'shared' things 'lockable'; where you can lock()/unlock() them, and assignment to/from shared things is valid (no casting), but a runtime assert insists that the entity is locked whenever it is accessed.This (IIUC) is conflating mutex-based synchronization with memory models and atomic operations. I suggest we postpone anything related to that for the sake of staying focused.
Nov 16 2012
On 15 November 2012 17:17, Andrei Alexandrescu < SeeWebsiteForEmail erdani.org> wrote:On 11/15/12 1:08 AM, Manu wrote:I can't resist... D may be serious about the *idea* of concurrency, but it clearly isn't serious about concurrency yet. shared is a prime example of that. We do support atomic primitives 'one way or another'; there are intrinsics on all compilers. Libraries can use them. Again, this thread seemed to be about urgent action... D needs a LOT of work on it's concurrency model, but something of an urgent fix to make a key language feature more useful needs to leverage what's there now.On 14 November 2012 19:54, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail **erdani.org<SeeWebsiteForEmail erdani.org>All contemporary languages that are serious about concurrency support atomic primitives one way or another. We must too. There's no two ways about it.wrote: Yah, the whole point here is that we need something IN THE LANGUAGE DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION. THIS IS VERY IMPORTANT. I won't outright disagree, but this seems VERY dangerous to me. You need to carefully study all popular architectures, and consider that if the language is made to depend on these primitives, and the architecture doesn't support it, or support that particular style of implementation (fairly likely), than D will become incompatible with a huge number of architectures on that day.
Nov 16 2012
On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei Alexandrescu wrote:That is correct. My point is that compiler implementers would follow some specification. That specification would contain informationt hat atomicLoad and atomicStore must have special properties that put them apart from any other functions.What are these special properties? Sorry, it seems like we are talking past each other…I'm not quite sure what you are saying here. The functions in core.atomic already exist, and currently offer four levels (raw, acq, rel, seq). Are you suggesting to remove the other options? David[1] I am not sure where the point of diminishing returns is here, although it might make sense to provide the same options as C++11. If I remember correctly, D1/Tango supported a lot more levels of synchronization.We could start with sequential consistency and then explore riskier/looser policies.
Nov 15 2012
On 11/15/12 1:29 PM, David Nadlinger wrote:On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei Alexandrescu wrote:For example you can't hoist a memory operation before a shared load or after a shared store. AndreiThat is correct. My point is that compiler implementers would follow some specification. That specification would contain informationt hat atomicLoad and atomicStore must have special properties that put them apart from any other functions.What are these special properties? Sorry, it seems like we are talking past each other…
Nov 15 2012
On Thursday, 15 November 2012 at 22:57:54 UTC, Andrei Alexandrescu wrote:On 11/15/12 1:29 PM, David Nadlinger wrote:Well, to be picky, that depends on what kind of memory operation you mean – moving non-volatile loads/stores across volatile ones is typically considered acceptable. But still, you can't move memory operations across any other arbitrary function call either (unless you can prove it is safe by inspecting the callee's body, obviously), so I don't see where atomicLoad/atomicStore would be special here. DavidOn Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei Alexandrescu wrote:For example you can't hoist a memory operation before a shared load or after a shared store.That is correct. My point is that compiler implementers would follow some specification. That specification would contain informationt hat atomicLoad and atomicStore must have special properties that put them apart from any other functions.What are these special properties? Sorry, it seems like we are talking past each other…
Nov 15 2012
On Nov 15, 2012, at 3:05 PM, David Nadlinger <see klickverbot.at> wrote:On Thursday, 15 November 2012 at 22:57:54 UTC, Andrei Alexandrescu =wrote:wrote:On 11/15/12 1:29 PM, David Nadlinger wrote:On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei Alexandrescu =followThat is correct. My point is that compiler implementers would =hatsome specification. That specification would contain informationt =thematomicLoad and atomicStore must have special properties that put =talkingapart from any other functions.=20 What are these special properties? Sorry, it seems like we are =or after a shared store.past each other=85=20 For example you can't hoist a memory operation before a shared load ==20 Well, to be picky, that depends on what kind of memory operation you =mean =96 moving non-volatile loads/stores across volatile ones is = typically considered acceptable. Usually not, really. Like if you implement a mutex, you don't want = non-volatile operations to be hoisted above the mutex acquire or sunk = below the mutex release. However, it's safe to move additional = operations into the block where the mutex is held.=
Nov 15 2012
On Thursday, 15 November 2012 at 23:22:32 UTC, Sean Kelly wrote:On Nov 15, 2012, at 3:05 PM, David Nadlinger <see klickverbot.at> wrote:Oh well, I was just being stupid when typing up my response: What I meant to say is that you _can_ reorder a set of memory operations involving atomic/volatile ones unless you violate the guarantees of the chosen memory order option. So, for Andrei's statement to be true, shared needs to be defined as making all memory operations sequentially consistent. Walter doesn't seem to think this is the way to go, at least if that is what he is referring to as »memory barriers«. DavidWell, to be picky, that depends on what kind of memory operation you mean – moving non-volatile loads/stores across volatile ones is typically considered acceptable.Usually not, really. Like if you implement a mutex, you don't want non-volatile operations to be hoisted above the mutex acquire or sunk below the mutex release. However, it's safe to move additional operations into the block where the mutex is held.
Nov 15 2012
On Nov 15, 2012, at 3:30 PM, David Nadlinger <see klickverbot.at> wrote:On Thursday, 15 November 2012 at 23:22:32 UTC, Sean Kelly wrote:wrote:On Nov 15, 2012, at 3:05 PM, David Nadlinger <see klickverbot.at> =mean =96 moving non-volatile loads/stores across volatile ones is = typically considered acceptable.Well, to be picky, that depends on what kind of memory operation you =non-volatile operations to be hoisted above the mutex acquire or sunk = below the mutex release. However, it's safe to move additional = operations into the block where the mutex is held.=20 Usually not, really. Like if you implement a mutex, you don't want ==20 Oh well, I was just being stupid when typing up my response: What I =meant to say is that you _can_ reorder a set of memory operations = involving atomic/volatile ones unless you violate the guarantees of the = chosen memory order option.=20 So, for Andrei's statement to be true, shared needs to be defined as =making all memory operations sequentially consistent. Walter doesn't = seem to think this is the way to go, at least if that is what he is = referring to as =BBmemory barriers=AB. I think because of the as-if rule, the compiler can continue to optimize = all it wants between volatile operations. Just not across them.=
Nov 15 2012
On 11/15/12 3:30 PM, David Nadlinger wrote:On Thursday, 15 November 2012 at 23:22:32 UTC, Sean Kelly wrote:Shared must be sequentially consistent. AndreiOn Nov 15, 2012, at 3:05 PM, David Nadlinger <see klickverbot.at> wrote:Oh well, I was just being stupid when typing up my response: What I meant to say is that you _can_ reorder a set of memory operations involving atomic/volatile ones unless you violate the guarantees of the chosen memory order option. So, for Andrei's statement to be true, shared needs to be defined as making all memory operations sequentially consistent. Walter doesn't seem to think this is the way to go, at least if that is what he is referring to as »memory barriers«.Well, to be picky, that depends on what kind of memory operation you mean – moving non-volatile loads/stores across volatile ones is typically considered acceptable.Usually not, really. Like if you implement a mutex, you don't want non-volatile operations to be hoisted above the mutex acquire or sunk below the mutex release. However, it's safe to move additional operations into the block where the mutex is held.
Nov 15 2012
Le 15/11/2012 15:22, Sean Kelly a écrit :On Nov 15, 2012, at 3:05 PM, David Nadlinger<see klickverbot.at> wrote:If it is known that the memory read/write is thread local, this is safe, even in the case of a mutex.On Thursday, 15 November 2012 at 22:57:54 UTC, Andrei Alexandrescu wrote:Usually not, really. Like if you implement a mutex, you don't want non-volatile operations to be hoisted above the mutex acquire or sunk below the mutex release. However, it's safe to move additional operations into the block where the mutex is held.On 11/15/12 1:29 PM, David Nadlinger wrote:Well, to be picky, that depends on what kind of memory operation you mean – moving non-volatile loads/stores across volatile ones is typically considered acceptable.On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei Alexandrescu wrote:For example you can't hoist a memory operation before a shared load or after a shared store.That is correct. My point is that compiler implementers would follow some specification. That specification would contain informationt hat atomicLoad and atomicStore must have special properties that put them apart from any other functions.What are these special properties? Sorry, it seems like we are talking past each other…
Nov 18 2012
On 11/15/12 3:05 PM, David Nadlinger wrote:On Thursday, 15 November 2012 at 22:57:54 UTC, Andrei Alexandrescu wrote:In D that's fine (as long as in-thread SC is respected) because non-shared vars are guaranteed to be thread-local.On 11/15/12 1:29 PM, David Nadlinger wrote:Well, to be picky, that depends on what kind of memory operation you mean – moving non-volatile loads/stores across volatile ones is typically considered acceptable.On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei Alexandrescu wrote:For example you can't hoist a memory operation before a shared load or after a shared store.That is correct. My point is that compiler implementers would follow some specification. That specification would contain informationt hat atomicLoad and atomicStore must have special properties that put them apart from any other functions.What are these special properties? Sorry, it seems like we are talking past each other…But still, you can't move memory operations across any other arbitrary function call either (unless you can prove it is safe by inspecting the callee's body, obviously), so I don't see where atomicLoad/atomicStore would be special here.It is special because e.g. on x86 the function is often a simple unprotected load or store. So after the inliner has at it, there's nothing to stay in the way of reordering. The point is the compiler must understand the semantics of acquire and release. Andrei
Nov 15 2012
On 11/14/2012 7:08 AM, Andrei Alexandrescu wrote:On 11/14/12 6:39 AM, Alex Rønne Petersen wrote:Yes. And also, I agree that having something typed as "shared" must prevent the compiler from reordering them. But that's separate from inserting memory barriers.On 14-11-2012 15:14, Andrei Alexandrescu wrote:Sorry, I was imprecise. We need to (a) define intrinsics for loading and storing data with high-level semantics (a short list: acquire, release, acquire+release, and sequentially-consistent) and THEN (b) implement the needed code generation appropriately for each architecture. Indeed on x86 there is little need to insert fence instructions, BUT there is a definite need for the compiler to prevent certain reorderings. That's why implementing shared data operations (whether implicit or explicit) as sheer library code is NOT possible.On 11/14/12 1:19 AM, Walter Bright wrote:I need some clarification here: By memory barrier, do you mean x86's mfence, sfence, and lfence?On 11/13/2012 11:56 PM, Jonathan M Davis wrote:AndreiBeing able to have double-checked locking work would be valuable, and having memory barriers would reduce race condition weirdness when locks aren't used properly, so I think that it would be desirable to have memory barriers.I'm not saying "memory barriers are bad". I'm saying that having the compiler blindly insert them for shared reads/writes is far from the right way to do it.Because as Walter said, inserting those blindly when unnecessary can lead to terrible performance because it practically murders pipelining.I think at this point we need to develop a better understanding of what's going on before issuing assessments.
Nov 14 2012
On 11/14/12 1:09 PM, Walter Bright wrote:Yes. And also, I agree that having something typed as "shared" must prevent the compiler from reordering them. But that's separate from inserting memory barriers.It's the same issue at hand: ordering properly and inserting barriers are two ways to ensure one single goal, sequential consistency. Same thing. Andrei
Nov 14 2012
On Nov 14, 2012, at 2:25 PM, Andrei Alexandrescu = <SeeWebsiteForEmail erdani.org> wrote:On 11/14/12 1:09 PM, Walter Bright wrote:are two ways to ensure one single goal, sequential consistency. Same = thing. Sequential consistency is great and all, but it doesn't render = concurrent code correct. At worst, it provides a false sense of = security that somehow it does accomplish this, and people end up = actually using it as such.=Yes. And also, I agree that having something typed as "shared" must prevent the compiler from reordering them. But that's separate from inserting memory barriers.=20 It's the same issue at hand: ordering properly and inserting barriers =
Nov 14 2012
On 11/14/12 4:50 PM, Sean Kelly wrote:On Nov 14, 2012, at 2:25 PM, Andrei Alexandrescu<SeeWebsiteForEmail erdani.org> wrote:Yah, but the baseline here is acquire-release which has subtle differences that are all the more maddening. AndreiOn 11/14/12 1:09 PM, Walter Bright wrote:Sequential consistency is great and all, but it doesn't render concurrent code correct. At worst, it provides a false sense of security that somehow it does accomplish this, and people end up actually using it as such.Yes. And also, I agree that having something typed as "shared" must prevent the compiler from reordering them. But that's separate from inserting memory barriers.It's the same issue at hand: ordering properly and inserting barriers are two ways to ensure one single goal, sequential consistency. Same thing.
Nov 14 2012
On Nov 14, 2012, at 6:28 PM, Andrei Alexandrescu = <SeeWebsiteForEmail erdani.org> wrote:On 11/14/12 4:50 PM, Sean Kelly wrote:differences that are all the more maddening. Really? Acquire-release always seemed to have equivalent safety to me. = Typically, the user doesn't even have to understand that optimization = can occur upwards across the trailing boundary of the block, etc, to = produce correct code. Though I do agree that the industry is moving = towards sequential consistency, so there may be no point in trying for = something weaker.=On Nov 14, 2012, at 2:25 PM, Andrei Alexandrescu<SeeWebsiteForEmail erdani.org> wrote: =20=20 Yah, but the baseline here is acquire-release which has subtle =On 11/14/12 1:09 PM, Walter Bright wrote:=20 Sequential consistency is great and all, but it doesn't render concurrent code correct. At worst, it provides a false sense of security that somehow it does accomplish this, and people end up actually using it as such.Yes. And also, I agree that having something typed as "shared" must prevent the compiler from reordering them. But that's separate from inserting memory barriers.=20 It's the same issue at hand: ordering properly and inserting barriers are two ways to ensure one single goal, sequential consistency. Same thing.
Nov 15 2012
Le 14/11/2012 22:09, Walter Bright a écrit :On 11/14/2012 7:08 AM, Andrei Alexandrescu wrote:I'm sorry but that is dumb. What is the point of ensuring that the compiler does not reorder load/stores if the CPU is allowed to do so ?On 11/14/12 6:39 AM, Alex Rønne Petersen wrote:Yes. And also, I agree that having something typed as "shared" must prevent the compiler from reordering them. But that's separate from inserting memory barriers.On 14-11-2012 15:14, Andrei Alexandrescu wrote:Sorry, I was imprecise. We need to (a) define intrinsics for loading and storing data with high-level semantics (a short list: acquire, release, acquire+release, and sequentially-consistent) and THEN (b) implement the needed code generation appropriately for each architecture. Indeed on x86 there is little need to insert fence instructions, BUT there is a definite need for the compiler to prevent certain reorderings. That's why implementing shared data operations (whether implicit or explicit) as sheer library code is NOT possible.On 11/14/12 1:19 AM, Walter Bright wrote:I need some clarification here: By memory barrier, do you mean x86's mfence, sfence, and lfence?On 11/13/2012 11:56 PM, Jonathan M Davis wrote:AndreiBeing able to have double-checked locking work would be valuable, and having memory barriers would reduce race condition weirdness when locks aren't used properly, so I think that it would be desirable to have memory barriers.I'm not saying "memory barriers are bad". I'm saying that having the compiler blindly insert them for shared reads/writes is far from the right way to do it.Because as Walter said, inserting those blindly when unnecessary can lead to terrible performance because it practically murders pipelining.I think at this point we need to develop a better understanding of what's going on before issuing assessments.
Nov 15 2012
On Nov 15, 2012, at 5:16 AM, deadalnix <deadalnix gmail.com> wrote:=20 What is the point of ensuring that the compiler does not reorder =load/stores if the CPU is allowed to do so ? Because we can write ASM to tell the CPU not to. We don't have any such = ability for the compiler right now.=
Nov 15 2012
On Thursday, 15 November 2012 at 16:43:14 UTC, Sean Kelly wrote:On Nov 15, 2012, at 5:16 AM, deadalnix <deadalnix gmail.com> wrote:I think the question was: Why would you want to disable compiler code motion for loads/stores which are not atomic, as the CPU might ruin your assumptions anyway? DavidWhat is the point of ensuring that the compiler does not reorder load/stores if the CPU is allowed to do so ?Because we can write ASM to tell the CPU not to. We don't have any such ability for the compiler right now.
Nov 15 2012
On 11/15/12 2:18 PM, David Nadlinger wrote:On Thursday, 15 November 2012 at 16:43:14 UTC, Sean Kelly wrote:The compiler does whatever it takes to ensure sequential consistency for shared use, including possibly inserting fences in certain places. AndreiOn Nov 15, 2012, at 5:16 AM, deadalnix <deadalnix gmail.com> wrote:I think the question was: Why would you want to disable compiler code motion for loads/stores which are not atomic, as the CPU might ruin your assumptions anyway?What is the point of ensuring that the compiler does not reorder load/stores if the CPU is allowed to do so ?Because we can write ASM to tell the CPU not to. We don't have any such ability for the compiler right now.
Nov 15 2012
On Thursday, 15 November 2012 at 22:58:53 UTC, Andrei Alexandrescu wrote:On 11/15/12 2:18 PM, David Nadlinger wrote:How does this have anything to do with deadalnix' question that I rephrased at all? It is not at all clear that shared should do this (it currently doesn't), and the question was explicitly about Walter's statement that shared should disable compiler reordering, when at the same time *not* inserting barriers/atomic ops. Thus the »which are not atomic« qualifier in my message. DavidOn Thursday, 15 November 2012 at 16:43:14 UTC, Sean Kelly wrote:The compiler does whatever it takes to ensure sequential consistency for shared use, including possibly inserting fences in certain places. AndreiOn Nov 15, 2012, at 5:16 AM, deadalnix <deadalnix gmail.com> wrote:I think the question was: Why would you want to disable compiler code motion for loads/stores which are not atomic, as the CPU might ruin your assumptions anyway?What is the point of ensuring that the compiler does not reorder load/stores if the CPU is allowed to do so ?Because we can write ASM to tell the CPU not to. We don't have any such ability for the compiler right now.
Nov 15 2012
On Nov 15, 2012, at 2:18 PM, David Nadlinger <see klickverbot.at> wrote:On Thursday, 15 November 2012 at 16:43:14 UTC, Sean Kelly wrote:load/stores if the CPU is allowed to do so ?On Nov 15, 2012, at 5:16 AM, deadalnix <deadalnix gmail.com> wrote:What is the point of ensuring that the compiler does not reorder =such ability for the compiler right now.=20 Because we can write ASM to tell the CPU not to. We don't have any ==20 I think the question was: Why would you want to disable compiler code =motion for loads/stores which are not atomic, as the CPU might ruin your = assumptions anyway? A barrier isn't always necessary to achieve the desired ordering on a = given system. But I'd still call out to ASM to make sure the intended = operation happened. I don't know that I'd ever feel comfortable with = "volatile x=3Dy" even if what I'd do instead is just a MOV.=
Nov 15 2012
On 2012-11-14 08:56, Jonathan M Davis wrote:Being able to have double-checked locking work would be valuable, and having memory barriers would reduce race condition weirdness when locks aren't used properly, so I think that it would be desirable to have memory barriers. If there's a major performance penalty though, that might be a reason not to do it. Certainly, I don't think that there's any question that adding memory barriers won't make it so that you don't need mutexes or synchronized blocks or whatnot. shared's primary benefit is in logically separating normal code from code that must shared data across threads and making it possible for the compiler to optimize based on the fact that it knows that a variable is thread-local.If there is a problem with efficiency in some cases then the developer can use __gshared and manually handling things. But of course, we don't want the developer to have to do this in most cases. -- /Jacob Carlborg
Nov 14 2012
Am 13.11.2012 23:22, schrieb Walter Bright:But I do see enormous value in shared in that it logically (and rather forcefully) separates thread-local code from multi-thread code. For example, see the post here about adding a destructor to a shared struct, and having it fail to compile. The complaint was along the lines of shared being broken, whereas I viewed it along the lines of shared pointing out a logic problem in the code - what does destroying a struct accessible from multiple threads mean? I think it must be clear that destroying an object can only happen in one thread, i.e. the object must become thread local in order to be destroyed.I still don't agree with you there. The struct would have clearly outlived any thread (as it was in the global scope) so at the point where it is destroyed there should be really only one thread left. So it IS destroyed in a single threaded context. The same is done for classes by the GC just that the GC ignores shared altogether. Kind Regards Benjamin Thaut
Nov 14 2012
On 11/14/2012 1:01 AM, Benjamin Thaut wrote:I still don't agree with you there. The struct would have clearly outlived any thread (as it was in the global scope) so at the point where it is destroyed there should be really only one thread left. So it IS destroyed in a single threaded context.If you know this for a fact, then cast it to thread local. The compiler cannot figure this out for you, hence it issues the error.The same is done for classes by the GC just that the GC ignores shared altogether.That's different, because the GC verifies that there are *no* references to it from any thread first.
Nov 14 2012
Am 14.11.2012 10:18, schrieb Walter Bright:On 11/14/2012 1:01 AM, Benjamin Thaut wrote:Could you please give an example where it would break? And whats the difference between: struct Value { ~this() { printf("destroy\n"); } } shared Value v; and: shared static ~this() { printf("destory\n"); } Kind Regards Benjamin ThautI still don't agree with you there. The struct would have clearly outlived any thread (as it was in the global scope) so at the point where it is destroyed there should be really only one thread left. So it IS destroyed in a single threaded context.If you know this for a fact, then cast it to thread local. The compiler cannot figure this out for you, hence it issues the error.The same is done for classes by the GC just that the GC ignores shared altogether.That's different, because the GC verifies that there are *no* references to it from any thread first.
Nov 14 2012
On 11/14/2012 1:23 AM, Benjamin Thaut wrote:Could you please give an example where it would break?Thread 1: 1. create shared object 2. pass reference to that object to Thread 2 3. destroy object Thread 2: 1. manipulate that objectAnd whats the difference between: struct Value { ~this() { printf("destroy\n"); } } shared Value v; and: shared static ~this() { printf("destory\n"); }The struct declaration of ~this() has no idea what context it will be used in.
Nov 14 2012
Am 14.11.2012 11:42, schrieb Walter Bright:On 11/14/2012 1:23 AM, Benjamin Thaut wrote:But for passing a reference to a value type you would have to use a pointer, correct? And pointers are a unsafe feature anyway... I don't see your point. And if the use of pointers is allowed, I can make the same case break in a single threaded environment without shared. Kind Regards Benjamin ThautCould you please give an example where it would break?Thread 1: 1. create shared object 2. pass reference to that object to Thread 2 3. destroy object Thread 2: 1. manipulate that object
Nov 14 2012
On 11/14/2012 2:49 AM, Benjamin Thaut wrote:Am 14.11.2012 11:42, schrieb Walter Bright:Pointers are safe. It's pointer arithmetic that is not (and escaping pointers).On 11/14/2012 1:23 AM, Benjamin Thaut wrote:But for passing a reference to a value type you would have to use a pointer, correct? And pointers are a unsafe feature anyway... I don't see your point.Could you please give an example where it would break?Thread 1: 1. create shared object 2. pass reference to that object to Thread 2 3. destroy object Thread 2: 1. manipulate that objectAnd if the use of pointers is allowed, I can make the same case break in a single threaded environment without shared.1. You can't escape pointers in safe code (well, it's a bug if you do). 2. If the struct is on the heap, it is only destructed if there are no references to it in any thread. If it is not on the heap, and you are in safe code, it should always be destructed safely when it goes out of scope. This is not so for shared pointers.
Nov 14 2012
Am 14.11.2012 12:00, schrieb Walter Bright:On 11/14/2012 2:49 AM, Benjamin Thaut wrote:So just to be clear, escaping pointers in a single threaded context is a bug. But if you escape them in a multithreaded context its ok? That sounds inconsistent to me. But if that is by design your argument is valid. I still can not think of any real world usecase though where this could actually be used. A small code example which would break as soon as we allow destructing of shared value types would really be nice. (maybe even in the langauge documentation, because I coudln't find anything) Kind Regards Benjamin ThautAm 14.11.2012 11:42, schrieb Walter Bright:Pointers are safe. It's pointer arithmetic that is not (and escaping pointers).On 11/14/2012 1:23 AM, Benjamin Thaut wrote:But for passing a reference to a value type you would have to use a pointer, correct? And pointers are a unsafe feature anyway... I don't see your point.Could you please give an example where it would break?Thread 1: 1. create shared object 2. pass reference to that object to Thread 2 3. destroy object Thread 2: 1. manipulate that objectAnd if the use of pointers is allowed, I can make the same case break in a single threaded environment without shared.1. You can't escape pointers in safe code (well, it's a bug if you do). 2. If the struct is on the heap, it is only destructed if there are no references to it in any thread. If it is not on the heap, and you are in safe code, it should always be destructed safely when it goes out of scope. This is not so for shared pointers.
Nov 14 2012
On 11/14/2012 3:14 AM, Benjamin Thaut wrote:A small code example which would break as soon as we allow destructing of shared value types would really be nice.I hate to repeat myself, but: Thread 1: 1. create shared object 2. pass reference to that object to Thread 2 3. destroy object Thread 2: 1. manipulate that object
Nov 14 2012
On 11/14/12 1:06 PM, Walter Bright wrote:On 11/14/2012 3:14 AM, Benjamin Thaut wrote:That should be disallowed at least in safe code. If I had my way I'd explore disallowing in all code. AndreiA small code example which would break as soon as we allow destructing of shared value types would really be nice.I hate to repeat myself, but: Thread 1: 1. create shared object 2. pass reference to that object to Thread 2
Nov 14 2012
On 2012-11-14 22:06, Walter Bright wrote:I hate to repeat myself, but: Thread 1: 1. create shared object 2. pass reference to that object to Thread 2 3. destroy object Thread 2: 1. manipulate that objectWhy would the object be destroyed if there's still a reference to it? If the object is manually destroyed I don't see what threads have to do with it since you can do the same thing in a single thread application. -- /Jacob Carlborg
Nov 15 2012
On Thursday, November 15, 2012 10:22:22 Jacob Carlborg wrote:On 2012-11-14 22:06, Walter Bright wrote:Yeah. If the reference passed across were shared, then the runtime should see it as having multiple references, and if it's _not_ shared, that means that you cast shared away (unsafe, since it's a cast) and passed it across threads without making sure that it was the only reference on the original thread. In that case, you shot yourself in the foot by using an system construct (casting) and not getting it right. I don't see why the runtime would have to worry about that. Unless the problem is that the object is a value type, so when it goes away on the first thread, it _has_ to be destroyed? If that's the case, then it's a pointer that was passed across rather than a reference, and then you've effectively done the same thing as returning a pointer to a local variable, which is system and again only happens if you're getting system wrong, which the compiler generally doesn't protect you from beyond giving you an error in the few cases where it can determine for certain that what you're doing is wrong (which is a fairly limited portion of the time). So, as far as I can see - unless I'm just totally missing something here - either you're dealing with shared objects on the heap here, in which case, the object shouldn't be destroyed on the first thread unless you do it manually (in which case, you're doing something stupid in system code), or you're dealing with passing pointers to shared value types across threads, which is essentially the equivalent of escaping a pointer to a local variable (in which case, you're doing something stupid in system code). In either case, it's you're doing something stupid in system code, and I don't see why the runtime would have to worry about it. You shot yourself in the foot by incorrectly using system code. If you want protection agains that, then don't use system code. - Jonathan M DavisI hate to repeat myself, but: Thread 1: 1. create shared object 2. pass reference to that object to Thread 2 3. destroy object Thread 2: 1. manipulate that objectWhy would the object be destroyed if there's still a reference to it? If the object is manually destroyed I don't see what threads have to do with it since you can do the same thing in a single thread application.
Nov 15 2012
Am 15.11.2012 12:48, schrieb Jonathan M Davis:Yeah. If the reference passed across were shared, then the runtime should see it as having multiple references, and if it's _not_ shared, that means that you cast shared away (unsafe, since it's a cast) and passed it across threads without making sure that it was the only reference on the original thread. In that case, you shot yourself in the foot by using an system construct (casting) and not getting it right. I don't see why the runtime would have to worry about that. Unless the problem is that the object is a value type, so when it goes away on the first thread, it _has_ to be destroyed? If that's the case, then it's a pointer that was passed across rather than a reference, and then you've effectively done the same thing as returning a pointer to a local variable, which is system and again only happens if you're getting system wrong, which the compiler generally doesn't protect you from beyond giving you an error in the few cases where it can determine for certain that what you're doing is wrong (which is a fairly limited portion of the time). So, as far as I can see - unless I'm just totally missing something here - either you're dealing with shared objects on the heap here, in which case, the object shouldn't be destroyed on the first thread unless you do it manually (in which case, you're doing something stupid in system code), or you're dealing with passing pointers to shared value types across threads, which is essentially the equivalent of escaping a pointer to a local variable (in which case, you're doing something stupid in system code). In either case, it's you're doing something stupid in system code, and I don't see why the runtime would have to worry about it. You shot yourself in the foot by incorrectly using system code. If you want protection agains that, then don't use system code. - Jonathan M DavisThank you, thats exatcly how I'm thinking too. And because of this it makes absolutley no sense to me to disallow the destruction of a shared struct, if it is allocated on the stack or as a global. If it is allocated on the heap you can't destory it manually anyway because delete is deprecated. And for exatcly this reason I wanted a code example from Walter. Because just listing a few bullet points does not make a real world use case. Kind Regards Benjamin Thaut
Nov 15 2012
11/15/2012 1:06 AM, Walter Bright пишет:On 11/14/2012 3:14 AM, Benjamin Thaut wrote:Ain't structs typically copied anyway? Reference would imply pointer then. If the struct is on the stack (weird but could be) then the thread that created it destroys the object once. The thing is as unsafe as escaping a pointer is. Personally I think that shared stuff allocated on the stack is here-be-dragons system code in any case. Otherwise it's GC's responsibility to destroy heap allocated struct when there are no references to it. What's so puzzling about it? BTW currently GC-allocated structs are not having their destructor called at all. The bug is however _minor_ ... http://d.puremagic.com/issues/show_bug.cgi?id=2834 -- Dmitry OlshanskyA small code example which would break as soon as we allow destructing of shared value types would really be nice.I hate to repeat myself, but: Thread 1: 1. create shared object 2. pass reference to that object to Thread 2 3. destroy object Thread 2: 1. manipulate that object
Nov 15 2012
On Wednesday, November 14, 2012 11:49:22 Benjamin Thaut wrote:Am 14.11.2012 11:42, schrieb Walter Bright:Pointers are not considered unsafe at all and are perfectly legal in SafeD. It's ponter _arithmetic_ which is unsafe and therefore considered to be system. - Jonathan M DavisOn 11/14/2012 1:23 AM, Benjamin Thaut wrote:But for passing a reference to a value type you would have to use a pointer, correct? And pointers are a unsafe feature anyway... I don't see your point. And if the use of pointers is allowed, I can make the same case break in a single threaded environment without shared.Could you please give an example where it would break?Thread 1: 1. create shared object 2. pass reference to that object to Thread 2 3. destroy object Thread 2: 1. manipulate that object
Nov 14 2012
On Monday, 12 November 2012 at 02:31:05 UTC, Walter Bright wrote:To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutexThis is a fairly reasonable use of shared, but it is bypassing the type system. Once shared is cast away, it is free to be mixed with thread local variables. Pieces can be assigned to non-shared globals, impure functions can stash reference, weakly pure functions can mix their arguments together, etc... If locking converts shared(T) to bikeshed(T), I bet some of safeD's logic for no escaping references could be used to improve things. It's also interesting to note that casting away shared after taking a lock implicitly means that everything was transitively owned by that lock. I wonder how well a library could promote/enforce such a thing?
Nov 14 2012
On 11/11/12 6:30 PM, Walter Bright wrote:1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutexThis is very different from how I view we should do things (and how we actually agreed to do things and how I wrote in TDPL). I can't believe I need to restart this on a cold cache. Andrei
Nov 14 2012
On Wednesday, November 14, 2012 18:30:56 Andrei Alexandrescu wrote:On 11/11/12 6:30 PM, Walter Bright wrote:Well, this is clearly how things work now, and if you want to use shared with much of anything, it's how things generally have to work, because almost nothing takes shared. Templated stuff will at least some of the time (though it's often untested for it and probably will get screwed by Unqual in quite a few cases), but there's no way aside from templates or casting to get shared variables to share the same functions as non-shared ones, leading to code duplication.1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutexThis is very different from how I view we should do things (and how we actually agreed to do things and how I wrote in TDPL). I can't believe I need to restart this on a cold cache.From what I recall of what TDPL says, this doesn't really contradict it. It'sjust that TDPL doesn't really say much about the fact that almost nothing will work with shared, which means that casting is necessary. I have no idea what we want to do about this situation though. Regardless of what we do with memory barriers and the like, it has no impact on whether casts are required. And I think that introducing the shared equivalent of const would be a huge mistake, because then most code would end up being written using that attribute, meaning that all code essentially has to be treated as shared from the standpoint of compiler optimizations. It would almost be the same as making everything shared by default again. So, as far as I can see, casting is what we're forced to do. - Jonathan M Davis
Nov 14 2012
On 2012-11-15 02:51:13 +0000, "Jonathan M Davis" <jmdavisProg gmx.com> said:I have no idea what we want to do about this situation though. Regardless of what we do with memory barriers and the like, it has no impact on whether casts are required.One thing I'm confused about right now is how people are using shared. If you're using shared with atomic operations, then you need barriers when accessing or mutating the variable. If you're using shared with mutexes, spin-locks, etc., you don't care about the barriers. But you can't use it with both at the same time. So which of these shared stands for? In both of these cases, there's an implicit policy for accessing or mutating the variable. I think the language need some way to express that policy. I suggested some time ago a way to protect variables with mutexes so that the compiler can actually help you use those mutexes correctly[1]. The idea was to associate a mutex to the variable declaration. This could be extended to support an atomic access policy. Let me restate and extend that idea to atomic operations. Declare a variable using the synchronized storage class and it automatically get a mutex: synchronized int i; // declaration i++; // error, variable shared synchronized (i) i++; // fine, variable is thread-local inside synchronized block Synchronized here is some kind of storage class causing two things: a mutex is attached to the variable declaration, and the type of the variable is made shared. The variable being shared, you can't access it directly. But a synchronized statement will make the variable non-shared within its bounds. Now, if you want a custom mutex class, write it like this: synchronized(SpinLock) int i; synchronized(i) { // implicit: i.mutexof.lock(); // implicit: scope (exit) i.mutexof.unlock(); i++; } If you want to declare the mutex separately, you could do it by specifying a variable instead of a type in the variable declaration: Mutex m; synchronized(m) int i; synchronized(i) { // implicit: m.lock(); // implicit: scope (exit) m.unlock(); i++; } Also, if you have a read-write mutex and only need read access, you could declare that you only need read access using const: synchronized(RWMutex) int i; synchronized(const i) { // implicit: i.mutexof.constLock(); // implicit: scope (exit) i.mutexof.constUnlock(); i++; // error, i is const } And finally, if you want to use atomic operations, declare it this way: synchronized(Atomic) int i; You can't really synchronize on something protected by Atomic: syncronized(i) // cannot make sycnronized block, no lock/unlock method in Atomic {} But you can call operators on it while synchronized, it works for anything implemented by Atomic: synchronized(i)++; // implicit: Atomic.opUnary!"++"(i); Because the policy object is associated with the variable declaration, when locking the mutex you need direct access to the original variable, or an alias to it. Same for performing atomic operations. You can't pass a reference to some function and have that function perform the locking. If that's a problem it can be avoided by having a way to pass the mutex to the function, or by passing an alias to a template. Okay, this syntax probably still has some problems, feel free to point them out. I don't really care about the syntax though. The important thing is that you need a way to define the policy for accessing the shared data in a way the compiler can actually enforce it and that programmers can actually reuse it. Because right now there is no policy. Having to cast things everywhere is equivalent to having to redefine the policy everywhere. Same for having to write encapsulation types that work with shared for everything you want to share: each type has to implement the policy. There's nothing worse than constantly rewriting the sharing policies. Concurrency error-prone because of all the subtleties; you don't want to encourage people to write policies of their own every time they invent a new type. You need to reuse existing ones, and the compiler can help with that. [1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/ -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Nov 14 2012
On Thu, 15 Nov 2012 04:33:20 -0000, Michel Fortin = <michel.fortin michelf.ca> wrote:On 2012-11-15 02:51:13 +0000, "Jonathan M Davis" <jmdavisProg gmx.com>==said:I have no idea what we want to do about this situation though. =Regardless of what we do with memory barriers and the like, it has no impact on =whether casts are required.Let me restate and extend that idea to atomic operations. Declare a =variable using the synchronized storage class and it automatically get=a =mutex: synchronized int i; // declaration i++; // error, variable shared synchronized (i) i++; // fine, variable is thread-local inside synchronized block Synchronized here is some kind of storage class causing two things: a ==mutex is attached to the variable declaration, and the type of the =variable is made shared. The variable being shared, you can't access i=t =directly. But a synchronized statement will make the variable non-shar=ed =within its bounds. Now, if you want a custom mutex class, write it like this: synchronized(SpinLock) int i; synchronized(i) { // implicit: i.mutexof.lock(); // implicit: scope (exit) i.mutexof.unlock(); i++; } If you want to declare the mutex separately, you could do it by =specifying a variable instead of a type in the variable declaration: Mutex m; synchronized(m) int i; =synchronized(i) { // implicit: m.lock(); // implicit: scope (exit) m.unlock(); i++; } Also, if you have a read-write mutex and only need read access, you =could declare that you only need read access using const: synchronized(RWMutex) int i; synchronized(const i) { // implicit: i.mutexof.constLock(); // implicit: scope (exit) i.mutexof.constUnlock(); i++; // error, i is const } And finally, if you want to use atomic operations, declare it this way=:synchronized(Atomic) int i; You can't really synchronize on something protected by Atomic: syncronized(i) // cannot make sycnronized block, no lock/unlock metho=d =in Atomic {} But you can call operators on it while synchronized, it works for =anything implemented by Atomic: synchronized(i)++; // implicit: Atomic.opUnary!"++"(i); Because the policy object is associated with the variable declaration,==when locking the mutex you need direct access to the original variable=, =or an alias to it. Same for performing atomic operations. You can't pa=ss =a reference to some function and have that function perform the lockin=g. =If that's a problem it can be avoided by having a way to pass the mute=x =to the function, or by passing an alias to a template.+1 I suggested something similar as did S=F6nke: http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=3D2#pos= t-op.wnnuiio554xghj:40puck.auriga.bhead.co.uk According to deadalnix the compiler magic I suggested to add the mutex = isn't possible: http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=3D3#pos= t-k7qsb5:242gqk:241:40digitalmars.com Most of our ideas can be implemented with a wrapper template containing = = the sync object (mutex, etc). So... my feeling is that the best solution for "shared", ignoring the = memory barrier aspect which I would relegate to a different feature and = = solve a different way, is.. 1. Remove the existing mutex from object. 2. Require that all objects passed to synchronized() {} statements = implement a synchable(*) interface 3. Design a Shared(*) wrapper template/struct that contains a mutex and = = implements synchable(*) 4. Design a Shared(*) base class which contains a mutex and implements = synchable(*) Then we design classes which are always shared using the base class and = we = wrap other objects we want to share in Shared() and use them in = synchronized statements. This would then relegate any builtin "shared" statement to be solely a = storage class which makes the object global and not thread local. (*) names up for debate R -- = Using Opera's revolutionary email client: http://www.opera.com/mail/
Nov 15 2012
On Nov 15, 2012, at 3:16 AM, Regan Heath <regan netmail.co.nz> wrote:=20 I suggested something similar as did S=F6nke: =http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=3D2#post-= op.wnnuiio554xghj:40puck.auriga.bhead.co.uk=20 According to deadalnix the compiler magic I suggested to add the mutex =isn't possible:=http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=3D3#post-= k7qsb5:242gqk:241:40digitalmars.com=20 Most of our ideas can be implemented with a wrapper template =containing the sync object (mutex, etc). If I understand you correctly, you don't need anything that explicitly = contains the sync object. A global table of mutexes used according to = the address of the value to be mutated should work.So... my feeling is that the best solution for "shared", ignoring the =memory barrier aspect which I would relegate to a different feature and = solve a different way, is..=20 1. Remove the existing mutex from object. 2. Require that all objects passed to synchronized() {} statements =implement a synchable(*) interface3. Design a Shared(*) wrapper template/struct that contains a mutex =and implements synchable(*)4. Design a Shared(*) base class which contains a mutex and implements =synchable(*) It would be nice to eliminate the mutex that's optionally built into = classes now. The possibility of having to allocate a new mutex on = whatever random function call happens to be the first one with = "synchronized" is kinda not great.=
Nov 15 2012
11/15/2012 8:33 AM, Michel Fortin пишет:If you want to declare the mutex separately, you could do it by specifying a variable instead of a type in the variable declaration: Mutex m; synchronized(m) int i; synchronized(i) { // implicit: m.lock(); // implicit: scope (exit) m.unlock(); i++; }While the rest of proposal was more or less fine. I don't get why we need escape control of mutex at all - in any case it just opens a possibility to shout yourself in the foot. I'd say: "Need direct access to mutex? - Go on with the manual way it's still right there (and scope(exit) for that matter)". Another problem is that somebody clever can escape reference to unlocked 'i' inside of synchronized to somewhere else. But anyway we can make it in the library right about now. synchronized T ---> Synchronized!T synchronized(i){ ... } ---> i.access((x){ //will lock & cast away shared T inside of it ... }); I fail to see what it doesn't solve (aside of syntactic sugar). The key point is that Synchronized!T is otherwise an opaque type. We could pack a few other simple primitives like 'load', 'store' etc. All of them will go through lock-unlock. Even escaping a reference can be solved by passing inside of 'access' a proxy of T. It could even asserts that the lock is in indeed locked. Same goes about Atomic!T. Though the set of primitives is quite limited depending on T. (I thought that built-in shared(T) is already atomic though so no need to reinvent this wheel) It's time we finally agree that 'shared' qualifier is an assembly language of multi-threading based on sharing. It just needs some safe patterns in the library. That and clarifying explicitly what guarantees (aside from being well.. being shared) it provides w.r.t. memory model. Until reaching this thread I was under impression that shared means: - globally visible - atomic operations for stuff that fits in one word - sequentially consistent guarantee - any other forms of access are disallowed except via casts -- Dmitry Olshansky
Nov 15 2012
On 2012-11-15 16:08:35 +0000, Dmitry Olshansky <dmitry.olsh gmail.com> said:11/15/2012 8:33 AM, Michel Fortin пишет:In case you want to protect two variables (or more) with the same mutex. For instance: Mutex m; synchronized(m) int next_id; synchronized(m) Object[int] objects_by_id; int addObject(Object o) { synchronized(next_id, objects_by_id) return objects_by_id[next_id++] = o; } Here it doesn't make sense and is less efficient to have two mutexes, since every time you need to lock on next_id you'll also want to lock on objects_by_id. I'm not sure how you could shoot yourself in the foot with this. You might get worse performance if you reuse the same mutex for too many things, just like you might get better performance if you use it wisely.If you want to declare the mutex separately, you could do it by specifying a variable instead of a type in the variable declaration: Mutex m; synchronized(m) int i; synchronized(i) { // implicit: m.lock(); // implicit: scope (exit) m.unlock(); i++; }While the rest of proposal was more or less fine. I don't get why we need escape control of mutex at all - in any case it just opens a possibility to shout yourself in the foot.But anyway we can make it in the library right about now. synchronized T ---> Synchronized!T synchronized(i){ ... } ---> i.access((x){ //will lock & cast away shared T inside of it ... }); I fail to see what it doesn't solve (aside of syntactic sugar).It solves the problem too. But it's significantly more inconvenient to use. Here's my example above redone using Syncrhonized!T: Synchronized!(Tuple!(int, Object[int])) objects_by_id; int addObject(Object o) { int id; objects_by_id.access((obj_by_id){ id = obj_by_id[1][obj_by_id[0]++] = o; }; return id; } I'm not sure if I have to explain why I prefer the first one or not, to me it's pretty obvious.The key point is that Synchronized!T is otherwise an opaque type. We could pack a few other simple primitives like 'load', 'store' etc. All of them will go through lock-unlock.Our proposals are pretty much identical. Your works by wrapping a variable in a struct template, mine is done with a policy object/struct associated with a variable. They'll produce the same code and impose the same restrictions.Even escaping a reference can be solved by passing inside of 'access' a proxy of T. It could even asserts that the lock is in indeed locked.Only if you can make a proxy object that cannot leak a reference. It's already not obvious how to not leak the top-level reference, but we must also consider the case where you're protecting a data structure with the mutex and get a pointer to one of its part, like if you slice a container. This is a hard problem. The language doesn't have a solution to that yet. However, having the link between the access policy and the variable known by the compiler makes it easier patch the hole later. What bothers me currently is that because we want to patch all the holes while not having all the necessary tools in the language to avoid escaping references, we just make using mutexes and things alike impossible without casts at every corner, which makes things even more bug prone than being able to escape references in the first place. There are many perils in concurrency, and the compiler cannot protect you from them all. It is of the uttermost importance that code dealing with mutexes be both readable and clear about what it is doing. Casts in this context are an obfuscator.Same goes about Atomic!T. Though the set of primitives is quite limited depending on T. (I thought that built-in shared(T) is already atomic though so no need to reinvent this wheel) It's time we finally agree that 'shared' qualifier is an assembly language of multi-threading based on sharing. It just needs some safe patterns in the library. That and clarifying explicitly what guarantees (aside from being well.. being shared) it provides w.r.t. memory model. Until reaching this thread I was under impression that shared means: - globally visible - atomic operations for stuff that fits in one word - sequentially consistent guarantee - any other forms of access are disallowed except via castsBuilt-in shared(T) atomicity (sequential consistency) is a subject of debate in this thread. It is not clear to me what will be the conclusion, but the way I see things atomicity is just one of the many policies you may want to use for keeping consistency when sharing data between threads. I'm not trilled by the idea of making everything atomic by default. That'll lure users to the bug-prone expert-only path while relegating the more generally applicable protection systems (mutexes) as a second-class citizen. I think it's better that you just can't do anything with shared, or that shared simply disappear, and that those variables that must be shared be accessible only through some kind of access policy. Atomic access should be one of those access policies, on an equal footing with other ones. But if D2 is still "frozen" -- as it was meant to be when TDPL got out -- and only minor changes can be made to it now, I don't see much hope for its concurrency model. Your Syncronized!T and Atomic!T wrappers might be the best thing we can hope for, but they're nothing to set D apart from its rivals (I could implement that easily in C++ for instance). -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Nov 16 2012
Am 16.11.2012 14:17, schrieb Michel Fortin:Only if you can make a proxy object that cannot leak a reference. It's already not obvious how to not leak the top-level reference, but we must also consider the case where you're protecting a data structure with the mutex and get a pointer to one of its part, like if you slice a container. This is a hard problem. The language doesn't have a solution to that yet. However, having the link between the access policy and the variable known by the compiler makes it easier patch the hole later. What bothers me currently is that because we want to patch all the holes while not having all the necessary tools in the language to avoid escaping references, we just make using mutexes and things alike impossible without casts at every corner, which makes things even more bug prone than being able to escape references in the first place. There are many perils in concurrency, and the compiler cannot protect you from them all. It is of the uttermost importance that code dealing with mutexes be both readable and clear about what it is doing. Casts in this context are an obfuscator.Can you have a look at my thread about this? http://forum.dlang.org/thread/k831b6$1368$1 digitalmars.com I would of course favor a nicely integrated language solution that is able to lift as many restrictions as possible, while still keeping everything statically verified [I would also like to have a language solution to Rebindable!T ;)]. But as an alternative to just a years lasting discussion, which does not lead to any agreed upon solution, I'd much rather have such a library solution - it can do a lot, is reasonably pretty, and is (supposedly and with a small exception) fully safe.
Nov 16 2012
On Nov 16, 2012, at 5:17 AM, Michel Fortin <michel.fortin michelf.ca> = wrote:On 2012-11-15 16:08:35 +0000, Dmitry Olshansky <dmitry.olsh gmail.com> =said:=20need escape control of mutex at all - in any case it just opens a = possibility to shout yourself in the foot.11/15/2012 8:33 AM, Michel Fortin =D0=BF=D0=B8=D1=88=D0=B5=D1=82:If you want to declare the mutex separately, you could do it by specifying a variable instead of a type in the variable declaration: Mutex m; synchronized(m) int i; synchronized(i) { // implicit: m.lock(); // implicit: scope (exit) m.unlock(); i++; }While the rest of proposal was more or less fine. I don't get why we ==20 In case you want to protect two variables (or more) with the same =mutex. For instance:=20 Mutex m; synchronized(m) int next_id; synchronized(m) Object[int] objects_by_id; =20 int addObject(Object o) { synchronized(next_id, objects_by_id) return objects_by_id[next_id++] =3D o; } =20 Here it doesn't make sense and is less efficient to have two mutexes, =since every time you need to lock on next_id you'll also want to lock on = objects_by_id.=20 I'm not sure how you could shoot yourself in the foot with this. You =might get worse performance if you reuse the same mutex for too many = things, just like you might get better performance if you use it wisely. This is what setSameMutex was intended for in Druntime. Except that no = one uses it and people have requested that it be removed. Perhaps = that's because the semantics aren't great though.=
Nov 16 2012
On 2012-11-16 15:23:37 +0000, Sean Kelly <sean invisibleduck.org> said:On Nov 16, 2012, at 5:17 AM, Michel Fortin <michel.fortin michelf.ca> wrote:Perhaps it's just my style of coding, but when designing a class that needs to be shared in C++, I usually use one mutex to protect only a couple of variables inside the object. That might mean I have two mutexes in one class for two sets of variables if it fits the access pattern. I also make the mutex private so that derived classes cannot access it. The idea is to strictly control what happens when each mutex is locked so that I can make sure I never have two mutexes locked at the same time without looking at the whole code base. This is to avoid deadlocks, and also it removes the need for recursive mutexes. I'd like the language to help me enforce this pattern, and what I'm proposing goes in that direction. Regarding setSameMutex, I'd argue that the semantics of having one mutex for a whole object isn't great. Mutexes shouldn't protect types, they should protect variables. Whether a class needs to protect its variables and how it does it is an implementation detail that shouldn't be leaked to the outside world. What the outside world should know is whether the object is thread-safe or not. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/On 2012-11-15 16:08:35 +0000, Dmitry Olshansky <dmitry.olsh gmail.com> said:This is what setSameMutex was intended for in Druntime. Except that no one uses it and people have requested that it be removed. Perhaps that's because the semantics aren't great though.While the rest of proposal was more or less fine. I don't get why we need escape control of mutex at all - in any case it just opens a possibility to shout yourself in the foot.In case you want to protect two variables (or more) with the same mutex.
Nov 17 2012
11/16/2012 5:17 PM, Michel Fortin пишет:On 2012-11-15 16:08:35 +0000, Dmitry Olshansky <dmitry.olsh gmail.com> said:Wrap in a struct and it would be even much clearer and safer. struct ObjectRepository { int next_id; Object[int] objects_by_id; } //or whatever that combination indicates anyway synchronized ObjectRepository objeRepo;While the rest of proposal was more or less fine. I don't get why we need escape control of mutex at all - in any case it just opens a possibility to shout yourself in the foot.In case you want to protect two variables (or more) with the same mutex. For instance: Mutex m; synchronized(m) int next_id; synchronized(m) Object[int] objects_by_id;int addObject(Object o) { synchronized(next_id, objects_by_id)...synchronized(objRepo) with(objRepo)... Though I'd rather use it as struct directly.return objects_by_id[next_id++] = o; } Here it doesn't make sense and is less efficient to have two mutexes, since every time you need to lock on next_id you'll also want to lock on objects_by_id.Yes. But we shouldn't close our eyes on the rest of language for how to implement this. Moreover it makes more sense to pack related stuff (that is under a single lock) into a separate entity.I'm not sure how you could shoot yourself in the foot with this. You might get worse performance if you reuse the same mutex for too many things, just like you might get better performance if you use it wisely.Easily - now the mutex is separate and there is no guarantee that it won't get used for something else then intended. The declaration implies the connection but I do not see anything preventing it from abuse.If we made a tiny change in the language that would allow different syntax for passing delegates mine would shine. Such a change at the same time enables more nice way to abstract away control flow. Imagine: access(object_by_id){ ... }; to be convertible to: (x){with(x){ ... }})(access(object_by_id)); More generally speaking a lowering: expression { ... } --> (x){with(x){ ... }}(expression); AFIAK it doesn't conflict with anything. Or wait a sec. Even simpler idiom and no extra features. Drop the idea of 'access' taking a delegate. The other library idiom is to return a RAII proxy that locks/unlocks an object on construction/destroy. with(lock(object_by_id)) { ... do what you like } Fine by me. And C++ can't do it ;)But anyway we can make it in the library right about now. synchronized T ---> Synchronized!T synchronized(i){ ... } ---> i.access((x){ //will lock & cast away shared T inside of it ... }); I fail to see what it doesn't solve (aside of syntactic sugar).It solves the problem too. But it's significantly more inconvenient to use. Here's my example above redone using Syncrhonized!T: Synchronized!(Tuple!(int, Object[int])) objects_by_id; int addObject(Object o) { int id; objects_by_id.access((obj_by_id){ id = obj_by_id[1][obj_by_id[0]++] = o; }; return id; } I'm not sure if I have to explain why I prefer the first one or not, to me it's pretty obvious.I kind of wanted to point out this disturbing thought about your proposal. That is a lot of extra syntax and rules added buys us very small gain - prettier syntax.The key point is that Synchronized!T is otherwise an opaque type. We could pack a few other simple primitives like 'load', 'store' etc. All of them will go through lock-unlock.Our proposals are pretty much identical. Your works by wrapping a variable in a struct template, mine is done with a policy object/struct associated with a variable. They'll produce the same code and impose the same restrictions.It need not be 100% malicious dambass proof. Basic foolproofness is OK. See my sketch, it could be vastly improved: https://gist.github.com/4089706 See also Ludwig's work. Though he is focused on classes and their monitor mutex.Even escaping a reference can be solved by passing inside of 'access' a proxy of T. It could even asserts that the lock is in indeed locked.Only if you can make a proxy object that cannot leak a reference. It's already not obvious how to not leak the top-level reference, but we must also consider the case where you're protecting a data structure with the mutex and get a pointer to one of its part, like if you slice a container. This is a hard problem. The language doesn't have a solution to that yet. However, having the link between the access policy and the variable known by the compiler makes it easier patch the hole later.What bothers me currently is that because we want to patch all the holes while not having all the necessary tools in the language to avoid escaping references, we just make using mutexes and things alike impossible without casts at every corner, which makes things even more bug prone than being able to escape references in the first place.Well it kind of double-edged. However I do think we need more general tools in the language and niche ones in the library. Precisely because you can pack tons of niche and miscellaneous stuff on the bookshelf ;) Locks & the works are niche stuff enabling a lot more of common things.There are many perils in concurrency, and the compiler cannot protect you from them all. It is of the uttermost importance that code dealing with mutexes be both readable and clear about what it is doing. Casts in this context are an obfuscator.See below about high-level primitives. The code dealing with mutexes has to be small and isolated anyway. Encouraging pattern of 'just grab the lock and you are golden' is even worse (cause it won't break as fast and hard as e.g. naive atomics will).That's why I think people shouldn't have to use mutexes at all. Explicitly - provide folks with blocking queues, Synchronized!T, concurrent containers (e.g. hash map) and what not. Even Java has some useful incarnations of these.That and clarifying explicitly what guarantees (aside from being well.. being shared) it provides w.r.t. memory model. Until reaching this thread I was under impression that shared means: - globally visible - atomic operations for stuff that fits in one word - sequentially consistent guarantee - any other forms of access are disallowed except via castsBuilt-in shared(T) atomicity (sequential consistency) is a subject of debate in this thread. It is not clear to me what will be the conclusion, but the way I see things atomicity is just one of the many policies you may want to use for keeping consistency when sharing data between threads. I'm not trilled by the idea of making everything atomic by default. That'll lure users to the bug-prone expert-only path while relegating the more generally applicable protection systems (mutexes) as a second-class citizen.I think it's better that you just can't do anything with shared, or that shared simply disappear, and that those variables that must be shared be accessible only through some kind of access policy. Atomic access should be one of those access policies, on an equal footing with other ones.This is where casts will be a most unwelcome obfuscator and there is no sensible way to de-obscure it by using higher level primitives. Having to say Atomic!X is workable though.But if D2 is still "frozen" -- as it was meant to be when TDPL got out -- and only minor changes can be made to it now, I don't see much hope for its concurrency model. Your Syncronized!T and Atomic!T wrappers might be the best thing we can hope for, but they're nothing to set D apart from its rivals (I could implement that easily in C++ for instance).Yeah, but we may tweak some syntax in terms of one lowering or a couple. I'm of strong opinion that lock-based multi-threading needs no _specific_ built-in support in the language. The case is niche and hardly useful outside of certain help with doing safe high-level primitives in the library. As for client code it doesn't care that much. Compared to C++ there is one big thing. That is no-shared by default. This alone should be immensely helpful especially when dealing with 3rd party libraries that 'try hard to be thread-safe' except that they are usually not. -- Dmitry Olshansky
Nov 16 2012
On 2012-11-16 18:56:28 +0000, Dmitry Olshansky <dmitry.olsh gmail.com> said:11/16/2012 5:17 PM, Michel Fortin пишет:I guess that'd be fine too.In case you want to protect two variables (or more) with the same mutex. For instance: Mutex m; synchronized(m) int next_id; synchronized(m) Object[int] objects_by_id;Wrap in a struct and it would be even much clearer and safer. struct ObjectRepository { int next_id; Object[int] objects_by_id; } //or whatever that combination indicates anyway synchronized ObjectRepository objeRepo;If we made a tiny change in the language that would allow different syntax for passing delegates mine would shine. Such a change at the same time enables more nice way to abstract away control flow. Imagine: access(object_by_id){ ... }; to be convertible to: (x){with(x){ ... }})(access(object_by_id)); More generally speaking a lowering: expression { ... } --> (x){with(x){ ... }}(expression); AFIAK it doesn't conflict with anything. Or wait a sec. Even simpler idiom and no extra features. Drop the idea of 'access' taking a delegate. The other library idiom is to return a RAII proxy that locks/unlocks an object on construction/destroy. with(lock(object_by_id)) { ... do what you like } Fine by me. And C++ can't do it ;)Clever. But you forgot to access the variable somewhere. What's its name within the with block? Your code would be clearer this way: { auto locked_object_by_id = lock(object_by_id); // … do what you like } And yes you can definitely do that in C++. I maintain that the "synchronized (var)" syntax is still much clearer, and greppable too. That could be achieved with an appropriate lowering.Sometime having something built in the language is important: it gives first-class status to some constructs. For instance: arrays. We don't need language-level arrays in D, we could just use a struct template that does the same thing. By integrating a feature into the language we're sending the message that this is *the* way to do it, as no other way can stand on equal footing, preventing infinite reimplementation of the concept within various libraries. You might be right however than mutex-protected variables do not deserve this first class status.I kind of wanted to point out this disturbing thought about your proposal. That is a lot of extra syntax and rules added buys us very small gain - prettier syntax.The key point is that Synchronized!T is otherwise an opaque type. We could pack a few other simple primitives like 'load', 'store' etc. All of them will go through lock-unlock.Our proposals are pretty much identical. Your works by wrapping a variable in a struct template, mine is done with a policy object/struct associated with a variable. They'll produce the same code and impose the same restrictions.I wouldn't say they shouldn't use mutexes at all, but perhaps you're right that they don't deserve first-class treatment. I still maintain that "syncronized (var)" should work, for clarity and consistency reasons, but using a template such as Synchronized!T when declaring the variable might be the best solution. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/Built-in shared(T) atomicity (sequential consistency) is a subject of debate in this thread. It is not clear to me what will be the conclusion, but the way I see things atomicity is just one of the many policies you may want to use for keeping consistency when sharing data between threads. I'm not trilled by the idea of making everything atomic by default. That'll lure users to the bug-prone expert-only path while relegating the more generally applicable protection systems (mutexes) as a second-class citizen.That's why I think people shouldn't have to use mutexes at all. Explicitly - provide folks with blocking queues, Synchronized!T, concurrent containers (e.g. hash map) and what not. Even Java has some useful incarnations of these.
Nov 17 2012
On 2012-11-17 14:22, Michel Fortin wrote:Sometime having something built in the language is important: it gives first-class status to some constructs. For instance: arrays. We don't need language-level arrays in D, we could just use a struct template that does the same thing. By integrating a feature into the language we're sending the message that this is *the* way to do it, as no other way can stand on equal footing, preventing infinite reimplementation of the concept within various libraries.If a feature can be implemented in a library with the same syntax, semantic and performance I see no reason to put it in the language. -- /Jacob Carlborg
Nov 17 2012
11/17/2012 5:22 PM, Michel Fortin пишет:On 2012-11-16 18:56:28 +0000, Dmitry Olshansky <dmitry.olsh gmail.com>Not having the name would imply you can't escape it :) But I agree it's not always clear where the writes go to when doing things inside the with block.Or wait a sec. Even simpler idiom and no extra features. Drop the idea of 'access' taking a delegate. The other library idiom is to return a RAII proxy that locks/unlocks an object on construction/destroy. with(lock(object_by_id)) { ... do what you like } Fine by me. And C++ can't do it ;)Clever. But you forgot to access the variable somewhere. What's its name within the with block?Your code would be clearer this way: { auto locked_object_by_id = lock(object_by_id); // … do what you like } And yes you can definitely do that in C++.Well, I actually did it in the past when C++0x was relatively new. I just thought 'with' makes it more interesting. As to how access the variable - it depends on what it is.I maintain that the "synchronized (var)" syntax is still much clearer, and greppable too. That could be achieved with an appropriate lowering.Yes! If we could make synchronized to be user-hookable this all would be more clear and generally useful. There was a discussion about providing a user defined semantics for synchronized block. It was clear and useful and a lot of folks were favorable of it. Yet it wasn't submitted as a proposal. All other things being equal I believe we should go in this direction - amend a couple of things (say add a user-hookable synchronized) and start laying bricks for std.sharing. -- Dmitry Olshansky
Nov 17 2012
On Saturday, 17 November 2012 at 13:22:23 UTC, Michel Fortin wrote:On 2012-11-16 18:56:28 +0000, Dmitry Olshansky <dmitry.olsh gmail.com> said:<snip> That solution does not work in the general case. More specifically any graph-like data structure. E.g a linked-lists, trees, etc.. Think for example an insert to a shared AVL tree.11/16/2012 5:17 PM, Michel Fortin пишет:I guess that'd be fine too.In case you want to protect two variables (or more) with the same mutex. For instance: Mutex m; synchronized(m) int next_id; synchronized(m) Object[int] objects_by_id;Wrap in a struct and it would be even much clearer and safer. struct ObjectRepository { int next_id; Object[int] objects_by_id; } //or whatever that combination indicates anyway synchronized ObjectRepository objeRepo;
Nov 19 2012
On 2012-11-19 09:31:46 +0000, "foobar" <foo bar.com> said:On Saturday, 17 November 2012 at 13:22:23 UTC, Michel Fortin wrote:No solution will be foolproof in the general case unless we add new type modifiers to the language to prevent escaping references, something Walter is reluctant to do. So whatever we do with mutexes it'll always be a leaky abstraction. I'm not too trilled by this either. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/On 2012-11-16 18:56:28 +0000, Dmitry Olshansky <dmitry.olsh gmail.com> said:<snip> That solution does not work in the general case. More specifically any graph-like data structure. E.g a linked-lists, trees, etc.. Think for example an insert to a shared AVL tree.11/16/2012 5:17 PM, Michel Fortin пишет:I guess that'd be fine too.In case you want to protect two variables (or more) with the same mutex. For instance: Mutex m; synchronized(m) int next_id; synchronized(m) Object[int] objects_by_id;Wrap in a struct and it would be even much clearer and safer. struct ObjectRepository { int next_id; Object[int] objects_by_id; } //or whatever that combination indicates anyway synchronized ObjectRepository objeRepo;
Nov 19 2012
On 11/15/12, Jonathan M Davis <jmdavisProg gmx.com> wrote:From what I recall of what TDPL saysIt says (on p.413) reading and writing shared values are guaranteed to be atomic, for pointers, arrays, function pointers, delegates, class references, and struct types containing exactly one of these types. Reals are not supported. It also talks about automatically inserting memory barriers on page 414.
Nov 14 2012
On Thursday, November 15, 2012 03:51:13 Jonathan M Davis wrote:I have no idea what we want to do about this situation though. Regardless of what we do with memory barriers and the like, it has no impact on whether casts are required. And I think that introducing the shared equivalent of const would be a huge mistake, because then most code would end up being written using that attribute, meaning that all code essentially has to be treated as shared from the standpoint of compiler optimizations. It would almost be the same as making everything shared by default again. So, as far as I can see, casting is what we're forced to do.Actually, I think that what it comes down to is that shared works nicely when you have a type which is designed to be shared, and it encapsulates everything that it needs. Where it starts requiring casting is when you need to pass it to other stuff. - Jonathan M Davis
Nov 14 2012
On 11/14/12 7:24 PM, Jonathan M Davis wrote:On Thursday, November 15, 2012 03:51:13 Jonathan M Davis wrote:TDPL 13.14 explains that inside synchronized classes, top-level shared is automatically lifted. AndreiI have no idea what we want to do about this situation though. Regardless of what we do with memory barriers and the like, it has no impact on whether casts are required. And I think that introducing the shared equivalent of const would be a huge mistake, because then most code would end up being written using that attribute, meaning that all code essentially has to be treated as shared from the standpoint of compiler optimizations. It would almost be the same as making everything shared by default again. So, as far as I can see, casting is what we're forced to do.Actually, I think that what it comes down to is that shared works nicely when you have a type which is designed to be shared, and it encapsulates everything that it needs. Where it starts requiring casting is when you need to pass it to other stuff. - Jonathan M Davis
Nov 14 2012
On Wednesday, November 14, 2012 20:32:35 Andrei Alexandrescu wrote:TDPL 13.14 explains that inside synchronized classes, top-level shared is automatically lifted.Then it's doing the casting for you. I suppose that that's an argument that using synchronized classes when dealing with shared is the way to go (which IIRC TDPL does argue), but that only applies to classes, and there are plenty of cases (maybe even the majority) where it's built-in types like arrays or AAs which people are trying to share, and synchronized classes won't help them there unless they create wrapper types. And explicit casting will be required for them. And of course, anyone wanting to use mutexes or synchronized blocks will have to use explicit casts regardless of what they're protecting, because it won't be inside a synchronized class. So, while synchronized classes make dealing with classes nicer, they only handle a very specific portion of what might be used with shared. In any case, I clearly need to reread TDPL's threading stuff (and maybe the whole book). It's been a while since I read it, and I'm getting rusty on the details. By the way, speaking of synchronized classes, as I understand it, they're still broken with regards to TDPL in that synchronized is still used on functions rather than classes like TDPL describes. So, they aren't currently a solution regardless of what the language actual design is supposed to be. Obviously, that should be fixed though. - Jonathan M Davis
Nov 15 2012
Am 15.11.2012 05:32, schrieb Andrei Alexandrescu:On 11/14/12 7:24 PM, Jonathan M Davis wrote:There are three problems I currently see with this: - It's not actually implemented - It's not safe because unshared references can be escaped or dragged in - Synchronized classes provide no way to avoid the automatic locking in certain methods, but often it is necessary to have more fine-grained control for efficiency reasons, or to avoid dead-locksOn Thursday, November 15, 2012 03:51:13 Jonathan M Davis wrote:TDPL 13.14 explains that inside synchronized classes, top-level shared is automatically lifted. AndreiI have no idea what we want to do about this situation though. Regardless of what we do with memory barriers and the like, it has no impact on whether casts are required. And I think that introducing the shared equivalent of const would be a huge mistake, because then most code would end up being written using that attribute, meaning that all code essentially has to be treated as shared from the standpoint of compiler optimizations. It would almost be the same as making everything shared by default again. So, as far as I can see, casting is what we're forced to do.Actually, I think that what it comes down to is that shared works nicely when you have a type which is designed to be shared, and it encapsulates everything that it needs. Where it starts requiring casting is when you need to pass it to other stuff. - Jonathan M Davis
Nov 15 2012
On 15 November 2012 04:30, Andrei Alexandrescu < SeeWebsiteForEmail erdani.org> wrote:On 11/11/12 6:30 PM, Walter Bright wrote:The pattern Walter describes is primitive and useful, I'd like to see shared assist to that end (see my previous post). You can endeavour to do any other fancy stuff you like, but until some distant future when it's actually done, then proven and well supported, I'll keep doing this. Not to repeat my prev post... but in reply to Walter's take on it, it would be interesting if 'shared' just added implicit lock()/unlock() methods to do the mutex acquisition and then remove the cast requirement, but have the language runtime assert that the object is locked whenever it is accessed (this guarantees the safety in a more useful way, the casts are really annying). I can't imagine a simpler and more immediately useful solution. In fact, it's a reasonably small step to this being possible with user-defined attributes. Although attributes have no current mechanism to add a mutex, and lock/unlock methods to the object being attributed (like1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutexThis is very different from how I view we should do things (and how we actually agreed to do things and how I wrote in TDPL). I can't believe I need to restart this on a cold cache.
Nov 15 2012
On 2012-11-15 10:22, Manu wrote:Not to repeat my prev post... but in reply to Walter's take on it, it would be interesting if 'shared' just added implicit lock()/unlock() methods to do the mutex acquisition and then remove the cast requirement, but have the language runtime assert that the object is locked whenever it is accessed (this guarantees the safety in a more useful way, the casts are really annying). I can't imagine a simpler and more immediately useful solution.How about implementing a library function, something like this: shared int i; lock(i, (x) { // operate on x }); * "lock" will acquire a lock * Cast away shared for "i" * Call the delegate with the now plain "int" * Release the lock http://pastebin.com/tfQ12nJB -- /Jacob Carlborg
Nov 15 2012
On 15 November 2012 12:14, Jacob Carlborg <doob me.com> wrote:On 2012-11-15 10:22, Manu wrote: Not to repeat my prev post... but in reply to Walter's take on it, itInteresting concept. Nice idea, could certainly be useful, but it doesn't address the problem as directly as my suggestion. There are still many problem situations, for instance, any time a template is involved. The template doesn't know to do that internally, but under my proposal, you lock it prior to the workload, and then the template works as expected. Templates won't just break and fail whenever shared is involved, because assignments would be legal. They'll just assert that the thing is locked at the time, which is the programmers responsibility to ensure.would be interesting if 'shared' just added implicit lock()/unlock() methods to do the mutex acquisition and then remove the cast requirement, but have the language runtime assert that the object is locked whenever it is accessed (this guarantees the safety in a more useful way, the casts are really annying). I can't imagine a simpler and more immediately useful solution.How about implementing a library function, something like this: shared int i; lock(i, (x) { // operate on x }); * "lock" will acquire a lock * Cast away shared for "i" * Call the delegate with the now plain "int" * Release the lock http://pastebin.com/tfQ12nJB
Nov 15 2012
On 15.11.2012 11:52, Manu wrote:On 15 November 2012 12:14, Jacob Carlborg <doob me.com <mailto:doob me.com>> wrote: On 2012-11-15 10:22, Manu wrote: Not to repeat my prev post... but in reply to Walter's take on it, it would be interesting if 'shared' just added implicit lock()/unlock() methods to do the mutex acquisition and then remove the cast requirement, but have the language runtime assert that the object is locked whenever it is accessed (this guarantees the safety in a more useful way, the casts are really annying). I can't imagine a simpler and more immediately useful solution. How about implementing a library function, something like this: shared int i; lock(i, (x) { // operate on x }); * "lock" will acquire a lock * Cast away shared for "i" * Call the delegate with the now plain "int" * Release the lock http://pastebin.com/tfQ12nJB Interesting concept. Nice idea, could certainly be useful, but it doesn't address the problem as directly as my suggestion. There are still many problem situations, for instance, any time a template is involved. The template doesn't know to do that internally, but under my proposal, you lock it prior to the workload, and then the template works as expected. Templates won't just break and fail whenever shared is involved, because assignments would be legal. They'll just assert that the thing is locked at the time, which is the programmers responsibility to ensure.I managed to make a simple example that works with the current implementation: http://dpaste.dzfl.pl/27b6df62 http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=4#post-k7s0gs:241h45:241:40digitalmars.com It seems to me that solving this shared issue cannot be done purely on a compiler basis but will require a runtime support. Actually I don't see how it can be done properly without telling "this lock must be locked when accessing this variable". http://dpaste.dzfl.pl/edbd3e10
Nov 15 2012
On 2012-11-15 11:52, Manu wrote:Interesting concept. Nice idea, could certainly be useful, but it doesn't address the problem as directly as my suggestion. There are still many problem situations, for instance, any time a template is involved. The template doesn't know to do that internally, but under my proposal, you lock it prior to the workload, and then the template works as expected. Templates won't just break and fail whenever shared is involved, because assignments would be legal. They'll just assert that the thing is locked at the time, which is the programmers responsibility to ensure.I don't understand how a template would cause problems. -- /Jacob Carlborg
Nov 15 2012
On Thursday, November 15, 2012 11:22:30 Manu wrote:Not to repeat my prev post... but in reply to Walter's take on it, it would be interesting if 'shared' just added implicit lock()/unlock() methods to do the mutex acquisition and then remove the cast requirement, but have the language runtime assert that the object is locked whenever it is accessed (this guarantees the safety in a more useful way, the casts are really annying). I can't imagine a simpler and more immediately useful solution. In fact, it's a reasonably small step to this being possible with user-defined attributes. Although attributes have no current mechanism to add a mutex, and lock/unlock methods to the object being attributed (like1. It wouldn't stop you from needing to cast away shared at all, because without casting away shared, you wouldn't be able to pass it to anything, because the types would differ. Even if you were arguing that doing something like void foo(C c) {...} shared c = new C; foo(c); //no cast required, lock automatically taken it wouldn't work, because then foo could wile away a reference to c somewhere, and the type system would have no way of knowing that it was a shared variable that was being wiled away as opposed to a thread-local one, which means that it'll likely generate incorrect code. That can happen with the cast as well, but at least in that case, you're forced to be explicit about it, and it's automatically system. If it's done for you, it'll be easy to miss and screw up. 2. It's often the case that you need to lock/unlock groups of stuff together such that locking specific variables is of often of limited use and would just introduce pointless extra locks when dealing with multiple variables. It would also increase the risk of deadlocks, because you wouldn't have much - if any - control over what order locks were acquired in when dealing with multiple shared variables. - Jonathan M Davis
Nov 15 2012
On 15 November 2012 13:38, Jonathan M Davis <jmdavisProg gmx.com> wrote:On Thursday, November 15, 2012 11:22:30 Manu wrote:I don't really see the difference, other than, as you say, the cast is explicit. Obviously the possibility for the situation you describe exists, it's equally possible with the cast, except this way, the usage pattern is made more convenient, the user has a convenient way to control the locks and most importantly, it would work with templates. That said, this sounds like another perfect application of 'scope'. Perhaps only scope parameters can receive a locked, shared thing... that would mechanically protect you against escape. 2. It's often the case that you need to lock/unlock groups of stuff togetherNot to repeat my prev post... but in reply to Walter's take on it, itwouldbe interesting if 'shared' just added implicit lock()/unlock() methods to do the mutex acquisition and then remove the cast requirement, but havethelanguage runtime assert that the object is locked whenever it is accessed (this guarantees the safety in a more useful way, the casts are really annying). I can't imagine a simpler and more immediately useful solution. In fact, it's a reasonably small step to this being possible with user-defined attributes. Although attributes have no current mechanism to add a mutex, and lock/unlock methods to the object being attributed (like1. It wouldn't stop you from needing to cast away shared at all, because without casting away shared, you wouldn't be able to pass it to anything, because the types would differ. Even if you were arguing that doing something like void foo(C c) {...} shared c = new C; foo(c); //no cast required, lock automatically taken it wouldn't work, because then foo could wile away a reference to c somewhere, and the type system would have no way of knowing that it was a shared variable that was being wiled away as opposed to a thread-local one, which means that it'll likely generate incorrect code. That can happen with the cast as well, but at least in that case, you're forced to be explicit about it, and it's automatically system. If it's done for you, it'll be easy to miss and screw up.such that locking specific variables is of often of limited use and would just introduce pointless extra locks when dealing with multiple variables. It would also increase the risk of deadlocks, because you wouldn't have much - if any - control over what order locks were acquired in when dealing with multiple shared variables.Your fear is precisely the state we're in now, except it puts all the work on the user to create and use the synchronisation objects, and also to assert that things are locked when they are accessed. I'm just suggesting some reasonably simple change that would make the situation more usable and safer immediately, short of waiting for all these fantastic designs being discussed having time to simmer and manifest. Perhaps a usage mechanism could be more like: shared int x, y, z; synchronised with(x, y, z) { // do work with x, y, z, all locked together. }
Nov 15 2012
On Nov 11, 2012, at 6:30 PM, Walter Bright <newshound2 digitalmars.com> = wrote:=20 To make a shared type work in an algorithm, you have to: =20 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutexSo what happens if you pass a reference to the now non-shared object to = a function that caches a local reference to it? Half the point of the = attribute is to protect us from accidents like this.=
Nov 15 2012
On Thursday, 15 November 2012 at 16:31:43 UTC, Sean Kelly wrote:On Nov 11, 2012, at 6:30 PM, Walter Bright <newshound2 digitalmars.com> wrote:The constructive thing to do may be to try and figure out what should users be allowed to do with locked shared data... I think the basic idea is that no references can be escaped; SafeD rules could probably help with that. Non-shared member functions might also need to be tagged with their ability to be called on locked, shared data.To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutexSo what happens if you pass a reference to the now non-shared object to a function that caches a local reference to it? Half the point of the attribute is to protect us from accidents like this.
Nov 17 2012
Le 17/11/2012 05:49, Jason House a écrit :On Thursday, 15 November 2012 at 16:31:43 UTC, Sean Kelly wrote:Nothing is safe if ownership cannot be statically proven. This is completely useless.On Nov 11, 2012, at 6:30 PM, Walter Bright <newshound2 digitalmars.com> wrote:The constructive thing to do may be to try and figure out what should users be allowed to do with locked shared data... I think the basic idea is that no references can be escaped; SafeD rules could probably help with that. Non-shared member functions might also need to be tagged with their ability to be called on locked, shared data.To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutexSo what happens if you pass a reference to the now non-shared object to a function that caches a local reference to it? Half the point of the attribute is to protect us from accidents like this.
Nov 18 2012
Am 19.11.2012 05:57, schrieb deadalnix:Le 17/11/2012 05:49, Jason House a écrit :But you can at least prove ownership under some limited circumstances. Limited, but (without having tested on a large scale) still practical. Interest seems to be limited much more than those circumstances, but anyway: http://forum.dlang.org/thread/k831b6$1368$1 digitalmars.com (the same approach that I already posted in this thread, but in a state that should be more or less bullet proof)On Thursday, 15 November 2012 at 16:31:43 UTC, Sean Kelly wrote:Nothing is safe if ownership cannot be statically proven. This is completely useless.On Nov 11, 2012, at 6:30 PM, Walter Bright <newshound2 digitalmars.com> wrote:The constructive thing to do may be to try and figure out what should users be allowed to do with locked shared data... I think the basic idea is that no references can be escaped; SafeD rules could probably help with that. Non-shared member functions might also need to be tagged with their ability to be called on locked, shared data.To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutexSo what happens if you pass a reference to the now non-shared object to a function that caches a local reference to it? Half the point of the attribute is to protect us from accidents like this.
Nov 19 2012
On Monday, 19 November 2012 at 04:57:16 UTC, deadalnix wrote:Le 17/11/2012 05:49, Jason House a écrit :Bartosz's design was very explicit about ownership, but was deemed too complex for D2. Shared was kept simple, but underpowered. Here's what I remember of Bartosz's design: - Shared object members are owned by the enclosung container unless explicitly marked otherwise - lockfree shared data is marked differently - Non-lockfree shared objects required locking them prior to access, but did not require separate shared and non-shared code. - No sequential consistency I really liked his design, but I think the explicit ownership part was considered too complex. There may still be something that can be done to improve D2, but I doubt it'd be a complete solution.On Thursday, 15 November 2012 at 16:31:43 UTC, Sean Kelly wrote:Nothing is safe if ownership cannot be statically proven. This is completely useless.On Nov 11, 2012, at 6:30 PM, Walter Bright <newshound2 digitalmars.com> wrote:The constructive thing to do may be to try and figure out what should users be allowed to do with locked shared data... I think the basic idea is that no references can be escaped; SafeD rules could probably help with that. Non-shared member functions might also need to be tagged with their ability to be called on locked, shared data.To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutexSo what happens if you pass a reference to the now non-shared object to a function that caches a local reference to it? Half the point of the attribute is to protect us from accidents like this.
Nov 20 2012
Am 11.11.2012 19:46, schrieb Alex Rønne Petersen:Something needs to be done about shared. I don't know what, but the current situation is -- and I'm really not exaggerating here -- laughable. I think we either need to just make it perfectly clear that shared is for documentation purposes and nothing else, or, figure out an alternative system to shared, because I don't see shared actually being useful for real world work no matter what we do with it.After reading Walter's comment, it suddenly seemed obvious that we are currently using 'shared' the wrong way. Shared is just not meant to be used on objects at all (or only in some special cases like synchronization primitives). I just experimented a bit with a statically checked library based solution and a nice way to use shared is to only use it for disabling access to non-shared members while its monitor is not locked. A ScopedLock proxy and a lock() function can be used for this: --- class MyClass { void method(); } void main() { auto inst = new shared(MyClass); //inst.method(); // forbidden { ScopedLock!MyClass l = lock(inst); l.method(); // now allowed as long as 'l' is in scope } // can also be called like this: inst.lock().method(); } --- ScopedLock is non-copyable and handles the dirty details of locking and casting away 'shared' when its safe to do so. No tagging of the class with 'synchronized' or 'shared' needs to be done and everything works nicely without casts. This comes with a restriction, though. Doing all this is only safe as long as the instance is known to not contain any unisolated aliasing*. So use would be restricted to types that contain only immutable or unique/isolated references. So I also implemented an Isolated!(T) type that is recognized by ScopedLock, as well as functions such as spawn(). The resulting usage can be seen in the example at the bottom. It doesn't provide all the flexibility that a built-in 'isolated' type would do, but the possible use cases at least look interesting. There are still some details to be worked out, such as writing a spawn() function that correctly moves Isolated!() parameters instead of copying or the forward reference error mentioned in the example. I'll now try and see if some of my earlier multi-threading designs fit into this system. --- import std.stdio; import std.typecons; import std.traits; import stdx.typecons; class SomeClass { } class Test { private { string m_test1 = "test 1"; Isolated!SomeClass m_isolatedReference; // currently causes a size forward reference error: //Isolated!Test m_next; } this() { //m_next = ...; } void test1() const { writefln(m_test1); } void test2() const { writefln("test 2"); } } void main() { writefln("Shared locking"); // create a shared instance of Test - no members will // be accessible auto t = new shared(Test); { // temporarily lock t to make all non-shared members // safely available // lock() words only for objects with no unisolated // aliasing. ScopedLock!Test l = lock(t); l.test1(); l.test2(); } // passing a shared object to a different thread works as usual writefln("Shared spawn"); spawn(&myThreadFunc1, t); // create an isolated instance of Test // currently, Test may not contain unisolated aliasing, but // this requirement may get lifted, // as long as only pure methods are called Isolated!Test u = makeIsolated!Test(); // move ownership to a different function and recover writefln("Moving unique"); Isolated!Test v = myThreadFunc2(u.move()); // moving to a different thread also works writefln("Moving unique spawn"); spawn(&myThreadFunc2, v.move()); // another possibility is to convert to immutable auto w = makeIsolated!Test(); writefln("Convert to immutable spawn"); spawn(&myThreadFunc3, w.freeze()); // or just loose the isolation and act on the base type writefln("Convert to mutable"); auto x = makeIsolated!Test(); Test xm = x.extract(); xm.test1(); xm.test2(); } void myThreadFunc1(shared(Test) t) { // call non-shared method on shared object t.lock().test1(); t.lock().test2(); } Isolated!Test myThreadFunc2(Isolated!Test t) { // call methods as usual on an isolated object t.test1(); t.test2(); return t.move(); } void myThreadFunc3(immutable(Test) t) { t.test1(); t.test2(); } // fake spawn function just to test the type constraints void spawn(R, ARGS...)(R function(ARGS) func, ARGS args) { foreach( i, T; ARGS ) static assert(!hasUnisolatedAliasing!T || !hasUnsharedAliasing!T, "Parameter "~to!string(i)~" of type" ~T.stringof~" has unshared or unisolated aliasing. Cannot safely be passed to a different thread."); // TODO: do this in a different thread... // TODO: don't cheat with the 1-parameter move detection static if(__traits(compiles, func(args[0])) ) func(args); else func(args[0].move()); } --- * shared aliasing would also be OK, but this is not yet handled by the implementation.
Nov 12 2012
On Mon, 12 Nov 2012 11:41:00 -0000, S=F6nke Ludwig = <sludwig outerproduct.org> wrote:Am 11.11.2012 19:46, schrieb Alex R=F8nne Petersen:tSomething needs to be done about shared. I don't know what, but the current situation is -- and I'm really not exaggerating here -- laughable. I think we either need to just make it perfectly clear tha=anshared is for documentation purposes and nothing else, or, figure out=ngalternative system to shared, because I don't see shared actually bei=useful for real world work no matter what we do with it.After reading Walter's comment, it suddenly seemed obvious that we are=currently using 'shared' the wrong way. Shared is just not meant to be=used on objects at all (or only in some special cases like synchronization primitives). I just experimented a bit with a statical=lychecked library based solution and a nice way to use shared is to only=use it for disabling access to non-shared members while its monitor is=not locked. A ScopedLock proxy and a lock() function can be used for =this:I had exactly the same idea: http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=3D2#pos= t-op.wnnsrds954xghj:40puck.auriga.bhead.co.uk But, then I went right back the other way: http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=3D2#pos= t-op.wnnt4iyz54xghj:40puck.auriga.bhead.co.uk I think we can definitely create a library solution like the one you = propose below, and it should work quite well. But, I reckon it would be= = even nicer if the compiler did just a little bit of the work for us, and= = we integrated with the built in synchronized statement. :) R -- = Using Opera's revolutionary email client: http://www.opera.com/mail/
Nov 12 2012
Am 12.11.2012 13:33, schrieb Regan Heath:On Mon, 12 Nov 2012 11:41:00 -0000, Sönke Ludwig <sludwig outerproduct.org> wrote:The only problem is that for this approach to be safe, any aliasing outside of the object's reference tree that is not 'shared', must be disallowed. To get the maximum use out of this, some kind of 'isolated'/'unique' qualifier is needed again. So a built-in language solution - which would definitely be highly desirable - that allows this would also either have to introduce a new type qualifier, or recognize the corresponding library structure which implements this. Since for various reasons both possibilities have a questionable probability of being implemented, I decided to go and see what can be done with the current state. By now I would be more than happy to have _any_ decent solution that works and that can also be recommend to others.Am 11.11.2012 19:46, schrieb Alex Rønne Petersen:I had exactly the same idea: http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=2#post-op.wnnsrds954xghj:40puck.auriga.bhead.co.uk But, then I went right back the other way: http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=2#post-op.wnnt4iyz54xghj:40puck.auriga.bhead.co.uk I think we can definitely create a library solution like the one you propose below, and it should work quite well. But, I reckon it would be even nicer if the compiler did just a little bit of the work for us, and we integrated with the built in synchronized statement. :) RSomething needs to be done about shared. I don't know what, but the current situation is -- and I'm really not exaggerating here -- laughable. I think we either need to just make it perfectly clear that shared is for documentation purposes and nothing else, or, figure out an alternative system to shared, because I don't see shared actually being useful for real world work no matter what we do with it.After reading Walter's comment, it suddenly seemed obvious that we are currently using 'shared' the wrong way. Shared is just not meant to be used on objects at all (or only in some special cases like synchronization primitives). I just experimented a bit with a statically checked library based solution and a nice way to use shared is to only use it for disabling access to non-shared members while its monitor is not locked. A ScopedLock proxy and a lock() function can be used for this:
Nov 12 2012
Le 12/11/2012 12:41, Sönke Ludwig a écrit :Am 11.11.2012 19:46, schrieb Alex Rønne Petersen:With some kind of ownership in the type system, it can me made automagic that shared is casted away on synchronized object.Something needs to be done about shared. I don't know what, but the current situation is -- and I'm really not exaggerating here -- laughable. I think we either need to just make it perfectly clear that shared is for documentation purposes and nothing else, or, figure out an alternative system to shared, because I don't see shared actually being useful for real world work no matter what we do with it.After reading Walter's comment, it suddenly seemed obvious that we are currently using 'shared' the wrong way. Shared is just not meant to be used on objects at all (or only in some special cases like synchronization primitives). I just experimented a bit with a statically checked library based solution and a nice way to use shared is to only use it for disabling access to non-shared members while its monitor is not locked. A ScopedLock proxy and a lock() function can be used for this: --- class MyClass { void method(); } void main() { auto inst = new shared(MyClass); //inst.method(); // forbidden { ScopedLock!MyClass l = lock(inst); l.method(); // now allowed as long as 'l' is in scope } // can also be called like this: inst.lock().method(); } --- ScopedLock is non-copyable and handles the dirty details of locking and casting away 'shared' when its safe to do so. No tagging of the class with 'synchronized' or 'shared' needs to be done and everything works nicely without casts. This comes with a restriction, though. Doing all this is only safe as long as the instance is known to not contain any unisolated aliasing*. So use would be restricted to types that contain only immutable or unique/isolated references. So I also implemented an Isolated!(T) type that is recognized by ScopedLock, as well as functions such as spawn(). The resulting usage can be seen in the example at the bottom. It doesn't provide all the flexibility that a built-in 'isolated' type would do, but the possible use cases at least look interesting. There are still some details to be worked out, such as writing a spawn() function that correctly moves Isolated!() parameters instead of copying or the forward reference error mentioned in the example. I'll now try and see if some of my earlier multi-threading designs fit into this system. --- import std.stdio; import std.typecons; import std.traits; import stdx.typecons; class SomeClass { } class Test { private { string m_test1 = "test 1"; Isolated!SomeClass m_isolatedReference; // currently causes a size forward reference error: //Isolated!Test m_next; } this() { //m_next = ...; } void test1() const { writefln(m_test1); } void test2() const { writefln("test 2"); } } void main() { writefln("Shared locking"); // create a shared instance of Test - no members will // be accessible auto t = new shared(Test); { // temporarily lock t to make all non-shared members // safely available // lock() words only for objects with no unisolated // aliasing. ScopedLock!Test l = lock(t); l.test1(); l.test2(); } // passing a shared object to a different thread works as usual writefln("Shared spawn"); spawn(&myThreadFunc1, t); // create an isolated instance of Test // currently, Test may not contain unisolated aliasing, but // this requirement may get lifted, // as long as only pure methods are called Isolated!Test u = makeIsolated!Test(); // move ownership to a different function and recover writefln("Moving unique"); Isolated!Test v = myThreadFunc2(u.move()); // moving to a different thread also works writefln("Moving unique spawn"); spawn(&myThreadFunc2, v.move()); // another possibility is to convert to immutable auto w = makeIsolated!Test(); writefln("Convert to immutable spawn"); spawn(&myThreadFunc3, w.freeze()); // or just loose the isolation and act on the base type writefln("Convert to mutable"); auto x = makeIsolated!Test(); Test xm = x.extract(); xm.test1(); xm.test2(); } void myThreadFunc1(shared(Test) t) { // call non-shared method on shared object t.lock().test1(); t.lock().test2(); } Isolated!Test myThreadFunc2(Isolated!Test t) { // call methods as usual on an isolated object t.test1(); t.test2(); return t.move(); } void myThreadFunc3(immutable(Test) t) { t.test1(); t.test2(); } // fake spawn function just to test the type constraints void spawn(R, ARGS...)(R function(ARGS) func, ARGS args) { foreach( i, T; ARGS ) static assert(!hasUnisolatedAliasing!T || !hasUnsharedAliasing!T, "Parameter "~to!string(i)~" of type" ~T.stringof~" has unshared or unisolated aliasing. Cannot safely be passed to a different thread."); // TODO: do this in a different thread... // TODO: don't cheat with the 1-parameter move detection static if(__traits(compiles, func(args[0])) ) func(args); else func(args[0].move()); } --- * shared aliasing would also be OK, but this is not yet handled by the implementation.
Nov 12 2012
Am 12.11.2012 14:00, schrieb deadalnix:With some kind of ownership in the type system, it can me made automagic that shared is casted away on synchronized object.Yes and I would love to have that, but I fear that we then basically get where Bartosz Milewski was at the end of his research. And unfortunately that went too far to be considered for (mid-term) inclusion. Besides its shortcomings, there are also actually some advantages to a library based solution. For example it could be allowed to customize the lock()/unlock() function so that locking could work for fiber-aware mutexes (e.g. http://vibed.org/api/vibe.core.mutex/ ...) or even for network based distributed object systems.
Nov 12 2012
Le 12/11/2012 14:23, Sönke Ludwig a écrit :Am 12.11.2012 14:00, schrieb deadalnix:Don't get me started on fibers /DWith some kind of ownership in the type system, it can me made automagic that shared is casted away on synchronized object.Yes and I would love to have that, but I fear that we then basically get where Bartosz Milewski was at the end of his research. And unfortunately that went too far to be considered for (mid-term) inclusion. Besides its shortcomings, there are also actually some advantages to a library based solution. For example it could be allowed to customize the lock()/unlock() function so that locking could work for fiber-aware mutexes (e.g. http://vibed.org/api/vibe.core.mutex/ ...) or even for network based distributed object systems.
Nov 12 2012
I generated some quick documentation with examples here: http://vibed.org/temp/d-isolated-test/stdx/typecons/lock.html http://vibed.org/temp/d-isolated-test/stdx/typecons/makeIsolated.html http://vibed.org/temp/d-isolated-test/stdx/typecons/makeIsolatedArray.html It does offer some nice improvements. No single cast and everything is statically checked.
Nov 12 2012
Am 12.11.2012 16:27, schrieb Sönke Ludwig:I generated some quick documentation with examples here: http://vibed.org/temp/d-isolated-test/stdx/typecons/lock.html http://vibed.org/temp/d-isolated-test/stdx/typecons/makeIsolated.html http://vibed.org/temp/d-isolated-test/stdx/typecons/makeIsolatedArray.html It does offer some nice improvements. No single cast and everything is statically checked.All examples compile now. Put everything on github for reference: https://github.com/s-ludwig/d-isolated-test
Nov 12 2012
On Thursday, November 15, 2012 04:12:47 Andrej Mitrovic wrote:On 11/15/12, Jonathan M Davis <jmdavisProg gmx.com> wrote:Good to know, but none of that really has anything to do with the casting, which is what I was responding to. And looking at that list, it sounds reasonable that all of that would be guaranteed to be atomic, but I think that the fundamental problem that's affecting usability is all of the casting that's typically required. And I don't see any way around that other than writing code that doesn't need to pass shared objects around or using templates very heavily. - Jonathan M DavisFrom what I recall of what TDPL saysIt says (on p.413) reading and writing shared values are guaranteed to be atomic, for pointers, arrays, function pointers, delegates, class references, and struct types containing exactly one of these types. Reals are not supported. It also talks about automatically inserting memory barriers on page 414.
Nov 14 2012
On Thursday, November 15, 2012 14:32:47 Manu wrote:On 15 November 2012 13:38, Jonathan M Davis <jmdavisProg gmx.com> wrote:I don't really see the difference, other than, as you say, the cast is explicit. Obviously the possibility for the situation you describe exists, it's equally possible with the cast, except this way, the usage pattern is made more convenient, the user has a convenient way to control the locks and most importantly, it would work with templates. That said, this sounds like another perfect application of 'scope'. Perhaps only scope parameters can receive a locked, shared thing... that would mechanically protect you against escape.You could make casting away const implicit too, which would make some code easier, but it would be a disaster, because the programer wouldn't have a clue that it's happening in many cases, and the code would end up being very, very wrong. Implicitly casting away shared would put you in the same boat. _Maybe_ you could get away with it in very restricted circumstances where both pure and scope are being used, but then it becomes so restrictive that it's nearly useless anyway. And again, it would be hidden from the programmer, when this is something that _needs_ to be explicit. Having implicit locks happen on you could really screw with any code trying to do explicit locks, as would be needed anyway in all but the most basic cases.2. It's often the case that you need to lock/unlock groups of stuff togetherExcept that with your suggestion, you're introducing potential deadlocks which are outside of the programmer's control, and you're introducing extra overhead with those locks (both in terms of memory and in terms of the runtime costs). Not to mention, it would probably cause all kinds of issues for something like shared int* to have a mutex with it, because then its size is completely different from int*. It also would cause even worse problems when that shared int* was cast to int* (aside from the size issues), because all of the locking that was happening for the shared int* was invisible. If you want automatic locks, then use synchronized classes. That's what they're for. Honestly, I really don't buy into the idea that it makes sense for shared to magically make multi-threaded code work without the programmer worrying about locks. Making it so that it's well-defined as to what's atomic is great for code that has any chance of being lock-free, but it's still up to the programmer to understand when locks are and aren't needed and how to use them correctly. I don't think that it can possibly work for it to be automatic. It's far to easy to introduce deadlocks, and it would only work in the simplest of cases anyway, meaning that the programmer needs to understand and properly solve the issues anyway. And if the programmer has to understand it all to get it right, why bother adding the extra overhead and deadlock potential caused by automatically locking anything? D provides some great synchronization primitives. People should use them. I think that the only things that share really needs to be solving are: 1. Indicating to the compiler via the type system that the object is not thread-local. This properly segregates shared and unshared code and allows the compiler to take advantage of thread locality for optimizations and avoid optimizations with shared code that screw up threading (e.g. double-checked locking won't work if the compiler does certain optimizations). 2. Making it explicit and well-defined as part of the language which operations can assumed to be atomic (even if it that set of operations is very small, having it be well-defined is valuable). 3. Ensuring sequential consistency so that it's possible to do lock-free code when atomic operations permit it and so that there are fewer weird issues due to undefined behavior. - Jonathan M Davissuch that locking specific variables is of often of limited use and would just introduce pointless extra locks when dealing with multiple variables. It would also increase the risk of deadlocks, because you wouldn't have much - if any - control over what order locks were acquired in when dealing with multiple shared variables.Your fear is precisely the state we're in now, except it puts all the work on the user to create and use the synchronisation objects, and also to assert that things are locked when they are accessed. I'm just suggesting some reasonably simple change that would make the situation more usable and safer immediately, short of waiting for all these fantastic designs being discussed having time to simmer and manifest.
Nov 15 2012
On 15 November 2012 15:00, Jonathan M Davis <jmdavisProg gmx.com> wrote:On Thursday, November 15, 2012 14:32:47 Manu wrote:... no, they're not even the same thing. const things can not be changed. Shared things are still mutable things, and perfectly compatible with other non-shared mutable things, they just have some access control requirements. _Maybe_ you could get away with it in very restricted circumstances whereOn 15 November 2012 13:38, Jonathan M Davis <jmdavisProg gmx.com> wrote:I don't really see the difference, other than, as you say, the cast is explicit. Obviously the possibility for the situation you describe exists, it's equally possible with the cast, except this way, the usage pattern ismademore convenient, the user has a convenient way to control the locks and most importantly, it would work with templates. That said, this sounds like another perfect application of 'scope'.Perhapsonly scope parameters can receive a locked, shared thing... that would mechanically protect you against escape.You could make casting away const implicit too, which would make some code easier, but it would be a disaster, because the programer wouldn't have a clue that it's happening in many cases, and the code would end up being very, very wrong. Implicitly casting away shared would put you in the same boat.both pure and scope are being used, but then it becomes so restrictive that it's nearly useless anyway. And again, it would be hidden from the programmer, when this is something that _needs_ to be explicit. Having implicit locks happen on you could really screw with any code trying to do explicit locks, as would be needed anyway in all but the most basic cases.I think you must have misunderstood my suggestion, I certainly didn't suggest locking would be implicit. All locks would be explicit, all I suggested is that shared things would gain an associated mutex, and an implicit assert that said mutex is locked whenever it is accessed, rather than deny assignment between shared/unshared things. You could use lock methods, or a nice alternative would be to submit them to some sort of synchronised scope like luka illustrates. I'm of the opinion that for the time being, explicit lock control is mandatory (anything else is a distant dream), and atomic primitives may not be relied upon.2. It's often the case that you need to lock/unlock groups of stuff togetherTo all above: You've completely misunderstood my suggestion. It's basically the same as luka. It's not that hard, shared just assists the user do what they do anyway by associating a lock primitive, and implicitly assert it is locked when accessed. No magic should be performed on the users behalf. I think that the only things that share really needs to be solving are:wouldsuch that locking specific variables is of often of limited use andItjust introduce pointless extra locks when dealing with multiple variables.ifwould also increase the risk of deadlocks, because you wouldn't have much -multipleany - control over what order locks were acquired in when dealing withworkshared variables.Your fear is precisely the state we're in now, except it puts all theon the user to create and use the synchronisation objects, and also to assert that things are locked when they are accessed. I'm just suggesting some reasonably simple change that would make the situation more usable and safer immediately, short of waiting for allthesefantastic designs being discussed having time to simmer and manifest.Except that with your suggestion, you're introducing potential deadlocks which are outside of the programmer's control, and you're introducing extra overhead with those locks (both in terms of memory and in terms of the runtime costs). Not to mention, it would probably cause all kinds of issues for something like shared int* to have a mutex with it, because then its size is completely different from int*. It also would cause even worse problems when that shared int* was cast to int* (aside from the size issues), because all of the locking that was happening for the shared int* was invisible. If you want automatic locks, then use synchronized classes. That's what they're for. Honestly, I really don't buy into the idea that it makes sense for shared to magically make multi-threaded code work without the programmer worrying about locks. Making it so that it's well-defined as to what's atomic is great for code that has any chance of being lock-free, but it's still up to the programmer to understand when locks are and aren't needed and how to use them correctly. I don't think that it can possibly work for it to be automatic. It's far to easy to introduce deadlocks, and it would only work in the simplest of cases anyway, meaning that the programmer needs to understand and properly solve the issues anyway. And if the programmer has to understand it all to get it right, why bother adding the extra overhead and deadlock potential caused by automatically locking anything? D provides some great synchronization primitives. People should use them.1. Indicating to the compiler via the type system that the object is not thread-local. This properly segregates shared and unshared code and allows the compiler to take advantage of thread locality for optimizations and avoid optimizations with shared code that screw up threading (e.g. double-checked locking won't work if the compiler does certain optimizations). 2. Making it explicit and well-defined as part of the language which operations can assumed to be atomic (even if it that set of operations is very small, having it be well-defined is valuable). 3. Ensuring sequential consistency so that it's possible to do lock-free code when atomic operations permit it and so that there are fewer weird issues due to undefined behavior. - Jonathan M Davis
Nov 15 2012
Would it be useful if 'shared' in D did something like 'volatile' in C++ (as in, Andrei's article on volatile-correctness)? http://www.drdobbs.com/cpp/volatile-the-multithreaded-programmers-b/184403766
Nov 15 2012