digitalmars.D - The `shared` debate, from my point of view
- Steven Schveighoffer (154/160) Oct 23 2018 I wrote in a previous buried post that I finally understood the benefits...
- Manu (137/322) Oct 23 2018 C++ has strict aliasing. C++ can threat pointers like they're the
- Steven Schveighoffer (81/338) Oct 24 2018 I admit, I'm not an expert on strict aliasing, but I thought that was
I wrote in a previous buried post that I finally understood the benefits of Manu's system of shared, and why he has proposed it with the implicit casting of unshared to shared. Here is the expansion on that. Let me first start by trying to infer through his various posts and explanations where this thing came from. I'm going to spend a few paragraphs putting words in Manu's mouth, forgive me if I'm wrong (and please correct if necessary!), but I think it's important to understand the motivation for the proposal. Obviously Manu has a lot of experience writing C++ code, and in C++, anything shared is "shared by convention". That is, you can share anything you want, there are no restrictions. The end result is that any pointer to any data must be treated like it's shared. Treating every piece of data as if it was shared doesn't pan out well, because synchronizing memory across threads to avoid races isn't cheap, and pointers are used *everywhere*. So you must restrict yourself to a set of rules for sharing data. What I envision has been practiced in this case is that certain types encapsulate COMPLETELY thread-safe behavior. This means that whether the type is on the stack or on the heap, shared or not, it's going to defensively use locks or atomics to make sure no races can happen with that specific type. This is similar to stdout in libc, which is generally thread-safe, but uses locks even when only one thread is using it. (Side note: I myself have used things like shared_ptr in multi-threaded environments, and it does make things so much easier to have thread-safe primitives) Somehow you must know which data is shared from outside the thread, and which data is local. This means that you have to have some way (probably by convention) for siloing this shared data away from your local data. The reason is because local data can be manipulated without synchronization, while shared data cannot. But clearly, any shared data can only be manipulated via the "thread-safe" types that are fully encapsulated and can't cause problems. The other data, you are not allowed to touch (again, this is C++, so by convention). Knowing what data can be manipulated is easy to discern based on the type of the data (e.g. Atomic<int>). To recap: the silo that contains data shared from elsewhere can ONLY be manipulated through the fully anointed "thread safe" types. Normal types cannot be touched, because some other thread (the owner thread) can manipulate that data without sync. ------ Now, let's look at the *current state* of D. In D, shared data and unshared data are STRICTLY separated. One cannot simply share any data (like an int *) because that would now mean that int * is shared. Having a shared alias to unshared data trivially causes a paradox that now will result in races. This is the reason why implicit casting either way isn't allowed. But that doesn't FIT the fully encapsulated "I can use this type shared or not", which can be used whether it's thread local or shared. So what Manu proposes is to remove this definition of shared, and instead of shared meaning "this data is shared", it means "you can only operate this data IF it provides a thread-safe interface". The thread-safe interface comes in the form of free-functions that accept the type as shared, including shared member functions. So Manu's proposal (MP) is to do 2 simple things: 1. Basic data that is shared CANNOT be read or written via standard methods or operators. This enforces the convention of not touching data in the shared silo that is NOT thread-safe. 2. Standard data implicitly casts to shared. The only way obviously to write or read shared basic types, therefore, is to cast away shared. But the intention from this plan is to only do this while inside a fully encapsulated and tightly controlled type. And HERE is the key part I was missing -- those ints that have no thread-safe interface, are STILL USABLE by the original thread, because it still can have a non-shared reference to the data. All other threads can ONLY have a shared reference to the data, restricting them to the thread-safe portions of the type. In other words, you can use the Atomic!int type while the reference is shared, but not the int. Manu's quote (with contexts by me) here explains it all:In practise, and in my direct experience, classes tend to have exactly one [thread-safe member], and either zero (pure utility), or many such [thread-local] members. Threadsafe API interacts with [the thread safe member], and the rest is just normal thread-local methods which interact with all members thread-locally, and may also interact with [the thread safe member] while not violating any threadsafety commitments.This requires a different mindset when implementing shared data. You can NEVER have a function that takes a shared int * and does anything with it. So all of core.atomic changes to only accepting `ref int`, and not `shared ref int`. Essentially, in order for a type to have an encapsulated thread-safe interface, it cannot have any other thread-unsafe means of manipulating the data. Obviously, this means basic types are useless as shared types unless encapsulated into a specially written type. You want these sharable types in their own modules, so it can't have any unforeseen hooks into the private data, and it will actually work. It is a convention, although not too hard to follow. Some have mentioned that there are still loopholes (like accessing tupleof) that need to be addressed, but those should be addressed anyway. Therefore, the rules are simple, they are sound, and they do accomplish a certain view of sharing data that will be useful in many cases. And it allows Manu's current model of sharing data to easily be implemented AND get rid of some of the convention in C++ by using compiler guarantees in D. ----------- So here is my take on this: I propose that we still make basic shared data unusable without casting, but do not allow implicit casting to shared. Manu's workflow and model is still doable without the implicit casting. Simply because, if you want shared data, declare it shared. That is, if you have (with MP): struct SharableType { int x; Atomic!int y; } just declare it: struct SharableType { int x; shared Atomic!int y; } and share y instead of the whole thing. You still do not have to cast anything, and realistically, the other thread doesn't care about the other data it receives that isn't actually accessible. I see no reason to deal with the compiler preventing twiddling when it can be trivially prevented by not giving it to the other thread. The objection I have seen most cited is that then the user is forced to cast data to shared to share it. I don't see how -- if you have the above you don't need to cast. Simply put, casting unshared data to shared or vice versa means you have verified BY HAND that there are no other references to that data from that point forward. If the compiler can prove this, it can do the implicit cast. It works fine for immutable/mutable transitions, and can work here too. Casting to share data will not be a requirement for safe code, and will be rare in user code, if anywhere. The only issue I see that can possibly cause problems is that it may not be easy or possible to separate the shared parts of data into its own type, which means you have to share it through an artificial reference type (one that contains only the sharable pieces). This can be automated and implemented via introspection. One further benefit to keeping the cast explicit, is that one can write specific implementations knowing that data is not shared or is shared, giving a possibility of performance benefit that just isn't possible with MP (at least it isn't possible with compiler guarantees, obviously anything is possible if you follow conventions). One thing that is problematic with MP, is that you can't actually pass ownership of thread-local data from one thread to another. This isn't actually possible without casting under the current shared regime, but with implicit casting from unshared to shared, you have introduced NO opt-in cast on the sharing side. This makes it impossible for the compiler or code reviewer to find the place at which you should be verifying the reference is unique (a requirement if you want to change ownership). The receiving side's cast back to thread-local can be abstracted (because you can wrap it in a type that assumes uniqueness and destroys the original). Another thing that looks attractive from MP is you have this "carved out" section of your type that's only owned by your thread. This is great until you realize, you ONLY have access to it from your original reference. You can't send it away, get it back, and then manipulate the result. In this sense, it's VERY similar to const. So really it does you no good to associate the shared portions of the data with your local portions for the purpose of sending it away to other threads for a processing round-trip. ------- To summarize, I think the reality is that we ACTUALLY can implement sharing as Manu wishes without implicit casting, albeit via library abstraction using introspection. I can easily see a library that allows you to pass a type that isn't shared, as long as it has shared pieces, and have that library simply restrict access to the thread-safe pieces via a wrapper. We don't need the compiler's help for that. So Manu can have his cake, I can eat my cake, we'll have a great big sharing of cake party, where nobody is racing, and everything is roses and lollipops. That's all I can think of for now. -Steve
Oct 23 2018
On Tuesday, 23 October 2018 at 21:17:16 UTC, Steven Schveighoffer wrote:I wrote in a previous buried post that I finally understood the benefits of Manu's system of shared, and why he has proposed it with the implicit casting of unshared to shared. Here is the expansion on that. Let me first start by trying to infer through his various posts and explanations where this thing came from. I'm going to spend a few paragraphs putting words in Manu's mouth, forgive me if I'm wrong (and please correct if necessary!), but I think it's important to understand the motivation for the proposal. Obviously Manu has a lot of experience writing C++ code, and in C++, anything shared is "shared by convention". That is, you can share anything you want, there are no restrictions. The end result is that any pointer to any data must be treated like it's shared.C++ has strict aliasing. C++ can threat pointers like they're the only reference to that data in the universe if it likes. It's not true that "data must be treated like it's shared". Data must be very deliberately handled if it IS shared. And if it is shared, and not handled, then basically, undefined behaviour. The rules are basically the same as D. Just to be clear, I'm not talking about, and not contrasting to C++ at any point. So suggest so is not fair. I'm strictly interested in the reality of data access.Treating every piece of data as if it was shared doesn't pan out well, because synchronizing memory across threads to avoid races isn't cheap, and pointers are used *everywhere*. So you must restrict yourself to a set of rules for sharing data. What I envision has been practiced in this case is that certain types encapsulate COMPLETELY thread-safe behavior. This means that whether the type is on the stack or on the heap, shared or not, it's going to defensively use locks or atomics to make sure no races can happen with that specific type. This is similar to stdout in libc, which is generally thread-safe, but uses locks even when only one thread is using it. (Side note: I myself have used things like shared_ptr in multi-threaded environments, and it does make things so much easier to have thread-safe primitives) Somehow you must know which data is shared from outside the thread, and which data is local. This means that you have to have some way (probably by convention) for siloing this shared data away from your local data. The reason is because local data can be manipulated without synchronization, while shared data cannot. But clearly, any shared data can only be manipulated via the "thread-safe" types that are fully encapsulated and can't cause problems. The other data, you are not allowed to touch (again, this is C++, so by convention). Knowing what data can be manipulated is easy to discern based on the type of the data (e.g. Atomic<int>). To recap: the silo that contains data shared from elsewhere can ONLY be manipulated through the fully anointed "thread safe" types. Normal types cannot be touched, because some other thread (the owner thread) can manipulate that data without sync. ------ Now, let's look at the *current state* of D. In D, shared data and unshared data are STRICTLY separated. One cannot simply share any data (like an int *) because that would now mean that int * is shared. Having a shared alias to unshared data trivially causes a paradox that now will result in races. This is the reason why implicit casting either way isn't allowed. But that doesn't FIT the fully encapsulated "I can use this type shared or not", which can be used whether it's thread local or shared. So what Manu proposes is to remove this definition of shared, and instead of shared meaning "this data is shared", it means "you can only operate this data IF it provides a thread-safe interface". The thread-safe interface comes in the form of free-functions that accept the type as shared, including shared member functions. So Manu's proposal (MP) is to do 2 simple things: 1. Basic data that is shared CANNOT be read or written via standard methods or operators. This enforces the convention of not touching data in the shared silo that is NOT thread-safe. 2. Standard data implicitly casts to shared. The only way obviously to write or read shared basic types, therefore, is to cast away shared. But the intention from this plan is to only do this while inside a fully encapsulated and tightly controlled type. And HERE is the key part I was missing -- those ints that have no thread-safe interface, are STILL USABLE by the original thread, because it still can have a non-shared reference to the data. All other threads can ONLY have a shared reference to the data, restricting them to the thread-safe portions of the type. In other words, you can use the Atomic!int type while the reference is shared, but not the int. Manu's quote (with contexts by me) here explains it all:Yup. But let's be clear; this isn't actually a feature of my proposal, this is a feature of reality! There is *absolutely no world* where unregulated interaction with primitive data is safe. You MUST perform custom and deliberate threadsafe handling of any interaction with any data. Any access to raw data from a shared reference can never be safe, and should be strictly banned. atomicIncrement(shared int*) is unacceptable under any conceivable model, because `int` has an unsafe API (the intrinsic operators). Even in the current implementation, right now, core.atomic functions should change to int* and require unsafe casts.In practise, and in my direct experience, classes tend to have exactly one [thread-safe member], and either zero (pure utility), or many such [thread-local] members. Threadsafe API interacts with [the thread safe member], and the rest is just normal thread-local methods which interact with all members thread-locally, and may also interact with [the thread safe member] while not violating any threadsafety commitments.This requires a different mindset when implementing shared data. You can NEVER have a function that takes a shared int * and does anything with it. So all of core.atomic changes to only accepting `ref int`, and not `shared ref int`. Essentially, in order for a type to have an encapsulated thread-safe interface, it cannot have any other thread-unsafe means of manipulating the data. Obviously, this means basic types are useless as shared types unless encapsulated into a specially written type.You want these sharable types in their own modules, so it can't have any unforeseen hooks into the private data, and it will actually work. It is a convention, although not too hard to follow. Some have mentioned that there are still loopholes (like accessing tupleof) that need to be addressed, but those should be addressed anyway. Therefore, the rules are simple, they are sound, and they do accomplish a certain view of sharing data that will be useful in many cases. And it allows Manu's current model of sharing data to easily be implemented AND get rid of some of the convention in C++ by using compiler guarantees in D.**Get rid of _all_ of our convention. Our infrastructure would become typesafe, and safe. I can't think of any data that is strictly shared. We don't have that, and I don't know what things are like that which aren't also immutable. All data has an owner, and that owner can do things to it that an owner should be allowed to do.----------- So here is my take on this: I propose that we still make basic shared data unusable without casting,Indeed, I don't see any room for debate on this. It just needs to be right.but do not allow implicit casting to shared. Manu's workflow and model is still doable without the implicit casting. Simply because, if you want shared data, declare it shared. That is, if you have (with MP): struct SharableType { int x; Atomic!int y; } just declare it: struct SharableType { int x; shared Atomic!int y; }Declaring `y` shared might be a useful choice in some cases to help catch cases of unshared functions accidentally accessing a member. In this case though it's a bit lame that now an unshared function can't access `y`, since it's Atomic!() and therefore perfectly fine to do. Any unshared function that wants to access `y` must do an unsafe cast. But the real problem is here: void DoParallelFor(ref shared SharableType x) { x.threadsafeMethod(); } void fun() { SharableType x; x.threadlocalMethod(); DoParallelFor(x); // <- no implicit conversion requires unsafe cast! solame! } So now, fun() must be unsafe. This requirement to perform needless unsafe casts everywhere means my whole program becomes unsafe! And that goes for all forms of safety. We don't have an "allow unsafe shared casts, but enforce safety for other things" option... it's just that all things are unsafe now. By forcing totally needless unsafe interactions into user code, you are making the whole user-side program unsafe. That's a terrible design choice.and share y instead of the whole thing.But it's `SharableType` that defines interesting interactions with `y`. `y` is private; it's an uninteresting implementation detail of `SharableType`.You still do not have to cast anything, and realistically, the other thread doesn't care about the other data it receives that isn't actually accessible. I see no reason to deal with the compiler preventing twiddling when it can be trivially prevented by not giving it to the other thread. The objection I have seen most cited is that then the user is forced to cast data to shared to share it. I don't see how -- if you have the above you don't need to cast.My examples above should convince you that casts must exist. The casts will appear somewhere. Like I say above, sharing `y` is uninteresting, because it's `SharableType` that defines interesting interactions with `y`. I could add another layer in the middle, but then we just move the casts to any unshared methods of `SharableType` that wants to call threadsafe functions of its member, and we've needlessly made the implementation of `SharableType` more complex and noisy. Like I say, the casts will exist *somewhere*, this is just a matter of choice of where. My proposal makes the (I feel; 'objective') assumption, that the best place for unsafe casts to appear is: * in the 5-10 core low-level library functions written by the threadsafety expert; that guy is trained to handle unsafety concerns properly * NOT in the user code, causing ALL user code to become unsafe because necessitating users to perform unsafe casts I'm not changing the landscape, I'm just shifting the safety guards into are a more reasonable location. By putting the unsafety in the *core* library, the whole program is safe, and the number of unsafe interactions are minimised and contained. It's also a matter is unsafe casting frequency. I've made the point that library:users is a 1:many ratio. Shifting the unsafe bits into the '1', and NOT scattering it among the 'many' is just common-sense.Simply put, casting unshared data to shared or vice versa means you have verified BY HAND that there are no other references to that data from that point forward. If the compiler can prove this, it can do the implicit cast. It works fine for immutable/mutable transitions, and can work here too. Casting to share data will not be a requirement for safe code, and will be rare in user code, if anywhere.Can you elaborate on this claim: "Casting to share data will not be a requirement for safe code" I went through the process you're going through now... I've been all over this design landscape, but I can't produce any design that works other than the one I have.The only issue I see that can possibly cause problems is that it may not be easy or possible to separate the shared parts of data into its own type, which means you have to share it through an artificial reference type (one that contains only the sharable pieces). This can be automated and implemented via introspection.This does indeed feel very awkward to me. Understand, you're enforcing this on a very large number of types in my ecosystem. Burden of complexity is best placed on the threadsafety author/expert and isolated/contained in core libs, not distributed among all users in all code everywhere.One further benefit to keeping the cast explicit, is that one can write specific implementations knowing that data is not shared or is shared, giving a possibility of performance benefit that just isn't possible with MP (at least it isn't possible with compiler guarantees, obviously anything is possible if you follow conventions).I think what you mean is "one can write specific implementations *safely*..." And that's literally the single useful facet of the current design I can identify. My model can still implement 2 overloads to make the same assumptions, but it requires unsafe cast in the threadlocal implementation to implement the optimisation. I am completely happy with such an optimisation being unsafe, but I think safety by default is the sensible option. The reality is though, that the thread-local functions are NOT the perf issue, almost by definition. Sharing something implies that its threadsafe methods will be called a great many times by many threads... the single instance would not represent 'the workload' in a shared world, it's just an arbiter, or book-keeper.One thing that is problematic with MP, is that you can't actually pass ownership of thread-local data from one thread to another.Is this true? This feels like a problem for move semantics. It seems like a general architectural problem, can you explain how MP affects this? How does this work now that would be ruined by my proposal?This isn't actually possible without casting under the current shared regime, but with implicit casting from unshared to shared, you have introduced NO opt-in cast on the sharing side.You say "passing ownership", why would that API receive a `shared` one? That's backwards. It should receive *THE* one, ie, an unsahred rvalue, and you would move your object.This makes it impossible for the compiler or code reviewer to find the place at which you should be verifying the reference is unique (a requirement if you want to change ownership). The receiving side's cast back to thread-local can be abstracted (because you can wrap it in a type that assumes uniqueness and destroys the original).I think you're mistaken to think that `shared` has anything at all to do with passing ownership. That case is not-shared by definition.Another thing that looks attractive from MP is you have this "carved out" section of your type that's only owned by your thread. This is great until you realize, you ONLY have access to it from your original reference. You can't send it away, get it back, and then manipulate the result. In this sense, it's VERY similar to const. So really it does you no good to associate the shared portions of the data with your local portions for the purpose of sending it away to other threads for a processing round-trip.I don't understand this point. You either lease it out to a cluster for processing (think parallel for), or if you want to 'send' it on a round-trip, then you are transferring ownership along the way. Both models work fine.------- To summarize, I think the reality is that we ACTUALLY can implement sharing as Manu wishes without implicit casting, albeit via library abstraction using introspection. I can easily see a library that allows you to pass a type that isn't shared, as long as it has shared pieces, and have that library simply restrict access to the thread-safe pieces via a wrapper. We don't need the compiler's help for that. So Manu can have his cake, I can eat my cake, we'll have a great big sharing of cake party, where nobody is racing, and everything is roses and lollipops.we end up in. All of us will be happier in that world, so I think that's worth pursuing as a first goal. all code being unsafe. whole point :/
Oct 23 2018
On 10/23/18 9:22 PM, Manu wrote:On Tuesday, 23 October 2018 at 21:17:16 UTC, Steven Schveighoffer wrote:I admit, I'm not an expert on strict aliasing, but I thought that was about how pointers of different *types* can be assumed not to point at the same thing.I wrote in a previous buried post that I finally understood the benefits of Manu's system of shared, and why he has proposed it with the implicit casting of unshared to shared. Here is the expansion on that. Let me first start by trying to infer through his various posts and explanations where this thing came from. I'm going to spend a few paragraphs putting words in Manu's mouth, forgive me if I'm wrong (and please correct if necessary!), but I think it's important to understand the motivation for the proposal. Obviously Manu has a lot of experience writing C++ code, and in C++, anything shared is "shared by convention". That is, you can share anything you want, there are no restrictions. The end result is that any pointer to any data must be treated like it's shared.C++ has strict aliasing. C++ can threat pointers like they're the only reference to that data in the universe if it likes. It's not true that "data must be treated like it's shared". Data must be very deliberately handled if it IS shared. And if it is shared, and not handled, then basically, undefined behaviour.The rules are basically the same as D. Just to be clear, I'm not talking about, and not contrasting to C++ at any point. So suggest so is not fair. I'm strictly interested in the reality of data access.I wasn't trying to say there is a C++ comparison, but in trying to reconstruct where your vast experience comes from, I thought maybe it was from your work in C++. It helps to understand where the ideas come from if you know the limitations of the system in which they were developed. Sorry if I made it sound otherwise.Very true, but in MP, the threadsafe API requires that there is ONLY a threadsafe API, whereas in current D, you can interact with shared data in a threadsafe manner (and if we implement rule 1 regardless, then only the threadsafe API exists).This requires a different mindset when implementing shared data. You can NEVER have a function that takes a shared int * and does anything with it. So all of core.atomic changes to only accepting `ref int`, and not `shared ref int`. Essentially, in order for a type to have an encapsulated thread-safe interface, it cannot have any other thread-unsafe means of manipulating the data. Obviously, this means basic types are useless as shared types unless encapsulated into a specially written type.Yup. But let's be clear; this isn't actually a feature of my proposal, this is a feature of reality! There is *absolutely no world* where unregulated interaction with primitive data is safe.You MUST perform custom and deliberate threadsafe handling of any interaction with any data. Any access to raw data from a shared reference can never be safe, and should be strictly banned.Concur.atomicIncrement(shared int*) is unacceptable under any conceivable model, because `int` has an unsafe API (the intrinsic operators).No, only in MP, since implicit casting is allowed. If no implicit casting is allowed, then threadsafe API is still possible for shared ints.Even in the current implementation, right now, core.atomic functions should change to int* and require unsafe casts.Not necessary if we implement rule 1.Trivially, not all data is shared. So saying data is shared even though you can't use it is no different than saying it's not shared. Essentially, you ARE only sharing the threadsafe data in MP, since the other data isn't usable.You want these sharable types in their own modules, so it can't have any unforeseen hooks into the private data, and it will actually work. It is a convention, although not too hard to follow. Some have mentioned that there are still loopholes (like accessing tupleof) that need to be addressed, but those should be addressed anyway. Therefore, the rules are simple, they are sound, and they do accomplish a certain view of sharing data that will be useful in many cases. And it allows Manu's current model of sharing data to easily be implemented AND get rid of some of the convention in C++ by using compiler guarantees in D.**Get rid of _all_ of our convention. Our infrastructure would become typesafe, and safe. I can't think of any data that is strictly shared. We don't have that, and I don't know what things are like that which aren't also immutable. All data has an owner, and that owner can do things to it that an owner should be allowed to do.Nod. I'm 99% sure Walter is in favor of something like this as well, but I can't find the past evidence right now.----------- So here is my take on this: I propose that we still make basic shared data unusable without casting,Indeed, I don't see any room for debate on this. It just needs to be right.Why not? An unshared function can access `y` just fine via it's shared interface, just like it does today.but do not allow implicit casting to shared. Manu's workflow and model is still doable without the implicit casting. Simply because, if you want shared data, declare it shared. That is, if you have (with MP): struct SharableType { int x; Atomic!int y; } just declare it: struct SharableType { int x; shared Atomic!int y; }Declaring `y` shared might be a useful choice in some cases to help catch cases of unshared functions accidentally accessing a member. In this case though it's a bit lame that now an unshared function can't access `y`, since it's Atomic!() and therefore perfectly fine to do. Any unshared function that wants to access `y` must do an unsafe cast.But the real problem is here: void DoParallelFor(ref shared SharableType x) { x.threadsafeMethod(); } void fun() { SharableType x; x.threadlocalMethod(); DoParallelFor(x); // <- no implicit conversion requires unsafe cast! solame! }No, you just stick the method into the type you pass that is shared. If you need extra shared methods on the type, you wrap it, and make those methods shared. It's still a sub-type of the full type, or even a separate piece of data.So now, fun() must be unsafe.No, you just pass the shared part: DoParallelFor(x.sharedPart); Or pass a specialized introspection-generated wrapper, which only shares the sharable parts: DoParallelFor(x.makeShared);This requirement to perform needless unsafe casts everywhere means my whole program becomes unsafe! And that goes for all forms of safety. We don't have an "allow unsafe shared casts, but enforce safety for other things" option... it's just that all things are unsafe now. By forcing totally needless unsafe interactions into user code, you are making the whole user-side program unsafe. That's a terrible design choice.Casts are not required, so this part is moot.Then you subdivide the SharableType into the parts that are sharable and the parts that are not, putting the sharable interface there.and share y instead of the whole thing.But it's `SharableType` that defines interesting interactions with `y`. `y` is private; it's an uninteresting implementation detail of `SharableType`.[snip] No, as mentioned above, no casts are needed.The objection I have seen most cited is that then the user is forced to cast data to shared to share it. I don't see how -- if you have the above you don't need to cast.My examples above should convince you that casts must exist.I'm not changing the landscape, I'm just shifting the safety guards into are a more reasonable location. By putting the unsafety in the *core* library, the whole program is safe, and the number of unsafe interactions are minimised and contained. It's also a matter is unsafe casting frequency. I've made the point that library:users is a 1:many ratio. Shifting the unsafe bits into the '1', and NOT scattering it among the 'many' is just common-sense.I think we can save the casting for the underlying library writers in either case.You allocate it shared or define it shared to begin with. Because it's going to be shared. No casts are needed.Simply put, casting unshared data to shared or vice versa means you have verified BY HAND that there are no other references to that data from that point forward. If the compiler can prove this, it can do the implicit cast. It works fine for immutable/mutable transitions, and can work here too. Casting to share data will not be a requirement for safe code, and will be rare in user code, if anywhere.Can you elaborate on this claim: "Casting to share data will not be a requirement for safe code"I went through the process you're going through now... I've been all over this design landscape, but I can't produce any design that works other than the one I have.I don't want to say that you haven't looked at this way before, but this argument really is an appeal to authority. I just can't accept that "because I have the authority" is the main reason why it can't work. You haven't explained exactly why it doesn't work in your system.I think it would help very much to have an example of such a type, and how it's used. It's hard to think in abstract terms.The only issue I see that can possibly cause problems is that it may not be easy or possible to separate the shared parts of data into its own type, which means you have to share it through an artificial reference type (one that contains only the sharable pieces). This can be automated and implemented via introspection.This does indeed feel very awkward to me. Understand, you're enforcing this on a very large number of types in my ecosystem.Burden of complexity is best placed on the threadsafety author/expert and isolated/contained in core libs, not distributed among all users in all code everywhere.Not disagreeing here, but again, this argument seems based on the faulty assumption that casts are required in user code.Yes, that's what I meant.One further benefit to keeping the cast explicit, is that one can write specific implementations knowing that data is not shared or is shared, giving a possibility of performance benefit that just isn't possible with MP (at least it isn't possible with compiler guarantees, obviously anything is possible if you follow conventions).I think what you mean is "one can write specific implementations *safely*..." And that's literally the single useful facet of the current design I can identify.My model can still implement 2 overloads to make the same assumptions, but it requires unsafe cast in the threadlocal implementation to implement the optimisation.No, it won't be possible, unless the two overloads use different data members.I am completely happy with such an optimisation being unsafe, but I think safety by default is the sensible option.I'm fine with it too, but it's unnecessary. However, I don't see this being a trade-off. In the model where we only implement rule one, safety by default is present, *and* you can implement the optimized versions.The reality is though, that the thread-local functions are NOT the perf issue, almost by definition. Sharing something implies that its threadsafe methods will be called a great many times by many threads... the single instance would not represent 'the workload' in a shared world, it's just an arbiter, or book-keeper.It just allows the same type to be used in a shared or unshared capacity without paying the synchronization penalty when it's not necessary.I maybe said this too forcefully. Clearly you can do it by convention. It's just that you must manually verify there are no thread-local references to it remaining. The compiler cannot check this for you, which generally suggests you should require a cast. This means you are thwarting the mechanically checked safe code, and it requires more scrutiny. But yes, you can do it in either case.One thing that is problematic with MP, is that you can't actually pass ownership of thread-local data from one thread to another.Is this true? This feels like a problem for move semantics. It seems like a general architectural problem, can you explain how MP affects this?Moving a current alias doesn't guarantee there are not other aliases. I still think a cast should be required. If it is provable by the compiler, then conversion to shared works fine. But yeah, you could make the function that passes the ownership accept an unshared value. It's just more dangerous and less safe.This isn't actually possible without casting under the current shared regime, but with implicit casting from unshared to shared, you have introduced NO opt-in cast on the sharing side.You say "passing ownership", why would that API receive a `shared` one? That's backwards. It should receive *THE* one, ie, an unsahred rvalue, and you would move your object.Any time you are going between threads, it's shared. Passing ownership means that both sides agree to a handshake between the two threads. In this case, it's temporarily shared while you are exchanging the data.This makes it impossible for the compiler or code reviewer to find the place at which you should be verifying the reference is unique (a requirement if you want to change ownership). The receiving side's cast back to thread-local can be abstracted (because you can wrap it in a type that assumes uniqueness and destroys the original).I think you're mistaken to think that `shared` has anything at all to do with passing ownership. That case is not-shared by definition.Lease what? The shared parts or the thread-local parts? In any case, you can't retain ownership via another reference and pass the data to a different thread safely. This would all have to be manually verified.Another thing that looks attractive from MP is you have this "carved out" section of your type that's only owned by your thread. This is great until you realize, you ONLY have access to it from your original reference. You can't send it away, get it back, and then manipulate the result. In this sense, it's VERY similar to const. So really it does you no good to associate the shared portions of the data with your local portions for the purpose of sending it away to other threads for a processing round-trip.I don't understand this point. You either lease it out to a cluster for processing (think parallel for), or if you want to 'send' it on a round-trip, then you are transferring ownership along the way. Both models work fine.I hope you can see that's not the case. Declaring things how they are going to be does not require any casting. -Steve------- To summarize, I think the reality is that we ACTUALLY can implement sharing as Manu wishes without implicit casting, albeit via library abstraction using introspection. I can easily see a library that allows you to pass a type that isn't shared, as long as it has shared pieces, and have that library simply restrict access to the thread-safe pieces via a wrapper. We don't need the compiler's help for that. So Manu can have his cake, I can eat my cake, we'll have a great big sharing of cake party, where nobody is racing, and everything is roses and lollipops.up in. All of us will be happier in that world, so I think that's worth pursuing as a first goal. being unsafe.
Oct 24 2018