digitalmars.D - A question about move() and a rant about shared
- Stanislav Blinov (190/190) Jan 24 2014 Ok, this is going to be a long one, so please bear with me.
- Stanislav Blinov (2/2) Jan 24 2014 Ouch. Please excuse the formatting. I didn't pay attention when
- Dmitry Olshansky (59/140) Jan 24 2014 2 unrelated questions would be better as 2 nice smaller posts.
- Stanislav Blinov (102/174) Jan 24 2014 I know. Sorry if it got you irritated. I have a feeling I'll be
- Dmitry Olshansky (39/172) Jan 25 2014 For functions only this pointer becomes shared. For data - yes,
- Stanislav Blinov (66/130) Jan 25 2014 It can already be inferred from the interface. The Queue uses
- Dmitry Olshansky (26/61) Jan 25 2014 Say front returns a reference to a first element in an array.
- Stanislav Blinov (21/66) Jan 25 2014 Aha, got it. Thanks, I'll be looking into that.
- Andrei Alexandrescu (3/16) Jan 25 2014 We plan to disallow taking address of a ref return from a function.
- Stanislav Blinov (11/13) Jan 25 2014 What about sneaky cases? E.g. like this:
- Stanislav Blinov (18/33) Jan 26 2014 I believe I need to clarify the question.
- Stanislav Blinov (3/4) Jan 26 2014 s/int/it/
Ok, this is going to be a long one, so please bear with me. I'll start with a question. 1. std.algorithm.move() and std.container TDPL describes when a compiler can and cannot perform a move automatically. For cases when it isn't done automatically but we explicitly require a move, we have std.algorithm.move(). This function comes in extremely handy especially when we want to pass around data that cannot be otherwise copied (disabled this(this)). For example, sinking some unique value into a thread or storing it in a container. Or even both. But why is there no practical way of storing such uncopyable data in standard containers? I.e. both Array and DList do try perform a copy when insert() is called, and happily fail when this(this) is disabled. Same with access: front() returns by value, so again no luck with disabled this(this). What is interesting though is that range interfaces for containers do allow for moveFront() et al., and for some containers they're even defined. So it's safe to move contents *out* but not *in*? Is there some deeper technical reasoning behind this that I fail to see? Below is a medium rant that's somewhat unrelated to the above, and is aimed at receiving insights from those who're interested, so if you're not, just skip it :) I'll use quotes here to distinguish words from language qualifier. This is mostly my current thoughts on "shared" and its usage, and I'd like that you could point out where I'm wrong in this sensitive topic, any feedback is greately appreciated. 2. "shared" is transitive. How transitive? Declaring something as "shared" means that all its representation is also "shared". This is a good thing, right?. But it does have certain implications. Consider a shared data structure (for example, a (multi)producer (multi)consumer queue). If it's designed to store anything that has indirections (pointers, references), those are better be either provably unique (not possble in D except for immutable data), or "shared". In fact, the whole stored type should just be "shared", which is enforced by the compiler. Thus, we come to this: shared class Queue(T) { private Container!T q; // that'll be shared(Container!T), // which will in turn store shared(T) alias shared(T) Type; // push and pop are of course synchronized void push(Type) { ... } Type pop() { ... } } But in case when there are no indirections (i.e. a primitive type, or, more practically, a struct with some primitive fields and a bunch of methods that reason about that data or maybe do something with it) it all comes down to usage. In case of that queue, no two threads could possibly access the same data simultaneously. Let me define it real quick (somewhat contrived but should state the intent): struct Packet { ulong ID; ubyte[32] header; ubyte[64] data; string type() inout property { ... } ulong checkSum() inout property { ... } Variant payload() inout property { ... } } Note that I do have arrays in there, but they cannot possibly introduce any aliasing, since they're static. As soon as a producer pushes such value, it releases ownership of it, and some consumer later gains ownership. Remember, there are no indirections, so no two threads could race against the same data. But I cannot just declare a plain struct and then start pushing it into that queue. It wouldn't work, because queue expects "shared" type. One solution would be to use a cast. On one hand, this is feasible: such data is really only logically shared when it's somewhere "in-between" threads, i.e. sits in a queue. The "shared" queue owns the data for a moment, and thus makes the data iself "shared". As soon as a consumer pops that value off a queue, it can be cast back to non-"shared". This is ugly: it imposes certain convention in handling one "shared" type (the queue) with another non-"shared" one (the struct): i.e. "always cast when push or pop". Convention is not a reasonable justification for overlooking type system. Another solution would be to instantiate those structs as "shared" in the first place. But that won't work either: now any methods that those structs have must also have "shared" overloads. In other words, I suddenly need to provide "shared" interface for my struct. Well ok, I can do that trivially, by just declaring the whole struct type as "shared". But this is wrong. "shared" advertises certain promise: this data is allowed to be accessed by more than one thread at a time. This implies that access to the data is better be synchronized. In other words, I would have to actually *write* the synchronization for something that would never *need* synchronization. If I don't do it and simply leave the struct declared as "shared" (or have all relevant "shared" overloads), I'm shooting someone in the foot: imagine that later someone starts using my types, sees that this struct is declared "shared" and happily assumes that it can be used concurrently. In that contrived example it would be easy to see that's not actually the case. But reality is cruel. Bang. So ideally I'd want the queue to handle this situation for me, and luckily I can: shared class Queue(T) { static if (hasUnsharedAliasing!T) { private Container!T q; // as before alias shared(T) Type; } else { private __gshared Container!T q; // nothing is "shared" here alias T Type; } // push and pop are still synchronized :) void push(Type) { ... } Type pop() { ... } } But that's not the end of it. As seen from that definition, I'm using some container (Container(T)) as actual storage. Current definition of Queue requires that Container(T) have "shared" interface. Either that, or implement the whole storing business myself right there in the Queue. The latter is certainly not feasible, especially since, depending on requirements for the Queue, I may need different storage capabilities. In short: I don't need that container to be "shared" at all (provided it's a sane container that doesn't do anything else with the data except for storing it). And in fact, if it were "shared" already, I wouldn't need to define Queue at all, I'd just use Container directly. Therefore, final iteration of Queue would look like this: shared class Queue(T) { static if (hasUnsharedAliasing!T) alias shared(T) Type; else alias T Type; private __gshared Container!Type q; // still synchronized :) void push(Type) { ... } Type pop() { ... } } All synchronization issues are handled by Queue, Container merely stores the data, which is in turn "shared" or not depending on its representation. This is conceptually how std.concurrency works: it allows you to send and receive plain value types without imposing any "shared" qualification on them, but as soon as you try to send a non-"shared" reference type or a struct with non-"shared" pointers, it won't allow it. There is, however, a nag with this: __gshared is not safe. But getting rid of it would mean only one thing: the Queue could only ever store shared(T), which kind of kills initial message. So, is "shared" really not as transitive as D wants it to be? I imagine by now you already have a big list of "you're incorrect"s to stick in my face, or you probably have already stopped reading :) I have some more doubts regarding my handling of "shared", but I'll leave them for later so as to not bore you to death.
Jan 24 2014
Ouch. Please excuse the formatting. I didn't pay attention when pasting this :|
Jan 24 2014
24-Jan-2014 21:07, Stanislav Blinov пишет:Ok, this is going to be a long one, so please bear with me. I'll start with a question.2 unrelated questions would be better as 2 nice smaller posts. Just saying.1. std.algorithm.move() and std.container[snip]But why is there no practical way of storing such uncopyable data in standard containers? I.e. both Array and DList do try perform a copy when insert() is called, and happily fail when this(this) is disabled. Same with access: front() returns by value, so again no luck with disabled this(this).Consider that std.container is an incomplete piece of work that is going to get an overhaul sometime "soon". Reasons are numerous but one is that sealed container concept couldn't be implemented well enough back then (hence need to copy or explicitly move on access). The second one is that the lack of allocators has pretty much blocked the development of std.container.What is interesting though is that range interfaces for containers do allow for moveFront() et al., and for some containers they're even defined. So it's safe to move contents *out* but not *in*? Is there some deeper technical reasoning behind this that I fail to see?moveFront/moveBack/moveAt is a shameful and unknown thing about ranges usually kept in a basement during public talks. Ask Andrei of where these things go. If I understand the current direction is to go with ref T and make some language level/compiler level changes to the escaping reference.2. "shared" is transitive. How transitive? Declaring something as "shared" means that all its representation is also "shared". This is a good thing, right?.It is a necessary thing.But in case when there are no indirections (i.e. a primitive type, or, more practically, a struct with some primitive fields and a bunch of methods that reason about that data or maybe do something with it) it all comes down to usage. In case of that queue, no two threads could possibly access the same data simultaneously. Let me define it real quick (somewhat contrived but should state the intent): struct Packet { ulong ID; ubyte[32] header; ubyte[64] data; string type() inout property { ... } ulong checkSum() inout property { ... } Variant payload() inout property { ... } } Note that I do have arrays in there, but they cannot possibly introduce any aliasing, since they're static.They can. To copy the thing out of somewhere you need to reach into memory area that is shared between threads and then all bets are off. //Example: shared Packet[4] buffer; There are easily races on fields ID, header and data here. High level invariants that deal with ownership are not represented in D typesystem and hence exist only in the head of people writing code/comments and better be encapsulated in something manageable.As soon as a producer pushes such value, it releases ownership of it, and some consumer later gains ownership. Remember, there are no indirections, so no two threads could race against the same data. But I cannot just declare a plain struct and then start pushing it into that queue. It wouldn't work, because queue expects "shared" type.Depends on how you've defined push/pop. The thing is that both operation either accept local Packet or produce local Packet. void push(ref Packet p); //copy from local memory to the queue Packet pop(); //pop to the local memory But it is intrinsic to the way queue which is local (T1) --> shared --> local (T2) bridge of sorts T1, T2 being some threads.One solution would be to use a cast. On one hand, this is feasible: such data is really only logically shared when it's somewhere "in-between" threads, i.e. sits in a queue. The "shared" queue owns the data for a moment, and thus makes the data iself "shared". As soon as a consumer pops that value off a queue, it can be cast back to non-"shared".This is ugly: it imposes certain convention in handling one "shared" type (the queue) with another non-"shared" one (the struct): i.e. "always cast when push or pop". Convention is not a reasonable justification for overlooking type system.To simplify you put shared qualifier on type and then suddenly it doesn't do magic. Sad but you have to think at what happens where with sharing. Casts or no casts is an internal thing and sadly at the time pretty much required because compiler can't grasp ownership for you.Another solution would be to instantiate those structs as "shared" in the first place.And it doesn't work nicely because they are in fact not, only inside of the queue they are shared.In short: I don't need that container to be "shared" at all (provided it's a sane container that doesn't do anything else with the data except for storing it).There you are horribly wrong. All containers need to be aware of the fact that they are shared between threads else some ugly things are going to happen. (Simultaneous insert and removal from an shared array?)Therefore, final iteration of Queue would look like this: shared class Queue(T) { static if (hasUnsharedAliasing!T) alias shared(T) Type; else alias T Type; private __gshared Container!Type q; // still synchronized :) void push(Type) { ... } Type pop() { ... } }In essence all you need is a locking wrapper container that gives you shared interface around a mutable container or a container that is shared-aware by its nature (lock-free or whatever it may be inside). The need is well recognized and you are not alone in this.There is, however, a nag with this: __gshared is not safe. But getting rid of it would mean only one thing: the Queue could only ever store shared(T), which kind of kills initial message.Forget __gshared it's nothing but a transitive kludge.So, is "shared" really not as transitive as D wants it to be?It is. The problem with shared and immutable is that we humans by our very nature expect: immutable(SomeContainer!T) to mean the same as ImmutableSomeContainer!T Where ImmutableSomeContainer is almost the same thing but in fact is a different type that has immutable interface and handles immutable data well. Example - ImmutableVector vs Vector. ImmutableVector need not to known about things such as capacity, has different kind of range and is a different type in general not just Vector with some auto-magic restrictions bolted on top of it. The same could be said about shared. -- Dmitry Olshansky
Jan 24 2014
On Friday, 24 January 2014 at 19:54:31 UTC, Dmitry Olshansky wrote:2 unrelated questions would be better as 2 nice smaller posts. Just saying.I know. Sorry if it got you irritated. I have a feeling I'll be coming back with questions that are based on both move() and "shared" tied up together, hence I posted these together too.Consider that std.container is an incomplete piece of work...Understood. Thanks.moveFront/moveBack/moveAt is a shameful and unknown thing about ranges usually kept in a basement during public talks. Ask Andrei of where these things go.Now you've got me intrigued :)If I understand the current direction is to go with ref T and make some language level/compiler level changes to the escaping reference.Escaping... where? By "go with ref T" you mean return value of e.g. front()?Aha, I see you responded as you went through my post :) I've got this case covered further down. Of course they can this way. But instantiating them as "shared" doesn't make sense anyway, exactly because they are not "shared"-aware.struct Packet { ulong ID; ubyte[32] header; ubyte[64] data; } Note that I do have arrays in there, but they cannot possibly introduce any aliasing, since they're static.They can. To copy the thing out of somewhere you need to reach into memory area that is shared between threads and then all bets are off. //Example: shared Packet[4] buffer;High level invariants that deal with ownership are not represented in D typesystem and hence exist only in the head of people writing code/comments and better be encapsulated in something manageable.But they actually are. immutable === doesn't care: share or own, you only ever can read it. Value type === single owner. Because you can only ever give away the value entirely (move() it) or a copy of it (which is coincidentally both shallow and deep copy). References (any kind thereof) === sharing. Granted, this system is lacking (i.e. it'd be nice to have unique references). But that's at least a bare base.I'm losing you here. Without any __gshared "tricks" we have this: shared class Queue(T) { private Container!T q; void push(T); T pop(); } Note that the class Queue is "shared"-qualified, meaning that "q" is also "shared", meaning that (due to transitive nature of "shared"), the container ultimately stores shared(T). In fact, the definition above would even fail to compile in this case, because push should take shared(T) and pop should return shared(T) (since Queue will be dealing with a container of shared Ts, right?)It wouldn't work, because queue expects "shared" type.Depends on how you've defined push/pop.The thing is that both operation either accept local Packet or produce local Packet. void push(ref Packet p); //copy from local memory to the queue Packet pop(); //pop to the local memoryBut it is intrinsic to the way queue which is local (T1) --> shared --> local (T2) bridge of sorts T1, T2 being some threads.Umm... I did post an interface of Queue, did I not? Where'd that "ref Packet" come from? :) push() should take an rvalue, pop() should return an rvalue. The fact that pushing thread managed to create an rvalue to push means precisely that it no longer has any aliases to pushed data (remember, we're talking value types here). Same with popping thread: as soon as it gets the value, Queue no longer stores it. If Packet weren't value type (e.g. contained references), it'd break the first invariant I imposed in my original post and would fall into "must be "shared"" category.:D Was I that bad at explaining? Shared qualifier doesn't do any magic by itself anyway, it's our task as programmers to make it do what it means. I.e. put synchronization where it belongs.One solution would be to use a cast... This is ugly... Convention is not a reasonable justification for overlooking type system.To simplify you put shared qualifier on type and then suddenly it doesn't do magic.Sad but you have to think at what happens where with sharing.I do. Lately, probably too much :) That's why I posted this.Casts or no casts is an internal thing and sadly at the time pretty much required because compiler can't grasp ownership for you.It can't, and therefore it's my task to solve. I mean, in all of my example the only rock-solid shared thing is the queue. The queue would be the object all threads access simultaneously, the queue would be declared somewhere as "shared(Queue) theQueue;", the queue would be pushed and popped and fed and milked dry of its contents. But the contents, each individual packet, would *never* be accessed by more than one thread at a time, this is by design in this particular case. Generalizing that design to "always support shared" just doesn't make sense: why would I need to implement shared interface (and thus, shared access and all that comes with it) for those Packets if they're never used concurrently?Eggz-actly. And even this "sharing" is purely incidental, due to the very nature of both such queues and this particular type Packet. Consider an already existing example of all that jazz with producer-consumer queue: std.concurrency. Only in that case that would be multi producer single consumer. It is perfectly safe to send() value types all day. You don't have to design your structs (or your ints :) ) as shared in order to pass them from one thread to another in a message. That is as long as your types don't have any references or pointers, which would *have* to be "shared".Another solution would be to instantiate those structs as "shared" in the first place.And it doesn't work nicely because they are in fact not, only inside of the queue they are shared.There you are horribly wrong. All containers need to be aware of the fact that they are shared between threads else some ugly things are going to happen. (Simultaneous insert and removal from an shared array?)Umm... I don't follow the logic. As you suggest further below, all I'm doing is designing a container type instances of which can be safely shared between threads. I can't build that container using only other "shared"-enabled types, and why should I? The underlying "storage" container is forever encapsulated in my shared one, the shared one deals with synchronization.In essence all you need is a locking wrapper container that gives you shared interface around a mutable container or a container that is shared-aware by its nature (lock-free or whatever it may be inside).Not exactly. In essence what I need is a "shared" wrapper that deals with all synchronization issues (not necessarily locking). Why does whatever it uses for *storage* have to know anything about sharing? Classical implementations of such queues on top of dlists, ring buffers, other queues... I know that there are shared structures that can be implemented more efficiently entirely from scratch (Andrei proposes a couple as examples in TDPL, the latter being, in his own words, somewhat barbaric), i.e. that maintain their own storage. But that's not always required nor needed. If I'd reduce my Queue to single producer single consumer (a pipe) it can be happily built on top of a plain array plus a couple of pointers. Do I understand you correctly that *everything* down to the lowest level should be shared-enabled, i.e. thread-safe, i.e. synchronized? Should Phobos just have e.g. two versions each of Array, DList, RBTree, you-name-it, one shared, one - not? Following that logic, how would you build some shared conatiner on top of D arrays which are by their nature not synchronized?The need is well recognized and you are not alone in this.That's good to hear. That is, if I didn't lose your meaning completely :)It is. The problem with shared and immutable is that we humans by our very nature expect: immutable(SomeContainer!T) to mean the same as ImmutableSomeContainer!T Where ImmutableSomeContainer is almost the same thing but in fact is a different type that has immutable interface and handles immutable data well.I'd say that's more of a way of thinking imposed by some other language(s) rather than our human nature. Then again, the boundaries do blur sometimes :)Example - ImmutableVector vs Vector. ImmutableVector need not to known about things such as capacity, has different kind of range and is a different type in general not just Vector with some auto-magic restrictions bolted on top of it. The same could be said about shared.ImmutableVector is a strange concept altogether, but I understand what you mean here. What I still don't completely understand is the general message of "with "shared" everything should be "shared"". How can it be?
Jan 24 2014
25-Jan-2014 02:13, Stanislav Blinov пишет:On Friday, 24 January 2014 at 19:54:31 UTC, Dmitry Olshansky wrote:Which is the same as:2 unrelated questions would be better as 2 nice smaller posts. Just saying.I know. Sorry if it got you irritated. I have a feeling I'll be coming back with questions that are based on both move() and "shared" tied up together, hence I posted these together too.Consider that std.container is an incomplete piece of work...Understood. Thanks.moveFront/moveBack/moveAt is a shameful and unknown thing about ranges usually kept in a basement during public talks. Ask Andrei of where these things go.Now you've got me intrigued :)If I understand the current direction is to go with ref T and make some language level/compiler level changes to the escaping reference.Escaping... where? By "go with ref T" you mean return value of e.g. front()?Aha, I see you responded as you went through my post :) I've got this case covered further down. Of course they can this way. But instantiating them as "shared" doesn't make sense anyway, exactly because they are not "shared"-aware.struct Packet { ulong ID; ubyte[32] header; ubyte[64] data; } Note that I do have arrays in there, but they cannot possibly introduce any aliasing, since they're static.They can. To copy the thing out of somewhere you need to reach into memory area that is shared between threads and then all bets are off. //Example: shared Packet[4] buffer;High level invariants that deal with ownership are not represented in D typesystem and hence exist only in the head of people writing code/comments and better be encapsulated in something manageable.But they actually are. immutable === doesn't care: share or own, you only ever can read it. Value type === single owner. Because you can only ever give away the value entirely (move() it) or a copy of it (which is coincidentally both shallow and deep copy). References (any kind thereof) === sharing. Granted, this system is lacking (i.e. it'd be nice to have unique references). But that's at least a bare base.I'm losing you here. Without any __gshared "tricks" we have this: shared class Queue(T) { private Container!T q; void push(T); T pop(); }It wouldn't work, because queue expects "shared" type.Depends on how you've defined push/pop.class Queue(T) { shared private Container!T q; void push(T) shared; T pop() shared; } Note that the class Queue is "shared"-qualified, meaning that "q" is also "shared", meaning that (due to transitive nature of "shared"), the container ultimately stores shared(T).For functions only this pointer becomes shared. For data - yes, transitivity, turtles all the way down.In fact, the definition above would even fail to compile in this case, because push should take shared(T) and pop should return shared(T) (since Queue will be dealing with a container of shared Ts, right?)No Queue in your definition of Queue it may take local T no problem at all. It all depends on the contents of pop/push to understand if it compiles.Well seeing that Packet is such a fat type, I instinctively went for 'ref' :)The thing is that both operation either accept local Packet or produce local Packet. void push(ref Packet p); //copy from local memory to the queue Packet pop(); //pop to the local memoryBut it is intrinsic to the way queue which is local (T1) --> shared --> local (T2) bridge of sorts T1, T2 being some threads.Umm... I did post an interface of Queue, did I not? Where'd that "ref Packet" come from? :)push() should take an rvalue, pop() should return an rvalue. The fact that pushing thread managed to create an rvalue to push means precisely that it no longer has any aliases to pushed data (remember, we're talking value types here).If the queue ultimately copies the data to its store it doesn't matter if there are aliases to the original thread's local packet. From the type system point of view there already can't be aliases in the queue as it's shared an the argument/return type is not :)Same with popping thread: as soon as it gets the value, Queue no longer stores it. If Packet weren't value type (e.g. contained references), it'd break the first invariant I imposed in my original post and would fall into "must be "shared"" category.Given the interface alone it doesn't break anything. The whole "as soon as it gets value" is the only interesting part. There are different hack/tricks that could be used to make the queue go shared --copy-> local by hand. What I can deduce is that shared containers in general that return a copy may do a shallow unqual, i.e. mark one "level" of a indirections as not shared since it is a copy.Okay.:D Was I that bad at explaining? Shared qualifier doesn't do any magic by itself anyway, it's our task as programmers to make it do what it means. I.e. put synchronization where it belongs.One solution would be to use a cast... This is ugly... Convention is not a reasonable justification for overlooking type system.To simplify you put shared qualifier on type and then suddenly it doesn't do magic.Mmm as I've seen above there is no problem modeling it - that is taking local/mutable argument and put it into shared container.Sad but you have to think at what happens where with sharing.I do. Lately, probably too much :) That's why I posted this.Casts or no casts is an internal thing and sadly at the time pretty much required because compiler can't grasp ownership for you.It can't, and therefore it's my task to solve. I mean, in all of my example the only rock-solid shared thing is the queue. The queue would be the object all threads access simultaneously, the queue would be declared somewhere as "shared(Queue) theQueue;", the queue would be pushed and popped and fed and milked dry of its contents. But the contents, each individual packet, would *never* be accessed by more than one thread at a time, this is by design in this particular case.Generalizing that design to "always support shared" just doesn't make sense: why would I need to implement shared interface (and thus, shared access and all that comes with it) for those Packets if they're never used concurrently?You don't need shared interface for Packets to store them in a shared container. In this case only because of no indirections property of Packets.Yes, by locking wrapper I meant exactly this. I just have no idea how you'd generally wrap unknown beforehand container otherwise (well there are spinlocks and whatnot, but still locks).In essence all you need is a locking wrapper container that gives you shared interface around a mutable container or a container that is shared-aware by its nature (lock-free or whatever it may be inside).Not exactly. In essence what I need is a "shared" wrapper that deals with all synchronization issues (not necessarily locking). Why does whatever it uses for *storage* have to know anything about sharing? Classical implementations of such queues on top of dlists, ring buffers, other queues...Do I understand you correctly that *everything* down to the lowest level should be shared-enabled, i.e. thread-safe, i.e. synchronized? Should Phobos just have e.g. two versions each of Array, DList, RBTree, you-name-it, one shared, one - not?No, for many wrapper is enough or even all the boundary of what we could do. What's needed is more containers that are (by their nature) shared (thread-safe). See Java's ConcurrentHashMap.Following that logic, how would you build some shared conatiner on top of D arrays which are by their nature not synchronized?New type with a shared interface, lock and hacks with casts inside.There was a lot of confusion, partly because the problem statement was defined as "rant" ;) I've meant here the need for shared containers and/or wrappers around non-shared ones.The need is well recognized and you are not alone in this.That's good to hear. That is, if I didn't lose your meaning completely :)For _data_ it must else you allow low-level races. "Everything" is bit too general a word. -- Dmitry OlshanskyExample - ImmutableVector vs Vector. ImmutableVector need not to known about things such as capacity, has different kind of range and is a different type in general not just Vector with some auto-magic restrictions bolted on top of it. The same could be said about shared.ImmutableVector is a strange concept altogether, but I understand what you mean here. What I still don't completely understand is the general message of "with "shared" everything should be "shared"". How can it be?
Jan 25 2014
On Saturday, 25 January 2014 at 10:03:55 UTC, Dmitry Olshansky wrote:It can already be inferred from the interface. The Queue uses some Container as storage. Obviously push()/pop() would actually access that storage to store/remove elements. If Container does not support "shared", we're in trouble. Let's get a little more concrete: http://dpaste.dzfl.pl/5bd15697 A dastardly simple implementation of Queue. Note that Queue is being designed as shared type, what it uses for storage is its business. However, due to transitivity of "shared", it cannot really use anything *but* other "shared" containers. Therefore, we either need casts or __gshared :( Note that "T" for Queue is "Packet", bur for Container it's "shared(Packet)". Well, ok. Let's take a step back and forget about existing containers. I could implement the storage myself, inside the Queue type, right? Something like that: class Queue(T) { struct Node { T value; Node* prev, next; } private Node* head, tail; // Implementation follows... } I'd implement the whole business with doubly-linked list, maybe even employ some interesting algorithm to make the whole thing lock-free... But again, transitivity of "shared" now dictates that Queue can only store shared(T). Now about this Queue type, it has a very interesting property: it's a shared data structure that never allows more than one thread access to any single stored item at a time. It allows to push() and pop() concurrently (well, ok, this one is blocking). But it does not have random access, in fact, in its public interface it doesn't have anything *but* push() and pop()! Therefore, even though the Queue is a shared type (and instances of it will always be shared), it can *safely* store non-shared data. This does not mean that it *shouldn't* store shared data, it just means that the transitivity of "shared" can be lifted in one particular case: iff !hasUnsharedAliasing!(T). That's the same basis upon which std.concurrency is built. And that's why I went through that whole business with __gshared in my original post :)In fact, the definition above would even fail to compile in this case, because push should take shared(T) and pop should return shared(T) (since Queue will be dealing with a container of shared Ts, right?)No Queue in your definition of Queue it may take local T no problem at all. It all depends on the contents of pop/push to understand if it compiles.Aww, 'tis not fat, just a little overweight :DWhere'd that "ref Packet" come from? :)Well seeing that Packet is such a fat type, I instinctively went for 'ref' :)Eggz-actly. But this is only true for data types such as that Packet. If it contained pointers or references... no go.push() should take an rvalue, pop() should return an rvalue. The fact that pushing thread managed to create an rvalue to push means precisely that it no longer has any aliases to pushed data (remember, we're talking value types here).If the queue ultimately copies the data to its store it doesn't matter if there are aliases to the original thread's local packet.From the type system point of view there already can't be aliases in the queue as it's shared an the argument/return type is not :)You don't know that beforehand. Unlike its guts, which dictate that what it stores is in fact shared(T), it's interface does not say anything about T. T is T. Queue could be instantiated as Queue!(shared(FancyType)) too :)Given the interface alone it doesn't break anything. The whole "as soon as it gets value" is the only interesting part. There are different hack/tricks that could be used to make the queue go shared --copy-> local by hand.What I can deduce is that shared containers in general that return a copy may do a shallow unqual, i.e. mark one "level" of a indirections as not shared since it is a copy.Almost my thinking. In fact, there's no sense in *copying* shared data anyway. Moving - yes. OTOH, copying *references* to shared data is perfectly fine.Yes, but either by using casts, or __gshared :)But the contents, each individual packet, would *never* be accessed by more than one thread at a time, this is by design in this particular case.Mmm as I've seen above there is no problem modeling it - that is taking local/mutable argument and put it into shared container.You don't need shared interface for Packets to store them in a shared container. In this case only because of no indirections property of Packets.That would depend on a container. As you've demostrated earlier, if a container did allow concurrent access to its items, the Packets could provide a world of hurt. The Queue does not allow such things, therefore it's indeed safe.Yes, by locking wrapper I meant exactly this. I just have no idea how you'd generally wrap unknown beforehand container otherwise (well there are spinlocks and whatnot, but still locks).Of course it's not unknown beforehand. I've already mentioned that such queues could be built on top of dlists, other queues, ring buffers, you-name-it. But if those are not "shared", we again come back to all these shennanigans with casts or __gshared.Whew :)Do I understand you correctly that *everything* down to the lowest level should be shared-enabled?..No, for many wrapper is enough or even all the boundary of what we could do. What's needed is more containers that are (by their nature) shared (thread-safe). See Java's ConcurrentHashMap.Just like the Queue from my original post :DFollowing that logic, how would you build some shared conatiner on top of D arrays which are by their nature not synchronized?New type with a shared interface, lock and hacks with casts inside.Hmm... I'll take that into my reply to your next message ;)What I still don't completely understand is the general message of "with "shared" everything should be "shared"". How can it be?For _data_ it must else you allow low-level races. "Everything" is bit too general a word.
Jan 25 2014
25-Jan-2014 02:13, Stanislav Blinov пишет:On Friday, 24 January 2014 at 19:54:31 UTC, Dmitry Olshansky wrote:Say front returns a reference to a first element in an array. Say user stores &someArray.front as pointer somewhere, and the does someArray.insert([1,2,3,4,5]); In case array got resized the pointer is now dangling. Sealed containers were meant to prevent such class of errors entirely. Other links: http://forum.dlang.org/thread/bug-5541-3 http.d.puremagic.com/issues/ https://github.com/D-Programming-Language/dmd/pull/1903If I understand the current direction is to go with ref T and make some language level/compiler level changes to the escaping reference.Escaping... where? By "go with ref T" you mean return value of e.g. front()?It makes sense as in memory being visible from many threads. E.g. buffer[0].ID is accessible via atomic CAS operation. Same with header and data fields.Aha, I see you responded as you went through my post :) I've got this case covered further down. Of course they can this way. But instantiating them as "shared" doesn't make sense anyway, exactly because they are not "shared"-aware.struct Packet { ulong ID; ubyte[32] header; ubyte[64] data; } Note that I do have arrays in there, but they cannot possibly introduce any aliasing, since they're static.They can. To copy the thing out of somewhere you need to reach into memory area that is shared between threads and then all bets are off. //Example: shared Packet[4] buffer;High level invariants that deal with ownership are not represented in D typesystem and hence exist only in the head of people writing code/comments and better be encapsulated in something manageable.But they actually are. immutable === doesn't care: share or own, you only ever can read it.Value type === single owner.No. You may have many pointers to one shared value type. And most certainly not '==='. Anyway I was talking about ownership as in objects that own other objects. See for example these 2: http://bartoszmilewski.com/2009/09/22/ownership-systems-against-data-races/ http://bartoszmilewski.com/2009/06/02/race-free-multithreading-ownership/Because you can only ever give away the value entirely (move() it) or a copy of it (which is coincidentally both shallow and deep copy). References (any kind thereof) === sharing. Granted, this system is lacking (i.e. it'd be nice to have unique references). But that's at least a bare base.And that's the center point. D's type system doesn't statically ensure if there are references present or not. And shared just implies "may be referenced from many threads" and local/mutable is "may be referenced from current thread". Not a single thing in the D type system makes sure e.g. that unique value has exactly one reference to it. See e.g. Rust where authors introduce such notion in the type system. -- Dmitry Olshansky
Jan 25 2014
On Saturday, 25 January 2014 at 11:51:04 UTC, Dmitry Olshansky wrote:Aha, got it. Thanks, I'll be looking into that.Escaping... where? By "go with ref T" you mean return value of e.g. front()?Say front returns a reference to a first element in an array. Say user stores &someArray.front as pointer somewhere, and the does someArray.insert([1,2,3,4,5]); In case array got resized the pointer is now dangling. Sealed containers were meant to prevent such class of errors entirely. Other links: http://forum.dlang.org/thread/bug-5541-3 http.d.puremagic.com/issues/ https://github.com/D-Programming-Language/dmd/pull/1903Yup, but that's not the case with Queue, as I hope is already established in my previous reply.I've got this case covered further down. Of course they can [introduce aliasing] this way. But instantiating them as "shared" doesn't make sense anyway, exactly because they are not "shared"-aware.It makes sense as in memory being visible from many threads. E.g. buffer[0].ID is accessible via atomic CAS operation. Same with header and data fields.Pointers are references. Whatever one thread does with pointers to its own data is its business and has nothing to do with D's "shared" qualifier. Yes, it's sharing too, only it's not meant in the context "shared between threads". I didn't talk about ownership as in "one object owns another". I was talking about ownership of data in the context of threads. You cannot share a value type with another thread, unless you give it a pointer. And you can't do that in idiomatic D, because you're not allowed to "send" a pointer-to-non-"shared" to another thread.Value type === single owner.No. You may have many pointers to one shared value type. And most certainly not '==='.Anyway I was talking about ownership as in objects that own other objects. See for example these 2: http://bartoszmilewski.com/2009/09/22/ownership-systems-against-data-races/ http://bartoszmilewski.com/2009/06/02/race-free-multithreading-ownership/I've read that :)Because you can only ever give away the value entirely (move() it) or a copy of it (which is coincidentally both shallow and deep copy). References (any kind thereof) === sharing. Granted, this system is lacking (i.e. it'd be nice to have unique references). But that's at least a bare base.And that's the center point. D's type system doesn't statically ensure if there are references present or not. And shared just implies "may be referenced from many threads" and local/mutable is "may be referenced from current thread". Not a single thing in the D type system makes sure e.g. that unique value has exactly one reference to it.I've been having fun with various approaches to implementing Unique. I have some ideas on where it can be improved, but without language support, this is gruesome indeed.See e.g. Rust where authors introduce such notion in the type system.I've never investigated Rust, never hurts to take a look, I guess...
Jan 25 2014
On 1/25/14 3:51 AM, Dmitry Olshansky wrote:25-Jan-2014 02:13, Stanislav Blinov пишет:We plan to disallow taking address of a ref return from a function. AndreiOn Friday, 24 January 2014 at 19:54:31 UTC, Dmitry Olshansky wrote:Say front returns a reference to a first element in an array. Say user stores &someArray.front as pointer somewhere, and the does someArray.insert([1,2,3,4,5]); In case array got resized the pointer is now dangling. Sealed containers were meant to prevent such class of errors entirely.If I understand the current direction is to go with ref T and make some language level/compiler level changes to the escaping reference.Escaping... where? By "go with ref T" you mean return value of e.g. front()?
Jan 25 2014
On Saturday, 25 January 2014 at 18:47:17 UTC, Andrei Alexandrescu wrote:We plan to disallow taking address of a ref return from a function.What about sneaky cases? E.g. like this: NullableRef!T nref(T)(ref T v) { return NullableRef!T(&v); } struct Container(T) { //... ref inout(T) front() inout { ... } } Container!int cnt; //... auto n = nref(cnt.front());
Jan 25 2014
On Sunday, 26 January 2014 at 07:21:15 UTC, Stanislav Blinov wrote:On Saturday, 25 January 2014 at 18:47:17 UTC, Andrei Alexandrescu wrote:I believe I need to clarify the question. If a type stored in container has disabled this(this), then then only way to go about it is to return int by ref. Ranges iterating over this container should respect that, right? But in that case, most of the tasty features of std.range would cease to work, because they rely on an assumption that an element could be forwarded through the interfaces of all involved ranges during the front() call. For example, std.range.lockstep: it won't compile for ranges that return by ref, because isInputRange expects { auto x = r.front(); } to compile, and it won't (copying is disabled). Currently this could be circumvented by using that NullableRef "trick" from above. How would we go about forwarding ref through the interface of ranges if taking the address will not be allowed?We plan to disallow taking address of a ref return from a function.What about sneaky cases? E.g. like this: NullableRef!T nref(T)(ref T v) { return NullableRef!T(&v); } struct Container(T) { //... ref inout(T) front() inout { ... } } Container!int cnt; //... auto n = nref(cnt.front());
Jan 26 2014
On Sunday, 26 January 2014 at 12:01:18 UTC, Stanislav Blinov wrote:then only way to go about it is to return int by ref.s/int/it/
Jan 26 2014