www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - A question about move() and a rant about shared

reply "Stanislav Blinov" <stanislav.blinov gmail.com> writes:
Ok, this is going to be a long one, so please bear with me.

I'll start with a question.

1. std.algorithm.move() and std.container

TDPL describes when a compiler can and cannot perform a move 
automatically.
For cases when it isn't done automatically but we explicitly 
require a move, we have
std.algorithm.move(). This function comes in extremely handy 
especially when we
want to pass around data that cannot be otherwise copied 
(disabled this(this)).
For example, sinking some unique value into a thread or storing 
it in a container.
Or even both.

But why is there no practical way of storing such uncopyable data 
in standard
containers? I.e. both Array and DList do try perform a copy when 
insert() is called,
and happily fail when this(this) is  disabled. Same with access: 
front() returns
by value, so again no luck with  disabled this(this).

What is interesting though is that range interfaces for containers
do allow for moveFront() et al., and for some containers they're 
even defined.
So it's safe to move contents *out* but not *in*?
Is there some deeper technical reasoning behind this that I fail 
to see?




Below is a medium rant that's somewhat unrelated to the above, 
and is
aimed at receiving insights from those who're interested,
so if you're not, just skip it :)

I'll use quotes here to distinguish words from language qualifier.
This is mostly my current thoughts on "shared" and its usage, and 
I'd like
that you could point out where I'm wrong in this sensitive topic, 
any feedback
is greately appreciated.


2. "shared" is transitive. How transitive?

Declaring something as "shared" means that all its representation 
is also "shared".
This is a good thing, right?. But it does have certain 
implications. Consider a shared
data structure (for example, a (multi)producer (multi)consumer 
queue). If it's designed
to store anything that has indirections (pointers, references), 
those are better be
either provably unique (not possble in D except for immutable 
data),
or "shared". In fact, the whole stored type should just be 
"shared",
which is enforced by the compiler. Thus, we come to this:

shared class Queue(T) {
	private Container!T q;  // that'll be shared(Container!T),
	                        // which will in turn store shared(T)
	alias shared(T) Type;

	// push and pop are of course synchronized
	void push(Type) { ... }
	Type pop() { ... }
}

But in case when there are no indirections (i.e. a primitive 
type, or, more practically,
a struct with some primitive fields and a bunch of methods that 
reason about that data
or maybe do something with it) it all comes down to usage. In 
case of that queue, no two threads
could possibly access the same data simultaneously.
Let me define it real quick (somewhat contrived but should state 
the intent):

struct Packet {
	ulong ID;
	ubyte[32] header;
	ubyte[64] data;

	string type() inout  property { ... }
	ulong checkSum() inout  property { ... }
	Variant payload() inout  property { ... }
}

Note that I do have arrays in there, but they cannot possibly 
introduce any aliasing,
since they're static.

As soon as a producer pushes such value, it releases ownership of 
it, and some consumer
later gains ownership. Remember, there are no indirections, so no 
two threads could race
against the same data. But I cannot just declare a plain struct 
and then start
pushing it into that queue. It wouldn't work, because queue 
expects "shared" type.

One solution would be to use a cast. On one hand, this is 
feasible: such data is
really only logically shared when it's somewhere "in-between" 
threads, i.e. sits in a queue.
The "shared" queue owns the data for a moment, and thus makes the 
data iself "shared".
As soon as a consumer pops that value off a queue, it can be cast 
back to non-"shared".

This is ugly: it imposes certain convention in handling one 
"shared"
type (the queue) with another non-"shared" one (the struct): i.e. 
"always cast when push or pop".
Convention is not a reasonable justification for overlooking type 
system.

Another solution would be to instantiate those structs as 
"shared" in the first place.
But that won't work either: now any methods that those structs 
have must also have
"shared" overloads. In other words, I suddenly need to provide 
"shared" interface
for my struct. Well ok, I can do that trivially, by just 
declaring the whole struct type
as "shared". But this is wrong. "shared" advertises certain 
promise: this data is allowed to be
accessed by more than one thread at a time. This implies that 
access to the data
is better be synchronized. In other words, I would have to 
actually *write* the synchronization
for something that would never *need* synchronization. If I don't 
do it and simply
leave the struct declared as "shared" (or have all relevant 
"shared" overloads), I'm
shooting someone in the foot: imagine that later someone starts 
using my types, sees that this struct
is declared "shared" and happily assumes that it can be used 
concurrently. In that
contrived example it would be easy to see that's not actually the 
case. But reality is cruel.
Bang.

So ideally I'd want the queue to handle this situation for me, 
and luckily I can:

shared class Queue(T) {
	static if (hasUnsharedAliasing!T) {
		private Container!T q;           // as before
		alias shared(T) Type;
	} else {
		private __gshared Container!T q; // nothing is "shared" here
		alias T Type;
	}

	// push and pop are still synchronized :)
	void push(Type) { ... }
	Type pop() { ... }
}

But that's not the end of it. As seen from that definition, I'm 
using some
container (Container(T)) as actual storage. Current definition of 
Queue
requires that Container(T) have "shared" interface. Either that, 
or implement
the whole storing business myself right there in the Queue. The 
latter is certainly not
feasible, especially since, depending on requirements for the 
Queue,
I may need different storage capabilities.
In short: I don't need that container to be "shared" at all 
(provided it's a sane container
that doesn't do anything else with the data except for storing 
it). And in fact, if it were
"shared" already, I wouldn't need to define Queue at all, I'd 
just use Container directly.

Therefore, final iteration of Queue would look like this:

shared class Queue(T) {
	static if (hasUnsharedAliasing!T)
		alias shared(T) Type;
	else
		alias T Type;

	private __gshared Container!Type q;

	// still synchronized :)
	void push(Type) { ... }
	Type pop() { ... }	
}

All synchronization issues are handled by Queue, Container merely 
stores the data,
which is in turn "shared" or not depending on its representation.

This is conceptually how std.concurrency works: it allows you to 
send and receive
plain value types without imposing any "shared" qualification on 
them, but as soon
as you try to send a non-"shared" reference type or a struct with 
non-"shared" pointers,
it won't allow it.

There is, however, a nag with this: __gshared is not  safe. But 
getting rid of it
would mean only one thing: the Queue could only ever store 
shared(T), which
kind of kills initial message.

So, is "shared" really not as transitive as D wants it to be?

I imagine by now you already have a big list of "you're 
incorrect"s to stick in my face,
or you probably have already stopped reading :)

I have some more doubts regarding my handling of "shared", but 
I'll leave them for later so as
to not bore you to death.
Jan 24 2014
next sibling parent "Stanislav Blinov" <stanislav.blinov gmail.com> writes:
Ouch. Please excuse the formatting. I didn't pay attention when 
pasting this :|
Jan 24 2014
prev sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
24-Jan-2014 21:07, Stanislav Blinov пишет:
 Ok, this is going to be a long one, so please bear with me.

 I'll start with a question.
2 unrelated questions would be better as 2 nice smaller posts. Just saying.
 1. std.algorithm.move() and std.container
[snip]
 But why is there no practical way of storing such uncopyable data in
 standard
 containers? I.e. both Array and DList do try perform a copy when
 insert() is called,
 and happily fail when this(this) is  disabled. Same with access: front()
 returns
 by value, so again no luck with  disabled this(this).
Consider that std.container is an incomplete piece of work that is going to get an overhaul sometime "soon". Reasons are numerous but one is that sealed container concept couldn't be implemented well enough back then (hence need to copy or explicitly move on access). The second one is that the lack of allocators has pretty much blocked the development of std.container.
 What is interesting though is that range interfaces for containers
 do allow for moveFront() et al., and for some containers they're even
 defined.
 So it's safe to move contents *out* but not *in*?
 Is there some deeper technical reasoning behind this that I fail to see?
moveFront/moveBack/moveAt is a shameful and unknown thing about ranges usually kept in a basement during public talks. Ask Andrei of where these things go. If I understand the current direction is to go with ref T and make some language level/compiler level changes to the escaping reference.
 2. "shared" is transitive. How transitive?

 Declaring something as "shared" means that all its representation is
 also "shared".
 This is a good thing, right?.
It is a necessary thing.
 But in case when there are no indirections (i.e. a primitive type, or,
 more practically,
 a struct with some primitive fields and a bunch of methods that reason
 about that data
 or maybe do something with it) it all comes down to usage.
 In case of
 that queue, no two threads
 could possibly access the same data simultaneously.
 Let me define it real quick (somewhat contrived but should state the
 intent):

 struct Packet {
      ulong ID;
      ubyte[32] header;
      ubyte[64] data;

      string type() inout  property { ... }
      ulong checkSum() inout  property { ... }
      Variant payload() inout  property { ... }
 }

 Note that I do have arrays in there, but they cannot possibly introduce
 any aliasing,
 since they're static.
They can. To copy the thing out of somewhere you need to reach into memory area that is shared between threads and then all bets are off. //Example: shared Packet[4] buffer; There are easily races on fields ID, header and data here. High level invariants that deal with ownership are not represented in D typesystem and hence exist only in the head of people writing code/comments and better be encapsulated in something manageable.
 As soon as a producer pushes such value, it releases ownership of it,
 and some consumer
 later gains ownership. Remember, there are no indirections, so no two
 threads could race
 against the same data. But I cannot just declare a plain struct and then
 start
 pushing it into that queue. It wouldn't work, because queue expects
 "shared" type.
Depends on how you've defined push/pop. The thing is that both operation either accept local Packet or produce local Packet. void push(ref Packet p); //copy from local memory to the queue Packet pop(); //pop to the local memory But it is intrinsic to the way queue which is local (T1) --> shared --> local (T2) bridge of sorts T1, T2 being some threads.
 One solution would be to use a cast. On one hand, this is feasible: such
 data is
 really only logically shared when it's somewhere "in-between" threads,
 i.e. sits in a queue.
 The "shared" queue owns the data for a moment, and thus makes the data
 iself "shared".
 As soon as a consumer pops that value off a queue, it can be cast back
 to non-"shared".
 This is ugly: it imposes certain convention in handling one "shared"
 type (the queue) with another non-"shared" one (the struct): i.e.
 "always cast when push or pop".
 Convention is not a reasonable justification for overlooking type system.
To simplify you put shared qualifier on type and then suddenly it doesn't do magic. Sad but you have to think at what happens where with sharing. Casts or no casts is an internal thing and sadly at the time pretty much required because compiler can't grasp ownership for you.
 Another solution would be to instantiate those structs as "shared" in
 the first place.
And it doesn't work nicely because they are in fact not, only inside of the queue they are shared.
 In short: I don't need that container to be "shared" at all (provided
 it's a sane container
 that doesn't do anything else with the data except for storing it).
There you are horribly wrong. All containers need to be aware of the fact that they are shared between threads else some ugly things are going to happen. (Simultaneous insert and removal from an shared array?)
 Therefore, final iteration of Queue would look like this:

 shared class Queue(T) {
      static if (hasUnsharedAliasing!T)
          alias shared(T) Type;
      else
          alias T Type;

      private __gshared Container!Type q;

      // still synchronized :)
      void push(Type) { ... }
      Type pop() { ... }
 }
In essence all you need is a locking wrapper container that gives you shared interface around a mutable container or a container that is shared-aware by its nature (lock-free or whatever it may be inside). The need is well recognized and you are not alone in this.
 There is, however, a nag with this: __gshared is not  safe. But getting
 rid of it
 would mean only one thing: the Queue could only ever store shared(T), which
 kind of kills initial message.
Forget __gshared it's nothing but a transitive kludge.
 So, is "shared" really not as transitive as D wants it to be?
It is. The problem with shared and immutable is that we humans by our very nature expect: immutable(SomeContainer!T) to mean the same as ImmutableSomeContainer!T Where ImmutableSomeContainer is almost the same thing but in fact is a different type that has immutable interface and handles immutable data well. Example - ImmutableVector vs Vector. ImmutableVector need not to known about things such as capacity, has different kind of range and is a different type in general not just Vector with some auto-magic restrictions bolted on top of it. The same could be said about shared. -- Dmitry Olshansky
Jan 24 2014
parent reply "Stanislav Blinov" <stanislav.blinov gmail.com> writes:
On Friday, 24 January 2014 at 19:54:31 UTC, Dmitry Olshansky 
wrote:

 2 unrelated questions would be better as 2 nice smaller posts.
 Just saying.
I know. Sorry if it got you irritated. I have a feeling I'll be coming back with questions that are based on both move() and "shared" tied up together, hence I posted these together too.
 Consider that std.container is an incomplete piece of work...
Understood. Thanks.
 moveFront/moveBack/moveAt is a shameful and unknown thing about 
 ranges usually kept in a basement during public talks. Ask 
 Andrei of where these things go.
Now you've got me intrigued :)
 If I understand the current direction is to go with ref T and 
 make some language
 level/compiler level changes to the escaping reference.
Escaping... where? By "go with ref T" you mean return value of e.g. front()?
 struct Packet {
     ulong ID;
     ubyte[32] header;
     ubyte[64] data;
 }

 Note that I do have arrays in there, but they cannot possibly 
 introduce any aliasing, since they're static.
They can. To copy the thing out of somewhere you need to reach into memory area that is shared between threads and then all bets are off. //Example: shared Packet[4] buffer;
Aha, I see you responded as you went through my post :) I've got this case covered further down. Of course they can this way. But instantiating them as "shared" doesn't make sense anyway, exactly because they are not "shared"-aware.
 High level invariants that deal with ownership are not 
 represented in D typesystem and hence exist only in the head of 
 people writing code/comments and better be encapsulated in 
 something manageable.
But they actually are. immutable === doesn't care: share or own, you only ever can read it. Value type === single owner. Because you can only ever give away the value entirely (move() it) or a copy of it (which is coincidentally both shallow and deep copy). References (any kind thereof) === sharing. Granted, this system is lacking (i.e. it'd be nice to have unique references). But that's at least a bare base.
 It wouldn't work, because queue expects "shared" type.
Depends on how you've defined push/pop.
I'm losing you here. Without any __gshared "tricks" we have this: shared class Queue(T) { private Container!T q; void push(T); T pop(); } Note that the class Queue is "shared"-qualified, meaning that "q" is also "shared", meaning that (due to transitive nature of "shared"), the container ultimately stores shared(T). In fact, the definition above would even fail to compile in this case, because push should take shared(T) and pop should return shared(T) (since Queue will be dealing with a container of shared Ts, right?)
 The thing is that both operation either accept local Packet or 
 produce local Packet.
 void push(ref Packet p);  //copy from local memory to the queue
 Packet pop(); //pop to the local memory
 But it is intrinsic to the way queue which is local (T1) --> 
 shared --> local (T2) bridge of sorts T1, T2 being some threads.
Umm... I did post an interface of Queue, did I not? Where'd that "ref Packet" come from? :) push() should take an rvalue, pop() should return an rvalue. The fact that pushing thread managed to create an rvalue to push means precisely that it no longer has any aliases to pushed data (remember, we're talking value types here). Same with popping thread: as soon as it gets the value, Queue no longer stores it. If Packet weren't value type (e.g. contained references), it'd break the first invariant I imposed in my original post and would fall into "must be "shared"" category.
 One solution would be to use a cast...
 This is ugly...
 Convention is not a reasonable justification for overlooking 
 type system.
To simplify you put shared qualifier on type and then suddenly it doesn't do magic.
:D Was I that bad at explaining? Shared qualifier doesn't do any magic by itself anyway, it's our task as programmers to make it do what it means. I.e. put synchronization where it belongs.
 Sad but you have to think at what happens where with sharing.
I do. Lately, probably too much :) That's why I posted this.
 Casts or no casts is an internal thing and sadly at the time 
 pretty much required because compiler can't grasp ownership for 
 you.
It can't, and therefore it's my task to solve. I mean, in all of my example the only rock-solid shared thing is the queue. The queue would be the object all threads access simultaneously, the queue would be declared somewhere as "shared(Queue) theQueue;", the queue would be pushed and popped and fed and milked dry of its contents. But the contents, each individual packet, would *never* be accessed by more than one thread at a time, this is by design in this particular case. Generalizing that design to "always support shared" just doesn't make sense: why would I need to implement shared interface (and thus, shared access and all that comes with it) for those Packets if they're never used concurrently?
 Another solution would be to instantiate those structs as 
 "shared" in
 the first place.
And it doesn't work nicely because they are in fact not, only inside of the queue they are shared.
Eggz-actly. And even this "sharing" is purely incidental, due to the very nature of both such queues and this particular type Packet. Consider an already existing example of all that jazz with producer-consumer queue: std.concurrency. Only in that case that would be multi producer single consumer. It is perfectly safe to send() value types all day. You don't have to design your structs (or your ints :) ) as shared in order to pass them from one thread to another in a message. That is as long as your types don't have any references or pointers, which would *have* to be "shared".
 There you are horribly wrong. All containers need to be aware 
 of the fact that they are shared between threads else some ugly 
 things are going to happen. (Simultaneous insert and removal 
 from an shared array?)
Umm... I don't follow the logic. As you suggest further below, all I'm doing is designing a container type instances of which can be safely shared between threads. I can't build that container using only other "shared"-enabled types, and why should I? The underlying "storage" container is forever encapsulated in my shared one, the shared one deals with synchronization.
 In essence all you need is a locking wrapper container that 
 gives you shared interface around a mutable container or a 
 container that is shared-aware by its nature (lock-free or 
 whatever it may be inside).
Not exactly. In essence what I need is a "shared" wrapper that deals with all synchronization issues (not necessarily locking). Why does whatever it uses for *storage* have to know anything about sharing? Classical implementations of such queues on top of dlists, ring buffers, other queues... I know that there are shared structures that can be implemented more efficiently entirely from scratch (Andrei proposes a couple as examples in TDPL, the latter being, in his own words, somewhat barbaric), i.e. that maintain their own storage. But that's not always required nor needed. If I'd reduce my Queue to single producer single consumer (a pipe) it can be happily built on top of a plain array plus a couple of pointers. Do I understand you correctly that *everything* down to the lowest level should be shared-enabled, i.e. thread-safe, i.e. synchronized? Should Phobos just have e.g. two versions each of Array, DList, RBTree, you-name-it, one shared, one - not? Following that logic, how would you build some shared conatiner on top of D arrays which are by their nature not synchronized?
 The need is well recognized and you are not alone in this.
That's good to hear. That is, if I didn't lose your meaning completely :)
 It is. The problem with shared and immutable is that we humans 
 by our very nature expect:

 immutable(SomeContainer!T)

 to mean the same as
 ImmutableSomeContainer!T

 Where ImmutableSomeContainer is almost the same thing but in 
 fact is a different type that has immutable interface and 
 handles immutable data well.
I'd say that's more of a way of thinking imposed by some other language(s) rather than our human nature. Then again, the boundaries do blur sometimes :)
 Example - ImmutableVector vs Vector.
 ImmutableVector need not to known about things such as 
 capacity, has different kind of range and is a different type 
 in general not just Vector with some auto-magic restrictions 
 bolted on top of it.

 The same could be said about shared.
ImmutableVector is a strange concept altogether, but I understand what you mean here. What I still don't completely understand is the general message of "with "shared" everything should be "shared"". How can it be?
Jan 24 2014
next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
25-Jan-2014 02:13, Stanislav Blinov пишет:
 On Friday, 24 January 2014 at 19:54:31 UTC, Dmitry Olshansky wrote:

 2 unrelated questions would be better as 2 nice smaller posts.
 Just saying.
I know. Sorry if it got you irritated. I have a feeling I'll be coming back with questions that are based on both move() and "shared" tied up together, hence I posted these together too.
 Consider that std.container is an incomplete piece of work...
Understood. Thanks.
 moveFront/moveBack/moveAt is a shameful and unknown thing about ranges
 usually kept in a basement during public talks. Ask Andrei of where
 these things go.
Now you've got me intrigued :)
 If I understand the current direction is to go with ref T and make
 some language
 level/compiler level changes to the escaping reference.
Escaping... where? By "go with ref T" you mean return value of e.g. front()?
 struct Packet {
     ulong ID;
     ubyte[32] header;
     ubyte[64] data;
 }

 Note that I do have arrays in there, but they cannot possibly
 introduce any aliasing, since they're static.
They can. To copy the thing out of somewhere you need to reach into memory area that is shared between threads and then all bets are off. //Example: shared Packet[4] buffer;
Aha, I see you responded as you went through my post :) I've got this case covered further down. Of course they can this way. But instantiating them as "shared" doesn't make sense anyway, exactly because they are not "shared"-aware.
 High level invariants that deal with ownership are not represented in
 D typesystem and hence exist only in the head of people writing
 code/comments and better be encapsulated in something manageable.
But they actually are. immutable === doesn't care: share or own, you only ever can read it. Value type === single owner. Because you can only ever give away the value entirely (move() it) or a copy of it (which is coincidentally both shallow and deep copy). References (any kind thereof) === sharing. Granted, this system is lacking (i.e. it'd be nice to have unique references). But that's at least a bare base.
 It wouldn't work, because queue expects "shared" type.
Depends on how you've defined push/pop.
I'm losing you here. Without any __gshared "tricks" we have this: shared class Queue(T) { private Container!T q; void push(T); T pop(); }
Which is the same as:
 class Queue(T) {
      shared private Container!T q;

      void push(T) shared;
      T pop() shared;
 }

 Note that the class Queue is "shared"-qualified, meaning that "q" is
 also "shared", meaning that (due to transitive nature of "shared"), the
 container ultimately stores shared(T).
For functions only this pointer becomes shared. For data - yes, transitivity, turtles all the way down.
 In fact, the definition above
 would even fail to compile in this case, because push should take
 shared(T) and pop should return shared(T) (since Queue will be dealing
 with a container of shared Ts, right?)
No Queue in your definition of Queue it may take local T no problem at all. It all depends on the contents of pop/push to understand if it compiles.
 The thing is that both operation either accept local Packet or produce
 local Packet.
 void push(ref Packet p);  //copy from local memory to the queue
 Packet pop(); //pop to the local memory
 But it is intrinsic to the way queue which is local (T1) --> shared
 --> local (T2) bridge of sorts T1, T2 being some threads.
Umm... I did post an interface of Queue, did I not? Where'd that "ref Packet" come from? :)
Well seeing that Packet is such a fat type, I instinctively went for 'ref' :)
 push() should take an rvalue, pop() should return
 an rvalue. The fact that pushing thread managed to create an rvalue to
 push means precisely that it no longer has any aliases to pushed data
 (remember, we're talking value types here).
If the queue ultimately copies the data to its store it doesn't matter if there are aliases to the original thread's local packet. From the type system point of view there already can't be aliases in the queue as it's shared an the argument/return type is not :)
 Same with popping thread: as
 soon as it gets the value, Queue no longer stores it.
 If Packet weren't value type (e.g. contained references), it'd break the
 first invariant I imposed in my original post and would fall into "must
 be "shared"" category.
Given the interface alone it doesn't break anything. The whole "as soon as it gets value" is the only interesting part. There are different hack/tricks that could be used to make the queue go shared --copy-> local by hand. What I can deduce is that shared containers in general that return a copy may do a shallow unqual, i.e. mark one "level" of a indirections as not shared since it is a copy.
 One solution would be to use a cast...
 This is ugly...
 Convention is not a reasonable justification for overlooking type
 system.
To simplify you put shared qualifier on type and then suddenly it doesn't do magic.
:D Was I that bad at explaining? Shared qualifier doesn't do any magic by itself anyway, it's our task as programmers to make it do what it means. I.e. put synchronization where it belongs.
Okay.
 Sad but you have to think at what happens where with sharing.
I do. Lately, probably too much :) That's why I posted this.
 Casts or no casts is an internal thing and sadly at the time pretty
 much required because compiler can't grasp ownership for you.
It can't, and therefore it's my task to solve. I mean, in all of my example the only rock-solid shared thing is the queue. The queue would be the object all threads access simultaneously, the queue would be declared somewhere as "shared(Queue) theQueue;", the queue would be pushed and popped and fed and milked dry of its contents. But the contents, each individual packet, would *never* be accessed by more than one thread at a time, this is by design in this particular case.
Mmm as I've seen above there is no problem modeling it - that is taking local/mutable argument and put it into shared container.
 Generalizing that design to "always support shared" just doesn't make
 sense: why would I need to implement shared interface (and thus, shared
 access and all that comes with it) for those Packets if they're never
 used concurrently?
You don't need shared interface for Packets to store them in a shared container. In this case only because of no indirections property of Packets.
 In essence all you need is a locking wrapper container that gives you
 shared interface around a mutable container or a container that is
 shared-aware by its nature (lock-free or whatever it may be inside).
Not exactly. In essence what I need is a "shared" wrapper that deals with all synchronization issues (not necessarily locking). Why does whatever it uses for *storage* have to know anything about sharing? Classical implementations of such queues on top of dlists, ring buffers, other queues...
Yes, by locking wrapper I meant exactly this. I just have no idea how you'd generally wrap unknown beforehand container otherwise (well there are spinlocks and whatnot, but still locks).
 Do I understand you correctly that *everything* down to the lowest level
 should be shared-enabled, i.e. thread-safe, i.e. synchronized? Should
 Phobos just have e.g. two versions each of Array, DList, RBTree,
 you-name-it, one shared, one - not?
No, for many wrapper is enough or even all the boundary of what we could do. What's needed is more containers that are (by their nature) shared (thread-safe). See Java's ConcurrentHashMap.
 Following that logic, how would you
 build some shared conatiner on top of D arrays which are by their nature
 not synchronized?
New type with a shared interface, lock and hacks with casts inside.
 The need is well recognized and you are not alone in this.
That's good to hear. That is, if I didn't lose your meaning completely :)
There was a lot of confusion, partly because the problem statement was defined as "rant" ;) I've meant here the need for shared containers and/or wrappers around non-shared ones.
 Example - ImmutableVector vs Vector.
 ImmutableVector need not to known about things such as capacity, has
 different kind of range and is a different type in general not just
 Vector with some auto-magic restrictions bolted on top of it.

 The same could be said about shared.
ImmutableVector is a strange concept altogether, but I understand what you mean here. What I still don't completely understand is the general message of "with "shared" everything should be "shared"". How can it be?
For _data_ it must else you allow low-level races. "Everything" is bit too general a word. -- Dmitry Olshansky
Jan 25 2014
parent "Stanislav Blinov" <stanislav.blinov gmail.com> writes:
On Saturday, 25 January 2014 at 10:03:55 UTC, Dmitry Olshansky 
wrote:

 In fact, the definition above
 would even fail to compile in this case, because push should 
 take
 shared(T) and pop should return shared(T) (since Queue will be 
 dealing
 with a container of shared Ts, right?)
No Queue in your definition of Queue it may take local T no problem at all. It all depends on the contents of pop/push to understand if it compiles.
It can already be inferred from the interface. The Queue uses some Container as storage. Obviously push()/pop() would actually access that storage to store/remove elements. If Container does not support "shared", we're in trouble. Let's get a little more concrete: http://dpaste.dzfl.pl/5bd15697 A dastardly simple implementation of Queue. Note that Queue is being designed as shared type, what it uses for storage is its business. However, due to transitivity of "shared", it cannot really use anything *but* other "shared" containers. Therefore, we either need casts or __gshared :( Note that "T" for Queue is "Packet", bur for Container it's "shared(Packet)". Well, ok. Let's take a step back and forget about existing containers. I could implement the storage myself, inside the Queue type, right? Something like that: class Queue(T) { struct Node { T value; Node* prev, next; } private Node* head, tail; // Implementation follows... } I'd implement the whole business with doubly-linked list, maybe even employ some interesting algorithm to make the whole thing lock-free... But again, transitivity of "shared" now dictates that Queue can only store shared(T). Now about this Queue type, it has a very interesting property: it's a shared data structure that never allows more than one thread access to any single stored item at a time. It allows to push() and pop() concurrently (well, ok, this one is blocking). But it does not have random access, in fact, in its public interface it doesn't have anything *but* push() and pop()! Therefore, even though the Queue is a shared type (and instances of it will always be shared), it can *safely* store non-shared data. This does not mean that it *shouldn't* store shared data, it just means that the transitivity of "shared" can be lifted in one particular case: iff !hasUnsharedAliasing!(T). That's the same basis upon which std.concurrency is built. And that's why I went through that whole business with __gshared in my original post :)
 Where'd that "ref Packet" come from? :)
Well seeing that Packet is such a fat type, I instinctively went for 'ref' :)
Aww, 'tis not fat, just a little overweight :D
 push() should take an rvalue, pop() should return
 an rvalue. The fact that pushing thread managed to create an 
 rvalue to
 push means precisely that it no longer has any aliases to 
 pushed data
 (remember, we're talking value types here).
If the queue ultimately copies the data to its store it doesn't matter if there are aliases to the original thread's local packet.
Eggz-actly. But this is only true for data types such as that Packet. If it contained pointers or references... no go.
 From the type system point of view there already can't be 
 aliases in the queue as it's shared an the argument/return type 
 is not :)
You don't know that beforehand. Unlike its guts, which dictate that what it stores is in fact shared(T), it's interface does not say anything about T. T is T. Queue could be instantiated as Queue!(shared(FancyType)) too :)
 Given the interface alone it doesn't break anything. The whole 
 "as soon as it gets value" is the only interesting part. There 
 are different hack/tricks that could be used to make the queue 
 go shared --copy-> local by hand.
 What I can deduce is that shared containers in general that 
 return a copy may do a shallow unqual, i.e. mark one "level" of 
 a indirections as not shared since it is a copy.
Almost my thinking. In fact, there's no sense in *copying* shared data anyway. Moving - yes. OTOH, copying *references* to shared data is perfectly fine.
 But the
 contents, each individual packet, would *never* be accessed by 
 more than
 one thread at a time, this is by design in this particular 
 case.
Mmm as I've seen above there is no problem modeling it - that is taking local/mutable argument and put it into shared container.
Yes, but either by using casts, or __gshared :)
 You don't need shared interface for Packets to store them in a 
 shared container. In this case only because of no indirections 
 property of Packets.
That would depend on a container. As you've demostrated earlier, if a container did allow concurrent access to its items, the Packets could provide a world of hurt. The Queue does not allow such things, therefore it's indeed safe.
 Yes, by locking wrapper I meant exactly this. I just have no 
 idea how you'd generally wrap unknown beforehand container 
 otherwise (well there are spinlocks and whatnot, but still 
 locks).
Of course it's not unknown beforehand. I've already mentioned that such queues could be built on top of dlists, other queues, ring buffers, you-name-it. But if those are not "shared", we again come back to all these shennanigans with casts or __gshared.
 Do I understand you correctly that *everything* down to the 
 lowest level
 should be shared-enabled?..
No, for many wrapper is enough or even all the boundary of what we could do. What's needed is more containers that are (by their nature) shared (thread-safe). See Java's ConcurrentHashMap.
Whew :)
 Following that logic, how would you
 build some shared conatiner on top of D arrays which are by 
 their nature
 not synchronized?
New type with a shared interface, lock and hacks with casts inside.
Just like the Queue from my original post :D
 What I still don't completely understand is the general
 message of "with "shared" everything should be "shared"". How 
 can it be?
For _data_ it must else you allow low-level races. "Everything" is bit too general a word.
Hmm... I'll take that into my reply to your next message ;)
Jan 25 2014
prev sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
25-Jan-2014 02:13, Stanislav Blinov пишет:
 On Friday, 24 January 2014 at 19:54:31 UTC, Dmitry Olshansky wrote:
 If I understand the current direction is to go with ref T and make
 some language
 level/compiler level changes to the escaping reference.
Escaping... where? By "go with ref T" you mean return value of e.g. front()?
Say front returns a reference to a first element in an array. Say user stores &someArray.front as pointer somewhere, and the does someArray.insert([1,2,3,4,5]); In case array got resized the pointer is now dangling. Sealed containers were meant to prevent such class of errors entirely. Other links: http://forum.dlang.org/thread/bug-5541-3 http.d.puremagic.com/issues/ https://github.com/D-Programming-Language/dmd/pull/1903
 struct Packet {
     ulong ID;
     ubyte[32] header;
     ubyte[64] data;
 }

 Note that I do have arrays in there, but they cannot possibly
 introduce any aliasing, since they're static.
They can. To copy the thing out of somewhere you need to reach into memory area that is shared between threads and then all bets are off. //Example: shared Packet[4] buffer;
Aha, I see you responded as you went through my post :) I've got this case covered further down. Of course they can this way. But instantiating them as "shared" doesn't make sense anyway, exactly because they are not "shared"-aware.
It makes sense as in memory being visible from many threads. E.g. buffer[0].ID is accessible via atomic CAS operation. Same with header and data fields.
 High level invariants that deal with ownership are not represented in
 D typesystem and hence exist only in the head of people writing
 code/comments and better be encapsulated in something manageable.
But they actually are. immutable === doesn't care: share or own, you only ever can read it.
 Value type === single owner.
No. You may have many pointers to one shared value type. And most certainly not '==='. Anyway I was talking about ownership as in objects that own other objects. See for example these 2: http://bartoszmilewski.com/2009/09/22/ownership-systems-against-data-races/ http://bartoszmilewski.com/2009/06/02/race-free-multithreading-ownership/
 Because you can only
 ever give away the value entirely (move() it) or a copy of it (which is
 coincidentally both shallow and deep copy).
 References (any kind
 thereof) === sharing. Granted, this system is lacking (i.e. it'd be nice
 to have unique references). But that's at least a bare base.
And that's the center point. D's type system doesn't statically ensure if there are references present or not. And shared just implies "may be referenced from many threads" and local/mutable is "may be referenced from current thread". Not a single thing in the D type system makes sure e.g. that unique value has exactly one reference to it. See e.g. Rust where authors introduce such notion in the type system. -- Dmitry Olshansky
Jan 25 2014
next sibling parent "Stanislav Blinov" <stanislav.blinov gmail.com> writes:
On Saturday, 25 January 2014 at 11:51:04 UTC, Dmitry Olshansky 
wrote:

 Escaping... where? By "go with ref T" you mean return value of 
 e.g.
 front()?
Say front returns a reference to a first element in an array. Say user stores &someArray.front as pointer somewhere, and the does someArray.insert([1,2,3,4,5]); In case array got resized the pointer is now dangling. Sealed containers were meant to prevent such class of errors entirely. Other links: http://forum.dlang.org/thread/bug-5541-3 http.d.puremagic.com/issues/ https://github.com/D-Programming-Language/dmd/pull/1903
Aha, got it. Thanks, I'll be looking into that.
 I've got this case covered further down. Of course they can 
 [introduce aliasing] this way. But
 instantiating them as "shared" doesn't make sense anyway, 
 exactly
 because they are not "shared"-aware.
It makes sense as in memory being visible from many threads. E.g. buffer[0].ID is accessible via atomic CAS operation. Same with header and data fields.
Yup, but that's not the case with Queue, as I hope is already established in my previous reply.
 Value type === single owner.
No. You may have many pointers to one shared value type. And most certainly not '==='.
Pointers are references. Whatever one thread does with pointers to its own data is its business and has nothing to do with D's "shared" qualifier. Yes, it's sharing too, only it's not meant in the context "shared between threads". I didn't talk about ownership as in "one object owns another". I was talking about ownership of data in the context of threads. You cannot share a value type with another thread, unless you give it a pointer. And you can't do that in idiomatic D, because you're not allowed to "send" a pointer-to-non-"shared" to another thread.
 Anyway I was talking about ownership as in objects that own 
 other objects.
 See for example these 2:
 http://bartoszmilewski.com/2009/09/22/ownership-systems-against-data-races/
 http://bartoszmilewski.com/2009/06/02/race-free-multithreading-ownership/
I've read that :)
 Because you can only
 ever give away the value entirely (move() it) or a copy of it 
 (which is
 coincidentally both shallow and deep copy).
 References (any kind
 thereof) === sharing. Granted, this system is lacking (i.e. 
 it'd be nice
 to have unique references). But that's at least a bare base.
 And that's the center point. D's type system doesn't statically 
 ensure if there are references present or not. And shared just 
 implies "may be referenced from many threads" and local/mutable 
 is "may be referenced from current thread". Not a single thing 
 in the D type system makes sure e.g. that unique value has 
 exactly one reference to it.
I've been having fun with various approaches to implementing Unique. I have some ideas on where it can be improved, but without language support, this is gruesome indeed.
 See e.g. Rust where authors introduce such notion in the type 
 system.
I've never investigated Rust, never hurts to take a look, I guess...
Jan 25 2014
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/25/14 3:51 AM, Dmitry Olshansky wrote:
 25-Jan-2014 02:13, Stanislav Blinov пишет:
 On Friday, 24 January 2014 at 19:54:31 UTC, Dmitry Olshansky wrote:
 If I understand the current direction is to go with ref T and make
 some language
 level/compiler level changes to the escaping reference.
Escaping... where? By "go with ref T" you mean return value of e.g. front()?
Say front returns a reference to a first element in an array. Say user stores &someArray.front as pointer somewhere, and the does someArray.insert([1,2,3,4,5]); In case array got resized the pointer is now dangling. Sealed containers were meant to prevent such class of errors entirely.
We plan to disallow taking address of a ref return from a function. Andrei
Jan 25 2014
parent reply "Stanislav Blinov" <stanislav.blinov gmail.com> writes:
On Saturday, 25 January 2014 at 18:47:17 UTC, Andrei Alexandrescu 
wrote:

 We plan to disallow taking address of a ref return from a 
 function.
What about sneaky cases? E.g. like this: NullableRef!T nref(T)(ref T v) { return NullableRef!T(&v); } struct Container(T) { //... ref inout(T) front() inout { ... } } Container!int cnt; //... auto n = nref(cnt.front());
Jan 25 2014
parent reply "Stanislav Blinov" <stanislav.blinov gmail.com> writes:
On Sunday, 26 January 2014 at 07:21:15 UTC, Stanislav Blinov 
wrote:
 On Saturday, 25 January 2014 at 18:47:17 UTC, Andrei 
 Alexandrescu wrote:

 We plan to disallow taking address of a ref return from a 
 function.
What about sneaky cases? E.g. like this: NullableRef!T nref(T)(ref T v) { return NullableRef!T(&v); } struct Container(T) { //... ref inout(T) front() inout { ... } } Container!int cnt; //... auto n = nref(cnt.front());
I believe I need to clarify the question. If a type stored in container has disabled this(this), then then only way to go about it is to return int by ref. Ranges iterating over this container should respect that, right? But in that case, most of the tasty features of std.range would cease to work, because they rely on an assumption that an element could be forwarded through the interfaces of all involved ranges during the front() call. For example, std.range.lockstep: it won't compile for ranges that return by ref, because isInputRange expects { auto x = r.front(); } to compile, and it won't (copying is disabled). Currently this could be circumvented by using that NullableRef "trick" from above. How would we go about forwarding ref through the interface of ranges if taking the address will not be allowed?
Jan 26 2014
parent "Stanislav Blinov" <stanislav.blinov gmail.com> writes:
On Sunday, 26 January 2014 at 12:01:18 UTC, Stanislav Blinov 
wrote:

 then only way to go about it is to return int by ref.
s/int/it/
Jan 26 2014