digitalmars.D - Difference between _

digitalmars.D - Difference between __gshared and shared.

wobbles (9/9) Jul 08 2015 After reading the recent "Lessons Learned" article [1], and

Olivier Pisano (5/14) Jul 08 2015 You can read it there :

wobbles (3/19) Jul 08 2015 Ok, so we should prioritise using 'shared' over __gshared as much

ketmar (2/4) Jul 08 2015 only `shared` is PITA...=

Jonathan M Davis (22/26) Jul 08 2015 The primary advantage of shared is that it allows most everything

tcak (11/39) Jul 08 2015 I still couldn't have found my answer though. I have three

Jonathan M Davis (18/64) Jul 08 2015 By using __gshared, you're throwing away the compiler's help, and

Shachar Shemesh (4/8) Jul 08 2015 I guess my main issue with this statement is that I don't see how that

Jonathan M Davis (15/27) Jul 08 2015 Because unless you cast away shared, you're prevented from doing

=?UTF-8?B?Ik3DoXJjaW8=?= Martins" (29/57) Jul 08 2015 I think a good way to avoid this extra annoyance would be to have

Jonathan M Davis (40/51) Jul 08 2015 Synchronized classes would be able to remove the outer layer of

deadalnix (2/20) Jul 08 2015 Amen

=?UTF-8?B?Ik3DoXJjaW8=?= Martins" (36/46) Jul 09 2015 What sort of subtle ways? Can you give examples that are not

Jonathan M Davis (69/86) Jul 09 2015 Well, the compiler is free to assume that a variable that is not

John Colvin (13/37) Jul 09 2015 Pretty sure that's the same as in C++. Unless there was an

Jonathan M Davis (24/30) Jul 09 2015 __gshared is required for interacting with C/C++ APIs, but

=?UTF-8?B?Ik3DoXJjaW8=?= Martins" (7/14) Jul 09 2015 But this is what a C/C++ compiler would do, unless you your data

Jonathan M Davis (15/32) Jul 09 2015 Sort of, but the assumptions that the D compiler is allowed to
deadalnix (6/23) Jul 10 2015 If you think volatile is going to help you with concurency, you

ketmar (2/5) Jul 08 2015 +1=

=?UTF-8?B?Ik3DoXJjaW8=?= Martins" (10/14) Jul 08 2015 I write a fair amount of threaded code myself and must say the

Jonathan M Davis (59/61) Jul 08 2015 In general, if you're doing much with a shared object other than

wobbles (12/14) Jul 08 2015 Interesting, so the main pain of using shared is the requirement

Dmitry Olshansky (8/22) Jul 08 2015 It's good convention but still convention - who guarantees that the

Jonathan M Davis (27/57) Jul 08 2015 Yes. That's the problem, and it would be great if we could find a

Dmitry Olshansky (5/11) Jul 08 2015 I definetly seen synchronized classes in TDPL years ago. They ought to

Jonathan M Davis (17/19) Jul 08 2015 They're described in TDPL, but they've never been implemented.

Steven Schveighoffer (18/23) Jul 08 2015 __gshared just puts the data in global segment, but does NOT alter the
Iain Buclaw via Digitalmars-d (4/12) Jul 08 2015 http://forum.dlang.org/post/mailman.739.1431034764.4581.digitalmars-d@pu...

"wobbles" <grogan.colin gmail.com> writes:

After reading the recent "Lessons Learned" article [1], and 
reading a few comments on the thread, there was a mention of 
using __gshared over shared.

What exactly is the difference here?
Are they 2 keywords to do the same thing, or are there specific 
use cases to both?
Is there plans to 'converge' them at some point?

[1] 
https://www.reddit.com/r/programming/comments/3cg1r0/lessons_learned_writing_a_filesystem_in_d/

Jul 08 2015

"Olivier Pisano" <olivier.pisano supersonicimagine.com> writes:

On Wednesday, 8 July 2015 at 09:20:58 UTC, wobbles wrote:
 After reading the recent "Lessons Learned" article [1], and 
 reading a few comments on the thread, there was a mention of 
 using __gshared over shared.

 What exactly is the difference here?
 Are they 2 keywords to do the same thing, or are there specific 
 use cases to both?
 Is there plans to 'converge' them at some point?

 [1] 
 https://www.reddit.com/r/programming/comments/3cg1r0/lessons_learned_writing_a_filesystem_in_d/

You can read it there : 
http://dlang.org/migrate-to-shared.html#gshared

Basically, __gshared is for interfacing with C (where everything 
is shared by default) without marking data as shared.

Jul 08 2015

"wobbles" <grogan.colin gmail.com> writes:

On Wednesday, 8 July 2015 at 09:37:37 UTC, Olivier Pisano wrote:
 On Wednesday, 8 July 2015 at 09:20:58 UTC, wobbles wrote:
 After reading the recent "Lessons Learned" article [1], and 
 reading a few comments on the thread, there was a mention of 
 using __gshared over shared.

 What exactly is the difference here?
 Are they 2 keywords to do the same thing, or are there 
 specific use cases to both?
 Is there plans to 'converge' them at some point?

 [1] 
 https://www.reddit.com/r/programming/comments/3cg1r0/lessons_learned_writing_a_filesystem_in_d/

 You can read it there : 
 http://dlang.org/migrate-to-shared.html#gshared

 Basically, __gshared is for interfacing with C (where 
 everything is shared by default) without marking data as shared.

Ok, so we should prioritise using 'shared' over __gshared as much 
as possible. Good to know!

Jul 08 2015

ketmar <ketmar ketmar.no-ip.org> writes:

On Wed, 08 Jul 2015 09:43:38 +0000, wobbles wrote:

 Ok, so we should prioritise using 'shared' over __gshared as much as
 possible. Good to know!

only `shared` is PITA...=

Jul 08 2015

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Wednesday, 8 July 2015 at 09:54:01 UTC, ketmar wrote:
 On Wed, 08 Jul 2015 09:43:38 +0000, wobbles wrote:

 Ok, so we should prioritise using 'shared' over __gshared as 
 much as possible. Good to know!

 only `shared` is PITA...

The primary advantage of shared is that it allows most everything 
to be thread-local.

Though arguably, shared _should_ be a bit of a pain, since its 
usage should normally be very restricted. But we do need to 
revisit shared and figure out what we want/need to do with it. 
Synchronized classes as described in TDPL were never even 
implemented (though I contest that they really make shared usable 
in any kind of sane way; I really don't see how you can do 
anything other than really basic stuff with shared without 
requiring that the programmer deal with the locks and casting 
properly on their own). So, more work needs to be done there even 
if it's figuring out what shared _isn't_ going to be doing.

Really though, one of the bigger problems is dealing with 
allocation and deallocation of shared objects and passing objects 
across threads, since we keep wanting to be able to do stuff like 
have thread-specific allocators, but the way shared currently 
works doesn't actually allow for it. :|

Regardless, while I would very much like to see shared properly 
ironed out, I'm _very_ grateful that thread-local is the default 
in D. It's just so much saner.

- Jonathan M Davis

Jul 08 2015

"tcak" <1ltkrs+3wyh1ow7kzn1k sharklasers.com> writes:

On Wednesday, 8 July 2015 at 10:10:58 UTC, Jonathan M Davis wrote:
 On Wednesday, 8 July 2015 at 09:54:01 UTC, ketmar wrote:
 On Wed, 08 Jul 2015 09:43:38 +0000, wobbles wrote:

 Ok, so we should prioritise using 'shared' over __gshared as 
 much as possible. Good to know!

 only `shared` is PITA...

 The primary advantage of shared is that it allows most 
 everything to be thread-local.

 Though arguably, shared _should_ be a bit of a pain, since its 
 usage should normally be very restricted. But we do need to 
 revisit shared and figure out what we want/need to do with it. 
 Synchronized classes as described in TDPL were never even 
 implemented (though I contest that they really make shared 
 usable in any kind of sane way; I really don't see how you can 
 do anything other than really basic stuff with shared without 
 requiring that the programmer deal with the locks and casting 
 properly on their own). So, more work needs to be done there 
 even if it's figuring out what shared _isn't_ going to be doing.

 Really though, one of the bigger problems is dealing with 
 allocation and deallocation of shared objects and passing 
 objects across threads, since we keep wanting to be able to do 
 stuff like have thread-specific allocators, but the way shared 
 currently works doesn't actually allow for it. :|

 Regardless, while I would very much like to see shared properly 
 ironed out, I'm _very_ grateful that thread-local is the 
 default in D. It's just so much saner.

 - Jonathan M Davis

I still couldn't have found my answer though. I have three 
different use cases, one is missing in the language.

1. Thread-local object.
2. Shared, but implemented as not to be synchronised.
3. Shared, and implemented to be synchronised.

There is no simple way to design a class, so that you can 
implement it for points 2 and 3. The only way is to use _gshared 
with point one to solve it.

I use shared in many of my classes. Thus I experience different 
situations.

Jul 08 2015

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Wednesday, 8 July 2015 at 10:49:05 UTC, tcak wrote:
 On Wednesday, 8 July 2015 at 10:10:58 UTC, Jonathan M Davis 
 wrote:
 On Wednesday, 8 July 2015 at 09:54:01 UTC, ketmar wrote:
 On Wed, 08 Jul 2015 09:43:38 +0000, wobbles wrote:

 Ok, so we should prioritise using 'shared' over __gshared as 
 much as possible. Good to know!

 only `shared` is PITA...

 The primary advantage of shared is that it allows most 
 everything to be thread-local.

 Though arguably, shared _should_ be a bit of a pain, since its 
 usage should normally be very restricted. But we do need to 
 revisit shared and figure out what we want/need to do with it. 
 Synchronized classes as described in TDPL were never even 
 implemented (though I contest that they really make shared 
 usable in any kind of sane way; I really don't see how you can 
 do anything other than really basic stuff with shared without 
 requiring that the programmer deal with the locks and casting 
 properly on their own). So, more work needs to be done there 
 even if it's figuring out what shared _isn't_ going to be 
 doing.

 Really though, one of the bigger problems is dealing with 
 allocation and deallocation of shared objects and passing 
 objects across threads, since we keep wanting to be able to do 
 stuff like have thread-specific allocators, but the way shared 
 currently works doesn't actually allow for it. :|

 Regardless, while I would very much like to see shared 
 properly ironed out, I'm _very_ grateful that thread-local is 
 the default in D. It's just so much saner.

 - Jonathan M Davis

 I still couldn't have found my answer though. I have three 
 different use cases, one is missing in the language.

 1. Thread-local object.
 2. Shared, but implemented as not to be synchronised.
 3. Shared, and implemented to be synchronised.

 There is no simple way to design a class, so that you can 
 implement it for points 2 and 3. The only way is to use 
 _gshared with point one to solve it.

 I use shared in many of my classes. Thus I experience different 
 situations.

By using __gshared, you're throwing away the compiler's help, and 
it's _much_ more likely that you're going to write code which 
causes the compiler to generate incorrect machine code, because 
it's assuming that an object is thread-local when it's not.

Generally what you have to do with shared is lock on a mutex, 
cast away shared on the object you want to operate on, do 
whatever you're going to do with it, and then release the lock 
after there are no more thread-local references to the shared 
object. And that's basically what you normally should be doing in 
C++ code except that you don't have to cast away shared, because 
C++ doesn't have it.

I know that there are a number of people who get frustrated with 
shared and using __gshared instead, but unless you fully 
understand what you're doing and how the language works, and 
you're _really_ careful, you're going to shoot yourself in the 
foot it subtle ways if you do that.

- Jonathan M Davis

Jul 08 2015

Shachar Shemesh <shachar weka.io> writes:

On 08/07/15 15:08, Jonathan M Davis wrote:
 I know that there are a number of people who get frustrated with shared
 and using __gshared instead, but unless you fully understand what you're
 doing and how the language works, and you're _really_ careful, you're
 going to shoot yourself in the foot it subtle ways if you do that.

I guess my main issue with this statement is that I don't see how that 
is not the case when using shared.

Shachar

Jul 08 2015

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Wednesday, 8 July 2015 at 13:26:53 UTC, Shachar Shemesh wrote:
 On 08/07/15 15:08, Jonathan M Davis wrote:
 I know that there are a number of people who get frustrated 
 with shared
 and using __gshared instead, but unless you fully understand 
 what you're
 doing and how the language works, and you're _really_ careful, 
 you're
 going to shoot yourself in the foot it subtle ways if you do 
 that.

 I guess my main issue with this statement is that I don't see 
 how that is not the case when using shared.

Because unless you cast away shared, you're prevented from doing 
much of anything to the object, and the compiler clearly 
indicates which objects are shared, so the code that has to worry 
about getting locks right and dealing with casting away shared 
correctly is clearly marked and segregated from the rest of the 
program, unlike with a language like C++ or Java where 
_everything_ is shared, and you have no idea which objects are 
actually shared across threads and which are thread-local.

What we have is uglier than we'd like, but that ugliness 
highlights the small portion of code where you actually have to 
deal with synchronization and threading issues so that if you do 
have a problem with it, you have a very small amount of code to 
dig through to figure out how you screwed it up.

- Jonathan M Davis

Jul 08 2015

=?UTF-8?B?Ik3DoXJjaW8=?= Martins" <marcioapm gmail.com> writes:

On Wednesday, 8 July 2015 at 14:00:43 UTC, Jonathan M Davis wrote:
 On Wednesday, 8 July 2015 at 13:26:53 UTC, Shachar Shemesh 
 wrote:
 On 08/07/15 15:08, Jonathan M Davis wrote:
 I know that there are a number of people who get frustrated 
 with shared
 and using __gshared instead, but unless you fully understand 
 what you're
 doing and how the language works, and you're _really_ 
 careful, you're
 going to shoot yourself in the foot it subtle ways if you do 
 that.

 I guess my main issue with this statement is that I don't see 
 how that is not the case when using shared.

 Because unless you cast away shared, you're prevented from 
 doing much of anything to the object, and the compiler clearly 
 indicates which objects are shared, so the code that has to 
 worry about getting locks right and dealing with casting away 
 shared correctly is clearly marked and segregated from the rest 
 of the program, unlike with a language like C++ or Java where 
 _everything_ is shared, and you have no idea which objects are 
 actually shared across threads and which are thread-local.

 What we have is uglier than we'd like, but that ugliness 
 highlights the small portion of code where you actually have to 
 deal with synchronization and threading issues so that if you 
 do have a problem with it, you have a very small amount of code 
 to dig through to figure out how you screwed it up.

 - Jonathan M Davis

I think a good way to avoid this extra annoyance would be to have 
shared be implicitly convertible to non-shared, or actually, 
shared should just be ignored inside synchronized blocks, 
essentially the net result is that the cast is done implicitly at 
the beginning of the block.

To solve the wrong mutex problem that Dmitry mentioned, perhaps a 
declarative approach could be used? I wouldn't mind the extra 
syntax, as it also provides documentation by giving clarity into 
which mutexes guard what data. With the implicit 
stripping/ignoring of shared it would become very succinct as 
well.

```
Mutex moomtx;
Mytex cheesemtx;

shared(moomtx) int[] moo;
shared(cheesemtx) float cheese;

or perhaps with UDA:

 lock(globalmtx, moomtx) shared int moo; // either globalmtx or 
moomtx must be locked


synchronized(moomtx) {
    moomtx = []; // ok
    cheese = 1.61803f; // error: mutex moomtx is not locked, even 
though we are inside a synchronized block

    moo.sort(); // shared automatically ignored
}

// ...
moo.sort(); // error: mutex moomtx is not locked
```

Jul 08 2015

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Wednesday, 8 July 2015 at 14:29:56 UTC, Márcio Martins wrote:
 I think a good way to avoid this extra annoyance would be to 
 have shared be implicitly convertible to non-shared, or 
 actually, shared should just be ignored inside synchronized 
 blocks, essentially the net result is that the cast is done 
 implicitly at the beginning of the block.

 To solve the wrong mutex problem that Dmitry mentioned, perhaps 
 a declarative approach could be used? I wouldn't mind the extra 
 syntax, as it also provides documentation by giving clarity 
 into which mutexes guard what data. With the implicit 
 stripping/ignoring of shared it would become very succinct as 
 well.

Synchronized classes would be able to remove the outer layer of 
shared, because they can actually guarantee that no one else has 
access to their member variables (e.g. it would be illegal to 
make them public). So, when it locks the mutex on a member 
function call, it can actually guarantee that stripping that 
outer layer of shared is safe. But it can only strip away the 
outer layer, because it can't make any guarantees beyond that.

For shared to work like you're suggesting, the compiler would 
essentially need to make it so that doing _anything_ to a shared 
object when the mutex wasn't locked would be illegal, and I don't 
know how feasible that really is. And even if it could do it, all 
it would be doing would be the same as a synchronized class and 
stripping away the outer layer of shared. But even then, it 
wouldn't be enough, because it has no way of stopping you from 
assigning that partially-unshared object to something and letting 
it escape, whereas with a synchronized class, everything you're 
doing is inside of a member function, so it's encapsulated, 
allowing the compiler to make better guarantees about what you 
are or aren't doing with the object. So, for what you're 
proposing, you pretty much might as well just create a 
synchronized class which uses opDispatch to forward calls to the 
member variable. It's slightly more verbose at the declaration 
site but less verbose wherever you use the object. And 
regardless, it doesn't solve the problem that all that you can 
strip away of shared is the outer layer. I'm not sure that it's 
possible to do any more than that with compiler guarantees.

Another thing to consider is that you might need to lock multiple 
objects with the same mutex or lock multiple mutexes when 
accessing an object, in which case, what you're proposing is 
definitely worse than what we'd get from synchronized classes, 
since they naturally lock multiple items at once (though they 
don't really cover the case where multiple mutexes need to be 
involved except that because all of their functionality is 
encapsulated, it's easier to put the extra locking where it needs 
to be without missing it).

Maybe we need to do something like you're suggesting, but really, 
it doesn't seem like it's improving particularly on synchronized 
classes.

- Jonathan M Davis

Jul 08 2015

"deadalnix" <deadalnix gmail.com> writes:

On Wednesday, 8 July 2015 at 12:08:37 UTC, Jonathan M Davis wrote:
 By using __gshared, you're throwing away the compiler's help, 
 and it's _much_ more likely that you're going to write code 
 which causes the compiler to generate incorrect machine code, 
 because it's assuming that an object is thread-local when it's 
 not.

 Generally what you have to do with shared is lock on a mutex, 
 cast away shared on the object you want to operate on, do 
 whatever you're going to do with it, and then release the lock 
 after there are no more thread-local references to the shared 
 object. And that's basically what you normally should be doing 
 in C++ code except that you don't have to cast away shared, 
 because C++ doesn't have it.

 I know that there are a number of people who get frustrated 
 with shared and using __gshared instead, but unless you fully 
 understand what you're doing and how the language works, and 
 you're _really_ careful, you're going to shoot yourself in the 
 foot it subtle ways if you do that.

 - Jonathan M Davis

Amen

Jul 08 2015

=?UTF-8?B?Ik3DoXJjaW8=?= Martins" <marcioapm gmail.com> writes:

On Wednesday, 8 July 2015 at 21:15:19 UTC, deadalnix wrote:
 On Wednesday, 8 July 2015 at 12:08:37 UTC, Jonathan M Davis 
 wrote:

 I know that there are a number of people who get frustrated 
 with shared and using __gshared instead, but unless you fully 
 understand what you're doing and how the language works, and 
 you're _really_ careful, you're going to shoot yourself in the 
 foot it subtle ways if you do that.

 - Jonathan M Davis

 Amen

What sort of subtle ways? Can you give examples that are not 
effectively the same subtle ways you would encounter with 
pthreads in C/C++? I have been running with the assumption that 
__gshared effectively bypasses TLS, which again, feels sort of 
dirty to use a __ prefixed keyword for that, but, yeah...

I'm not sure why I don't see the magic with synchronized classes. 
To me, they have a fundamental flaw in the fact they are classes. 
While I don't mind much their existence, I would very much 
dislike if that is the only convenient way to use the 
shared/synchronized mechanisms.

Regarding your point about multiple pieces of data being guarded 
by multiple mutexes using the proposed design, we could perhaps 
do it this way:

 lock(mutex) shared {
   int moo;
   float foo;
}


We could implement synchronized classes/structs like this:

struct Foo {
   void cowsay() synchronized(moootex) {
      // your synchronized method implementation
   }

   void cowsay() {
     synchronized(mootex) {
       // no sugar
     }
   }

    lock(moootex) private shared {
     int m_moo;
     float m_foo;
   }
}

I think we should have avoid Java's non-sense of having to 
declare a class to do anything, and instead find generic ways to 
do things that are useful for multiple paradigms.

Jul 09 2015

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Thursday, 9 July 2015 at 12:39:00 UTC, Márcio Martins wrote:
 On Wednesday, 8 July 2015 at 21:15:19 UTC, deadalnix wrote:
 On Wednesday, 8 July 2015 at 12:08:37 UTC, Jonathan M Davis 
 wrote:

 I know that there are a number of people who get frustrated 
 with shared and using __gshared instead, but unless you fully 
 understand what you're doing and how the language works, and 
 you're _really_ careful, you're going to shoot yourself in 
 the foot it subtle ways if you do that.

 - Jonathan M Davis

 Amen

 What sort of subtle ways? Can you give examples that are not 
 effectively the same subtle ways you would encounter with 
 pthreads in C/C++? I have been running with the assumption that 
 __gshared effectively bypasses TLS, which again, feels sort of 
 dirty to use a __ prefixed keyword for that, but, yeah...

Well, the compiler is free to assume that a variable that is not 
marked as shared is thread-local. So, it's free to make 
optimizations based on that. So, for instance, it can know for a 
fact that

auto foo = getFoo();
auto result1 = foo.constPureFunction(); // This function _cannot_ 
mutate foo
auto result2 = foo.constPureFunction(); // This function _cannot_ 
mutate foo
auto bar = foo;

So, it knows that the value of bar is identical to the value of 
foo and that result1 and result2 are guaranteed to be the same, 
because it knows that no other thread can possibly have mutated 
foo within this code, and there's no way that this code mutated 
foo even through another reference to the same data on the same 
thread. And it can know that thanks to how const, pure, and TLS 
all work. The compiler is free to optimize the code or make other 
alterations to it based on that knowledge, so if it makes an 
optimization based on that, and foo is actually shared across 
threads (either because the object it refers to was originally 
__gshared or because shared was cast away incorrectly), then 
you're going to have incorrect machine code. And what 
optimizations the compiler does with code like this could change 
over time. And unless you're an expert in the language and in the 
compiler, you're not going to know when the compiler is going to 
make optimizations where the fact that the variable is in TLS 
factors in. So, you're not going to know when the compiler might 
optimize your code in ways that won't work with __gshared, and 
what optimizations it does or doesn't do right now won't 
necessarily be the same ones that it does or doesn't do later.

You have the same problem with shared, but in that case, the 
compiler makes it so that you have to cast away shared to get 
into this mess. It protects against doing stuff like accidentally 
passing a shared object around into code that will treat is a 
thread-local. Heck, with a __gshared object, if you change its 
type, it could go from being a value type where passing it to 
other code works just fine because it's truly copied to being a 
reference type (or partial reference type) where it's not copied 
(or only partially copied), and the compiler won't be able to 
help you catch the points where you were doing a full copy before 
but aren't now. And if it's your coworker that changed the 
definition of the type of the variable that you marked as 
__gshared, you could be screwed without knowing it.

Really, we can't tell what subtle behavioral problems you're 
risking with __gshared, because that depends on what the compiler 
is currently able to do with the assumption that a variable is in 
TLS. You run into all of the problems that you risk with sharing 
variables in threads in C++ only worse, because the D compiler is 
free to assume that an object is thread-local unless it's marked 
as shared and thus can make optimizations based on that, whereas 
the C++ compiler can't. And you've thrown away all of the 
compiler's help by using __gshared. __gshared is intended 
specifically for use with interacting with C code where we don't 
really have a choice, and you have to be careful with it. For 
everything else, if you need to share data across threads, that's 
what shared is for, and the compiler then knows that it's shared, 
so it will optimize differently, and it'll yell at you when you 
misuse it. Ultimately, you still have the risk of screwing it up 
when you cast away shared when the object is protected by a lock, 
but then at least, even if you end up with a subtle bug, because 
a thread-local reference to the shared data escaped the lock, 
it's a lot easier to figure out where you've misused shared 
incorrectly in D than figuring out where you might have screwed 
it up in C++, because all of your shared objects are explicitly 
marked as such, and it's the points where you cast away shared 
that risk problems, so you have a lot less code to look at.

As annoying as it can be, shared is your friend. __gshared is not.

- Jonathan M Davis

Jul 09 2015

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Thursday, 9 July 2015 at 14:03:18 UTC, Jonathan M Davis wrote:
 Well, the compiler is free to assume that a variable that is 
 not marked as shared is thread-local. So, it's free to make 
 optimizations based on that. So, for instance, it can know for 
 a fact that

 auto foo = getFoo();
 auto result1 = foo.constPureFunction(); // This function 
 _cannot_ mutate foo
 auto result2 = foo.constPureFunction(); // This function 
 _cannot_ mutate foo
 auto bar = foo;

 So, it knows that the value of bar is identical to the value of 
 foo and that result1 and result2 are guaranteed to be the same,

Pretty sure that's the same as in C++. Unless there was an 
acquire operation/barrier in there, the compiler is free to 
assume that two sequential reads to a memory location, without an 
intervening write (only considering the same thread), will return 
the same result. The optimisations that are forbidden in C++ are 
more subtle.

 Really, we can't tell what subtle behavioral problems you're 
 risking with __gshared, because that depends on what the 
 compiler is currently able to do with the assumption that a 
 variable is in TLS. You run into all of the problems that you 
 risk with sharing variables in threads in C++ only worse, 
 because the D compiler is free to assume that an object is 
 thread-local unless it's marked as shared and thus can make 
 optimizations based on that, whereas the C++ compiler can't. 
 And you've thrown away all of the compiler's help by using 
 __gshared. __gshared is intended specifically for use with 
 interacting with C code where we don't really have a choice, 
 and you have to be careful with it.

Basically, __gshared pretends to compatible with C(++) globals, 
but in actual fact it doesn't share the same memory model so who 
knows what might happen... It's not just 
dangerous-so-be-very-careful, it's fundamentally broken and we're 
currently just getting away with it by relying on C(++) 
optimisers.

Jul 09 2015

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Thursday, 9 July 2015 at 14:40:17 UTC, John Colvin wrote:
 Basically, __gshared pretends to compatible with C(++) globals, 
 but in actual fact it doesn't share the same memory model so 
 who knows what might happen... It's not just 
 dangerous-so-be-very-careful, it's fundamentally broken and 
 we're currently just getting away with it by relying on C(++) 
 optimisers.

__gshared is required for interacting with C/C++ APIs, but 
really, even there, what you're mainly dealing with is primitive 
types like int, and access to it should normally be pretty 
minimal/restricted. That being said, C/C++ bindings in general 
are arguably a giant hole, because they're marked as non-shared 
when they're arguably shared. It usually works fine, because the 
C/C++ functions generally aren't doing anything with what you 
pass to them which would cause them to be used across multiple 
threads, and you're usually not doing a lot of passing around of 
data that you get from C/C++ functions, but it _is_ an area that 
is a bit of minefield if you're not careful. You're basically 
dealing with the __gshared problem. Unfortunately, I'm not sure 
what we can do about it. Simply marking it all as shared would be 
problematic, since most of it really isn't, but having it all be 
treated as thread-local is also problematic. So, unfortunately, 
you just have to be very careful when dealing with C/C++ bindings 
and understand what the C/C++ code is doing. It is a problem 
though.

Still, using __gshared more than you have to is just going to 
make the problem bigger. It's bad enough that we have to deal 
with it at the C/C++ binding layer without having to worry about 
it in straight up D code.

- Jonathan M Davis

Jul 09 2015

=?UTF-8?B?Ik3DoXJjaW8=?= Martins" <marcioapm gmail.com> writes:

On Thursday, 9 July 2015 at 14:03:18 UTC, Jonathan M Davis wrote:
 On Thursday, 9 July 2015 at 12:39:00 UTC, Márcio Martins wrote:
[...]

 Well, the compiler is free to assume that a variable that is 
 not marked as shared is thread-local. So, it's free to make 
 optimizations based on that. So, for instance, it can know for 
 a fact that

 [...]

But this is what a C/C++ compiler would do, unless you your data 
is qualified as volatile. I believe __gshared also implies the 
volatile behavior, right? I wouldn't make sense any other way.

So basically, __gshared is like saying "I want the C/C++ 
behavior, and I accept I am all on my own as the compiler will 
not help me".

Jul 09 2015

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Thursday, 9 July 2015 at 14:57:56 UTC, Márcio Martins wrote:
 On Thursday, 9 July 2015 at 14:03:18 UTC, Jonathan M Davis 
 wrote:
 On Thursday, 9 July 2015 at 12:39:00 UTC, Márcio Martins wrote:
[...]

 Well, the compiler is free to assume that a variable that is 
 not marked as shared is thread-local. So, it's free to make 
 optimizations based on that. So, for instance, it can know for 
 a fact that

 [...]

 But this is what a C/C++ compiler would do, unless you your 
 data is qualified as volatile. I believe __gshared also implies 
 the volatile behavior, right? I wouldn't make sense any other 
 way.

 So basically, __gshared is like saying "I want the C/C++ 
 behavior, and I accept I am all on my own as the compiler will 
 not help me".

Sort of, but the assumptions that the D compiler is allowed to 
make aren't the same. Regardless of shared/__gshared itself, D's 
const is very different, and C++ doesn't have const or immutable. 
And the D compiler devs can add whatever optimizations they want 
based on what those features guarantee so long as they can prove 
that they're correct, which changes what what a D compiler is 
allowed to optimize in comparison to a C++ compiler. So, if you 
make assumptions on what's valid based purely on C++, you risk 
shooting yourself in the foot.

__gshared is really only meant for interacting with C APIs, and 
if you're using it for other stuff, you're just begging for 
trouble. You might get away with it at least some of the time, 
but it really isn't a good idea to try.

- Jonathan M Davis

Jul 09 2015

"deadalnix" <deadalnix gmail.com> writes:

On Thursday, 9 July 2015 at 14:57:56 UTC, Márcio Martins wrote:
 On Thursday, 9 July 2015 at 14:03:18 UTC, Jonathan M Davis 
 wrote:
 On Thursday, 9 July 2015 at 12:39:00 UTC, Márcio Martins wrote:
[...]

 Well, the compiler is free to assume that a variable that is 
 not marked as shared is thread-local. So, it's free to make 
 optimizations based on that. So, for instance, it can know for 
 a fact that

 [...]

 But this is what a C/C++ compiler would do, unless you your 
 data is qualified as volatile. I believe __gshared also implies 
 the volatile behavior, right? I wouldn't make sense any other 
 way.

 So basically, __gshared is like saying "I want the C/C++ 
 behavior, and I accept I am all on my own as the compiler will 
 not help me".

If you think volatile is going to help you with concurency, you 
gonna have bad time.

The only thing volatile does is to prevent register promotion of 
the variable. It is usefull for MMIO, it doesn't provide 
guarantee for multithreading.

Jul 10 2015

ketmar <ketmar ketmar.no-ip.org> writes:

On Wed, 08 Jul 2015 10:10:55 +0000, Jonathan M Davis wrote:

 Regardless, while I would very much like to see shared properly ironed
 out, I'm _very_ grateful that thread-local is the default in D. It's
 just so much saner.

+1=

Jul 08 2015

=?UTF-8?B?Ik3DoXJjaW8=?= Martins" <marcioapm gmail.com> writes:

On Wednesday, 8 July 2015 at 09:54:01 UTC, ketmar wrote:
 On Wed, 08 Jul 2015 09:43:38 +0000, wobbles wrote:

 Ok, so we should prioritise using 'shared' over __gshared as 
 much as possible. Good to know!

 only `shared` is PITA...

I write a fair amount of threaded code myself and must say the 
experience with shared has also been horrible. To me, it feels 
like an attempt at fixing something in the way we write threaded 
code that someone gave up on, and now it's just noise in the 
language. I have given up on it too, and am now using only 
__gshared. It's fine, but it feels dirty to be using a __ 
prefixed keyword so often. :)

If I remember correctly, it was also a pain to use most of Phobos 
with 'shared' without casting left and right.

Jul 08 2015

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Wednesday, 8 July 2015 at 11:02:19 UTC, Márcio Martins wrote:
 If I remember correctly, it was also a pain to use most of 
 Phobos with 'shared' without casting left and right.

In general, if you're doing much with a shared object other than 
passing it around, you're using shared incorrectly. Various 
operations can be done with core.atomic and shared objects, but a 
number of operations are outright forbidden by the language, and 
to use a non-thread local object correctly (be it __gshared or 
shared), you _have_ to protect it with lock except in cases where 
you know what you're doing and are _extremely_ careful with 
atomic operations. D is effectively trying to force you to not 
try and do operations on a shared object unless it's protected by 
a lock.

Per TDPL, we're supposed to have synchronized classes which then 
strip off the outer layer of shared internally when operating on 
their member variables so that the sharedness can be stripped 
safely without relying on the programmer getting the locks and 
casting right, but synchronized classes have never been 
implemented (just synchronized functions), so that doesn't work 
at the moment, and stripping off just the outer layer of shared 
often isn't enough anyway. So, what you end up having to do is 
something along the lines of

synchronized(mutexObj)
{
     auto unshared = cast(T)sharedT;

     // ...
     // do stuff on unshared, since it's considered thread-local, 
and normal
     // code will work on it.
     //...

     // make sure that there are no references to unshared before 
leaving the
     // synchronized block
}

That way, you can use the shared object with normal code as long 
as you've protected it properly with a mutex, and the language 
prevents you from accidentally operating no the shared object 
through the shared reference (which would then be when it's not 
protected by a lock). The problem is that unlike with 
synchronized classes, you have to explicitly cast away shared 
yourself and ensure that no thread-local references to that data 
escape the synchronized block, and if you screw it up, you could 
have subtle, entertaining bugs, whereas synchronized classes 
would be able to guarantee that the outer layer of shared was 
safely removed.

What we would ideally have would be a way to do the above code 
safely without having to do the cast explicitly and make sure 
that that was done correctly, but we haven't figured out how to 
do that yet. But this idiom ensures that you operate on shared 
objects only when they're protected by a lock. Actually operating 
on shared or __gshared objects without locking is just plain 
buggy unless you're talking about atomic operations and using 
core.atomic properly to deal with that (which isn't necessarily 
easy). So, if anything, the fact that folks find shared so 
annoying implies that they're trying to write code in a manner 
which is not thread-safe. That's not to say that shared doesn't 
need to be improved or that there aren't things that you can't do 
with it right now that you should be able to, but in general, the 
stuff that you can't do with a shared object is stuff that you 
shouldn't be doing with a shared object anyway.

- Jonathan M Davis

Jul 08 2015

"wobbles" <grogan.colin gmail.com> writes:

On Wednesday, 8 July 2015 at 12:21:22 UTC, Jonathan M Davis wrote:
 On Wednesday, 8 July 2015 at 11:02:19 UTC, Márcio Martins wrote:
 [...]



Interesting, so the main pain of using shared is the requirement 
to cast away the shared whenever you want to work on the data in 
a synchronized block.

Is there any links do you know to the old conversations on what 
solutions are there for this?

My first thought is using the 'with' keyword.
shared int mySharedInt;
synchronised(mutexObj) with (mySharedInt){
     // any references to mySharedInt in this block are implicitly 
converted to non-shared
}

Jul 08 2015

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 08-Jul-2015 15:39, wobbles wrote:
 On Wednesday, 8 July 2015 at 12:21:22 UTC, Jonathan M Davis wrote:
 On Wednesday, 8 July 2015 at 11:02:19 UTC, Márcio Martins wrote:
 [...]



 Interesting, so the main pain of using shared is the requirement to cast
 away the shared whenever you want to work on the data in a synchronized
 block.

 Is there any links do you know to the old conversations on what
 solutions are there for this?

 My first thought is using the 'with' keyword.
 shared int mySharedInt;
 synchronised(mutexObj) with (mySharedInt){
      // any references to mySharedInt in this block are implicitly
 converted to non-shared
 }

It's good convention but still convention - who guarantees that the 
right mutex was locked? The locking protocol is outside of the 
competence of the compiler.

However there are synchronized _class_-es that might play nice with 
shared b/c very access to them is guarded by built-in mutex.

-- 
Dmitry Olshansky

Jul 08 2015

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Wednesday, 8 July 2015 at 12:51:02 UTC, Dmitry Olshansky wrote:
 On 08-Jul-2015 15:39, wobbles wrote:
 On Wednesday, 8 July 2015 at 12:21:22 UTC, Jonathan M Davis 
 wrote:
 On Wednesday, 8 July 2015 at 11:02:19 UTC, Márcio Martins 
 wrote:
 [...]



 Interesting, so the main pain of using shared is the 
 requirement to cast
 away the shared whenever you want to work on the data in a 
 synchronized
 block.

 Is there any links do you know to the old conversations on what
 solutions are there for this?

 My first thought is using the 'with' keyword.
 shared int mySharedInt;
 synchronised(mutexObj) with (mySharedInt){
      // any references to mySharedInt in this block are 
 implicitly
 converted to non-shared
 }

 It's good convention but still convention - who guarantees that 
 the right mutex was locked? The locking protocol is outside of 
 the competence of the compiler.

Yes. That's the problem, and it would be great if we could find a 
solution for it. But as annoying as it is, we're still better off 
than what you get with C++.

 However there are synchronized _class_-es that might play nice 
 with shared b/c very access to them is guarded by built-in 
 mutex.

Yes. If synchronized classes were actually implemented, then that 
provides a compiler guaranteed safe way to strip off the outer 
layered of shared, but it only strips off the outer layer (since 
it can't guarantee that stripping off anymore would be safe), 
which would severely limit its usefulness, and it's rather clunky 
and verbose to have to declare whole classes just to operate on 
shared data. So, thus far, it's the best that we've come up with 
for safely casting away shared in a compiler-guaranteed way, but 
it's still not all that great, and it's not even implemented.

So, ultimately, even if we finally do get synchronized classes, I 
expect that there will be a fair bit of code that's going to have 
to rely on the programmer to correctly and safely cast away 
shared to operate on data, as unappealing as that may be.

Regardless, until we have synchronized classes or another 
solution which does something similar, the idiom that I described 
is pretty much what we have to do, even if it unfortunately 
relies on the programmer following convention and being careful. 
But it is basically what you have to do in languages like C++ 
anyway except that casting is involved, and the portion of the 
code where it's necessary is a lot clearer thanks to the fact 
that shared objects are explicitly shared and how shared doesn't 
allow you to do much.

- Jonathan M Davis

Jul 08 2015

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 08-Jul-2015 16:57, Jonathan M Davis wrote:
 On Wednesday, 8 July 2015 at 12:51:02 UTC, Dmitry Olshansky wrote:
 On 08-Jul-2015 15:39, wobbles wrote:


 So, ultimately, even if we finally do get synchronized classes, I expect
 that there will be a fair bit of code that's going to have to rely on
 the programmer to correctly and safely cast away shared to operate on
 data, as unappealing as that may be.

I definetly seen synchronized classes in TDPL years ago. They ought to 
work today or am I missing something?


-- 
Dmitry Olshansky

Jul 08 2015

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Wednesday, 8 July 2015 at 18:05:13 UTC, Dmitry Olshansky wrote:
 I definetly seen synchronized classes in TDPL years ago. They 
 ought to work today or am I missing something?

They're described in TDPL, but they've never been implemented. 
Instead, we have synchronized functions like in Java (which TDPL 
specifically talks about being a bad idea). With a synchronized 
class, _all_ functions in a class would be synchronized, and a 
variety of restrictions are placed on the class so that it can 
make implicitly remove the outer layed or shared on its member 
variables inside its member functions. With synchronized 
functions, all they do is lock when you enter them an unlock when 
you exit them and share the mutex with the other synchronized 
functions in the class, but they add no other abilities or 
guarantees.

So, arguably, we should implement synchronized classes as 
described in TDPL, and that would certainly help with the shared 
situation, but for whatever reason, no one has ever implemented 
them.

- Jonathan M Davis

Jul 08 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/8/15 5:20 AM, wobbles wrote:
 After reading the recent "Lessons Learned" article [1], and reading a
 few comments on the thread, there was a mention of using __gshared over
 shared.

I don't see any full answers to this so:

 What exactly is the difference here?

__gshared just puts the data in global segment, but does NOT alter the 
type. shared DOES alter the type:

__gshared int x1;
shared int x2;

pragma(msg, typeof(x1)); // int
pragma(msg, typeof(x2)); // shared(int)

Why is this important? Because you can overload on shared data to do 
special things (i.e. reject pointers to shared data, or use mutex locks 
around only truly shared data). __gshared data is only strictly a 
storage class, so you cannot do anything different with e.g. &x1 as you 
could with an address to a normal thread-local int.

As has been mentioned, __gshared data more accurately represents how C 
treats global data, but technically, you could use C to access shared 
variables. Both would have to be tagged with extern(C).

 Is there plans to 'converge' them at some point?

No, they are different concepts.

-Steve

Jul 08 2015

Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:

On 8 July 2015 at 11:20, wobbles via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 After reading the recent "Lessons Learned" article [1], and reading a few
 comments on the thread, there was a mention of using __gshared over shared.

 What exactly is the difference here?
 Are they 2 keywords to do the same thing, or are there specific use cases
 to both?
 Is there plans to 'converge' them at some point?

 [1]
 https://www.reddit.com/r/programming/comments/3cg1r0/lessons_learned_writing_a_filesystem_in_d/

http://forum.dlang.org/post/mailman.739.1431034764.4581.digitalmars-d puremagic.com

Iain

Jul 08 2015

D Programming

C/C++ Programming

Other

digitalmars.D - Difference between __gshared and shared.