www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - More synchronized ideas

reply Michel Fortin <michel.fortin michelf.com> writes:
After trying to make sense of the thread "synchronized 
(this[.classinfo]) in druntime and phobos", I had to write my opinion 
on all this somewhere that wouldn't be instantly lost in a bazilion of 
posts. It turned out into something quite elaborate.

<http://michelf.com/weblog/2012/mutex-synchonization-in-d/>

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/
Jun 04 2012
next sibling parent reply "Jason House" <jason.james.house gmail.com> writes:
On Monday, 4 June 2012 at 11:17:45 UTC, Michel Fortin wrote:
 After trying to make sense of the thread "synchronized 
 (this[.classinfo]) in druntime and phobos", I had to write my 
 opinion on all this somewhere that wouldn't be instantly lost 
 in a bazilion of posts. It turned out into something quite 
 elaborate.

 <http://michelf.com/weblog/2012/mutex-synchonization-in-d/>
The example about current D seems a bit lacking. You change behavior because it's easy to code a different way. If you really want to use synchronized classes, then you should have two of them. The map can be in one synchronized class and the counters can be in another. the main class in the example would simply call methods on the synchronized objects. That being said, I've never used synchronized classes in my multithreaded D1/D2 code. I used Tango's Mutexes and Conditions. They're more flexible. That being said, I think the use of a synchronized class for your counters is perfectly reasonable and would simplify validating the code for correctness.
Jun 04 2012
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2012-06-04 13:15:57 +0000, "Jason House" <jason.james.house gmail.com> said:

 The example about current D seems a bit lacking. You change behavior 
 because it's easy to code a different way. If you really want to use 
 synchronized classes, then you should have two of them. The map can be 
 in one synchronized class and the counters can be in another. the main 
 class in the example would simply call methods on the synchronized 
 objects.
Valid comment. I thought about creating yet another example illustrating that, but I gave up when realizing the silliness of it. I mean, yes you can make it work, but at the price of writing a lot of boilerplate code just for forwarding things around. Here's a modified implementation of that dictionary class, wrapping the translations AA and the two counters in two distinct classes: class Dictionary { private SynchronizedStringMap translations; private SynchronizedCounters counters; void addWord(string word, string foreignWord) { synchronized(translations) { translations[word] = foreignWord; counters.addUnconfirmed(); } } bool confirmWord(string word, string foreignWord) { synchronized(translations) { string * found = word in translation; if (!found) return false; if (*found != foreignWord) return false; } synchronized(counters) { counters.removeUnconfirmed(); counters.addConfirmed(); } globalNotifyWordConfirmed(word, foreignWord); return true; } } That done, I still have to implement two other classes: SynchronizedStringMap which needs to implement the '[]' and 'in' operators, and counters which needs to implement addUnconfirmed, and SynchronizedCounters which needs to implement addUnconfirmed, removeUnconfirmed, and addConfirmed. All those functions just forwarding calls to the wrapped variable(s). And all this for a simple toy example. That said, the other point is that it's too easy to shoot yourself in the foot using implicit synchronization. It's easy to forget you still have a mutex locked when adding the call to globalNotifyWordConfirmed, especially if you add this line of code later in the development process and you have forgotten about the "synchronized" keyword far away at the very top of the class declaration. And once you have a deadlock and you've identified the culprit, to fix the bug you need to split everything that needs to be synchronized into a separate class, a tiresome and bug-prone process, just because you can't opt-out of the implicit synchronization. Call me nuts if you want, but I think this is awful.
 That being said, I've never used synchronized classes in my 
 multithreaded D1/D2 code. I used Tango's Mutexes and Conditions. 
 They're more flexible.
I'd say it's a good choice. How does it work with shared variables in D2, or are you just ignoring the type system? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jun 04 2012
parent reply "Jason House" <jason.james.house gmail.com> writes:
On Monday, 4 June 2012 at 17:40:38 UTC, Michel Fortin wrote:
 On 2012-06-04 13:15:57 +0000, "Jason House" 
 <jason.james.house gmail.com> said:

 If you really want to use synchronized classes, then you should
 have two of them.
Valid comment. I thought about creating yet another example illustrating that, but I gave up when realizing the silliness of it. I mean, yes you can make it work, but at the price of writing a lot of boilerplate code just for forwarding things around. Here's a modified implementation of that dictionary class, wrapping the translations AA and the two counters in two distinct classes: ... That said, the other point is that it's too easy to shoot yourself in the foot using implicit synchronization. It's easy to forget you still have a mutex locked when adding the call to globalNotifyWordConfirmed, especially if you add this line of code later in the development process and you have forgotten about the "synchronized" keyword far away at the very top of the class declaration. And once you have a deadlock and you've identified the culprit, to fix the bug you need to split everything that needs to be synchronized into a separate class, a tiresome and bug-prone process, just because you can't opt-out of the implicit synchronization. Call me nuts if you want, but I think this is awful.
I was thinking something more like this: shared class Dictionary { private SynchronizedCounters counters; private SynchronizedStringMap translations(counters); void addWord(string word, string foreignWord) shared { // synchronized opIndexAssign call translations[word] = foreignWord; } bool confirmWord(string word, string foreignWord) shared { // synchronized opIndex call string candidate = translations[word]; if (candidate != foreignWord) return false; counters.confirmOneWord(); globalNotifyWordConfirmed(word, foreignWord); return true; } } // All counter operations are embedded here // No need to review for any unsafe data usage, // it's all here (future uses will add new methods) synchronized class SynchronizedCounters { private int confirmed, unconfirmed; void addUnconfirmed() { ++unconfirmed; } void confirmOneWord() { --unconfirmed; ++confirmed; } } // Similar concept, but embeds SynchronizedCounters // I don't like that, but it's the only way to fully // embrace D synchronized classes for toy example synchronized class SynchronizedStringMap { private string[string] translations; private SynchronizedCounters counters; this(SynchronizedCounters _counters) : counters(_counters) {} void opIndexAssign(string word, string foreignWord) { translations[word] = foreignWord; counters.addUnconfirmed(); } string opIndex(string word) { shared string *found = word in translations; if (found) return *found; else return null; } }
 That being said, I've never used synchronized classes in my 
 multithreaded D1/D2 code. I used Tango's Mutexes and 
 Conditions. They're more flexible.
I'd say it's a good choice. How does it work with shared variables in D2, or are you just ignoring the type system?
I did not find my mutex based code, but did find a lock free broadcast queue. It was written to be shared-aware. Here's the receive class: /// Receives delegates from the specified sender. Never blocks. class receiver{ private target parent; private shared sender source; private int nextMessageId = 1; this(target t, shared sender s){ parent = t; source = s; } bool receive(){ if (source.receive(nextMessageId, parent)){ nextMessageId++; return true; } return false; } } And here was the receive method in the sender class: private bool receive(int messageId, target t) shared { if (pending == 0 || id < messageId) return false; msg(t); atomicDecrement!(msync.raw)(pending); return true; } atomicDecrement in Tango was a template. Looking at the commit logs, it looks like I had to hack at isValidNumericType, but I did not have to change the Atomic.d (beyond removing the volatile keywords and fixing incorrect assembly)
Jun 04 2012
parent Michel Fortin <michel.fortin michelf.com> writes:
On 2012-06-05 01:58:04 +0000, "Jason House" <jason.james.house gmail.com> said:

 I was thinking something more like this:
 
 shared class Dictionary
 {
    private SynchronizedCounters counters;
    private SynchronizedStringMap translations(counters);
 
    void addWord(string word, string foreignWord) shared
    {
      // synchronized opIndexAssign call
      translations[word] = foreignWord;
    }
    bool confirmWord(string word, string foreignWord) shared
    {
      // synchronized opIndex call
      string candidate = translations[word];
      if (candidate != foreignWord)
        return false;
      counters.confirmOneWord();
      globalNotifyWordConfirmed(word, foreignWord);
      return true;
    }
 }
 
 // All counter operations are embedded here
 // No need to review for any unsafe data usage,
 // it's all here (future uses will add new methods)
 synchronized class SynchronizedCounters
 {
    private int confirmed, unconfirmed;
    void addUnconfirmed() { ++unconfirmed; }
    void confirmOneWord() { --unconfirmed; ++confirmed; }
 }
 
 // Similar concept, but embeds SynchronizedCounters
 // I don't like that, but it's the only way to fully
 // embrace D synchronized classes for toy example
 synchronized class SynchronizedStringMap
 {
    private string[string] translations;
    private SynchronizedCounters counters;
    this(SynchronizedCounters _counters)
      : counters(_counters) {}
    void opIndexAssign(string word, string foreignWord)
    {
      translations[word] = foreignWord;
      counters.addUnconfirmed();
    }
    string opIndex(string word)
    {
      shared string *found = word in translations;
      if (found)
        return *found;
      else
        return null;
    }
 }
I see there are some errors, but I get the general concept. And I can see why you don't like that. It's no longer a "SynchronizedStringMap" once it contains the counters: it's just a shell encapsulating the parts of Dictionary that needs to be run while the mutex is locked. Because you can't lock Dictionary since it has callbacks, you have to split code that would normally be together and put the synchronized portion in another class just to fit the constrains of the synchronized class concept. I think this illustrates that associating mutexes to variables is a better approach. Beside, I wouldn't want to be the one who has to teach about writing concurrent code using this paradigm. Even the equivalent C++ code is easier to read and explain that this; shouldn't that raise a red flag? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jun 04 2012
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 04 Jun 2012 07:17:45 -0400, Michel Fortin  
<michel.fortin michelf.com> wrote:

 After trying to make sense of the thread "synchronized  
 (this[.classinfo]) in druntime and phobos", I had to write my opinion on  
 all this somewhere that wouldn't be instantly lost in a bazilion of  
 posts. It turned out into something quite elaborate.

 <http://michelf.com/weblog/2012/mutex-synchonization-in-d/>
I like this. But it needs a lot of work. A few comments: 1. This does not handle shared *at all*. Presumably, there is no reason to lock unshared data, so this has to be handled somewhere. If you say "synchronized implies shared", well, then how do you have a shared int inside an unshared class? My instinct is that all the methods that need to used synchronized need to be declared shared (meaning the whole class data is shared). But that sucks, because what if you have a thread-local instance? I have an idea to solve this. Since the mutexes are implicit, we can declare space for them, but only allocate them when the class instance is shared (allocated on construction). Then when synchronized goes to lock them, if they are null, do nothing. But what if some data is not marked synchronized? I can see why Bartosz had such trouble creating a sharing system in a simple manner... 2. As far as determining a mutex to protect multiple items of data, what about: synchronized(symbolname) int x, int y; or synchronized(symbolname) { int x; int y; } where you cannot do synchronized(x) or synchronized(y), and cannot read or write x or y without doing synchronized(symbolname). -Steve
Jun 04 2012
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2012-06-04 18:22:22 +0000, "Steven Schveighoffer" 
<schveiguy yahoo.com> said:

 On Mon, 04 Jun 2012 07:17:45 -0400, Michel Fortin  
 <michel.fortin michelf.com> wrote:
 
 After trying to make sense of the thread "synchronized  
 (this[.classinfo]) in druntime and phobos", I had to write my opinion 
 on  all this somewhere that wouldn't be instantly lost in a bazilion of 
  posts. It turned out into something quite elaborate.
 
 <http://michelf.com/weblog/2012/mutex-synchonization-in-d/>
I like this. But it needs a lot of work. A few comments: 1. This does not handle shared *at all*. Presumably, there is no reason to lock unshared data, so this has to be handled somewhere. If you say "synchronized implies shared", well, then how do you have a shared int inside an unshared class? My instinct is that all the methods that need to used synchronized need to be declared shared (meaning the whole class data is shared). But that sucks, because what if you have a thread-local instance?
To the type system, the synchronized variable is thread-local with one restriction: you can only access it inside a synchronized block. So without the synchronized block you can't read or write to it, and you can't take its address. Inside the synchronized block, any expression making use of that variable is tainted by the current scope and must be pure (weakly). Except if the only result of an expression contains no indirection (is entirely copied), or is immutable or shared (no synchronization needed), then the expression does not get tainted and its result can be sent anywhere. Note that this taint thing is only done locally inside the synchronized block: called functions are simply treated as an expression with inputs and outputs to check whether they are tainted or not.
 I have an idea to solve this.  Since the mutexes are implicit, we can  
 declare space for them, but only allocate them when the class instance 
 is  shared (allocated on construction).  Then when synchronized goes to 
 lock  them, if they are null, do nothing.
Or they could simply work like the monitors objects currently have. As I wrote, I think we need support for shared mutexes too (integrated with const if you will). Ideally, there'd be a way to choose your own mutex implementations, perhaps with "synchronized(Spinlock) int x".
 But what if some data is not marked synchronized?
You might actually want to restrict synchronized variables to shared classes and shared structs. In that case, variables not marked as synchronized will be shared and accessible using atomic operations.
 I can see why Bartosz had such trouble creating a sharing system in a  
 simple manner...
:-)
 2. As far as determining a mutex to protect multiple items of data, 
 what  about:
 
 synchronized(symbolname) int x, int y;
 
 or
 
 synchronized(symbolname)
 {
     int x;
     int y;
 }
 
 where you cannot do synchronized(x) or synchronized(y), and cannot read 
 or  write x or y without doing synchronized(symbolname).
Or we could be less original: use a struct. It's just a minor cosmetic problem actually. struct XY { int x, y; } synchronized XY xy; -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jun 04 2012
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 04 Jun 2012 18:26:50 -0400, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2012-06-04 18:22:22 +0000, "Steven Schveighoffer"  
 <schveiguy yahoo.com> said:

 On Mon, 04 Jun 2012 07:17:45 -0400, Michel Fortin   
 <michel.fortin michelf.com> wrote:

 After trying to make sense of the thread "synchronized   
 (this[.classinfo]) in druntime and phobos", I had to write my opinion  
 on  all this somewhere that wouldn't be instantly lost in a bazilion  
 of  posts. It turned out into something quite elaborate.
  <http://michelf.com/weblog/2012/mutex-synchonization-in-d/>
I like this. But it needs a lot of work. A few comments: 1. This does not handle shared *at all*. Presumably, there is no reason to lock unshared data, so this has to be handled somewhere. If you say "synchronized implies shared", well, then how do you have a shared int inside an unshared class? My instinct is that all the methods that need to used synchronized need to be declared shared (meaning the whole class data is shared). But that sucks, because what if you have a thread-local instance?
To the type system, the synchronized variable is thread-local with one restriction: you can only access it inside a synchronized block. So without the synchronized block you can't read or write to it, and you can't take its address. Inside the synchronized block, any expression making use of that variable is tainted by the current scope and must be pure (weakly). Except if the only result of an expression contains no indirection (is entirely copied), or is immutable or shared (no synchronization needed), then the expression does not get tainted and its result can be sent anywhere. Note that this taint thing is only done locally inside the synchronized block: called functions are simply treated as an expression with inputs and outputs to check whether they are tainted or not.
This feels like the most significant part of this proposal, and I think is definitely the right track.
 I have an idea to solve this.  Since the mutexes are implicit, we can   
 declare space for them, but only allocate them when the class instance  
 is  shared (allocated on construction).  Then when synchronized goes to  
 lock  them, if they are null, do nothing.
Or they could simply work like the monitors objects currently have.
Some have stated there is a performance hit for lazy initialization of monitors. I think we should try and fix that.
 As I wrote, I think we need support for shared mutexes too (integrated  
 with const if you will).
I don't disagree with that approach, except the usage of shared as a keyword to mean something completely different than it does now.
 Ideally, there'd be a way to choose your own mutex implementations,  
 perhaps with "synchronized(Spinlock) int x".
I like this better, perhaps a synchronized(ReadWriteMutex) would suffice instead of synchronized(shared)
 But what if some data is not marked synchronized?
You might actually want to restrict synchronized variables to shared classes and shared structs. In that case, variables not marked as synchronized will be shared and accessible using atomic operations.
I kind of like the idea of having to implement a class/structure once, and then being able to use it penalty-free in a thread-local context. Perhaps that's asking too much -- a class that is meant to be shared may have not a lot in common with one that was meant to be thread-local. It sure feels like I should be able to (for instance) implement a container that can be shared or unshared, and change small pieces of it, or add a few synchronized calls and get it working.
 2. As far as determining a mutex to protect multiple items of data,  
 what  about:
  synchronized(symbolname) int x, int y;
  or
  synchronized(symbolname)
 {
     int x;
     int y;
 }
  where you cannot do synchronized(x) or synchronized(y), and cannot  
 read or  write x or y without doing synchronized(symbolname).
Or we could be less original: use a struct. It's just a minor cosmetic problem actually. struct XY { int x, y; } synchronized XY xy;
True. But sometimes the cosmetic things make all the difference! You could implement synchronized with structs also instead of having a keyword. One thing I don't like about this is the xy noise when referring to x and y. -Steve
Jun 06 2012
prev sibling parent reply "Nathan M. Swan" <nathanmswan gmail.com> writes:
On Monday, 4 June 2012 at 11:17:45 UTC, Michel Fortin wrote:
 After trying to make sense of the thread "synchronized 
 (this[.classinfo]) in druntime and phobos", I had to write my 
 opinion on all this somewhere that wouldn't be instantly lost 
 in a bazilion of posts. It turned out into something quite 
 elaborate.

 <http://michelf.com/weblog/2012/mutex-synchonization-in-d/>
This encourages the bad practice (IMO) of shared data. Only a single thread should have the dictionary, with an AddWordMsg struct and a ConfirmWordMsg struct. Using a message passing approach, the client thread sends an AddWordMsg and continues while the word is added concurrently, making it more efficient. My non-expert opinion, NMS
Jun 04 2012
parent "Nathan M. Swan" <nathanmswan gmail.com> writes:
On Tuesday, 5 June 2012 at 05:14:36 UTC, Nathan M. Swan wrote:
 On Monday, 4 June 2012 at 11:17:45 UTC, Michel Fortin wrote:
 After trying to make sense of the thread "synchronized 
 (this[.classinfo]) in druntime and phobos", I had to write my 
 opinion on all this somewhere that wouldn't be instantly lost 
 in a bazilion of posts. It turned out into something quite 
 elaborate.

 <http://michelf.com/weblog/2012/mutex-synchonization-in-d/>
This encourages the bad practice (IMO) of shared data. Only a single thread should have the dictionary, with an AddWordMsg struct and a ConfirmWordMsg struct. Using a message passing approach, the client thread sends an AddWordMsg and continues while the word is added concurrently, making it more efficient. My non-expert opinion, NMS
Though on the flip side, part of the D ideology is not restricting the programmer to a single paradigm. So maybe I like the idea :) Data sharing should still be discouraged. NMS
Jun 04 2012