digitalmars.D - More synchronized ideas

Michel Fortin (9/9) Jun 04 2012 After trying to make sense of the thread "synchronized

Jason House (12/18) Jun 04 2012 The example about current D seems a bit lacking. You change

Michel Fortin (65/74) Jun 04 2012 Valid comment. I thought about creating yet another example

Jason House (84/114) Jun 04 2012 I was thinking something more like this:

Michel Fortin (17/74) Jun 04 2012 I see there are some errors, but I get the general concept.

Steven Schveighoffer (30/35) Jun 04 2012 I like this. But it needs a lot of work.

Michel Fortin (30/73) Jun 04 2012 To the type system, the synchronized variable is thread-local with one

Steven Schveighoffer (22/81) Jun 06 2012 This feels like the most significant part of this proposal, and I think ...

Nathan M. Swan (9/15) Jun 04 2012 This encourages the bad practice (IMO) of shared data. Only a

Nathan M. Swan (6/22) Jun 04 2012 Though on the flip side, part of the D ideology is not

Michel Fortin <michel.fortin michelf.com> writes:

After trying to make sense of the thread "synchronized 
(this[.classinfo]) in druntime and phobos", I had to write my opinion 
on all this somewhere that wouldn't be instantly lost in a bazilion of 
posts. It turned out into something quite elaborate.

<http://michelf.com/weblog/2012/mutex-synchonization-in-d/>

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Jun 04 2012

"Jason House" <jason.james.house gmail.com> writes:

On Monday, 4 June 2012 at 11:17:45 UTC, Michel Fortin wrote:
 After trying to make sense of the thread "synchronized 
 (this[.classinfo]) in druntime and phobos", I had to write my 
 opinion on all this somewhere that wouldn't be instantly lost 
 in a bazilion of posts. It turned out into something quite 
 elaborate.

 <http://michelf.com/weblog/2012/mutex-synchonization-in-d/>

The example about current D seems a bit lacking. You change 
behavior because it's easy to code a different way. If you really 
want to use synchronized classes, then you should have two of 
them. The map can be in one synchronized class and the counters 
can be in another. the main class in the example would simply 
call methods on the synchronized objects.

That being said, I've never used synchronized classes in my 
multithreaded D1/D2 code. I used Tango's Mutexes and Conditions. 
They're more flexible. That being said, I think the use of a 
synchronized class for your counters is perfectly reasonable and 
would simplify validating the code for correctness.

Jun 04 2012

Michel Fortin <michel.fortin michelf.com> writes:

On 2012-06-04 13:15:57 +0000, "Jason House" <jason.james.house gmail.com> said:

 The example about current D seems a bit lacking. You change behavior 
 because it's easy to code a different way. If you really want to use 
 synchronized classes, then you should have two of them. The map can be 
 in one synchronized class and the counters can be in another. the main 
 class in the example would simply call methods on the synchronized 
 objects.

Valid comment. I thought about creating yet another example 
illustrating that, but I gave up when realizing the silliness of it. I 
mean, yes you can make it work, but at the price of writing a lot of 
boilerplate code just for forwarding things around.

Here's a modified implementation of that dictionary class, wrapping the 
translations AA and the two counters in two distinct classes:

	class Dictionary
	{
	    private SynchronizedStringMap translations;
	    private SynchronizedCounters counters;
	
	    void addWord(string word, string foreignWord)
	    {
	        synchronized(translations)
	        {
	            translations[word] = foreignWord;
	
	            counters.addUnconfirmed();
	        }
	    }

	
	    bool confirmWord(string word, string foreignWord)
	    {
	        synchronized(translations)
	        {
	            string * found = word in translation;
	            if (!found)
	                return false;
	            if (*found != foreignWord)
	                return false;
	        }
	        synchronized(counters)
	        {
	            counters.removeUnconfirmed();
	            counters.addConfirmed();
	        }
	        globalNotifyWordConfirmed(word, foreignWord);
	        return true;
	    }
	}

That done, I still have to implement two other classes: 
SynchronizedStringMap which needs to implement the '[]' and 'in' 
operators, and counters which needs to implement addUnconfirmed, and 
SynchronizedCounters which needs to implement addUnconfirmed, 
removeUnconfirmed, and addConfirmed. All those functions just 
forwarding calls to the wrapped variable(s). And all this for a simple 
toy example.

That said, the other point is that it's too easy to shoot yourself in 
the foot using implicit synchronization. It's easy to forget you still 
have a mutex locked when adding the call to globalNotifyWordConfirmed, 
especially if you add this line of code later in the development 
process and you have forgotten about the "synchronized" keyword far 
away at the very top of the class declaration. And once you have a 
deadlock and you've identified the culprit, to fix the bug you need to 
split everything that needs to be synchronized into a separate class, a 
tiresome and bug-prone process, just because you can't opt-out of the 
implicit synchronization. Call me nuts if you want, but I think this is 
awful.


 That being said, I've never used synchronized classes in my 
 multithreaded D1/D2 code. I used Tango's Mutexes and Conditions. 
 They're more flexible.

I'd say it's a good choice. How does it work with shared variables in 
D2, or are you just ignoring the type system?


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Jun 04 2012

"Jason House" <jason.james.house gmail.com> writes:

On Monday, 4 June 2012 at 17:40:38 UTC, Michel Fortin wrote:
 On 2012-06-04 13:15:57 +0000, "Jason House" 
 <jason.james.house gmail.com> said:

 If you really want to use synchronized classes, then you should
 have two of them.

 Valid comment. I thought about creating yet another example 
 illustrating that, but I gave up when realizing the silliness 
 of it. I mean, yes you can make it work, but at the price of 
 writing a lot of boilerplate code just for forwarding things 
 around.

 Here's a modified implementation of that dictionary class, 
 wrapping the translations AA and the two counters in two 
 distinct classes:

 ...

 That said, the other point is that it's too easy to shoot 
 yourself in the foot using implicit synchronization. It's easy 
 to forget you still have a mutex locked when adding the call to 
 globalNotifyWordConfirmed, especially if you add this line of 
 code later in the development process and you have forgotten 
 about the "synchronized" keyword far away at the very top of 
 the class declaration. And once you have a deadlock and you've 
 identified the culprit, to fix the bug you need to split 
 everything that needs to be synchronized into a separate class, 
 a tiresome and bug-prone process, just because you can't 
 opt-out of the implicit synchronization. Call me nuts if you 
 want, but I think this is awful.


I was thinking something more like this:

shared class Dictionary
{
   private SynchronizedCounters counters;
   private SynchronizedStringMap translations(counters);

   void addWord(string word, string foreignWord) shared
   {
     // synchronized opIndexAssign call
     translations[word] = foreignWord;
   }
   bool confirmWord(string word, string foreignWord) shared
   {
     // synchronized opIndex call
     string candidate = translations[word];
     if (candidate != foreignWord)
       return false;
     counters.confirmOneWord();
     globalNotifyWordConfirmed(word, foreignWord);
     return true;
   }
}

// All counter operations are embedded here
// No need to review for any unsafe data usage,
// it's all here (future uses will add new methods)
synchronized class SynchronizedCounters
{
   private int confirmed, unconfirmed;
   void addUnconfirmed() { ++unconfirmed; }
   void confirmOneWord() { --unconfirmed; ++confirmed; }
}

// Similar concept, but embeds SynchronizedCounters
// I don't like that, but it's the only way to fully
// embrace D synchronized classes for toy example
synchronized class SynchronizedStringMap
{
   private string[string] translations;
   private SynchronizedCounters counters;
   this(SynchronizedCounters _counters)
     : counters(_counters) {}
   void opIndexAssign(string word, string foreignWord)
   {
     translations[word] = foreignWord;
     counters.addUnconfirmed();
   }
   string opIndex(string word)
   {
     shared string *found = word in translations;
     if (found)
       return *found;
     else
       return null;
   }
}




 That being said, I've never used synchronized classes in my 
 multithreaded D1/D2 code. I used Tango's Mutexes and 
 Conditions. They're more flexible.

 I'd say it's a good choice. How does it work with shared 
 variables in D2, or are you just ignoring the type system?


I did not find my mutex based code, but did find a lock free 
broadcast queue.  It was written to be shared-aware.  Here's the 
receive class:
/// Receives delegates from the specified sender. Never blocks.
class receiver{
     private target parent;
     private shared sender source;
     private int nextMessageId = 1;
     this(target t, shared sender s){ parent = t; source = s; }
     bool receive(){
         if (source.receive(nextMessageId, parent)){
             nextMessageId++;
             return true;
         }
         return false;
     }
}

And here was the receive method in the sender class:
private bool receive(int messageId, target t) shared
{
     if (pending == 0 || id < messageId)
         return false;
     msg(t);
     atomicDecrement!(msync.raw)(pending);
     return true;
}

atomicDecrement in Tango was a template.  Looking at the commit 
logs, it looks like I had to hack at isValidNumericType, but I 
did not have to change the Atomic.d (beyond removing the volatile 
keywords and fixing incorrect assembly)

Jun 04 2012

Michel Fortin <michel.fortin michelf.com> writes:

On 2012-06-05 01:58:04 +0000, "Jason House" <jason.james.house gmail.com> said:

 I was thinking something more like this:
 
 shared class Dictionary
 {
    private SynchronizedCounters counters;
    private SynchronizedStringMap translations(counters);
 
    void addWord(string word, string foreignWord) shared
    {
      // synchronized opIndexAssign call
      translations[word] = foreignWord;
    }
    bool confirmWord(string word, string foreignWord) shared
    {
      // synchronized opIndex call
      string candidate = translations[word];
      if (candidate != foreignWord)
        return false;
      counters.confirmOneWord();
      globalNotifyWordConfirmed(word, foreignWord);
      return true;
    }
 }
 
 // All counter operations are embedded here
 // No need to review for any unsafe data usage,
 // it's all here (future uses will add new methods)
 synchronized class SynchronizedCounters
 {
    private int confirmed, unconfirmed;
    void addUnconfirmed() { ++unconfirmed; }
    void confirmOneWord() { --unconfirmed; ++confirmed; }
 }
 
 // Similar concept, but embeds SynchronizedCounters
 // I don't like that, but it's the only way to fully
 // embrace D synchronized classes for toy example
 synchronized class SynchronizedStringMap
 {
    private string[string] translations;
    private SynchronizedCounters counters;
    this(SynchronizedCounters _counters)
      : counters(_counters) {}
    void opIndexAssign(string word, string foreignWord)
    {
      translations[word] = foreignWord;
      counters.addUnconfirmed();
    }
    string opIndex(string word)
    {
      shared string *found = word in translations;
      if (found)
        return *found;
      else
        return null;
    }
 }

I see there are some errors, but I get the general concept.

And I can see why you don't like that. It's no longer a 
"SynchronizedStringMap" once it contains the counters: it's just a 
shell encapsulating the parts of Dictionary that needs to be run while 
the mutex is locked. Because you can't lock Dictionary since it has 
callbacks, you have to split code that would normally be together and 
put the synchronized portion in another class just to fit the 
constrains of the synchronized class concept. I think this illustrates 
that associating mutexes to variables is a better approach.

Beside, I wouldn't want to be the one who has to teach about writing 
concurrent code using this paradigm. Even the equivalent C++ code is 
easier to read and explain that this; shouldn't that raise a red flag?


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Jun 04 2012

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 04 Jun 2012 07:17:45 -0400, Michel Fortin  
<michel.fortin michelf.com> wrote:

 After trying to make sense of the thread "synchronized  
 (this[.classinfo]) in druntime and phobos", I had to write my opinion on  
 all this somewhere that wouldn't be instantly lost in a bazilion of  
 posts. It turned out into something quite elaborate.

 <http://michelf.com/weblog/2012/mutex-synchonization-in-d/>

I like this.  But it needs a lot of work.

A few comments:

1. This does not handle shared *at all*.  Presumably, there is no reason  
to lock unshared data, so this has to be handled somewhere.  If you say  
"synchronized implies shared", well, then how do you have a shared int  
inside an unshared class?  My instinct is that all the methods that need  
to used synchronized need to be declared shared (meaning the whole class  
data is shared).  But that sucks, because what if you have a thread-local  
instance?

I have an idea to solve this.  Since the mutexes are implicit, we can  
declare space for them, but only allocate them when the class instance is  
shared (allocated on construction).  Then when synchronized goes to lock  
them, if they are null, do nothing.

But what if some data is not marked synchronized?

I can see why Bartosz had such trouble creating a sharing system in a  
simple manner...

2. As far as determining a mutex to protect multiple items of data, what  
about:

synchronized(symbolname) int x, int y;

or

synchronized(symbolname)
{
    int x;
    int y;
}

where you cannot do synchronized(x) or synchronized(y), and cannot read or  
write x or y without doing synchronized(symbolname).

-Steve

Jun 04 2012

Michel Fortin <michel.fortin michelf.com> writes:

On 2012-06-04 18:22:22 +0000, "Steven Schveighoffer" 
<schveiguy yahoo.com> said:

 On Mon, 04 Jun 2012 07:17:45 -0400, Michel Fortin  
 <michel.fortin michelf.com> wrote:
 
 After trying to make sense of the thread "synchronized  
 (this[.classinfo]) in druntime and phobos", I had to write my opinion 
 on  all this somewhere that wouldn't be instantly lost in a bazilion of 
  posts. It turned out into something quite elaborate.
 
 <http://michelf.com/weblog/2012/mutex-synchonization-in-d/>

 
 I like this.  But it needs a lot of work.
 
 A few comments:
 
 1. This does not handle shared *at all*.  Presumably, there is no 
 reason  to lock unshared data, so this has to be handled somewhere.  If 
 you say  "synchronized implies shared", well, then how do you have a 
 shared int  inside an unshared class? My instinct is that all the 
 methods that need  to used synchronized need to be declared shared 
 (meaning the whole class  data is shared).  But that sucks, because 
 what if you have a thread-local  instance?

To the type system, the synchronized variable is thread-local with one 
restriction: you can only access it inside a synchronized block. So 
without the synchronized block you can't read or write to it, and you 
can't take its address. Inside the synchronized block, any expression 
making use of that variable is tainted by the current scope and must be 
pure (weakly). Except if the only result of an expression contains no 
indirection (is entirely copied), or is immutable or shared (no 
synchronization needed), then the expression does not get tainted and 
its result can be sent anywhere.

Note that this taint thing is only done locally inside the synchronized 
block: called functions are simply treated as an expression with inputs 
and outputs to check whether they are tainted or not.


 I have an idea to solve this.  Since the mutexes are implicit, we can  
 declare space for them, but only allocate them when the class instance 
 is  shared (allocated on construction).  Then when synchronized goes to 
 lock  them, if they are null, do nothing.

Or they could simply work like the monitors objects currently have.

As I wrote, I think we need support for shared mutexes too (integrated 
with const if you will). Ideally, there'd be a way to choose your own 
mutex implementations, perhaps with "synchronized(Spinlock) int x".


 But what if some data is not marked synchronized?

You might actually want to restrict synchronized variables to shared 
classes and shared structs. In that case, variables not marked as 
synchronized will be shared and accessible using atomic operations.


 I can see why Bartosz had such trouble creating a sharing system in a  
 simple manner...

:-)


 2. As far as determining a mutex to protect multiple items of data, 
 what  about:
 
 synchronized(symbolname) int x, int y;
 
 or
 
 synchronized(symbolname)
 {
     int x;
     int y;
 }
 
 where you cannot do synchronized(x) or synchronized(y), and cannot read 
 or  write x or y without doing synchronized(symbolname).

Or we could be less original: use a struct. It's just a minor cosmetic 
problem actually.

	struct XY { int x, y; }
	synchronized XY xy;


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Jun 04 2012

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 04 Jun 2012 18:26:50 -0400, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2012-06-04 18:22:22 +0000, "Steven Schveighoffer"  
 <schveiguy yahoo.com> said:

 On Mon, 04 Jun 2012 07:17:45 -0400, Michel Fortin   
 <michel.fortin michelf.com> wrote:

 After trying to make sense of the thread "synchronized   
 (this[.classinfo]) in druntime and phobos", I had to write my opinion  
 on  all this somewhere that wouldn't be instantly lost in a bazilion  
 of  posts. It turned out into something quite elaborate.
  <http://michelf.com/weblog/2012/mutex-synchonization-in-d/>

  I like this.  But it needs a lot of work.
  A few comments:
  1. This does not handle shared *at all*.  Presumably, there is no  
 reason  to lock unshared data, so this has to be handled somewhere.  If  
 you say  "synchronized implies shared", well, then how do you have a  
 shared int  inside an unshared class? My instinct is that all the  
 methods that need  to used synchronized need to be declared shared  
 (meaning the whole class  data is shared).  But that sucks, because  
 what if you have a thread-local  instance?

 To the type system, the synchronized variable is thread-local with one  
 restriction: you can only access it inside a synchronized block. So  
 without the synchronized block you can't read or write to it, and you  
 can't take its address. Inside the synchronized block, any expression  
 making use of that variable is tainted by the current scope and must be  
 pure (weakly). Except if the only result of an expression contains no  
 indirection (is entirely copied), or is immutable or shared (no  
 synchronization needed), then the expression does not get tainted and  
 its result can be sent anywhere.

 Note that this taint thing is only done locally inside the synchronized  
 block: called functions are simply treated as an expression with inputs  
 and outputs to check whether they are tainted or not.

This feels like the most significant part of this proposal, and I think is  
definitely the right track.

 I have an idea to solve this.  Since the mutexes are implicit, we can   
 declare space for them, but only allocate them when the class instance  
 is  shared (allocated on construction).  Then when synchronized goes to  
 lock  them, if they are null, do nothing.

 Or they could simply work like the monitors objects currently have.

Some have stated there is a performance hit for lazy initialization of  
monitors.  I think we should try and fix that.

 As I wrote, I think we need support for shared mutexes too (integrated  
 with const if you will).

I don't disagree with that approach, except the usage of shared as a  
keyword to mean something completely different than it does now.

 Ideally, there'd be a way to choose your own mutex implementations,  
 perhaps with "synchronized(Spinlock) int x".

I like this better, perhaps a synchronized(ReadWriteMutex) would suffice  
instead of synchronized(shared)

 But what if some data is not marked synchronized?

 You might actually want to restrict synchronized variables to shared  
 classes and shared structs. In that case, variables not marked as  
 synchronized will be shared and accessible using atomic operations.

I kind of like the idea of having to implement a class/structure once, and  
then being able to use it penalty-free in a thread-local context.

Perhaps that's asking too much -- a class that is meant to be shared may  
have not a lot in common with one that was meant to be thread-local.  It  
sure feels like I should be able to (for instance) implement a container  
that can be shared or unshared, and change small pieces of it, or add a  
few synchronized calls and get it working.

 2. As far as determining a mutex to protect multiple items of data,  
 what  about:
  synchronized(symbolname) int x, int y;
  or
  synchronized(symbolname)
 {
     int x;
     int y;
 }
  where you cannot do synchronized(x) or synchronized(y), and cannot  
 read or  write x or y without doing synchronized(symbolname).

 Or we could be less original: use a struct. It's just a minor cosmetic  
 problem actually.

 	struct XY { int x, y; }
 	synchronized XY xy;

True.  But sometimes the cosmetic things make all the difference!  You  
could implement synchronized with structs also instead of having a keyword.

One thing I don't like about this is the xy noise when referring to x and  
y.

-Steve

Jun 06 2012

"Nathan M. Swan" <nathanmswan gmail.com> writes:

On Monday, 4 June 2012 at 11:17:45 UTC, Michel Fortin wrote:
 After trying to make sense of the thread "synchronized 
 (this[.classinfo]) in druntime and phobos", I had to write my 
 opinion on all this somewhere that wouldn't be instantly lost 
 in a bazilion of posts. It turned out into something quite 
 elaborate.

 <http://michelf.com/weblog/2012/mutex-synchonization-in-d/>

This encourages the bad practice (IMO) of shared data. Only a 
single thread should have the dictionary, with an AddWordMsg 
struct and a ConfirmWordMsg struct.

Using a message passing approach, the client thread sends an 
AddWordMsg and continues while the word is added concurrently, 
making it more efficient.

My non-expert opinion,
NMS

Jun 04 2012

"Nathan M. Swan" <nathanmswan gmail.com> writes:

On Tuesday, 5 June 2012 at 05:14:36 UTC, Nathan M. Swan wrote:
 On Monday, 4 June 2012 at 11:17:45 UTC, Michel Fortin wrote:
 After trying to make sense of the thread "synchronized 
 (this[.classinfo]) in druntime and phobos", I had to write my 
 opinion on all this somewhere that wouldn't be instantly lost 
 in a bazilion of posts. It turned out into something quite 
 elaborate.

 <http://michelf.com/weblog/2012/mutex-synchonization-in-d/>

 This encourages the bad practice (IMO) of shared data. Only a 
 single thread should have the dictionary, with an AddWordMsg 
 struct and a ConfirmWordMsg struct.

 Using a message passing approach, the client thread sends an 
 AddWordMsg and continues while the word is added concurrently, 
 making it more efficient.

 My non-expert opinion,
 NMS

Though on the flip side, part of the D ideology is not 
restricting the programmer to a single paradigm. So maybe I like 
the idea :)

Data sharing should still be discouraged.

NMS

Jun 04 2012

D Programming

C/C++ Programming

Other

digitalmars.D - More synchronized ideas