digitalmars.D - Re: [std.concurrency] prioritySend is 1000 times slower than send?

Sean Kelly <sean invisibleduck.org> Sep 30 2010

Sean Kelly <sean invisibleduck.org> Sep 30 2010

osa <osa aso.osa> Sep 30 2010

Sean Kelly <sean invisibleduck.org> Sep 30 2010

osa <osa aso.osa> Sep 30 2010

Sean Kelly <sean invisibleduck.org> Sep 30 2010

Sean Kelly <sean invisibleduck.org> Oct 08 2010

osa <osa aso.osa> Oct 08 2010

Sean Kelly <sean invisibleduck.org> writes:

osa Wrote:

 I started using std.concurrency in some projects and overall it feels 
 like a solid (albeit minimalistic) design. However, current 
 implementation has some issues. For example, I've noticed that using 
 prioritySend slows everything considerably.


Thanks for this.  I can tell you that prioritySend performs an extra allocation
to account for a design requirement (if a priority message isn't received it's
thrown as PriorityMessage!(T), and this exception is generated when the send
occurs, since static type info isn't available at the receive side when it's
needed for this).  I had originally thought that the difference was just more
garbage collections, but calling GC.disable only increases the number of
priority messages sent by about 1000.  I'll have to look at the code to see if
I can figure out what's going on.

Sep 30 2010

Sean Kelly <sean invisibleduck.org> writes:

Sean Kelly Wrote:

 osa Wrote:
 
 I started using std.concurrency in some projects and overall it feels 
 like a solid (albeit minimalistic) design. However, current 
 implementation has some issues. For example, I've noticed that using 
 prioritySend slows everything considerably.


 Thanks for this.  I can tell you that prioritySend performs an extra
allocation to account for a design requirement (if a priority message isn't
received it's thrown as PriorityMessage!(T), and this exception is generated
when the send occurs, since static type info isn't available at the receive
side when it's needed for this).  I had originally thought that the difference
was just more garbage collections, but calling GC.disable only increases the
number of priority messages sent by about 1000.  I'll have to look at the code
to see if I can figure out what's going on.


Okay, I've fixed one issue with priority messages that, aside from broken
behavior, has increased performance somewhat.  Here are the timings:

Benchmark: 5944400 iterations in 5 seconds (1.18888e+06/second) -- built
without -version=priority
Benchmark: 4900 iterations in 5.119 seconds (957.218/second) -- build with
-version=priority before fix
Benchmark: 39700 iterations in 5.001 seconds (7938.41/second) -- built with
version=priority after fix

The remaining issue has to do with the fact that the exception is constructed
when the send is issued and when this exception is constructed a stack trace is
generated as well.  I'll have to modify Throwable so that derived classes can
specify that no trace be generated.  That or eliminate constructing the
exception at the send site and change how that exception is represented.

Sep 30 2010

osa <osa aso.osa> writes:

On 09/30/2010 01:45 PM, Sean Kelly wrote:
 Benchmark: 5944400 iterations in 5 seconds (1.18888e+06/second) -- built
without -version=priority
 Benchmark: 4900 iterations in 5.119 seconds (957.218/second) -- build with
-version=priority before fix
 Benchmark: 39700 iterations in 5.001 seconds (7938.41/second) -- built with
version=priority after fix


Seems to be about an order of magnitude improvement. Not too bad.

 The remaining issue has to do with the fact that the exception is constructed
when the send is issued and when this exception is constructed a stack trace is
generated as well.  I'll have to modify Throwable so that derived classes can
specify that no trace be generated.  That or eliminate constructing the
exception at the send site and change how that exception is represented.


I've also thought about switching to 'send' if the receiver queue is 
empty, but there is no way in std.concurrency API to check for that. Is 
there any serious issue with adding such method? I understand that in 
multi-threaded environment an empty queue as told by 'isEmpty' call may 
become non-empty before that fact is used, but in some situations 
approximate result (means empty or almost empty) is fine.

Sep 30 2010

Sean Kelly <sean invisibleduck.org> writes:

osa Wrote:
 
 I've also thought about switching to 'send' if the receiver queue is 
 empty, but there is no way in std.concurrency API to check for that. Is 
 there any serious issue with adding such method? I understand that in 
 multi-threaded environment an empty queue as told by 'isEmpty' call may 
 become non-empty before that fact is used, but in some situations 
 approximate result (means empty or almost empty) is fine.


The current API is designed to apply to in-process and out-of-process
messaging, so a function like that doesn't really fit.  I think this is really
more of just a tuning issue.  And in fact, that the PriorityMessage exception
is a template isn't feasible for out-of-process messaging, so this is an issue
that has to be addressed at some point anyway.  I think I'm going to both
change the exception to be generated within receive() only if needed, have it
contain a variant instead of a templated type, and possibly also not generate a
stack trace for it.  I haven't decided whether a trace is meaningful in this
context.  Getting a PriorityMessage exception could imply a failure to
receive() a type required by the application design so a trace might be a good
indication of where the error is... or maybe that's just wrong.

I'm looking into the hang issue as well... it's just less obvious where the
problem is there.

Sep 30 2010

osa <osa aso.osa> writes:

On 09/30/2010 03:33 PM, Sean Kelly wrote:
 osa Wrote:
 I've also thought about switching to 'send' if the receiver queue is
 empty, but there is no way in std.concurrency API to check for that. Is
 there any serious issue with adding such method? I understand that in
 multi-threaded environment an empty queue as told by 'isEmpty' call may
 become non-empty before that fact is used, but in some situations
 approximate result (means empty or almost empty) is fine.


 The current API is designed to apply to in-process and out-of-process
messaging, so a function like that doesn't really fit.


I see. It is reasonable if out-of-process messaging is going to be 
implemented.

  Getting a PriorityMessage exception could imply a failure to receive() a type
required by the application design so a trace might be a good indication of
where the error is... or maybe that's just wrong.


I'd say that having a trace for exceptions thrown by recieve may be 
useful only if you have many receieve() calls scattered all over the 
code, with try...catch on the very top level. But my (limited) 
experience with std.concurrency way of thread communication tells me 
that it is a bad idea; I'd use as few calls to receive() as possible and 
keep them close to each other. But people's mileage may vary.

Sep 30 2010

Sean Kelly <sean invisibleduck.org> writes:

osa Wrote:

 On 09/30/2010 03:33 PM, Sean Kelly wrote:
 osa Wrote:
 I've also thought about switching to 'send' if the receiver queue is
 empty, but there is no way in std.concurrency API to check for that. Is
 there any serious issue with adding such method? I understand that in
 multi-threaded environment an empty queue as told by 'isEmpty' call may
 become non-empty before that fact is used, but in some situations
 approximate result (means empty or almost empty) is fine.


 The current API is designed to apply to in-process and out-of-process
messaging, so a function like that doesn't really fit.


 I see. It is reasonable if out-of-process messaging is going to be 
 implemented.


It will be.  But I want to get the bumps smoothed out for in-process messaging
first.

Sep 30 2010

Sean Kelly <sean invisibleduck.org> writes:

== Quote from Sean Kelly (sean invisibleduck.org)'s article
 Sean Kelly Wrote:
 osa Wrote:

 I started using std.concurrency in some projects and overall it feels
 like a solid (albeit minimalistic) design. However, current
 implementation has some issues. For example, I've noticed that using
 prioritySend slows everything considerably.


 Thanks for this.  I can tell you that prioritySend performs an extra
allocation to account for a design requirement (if a priority message isn't
received it's




for this).  I had originally thought that the difference was just more garbage
collections, but calling GC.disable only increases the number of priority
messages
sent by about 1000.  I'll have to look at the code to see if I can figure out
what's going on.
 Okay, I've fixed one issue with priority messages that, aside from broken
behavior, has increased performance somewhat.  Here are the timings:
 Benchmark: 5944400 iterations in 5 seconds (1.18888e+06/second) -- built
without -version=priority
 Benchmark: 4900 iterations in 5.119 seconds (957.218/second) -- build with
-version=priority before fix
 Benchmark: 39700 iterations in 5.001 seconds (7938.41/second) -- built with
version=priority after fix
 The remaining issue has to do with the fact that the exception is constructed
when the send is issued and when this exception is constructed a stack trace is


the send site and change how that exception is represented.

I just made some functional changes to how priority messages are sent and added
a few performance tweaks to messaging in general.  The only visible
difference should be that PriorityMessageException is no longer a template
class but instead contains a Variant, which is something that would have been
necessary for inter-process messaging anyway.  Here are the timings:

--- Before ---

$ dmd -inline -release -O priority
Benchmark: 5749600 iterations in 5 seconds (1.14992e+06/second)
Benchmark: 5747800 iterations in 5 seconds (1.14956e+06/second)
Benchmark: 5748200 iterations in 5 seconds (1.14964e+06/second)

$ dmd -inline -release -O priority -version=priority
Benchmark: 39100 iterations in 5.01 seconds (7804.39/second)
Benchmark: 39100 iterations in 5.01 seconds (7804.39/second)
Benchmark: 39100 iterations in 5 seconds (7820/second)

--- After ---

$ dmd -inline -release -O priority
Benchmark: 7204200 iterations in 5 seconds (1.44084e+06/second)
Benchmark: 7167000 iterations in 5 seconds (1.4334e+06/second)
Benchmark: 7164400 iterations in 5 seconds (1.43288e+06/second)

$ dmd -inline -release -O priority -version=priority
Benchmark: 7442500 iterations in 5 seconds (1.4885e+06/second)
Benchmark: 7448600 iterations in 5 seconds (1.48972e+06/second)
Benchmark: 7421800 iterations in 5 seconds (1.48436e+06/second)

Oct 08 2010

osa <osa aso.osa> writes:

On 10/08/2010 04:29 PM, Sean Kelly wrote:
 I just made some functional changes to how priority messages are sent and
added a few performance tweaks to messaging in general.  The only visible
 difference should be that PriorityMessageException is no longer a template
class but instead contains a Variant, which is something that would have been
 necessary for inter-process messaging anyway.  Here are the timings:

 --- After ---

 $ dmd -inline -release -O priority
 Benchmark: 7204200 iterations in 5 seconds (1.44084e+06/second)
 Benchmark: 7167000 iterations in 5 seconds (1.4334e+06/second)
 Benchmark: 7164400 iterations in 5 seconds (1.43288e+06/second)

 $ dmd -inline -release -O priority -version=priority
 Benchmark: 7442500 iterations in 5 seconds (1.4885e+06/second)
 Benchmark: 7448600 iterations in 5 seconds (1.48972e+06/second)
 Benchmark: 7421800 iterations in 5 seconds (1.48436e+06/second)


Wow! This is a really good improvement. Thanks! I assume this is in 
phobos SVN already, so I'll try to build my application (not simplified 
benchmark) using updated std.concurrency to see how it performs now. 
I'll let you know if something is wrong ;)

Oct 08 2010

D Programming

C/C++ Programming

Other

digitalmars.D - Re: [std.concurrency] prioritySend is 1000 times slower than send?