digitalmars.D.learn - Death by concurrency

Manfred Nowak (5/5) Nov 08 2005 The well known shootout shows a negative mark for concurrency for D:

Ben Hinkle (6/11) Nov 08 2005 It could be the busy-waiting. Instead of looping and yielding a waiting

Sean Kelly (88/101) Nov 08 2005 I'm not sure what's wrong with their test. I modified the shootout code...

Sean Kelly (4/4) Nov 08 2005 Oh, the shootout says the code should print '5000' and mine printed

Sean Kelly (4/4) Nov 08 2005 Oops. I just noticed that N is a command-line parameter. For an N of

Ben Hinkle (6/107) Nov 08 2005 If the code was making 500 threads it could also be that they ran the

Sean Kelly (4/8) Nov 08 2005 Ah, good point. Ares doesn't have this limitation as it used an AA for

pragma (5/12) Nov 08 2005 Sean,

Sean Kelly (9/11) Nov 08 2005 See my reply to the OP. I tried simply removing the 'synchronized'

Sean Kelly (11/12) Nov 08 2005 I don't really like the way this test is structured, as what it is

Georg Wrede (8/14) Nov 09 2005

Dawid =?UTF-8?B?Q2nEmcW8YXJraWV3aWN6?= (4/15) Nov 12 2005 I would like to see it too. I've to write my own "tasks" for TCP server ...

Manfred Nowak <svv1999 hotmail.com> writes:

The well known shootout shows a negative mark for concurrency for D:

http://shootout.alioth.debian.org/benchmark.php?
test=message&lang=all&sort=fullcpu

What is the reason?

-manfred

Nov 08 2005

"Ben Hinkle" <bhinkle mathworks.com> writes:

"Manfred Nowak" <svv1999 hotmail.com> wrote in message 
news:Xns97086B99494CDsvv1999hotmailcom 63.105.9.61...
 The well known shootout shows a negative mark for concurrency for D:

 http://shootout.alioth.debian.org/benchmark.php?
 test=message&lang=all&sort=fullcpu

 What is the reason?

 -manfred

It could be the busy-waiting. Instead of looping and yielding a waiting 
thread should park itself. The ReentrantLock and Condition classes from 
http://home.comcast.net/~benhinkle/locks/locks.html should help - but I 
don't know if user libraries are allowed in the shootout like that.

Nov 08 2005

Sean Kelly <sean f4.ca> writes:

Ben Hinkle wrote:
 "Manfred Nowak" <svv1999 hotmail.com> wrote in message 
 news:Xns97086B99494CDsvv1999hotmailcom 63.105.9.61...
 
The well known shootout shows a negative mark for concurrency for D:

http://shootout.alioth.debian.org/benchmark.php?
test=message&lang=all&sort=fullcpu

What is the reason?


  >
 It could be the busy-waiting. Instead of looping and yielding a waiting 
 thread should park itself. The ReentrantLock and Condition classes from 
 http://home.comcast.net/~benhinkle/locks/locks.html should help - but I 
 don't know if user libraries are allowed in the shootout like that. 

I'm not sure what's wrong with their test.  I modified the shootout code 
to run on Ares with DMD .139 (since I'm too lazy to rebuild Phobos just 
for this test), and ptime reported it completing in 0.625 seconds on my 
laptop.  And this was with quite a lot of stuff running in  the 
background.  In case anyone is interested, here is the test code.  I 
simply renamed 'wait' to 'join' and did output via printf instead of 
streams:

import std.thread, std.c.stdio, std.c.stdlib;

int main(char[][] args)
{
     const int length = 500;
     int n = args.length > 1 ? atoi(args[1]) : 1;

     EndLink chainEnd = new EndLink(length * n);
     chainEnd.start();

     Link chain = chainEnd;
     while(n--)
     {
         for(int i = 1; i < length; i++)
         {
             Link link = new Link(chain);
             chain = link;
         }

         chain.put(0);
         while(chain.next)
         {
             chain.start();
             chain.join();
             chain = chain.next;
         }
     }

     chainEnd.join();
     printf("%i\n", chainEnd.count);

     return 0;
}

class Link: Thread
{
private:
     int message = -1;

public:
     Link next;

     this(Link t)
     {
         next = t;
     }

     void run()
     {
         next.put(this.take());
     }

     synchronized void put(int m)
     {
         message = m;
         yield();
     }

protected:
     synchronized int take()
     {
         if(message != -1)
         {
             int m = message;
             message = -1;
             return m + 1;
         }
         yield();
         return 0;
     }
}

class EndLink: Link
{
private:
     int finalCount;

public:
     int count = 0;

     this(int i)
     {
         super(null);
         finalCount = i;
     }

     void run()
     {
         while(count < finalCount)
         {
             count += this.take();
             yield();
         }
     }
}

Nov 08 2005

Sean Kelly <sean f4.ca> writes:

Oh, the shootout says the code should print '5000' and mine printed 
'500'.  I haven't taken the time to figure out why the result was 
different, though it's likely a bug in the shootout code.


Sean

Nov 08 2005

Sean Kelly <sean f4.ca> writes:

Oops.  I just noticed that N is a command-line parameter.  For an N of 
10, ptime clocks this test at 5.210 seconds on my laptop, and '5000' is 
printed as expected.


Sean

Nov 08 2005

"Ben Hinkle" <bhinkle mathworks.com> writes:

If the code was making 500 threads it could also be that they ran the 
benchmark on linux and bumped into phobos's limitation on the number of 
threads allowed at once:
    static Thread[/*_POSIX_THREAD_THREADS_MAX*/ 100] allThreads;


"Sean Kelly" <sean f4.ca> wrote in message 
news:dkr4ls$ljk$1 digitaldaemon.com...
 Ben Hinkle wrote:
 "Manfred Nowak" <svv1999 hotmail.com> wrote in message 
 news:Xns97086B99494CDsvv1999hotmailcom 63.105.9.61...

The well known shootout shows a negative mark for concurrency for D:

http://shootout.alioth.debian.org/benchmark.php?
test=message&lang=all&sort=fullcpu

What is the reason?


  >
 It could be the busy-waiting. Instead of looping and yielding a waiting 
 thread should park itself. The ReentrantLock and Condition classes from 
 http://home.comcast.net/~benhinkle/locks/locks.html should help - but I 
 don't know if user libraries are allowed in the shootout like that.

 I'm not sure what's wrong with their test.  I modified the shootout code 
 to run on Ares with DMD .139 (since I'm too lazy to rebuild Phobos just 
 for this test), and ptime reported it completing in 0.625 seconds on my 
 laptop.  And this was with quite a lot of stuff running in  the 
 background.  In case anyone is interested, here is the test code.  I 
 simply renamed 'wait' to 'join' and did output via printf instead of 
 streams:

 import std.thread, std.c.stdio, std.c.stdlib;

 int main(char[][] args)
 {
     const int length = 500;
     int n = args.length > 1 ? atoi(args[1]) : 1;

     EndLink chainEnd = new EndLink(length * n);
     chainEnd.start();

     Link chain = chainEnd;
     while(n--)
     {
         for(int i = 1; i < length; i++)
         {
             Link link = new Link(chain);
             chain = link;
         }

         chain.put(0);
         while(chain.next)
         {
             chain.start();
             chain.join();
             chain = chain.next;
         }
     }

     chainEnd.join();
     printf("%i\n", chainEnd.count);

     return 0;
 }

 class Link: Thread
 {
 private:
     int message = -1;

 public:
     Link next;

     this(Link t)
     {
         next = t;
     }

     void run()
     {
         next.put(this.take());
     }

     synchronized void put(int m)
     {
         message = m;
         yield();
     }

 protected:
     synchronized int take()
     {
         if(message != -1)
         {
             int m = message;
             message = -1;
             return m + 1;
         }
         yield();
         return 0;
     }
 }

 class EndLink: Link
 {
 private:
     int finalCount;

 public:
     int count = 0;

     this(int i)
     {
         super(null);
         finalCount = i;
     }

     void run()
     {
         while(count < finalCount)
         {
             count += this.take();
             yield();
         }
     }
 }

Nov 08 2005

Sean Kelly <sean f4.ca> writes:

Ben Hinkle wrote:
 If the code was making 500 threads it could also be that they ran the 
 benchmark on linux and bumped into phobos's limitation on the number of 
 threads allowed at once:
     static Thread[/*_POSIX_THREAD_THREADS_MAX*/ 100] allThreads;

Ah, good point.  Ares doesn't have this limitation as it used an AA for 
storing thread references.


Sean

Nov 08 2005

pragma <pragma_member pathlink.com> writes:

In article <dkr5ac$me5$2 digitaldaemon.com>, Sean Kelly says...
Ben Hinkle wrote:
 If the code was making 500 threads it could also be that they ran the 
 benchmark on linux and bumped into phobos's limitation on the number of 
 threads allowed at once:
     static Thread[/*_POSIX_THREAD_THREADS_MAX*/ 100] allThreads;

Ah, good point.  Ares doesn't have this limitation as it used an AA for 
storing thread references.

Sean,
Out of curiosity, have you tried using Ares' Atomic lib for this task?  I wonder
what the difference in time would be when compared to 'synchronized'?

- EricAnderton at yahoo

Nov 08 2005

Sean Kelly <sean f4.ca> writes:

pragma wrote:
 Out of curiosity, have you tried using Ares' Atomic lib for this task?  I
wonder
 what the difference in time would be when compared to 'synchronized'?

See my reply to the OP.  I tried simply removing the 'synchronized' 
properties entirely and only saw a small performance increase (less than 
0.1 seconds average).  I suspect this is because the real time consumer 
in this case is thread creation.  I also tried disabling the GC and the 
test ran slower on average than with it enabled.  It would probably be 
difficult to optimize this test to perform noticeably better as the 500 
threads need to be created no matter what.


Sean

Nov 08 2005

Sean Kelly <sean f4.ca> writes:

Manfred Nowak wrote:
 The well known shootout shows a negative mark for concurrency for D

I don't really like the way this test is structured, as what it is 
really testing the efficiency of thread creation.  For any language with 
its roots in OS-level thread code, the performance should be pretty much 
equivalent.  I suspect the functional languages perform so well because 
they do user-level concurrency rather than kernel-level concurrency (and 
probably also because they don't allocate large chunks of memory for 
stack space and such in the process).  I'm quite surprised by the 
abysmal performance of the Scheme and OCaml tests however.  Is it simply 
because their interpreters stink?


Sean

Nov 08 2005

Georg Wrede <georg.wrede nospam.org> writes:

Sean Kelly wrote:
 Manfred Nowak wrote:
 
 The well known shootout shows a negative mark for concurrency for D


<snip>

                 I suspect the functional languages perform
 so well because they do user-level concurrency rather than
 kernel-level concurrency 

<snip>

I think we should have both in D.

I don't think it's too hard to imagine a situation where one would want 
to use a few real OS threads, and _within_ some of them a bunch of 
simple cooperating light weight threads. ("Fibers, if you like.")

Equally, preemtive threading is overkill for a lot of other things.

Nov 09 2005

Dawid =?UTF-8?B?Q2nEmcW8YXJraWV3aWN6?= <dawid.ciezarkiewicz gmail.com> writes:

Georg Wrede wrote:

 I suspect the functional languages perform
 so well because they do user-level concurrency rather than
 kernel-level concurrency

 
 <snip>
 
 I think we should have both in D.
 
 I don't think it's too hard to imagine a situation where one would want
 to use a few real OS threads, and _within_ some of them a bunch of
 simple cooperating light weight threads. ("Fibers, if you like.")

I would like to see it too. I've to write my own "tasks" for TCP server I'm
working on. Real threads are just to heavy to be massive. And things like
this are often used IMO, so having them in standard lib would be great.

Nov 12 2005

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Death by concurrency