www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Comparing Parallelization in HPC with D, Chapel, and Go

reply "anon" <anonymous gmail.com> writes:
https://www.academia.edu/3982638/A_Study_of_Successive_Over-relaxation_SOR_Method_Parallelization_Over_Modern_HPC_Languages
Nov 21 2014
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
anon:

 https://www.academia.edu/3982638/A_Study_of_Successive_Over-relaxation_SOR_Method_Parallelization_Over_Modern_HPC_Languages
Thank you for the link, it's very uncommon to see papers that use D. But where's the D/Go/Chapel source code? What's the compiler/version used? (When you do floating point benchmarks there's a huge difference between LDC2 and DMD). Bye, bearophile
Nov 21 2014
parent reply "Kapps" <opantm2+spam gmail.com> writes:
On Friday, 21 November 2014 at 21:53:00 UTC, bearophile wrote:
 anon:

 https://www.academia.edu/3982638/A_Study_of_Successive_Over-relaxation_SOR_Method_Parallelization_Over_Modern_HPC_Languages
Thank you for the link, it's very uncommon to see papers that use D. But where's the D/Go/Chapel source code? What's the compiler/version used? (When you do floating point benchmarks there's a huge difference between LDC2 and DMD). Bye, bearophile
The flags make it likely that DMD was used (-O -inline -release). IIRC there were some problems with DMD that made it not perform too well in these types of benchmarks that use std.parallelism. Results would likely have been noticeably better with GDC or LDC.
Nov 21 2014
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Kapps:

 The flags make it likely that DMD was used (-O -inline 
 -release).
But I use ldmd2 all the time with those arguments :-) Bye, bearophile
Nov 21 2014
prev sibling next sibling parent reply Russel Winder via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Fri, 2014-11-21 at 22:57 +0000, Kapps via Digitalmars-d wrote:
 On Friday, 21 November 2014 at 21:53:00 UTC, bearophile wrote:
 anon:

 https://www.academia.edu/3982638/A_Study_of_Successive_Over-relaxation=
_SOR_Method_Parallelization_Over_Modern_HPC_Languages
 Thank you for the link, it's very uncommon to see papers that=20
 use D. But where's the D/Go/Chapel source code? What's the=20
 compiler/version used? (When you do floating point benchmarks=20
 there's a huge difference between LDC2 and DMD).

 Bye,
 bearophile
=20 The flags make it likely that DMD was used (-O -inline -release).=20 IIRC there were some problems with DMD that made it not perform=20 too well in these types of benchmarks that use std.parallelism.=20 Results would likely have been noticeably better with GDC or LDC.
Sorry, I must have missed this thread earlier, hopefully I am not late ;-) =46rom a quick scan there appears to be no mention of how many cores on the test machine. Maybe there were only 4? Hopefully they were using ldc2 and not dmd. I suspect they we using gc and not gccgo. The words used about the implementations imply there could be a lot better realizations of their algorithms in the three languages. Without actual code though there is very little to be said. I believe it should be a requirement of academic, and indeed non-academic, publishing of any work involving timings that the code be made available. Without the code there is no reproducibility and reproducibility is a cornerstone of scientific method. On the upside using Chapel, D and Go shows forward looking. I wonder about X10 and C++. Not to mention Rust, Java, Groovy, and Python. I have emailed the author. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Nov 22 2014
parent "Sean Kelly" <sean invisibleduck.org> writes:
Yes, I'd be curious to see the code.  I also suspect that the 
functionality in core may not be sufficiently advertised.  At one 
point he mentions using yieldForce to simulate a barrier, which 
suggests he wasn't aware of core.sync.barrier.
Nov 22 2014
prev sibling next sibling parent Ziad Hatahet via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Sat, Nov 22, 2014 at 7:17 AM, Russel Winder via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 I have emailed the author.
Keep us posted!
Nov 23 2014
prev sibling next sibling parent Russel Winder via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Sun, 2014-11-23 at 13:09 -0800, Ziad Hatahet via Digitalmars-d wrote:
 On Sat, Nov 22, 2014 at 7:17 AM, Russel Winder via Digitalmars-d <
 digitalmars-d puremagic.com> wrote:
 
 I have emailed the author.
Keep us posted!
Author replied. He is issuing source code on a bilateral pseudo-NDA. I will read it to ensure no hidden problems later this evening, and then reply. Most likely affirmative… -- Russel. ============================================================================= Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.net 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Nov 24 2014
prev sibling next sibling parent "ixid" <nuaccount gmail.com> writes:
On Friday, 21 November 2014 at 22:57:44 UTC, Kapps wrote:
 On Friday, 21 November 2014 at 21:53:00 UTC, bearophile wrote:
 anon:

 https://www.academia.edu/3982638/A_Study_of_Successive_Over-relaxation_SOR_Method_Parallelization_Over_Modern_HPC_Languages
Thank you for the link, it's very uncommon to see papers that use D. But where's the D/Go/Chapel source code? What's the compiler/version used? (When you do floating point benchmarks there's a huge difference between LDC2 and DMD). Bye, bearophile
The flags make it likely that DMD was used (-O -inline -release). IIRC there were some problems with DMD that made it not perform too well in these types of benchmarks that use std.parallelism. Results would likely have been noticeably better with GDC or LDC.
Whenever there is a benchmark like this the D community outlines a number of obvious to arcane speedups. Our house needs to be in order such that the obvious choice is at least competitive to the speed claims made for D. DMD particularly, while not optimisation focused, should improve its floating point speed and avoid surprising 80 bit floating point behaviours, or at least try to be surprising in a manner more in line with what users of other languages are used to.
Nov 24 2014
prev sibling parent reply Russel Winder via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Sun, 2014-11-23 at 13:09 -0800, Ziad Hatahet via Digitalmars-d wrote:
 On Sat, Nov 22, 2014 at 7:17 AM, Russel Winder via Digitalmars-d <
 digitalmars-d puremagic.com> wrote:
=20
 I have emailed the author.
=20 Keep us posted!
The author is currently having a vacation. He has though sent me the codes. I shall review them and report back to him, not publicly at this stage. When back from vacation his intention is to set the codes up for public availability, and hence wider review and debate. At this point the D community (and I hope the Go community) at large will be able to constructively chip in suggestions. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Nov 25 2014
parent reply "Sparsh Mittal" <invalid emailaddress.com> writes:
I am author of the paper "A Study of Successive Over-relaxation
Method Parallelization Over Modern HPC Languages".

The code has been made available for academic use at
https://www.academia.edu/9709444/Source_code_of_Parallel_and_Serial_Red-Black_SOR_Implementation_in_Chapel_D_and_Go_Languages

Questions and comments can be sent to my email address [although
note that use of software does not imply support].
Dec 10 2014
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Sparsh Mittal:

 I am author of the paper "A Study of Successive Over-relaxation
 Method Parallelization Over Modern HPC Languages".

 The code has been made available for academic use at
 https://www.academia.edu/9709444/Source_code_of_Parallel_and_Serial_Red-Black_SOR_Implementation_in_Chapel_D_and_Go_Languages

 Questions and comments can be sent to my email address [although
 note that use of software does not imply support].
What compiler, compiler version, and compilation arguments did you use for the D code? (For such kind of benchmarks the DMD compiler is the wrong compiler to use). I have improved and made more idiomatic the serial version of the D code: http://dpaste.dzfl.pl/a6743f2eceda Bye, bearophile
Dec 10 2014
parent "Sparsh Mittal" <invalid emailaddress.com> writes:
Thanks for your interest. The users are welcome to make 
improvements to the code and use in their research. Chapel, D and 
Go are all relatively new languages and certainly many 
optimizations are possible with them.

As shown in the paper, I ran the D code with "-inline -O 
-release". I ran the experiments when I was at Iowa State. We had 
departmental servers http://it.engineering.iastate.edu/remote/ 
and I ran the experiments on those with 24 cores (note that this 
link is very frequently updated to show the servers which are 
online).

Now I have moved from there and don't have access to the 
computer.  I am sorry that I don't exactly remember/know answers 
to the other questions.
Dec 12 2014
prev sibling next sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Fri, 21 Nov 2014 21:29:09 +0000
schrieb "anon" <anonymous gmail.com>:

 
 https://www.academia.edu/3982638/A_Study_of_Successive_Over-relaxation_SOR_Method_Parallelization_Over_Modern_HPC_Languages
Did they upload the source code and input data somewhere? It looks like Chapel and D scale badly with number of threads while Go makes excellent use of CPU cores and while executing slower beats the other two >= 8 threads. Then again they could have had much higher speed if they used a GPU driven approach. -- Marco
Nov 21 2014
prev sibling next sibling parent "Horse" <gohater gmail.com> writes:
On Friday, 21 November 2014 at 21:29:10 UTC, anon wrote:
 https://www.academia.edu/3982638/A_Study_of_Successive_Over-relaxation_SOR_Method_Parallelization_Over_Modern_HPC_Languages
Here is another where they compare Chapel, Go, Cilk and TBB. http://arxiv.org/pdf/1302.2837.pdf Conclusion: TBB is the best..
Nov 21 2014
prev sibling parent reply "Andrei Amatuni" <andrei.amatuni gmail.com> writes:
This prompted me to google for recent academic papers on D, which
led me to this:

http://research.ijcaonline.org/volume104/number7/pxc3898921.pdf

not exactly the most rigorous research, but it's pretty
favorable...
Nov 23 2014
next sibling parent Ziad Hatahet via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Sun, Nov 23, 2014 at 7:48 PM, Andrei Amatuni via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 not exactly the most rigorous research, but it's pretty
 favorable...
Not even remotely rigorous. One has to wonder about the quality of the conference into which this paper was accepted.
Nov 24 2014
prev sibling parent reply "Craig Dillabaugh" <craig.dillabaugh gmail.com> writes:
On Monday, 24 November 2014 at 03:48:27 UTC, Andrei Amatuni wrote:
 This prompted me to google for recent academic papers on D, 
 which
 led me to this:

 http://research.ijcaonline.org/volume104/number7/pxc3898921.pdf

 not exactly the most rigorous research, but it's pretty
 favorable...
My main take away from that paper was that C is much slower than Java :o) Based on those results it likely would have been trounced by Python or Ruby too.
Nov 24 2014
next sibling parent Russel Winder via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Mon, 2014-11-24 at 11:53 +0000, Craig Dillabaugh via Digitalmars-d
wrote:
[…]
 My main take away from that paper was that C is much slower than 
 Java :o)
This can happen!
 Based on those results it likely would have been trounced by 
 Python or Ruby too.
I don't know about Ruby, but Python can now be more or less as fast as C and C++. I am not joking on this one, even my π by quadrature codes can show Python running computational loops as fast. -- Russel. ============================================================================= Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.net 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Nov 24 2014
prev sibling parent "Nemanja Boric" <4burgos gmail.com> writes:
 Compilers and interpreters used
 Turbo C++ IDE
:-) On Monday, 24 November 2014 at 11:53:08 UTC, Craig Dillabaugh wrote:
 On Monday, 24 November 2014 at 03:48:27 UTC, Andrei Amatuni 
 wrote:
 This prompted me to google for recent academic papers on D, 
 which
 led me to this:

 http://research.ijcaonline.org/volume104/number7/pxc3898921.pdf

 not exactly the most rigorous research, but it's pretty
 favorable...
My main take away from that paper was that C is much slower than Java :o) Based on those results it likely would have been trounced by Python or Ruby too.
Nov 24 2014