digitalmars.D - D and Heterogeneous Computing
- Josh Klontz (21/21) Apr 07 2012 Greetings! As someone with a research interest in software
- Robert Jacques (2/23) Apr 07 2012 I've been using D with CUDA via a high-level wrapper around the driver A...
- Josh Klontz (10/49) Apr 08 2012 Yes, I certainly don't want to be in the business of writing
- Dmitry Olshansky (6/49) Apr 08 2012 Take a look at C++ AMP it's almost exactly this thing added to Visual
- Robert Jacques (2/53) Apr 09 2012 IIRC, doesn't OpenCL support jit-ing ASCII source files? Then, there wou...
- Josh Klontz (33/35) Apr 10 2012 Correct, and that's the underlying power I'm proposing to
- Dmitry Olshansky (25/60) Apr 10 2012 From the looks of it this kind of stuff should be easy with tokenzied
- Josh Klontz (1/25) Apr 10 2012 Awesome, thanks! Will chew on this for a while :)
- proxy (1/2) Apr 10 2012 Looking forward to it!! :)
Greetings! As someone with a research interest in software abstractions for image processing, the D programming language appears to offer unsurpassed language features for constructing beautiful and efficient programs. With that said, what would really get me to abandon C++ is if D supported a heterogenous programming model. My personal inclination would be something closer to OpenACC than anything else I've seen available. Though only in the sense that I like the idea of writing code once and being able to compile/run/debug it with or without automatic vectorization/kernelization. Presumably we could achieve more elegant syntax with tighter integration into the language. Has anyone been working on anything like this? Is this something the community would be interested in seeing? What should the solution look like? One path forward could be a patch to the compiler to generate and execute OpenCL kernels for appropriately marked-up D code. While I'm new the the D language, I'd be happy to work on a proof of concept of this if it is something the community thinks would be valuable and I could get specific feedback about the right way to approach it.
Apr 07 2012
On Sat, 07 Apr 2012 11:38:15 -0500, Josh Klontz <josh.klontz gmail.com> wrote:Greetings! As someone with a research interest in software abstractions for image processing, the D programming language appears to offer unsurpassed language features for constructing beautiful and efficient programs. With that said, what would really get me to abandon C++ is if D supported a heterogenous programming model. My personal inclination would be something closer to OpenACC than anything else I've seen available. Though only in the sense that I like the idea of writing code once and being able to compile/run/debug it with or without automatic vectorization/kernelization. Presumably we could achieve more elegant syntax with tighter integration into the language. Has anyone been working on anything like this? Is this something the community would be interested in seeing? What should the solution look like? One path forward could be a patch to the compiler to generate and execute OpenCL kernels for appropriately marked-up D code. While I'm new the the D language, I'd be happy to work on a proof of concept of this if it is something the community thinks would be valuable and I could get specific feedback about the right way to approach it.I've been using D with CUDA via a high-level wrapper around the driver API. It works very nicely, but it doesn't address the language integration issues. Might I recommend looking into hooking up LDC to the PTX LLVM back-end. That would seem much faster than writing your own back-end.
Apr 07 2012
On Saturday, 7 April 2012 at 18:47:21 UTC, Robert Jacques wrote:On Sat, 07 Apr 2012 11:38:15 -0500, Josh Klontz <josh.klontz gmail.com> wrote:Yes, I certainly don't want to be in the business of writing back-ends. Another idea that came to mind recently was implementing a keyword similar in spirit to "asm": opencl { // Valid opencl code here } And have the compiler automatically handle memory copying of D variables referenced in the kernel code. Would be entirely back-end independent and perhaps pleasant to implement?Greetings! As someone with a research interest in software abstractions for image processing, the D programming language appears to offer unsurpassed language features for constructing beautiful and efficient programs. With that said, what would really get me to abandon C++ is if D supported a heterogenous programming model. My personal inclination would be something closer to OpenACC than anything else I've seen available. Though only in the sense that I like the idea of writing code once and being able to compile/run/debug it with or without automatic vectorization/kernelization. Presumably we could achieve more elegant syntax with tighter integration into the language. Has anyone been working on anything like this? Is this something the community would be interested in seeing? What should the solution look like? One path forward could be a patch to the compiler to generate and execute OpenCL kernels for appropriately marked-up D code. While I'm new the the D language, I'd be happy to work on a proof of concept of this if it is something the community thinks would be valuable and I could get specific feedback about the right way to approach it.I've been using D with CUDA via a high-level wrapper around the driver API. It works very nicely, but it doesn't address the language integration issues. Might I recommend looking into hooking up LDC to the PTX LLVM back-end. That would seem much faster than writing your own back-end.
Apr 08 2012
On 09.04.2012 6:49, Josh Klontz wrote:On Saturday, 7 April 2012 at 18:47:21 UTC, Robert Jacques wrote:Take a look at C++ AMP it's almost exactly this thing added to Visual C++ (but of course for now it's DirectCompute): http://msdn.microsoft.com/en-us/library/hh265136(v=vs.110).aspx -- Dmitry OlshanskyOn Sat, 07 Apr 2012 11:38:15 -0500, Josh Klontz <josh.klontz gmail.com> wrote:Yes, I certainly don't want to be in the business of writing back-ends. Another idea that came to mind recently was implementing a keyword similar in spirit to "asm": opencl { // Valid opencl code here } And have the compiler automatically handle memory copying of D variables referenced in the kernel code. Would be entirely back-end independent and perhaps pleasant to implement?Greetings! As someone with a research interest in software abstractions for image processing, the D programming language appears to offer unsurpassed language features for constructing beautiful and efficient programs. With that said, what would really get me to abandon C++ is if D supported a heterogenous programming model. My personal inclination would be something closer to OpenACC than anything else I've seen available. Though only in the sense that I like the idea of writing code once and being able to compile/run/debug it with or without automatic vectorization/kernelization. Presumably we could achieve more elegant syntax with tighter integration into the language. Has anyone been working on anything like this? Is this something the community would be interested in seeing? What should the solution look like? One path forward could be a patch to the compiler to generate and execute OpenCL kernels for appropriately marked-up D code. While I'm new the the D language, I'd be happy to work on a proof of concept of this if it is something the community thinks would be valuable and I could get specific feedback about the right way to approach it.I've been using D with CUDA via a high-level wrapper around the driver API. It works very nicely, but it doesn't address the language integration issues. Might I recommend looking into hooking up LDC to the PTX LLVM back-end. That would seem much faster than writing your own back-end.
Apr 08 2012
On Sun, 08 Apr 2012 21:49:48 -0500, Josh Klontz <josh.klontz gmail.com> wrote:On Saturday, 7 April 2012 at 18:47:21 UTC, Robert Jacques wrote:IIRC, doesn't OpenCL support jit-ing ASCII source files? Then, there wouldn't be a need for any language changes.On Sat, 07 Apr 2012 11:38:15 -0500, Josh Klontz <josh.klontz gmail.com> wrote:Yes, I certainly don't want to be in the business of writing back-ends. Another idea that came to mind recently was implementing a keyword similar in spirit to "asm": opencl { // Valid opencl code here } And have the compiler automatically handle memory copying of D variables referenced in the kernel code. Would be entirely back-end independent and perhaps pleasant to implement?Greetings! As someone with a research interest in software abstractions for image processing, the D programming language appears to offer unsurpassed language features for constructing beautiful and efficient programs. With that said, what would really get me to abandon C++ is if D supported a heterogenous programming model. My personal inclination would be something closer to OpenACC than anything else I've seen available. Though only in the sense that I like the idea of writing code once and being able to compile/run/debug it with or without automatic vectorization/kernelization. Presumably we could achieve more elegant syntax with tighter integration into the language. Has anyone been working on anything like this? Is this something the community would be interested in seeing? What should the solution look like? One path forward could be a patch to the compiler to generate and execute OpenCL kernels for appropriately marked-up D code. While I'm new the the D language, I'd be happy to work on a proof of concept of this if it is something the community thinks would be valuable and I could get specific feedback about the right way to approach it.I've been using D with CUDA via a high-level wrapper around the driver API. It works very nicely, but it doesn't address the language integration issues. Might I recommend looking into hooking up LDC to the PTX LLVM back-end. That would seem much faster than writing your own back-end.
Apr 09 2012
IIRC, doesn't OpenCL support jit-ing ASCII source files? Then, there wouldn't be a need for any language changes.Correct, and that's the underlying power I'm proposing to leverage. IMO, writing OpenCL code involves (at least) the following nuisances: 1) The kernel code needs to be written as a text string within the native code base. 2) Various function calls to the OpenCL library need to be made to manage the runtime, compile kernels, connect arguments to kernels, execute the kernels, and retrieve the results. 3) If you want to build an application both with and without OpenCL as the backend then you have to maintain two versions of every algorithm, one as an OpenCL string and the other in the native language of your program. To me there seems to be a huge opportunity to obviate the above issues and entice new developers to D via some careful engineering at either the compiler or the standard library level to support heterogeneous computing. Certainly technologies like C++ AMP are a step in the right direction, but to my knowledge there currently doesn't exist anything with the following desirable principles: 1) Write the algorithm once, compile for both serial execution on the CPU or massively parallel execution on an OpenCL enabled device. 2) FOSS 3) Runs everywhere the underlying language runs. 4) The underlying language has a robust compiler, active and growing community, solid standard library, elegant language features, etc... Perhaps I was wrong to suggest that this has to be solved at the compiler level. The EPGPU library seems to tackle some of the problems of mixing OpenCL kernels within C++, though the syntax is far from ideal. Thoughts?
Apr 10 2012
On 11.04.2012 0:31, Josh Klontz wrote:From the looks of it this kind of stuff should be easy with tokenzied strings ( q{ code } )+ mixins + some "auto-magic" helpers being run for OpenCL behind the covers. The problematic part is checking that the fragment is using the correct subset of both languages. Ideally API should work along the lines of this: float[] arr1, arr2; //init arr1 & arr2 assert(arr1.length == arr2.length); length = arr1.length; compute!q{ for(int i=0;i<length; i++) arr1[i] += arr2[i]; }(arr1, arr2); where compute works both with plain CPU and even without OpenCL (by simply mixin stuff in) and for OpenCL with a bit of extra binding magic inside compute template. (compute is an eponymous template that alied to static function inside, that in turn is generated by mixin, for concrete example - take a look on how ctRegex template in std.regex does it) Of course, there are some painful details when you go for deeper things and error messages but it should be perfectly doable in normal D even w/o say CTFE parser. -- Dmitry OlshanskyIIRC, doesn't OpenCL support jit-ing ASCII source files? Then, there wouldn't be a need for any language changes.Correct, and that's the underlying power I'm proposing to leverage. IMO, writing OpenCL code involves (at least) the following nuisances: 1) The kernel code needs to be written as a text string within the native code base. 2) Various function calls to the OpenCL library need to be made to manage the runtime, compile kernels, connect arguments to kernels, execute the kernels, and retrieve the results. 3) If you want to build an application both with and without OpenCL as the backend then you have to maintain two versions of every algorithm, one as an OpenCL string and the other in the native language of your program. To me there seems to be a huge opportunity to obviate the above issues and entice new developers to D via some careful engineering at either the compiler or the standard library level to support heterogeneous computing. Certainly technologies like C++ AMP are a step in the right direction, but to my knowledge there currently doesn't exist anything with the following desirable principles: 1) Write the algorithm once, compile for both serial execution on the CPU or massively parallel execution on an OpenCL enabled device. 2) FOSS 3) Runs everywhere the underlying language runs. 4) The underlying language has a robust compiler, active and growing community, solid standard library, elegant language features, etc... Perhaps I was wrong to suggest that this has to be solved at the compiler level. The EPGPU library seems to tackle some of the problems of mixing OpenCL kernels within C++, though the syntax is far from ideal. Thoughts?
Apr 10 2012
From the looks of it this kind of stuff should be easy with tokenzied strings ( q{ code } )+ mixins + some "auto-magic" helpers being run for OpenCL behind the covers. The problematic part is checking that the fragment is using the correct subset of both languages. Ideally API should work along the lines of this: float[] arr1, arr2; //init arr1 & arr2 assert(arr1.length == arr2.length); length = arr1.length; compute!q{ for(int i=0;i<length; i++) arr1[i] += arr2[i]; }(arr1, arr2); where compute works both with plain CPU and even without OpenCL (by simply mixin stuff in) and for OpenCL with a bit of extra binding magic inside compute template. (compute is an eponymous template that alied to static function inside, that in turn is generated by mixin, for concrete example - take a look on how ctRegex template in std.regex does it) Of course, there are some painful details when you go for deeper things and error messages but it should be perfectly doable in normal D even w/o say CTFE parser.Awesome, thanks! Will chew on this for a while :)
Apr 10 2012
Awesome, thanks! Will chew on this for a while :)Looking forward to it!! :)
Apr 10 2012