digitalmars.D.announce - DCompute: First kernels run successfully

Nicholas Wilson (66/66) Sep 11 2017 I'm pleased to announce that I have run the first dcompute kernel

jmh530 (3/5) Sep 11 2017 Keep up the good work.
kerdemdemir (12/16) Sep 11 2017 Hi Wilson,

Nicholas Wilson (13/30) Sep 11 2017 Hi Erdem

Walter Bright (3/5) Sep 11 2017 Excellent!

Nicholas Wilson (5/10) Sep 11 2017 Indeed the the world domination begin!

Nicholas Wilson <iamthewilsonator hotmail.com> writes:

I'm pleased to announce that I have run the first dcompute kernel 
and it was a success!

There is still a fair bit of polish to the driver needed to make 
the API sane and more complete, not to mention more similar to 
the (untested) OpenCL driver API. But it works!
(Contributions are of course greatly welcomed)

The kernel:
```
 compute(CompileFor.deviceOnly)
module dcompute.tests.dummykernels;

import ldc.dcompute;
import dcompute.std.index;

 kernel void saxpy(GlobalPointer!(float) res,
                    float alpha,GlobalPointer!(float) x,
                    GlobalPointer!(float) y,
                    size_t N)
{
     auto i = GlobalIndex.x;
     if (i >= N) return;
     res[i] = alpha*x[i] + y[i];
}
```

The host code:
```
import dcompute.driver.cuda;
import dcompute.tests.dummykernels : saxpy;

Platform.initialise();

auto devs   = Platform.getDevices(theAllocator);
auto ctx    = Context(devs[0]); scope(exit) ctx.detach();

// Change the file to match your GPU.
Program.globalProgram = 
Program.fromFile("./.dub/obj/kernels_cuda210_64.ptx");
auto q = Queue(false);

enum size_t N = 128;
float alpha = 5.0;
float[N] res, x,y;
foreach (i; 0 .. N)
{
     x[i] = N - i;
     y[i] = i * i;
}
Buffer!(float) b_res, b_x, b_y;
b_res      =  Buffer!(float)(res[]); scope(exit) b_res.release();
b_x        =  Buffer!(float)(x[]);   scope(exit) b_x.release();
b_y        =  Buffer!(float)(y[]);   scope(exit) b_y.release();

b_x.copy!(Copy.hostToDevice); // not quite sold on this interface 
yet.
b_y.copy!(Copy.hostToDevice);

q.enqueue!(saxpy)  // <-- the main magic happens here
     ([N,1,1],[1,1,1])   // the grid
     (b_res,alpha,b_x,b_y, N); // the kernel arguments

b_res.copy!(Copy.deviceToHost);
foreach(i; 0 .. N)
     enforce(res[i] == alpha * x[i] + y[i]);
writeln(res[]); // [640, 636, ... 16134]
```

Simple as that!

Dcompute, as always, is at https://github.com/libmir/dcompute and 
on dub.

To successfully run the dcompute CUDA test you will need a very 
recent LDC (less than two days) with the NVPTX backend* enabled 
along with a CUDA environment and an Nvidia GPU.

*Or wait for LDC 1.4 release real soon(™).

Thanks to the LDC folks for putting up with me ;)

Have fun GPU programming,
Nic

Sep 11 2017

jmh530 <john.michael.hall gmail.com> writes:

On Monday, 11 September 2017 at 12:23:16 UTC, Nicholas Wilson 
wrote:
 I'm pleased to announce that I have run the first dcompute 
 kernel and it was a success!

Keep up the good work.

Sep 11 2017

kerdemdemir <kerdemdemir hotmail.com> writes:

Hi Wilson,

Since I believe GPU-CPU hybrid programming is the future I 
believe you are doing a great job for your and D lang's future.

 To successfully run the dcompute CUDA test you will need a very 
 recent LDC (less than two days) with the NVPTX backend* enabled 
 along with a CUDA environment and an Nvidia GPU.

 *Or wait for LDC 1.4 release real soon(™).

Can you please describe a bit about for starters like me how to 
build recent LDC.

Is this "NVPTX backend" a cmake option?
And what should I do for making my "CUDA environment" ready? 
Which packages should I install?

Sorry if my questions are so dummy I hope I will be able to add 
an example.

Regards
Erdem

Sep 11 2017

Nicholas Wilson <iamthewilsonator hotmail.com> writes:

On Monday, 11 September 2017 at 20:45:43 UTC, kerdemdemir wrote:
 Hi Wilson,

 Since I believe GPU-CPU hybrid programming is the future I 
 believe you are doing a great job for your and D lang's future.

 To successfully run the dcompute CUDA test you will need a 
 very recent LDC (less than two days) with the NVPTX backend* 
 enabled along with a CUDA environment and an Nvidia GPU.

 *Or wait for LDC 1.4 release real soon(™).

 Can you please describe a bit about for starters like me how to 
 build recent LDC.

 Is this "NVPTX backend" a cmake option?
 And what should I do for making my "CUDA environment" ready? 
 Which packages should I install?

 Sorry if my questions are so dummy I hope I will be able to add 
 an example.

 Regards
 Erdem


Hi Erdem

Sorry I've been a bit busy with uni. To build LDC just clone ldc 
and `git submodule --init` and run cmake, setting LLVM_CONFIG to 
/path/to/llvm/build/bin/llvm-config and LLVM_INTRINSIC_TD_PATH to 
/path/to/llvm/source/include/llvm/IR

The nvptx backend is enabled by setting LLVM's cmake variable 
LLVM_TARGETS_TO_BUILD to either "all", or "X86;NVPTX" along with 
any other archs you want to enable, (without the quotes) and then 
building LLVM with cmake. This will get picked up by LDC 
automatically.

I just installed the CUDA sdk in its entirety, but I'm sure you 
don't need everything from it.

Sep 11 2017

Walter Bright <newshound2 digitalmars.com> writes:

On 9/11/2017 5:23 AM, Nicholas Wilson wrote:
 I'm pleased to announce that I have run the first dcompute kernel and it was a 
 success!

Excellent!


https://media.licdn.com/mpr/mpr/AAEAAQAAAAAAAAgvAAAAJDY4OTI4MmE0LTVlZDgtNGQzYy1iN2U1LWU5Nzk1NjlhNzIwNg.jpg

Sep 11 2017

Nicholas Wilson <iamthewilsonator hotmail.com> writes:

On Monday, 11 September 2017 at 22:40:02 UTC, Walter Bright wrote:
 On 9/11/2017 5:23 AM, Nicholas Wilson wrote:
 I'm pleased to announce that I have run the first dcompute 
 kernel and it was a success!

 Excellent!


 https://media.licdn.com/mpr/mpr/AAEAAQAAAAAAAAgvAAAAJDY4OTI4MmE0LTVlZDgtNGQzYy1iN2U1LWU5Nzk1NjlhNzIwNg.jpg

Indeed the the world domination begin!

I just need to get some OpenCL 2.0 capable hardware to test that 
and we'll be well on the way.

AlsoLDC1.4 was just released Yay!

Sep 11 2017

D Programming

C/C++ Programming

Other

digitalmars.D.announce - DCompute: First kernels run successfully