digitalmars.D.learn - ndslice: convert a sliced object to T[]

data pulverizer (6/6) Jun 14 2016 How do I unravel a sliced item T[].sliced(...) to an array T[]?

Seb (11/17) Jun 14 2016 A slice is just a _view_ on your memory, the easiest way is to

data pulverizer (5/26) Jun 14 2016 in that case:

Seb (4/8) Jun 14 2016 Are you sure you want to create a _copy_ of your data? In most

data pulverizer (5/15) Jun 15 2016 Thanks, I did.

Andrea Fontana (3/21) Jun 15 2016 Yes. You're forcing it to read all elements and copy them in a

data pulverizer (7/31) Jun 15 2016 I guess foreach would not copy the elements? for example:

Andrea Fontana (3/9) Jun 15 2016 The question is: why you need to put them inside an array? If you

data pulverizer (3/15) Jun 15 2016 I need this to work with external libraries that only deal with

Andrea Fontana (2/19) Jun 15 2016 Then I think the slice.byElement.array is the right solution.

data pulverizer (59/60) Jun 15 2016 The problem with that is that it slows down the code. I compared

Seb (58/120) Jun 15 2016 As said you can avoid the copy (see below). I also profiled it a

data pulverizer (14/32) Jun 15 2016 I didn't benchmark the RNG but I did notice it took a lot of time

data pulverizer (1/1) Jun 15 2016 Oh, I didn't see that runif now returns a tuple.
Seb (44/80) Jun 15 2016 You wrote that too :-)

data pulverizer (3/7) Jun 15 2016 Very true!

Ilya Yaroshenko (3/6) Jun 15 2016 This would work only for slices with continuous memory

data pulverizer <data.pulverizer gmail.com> writes:

How do I unravel a sliced item T[].sliced(...) to an array T[]?

For instance:

import std.experimental.ndslice;
auto slice = new int[12].sliced(3, 4);
int[] x = ??;

Thanks

Jun 14 2016

Seb <seb wilzba.ch> writes:

On Wednesday, 15 June 2016 at 02:43:37 UTC, data pulverizer wrote:
 How do I unravel a sliced item T[].sliced(...) to an array T[]?

 For instance:

 import std.experimental.ndslice;
 auto slice = new int[12].sliced(3, 4);
 int[] x = ??;

 Thanks

A slice is just a _view_ on your memory, the easiest way is to 
save a reference to your array like this:

```
int[] arr = new int[12];
auto slice = arr.sliced(3, 4);
slice[1, 1] = 42;
arr // [0, 0, 0, 0, 0, 42, 0, 0, 0, 0, 0, 0]
```

For a general case, you should give `byElement` a try:

https://dlang.org/phobos/std_experimental_ndslice_selection.html#byElement

Jun 14 2016

data pulverizer <data.pulverizer gmail.com> writes:

On Wednesday, 15 June 2016 at 02:50:30 UTC, Seb wrote:
 On Wednesday, 15 June 2016 at 02:43:37 UTC, data pulverizer 
 wrote:
 How do I unravel a sliced item T[].sliced(...) to an array T[]?

 For instance:

 import std.experimental.ndslice;
 auto slice = new int[12].sliced(3, 4);
 int[] x = ??;

 Thanks

 A slice is just a _view_ on your memory, the easiest way is to 
 save a reference to your array like this:

 ```
 int[] arr = new int[12];
 auto slice = arr.sliced(3, 4);
 slice[1, 1] = 42;
 arr // [0, 0, 0, 0, 0, 42, 0, 0, 0, 0, 0, 0]
 ```

 For a general case, you should give `byElement` a try:

 https://dlang.org/phobos/std_experimental_ndslice_selection.html#byElement

in that case:

import std.array : array;
int[] x = slice.byElement.array;

thanks, now I can go to bed!

Jun 14 2016

Seb <seb wilzba.ch> writes:

On Wednesday, 15 June 2016 at 03:11:23 UTC, data pulverizer wrote:
 in that case:

 import std.array : array;
 int[] x = slice.byElement.array;

Are you sure you want to create a _copy_ of your data? In most 
cases you don't need that ;-)

 thanks, now I can go to bed!

You are welcome. Sleep tight!

Jun 14 2016

data pulverizer <data.pulverizer gmail.com> writes:

On Wednesday, 15 June 2016 at 03:17:39 UTC, Seb wrote:
 On Wednesday, 15 June 2016 at 03:11:23 UTC, data pulverizer 
 wrote:
 in that case:

 import std.array : array;
 int[] x = slice.byElement.array;

 Are you sure you want to create a _copy_ of your data? In most 
 cases you don't need that ;-)

 thanks, now I can go to bed!

 You are welcome. Sleep tight!

Thanks, I did.

I definitely don't want to create a copy! I thought .byElement 
would provide a range which I assume is a reference am I forcing 
it to copy by using .array?

Jun 15 2016

Andrea Fontana <nospam example.com> writes:

On Wednesday, 15 June 2016 at 07:24:23 UTC, data pulverizer wrote:
 On Wednesday, 15 June 2016 at 03:17:39 UTC, Seb wrote:
 On Wednesday, 15 June 2016 at 03:11:23 UTC, data pulverizer 
 wrote:
 in that case:

 import std.array : array;
 int[] x = slice.byElement.array;

 Are you sure you want to create a _copy_ of your data? In most 
 cases you don't need that ;-)

 thanks, now I can go to bed!

 You are welcome. Sleep tight!

 Thanks, I did.

 I definitely don't want to create a copy! I thought .byElement 
 would provide a range which I assume is a reference am I 
 forcing it to copy by using .array?

Yes. You're forcing it to read all elements and copy them in a 
new array.

Jun 15 2016

data pulverizer <data.pulverizer gmail.com> writes:

On Wednesday, 15 June 2016 at 07:45:12 UTC, Andrea Fontana wrote:
 On Wednesday, 15 June 2016 at 07:24:23 UTC, data pulverizer 
 wrote:
 On Wednesday, 15 June 2016 at 03:17:39 UTC, Seb wrote:
 On Wednesday, 15 June 2016 at 03:11:23 UTC, data pulverizer 
 wrote:
 in that case:

 import std.array : array;
 int[] x = slice.byElement.array;

 Are you sure you want to create a _copy_ of your data? In 
 most cases you don't need that ;-)

 thanks, now I can go to bed!

 You are welcome. Sleep tight!

 Thanks, I did.

 I definitely don't want to create a copy! I thought .byElement 
 would provide a range which I assume is a reference am I 
 forcing it to copy by using .array?

 Yes. You're forcing it to read all elements and copy them in a 
 new array.

I guess foreach would not copy the elements? for example:

foreach(el; slice.byElement)
		x ~= el;

But it feels wrong to be doing work pulling elements that already 
exists by using foreach. I feel as if I am missing something 
obvious but can't get it.

Jun 15 2016

Andrea Fontana <nospam example.com> writes:

On Wednesday, 15 June 2016 at 08:25:35 UTC, data pulverizer wrote:
 I guess foreach would not copy the elements? for example:

 foreach(el; slice.byElement)
 		x ~= el;

 But it feels wrong to be doing work pulling elements that 
 already exists by using foreach. I feel as if I am missing 
 something obvious but can't get it.

The question is: why you need to put them inside an array? If you 
can, leave them in the lazy range and work on it.

Jun 15 2016

data pulverizer <data.pulverizer gmail.com> writes:

On Wednesday, 15 June 2016 at 08:53:22 UTC, Andrea Fontana wrote:
 On Wednesday, 15 June 2016 at 08:25:35 UTC, data pulverizer 
 wrote:
 I guess foreach would not copy the elements? for example:

 foreach(el; slice.byElement)
 		x ~= el;

 But it feels wrong to be doing work pulling elements that 
 already exists by using foreach. I feel as if I am missing 
 something obvious but can't get it.

 The question is: why you need to put them inside an array? If 
 you can, leave them in the lazy range and work on it.

I need this to work with external libraries that only deal with 
one dimensional arrays.

Jun 15 2016

Andrea Fontana <nospam example.com> writes:

On Wednesday, 15 June 2016 at 08:56:15 UTC, data pulverizer wrote:
 On Wednesday, 15 June 2016 at 08:53:22 UTC, Andrea Fontana 
 wrote:
 On Wednesday, 15 June 2016 at 08:25:35 UTC, data pulverizer 
 wrote:
 I guess foreach would not copy the elements? for example:

 foreach(el; slice.byElement)
 		x ~= el;

 But it feels wrong to be doing work pulling elements that 
 already exists by using foreach. I feel as if I am missing 
 something obvious but can't get it.

 The question is: why you need to put them inside an array? If 
 you can, leave them in the lazy range and work on it.

 I need this to work with external libraries that only deal with 
 one dimensional arrays.

Then I think the slice.byElement.array is the right solution.

Jun 15 2016

data pulverizer <data.pulverizer gmail.com> writes:

On Wednesday, 15 June 2016 at 09:32:21 UTC, Andrea Fontana wrote:
 Then I think the slice.byElement.array is the right solution.

The problem with that is that it slows down the code. I compared 
matrix multiplication between R and D's cblas adaptor and ndslice.

n = 4000
Matrices: A, B
Sizes: both n by n
Engine: both call openblas

R Elapsed Time: 2.709 s
D's cblas and ndslice: 3.593 s

The R code:

n = 4000; A = matrix(runif(n*n), nr = n); B = matrix(runif(n*n), 
nr = n)
system.time(C <- A%*%B)

The D code:

import std.stdio : writeln;
import std.experimental.ndslice;
import std.random : Random, uniform;
import std.conv : to;
import std.array : array;
import cblas;
import std.datetime : StopWatch;


T[] runif(T)(ulong len, T min, T max){
	T[] arr = new T[len];
	Random gen;
	for(ulong i = 0; i < len; ++i)
		arr[i] = uniform(min, max, gen);
	return arr;
}

// Random matrix
auto rmat(T)(ulong nrow, ulong ncol, T min, T max){
	return runif(nrow*ncol, min, max).sliced(nrow, ncol);
}

auto matrix_mult(T)(Slice!(2, T*) a, Slice!(2, T*) b){
	int M = to!int(a.shape[0]);
	int K = to!int(a.shape[1]);
	int N = to!int(b.shape[1]);
	int n_el = to!int(a.elementsCount);
	T[] A = a.byElement.array;
	T[] B = b.byElement.array;
	T[] C = new T[M*N];
	gemm(Order.ColMajor, Transpose.NoTrans, Transpose.NoTrans, M, N, 
K, 1., A.ptr, K, B.ptr, N, 0, C.ptr, N);
	return C.sliced(M, N);
}


void main()
{
	int n = 4000;
	auto A = rmat(n, n, 0., 1.);
	auto B = rmat(n, n, 0., 1. );
	StopWatch sw;
	sw.start();
	auto C = matrix_mult(A, B);
	sw.stop();
	writeln("Time taken: \n\t", sw.peek().msecs, " [ms]");
}

In my system monitor I can see the copy phase in the D process as 
as single core process. There should be a way to do go from 
ndslice to T[] without copying. Using a foreach loop is even 
slower

Jun 15 2016

Seb <seb wilzba.ch> writes:

On Wednesday, 15 June 2016 at 11:19:20 UTC, data pulverizer wrote:
 On Wednesday, 15 June 2016 at 09:32:21 UTC, Andrea Fontana 
 wrote:
 Then I think the slice.byElement.array is the right solution.

 The problem with that is that it slows down the code. I 
 compared matrix multiplication between R and D's cblas adaptor 
 and ndslice.

 n = 4000
 Matrices: A, B
 Sizes: both n by n
 Engine: both call openblas

 R Elapsed Time: 2.709 s
 D's cblas and ndslice: 3.593 s

 The R code:

 n = 4000; A = matrix(runif(n*n), nr = n); B = 
 matrix(runif(n*n), nr = n)
 system.time(C <- A%*%B)

 The D code:

 import std.stdio : writeln;
 import std.experimental.ndslice;
 import std.random : Random, uniform;
 import std.conv : to;
 import std.array : array;
 import cblas;
 import std.datetime : StopWatch;


 T[] runif(T)(ulong len, T min, T max){
 	T[] arr = new T[len];
 	Random gen;
 	for(ulong i = 0; i < len; ++i)
 		arr[i] = uniform(min, max, gen);
 	return arr;
 }

 // Random matrix
 auto rmat(T)(ulong nrow, ulong ncol, T min, T max){
 	return runif(nrow*ncol, min, max).sliced(nrow, ncol);
 }

 auto matrix_mult(T)(Slice!(2, T*) a, Slice!(2, T*) b){
 	int M = to!int(a.shape[0]);
 	int K = to!int(a.shape[1]);
 	int N = to!int(b.shape[1]);
 	int n_el = to!int(a.elementsCount);
 	T[] A = a.byElement.array;
 	T[] B = b.byElement.array;
 	T[] C = new T[M*N];
 	gemm(Order.ColMajor, Transpose.NoTrans, Transpose.NoTrans, M, 
 N, K, 1., A.ptr, K, B.ptr, N, 0, C.ptr, N);
 	return C.sliced(M, N);
 }


 void main()
 {
 	int n = 4000;
 	auto A = rmat(n, n, 0., 1.);
 	auto B = rmat(n, n, 0., 1. );
 	StopWatch sw;
 	sw.start();
 	auto C = matrix_mult(A, B);
 	sw.stop();
 	writeln("Time taken: \n\t", sw.peek().msecs, " [ms]");
 }

 In my system monitor I can see the copy phase in the D process 
 as as single core process. There should be a way to do go from 
 ndslice to T[] without copying. Using a foreach loop is even 
 slower

As said you can avoid the copy (see below). I also profiled it a 
bit and it was interesting to see that 50% of the runtime are 
spent on generating the random matrix. On my machine now both 
scripts take 1.5s when compiled with

DFLAGS="-release -O3 -boundscheck=off" dub foo2.d --compiler=ldc
(`-b release` would also work)


/+ dub.sdl:
name "matrix_mult"
dependency "cblas" version="~master"
dependency "mir" version="~>0.15"
+/
import std.stdio : writeln;
import mir.ndslice;
import std.random : Random, uniform;
import std.conv : to;
import std.array : array;
import cblas;
import std.datetime : StopWatch;


T[] runif(T)(ulong len, T min, T max){
	T[] arr = new T[len];
	Random gen;
	for(ulong i = 0; i < len; ++i)
		arr[i] = uniform(min, max, gen);
	return arr;
}

// Random matrix
auto rmat(T)(ulong nrow, ulong ncol, T min, T max){
     import std.typecons : tuple;
     auto arr = runif(nrow*ncol, min, max);
	return tuple(arr, arr.sliced(nrow, ncol));
}

auto matrix_mult(T)(T[] A, T[] B, Slice!(2, T*) a, Slice!(2, T*) 
b){
	int M = to!int(a.shape[0]);
	int K = to!int(a.shape[1]);
	int N = to!int(b.shape[1]);
	int n_el = to!int(a.elementsCount);
	T[] C = new T[M*N];
     gemm(Order.ColMajor, Transpose.NoTrans, Transpose.NoTrans, M, 
N, K, 1., A.ptr, K, B.ptr, N, 0, C.ptr, N);
	return C.sliced(M, N);
}

void main()
{
	int n = 4000;
	auto ta = rmat(n, n, 0., 1.);
	auto tb = rmat(n, n, 0., 1. );
	StopWatch sw;
	sw.start();
	auto C = matrix_mult(ta[0], tb[0], ta[1], tb[1]);
	sw.stop();
	writeln("Time taken: \n\t", sw.peek().msecs, " [ms]");
}

For performance issues, you should definitely open an issue at 
mir (the development library of ndslice): 
https://github.com/libmir/mir

Jun 15 2016

data pulverizer <data.pulverizer gmail.com> writes:

On Wednesday, 15 June 2016 at 12:10:32 UTC, Seb wrote:
 As said you can avoid the copy (see below). I also profiled it 
 a bit and it was interesting to see that 50% of the runtime are 
 spent on generating the random matrix. On my machine now both 
 scripts take 1.5s when compiled with

I didn't benchmark the RNG but I did notice it took a lot of time 
to generate the matrix but for now I am focused on the BLAS side 
of things.

I am puzzled about how your code works:

Firstly:
I didn't know that you could substitute an array for its first 
element in D though I am aware that a pointer to an array's first 
element is equivalent to passing the array in C.
 auto matrix_mult(T)(T[] A, T[] B, Slice!(2, T*) a, Slice!(2, 
 T*) b){
 	...
     gemm(Order.ColMajor, Transpose.NoTrans, Transpose.NoTrans, 
 M, N, K, 1., A.ptr, K, B.ptr, N, 0, C.ptr, N);
 	return C.sliced(M, N);
 }

Secondly:
I am especially puzzled about using the second element to stand 
in for the slice itself. How does that work? And where can I find 
more cool tricks like that?

 void main()
 {
 	...
 	auto C = matrix_mult(ta[0], tb[0], ta[1], tb[1]);
 	sw.stop();
 	writeln("Time taken: \n\t", sw.peek().msecs, " [ms]");
 }

Many thanks!

Jun 15 2016

data pulverizer <data.pulverizer gmail.com> writes:

Oh, I didn't see that runif now returns a tuple.

Jun 15 2016

Seb <seb wilzba.ch> writes:

On Wednesday, 15 June 2016 at 13:13:05 UTC, data pulverizer wrote:
 On Wednesday, 15 June 2016 at 12:10:32 UTC, Seb wrote:
 As said you can avoid the copy (see below). I also profiled it 
 a bit and it was interesting to see that 50% of the runtime 
 are spent on generating the random matrix. On my machine now 
 both scripts take 1.5s when compiled with

 I didn't benchmark the RNG but I did notice it took a lot of 
 time to generate the matrix but for now I am focused on the 
 BLAS side of things.

 I am puzzled about how your code works:

 Firstly:
 I didn't know that you could substitute an array for its first 
 element in D though I am aware that a pointer to an array's 
 first element is equivalent to passing the array in C.
 auto matrix_mult(T)(T[] A, T[] B, Slice!(2, T*) a, Slice!(2, 
 T*) b){
 	...
     gemm(Order.ColMajor, Transpose.NoTrans, Transpose.NoTrans, 
 M, N, K, 1., A.ptr, K, B.ptr, N, 0, C.ptr, N);
 	return C.sliced(M, N);
 }


You wrote that too :-)
For more infos see:

https://dlang.org/spec/arrays.html

However that's very dangerous, so use just slices wherever you 
can.

 Secondly:
 I am especially puzzled about using the second element to stand 
 in for the slice itself. How does that work? And where can I 
 find more cool tricks like that?

 void main()
 {
 	...
 	auto C = matrix_mult(ta[0], tb[0], ta[1], tb[1]);
 	sw.stop();
 	writeln("Time taken: \n\t", sw.peek().msecs, " [ms]");
 }

 Many thanks!


Btw you don't even need to save tuples, the pointer is already 
saved in the slice ;-)
N.b: afaik you need the latest version of mir, because 
std.experimental.ndslice in 2.071 doesn't expose the `.ptr` (yet).


// Random matrix
auto rmat(T)(ulong nrow, ulong ncol, T min, T max){
	return runif(nrow*ncol, min, max).sliced(nrow, ncol);
}

auto matrix_mult(T)(Slice!(2, T*) a, Slice!(2, T*) b){
	int M = to!int(a.shape[0]);
	int K = to!int(a.shape[1]);
	int N = to!int(b.shape[1]);
	int n_el = to!int(a.elementsCount);
	T[] C = new T[M*N];
     gemm(Order.ColMajor, Transpose.NoTrans, Transpose.NoTrans, M, 
N, K, 1., a.ptr, K, b.ptr, N, 0, C.ptr, N);
	return C.sliced(M, N);
}

void main()
{
	int n = 4000;
	auto A = rmat(n, n, 0., 1.);
	auto B = rmat(n, n, 0., 1. );
	StopWatch sw;
	sw.start();
	auto C = matrix_mult(A, B);
	sw.stop();
	writeln("Time taken: \n\t", sw.peek().msecs, " [ms]");
}

If you really want to get the original T[] back, you could use 
something like

```
T[] a = slice.ptr[0.. slice.elementsCount];
```

but for most cases `byElement` would be a lot better, because all 
transformations etc are of course only applied to your view.

 And where can I find more cool tricks like that?

Browse the source code and the unittests. Phobos is an amazing 
resource :)

Jun 15 2016

data pulverizer <data.pulverizer gmail.com> writes:

On Wednesday, 15 June 2016 at 14:14:23 UTC, Seb wrote:
 On Wednesday, 15 June 2016 at 13:13:05 UTC, data pulverizer
 And where can I find more cool tricks like that?

 Browse the source code and the unittests. Phobos is an amazing 
 resource :)

Very true!
That's great many thanks!

Jun 15 2016

Ilya Yaroshenko <ilyayaroshenko gmail.com> writes:

On Wednesday, 15 June 2016 at 14:14:23 UTC, Seb wrote:
 ```
 T[] a = slice.ptr[0.. slice.elementsCount];
 ```

This would work only for slices with continuous memory 
representation and positive strides. -- Ilya

Jun 15 2016

D Programming

C/C++ Programming

Other

digitalmars.D.learn - ndslice: convert a sliced object to T[]