digitalmars.D.learn - ndslice: convert a sliced object to T[]
- data pulverizer (6/6) Jun 14 2016 How do I unravel a sliced item T[].sliced(...) to an array T[]?
- Seb (11/17) Jun 14 2016 A slice is just a _view_ on your memory, the easiest way is to
- data pulverizer (5/26) Jun 14 2016 in that case:
- Seb (4/8) Jun 14 2016 Are you sure you want to create a _copy_ of your data? In most
- data pulverizer (5/15) Jun 15 2016 Thanks, I did.
- Andrea Fontana (3/21) Jun 15 2016 Yes. You're forcing it to read all elements and copy them in a
- data pulverizer (7/31) Jun 15 2016 I guess foreach would not copy the elements? for example:
- Andrea Fontana (3/9) Jun 15 2016 The question is: why you need to put them inside an array? If you
- data pulverizer (3/15) Jun 15 2016 I need this to work with external libraries that only deal with
- Andrea Fontana (2/19) Jun 15 2016 Then I think the slice.byElement.array is the right solution.
- data pulverizer (59/60) Jun 15 2016 The problem with that is that it slows down the code. I compared
- Seb (58/120) Jun 15 2016 As said you can avoid the copy (see below). I also profiled it a
- data pulverizer (14/32) Jun 15 2016 I didn't benchmark the RNG but I did notice it took a lot of time
- data pulverizer (1/1) Jun 15 2016 Oh, I didn't see that runif now returns a tuple.
- Seb (44/80) Jun 15 2016 You wrote that too :-)
- data pulverizer (3/7) Jun 15 2016 Very true!
- Ilya Yaroshenko (3/6) Jun 15 2016 This would work only for slices with continuous memory
How do I unravel a sliced item T[].sliced(...) to an array T[]? For instance: import std.experimental.ndslice; auto slice = new int[12].sliced(3, 4); int[] x = ??; Thanks
Jun 14 2016
On Wednesday, 15 June 2016 at 02:43:37 UTC, data pulverizer wrote:How do I unravel a sliced item T[].sliced(...) to an array T[]? For instance: import std.experimental.ndslice; auto slice = new int[12].sliced(3, 4); int[] x = ??; ThanksA slice is just a _view_ on your memory, the easiest way is to save a reference to your array like this: ``` int[] arr = new int[12]; auto slice = arr.sliced(3, 4); slice[1, 1] = 42; arr // [0, 0, 0, 0, 0, 42, 0, 0, 0, 0, 0, 0] ``` For a general case, you should give `byElement` a try: https://dlang.org/phobos/std_experimental_ndslice_selection.html#byElement
Jun 14 2016
On Wednesday, 15 June 2016 at 02:50:30 UTC, Seb wrote:On Wednesday, 15 June 2016 at 02:43:37 UTC, data pulverizer wrote:in that case: import std.array : array; int[] x = slice.byElement.array; thanks, now I can go to bed!How do I unravel a sliced item T[].sliced(...) to an array T[]? For instance: import std.experimental.ndslice; auto slice = new int[12].sliced(3, 4); int[] x = ??; ThanksA slice is just a _view_ on your memory, the easiest way is to save a reference to your array like this: ``` int[] arr = new int[12]; auto slice = arr.sliced(3, 4); slice[1, 1] = 42; arr // [0, 0, 0, 0, 0, 42, 0, 0, 0, 0, 0, 0] ``` For a general case, you should give `byElement` a try: https://dlang.org/phobos/std_experimental_ndslice_selection.html#byElement
Jun 14 2016
On Wednesday, 15 June 2016 at 03:11:23 UTC, data pulverizer wrote:in that case: import std.array : array; int[] x = slice.byElement.array;Are you sure you want to create a _copy_ of your data? In most cases you don't need that ;-)thanks, now I can go to bed!You are welcome. Sleep tight!
Jun 14 2016
On Wednesday, 15 June 2016 at 03:17:39 UTC, Seb wrote:On Wednesday, 15 June 2016 at 03:11:23 UTC, data pulverizer wrote:Thanks, I did. I definitely don't want to create a copy! I thought .byElement would provide a range which I assume is a reference am I forcing it to copy by using .array?in that case: import std.array : array; int[] x = slice.byElement.array;Are you sure you want to create a _copy_ of your data? In most cases you don't need that ;-)thanks, now I can go to bed!You are welcome. Sleep tight!
Jun 15 2016
On Wednesday, 15 June 2016 at 07:24:23 UTC, data pulverizer wrote:On Wednesday, 15 June 2016 at 03:17:39 UTC, Seb wrote:Yes. You're forcing it to read all elements and copy them in a new array.On Wednesday, 15 June 2016 at 03:11:23 UTC, data pulverizer wrote:Thanks, I did. I definitely don't want to create a copy! I thought .byElement would provide a range which I assume is a reference am I forcing it to copy by using .array?in that case: import std.array : array; int[] x = slice.byElement.array;Are you sure you want to create a _copy_ of your data? In most cases you don't need that ;-)thanks, now I can go to bed!You are welcome. Sleep tight!
Jun 15 2016
On Wednesday, 15 June 2016 at 07:45:12 UTC, Andrea Fontana wrote:On Wednesday, 15 June 2016 at 07:24:23 UTC, data pulverizer wrote:I guess foreach would not copy the elements? for example: foreach(el; slice.byElement) x ~= el; But it feels wrong to be doing work pulling elements that already exists by using foreach. I feel as if I am missing something obvious but can't get it.On Wednesday, 15 June 2016 at 03:17:39 UTC, Seb wrote:Yes. You're forcing it to read all elements and copy them in a new array.On Wednesday, 15 June 2016 at 03:11:23 UTC, data pulverizer wrote:Thanks, I did. I definitely don't want to create a copy! I thought .byElement would provide a range which I assume is a reference am I forcing it to copy by using .array?in that case: import std.array : array; int[] x = slice.byElement.array;Are you sure you want to create a _copy_ of your data? In most cases you don't need that ;-)thanks, now I can go to bed!You are welcome. Sleep tight!
Jun 15 2016
On Wednesday, 15 June 2016 at 08:25:35 UTC, data pulverizer wrote:I guess foreach would not copy the elements? for example: foreach(el; slice.byElement) x ~= el; But it feels wrong to be doing work pulling elements that already exists by using foreach. I feel as if I am missing something obvious but can't get it.The question is: why you need to put them inside an array? If you can, leave them in the lazy range and work on it.
Jun 15 2016
On Wednesday, 15 June 2016 at 08:53:22 UTC, Andrea Fontana wrote:On Wednesday, 15 June 2016 at 08:25:35 UTC, data pulverizer wrote:I need this to work with external libraries that only deal with one dimensional arrays.I guess foreach would not copy the elements? for example: foreach(el; slice.byElement) x ~= el; But it feels wrong to be doing work pulling elements that already exists by using foreach. I feel as if I am missing something obvious but can't get it.The question is: why you need to put them inside an array? If you can, leave them in the lazy range and work on it.
Jun 15 2016
On Wednesday, 15 June 2016 at 08:56:15 UTC, data pulverizer wrote:On Wednesday, 15 June 2016 at 08:53:22 UTC, Andrea Fontana wrote:Then I think the slice.byElement.array is the right solution.On Wednesday, 15 June 2016 at 08:25:35 UTC, data pulverizer wrote:I need this to work with external libraries that only deal with one dimensional arrays.I guess foreach would not copy the elements? for example: foreach(el; slice.byElement) x ~= el; But it feels wrong to be doing work pulling elements that already exists by using foreach. I feel as if I am missing something obvious but can't get it.The question is: why you need to put them inside an array? If you can, leave them in the lazy range and work on it.
Jun 15 2016
On Wednesday, 15 June 2016 at 09:32:21 UTC, Andrea Fontana wrote:Then I think the slice.byElement.array is the right solution.The problem with that is that it slows down the code. I compared matrix multiplication between R and D's cblas adaptor and ndslice. n = 4000 Matrices: A, B Sizes: both n by n Engine: both call openblas R Elapsed Time: 2.709 s D's cblas and ndslice: 3.593 s The R code: n = 4000; A = matrix(runif(n*n), nr = n); B = matrix(runif(n*n), nr = n) system.time(C <- A%*%B) The D code: import std.stdio : writeln; import std.experimental.ndslice; import std.random : Random, uniform; import std.conv : to; import std.array : array; import cblas; import std.datetime : StopWatch; T[] runif(T)(ulong len, T min, T max){ T[] arr = new T[len]; Random gen; for(ulong i = 0; i < len; ++i) arr[i] = uniform(min, max, gen); return arr; } // Random matrix auto rmat(T)(ulong nrow, ulong ncol, T min, T max){ return runif(nrow*ncol, min, max).sliced(nrow, ncol); } auto matrix_mult(T)(Slice!(2, T*) a, Slice!(2, T*) b){ int M = to!int(a.shape[0]); int K = to!int(a.shape[1]); int N = to!int(b.shape[1]); int n_el = to!int(a.elementsCount); T[] A = a.byElement.array; T[] B = b.byElement.array; T[] C = new T[M*N]; gemm(Order.ColMajor, Transpose.NoTrans, Transpose.NoTrans, M, N, K, 1., A.ptr, K, B.ptr, N, 0, C.ptr, N); return C.sliced(M, N); } void main() { int n = 4000; auto A = rmat(n, n, 0., 1.); auto B = rmat(n, n, 0., 1. ); StopWatch sw; sw.start(); auto C = matrix_mult(A, B); sw.stop(); writeln("Time taken: \n\t", sw.peek().msecs, " [ms]"); } In my system monitor I can see the copy phase in the D process as as single core process. There should be a way to do go from ndslice to T[] without copying. Using a foreach loop is even slower
Jun 15 2016
On Wednesday, 15 June 2016 at 11:19:20 UTC, data pulverizer wrote:On Wednesday, 15 June 2016 at 09:32:21 UTC, Andrea Fontana wrote:As said you can avoid the copy (see below). I also profiled it a bit and it was interesting to see that 50% of the runtime are spent on generating the random matrix. On my machine now both scripts take 1.5s when compiled with DFLAGS="-release -O3 -boundscheck=off" dub foo2.d --compiler=ldc (`-b release` would also work) /+ dub.sdl: name "matrix_mult" dependency "cblas" version="~master" dependency "mir" version="~>0.15" +/ import std.stdio : writeln; import mir.ndslice; import std.random : Random, uniform; import std.conv : to; import std.array : array; import cblas; import std.datetime : StopWatch; T[] runif(T)(ulong len, T min, T max){ T[] arr = new T[len]; Random gen; for(ulong i = 0; i < len; ++i) arr[i] = uniform(min, max, gen); return arr; } // Random matrix auto rmat(T)(ulong nrow, ulong ncol, T min, T max){ import std.typecons : tuple; auto arr = runif(nrow*ncol, min, max); return tuple(arr, arr.sliced(nrow, ncol)); } auto matrix_mult(T)(T[] A, T[] B, Slice!(2, T*) a, Slice!(2, T*) b){ int M = to!int(a.shape[0]); int K = to!int(a.shape[1]); int N = to!int(b.shape[1]); int n_el = to!int(a.elementsCount); T[] C = new T[M*N]; gemm(Order.ColMajor, Transpose.NoTrans, Transpose.NoTrans, M, N, K, 1., A.ptr, K, B.ptr, N, 0, C.ptr, N); return C.sliced(M, N); } void main() { int n = 4000; auto ta = rmat(n, n, 0., 1.); auto tb = rmat(n, n, 0., 1. ); StopWatch sw; sw.start(); auto C = matrix_mult(ta[0], tb[0], ta[1], tb[1]); sw.stop(); writeln("Time taken: \n\t", sw.peek().msecs, " [ms]"); } For performance issues, you should definitely open an issue at mir (the development library of ndslice): https://github.com/libmir/mirThen I think the slice.byElement.array is the right solution.The problem with that is that it slows down the code. I compared matrix multiplication between R and D's cblas adaptor and ndslice. n = 4000 Matrices: A, B Sizes: both n by n Engine: both call openblas R Elapsed Time: 2.709 s D's cblas and ndslice: 3.593 s The R code: n = 4000; A = matrix(runif(n*n), nr = n); B = matrix(runif(n*n), nr = n) system.time(C <- A%*%B) The D code: import std.stdio : writeln; import std.experimental.ndslice; import std.random : Random, uniform; import std.conv : to; import std.array : array; import cblas; import std.datetime : StopWatch; T[] runif(T)(ulong len, T min, T max){ T[] arr = new T[len]; Random gen; for(ulong i = 0; i < len; ++i) arr[i] = uniform(min, max, gen); return arr; } // Random matrix auto rmat(T)(ulong nrow, ulong ncol, T min, T max){ return runif(nrow*ncol, min, max).sliced(nrow, ncol); } auto matrix_mult(T)(Slice!(2, T*) a, Slice!(2, T*) b){ int M = to!int(a.shape[0]); int K = to!int(a.shape[1]); int N = to!int(b.shape[1]); int n_el = to!int(a.elementsCount); T[] A = a.byElement.array; T[] B = b.byElement.array; T[] C = new T[M*N]; gemm(Order.ColMajor, Transpose.NoTrans, Transpose.NoTrans, M, N, K, 1., A.ptr, K, B.ptr, N, 0, C.ptr, N); return C.sliced(M, N); } void main() { int n = 4000; auto A = rmat(n, n, 0., 1.); auto B = rmat(n, n, 0., 1. ); StopWatch sw; sw.start(); auto C = matrix_mult(A, B); sw.stop(); writeln("Time taken: \n\t", sw.peek().msecs, " [ms]"); } In my system monitor I can see the copy phase in the D process as as single core process. There should be a way to do go from ndslice to T[] without copying. Using a foreach loop is even slower
Jun 15 2016
On Wednesday, 15 June 2016 at 12:10:32 UTC, Seb wrote:As said you can avoid the copy (see below). I also profiled it a bit and it was interesting to see that 50% of the runtime are spent on generating the random matrix. On my machine now both scripts take 1.5s when compiled withI didn't benchmark the RNG but I did notice it took a lot of time to generate the matrix but for now I am focused on the BLAS side of things. I am puzzled about how your code works: Firstly: I didn't know that you could substitute an array for its first element in D though I am aware that a pointer to an array's first element is equivalent to passing the array in C.auto matrix_mult(T)(T[] A, T[] B, Slice!(2, T*) a, Slice!(2, T*) b){ ... gemm(Order.ColMajor, Transpose.NoTrans, Transpose.NoTrans, M, N, K, 1., A.ptr, K, B.ptr, N, 0, C.ptr, N); return C.sliced(M, N); }Secondly: I am especially puzzled about using the second element to stand in for the slice itself. How does that work? And where can I find more cool tricks like that?void main() { ... auto C = matrix_mult(ta[0], tb[0], ta[1], tb[1]); sw.stop(); writeln("Time taken: \n\t", sw.peek().msecs, " [ms]"); }Many thanks!
Jun 15 2016
Oh, I didn't see that runif now returns a tuple.
Jun 15 2016
On Wednesday, 15 June 2016 at 13:13:05 UTC, data pulverizer wrote:On Wednesday, 15 June 2016 at 12:10:32 UTC, Seb wrote:You wrote that too :-) For more infos see: https://dlang.org/spec/arrays.html However that's very dangerous, so use just slices wherever you can.As said you can avoid the copy (see below). I also profiled it a bit and it was interesting to see that 50% of the runtime are spent on generating the random matrix. On my machine now both scripts take 1.5s when compiled withI didn't benchmark the RNG but I did notice it took a lot of time to generate the matrix but for now I am focused on the BLAS side of things. I am puzzled about how your code works: Firstly: I didn't know that you could substitute an array for its first element in D though I am aware that a pointer to an array's first element is equivalent to passing the array in C.auto matrix_mult(T)(T[] A, T[] B, Slice!(2, T*) a, Slice!(2, T*) b){ ... gemm(Order.ColMajor, Transpose.NoTrans, Transpose.NoTrans, M, N, K, 1., A.ptr, K, B.ptr, N, 0, C.ptr, N); return C.sliced(M, N); }Secondly: I am especially puzzled about using the second element to stand in for the slice itself. How does that work? And where can I find more cool tricks like that?Btw you don't even need to save tuples, the pointer is already saved in the slice ;-) N.b: afaik you need the latest version of mir, because std.experimental.ndslice in 2.071 doesn't expose the `.ptr` (yet). // Random matrix auto rmat(T)(ulong nrow, ulong ncol, T min, T max){ return runif(nrow*ncol, min, max).sliced(nrow, ncol); } auto matrix_mult(T)(Slice!(2, T*) a, Slice!(2, T*) b){ int M = to!int(a.shape[0]); int K = to!int(a.shape[1]); int N = to!int(b.shape[1]); int n_el = to!int(a.elementsCount); T[] C = new T[M*N]; gemm(Order.ColMajor, Transpose.NoTrans, Transpose.NoTrans, M, N, K, 1., a.ptr, K, b.ptr, N, 0, C.ptr, N); return C.sliced(M, N); } void main() { int n = 4000; auto A = rmat(n, n, 0., 1.); auto B = rmat(n, n, 0., 1. ); StopWatch sw; sw.start(); auto C = matrix_mult(A, B); sw.stop(); writeln("Time taken: \n\t", sw.peek().msecs, " [ms]"); } If you really want to get the original T[] back, you could use something like ``` T[] a = slice.ptr[0.. slice.elementsCount]; ``` but for most cases `byElement` would be a lot better, because all transformations etc are of course only applied to your view.void main() { ... auto C = matrix_mult(ta[0], tb[0], ta[1], tb[1]); sw.stop(); writeln("Time taken: \n\t", sw.peek().msecs, " [ms]"); }Many thanks!And where can I find more cool tricks like that?Browse the source code and the unittests. Phobos is an amazing resource :)
Jun 15 2016
On Wednesday, 15 June 2016 at 14:14:23 UTC, Seb wrote:On Wednesday, 15 June 2016 at 13:13:05 UTC, data pulverizerVery true! That's great many thanks!And where can I find more cool tricks like that?Browse the source code and the unittests. Phobos is an amazing resource :)
Jun 15 2016
On Wednesday, 15 June 2016 at 14:14:23 UTC, Seb wrote:``` T[] a = slice.ptr[0.. slice.elementsCount]; ```This would work only for slices with continuous memory representation and positive strides. -- Ilya
Jun 15 2016