www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - How to sum multidimensional arrays?

reply p.shkadzko <p.shkadzko gmail.com> writes:
I'd like to sum 2D arrays. Let's create 2 random 2D arrays and 
sum them.

```
import std.random : Xorshift, unpredictableSeed, uniform;
import std.range : generate, take, chunks;
import std.array : array;

static T[][] rndMatrix(T)(T max, in int rows, in int cols)
{
     Xorshift rnd;
     rnd.seed(unpredictableSeed);
     const amount = rows * cols;
     return generate(() => uniform(0, max, 
rnd)).take(amount).array.chunks(cols).array;
}

void main() {
     int[][] m1 = rndMatrix(10, 2, 3);
     int[][] m2 = rndMatrix(10, 2, 3);

     auto c = m1[] + m2[];
}
```

This won't work because the compiler will throw "Error: array 
operation m[] + m2[] without destination memory not allowed". 
Looking at 
https://forum.dlang.org/thread/wnjepbggivhutgbyjagm forum.dlang.org, I modified
the code to:


```
void main() {
     int[][] m1 = rndMatrix(10, 2, 3);
     int[][] m2 = rndMatrix(10, 2, 3);

     int[][] c;
     c.length = m[0].length;
     c[1].length = m[1].length;
	
     c[] = m1[] + m2[];
}
```
Well then I am getting

"/dlang/dmd/linux/bin64/../../src/druntime/import/core/internal/arra
/operations.d(165): Error: static assert:  "Binary + not supported for types
int[] and int[]."

Right, then I am trying the following

```
void main() {
     int[][] m1 = rndMatrix(10, 2, 3);
     int[][] m2 = rndMatrix(10, 2, 3);

     auto c = zip(m[], m2[]).map!((a, b) => a + b);

}
```

Doesn't work either because

"Error: template D main.__lambda1 cannot deduce function from 
argument types !()(Tuple!(int[], int[])), candidates are:
onlineapp.d(21):        __lambda1
(...)


So, I have to flatten first, then zip + sum and then reshape back 
to the original.

```
auto c = zip(m.joiner, m2.joiner).map!(t => t[0] + 
t[1]).array.chunks(3).array;
```

This works but it does not look very efficient considering we 
flatten and then calling array twice. It will get even worse with 
3D arrays. Is there a better way without relying on mir.ndslice?
Feb 27 2020
next sibling parent reply p.shkadzko <p.shkadzko gmail.com> writes:
On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:
 This works but it does not look very efficient considering we 
 flatten and then calling array twice. It will get even worse 
 with 3D arrays.
And yes, benchmarks show that summing 2D arrays like in the example above is significantly slower than in numpy. But that is to be expected... I guess. D -- sum of two 5000 x 6000 2D arrays: 3.4 sec. numpy -- sum of two 5000 x 6000 2D arrays: 0.0367800739913946 sec.
Feb 27 2020
parent reply jmh530 <john.michael.hall gmail.com> writes:
On Thursday, 27 February 2020 at 15:28:01 UTC, p.shkadzko wrote:
 On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:
 This works but it does not look very efficient considering we 
 flatten and then calling array twice. It will get even worse 
 with 3D arrays.
And yes, benchmarks show that summing 2D arrays like in the example above is significantly slower than in numpy. But that is to be expected... I guess. D -- sum of two 5000 x 6000 2D arrays: 3.4 sec. numpy -- sum of two 5000 x 6000 2D arrays: 0.0367800739913946 sec.
What's the performance of mir like? The code below seems to work without issue. /+dub.sdl: dependency "mir-algorithm" version="~>3.7.17" dependency "mir-random" version="~>2.2.10" +/ import std.stdio : writeln; import mir.random : Random, unpredictableSeed; import mir.random.variable: UniformVariable; import mir.random.algorithm: randomSlice; auto rndMatrix(T)(T max, in int rows, in int cols) { auto gen = Random(unpredictableSeed); auto rv = UniformVariable!T(0.0, max); return randomSlice(gen, rv, rows, cols); } void main() { auto m1 = rndMatrix(10.0, 2, 3); auto m2 = rndMatrix(10.0, 2, 3); auto m3 = m1 + m2; writeln(m1); writeln(m2); writeln(m3); }
Feb 27 2020
parent reply 9il <ilyayaroshenko gmail.com> writes:
On Thursday, 27 February 2020 at 16:31:49 UTC, jmh530 wrote:
 On Thursday, 27 February 2020 at 15:28:01 UTC, p.shkadzko wrote:
 On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko 
 wrote:
 This works but it does not look very efficient considering we 
 flatten and then calling array twice. It will get even worse 
 with 3D arrays.
And yes, benchmarks show that summing 2D arrays like in the example above is significantly slower than in numpy. But that is to be expected... I guess. D -- sum of two 5000 x 6000 2D arrays: 3.4 sec. numpy -- sum of two 5000 x 6000 2D arrays: 0.0367800739913946 sec.
What's the performance of mir like? The code below seems to work without issue. /+dub.sdl: dependency "mir-algorithm" version="~>3.7.17" dependency "mir-random" version="~>2.2.10" +/ import std.stdio : writeln; import mir.random : Random, unpredictableSeed; import mir.random.variable: UniformVariable; import mir.random.algorithm: randomSlice; auto rndMatrix(T)(T max, in int rows, in int cols) { auto gen = Random(unpredictableSeed); auto rv = UniformVariable!T(0.0, max); return randomSlice(gen, rv, rows, cols); } void main() { auto m1 = rndMatrix(10.0, 2, 3); auto m2 = rndMatrix(10.0, 2, 3); auto m3 = m1 + m2; writeln(m1); writeln(m2); writeln(m3); }
The same as numpy for large matrixes because the cost is memory access. Mir+LDC will be faster for small matrixes because it will flatten the inner loop and use SIMD instructions. Few performances nitpick for your example to be fair with benchmarking againt the test: 1. Random (default) is slower then Xorfish. 2. double is twice larger then int and requires twice more memory, so it would be twice slower then int for large matrixes. Check the prev. post, we have posted almost in the same time ;) https://forum.dlang.org/post/izoflhyerkiladngyrov forum.dlang.org
Feb 27 2020
parent jmh530 <john.michael.hall gmail.com> writes:
On Thursday, 27 February 2020 at 16:39:15 UTC, 9il wrote:
 [snip]
 Few performances nitpick for your example to be fair with 
 benchmarking againt the test:
 1. Random (default) is slower then Xorfish.
 2. double is twice larger then int and requires twice more 
 memory, so it would be twice slower then int for large matrixes.

 Check the prev. post, we have posted almost in the same time ;)
 https://forum.dlang.org/post/izoflhyerkiladngyrov forum.dlang.org
Those differences largely came from a lack of attention to detail. I didn't notice the Xorshift until after I posted. I used double because it's such a force of habit for me to use continuous distributions. I came across this in the documentation. UniformVariable!T uniformVariable(T = double)(in T a, in T b) if(isIntegral!T) and did a double-take until I read the note associated with it in the source.
Feb 27 2020
prev sibling next sibling parent reply bachmeier <no spam.net> writes:
On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:

[...]

 This works but it does not look very efficient considering we 
 flatten and then calling array twice. It will get even worse 
 with 3D arrays. Is there a better way without relying on 
 mir.ndslice?
Is there a reason you can't create a struct around a double[] like this? struct Matrix { double[] data; } Then to add Matrix A to Matrix B, you use A.data[] + B.data[]. But since I'm not sure what exactly you're doing, maybe that won't work.
Feb 27 2020
parent reply p.shkadzko <p.shkadzko gmail.com> writes:
On Thursday, 27 February 2020 at 15:48:53 UTC, bachmeier wrote:
 On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:

 [...]

 This works but it does not look very efficient considering we 
 flatten and then calling array twice. It will get even worse 
 with 3D arrays. Is there a better way without relying on 
 mir.ndslice?
Is there a reason you can't create a struct around a double[] like this? struct Matrix { double[] data; } Then to add Matrix A to Matrix B, you use A.data[] + B.data[]. But since I'm not sure what exactly you're doing, maybe that won't work.
Right! Ok, here is how I do it. ``` struct Matrix(T) { T[] elems; int cols; T[][] to2D() { return elems.chunks(cols).array; } } ``` and Matrix summing and random array generator functions ``` auto matrixSum(Matrix!int m1, Matrix!int m2) { Matrix!int m3; m3.cols = m1.cols; m3.elems.length = m1.elems.length; m3.elems[] = m1.elems[] + m2.elems[]; return m3.to2D; } static T[] rndArr(T)(in T max, in int elems) { Xorshift rnd; return generate(() => uniform(0, max, rnd)).take(elems).array; } ``` Then we do the following ``` auto m1 = Matrix!int(rndArr!int(10, 5000 * 6000), 6000); auto m2 = Matrix!int(rndArr!int(10, 5000 * 6000), 6000); auto m3 = matrixSum(m1, m2); ``` And it works effortlessly! Sum of two 5000 x 6000 int arrays is just 0.105 sec! (on a Windows machine though but with weaker CPU). I bet using mir.ndslice instead of D arrays would be even faster.
Feb 27 2020
parent 9il <ilyayaroshenko gmail.com> writes:
On Thursday, 27 February 2020 at 23:15:28 UTC, p.shkadzko wrote:
 And it works effortlessly!
 Sum of two 5000 x 6000 int arrays is just 0.105 sec! (on a 
 Windows machine though but with weaker CPU).

 I bet using mir.ndslice instead of D arrays would be even 
 faster.
Yes, the output for the following benchmark shows that Mir is 43% faster. However, when I have checked the assembler output, both Mir and Std (really LDC in both cases) generate almost the same and best possible loops with AVX instructions for summation. In another hand, Mir is faster because it generates random matrixes faster and uses uninitialized memory for the summation target. Output: ``` std: 426 ms, 432 μs, and 1 hnsec |10 mir: 297 ms, 694 μs, and 3 hnsecs |10 ``` Run command: `dub --build=release --single --compiler=ldc2 test.d` Note that -mcpu=native flag is passed to LDC. Source: ``` /+dub.sdl: dependency "mir-algorithm" version="~>3.7.17" dependency "mir-random" version="~>2.2.10" dflags "-mcpu=native" platform="ldc" +/ int val; void testStd() { pragma(inline, false); static struct Matrix(T) { import std.range; T[] elems; int cols; T[][] to2D() { return elems.chunks(cols).array; } } static auto matrixSum(Matrix!int m1, Matrix!int m2) { Matrix!int m3; m3.cols = m1.cols; m3.elems.length = m1.elems.length; m3.elems[] = m1.elems[] + m2.elems[]; return m3.to2D; } static T[] rndArr(T)(in T max, in int elems) { import std.random; import std.range; Xorshift rnd; return generate(() => uniform(0, max, rnd)).take(elems).array; } auto m1 = Matrix!int(rndArr!int(10, 5000 * 6000), 6000); auto m2 = Matrix!int(rndArr!int(10, 5000 * 6000), 6000); auto m3 = matrixSum(m1, m2); val = m3[$-1][$-1]; } void testMir() { pragma(inline, false); import mir.ndslice; import mir.random: threadLocal; import mir.random.variable: uniformVar; import mir.random.algorithm: randomSlice; import mir.random.engine.xorshift; auto m1 = threadLocal!Xorshift.randomSlice(uniformVar!int(0, 10), [5000, 6000]); auto m2 = threadLocal!Xorshift.randomSlice(uniformVar!int(0, 10), [5000, 6000]); auto m3 = slice(m1 + m2); val = m3[$-1][$-1]; } void main() { import std.datetime.stopwatch; import std.stdio; import core.memory; GC.disable; StopWatch clock; clock.reset; clock.start; testStd; clock.stop; writeln("std: ", clock.peek, " |", val); clock.reset; clock.start; testMir; clock.stop; writeln("mir: ", clock.peek, " |", val); } ```
Feb 27 2020
prev sibling next sibling parent reply 9il <ilyayaroshenko gmail.com> writes:
On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:
 Is there a better way without relying on mir.ndslice?
ndslice Poker Face /+dub.sdl: dependency "mir-algorithm" version="~>3.7.17" dependency "mir-random" version="~>2.2.10" +/ import mir.ndslice; import mir.random: threadLocal; import mir.random.variable: uniformVar; import mir.random.algorithm: randomSlice; import mir.random.engine.xorshift; void main() { Slice!(int*, 2) m1 = threadLocal!Xorshift.randomSlice(uniformVar!int(0, 10), [2, 3]); Slice!(int*, 2) m2 = threadLocal!Xorshift.randomSlice(uniformVar!int(0, 10), [2, 3]); Slice!(int*, 2) c = slice(m1 + m2); }
Feb 27 2020
parent p.shkadzko <p.shkadzko gmail.com> writes:
On Thursday, 27 February 2020 at 16:31:07 UTC, 9il wrote:
 On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:
 Is there a better way without relying on mir.ndslice?
ndslice Poker Face /+dub.sdl: dependency "mir-algorithm" version="~>3.7.17" dependency "mir-random" version="~>2.2.10" +/ import mir.ndslice; import mir.random: threadLocal; import mir.random.variable: uniformVar; import mir.random.algorithm: randomSlice; import mir.random.engine.xorshift; void main() { Slice!(int*, 2) m1 = threadLocal!Xorshift.randomSlice(uniformVar!int(0, 10), [2, 3]); Slice!(int*, 2) m2 = threadLocal!Xorshift.randomSlice(uniformVar!int(0, 10), [2, 3]); Slice!(int*, 2) c = slice(m1 + m2); }
Yes, mir.ndslice is a straightforward choice for multidimensional arrays. I shall do some benchmarks with it next. But first, I try to do it with standard D ops and see what's the rough difference against numpy's C.
Feb 27 2020
prev sibling next sibling parent Andrea Fontana <nospam example.com> writes:
On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:
 void main() {
     int[][] m1 = rndMatrix(10, 2, 3);
     int[][] m2 = rndMatrix(10, 2, 3);

     auto c = m1[] + m2[];
 }
I think you're trying to do this: int[][] m1 = rndMatrix(10, 2, 3); int[][] m2 = rndMatrix(10, 2, 3); int[][] m3; m3.length = m1.length; foreach(i; 0..m1.length) { m3[i].length = m1[i].length; m3[i][] = m1[i][] + m2[i][]; } But of course that's not the best solution :)
Feb 27 2020
prev sibling parent reply AB <nospam nospam.me> writes:
On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:
 I'd like to sum 2D arrays. Let's create 2 random 2D arrays and 
 sum them.

 ```
 import std.random : Xorshift, unpredictableSeed, uniform;
 import std.range : generate, take, chunks;
 import std.array : array;

 static T[][] rndMatrix(T)(T max, in int rows, in int cols)
 {
     Xorshift rnd;
     rnd.seed(unpredictableSeed);
     const amount = rows * cols;
     return generate(() => uniform(0, max, 
 rnd)).take(amount).array.chunks(cols).array;
 }

 void main() {
     int[][] m1 = rndMatrix(10, 2, 3);
     int[][] m2 = rndMatrix(10, 2, 3);

     auto c = m1[] + m2[];
 }
 ```
Maybe this is already clear, but it is important to highlight that 2D arrays and arrays of arrays are two different things. int[][] is an array of arrays, for each outer index the element is an array that has its own allocated memory and length. 2D arrays are not provided by the language, but can be implemented by defining a type with the required operator overloads. Using an array of arrays for a 10000x10000 matrix requires 10001 allocations while a dedicated 2D array implementation needs only 1; Example of an array of arrays where the inner arrays have different lengths: ------------ module test; void main() { import std.stdio; int[][] a; a.length=3; a[0]=[1,2,3]; a[1]=[3,4]; a[2]=[]; writeln(a); } ------------ Your Example with a minimal 2D array. ------------ module test2; import std.random : Xorshift, unpredictableSeed, uniform; import std.range : generate, take, chunks; import std.array : array; import std.stdio : writeln; struct Matrix(T) { int rows; T[] data; alias data this; int cols() {return cast(int) data.length/rows;} this(int r, int c) { data=new int[r*c]; rows=r;} this(int r, int c, T[] d) {assert(r*c==data.length); data=d; rows=r; } auto opIndex(int r, int c) {return data[rows*c+r];} } auto rndMatrix(T)(T max, in int rows, in int cols) { Xorshift rnd; rnd.seed(unpredictableSeed); const amount = rows * cols; return Matrix!T(rows,cols,generate(() => uniform(0, max, rnd)).take(amount).array); } void main() { auto m1 = rndMatrix(10, 2, 3); auto m2 = rndMatrix(10, 2, 3); auto c = Matrix!int(2,3); c[] = m1[] + m2[]; writeln(m1[1,2]); writeln(m2[1,2]); writeln(c[1,2]); } ---------------- See https://dlang.org/spec/operatoroverloading.html#array-ops for a better overview of the required operators or mir.ndslice for an nD implementation.
Feb 28 2020
next sibling parent bachmeier <no spam.net> writes:
On Friday, 28 February 2020 at 16:51:10 UTC, AB wrote:

 See https://dlang.org/spec/operatoroverloading.html#array-ops 
 for a better overview of the required operators or mir.ndslice 
 for an nD implementation.
Here's an old version of some of the things I've been using: https://bitbucket.org/bachmeil/dmdgretl/src/67a6c5dbf95f23fa753bfd590bc464147cbdc5cc/inst/gretl/matrix.d#lines-307 It has multidimensional slicing and such.
Feb 28 2020
prev sibling parent reply p.shkadzko <p.shkadzko gmail.com> writes:
On Friday, 28 February 2020 at 16:51:10 UTC, AB wrote:
 On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:
[...]
Your Example with a minimal 2D array. ------------ module test2; import std.random : Xorshift, unpredictableSeed, uniform; import std.range : generate, take, chunks; import std.array : array; import std.stdio : writeln; struct Matrix(T) { int rows; T[] data; alias data this; int cols() {return cast(int) data.length/rows;} this(int r, int c) { data=new int[r*c]; rows=r;} this(int r, int c, T[] d) {assert(r*c==data.length); data=d; rows=r; } auto opIndex(int r, int c) {return data[rows*c+r];} }
Can you please explain what is the purpose of "alias data this" in your Matrix struct? As I remember "alias <member> this" is used for implicit type conversions but I don't see where "data" is converted.
Feb 29 2020
parent AB <a a.a> writes:
On Saturday, 29 February 2020 at 19:04:12 UTC, p.shkadzko wrote:
 On Friday, 28 February 2020 at 16:51:10 UTC, AB wrote:
 On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko 
 wrote:
[...]
Your Example with a minimal 2D array. ------------ module test2; import std.random : Xorshift, unpredictableSeed, uniform; import std.range : generate, take, chunks; import std.array : array; import std.stdio : writeln; struct Matrix(T) { int rows; T[] data; alias data this; int cols() {return cast(int) data.length/rows;} this(int r, int c) { data=new int[r*c]; rows=r;} this(int r, int c, T[] d) {assert(r*c==data.length); data=d; rows=r; } auto opIndex(int r, int c) {return data[rows*c+r];} }
Can you please explain what is the purpose of "alias data this" in your Matrix struct? As I remember "alias <member> this" is used for implicit type conversions but I don't see where "data" is converted.
Without "alias data this" the call c[] = m1[] + m2[]; should be written as c.data[] = m1.data[] + m2.data[]; With "alias data this" if some members of Matrix are not defined, but they are available for Matrix.data, the members of Matrix.data will be used. It is more about names lookup rules than type conversions.
Mar 01 2020