digitalmars.D.announce - Multidimensional Arrays
- Oskar Linde (175/175) Sep 08 2006 I have been implementing a strided multidimensional array type and am
- Don Clugston (22/27) Sep 10 2006 [snip]
- Bill Baxter (33/280) Feb 16 2007 Hey Oskar,
- Neal Becker (4/4) Feb 17 2007 I think it makes a lot of sense to build arrays from 2 components. One
- Bill Baxter (13/17) Feb 18 2007 You mean exactly how Oskar and I are doing it? Or you mean a more
- Oskar Linde (48/102) Mar 16 2007 Hi Bill,
- Bill Baxter (42/130) Mar 16 2007 Hmm, how do you accomplish that? I'll take a look at the code but it
I have been implementing a strided multidimensional array type and am interested in comments. It is basically designed after Norbert Nemec's suggestion: http://homepages.uni-regensburg.de/~nen10015/documents/D-multidimarray.html +/- some things. The basic array type looks like this: struct Array(T,int dims = 1) { T* ptr; size_t[dims] range; ptrdiff_t[dims] stride; const dimensions = dims; alias T ElementType; ... } Usage: Initialization // Create a 3x3 array auto A = Array!(int,2)(3,3); //This could use a nicer syntax A.assign(5,6,7, 3,1,9, 8,8,1); writefln("A = %s",A.toString); /* A = (3x3) 5 6 7 3 1 9 8 8 1 */ auto B = Array!(int,2)(3,3); B.assign(0,0,1, 0,2,1, 7,8,9); Element wise operators: writefln("A*B = %s",(A*B).toString); Scalar operators: writefln("5*B = %s",(5*B).toString); As in Norberts proposal: C.transpose returns a transposed array view (all dimensions reversed) C.transpose(dimA,dimB) swaps dimension dimA and dimB C.diag returns a N-1 dimensional view of the diagonal of the matrix. E.g. a 5x5 identity matrix: // Zeros is just a convenience wrapper to generate a zero initialized array Auto A = zeros!(int)(5,5); A.diag[] = 1; writefln("A = %s",A.toString); A = (5x5) 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 C.reverse(dim) reverses the given dimension (inverts the stride) Indexing and slicing: auto D = Array!(int,3)(7,7,7); // 7x7x7 dimensional array D[1,2,3] => int D[1,2,all] => 7 dimensional slice D[1,all,all] => 7x7 dimensional slice D[range(1,end-1), all, range(1,end-1)] => 5x7x5 dimensional subarray D[end-1, range(end-3,end-1), all] => 2x7 dimensional slice End works in the same way as $ does for regular arrays. range(a,b) is the replacement for a..b (D doesn't support mixed indexing/slicing otherwise) Slices/subarrays are of course views that share the same data: auto M = zeros!(int)(4,4); M[range(0,2), range(0,2)][] = 1; M[range(2,end), range(2,end)][] = 1; writefln("M = %s", M.toString); M = (4x4) 1 1 0 0 1 1 0 0 0 0 1 1 0 0 1 1 .dup => Returns a compact copy of the array In place map over a function: M.doMap((int y){return 1<<y;}); Functional map: writefln("cos(M) = %s",M.map(&cos).toString); cos(M) = (4x4) 0.5403 0.5403 1.0000 1.0000 0.5403 0.5403 1.0000 1.0000 1.0000 1.0000 0.5403 0.5403 1.0000 1.0000 0.5403 0.5403 Foreach iteration: auto N = zeros!(int)(5,5); int i = 0; foreach(inout x; N) x = ++i; Of course, custom types are supported. Some examples just for fun: :) auto Q = ones!(Rational!(long))(5,5) * 3; writefln("Q/N = %s",(Q/N).toString); Q/N = (5x5) 3 3/2 1 3/4 3/5 1/2 3/7 3/8 1/3 3/10 3/11 1/4 3/13 3/14 1/5 3/16 3/17 1/6 3/19 3/20 1/7 3/22 3/23 1/8 3/25 auto U = zeros!(Uncertain!(double))(4,4); double val = -8; foreach(inout T;U) { // +- 10% uncertainty + 0.1 T += Uncertain!(double)(val,abs(0.1*val)+0.1); val += 1; } writefln("U = %s",U.toString); U = (4x4) (-8.0 ± 0.9) (-7.0 ± 0.8) (-6.0 ± 0.7) (-5.0 ± 0.6) (-4.0 ± 0.5) (-3.0 ± 0.4) (-2.0 ± 0.3) (-1.0 ± 0.2) (0. ± 0.1) (1.0 ± 0.2) (2.0 ± 0.3) (3.0 ± 0.4) (4.0 ± 0.5) (5.0 ± 0.6) (6.0 ± 0.7) (7.0 ± 0.8) writefln("U/U = %s",(U/U).toString); U/U = (4x4) (1.0 ± 0.2) (1.0 ± 0.2) (1.0 ± 0.2) (1.0 ± 0.2) (1.0 ± 0.3) (1.0 ± 0.3) (1.0 ± 0.3) (1.1 ± 0.4) (nan ± inf) (1.1 ± 0.4) (1.0 ± 0.3) (1.0 ± 0.3) (1.0 ± 0.3) (1.0 ± 0.2) (1.0 ± 0.2) (1.0 ± 0.2) auto V = zeros!(Voltage)(4,4); val = 0; foreach(inout v; V) { v = val * volt; val += 0.1; } auto R = zeros!(Resistance)(4,4); val = 100; foreach(inout r; R) { r = val * ohm; val *= 2; } writefln("V = %s",V.toString); writefln("R = %s",R.toString); writefln("V/R = %s",(V/R).toString); V = (4x4) 0 V 100 mV 200 mV 300 mV 400 mV 500 mV 600 mV 700 mV 800 mV 900 mV 1e+03 mV 1.1 V 1.2 V 1.3 V 1.4 V 1.5 V R = (4x4) 100 Ω 200 Ω 400 Ω 800 Ω 1.6 kΩ 3.2 kΩ 6.4 kΩ 12.8 kΩ 25.6 kΩ 51.2 kΩ 102 kΩ 205 kΩ 410 kΩ 819 kΩ 1.64 MΩ 3.28 MΩ V/R = (4x4) 0 A 500 µA 500 µA 375 µA 250 µA 156 µA 93.7 µA 54.7 µA 31.2 µA 17.6 µA 9.77 µA 5.37 µA 2.93 µA 1.59 µA 854 nA 458 nA Comments: The only tricky part about all this was getting opIndex working with compile time variadic arguments. I have not yet gotten DMD to inline the generated indexing functions, but it should be doable as they are all trivial. I have not been able to pursue this due to a DMD inlining bug that in some cases generates "xx is not an lvalue" when compiling with -inline. The main issue that remains is op_r operators that have to work a bit differently from non _r operators(*). I see this all more as a proof of concept than anything else. If anyone is interested in the sources, I will find a way to put them up somewhere. Future extensions would be writing Matrix types, and possibly hook them up with a BLAS library. Anyone know if there exists BLAS implementations that supports 80-bit reals? Other things I want to look at are adding stride support to the range() type and investigate the possibilities of generating expression templates. *) Currently, if you have an: opAdd(T)(T x) {...} you can't also have an: opAdd_r(T)(T x) {...} since this will create ambiguous overloads. I find myself wanting to define opAdd_r for all cases where there are no regular non _r operator, but there is (AFAIK) no way to do that. I would go as far as suggesting a change to D: If both a matching opX and a matching opX_r overload exists, the opX overload is chosen. This will demote the opX_r overloads a bit and is consistent with the way commutative operators work, where if for example no a.opAdd(b) exists, b.opAdd(a) is called. /Oskar
Sep 08 2006
Oskar Linde wrote:I have been implementing a strided multidimensional array type and am interested in comments.[snip] This looks really promising -- that improved IFTI support in the last release has clearly opened a lot of possibilities. For now, I'll just comment on one thing.Future extensions would be writing Matrix types, and possibly hook them up with a BLAS library. Anyone know if there exists BLAS implementations that supports 80-bit reals?This is a really interesting issue. Personally I don't of any implementations, but I haven't really looked. My first reaction was, "you'd always be using doubles or floats in matrices, so you'd use SIMD". But even with 64-bit matrices, 80-bit temporary results can still be useful, especially for operations on the diagonal elements of matrices. Basically, all vectors parameters would still be 64-bit doubles, but scalar parameters and returned results would be 80-bit. And then I got thinking -- it's not clear to me that SIMD would be very beneficial when dealing with strided matrices, unless you're extremely clever(and this is probably why the AMD BLAS libraries have hand-crafted ASM even for the BLAS3 functions, you'd don't get much benefit otherwise). Consequently, I strongly suspect that you'd always want to treat strided arrays differently from built-in arrays (ie, stride == 1). I don't know to what extent expression templates could keep track of whether an array has a stride or not, but there'd be a massive performance boost if you could. -Don
Sep 10 2006
Oskar Linde wrote:I have been implementing a strided multidimensional array type and am interested in comments. It is basically designed after Norbert Nemec's suggestion: http://homepages.uni-regensburg.de/~nen10015/documents/D-multidimarray.html +/- some things. The basic array type looks like this: struct Array(T,int dims = 1) { T* ptr; size_t[dims] range; ptrdiff_t[dims] stride; const dimensions = dims; alias T ElementType; ... }Hey Oskar, I've been working on implementing something similar. I'd be interested to see some parts of your implementation. Mine's at http://www.dsource.org/projects/multiarray The main difference between our approaches seems to be that I've made 'dims' non-const (and therefore not a template parameter). I thought that would be better since sometimes it's handy to reshape a 1-d array into an equivalent 2d Nx1 or 3-d Nx1x1. And you have the potential to do broadcasting more easily that way too (E.g. multiply an N dim 1-d array times an Nx5 array just by duplicating the one axis implicitly). The down side is that range and stride then become dynamic arrays. I guess with your approach you can create a new view with the need more templated member functions, parameterized by Array dimension. Another tradeoff is that I can also have indexing and slicing return arrays of different dimension, but I can't have indexing return a single element (because the return type of opIndex is Array not float and you can't have two different return types). Instead I overloaded opCall for that. So M(4,3) is the value of element 4,3, but M[4,3] is a 1-element view. I also made it a class in the end, though I started out with it being a struct. Partly that decision was motivated by the idea that generally I don't really want to pass 6 or more 32-bit numbers around by value, plus the presence of pointer member data, so using a class seemed reasonable. Having to remember to say 'new' everywhere has been a pain, though. I have mine hooked up with some basic BLAS routines -- the routines for VV, MV, and MM multiplication, and the simplest least squares and linear system solvers. I don't know of any 80-bit capable BLAS, but really single precision is good enough for most of what I do. Double is mostly overkill, but I use it to be sure. --bbUsage: Initialization // Create a 3x3 array auto A = Array!(int,2)(3,3); //This could use a nicer syntax A.assign(5,6,7, 3,1,9, 8,8,1); writefln("A = %s",A.toString); /* A = (3x3) 5 6 7 3 1 9 8 8 1 */ auto B = Array!(int,2)(3,3); B.assign(0,0,1, 0,2,1, 7,8,9); Element wise operators: writefln("A*B = %s",(A*B).toString); Scalar operators: writefln("5*B = %s",(5*B).toString); As in Norberts proposal: C.transpose returns a transposed array view (all dimensions reversed) C.transpose(dimA,dimB) swaps dimension dimA and dimB C.diag returns a N-1 dimensional view of the diagonal of the matrix. E.g. a 5x5 identity matrix: // Zeros is just a convenience wrapper to generate a zero initialized array Auto A = zeros!(int)(5,5); A.diag[] = 1; writefln("A = %s",A.toString); A = (5x5) 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 C.reverse(dim) reverses the given dimension (inverts the stride) Indexing and slicing: auto D = Array!(int,3)(7,7,7); // 7x7x7 dimensional array D[1,2,3] => int D[1,2,all] => 7 dimensional slice D[1,all,all] => 7x7 dimensional slice D[range(1,end-1), all, range(1,end-1)] => 5x7x5 dimensional subarray D[end-1, range(end-3,end-1), all] => 2x7 dimensional slice End works in the same way as $ does for regular arrays. range(a,b) is the replacement for a..b (D doesn't support mixed indexing/slicing otherwise) Slices/subarrays are of course views that share the same data: auto M = zeros!(int)(4,4); M[range(0,2), range(0,2)][] = 1; M[range(2,end), range(2,end)][] = 1; writefln("M = %s", M.toString); M = (4x4) 1 1 0 0 1 1 0 0 0 0 1 1 0 0 1 1 .dup => Returns a compact copy of the array In place map over a function: M.doMap((int y){return 1<<y;}); Functional map: writefln("cos(M) = %s",M.map(&cos).toString); cos(M) = (4x4) 0.5403 0.5403 1.0000 1.0000 0.5403 0.5403 1.0000 1.0000 1.0000 1.0000 0.5403 0.5403 1.0000 1.0000 0.5403 0.5403 Foreach iteration: auto N = zeros!(int)(5,5); int i = 0; foreach(inout x; N) x = ++i; Of course, custom types are supported. Some examples just for fun: :) auto Q = ones!(Rational!(long))(5,5) * 3; writefln("Q/N = %s",(Q/N).toString); Q/N = (5x5) 3 3/2 1 3/4 3/5 1/2 3/7 3/8 1/3 3/10 3/11 1/4 3/13 3/14 1/5 3/16 3/17 1/6 3/19 3/20 1/7 3/22 3/23 1/8 3/25 auto U = zeros!(Uncertain!(double))(4,4); double val = -8; foreach(inout T;U) { // +- 10% uncertainty + 0.1 T += Uncertain!(double)(val,abs(0.1*val)+0.1); val += 1; } writefln("U = %s",U.toString); U = (4x4) (-8.0 ± 0.9) (-7.0 ± 0.8) (-6.0 ± 0.7) (-5.0 ± 0.6) (-4.0 ± 0.5) (-3.0 ± 0.4) (-2.0 ± 0.3) (-1.0 ± 0.2) (0. ± 0.1) (1.0 ± 0.2) (2.0 ± 0.3) (3.0 ± 0.4) (4.0 ± 0.5) (5.0 ± 0.6) (6.0 ± 0.7) (7.0 ± 0.8) writefln("U/U = %s",(U/U).toString); U/U = (4x4) (1.0 ± 0.2) (1.0 ± 0.2) (1.0 ± 0.2) (1.0 ± 0.2) (1.0 ± 0.3) (1.0 ± 0.3) (1.0 ± 0.3) (1.1 ± 0.4) (nan ± inf) (1.1 ± 0.4) (1.0 ± 0.3) (1.0 ± 0.3) (1.0 ± 0.3) (1.0 ± 0.2) (1.0 ± 0.2) (1.0 ± 0.2) auto V = zeros!(Voltage)(4,4); val = 0; foreach(inout v; V) { v = val * volt; val += 0.1; } auto R = zeros!(Resistance)(4,4); val = 100; foreach(inout r; R) { r = val * ohm; val *= 2; } writefln("V = %s",V.toString); writefln("R = %s",R.toString); writefln("V/R = %s",(V/R).toString); V = (4x4) 0 V 100 mV 200 mV 300 mV 400 mV 500 mV 600 mV 700 mV 800 mV 900 mV 1e+03 mV 1.1 V 1.2 V 1.3 V 1.4 V 1.5 V R = (4x4) 100 Ω 200 Ω 400 Ω 800 Ω 1.6 kΩ 3.2 kΩ 6.4 kΩ 12.8 kΩ 25.6 kΩ 51.2 kΩ 102 kΩ 205 kΩ 410 kΩ 819 kΩ 1.64 MΩ 3.28 MΩ V/R = (4x4) 0 A 500 µA 500 µA 375 µA 250 µA 156 µA 93.7 µA 54.7 µA 31.2 µA 17.6 µA 9.77 µA 5.37 µA 2.93 µA 1.59 µA 854 nA 458 nA Comments: The only tricky part about all this was getting opIndex working with compile time variadic arguments. I have not yet gotten DMD to inline the generated indexing functions, but it should be doable as they are all trivial. I have not been able to pursue this due to a DMD inlining bug that in some cases generates "xx is not an lvalue" when compiling with -inline. The main issue that remains is op_r operators that have to work a bit differently from non _r operators(*). I see this all more as a proof of concept than anything else. If anyone is interested in the sources, I will find a way to put them up somewhere. Future extensions would be writing Matrix types, and possibly hook them up with a BLAS library. Anyone know if there exists BLAS implementations that supports 80-bit reals? Other things I want to look at are adding stride support to the range() type and investigate the possibilities of generating expression templates. *) Currently, if you have an: opAdd(T)(T x) {...} you can't also have an: opAdd_r(T)(T x) {...} since this will create ambiguous overloads. I find myself wanting to define opAdd_r for all cases where there are no regular non _r operator, but there is (AFAIK) no way to do that. I would go as far as suggesting a change to D: If both a matching opX and a matching opX_r overload exists, the opX overload is chosen. This will demote the opX_r overloads a bit and is consistent with the way commutative operators work, where if for example no a.opAdd(b) exists, b.opAdd(a) is called. /Oskar
Feb 16 2007
I think it makes a lot of sense to build arrays from 2 components. One component is the storage. The second component is the array, which is an interpretation of the storage. I think this simplifies slicing and projections.
Feb 17 2007
Neal Becker wrote:I think it makes a lot of sense to build arrays from 2 components. One component is the storage. The second component is the array, which is an interpretation of the storage. I think this simplifies slicing and projections.You mean exactly how Oskar and I are doing it? Or you mean a more flexible notion of storage? I agree it would be nice to be able to use different storage back ends besides strided memory. Diagonal, Triangular, Sparse, Block, Banded, etc. would all be nice. The Matrix Template Library did something like that, though I'd say with mixed results: http://www.osl.iu.edu/research/mtl/ The high-level idea is nice, but the end result was overly complicated IMHO, both internally and externally. And it depended critically on good optimization from the compiler to get decent performance since everything was in C++ (no tie-in with optimized Fortran back-ends like ATLAS). --bb
Feb 18 2007
Bill Baxter skrev:Oskar Linde wrote:Hi Bill, I'm sorry for the late reply. I've had a quite hectic last few months and not had time to dig into this. I have now at least extracted my old multiarray code and turned it into something that compiles (and converted it to use the new D variadic templates instead of the old variadic.d hack): http://www.csc.kth.se/~ol/multiarray.tar.gz the multiarray module is sci.matrix.multiarray. Everything is in a very crude shape.I have been implementing a strided multidimensional array type and am interested in comments. It is basically designed after Norbert Nemec's suggestion: http://homepages.uni-regensburg.de/~nen10015/documents/D-multidimarray.html +/- some things. The basic array type looks like this: struct Array(T,int dims = 1) { T* ptr; size_t[dims] range; ptrdiff_t[dims] stride; const dimensions = dims; alias T ElementType; ... }Hey Oskar, I've been working on implementing something similar. I'd be interested to see some parts of your implementation. Mine's at http://www.dsource.org/projects/multiarrayThe main difference between our approaches seems to be that I've made 'dims' non-const (and therefore not a template parameter). I thought that would be better since sometimes it's handy to reshape a 1-d array into an equivalent 2d Nx1 or 3-d Nx1x1. And you have the potential to do broadcasting more easily that way too (E.g. multiply an N dim 1-d array times an Nx5 array just by duplicating the one axis implicitly). The down side is that range and stride then become dynamic arrays.I guess with your approach you can create a new view with theYes. This is very powerful. Given a 10x10x10 array arr, arr[0,all,all] will be a 10x10 array, not a 1x10x10 array or anything else. And that is all deduced at compile time.On the other hand that probably means you need more templated member functions, parameterized by Array dimension.Possibly.Another tradeoff is that I can also have indexing and slicing return arrays of different dimension, but I can't have indexing return a single element (because the return type of opIndex is Array not float and you can't have two different return types). Instead I overloaded opCall for that. So M(4,3) is the value of element 4,3, but M[4,3] is a 1-element view.In addition to being able to have full indexing return a single element vs partial indexing returning slices I think the greatest advantage of having the dimensionality as a compile time constant is the improved compile time consistency checks. I believe you will extremely rarely dynamically adjust the dimensionality of a multidimensional array. You are almost always required to know the dimensionality at compile time anyway. You can always make slices of different dimensionality. Having a dimension with a stride of 0, you could even duplicate the data over a dimension such as [1,1,1,1,... (10000 times) ...,1] only occupying a single unit of memory. Reshaping has to be done explicitly (which I see as a good thing). In general, I think this falls back on the same conclusion as the discussion regarding intrinsic vector operations and small static arrays being value types a while ago. Low dimensionalities will most of the time turn out to be static, while larger dimensions are dynamic. The dimensionality of a multidimensional array will typically be static and of a low order (For example, I've personally never encountered a case needing more than 6 dimensions).I also made it a class in the end, though I started out with it being a struct. Partly that decision was motivated by the idea that generally I don't really want to pass 6 or more 32-bit numbers around by value, plus the presence of pointer member data, so using a class seemed reasonable. Having to remember to say 'new' everywhere has been a pain, though.The biggest disadvantage I see is that data shuffling (such as in a loop assigning slices of one array to slices of another one) will result in the creation of lots of temporary heap allocated classes instead of, when using a struct, a small constant stack memory usage. One option could possibly be to make slicing return proxy structs that (in the future implicitly) converted to class instances. Note that the by-value passing doesn't necessarily mean any reduction in performance. Being careful about making functions not changing the shape of the array take the array as an inout (future const/readonly ref) while those that need must make the same data-copy when dup-ing the class anyway. Another option to avoid passing lots of data by value could possibly be to make a container struct containing a pointer to another struct containing all the data.I have mine hooked up with some basic BLAS routines -- the routines for VV, MV, and MM multiplication, and the simplest least squares and linear system solvers.That sounds really interesting. I will definitely give it a look. /Oskar
Mar 16 2007
Oskar Linde wrote:Bill Baxter skrev:Hi Bill, I'm sorry for the late reply. I've had a quite hectic last few months and not had time to dig into this. I have now at least extracted my old multiarray code and turned it into something that compiles (and converted it to use the new D variadic templates instead of the old variadic.d hack): http://www.csc.kth.se/~ol/multiarray.tar.gzGreat! Thanks for putting this online.the multiarray module is sci.matrix.multiarray. Everything is in a very crude shape.Hmm, how do you accomplish that? I'll take a look at the code but it seems like you need a lot of partial specializations for the 0 case. Any opIndex call that has a zero as one of its arguments needs to have a special implementation because it's returning a different type. But I thought specialization and IFTI were mutually exclusive.The main difference between our approaches seems to be that I've made 'dims' non-const (and therefore not a template parameter). I thought that would be better since sometimes it's handy to reshape a 1-d array into an equivalent 2d Nx1 or 3-d Nx1x1. And you have the potential to do broadcasting more easily that way too (E.g. multiply an N dim 1-d array times an Nx5 array just by duplicating the one axis implicitly). The down side is that range and stride then become dynamic arrays.of dims when needed.Yes. This is very powerful. Given a 10x10x10 array arr, arr[0,all,all] will be a 10x10 array, not a 1x10x10 array or anything else. And that is all deduced at compile time.What about allowing flexible expanding of dimensions? I'm basically trying to make a pseudo-clone of numpy (www.numpy.org), and one thing it does a lot is "broadcasting". http://www.scipy.org/EricsBroadcastingDoc http://www.onlamp.com/pub/a/python/2000/09/27/numerically.htmlOn the other hand that probably means you need more templated member functions, parameterized by Array dimension.Possibly.Another tradeoff is that I can also have indexing and slicing return arrays of different dimension, but I can't have indexing return a single element (because the return type of opIndex is Array not float and you can't have two different return types). Instead I overloaded opCall for that. So M(4,3) is the value of element 4,3, but M[4,3] is a 1-element view.In addition to being able to have full indexing return a single element vs partial indexing returning slices I think the greatest advantage of having the dimensionality as a compile time constant is the improved compile time consistency checks. I believe you will extremely rarely dynamically adjust the dimensionality of a multidimensional array. You are almost always required to know the dimensionality at compile time anyway. You can always make slices of different dimensionality. Having a dimension with a stride of 0, you could even duplicate the data over a dimension such as [1,1,1,1,... (10000 times) ...,1] only occupying a single unit of memory. Reshaping has to be done explicitly (which I see as a good thing).In general, I think this falls back on the same conclusion as the discussion regarding intrinsic vector operations and small static arrays being value types a while ago. Low dimensionalities will most of the time turn out to be static, while larger dimensions are dynamic. The dimensionality of a multidimensional array will typically be static and of a low order (For example, I've personally never encountered a case needing more than 6 dimensions).Yeh, there is a limit. Numpy is supposed to be dimension agnostic, but there is a hard-coded MAX_DIMS in the C code that implements the backend. I think it's set to something like 32.Yep. But basically since I was porting from Python code my thinking was "Python already does all those allocations, and millions more -- a D implementation can only be faster" :-) I think the thing that really got me to switch to class was that there were some IFTI bugs that only surfaced when using structs and which went away when using a class. They're fixed now, so maybe it's time to go back to a struct. *Especially* if we get some new constref goodness.I also made it a class in the end, though I started out with it being a struct. Partly that decision was motivated by the idea that generally I don't really want to pass 6 or more 32-bit numbers around by value, plus the presence of pointer member data, so using a class seemed reasonable. Having to remember to say 'new' everywhere has been a pain, though.The biggest disadvantage I see is that data shuffling (such as in a loop assigning slices of one array to slices of another one) will result in the creation of lots of temporary heap allocated classes instead of, when using a struct, a small constant stack memory usage.One option could possibly be to make slicing return proxy structs that (in the future implicitly) converted to class instances. Note that the by-value passing doesn't necessarily mean any reduction in performance. Being careful about making functions not changing the shape of the array take the array as an inout (future const/readonly ref) while those that need must make the same data-copy when dup-ing the class anyway.Yeh, I hate to make things 'inout' when I really want const reference behavior. I'm going to go an rewrite a bunch of stuff as soon as we get const ref. Another nice thing about going the class route with current D, is that you can have a function which potentially returns the input unmodified. For instance a function that makes sure the input is at least 2D, and if so just returns the input as-is, otherwise it creates a new view. With structs that'll turn into create a new view no matter what. On the other hand, you need those kinds of functions in Numpy a lot because you tend to write functions that take an "A" where "A" could be a python array, or a python tuple, or a numpy array of any dimension, or any other sequence type. Using numpy.asarray_2d(A) at the top of your functions is in many ways a substitute for not having static typing. It's hard to say without actually going through the exercise. I don't have much experience using big flexible array packages in statically typed languages like C++. My experience is with the array handling in Matlab and Numpy, both of which are dynamic. Maybe static dimension is the way to go with a statically typed language.Another option to avoid passing lots of data by value could possibly be to make a container struct containing a pointer to another struct containing all the data.Sounds like significantly more complexity for not so much benefit.--bbI have mine hooked up with some basic BLAS routines -- the routines for VV, MV, and MM multiplication, and the simplest least squares and linear system solvers.That sounds really interesting. I will definitely give it a look.
Mar 16 2007