www.digitalmars.com         C & C++   DMDScript  

D - Dynamic rectangular arrays?

reply Norbert Nemec <Norbert.Nemec gmx.de> writes:
Hi there,

As far as I can see, there is no way to have rectangular arrays if you don't
know their size at compile time? This definitely is a limitation! Of
course, there is some overhead, if you have to store the shape of an array
and calculate column offsets at run-time, but it should still be much more
efficient than to use arrays of pointers like in C.

My proposal, how this could be achived would be an extension of array
pointers to something like:

struct array3d(T) {
        T *ptr;
        size_t range0;
        size_t stride0;
        size_t range1;
        size_t stride1;
        size_t range2;
        size_t stride2;
};

Here you would have as many rang/stride pair as the array has dimensions. If
the compiler knows stride to be 1, it can optimize it away, reducing
everything to the current 1d arrays.

The range arguments are only needed for range checking. Element access only
needs access to the ptr and the stride arguments.

Additional benefits:
* slicing possible not only to blocks but also to arbitrary sub-grids.
* simple implementation of fortran-arrays and even reverse memory layouts

Of course, all of this could be done in the library, but then - since we
already have arrays as language elements, it would make sense to make
truely multidimensional arrays part of the language as well.

As for the syntax: I would propose to use A[a,b] as way to express
rectangular multidimensional arrays and leave A[a][b] for ragged arrays as
we have them now. Of course, the compiler could still optimize ragged
arrays with fixed size, but that would be another matter.

Array pointers, of course have to have a dimension known at compile time. If
their range is dynamic, the type would, of course, be something like
        int[,]

(This will, of course, look ugly for 17-dimensional arrays, but who needs
these?)

Ciao,
Nobbi
Apr 24 2004
next sibling parent reply "Walter" <walter digitalmars.com> writes:
"Norbert Nemec" <Norbert.Nemec gmx.de> wrote in message
news:c6d8br$1un2$1 digitaldaemon.com...
 As far as I can see, there is no way to have rectangular arrays if you
don't
 know their size at compile time?
That's correct.
 This definitely is a limitation! Of
 course, there is some overhead, if you have to store the shape of an array
 and calculate column offsets at run-time, but it should still be much more
 efficient than to use arrays of pointers like in C.

 My proposal, how this could be achived would be an extension of array
 pointers to something like:

 struct array3d(T) {
         T *ptr;
         size_t range0;
         size_t stride0;
         size_t range1;
         size_t stride1;
         size_t range2;
         size_t stride2;
 };

 Here you would have as many rang/stride pair as the array has dimensions.
If
 the compiler knows stride to be 1, it can optimize it away, reducing
 everything to the current 1d arrays.

 The range arguments are only needed for range checking. Element access
only
 needs access to the ptr and the stride arguments.

 Additional benefits:
 * slicing possible not only to blocks but also to arbitrary sub-grids.
 * simple implementation of fortran-arrays and even reverse memory layouts

 Of course, all of this could be done in the library, but then - since we
 already have arrays as language elements, it would make sense to make
 truely multidimensional arrays part of the language as well.

 As for the syntax: I would propose to use A[a,b] as way to express
 rectangular multidimensional arrays and leave A[a][b] for ragged arrays as
 we have them now. Of course, the compiler could still optimize ragged
 arrays with fixed size, but that would be another matter.

 Array pointers, of course have to have a dimension known at compile time.
If
 their range is dynamic, the type would, of course, be something like
         int[,]

 (This will, of course, look ugly for 17-dimensional arrays, but who needs
 these?)
Your idea is good, and will work, but at the moment you can achieve nearly the same thing by declaring a class with [] operator overloads.
Apr 24 2004
parent reply Norbert Nemec <Norbert.Nemec gmx.de> writes:
Walter wrote:
 Your idea is good, and will work, but at the moment you can achieve nearly
 the same thing by declaring a class with [] operator overloads.
I'm trying just this right now. Let's see what come from it.
Apr 24 2004
next sibling parent reply J Anderson <REMOVEanderson badmama.com.au> writes:
Norbert Nemec wrote:

Walter wrote:
  

Your idea is good, and will work, but at the moment you can achieve nearly
the same thing by declaring a class with [] operator overloads.
    
I'm trying just this right now. Let's see what come from it.
Could be something that goes into DTL... -- -Anderson: http://badmama.com.au/~anderson/
Apr 24 2004
parent "Matthew" <matthew.hat stlsoft.dot.org> writes:
"J Anderson" <REMOVEanderson badmama.com.au> wrote in message
news:c6ddcb$2blh$1 digitaldaemon.com...
 Norbert Nemec wrote:

Walter wrote:


Your idea is good, and will work, but at the moment you can achieve nearly
the same thing by declaring a class with [] operator overloads.
I'm trying just this right now. Let's see what come from it.
Could be something that goes into DTL...
I think that should be pretty straightforward. I have two efficient (space/speed) and expressive rectangular array templates in STLSoft (http://stlsoft.org/) and I see no reason why I can't borrow them into DTL.
Apr 24 2004
prev sibling parent reply Drew McCormack <drewmccormack mac.com> writes:
On 2004-04-24 11:56:50 +0200, Norbert Nemec <Norbert.Nemec gmx.de> said:

 Walter wrote:
 Your idea is good, and will work, but at the moment you can achieve nearly
 the same thing by declaring a class with [] operator overloads.
I'm trying just this right now. Let's see what come from it.
Did you try this Norbert? Any luck? Drew McCormack
Apr 28 2004
parent Norbert Nemec <Norbert.Nemec gmx.de> writes:
Drew McCormack wrote:

 On 2004-04-24 11:56:50 +0200, Norbert Nemec <Norbert.Nemec gmx.de> said:
 
 Walter wrote:
 Your idea is good, and will work, but at the moment you can achieve
 nearly the same thing by declaring a class with [] operator overloads.
I'm trying just this right now. Let's see what come from it.
Did you try this Norbert? Any luck?
I have something working, but it needs some more cleanup before I can put it up for download with a clear conscience. Maybe I'll find the time for that tonight, otherwise, it will have to wait for next week.
Apr 28 2004
prev sibling parent reply Drew McCormack <drewmccormack mac.com> writes:
On 2004-04-24 10:31:48 +0200, Norbert Nemec <Norbert.Nemec gmx.de> said:

 Hi there,
 
 As far as I can see, there is no way to have rectangular arrays if you don't
 know their size at compile time? This definitely is a limitation! Of
 course, there is some overhead, if you have to store the shape of an array
 and calculate column offsets at run-time, but it should still be much more
 efficient than to use arrays of pointers like in C.
As someone coming from scientific programming, where Fortran is still king, I would like to chime in here. D is offering me hope that I may one day be able to leave Fortran behind, so I don't want you guys to mess it up at the last hurdle. Fortran is a pretty rotten language, but if it gets one thing right, it is arrays. Fortran 90 that is. If D basically implements the Fortran 90 approach, leaving the extremely primitive C approach largely behind, I think it could capture the imagination of a lot of scientific developers. And be a much more powerful language to boot. Since I assume most people here are from the C/C++ side of programming, here is some stuff you can do with fortran 90 arrays (in D-type syntax) 1) Create them on the stack, with a size not known at compile time (called automatic arrays) void someFunc( int n ) { float [n] array; } 2) Create them on the heap with a size not known at compile time int n, m; n = 5; m = 10; float [] array = new float[n][m]; 3) Query the size of any dimension float [10][5] array; array.size(1); // this is equal to 10 array.size(2); // this is equla to 5 4) Have lower and upper bounds float [2..10][2..4] array; array.lowerbound(1); // this is 2 array.upperbound(2); // this is 4 5) Slice arbitrary dimensions float [10][5] array; array[2..8][1..4]; // Gives a new rank 2 array 6) Include strides in array slices. These can also be negative. float[10][5] array; array[2..4:2][1..4:3]; // This is rank 2 array with indexes { { [2,1], [2,4] } { [4,1],[4,4] } } array[4..2:-2][1..4:3]; // This is rank 2 array with { { [4,1], [4,4] } { [2,1],[2,4] } } 7) Some powerful functions, like reductions, matrix multiplication, and 'broadcasting' float [10][5] matrix, resultmat; float [5] vector, resultvec; resultvec[] = matrixmult( matrix, vector ); resultmat[][] = broadcast( vector, 1, 10 ); // broadcast expands vector into matrix by duplicating Another way to do broadcasting is used in the python library Numpy: NewAxis. It is particularly elegant. It creates temporary arrays which are broadcasts of other arrays: matrixmultiply( vector1[][newaxis], vector2[] ); // Turns vector into a temporay matrix, and multiplies by vector2 What I don't want, is to have to rely of my own array class and templates. That is just the C++ situation, and leads to all sorts of problems. Better to make your arrays powerful, and just put some more advanced functions (like matrixmultiply, for example) in the library. Drew McCormack
Apr 27 2004
parent Norbert Nemec <Norbert.Nemec gmx.de> writes:
Drew McCormack wrote:
 4) Have lower and upper bounds
 float [2..10][2..4] array;
 array.lowerbound(1); // this is 2
 array.upperbound(2); // this is 4
I clearly vote against this: it adds to the runtime-overhead. Even if it might be convenient in some cases, it never is necessary and may even make the code harder to read.
 7) Some powerful functions, like reductions, matrix multiplication, and
 'broadcasting'
Here, the question is how much has to be in the language, and how much should be put into the library instead.
 What I don't want, is to have to rely of my own array class and
 templates. That is just the C++ situation, and leads to all sorts of
 problems. Better to make your arrays powerful, and just put some more
 advanced functions (like matrixmultiply, for example) in the library.
Just my position. Arrays are just too basic to leave them to the library.
Apr 27 2004