digitalmars.D - OpSlice for Ndim arrays?

Norbert Nemec (52/52) Mar 10 2006 Hi there,

Norbert Nemec (1/1) Mar 10 2006 Sorry the duplicate posting... :-(
S. Chancellor (9/78) Mar 10 2006 It would be nice if the .. operator just returned a variable of a

Oskar Linde (94/107) Mar 11 2006 The problem with vararg functions is that the argument types are resolve...

Ivan Senji (4/37) Mar 11 2006 I kind of like the idea, but not sure if it is maybe to late for

Oskar Linde (5/46) Mar 11 2006 There is fortunately an obvious fallback:

Norbert Nemec (2/138) Mar 12 2006

Norbert Nemec <Norbert_member pathlink.com> writes:

Hi there,

after two years of silence, I just found a little time to return my thoughts to
D and the issues of NDim-arrays I started back then. (No, I never completely
forgot about D, I just had other things that I had to give priority.)

My design proposal for NDim arrays needs a complete overhaul before discussion
of details makes sense. However, there is one cucial issue that needs to be
solved and I have no idea how it could be done:

Currently we have OpIndex and OpIndexAssign fit for overloading multidimensional
indexing. OpSlice and OpSliceAssign, however, work only for one dimension. I've
been thinking how to generalize these, but it turns out to be extremely tricky.
First though is: Why not simply extend OpSlice to take N pairs of boundaries?
But then, what about mixed expressions like 'a[1,4..7]' ?

Working with Python, I use expressions like that all the time, and I think it is
crucial that Ndim-arrays in D allow this kind of slicing as well. For native
Ndim arrays, it should be straightforward to handle such expressions in the
compiler. However, how should they be overloaded?!?

Python goes the way of taking a tuple of objects as indices where '4..7' would
then be translated into a "slice object". The indexing-overloading routine then
goes through all the objects at run-time and decides whether each one is a
slicing or an indexing operation. All of this is rather simple with dynamic
typing and accepting the run-time overhead.

In D, this is not an option. Indexing/Slicing has to be type-safe and the
resulting type has to be known at compile time.

One rather clumsy solution that I came up so far is to split the resolve the
expression into two consecutive function calls:
a[1,4..7]
would become
a.OpIndexPartial(0,1).OpSlice(4,7)
where the first function is defined as
OpIndexPartial(int dim,int idx)
and returns an array one rank lower.

The assignment expression:
a[1,4..7] = b[0..3]
would turn into
a.OpIndexPartial(0,1).OpSliceAssign(b[0..3],4,7)

The ugly thing about this solution is, that some kind of marshal-object is
needed which is returned from OpIndexAssign.


Another possibility would be to change
OpSlice(size_t start, size_t end)
into
OpIndex(size_t[2] slc)
and interpret a complex expression like
a[1,3..5,2,6..3]
as a call to
OpIndex(size_t idx0, size_t[2] slc1, size_t idx2, size_t[2] slc3)
I don't know, however, whether D templates would allow to implement the long
list of routines in some comfortable way.

All the really clean solution that I could think of would demand for tuple-types
and list-handling at compile time, both things that have been discussed before
but are at best a topic for the very far future.

Any ideas?

Norbert Nemec

Mar 10 2006

Norbert Nemec <Norbert_member pathlink.com> writes:

Sorry the duplicate posting... :-(

Mar 10 2006

S. Chancellor <dnewsgr mephit.kicks-ass.org> writes:

It would be nice if the .. operator just returned a variable of a 
sequence type template.  That would make it more useful than the 
current limitations of being in a [] operator.  Lots of other language 
features could take advantage of sequence types.  Then:

" But then, what about mixed expressions like 'a[1,4..7]' ?" Wouldn't 
have alot of baring, it would just be a vararg function, one element 
would be an int, while another would be a sequence.

-S.


On 2006-03-10 07:34:31 -0800, Norbert Nemec <Norbert_member pathlink.com> said:

 Hi there,
 
 after two years of silence, I just found a little time to return my thoughts to
 D and the issues of NDim-arrays I started back then. (No, I never completely
 forgot about D, I just had other things that I had to give priority.)
 
 My design proposal for NDim arrays needs a complete overhaul before discussion
 of details makes sense. However, there is one cucial issue that needs to be
 solved and I have no idea how it could be done:
 
 Currently we have OpIndex and OpIndexAssign fit for overloading 
 multidimensional
 indexing. OpSlice and OpSliceAssign, however, work only for one dimension. I've
 been thinking how to generalize these, but it turns out to be extremely tricky.
 First though is: Why not simply extend OpSlice to take N pairs of boundaries?
 But then, what about mixed expressions like 'a[1,4..7]' ?
 
 Working with Python, I use expressions like that all the time, and I 
 think it is
 crucial that Ndim-arrays in D allow this kind of slicing as well. For native
 Ndim arrays, it should be straightforward to handle such expressions in the
 compiler. However, how should they be overloaded?!?
 
 Python goes the way of taking a tuple of objects as indices where '4..7' would
 then be translated into a "slice object". The indexing-overloading routine then
 goes through all the objects at run-time and decides whether each one is a
 slicing or an indexing operation. All of this is rather simple with dynamic
 typing and accepting the run-time overhead.
 
 In D, this is not an option. Indexing/Slicing has to be type-safe and the
 resulting type has to be known at compile time.
 
 One rather clumsy solution that I came up so far is to split the resolve the
 expression into two consecutive function calls:
 a[1,4..7]
 would become
 a.OpIndexPartial(0,1).OpSlice(4,7)
 where the first function is defined as
 OpIndexPartial(int dim,int idx)
 and returns an array one rank lower.
 
 The assignment expression:
 a[1,4..7] = b[0..3]
 would turn into
 a.OpIndexPartial(0,1).OpSliceAssign(b[0..3],4,7)
 
 The ugly thing about this solution is, that some kind of marshal-object is
 needed which is returned from OpIndexAssign.
 
 
 Another possibility would be to change
 OpSlice(size_t start, size_t end)
 into
 OpIndex(size_t[2] slc)
 and interpret a complex expression like
 a[1,3..5,2,6..3]
 as a call to
 OpIndex(size_t idx0, size_t[2] slc1, size_t idx2, size_t[2] slc3)
 I don't know, however, whether D templates would allow to implement the long
 list of routines in some comfortable way.
 
 All the really clean solution that I could think of would demand for 
 tuple-types
 and list-handling at compile time, both things that have been discussed before
 but are at best a topic for the very far future.
 
 Any ideas?
 
 Norbert Nemec

Mar 10 2006

Oskar Linde <olREM OVEnada.kth.se> writes:

S. Chancellor wrote:

 It would be nice if the .. operator just returned a variable of a
 sequence type template.  That would make it more useful than the
 current limitations of being in a [] operator.  Lots of other language
 features could take advantage of sequence types.  Then:
 
 " But then, what about mixed expressions like 'a[1,4..7]' ?" Wouldn't
 have alot of baring, it would just be a vararg function, one element
 would be an int, while another would be a sequence.

The problem with vararg functions is that the argument types are resolved at
runtime. This is what Norbert calls "not an option". First, the only way to
make this fast is to hope that the indexing function gets inlined, its
loops unrolled and the type information constantly folded in.
Multidimensional arrays are often used where speed matters, so indexing
should be as fast as possible. But this is not the main problem. The
resulting type of an indexing expression depends on whether dimensions are
indexed or sliced. A three dimensional array a, indexed like a[0..1,0,0..5]
should result in a two dimensional array (sliced dimensions remain, indexed
dimensions are removed). The return type has to be computed at compile
time.

I was going to post this at this moment, but reading Norberts post again
gave me an idea I had to test...:

 On 2006-03-10 07:34:31 -0800, Norbert Nemec <Norbert_member pathlink.com>
 said: 
 All the really clean solution that I could think of would demand for
 tuple-types and list-handling at compile time, both things that have been
 discussed before but are at best a topic for the very far future.


It turns out this is not at all far away. Compile time varadic function
arguments are available with the IFTI support in DMD 0.149. That means
today!

(Allright, not unlimited number of varadic function arguments, but a
reasonable number of arguments at least.)

Here is a demo:

---
import std.stdio;

struct Empty {}

template TupleType(A = Empty, B = Empty, C = Empty, D = Empty, E = Empty, F
= Empty, G = Empty /*,...*/) {
  static if (is(A == Empty))
    alias Empty TupleType;
  else
    alias List!(A,.TupleType!(B,C,D,E,F,G/*,...*/)) TupleType;
}

template Tuple(A = Empty, B = Empty, C = Empty, D = Empty, E = Empty, F =
Empty, G = Empty /*,...*/) {
  TupleType!(A,B,C,D,E,F,G /*,...*/) Tuple(A a = A.init, B b = B.init, C c =
C.init, D d = D.init, E e = E.in\
it, F f = F.init, G g = G.init /*,...*/) {
    static if (is(A == Empty))
      return Empty.init;
    else
      return List!(A,TupleType!(B,C,D,E,F,G/*,...*/)
(a,.Tuple(b,c,d,e,f,g /*,...*/));
  }
}

struct List(A,B) {
  A head;
  B tail;
  static List opCall(A a, B b) {
    List!(A,B) ret;
    ret.head = a;
    ret.tail = b;
    return ret;
  }
}

template func(A = Empty, B = Empty, C = Empty, D = Empty, E = Empty, F =
Empty, G = Empty /*, ...*/) {
    void func(A a = A.init, B b = B.init, C c = C.init, D d = D.init, E e =
E.init, F f = F.init, G g = G.ini\
t /*,...*/) {
      realFunc(Tuple(a,b,c,d,e,f,g /*,...*/));
  }
}

template realFunc(T) {
  void realFunc(T args) {
    static if (is(T == Empty)) {
      writefln("Done!");
    } else {
      writefln("Got argument: (%s) %s",typeid(typeof(args.head)),args.head);
      .realFunc(args.tail);
    }
  }
}

void main() {
  int a = 1;
  float b = 5.6;
  long c = 3;
  char[] d = "test";
  func(a,b,c,d);
}

---

And guess what? This prints:

Got argument: (int) 1
Got argument: (float) 5.6
Got argument: (long) 3
Got argument: (char[]) test
Done!

All types are deduced at compile time!

Notice the neat implementation of realFunc(). All other templates could be
put in a library. I bet the func()-wrapper could declared by a mixin!

So all we need to support multidimensional splice/indexing is to make ..
evaluate into:

struct Sequence {
        size_t start,end;
}

and we are set.

/Oskar

Mar 11 2006

Ivan Senji <ivan.senji_REMOVE_ _THIS__gmail.com> writes:

Oskar Linde wrote:

...

 
 void main() {
   int a = 1;
   float b = 5.6;
   long c = 3;
   char[] d = "test";
   func(a,b,c,d);
 }
 
 ---
 
 And guess what? This prints:
 
 Got argument: (int) 1
 Got argument: (float) 5.6
 Got argument: (long) 3
 Got argument: (char[]) test
 Done!
 
 All types are deduced at compile time!
 
 Notice the neat implementation of realFunc(). All other templates could be
 put in a library. I bet the func()-wrapper could declared by a mixin!
 
 So all we need to support multidimensional splice/indexing is to make ..
 evaluate into:
 
 struct Sequence {
         size_t start,end;
 }
 
 and we are set.
 


I kind of like the idea, but not sure if it is maybe to late for 
something like that, and not sure if it will cause any problems.

Mar 11 2006

Oskar Linde <olREM OVEnada.kth.se> writes:

Ivan Senji wrote:

 Oskar Linde wrote:
 
 ...
 
 
 void main() {
   int a = 1;
   float b = 5.6;
   long c = 3;
   char[] d = "test";
   func(a,b,c,d);
 }
 
 ---
 
 And guess what? This prints:
 
 Got argument: (int) 1
 Got argument: (float) 5.6
 Got argument: (long) 3
 Got argument: (char[]) test
 Done!
 
 All types are deduced at compile time!
 
 Notice the neat implementation of realFunc(). All other templates could
 be put in a library. I bet the func()-wrapper could declared by a mixin!
 
 So all we need to support multidimensional splice/indexing is to make ..
 evaluate into:
 
 struct Sequence {
         size_t start,end;
 }
 
 and we are set.
 

 
 
 I kind of like the idea, but not sure if it is maybe to late for
 something like that, and not sure if it will cause any problems.

There is fortunately an obvious fallback:

If the class has no opIndex(Sequence) overload, a depreciable
opSplice(s.start,s.end) gets called.

/Oskar

Mar 11 2006

Norbert Nemec <Norbert Nemec-online.de> writes:

Great! This might really be the way to go. Thanks a lot for the idea.



Oskar Linde wrote:
 S. Chancellor wrote:
 
 
It would be nice if the .. operator just returned a variable of a
sequence type template.  That would make it more useful than the
current limitations of being in a [] operator.  Lots of other language
features could take advantage of sequence types.  Then:

" But then, what about mixed expressions like 'a[1,4..7]' ?" Wouldn't
have alot of baring, it would just be a vararg function, one element
would be an int, while another would be a sequence.

 
 
 The problem with vararg functions is that the argument types are resolved at
 runtime. This is what Norbert calls "not an option". First, the only way to
 make this fast is to hope that the indexing function gets inlined, its
 loops unrolled and the type information constantly folded in.
 Multidimensional arrays are often used where speed matters, so indexing
 should be as fast as possible. But this is not the main problem. The
 resulting type of an indexing expression depends on whether dimensions are
 indexed or sliced. A three dimensional array a, indexed like a[0..1,0,0..5]
 should result in a two dimensional array (sliced dimensions remain, indexed
 dimensions are removed). The return type has to be computed at compile
 time.
 
 I was going to post this at this moment, but reading Norberts post again
 gave me an idea I had to test...:
 
 
On 2006-03-10 07:34:31 -0800, Norbert Nemec <Norbert_member pathlink.com>
said: 

All the really clean solution that I could think of would demand for
tuple-types and list-handling at compile time, both things that have been
discussed before but are at best a topic for the very far future.


 
 
 It turns out this is not at all far away. Compile time varadic function
 arguments are available with the IFTI support in DMD 0.149. That means
 today!
 
 (Allright, not unlimited number of varadic function arguments, but a
 reasonable number of arguments at least.)
 
 Here is a demo:
 
 ---
 import std.stdio;
 
 struct Empty {}
 
 template TupleType(A = Empty, B = Empty, C = Empty, D = Empty, E = Empty, F
 = Empty, G = Empty /*,...*/) {
   static if (is(A == Empty))
     alias Empty TupleType;
   else
     alias List!(A,.TupleType!(B,C,D,E,F,G/*,...*/)) TupleType;
 }
 
 template Tuple(A = Empty, B = Empty, C = Empty, D = Empty, E = Empty, F =
 Empty, G = Empty /*,...*/) {
   TupleType!(A,B,C,D,E,F,G /*,...*/) Tuple(A a = A.init, B b = B.init, C c =
 C.init, D d = D.init, E e = E.in\
 it, F f = F.init, G g = G.init /*,...*/) {
     static if (is(A == Empty))
       return Empty.init;
     else
       return List!(A,TupleType!(B,C,D,E,F,G/*,...*/)
 (a,.Tuple(b,c,d,e,f,g /*,...*/));
   }
 }
 
 struct List(A,B) {
   A head;
   B tail;
   static List opCall(A a, B b) {
     List!(A,B) ret;
     ret.head = a;
     ret.tail = b;
     return ret;
   }
 }
 
 template func(A = Empty, B = Empty, C = Empty, D = Empty, E = Empty, F =
 Empty, G = Empty /*, ...*/) {
     void func(A a = A.init, B b = B.init, C c = C.init, D d = D.init, E e =
 E.init, F f = F.init, G g = G.ini\
 t /*,...*/) {
       realFunc(Tuple(a,b,c,d,e,f,g /*,...*/));
   }
 }
 
 template realFunc(T) {
   void realFunc(T args) {
     static if (is(T == Empty)) {
       writefln("Done!");
     } else {
       writefln("Got argument: (%s) %s",typeid(typeof(args.head)),args.head);
       .realFunc(args.tail);
     }
   }
 }
 
 void main() {
   int a = 1;
   float b = 5.6;
   long c = 3;
   char[] d = "test";
   func(a,b,c,d);
 }
 
 ---
 
 And guess what? This prints:
 
 Got argument: (int) 1
 Got argument: (float) 5.6
 Got argument: (long) 3
 Got argument: (char[]) test
 Done!
 
 All types are deduced at compile time!
 
 Notice the neat implementation of realFunc(). All other templates could be
 put in a library. I bet the func()-wrapper could declared by a mixin!
 
 So all we need to support multidimensional splice/indexing is to make ..
 evaluate into:
 
 struct Sequence {
         size_t start,end;
 }
 
 and we are set.
 
 /Oskar

Mar 12 2006

D Programming

C/C++ Programming

Other

digitalmars.D - OpSlice for Ndim arrays?