digitalmars.D - OpSlice for Ndim arrays?
- Norbert Nemec (52/52) Mar 10 2006 Hi there,
- Norbert Nemec (1/1) Mar 10 2006 Sorry the duplicate posting... :-(
- S. Chancellor (9/78) Mar 10 2006 It would be nice if the .. operator just returned a variable of a
- Oskar Linde (94/107) Mar 11 2006 The problem with vararg functions is that the argument types are resolve...
- Ivan Senji (4/37) Mar 11 2006 I kind of like the idea, but not sure if it is maybe to late for
- Oskar Linde (5/46) Mar 11 2006 There is fortunately an obvious fallback:
- Norbert Nemec (2/138) Mar 12 2006
Hi there, after two years of silence, I just found a little time to return my thoughts to D and the issues of NDim-arrays I started back then. (No, I never completely forgot about D, I just had other things that I had to give priority.) My design proposal for NDim arrays needs a complete overhaul before discussion of details makes sense. However, there is one cucial issue that needs to be solved and I have no idea how it could be done: Currently we have OpIndex and OpIndexAssign fit for overloading multidimensional indexing. OpSlice and OpSliceAssign, however, work only for one dimension. I've been thinking how to generalize these, but it turns out to be extremely tricky. First though is: Why not simply extend OpSlice to take N pairs of boundaries? But then, what about mixed expressions like 'a[1,4..7]' ? Working with Python, I use expressions like that all the time, and I think it is crucial that Ndim-arrays in D allow this kind of slicing as well. For native Ndim arrays, it should be straightforward to handle such expressions in the compiler. However, how should they be overloaded?!? Python goes the way of taking a tuple of objects as indices where '4..7' would then be translated into a "slice object". The indexing-overloading routine then goes through all the objects at run-time and decides whether each one is a slicing or an indexing operation. All of this is rather simple with dynamic typing and accepting the run-time overhead. In D, this is not an option. Indexing/Slicing has to be type-safe and the resulting type has to be known at compile time. One rather clumsy solution that I came up so far is to split the resolve the expression into two consecutive function calls: a[1,4..7] would become a.OpIndexPartial(0,1).OpSlice(4,7) where the first function is defined as OpIndexPartial(int dim,int idx) and returns an array one rank lower. The assignment expression: a[1,4..7] = b[0..3] would turn into a.OpIndexPartial(0,1).OpSliceAssign(b[0..3],4,7) The ugly thing about this solution is, that some kind of marshal-object is needed which is returned from OpIndexAssign. Another possibility would be to change OpSlice(size_t start, size_t end) into OpIndex(size_t[2] slc) and interpret a complex expression like a[1,3..5,2,6..3] as a call to OpIndex(size_t idx0, size_t[2] slc1, size_t idx2, size_t[2] slc3) I don't know, however, whether D templates would allow to implement the long list of routines in some comfortable way. All the really clean solution that I could think of would demand for tuple-types and list-handling at compile time, both things that have been discussed before but are at best a topic for the very far future. Any ideas? Norbert Nemec
Mar 10 2006
It would be nice if the .. operator just returned a variable of a sequence type template. That would make it more useful than the current limitations of being in a [] operator. Lots of other language features could take advantage of sequence types. Then: " But then, what about mixed expressions like 'a[1,4..7]' ?" Wouldn't have alot of baring, it would just be a vararg function, one element would be an int, while another would be a sequence. -S. On 2006-03-10 07:34:31 -0800, Norbert Nemec <Norbert_member pathlink.com> said:Hi there, after two years of silence, I just found a little time to return my thoughts to D and the issues of NDim-arrays I started back then. (No, I never completely forgot about D, I just had other things that I had to give priority.) My design proposal for NDim arrays needs a complete overhaul before discussion of details makes sense. However, there is one cucial issue that needs to be solved and I have no idea how it could be done: Currently we have OpIndex and OpIndexAssign fit for overloading multidimensional indexing. OpSlice and OpSliceAssign, however, work only for one dimension. I've been thinking how to generalize these, but it turns out to be extremely tricky. First though is: Why not simply extend OpSlice to take N pairs of boundaries? But then, what about mixed expressions like 'a[1,4..7]' ? Working with Python, I use expressions like that all the time, and I think it is crucial that Ndim-arrays in D allow this kind of slicing as well. For native Ndim arrays, it should be straightforward to handle such expressions in the compiler. However, how should they be overloaded?!? Python goes the way of taking a tuple of objects as indices where '4..7' would then be translated into a "slice object". The indexing-overloading routine then goes through all the objects at run-time and decides whether each one is a slicing or an indexing operation. All of this is rather simple with dynamic typing and accepting the run-time overhead. In D, this is not an option. Indexing/Slicing has to be type-safe and the resulting type has to be known at compile time. One rather clumsy solution that I came up so far is to split the resolve the expression into two consecutive function calls: a[1,4..7] would become a.OpIndexPartial(0,1).OpSlice(4,7) where the first function is defined as OpIndexPartial(int dim,int idx) and returns an array one rank lower. The assignment expression: a[1,4..7] = b[0..3] would turn into a.OpIndexPartial(0,1).OpSliceAssign(b[0..3],4,7) The ugly thing about this solution is, that some kind of marshal-object is needed which is returned from OpIndexAssign. Another possibility would be to change OpSlice(size_t start, size_t end) into OpIndex(size_t[2] slc) and interpret a complex expression like a[1,3..5,2,6..3] as a call to OpIndex(size_t idx0, size_t[2] slc1, size_t idx2, size_t[2] slc3) I don't know, however, whether D templates would allow to implement the long list of routines in some comfortable way. All the really clean solution that I could think of would demand for tuple-types and list-handling at compile time, both things that have been discussed before but are at best a topic for the very far future. Any ideas? Norbert Nemec
Mar 10 2006
S. Chancellor wrote:It would be nice if the .. operator just returned a variable of a sequence type template. That would make it more useful than the current limitations of being in a [] operator. Lots of other language features could take advantage of sequence types. Then: " But then, what about mixed expressions like 'a[1,4..7]' ?" Wouldn't have alot of baring, it would just be a vararg function, one element would be an int, while another would be a sequence.The problem with vararg functions is that the argument types are resolved at runtime. This is what Norbert calls "not an option". First, the only way to make this fast is to hope that the indexing function gets inlined, its loops unrolled and the type information constantly folded in. Multidimensional arrays are often used where speed matters, so indexing should be as fast as possible. But this is not the main problem. The resulting type of an indexing expression depends on whether dimensions are indexed or sliced. A three dimensional array a, indexed like a[0..1,0,0..5] should result in a two dimensional array (sliced dimensions remain, indexed dimensions are removed). The return type has to be computed at compile time. I was going to post this at this moment, but reading Norberts post again gave me an idea I had to test...:On 2006-03-10 07:34:31 -0800, Norbert Nemec <Norbert_member pathlink.com> said:It turns out this is not at all far away. Compile time varadic function arguments are available with the IFTI support in DMD 0.149. That means today! (Allright, not unlimited number of varadic function arguments, but a reasonable number of arguments at least.) Here is a demo: --- import std.stdio; struct Empty {} template TupleType(A = Empty, B = Empty, C = Empty, D = Empty, E = Empty, F = Empty, G = Empty /*,...*/) { static if (is(A == Empty)) alias Empty TupleType; else alias List!(A,.TupleType!(B,C,D,E,F,G/*,...*/)) TupleType; } template Tuple(A = Empty, B = Empty, C = Empty, D = Empty, E = Empty, F = Empty, G = Empty /*,...*/) { TupleType!(A,B,C,D,E,F,G /*,...*/) Tuple(A a = A.init, B b = B.init, C c = C.init, D d = D.init, E e = E.in\ it, F f = F.init, G g = G.init /*,...*/) { static if (is(A == Empty)) return Empty.init; else return List!(A,TupleType!(B,C,D,E,F,G/*,...*/) (a,.Tuple(b,c,d,e,f,g /*,...*/)); } } struct List(A,B) { A head; B tail; static List opCall(A a, B b) { List!(A,B) ret; ret.head = a; ret.tail = b; return ret; } } template func(A = Empty, B = Empty, C = Empty, D = Empty, E = Empty, F = Empty, G = Empty /*, ...*/) { void func(A a = A.init, B b = B.init, C c = C.init, D d = D.init, E e = E.init, F f = F.init, G g = G.ini\ t /*,...*/) { realFunc(Tuple(a,b,c,d,e,f,g /*,...*/)); } } template realFunc(T) { void realFunc(T args) { static if (is(T == Empty)) { writefln("Done!"); } else { writefln("Got argument: (%s) %s",typeid(typeof(args.head)),args.head); .realFunc(args.tail); } } } void main() { int a = 1; float b = 5.6; long c = 3; char[] d = "test"; func(a,b,c,d); } --- And guess what? This prints: Got argument: (int) 1 Got argument: (float) 5.6 Got argument: (long) 3 Got argument: (char[]) test Done! All types are deduced at compile time! Notice the neat implementation of realFunc(). All other templates could be put in a library. I bet the func()-wrapper could declared by a mixin! So all we need to support multidimensional splice/indexing is to make .. evaluate into: struct Sequence { size_t start,end; } and we are set. /OskarAll the really clean solution that I could think of would demand for tuple-types and list-handling at compile time, both things that have been discussed before but are at best a topic for the very far future.
Mar 11 2006
Oskar Linde wrote: ...void main() { int a = 1; float b = 5.6; long c = 3; char[] d = "test"; func(a,b,c,d); } --- And guess what? This prints: Got argument: (int) 1 Got argument: (float) 5.6 Got argument: (long) 3 Got argument: (char[]) test Done! All types are deduced at compile time! Notice the neat implementation of realFunc(). All other templates could be put in a library. I bet the func()-wrapper could declared by a mixin! So all we need to support multidimensional splice/indexing is to make .. evaluate into: struct Sequence { size_t start,end; } and we are set.I kind of like the idea, but not sure if it is maybe to late for something like that, and not sure if it will cause any problems.
Mar 11 2006
Ivan Senji wrote:Oskar Linde wrote: ...There is fortunately an obvious fallback: If the class has no opIndex(Sequence) overload, a depreciable opSplice(s.start,s.end) gets called. /Oskarvoid main() { int a = 1; float b = 5.6; long c = 3; char[] d = "test"; func(a,b,c,d); } --- And guess what? This prints: Got argument: (int) 1 Got argument: (float) 5.6 Got argument: (long) 3 Got argument: (char[]) test Done! All types are deduced at compile time! Notice the neat implementation of realFunc(). All other templates could be put in a library. I bet the func()-wrapper could declared by a mixin! So all we need to support multidimensional splice/indexing is to make .. evaluate into: struct Sequence { size_t start,end; } and we are set.I kind of like the idea, but not sure if it is maybe to late for something like that, and not sure if it will cause any problems.
Mar 11 2006
Great! This might really be the way to go. Thanks a lot for the idea. Oskar Linde wrote:S. Chancellor wrote:It would be nice if the .. operator just returned a variable of a sequence type template. That would make it more useful than the current limitations of being in a [] operator. Lots of other language features could take advantage of sequence types. Then: " But then, what about mixed expressions like 'a[1,4..7]' ?" Wouldn't have alot of baring, it would just be a vararg function, one element would be an int, while another would be a sequence.The problem with vararg functions is that the argument types are resolved at runtime. This is what Norbert calls "not an option". First, the only way to make this fast is to hope that the indexing function gets inlined, its loops unrolled and the type information constantly folded in. Multidimensional arrays are often used where speed matters, so indexing should be as fast as possible. But this is not the main problem. The resulting type of an indexing expression depends on whether dimensions are indexed or sliced. A three dimensional array a, indexed like a[0..1,0,0..5] should result in a two dimensional array (sliced dimensions remain, indexed dimensions are removed). The return type has to be computed at compile time. I was going to post this at this moment, but reading Norberts post again gave me an idea I had to test...:On 2006-03-10 07:34:31 -0800, Norbert Nemec <Norbert_member pathlink.com> said:It turns out this is not at all far away. Compile time varadic function arguments are available with the IFTI support in DMD 0.149. That means today! (Allright, not unlimited number of varadic function arguments, but a reasonable number of arguments at least.) Here is a demo: --- import std.stdio; struct Empty {} template TupleType(A = Empty, B = Empty, C = Empty, D = Empty, E = Empty, F = Empty, G = Empty /*,...*/) { static if (is(A == Empty)) alias Empty TupleType; else alias List!(A,.TupleType!(B,C,D,E,F,G/*,...*/)) TupleType; } template Tuple(A = Empty, B = Empty, C = Empty, D = Empty, E = Empty, F = Empty, G = Empty /*,...*/) { TupleType!(A,B,C,D,E,F,G /*,...*/) Tuple(A a = A.init, B b = B.init, C c = C.init, D d = D.init, E e = E.in\ it, F f = F.init, G g = G.init /*,...*/) { static if (is(A == Empty)) return Empty.init; else return List!(A,TupleType!(B,C,D,E,F,G/*,...*/) (a,.Tuple(b,c,d,e,f,g /*,...*/)); } } struct List(A,B) { A head; B tail; static List opCall(A a, B b) { List!(A,B) ret; ret.head = a; ret.tail = b; return ret; } } template func(A = Empty, B = Empty, C = Empty, D = Empty, E = Empty, F = Empty, G = Empty /*, ...*/) { void func(A a = A.init, B b = B.init, C c = C.init, D d = D.init, E e = E.init, F f = F.init, G g = G.ini\ t /*,...*/) { realFunc(Tuple(a,b,c,d,e,f,g /*,...*/)); } } template realFunc(T) { void realFunc(T args) { static if (is(T == Empty)) { writefln("Done!"); } else { writefln("Got argument: (%s) %s",typeid(typeof(args.head)),args.head); .realFunc(args.tail); } } } void main() { int a = 1; float b = 5.6; long c = 3; char[] d = "test"; func(a,b,c,d); } --- And guess what? This prints: Got argument: (int) 1 Got argument: (float) 5.6 Got argument: (long) 3 Got argument: (char[]) test Done! All types are deduced at compile time! Notice the neat implementation of realFunc(). All other templates could be put in a library. I bet the func()-wrapper could declared by a mixin! So all we need to support multidimensional splice/indexing is to make .. evaluate into: struct Sequence { size_t start,end; } and we are set. /OskarAll the really clean solution that I could think of would demand for tuple-types and list-handling at compile time, both things that have been discussed before but are at best a topic for the very far future.
Mar 12 2006