digitalmars.D - Proposal: Multidimensional opSlice solution
- Norbert Nemec (50/50) Mar 08 2010 Hi there,
- Don (12/23) Mar 08 2010 A solution was suggested while you were away.
- Michel Fortin (6/8) Mar 08 2010 And that's sad, because there are many other uses for a Slice struct.
- Norbert Nemec (16/41) Mar 08 2010 I know this solution. It is exactly the "syntax sugar" issue that I see
-
Don
(17/63)
Mar 08 2010
You are mistaken
. - Lars T. Kyllingstad (3/39) Mar 08 2010 It will? For complex literals, then?
- Don (3/44) Mar 08 2010 Hmmm. Maybe not. Complex literals will probably just disappear. We don't...
- Norbert Nemec (45/71) Mar 08 2010 Indeed? If that is so, a builtin Slice type as syntactic sugar would be
- bearophile (8/14) Mar 08 2010 D won't be frozen forever, eventually few new things will be added, for ...
- Norbert Nemec (11/18) Mar 08 2010 I experienced this crucial difficulty with my very first request, five
- bearophile (7/11) Mar 09 2010 I don't agree, 2D arrays are useful in many other situations, for exampl...
- Ellery Newcomer (2/8) Mar 08 2010 Ternary ?:, I suppose.
- Norbert Nemec (6/19) Mar 09 2010 Why not simply split up the ternary a?b:c into a nested expression
- Fawzi Mohamed (43/65) Mar 09 2010 I don't see why the obvious solution from..to..step would have
- bearophile (5/9) Mar 09 2010 If .. is more descriptive for a range of indexes (and I agree, despite i...
- Norbert Nemec (21/71) Mar 09 2010 The ".." vs. ":" issue is really more cosmetics. It makes a bit of
- Fawzi Mohamed (19/95) Mar 09 2010 As far as I know for D1.0 and large multidimensional arrays, really
- Andrei Alexandrescu (8/12) Mar 09 2010 I'm not so sure about that. I was very enthusiastic about defining an
- Fawzi Mohamed (8/19) Mar 09 2010 yes and for dense storage, and may operations, the cost is exponential
- Fawzi Mohamed (12/37) Mar 09 2010 By the way on the whole I think that D2.0 improves several things in
- Norbert Nemec (19/30) Mar 09 2010 N-dimensional arrays are not tied to N-dimensional spaces.
- bearophile (5/6) Mar 09 2010 But they seem acceptable for AA literals:
- Ellery Newcomer (2/8) Mar 09 2010 Yeah, on second thought I don't think ?: poses any such problems.
- Don (13/36) Mar 09 2010 slicing "a..b:c"
- Norbert Nemec (11/13) Mar 09 2010 Indeed, cache usage is a challenge. My general approach would be fairly
- %fil (6/22) Jan 18 2011 Hi All,
Hi there, in implementing multi-dimensional arrays, the current way of overloading the slicing operator does not scale up. Currently, there is opIndex working for an arbitrary number of indices, but opSlice works only for one dimension. Ultimately, it should be possible to allow slicing for more than one dimension, and mixing it with indexing for e.g.: A[4,7..8,7,2..5] So far, no clean solution for overloading this has been suggested. Looking at Python, there is a range operator a:b or a:b:c that takes two or three integer and returns a 'slice' object. I.e. a builtin type. The __getitem__ method (Python's equivalent to opIndex) then simply receives either integers or 'slice' object for each dimension. The range operator only exists within indexing expressions. The more D-ish equivalent of this solution is another opXXX method. How about the following: ---------------- * The slicing operation obj[] is overloaded by obj.opIndex() * The slicing operation obj[a..b] is overloaded by obj.opIndex(obj.opRange(a,b)) * Within an indexing expression on an object 'obj', an expression of the form 'a..b' is transformed into a call obj.opRange(a,b) furthermore, an expression of the form 'a..b:c' is transformed into obj.opRange(a,b,c) to where 'c' is the stride. * Alternatively, template functions of the form obj.opRange(int N)(a,b) obj.opRange(int N)(a,b,c) are tried, with N set to the dimension number within the indexing expression. It is an error for both opRange(a,b,c) and opRange(int N)(a,b,c) to exist in the same user defined type. For example obj[a..b,d..e:f] would become obj.opIndex(obj.opRange!(0)(a,b),obj.opRange!(1)(d,e,f)) * The opRange function may return an object of arbitrary type with is passed on to opIndex (or opOpIndex or opIndexAssign) to perform the actual slicing on the object. Proper overloading of these functions allows arbitrary mixing of slicing and indexing within the same expression. * All op*Slice* operators become obsolete. ---------------- I am not sure whether opRange is the best name, as it clashes with the range concept in the libraries. I would have liked to reuse the name opSlice which would become unused by this change. However - the transition to this completely different meaning would be rather painful. Alternative name suggestions are welcome. Greetings, Norbert
Mar 08 2010
Norbert Nemec wrote:Hi there, in implementing multi-dimensional arrays, the current way of overloading the slicing operator does not scale up. Currently, there is opIndex working for an arbitrary number of indices, but opSlice works only for one dimension. Ultimately, it should be possible to allow slicing for more than one dimension, and mixing it with indexing for e.g.: A[4,7..8,7,2..5] So far, no clean solution for overloading this has been suggested.A solution was suggested while you were away. You don't need a new opRange operator, a simple tuple struct like: struct Slice(T) { T from; T to; } in std.object is enough. Note that: A[4, Slice(7,8), 7, Slice(2,5)] will work with the existing compiler. So it's just a tiny syntax sugar issue. And another simple possibility is to turn 7..8 into int[2][7,8]. However, Andrei argued that slicing of multidimensional arrays is so rarely used that syntax sugar is not necessary. Thus, it's not in D2.
Mar 08 2010
On 2010-03-08 09:55:53 -0500, Don <nospam nospam.com> said:However, Andrei argued that slicing of multidimensional arrays is so rarely used that syntax sugar is not necessary. Thus, it's not in D2.And that's sad, because there are many other uses for a Slice struct. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Mar 08 2010
Don wrote:Norbert Nemec wrote:I know this solution. It is exactly the "syntax sugar" issue that I see as the problem here: The compiler would need to be aware of the data type Slice. It would therefore have to be something like a "builtin" type. If I am not mistaken, the language definition so far never makes use of "builtin" struct types.Hi there, in implementing multi-dimensional arrays, the current way of overloading the slicing operator does not scale up. Currently, there is opIndex working for an arbitrary number of indices, but opSlice works only for one dimension. Ultimately, it should be possible to allow slicing for more than one dimension, and mixing it with indexing for e.g.: A[4,7..8,7,2..5] So far, no clean solution for overloading this has been suggested.A solution was suggested while you were away. You don't need a new opRange operator, a simple tuple struct like: struct Slice(T) { T from; T to; } in std.object is enough. Note that: A[4, Slice(7,8), 7, Slice(2,5)] will work with the existing compiler. So it's just a tiny syntax sugar issue.And another simple possibility is to turn 7..8 into int[2][7,8].That solution would unnecessarily get in to way of implementing indexing by an array of indices: auto A = MyArray(["zero","one","two","three"]); assert( A[ [2,3,0] ] == MyArray(["two","three","zero"]) );However, Andrei argued that slicing of multidimensional arrays is so rarely used that syntax sugar is not necessary. Thus, it's not in D2.Outside of numerics, multidimensional arrays are indeed rarely used. In numerical programming, however, they are an essential ingredient. My ultimate goal is to multidimensional arrays in D as comfortable as in Fortran, Matlab or Python/NumPy in order to make D a real competitor in that market.
Mar 08 2010
Norbert Nemec wrote:Don wrote:You are mistaken <g>. object, TypeInfo, AssociativeArray, ... Complex will be added to that list, too.Norbert Nemec wrote:I know this solution. It is exactly the "syntax sugar" issue that I see as the problem here: The compiler would need to be aware of the data type Slice. It would therefore have to be something like a "builtin" type. If I am not mistaken, the language definition so far never makes use of "builtin" struct types.Hi there, in implementing multi-dimensional arrays, the current way of overloading the slicing operator does not scale up. Currently, there is opIndex working for an arbitrary number of indices, but opSlice works only for one dimension. Ultimately, it should be possible to allow slicing for more than one dimension, and mixing it with indexing for e.g.: A[4,7..8,7,2..5] So far, no clean solution for overloading this has been suggested.A solution was suggested while you were away. You don't need a new opRange operator, a simple tuple struct like: struct Slice(T) { T from; T to; } in std.object is enough. Note that: A[4, Slice(7,8), 7, Slice(2,5)] will work with the existing compiler. So it's just a tiny syntax sugar issue.But that syntax doesn't allow slices. It's only an issue if a dimension can be BOTH a slice T..T, AND T[2], and they are not equivalent. I tried to come up with a plausible scenario where it was really a problem, but failed. It requires that both T and T[2] are valid indices for the same dimension, AND that slicing is supported. But this is all a bit academic since it's not going to happen. Maybe if you'd been around a few months back, things would be different.And another simple possibility is to turn 7..8 into int[2][7,8].That solution would unnecessarily get in to way of implementing indexing by an array of indices: auto A = MyArray(["zero","one","two","three"]); assert( A[ [2,3,0] ] == MyArray(["two","three","zero"]) );Agreed. Multidimensional slicing is a costly operation, though, so I think the absence of syntax sugar is tolerable. I see syntax sugar for slicing as much less important than multidimensional $, where the alternative is really quite ugly. So I pushed hard for opDollar.However, Andrei argued that slicing of multidimensional arrays is so rarely used that syntax sugar is not necessary. Thus, it's not in D2.Outside of numerics, multidimensional arrays are indeed rarely used. In numerical programming, however, they are an essential ingredient. My ultimate goal is to multidimensional arrays in D as comfortable as in Fortran, Matlab or Python/NumPy in order to make D a real competitor in that market.
Mar 08 2010
Don wrote:Norbert Nemec wrote:It will? For complex literals, then? -LarsDon wrote:You are mistaken <g>. object, TypeInfo, AssociativeArray, ... Complex will be added to that list, too.Norbert Nemec wrote:I know this solution. It is exactly the "syntax sugar" issue that I see as the problem here: The compiler would need to be aware of the data type Slice. It would therefore have to be something like a "builtin" type. If I am not mistaken, the language definition so far never makes use of "builtin" struct types.Hi there, in implementing multi-dimensional arrays, the current way of overloading the slicing operator does not scale up. Currently, there is opIndex working for an arbitrary number of indices, but opSlice works only for one dimension. Ultimately, it should be possible to allow slicing for more than one dimension, and mixing it with indexing for e.g.: A[4,7..8,7,2..5] So far, no clean solution for overloading this has been suggested.A solution was suggested while you were away. You don't need a new opRange operator, a simple tuple struct like: struct Slice(T) { T from; T to; } in std.object is enough. Note that: A[4, Slice(7,8), 7, Slice(2,5)] will work with the existing compiler. So it's just a tiny syntax sugar issue.
Mar 08 2010
Lars T. Kyllingstad wrote:Don wrote:Hmmm. Maybe not. Complex literals will probably just disappear. We don't need them any more.Norbert Nemec wrote:It will? For complex literals, then? -LarsDon wrote:You are mistaken <g>. object, TypeInfo, AssociativeArray, ... Complex will be added to that list, too.Norbert Nemec wrote:I know this solution. It is exactly the "syntax sugar" issue that I see as the problem here: The compiler would need to be aware of the data type Slice. It would therefore have to be something like a "builtin" type. If I am not mistaken, the language definition so far never makes use of "builtin" struct types.Hi there, in implementing multi-dimensional arrays, the current way of overloading the slicing operator does not scale up. Currently, there is opIndex working for an arbitrary number of indices, but opSlice works only for one dimension. Ultimately, it should be possible to allow slicing for more than one dimension, and mixing it with indexing for e.g.: A[4,7..8,7,2..5] So far, no clean solution for overloading this has been suggested.A solution was suggested while you were away. You don't need a new opRange operator, a simple tuple struct like: struct Slice(T) { T from; T to; } in std.object is enough. Note that: A[4, Slice(7,8), 7, Slice(2,5)] will work with the existing compiler. So it's just a tiny syntax sugar issue.
Mar 08 2010
Don wrote:Norbert Nemec wrote:Indeed? If that is so, a builtin Slice type as syntactic sugar would be just as good as my proposal. Just note that the slice operator "a..b" should also allow strided slicing "a..b:c"The compiler would need to be aware of the data type Slice. It would therefore have to be something like a "builtin" type. If I am not mistaken, the language definition so far never makes use of "builtin" struct types.You are mistaken <g>. object, TypeInfo, AssociativeArray, ... Complex will be added to that list, too.Which is not only feasible but implemented and widely used in Fortran, Matlab and Python/NumPy: a[5] --> returns a single element a[5..8] --> returns a slice [ a[5],a[6],a[7] ] a[[5,3,1,4,6]] --> returns a list of elements selected by the list of indices [a[5],a[3],a[1],a[4],a[6]] In the last case, the list of indices can of course have just two entries, so a[[5,10]] == [ a[5],a[10] ]auto A = MyArray(["zero","one","two","three"]); assert( A[ [2,3,0] ] == MyArray(["two","three","zero"]) );But that syntax doesn't allow slices. It's only an issue if a dimension can be BOTH a slice T..T, AND T[2], and they are not equivalent. I tried to come up with a plausible scenario where it was really a problem, but failed. It requires that both T and T[2] are valid indices for the same dimension, AND that slicing is supported.But this is all a bit academic since it's not going to happen. Maybe if you'd been around a few months back, things would be different.Yea, seems like I missed some important decisions. May indeed be too late for some ideas. Still - not giving up quite yet.Agreed. Multidimensional slicing is a costly operation, though, so I think the absence of syntax sugar is tolerable.Costly? Not at all! Just a bit of integer arithmetic that would need to be done anyway to access the data. Most numerical code consist of many nested loops containing very few operations. Strided multidimensional slicing operations in combination with array expressions allow to express this equivalently in very few lines that can be optimized very effectively. As for the necessity of syntactic sugar, consider one typical line that I picked from my own numerical Python code: Python/NumPy: off = (y[:-1,:]*x[1:,:] - y[1:,:]*x[:-1,:]) / (x[1:,:]-x[:-1,:]) D no slicing for(int i=0;i<off.shape[0]-1;i++) for(int j=0;j<off.shape[1];j++) off[i,j] = (y[i,j]*x[i+1,j]-y[i+1,j]*x[i,j]) / (x[i+1,j]-x[i,j]) D currently possible: off[] = (y[slice(0,$-1),slice(0,$)]*x[slice(1,$),slice(0,$)] - y[slice(1,$),slice(0,$)]*x[slice(0,$-1),slice(0,$)]) / (x[slice(1,$),slice(0,$)] - x[slice(0,$-1),slice(0,$)]); D with syntactic sugar: off[] = (y[0..$-1,0..$]*x[1..$,0..$] - y[1..$,0..$]*x[0..$-1,0..$]) / (x[1..$,0..$] - x[0..$-1,0..$]); D with defaults 0 and $ for missing boundaries: off[] = (y[..$-1,..]*x[1..,..] - y[1..,..]*x[..$-1,..]) / (x[1..,..] - x[..$-1,..]); Admitted, the last case does not work quite as nicely with ".." as it does with Python's ":". Still, the point should be clear.I see syntax sugar for slicing as much less important than multidimensional $, where the alternative is really quite ugly. So I pushed hard for opDollar.Thanks for that. It is indeed essential.
Mar 08 2010
Norbert Nemec:Just note that the slice operator "a..b" should also allow strided slicing "a..b:c"I have asked for this ages ago :o) But at that time there was no laziness and ranges in Phobos, so that was not very useful.Yea, seems like I missed some important decisions. May indeed be too late for some ideas. Still - not giving up quite yet.D won't be frozen forever, eventually few new things will be added, for example in D3. For example probably some of the limits of CTFE will be removed. Even C language and Fortran keep having changes every ten years or so. What will be hard to do are changes/removals. It's important to tell apart additive changes (like the ones you ask, like using 0 and $ as default bounds when they are missing, or adding a stride) from breaking changes (like replacing .. with a : ). I didn't understand this essential difference until too much late.Admitted, the last case does not work quite as nicely with ".." as it does with Python's ":". Still, the point should be clear.I have never understood why Walter has adopted .. for slices instead of the better : I'd like to know why. Bye, bearophile
Mar 08 2010
bearophile wrote:D won't be frozen forever, eventually few new things will be added, for example in D3. For example probably some of the limits of CTFE will be removed. Even C language and Fortran keep having changes every ten years or so. What will be hard to do are changes/removals. It's important to tell apart additive changes (like the ones you ask, like using 0 and $ as default bounds when they are missing, or adding a stride) from breaking changes (like replacing .. with a : ). I didn't understand this essential difference until too much late.I experienced this crucial difficulty with my very first request, five years ago when I pleaded to replace the names "ireal" and "creal". When no one seemed to catch the deep irony of these names, I learned to accept it as a bit of mathematical humor.I guess in one dimension, ".." does indeed look more intuitive than ":". It just does not scale up as nicely to higher dimensions. The fundamental issue that I am only gradually beginning to understand is that multidimensional arrays are really much more a niche feature than they seem. Outside of numerics, hardly anybody actually finds them particularly useful.Admitted, the last case does not work quite as nicely with ".." as it does with Python's ":". Still, the point should be clear.I have never understood why Walter has adopted .. for slices instead of the better : I'd like to know why.
Mar 08 2010
Norbert Nemec:The fundamental issue that I am only gradually beginning to understand is that multidimensional arrays are really much more a niche feature than they seem. Outside of numerics, hardly anybody actually finds them particularly useful.I don't agree, 2D arrays are useful in many other situations, for example in games to represent a 2D field of characteristics, etc, etc. But I agree that heavy usage of slices and strides and nD arrays even more complex than your example is common in numerics only: off = (y[:-1,:]*x[1:,:] - y[1:,:]*x[:-1,:]) / (x[1:,:]-x[:-1,:]) D has a chance to become important for the people doing numerical/array processing, so even if this is a niche purpose, it can be an important niche for D. The introduction of the improved $ by Don means that I am not the only one that shares this idea. Bye, bearophile
Mar 09 2010
On 03/08/2010 08:49 PM, bearophile wrote:Ternary ?:, I suppose.Admitted, the last case does not work quite as nicely with ".." as it does with Python's ":". Still, the point should be clear.I have never understood why Walter has adopted .. for slices instead of the better : I'd like to know why. Bye, bearophile
Mar 08 2010
Ellery Newcomer wrote:On 03/08/2010 08:49 PM, bearophile wrote:Why not simply split up the ternary a?b:c into a nested expression a?(b:c) The binary ":" could simply be an expression that may only appear in special contexts: on the right side of binary "?" expressions or within indexing expressions.Ternary ?:, I suppose.Admitted, the last case does not work quite as nicely with ".." as it does with Python's ":". Still, the point should be clear.I have never understood why Walter has adopted .. for slices instead of the better : I'd like to know why. Bye, bearophile
Mar 09 2010
On 2010-03-09 09:47:17 +0100, Norbert Nemec <Norbert Nemec-online.de> said:Ellery Newcomer wrote:I don't see why the obvious solution from..to..step would have problems, so that one needs the ":", but maybe I did not think enough about the implications. Anyway I find .. more self descriptive for slices than :. Anyway at the moment there is no syntactic sugar for that. You said that multidimensional arrays are a niche, well t might well be, but still there are some implementations in D. In particular there is my NArray implementation in blip http://dsource.org/projects/blip . Using it your example off = (y[:-1,:]*x[1:,:] - y[1:,:]*x[:-1,:]) / (x[1:,:]-x[:-1,:]) becomes auto off=(y[Range(0,-2)]*x[Range(1,-1)]-y[Range(1,-1)]*x[Range(0,-2)])/(x[Range(1,-1)]-x[Range(0,-1)]); not ideal, but not so terrible either. The Range structure negative indexes are from the end, and are inclusive (thus Range(0,-1) means up to the end, the whole range). This removes problems with making $ work correctly. NArrays basically never copy and have a reasonable performance. As they are based on a big flat storage, you can easily write very efficient code, and use already written libraries, for example blas and lapack are used by default for several operations. You can use scope to guarantee collection (well not yet with all compilers it works as it should). All operations can have the target array, so that you can avoid the allocation of temporary arrays. Thus for example auto off=(y[Range(0,-2)]*x[Range(1,-1)]-y[Range(1,-1)]*x[Range(0,-2)]); off /= (x[Range(1,-1)]-x[Range(0,-1)]); would already be a bit better (one temporary less). Probably a pool for reusing temporaries would be good, but I did not do it. Also there are no expression templates (à la blitz++), these work well for small expressions, but when you have to follow several read contexts it becomes unclear if it is really a win, so optimizing them can be very complicated. There are optimized versions of commonly used operations. Parallelization has been prepared, but not yet done (it is a rewrite of the ploop iteration away). I think that the library is quite complete from the user point of view, and quite usable... FawziOn 03/08/2010 08:49 PM, bearophile wrote:Why not simply split up the ternary a?b:c into a nested expression a?(b:c) The binary ":" could simply be an expression that may only appear in special contexts: on the right side of binary "?" expressions or within indexing expressions.Ternary ?:, I suppose.Admitted, the last case does not work quite as nicely with ".." as it does with Python's ":". Still, the point should be clear.I have never understood why Walter has adopted .. for slices instead of the better : I'd like to know why. Bye, bearophile
Mar 09 2010
Fawzi Mohamed:I don't see why the obvious solution from..to..step would have problems, so that one needs the ":", but maybe I did not think enough about the implications. Anyway I find .. more self descriptive for slices than :.If .. is more descriptive for a range of indexes (and I agree, despite it needs two chars instead of one) then to..step is not right, because to..step is not a range. So from..to:step seems more descriptive than from..to..step :-) Bye, bearophile
Mar 09 2010
Fawzi Mohamed wrote:On 2010-03-09 09:47:17 +0100, Norbert Nemec <Norbert Nemec-online.de> said:The ".." vs. ":" issue is really more cosmetics. It makes a bit of difference in readability when full slices are used in multiple dimensions. Simply imagine A[..,..,..,..] vs. A[:,:,:,:]Ellery Newcomer wrote:I don't see why the obvious solution from..to..step would have problems, so that one needs the ":", but maybe I did not think enough about the implications. Anyway I find .. more self descriptive for slices than :.On 03/08/2010 08:49 PM, bearophile wrote:Why not simply split up the ternary a?b:c into a nested expression a?(b:c) The binary ":" could simply be an expression that may only appear in special contexts: on the right side of binary "?" expressions or within indexing expressions.Ternary ?:, I suppose.Admitted, the last case does not work quite as nicely with ".." as it does with Python's ":". Still, the point should be clear.I have never understood why Walter has adopted .. for slices instead of the better : I'd like to know why. Bye, bearophileAnyway at the moment there is no syntactic sugar for that. You said that multidimensional arrays are a niche, well t might well be, but still there are some implementations in D. In particular there is my NArray implementation in blip http://dsource.org/projects/blip .Indeed, there is a number of implementations. Will be interesting to do a systematic comparison. Ultimately, I would like to combine the strengths of all these different implementations into one library that has the potential to become standard. Multidimensional arrays are the essence of the interfaces of most numerical libraries. Having several incompatible standards is a major roadblock for acceptance in the numerical community.Using it your example off = (y[:-1,:]*x[1:,:] - y[1:,:]*x[:-1,:]) / (x[1:,:]-x[:-1,:]) becomes auto off=(y[Range(0,-2)]*x[Range(1,-1)]-y[Range(1,-1)]*x[Range(0,-2)])/(x[Range(1,- )]-x[Range(0,-1)]); not ideal, but not so terrible either.The full slicing indices for the second dimension are missing. You could of course make those implicit, but then what happens if you need that expression to work on swapped dimensions?The Range structure negative indexes are from the end, and are inclusive (thus Range(0,-1) means up to the end, the whole range). This removes problems with making $ work correctly.The negative sign solution is nice for scripting languages like Matlab or Python. In a compiled language it leads to unnecessary runtime overhead because every indexing operation needs to be checked for the sign.I think that the library is quite complete from the user point of view, and quite usable...Good to know. I will have a closer look at it.
Mar 09 2010
On 2010-03-09 12:06:05 +0100, Norbert Nemec <Norbert Nemec-online.de> said:Fawzi Mohamed wrote:I agree that for the full slice it looks uglierOn 2010-03-09 09:47:17 +0100, Norbert Nemec <Norbert Nemec-online.de> said:The ".." vs. ":" issue is really more cosmetics. It makes a bit of difference in readability when full slices are used in multiple dimensions. Simply imagine A[..,..,..,..] vs. A[:,:,:,:]Ellery Newcomer wrote:I don't see why the obvious solution from..to..step would have problems, so that one needs the ":", but maybe I did not think enough about the implications. Anyway I find .. more self descriptive for slices than :.On 03/08/2010 08:49 PM, bearophile wrote:Why not simply split up the ternary a?b:c into a nested expression a?(b:c) The binary ":" could simply be an expression that may only appear in special contexts: on the right side of binary "?" expressions or within indexing expressions.Ternary ?:, I suppose.Admitted, the last case does not work quite as nicely with ".." as it does with Python's ":". Still, the point should be clear.I have never understood why Walter has adopted .. for slices instead of the better : I'd like to know why. Bye, bearophileAs far as I know for D1.0 and large multidimensional arrays, really usable libraries are just multiarray by braxissimo (that is a kind of abandoned experiment, as he focused on matrixes and vectors), and then my library. Matrix libraries are more (and my library supports dense matrixes/vectors), but really multidimensional libraries, I am not aware of other serious contenders, please share the others...Anyway at the moment there is no syntactic sugar for that. You said that multidimensional arrays are a niche, well t might well be, but still there are some implementations in D. In particular there is my NArray implementation in blip http://dsource.org/projects/blip .Indeed, there is a number of implementations. Will be interesting to do a systematic comparison. Ultimately, I would like to combine the strengths of all these different implementations into one library that has the potential to become standard. Multidimensional arrays are the essence of the interfaces of most numerical libraries. Having several incompatible standards is a major roadblock for acceptance in the numerical community.not ideal,Using it your example off = (y[:-1,:]*x[1:,:] - y[1:,:]*x[:-1,:]) / (x[1:,:]-x[:-1,:]) becomes auto off=(y[Range(0,-2)]*x[Range(1,-1)]-y[Range(1,-1)]*x[Range(0,-2)])/(x[Range(1,-1)]-x[Range(0,-1)]);Indeed missing= implicit, to swap dimensions you have to use Range(-1) (the full range from 0 to the last).but not so terrible either.The full slicing indices for the second dimension are missing. You could of course make those implicit, but then what happens if you need that expression to work on swapped dimensions?Indeed I don't accept negative indexes for indexing, just for Ranges (where the overhead should be acceptable).The Range structure negative indexes are from the end, and are inclusive (thus Range(0,-1) means up to the end, the whole range). This removes problems with making $ work correctly.The negative sign solution is nice for scripting languages like Matlab or Python. In a compiled language it leads to unnecessary runtime overhead because every indexing operation needs to be checked for the sign.let me know if you have problems/comment about it (also in the IRC if you want). ciao FawziI think that the library is quite complete from the user point of view, and quite usable...Good to know. I will have a closer look at it.
Mar 09 2010
Norbert Nemec wrote:Multidimensional arrays are the essence of the interfaces of most numerical libraries. Having several incompatible standards is a major roadblock for acceptance in the numerical community.I'm not so sure about that. I was very enthusiastic about defining an infrastructure of arbitary-dimensional arrays, and did a little research about it. It turns out the applicability of N-dimensional arrays falls off a cliff when N > 3. In turn, this is because high-dimensional space are weird - an N-dimensional space is not just like a 3-dimensional one, only with more dimensions; it's a downright weird beast. Andrei
Mar 09 2010
On 9-mar-10, at 13:21, Andrei Alexandrescu wrote:Norbert Nemec wrote:yes and for dense storage, and may operations, the cost is exponential in N, so that one normally doesn't really want to go beyond 3. Still I did need up to 3, and well if you go with dense storage, up to 3 or up to N is more or less the same amount f code if you use templates... But for non dense storage it gets messy really quick because design choices really influence the api that one can use efficiently. FawziMultidimensional arrays are the essence of the interfaces of most numerical libraries. Having several incompatible standards is a major roadblock for acceptance in the numerical community.I'm not so sure about that. I was very enthusiastic about defining an infrastructure of arbitary-dimensional arrays, and did a little research about it. It turns out the applicability of N-dimensional arrays falls off a cliff when N > 3. In turn, this is because high- dimensional space are weird - an N-dimensional space is not just like a 3-dimensional one, only with more dimensions; it's a downright weird beast.
Mar 09 2010
On 2010-03-09 13:35:36 +0100, Fawzi Mohamed <fawzi gmx.ch> said:On 9-mar-10, at 13:21, Andrei Alexandrescu wrote:By the way on the whole I think that D2.0 improves several things in the language, there are some design choices that I don't share so much (shared, increased use of thread local storage,...), but for sure it will make library implementations of multidimensional array better: * immutability can be put to good use for shape and strides, * maybe a struct implementation might be usable together with a good memory management (I have tried a couple of times to switch to a structure, but the drawbacks were more that the wins in D1.0, especially as you could not ensure deallocation of resources) * the latest additions wrt. operator overloading are nice. Now what is still missing for me is a good compiler for x86_64... :)Norbert Nemec wrote:yes and for dense storage, and may operations, the cost is exponential in N, so that one normally doesn't really want to go beyond 3. Still I did need up to 3, and well if you go with dense storage, up to 3 or up to N is more or less the same amount f code if you use templates... But for non dense storage it gets messy really quick because design choices really influence the api that one can use efficiently. FawziMultidimensional arrays are the essence of the interfaces of most numerical libraries. Having several incompatible standards is a major roadblock for acceptance in the numerical community.I'm not so sure about that. I was very enthusiastic about defining an infrastructure of arbitary-dimensional arrays, and did a little research about it. It turns out the applicability of N-dimensional arrays falls off a cliff when N > 3. In turn, this is because high- dimensional space are weird - an N-dimensional space is not just like a 3-dimensional one, only with more dimensions; it's a downright weird beast.
Mar 09 2010
Andrei Alexandrescu wrote:Norbert Nemec wrote:N-dimensional arrays are not tied to N-dimensional spaces. If you view an array as data on a space-grid, then I agree that high dimensionality is awkward. I did work in high-energy physics, where four-dimensional space-time grids are commonly used for calculations. With the cost scaling in the fourth power of the resolution, this is extremely costly. Grid methods for N>4 are indeed rarely used. Array dimensions, however, do not need to refer to space dimension. I, for example, often handle electron spins as additional indices of range two. Or number of particles in the system, and index for the various configurations of the system that need to be handled. There are unlimited possibilities what an index may mean. Fortran 90 limits the number of dimensions to seven and in our current code, we actually were hit by this limit once when adding another dimension would actually have been the cleanest solution. Of course, these are rare cases and probably indicate a questionable design to begin with, but as scientific software grows, one often just wants to add a new feature in the simplest and cleanest way. Having unnecessary limitations can easily get in the way.Multidimensional arrays are the essence of the interfaces of most numerical libraries. Having several incompatible standards is a major roadblock for acceptance in the numerical community.I'm not so sure about that. I was very enthusiastic about defining an infrastructure of arbitary-dimensional arrays, and did a little research about it. It turns out the applicability of N-dimensional arrays falls off a cliff when N > 3. In turn, this is because high-dimensional space are weird - an N-dimensional space is not just like a 3-dimensional one, only with more dimensions; it's a downright weird beast.
Mar 09 2010
Ellery Newcomer:Ternary ?:, I suppose.But they seem acceptable for AA literals: auto aa = [1: 2, 3: 4]; Bye, bearophile
Mar 09 2010
On 03/09/2010 04:49 AM, bearophile wrote:Ellery Newcomer:Yeah, on second thought I don't think ?: poses any such problems.Ternary ?:, I suppose.But they seem acceptable for AA literals: auto aa = [1: 2, 3: 4]; Bye, bearophile
Mar 09 2010
Norbert Nemec wrote:Don wrote:Norbert Nemec wrote:Indeed? If that is so, a builtin Slice type as syntactic sugar would be just as good as my proposal.The compiler would need to be aware of the data type Slice. It would therefore have to be something like a "builtin" type. If I am not mistaken, the language definition so far never makes use of "builtin" struct types.You are mistaken <g>. object, TypeInfo, AssociativeArray, ... Complex will be added to that list, too.Just note that the slice operator "a..b" should also allow stridedslicing "a..b:c" That's a lot less clear. I don't want to touch that issue right now. Anyway, you've convinced me that the int[2] option doesn't work for slice information.You're correct about what they do, but I think the conclusion is wrong. Multidimensional slices normally result in appallingly inefficient use of caches. Unless you do something extremely clever. So I think the onus is now on library developers to show that it's possible to create a compelling library which makes efficient and intuitive use of multidimensional slices. I think by the time that task is accomplished (which I imagine may take a couple of years), the language will be ready for a minor update. It's only a minor syntax tweak.Agreed. Multidimensional slicing is a costly operation, though, so I think the absence of syntax sugar is tolerable.Costly? Not at all! Just a bit of integer arithmetic that would need to be done anyway to access the data. Most numerical code consist of many nested loops containing very few operations. Strided multidimensional slicing operations in combination with array expressions allow to express this equivalently in very few lines that can be optimized very effectively.
Mar 09 2010
Don wrote:Multidimensional slices normally result in appallingly inefficient use of caches.Indeed, cache usage is a challenge. My general approach would be fairly conservative: give the user full control over memory layout, but do this as comfortably as possible. Provide good performance for straightforward code but allow the user to tweak the details to improve performance A library function that takes several arrays as input and output should allow arbitrary memory layouts, but it should also specify which memory layout is most efficient. In any case, I think that the expressiveness of multidimensional slices is worth having them even if the performance is not optimal in every case with the first generation of libraries.
Mar 09 2010
Hi All, I'm still learning D(2) and was toying around with creating a matrix class and wanting to overload opSlice for this purpose. As I can only find very limited documentation on multidimensional arrays in the TDPL book (to my big suprise given the D should be an attractive language for numerical coding), I stumbled accross this post looking elsewhere. What was the result of this discussion in the end? Are there plans to allow opslice to work for multidimensional arrays or will this be done in Phobos? Many thanks and sorry to bring us such an old post. fil Norbert Nemec Wrote:Don wrote:Multidimensional slices normally result in appallingly inefficient use of caches.Indeed, cache usage is a challenge. My general approach would be fairly conservative: give the user full control over memory layout, but do this as comfortably as possible. Provide good performance for straightforward code but allow the user to tweak the details to improve performance A library function that takes several arrays as input and output should allow arbitrary memory layouts, but it should also specify which memory layout is most efficient. In any case, I think that the expressiveness of multidimensional slices is worth having them even if the performance is not optimal in every case with the first generation of libraries.
Jan 18 2011