digitalmars.D - Status of multidimensional slicing
- Jared Miller (61/61) Mar 07 2014 I would like to revisit the topic of operator overloads for
- bearophile (5/6) Mar 07 2014 D needs to offer a nice syntax for user defined multidimensional
- H. S. Teoh (11/17) Mar 07 2014 [...]
- Andrei Alexandrescu (3/17) Mar 07 2014 I agree and sympathize.
- Jared Miller (12/14) Mar 10 2014 So are there any significant objections to Kenji's PR?
- Kenji Hara (4/5) Mar 12 2014 I finished to update my pull request #443. Now it is active.
- Mason McGill (81/87) Mar 14 2014 Hi all,
- bearophile (9/14) Mar 14 2014 Thank you for your help. An injection of experience is quite
- H. S. Teoh (8/25) Mar 14 2014 [...]
- Mason McGill (23/55) Mar 14 2014 True, but I think the issue at hand when discussing "sugary"
- Brad Roberts (11/15) Mar 07 2014 You expressed this as if there's actual correlation or causation between...
I would like to revisit the topic of operator overloads for multidimensional slicing. Bottom line: opSlice is currently limited to 1 dimension/axis only. The cleanest workaround right now is to pass your own "slice" structs to opIndex. It works but it's not too pretty. ---- // Suppose we have a user-defined type... auto mat = Matrix( [ [0,1,2], [3,4,5], [6,7,8] ] ); // This type of indexing can be implemented: auto cell = mat[1, $-1]; // But multidimensional slicing cannot: // auto submatrix = mat[0..2, 1..$]; // "Cleanest" workaround with a slice struct S taken by opIndex // (no $ capability): auto submatrix = mat[ S(0,2), S(1,3) ]; // With a bit more hacking, something like this could be done: auto submatrix = mat[ S[0..2], S[1..$] ]; ---- Problem with current state of affairs and rationale for a fix: * A stated design goal of D is to "Cater to the needs of numerical analysis programmers", and presumably HPC / scientific computing that's heavy on linear algebra and n-dimensional arrays. Well, it seems like the multidimensional slice/stride syntax in Matlab, NumPy, and even Fortran has been pretty popular with these folks. Syntactic sugar here is a clear win. I don't think it's a niche feature. * The limitation on slicing is inconsistent with the capabilities of opIndex and opDollar, and workarounds are ugly. but it was never implemented (despite opDollar getting done). Recap of discussions so far: * 2009-10-10: DIP7 (http://www.prowiki.org/wiki4d/wiki.cgi?LanguageDevel/DIPs/DIP7) * 2010-03-08: "Proposal: Multidimensional opSlice solution" (http://forum.dlang.org/thread/hn2q9q$263e$1 digitalmars.com) multidimensional indexing and slicing" * 2012-06-01: "[Proposal] Additional operator overloadings for multidimentional indexing and slicing" (http://forum.dlang.org/thread/mailman.1202.1338515967.24740.digitalmars-d puremagic.com) * 2012-11-19: "Multidimensional array operator overloading" (http://forum.dlang.org/thread/mailman.2065.1353348152.5162.digitalmars-d puremagic.com) * 2012-12-19: "Multidimensional slice" (http://forum.dlang.org/thread/lglljlnzoathjxijomrn forum.dlang.org) * 2013-04-06: "rationale for opSlice, opSliceAssign, vs a..b being syntax suger for a Slice struct?" (http://forum.dlang.org/thread/mailman.551.1365290408.4724.digitalmars-d-learn puremagic.com) * 2013-05-12: Andrei asks for feedback on Kenji's 2011 pull * 2013-10-11: "std.linalg" (http://forum.dlang.org/thread/rmyaglfeimzuggoluxvd forum.dlang.org) Steps forward: So I basically want resurrect the topic and gauge support for fixing slice overloads. Then, core committers could revisit solid near-term solution. Finally, perhaps a DIP for stride syntax/overloads? Looking forward to discussion.
Mar 07 2014
Jared Miller:Looking forward to discussion.D needs to offer a nice syntax for user defined multidimensional slicing. Bye, bearophile
Mar 07 2014
On Fri, Mar 07, 2014 at 09:23:30PM +0000, bearophile wrote:Jared Miller:[...] +1. I fully support Kenji's pull to extend the language in that direction. I'm a bit sad that Walter is pushing for a large breaking change to D string handling, while Kenji's pull, which is a non-breaking enhancement that would lead to much better D support for many numerical computation applications, has been stagnating for at least a year (probably more). T -- Unix is my IDE. -- Justin WhearLooking forward to discussion.D needs to offer a nice syntax for user defined multidimensional slicing.
Mar 07 2014
On 3/7/14, 2:30 PM, H. S. Teoh wrote:On Fri, Mar 07, 2014 at 09:23:30PM +0000, bearophile wrote:I agree and sympathize. AndreiJared Miller:[...] +1. I fully support Kenji's pull to extend the language in that direction. I'm a bit sad that Walter is pushing for a large breaking change to D string handling, while Kenji's pull, which is a non-breaking enhancement that would lead to much better D support for many numerical computation applications, has been stagnating for at least a year (probably more).Looking forward to discussion.D needs to offer a nice syntax for user defined multidimensional slicing.
Mar 07 2014
So are there any significant objections to Kenji's PR? I think it's got a lot of things going for it, particularly in likely be a top priority for most people, but it's got a lot of bang for your buck: a great benefit to an important subset of users for relatively little effort. I'd love to see it on the official agenda for release this year. Is this the right location: http://wiki.dlang.org/Agenda, and is anybody welcome to offer edits? Jared On Saturday, 8 March 2014 at 01:24:50 UTC, Andrei Alexandrescu wrote:I agree and sympathize. Andrei
Mar 10 2014
2014-03-08 10:24 GMT+09:00 Andrei Alexandrescu < SeeWebsiteForEmail erdani.org>:I agree and sympathize.Kenji Hara
Mar 12 2014
Hi all, I think D has a lot to offer technical computing: - the speed and modeling power of C++ - GC for clean API design - reflection for automatic bindings And technical computing has a lot to offer D: - users - API writers - time in the minds of people who teach Multidimensional array support is important for this exchange to happen, so as a D user and a computer vision researcher I'm glad to see it's being addressed! However, I'm interested in hearing more about the rationale for the design decisions made concerning ignoring some of the lessons the SciPy community has learned over the past 10+ years. A bit of elaboration: In Python, slicing and indexing were originally separate operations. Custom containers would have to define both `__getitem__(self, key)` and `__getslice__(self, start, end)`. This is where D is now. Python then deprecated `__getslice__` and decided `container[start:end]` should translate to `container[slice(start, end)]`: the slicing syntax just became sugar for creating a lightweight slice object (i.e. a "range literal"), but it only worked inside an index expression. If I understand correctly, this is similar in spirit to the solution the D community seems to be converging upon. This solution enables multidimensional slicing, but needlessly prohibits the construction of range literals outside of an index expression. So, why is this important? One point of view is that multidimensional slicing is just one of many use cases for a concise representation of a range of numbers. In more "specialized" scientific languages, like MATLAB/Octave and Julia, range literals are a critical component to readable, idiomatic code. In order to partially make up for this, SciPy is forced to subvert Python's indexing syntax for calling functions that may operate on numeric ranges, obfuscating code (e.g. http://docs.scipy.org/doc/numpy/reference/generated/numpy.r_.html). I point this out because it (fortunately) seems like D is in a position to have range literals while maintaining backwards compatibility and reducing language complexity (details are below). I'd like to hear your thoughts about range literals as a solution for multidimensional indexing: whether it's been proposed, if so, why is was decided against, what its disadvantages might be, whether they're compatible with the work already done on this front, etc. =================== Range Literals in D =================== // Right now this works: foreach (i; 0..10) doScience(i); // And this works: auto range = iota(0, 10); foreach (i; range) doScience(i); // So why shouldn't this work? auto range = 0..10; foreach (i; range) doScience(i); // Or this? auto range = 0..10; myFavoriteArray[range] = fascinatingFindings(); // Or this? auto range = 0..10; myFavoriteMatrix[0..$, range] = fascinating2DFindings(); // `opSlice` would no longer be necessary... myMap["key"]; // calls `opIndex(string);` myVector[5]; // calls `opIndex(int);` myMatrix[5, 0..10]; // calls `opIndex(int, NumericRange);` // But old code that defines `opSlice` could still work (like in Python). myVector[0..10]; // If `opIndex(NumericRange)` isn't defined, // fall back to`opSlice`. // `ForeachRangeStatement` would no longer need to exist as an odd special case. // The following two statements are semantically equivalent, and with range // literals, they'd be instances of the same looping syntax. foreach (i; 0..10) doScience(); foreach (i; iota(0, 10)) doScience(); // Compilers would, of course, be free to special-case `foreach` loops // over range literals, if it's helpful for performance. On Wednesday, 12 March 2014 at 13:55:05 UTC, Kenji Hara wrote:2014-03-08 10:24 GMT+09:00 Andrei Alexandrescu < SeeWebsiteForEmail erdani.org>:I agree and sympathize.Kenji Hara
Mar 14 2014
Mason McGill:My concern is that this design may be ignoring some of the lessons the SciPy community has learned over the past 10+ years.Thank you for your help. An injection of experience is quite important here. Julia is far newer than D, and yet it has already a better design and more refined implementation in several things related to numerical computing.// So why shouldn't this work? auto range = 0..10; foreach (i; range) doScience(i);People have suggested this lot of time ago, again and again. So I ask that question for Walter. Bye, bearophile
Mar 14 2014
On Fri, Mar 14, 2014 at 12:29:34PM +0000, bearophile wrote:Mason McGill:[...] Replace the first line with: auto range = iota(0, 10); and it will work. It's not *that* hard to learn, is it? T -- Klein bottle for rent ... inquire within. -- Stephen MulraneyMy concern is that this design may be ignoring some of the lessons the SciPy community has learned over the past 10+ years.Thank you for your help. An injection of experience is quite important here. Julia is far newer than D, and yet it has already a better design and more refined implementation in several things related to numerical computing.// So why shouldn't this work? auto range = 0..10; foreach (i; range) doScience(i);People have suggested this lot of time ago, again and again. So I ask that question for Walter.
Mar 14 2014
True, but I think the issue at hand when discussing "sugary" syntax is clarity and expressiveness rather than completeness. In many domains, programmer working memory is at a premium, and code like this: { auto samples = meshgrid(iota(0, 2), iota(0, 100), iota(0, 100)); vector[StridedSlice(0, 10, 2)] = iota(1, 6); plot(iota(-10, 10), myFunction(iota(-10, 10))); foreach (i; square(iota(0, 10))) performSquareDance(i); } might not be as respectful of that resource as code like this: { auto samples = meshgrid(0..2, 0..100, 0..100); vector[(0..10).by(2)] = 1..6; plot(-10..10, myFunction(-10..10)); foreach (i; square(0..10)) performSquareDance(i); } Reference for `meshgrid`: http://www.mathworks.com/help/matlab/ref/meshgrid.html Reference for strided indexing: http://docs.scipy.org/doc/numpy/user/basics.indexing.html On Friday, 14 March 2014 at 14:36:29 UTC, H. S. Teoh wrote:// So why shouldn't this work? auto range = 0..10; foreach (i; range) doScience(i);Replace the first line with: auto range = iota(0, 10); and it will work. It's not *that* hard to learn, is it?On Fri, Mar 14, 2014 at 12:29:34PM +0000, bearophile wrote:Mason McGill:[...] Replace the first line with: auto range = iota(0, 10); and it will work. It's not *that* hard to learn, is it? TMy concern is that this design may be ignoring some of the lessons the SciPy community has learned over the past 10+ years.Thank you for your help. An injection of experience is quite important here. Julia is far newer than D, and yet it has already a better design and more refined implementation in several things related to numerical computing.// So why shouldn't this work? auto range = 0..10; foreach (i; range) doScience(i);People have suggested this lot of time ago, again and again. So I ask that question for Walter.
Mar 14 2014
On 3/7/2014 2:30 PM, H. S. Teoh wrote:I'm a bit sad that Walter is pushing for a large breaking change to D string handling, while Kenji's pull, which is a non-breaking enhancement that would lead to much better D support for many numerical computation applications, has been stagnating for at least a year (probably more).You expressed this as if there's actual correlation or causation between the two when it's highly unlikely any exists. He's doing exactly what many many others do: express concern about a problem encountered during recent use of some aspect of the D ecosystem. It's an unfortunate but true aspect of the rate of D development combined with the relative small community: old pull's get lost in the noise. For pulls to get attention, the author or proponents of a pull need to keep it alive. The rate of application of pulls (regardless of age) isn't bad, but when combined with the influx rate of new pull requests it's just not high enough to get the backlog gone.
Mar 07 2014