digitalmars.D.learn - Questions about the slice operator
- ixid (21/21) Apr 03 2012 I understand the basic use to slice an array but what about these:
- bearophile (26/46) Apr 03 2012 The design of D language is a bit of a patchwork, it's not very
- ixid (1/1) Apr 03 2012 Thank you, very informative as always. =)
- Jonathan M Davis (34/55) Apr 03 2012 And what would it mean in the case of parallel(0 ..5)? Notice that
- ixid (4/4) Apr 03 2012 "And what would it mean in the case of parallel(0 ..5)?"
- Jonathan M Davis (19/24) Apr 03 2012 1. ".." would then be doing something very different than it does in all...
- ixid (2/2) Apr 03 2012 Thank you, very interesting to understand a little more about
- Jacob Carlborg (28/36) Apr 04 2012 Why couldn't the .. syntax be syntax sugar for some kind of library
- =?utf-8?Q?Simen_Kj=C3=A6r=C3=A5s?= (2/43) Apr 04 2012 And what do we do with 3..$?
- Jacob Carlborg (14/15) Apr 04 2012 Hmm, that's a good point. The best I can think of for now is to
- =?utf-8?Q?Simen_Kj=C3=A6r=C3=A5s?= (7/20) Apr 04 2012 Not enough:
- Jonathan M Davis (7/37) Apr 04 2012 I believe that we have opDollar already but that it's buggy.
-
=?utf-8?Q?Simen_Kj=C3=A6r=C3=A5s?=
(46/102)
Apr 04 2012
On Wed, 04 Apr 2012 14:16:54 +0200, Simen Kj=C3=A6r=C3=A5s
- Jacob Carlborg (4/24) Apr 04 2012 I don't think I really understand this idea of an index set.
- =?utf-8?Q?Simen_Kj=C3=A6r=C3=A5s?= (21/51) Apr 04 2012 ).
- Jacob Carlborg (4/19) Apr 04 2012 Ok, now I think I get it.
- Jonathan M Davis (18/67) Apr 04 2012 That might work, but it does make it so that ".." has very different mea...
- Jacob Carlborg (11/36) Apr 04 2012 Yeah, we don't have to stop any releases for this. It's one of those
- travert phare.normalesup.org (Christophe) (10/11) Apr 05 2012 Having a specific range for a .. operator allows you to have them as
- Jonathan M Davis (11/14) Apr 05 2012 As I said, all it does is give you syntactic sugar for iota which can't ...
- bearophile (5/7) Apr 05 2012 Such functions are also able to accept a Iota struct and then read its f...
- Timon Gehr (6/26) Apr 05 2012 It would be awkward to introduce it in a backwards compatible way,
I understand the basic use to slice an array but what about these: foreach(i;0..5) dostuff; That works yet this does not: foreach(i;parallel(0..5)) dostuff; Why not let this work? It'd seem like a natural way of writing a parallel loop. For some reason: foreach(i;[0,1,2,3,4]) dostuff; This performs far more slowly than the first example and only as fast as it when parallelized with a ~150ms function for each iteration. What kind of data is it and how is it behaving? If it can do what it does in the first example why not let it do something like this: int[] arr = 0..5; //arr = [0,1,2,3,4] One final thought- why is the array function required to convert a lazy Result to an eager one? If you explicitly set something like int[] blah = lazything why not have it silently convert itself?
Apr 03 2012
ixid:I understand the basic use to slice an array but what about these: foreach(i;0..5) dostuff; That works yet this does not: foreach(i;parallel(0..5)) dostuff; Why not let this work? It'd seem like a natural way of writing a parallel loop.The design of D language is a bit of a patchwork, it's not very coherent. So the ".." notation defines an iterable interval only in a foreach (.. is used for switch cases too, but it includes the closing item too). Generally a "patchwork design" has some clear disadvantages, but it often has some less visible advantages too. With other people I have suggested few times for a..b to denote a first-class lazy range in D, but Walter was not interested, I guess. I'd like this, but using iota(5) is not terrible (but keep in mind that iterating on an empty interval gives a different outcome to iterating on an empty iota. I have an open bug report on this).For some reason: foreach(i;[0,1,2,3,4]) dostuff; This performs far more slowly than the first exampleI don't know why, but maybe the cause is that an array literal like that induces a heap allocation. This doesn't happen with the lazy 0..5 syntax.If it can do what it does in the first example why not let it do something like this: int[] arr = 0..5; //arr = [0,1,2,3,4]Because a..b is not a first-class interval, because lazyness and ranges were introduced quite late in D and not since the beginning of its design, so lazy constructs are mostly library-defined and they don't act like built-ins.One final thought- why is the array function required to convert a lazy Result to an eager one? If you explicitly set something like int[] blah = lazything why not have it silently convert itself?Beside the answers I've already given, generally because implicit conversions are often a bad thing. Requiring some syntax to denote the lazy->eager conversion is positive, I think I don't know of a language that perform such conversion implicitly. Bye, bearophile
Apr 03 2012
On Wednesday, April 04, 2012 03:29:03 ixid wrote:I understand the basic use to slice an array but what about these: foreach(i;0..5) dostuff; That works yet this does not: foreach(i;parallel(0..5)) dostuff; Why not let this work? It'd seem like a natural way of writing a parallel loop. For some reason: foreach(i;[0,1,2,3,4]) dostuff;This performs far more slowly than the first example and only as fast as it when parallelized with a ~150ms function for each iteration.And what would it mean in the case of parallel(0 ..5)? Notice that foreach(i; 0 .. 5) and foreach(i; [0, 1, 2. 3. 4])) mean _completely different things. The first one doesn't involve arrays it all. It gets lowered to something like for(int i = 0; i < 5; ++i) .. is _never_ used for generating an array. It's only ever used for indicating a range of values. If you want to generate a range, then use std.range.iota. .. wouldn't make sense in the contexts that you're describing. It would have to generate something. And if 0 .. 5 generated [0, 1, 2, 3, 4] in the general case, then foreach(i; ident([0 .. 5]) would be just as inefficient as foreach(i; [0, 1, 2, 3, 4, 5])) even excluding the cost of ident (which presumably just returns the array). foreach(i; 0 .. 5) is more efficient only because it has _nothing_ to do with arrays. Generalizing the syntax wouldn't help at all, and if it were generalized, it would arguably have to be consistent in all of its uses, in which case foreach(i; 0 .. 5) would become identical to foreach(i; [0, 1, 2, 3, 4]) and therefore less efficient. Generalizing .. just doesn't make sense.One final thought- why is the array function required to convert a lazy Result to an eager one? If you explicitly set something like int[] blah = lazything why not have it silently convert itself?That would be an incredibly bad idea. Converting from a lazy range to an array is expensive. You have to process the entire range and allocate memory for the array that you're stuffing it in. Sometimes, you need to do that, but you certainly don't want it to happen by accident. If such conversions were implicit, you'd get hidden performance hits all over the place if you weren't really careful. And in general, D isn't big on implicit conversions anyway. They're useful in some cases, but they often causes bugs. So, D allows a lot fewer implicit conversions than C++ does, and ranges follow that pattern. - Jonathan M Davis
Apr 03 2012
"And what would it mean in the case of parallel(0 ..5)?" Wouldn't it be a more elegant way of doing pretty much the same thing as parallel(iota(0,5))? Iterating over a range and carrying out your parallel task with that value.
Apr 03 2012
On Wednesday, April 04, 2012 04:45:43 ixid wrote:"And what would it mean in the case of parallel(0 ..5)?" Wouldn't it be a more elegant way of doing pretty much the same thing as parallel(iota(0,5))? Iterating over a range and carrying out your parallel task with that value.1. ".." would then be doing something very different than it does in all other cases. 2. That's moving something into the language which works perfectly well in the library, and moving it into the library doesn't really buy us anything. 3. The trend is to move stuff _out_ of the language and into libraries rather than into the language. The overall take on it at this point (especially from Andrei) is that if it _can_ be done in a library, then it _should_ be done in the library. The language is already very powerful and is arguably overly complex already. So, the question at this point is very much why it should be in the language when it works in the library and _not_ why it's in the library when it could be in the language. I can understand why you'd like to use ".." in more cases than is currently allowed, but given the current semantics of "..", it really wouldn't make sense to use it in the sort of cases that you'd like to. Even if they're conceptually similar, they're semantically _very_ different from the current use cases for "..". So, using ".." in place of iota really wouldn't be making the language more consistent, even if it might seem so at first glance. - Jonathan M Davis
Apr 03 2012
Thank you, very interesting to understand a little more about what goes on underneath with conceptual vs semantic differences.
Apr 03 2012
On 2012-04-04 04:11, Jonathan M Davis wrote:foreach(i; 0 .. 5) is more efficient only because it has _nothing_ to do with arrays. Generalizing the syntax wouldn't help at all, and if it were generalized, it would arguably have to be consistent in all of its uses, in which case foreach(i; 0 .. 5) would become identical to foreach(i; [0, 1, 2, 3, 4]) and therefore less efficient. Generalizing .. just doesn't make sense.Why couldn't the .. syntax be syntax sugar for some kind of library implement range type, just as what is done with associative arrays. We could implement a new library type, named "range". Looking something like this: struct range { size_t start; size_t end; // implement the range interface or opApply } range r = 1 .. 5; The above line would be syntax sugar for: range r = range(1, 5); void foo (range r) { foreach (e ; r) {} } foo(r); This could then be taken advantage of in other parts of the language: class A { int opSlice (range r); // new syntax int opSlice (size_t start, size_t end); // old syntax } I think this would be completely backwards compatible as well. -- /Jacob Carlborg
Apr 04 2012
On Wed, 04 Apr 2012 12:06:33 +0200, Jacob Carlborg <doob me.com> wrote:On 2012-04-04 04:11, Jonathan M Davis wrote:And what do we do with 3..$?foreach(i; 0 .. 5) is more efficient only because it has _nothing_ to do with arrays. Generalizing the syntax wouldn't help at all, and if it were generalized, it would arguably have to be consistent in all of its uses, in which case foreach(i; 0 .. 5) would become identical to foreach(i; [0, 1, 2, 3, 4]) and therefore less efficient. Generalizing .. just doesn't make sense.Why couldn't the .. syntax be syntax sugar for some kind of library implement range type, just as what is done with associative arrays. We could implement a new library type, named "range". Looking something like this: struct range { size_t start; size_t end; // implement the range interface or opApply } range r = 1 .. 5; The above line would be syntax sugar for: range r = range(1, 5); void foo (range r) { foreach (e ; r) {} } foo(r); This could then be taken advantage of in other parts of the language: class A { int opSlice (range r); // new syntax int opSlice (size_t start, size_t end); // old syntax } I think this would be completely backwards compatible as well.
Apr 04 2012
On 2012-04-04 14:16, Simen Kjærås wrote:And what do we do with 3..$?Hmm, that's a good point. The best I can think of for now is to translate that to: range(3, size_t.max) Or something like: struct range { size_t start; size_t end; bool dollar; // better name is needed } range(3, 0, true) -- /Jacob Carlborg
Apr 04 2012
On Wed, 04 Apr 2012 14:21:01 +0200, Jacob Carlborg <doob me.com> wrote:On 2012-04-04 14:16, Simen Kj=C3=A6r=C3=A5s wrote:And what do we do with 3..$?Hmm, that's a good point. The best I can think of for now is to =translate that to: range(3, size_t.max) Or something like: struct range { size_t start; size_t end; bool dollar; // better name is needed } range(3, 0, true)Not enough: $-3..$-2 This is a hard and unpleasant one, unless we go with $ being defined as the length of the array we're slicing, and only valid inside a slice operation. (and of course some opDollar or the like for other containers)
Apr 04 2012
On Wednesday, April 04, 2012 14:37:54 Simen Kjærås wrote:On Wed, 04 Apr 2012 14:21:01 +0200, Jacob Carlborg <doob me.com> wrote:I believe that we have opDollar already but that it's buggy. http://d.puremagic.com/issues/show_bug.cgi?id=7097 http://d.puremagic.com/issues/show_bug.cgi?id=7520 Several types in Phobos already have opDollar (generally an alias for length, it seems). - Jonathan M DavisOn 2012-04-04 14:16, Simen Kjærås wrote:Not enough: $-3..$-2 This is a hard and unpleasant one, unless we go with $ being defined as the length of the array we're slicing, and only valid inside a slice operation. (and of course some opDollar or the like for other containers)And what do we do with 3..$?Hmm, that's a good point. The best I can think of for now is to translate that to: range(3, size_t.max) Or something like: struct range { size_t start; size_t end; bool dollar; // better name is needed } range(3, 0, true)
Apr 04 2012
On Wed, 04 Apr 2012 14:16:54 +0200, Simen Kj=C3=A6r=C3=A5s <simen.kjaras= gmail.com> = wrote:On Wed, 04 Apr 2012 12:06:33 +0200, Jacob Carlborg <doob me.com> wrote=:On 2012-04-04 04:11, Jonathan M Davis wrote:foreach(i; 0 .. 5) is more efficient only because it has _nothing_ to do with arrays. =d =Generalizing the syntax wouldn't help at all, and if it were generalized, it woul=e.arguably have to be consistent in all of its uses, in which case foreach(i; 0 .. 5) would become identical to foreach(i; [0, 1, 2, 3, 4]) and therefore less efficient. Generalizing .. just doesn't make sens=Why couldn't the .. syntax be syntax sugar for some kind of library =ng =implement range type, just as what is done with associative arrays. We could implement a new library type, named "range". Looking somethi=like this: struct range { size_t start; size_t end; // implement the range interface or opApply } range r =3D 1 .. 5; The above line would be syntax sugar for: range r =3D range(1, 5); void foo (range r) { foreach (e ; r) {} } foo(r); This could then be taken advantage of in other parts of the language:=Actually, I've thought a little about this. And apart from the tiny idiosyncrasy of $, a..b as a more regular type can bring some interesting enhancements to the language. Consider a..b as simply a set of indices, defined by a start point and an end point. A different index set may be [1,2,4,5], or Strided!(3,4). An index set then works as a filter on a range, returning only those elements whose indices are in the set. We can now redefine opIndex to take either a single index or an index set, as follows: auto opIndex(S)(S set) if (isIndexSet!S) { return set.transform(this); } For an AA, there would be another constraint that the type of elements of the index set match those of the AA keys, of course. Other containers= may have other constraints. An index set may or may not be iterable, but it should always supply functionality to check if an index is contained in it. With this framework laid out, we can define these operations on arrays, and have any array be sliceable by an array of integral elements: assert(['a','b','c'][[0,2]] =3D=3D ['a', 'c']); The problem of $ is a separate one, and quite complex to handle. No doubt it is useful for arrays and their ilk, but for the generic array and index set, it's complex and unpleasant. Barring the use of expression templates, I see few other solutions than to introduce the function opDollar(size_t level), where level is 0 for the first index ([$]), 1 for the second ([_, $]), etc. This means there is no way to express the concept of next-to-last element outside of the opSlice call. A different solution would be to use a specific type for $. Basically, this would be: struct Dollar(T) { T offset; alias offset this; // operator overloads here to assure typeof($+n) =3D=3D typeof($) } This complicates things a lot, and still does not really work. [1,2,3][0..foo($)] works in D today, but would not with the proposed type. Hence, the use of $ outside slice operations likely should not (indeed, can not) be possible.class A { int opSlice (range r); // new syntax int opSlice (size_t start, size_t end); // old syntax } I think this would be completely backwards compatible as well.And what do we do with 3..$?
Apr 04 2012
On 2012-04-04 15:01, Simen Kjærås wrote:Actually, I've thought a little about this. And apart from the tiny idiosyncrasy of $, a..b as a more regular type can bring some interesting enhancements to the language. Consider a..b as simply a set of indices, defined by a start point and an end point. A different index set may be [1,2,4,5], or Strided!(3,4). An index set then works as a filter on a range, returning only those elements whose indices are in the set. We can now redefine opIndex to take either a single index or an index set, as follows: auto opIndex(S)(S set) if (isIndexSet!S) { return set.transform(this); } For an AA, there would be another constraint that the type of elements of the index set match those of the AA keys, of course. Other containers may have other constraints. An index set may or may not be iterable, but it should always supply functionality to check if an index is contained in it. With this framework laid out, we can define these operations on arrays, and have any array be sliceable by an array of integral elements: assert(['a','b','c'][[0,2]] == ['a', 'c']);I don't think I really understand this idea of an index set. -- /Jacob Carlborg
Apr 04 2012
On Wed, 04 Apr 2012 15:29:58 +0200, Jacob Carlborg <doob me.com> wrote:On 2012-04-04 15:01, Simen Kj=C3=A6r=C3=A5s wrote:dActually, I've thought a little about this. And apart from the tiny idiosyncrasy of $, a..b as a more regular type can bring some interesting enhancements to the language. Consider a..b as simply a set of indices, defined by a start point an=).an end point. A different index set may be [1,2,4,5], or Strided!(3,4=An index set then works as a filter on a range, returning only those elements whose indices are in the set. We can now redefine opIndex to take either a single index or an index=sset, as follows: auto opIndex(S)(S set) if (isIndexSet!S) { return set.transform(this); } For an AA, there would be another constraint that the type of element=ersof the index set match those of the AA keys, of course. Other contain=s,may have other constraints. An index set may or may not be iterable, but it should always supply functionality to check if an index is contained in it. With this framework laid out, we can define these operations on array=It's quite simple, really - an index set holds indices. For a regular array of N elements, the index set it [0..N-1]. For an AA, the index set= is all the keys in the AA. Basically, an index set is the set of all values that will give meaningful results from container[index]. arr[2..4] thus means 'restrict the indices to those between 2 and 4'. For arrays though, it also translates the array so that what was 2 before, now is 0. For a T[string] aa, one could imagine the operation aa["a".."c"] to produce a new AA with only those elements whose keys satisfy "a" <=3D key < "c". As for the example given: assert(['a','b','c'][[0,2]] =3D=3D ['a', 'c']); This means 'grab the elements at position 0 and 2, and put them in a new array'. Hence, element 0 ('a') and element 2 ('c') are in the result.and have any array be sliceable by an array of integral elements: assert(['a','b','c'][[0,2]] =3D=3D ['a', 'c']);I don't think I really understand this idea of an index set.
Apr 04 2012
On 2012-04-04 16:40, Simen Kjærås wrote:It's quite simple, really - an index set holds indices. For a regular array of N elements, the index set it [0..N-1]. For an AA, the index set is all the keys in the AA. Basically, an index set is the set of all values that will give meaningful results from container[index]. arr[2..4] thus means 'restrict the indices to those between 2 and 4'. For arrays though, it also translates the array so that what was 2 before, now is 0. For a T[string] aa, one could imagine the operation aa["a".."c"] to produce a new AA with only those elements whose keys satisfy "a" <= key < "c". As for the example given: assert(['a','b','c'][[0,2]] == ['a', 'c']); This means 'grab the elements at position 0 and 2, and put them in a new array'. Hence, element 0 ('a') and element 2 ('c') are in the result.Ok, now I think I get it. -- /Jacob Carlborg
Apr 04 2012
On Wednesday, April 04, 2012 12:06:33 Jacob Carlborg wrote:On 2012-04-04 04:11, Jonathan M Davis wrote:That might work, but it does make it so that ".." has very different meanings in different contexts, and I don't know that it really buys us much. iota already does them same thing (and with more functionality), just without the syntactic sugar. Also, we've had enough issues with moving AA's into druntime, that I don't know how great an idea this sort of thing would be (though it should be much simpler). It would certainly make some folks (e.g. Bearophile) happy though.foreach(i; 0 .. 5) is more efficient only because it has _nothing_ to do with arrays. Generalizing the syntax wouldn't help at all, and if it were generalized, it would arguably have to be consistent in all of its uses, in which case foreach(i; 0 .. 5) would become identical to foreach(i; [0, 1, 2, 3, 4]) and therefore less efficient. Generalizing .. just doesn't make sense.Why couldn't the .. syntax be syntax sugar for some kind of library implement range type, just as what is done with associative arrays. We could implement a new library type, named "range". Looking something like this: struct range { size_t start; size_t end; // implement the range interface or opApply } range r = 1 .. 5; The above line would be syntax sugar for: range r = range(1, 5); void foo (range r) { foreach (e ; r) {} } foo(r);This could then be taken advantage of in other parts of the language: class A { int opSlice (range r); // new syntax int opSlice (size_t start, size_t end); // old syntax } I think this would be completely backwards compatible as well.Except that opSlice already works with "..". What would this buy you? It doesn't make sense to pass opSlice a range normally. Why treat this proposed "range" type any differently from any other range? This functionality already exists with the second declaration there. If we added a range type like this, I'd be inclined to make it __range or somesuch and not ever have its name used explicitly anywhere. It would basically just be syntactic sugar for iota (though it wouldnt' use iota specifically). I don't know what else you would be looking to get out of using its type specifically anywhere. That's not general done with other range types. - Jonathan M Davis
Apr 04 2012
On 2012-04-04 19:09, Jonathan M Davis wrote:That might work, but it does make it so that ".." has very different meanings in different contexts, and I don't know that it really buys us much. iota already does them same thing (and with more functionality), just without the syntactic sugar. Also, we've had enough issues with moving AA's into druntime, that I don't know how great an idea this sort of thing would be (though it should be much simpler). It would certainly make some folks (e.g. Bearophile) happy though.Yeah, we don't have to stop any releases for this. It's one of those features in the language that is not very consistent and sometimes that just a bit annoying.Nothing, but that's how the language could have looked like if a first class range type had been added to the language a long time ago.This could then be taken advantage of in other parts of the language: class A { int opSlice (range r); // new syntax int opSlice (size_t start, size_t end); // old syntax } I think this would be completely backwards compatible as well.Except that opSlice already works with "..". What would this buy you?It doesn't make sense to pass opSlice a range normally. Why treat this proposed "range" type any differently from any other range? This functionality already exists with the second declaration there. If we added a range type like this, I'd be inclined to make it __range or somesuch and not ever have its name used explicitly anywhere. It would basically just be syntactic sugar for iota (though it wouldnt' use iota specifically). I don't know what else you would be looking to get out of using its type specifically anywhere. That's not general done with other range types.In this case "range" is just a start and end of a list of numbers, maybe "range" is not a good name IT conflicts with the concept of ranges. No, instead we use templates like mad. -- /Jacob Carlborg
Apr 04 2012
"Jonathan M Davis" , dans le message (digitalmars.D.learn:34243), aExcept that opSlice already works with "..". What would this buy you?Having a specific range for a .. operator allows you to have them as parameters of any function. For example, this could be nice for multidimensional slicing: Matrix!(double, 6, 6) A; auto partOfA = A[1..3, 4..6]; Operations on several items of a container: Container B; B.remove(4..9); // remove 5 contiguous elements. etc.
Apr 05 2012
On Thursday, April 05, 2012 14:58:41 Christophe wrote:"Jonathan M Davis" , dans le message (digitalmars.D.learn:34243), aAs I said, all it does is give you syntactic sugar for iota which can't even do as much as iota can (since it lacks a step parameter). But my point that you're quoting has nothing to do with using .. with functions in general. It specifically has to do with creating a new overload for opSlice as Jacob suggestios - i.e. when you do a[0 .. 5] where a is an instance of a user-defined type. That works just fine with auto opSlice (size_t start, size_t end). The range type buys you nothing for that, and in fact would be _more_ expensive, since it would have to allocate a struct rather than simply passing the two indices. - Jonathan M DavisExcept that opSlice already works with "..". What would this buy you?
Apr 05 2012
Christophe:Having a specific range for a .. operator allows you to have them as parameters of any function.Such functions are also able to accept a Iota struct and then read its fields to find its bounds. For Jonathan M Davis: the first class intervals seem nice to have, but they aren't near the top of the list of my enhancement requests :-) Bye, bearophile
Apr 05 2012
On 04/04/2012 12:06 PM, Jacob Carlborg wrote:On 2012-04-04 04:11, Jonathan M Davis wrote:It would be awkward to introduce it in a backwards compatible way, because currently '..' binds weaker than any operator. auto x = 0..10; // ok auto y = 0..10, z = 2; // error, z not defined x = 0..11; // error: expression '11' has no effectforeach(i; 0 .. 5) is more efficient only because it has _nothing_ to do with arrays. Generalizing the syntax wouldn't help at all, and if it were generalized, it would arguably have to be consistent in all of its uses, in which case foreach(i; 0 .. 5) would become identical to foreach(i; [0, 1, 2, 3, 4]) and therefore less efficient. Generalizing .. just doesn't make sense.Why couldn't the .. syntax be syntax sugar for some kind of library implement range type, just as what is done with associative arrays. ... I think this would be completely backwards compatible as well.
Apr 05 2012