digitalmars.D - Iterators for D
- Walter Bright (20/20) Nov 06 2006 It's becoming increasingly obvious that D needs iterators. While opApply...
- Walter Bright (3/4) Nov 06 2006 Make that .begin
- rm (16/22) Nov 06 2006 that would give something like this?
- Sean Kelly (8/26) Nov 06 2006 Oh right, that's what doesn't seem to work with the C++ model--such an
- rm (5/10) Nov 06 2006 I didn't read the discussion, but wasn't there a topic about
- Sean Kelly (12/23) Nov 06 2006 How? Say I do this:
- Hasan Aljudy (4/35) Nov 06 2006 instead of:
- Bill Baxter (5/46) Nov 06 2006 The point is that you could replace char[] with SomeCollection and the
- Dave (6/35) Nov 06 2006 Would this mechanism work for, say, arrays of structs (not struct*[] 's)...
- Dave (4/43) Nov 06 2006 I should add - will native arrays of structs (say) provide this by defau...
- Walter Bright (2/4) Nov 06 2006 Yes. It would be no different than for native arrays of ints.
- Dave (2/7) Nov 06 2006 Cool!
- Tom S (6/9) Nov 06 2006 Could you elaborate on how this would look like ? E.g. a usage example
- Walter Bright (2/5) Nov 06 2006 With an iterator, what is the key?
- Tom S (2/8) Nov 06 2006 You've got a point. I failed to see the big picture here.
- Bill Baxter (17/23) Nov 06 2006 It could just be the count. Say I'm trying to linearize a non-linear
- rm (3/9) Nov 06 2006 with an iterator, what is the "value"?
- Sean Kelly (8/28) Nov 06 2006 Is it worth discussing whether the pointer-style C++ iterators are the
- Walter Bright (2/8) Nov 06 2006 Hmm, or it could be done starting with .end and going to .begin.
- Sean Kelly (7/17) Nov 06 2006 Yup. About the only potentially weird thing there is that, assuming C++...
- Bill Baxter (51/71) Nov 06 2006 How about pointers (unknown length arrays)?
- Walter Bright (14/82) Nov 06 2006 It's simpler than that. Pointers can be converted to arrays:
- Bill Baxter (22/71) Nov 06 2006 The C++ way really embodies two design choices:
- Walter Bright (2/10) Nov 06 2006 That's a very good point. Got any ideas on that?
- Craig Black (8/18) Nov 07 2006 When writing custom C++ iterators, I find that end() is not ever necessa...
- Tom S (3/8) Nov 07 2006 Also in some cases, 'end()' is really unnatural to write. E.g. with
- Kristian Kilpi (42/50) Nov 07 2006 er
- Karen Lanrap (15/16) Nov 06 2006 In the general case the structure to be linearized is a directed
- Mike Capp (4/7) Nov 06 2006 Wouldn't this make iterators incompatible with e.g. linked lists? Seems ...
- Walter Bright (4/12) Nov 06 2006 To work, all it has to do is support [0], and throw an exception if the
- Bill Baxter (34/50) Nov 06 2006 This is one point where there's a huge divide between C++ iterators and
- Sean Kelly (11/53) Nov 06 2006 For what it's worth, C++98 style iterator traits won't work in D because...
- Kirk McDonald (63/92) Nov 06 2006 What are the arguments against that, again? I do not recall, and it's
- Walter Bright (18/92) Nov 06 2006 Because D has many cases of implicit dereferencing, I thought it would
- Kirk McDonald (34/156) Nov 06 2006 Wouldn't that be p[0] = ...?
- Walter Bright (21/80) Nov 06 2006 I must have misunderstood you. I thought you were asking if [n] worked
- Bill Baxter (31/45) Nov 06 2006 The only thing that comes to mind is if you get multiple indirections.
- Jarrett Billingsley (16/18) Nov 06 2006 Somehow..
- Walter Bright (8/32) Nov 06 2006 The problem is if the iterator is an actual pointer, there is no .value
- Jarrett Billingsley (4/11) Nov 06 2006 And yet another is to allow overriding of opDeref and opDerefAssign ;)
- Sean Kelly (5/34) Nov 06 2006 Is there really any reason to support pointers as iterators though? C++...
- Bill Baxter (10/24) Nov 06 2006 Well, they could. It's up to Mr. Compiler Writer if a pointer has a
- Walter Bright (10/16) Nov 06 2006 Consider the following struct:
- Bill Baxter (10/31) Nov 06 2006 It's a name conflict. They happen. I'd have trouble if I wanted to
- Charles D Hixson (15/36) Nov 07 2006 Maybe that's why Ada used "`" as a property marker rather than
- Sean Kelly (6/30) Nov 07 2006 I've used it a few times, but generally more when I'm throwing together
- Walter Bright (2/5) Nov 06 2006 Can you be more specific about what the problems and solutions are?
- Georg Wrede (8/16) Nov 07 2006 For one thing, if pointers still are supported, then all algorithms have...
- Sean Kelly (27/33) Nov 07 2006 A direct application of pointers as iterators couldn't be done exactly
- Walter Bright (3/14) Nov 07 2006 Hmm, instead of 'pointers as iterators', have 'arrays as iterators'.
- Sean Kelly (5/20) Nov 07 2006 I went ahead and wrote everything up during my commute this morning.
- Benji Smith (29/40) Nov 07 2006 I agree completely. As someone with a background in Java, C#, and Python...
- Bill Baxter (13/28) Nov 07 2006 OOh, I like the direction this is heading.
- Ary Manzana (3/8) Nov 07 2006 In fact, Java has them:
- Daniel Keep (33/84) Nov 07 2006 Hang on; aren't we back to where we are *right now*?
- Bill Baxter (14/35) Nov 07 2006 Close, but right now we don't have a good way to iterate over multiple
- Daniel Keep (22/76) Nov 07 2006 Well, as I said, once you have your sequence expressed as a coroutine,
- Sean Kelly (7/13) Nov 08 2006 That's pretty much it. Saving the current register data, loading the
- Benji Smith (11/21) Nov 08 2006 What about containers where there are multiple valid orders of
- Chris Nicholson-Sauls (13/39) Nov 08 2006 It should be roughly equivelant in cost to explicitly calling the method...
- Sean Kelly (17/106) Nov 08 2006 Essentially. But D has no formal description of an iterator type. I
- Bill Baxter (15/137) Nov 08 2006 I wrote some mesh (graph) manipulation algorithms in Python. I really
- =?iso-8859-1?q?Knud_S=F8rensen?= (3/8) Nov 06 2006 What are those cases ? Maybe we can find a way to fix the problem with
- Walter Bright (4/14) Nov 06 2006 One such case is the usefulness of being able to provide an input
- Bill Baxter (12/27) Nov 06 2006 Another case is the desire to iterate over multiple containers at the
- KlausO (6/10) Nov 07 2006 Some questions:
- Chris Nicholson-Sauls (67/80) Nov 07 2006 Mostly agreed. For example, one case I seem to find mentioned a few tim...
- Bill Baxter (5/23) Nov 07 2006 Other way around. Tree traversal is harder to do with C++ style
- Craig Black (6/26) Nov 07 2006 It's not a huge problem, but passing an argument needlessly is both ugly...
- Oskar Linde (107/121) Nov 07 2006 I apologize for this long post. Writing too much has become a bad habit
- nazo (11/17) Nov 07 2006 I think that this is more better.
- Walter Bright (12/53) Nov 07 2006 Right.
- Dave (6/45) Nov 12 2006 For arrays, we already have arr.ptr, what if array.end was simply added?
- Burton Radons (53/53) Nov 10 2006 I don't like C++ iterators; they're a very old design made to fit a
- Walter Bright (2/2) Nov 10 2006 As nice as yield is for some types of problems, they aren't going to
It's becoming increasingly obvious that D needs iterators. While opApply is far better for some kinds of iteration (such as recursively traversing a directory), iterators are more efficient in some cases, and allow for things that opApply makes difficult. So hear are the design goals: 1) Works with dynamic and stat arrays 2) Doesn't need to work with associative arrays 3) Should be possible to achieve "pointer efficiency" with it 4) Needs to be able to restrict lvalue access (what C++ does with const iterators) 5) Needs to work seamlessly with foreach 6) Iterators need to be copyable So here's one design: .being property returns an iterator that starts at the beginning .end returns an iterator that is at the end Overloading * doesn't work with D. But, instead: Overload opIndex for rvalue access Overload opIndexAssign for lvalue access Overloading opIndex also will work for random access foreach loops will not be able to have a 'key' parameter.
Nov 06 2006
Walter Bright wrote:.being property returns an iterator that starts at the beginningMake that .begin Also, overloading += will be used to advance the iterator.
Nov 06 2006
Walter Bright wrote:Walter Bright wrote:that would give something like this? import std.stdio; void main() { char[] s = "hello"; // += for advancing an iterator, why not ++? for (iterator i = s.begin; i != s.end; i += 1) writefln(s[i]); // ??? i is an iterator or a char ??? foreach(i; s) // extend writefln to take iterators ? // but that would mean that it must be possible // to determine a value from an iterator writefln(i); }.being property returns an iterator that starts at the beginningMake that .begin Also, overloading += will be used to advance the iterator.
Nov 06 2006
rm wrote:Walter Bright wrote:Oh right, that's what doesn't seem to work with the C++ model--such an iterator wouldn't work in foreach: foreach( e; s.begin() ) // nowhere to supply s.end() I think the Java style may be more appropriate: foreach( e; s.fwdIter() ) foreach( e; s.revIter() ) SeanWalter Bright wrote:that would give something like this? import std.stdio; void main() { char[] s = "hello"; // += for advancing an iterator, why not ++? for (iterator i = s.begin; i != s.end; i += 1) writefln(s[i]);.being property returns an iterator that starts at the beginningMake that .begin Also, overloading += will be used to advance the iterator.
Nov 06 2006
I think the Java style may be more appropriate: foreach( e; s.fwdIter() ) foreach( e; s.revIter() )I didn't read the discussion, but wasn't there a topic about foreach_reverse? Should that not cover s.revIter()? Secondly, why must one give in the foreach statement the first iterator? I'd say that there the iterable must be given? roel
Nov 06 2006
rm wrote:How? Say I do this: foreach_reverse( e; s.begin() ) I've passed an iterator pointing to the beginning of a sequence to an algorithm that wants to iterate backwards, so best case the iterator is bidirectional and we will visit exactly one element (the theoretical "last" in this case) before exiting the loop.I think the Java style may be more appropriate: foreach( e; s.fwdIter() ) foreach( e; s.revIter() )I didn't read the discussion, but wasn't there a topic about foreach_reverse? Should that not cover s.revIter()?Secondly, why must one give in the foreach statement the first iterator? I'd say that there the iterable must be given?Something like that, yes. You need a unified object that knows when it's reached the end of the sequence. So assuming C++ style iterators you'd have to wrap them in an object that takes care of the comparisons and looping to work with foreach. Sean
Nov 06 2006
rm wrote:Walter Bright wrote:Why would I useWalter Bright wrote:that would give something like this? import std.stdio; void main() { char[] s = "hello"; // += for advancing an iterator, why not ++? for (iterator i = s.begin; i != s.end; i += 1) writefln(s[i]); // ??? i is an iterator or a char ??? foreach(i; s) // extend writefln to take iterators ? // but that would mean that it must be possible // to determine a value from an iterator writefln(i); }.being property returns an iterator that starts at the beginningMake that .begin Also, overloading += will be used to advance the iterator.for (iterator i = s.begin; i != s.end; i += 1) writefln(s[i]);instead of:for (int i = 0; i < s.length; i++) writefln(s[i]);?
Nov 06 2006
Hasan Aljudy wrote:rm wrote:The point is that you could replace char[] with SomeCollection and the code should still work. But the same mechanism should also work on the built-in types. --bbWalter Bright wrote:Why would I use > for (iterator i = s.begin; i != s.end; i += 1) > writefln(s[i]); instead of: > for (int i = 0; i < s.length; i++) > writefln(s[i]); ?Walter Bright wrote:that would give something like this? import std.stdio; void main() { char[] s = "hello"; // += for advancing an iterator, why not ++? for (iterator i = s.begin; i != s.end; i += 1) writefln(s[i]); // ??? i is an iterator or a char ??? foreach(i; s) // extend writefln to take iterators ? // but that would mean that it must be possible // to determine a value from an iterator writefln(i); }.being property returns an iterator that starts at the beginningMake that .begin Also, overloading += will be used to advance the iterator.
Nov 06 2006
Walter Bright wrote:It's becoming increasingly obvious that D needs iterators. While opApply is far better for some kinds of iteration (such as recursively traversing a directory), iterators are more efficient in some cases, and allow for things that opApply makes difficult. So hear are the design goals: 1) Works with dynamic and stat arrays 2) Doesn't need to work with associative arrays 3) Should be possible to achieve "pointer efficiency" with it 4) Needs to be able to restrict lvalue access (what C++ does with const iterators) 5) Needs to work seamlessly with foreach 6) Iterators need to be copyable So here's one design: .being property returns an iterator that starts at the beginning .end returns an iterator that is at the end Overloading * doesn't work with D. But, instead: Overload opIndex for rvalue access Overload opIndexAssign for lvalue access Overloading opIndex also will work for random access foreach loops will not be able to have a 'key' parameter.Would this mechanism work for, say, arrays of structs (not struct*[] 's) so that we could avoid the overhead of either a) copying the struct and/or b) the extra level of indirection in the loop? (In other words, could the iterator provide a reference to each element by default). Thanks, - Dave
Nov 06 2006
Dave wrote:Walter Bright wrote:I should add - will native arrays of structs (say) provide this by default w/o having to add specialized operators? Thanks.It's becoming increasingly obvious that D needs iterators. While opApply is far better for some kinds of iteration (such as recursively traversing a directory), iterators are more efficient in some cases, and allow for things that opApply makes difficult. So hear are the design goals: 1) Works with dynamic and stat arrays 2) Doesn't need to work with associative arrays 3) Should be possible to achieve "pointer efficiency" with it 4) Needs to be able to restrict lvalue access (what C++ does with const iterators) 5) Needs to work seamlessly with foreach 6) Iterators need to be copyable So here's one design: .being property returns an iterator that starts at the beginning .end returns an iterator that is at the end Overloading * doesn't work with D. But, instead: Overload opIndex for rvalue access Overload opIndexAssign for lvalue access Overloading opIndex also will work for random access foreach loops will not be able to have a 'key' parameter.Would this mechanism work for, say, arrays of structs (not struct*[] 's) so that we could avoid the overhead of either a) copying the struct and/or b) the extra level of indirection in the loop? (In other words, could the iterator provide a reference to each element by default). Thanks, - Dave
Nov 06 2006
Dave wrote:I should add - will native arrays of structs (say) provide this by default w/o having to add specialized operators?Yes. It would be no different than for native arrays of ints.
Nov 06 2006
Walter Bright wrote:Dave wrote:Cool!I should add - will native arrays of structs (say) provide this by default w/o having to add specialized operators?Yes. It would be no different than for native arrays of ints.
Nov 06 2006
Walter Bright wrote:Overload opIndex for rvalue access Overload opIndexAssign for lvalue accessCould you elaborate on how this would look like ? E.g. a usage example of iterators with the proposed design.foreach loops will not be able to have a 'key' parameter.What's the particular reason for that restriction ? -- Tomasz Stachowiak
Nov 06 2006
Tom S wrote:Walter Bright wrote:With an iterator, what is the key?foreach loops will not be able to have a 'key' parameter.What's the particular reason for that restriction ?
Nov 06 2006
Walter Bright wrote:Tom S wrote:You've got a point. I failed to see the big picture here.Walter Bright wrote:With an iterator, what is the key?foreach loops will not be able to have a 'key' parameter.What's the particular reason for that restriction ?
Nov 06 2006
Walter Bright wrote:Tom S wrote:It could just be the count. Say I'm trying to linearize a non-linear datastructure into a pre-existing array: foreach(int i, val v; hierarchical_thing.iterator()) { linearized[i] = val; } Also having the count there would make it easier to do something like iterating lock step in funky ways over different objects: foreach(int i, val v; thing.randomized_iterator()) { other_thing[i] = val; } Obviously you could count things yourself if you had to. But I find for instance in Python that I use 'enumerate()' quite often. enumerate() takes an iterator and returns an iterator that spits out (count,value) tuples. --bbWalter Bright wrote:With an iterator, what is the key?foreach loops will not be able to have a 'key' parameter.What's the particular reason for that restriction ?
Nov 06 2006
Walter Bright wrote:Tom S wrote:with an iterator, what is the "value"? Is it the key-value *pair*, or is it just the key?Walter Bright wrote:With an iterator, what is the key?foreach loops will not be able to have a 'key' parameter.What's the particular reason for that restriction ?
Nov 06 2006
Walter Bright wrote:It's becoming increasingly obvious that D needs iterators. While opApply is far better for some kinds of iteration (such as recursively traversing a directory), iterators are more efficient in some cases, and allow for things that opApply makes difficult. So hear are the design goals: 1) Works with dynamic and stat arrays 2) Doesn't need to work with associative arrays 3) Should be possible to achieve "pointer efficiency" with it 4) Needs to be able to restrict lvalue access (what C++ does with const iterators) 5) Needs to work seamlessly with foreach 6) Iterators need to be copyable So here's one design: ..being property returns an iterator that starts at the beginning ..end returns an iterator that is at the endIs it worth discussing whether the pointer-style C++ iterators are the best approach? They certainly are if random access iterators are a requirement, but if we're just implementing forward and reverse iterators, the all-in-one Java approach may be better. In that vein, I assume alose that reverse iteration would work with reverse iterators and an .rbegin type method instead of .begin and foreach_reverse? Sean
Nov 06 2006
Sean Kelly wrote:Is it worth discussing whether the pointer-style C++ iterators are the best approach? They certainly are if random access iterators are a requirement, but if we're just implementing forward and reverse iterators, the all-in-one Java approach may be better. In that vein, I assume alose that reverse iteration would work with reverse iterators and an .rbegin type method instead of .begin and foreach_reverse?Hmm, or it could be done starting with .end and going to .begin.
Nov 06 2006
Walter Bright wrote:Sean Kelly wrote:Yup. About the only potentially weird thing there is that, assuming C++ style iterators, it would require reverse iterators and forward iterators to be comparable to one another. And, I suppose, for .end to return a reverse style iterator even for sequences that don't support reverse iteration? SeanIs it worth discussing whether the pointer-style C++ iterators are the best approach? They certainly are if random access iterators are a requirement, but if we're just implementing forward and reverse iterators, the all-in-one Java approach may be better. In that vein, I assume alose that reverse iteration would work with reverse iterators and an .rbegin type method instead of .begin and foreach_reverse?Hmm, or it could be done starting with .end and going to .begin.
Nov 06 2006
Walter Bright wrote:It's becoming increasingly obvious that D needs iterators. While opApply is far better for some kinds of iteration (such as recursively traversing a directory), iterators are more efficient in some cases, and allow for things that opApply makes difficult. So hear are the design goals: 1) Works with dynamic and stat arraysHow about pointers (unknown length arrays)? int *some_numbers; Is it required that a pointer be a valid iterator? I.e. should it be possible to have this template function: print_all(some_numbers, some_numbers+theLength); Or will that case require a wrapper object that contains theLength? auto iter = PointerIterator(some_numbers, theLength); print_all(iter);2) Doesn't need to work with associative arraysWhy not? Do you mean it doesn't need to because you can just iterate over keys or values?3) Should be possible to achieve "pointer efficiency" with it 4) Needs to be able to restrict lvalue access (what C++ does with const iterators) 5) Needs to work seamlessly with foreach 6) Iterators need to be copyableSo here's one design: .being property returns an iterator that starts at the beginning .end returns an iterator that is at the endThat's like C++'s way. Iterator is basically a generalized pointer. is like a pointer that knows it's own limits. the C++ way working for D because, in practice, references and dereferencing get used pretty heavily there. On the other hand One of the major design goals for C++ was that for a basic array a regular pointer should be a valid iterator. Thus the iterator cannot know where it's own end() is, because a pointer does not have such knowledge. But aside from cute demos on SGI's STL web page where algorithms are applied to plain arrays using pointers, I've never seen that sort of thing used in the real world. You almost always use a std::vector or std::list or something besides a raw array. So pointer equivalence seems to be pretty useless to me. Even moreso with D. In the rare case you need to work with raw pointers, you can always make an iterator wrapper. On the plus side for C++ style: * iterators are lightweight (just one size_t/pointer in most cases) * algorithms and methods all take begin/end pairs making things uniform. Minuses * Iterators are dumb and error prone and easily go out of bounds * algorithms all require begin/end pairs making things cumbersome I think the minuses outweigh the plusses. But maybe it's useful to take one concrete example: A generic mergesort or quicksort algorithm. In C++ recursive calls just pass the limits of the range to the children mergesort(begin, end) { ... mergesort(begin, (end-begin)/2); mergesort((end-begin)/2, end); } It's very efficient. In the iterator-is-object approach I'm not sure how that works. If all you've got is .next() or .hasNext() I'm not sure what you're supposed to do to create an iterator over half the range represented by the original iterator.foreach loops will not be able to have a 'key' parameter.Don't understand that one. --bb
Nov 06 2006
Bill Baxter wrote:Walter Bright wrote:It's simpler than that. Pointers can be converted to arrays: auto array = some_numbers[0 .. theLength]; and then you can get the iterator.It's becoming increasingly obvious that D needs iterators. While opApply is far better for some kinds of iteration (such as recursively traversing a directory), iterators are more efficient in some cases, and allow for things that opApply makes difficult. So hear are the design goals: 1) Works with dynamic and stat arraysHow about pointers (unknown length arrays)? int *some_numbers; Is it required that a pointer be a valid iterator? I.e. should it be possible to have this template function: print_all(some_numbers, some_numbers+theLength); Or will that case require a wrapper object that contains theLength? auto iter = PointerIterator(some_numbers, theLength); print_all(iter);Because that falls into my big complaint with iterators - there's no way to efficiently 'linearize' a binary tree. Might as well just iterate over the keys or values.2) Doesn't need to work with associative arraysWhy not? Do you mean it doesn't need to because you can just iterate over keys or values?That's like C++'s way. Iterator is basically a generalized pointer.Yes.is like a pointer that knows it's own limits. the C++ way working for D because, in practice, references and dereferencing get used pretty heavily there. On the other handI think the C++ like way will be a lot more efficient, and I think it will work. Reference types are not needed.One of the major design goals for C++ was that for a basic array a regular pointer should be a valid iterator. Thus the iterator cannot know where it's own end() is, because a pointer does not have such knowledge. But aside from cute demos on SGI's STL web page where algorithms are applied to plain arrays using pointers, I've never seen that sort of thing used in the real world. You almost always use a std::vector or std::list or something besides a raw array. So pointer equivalence seems to be pretty useless to me. Even moreso with D. In the rare case you need to work with raw pointers, you can always make an iterator wrapper.I think it'll work fine, since pointers can be easily converted to dynamic arrays (see above).On the plus side for C++ style: * iterators are lightweight (just one size_t/pointer in most cases) * algorithms and methods all take begin/end pairs making things uniform. Minuses * Iterators are dumb and error prone and easily go out of bounds * algorithms all require begin/end pairs making things cumbersome I think the minuses outweigh the plusses. But maybe it's useful to take one concrete example: A generic mergesort or quicksort algorithm. In C++ recursive calls just pass the limits of the range to the children mergesort(begin, end) { ... mergesort(begin, (end-begin)/2); mergesort((end-begin)/2, end); } It's very efficient. In the iterator-is-object approach I'm not sure how that works. If all you've got is .next() or .hasNext() I'm not sure what you're supposed to do to create an iterator over half the range represented by the original iterator.Iterators can be any type, not just objects, that support [], ++, +=, and =.What's the key for an unknown iterator type?foreach loops will not be able to have a 'key' parameter.Don't understand that one.
Nov 06 2006
Walter Bright wrote:Bill Baxter wrote:Nice! Didn't know you could do that.Walter Bright wrote:It's simpler than that. Pointers can be converted to arrays: auto array = some_numbers[0 .. theLength]; and then you can get the iterator.It's becoming increasingly obvious that D needs iterators. While opApply is far better for some kinds of iteration (such as recursively traversing a directory), iterators are more efficient in some cases, and allow for things that opApply makes difficult. So hear are the design goals: 1) Works with dynamic and stat arraysHow about pointers (unknown length arrays)? int *some_numbers; Is it required that a pointer be a valid iterator? I.e. should it be possible to have this template function: print_all(some_numbers, some_numbers+theLength); Or will that case require a wrapper object that contains theLength? auto iter = PointerIterator(some_numbers, theLength); print_all(iter);I think the C++ like way will be a lot more efficient, and I think it will work. Reference types are not needed.The C++ way really embodies two design choices: 1) iterators are defined by 'concept' --> Means generic iteration functions all need to be templates --> The other alternative is define them by interfaces. --> these two aren't necessarily mutually exclusive though 2) an iterator is a pointer -- begin() and end() are kept separate. --> The other alternative is 'an iterator is a range' Definitely 1) will be more efficient than using interfaces, but I don't thing pointer semantics, 2), are required for 1) to work. It may be possible for D iterators to have the speed of C++ while having the convenience and safety of Java's iterators that know their bounds.I think the iterator-as-range idea is at least worth discussing. 99% of the time you use an iterator you also need to know where to stop. It's pretty similar to how arrays in D keep a length, because 99% of the time when you have an array you need to know how long it is too. So it makes sense to keep those two bits of information together. Similarly in C++, when you have an iterator, you almost always need the end() too. And any time you want to pass around iterators you end up having to pass the two bits of information around separately. --bbI think the minuses outweigh the plusses. But maybe it's useful to take one concrete example: A generic mergesort or quicksort algorithm. In C++ recursive calls just pass the limits of the range to the children mergesort(begin, end) { ... mergesort(begin, (end-begin)/2); mergesort((end-begin)/2, end); } It's very efficient. In the iterator-is-object approach I'm not sure how that works. If all you've got is .next() or .hasNext() I'm not sure what you're supposed to do to create an iterator over half the range represented by the original iterator.Iterators can be any type, not just objects, that support [], ++, +=, and =.
Nov 06 2006
Bill Baxter wrote:I think the iterator-as-range idea is at least worth discussing. 99% of the time you use an iterator you also need to know where to stop. It's pretty similar to how arrays in D keep a length, because 99% of the time when you have an array you need to know how long it is too. So it makes sense to keep those two bits of information together. Similarly in C++, when you have an iterator, you almost always need the end() too. And any time you want to pass around iterators you end up having to pass the two bits of information around separately.That's a very good point. Got any ideas on that?
Nov 06 2006
When writing custom C++ iterators, I find that end() is not ever necessary. If end() is not used, it means that a little more smarts have to be added to the iterator itself so that the iterator knows when to stop. In some cases this means that the iterator needs a pointer to the collection/container object in order to get that information. -Craig "Walter Bright" <newshound digitalmars.com> wrote in message news:eioqvl$26t0$2 digitaldaemon.com...Bill Baxter wrote:I think the iterator-as-range idea is at least worth discussing. 99% of the time you use an iterator you also need to know where to stop. It's pretty similar to how arrays in D keep a length, because 99% of the time when you have an array you need to know how long it is too. So it makes sense to keep those two bits of information together. Similarly in C++, when you have an iterator, you almost always need the end() too. And any time you want to pass around iterators you end up having to pass the two bits of information around separately.That's a very good point. Got any ideas on that?
Nov 07 2006
Craig Black wrote:When writing custom C++ iterators, I find that end() is not ever necessary. If end() is not used, it means that a little more smarts have to be added to the iterator itself so that the iterator knows when to stop. In some cases this means that the iterator needs a pointer to the collection/container object in order to get that information.Also in some cases, 'end()' is really unnatural to write. E.g. with cyclic containers.
Nov 07 2006
On Tue, 07 Nov 2006 16:03:13 +0200, Craig Black <cblack ara.com> wrote:When writing custom C++ iterators, I find that end() is not ever =necessary. If end() is not used, it means that a little more smarts have to be =added to the iterator itself so that the iterator knows when to stop. In some ==cases this means that the iterator needs a pointer to the collection/contain=erobject in order to get that information.A benefit of having a pointer to the iterated container is of course tha= t = boundary checks are always up to date, even if you change the container = = while iterating it. Well, I'm not saying that it would be the way to do = = the iterators, or that it's not the way to do them. Btw, I have used (in C++) iterators having the following functions: isEnd() //does the iterator point to the end position? (isNotEnd(= ) = etc provided for convenience) //there is also: isBegin(), isFirst(), isLast() toBegin() //to position -1 (which is outside the container) toFirst() //to position 0 toLast() //to position end - 1 toEnd() //the end position points outside the container toNext() toPrev() get() //returns the current item read() //=3D=3D get() + toNext() getNext() getPrev() readBackward() //=3D=3D get() + toPrev() ... I have named iterators capable of changing the container as 'Writer's = ('wr' + 'iter'). They have the following functions: set() write() //=3D=3D set() + toNext() ... Ok, there are a lot of functions... but my actual point is that the = iterator has two positions, 'begin' and 'end', that point *outside* the = = container. When writing (or setting) a value to 'end', the value is = appended to the end of the container (very natural operation for files).= = The same way "i.toBegin(); i.set(val);" inserts 'val' to the start of th= e = container (very unnatural operation for files, and actually causes an = error if tried for files).
Nov 07 2006
Walter Bright wrote:there's no way to efficiently 'linearize' a binary treeIn the general case the structure to be linearized is a directed graph. Every linearization has to at least implicitely build a spanning tree for that graph. A priori in spanning trees the number of childs of a node is limited only by the number of nodes of the structure. Therefore every linearizator must be able to handle trees of arbitrary degree. From tree grammars it is known, that top down attributions and bottom up attributions are incomparable to each other when only one run is allowed. Because the goal of any linearization is to compute some attribution of the structure every 'traverser' must assist the process outlined above, where in the general case more than one run is needed. Now: what is efficiency in this general case?
Nov 06 2006
Overload opIndex for rvalue access [...] Overloading opIndex also will work for random accessWouldn't this make iterators incompatible with e.g. linked lists? Seems to defeat much of the purpose. (It might be possible to implement random access for a list, but it's not something you'd want to encourage.)
Nov 06 2006
Mike Capp wrote:To work, all it has to do is support [0], and throw an exception if the index is anything but 0.Overload opIndex for rvalue access [...] Overloading opIndex also will work for random accessWouldn't this make iterators incompatible with e.g. linked lists? Seems to defeat much of the purpose.(It might be possible to implement random access for a list, but it's not something you'd want to encourage.)I agree.
Nov 06 2006
Walter Bright wrote:Mike Capp wrote:This is one point where there's a huge divide between C++ iterators and Java iterators. In C++ there is no such thing as *the* iterator type or interface. What C++ (or STL, rather) has is collection of *iterator concepts*. What that means is that if opIndex doesn't make sense for the the collection in question, you just leave it out. A 'concept' is like an interface but for templates. Here are the concepts in the STL: Trivial Iterator - http://www.sgi.com/tech/stl/trivial.html Input Iterator - http://www.sgi.com/tech/stl/InputIterator.html Output Iterator - http://www.sgi.com/tech/stl/OutputIterator.html Forward Iterator - http://www.sgi.com/tech/stl/ForwardIterator.html Bidirectional Iterator - http://www.sgi.com/tech/stl/BidirectionalIterator.html Random Access Iterator http://www.sgi.com/tech/stl/RandomAccessIterator.html If you want your container to support forward iteration, then your iterator can be of any type you wish, but it must support x++, ++x, and *x operations. pros and cons. * Main pro is that functions that take generic iterators can be ordinary functions rather than templates. * Main con is that such functions can't generally be as efficient as raw pointer manipulation, because you have to make a virtual method call through the interface pointer to get each next element. D is in a very interesting place in that it is high level enough that the C++ model doesn't quite make sense, but low level enough that the Java model doesn't quite make sense either. D may be able to have the best of both worlds. Define *both* standard concepts *and* standard interfaces that implement those concepts, and then libraries can write either templatized algorithms for speed, or regular function based algorithms for code-size, use with inheritance, etc. --bbTo work, all it has to do is support [0], and throw an exception if the index is anything but 0.Overload opIndex for rvalue access [...] Overloading opIndex also will work for random accessWouldn't this make iterators incompatible with e.g. linked lists? Seems to defeat much of the purpose.(It might be possible to implement random access for a list, but it's not something you'd want to encourage.)I agree.
Nov 06 2006
Bill Baxter wrote:Walter Bright wrote:For what it's worth, C++98 style iterator traits won't work in D because of D's simpler template lookup rules. I think we'll have to go with something much closer to C++0x concept checking if we want to overload algorithms for different iterator categories.Mike Capp wrote:This is one point where there's a huge divide between C++ iterators and Java iterators. In C++ there is no such thing as *the* iterator type or interface. What C++ (or STL, rather) has is collection of *iterator concepts*. What that means is that if opIndex doesn't make sense for the the collection in question, you just leave it out. A 'concept' is like an interface but for templates.To work, all it has to do is support [0], and throw an exception if the index is anything but 0.Overload opIndex for rvalue access [...] Overloading opIndex also will work for random accessWouldn't this make iterators incompatible with e.g. linked lists? Seems to defeat much of the purpose.(It might be possible to implement random access for a list, but it's not something you'd want to encourage.)I agree.pros and cons. * Main pro is that functions that take generic iterators can be ordinary functions rather than templates. * Main con is that such functions can't generally be as efficient as raw pointer manipulation, because you have to make a virtual method call through the interface pointer to get each next element.Hrm... you could still use a Java design without the base class and have everyone just use templates though.D is in a very interesting place in that it is high level enough that the C++ model doesn't quite make sense, but low level enough that the Java model doesn't quite make sense either. D may be able to have the best of both worlds. Define *both* standard concepts *and* standard interfaces that implement those concepts, and then libraries can write either templatized algorithms for speed, or regular function based algorithms for code-size, use with inheritance, etc.That's kind of what I was thinking, though if iterators all inherit from a common base class then the calls will be virtual regardless, unless the DMD optimizer gets a bit fancier. Sean
Nov 06 2006
Walter Bright wrote:It's becoming increasingly obvious that D needs iterators. While opApply is far better for some kinds of iteration (such as recursively traversing a directory), iterators are more efficient in some cases, and allow for things that opApply makes difficult. So hear are the design goals: 1) Works with dynamic and stat arrays 2) Doesn't need to work with associative arrays 3) Should be possible to achieve "pointer efficiency" with it 4) Needs to be able to restrict lvalue access (what C++ does with const iterators) 5) Needs to work seamlessly with foreach 6) Iterators need to be copyable So here's one design: .begin property returns an iterator that starts at the beginning .end returns an iterator that is at the end Overloading * doesn't work with D. But, instead:What are the arguments against that, again? I do not recall, and it's worth at least reviewing why we don't have opDeref and opDerefAssign. As you know, Walter, a major benefit of C++-style iterators is that they are semantically as close as possible to pointers. Would [] be changed to dereference pointers? I sure hope not. If we go this route, I would suggest adding these unary-* overloads.Overload opIndex for rvalue access Overload opIndexAssign for lvalue accessI think you mean opSlice and opSliceAssign. Bar bar; auto foo = bar.begin(); Bar baz = foo[]; foo[] = Bar.init;Overloading opIndex also will work for random access foreach loops will not be able to have a 'key' parameter.So, I would assume these iterators would work something like this: If a class provides .begin and .end, and these both return the same type, and that type provides the requisite iterator methods, it is foreach-able. If a class provides .begin, .end, and opApply, one of the two iteration methods has to take precedence. I would hope it's opApply. For a type T, its associated iterator type should be available via T.iterator. This has to be standard. An iterable type T MUST provide these methods: I begin() I end() An iterator type I MUST provide the following overloads: E opSlice() I opPostInc() E is the type of an element in the collection. These are enough to describe a read-only forward iterator, which is adequate for foreach. In other words, given a type T that meets the requirements outlined above, the following is adequate to iterate through it: foreach(e; t) { // ... } 'e' would be of type E. We run into a curious problem involving pre-increment, which uses opAddAssign. Classes for which random-access is problematic should not be required to provide this, which is essentially a random-access method. There are two options here: 1) Only require that iterators support opPostInc. (This further removes them from actual pointers.) 2) Introduce a convention whereby a non-random-access iterator throws an exception if any value other than 1 is passed to its opAddAssign overload. Extending these rules for bidirectional iterators and random-access iterators, as well as read-write iterators, is left as an exercise to the reader. (And not a very difficult one.) All of these strange edge-cases and differences between pointers and possible iterator types convince me that C++-STYLE ITERATORS ARE NOT FOR D. D, unlike C++, does not supply enough operator overloads for their semantics to be identical to those of pointers, which was the whole point of C++-style iterators in the first place. Instead, we should look to other languages for inspiration. I suggest Java and Python, whose iterator semantics are similar. Other posters have already made suggestions on this front. I would only suggest the "opIter" method for getting an iterator, and that the iterator class itself be available via T.iterator (whether it is a nested class or an alias shouldn't matter). Classes can provide methods for returning alternate iterators, as well. (A class providing opIter should be iterated over as easily as an iterator itself.) -- Kirk McDonald Pyd: Wrapping Python with D http://pyd.dsource.org
Nov 06 2006
Kirk McDonald wrote:Walter Bright wrote:Because D has many cases of implicit dereferencing, I thought it would cause more trouble than it's worth.Overloading * doesn't work with D. But, instead:What are the arguments against that, again?I do not recall, and it's worth at least reviewing why we don't have opDeref and opDerefAssign. As you know, Walter, a major benefit of C++-style iterators is that they are semantically as close as possible to pointers. Would [] be changed to dereference pointers?You've always been able to: int* p; p[1] = ...I sure hope not. If we go this route, I would suggest adding these unary-* overloads.No, I meant Index.Overload opIndex for rvalue access Overload opIndexAssign for lvalue accessI think you mean opSlice and opSliceAssign.Yes, though they will return a value, not a type.Overloading opIndex also will work for random access foreach loops will not be able to have a 'key' parameter.So, I would assume these iterators would work something like this: If a class provides .begin and .end, and these both return the same type, and that type provides the requisite iterator methods, it is foreach-able.If a class provides .begin, .end, and opApply, one of the two iteration methods has to take precedence. I would hope it's opApply.I was leaning towards the other way around.For a type T, its associated iterator type should be available via T.iterator. This has to be standard.It's not needed, as typeof(T.begin) will do it.An iterable type T MUST provide these methods: I begin() I end()Yes.An iterator type I MUST provide the following overloads: E opSlice()No. opIndex()I opPostInc()Probably opAddAssign()E is the type of an element in the collection.These are enough to describe a read-only forward iterator, which is adequate for foreach. In other words, given a type T that meets the requirements outlined above, the following is adequate to iterate through it: foreach(e; t) { // ... } 'e' would be of type E.Yes.We run into a curious problem involving pre-increment, which uses opAddAssign. Classes for which random-access is problematic should not be required to provide this, which is essentially a random-access method. There are two options here: 1) Only require that iterators support opPostInc. (This further removes them from actual pointers.) 2) Introduce a convention whereby a non-random-access iterator throws an exception if any value other than 1 is passed to its opAddAssign overload.I could go either way with that.Extending these rules for bidirectional iterators and random-access iterators, as well as read-write iterators, is left as an exercise to the reader. (And not a very difficult one.) All of these strange edge-cases and differences between pointers and possible iterator types convince me that C++-STYLE ITERATORS ARE NOT FOR D. D, unlike C++, does not supply enough operator overloads for their semantics to be identical to those of pointers, which was the whole point of C++-style iterators in the first place.I think it does provide enough, in fact, it will be less wacky than in C++ (look at the wretched iterator/const_iterator dichotomy). It'll work with core arrays, and will not need those typedefs.Instead, we should look to other languages for inspiration. I suggest Java and Python, whose iterator semantics are similar. Other posters have already made suggestions on this front. I would only suggest the "opIter" method for getting an iterator, and that the iterator class itself be available via T.iterator (whether it is a nested class or an alias shouldn't matter). Classes can provide methods for returning alternate iterators, as well. (A class providing opIter should be iterated over as easily as an iterator itself.)
Nov 06 2006
Walter Bright wrote:Kirk McDonald wrote:Wouldn't that be p[0] = ...? Is that how you intend to use opIndex to "dereference" the iterator? That's /terrifying/. Although it is consistent with pointers, using [0] for everything is just asking for trouble with regards to non-random-access iterators. And though it's probably not totally meaningless to a random reader of the code, it comes pretty darned close.Walter Bright wrote:Because D has many cases of implicit dereferencing, I thought it would cause more trouble than it's worth.Overloading * doesn't work with D. But, instead:What are the arguments against that, again?I do not recall, and it's worth at least reviewing why we don't have opDeref and opDerefAssign. As you know, Walter, a major benefit of C++-style iterators is that they are semantically as close as possible to pointers. Would [] be changed to dereference pointers?You've always been able to: int* p; p[1] = ...s/these both return the same type/their return types are the same/I sure hope not. If we go this route, I would suggest adding these unary-* overloads.No, I meant Index.Overload opIndex for rvalue access Overload opIndexAssign for lvalue accessI think you mean opSlice and opSliceAssign.Yes, though they will return a value, not a type.Overloading opIndex also will work for random access foreach loops will not be able to have a 'key' parameter.So, I would assume these iterators would work something like this: If a class provides .begin and .end, and these both return the same type, and that type provides the requisite iterator methods, it is foreach-able.It is slightly easier and clearer to explicitly say: foreach(e; t.begin) { } than foreach(e; &t.opApply) { } opApply is an operator overload. .begin would just be a special method. I would expect the operator to take precedence.If a class provides .begin, .end, and opApply, one of the two iteration methods has to take precedence. I would hope it's opApply.I was leaning towards the other way around.Due to the way in which D's type inference from properties works, you'll need typeof(T.begin()). Otherwise, it will derive the type of the method itself. (Which isn't even a valid type, as such.)For a type T, its associated iterator type should be available via T.iterator. This has to be standard.It's not needed, as typeof(T.begin) will do it.I'm referring specifically to the no-index form of opSlice (meaning i[] would "dereference" the iterator). (A no-index opIndex doesn't work.) This of course is inconsistent with pointers, so saying i[0] is better on that count.An iterable type T MUST provide these methods: I begin() I end()Yes.An iterator type I MUST provide the following overloads: E opSlice()No. opIndex()Oh, there's another one: int opEquals(I)I opPostInc()Probably opAddAssign()Neither one is particularly elegant.E is the type of an element in the collection.These are enough to describe a read-only forward iterator, which is adequate for foreach. In other words, given a type T that meets the requirements outlined above, the following is adequate to iterate through it: foreach(e; t) { // ... } 'e' would be of type E.Yes.We run into a curious problem involving pre-increment, which uses opAddAssign. Classes for which random-access is problematic should not be required to provide this, which is essentially a random-access method. There are two options here: 1) Only require that iterators support opPostInc. (This further removes them from actual pointers.) 2) Introduce a convention whereby a non-random-access iterator throws an exception if any value other than 1 is passed to its opAddAssign overload.I could go either way with that.Without the ability to overload the dereference operator, any attempt to write a class that behaves like a pointer in D will be unwieldy, or at least ugly. Admittedly, this isn't much of an issue when just using foreach, but might be an issue with any kind of STL-like algorithms library. (Which is what you're inviting when using C++'s iterator semantics.)Extending these rules for bidirectional iterators and random-access iterators, as well as read-write iterators, is left as an exercise to the reader. (And not a very difficult one.) All of these strange edge-cases and differences between pointers and possible iterator types convince me that C++-STYLE ITERATORS ARE NOT FOR D. D, unlike C++, does not supply enough operator overloads for their semantics to be identical to those of pointers, which was the whole point of C++-style iterators in the first place.I think it does provide enough, in fact, it will be less wacky than in C++ (look at the wretched iterator/const_iterator dichotomy). It'll work with core arrays, and will not need those typedefs.-- Kirk McDonald Pyd: Wrapping Python with D http://pyd.dsource.orgInstead, we should look to other languages for inspiration. I suggest Java and Python, whose iterator semantics are similar. Other posters have already made suggestions on this front. I would only suggest the "opIter" method for getting an iterator, and that the iterator class itself be available via T.iterator (whether it is a nested class or an alias shouldn't matter). Classes can provide methods for returning alternate iterators, as well. (A class providing opIter should be iterated over as easily as an iterator itself.)
Nov 06 2006
Kirk McDonald wrote:Walter Bright wrote:I must have misunderstood you. I thought you were asking if [n] worked on pointers. Yes, it does.You've always been able to: int* p; p[1] = ...Wouldn't that be p[0] = ...?Is that how you intend to use opIndex to "dereference" the iterator? That's /terrifying/. Although it is consistent with pointers, using [0] for everything is just asking for trouble with regards to non-random-access iterators. And though it's probably not totally meaningless to a random reader of the code, it comes pretty darned close.I agree it might be startling at first because it's unusual. Once one is past that, however, is it that bad?The: foreach(e; t.begin) { } would not be supported. It would be: foreach (e; t) { } If an aggregate had both, and iterators would be chosen by default, overriding with: foreach (e; &t.opApply) { } would not require any new conventions or syntax. So I think this is the right way to go.It is slightly easier and clearer to explicitly say: foreach(e; t.begin) { } than foreach(e; &t.opApply) { } opApply is an operator overload. .begin would just be a special method. I would expect the operator to take precedence.If a class provides .begin, .end, and opApply, one of the two iteration methods has to take precedence. I would hope it's opApply.I was leaning towards the other way around.Yes, you're right.Due to the way in which D's type inference from properties works, you'll need typeof(T.begin()). Otherwise, it will derive the type of the method itself. (Which isn't even a valid type, as such.)For a type T, its associated iterator type should be available via T.iterator. This has to be standard.It's not needed, as typeof(T.begin) will do it.Ok, I understand now where you were coming from with the slice.No. opIndex()I'm referring specifically to the no-index form of opSlice (meaning i[] would "dereference" the iterator). (A no-index opIndex doesn't work.) This of course is inconsistent with pointers, so saying i[0] is better on that count.Yes, I'd overlooked that.Oh, there's another one: int opEquals(I)I opPostInc()Probably opAddAssign()True, but that inelegance is hidden away in the library, but the flexibility is probably worth it.I could go either way with that.Neither one is particularly elegant.I still think it isn't necessary, but you might be thinking of a use case I haven't. Can you provide an example?I think it does provide enough, in fact, it will be less wacky than in C++ (look at the wretched iterator/const_iterator dichotomy). It'll work with core arrays, and will not need those typedefs.Without the ability to overload the dereference operator, any attempt to write a class that behaves like a pointer in D will be unwieldy, or at least ugly. Admittedly, this isn't much of an issue when just using foreach, but might be an issue with any kind of STL-like algorithms library. (Which is what you're inviting when using C++'s iterator semantics.)
Nov 06 2006
Walter Bright wrote:Kirk McDonald wrote:The only thing that comes to mind is if you get multiple indirections. For instance, in C++ I frequently end up creating things like vectors of lists or lists of vectors. Or vectors of pointers. Or sets of list iterators. Then to actually use the things I need two dereferences. But with built-in GC, you don't really need smart pointers so often. And with "dot-is-all-you-need" that also eliminates many cases for dereferences. But still I'm sure you could come up with situations where to get at the thing pointed to you'd need a string of [0]'s iteriteriter[0][0][0] Actually, in some ways it's an improvement over the prefix deref operator because it reads consistently from left to right. In c++ if you had iterator to vector of iterator and you wanted to deref it you'd need: *((*iterveciter)[i]) Or maybe the inner parens are unnecessary. I can't recall, which is another reason why using postfix for everything is better -- no question about precedence rules. So I'm sold on dereferencing being a postfix operation. But I would still like it better if there were some way to get a compile-time error if I try to do iter[2] // Error! This iterator isn't random access! Would introducing a special symbol be of any use? Like [*]? iter[*] It would just compile to opIndex(0), but it declares the intent "this is dereferencing, not arbitrary indexing". I guess it's kind of pointless if there's no compiler support for enforcing its use. I guess it would only be useful if there were an opDeref. Anyway, whatever you do with dereferencing -- I'm convinced you should absolutely not make it a unary prefix operator. It should be postfix. --bbI still think it isn't necessary, but you might be thinking of a use case I haven't. Can you provide an example?I think it does provide enough, in fact, it will be less wacky than in C++ (look at the wretched iterator/const_iterator dichotomy). It'll work with core arrays, and will not need those typedefs.Without the ability to overload the dereference operator, any attempt to write a class that behaves like a pointer in D will be unwieldy, or at least ugly. Admittedly, this isn't much of an issue when just using foreach, but might be an issue with any kind of STL-like algorithms library. (Which is what you're inviting when using C++'s iterator semantics.)
Nov 06 2006
"Walter Bright" <newshound digitalmars.com> wrote in message news:eio6u5$1j6o$1 digitaldaemon.com...Overload opIndex for rvalue access Overload opIndexAssign for lvalue accessSomehow.. foreach(i; foo) { auto x = i[0]; i[0] = x + 1; } You often say that operator overloading should be used only what it should be used for, and taking the index of an iterator means.. what exactly? Would it be nicer to just use a .value property instead? foreach(i; foo) { auto x = i.value; i.value = x + 1; }
Nov 06 2006
Jarrett Billingsley wrote:"Walter Bright" <newshound digitalmars.com> wrote in message news:eio6u5$1j6o$1 digitaldaemon.com...The problem is if the iterator is an actual pointer, there is no .value property for a pointer. Another possibility is to have: *x rewritten to: x[0] if x is not a pointer.Overload opIndex for rvalue access Overload opIndexAssign for lvalue accessSomehow.. foreach(i; foo) { auto x = i[0]; i[0] = x + 1; } You often say that operator overloading should be used only what it should be used for, and taking the index of an iterator means.. what exactly? Would it be nicer to just use a .value property instead? foreach(i; foo) { auto x = i.value; i.value = x + 1; }
Nov 06 2006
"Walter Bright" <newshound digitalmars.com> wrote in message news:eiosro$28g6$1 digitaldaemon.com...The problem is if the iterator is an actual pointer, there is no .value property for a pointer. Another possibility is to have: *x rewritten to: x[0] if x is not a pointer.And yet another is to allow overriding of opDeref and opDerefAssign ;) Which would, of course, be the _optimal_ solution, but..
Nov 06 2006
Walter Bright wrote:Jarrett Billingsley wrote:Is there really any reason to support pointers as iterators though? C++ libraries even seem to be moving away from that towards more robust and less error-prone iterator objects. Sean"Walter Bright" <newshound digitalmars.com> wrote in message news:eio6u5$1j6o$1 digitaldaemon.com...The problem is if the iterator is an actual pointer, there is no .value property for a pointer.Overload opIndex for rvalue access Overload opIndexAssign for lvalue accessSomehow.. foreach(i; foo) { auto x = i[0]; i[0] = x + 1; } You often say that operator overloading should be used only what it should be used for, and taking the index of an iterator means.. what exactly? Would it be nicer to just use a .value property instead? foreach(i; foo) { auto x = i.value; i.value = x + 1; }
Nov 06 2006
Sean Kelly wrote:Well, they could. It's up to Mr. Compiler Writer if a pointer has a value property or not.Would it be nicer to just use a .value property instead? foreach(i; foo) { auto x = i.value; i.value = x + 1; }The problem is if the iterator is an actual pointer, there is no .value property for a pointer.Is there really any reason to support pointers as iterators though? C++ libraries even seem to be moving away from that towards more robust and less error-prone iterator objects.Agreed. I don't think I've ever actually used a raw pointer with a fancy STL algorithm. I was actually really surprised the first time I saw it in the examples on the SGI STL page. Other than those examples, I've never seen it in real code. And that should go double for D, where you rarely even see pointers to begin with. --bb
Nov 06 2006
Bill Baxter wrote:Sean Kelly wrote:Consider the following struct: struct Foo { int value; } Foo* p; p.value; Is p.value the entire contents of Foo (as it would be for a proposed .value property) or just Foo.value? "value" is a pretty common field name.Well, they could. It's up to Mr. Compiler Writer if a pointer has a value property or not.The problem is if the iterator is an actual pointer, there is no .value property for a pointer.
Nov 06 2006
Walter Bright wrote:Bill Baxter wrote:It's a name conflict. They happen. I'd have trouble if I wanted to have a sizeof member in my struct too. If you want to be sure to get the struct's member then use p[0].value. 'Value' is too common, though. If you were going to go that route it should definitely be something like .iterVal or some other word less likely to conflict. I'm not saying it's the right way to go, just that it could be made to work if needed. --bbSean Kelly wrote:Consider the following struct: struct Foo { int value; } Foo* p; p.value; Is p.value the entire contents of Foo (as it would be for a proposed .value property) or just Foo.value? "value" is a pretty common field name.Well, they could. It's up to Mr. Compiler Writer if a pointer has a value property or not.The problem is if the iterator is an actual pointer, there is no .value property for a pointer.
Nov 06 2006
Walter Bright wrote:Bill Baxter wrote:Maybe that's why Ada used "`" as a property marker rather than "."? Of course there's the Python approach...naming it __value__. A suitable solution might be to name it "valueof". That's a much less common field name. The problem is that this is something used quite frequently, so one would want it to be short. Perhaps "V"? Then one would write struct Foo { int value; } Foo* p; p.value; p.V; and have two distinct forms. (Yes, V is used occasionally as a variable name...but pretty rarely.)Sean Kelly wrote:Consider the following struct: struct Foo { int value; } Foo* p; p.value; Is p.value the entire contents of Foo (as it would be for a proposed .value property) or just Foo.value? "value" is a pretty common field name.Well, they could. It's up to Mr. Compiler Writer if a pointer has a value property or not.The problem is if the iterator is an actual pointer, there is no .value property for a pointer.
Nov 07 2006
Bill Baxter wrote:Sean Kelly wrote:I've used it a few times, but generally more when I'm throwing together a quick test than writing actual production code. But since D arrays aren't the same as C++ arrays and we have the option of getting a robust object back (via a.begin() or whatever), why not take it? SeanWell, they could. It's up to Mr. Compiler Writer if a pointer has a value property or not.Would it be nicer to just use a .value property instead? foreach(i; foo) { auto x = i.value; i.value = x + 1; }The problem is if the iterator is an actual pointer, there is no .value property for a pointer.Is there really any reason to support pointers as iterators though? C++ libraries even seem to be moving away from that towards more robust and less error-prone iterator objects.Agreed. I don't think I've ever actually used a raw pointer with a fancy STL algorithm. I was actually really surprised the first time I saw it in the examples on the SGI STL page. Other than those examples, I've never seen it in real code.
Nov 07 2006
Sean Kelly wrote:Is there really any reason to support pointers as iterators though? C++ libraries even seem to be moving away from that towards more robust and less error-prone iterator objects.Can you be more specific about what the problems and solutions are?
Nov 06 2006
Walter Bright wrote:Sean Kelly wrote:For one thing, if pointers still are supported, then all algorithms have to be written so they can use them. If we skip "real pointer compatibility", then we can decide on smarter iterators. Example: With pointer-compatible iterators, it is impossible to traverse the tree. With smarter iterators one could.Is there really any reason to support pointers as iterators though? C++ libraries even seem to be moving away from that towards more robust and less error-prone iterator objects.Can you be more specific about what the problems and solutions are?
Nov 07 2006
Walter Bright wrote:Sean Kelly wrote:A direct application of pointers as iterators couldn't be done exactly the C++ way because C++ uses traits templates to describe the operations an iterator supports. And D will not overload templates from different modules, so the user could not provide overloads for his own iterators. This isn't a huge problem however, as we could probably write a set of generic traits templates that apply to classes/structs implemented the 'right' way rather than requiring the user to provide his own. This would be roughly similar to how a user can derive from std::iterator<> to add the required typedefs to his object in C++. More generally though, I find the need to use two iterators to identify a range to be somewhat cumbersome and annoying. And there have been enough others who feel this way that there has been at least an informal proposal for "all in one" iterators for C++. I haven't bothered to check whether it was ever formalized though. Basically, I'm wondering whether it might be easier and more D-like to use iterators for stepping and opIndex for random access, instead of iterators for everything. Since D has slicing, the argument for using iterators to define the boundaries of a range of randomly accessible elements seems kind of small to me. ie. sort( a.begin, a.begin + 5 ); // sort the first 5 elements or sort( a[0 .. 5] ); I find the second to be cleaner to look at. But I'm undecided whether we'd lose any power or flexibility by not bothering with random access iterators. SeanIs there really any reason to support pointers as iterators though? C++ libraries even seem to be moving away from that towards more robust and less error-prone iterator objects.Can you be more specific about what the problems and solutions are?
Nov 07 2006
Sean Kelly wrote:Since D has slicing, the argument for using iterators to define the boundaries of a range of randomly accessible elements seems kind of small to me. ie. sort( a.begin, a.begin + 5 ); // sort the first 5 elements or sort( a[0 .. 5] ); I find the second to be cleaner to look at. But I'm undecided whether we'd lose any power or flexibility by not bothering with random access iterators.Hmm, instead of 'pointers as iterators', have 'arrays as iterators'. That definitely sounds like a good area to explore.
Nov 07 2006
Walter Bright wrote:Sean Kelly wrote:I went ahead and wrote everything up during my commute this morning. It's posted separately as "Iterator straw-man." And I apologize for it being a bit longer than expected :-) SeanSince D has slicing, the argument for using iterators to define the boundaries of a range of randomly accessible elements seems kind of small to me. ie. sort( a.begin, a.begin + 5 ); // sort the first 5 elements or sort( a[0 .. 5] ); I find the second to be cleaner to look at. But I'm undecided whether we'd lose any power or flexibility by not bothering with random access iterators.Hmm, instead of 'pointers as iterators', have 'arrays as iterators'. That definitely sounds like a good area to explore.
Nov 07 2006
Sean Kelly wrote:Since D has slicing, the argument for using iterators to define the boundaries of a range of randomly accessible elements seems kind of small to me. ie. sort( a.begin, a.begin + 5 ); // sort the first 5 elements or sort( a[0 .. 5] ); I find the second to be cleaner to look at. But I'm undecided whether we'd lose any power or flexibility by not bothering with random access iterators.(among others), but without significant C++ experience, my concept of an iterator is: "an object with methods for getting the current element, determining whether a 'next' element exists, and for advancing the cursor to the next element". The C++ iterators look very weird (and very very wrong) to me. Search and sort routines should use opIndex, opIndexAssign, and opSlice. If an object doesn't implement opIndex or opIndexAssign, it's not sortable using generic sorting algorithms. Tough luck. Iteration, though, is completely distinct from sorting and searching, and should use a different mechanism. Walter Bright wrote:Hmm, instead of 'pointers as iterators', have 'arrays as iterators'. That definitely sounds like a good area to explore.Perhaps an iterator always steps through an array (or an object implementing a List or Iterable interface), and in order for an object to be foreach-able, you'll need to first transform an object into an array (or construct a List or an Iterable wrapping the object). Arrays could still get their own special-case implementation of foreach, avoiding interfaces for maximum speed. And, if a class supports opIndex, then you could avoid making a virtual method call through the Iterable interface. But I think it's also desirable, in many many cases, to implement an Iterator class, without supporting opIndex. The only methods in the iterator would be hasNext() and next(). Foreach should support real arrays, classes with opIndex, and Java-style iterators. Searching and sorting should only use opIndex, opIndex, opCmp, and opEquals. Iterators should only be used for iteration. Anyhoo, those are my two cents. --Benji Smith
Nov 07 2006
Walter Bright wrote:Sean Kelly wrote:OOh, I like the direction this is heading. Sean I don't have time to read the straw-man right now but that definitely solves my recursive mergesort question. And Benji's comment about simple iterators and ranges being distinct things that shouldn't be lumped together also seems on target. If iterators don't have to be ranges then it's simple to see how to support iterators over infinite or circular containers. They just never stop spitting out 'next()'s But how do you handle generically pointing to an element then? Like you have with an iterator to a node in linked list. I guess you can have iterators with hasPrev() getPrev() (or --) type methods too. --bbSince D has slicing, the argument for using iterators to define the boundaries of a range of randomly accessible elements seems kind of small to me. ie. sort( a.begin, a.begin + 5 ); // sort the first 5 elements or sort( a[0 .. 5] ); I find the second to be cleaner to look at. But I'm undecided whether we'd lose any power or flexibility by not bothering with random access iterators.Hmm, instead of 'pointers as iterators', have 'arrays as iterators'. That definitely sounds like a good area to explore.
Nov 07 2006
Bill Baxter escribió:But how do you handle generically pointing to an element then? Like you have with an iterator to a node in linked list. I guess you can have iterators with hasPrev() getPrev() (or --) type methods too. --bbIn fact, Java has them: http://java.sun.com/j2se/1.5.0/docs/api/java/util/ListIterator.html
Nov 07 2006
Walter Bright wrote:Sean Kelly wrote:Hang on; aren't we back to where we are *right now*? Things that need random access override opIndex and opIndexAssign. Things which you can take a range of values from override opSlice and opSliceAssign. Things that require lock-step iteration override opApply, or supply a function that returns a delegate that "runs" a foreach loop. About the only thing this *doesn't* cover are bi-directional iterators (say, iterating back and forth over an infinite series). I know some people want to be able to stop iteration and resume it, but couldn't that be solved by making opApply return a generator like what Python does? Those things are written the same way as a current opApply (sans all the return value from the delegate stuff), and can be advanced manually. I know this can be done. I wrote the coroutine module for StackThreads a while back, and one of the *very first* things I did was implement generic generators. The base generator class essentially just implemented opApply switching over to the coroutine every time it needed the next value. Here's a port of Python's range() function (well, the one and zero argument versions :P):Since D has slicing, the argument for using iterators to define the boundaries of a range of randomly accessible elements seems kind of small to me. ie. sort( a.begin, a.begin + 5 ); // sort the first 5 elements or sort( a[0 .. 5] ); I find the second to be cleaner to look at. But I'm undecided whether we'd lose any power or flexibility by not bothering with random access iterators.Hmm, instead of 'pointers as iterators', have 'arrays as iterators'. That definitely sounds like a good area to explore.class range : Generator!(uint) { // These just let you use range() or range(n) static range opCall() { return range(uint.max); } static range opCall(uint limit) { return new range(limit); } protected: uint limit; // Or you can use new range(n) this(uint limit) { this.limit = limit; super(); } // Where the magic happens. override uint run() { uint i = 0; while( i < limit ) yield(i++); // yield in D: *very* sexy StopGenerator(); // throws an exception, imitating what // Python does to stop an iterator. } }Now, from that, we can implement opApply AND next/current in the base class. Going with coroutines kills both the opApply and forward iterator birds with one stone. As for bi-directional iterators... I actually can't think of anywhere I'd use them. I just keep thinking "I could just use random access to the same end". Python's never suffered from not having them... Anyway, just some random thoughts. Haven't had coffee yet :3 -- Daniel -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Nov 07 2006
Daniel Keep wrote:Walter Bright wrote:Close, but right now we don't have a good way to iterate over multiple things at once. for(i,j; i.hasNext()&&j.hasNext(); i++,j++) { // do something with current values of i and j } Or as you say to stop iteration or a generic way to return a pointer to a particular spot in a container.Sean Kelly wrote:Hang on; aren't we back to where we are *right now*?Since D has slicing, the argument for using iterators to define the boundaries of a range of randomly accessible elements seems kind of small to me. ie. sort( a.begin, a.begin + 5 ); // sort the first 5 elements or sort( a[0 .. 5] ); I find the second to be cleaner to look at. But I'm undecided whether we'd lose any power or flexibility by not bothering with random access iterators.Hmm, instead of 'pointers as iterators', have 'arrays as iterators'. That definitely sounds like a good area to explore.About the only thing this *doesn't* cover are bi-directional iterators (say, iterating back and forth over an infinite series).[intersting stuff about generators and D]If generators can handle the above cases *AND* do it with code that simple to create and use *AND* do all that as efficiently as for loops, then it sounds great to me. My impression was that any sort of opApply/coroutine thing in D was not going to be so efficient. --bb
Nov 07 2006
Bill Baxter wrote:Daniel Keep wrote:Well, as I said, once you have your sequence expressed as a coroutine, writing hasNext() and getCurrent() is trivial. Once you have that, iterating multiple things in lockstep is also trivial:Walter Bright wrote:Close, but right now we don't have a good way to iterate over multiple things at once. for(i,j; i.hasNext()&&j.hasNext(); i++,j++) { // do something with current values of i and j } Or as you say to stop iteration or a generic way to return a pointer to a particular spot in a container.Sean Kelly wrote:Hang on; aren't we back to where we are *right now*?Since D has slicing, the argument for using iterators to define the boundaries of a range of randomly accessible elements seems kind of small to me. ie. sort( a.begin, a.begin + 5 ); // sort the first 5 elements or sort( a[0 .. 5] ); I find the second to be cleaner to look at. But I'm undecided whether we'd lose any power or flexibility by not bothering with random access iterators.Hmm, instead of 'pointers as iterators', have 'arrays as iterators'. That definitely sounds like a good area to explore.About the only thing this *doesn't* cover are bi-directional iterators (say, iterating back and forth over an infinite series).[intersting stuff about generators and D]If generators can handle the above cases *AND* do it with code that simple to create and use *AND* do all that as efficiently as for loops, then it sounds great to me. My impression was that any sort of opApply/coroutine thing in D was not going to be so efficient. --bbclass zip : Generator!(Tuple!(T1,T2)) { // ... Generator!(T1) seq1; Generator!(T2) seq2; // ... void run() { while( seq1.hasNext && seq2.hasNext ) yield(new Tuple!(T1,T2)(seq1.next,seq2.next)); } // ... }Or something to that effect :P As for efficiency, I'm not sure how to go about testing that so that the results mean anything. I'm sure my current implementation could be made faster, but I don't know how. Just for loose comparison, the game EVE Online has the record for the most number of people simultaneously online on a single server at once. And it's servers are built using coroutines running in *Python*. I suppose the bottleneck (if it is one) is how fast you can do user-space context switches. Which is basically a function call, moving the base of the stack, and switching over stuff like the exception handling registers. Or somesuch... Mikola Lysenko wrote StackThreads; my attempt only mostly worked :P -- Daniel -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Nov 07 2006
Daniel Keep wrote:I suppose the bottleneck (if it is one) is how fast you can do user-space context switches. Which is basically a function call, moving the base of the stack, and switching over stuff like the exception handling registers. Or somesuch... Mikola Lysenko wrote StackThreads; my attempt only mostly worked :PThat's pretty much it. Saving the current register data, loading the other register data, and jumping to a new instruction address. There's no way this can be as efficient as an integer increment instruction (for loop iteration) but it's orders of magnitude faster than a kernel thread context switch. Sean
Nov 08 2006
Daniel Keep wrote:Things that need random access override opIndex and opIndexAssign. Things which you can take a range of values from override opSlice and opSliceAssign. Things that require lock-step iteration override opApply, or supply a function that returns a delegate that "runs" a foreach loop. About the only thing this *doesn't* cover are bi-directional iterators (say, iterating back and forth over an infinite series).What about containers where there are multiple valid orders of iteration? For a binary tree class, I might want to perform iteration using depth-first, breadth-first, in-order, or reverse-order traversals. How does the opApply/opApplyReverse solution address these needs? And isn't a delegate-call about as computationally expensive as a virtual method call (in the case of an Iterable iterface)? If the delegate solution is really better than an Iterable interface, why create special cases for opApply and opApplyReverse? Why not just say that foreach always requires an iteration delegate? --benji
Nov 08 2006
Benji Smith wrote:Daniel Keep wrote:It should be roughly equivelant in cost to explicitly calling the method, meaning if the method is virtual then you will get that cost, and if it is final then it is like any function call. Since Iterable methods will very likely almost always be virtual, there is at least a chance of delegates being a little cheaper /some of the time/ -- when they can be pointers to final methods.Things that need random access override opIndex and opIndexAssign. Things which you can take a range of values from override opSlice and opSliceAssign. Things that require lock-step iteration override opApply, or supply a function that returns a delegate that "runs" a foreach loop. About the only thing this *doesn't* cover are bi-directional iterators (say, iterating back and forth over an infinite series).What about containers where there are multiple valid orders of iteration? For a binary tree class, I might want to perform iteration using depth-first, breadth-first, in-order, or reverse-order traversals. How does the opApply/opApplyReverse solution address these needs? And isn't a delegate-call about as computationally expensive as a virtual method call (in the case of an Iterable iterface)?If the delegate solution is really better than an Iterable interface, why create special cases for opApply and opApplyReverse? Why not just say that foreach always requires an iteration delegate? --benjiWell, when foreach(;) was first added to D, I don't think anyone had thought about delegates-as-aggregates yet. In fact, it was just added a couple of versions back. So I suppose in a sense, opApply*Reverse is just legacy -- though it makes it straight-forward to iterate simple, linear-only collections. I suppose it would be safe enough to do away with opApply*Reverse now, but I don't think it hurts anything to leave them be, either. -- Chris Nicholson-Sauls
Nov 08 2006
Daniel Keep wrote:Walter Bright wrote:Essentially. But D has no formal description of an iterator type. I think it would be a useful convention to establish, and doing so requires determining what types of iterators should be supported, and how.Sean Kelly wrote:Hang on; aren't we back to where we are *right now*?Since D has slicing, the argument for using iterators to define the boundaries of a range of randomly accessible elements seems kind of small to me. ie. sort( a.begin, a.begin + 5 ); // sort the first 5 elements or sort( a[0 .. 5] ); I find the second to be cleaner to look at. But I'm undecided whether we'd lose any power or flexibility by not bothering with random access iterators.Hmm, instead of 'pointers as iterators', have 'arrays as iterators'. That definitely sounds like a good area to explore.Things that need random access override opIndex and opIndexAssign. Things which you can take a range of values from override opSlice and opSliceAssign. Things that require lock-step iteration override opApply, or supply a function that returns a delegate that "runs" a foreach loop. About the only thing this *doesn't* cover are bi-directional iterators (say, iterating back and forth over an infinite series).This wouldn't be too difficult to do by adding a prev() method and atBegin() or some such. Or maybe the Java style of hasNext(), next(), hasPrev(), prev().I know some people want to be able to stop iteration and resume it, but couldn't that be solved by making opApply return a generator like what Python does? Those things are written the same way as a current opApply (sans all the return value from the delegate stuff), and can be advanced manually. I know this can be done. I wrote the coroutine module for StackThreads a while back, and one of the *very first* things I did was implement generic generators. The base generator class essentially just implemented opApply switching over to the coroutine every time it needed the next value. Here's a port of Python's range() function (well, the one and zero argument versions :P):Interesting idea--I'll admit I don't have much experience with Python. But I think both approaches have value. Using coroutines just to step across two sequences simultaneously seems unnecessarily complex, and not terribly efficient compared to a simple pointer increment. But it does offer a tremendous amount of power for more complex cases.class range : Generator!(uint) { // These just let you use range() or range(n) static range opCall() { return range(uint.max); } static range opCall(uint limit) { return new range(limit); } protected: uint limit; // Or you can use new range(n) this(uint limit) { this.limit = limit; super(); } // Where the magic happens. override uint run() { uint i = 0; while( i < limit ) yield(i++); // yield in D: *very* sexy StopGenerator(); // throws an exception, imitating what // Python does to stop an iterator. } }Now, from that, we can implement opApply AND next/current in the base class. Going with coroutines kills both the opApply and forward iterator birds with one stone.As for bi-directional iterators... I actually can't think of anywhere I'd use them. I just keep thinking "I could just use random access to the same end". Python's never suffered from not having them...I don't think I've ever needed them either, but they can come in handy when implementing one container in terms of another. Say an iterable stack and queue (unlikely, I know) that are implemented via a doubly linked-list. Sean
Nov 08 2006
Sean Kelly wrote:Daniel Keep wrote:I wrote some mesh (graph) manipulation algorithms in Python. I really needed a linked list for that and a way to point to e.g. an edge in a list of edges coming out of a vertex that wouldn't be invalidated by inserting edges here and there before or after it, and I needed both to find the previous edge and the following edge. In C++ std::list would have been the clear choice. Doing that with indexes into to a standard python list just isn't practical (I started out trying it that way because Python FAQ's seem to say "you don't need a linked list, just use a python list". But with a python list, every time you insert one new item you have to go find everyone out there that's holding onto an index into this list and get them to update their values. Just not practical. So sometimes bidirectional iterators are needed. Even in Python. --bbWalter Bright wrote:Essentially. But D has no formal description of an iterator type. I think it would be a useful convention to establish, and doing so requires determining what types of iterators should be supported, and how.Sean Kelly wrote:Hang on; aren't we back to where we are *right now*?Since D has slicing, the argument for using iterators to define the boundaries of a range of randomly accessible elements seems kind of small to me. ie. sort( a.begin, a.begin + 5 ); // sort the first 5 elements or sort( a[0 .. 5] ); I find the second to be cleaner to look at. But I'm undecided whether we'd lose any power or flexibility by not bothering with random access iterators.Hmm, instead of 'pointers as iterators', have 'arrays as iterators'. That definitely sounds like a good area to explore.Things that need random access override opIndex and opIndexAssign. Things which you can take a range of values from override opSlice and opSliceAssign. Things that require lock-step iteration override opApply, or supply a function that returns a delegate that "runs" a foreach loop. About the only thing this *doesn't* cover are bi-directional iterators (say, iterating back and forth over an infinite series).This wouldn't be too difficult to do by adding a prev() method and atBegin() or some such. Or maybe the Java style of hasNext(), next(), hasPrev(), prev().I know some people want to be able to stop iteration and resume it, but couldn't that be solved by making opApply return a generator like what Python does? Those things are written the same way as a current opApply (sans all the return value from the delegate stuff), and can be advanced manually. I know this can be done. I wrote the coroutine module for StackThreads a while back, and one of the *very first* things I did was implement generic generators. The base generator class essentially just implemented opApply switching over to the coroutine every time it needed the next value. Here's a port of Python's range() function (well, the one and zero argument versions :P):Interesting idea--I'll admit I don't have much experience with Python. But I think both approaches have value. Using coroutines just to step across two sequences simultaneously seems unnecessarily complex, and not terribly efficient compared to a simple pointer increment. But it does offer a tremendous amount of power for more complex cases.class range : Generator!(uint) { // These just let you use range() or range(n) static range opCall() { return range(uint.max); } static range opCall(uint limit) { return new range(limit); } protected: uint limit; // Or you can use new range(n) this(uint limit) { this.limit = limit; super(); } // Where the magic happens. override uint run() { uint i = 0; while( i < limit ) yield(i++); // yield in D: *very* sexy StopGenerator(); // throws an exception, imitating what // Python does to stop an iterator. } }Now, from that, we can implement opApply AND next/current in the base class. Going with coroutines kills both the opApply and forward iterator birds with one stone.As for bi-directional iterators... I actually can't think of anywhere I'd use them. I just keep thinking "I could just use random access to the same end". Python's never suffered from not having them...I don't think I've ever needed them either, but they can come in handy when implementing one container in terms of another. Say an iterable stack and queue (unlikely, I know) that are implemented via a doubly linked-list.
Nov 08 2006
On Mon, 06 Nov 2006 12:46:01 -0800, Walter Bright wrote:It's becoming increasingly obvious that D needs iterators. While opApply is far better for some kinds of iteration (such as recursively traversing a directory), iterators are more efficient in some cases, and allow for things that opApply makes difficult.What are those cases ? Maybe we can find a way to fix the problem with opApply.
Nov 06 2006
Knud Sørensen wrote:On Mon, 06 Nov 2006 12:46:01 -0800, Walter Bright wrote:One such case is the usefulness of being able to provide an input iterator to a parsing function, which may itself pass the iterator off to other parsing functions.It's becoming increasingly obvious that D needs iterators. While opApply is far better for some kinds of iteration (such as recursively traversing a directory), iterators are more efficient in some cases, and allow for things that opApply makes difficult.What are those cases ? Maybe we can find a way to fix the problem with opApply.
Nov 06 2006
Walter Bright wrote:Knud Sørensen wrote:Another case is the desire to iterate over multiple containers at the same time, say as you would do in a sorted list merge operation. opApply could handle this one if D had real coroutines (i.e. multiple opApply's could be running simultaneously) Another case is the basic desire to hold onto a pointer to a container for later use. For instance if you have an STL style linked list container where the implementation (the actual nodes) are hidden from the user. You still want the user to be able to keep something that functions like a pointer to a particular insertion point in the list (and be able to use that in generic sequence manipulation algorithms). --bbOn Mon, 06 Nov 2006 12:46:01 -0800, Walter Bright wrote:One such case is the usefulness of being able to provide an input iterator to a parsing function, which may itself pass the iterator off to other parsing functions.It's becoming increasingly obvious that D needs iterators. While opApply is far better for some kinds of iteration (such as recursively traversing a directory), iterators are more efficient in some cases, and allow for things that opApply makes difficult.What are those cases ? Maybe we can find a way to fix the problem with opApply.
Nov 06 2006
Walter Bright wrote:One such case is the usefulness of being able to provide an input iterator to a parsing function, which may itself pass the iterator off to other parsing functions.Some questions: Is your data structure a graph or a tree ? What does your iterator iterating ? I ask because I currently investigate the visitor pattern to traverse graph like data structures (see attached code for an example).
Nov 07 2006
Knud Sørensen wrote:On Mon, 06 Nov 2006 12:46:01 -0800, Walter Bright wrote:Mostly agreed. For example, one case I seem to find mentioned a few times is providing depth-first versus breadth-first iteration on a tree -- supposedly this is hard to do with foreach/opApply. While I do agree that opApply is not optimal (because you can't tell opApply which way you want to go) I thought this was part of the utility of the recent delegates-as-aggregates feature? The following is just off the very top of my head, so it might not be ideal/perfect, but I do believe it provides this ability, using current D: <code> struct Tree(int B = 2, T) { version (Tree_NoWarnings) {} else { static if (B == 0) { static assert(false, "A tree with zero branches is just useless."); } else static if (B == 1) { static assert(false, "A tree with only one branch may as well be an array."); } } Tree*[B] children ; T value ; int walkDepth (int delegate(inout Tree!(T)) dlg) { int result ; foreach (x; children) { if (x && result = x.walkDepth(dlg)) { goto end; } } if (result = dlg(this)) { goto end; } end: return result; } int walkBreadth (int delegate(inout Tree!(T)) dlg) { int result ; Tree!(T)*[] gen = [this] , next ; while (gen.length) { foreach (x; gen) { if (result = dlg(x.value)) { goto end; } foreach (y; x.children) { if (y) { next ~= y; } } } gen = next.dup; next.length = 0; } end: return result; } } Tree!(2, int) binary_int_tree ; // ... foreach (node; &binary_int_tree.walkDepth) { // ... } foreach (node; &binary_int_tree.walkBreadth) { // ... } </code> Unless I'm missing something entirely, I don't see any major utility in this case for having a seperate Iterator entity... -- Chris Nicholson-SaulsIt's becoming increasingly obvious that D needs iterators. While opApply is far better for some kinds of iteration (such as recursively traversing a directory), iterators are more efficient in some cases, and allow for things that opApply makes difficult.What are those cases ? Maybe we can find a way to fix the problem with opApply.
Nov 07 2006
Chris Nicholson-Sauls wrote:Knud Sørensen wrote:Other way around. Tree traversal is harder to do with C++ style iterators, easy to do with opApply. I assume your code is correct but didn't look at it too closely. --bbOn Mon, 06 Nov 2006 12:46:01 -0800, Walter Bright wrote:Mostly agreed. For example, one case I seem to find mentioned a few times is providing depth-first versus breadth-first iteration on a tree -- supposedly this is hard to do with foreach/opApply.It's becoming increasingly obvious that D needs iterators. While opApply is far better for some kinds of iteration (such as recursively traversing a directory), iterators are more efficient in some cases, and allow for things that opApply makes difficult.What are those cases ? Maybe we can find a way to fix the problem with opApply.
Nov 07 2006
"Walter Bright" <newshound digitalmars.com> wrote in message news:eio6u5$1j6o$1 digitaldaemon.com...It's becoming increasingly obvious that D needs iterators. While opApply is far better for some kinds of iteration (such as recursively traversing a directory), iterators are more efficient in some cases, and allow for things that opApply makes difficult. So hear are the design goals: 1) Works with dynamic and stat arrays 2) Doesn't need to work with associative arrays 3) Should be possible to achieve "pointer efficiency" with it 4) Needs to be able to restrict lvalue access (what C++ does with const iterators) 5) Needs to work seamlessly with foreach 6) Iterators need to be copyable So here's one design: .being property returns an iterator that starts at the beginning .end returns an iterator that is at the end Overloading * doesn't work with D. But, instead: Overload opIndex for rvalue access Overload opIndexAssign for lvalue access Overloading opIndex also will work for random access foreach loops will not be able to have a 'key' parameter.It's not a huge problem, but passing an argument needlessly is both ugly and slightly inefficient. Perhaps opIndex and opIndexAssign could be overloaded without any parameters. -Craig
Nov 07 2006
Walter Bright wrote:It's becoming increasingly obvious that D needs iterators. While opApply is far better for some kinds of iteration (such as recursively traversing a directory), iterators are more efficient in some cases, and allow for things that opApply makes difficult. So hear are the design goals: 1) Works with dynamic and stat arrays 2) Doesn't need to work with associative arrays 3) Should be possible to achieve "pointer efficiency" with it 4) Needs to be able to restrict lvalue access (what C++ does with const iterators) 5) Needs to work seamlessly with foreach 6) Iterators need to be copyableI apologize for this long post. Writing too much has become a bad habit of mine lately it seems. This took a while writing, and in the meantime new posts appeared that say much of the same. If you don't bother reading this, read Sean's posts instead, as I agree 100 % with him on this. :) What I suggest is that we should not try to copy the C++ "iterator == pointer" idiom, but instead aim for an iterator design that is self-contained and simple. For me, the iterator concept is simple. An iterator is an entity that allows an iterative way of accessing data. An iterator needs: 1. a way to reference the data 2. a way to traverse the data 3. a way to signal the end of the data second iterator that just signals the end of the iterative range.) An iterator could also be a generator. Here are some examples (translated to D) of how iterators are implemented in some languages other than C++: Java ---- class Iterator(T) { bool hasNext(); // self explanatory T* next(); // steps and returns reference to next element } while(i.hasNext()) writefln(*i.next()); ----------------------- class Iterator(T) { bool MoveNext(); // false if no more elements T Current(); // property getter } while(i.MoveNext) writefln(i.Current); Python ------ class Iterator(T) { T next(); } while(true) try writefln(i.next); catch (StopIteration) break; Taking inspiration from the above and following Walter's design goals above, here is my suggestion: interface Iterator(T) { T value; // getter (rvalue access) T value(T); // setter (lvalue access) int opApply(...) // support foreach style iteration } An alternative is supplying a value property that returns a pointer to the element, but that would not allow limiting lvalue access for read only iterators and introduces a pointer, which might be bad by itself. Usage: // explicit iteration while(i.step) { writefln(i.value); i.value = 5; } // implicit iteration foreach(o; i) writefln(o); I will briefly explain my reasoning: As I said, I believe the iterator should be clean and simple. Saying that it is OK for the iterator implementations to be dirty and messy, but that the implementations will be hidden away in libraries and only be the headache of library writers will lead to just that - only library writers will write iterators. As Sean hints at, "random access iterators" are not pure iterators. The random access part is orthogonal to the iterative part. In D, a perfect way to implement a random access view is defining a .length property and an opIndex method. A random access view doesn't need to be iterable and an iterator doesn't need to provide random access. A D slice is the perfect example of a random access view. I see no reason to use operator overloading when the result isn't replaceable with a pointer anyway. Bill Baxter's points regarding iterator as a concept vs an interface/abstract class are very valid. And as Bill says, maybe both should be supported. One of them could always be converted to the other. Regarding efficiency, here is one sample implementation that a compiler should(*) be able to make just as efficient as for/foreach loops: struct Iterator(T) { T* ptr; T* end; bool step() { return ++ptr != end; } T value() { return *ptr; } T value(T x) { return *ptr = x; } // mixin OpApplyMixin; } Iterator!(T) iter(T)(T[] x) { Iterator!(T) ret;// = {x.ptr-1, x.ptr+x.length}; // ICE ret.ptr = x.ptr-1; ret.end = x.ptr+x.length; return ret; } void main() { int[] arr = [1,2,3,4,5]; auto i = arr.iter(); while(i.step) { writefln("%s",i.value); i.value = 5; } } /Oskar
Nov 07 2006
interface Iterator(T) { T value; // getter (rvalue access) T value(T); // setter (lvalue access) int opApply(...) // support foreach style iteration }I think that this is more better. interface Iterator(T) { // also support in pointer with opBack(). bool opNext(int); // speed hack T opValue; // getter (rvalue access). also support in pointer T opValue(T); // setter (lvalue access). also support in pointer int opApply(...) // support foreach style iteration } while(iter.next)//call opNext() writefln(*iter);//call opValue()
Nov 07 2006
Oskar Linde wrote:As Sean hints at, "random access iterators" are not pure iterators. The random access part is orthogonal to the iterative part. In D, a perfect way to implement a random access view is defining a .length property and an opIndex method. A random access view doesn't need to be iterable and an iterator doesn't need to provide random access. A D slice is the perfect example of a random access view.You two might be right.I see no reason to use operator overloading when the result isn't replaceable with a pointer anyway.Right.Bill Baxter's points regarding iterator as a concept vs an interface/abstract class are very valid. And as Bill says, maybe both should be supported. One of them could always be converted to the other.Yes.Regarding efficiency, here is one sample implementation that a compiler should(*) be able to make just as efficient as for/foreach loops: struct Iterator(T) { T* ptr; T* end; bool step() { return ++ptr != end; } T value() { return *ptr; } T value(T x) { return *ptr = x; } // mixin OpApplyMixin; } Iterator!(T) iter(T)(T[] x) { Iterator!(T) ret;// = {x.ptr-1, x.ptr+x.length}; // ICE ret.ptr = x.ptr-1; ret.end = x.ptr+x.length; return ret; } void main() { int[] arr = [1,2,3,4,5]; auto i = arr.iter(); while(i.step) { writefln("%s",i.value); i.value = 5; } }It's good, but the pointer-before-the-start is problematic. This can cause problems on some architectures. Pointing past the end is explicitly supported, but before the beginning is not. I'd also prefer the step() get 'stuck' at the end, instead of going past it if it gets called again (this makes sense for input iterators). So I suggest instead: for (auto i = arr.iter(); !i.atEnd(); i.step()) ... in other words, separate out the 'end' test from the 'step' operation.
Nov 07 2006
Walter Bright wrote:For arrays, we already have arr.ptr, what if array.end was simply added? auto slice = array[50..100]; for(auto i = slice.ptr; i != slice.end; i++) { ... } Then for UDT custom iterators, ptr, end and op[Pre|Post][Inc|Dec] could be overloaded? Just a thought..struct Iterator(T) { T* ptr; T* end; bool step() { return ++ptr != end; } T value() { return *ptr; } T value(T x) { return *ptr = x; } // mixin OpApplyMixin; } Iterator!(T) iter(T)(T[] x) { Iterator!(T) ret;// = {x.ptr-1, x.ptr+x.length}; // ICE ret.ptr = x.ptr-1; ret.end = x.ptr+x.length; return ret; } void main() { int[] arr = [1,2,3,4,5]; auto i = arr.iter(); while(i.step) { writefln("%s",i.value); i.value = 5; } }It's good, but the pointer-before-the-start is problematic. This can cause problems on some architectures. Pointing past the end is explicitly supported, but before the beginning is not. I'd also prefer the step() get 'stuck' at the end, instead of going past it if it gets called again (this makes sense for input iterators). So I suggest instead: for (auto i = arr.iter(); !i.atEnd(); i.step()) ... in other words, separate out the 'end' test from the 'step' operation.
Nov 12 2006
I don't like C++ iterators; they're a very old design made to fit a language that was unmodifiable. They can't work with destructive or unbound iterators (well, they can, but you just gotta hope to hell the user doesn't use the interface in an unusual way), they can't work with associative arrays or tree structures without allocation (or wasting space in the structure for parent node pointers), and they're VERY verbose and distributed. Plus they allow iterating to an illegal index (the end index), which requires an additional state value for some iterables. Python's generators are a far better solution, like: void i_range (T) (T from, T to, T step = cast (T) 1, yield T index) { for (index = from; index < to; index += step) yield; } foreach (i; i_range (0, 100)) ... Or: yield T i_range (T) (T from, T to, T step = cast (T) 1) { T index; for (index = from; index < to; index += step) yield index; } You've been meaning to allow frame pointers for proper closures anyway, right? :P The implementation in iterators using the simplest interface I can think of would be something like: struct IRange (T, T from, T to, T step = cast (T) 1) { T index = from; static IRange opCall (T value) { IRange result; result.index = value; return result; } T opThis () { return index; } IRange opNext () { assert (index <= end); return opCall (index + 1); } IRange opEnd () { return opCall (to); } } foreach (i; IRange! (int, 0, 100)) ... Notice how the SINGLE line indicating how the arguments are interpreted are hidden within a whole bunch of boilerplate.
Nov 10 2006
As nice as yield is for some types of problems, they aren't going to happen in D anytime soon, because they're a lot of tricky work to implement.
Nov 10 2006