digitalmars.D - opNext: Simplified ranges
- Tomer Weka (59/59) Nov 04 2022 I find the empty/front/popNext protocol very cumbersome to
- Joseph Rushton Wakeling (15/20) Nov 05 2022 Out of curiosity: what do you think are the pros and cons of this
- rikki cattermole (20/20) Nov 05 2022 ```d
- rikki cattermole (4/5) Nov 05 2022 return !_haveValue;
- Steven Schveighoffer (20/23) Nov 05 2022 Not sure what this means. breaking a foreach does not consume the next
- Paul Backus (37/48) Nov 05 2022 You'd think, but if you actually try this, it turns out it
- Dukc (15/23) Nov 05 2022 This works:
- Steven Schveighoffer (24/79) Nov 05 2022 What is happening is the compiler is not using duck typing exactly, and
- Walter Bright (2/2) Nov 05 2022 We didn't use the "next" protocol for ranges because one cannot test for...
- Tomer Weka (5/7) Nov 06 2022 But that's exactly the issue - I have to produce the element in
- Imperatorn (2/10) Nov 06 2022 Doesn't one the solutions from Steven or Dukec work?
- Steven Schveighoffer (11/18) Nov 06 2022 The issue there is that you are making a copy, and the copy has already
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (6/8) Nov 06 2022 That's not a blocker if we can afford implementing a second
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (4/8) Nov 06 2022 It all boils down to fitting the design to the most common use
- Sebastiaan Koppe (6/10) Nov 06 2022 I actually think the most lowest range should be one with
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (4/6) Nov 06 2022 I don't grasp what you mean. Can you give enlighten me with a
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (3/9) Nov 06 2022 1. Ahh, is this inspired by the concurrency primitives?
- Tejas (10/20) Nov 06 2022 You have talked about redesigning ranges multiple times before as
- Sebastiaan Koppe (8/14) Nov 07 2022 I am afraid such a document, if done properly, is tantamount to
- Timon Gehr (3/14) Nov 07 2022 This does more than just check for existence; the range will have to
- Joseph Rushton Wakeling (12/14) Nov 10 2022 Aren't there cases (e.g. some kinds of IO) where that's
- Joseph Rushton Wakeling (10/13) Nov 10 2022 BTW, note that the `empty` check is a more information-limited
- victoroak (5/20) Nov 10 2022 Yes, in some cases you have to consume the inner stream in either
I find the empty/front/popNext protocol very cumbersome to implement for most cases (at least in the Weka codebase), which causes people to just prefer opApply (even though it has its own set of issues). Today I even ran into a problem that cannot be solved using the existing protocol: a consuming range (think a file stream) that should not consume the element if the foreach has been broken. Using opApply, this is easy (the delegate returns nonzero), but using popNext it's next to impossible (or at least really really cumbersome). I don't want to go into all the details, so let's just jump to the suggested protocol. ``` bool opNext(out T elem) {...} ``` `opNext` returns true if it "produced" an element, false if it reached the end. That way ``` foreach (elem; range) { ... } ``` Simply gets lowered to ``` T elem; while (range.opNext(elem)) { ... } ``` Of course this protocol can live side-by-side with the existing protocol, just take precedence in the lowering rules. A concrete example: ``` struct LinkedList { int value; LinkedList* next; struct Range { LinkedList* cursor; bool opNext(out int val) nothrow nogc { if (cursor is null) { return false; } val = cursor.value; return true; } } auto members() {return Range(&this);} } foreach (val; myList.members) { ... } ``` Notes: * `save()` can work for the new protocol as well, it just needs to duplicate the iterator's state * Maybe it would be wise to complement this with opPrev for bidirectional ranges, but I have to say I've rarely seen any use of them * I think chaining with map/filter/etc. is even easier to implement in the new protocol
Nov 04 2022
On Saturday, 5 November 2022 at 06:01:54 UTC, Tomer Weka wrote:``` bool opNext(out T elem) {...} ``` `opNext` returns true if it "produced" an element, false if it reached the end.Out of curiosity: what do you think are the pros and cons of this versus, say, something like: ``` Option!(T) opNext() { ... } ``` Where `Option` is an implementation of the option type, and hence encapsulates the question of whether an element was produced or not? (IIRC this has been another suggestion for a simplified/redesigned range concept.) An `Option`-based design has the advantage of not requiring a persistent `front`, which may fit better with true input ranges (e.g. reading from a stream). OTOH it may make things more clunky for the case where there is persistent underlying data that a `front` method could refer to.
Nov 05 2022
```d mixin template NextToRange(ElementType) { ElementType _value; bool _haveValue; property { bool empty() { return _haveValue; } ElementType front() { assert(_haveValue); return _value; } } void popFront() { _haveValue = nextValue(_value); } } ``` You don't need to break a large percentage of the standard library to use this ;)
Nov 05 2022
On 06/11/2022 12:34 AM, rikki cattermole wrote:return _haveValue;return !_haveValue; Oops, that should be reversed. Thanks Bruce Carneal!
Nov 05 2022
On 11/5/22 2:01 AM, Tomer Weka wrote:Today I even ran into a problem that cannot be solved using the existing protocol: a consuming range (think a file stream) that should not consume the element if the foreach has been broken.Not sure what this means. breaking a foreach does not consume the next element. Now, foreach *does* make a copy of the range. Which means that the copy is going to consume the current, while the original will not get it (and the copy is gone once the loop is over). This can be solved by a wrapper: ```d struct RefOf(T) { private T *_val; ref T _get() { return *_val; } alias _get this; } auto refOf(T)(return ref T val) { return RefOf!T(&val); } foreach(elem; range.refOf) // now doesn't make a copy. ``` Could also be solved with syntax changes to foreach. Maybe `foreach(elem; ref range)`? -Steve
Nov 05 2022
On Saturday, 5 November 2022 at 15:17:42 UTC, Steven Schveighoffer wrote:This can be solved by a wrapper: ```d struct RefOf(T) { private T *_val; ref T _get() { return *_val; } alias _get this; } auto refOf(T)(return ref T val) { return RefOf!T(&val); } foreach(elem; range.refOf) // now doesn't make a copy. ```You'd think, but if you actually try this, it turns out it doesn't work: the compiler is "smart" enough to see through the `alias this`, and makes a copy of `range.refOf._get` (the actual range object) instead of `range.refOf` (the wrapper). So, for example, this usage code prints `[1, 2, 3]`, demonstrating that the range is *not* consumed by the `foreach` loop: ```d void main() { import std.range, std.stdio; auto r = only(1, 2, 3); foreach (e; refOf(r)) {} writeln(r); } ``` In the `-vcg-ast` output, you can see the call to `_get` inserted by the compiler: ```d void main() { import std.range; import std.stdio; OnlyResult!(int, int, int) r = only(1, 2, 3); { OnlyResult!(int, int, int) __r47 = refOf(r)._get().opSlice(); for (; !__r47.empty(); __r47.popFront()) { int e = __r47.front(); } } writeln(r); return 0; } ```
Nov 05 2022
On Saturday, 5 November 2022 at 18:08:39 UTC, Paul Backus wrote:On Saturday, 5 November 2022 at 15:17:42 UTC, Steven Schveighoffer wrote:This works: ```D void main() { import std.range : inputRangeObject, refRange; import std.stdio; auto testPiece = [1,2,3,4,5]; foreach(el; (&testPiece).refRange.inputRangeObject) { if(el == 3) break; } writeln(testPiece); // [3, 4, 5] } ``` It's far from satisfying though. Ugly to begin with, and worse, incompatible with any function attributes.This can be solved by a wrapper:You'd think, but if you actually try this, it turns out it doesn't work: the compiler is "smart" enough to see through the `alias this`, and makes a copy of `range.refOf._get` (the actual range object) instead of `range.refOf` (the wrapper).
Nov 05 2022
On 11/5/22 2:08 PM, Paul Backus wrote:On Saturday, 5 November 2022 at 15:17:42 UTC, Steven Schveighoffer wrote:It's not "smart", just picky.This can be solved by a wrapper: ```d struct RefOf(T) { private T *_val; ref T _get() { return *_val; } alias _get this; } auto refOf(T)(return ref T val) { return RefOf!T(&val); } foreach(elem; range.refOf) // now doesn't make a copy. ```You'd think, but if you actually try this, it turns out it doesn't work: the compiler is "smart" enough to see through the `alias this`, and makes a copy of `range.refOf._get` (the actual range object) instead of `range.refOf` (the wrapper).So, for example, this usage code prints `[1, 2, 3]`, demonstrating that the range is *not* consumed by the `foreach` loop: ```d void main() { import std.range, std.stdio; auto r = only(1, 2, 3); foreach (e; refOf(r)) {} writeln(r); } ``` In the `-vcg-ast` output, you can see the call to `_get` inserted by the compiler: ```d void main() { import std.range; import std.stdio; OnlyResult!(int, int, int) r = only(1, 2, 3); { OnlyResult!(int, int, int) __r47 = refOf(r)._get().opSlice(); for (; !__r47.empty(); __r47.popFront()) { int e = __r47.front(); } } writeln(r); return 0; } ```What is happening is the compiler is not using duck typing exactly, and it also has some pretty puzzling logic. 1. If it has the ability to slice, it will *always do this at least once*. See the bug reported long ago: https://issues.dlang.org/show_bug.cgi?id=14619 2. The definition of a range is that it contains a `front` *member*. This cannot be via `alias this`. So this works: ```d struct RefOf(T) { private T *_val; auto front() { return _get.front; } RefOf opSlice() { return this; } ref T _get() { return *_val; } alias _get this; } ``` Take away from that what you will. (also I forgot there actually is something called `std.range.refRange` which *is supposed to do this*, but apparently still broken). -Steve
Nov 05 2022
We didn't use the "next" protocol for ranges because one cannot test for existence of the next element without consuming it.
Nov 05 2022
On Sunday, 6 November 2022 at 01:48:04 UTC, Walter Bright wrote:We didn't use the "next" protocol for ranges because one cannot test for existence of the next element without consuming it.But that's exactly the issue - I have to produce the element in the range's ctor, which essentially requires a "non-popping popNext" to determine if it's empty and assign the front -tomer
Nov 06 2022
On Sunday, 6 November 2022 at 07:55:13 UTC, Tomer Weka wrote:On Sunday, 6 November 2022 at 01:48:04 UTC, Walter Bright wrote:Doesn't one the solutions from Steven or Dukec work?We didn't use the "next" protocol for ranges because one cannot test for existence of the next element without consuming it.But that's exactly the issue - I have to produce the element in the range's ctor, which essentially requires a "non-popping popNext" to determine if it's empty and assign the front -tomer
Nov 06 2022
On 11/6/22 2:55 AM, Tomer Weka wrote:On Sunday, 6 November 2022 at 01:48:04 UTC, Walter Bright wrote:The issue there is that you are making a copy, and the copy has already consumed the front element. It would be no different with your API as well -- if you break from your range type, the element is gone. What we need is non-copyable input ranges. I have advocated for this for a while. In essence, a forward range should be copyable, an input range should not be, and we should remove `save`. Then when you foreach it, you *can't* make a copy, so it doesn't let you discard the element you are looking at. The other option is -- don't use foreach. Use a while loop like you wrote. -SteveWe didn't use the "next" protocol for ranges because one cannot test for existence of the next element without consuming it.But that's exactly the issue - I have to produce the element in the range's ctor, which essentially requires a "non-popping popNext" to determine if it's empty and assign the front
Nov 06 2022
On Sunday, 6 November 2022 at 01:48:04 UTC, Walter Bright wrote:We didn't use the "next" protocol for ranges because one cannot test for existence of the next element without consuming it.That's not a blocker if we can afford implementing a second function for the "peek" case. This is the path Rust took via its `Peekable` trait and `peek()` member as described at https://doc.rust-lang.org/stable/std/iter/struct.Peekable.html .
Nov 06 2022
On Sunday, 6 November 2022 at 21:06:01 UTC, Per Nordlöw wrote:That's not a blocker if we can afford implementing a second function for the "peek" case. This is the path Rust took via its `Peekable` trait and `peek()` member as described at https://doc.rust-lang.org/stable/std/iter/struct.Peekable.htmlIt all boils down to fitting the design to the most common use case. So is the most common case to be able to peek or not? I really don't know right. Does anybody have any insights into this?
Nov 06 2022
On Sunday, 6 November 2022 at 21:08:10 UTC, Per Nordlöw wrote:It all boils down to fitting the design to the most common use case. So is the most common case to be able to peek or not? I really don't know right. Does anybody have any insights into this?I actually think the most lowest range should be one with destructive read semantics; non-peekable, non-copyable. You can build everything on top of that. I also think ranges should be redesigned to have a 2 stage api. One to build the pipeline, one to iterate.
Nov 06 2022
On Sunday, 6 November 2022 at 22:32:00 UTC, Sebastiaan Koppe wrote:I also think ranges should be redesigned to have a 2 stage api. One to build the pipeline, one to iterate.I don't grasp what you mean. Can you give enlighten me with a show case?
Nov 06 2022
On Sunday, 6 November 2022 at 22:47:59 UTC, Per Nordlöw wrote:On Sunday, 6 November 2022 at 22:32:00 UTC, Sebastiaan Koppe wrote:1. Ahh, is this inspired by the concurrency primitives? 2. Are there any other languages that use such a 2 stage api?I also think ranges should be redesigned to have a 2 stage api. One to build the pipeline, one to iterate.I don't grasp what you mean. Can you give enlighten me with a show case?
Nov 06 2022
On Sunday, 6 November 2022 at 22:32:00 UTC, Sebastiaan Koppe wrote:On Sunday, 6 November 2022 at 21:08:10 UTC, Per Nordlöw wrote:You have talked about redesigning ranges multiple times before as well, plus have expierience using them non-trivially for quite some time. Could you please create a post/document that contains all your ideas/thoughts about what a redesigned range would look like? The maybe someone could actually work towards implementing it. If not directly within phobos, then atleast they could create a package and we could test the new ranges' efficacy for real world usageIt all boils down to fitting the design to the most common use case. So is the most common case to be able to peek or not? I really don't know right. Does anybody have any insights into this?I actually think the most lowest range should be one with destructive read semantics; non-peekable, non-copyable. You can build everything on top of that. I also think ranges should be redesigned to have a 2 stage api. One to build the pipeline, one to iterate.
Nov 06 2022
On Monday, 7 November 2022 at 02:49:13 UTC, Tejas wrote:Could you please create a post/document that contains all your ideas/thoughts about what a redesigned range would look like? The maybe someone could actually work towards implementing it. If not directly within phobos, then atleast they could create a package and we could test the new ranges' efficacy for real world usageI am afraid such a document, if done properly, is tantamount to doing the full redesign. Otherwise it is just a loose collection of rough ideas, valuable to no one but myself. The reason I brought them up is to gauge interest in and of them. The next step for me would be to experiment with them, and see if they hold up. If that is fruitful we will have something real to discuss.
Nov 07 2022
On 11/6/22 22:06, Per Nordlöw wrote:On Sunday, 6 November 2022 at 01:48:04 UTC, Walter Bright wrote:This does more than just check for existence; the range will have to cache the next element, which is the issue OP is trying to fix I think.We didn't use the "next" protocol for ranges because one cannot test for existence of the next element without consuming it.That's not a blocker if we can afford implementing a second function for the "peek" case. This is the path Rust took via its `Peekable` trait and `peek()` member as described at https://doc.rust-lang.org/stable/std/iter/struct.Peekable.html .
Nov 07 2022
On Sunday, 6 November 2022 at 01:48:04 UTC, Walter Bright wrote:We didn't use the "next" protocol for ranges because one cannot test for existence of the next element without consuming it.Aren't there cases (e.g. some kinds of IO) where that's unavoidable in any case? The "next" approach is surely the most information-limited case, where you really can't separate between checking for existence of the next element, and consuming it. Other possibilities, such as the presence of a check whether the range is `empty`, or the possibility to `peek`, would then be less information-limited options on top of that. I apologize if I'm out of date on some discussions which have already taken place around the range design, and so forcing you to re-explain something that's already been worked through.
Nov 10 2022
On Thursday, 10 November 2022 at 15:25:30 UTC, Joseph Rushton Wakeling wrote:Other possibilities, such as the presence of a check whether the range is `empty`, or the possibility to `peek`, would then be less information-limited options on top of that.BTW, note that the `empty` check is a more information-limited version of peekability. For example, an algorithmic range (such as a range-based view of random number generation) can tell you that it's not empty, but may not be able to tell you what the next value will be. OTOH I would imagine that most (all?) ranges across the contents of containers should be able both to tell you if the range is empty or not _and_ allow you to peek at the next in sequence.
Nov 10 2022
On Thursday, 10 November 2022 at 15:25:30 UTC, Joseph Rushton Wakeling wrote:On Sunday, 6 November 2022 at 01:48:04 UTC, Walter Bright wrote:Yes, in some cases you have to consume the inner stream in either the constructor or in the empty call. I would also like for a more simpler range with only `next`.We didn't use the "next" protocol for ranges because one cannot test for existence of the next element without consuming it.Aren't there cases (e.g. some kinds of IO) where that's unavoidable in any case? The "next" approach is surely the most information-limited case, where you really can't separate between checking for existence of the next element, and consuming it. Other possibilities, such as the presence of a check whether the range is `empty`, or the possibility to `peek`, would then be less information-limited options on top of that. I apologize if I'm out of date on some discussions which have already taken place around the range design, and so forcing you to re-explain something that's already been worked through.
Nov 10 2022