www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - User defined type and foreach

reply Tony <tonytdominguez aol.com> writes:
I made a stack data type and created an opIndex() so it could be 
turned into a dynamic array, and created empty() (unfortunate 
name), front() and popFront() methods, which I read allow it to 
be used with foreach.

However, when I use the class with foreach, the opindex gets 
called to create a dynamic array, rather than use the 
empty(),front(),popFront() routines. I would prefer it use the 
three methods, rather than create a dynamic array.
Nov 16 2017
next sibling parent reply Andrea Fontana <nospam example.com> writes:
On Thursday, 16 November 2017 at 08:03:48 UTC, Tony wrote:
 I made a stack data type and created an opIndex() so it could 
 be turned into a dynamic array, and created empty() 
 (unfortunate name), front() and popFront() methods, which I 
 read allow it to be used with foreach.

 However, when I use the class with foreach, the opindex gets 
 called to create a dynamic array, rather than use the 
 empty(),front(),popFront() routines. I would prefer it use the 
 three methods, rather than create a dynamic array.
You can try to implement opApply(). Check: http://ddili.org/ders/d.en/foreach_opapply.html
Nov 16 2017
parent reply Tony <tonytdominguez aol.com> writes:
On Thursday, 16 November 2017 at 08:26:25 UTC, Andrea Fontana 
wrote:
 On Thursday, 16 November 2017 at 08:03:48 UTC, Tony wrote:
 I made a stack data type and created an opIndex() so it could 
 be turned into a dynamic array, and created empty() 
 (unfortunate name), front() and popFront() methods, which I 
 read allow it to be used with foreach.

 However, when I use the class with foreach, the opindex gets 
 called to create a dynamic array, rather than use the 
 empty(),front(),popFront() routines. I would prefer it use the 
 three methods, rather than create a dynamic array.
You can try to implement opApply(). Check: http://ddili.org/ders/d.en/foreach_opapply.html
Thanks. Interesting that that page does not mention the behavior I am seeing, which is that foreach over a user-defined datatype can be implemented with only a 'T[] opIndex()' method.
Nov 16 2017
parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Thursday, November 16, 2017 08:43:17 Tony via Digitalmars-d-learn wrote:
 On Thursday, 16 November 2017 at 08:26:25 UTC, Andrea Fontana

 wrote:
 On Thursday, 16 November 2017 at 08:03:48 UTC, Tony wrote:
 I made a stack data type and created an opIndex() so it could
 be turned into a dynamic array, and created empty()
 (unfortunate name), front() and popFront() methods, which I
 read allow it to be used with foreach.

 However, when I use the class with foreach, the opindex gets
 called to create a dynamic array, rather than use the
 empty(),front(),popFront() routines. I would prefer it use the
 three methods, rather than create a dynamic array.
You can try to implement opApply(). Check: http://ddili.org/ders/d.en/foreach_opapply.html
Thanks. Interesting that that page does not mention the behavior I am seeing, which is that foreach over a user-defined datatype can be implemented with only a 'T[] opIndex()' method.
What you're seeing is intended for containers. If you implement opIndex or opSlice with no parameters so that you can slice the object with no indices, the idea is that it's going to return a range. foreach knows about this so that you can then iterate over a container with foreach rather than having to explicitly call a function on the container to get the range to then iterate over. e.g. with RedBlackTree rbt; ... you can do foreach(e; rbt) {...} instead of foreach(e; rbt[]) {...} though if you then pass the container to a range-based function, you're still going to have to explicitly slice it. - Jonathan M Davis
Nov 16 2017
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 11/16/17 3:03 AM, Tony wrote:
 I made a stack data type and created an opIndex() so it could be turned 
 into a dynamic array, and created empty() (unfortunate name), front() 
 and popFront() methods, which I read allow it to be used with foreach.
 
 However, when I use the class with foreach, the opindex gets called to 
 create a dynamic array, rather than use the empty(),front(),popFront() 
 routines. I would prefer it use the three methods, rather than create a 
 dynamic array.
Remove the opIndex, and see if it works. If it does, then this is a bug. The compiler should try to use the type as a range before seeing if opIndex gives it something. -Steve
Nov 16 2017
parent Tony <tonytdominguez aol.com> writes:
On Thursday, 16 November 2017 at 12:56:18 UTC, Steven 
Schveighoffer wrote:
 On 11/16/17 3:03 AM, Tony wrote:
 I made a stack data type and created an opIndex() so it could 
 be turned into a dynamic array, and created empty() 
 (unfortunate name), front() and popFront() methods, which I 
 read allow it to be used with foreach.
 
 However, when I use the class with foreach, the opindex gets 
 called to create a dynamic array, rather than use the 
 empty(),front(),popFront() routines. I would prefer it use the 
 three methods, rather than create a dynamic array.
Remove the opIndex, and see if it works. If it does, then this is a bug. The compiler should try to use the type as a range before seeing if opIndex gives it something.
Yes, if I remove opIndex() from the class, doing a foreach loop causes popFront(),empty(), and front() to be called and the foreach loop works.
Nov 16 2017
prev sibling parent reply ag0aep6g <anonymous example.com> writes:
On 11/16/2017 09:03 AM, Tony wrote:
 However, when I use the class with foreach, the opindex gets called to 
 create a dynamic array, rather than use the empty(),front(),popFront() 
 routines. I would prefer it use the three methods, rather than create a 
 dynamic array.
https://issues.dlang.org/show_bug.cgi?id=14619
Nov 16 2017
next sibling parent reply Tony <tonytdominguez aol.com> writes:
On Thursday, 16 November 2017 at 13:10:11 UTC, ag0aep6g wrote:
 On 11/16/2017 09:03 AM, Tony wrote:
 However, when I use the class with foreach, the opindex gets 
 called to create a dynamic array, rather than use the 
 empty(),front(),popFront() routines. I would prefer it use the 
 three methods, rather than create a dynamic array.
https://issues.dlang.org/show_bug.cgi?id=14619
And also this one which was marked as a duplicate of yours https://issues.dlang.org/show_bug.cgi?id=16374 "When lowering a foreach, the compiler gives priority to opSlice over front/popFront/empty, which is counter-intuitive (and also undocumented)." "opSlice" -> "opSlice or opIndex".
Nov 16 2017
parent Tony <tonytdominguez aol.com> writes:
On Thursday, 16 November 2017 at 13:35:13 UTC, Tony wrote:
 On Thursday, 16 November 2017 at 13:10:11 UTC, ag0aep6g wrote:
 On 11/16/2017 09:03 AM, Tony wrote:
 However, when I use the class with foreach, the opindex gets 
 called to create a dynamic array, rather than use the 
 empty(),front(),popFront() routines. I would prefer it use 
 the three methods, rather than create a dynamic array.
https://issues.dlang.org/show_bug.cgi?id=14619
And also this one which was marked as a duplicate of yours https://issues.dlang.org/show_bug.cgi?id=16374 "When lowering a foreach, the compiler gives priority to opSlice over front/popFront/empty, which is counter-intuitive (and also undocumented)." "opSlice" -> "opSlice or opIndex".
That should be "opSlice or opIndex-with-no-parameters"
Nov 16 2017
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 11/16/17 8:10 AM, ag0aep6g wrote:
 On 11/16/2017 09:03 AM, Tony wrote:
 However, when I use the class with foreach, the opindex gets called to 
 create a dynamic array, rather than use the empty(),front(),popFront() 
 routines. I would prefer it use the three methods, rather than create 
 a dynamic array.
https://issues.dlang.org/show_bug.cgi?id=14619
I took a shot at fixing. Way more complex than I realized. -Steve
Nov 16 2017
parent reply Tony <tonytdominguez aol.com> writes:
On Thursday, 16 November 2017 at 18:34:54 UTC, Steven 
Schveighoffer wrote:
 On 11/16/17 8:10 AM, ag0aep6g wrote:
 On 11/16/2017 09:03 AM, Tony wrote:
 However, when I use the class with foreach, the opindex gets 
 called to create a dynamic array, rather than use the 
 empty(),front(),popFront() routines. I would prefer it use 
 the three methods, rather than create a dynamic array.
https://issues.dlang.org/show_bug.cgi?id=14619
I took a shot at fixing. Way more complex than I realized.
I was initially miffed that I had added empty(), popFront() and pop() and they weren't being used, but I don't have a problem with using [] instead of them. Maybe call it a feature and document it. But I do have a complaint about the methods empty(), popFront() and pop(). I think they should have a special syntax or name to reflect that they are not general purpose methods. __empty() or preferably __forEachDone(). empty() is typically used to say if a container has no data, not if you are at the end of external foreach loop processing. pop() and popFront() also would typically have different meanings with certain containers and their names don't reflect that they have a special "external foreach loop" purpose.
Nov 16 2017
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Nov 17, 2017 at 01:06:31AM +0000, Tony via Digitalmars-d-learn wrote:
[...]
 But I do have a complaint about the methods empty(), popFront() and
 pop(). I think they should have a special syntax or name to reflect
 that they are not general purpose methods. __empty() or preferably
 __forEachDone().  empty() is typically used to say if a container has
 no data,  not if you are at the end of external foreach loop
 processing. pop() and popFront() also would typically have different
 meanings with certain containers and their names don't reflect that
 they have a special "external foreach loop" purpose.
It should be .empty, .popFront, and .front, not .pop. Also, these methods are *range* primitives, and over time, we have come to a consensus that generally speaking, it's a bad idea to conflate containers with ranges over containers. The main thing is that iterating over a range is supposed to consume it, which is usually not what you want with a container. The usual idiom is to separate the two concepts, and have the container provide a mechanism for returning a range over its contents, usually via .opIndex with no arguments, or .opSlice. Then you would just write: foreach (e; myContainer[]) { // [] calls .opIndex/.opSlice ... } Unfortunately, built-in arrays, which are also ranges, are one exception to this rule that, due to their ubiquity in D, also serve to mislead newcomers to D about when/where range primitives should be implemented. Generally speaking, built-in arrays should not be considered exemplary in this respect, but rather should be understood as exceptions. The general convention is to separate your containers from ranges over its contents, and to provide .opIndex / .opSlice that constructs a range over the container when needed. The other consideration is that if you don't really need range functionality, i.e., the only thing you want to do with your container is to put it in a foreach loop, then you can sidestep this whole mess and just implement .opApply for your container and call it a day. Of course, then you won't be able to use generic algorithms like those in std.algorithm with your container, but if you didn't intend to anyway, it's not a big deal. T -- Heads I win, tails you lose.
Nov 16 2017
parent reply Tony <tonytdominguez aol.com> writes:
On Friday, 17 November 2017 at 01:16:38 UTC, H. S. Teoh wrote:

 It should be .empty, .popFront, and .front, not .pop.

 Also, these methods are *range* primitives, and over time, we 
 have come to a consensus that generally speaking, it's a bad 
 idea to conflate containers with ranges over containers.  The 
 main thing is that iterating over a range is supposed to 
 consume it, which is usually not what you want with a container.

 The usual idiom is to separate the two concepts, and have the 
 container provide a mechanism for returning a range over its 
 contents, usually via .opIndex with no arguments, or .opSlice. 
 Then you would just write:

 	foreach (e; myContainer[]) { // [] calls .opIndex/.opSlice
 		...
 	}

 Unfortunately, built-in arrays, which are also ranges, are one 
 exception to this rule that, due to their ubiquity in D, also 
 serve to mislead newcomers to D about when/where range 
 primitives should be implemented. Generally speaking, built-in 
 arrays should not be considered exemplary in this respect, but 
 rather should be understood as exceptions.  The general 
 convention is to separate your containers from ranges over its 
 contents, and to provide .opIndex / .opSlice that constructs a 
 range over the container when needed.

 The other consideration is that if you don't really need range 
 functionality, i.e., the only thing you want to do with your 
 container is to put it in a foreach loop, then you can sidestep 
 this whole mess and just implement .opApply for your container 
 and call it a day.  Of course, then you won't be able to use 
 generic algorithms like those in std.algorithm with your 
 container, but if you didn't intend to anyway, it's not a big 
 deal.


 T
Thanks T! Good information, especially "iterating over a range is supposed to consume it". I have been reading dlang.org->Documentation->Language Reference, but should have also read dlang.org->Dlang-Tour->Ranges. Although that page makes a distinction about "range consumption" with regard to a "reference type" or a "value type" and it isn't clear to me why there would be a difference.
Nov 16 2017
parent reply Mike Parker <aldacron gmail.com> writes:
On Friday, 17 November 2017 at 03:15:12 UTC, Tony wrote:

 Thanks T! Good information, especially "iterating over a range 
 is supposed to consume it". I have been reading 
 dlang.org->Documentation->Language Reference, but  should have 
 also read dlang.org->Dlang-Tour->Ranges. Although that page
You might also find use in this article (poorly adapted from Chapter 6 of Learning D by the publisher, but still readable): https://www.packtpub.com/books/content/understanding-ranges
 makes a distinction about "range consumption" with regard to a 
 "reference type" or a "value type" and it isn't clear to me why 
 there would be a difference.
With a value type, you're consuming a copy of the original range, so you can reuse it after. With a reference type, you're consuming the original range and therefore can't reuse it. ======== struct ValRange { int[] items; bool empty() property { return items.length == 0; } int front() property { return items[0]; } void popFront() { items = items[1 .. $]; } } class RefRange { int[] items; this(int[] src) { items = src; } bool empty() property { return items.length == 0; } int front() property { return items[0]; } void popFront() { items = items[1 .. $]; } } void main() { import std.stdio; int[] ints = [1, 2, 3]; auto valRange = ValRange(ints); writeln("Val 1st Run:"); foreach(i; valRange) writeln(i); assert(!valRange.empty); writeln("Val 2nd Run:"); foreach(i; valRange) writeln(i); assert(!valRange.empty); auto refRange = new RefRange(ints); writeln("Ref 1st Run:"); foreach(i; refRange) writeln(i); assert(refRange.empty); writeln("Ref 2nd Run:"); foreach(i; refRange) writeln(i); // prints nothing }
Nov 16 2017
next sibling parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Friday, November 17, 2017 07:40:35 Mike Parker via Digitalmars-d-learn 
wrote:
 On Friday, 17 November 2017 at 03:15:12 UTC, Tony wrote:
 Thanks T! Good information, especially "iterating over a range
 is supposed to consume it". I have been reading
 dlang.org->Documentation->Language Reference, but  should have
 also read dlang.org->Dlang-Tour->Ranges. Although that page
You might also find use in this article (poorly adapted from Chapter 6 of Learning D by the publisher, but still readable): https://www.packtpub.com/books/content/understanding-ranges
 makes a distinction about "range consumption" with regard to a
 "reference type" or a "value type" and it isn't clear to me why
 there would be a difference.
With a value type, you're consuming a copy of the original range, so you can reuse it after. With a reference type, you're consuming the original range and therefore can't reuse it.
Technically, per the range API, you can _never_ reuse a range. The only legitimate way to get a copy of a range to then iterate over separately is to call save on the range (which of course requires it to then be a forward range and not just an input range). However, unfortunately, for many common range types (dynamic arrays included), calling save and copying the range have the same semantics. So, it's easy to write code that will work with many ranges without calling save anywhere but falls flat on its face as soon as you use a range that actually requires that save be called (typically because it's either a reference type, or it's a pseudo-reference type where only some state gets copied when the range is copied, and you get particularly weird behaviors when reusing the range that was copied). So, while you can get away with reusing a range where save and copying the range do the same thing, it's an incredibly bad idea in general and definitely causes problems in generic code. Certainly, it should only be done when you're dealing with a specific range type where you know what the semantics of copying it are. In general, as soon as you've copied a range, it should never be used again unless it's assigned a new value. Of course, if something is truly an input range (and not a forward range that merely doesn't have save declared like it should), then it's particularly bad to be trying to use copies of ranges, because any range that can't be a forward range is by definition either a reference type or a pseudo-reference type where a copy is not fully distinct from the original (typically where making a copy isn't possible or where it would be too expensive to do so). Personally, I'm inclined to think that we should never have had save and should have required that reference type ranges which are forward ranges be wrapped in a struct where copying it does the same thing that save does now, but I seriously doubt that we could make a change that big now. And we'd still have to watch out for how input ranges are different, since copying them wouldn't and couldn't work the same (and while getting rid of save like that would really clean up some range stuff that uses forward ranges, it would make it a lot harder to distinguish between input and forward ranges). So, it's not like there would be a perfect solution even if we were redesigning things from scratch. Ultimately, folks just need to be aware that they need to be calling save when they want to actually copy a range and make sure that they unit test their code well to make sure that it works with various range types if it's generic code (and that it works with the exact ranges that it uses if it's not generic). Unfortunately, it's usually the case that when you first test range-based code with a range that doesn't implicitly save when it's copied that you find that your code doesn't work with it, because save wasn't explicitly called when it needed to be. But at least you then catch it and can fix your code. - Jonathan M Davis
Nov 17 2017
prev sibling next sibling parent reply Tony <tonytdominguez aol.com> writes:
On Friday, 17 November 2017 at 07:40:35 UTC, Mike Parker wrote:

 You might also find use in this article (poorly adapted from 
 Chapter 6 of Learning D by the publisher, but still readable):

 https://www.packtpub.com/books/content/understanding-ranges

 makes a distinction about "range consumption" with regard to a 
 "reference type" or a "value type" and it isn't clear to me 
 why there would be a difference.
With a value type, you're consuming a copy of the original range, so you can reuse it after. With a reference type, you're consuming the original range and therefore can't reuse it. ======== struct ValRange { int[] items; bool empty() property { return items.length == 0; } int front() property { return items[0]; } void popFront() { items = items[1 .. $]; } } class RefRange { int[] items; this(int[] src) { items = src; } bool empty() property { return items.length == 0; } int front() property { return items[0]; } void popFront() { items = items[1 .. $]; } } void main() { import std.stdio; int[] ints = [1, 2, 3]; auto valRange = ValRange(ints); writeln("Val 1st Run:"); foreach(i; valRange) writeln(i); assert(!valRange.empty); writeln("Val 2nd Run:"); foreach(i; valRange) writeln(i); assert(!valRange.empty); auto refRange = new RefRange(ints); writeln("Ref 1st Run:"); foreach(i; refRange) writeln(i); assert(refRange.empty); writeln("Ref 2nd Run:"); foreach(i; refRange) writeln(i); // prints nothing }
Thanks for the reference and the code. I will have to iterate over the packpub text a while consulting the docs. I see that the code runs as you say, but I don't understand what's going on. You say with regard to a "value type" : "you're consuming a copy of the original range" but I don't see anything different between the processing in the struct versus in the class. They both have a dynamic array variable that they re-assign a "slice" to (or maybe that is - that they modify to be the sliced version). Anyway, I can't see why the one in the struct shrinks and then goes back to what it was originally. It's like calls were made by the compiler that aren't shown.
Nov 17 2017
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Friday, November 17, 2017 17:37:01 Tony via Digitalmars-d-learn wrote:
 On Friday, 17 November 2017 at 07:40:35 UTC, Mike Parker wrote:
 You might also find use in this article (poorly adapted from
 Chapter 6 of Learning D by the publisher, but still readable):

 https://www.packtpub.com/books/content/understanding-ranges

 makes a distinction about "range consumption" with regard to a
 "reference type" or a "value type" and it isn't clear to me
 why there would be a difference.
With a value type, you're consuming a copy of the original range, so you can reuse it after. With a reference type, you're consuming the original range and therefore can't reuse it. ======== struct ValRange { int[] items; bool empty() property { return items.length == 0; } int front() property { return items[0]; } void popFront() { items = items[1 .. $]; } } class RefRange { int[] items; this(int[] src) { items = src; } bool empty() property { return items.length == 0; } int front() property { return items[0]; } void popFront() { items = items[1 .. $]; } } void main() { import std.stdio; int[] ints = [1, 2, 3]; auto valRange = ValRange(ints); writeln("Val 1st Run:"); foreach(i; valRange) writeln(i); assert(!valRange.empty); writeln("Val 2nd Run:"); foreach(i; valRange) writeln(i); assert(!valRange.empty); auto refRange = new RefRange(ints); writeln("Ref 1st Run:"); foreach(i; refRange) writeln(i); assert(refRange.empty); writeln("Ref 2nd Run:"); foreach(i; refRange) writeln(i); // prints nothing }
Thanks for the reference and the code. I will have to iterate over the packpub text a while consulting the docs. I see that the code runs as you say, but I don't understand what's going on. You say with regard to a "value type" : "you're consuming a copy of the original range" but I don't see anything different between the processing in the struct versus in the class. They both have a dynamic array variable that they re-assign a "slice" to (or maybe that is - that they modify to be the sliced version). Anyway, I can't see why the one in the struct shrinks and then goes back to what it was originally. It's like calls were made by the compiler that aren't shown.
When you have foreach(e; range) it gets lowered to something like for(auto r = range; !r.empty; r.popFront()) { auto e = r.front; } So, the range is copied when you use it in a foreach. In the case of a class, it's just the reference that's copied. So, both "r" and "range" refer to the same object, but with a struct, you get two separate copies. So, when foreach iterates over "r", "range" isn't mutated. So, in the general case, if you want to use a range in foreach without consuming the range, it needs to be a forward range, and you need to call save. e.g. foreach(e; range.save) For many ranges, copying a range is equivalent to calling save, but for some it is not (most notably classes, since copying a class reference just means that you get two references to the same object). So, it's pretty typical for folks to write code that doesn't use save where it should and that works just fine with dynamic arrays and many structs but which fails miserably when you pass it a class. Also, just because something is a struct doesn't mean that copying it does a deep enough copy. If multiple variables hold state in the struct, and some of them are reference types and some are value types, then copying the struct does not result in an independent copy - and you can get really weird results when that happens. That's why it's important to test range-based functions with a variety of range types if it's intended to work with ranges in general as opposed to a specific type. Then you can ensure that you aren't accidentally relying on some aspect of a specific range type. - Jonathan M Davis
Nov 17 2017
next sibling parent reply Tony <tonytdominguez aol.com> writes:
On Friday, 17 November 2017 at 17:55:30 UTC, Jonathan M Davis 
wrote:
 When you have

 foreach(e; range)

 it gets lowered to something like

 for(auto r = range; !r.empty; r.popFront())
 {
     auto e = r.front;
 }

 So, the range is copied when you use it in a foreach. In the 
 case of a class, it's just the reference that's copied. So, 
 both "r" and "range" refer to the same object, but with a 
 struct, you get two separate copies. So, when foreach iterates 
 over "r", "range" isn't mutated.
Ah, I get it now ("r=range; process r"), thanks!
 So, in the general case, if you want to use a range in foreach 
 without consuming the range, it needs to be a forward range, 
 and you need to call save. e.g.

 foreach(e; range.save)
Seems like you can make class-based ranges to work on multiple foreach calls without having to do save, although maybe it falls apart in other usage. It also doesn't appear that the compiler requires an property annotation as specified in the interface : import std.stdio : writeln; class RefRange { int foreach_index; int[] items; this(int[] src) { items = src; } bool empty() { if (foreach_index == items.length) { foreach_index = 0; // reset for another foreach return true; } return false; } int front() { return items[foreach_index]; } void popFront() { foreach_index++; } } void main() { import std.stdio; int[] ints = [1, 2, 3]; auto refRange = new RefRange(ints); writeln("Ref 1st Run:"); foreach(i; refRange) writeln(i); assert( ! refRange.empty); writeln("Ref 2nd Run:"); foreach(i; refRange) writeln(i); // works } ------------------------------------------ Ref 1st Run: 1 2 3 Ref 2nd Run: 1 2 3
Nov 17 2017
next sibling parent Tony <tonytdominguez aol.com> writes:
On Saturday, 18 November 2017 at 05:24:30 UTC, Tony wrote:

Forgot to handle pre-mature foreach exit:

import std.stdio : writeln;

class RefRange {
     int foreach_index;
     int[] items;
     this(int[] src)
     {
        items = src;
     }

     bool empty()
     {
        if (foreach_index == items.length)
        {
	  foreach_index = 0; // reset for another foreach
	  return true;
        }
        return false;
     }
     int front() { return items[foreach_index]; }
     void popFront() { foreach_index++; }
     void resetIteration() { foreach_index = 0; }
}

void main() {

     int[] ints = [1, 2, 3];
     auto refRange = new RefRange(ints);

     writeln("Ref 1st Run:");
     foreach(i; refRange)
     {
        writeln(i);
        if ( i == 2 )
        {
	  refRange.resetIteration();
           break;
        }
     }
     assert( ! refRange.empty);
     writeln("Ref 2nd Run:");
     foreach(i; refRange) writeln(i); // works
}
-------------------------------
Ref 1st Run:
1
2
Ref 2nd Run:
1
2
3
Nov 17 2017
prev sibling parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Saturday, November 18, 2017 05:24:30 Tony via Digitalmars-d-learn wrote:
 On Friday, 17 November 2017 at 17:55:30 UTC, Jonathan M Davis
 So, in the general case, if you want to use a range in foreach
 without consuming the range, it needs to be a forward range,
 and you need to call save. e.g.

 foreach(e; range.save)
Seems like you can make class-based ranges to work on multiple foreach calls without having to do save, although maybe it falls apart in other usage.
A range that was a class would just be completely consumed by foreach unless you used break to exit the loop early. But my point is that in generic code, you can't copy a range and then use it any more after copying it without assigning it a new range, because the behavior is unspecified, and different ranges will act differently. Similarly, in generic code, you must call save when you want to get another copy of the range that can then be iterated independently. In non-generic code, you can choose to depend on the behavior of the specific range that you're using, but for the same code to work with all types of ranges, you're a bit more restricted.
 It also doesn't appear that the compiler
 requires an  property annotation as specified in the interface :
property does almost nothing in general. All it really does is affect how some stuff like typeof works. So, code introspection can be affected by property, but whether you put property on something like empty or front really doesn't matter. A lot of us do it out of habit and/or to make it clear that it's intended to be used as a property function, but any function which returns a value but has no parameters can be used as a getter property regardless of whether property is used. And what the range API really cares about is that the code in isInputRange, isForwardRange, etc. compiles, not that anything is marked with property. It's not even sure that everything involved is a function - e.g. infinite ranges are ranges where empty is known to be false at compile time, and that's usually done by defining empty as an enum, in which case it's not a function at all. However, it can be called the same way, so the same code work with a range that defines empty as a function and a range that defines it as an enum or variable, since the range API expects empty to be called without parens. Originally, property was supposed to do more with optional parens going away, but it never really happened (optional parens became too popular once UFCS was added to the language). The main problem that it may yet be made to solve is property functions which return callables. Right now, if you use () on a property function (whether it has property on it or not), they're the optional parens of the function call, whereas if the property were a variable, the parens would either trigger opCall or call a delegate (depending on the type of the variable). So, you can't currently turn a public variable that's a callable into a property function and have it work as a property function. property could be made to indicate that in that case, the single set of parens should be called on the return value rather than the function itself, in which case, you could have a property that's a callable, but while that has been discussed, it's never happened. So, for now at least, property really doesn't do much. - Jonathan M Davis
Nov 17 2017
prev sibling parent reply Jim Balter <Jim Balter.name> writes:
On Friday, 17 November 2017 at 17:55:30 UTC, Jonathan M Davis 
wrote:

 When you have

 foreach(e; range)

 it gets lowered to something like

 for(auto r = range; !r.empty; r.popFront())
 {
     auto e = r.front;
 }

 So, the range is copied when you use it in a foreach.
Indeed, and the language spec says so, but this is quite wrong as it violates the specification and design of ranges ... only forward ranges are copyable and only via their `save` function. I have an input range that can only be iterated once; if you try to do so again it's empty ... but the foreach implementation breaks that. You should be able to break out of the foreach statement, then run it again (or another foreach) and it should continue from where it left off, but copying breaks that. I need to know how many elements of my range were consumed; copying breaks that. I got around this by having a pointer to my state so only the pointer gets copied. I would also note that tutorials such as Ali Çehreli's "Programming in D – Tutorial and Reference" are unaware of this breakage: " Those three member functions must be named as empty, popFront, and front, respectively. The code that is generated by the compiler calls those functions: for ( ; !myObject.empty(); myObject.popFront()) { auto element = myObject.front(); // ... expressions ... } "
Jan 19
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Friday, January 19, 2024 3:49:29 AM MST Jim Balter via Digitalmars-d-learn 
wrote:
 On Friday, 17 November 2017 at 17:55:30 UTC, Jonathan M Davis

 wrote:
 When you have

 foreach(e; range)

 it gets lowered to something like

 for(auto r = range; !r.empty; r.popFront())
 {

     auto e = r.front;

 }

 So, the range is copied when you use it in a foreach.
Indeed, and the language spec says so, but this is quite wrong as it violates the specification and design of ranges ... only forward ranges are copyable and only via their `save` function. I have an input range that can only be iterated once; if you try to do so again it's empty ... but the foreach implementation breaks that. You should be able to break out of the foreach statement, then run it again (or another foreach) and it should continue from where it left off, but copying breaks that. I need to know how many elements of my range were consumed; copying breaks that. I got around this by having a pointer to my state so only the pointer gets copied. I would also note that tutorials such as Ali Çehreli's "Programming in D – Tutorial and Reference" are unaware of this breakage: " Those three member functions must be named as empty, popFront, and front, respectively. The code that is generated by the compiler calls those functions: for ( ; !myObject.empty(); myObject.popFront()) { auto element = myObject.front(); // ... expressions ... } "
No spec is being violated, and the behavior is completely expected. The core problem is that the range API doesn't actually specify the semantics of copying a range - and actually can't do so without making breaking changes. D types in general fall into one of three categories with regards to their copy semantics: 1. value types 2. reference types 3. pseudo-reference types When you copy a value type, you get two fully independent copies of the object (e.g. integers are a prime example of this; mutating a copy of an integer has no effect whatsoever on the original). When you copy a reference type, you get two fully dependent copies. The type is either a pointer or a reference (or a struct that holds just a pointer or a reference), and copying it results in another pointer or reference to the same object. So, mutating the object affects all pointers or references to that object. When you copy a pseudo-reference type, you get a partial copy. Typically, you're dealing with a struct which has both members which are value types and members which are reference types. The result is that some operations will affect only the object being mutated, whereas other operations will affect other copies of that object. Dynamic arrays are one example of this. They container a pointer and a size_t which is the length of the array, and reducing the size of the array by mutating the pointer and the length has no effect on other dynamic arrays which point to the same data, but mutating the actual elements affects all dynamic arrays which point to the same data. What this means for ranges is that depending on how they're implemented, you get one of four different behaviors when they're copied: 1. If the range is a value type, then copying the range results in two independent copies, so mutating the copy has no effect on the original. Code can iterate through both ranges independently. 2. If the range is a reference type, then copying the range results in two dependent copies, so mutating the copy mutates the original. Any code that iterates one of the two ranges then affects any code which would try to iterate the original, but the state is consistent across both ranges, because rather than containing their data, the data is elsewhere, and they both point to the same place. 3. If a range is a pseudo-reference type, and its iteration state is copied by value, then copying the range gives you the same behavior as a value type as far as iteration goes. Both the copy and the original can be iterated independently (though depending on the implementation, mutating the elements themselves could affect both ranges). Dynamic arrays fall in this category. 4. If a range is a pseudo-reference type, and its iteration state is not fully copied by value, then you end up with the copy and the original being partially dependent. This means that if you mutate one of them, it will only partially mutate the other, which tends to mean that the other ends up in an invalid state. A common situation where this can occur is if the range stores its front as a member variable, but the rest of its state is stored in another member variable which is a reference type. If you then call popFront on the copy, you'd end up with the copy's front changing, but the original's front wouldn't, and yet, the state they share for the rest of the range would be mutated, so if you then called popFront on the original, you wouldn't get the element that the copy got by calling popFront; you'd get the element after it - meaning that calling popFront on one range would then cause the other to skip an element. So, with pseudo-reference types that do not copy their iteration state by value, you basically can never use the original once you make a copy, because the original will be put into an invalid / inconsistent state by mutating the copy. The result of all of this is that you cannot rely on the semantics of copying a range. When you make a copy, you can rely on that copy having the same elements in the same order that the original would have had if you hadn't made a copy (since otherwise, you couldn't pass ranges around), but you can't rely on what happens to the original when you mutate the copy. This means that in generic code, once you copy a range, you cannot safely do anything with the original (since you'll get completely different behavior from different range types). In order for forward ranges to be able to get independent copies regardless of their copy sementics, we have the save function. This means that you can rely on save giving you an independent copy of the forward range, but you still can't rely on the semantics of copying the range, and there is no way to get independent copies of basic input ranges (and you can't rely on copies of basic input ranges being fully dependent either, since they could be pseudo-reference types). This means that in generic code, once you pass a range to foreach, you cannot use the original any longer, because you've copied it, and the semantics of how the original are affected by iterating through the copy depend on how that particular range type is implemented. Non-generic code can get away with it by relying on the copy semantics of a particular range type, but generic code can't. What this means in practice is that if you want foreach to use an independent copy of the range, you need to use save, and if you want to only partially iterate a range with a loop and then use the original afterwards, you cannot use foreach, because it copies the range, which then potentially puts the original range in an invalid state. And because you can't use save on a basic input range, once you pass a basic input range to foreach, you cannot safely use the original any longer. You either fully iterate through the range or you stop partway through and then never do anything with the rest of the range (since there is no safe way to do so in generic code), but there is no way to continue to iterate through the range once you pass it to foreach and then exit the foreach without relying on the copy semantics of a specific range type. So, for generic code, you'll need to use something other than foreach and avoid copying the range. To fix this, we would need to make it so that the range API defined the copy semantics of ranges, but that doesn't work as long as forward ranges are treated as a more advanced version of basic input ranges, because basic input ranges and forward ranges inherently have different copy semantics. Basic input ranges cannot be value types (otherwise, they could implement save), which means that in order to have consistent copy semantics for them, we would have to require that they be reference types (e.g. by requiring that they be classes or pointers to structs). On the other hand, we can't make forward ranges require reference semantics, because that won't work with dynamic arrays (and really what we want for forward ranges is to have the same copying semantics as dynamic arrays - i.e. their iteration state is copied by value). We could make forward ranges have consistent copy semantics by getting rid of save and requiring that copying a range results in a range that can be independently iterated (which would mean requiring that they be dynamic arrays or structs, forcing any reference type ranges to be wrapped in structs which do the equivalent of calling save internally when necessary to get an independent copy). And that way, we would have consistent copy semantics for basic input ranges and consistent copy semantics for forward ranges - but they wouldn't be the same copy semantics, so for generic code to be able to rely on the copy semantics, it would have to only support basic input ranges or only support forward ranges and not basic input ranges. This means that in order to fix the problem, we need to change the range API, which would break code. Hopefully, we'll be able to do that with the next version of Phobos which is currently in the planning stages, but that will then only work with new code or code that is updated to use the new API. Code written to work with the current range API will always have the problem. - Jonathan M Davis
Jan 19
parent reply Jim Balter <Jim Balter.name> writes:
On Friday, 19 January 2024 at 18:13:55 UTC, Jonathan M Davis 
wrote:
 On Friday, January 19, 2024 3:49:29 AM MST Jim Balter via 
 Digitalmars-d-learn wrote:
 On Friday, 17 November 2017 at 17:55:30 UTC, Jonathan M Davis

 wrote:
 When you have

 foreach(e; range)

 it gets lowered to something like

 for(auto r = range; !r.empty; r.popFront())
 {

     auto e = r.front;

 }

 So, the range is copied when you use it in a foreach.
Indeed, and the language spec says so, but this is quite wrong as it violates the specification and design of ranges ... only forward ranges are copyable and only via their `save` function. I have an input range that can only be iterated once; if you try to do so again it's empty ... but the foreach implementation breaks that. You should be able to break out of the foreach statement, then run it again (or another foreach) and it should continue from where it left off, but copying breaks that. I need to know how many elements of my range were consumed; copying breaks that. I got around this by having a pointer to my state so only the pointer gets copied. I would also note that tutorials such as Ali Çehreli's "Programming in D – Tutorial and Reference" are unaware of this breakage: " Those three member functions must be named as empty, popFront, and front, respectively. The code that is generated by the compiler calls those functions: for ( ; !myObject.empty(); myObject.popFront()) { auto element = myObject.front(); // ... expressions ... } "
"No spec is being violated, and the behavior is completely expected." This is not true, and your over-long response that is a lecture I didn't ask for and don't need misses my point and is largely irrelevant. The specification of ranges, which is independent of the D language, says that the way to copy a range is to use save(). Input ranges cannot be copied or restarted; that's the whole point of the difference between input ranges and forward ranges. You go on and on about the semantics of copying, which has nothing to do with what I wrote, which is that `foreach` copies when it shouldn't; the semantics of copying is not relevant to *not copying*. Input ranges are one-shots; they yield values once and that's it. `foreach` should have been implemented to call `save()` when that is available (when it is a forward range), and to do no copying when it is not available. Had that been done, then people wouldn't have written a bunch of ranges with missing save() functions that are reusable when they shouldn't be due to `foreach` making a copy, making `foreach` impossible to fix now. "Basic input ranges cannot be value types (otherwise, they could implement save)" This is not true, and again misses the point. My input range, the one I described in my original message, is a value type (a struct) which, when passed to functions, is passed by reference. It doesn't implement save because it's an input range, not a forward range. It's not supposed to be copied or restartable because it's an input range, not a forward range. Copying it has the wrong consequences because calling popFront on a copy doesn't advance the one-shot input range. It would work just fine if `foreach` didn't copy it. It works fine if I use it without `foreach`, e.g., if I use it in a for loop like in Ali's book:
    for ( ; !myObject.empty(); myObject.popFront()) {

          auto element = myObject.front();

          // ... expressions ...
    }
That works just fine with an input range, which is a range that is not restartable. The only thing that doesn't work with my input range is `foreach`, which copies the input range--which, by the specification of input ranges, was never intended. I managed to make it usable with a `foreach` loop by having the struct only store a pointer to its state so the `foreach` copy is harmless, in effect turning my value type range into a reference type and introducing unnecessary overhead.
Jan 25
parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Thursday, January 25, 2024 1:57:56 AM MST Jim Balter via Digitalmars-d-
learn wrote:
 The specification of ranges, which is independent of
 the D language, says that the way to copy a range is to use
 save().
I'm sorry, but you're misunderstanding the range specification if you think that save is the only way to copy a range or that the range specification makes any such guarantee. And if there is any official documentation that makes any such claim, then it needs to be fixed, because ranges have never worked that way, and they cannot work that way unless ranges in general are non-copyable, which is not the case at all. What save does is give you a guaranteed way to get a copy which can be independently iterated. However, it's perfectly valid to simply copy a range and then use the copy. They're not non-copyable types, and range-based code in general copies ranges all over the place - both with basic input ranges and with forward ranges. They're copied when they're passed to functions. They're copied when they're wrapped by other ranges. They're copied when they're passed to foreach. This is all falls within the expected use of ranges. What you don't get from those copies is the guarantee that the copy is independent, which is why save is necessary in cases where you need to be sure that you're getting an independent copy of a range. I gave that long explanation to try to get across what was happening with the copy semantics of ranges and what the consequences of that are. If that wasn't helpful to you, then I'm sorry. Either way, if you think that it goes against the range API to copy a range any way other than by calling save, then you are misunderstanding how ranges work, and looking at how Phobos uses ranges should make that clear, because it copies them all over the place, including with code that works on basic input ranges. The range API guarantees that copying a range will result in the copy having the same elements in the same order as the original would have had had it not been copied, because without that guarantee, range-based code in general simply wouldn't work. Range-based code in general relies on that ability to copy a range when passing it around. Now, what happens to the original after the copy has been made is not specified by the range API, and that's why save is necessary. But you can still copy a range and use that copy in generic code so long as you don't touch the original again. So, I've tried to explain to you why the current behavior is expected and does not violate the range specification. If that's not enough for you, then I'm sorry. Maybe someone else can help you out, but if what you've said about save were correct, then Phobos as a whole would be violating the range specification - as would the language itself with foreach. But they've been working this way for well over a decade. We would like to improve the situation with the next major version of Phobos so that the copy semantics of ranges are cleaner, and the range API will likely see a redesign to fix that issue, among others, but the current range API is as I've explained it to you, warts and all. - Jonathan M Davis
Jan 25
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Nov 17, 2017 at 02:50:59AM -0700, Jonathan M Davis via
Digitalmars-d-learn wrote:
[...]
 Personally, I'm inclined to think that we should never have had save
 and should have required that reference type ranges which are forward
 ranges be wrapped in a struct where copying it does the same thing
 that save does now, but I seriously doubt that we could make a change
 that big now.
Andrei also expressed, in retrospect, the same sentiment about input vs. forward ranges. But as you said, that ship has already sailed and we just have to make do with what we have now. T -- If you want to solve a problem, you need to address its root cause, not just its symptoms. Otherwise it's like treating cancer with Tylenol...
Nov 18 2017