www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - called copy constructor in foreach with ref on Range

reply Rogni <maktrefa gmail.com> writes:
https://pastebin.com/BYcRN8T2

why called copy constructor for ref Range struct type in foreach ?

```
struct MyRange {
     private int current_row = 0;
     private int rows_count = 5;
      disable this(this); // <===
     bool empty () { return current_row == rows_count; }
     int front() { return current_row*current_row; }
     void popFront() { ++current_row; }
}

struct Container {
      disable this(this); // <===
     int opApply(int delegate(int) dg) {
         foreach (int row; 0..5) dg(row*row);
         return 1;
     }
}

void foreach_iterable(Type)(ref Type iterable) {
     import std.stdio: writeln;
     foreach (int row_value; iterable) { // test_range.d(20): 
Error: struct test_range.MyRange is not copyable because it is 
annotated with  disable
         row_value.writeln;
     }
}

void call_iterable(Type)() {
     Type iterable;
     foreach_iterable(iterable);
}

void main() {
     call_iterable!Container(); // work
     call_iterable!MyRange(); // does`t work
}
```

ldc2 version: version   1.8.0 (DMD v2.078.3, LLVM 5.0.1)
Jun 21 2020
parent reply kinke <noone nowhere.com> writes:
A foreach over a custom range makes an initial copy, so that the 
original range is still usable after the foreach (not empty).
Jun 21 2020
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Sunday, June 21, 2020 2:25:37 PM MDT kinke via Digitalmars-d-learn wrote:
 A foreach over a custom range makes an initial copy, so that the
 original range is still usable after the foreach (not empty).
No, it's not so that the range will be useable afterwards. In fact, for generic code, you must _never_ use a range again after passing it to foreach, because the copying semantics of ranges are not specified, and you can get radically different behavior depending on what the copying semantics of a range are (e.g. if it's a class, then it's just copying the reference). In general, you should never use a range after it's been copied unless you know exactly what type of range you're dealing with and what its copying behavior is. If you want an independent copy, you need to use save. The reason that foreach copies the range is simply due to how the code is lowered. e.g. foreach(e; range) { } essentially becomes for(auto r = range; !r.empty; r.popFront()) { auto e = r.front; } And the fact that a copy is made is likely simply a result of it mimicking what happens with arrays. Either way, you should never be doing something like foreach(e; range) { } auto v = range.front; in generic code. It needs to be foreach(e; range.save) { } auto v = range.front; instead. - Jonathan M Davis
Jun 22 2020
parent reply kinke <kinke gmx.net> writes:
On Monday, 22 June 2020 at 16:25:11 UTC, Jonathan M Davis wrote:
 The reason that foreach copies the range is simply due to how 
 the code is lowered. e.g.

     foreach(e; range)
     {
     }

 essentially becomes

     for(auto r = range; !r.empty; r.popFront())
     {
         auto e = r.front;
     }

 And the fact that a copy is made is likely simply a result of 
 it mimicking what happens with arrays.
I was trying to explain/guess the rationale for that copy (not in terms of how it's implemented, but why it's implemented that way). This 'mimicking-an-array' doesn't make any sense to me. If the original idea wasn't to make sure the range is reusable afterwards, I guess it's done for implementation simplicity, to promote an rvalue range to the required lvalue. If copying a range is considered to be generally unsafe and a common pitfall (vs. the save() abomination), maybe range-foreach shouldn't allow any lvalue ranges in the 1st place, thus not making any copies and forcing the user to specify some rvalue (as returned by `range.save()`, or `move(range)` if destructive iteration is indeed intended).
Jun 22 2020
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Monday, June 22, 2020 10:59:45 AM MDT kinke via Digitalmars-d-learn wrote:
 If copying a range is considered to be generally unsafe and a
 common pitfall (vs. the save() abomination), maybe range-foreach
 shouldn't allow any lvalue ranges in the 1st place, thus not
 making any copies and forcing the user to specify some rvalue (as
 returned by `range.save()`, or `move(range)` if destructive
 iteration is indeed intended).
Copying ranges isn't a problem. Almost all range-based code copies ranges quite a bit (e.g. almost all range-based functions take a range by value, and chaining range-based function calls wouldn't work if it didn't). Rather, it's using the original after a copy has been made that's a problem (at least in generic code), because the semantics of copying aren't part of the range API and can vary wildly depending on the type. Really, the issues with copying were not properly taken into account when the range API was created, and mistakes were made. If we were to rework the range API at this point, we would probably get rid of save and require that copying be equivalent to save for forward ranges (which would then make it illegal for classes to be forward ranges without wrapping them in a struct). That would fix the problem for forward ranges, but basic input ranges can't have those semantics, or they could be forward ranges, so exactly what the correct solution would be is debatable, and if generic code operates on both basic input ranges and forward ranges, the copying semantics won't always be the same, which is the problem we have now. So, if we were to rework the range API, it's one of the open problems that needs to be sorted out. Regardless, as things stand, because copying a range can have reference semantics, value semantics, or something in between, in practice, that means that generic code cannot use a range once it's been copied, because the semantics will vary from type to type. The copy can be used (and most range-based code relies on that), but the original needs to be left alone. Non-generic code has more leeway, because it can rely on the behavior of specific range types, but with generic code, you have to be careful. And foreach is just one of the places where the issue of not using the original after making a copy comes up. - Jonathan M Davis
Jun 22 2020
parent reply kinke <noone nowhere.com> writes:
On Monday, 22 June 2020 at 19:03:44 UTC, Jonathan M Davis wrote:
 in practice, that means that generic code cannot use a range 
 once it's been copied
Translating to a simple rule-of-thumb: never copy a (struct) range, always move.
Jun 22 2020
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Monday, June 22, 2020 1:41:34 PM MDT kinke via Digitalmars-d-learn wrote:
 On Monday, 22 June 2020 at 19:03:44 UTC, Jonathan M Davis wrote:
 in practice, that means that generic code cannot use a range
 once it's been copied
Translating to a simple rule-of-thumb: never copy a (struct) range, always move.
You're unlikely to find much range-based code that does that and there really isn't much point in doing that. Again, copying isn't the problem. It's using the original after making the copy that's the problem. And moving doesn't fix anything, since the original variable is still there (just in its init state, which would still be invalid to use in generic code and could outright crash in some cases if you tried to use it - e.g. if it were a class reference, since it would then be null). So, code that does a move could accidentally use the original range after the move and have bugs just like code that copies the range has bugs if the original is used after the copy has been made. So, the rule of thumb is not that you should avoid copying ranges. It's that once you've copied a range, you should then use only the copy and not the original. - Jonathan M Davis
Jun 22 2020
next sibling parent reply kinke <noone nowhere.com> writes:
On Monday, 22 June 2020 at 20:51:37 UTC, Jonathan M Davis wrote:
 [...]
That's why I put the struct in parantheses. Moving a class ref makes hardly any sense, but I've also never written a *class* to represent a range. Moving is the no-brainer solution for transferring ownership of struct ranges and invalidating the original instance.
Jun 22 2020
parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Monday, June 22, 2020 3:10:28 PM MDT kinke via Digitalmars-d-learn wrote:
 On Monday, 22 June 2020 at 20:51:37 UTC, Jonathan M Davis wrote:
 [...]
That's why I put the struct in parantheses. Moving a class ref makes hardly any sense, but I've also never written a *class* to represent a range. Moving is the no-brainer solution for transferring ownership of struct ranges and invalidating the original instance.
Invalidating the instance doesn't actually prevent it from being misused though. At best, the fact that you moved the instance rather than copying it makes it more likely that accidentally using the instance will cause more extreme bugs or crash. The core issue that the original is potentially invalid but can still be used exists whether you copy the range or move it. Also, since the issue here is generic code, you have to take classes into account and cannot assume that the range is a struct. True, it's usually a bad idea to use classes for ranges, and ideally, we'd alter the range API so that classes weren't valid ranges, but the reality of the matter is that they are, and generic code has to take that into account. - Jonathan M Davis
Jun 22 2020
prev sibling parent reply Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Monday, 22 June 2020 at 20:51:37 UTC, Jonathan M Davis wrote:

 You're unlikely to find much range-based code that does that 
 and there really isn't much point in doing that. Again, copying 
 isn't the problem. It's using the original after making the 
 copy that's the problem.
Copy *is* the problem. If you can't make a copy (i.e. you get a compilation error) - you don't have a problem, the compiler just found a bug for you. Sadly, historically D's libraries were built around lots of redundant copying, even though there are very few cases where copies are actually required. The reason why we're "unlikely to find much range-based code that does that [move]" is (a) sloppy or shortsighted design and (b) poor language support. The latter hopefully stands to change.
 And moving doesn't fix anything, since the original variable is 
 still there
 (just in its init state, which would still be invalid to use in 
 generic code and could outright crash in some cases if you 
 tried to use it - e.g. if it were a class reference, since it 
 would then be null).
Eh? A range in 'init' state should be an empty range. If you get a crash from that then there's a bug in that range's implementation, not in user code.
 So, code that does a move could accidentally use the original 
 range after the move and have bugs just like code that copies 
 the range has bugs if the original is used after the copy has 
 been made. So, the rule of thumb is not that you should avoid 
 copying ranges. It's that once you've copied a range, you 
 should then use only the copy and not the original.
That is not true. Copying of forward ranges is absolutely fine. It's what the current `save()` primitive is supposed to do. It's the copying of input ranges should just be rejected, statically.
Jun 22 2020
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Jun 22, 2020 at 09:11:07PM +0000, Stanislav Blinov via
Digitalmars-d-learn wrote:
 On Monday, 22 June 2020 at 20:51:37 UTC, Jonathan M Davis wrote:
[...]
 And moving doesn't fix anything, since the original variable is
 still there (just in its init state, which would still be invalid to
 use in generic code and could outright crash in some cases if you
 tried to use it - e.g. if it were a class reference, since it would
 then be null).
Eh? A range in 'init' state should be an empty range. If you get a crash from that then there's a bug in that range's implementation, not in user code.
Don't be shocked when you find out how many Phobos ranges have .init states that are invalid (e.g., non-empty, but .front and .popFront will crash / return invalid values).
 So, code that does a move could accidentally use the original range
 after the move and have bugs just like code that copies the range
 has bugs if the original is used after the copy has been made. So,
 the rule of thumb is not that you should avoid copying ranges. It's
 that once you've copied a range, you should then use only the copy
 and not the original.
That is not true. Copying of forward ranges is absolutely fine. It's what the current `save()` primitive is supposed to do. It's the copying of input ranges should just be rejected, statically.
Jonathan is coming from the POV of generic code. The problem with move leaving the original range in its .init state isn't so much that it will crash or anything (even though as you said that does indicate a flaw in the range's implementation), but that the semantics of generic code changes in subtle ways. For example: auto myGenericFunc(R)(R r) { ... foreach (x; r) { doSomething(x); } if (!r.empty) doSomethingElse(r); ... } Suppose for argument's sake that the above foreach/if structure is an essential part of whatever algorithm myGenericFunc is implementing. Now there's a problem, because if R has array-like semantics, then the algorithm will do one thing, but if R has reference-like or move semantics, then the behaviour of the algorithm will be different, even if both ranges represent the same sequence of input values. Note that in real-life code, this problem can be far more subtle than a blatant foreach loop and if statement like the above. For example, consider a function that drops the first n elements of a range. Your generic function might want to pop the first n elements then do something else with the rest of the range. Well, if you write it the obvious way: auto myAlgo(R)(R r) { size_t n = ...; dropFirstN(r, n); ... // do something else with r } then you have a subtle bug, because the state of r after the call to dropFirstN might be completely different depending on whether r behaves like an array or like a by-reference or move type. T -- Truth, Sir, is a cow which will give [skeptics] no more milk, and so they are gone to milk the bull. -- Sam. Johnson
Jun 22 2020
next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Monday, 22 June 2020 at 21:33:08 UTC, H. S. Teoh wrote:
 Jonathan is coming from the POV of generic code.  The problem 
 with move leaving the original range in its .init state isn't 
 so much that it will crash or anything (even though as you said 
 that does indicate a flaw in the range's implementation), but 
 that the semantics of generic code changes in subtle ways. For 
 example:

 [...]
Seems to me like adding some unit tests with non-copyable input ranges to Phobos could expose a number of latent bugs.
Jun 22 2020
parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Monday, June 22, 2020 3:38:02 PM MDT Paul Backus via Digitalmars-d-learn 
wrote:
 On Monday, 22 June 2020 at 21:33:08 UTC, H. S. Teoh wrote:
 Jonathan is coming from the POV of generic code.  The problem
 with move leaving the original range in its .init state isn't
 so much that it will crash or anything (even though as you said
 that does indicate a flaw in the range's implementation), but
 that the semantics of generic code changes in subtle ways. For
 example:

 [...]
Seems to me like adding some unit tests with non-copyable input ranges to Phobos could expose a number of latent bugs.
At this point, non-copyable ranges really aren't a thing. foreach does not support them and neither does Phobos. isInputRange doesn't actually reject them, but they don't really work in practice, and it's unlikely that you're going to find much in Phobos that happens to work with them. isForwardRange does outright reject them though. - Jonathan M Davis
Jun 22 2020
prev sibling parent Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Monday, 22 June 2020 at 21:33:08 UTC, H. S. Teoh wrote:

 Don't be shocked when you find out how many Phobos ranges have 
 .init states that are invalid (e.g., non-empty, but .front and 
 .popFront will crash / return invalid values).
Which ones?
 Jonathan is coming from the POV of generic code.  The problem 
 with move leaving the original range in its .init state isn't 
 so much that it will crash or anything (even though as you said 
 that does indicate a flaw in the range's implementation), but 
 that the semantics of generic code changes in subtle ways.
Well that means that the code is not generic, i.e. the bug originates in the design, not implementation.
 	auto myGenericFunc(R)(R r) {
 		...
 		foreach (x; r) {
 			doSomething(x);
 		}
 		if (!r.empty)
 			doSomethingElse(r);
 		...
 	}

 Suppose for argument's sake that the above foreach/if structure 
 is an essential part of whatever algorithm myGenericFunc is 
 implementing. Now there's a problem, because if R has 
 array-like semantics, then the algorithm will do one thing, but 
 if R has reference-like or move semantics, then the behaviour 
 of the algorithm will be different, even if both ranges 
 represent the same sequence of input values.
Yep, definitely not generic. Which is exactly the kind of error that should be caught at compile time. Which we, sadly, can't do with the current range API.
 consider a function that drops the first n elements of a range.
 Your generic function might want to pop the first n elements 
 then do something else with the rest of the range.  Well, if 
 you write it the obvious way:

 	auto myAlgo(R)(R r) {
 		size_t n = ...;
 		dropFirstN(r, n);
 		... // do something else with r
 	}

 then you have a subtle bug, because the state of r after the 
 call to dropFirstN might be completely different depending on 
 whether r behaves like an array or like a by-reference or move 
 type.
Err... that one is actually fine. It would take the range by ref and pop off the elements. There already is such a function - the popFrontN. It's the functions that take ranges by value that present the not-so-subtle issue with reference types. For example, its chainable would-be-counterpart drop(). "Would-be" because that one takes the argument by value. We should move toward disallowing reference types to be ranges. A good deal of the rest can be solved with API and design changes (like disallowing copying of input ranges). It's kind of interesting how, with the ongoing discussion about range API in the general forum, a couple of range questions are brought up in learn. Something, as they say, is in the air.
Jun 22 2020
prev sibling next sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Monday, June 22, 2020 3:11:07 PM MDT Stanislav Blinov via Digitalmars-d-
learn wrote:
 On Monday, 22 June 2020 at 20:51:37 UTC, Jonathan M Davis wrote:
 You're unlikely to find much range-based code that does that
 and there really isn't much point in doing that. Again, copying
 isn't the problem. It's using the original after making the
 copy that's the problem.
Copy *is* the problem. If you can't make a copy (i.e. you get a compilation error) - you don't have a problem, the compiler just found a bug for you. Sadly, historically D's libraries were built around lots of redundant copying, even though there are very few cases where copies are actually required. The reason why we're "unlikely to find much range-based code that does that [move]" is (a) sloppy or shortsighted design and (b) poor language support. The latter hopefully stands to change.
As things stand, uncopyable ranges aren't really a thing, and common range idiomns rely on ranges being copyable. We'd need some major redesigning to make uncopyable ranges work, and personally, I don't think that it's worth the trouble.
 And moving doesn't fix anything, since the original variable is
 still there
 (just in its init state, which would still be invalid to use in
 generic code and could outright crash in some cases if you
 tried to use it - e.g. if it were a class reference, since it
 would then be null).
Eh? A range in 'init' state should be an empty range. If you get a crash from that then there's a bug in that range's implementation, not in user code.
The range API has no requirement that the init value of a range be empty, and any generic code which relies on such behavior is buggy. In the case of classes, it would outright crash, because the init value would be null. I agree that ideally the range API would require that the init state of a range be a valid, empty range, but that's simply not how it works right now. In order to make it work that way, we'd have to redesign the range API in at least some respects (e.g. getting rid of save and making it illegal for classes to be forward ranges).
 So, code that does a move could accidentally use the original
 range after the move and have bugs just like code that copies
 the range has bugs if the original is used after the copy has
 been made. So, the rule of thumb is not that you should avoid
 copying ranges. It's that once you've copied a range, you
 should then use only the copy and not the original.
That is not true. Copying of forward ranges is absolutely fine. It's what the current `save()` primitive is supposed to do. It's the copying of input ranges should just be rejected, statically.
As things stand, it is _not_ true that it's safe to copy forward ranges and then use the original. Sure, it will work for some ranges, but for others it won't. The entire reason that save exists is for ranges that are classes, because copying them does not result in an independent range. The range API does not require that copying a range result in an independent copy. It's not even guaranteed that copying a range that's a struct will result in an indpendent copy. Depending on the range type, copying it could result in an independent copy, or it could result in a reference to to the original range, or it could even result in part of the state being copied and part of it being a reference (e.g. if the current front is stored directly in the range, but the iteration state is stored by reference). If you rely on copying a range resulting in an independent copy, you will have buggy code - even if it's a forward range. The _only_ way that the range API specifies that you can get an independent copy of a range is to use save, and generic code should never rely on any other mechanism for that. Now, ideally, we'd get rid of save and require that copying a forward range result in an independent copy (which would then require that a class be wrapped in a struct to be a range), but that's simply not how the current range API works. Because the current range API does not make any guarantees about the semantics of copying a range, generic code cannot assume that those semantics are the same as save. As such, if you want an independent copy of a forward range in generic code, you must use save, or the code will be buggy. Similarly, generic code cannot use the original range after it's been copied (unless it simply never does anything with the copy), because mutating the copy may or may not mutate the original, and the original may or may not even be in a valid state if the copy is mutated. With non-generic code, you can rely on the behaviors of specific ranges and what will happen when you copy them (e.g. not bothering to call save when passing strings around), but with generic code, that's not true. And there's plenty of D code out there that works correctly with a specific range type but which would fail miserably if it were used generically, because it makes assumption which the range API does not guarantee (like the semantics of copying a range). - Jonathan M Davis
Jun 22 2020
parent reply Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Tuesday, 23 June 2020 at 02:41:55 UTC, Jonathan M Davis wrote:

 As things stand, uncopyable ranges aren't really a thing, and 
 common range idiomns rely on ranges being copyable.
Which idioms are those? I mean, genuine idioms, not design flaws like e.g. references.
 We'd need some major redesigning to make uncopyable ranges 
 work, and personally, I don't think that it's worth the trouble.
Of course we would. Andrei is entertaining changing the whole input range API. Though he, like you, seems opposed to the idea of uncopyables.
 The range API has no requirement that the init value of a range 
 be empty, and any generic code which relies on such behavior is 
 buggy. In the case of classes, it would outright crash, because 
 the init value would be null.
Yeah, that's a shame.
 I agree that ideally the range API would require that the init 
 state of a range be a valid, empty range, but that's simply not 
 how it works right now. In order to make it work that way, we'd 
 have to redesign the range API in at least some respects (e.g. 
 getting rid of save and making it illegal for classes to be 
 forward ranges).
Better yet - making it illegal for classes to be ranges.
 As things stand, it is _not_ true that it's safe to copy 
 forward ranges and then use the original. Sure, it will work 
 for some ranges, but for others it won't. The entire reason 
 that save exists is for ranges that are classes, because 
 copying them does not result in an independent range. The range 
 API does not require that copying a range result in an 
 independent copy. It's not even guaranteed that copying a range 
 that's a struct will result in an indpendent copy. Depending on 
 the range type, copying it could result in an independent copy, 
 or it could result in a reference to to the original range, or 
 it could even result in part of the state being copied and part 
 of it being a reference (e.g. if the current front is stored 
 directly in the range, but the iteration state is stored by 
 reference).
If copying isn't making a copy, it's not copying.
 If you rely on copying a range resulting in an independent 
 copy, you will have buggy code - even if it's a forward range. 
 The _only_ way that the range API specifies that you can get an 
 independent copy of a range is to use save, and generic code 
 should never rely on any other mechanism for that.
That's exactly what I said. "Copying of forward ranges is absolutely fine. It's what the current `save()` primitive is supposed to do." ...Except, of course, that that shouldn't be the case. *Copying* should be creating a copy. :)
 Now, ideally, we'd get rid of save and require that copying a 
 forward range result in an independent copy (which would then 
 require that a class be wrapped in a struct to be a range)
Yes!
 but that's simply not how the current range API works.
No, it's not :(
Jun 22 2020
next sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Monday, June 22, 2020 9:25:55 PM MDT Stanislav Blinov via Digitalmars-d-
learn wrote:
 On Tuesday, 23 June 2020 at 02:41:55 UTC, Jonathan M Davis wrote:
 As things stand, uncopyable ranges aren't really a thing, and
 common range idiomns rely on ranges being copyable.
Which idioms are those? I mean, genuine idioms, not design flaws like e.g. references.
It is extremely common to wrap ranges in other ranges (and in fact, you basically have to in order to have lazy ranges). That really doesn't work very well - if at all - if you can't copy the range. It might be possible with a bunch of explicit calls to move, but that would result in range-based code in general being system without a clean way to use trusted, since whether it's really safe to mark such moves with trusted would depend on the specific range implementation, which then is a big problem with generic code. Regardless of whether it's actually possible to make it work though, it's a complete mess in comparison to simply copying. And the fact that chaining range-based calls is extremely common makes the problem that much worse. The way that ranges are routinely passed around and wrapped works as well as it does, because ranges are copyable.
 As things stand, it is _not_ true that it's safe to copy
 forward ranges and then use the original. Sure, it will work
 for some ranges, but for others it won't. The entire reason
 that save exists is for ranges that are classes, because
 copying them does not result in an independent range. The range
 API does not require that copying a range result in an
 independent copy. It's not even guaranteed that copying a range
 that's a struct will result in an indpendent copy. Depending on
 the range type, copying it could result in an independent copy,
 or it could result in a reference to to the original range, or
 it could even result in part of the state being copied and part
 of it being a reference (e.g. if the current front is stored
 directly in the range, but the iteration state is stored by
 reference).
If copying isn't making a copy, it's not copying.
The semantics of copying a variable or object vary wildly depending on the type regardless of whether we're talking about ranges. Copying a pointer or reference is still a copy even if it isn't a deep copy. Copying a range _always_ results in a copy, just like it does with any other type. It just doesn't necessarily result in a copy which can be independently iterated. - Jonathan M Davis
Jun 22 2020
next sibling parent Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Tuesday, 23 June 2020 at 03:52:23 UTC, Jonathan M Davis wrote:
 On Monday, June 22, 2020 9:25:55 PM MDT Stanislav Blinov via 
 Digitalmars-d- learn wrote:
 On Tuesday, 23 June 2020 at 02:41:55 UTC, Jonathan M Davis 
 wrote:
 As things stand, uncopyable ranges aren't really a thing, 
 and common range idiomns rely on ranges being copyable.
Which idioms are those? I mean, genuine idioms, not design flaws like e.g. references.
It is extremely common to wrap ranges in other ranges (and in fact, you basically have to in order to have lazy ranges). That really doesn't work very well - if at all - if you can't copy the range. It might be possible with a bunch of explicit calls to move...
Where would "a bunch of explicit calls" come from??? See example implementations of map and filter here: https://gist.github.com/radcapricorn/d76d29c6df6fa822d7889e799937f39d If we disallow references as ranges, then any function that takes by value *owns that value*, and shouldn't needlessly copy it, only move. Yes, even when (in fact, especially when) that value is a mere wrapper over a pointer. Only when it would have to legitimately create a copy (i.e. fork the range) would it not move. At which point, if the range is non-copyable, it would be a compile error. Which would be a *good thing*. That is what all generic code that works with values *should* be doing: moving, moving, moving. Copy ctors should only be called when *necessary*, not "meh, whenever". Conceptually: // assume OtherRange is a struct, which is what ranges should be Range wrapRange(OtherRange)(OtherRange other) { // ...possibly pop some elements from `other` return Range(other); } That's what Phobos is doing now. What's going on there? It makes a copy of `other` and then destructs `other`. There's no other use of `other`. So, it can simply (no, it *should*) just move `other`. Moving it would leave `other` in .init state, which is destructible, and construct Range with a valid value, without calling any copy ctors. Walter's "moving, copying and forwarding" proposal, if it ever comes to fruition, would even let us drop the use of library moves in such cases.
 ...but that would result in range-based code in general being 
  system without a clean way to use  trusted, since
??? `move` infers.
 whether it's really safe to mark such moves with  trusted would 
 depend on the specific range implementation, which then is a 
 big problem with generic code.
A one-argument `move` basically never should be inferred system! (that is, if implementation isn't buggy, which it would seem it currently is in druntime...) Two-argument `move` might be system, if the destructor is system. Which means the range is system anyway.
 Regardless of whether it's actually possible to make it work 
 though, it's a complete mess in comparison to simply copying.
 And the fact that chaining range-based calls is extremely 
 common makes the problem that much worse.
You mean this? auto a = range.filter!foo.map!bar.array; It's not a problem in any way. If `range` is non-copyable: auto a = range.move.filter!foo.map!bar.array; ...which, if we do uphold the ".init is empty range" will *never* leave you with an invalid `range`, unlike copies that do who-knows-what.
 The way that ranges are routinely passed around and wrapped 
 works as well as it does, because ranges are copyable.
Now I see what you mean by "idioms". Copying things that shouldn't be copied :) Which is the bane of all Phobos.
 The semantics of copying a variable or object vary wildly 
 depending on the type regardless of whether we're talking about 
 ranges. Copying a pointer or reference is still a copy even if 
 it isn't a deep copy. Copying a range _always_ results in a 
 copy, just like it does with any other type. It just doesn't 
 necessarily result in a copy which can be independently 
 iterated.
:) Which means: - input ranges shouldn't be copyable -> no wild semantics anymore, statically enforced - forward ranges should migrate save() into copy ctors -> guarantee a copy does what it should - of course, .init must be an empty range, and all of this is moot so long as references are allowed to be ranges (all of this is one big "what if") I feel like a broken record. Maybe I just should take that gist and make a more practical implementation out of it, as a demo (no, not with Andrei's `fetchNext`).
Jun 22 2020
prev sibling parent Paul Backus <snarwin gmail.com> writes:
On Tuesday, 23 June 2020 at 03:52:23 UTC, Jonathan M Davis wrote:
 It is extremely common to wrap ranges in other ranges (and in 
 fact, you basically have to in order to have lazy ranges). That 
 really doesn't work very well - if at all - if you can't copy 
 the range. It might be possible with a bunch of explicit calls 
 to move, but that would result in range-based code in general 
 being  system without a clean way to use  trusted
`move` isn't system. Perhaps you are thinking of `moveEmplace`?
Jun 23 2020
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Jun 23, 2020 at 03:25:55AM +0000, Stanislav Blinov via
Digitalmars-d-learn wrote:
 On Tuesday, 23 June 2020 at 02:41:55 UTC, Jonathan M Davis wrote:
 
 As things stand, uncopyable ranges aren't really a thing, and common
 range idiomns rely on ranges being copyable.
Which idioms are those? I mean, genuine idioms, not design flaws like e.g. references.
 We'd need some major redesigning to make uncopyable ranges work, and
 personally, I don't think that it's worth the trouble.
Of course we would. Andrei is entertaining changing the whole input range API. Though he, like you, seems opposed to the idea of uncopyables.
[...] I'm also wondering what's the motivation behind supporting non-copyable ranges, and whether it's worth the effort and inevitable complications to support it if it's a rare use-case. Do you have any compelling use-case in mind, that might make this worthwhile? T -- Любишь кататься - люби и саночки возить.
Jun 22 2020
parent reply Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Tuesday, 23 June 2020 at 05:24:37 UTC, H. S. Teoh wrote:
 On Tue, Jun 23, 2020 at 03:25:55AM +0000, Stanislav Blinov via 
 Digitalmars-d-learn wrote:
 On Tuesday, 23 June 2020 at 02:41:55 UTC, Jonathan M Davis 
 wrote:
 We'd need some major redesigning to make uncopyable ranges 
 work, and personally, I don't think that it's worth the 
 trouble.
Of course we would. Andrei is entertaining changing the whole input range API. Though he, like you, seems opposed to the idea of uncopyables.
[...] I'm also wondering what's the motivation behind supporting non-copyable ranges, and whether it's worth the effort and inevitable complications to support it if it's a rare use-case. Do you have any compelling use-case in mind, that might make this worthwhile?
This compelling use case is staring at us from comments in this very thread. Like Jonathan said before - after you copied, you should only use the copy and not the original. Input range is a one-off iterable. You either consume it yourself, or give it away to someone else to consume. You can't do anything useful with it besides consuming it. But you can't give it away if it's copyable - you'd be giving someone a reference to mutable state which you also keep. Which brings about those very problems already talked about in this thread. So if you "only should use the copy", well then duh, let's make it official and force you to write a move then (turning attempts to copy into compile errors)! Assuming, of course, that we do establish that .init must be an empty range. You'd be left with a valid state, callee would be given a valid state, neither of you would stomp on each other. ByLine, just to become copyable, is made refcounted. It allocates its guts on the heap and molests an integer, all because it wants to remain copyable. But it is a call to "move" that is complication? Or, perhaps, a complication would be partial consumption? E.g. chaining a take + map combo, but still wanting to consume remainder. That is solved by making wrapping ranges propagate the `source`, naturally, by moving it. Pretty straightforward.
Jun 23 2020
parent reply Sebastiaan Koppe <mail skoppe.eu> writes:
On Tuesday, 23 June 2020 at 07:30:29 UTC, Stanislav Blinov wrote:
 On Tuesday, 23 June 2020 at 05:24:37 UTC, H. S. Teoh wrote:
 I'm also wondering what's the motivation behind supporting 
 non-copyable ranges, and whether it's worth the effort and 
 inevitable complications to support it if it's a rare use-case.
  Do you have any compelling use-case in mind, that might make 
 this worthwhile?
This compelling use case is staring at us from comments in this very thread.
I am a bit on the fence about having input ranges be non-copyable. Initially I liked the idea very much; I am a big fan of 'unique' semantics. But just because most people don't expect hot/destructive reads, doesn't mean we should impose such a hard restriction on input ranges. There are use cases for popping from an input range from different places. An obvious one would be parsing data from the network, where you pass the input range into helper functions that parse sub-sections of the data. They can be made to work for sure, but it doesn't make it cleaner. On the other hand, most of the time, non-copyable is exactly what you need...
Jun 23 2020
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 6/23/20 9:47 AM, Sebastiaan Koppe wrote:
 On Tuesday, 23 June 2020 at 07:30:29 UTC, Stanislav Blinov wrote:
 On Tuesday, 23 June 2020 at 05:24:37 UTC, H. S. Teoh wrote:
 I'm also wondering what's the motivation behind supporting 
 non-copyable ranges, and whether it's worth the effort and inevitable 
 complications to support it if it's a rare use-case.
  Do you have any compelling use-case in mind, that might make this 
 worthwhile?
This compelling use case is staring at us from comments in this very thread.
I am a bit on the fence about having input ranges be non-copyable. Initially I liked the idea very much; I am a big fan of 'unique' semantics. But just because most people don't expect hot/destructive reads, doesn't mean we should impose such a hard restriction on input ranges. There are use cases for popping from an input range from different places. An obvious one would be parsing data from the network, where you pass the input range into helper functions that parse sub-sections of the data.
You can pass a reference if you need to, there's nothing wrong with that.
 They can be made to work for sure, but it doesn't make it cleaner.
 
 On the other hand, most of the time, non-copyable is exactly what you 
 need...
Most of the time algorithms return rvalue ranges anyway, so move is not necessary. I think it would actually affect calls very little, and where it does affect them, you may find it actually fixes issues that you aren't aware of. -Steve
Jun 23 2020
prev sibling next sibling parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Monday, June 22, 2020 3:33:08 PM MDT H. S. Teoh via Digitalmars-d-learn 
wrote:
 On Mon, Jun 22, 2020 at 09:11:07PM +0000, Stanislav Blinov via Digitalmars-
d-learn wrote:
 That is not true. Copying of forward ranges is absolutely fine. It's
 what the current `save()` primitive is supposed to do. It's the
 copying of input ranges should just be rejected, statically.
Jonathan is coming from the POV of generic code. The problem with move leaving the original range in its .init state isn't so much that it will crash or anything (even though as you said that does indicate a flaw in the range's implementation), but that the semantics of generic code changes in subtle ways. For example: auto myGenericFunc(R)(R r) { ... foreach (x; r) { doSomething(x); } if (!r.empty) doSomethingElse(r); ... } Suppose for argument's sake that the above foreach/if structure is an essential part of whatever algorithm myGenericFunc is implementing. Now there's a problem, because if R has array-like semantics, then the algorithm will do one thing, but if R has reference-like or move semantics, then the behaviour of the algorithm will be different, even if both ranges represent the same sequence of input values. Note that in real-life code, this problem can be far more subtle than a blatant foreach loop and if statement like the above. For example, consider a function that drops the first n elements of a range. Your generic function might want to pop the first n elements then do something else with the rest of the range. Well, if you write it the obvious way: auto myAlgo(R)(R r) { size_t n = ...; dropFirstN(r, n); ... // do something else with r } then you have a subtle bug, because the state of r after the call to dropFirstN might be completely different depending on whether r behaves like an array or like a by-reference or move type.
Exactly. It's because of issues like this that generic, range-based functions need to be tested with a variety of range types - including reference types. Without that, bugs are routinely missed. I think that a fairly large percentage of Phobos (maybe even all of it) gets that right now, because we've added tests for reference type ranges, but there used to be quite a few bugs in Phobos, because there were no such tests. It's frequently the case that people write range-based code under the assumption that ranges will act like dynamic arrays, and their code then also works with many ranges which have similar copying behavior, but it doesn't work with all range types. Such problems don't generally get caught without extensive testing. It matters a lot less for code within a program that doesn't use a large variety of range types, but for libraries using generic functions, it's a must. If we do actually rework the range API as has occasionally been discussed, it would be _really_ nice if we could more thoroughly restrict the copying semantics of ranges so that some of these problems go away, but without such a redesign, we're stuck with such problems. And even with a redesign, the best fix for it is far from obvious, because basic input ranges and forward ranges inherently have different copying semantics. We can probably require that copying forward ranges always results in an independent copy (thereby essentially having value semantics), but basic input ranges can't really be value types, because if they could, they could be forward ranges, meaning that they're generally going to either be reference types or pseudo-reference types, which of course have different copying semantics. So, as long as the range API is set up so that the same code can operate on both basic input and forward ranges, we pretty much inherently have a problem with the copying semantics in ranges being inconsistent - though requiring value semantics for forward ranges would still be a significant improvement. - Jonathan M Davis
Jun 22 2020
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Jun 22, 2020 at 08:53:22PM -0600, Jonathan M Davis via
Digitalmars-d-learn wrote:
[...]
 Exactly. It's because of issues like this that generic, range-based
 functions need to be tested with a variety of range types - including
 reference types. Without that, bugs are routinely missed.
[...]
 If we do actually rework the range API as has occasionally been
 discussed, it would be _really_ nice if we could more thoroughly
 restrict the copying semantics of ranges so that some of these
 problems go away, but without such a redesign, we're stuck with such
 problems.
Redesign or not, I think the new range API should be much more specific in specifying exactly what behaviours ranges ought to have. The current API is under-specified, with two of the main problem points that I can remember being: 1) Transient-ness: is the value return by .front guaranteed to remain valid, or does .popFront invalidate it? This is not specified in the current spec, and we have ranges in Phobos of either type (e.g., iota and .byLine), with a lot of code just blindly assuming non-transience only to discover that things break down when handed a .byLine instance. 2) The exact semantics of copying (assigning) a range. What parts of a range's state is/isn't preserved when you assign a range to a local variable, for example? What happens if you pass a range to a function and then continue to operate on the original? All of this must be specified precisely in the range API so that we don't have one range acting one way and another range acting another way, thereby compromising the semantics of generic code. T -- Democracy: The triumph of popularity over principle. -- C.Bond
Jun 22 2020