digitalmars.D.learn - Difference between range `save` and copy constructor
- uranuz (10/10) Feb 15 2020 I am interested in current circumstances when we have new copy
- Steven Schveighoffer (12/22) Feb 15 2020 Nothing. IMO, any time you are doing anything in save other than `return...
- Jonathan M Davis (17/38) Feb 15 2020 Actually, as I understand it, the main reason that save was introduced w...
- Steven Schveighoffer (12/55) Feb 15 2020 I remember the discussions as being about how an actual implementation
- uranuz (53/55) Feb 16 2020 I have use of ranges as a classes in my code that rely on classes
- uranuz (13/13) Feb 16 2020 Also I see the problemme that someone can think that it creates
- uranuz (16/16) Feb 16 2020 I have reread presentation:
- Jonathan M Davis (43/59) Feb 16 2020 A range that can't be copied is basically useless. Not only do almost al...
- uranuz (18/18) Feb 16 2020 It's very bad. Because there seem that when I use range based
- Jonathan M Davis (54/72) Feb 16 2020 The current situation is definitely not ideal, but it can be used quite
- uranuz (14/14) Feb 16 2020 In general for value-semantics and ref-semantics the different
- uranuz (21/38) Feb 16 2020 This is working fine with disabled postblit...
- Jonathan M Davis (41/81) Feb 16 2020 That code compiles, because you're passing a temporary to foreach. So, t...
- Paul Backus (10/52) Feb 16 2020 On the other hand, this does work:
- Jonathan M Davis (18/88) Feb 16 2020 Given that the way that almost all range-based functions work is to copy...
- Paul Backus (6/9) Feb 16 2020 "Copy and then never use the original again" is conceptually the
- Jonathan M Davis (8/17) Feb 16 2020 We already have enough of a mess with save without making things even wo...
- uranuz (1/4) Feb 17 2020 OK. Thanks for instructions. I shall give it a try.
I am interested in current circumstances when we have new copy constructor feature what is the purpose of having range `save` primitive? For me they look like doing basicaly the same thing. And when looking in some source code of `range` module the most common thing that `save` does is that it use constructor typeof(this) to create a new instance and use `save` on the source range: https://github.com/dlang/phobos/blob/v2.090.1/std/range/package.d So what is conceptual difference between `save` and copy contructor of range?
Feb 15 2020
On 2/15/20 5:53 AM, uranuz wrote:I am interested in current circumstances when we have new copy constructor feature what is the purpose of having range `save` primitive? For me they look like doing basicaly the same thing. And when looking in some source code of `range` module the most common thing that `save` does is that it use constructor typeof(this) to create a new instance and use `save` on the source range: https://github.com/dlang/phobos/blob/v2.090.1/std/range/package.d So what is conceptual difference between `save` and copy contructor of range?Nothing. IMO, any time you are doing anything in save other than `return this;`, you shouldn't have implemented it. The original impetus for the save requirement was so that forward ranges could have a tangible checkable thing that allows introspection (does the range have a save method?). I'm not entirely sure if disabled postblit was even available at the time. The correct way to do it would be to treat ranges that can be copied (regardless of whether they have a copy constructor) as forward ranges, and treat ones that cannot be copied as input ranges. But it's hard to redo ranges like this with all existing code out there. -Steve
Feb 15 2020
On Saturday, February 15, 2020 7:34:42 AM MST Steven Schveighoffer via Digitalmars-d-learn wrote:On 2/15/20 5:53 AM, uranuz wrote:Actually, as I understand it, the main reason that save was introduced was so that classes could be forward ranges. While it would be possible to use the postblit constructor or copy constructor with structs, that obviously won't work for classes - hence when save is required. Personally, I think that we'd be better of simply requiring that forward rangse be copyable and force classes that want to be forward ranges to be wrapped by structs, but that would require reworking the range API, and it's far from a trivial change. In practice though, classes should almost never be used as forward ranges, because calling save on them would requires allocating a now object, and that gets expensive fast. As part of testing dxml, I tested it with forward ranges that were classes in order to make sure that they were handled correctly, and their performance was absolutely terrible in comparison to ranges that were structs or strings. - Jonathan M DavisI am interested in current circumstances when we have new copy constructor feature what is the purpose of having range `save` primitive? For me they look like doing basicaly the same thing. And when looking in some source code of `range` module the most common thing that `save` does is that it use constructor typeof(this) to create a new instance and use `save` on the source range: https://github.com/dlang/phobos/blob/v2.090.1/std/range/package.d So what is conceptual difference between `save` and copy contructor of range?Nothing. IMO, any time you are doing anything in save other than `return this;`, you shouldn't have implemented it. The original impetus for the save requirement was so that forward ranges could have a tangible checkable thing that allows introspection (does the range have a save method?). I'm not entirely sure if disabled postblit was even available at the time. The correct way to do it would be to treat ranges that can be copied (regardless of whether they have a copy constructor) as forward ranges, and treat ones that cannot be copied as input ranges. But it's hard to redo ranges like this with all existing code out there.
Feb 15 2020
On 2/15/20 9:45 AM, Jonathan M Davis wrote:On Saturday, February 15, 2020 7:34:42 AM MST Steven Schveighoffer via Digitalmars-d-learn wrote:I remember the discussions as being about how an actual implementation detail was required, not just a mark. I remember a suggestion for just putting an enum isForward member in the range being rejected because of this. Except people don't call save. They just copy, and it works for nearly all forward ranges in existence. And a class allocating a new class for saving a forward range is a mislabeled "cheap" operation IMO.On 2/15/20 5:53 AM, uranuz wrote:Actually, as I understand it, the main reason that save was introduced was so that classes could be forward ranges. While it would be possible to use the postblit constructor or copy constructor with structs, that obviously won't work for classes - hence when save is required.I am interested in current circumstances when we have new copy constructor feature what is the purpose of having range `save` primitive? For me they look like doing basicaly the same thing. And when looking in some source code of `range` module the most common thing that `save` does is that it use constructor typeof(this) to create a new instance and use `save` on the source range: https://github.com/dlang/phobos/blob/v2.090.1/std/range/package.d So what is conceptual difference between `save` and copy contructor of range?Nothing. IMO, any time you are doing anything in save other than `return this;`, you shouldn't have implemented it. The original impetus for the save requirement was so that forward ranges could have a tangible checkable thing that allows introspection (does the range have a save method?). I'm not entirely sure if disabled postblit was even available at the time. The correct way to do it would be to treat ranges that can be copied (regardless of whether they have a copy constructor) as forward ranges, and treat ones that cannot be copied as input ranges. But it's hard to redo ranges like this with all existing code out there.Personally, I think that we'd be better of simply requiring that forward rangse be copyable and force classes that want to be forward ranges to be wrapped by structs, but that would require reworking the range API, and it's far from a trivial change.Yep.In practice though, classes should almost never be used as forward ranges, because calling save on them would requires allocating a now object, and that gets expensive fast. As part of testing dxml, I tested it with forward ranges that were classes in order to make sure that they were handled correctly, and their performance was absolutely terrible in comparison to ranges that were structs or strings.Not surprised ;) -Steve
Feb 15 2020
Actually, as I understand it, the main reason that save was introduced was so that classes could be forward rangesI have use of ranges as a classes in my code that rely on classes and polymorthism, but it's usually an InputRange that implements Phobos interface: https://dlang.org/phobos/std_range_interfaces.html#.InputRange I have virtual opSlice operator that returns InputRange. And sometimes implementation of range is very different. So it's difficult to write one range as a struct. I have a pattern in my code that looks like the following: interface IContainer { InputRange opSlice(); } class MyContainer1: IContainer { class Range1: InputRange { //... one implementation } override InputRange opSlice() { return new Range1(this); } } class MyContainer2: IContainer { class Range2: InputRange { //... another implementation } override InputRange opSlice() { return new Range2(this); } } In this example I need a range to be a class, but not a struct. Another problemme is that `copy contructor` is defined only for structs, but not classes. For the class that uses another class instance of the `same` type to initialize from it would be a regular constructor with parameter. A copy constructor in struct semantics requires that the source would be actually `the same` type. But for classes source could be instance of another class that is inherited from current class. And we cannot prove statically that it's actually the same type. And also if we talk about range interface constructor cannot be a part of it. So we cannot add `copy contructor` (if we would have it for class) to interface and check for it's presence in generic code. So here we have this workaround with `save` method... I don't like that primitive concept has two ways to do the same thing. And it's unclear what is the primary way of doing this (copy constructor or save). It introduce the situation when half of the code would require range being copyable, but another part would require it to to have a save method. Not the situation is that there are a lot of algorothms in Phobos that are not working with ranges that have disabled postblit, but have `save` method that could be used to make a copy. Still I want to be able to create ranges as classes...
Feb 16 2020
Also I see the problemme that someone can think that it creates an input range, because he doesn't provide `save` method, but actually it creates forward range unexpectedly, because it is copyable. And it makes what is actually happening in code more difficult. Some algrorithm can take ranges by value, but others take them by reference. So result can be completely different. In first case range is being consumed, but in another in is not. Personally I prefer to take range by reference in all of my algrorithms except cases where I is always a class (because it's a reference already). But I still don't know what is the right way. There are no official guidelines about it. So every time it's a problemme. Although it looks like that range is a simple concept, but it's actually not.
Feb 16 2020
I have reread presentation: http://dconf.org/2015/talks/davis.pdf We declare that `pure` input range cannot be `unpoped` and we can't return to the previous position of it later at the time. So logically there is no sence of copying input range at all. So every Phobos algorithm that declares that it's is working with InputRange must be tested for working with range with disabled copy constructor and postblit. And if it is not it means that this algroithm actually requires a forward range and there we missing `save` calls? Because as it was written in this presentation a range copy is undefined (without call to save). So it's illegal to create copy of range in Phobos algorithms without `save`? So we need a test for every algorithm that it is working with range with disabled copy constructor and postblit if we declare that we only use `save` for range copy?
Feb 16 2020
On Sunday, February 16, 2020 3:41:31 AM MST uranuz via Digitalmars-d-learn wrote:I have reread presentation: http://dconf.org/2015/talks/davis.pdf We declare that `pure` input range cannot be `unpoped` and we can't return to the previous position of it later at the time. So logically there is no sence of copying input range at all. So every Phobos algorithm that declares that it's is working with InputRange must be tested for working with range with disabled copy constructor and postblit. And if it is not it means that this algroithm actually requires a forward range and there we missing `save` calls? Because as it was written in this presentation a range copy is undefined (without call to save). So it's illegal to create copy of range in Phobos algorithms without `save`? So we need a test for every algorithm that it is working with range with disabled copy constructor and postblit if we declare that we only use `save` for range copy?A range that can't be copied is basically useless. Not only do almost all range-based algorithms take their argumenst by value (and thus copy them), but foreach copies any range that it's given, meaning that if a range isn't copyable, you can't even use it with foreach. And since many range-based algorithms function by wrapping one range with another, the ability to copy ranges is fundamental to most range-based code. That being said, the semantics of copying a range are not actually defined by the range API. Whether iterating over a copy affects the original depends on how a range was implemented. e.g. In code such as void foo(R)(R r) if(isInputRange!R) { r.popFront(); } foo(range); whether the range in the original range in the calling code is affected by the element being popped from the copy inside of foo is implementation dependent. If it's a class or a struct that's a full-on reference type, then mutating the copy does affect the original, whereas if copying a range is equivalent to save, then mutating the copy has no effect on the original. And with pseudo-reference types, it's even worse, because you could end up _partially_ mutating the original by mutating the copy, meaning that you can get some pretty serious bugs if you attempt to use a range after it's been copied. This means that in practice, in generic code, you can never use a range once it's been copied unless you overwrite it with a new value. Passing a range to a function or using it with foreach basically means that you should not continue to use that range, and if you want to be able to continue to use it, you need to call save and pass that copy to the function or foreach instead of passing the range directly to a function or foreach. In order to fix it so that you can rely on the semantics of using a range after it's been copied, we'd have to rework the range API and make it so that the semantics of copying a range were well-defined, and that gets into a huge discussion on its own. As things stand, if you want to test range-based code to ensure that it works correctly (including calling save correctly), you have to test it with a variety of different range types, including both ranges where copying is equivalent to calling save and ranges which are reference types so that copying them simply results in another reference to the same data such that iterating one copy iterates all copies. - Jonathan M Davis
Feb 16 2020
It's very bad. Because there seem that when I use range based algorithm I need to take two things into account. The first is how algrorithm is implemented. If it creates copies of range inside or pass it by reference. And the second is how the range is implemented if it has value or reference semantics. So every time I need to look into implementation and I can't rely on API description in most of the cases. In a lot of cases Phobos uses value semantics. But there are cases where I want the range actually be consumed, but it's not. And the other problemme is when algorithm expects range to have value semantics, but it's not. So it's a buggy mess that it's hard to think about. In trivial cases this is working although. But in more complex cases it's simplier to implement some algorithms by own hands so that it would work as I expect it myself rather that thinking about all these value-ref-range mess. But still can't say that I implement it correctly, because range specification actually sucks as yo say. It's just horrible
Feb 16 2020
On Sunday, February 16, 2020 6:52:17 AM MST uranuz via Digitalmars-d-learn wrote:It's very bad. Because there seem that when I use range based algorithm I need to take two things into account. The first is how algrorithm is implemented. If it creates copies of range inside or pass it by reference. And the second is how the range is implemented if it has value or reference semantics. So every time I need to look into implementation and I can't rely on API description in most of the cases. In a lot of cases Phobos uses value semantics. But there are cases where I want the range actually be consumed, but it's not. And the other problemme is when algorithm expects range to have value semantics, but it's not. So it's a buggy mess that it's hard to think about. In trivial cases this is working although. But in more complex cases it's simplier to implement some algorithms by own hands so that it would work as I expect it myself rather that thinking about all these value-ref-range mess. But still can't say that I implement it correctly, because range specification actually sucks as yo say. It's just horribleThe current situation is definitely not ideal, but it can be used quite consistently. Just follow the rule that once you copy a range, you do not use that range ever again unless you assign it a new value. So, if you do something like auto result = r.find(e); then you should not be using r again unless you assign it a new value. e.g. r = r.find(e); r.popFront(); auto result = r.find(e); r = otherRange; r.popFront(); would be fine, because r was assigned a new value, whereas auto result = r.find(e); r.popFront(); should never happen in your code unless you know that typeof(r) is a type where copying it is equivalent to calling save on it. In generic code, that means that you should never use a range again after passing it to a function unless you assign it a new value, or that function accepted the argument by ref (which almost no range-based functions do). If you want to be able to use the range again after passing it to a function without assigning it a new value, then you need to call save so that you pass an independent copy. e.g. auto result = r.find.save(e); r.popFront(); The only copying going on here is with save, so there's no problem, whereas if r were passed to find directly, the behavior is implementation-dependent - hence why you should not be using a range after it's been copied (which includes passing it to a function). The only time that generic code can reuse a range after passing it to a function is if that function accepts its argument by ref, which almost no range-based code does. Far more frequently, it returns the result as a wrapper range, in which case, you can't use the original range any longer unless you used save when calling the function or if the code is not generic and you're coding based on the specific behavior of that particular range type - which usually isn't something that code should be doing. By no means do I claim that the status quo here is desirable, but if you just follow the simple rule that you don't ever use a range once it's been copied (unless that copy came from save), then you shouldn't run into problems related to the fact that different ranges have different copying semantics unless the function that you're calling is buggy. If you're going to run into a bug along those lines though, it's likely going to be because a function didn't call save when it was supposed to, and it was only tested with range types where copying them is equivalent to save. That's why it's important to test range-based code with both range types where copying them is equivalent to save and range types which are full-on reference types (and thus copying just results in another reference). In general though, any range that is a forward range should have copying it be equivalent to save, and using reference types for forward ranges tends to be inefficient and error-prone even if range-based functions (especially those in Phobos) should be able to handle them correctly. - Jonathan M Davis
Feb 16 2020
In general for value-semantics and ref-semantics the different code is actually needed. But generic algorithm try to pretend that the logic is the same. But it's not true. But in wide subset of trivial algorithm it's true. So it's incorrectly interpolated that it's true for every case. The very bad thing if range is passed by value it still can have value or reference semantic. And algorithm cannot say which is it actually. There is not such problemme for classes. So as I already said when passing ranges by ref in algorithms they behave predictible. And if I want algrorithm to operate on copy of algorithm then I can just create this copy before passing it to this algorithm. And again intention is more clear. But Phobos algorithms don't work like that. It's why I can't use them in some cases, because they are looking unpredictable for me.
Feb 16 2020
On Sunday, 16 February 2020 at 12:38:51 UTC, Jonathan M Davis wrote:On Sunday, February 16, 2020 3:41:31 AM MST uranuz via Digitalmars-d-learn wrote:This is working fine with disabled postblit... import std; struct SS { disable this(this); // Disabled copy bool _empty = false; bool empty() property { return _empty; } void popFront() { _empty = true; } int front() property { return 10; } } void main() { foreach( it; SS() ) { writeln(it); } } Am I missing something?I have reread presentation: http://dconf.org/2015/talks/davis.pdf We declare that `pure` input range cannot be `unpoped` and we can't return to the previous position of it later at the time. So logically there is no sence of copying input range at all. So every Phobos algorithm that declares that it's is working with InputRange must be tested for working with range with disabledA range that can't be copied is basically useless. Not only do almost all range-based algorithms take their argumenst by value (and thus copy them), but foreach copies any range that it's given, meaning that if a range isn't copyable, you can't even use it with foreach. And since many range-based algorithms function by wrapping one range with another, the ability to copy ranges is fundamental to most range-based code.
Feb 16 2020
On Sunday, February 16, 2020 7:29:11 AM MST uranuz via Digitalmars-d-learn wrote:On Sunday, 16 February 2020 at 12:38:51 UTC, Jonathan M Davis wrote:That code compiles, because you're passing a temporary to foreach. So, the compiler does a move instead of a copy. It's the difference between auto ss = SS(); and SS ss; auto ss2 = ss; If your main were void main() { SS ss; foreach( it; ss ) { writeln(it); } } then it would not compile. foreach(e; range) {...} basically gets lowered to for(auto __r = range; !__r.empty; __r.popFront()) { auto e = __r.front; ... } So, foreach( it; SS() ) { writeln(it); } would become for(auto __r = SS(); !__r.empty; __r.popFront()) { auto it = __r.front; writeln(it); } whereas SS ss; foreach( it; ss ) { writeln(it); } would become SS ss; for(auto __r = ss; !__r.empty; __r.popFront()) { auto it = __r.front; writeln(it); } - Jonathan M DavisOn Sunday, February 16, 2020 3:41:31 AM MST uranuz via Digitalmars-d-learn wrote:This is working fine with disabled postblit... import std; struct SS { disable this(this); // Disabled copy bool _empty = false; bool empty() property { return _empty; } void popFront() { _empty = true; } int front() property { return 10; } } void main() { foreach( it; SS() ) { writeln(it); } } Am I missing something?I have reread presentation: http://dconf.org/2015/talks/davis.pdf We declare that `pure` input range cannot be `unpoped` and we can't return to the previous position of it later at the time. So logically there is no sence of copying input range at all. So every Phobos algorithm that declares that it's is working with InputRange must be tested for working with range with disabledA range that can't be copied is basically useless. Not only do almost all range-based algorithms take their argumenst by value (and thus copy them), but foreach copies any range that it's given, meaning that if a range isn't copyable, you can't even use it with foreach. And since many range-based algorithms function by wrapping one range with another, the ability to copy ranges is fundamental to most range-based code.
Feb 16 2020
On Sunday, 16 February 2020 at 17:10:24 UTC, Jonathan M Davis wrote:On Sunday, February 16, 2020 7:29:11 AM MST uranuz viaOn the other hand, this does work: void main() { SS ss; foreach( it; move(ss) ) { writeln(it); } } So perhaps the correct approach is to use `move` when copying input ranges.This is working fine with disabled postblit... import std; struct SS { disable this(this); // Disabled copy bool _empty = false; bool empty() property { return _empty; } void popFront() { _empty = true; } int front() property { return 10; } } void main() { foreach( it; SS() ) { writeln(it); } } Am I missing something?That code compiles, because you're passing a temporary to foreach. So, the compiler does a move instead of a copy. It's the difference between auto ss = SS(); and SS ss; auto ss2 = ss; If your main were void main() { SS ss; foreach( it; ss ) { writeln(it); } } then it would not compile.
Feb 16 2020
On Sunday, February 16, 2020 10:53:36 AM MST Paul Backus via Digitalmars-d- learn wrote:On Sunday, 16 February 2020 at 17:10:24 UTC, Jonathan M Davis wrote:Given that the way that almost all range-based functions work is to copy the range that they're given (often then wrapping it in another range that's returned), I don't see how it would make sense to use move outside of very specific circumstances. If you pass a range to a function, and it's a basic input range, then you just use the range via the return value (be it the same range returned directly or returned within a wraper range), and if it's a forward range, you call save before passing it to the function if you want to be able to use the range directly again. Either way, generic code should never be using a range after it's been copied, and copying is a key part of how idiomatic, range-based code works in D. And really, using move just to be able to use an uncopyable range with foreach doesn't make a lot of sense, since if that's what you want to do, you can always just use a normal for loop. Regardless, there isn't much point in declaring a range type that can't be copied, since it's pretty much only going to work with code that you write. - Jonathan M DavisOn Sunday, February 16, 2020 7:29:11 AM MST uranuz viaOn the other hand, this does work: void main() { SS ss; foreach( it; move(ss) ) { writeln(it); } } So perhaps the correct approach is to use `move` when copying input ranges.This is working fine with disabled postblit... import std; struct SS { disable this(this); // Disabled copy bool _empty = false; bool empty() property { return _empty; } void popFront() { _empty = true; } int front() property { return 10; } } void main() { foreach( it; SS() ) { writeln(it); } } Am I missing something?That code compiles, because you're passing a temporary to foreach. So, the compiler does a move instead of a copy. It's the difference between auto ss = SS(); and SS ss; auto ss2 = ss; If your main were void main() { SS ss; foreach( it; ss ) { writeln(it); } } then it would not compile.
Feb 16 2020
On Sunday, 16 February 2020 at 18:11:11 UTC, Jonathan M Davis wrote:Either way, generic code should never be using a range after it's been copied, and copying is a key part of how idiomatic, range-based code works in D."Copy and then never use the original again" is conceptually the same thing as "move", right? In which case, generic code can accommodate non-copyable ranges *and* more clearly communicate its intent by using `move` instead of a naked copy.
Feb 16 2020
On Sunday, February 16, 2020 12:22:01 PM MST Paul Backus via Digitalmars-d- learn wrote:On Sunday, 16 February 2020 at 18:11:11 UTC, Jonathan M Davis wrote:We already have enough of a mess with save without making things even worse by trying to add moves into the mix. Also, non-copyable ranges have never really been a thing, and I really don't want to see things complicated even further trying to support such an uncommon use case. There are too many weird corner cases with ranges as it is. - Jonathan M DavisEither way, generic code should never be using a range after it's been copied, and copying is a key part of how idiomatic, range-based code works in D."Copy and then never use the original again" is conceptually the same thing as "move", right? In which case, generic code can accommodate non-copyable ranges *and* more clearly communicate its intent by using `move` instead of a naked copy.
Feb 16 2020
OK. Thanks for instructions. I shall give it a try.Either way, generic code should never be using a range after it's been copied, and copying is a key part of how idiomatic, range-based code works in D.
Feb 17 2020