www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Printing a range of ranges drains them

reply Steven Schveighoffer <schveiguy gmail.com> writes:
If you print a range of ranges (that are not arrays) with 
`writeln`, even if the nested range is a forward range, `writeln` 
will drain the nested ranges.

example:

```d
import std.stdio;
import std.range;
struct R
{
     int* ptr;
     size_t len;
     int front() {return  *ptr;}
     void popFront() { ++ptr; --len; }
     bool empty() {return len == 0;}
     typeof(this) save() { return this; }
}

static assert(isForwardRange!R);

void main()
{
     int[] arr = [1, 2, 3];
     auto r = R(arr.ptr, arr.length);
     R[] mdarr = [r, r, r];
     writeln(mdarr);
     writeln(mdarr);
}
```

Output:

```
[[1, 2, 3], [1, 2, 3], [1, 2, 3]]
[[], [], []]
```

If you do this with nested arrays, it does not drain the inner 
arrays.

You can fix by un-reffing the elements of the outer array: 
`writeln(mdarr.map!(e => e));`

So, does anyone expect this behavior? If so, can you explain why 
you think this is intentionally designed this way?

I wanted to file a bug, but I was shocked that this behavior as 
far as I can tell has always existed, and nobody has ever filed a 
bug on it.

-Steve
May 26 2024
next sibling parent Salih Dincer <salihdb hotmail.com> writes:
On Monday, 27 May 2024 at 00:25:42 UTC, Steven Schveighoffer 
wrote:

 So, does anyone expect this behavior? If so, can you explain 
 why you think this is intentionally designed this way?
I think everything is as it should be. Because each element in mdarr is a copy of each other. It will appear blank unless you rewind. You can see the situation with the reward() function: ```d //... auto a = R(arr.ptr, arr.length); auto arrs = [ a, a, a ]; void reward(R[] r) { foreach(i,ref e; r) { ++i; foreach(_; 0..i) { --e.ptr; ++e.len; } } } writeln(arrs); // [[1, 2, 3], [1, 2, 3], [1, 2, 3]] reward(arrs); writeln(arrs); // [[3], [2, 3], [1, 2, 3]] } ``` SDB 79
May 26 2024
prev sibling next sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Sunday, May 26, 2024 6:25:42 PM MDT Steven Schveighoffer via Digitalmars-d 
wrote:
 If you print a range of ranges (that are not arrays) with
 `writeln`, even if the nested range is a forward range, `writeln`
 will drain the nested ranges.

 example:

 ```d
 import std.stdio;
 import std.range;
 struct R
 {
      int* ptr;
      size_t len;
      int front() {return  *ptr;}
      void popFront() { ++ptr; --len; }
      bool empty() {return len == 0;}
      typeof(this) save() { return this; }
 }

 static assert(isForwardRange!R);

 void main()
 {
      int[] arr = [1, 2, 3];
      auto r = R(arr.ptr, arr.length);
      R[] mdarr = [r, r, r];
      writeln(mdarr);
      writeln(mdarr);
 }
 ```

 Output:

 ```
 [[1, 2, 3], [1, 2, 3], [1, 2, 3]]
 [[], [], []]
 ```

 If you do this with nested arrays, it does not drain the inner
 arrays.

 You can fix by un-reffing the elements of the outer array:
 `writeln(mdarr.map!(e => e));`

 So, does anyone expect this behavior? If so, can you explain why
 you think this is intentionally designed this way?

 I wanted to file a bug, but I was shocked that this behavior as
 far as I can tell has always existed, and nobody has ever filed a
 bug on it.
I don't recall ever really thinking about it. I don't think that it's something that I've done very often, and when I have, it was probably for debugging. And in many cases, if you want readable input, it makes sense to use foreach to loop through the outer range and print out each inner range individually, in which case, you can call save on the inner ranges. That's usually what I'd do if I know that I'm printing out a range of ranges. Given that writeln needs to work with basic input ranges, having it not consume the inner ranges would result in different behavior between basic input ranges and forward ranges, which wouldn't be great. So, arguably, having it consume them is the correct choice, but I'd have to spend a fair bit of time thinking through the implications to come to a properly informed conclusion. Realistically though, I expect that it's an issue that was never really thought through, and the current behavior is accidental whether it's truly desirable behavior or not. - Jonathan M Davis
May 26 2024
next sibling parent Salih Dincer <salihdb hotmail.com> writes:
On Monday, 27 May 2024 at 06:31:37 UTC, Jonathan M Davis wrote:
 ...
 Given that writeln needs to work with basic input ranges, 
 having it not consume the inner ranges would result in 
 different behavior between basic input ranges and forward 
 ranges, which wouldn't be great. So, arguably, having it 
 consume them is the correct choice, but I'd have to spend a 
 fair bit of time thinking through the implications to come to a 
 properly informed conclusion.
 ...
It is possible to show the same situation with iota() and of course if this is a contradictory situation: ```d alias strings = char[][]; enum form = "[%(%s, %)]"; void main() { auto num = iota(1, 4); auto range = [num, num, num]; write("[ "); foreach(rng; range) rng.write(" "); writeln("]"); range.writefln!form; range.writefln!form; strings str; auto s = "123".dup; str = [s, s, s]; str.writefln!form; str.writefln!form; } /* [ [1, 2, 3] [1, 2, 3] [1, 2, 3] ] [[1, 2, 3], [1, 2, 3], [1, 2, 3]] [[], [], []] ["123", "123", "123"] ["123", "123", "123"] //*/ ``` When we do the same experiment with the strings above, we get different results. Moreover, foreach() similarly does not consume the inner ranges... Now there is a contradiction on the writeln() side! SDB 79
May 27 2024
prev sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On Monday, 27 May 2024 at 06:31:37 UTC, Jonathan M Davis wrote:
 I don't recall ever really thinking about it. I don't think 
 that it's something that I've done very often, and when I have, 
 it was probably for debugging. And in many cases, if you want 
 readable input, it makes sense to use foreach to loop through 
 the outer range and print out each inner range individually, in 
 which case, you can call save on the inner ranges. That's 
 usually what I'd do if I know that I'm printing out a range of 
 ranges.
This is what `writeln` does (loops over the individual elements). You can even format the nested ranges using `writefln` and the `%(...%)` format specifier.
 Given that writeln needs to work with basic input ranges, 
 having it not consume the inner ranges would result in 
 different behavior between basic input ranges and forward 
 ranges, which wouldn't be great. So, arguably, having it 
 consume them is the correct choice, but I'd have to spend a 
 fair bit of time thinking through the implications to come to a 
 properly informed conclusion.
I don't think you are grasping how surprising this is. If you are debugging something, and you want to see what something looks like at the moment, you print it. In this case, the act of printing modifies the thing you are debugging! And it doesn't even look like it did anything, because it printed fine. It's only on the second printing you see there is a problem. So you think "what happened between the first printing and the second printing?". This is actually the use case I was looking at yesterday when I discovered (probably rediscovered) this issue. Would you expect printing a struct to modify the struct? Well, it does if it includes one of these range-of-ranges! Note also, if you make the outer range by-ref, it will consume all the inner ranges *but not the outer range, even if it uses by-ref elements*. In other words, it has different behavior on the outer range, vs the inner range. This is because `writeln` accepts its parameters by value, but the underlying `formatValue` uses auto ref (to support non-copyable range elements). And, by the way, nested arrays are not consumed, even if they are inside a range with lvalue elements. So that is another outlier. And also likely why nobody has complained about this -- most people use arrays for their ranges.
 Realistically though, I expect that it's an issue that was 
 never really thought through, and the current behavior is 
 accidental whether it's truly desirable behavior or not.
I tend to agree. I'm going to file an issue on it. I think any forward ranges should be passed via `.save` to their respective formatters. This should fix the problem, and is what most people would expect. When I posed this question, my thought was that the behavior was unintuitive, but given the length of time this has existed, I thought maybe someone has a good reason why the code is this way, and I'm just not seeing it. Note for the range redesign -- this is going to make things tricky as we won't have a `save` to use. We will have to explicitly copy the range before passing to the `formatValue` function (as long as it's a forward range). This is kind of a drawback, I'll put that on the range redesign thread. -Steve
May 27 2024
prev sibling parent reply monkyyy <crazymonkyyy gmail.com> writes:
On Monday, 27 May 2024 at 00:25:42 UTC, Steven Schveighoffer 
wrote:
 
 So, does anyone expect this behavior? If so, can you explain 
 why you think this is intentionally designed this way?
This is correct behavior for ref front ranges with imperative pop ref front is a violation of the "views of data", but given the current api how else is sorting going to work? I tried functional pop in my api experiment, it was hard to get right and will come with tradeoffs I doubt poeple will accept possible solutions are: 1. specaility n-depth range functions 2. a upper level to the api, so `auto i=foo[].find!F.key; foo[i]=...` is the correct way to mutate data 3. treat ranges of ranges as rare and unimportant
May 27 2024
parent reply Salih Dincer <salihdb hotmail.com> writes:
On Monday, 27 May 2024 at 16:28:28 UTC, monkyyy wrote:
 On Monday, 27 May 2024 at 00:25:42 UTC, Steven Schveighoffer 
 wrote:
 
 So, does anyone expect this behavior? If so, can you explain 
 why you think this is intentionally designed this way?
This is correct behavior for ref front ranges with imperative pop
Is the situation the same as this example? ```d void main() { class R { wchar* ptr; size_t len; this(T)(T[] range) { ptr = cast(wchar*)range.ptr; len = range.length; } auto empty() => len == 0; auto front() => *ptr++; auto popFront() => len--; auto save() { auto r = new R([]); r.len = len; r.ptr = ptr; return r; } } auto c = ['€', '₺', '₽']; auto r = new R(c); assert(!r.empty); import std.conv : text; auto str = r.text; // "€₺₽" assert(r.empty); } ``` Okay, the objections are about inner ranges, but when you rewrite as struct and remove the new operator while the R class is consumed above, a backup of the range is taken. Or is the difference between a class and a struct related to the reference type? Thanks... SDB 79
May 27 2024
next sibling parent Salih Dincer <salihdb hotmail.com> writes:
On Tuesday, 28 May 2024 at 05:54:36 UTC, Salih Dincer wrote:
 On Monday, 27 May 2024 at 16:28:28 UTC, monkyyy wrote:
 On Monday, 27 May 2024 at 00:25:42 UTC, Steven Schveighoffer 
 wrote:
 
 So, does anyone expect this behavior? If so, can you explain 
 why you think this is intentionally designed this way?
This is correct behavior for ref front ranges with imperative pop
Is the situation the same as this example?
There's no reason why this issue can't be easily fixed. Because when you include narrow string or wchar, there is no problem of not being able to save(). Here is the proof: ```d void main() { ushort[] i = [1, 2, 3]; auto r = R(i); auto arr = [r, r, r]; import std.conv : text; auto str = arr.text; assert(!arr.empty); foreach(n; arr) { n.writefln!"%(%d, %)"; } // no problem } ``` SDB 79
May 27 2024
prev sibling parent monkyyy <crazymonkyyy gmail.com> writes:
On Tuesday, 28 May 2024 at 05:54:36 UTC, Salih Dincer wrote:
 On Monday, 27 May 2024 at 16:28:28 UTC, monkyyy wrote:
 On Monday, 27 May 2024 at 00:25:42 UTC, Steven Schveighoffer 
 wrote:
 
 So, does anyone expect this behavior? If so, can you explain 
 why you think this is intentionally designed this way?
This is correct behavior for ref front ranges with imperative pop
Is the situation the same as this example?
I think so, should sorting fail if you use pointers rather then ref? When writing generic code a ref range and a pointer to a range probably isnt different(given pointer flattening on argument call); maybe theres some edge case thats detectable but like I would never write it and its probably incorrect for other pointers consumption and mutability is part of the range spec, and has to be for file io/sorting to work with the current goals. So a mutable reference to a consuming range; will drain unless something else prevents it
May 28 2024