digitalmars.dip.ideas - Parameter storage classes on foreach variables
- Quirin Schroll (105/105) May 17 As of now, `foreach` admits `ref` variables as in `foreach (ref
- Timon Gehr (7/15) May 18 I contest that `in` is a great option every time mere avoiding of copies...
- Nick Treleaven (5/14) May 20 Did you mean "isn't a great option"?
- Timon Gehr (5/17) May 20 The negation is in the word "contest".
- Quirin Schroll (14/30) May 21 True. In generic code, one basically can’t use `const`, and
- Paul Backus (12/40) May 21 I don't like these special-case rewrites. Binding an
As of now, `foreach` admits `ref` variables as in `foreach (ref x; xs)`. There, `ref` can be used for two conceptually different things: * Avoiding copies * Mutating the values in place If mutating in place is desired, `ref` is an excellent choice. However, if mere copy avoiding is desired, another great option would be `in`. On parameters, it avoids expensive copies, but does trivial ones. A type supplying `opApply` can, in principle, easily provide an implementation where the callback takes an argument by `in` or `out`: ```d struct Range { int opApply(scope int delegate(size_t, in X) callback) { X x; if (auto result = callback(0, x)) return result; return 0; } } ``` For `out`, it’s not really different. However, how do classical ranges (`empty`, `front`, `popFront`) fare with these? First `in`. ```d foreach (in x; xs) { … } // lowers to { auto __xs = xs; for (; !__xs.empty; __xs.popFront) { static if (/* should be ref */) const scope ref x = __xs.front; else const scope x = __xs.front; … } } ``` The first notable observation is that `out` makes no sense for input ranges. Rather, it would make sense for, well, output ranges: Every time the loop reaches the end, a `put` is issued, whereas `continue` means “this loop iteration did not produce a value, but continue” and `break` means “end the loop”: ```d foreach (out T x; xs) { … } // lowers to { auto __xs = xs; // or xs[] for (; !__xs.empty /* or __xs.length > 0 or nothing */;) { auto x = T.init; … __xs.put(x); /* or similar */ } } ``` The program should assign `x` in its body. If control reaches the end of the loop, the value is `put` in the output range. As an output range, in general, need not be finite, the loop is endless by design, but if the range has an `empty` member, it’s being used, and for types with `length`, but no `empty`, the condition is `__xs.length > 0`. For arrays and slices, the `put` operation is `__xs[0] = x; __xs = __xs[1 .. $];`. If `T` is not explicitly given, and `xs` is not an array or slice, an attempt should be made to extract it from the single parameter of a non-overloaded `xs.put`. Otherwise, it’s an error. Dynamic arrays and slices should support `size_t` keys as well: ```d foreach (i, out x; xs) { … } // lowers to { auto __xs = xs[]; for (size_t __i = 0; __xs.length > 0; ++__i) { size_t i = __i; auto x = typeof(xs[0]).init; … __xs[0] = x; __xs = __xs[1 .. $]; } } ``` Associative arrays specifically can be filled using `out` key and values: ```d int[string] aa; foreach (out key, out value; aa) { … } // lowers to { auto __aa = aa; for (;;) { KeyType key = KeyType.init; ValueType value = ValueType.init; … __aa[key] = value; } } ``` At some point, a `break` is needed, otherwise the loop is infinite.
May 17
On 5/17/24 20:59, Quirin Schroll wrote:As of now, `foreach` admits `ref` variables as in `foreach (ref x; xs)`. There, `ref` can be used for two conceptually different things: * Avoiding copies * Mutating the values in place If mutating in place is desired, `ref` is an excellent choice. However, if mere copy avoiding is desired, another great option would be `in`.I contest that `in` is a great option every time mere avoiding of copies is desired (because it implies transitive `const`). In general, extending `foreach` to `in` and `out` makes some sense, but `out` is likely to be quite controversial, especially the output range lowering. When I think of `foreach`, I think of consuming a range, not producing one.
May 18
On Saturday, 18 May 2024 at 22:43:48 UTC, Timon Gehr wrote:Did you mean "isn't a great option"? And if so, presumably we still need `auto ref`: https://github.com/dlang/DIPs/blob/master/DIPs/other/DIP1022.mdIf mutating in place is desired, `ref` is an excellent choice. However, if mere copy avoiding is desired, another great option would be `in`.I contest that `in` is a great option every time mere avoiding of copies is desired (because it implies transitive `const`).In general, extending `foreach` to `in` and `out` makes some sense, but `out` is likely to be quite controversial, especially the output range lowering. When I think of `foreach`, I think of consuming a range, not producing one.+1
May 20
On 5/20/24 16:29, Nick Treleaven wrote:On Saturday, 18 May 2024 at 22:43:48 UTC, Timon Gehr wrote:The negation is in the word "contest". (Stated more clearly: Sometimes `in` cannot be used because `const` is transitive.)...If mutating in place is desired, `ref` is an excellent choice. However, if mere copy avoiding is desired, another great option would be `in`.I contest that `in` is a great option every time mere avoiding of copies is desired (because it implies transitive `const`).Did you mean "isn't a great option"? And if so, presumably we still need `auto ref`: https://github.com/dlang/DIPs/blob/master/DIPs/other/DIP1022.md ...Would be good, also for local variables outside of `foreach`.
May 20
On Saturday, 18 May 2024 at 22:43:48 UTC, Timon Gehr wrote:On 5/17/24 20:59, Quirin Schroll wrote:True. In generic code, one basically can’t use `const`, and therefore `in`, as a type can become simply unusable. (Prime example would be delegate types once they’re fixed.) I case you know the type and it’s a type that works well being `const`, *then* `in` might be a great option.As of now, `foreach` admits `ref` variables as in `foreach (ref x; xs)`. There, `ref` can be used for two conceptually different things: * Avoiding copies * Mutating the values in place If mutating in place is desired, `ref` is an excellent choice. However, if mere copy avoiding is desired, another great option would be `in`.I contest that `in` is a great option every time mere avoiding of copies is desired (because it implies transitive `const`).In general, extending `foreach` to `in` and `out` makes some sense, but `out` is likely to be quite controversial, especially the output range lowering. When I think of `foreach`, I think of consuming a range, not producing one.I thought the same, but on the other hand, there’s a keyword, so it absolutely won’t happen accidentally. It may just surprise people to read it in someone else’s code. My sense is that everything that the stuff in a `foreach` header before the semicolon should support exactly the same things a lambda parameter list would simply because it may become a lambda passed to `opApply`. If it isn’t, well, it’s up for discussion what to do with it. Making it invalid is always an option.
May 21
On Friday, 17 May 2024 at 18:59:13 UTC, Quirin Schroll wrote:```d foreach (out T x; xs) { … } // lowers to { auto __xs = xs; // or xs[] for (; !__xs.empty /* or __xs.length > 0 or nothing */;) { auto x = T.init; … __xs.put(x); /* or similar */ } } ```[...]```d int[string] aa; foreach (out key, out value; aa) { … } // lowers to { auto __aa = aa; for (;;) { KeyType key = KeyType.init; ValueType value = ValueType.init; … __aa[key] = value; } } ```I don't like these special-case rewrites. Binding an array/AA/range element to an `out` loop variable should have *exactly* the same semantics as binding a function argument to an `out` parameter. That is, * The element must be an lvalue. * The element is bound by reference. * Upon being bound, the element is set its `.init` value. So, no implicit calls to `put`, no implicit insertion of AA elements, etc. Aside from that, this seems like a good idea to me. 👍
May 21