digitalmars.D - [DIP idea] out variables
- Q. Schroll (91/91) Jan 25 2021 Main goal: Make the `out` parameter storage class live up to
- 12345swordy (4/95) Jan 25 2021 in, out, inout need some badly reworking to do. Their is a
- Q. Schroll (8/11) Jan 26 2021 While in and out are opposites in a sense, inout is something
- Tobias Pankrath (7/13) Jan 26 2021 I recently started using C# professionally which has this feature
- Max Haughton (22/28) Jan 26 2021 A few thoughts,
- Luhrel (33/126) Jan 26 2021 I would add "the icing on the cake" : As DMD would know if a
- Ogi (3/9) Jan 27 2021 Is there any reason to use out parameters at all instead of
- Max Haughton (2/12) Jan 27 2021 Struct ABI can mean overhead in places you don't expect
- Jacob Carlborg (9/10) Jan 29 2021 If proper tuples are built-in to the language the language can invent
- Afgdr (10/17) Jan 30 2021 Totally expected. "out" and "ref" parameters in a backend are
- Dukc (24/45) Jan 28 2021 It is more, at least potentially: an optimization aid. The
- Q. Schroll (14/25) Jan 30 2021 Exactly this. Something similar is true for an immutable
Main goal: Make the `out` parameter storage class live up to promises. In current semantics, `out` is basically `ref` but with documented intent. The initialization of the parameter is more like a detail. General Idea ============ The idea of an out variable is one that **must** be passed to a function in an `out` parameter position. Basic example: int f(out int value); int g(int[] value...); int h(out int a, out int b); out int x; // g(x); // illegal: reads x, but x is not yet initialized. // h(x, x); // illegal: // reads the second x before the initialization of first x is complete. f(x); // initializes x. An `out` variable cannot be read until initialized by a function call in an `out` parameter position. Since D has exact evaluation order, it is easily determined that one usage of `x` initializes it and another in the same overall expression reads it (and not the other way around): out int x, y; /*1*/ if (h(x, y) > 0 && x < y) { .. } /*2*/ g(f(x), f(y), x, y); Evaluation order says in /*1*/ that h(x, y) is executed before x and y are read for testing `x < y`. Evaluation order says in /*2*/ that f(x) and f(y) are executed before x and y are read for passing them to g. Also, multiple execution paths can lead to different initialization points: out int x, y, z; if (g(0)) { f(x); f(y); f(z); } else h(x, y); // x, y are initialized. g(x, y); // okay: x and y initialized on both branches g(z); // invalid: z might not be initialized. It is always possible to initialize `out` variables using an ordinary assignment: out int x, y, z; if (g(0)) { /*as above*/ } else { h(x, y); z = 0; } g(z); // valid: z initialized on both branches Templates ========= Similar to `ref`, there will be `auto out` which infers `out` based on the arguments passed. `auto out` can be combined with `ref` (meaning pass by reference always, but if the argument is an out value, this is its initialization) and `auto ref` (meaning pass by reference if possible, and if the argument is an out value, this is its initialization; it cannot be passed by value and be initialized). With __traits(isOut, param) one can test whether `auto out` boiled down to `out` or not. After being (potentially|definitely|?) initialized, `out` variables do not trigger `auto out` to become `out`. In-place `out` Variables ======================== When calling a function with an `out` parameter, instead of passing an argument, a fresh variable can be declared instead: if (f(out int x) > 0 && x > 0) { g(x); } else { .. } if (g(0) && f(out x) > 0) { g(x); } else { .. } The type of an in-place out variable can be left out, when it can be inferred from the called function. [Clearly it can be done in some cases and clearly it cannot be in all templates. Exact rules TBD.] In the first else branch, `x` can be used, since regardless whether the `f(out int x) > 0 && x > 0` is true or false, evaluating it will initialize `x`. In the second else branch, `x` cannot be used because `x` might not be initialized if g(0) is false. The visibility of in-place out variables is limited to the statement they're declared in. For `if` statements this encompasses both branches, but for expression statements, it only encompasses that expression: x = f(out a) + a; // valid y = f(out b); // y += b; // error, b not visible out int c; f(c); z += c; // valid One obvious use-case is functions that return a bool value indicating success and the result is an `out` parameter. Usually, these functions' names begin with try: if (tryParseInt(str, out x)) { use(x); } Another could be unpacking: out T x; out S y; tuple.unpack(x, y); // or if (tuple.unpack(out a, out b) && condition(a, b)) { .. } What do you think? Worth it?
Jan 25 2021
On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:Main goal: Make the `out` parameter storage class live up to promises. In current semantics, `out` is basically `ref` but with documented intent. The initialization of the parameter is more like a detail. General Idea ============ The idea of an out variable is one that **must** be passed to a function in an `out` parameter position. Basic example: int f(out int value); int g(int[] value...); int h(out int a, out int b); out int x; // g(x); // illegal: reads x, but x is not yet initialized. // h(x, x); // illegal: // reads the second x before the initialization of first x is complete. f(x); // initializes x. An `out` variable cannot be read until initialized by a function call in an `out` parameter position. Since D has exact evaluation order, it is easily determined that one usage of `x` initializes it and another in the same overall expression reads it (and not the other way around): out int x, y; /*1*/ if (h(x, y) > 0 && x < y) { .. } /*2*/ g(f(x), f(y), x, y); Evaluation order says in /*1*/ that h(x, y) is executed before x and y are read for testing `x < y`. Evaluation order says in /*2*/ that f(x) and f(y) are executed before x and y are read for passing them to g. Also, multiple execution paths can lead to different initialization points: out int x, y, z; if (g(0)) { f(x); f(y); f(z); } else h(x, y); // x, y are initialized. g(x, y); // okay: x and y initialized on both branches g(z); // invalid: z might not be initialized. It is always possible to initialize `out` variables using an ordinary assignment: out int x, y, z; if (g(0)) { /*as above*/ } else { h(x, y); z = 0; } g(z); // valid: z initialized on both branches Templates ========= Similar to `ref`, there will be `auto out` which infers `out` based on the arguments passed. `auto out` can be combined with `ref` (meaning pass by reference always, but if the argument is an out value, this is its initialization) and `auto ref` (meaning pass by reference if possible, and if the argument is an out value, this is its initialization; it cannot be passed by value and be initialized). With __traits(isOut, param) one can test whether `auto out` boiled down to `out` or not. After being (potentially|definitely|?) initialized, `out` variables do not trigger `auto out` to become `out`. In-place `out` Variables ======================== When calling a function with an `out` parameter, instead of passing an argument, a fresh variable can be declared instead: if (f(out int x) > 0 && x > 0) { g(x); } else { .. } if (g(0) && f(out x) > 0) { g(x); } else { .. } The type of an in-place out variable can be left out, when it can be inferred from the called function. [Clearly it can be done in some cases and clearly it cannot be in all templates. Exact rules TBD.] In the first else branch, `x` can be used, since regardless whether the `f(out int x) > 0 && x > 0` is true or false, evaluating it will initialize `x`. In the second else branch, `x` cannot be used because `x` might not be initialized if g(0) is false. The visibility of in-place out variables is limited to the statement they're declared in. For `if` statements this encompasses both branches, but for expression statements, it only encompasses that expression: x = f(out a) + a; // valid y = f(out b); // y += b; // error, b not visible out int c; f(c); z += c; // valid One obvious use-case is functions that return a bool value indicating success and the result is an `out` parameter. Usually, these functions' names begin with try: if (tryParseInt(str, out x)) { use(x); } Another could be unpacking: out T x; out S y; tuple.unpack(x, y); // or if (tuple.unpack(out a, out b) && condition(a, b)) { .. } What do you think? Worth it?in, out, inout need some badly reworking to do. Their is a preview for in, but not for others sadly. -Alex
Jan 25 2021
On Tuesday, 26 January 2021 at 02:44:20 UTC, 12345swordy wrote:While in and out are opposites in a sense, inout is something completely unrelated. For the most part, I consider `in` to be fixed. With the preview, it works exactly as one would expect it did. On the other hand, `out` is near useless: In the current state, making `out` an alias for `ref` wouldn't be that much of a breaking change.What do you think? Worth it?in, out, inout need some badly reworking to do. Their is a preview for in, but not for others sadly.
Jan 26 2021
On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:In-place `out` Variables ======================== When calling a function with an `out` parameter, instead of passing an argument, a fresh variable can be declared instead: if (f(out int x) > 0 && x > 0) { g(x); } else { .. } if (g(0) && f(out x) > 0) { g(x); } else { .. }already. It makes function with out parameters so much more pleasant to use. Many argue that we should not overload D with even more features, but I'd say, if it makes D more fun to use and it is just syntax sugar / a simple lowering than we should consider it.
Jan 26 2021
On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:Main goal: Make the `out` parameter storage class live up to promises. In current semantics, `out` is basically `ref` but with documented intent. The initialization of the parameter is more like a detail. [...]A few thoughts, I like the concept of out applied to lvalues to catch things being used too early. The concept of introducing a new variable *inside* an expression sounds like a nightmare, I think the following construct is not only easier to implement but also more generally applicable elsewhere in the language if(out x; expr(x)) { } -- lowers to -- out x; if(expr(x)) { } I have left out any types from the above, although deferred type inference could be very useful it would also have to be considered very carefully. Also, finally, this would be yet another thing that rhymes with dataflow analysis in the core language, so it needs to be specified carefully.
Jan 26 2021
On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:Main goal: Make the `out` parameter storage class live up to promises. In current semantics, `out` is basically `ref` but with documented intent. The initialization of the parameter is more like a detail. General Idea ============ The idea of an out variable is one that **must** be passed to a function in an `out` parameter position. Basic example: int f(out int value); int g(int[] value...); int h(out int a, out int b); out int x; // g(x); // illegal: reads x, but x is not yet initialized. // h(x, x); // illegal: // reads the second x before the initialization of first x is complete. f(x); // initializes x.I would add "the icing on the cake" : As DMD would know if a `out` variable is initialized or not, we should be able to throw a generic error like "error: variable `d` is not initialized." for these types of codes: ``` class D { int x; void foo() { } } void main() { D d; d.foo(); // error: variable `d` is not initialized. } ``` ... instead of a raw crash with signal 11. That would clearly save some time.An `out` variable cannot be read until initialized by a function call in an `out` parameter position. Since D has exact evaluation order, it is easily determined that one usage of `x` initializes it and another in the same overall expression reads it (and not the other way around): out int x, y; /*1*/ if (h(x, y) > 0 && x < y) { .. } /*2*/ g(f(x), f(y), x, y); Evaluation order says in /*1*/ that h(x, y) is executed before x and y are read for testing `x < y`. Evaluation order says in /*2*/ that f(x) and f(y) are executed before x and y are read for passing them to g. Also, multiple execution paths can lead to different initialization points: out int x, y, z; if (g(0)) { f(x); f(y); f(z); } else h(x, y); // x, y are initialized. g(x, y); // okay: x and y initialized on both branches g(z); // invalid: z might not be initialized. It is always possible to initialize `out` variables using an ordinary assignment: out int x, y, z; if (g(0)) { /*as above*/ } else { h(x, y); z = 0; } g(z); // valid: z initialized on both branchesI imagine that it will still be possible to call f()/h() with a non-`out` variable ?Templates ========= Similar to `ref`, there will be `auto out` which infers `out` based on the arguments passed. `auto out` can be combined with `ref``void f(T)(auto out ref T t);` ?(meaning pass by reference always, but if the argument is an out value, this is its initialization) and `auto ref` (meaning pass by reference if possible, and if the argument is an out value, this is its initialization; it cannot be passed by value and be initialized). With __traits(isOut, param) one can test whether `auto out` boiled down to `out` or not.ok.After being (potentially|definitely|?) initialized, `out` variables do not trigger `auto out` to become `out`.That doesn't make sense.In-place `out` Variables ======================== When calling a function with an `out` parameter, instead of passing an argument, a fresh variable can be declared instead: if (f(out int x) > 0 && x > 0) { g(x); } else { .. } if (g(0) && f(out x) > 0) { g(x); } else { .. }I don't like that idea. That makes the code more difficult to read.The type of an in-place out variable can be left out, when it can be inferred from the called function. [Clearly it can be done in some cases and clearly it cannot be in all templates. Exact rules TBD.] In the first else branch, `x` can be used, since regardless whether the `f(out int x) > 0 && x > 0` is true or false, evaluating it will initialize `x`. In the second else branch, `x` cannot be used because `x` might not be initialized if g(0) is false. The visibility of in-place out variables is limited to the statement they're declared in. For `if` statements this encompasses both branches, but for expression statements, it only encompasses that expression: x = f(out a) + a; // valid y = f(out b); // y += b; // error, b not visible out int c; f(c); z += c; // validMeh, I really don't like that fact of declaring a variable inside a function's parameter. Also, I don't thing that it will be easy to implement it.One obvious use-case is functions that return a bool value indicating success and the result is an `out` parameter. Usually, these functions' names begin with try: if (tryParseInt(str, out x)) { use(x); } Another could be unpacking: out T x; out S y; tuple.unpack(x, y); // or if (tuple.unpack(out a, out b) && condition(a, b)) { .. }Same as above.What do you think? Worth it?Yes, except the `in-place`.
Jan 26 2021
On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:Main goal: Make the `out` parameter storage class live up to promises. In current semantics, `out` is basically `ref` but with documented intent. The initialization of the parameter is more like a detail. [...]Is there any reason to use out parameters at all instead of returning a tuple?
Jan 27 2021
On Wednesday, 27 January 2021 at 09:34:36 UTC, Ogi wrote:On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:Struct ABI can mean overhead in places you don't expectMain goal: Make the `out` parameter storage class live up to promises. In current semantics, `out` is basically `ref` but with documented intent. The initialization of the parameter is more like a detail. [...]Is there any reason to use out parameters at all instead of returning a tuple?
Jan 27 2021
On 2021-01-27 19:25, Max Haughton wrote:Struct ABI can mean overhead in places you don't expectIf proper tuples are built-in to the language the language can invent its own ABI for that type. Just like it does for arrays and delegates. On the other hand, there are a bunch of existing C functions that encodes out parameter as pointers. When declaring these in D, they can be declared with `out`, which will be more descriptive and safer than a pointer. It better shows the intent. -- /Jacob Carlborg
Jan 29 2021
On Saturday, 30 January 2021 at 07:27:16 UTC, Jacob Carlborg wrote:On 2021-01-27 19:25, Max Haughton wrote:Totally expected. "out" and "ref" parameters in a backend are pointers. In a front end it's "just" an abstraction that allows special checks, typically : 1. accepts only lvalues and 2. dont try implicit conversions. In addition for "out" it adds a zeroinit before the call since as it's a kind of return it must be defined even if not modified by the callee.Struct ABI can mean overhead in places you don't expectIf proper tuples are built-in to the language the language can invent its own ABI for that type. Just like it does for arrays and delegates. On the other hand, there are a bunch of existing C functions that encodes out parameter as pointers.
Jan 30 2021
On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:In current semantics, `out` is basically `ref` but with documented intent.It is more, at least potentially: an optimization aid. The calling function knows that contents of the `out` variable won't affect the result, unlike with `ref`.General Idea ============ The idea of an out variable is one that **must** be passed to a function in an `out` parameter position. Basic example: int f(out int value); int g(int[] value...); int h(out int a, out int b); out int x; // g(x); // illegal: reads x, but x is not yet initialized. // h(x, x); // illegal: // reads the second x before the initialization of first x is complete. f(x); // initializes x.I don't like this. It is going to get annoying in cases like this: ``` int f(out int, int); int func() { out int x; if(someCond) x.f(0); else if(someOtherCond) x.f(1); return x; } ``` What should the compiler do? It cannot know whether it's possible x can be returned uninitialized. It can issue an error just in case, and we hate to refactor code due to false alarms like that. Or it can ignore it, in which case the `out` storage parameter will sometimes work, sometimes silently fail. One is still going to need to void initialize stuff to be sure to elide the default initialization.In-place `out` Variables ======================== When calling a function with an `out` parameter, instead of passing an argument, a fresh variable can be declared instead: if (f(out int x) > 0 && x > 0) { g(x); } else { .. } if (g(0) && f(out x) > 0) { g(x); } else { .. }This, however, sounds better. I'd only leave out the requirement for the caller to specify `out`, and also let to do that for `ref` parameters.
Jan 28 2021
On Thursday, 28 January 2021 at 09:17:47 UTC, Dukc wrote:int f(out int, int); int func() { out int x; if(someCond) x.f(0); else if(someOtherCond) x.f(1); return x; } What should the compiler do? It cannot know whether it's possible x can be returned uninitialized. It can issue an error just in case, and we hate to refactor code due to false alarms like that.Exactly this. Something similar is true for an immutable constructor: struct S { int a; this(int x) immutable { if (x < 0) { a = 1; } else if (x > 0) { a = 2; } if (x == 0) a = 3; // error } } This problem is inherent.
Jan 30 2021