www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - auto ref deduction and common type deduction inconsistency

reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
Consider these two functions:

auto ref foo(ref int x) {
   if (condition) return x;
   return 3;
}

auto ref bar(ref int x) {
   return condition ? x : 3;
}

At a first glance, they appear to be equivalent, however foo is a 
compile-time error "constant 3 is not an lvalue" while bar 
compiles fine and returns an rvalue int.

The  rule in the spec is "The lexically first ReturnStatement 
determines the ref-ness of [an auto ref] function"

Why is this? I think it would be more consistent and convenient 
to be: "An auto ref function returns by ref if all return paths 
return an lvalue, else it returns by value".

Am I missing something? I don't see why foo should be rejected at 
compile time when it can happily return by value.

It is especially problematic in generic code where you 
opportunistically want to return by ref when possible, e.g.:

auto ref f(alias g, alias h)()
{
   if (condition)
     return g();
   return h();
}

If g returns by ref while h returns by value then this fails to 
instantiate. It would be nice if it just returned by value (as 
return condition ? g() : h() would)
Aug 19 2014
next sibling parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Tue, 19 Aug 2014 22:28:25 +0000
Peter Alexander via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 The  rule in the spec is "The lexically first ReturnStatement=20
 determines the ref-ness of [an auto ref] function"
=20
 Why is this? I think it would be more consistent and convenient=20
 to be: "An auto ref function returns by ref if all return paths=20
 return an lvalue, else it returns by value".
first: compilation speed. compiler can stop looking at function just after the first 'return'. second: it's easier to human to determine the actual return type this way. imagine that you want to return refs, and somewhere deep in your code you accidentaly returns some literal. it can take ages to figure out what happens. just add something like "if (0) return 42;" to foo(). compiler will eliminate dead code, but will use 'return 42' to determine function return type.
Aug 19 2014
parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Tuesday, 19 August 2014 at 23:56:12 UTC, ketmar via 
Digitalmars-d wrote:
 first: compilation speed. compiler can stop looking at function 
 just
 after the first 'return'.
No, it still has to check the other returns for errors anyway.
 second: it's easier to human to determine the actual return 
 type this
 way.
Well, the return type is already the common type of all return paths, so you need to look anyway. This is just whether the return is by ref or by value. In any case, I'd argue correct semantics are preferable to a slight convenience when reading.
 just add something like "if (0) return 42;" to foo(). compiler 
 will
 eliminate dead code, but will use 'return 42' to determine 
 function
 return type.
That doesn't help at all. I want return by ref when possible, not always return by value. If I wanted return by value, I'd just return by value!!
Aug 20 2014
parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Wed, 20 Aug 2014 14:44:40 +0000
Peter Alexander via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 Well, the return type is already the common type of all return=20
 paths
no, it's not. the return type will be taken from the first return statement in code.
 That doesn't help at all. I want return by ref when possible, not=20
 always return by value. If I wanted return by value, I'd just=20
 return by value!!
you can't return ref and non-ref simultaneously from one function.
Aug 20 2014
parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Wednesday, 20 August 2014 at 14:52:59 UTC, ketmar via 
Digitalmars-d wrote:
 On Wed, 20 Aug 2014 14:44:40 +0000
 Peter Alexander via Digitalmars-d <digitalmars-d puremagic.com> 
 wrote:

 Well, the return type is already the common type of all return 
 paths
no, it's not. the return type will be taken from the first return statement in code.
auto foo() { if (1) return 1; return 2.0; } This returns double. Try for yourself.
 That doesn't help at all. I want return by ref when possible, 
 not always return by value. If I wanted return by value, I'd 
 just return by value!!
you can't return ref and non-ref simultaneously from one function.
Of course, what I want is: 1. If both returns are lvalues, return by ref. 2. Otherwise, return by rvalue (regardless if one is an lvalue).
Aug 20 2014
next sibling parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Wednesday, 20 August 2014 at 15:08:52 UTC, Peter Alexander 
wrote:
 On Wednesday, 20 August 2014 at 14:52:59 UTC, ketmar via 
 Digitalmars-d wrote:
 On Wed, 20 Aug 2014 14:44:40 +0000
 Peter Alexander via Digitalmars-d 
 <digitalmars-d puremagic.com> wrote:

 Well, the return type is already the common type of all 
 return paths
no, it's not. the return type will be taken from the first return statement in code.
auto foo() { if (1) return 1; return 2.0; } This returns double. Try for yourself.
Then either the compiler or the documentation is wrong :-( "If there are multiple ReturnStatements, the types of them must match exactly." http://dlang.org/function#auto-functions
Aug 20 2014
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 08/20/2014 05:24 PM, "Marc =?UTF-8?B?U2Now7x0eiI=?= 
<schuetzm gmx.net>" wrote:
 On Wednesday, 20 August 2014 at 15:08:52 UTC, Peter Alexander wrote:
 On Wednesday, 20 August 2014 at 14:52:59 UTC, ketmar via Digitalmars-d
 wrote:
 On Wed, 20 Aug 2014 14:44:40 +0000
 Peter Alexander via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 Well, the return type is already the common type of all return paths
no, it's not. the return type will be taken from the first return statement in code.
auto foo() { if (1) return 1; return 2.0; } This returns double. Try for yourself.
Then either the compiler or the documentation is wrong :-( "If there are multiple ReturnStatements, the types of them must match exactly." http://dlang.org/function#auto-functions
https://issues.dlang.org/show_bug.cgi?id=8307
Aug 20 2014
prev sibling parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Wed, 20 Aug 2014 15:08:48 +0000
Peter Alexander via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 auto foo() {
    if (1) return 1;
    return 2.0;
 }
=20
 This returns double. Try for yourself.
i wasn't talking about integer promotions, but yes, it works here. and i'm sure that is shouldn't -- i consider this as a bug.
 1. If both returns are lvalues, return by ref.
 2. Otherwise, return by rvalue (regardless if one is an lvalue).
see above. now i understand you, but i think that any kind of type conversions should try to convert to the type of the first return (i.e. promote int to double if first return returnd 2.0, but emit error if first return returns int and second trying to return float). and for (cond ? x : 42) compiler should emit error too, 'cause exact return type can't be determined. maybe it should do this only for 'auto ref' though.
Aug 20 2014
parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Wednesday, 20 August 2014 at 15:29:05 UTC, ketmar via 
Digitalmars-d wrote:
 and for (cond ? x : 42) compiler should emit error too, 'cause
 exact return type can't be determined. maybe it should do this 
 only for
 'auto ref' though.
No, the type of (cond ? x : 42) is always `int`, the `ref` already gets lost inside the ternary operator. So in this case, it behaves correctly.
Aug 20 2014
parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Wed, 20 Aug 2014 16:34:40 +0000
via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 No, the type of (cond ? x : 42) is always `int`, the `ref`=20
 already gets lost inside the ternary operator. So in this case,=20
 it behaves correctly.
it's slightly counterintuitive. yes, this is correct, but D already complains on things like (a&b =3D=3D c), so it can complain on such returns too -- just to avoid disambiguation. something like "use return cast(int)".
Aug 20 2014
parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Wednesday, 20 August 2014 at 16:45:49 UTC, ketmar via 
Digitalmars-d wrote:
 On Wed, 20 Aug 2014 16:34:40 +0000
 via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 No, the type of (cond ? x : 42) is always `int`, the `ref` 
 already gets lost inside the ternary operator. So in this 
 case, it behaves correctly.
it's slightly counterintuitive. yes, this is correct, but D already complains on things like (a&b == c), so it can complain on such returns too -- just to avoid disambiguation. something like "use return cast(int)".
What would `cast(int)` do in this case? Note that `ref` is not part of the type -- it's a storage class associated only with the variable (= parameter).
Aug 20 2014
parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Wed, 20 Aug 2014 18:24:29 +0000
via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 What would `cast(int)` do in this case?
just clarifying the intentions. remember that programs are written for humans in the first place, and programs should be easy to read for humans. it's not so hard to type extra nine chars, but the reader then knows what author really wants and sure that author not slipped in the crack of specs/specific realisation. "(a&b =3D=3D c)" is perfectly valid too, yet compiler rejects it, demanding clarification from the author.
Aug 20 2014
prev sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 19 August 2014 at 22:28:27 UTC, Peter Alexander wrote:
 Consider these two functions:

 auto ref foo(ref int x) {
   if (condition) return x;
   return 3;
 }

 auto ref bar(ref int x) {
   return condition ? x : 3;
 }

 At a first glance, they appear to be equivalent, however foo is 
 a compile-time error "constant 3 is not an lvalue" while bar 
 compiles fine and returns an rvalue int.

 The  rule in the spec is "The lexically first ReturnStatement 
 determines the ref-ness of [an auto ref] function"

 Why is this? I think it would be more consistent and convenient 
 to be: "An auto ref function returns by ref if all return paths 
 return an lvalue, else it returns by value".

 Am I missing something? I don't see why foo should be rejected 
 at compile time when it can happily return by value.

 It is especially problematic in generic code where you 
 opportunistically want to return by ref when possible, e.g.:

 auto ref f(alias g, alias h)()
 {
   if (condition)
     return g();
   return h();
 }

 If g returns by ref while h returns by value then this fails to 
 instantiate. It would be nice if it just returned by value (as 
 return condition ? g() : h() would)
If I agree, you must understand that this increase wildly the cost of the analysis required to infer return type and/or refness. It makes the compiler implementation *way* more complex and could increase the compilation cost a lot. Consider that auto ref functions can happily call each others and you'll have to go through each of them as a graph, having set of possible return type and refness, aggregate these infos on a per function basis, removing cycles (so a function return type do not depend on itself). This hack in the spec come handy, and, unless we can come up with a good implementation of a more general spec, I'd argue for it to stay there.
Aug 20 2014
parent reply Artur Skawina via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 08/20/14 23:39, deadalnix via Digitalmars-d wrote:
 On Tuesday, 19 August 2014 at 22:28:27 UTC, Peter Alexander wrote:
 It is especially problematic in generic code where you opportunistically want
to return by ref when possible, e.g.:

 auto ref f(alias g, alias h)()
 {
   if (condition)
     return g();
   return h();
 }

 If g returns by ref while h returns by value then this fails to instantiate.
It would be nice if it just returned by value (as return condition ? g() : h()
would)
If I agree, you must understand that this increase wildly the cost of the analysis required to infer return type and/or refness. It makes the compiler implementation *way* more complex and could increase the compilation cost a lot. Consider that auto ref functions can happily call each others and you'll have to go through each of them as a graph, having set of possible return type and refness, aggregate these infos on a per function basis, removing cycles (so a function return type do not depend on itself). This hack in the spec come handy, and, unless we can come up with a good implementation of a more general spec, I'd argue for it to stay there.
While D's `ref` is a hack, it's /already/ part of the function type/signature. The return type of a function is /already/ (ie in the D dialects supported by recent frontend releases) determined from *all* returned expression. What would be the advantage of propagating/inferring only the type, but not the lvalueness?... artur
Aug 20 2014
parent "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Thursday, 21 August 2014 at 05:24:13 UTC, Artur Skawina via 
Digitalmars-d wrote:
 While D's `ref` is a hack, it's /already/ part of the function 
 type/signature.
 The return type of a function is /already/ (ie in the D 
 dialects supported
 by recent frontend releases) determined from *all* returned 
 expression.
 What would be the advantage of propagating/inferring only the 
 type, but not
 the lvalueness?...
I think I understand the issue better now. D doesn't always deduce a common return type, e.g. class A {} class B {} auto foo() { return new A(); return new B(); } This fails to compile with "mismatched function return type", even though it could easily return Object. However, it seems to do some deduction of sorts with integral types, e.g. this deduces to return double. auto foo() { return 0; return 0.0; return 0UL; } I'm not sure what logic it is using to do common type deductions. I haven't investigated fully. The problem comes with recursion, which we don't handle at the moment for auto or auto ref functions, but handling that becomes much easier when you just assume the return type is the return type from the first return statement, so I see the value in the described approach.
Aug 21 2014