digitalmars.D - Rvalue references

digitalmars.D - Rvalue references - The resolution

Walter Bright (65/65) May 04 2013 Thanks to the many recent threads on this, and the dips on it, everyone ...

David Nadlinger (7/11) May 04 2013 And to anybody who couldn't make it to DConf: You definitely

Andrei Alexandrescu (3/13) May 04 2013 Next year.

Tove (9/17) May 04 2013 Thanks for taking the time to detail the solution, I was quite

Araq (9/11) May 04 2013 Thanks. ;-)

Andrei Alexandrescu (3/15) May 04 2013 Whoa. Kudos!

Walter Bright (2/21) May 04 2013 Araq for the win!

deadalnix (26/44) May 04 2013 It shouldn't be expensive. Additionally, consider that returning

Walter Bright (7/13) May 04 2013 As you say, D ref's are analogous to Rust's borrowed pointers, and for t...

deadalnix (6/24) May 04 2013 Where you miss the point, is that these annotations may be

deadalnix (3/30) May 04 2013 Note : We may also choose the lack of explicit lifetime means

Walter Bright (5/11) May 04 2013 D omits the check when it can prove that the returned ref is not a ref t...

Walter Bright (3/3) May 04 2013 To put it another way, we wish to solve the problem without introducing ...

deadalnix (20/23) May 04 2013 Require isn't the right word, or you hav to explain yourself much

Walter Bright (9/10) May 04 2013 You need an explicit annotation if a ref parameter is returned by ref by...

Timon Gehr (2/12) May 04 2013 What is the point? Rust conservatively assumes this by default.

Walter Bright (14/15) May 04 2013 We could do that, too, and then disallow all code that looks like:

deadalnix (7/18) May 04 2013 This code sample won't require any annotation in Rust. And it

Walter Bright (19/41) May 04 2013 If the compiler accepts that code, it will crash at runtime. If it doesn...

deadalnix (19/50) May 05 2013 It doesn't accept it, with or without any combination of

Martin Nowak (14/27) May 26 2013 ref int foo(ref int a, ref int b);

Timothee Cour (10/39) May 26 2013 their parameter.

Steven Schveighoffer (5/11) May 28 2013 That case is covered by the proposal. It incurs a runtime check (worst ...

Andrei Alexandrescu (3/5) May 04 2013 http://static.rust-lang.org/doc/tutorial-borrowed-ptr.html

bearophile (7/8) May 04 2013 The management of pointers is one of the most refined parts of

Walter Bright (4/9) May 04 2013 Years ago, Bartosz proposed an ownership system for pointers. While soun...

Jonathan M Davis (14/26) May 04 2013 The trick is balancing it so that it's powerful enough and yet not too

David Nadlinger (6/11) May 04 2013 When did he mention that? If I had noticed, I would have been

Jonathan M Davis (36/48) May 05 2013 It was during dinner Friday night when he and Manu were discussing stuff...

Tove (8/24) May 05 2013 Assuming:

Walter Bright (3/7) May 04 2013 Consider also that the appeal of dynamic languages is people don't have ...
Zach the Mystic (8/24) May 05 2013 In the thread which appeared on github someone suggested

Diggory (11/11) May 04 2013 So just to be clear, "ref" parameters can now take rvalues?

Andrei Alexandrescu (3/4) May 04 2013 That part of the design isn't finished yet.

Namespace (3/3) May 04 2013 You mean DIP 36, not DIP 35. ;)
w0rp (10/12) May 04 2013 This reminds me of Tony Hoare's lecture on null references being

Walter Bright (2/4) May 04 2013 Yes.

deadalnix (7/12) May 04 2013 This is good, but not I'm a bit bitter with the whole code

Walter Bright (2/4) May 04 2013 I know that code breakage sux.

deadalnix (3/8) May 04 2013 And in this case, this was avoidable. We MUST get better at

Andrei Alexandrescu (4/8) May 04 2013 This is new to me. My understanding is that the discussed design

Jonathan M Davis (12/21) May 04 2013 That is definitely where things were when we ended the discussion on Wed...

Walter Bright (4/24) May 04 2013 That wasn't my understanding. I thought we agreed that since rvalues wou...

Andrei Alexandrescu (3/6) May 04 2013 The short answer is no.

Walter Bright (2/7) May 04 2013 Please explain your understanding of what we agreed on.

Andrei Alexandrescu (12/13) May 04 2013 Just the factual events. We all said repeatedly in the beginning of the

Walter Bright (3/13) May 04 2013 What I was talking about was the "no-op" thing with rvalue references, a...
Walter Bright (3/8) May 04 2013 Yes, I should have entitled the thread that it was a solution to the saf...

Jonathan M Davis (38/68) May 04 2013 The @safety issue is one of escaping local references, but Andrei and I ...

Walter Bright (2/69) May 04 2013 I meant exactly what you said: "and not know that it was effectively a n...

Jonathan M Davis (7/12) May 04 2013 Oh, okay. LOL. I was thinking you meant something lower level like than ...

Steven Schveighoffer (19/27) May 06 2013 The counter argument:

Andrei Alexandrescu (14/30) May 06 2013 I think we can technically make the overloading work while also allowing...

Steven Schveighoffer (6/18) May 06 2013 Wouldn't the new runtime check fix this?

Andrei Alexandrescu (11/31) May 06 2013 Depends how you define "fix". It would be a possibly rare bounds check

Steven Schveighoffer (18/52) May 06 2013 By "completely innocuous" you mean valid? I don't think the above is

Andrei Alexandrescu (7/60) May 06 2013 If x > 100, the code is saving a reference to a destroyed temporary. If

Steven Schveighoffer (16/38) May 06 2013 OK.

Andrei Alexandrescu (21/27) May 06 2013 It can't.

Andrei Alexandrescu (6/12) May 06 2013 I should add I've seen this bug several times (causing mysterious
Steven Schveighoffer (20/52) May 06 2013 Well, given that we intend to infer some special behavior given the type...

Andrei Alexandrescu (13/58) May 06 2013 No. It's a very different thing handled by a special rule in C++.

Steven Schveighoffer (16/71) May 06 2013 Not suggesting anything. I was inferring that since the code worked,

Andrei Alexandrescu (5/16) May 06 2013 I explained twice: min and other similar C++ examples are broken.

Steven Schveighoffer (10/22) May 06 2013 This is a trimmed down example:

Andrei Alexandrescu (7/23) May 06 2013 No. I believe I was very specific about what I destroyed and in all

Steven Schveighoffer (6/22) May 06 2013 OK, I was confused (seriously, I was not playing devil's advocate here)....

Walter Bright (2/4) May 06 2013 This is why D does not allow ref as a storage class for variables.

Rob T (24/30) May 06 2013 What I see going on is an attempt to double up on the use of ref

Andrei Alexandrescu (3/3) May 06 2013 On 5/6/13 12:48 PM, Steven Schveighoffer wrote:

Steven Schveighoffer (6/9) May 06 2013 Could be the time change, haven't rebooted my Mac since flying back. My...

Jonathan M Davis (3/5) May 07 2013 Oh, the wonders of dealing with time... :)

deadalnix (3/15) May 06 2013 Now that you mention that, is the proposal for ref safety is

Andrei Alexandrescu (3/19) May 06 2013 Yes, because it's dynamically checked.

deadalnix (3/4) May 06 2013 The check will see that the reference is in the current stack

Andrei Alexandrescu (3/8) May 06 2013 No. The check will fail (unless wrongly written).

deadalnix (5/15) May 06 2013 You'll have to explain more as I don't see how to make the check

Steven Schveighoffer (8/16) May 06 2013 Focusing back on this, I think any rvalues should be treated as though

Andrei Alexandrescu (3/20) May 06 2013 That should probably be a prerequisite of any working solution.

David Nadlinger (5/8) May 04 2013 I think you, Manu and I agreed on this simplification, and thus

David Nadlinger (10/43) May 04 2013 I was mostly arguing against Andrei's (in my opinion) overhasty

Manu (2/13) May 08 2013 I was left under the same impression that Walter also seems to be under.

Andrej Mitrovic (26/27) May 04 2013 So to recap, 2.063 turns slices into r-values which will break code

Walter Bright (3/24) May 04 2013 I don't know of any code it would break.

Walter Bright (11/16) May 04 2013 I see what you mean now. You mean how does an rvalue overload if faced w...

Diggory (10/28) May 04 2013 What about this:

Walter Bright (2/11) May 04 2013 An rvalue ref is not const, so (1) would match the same as (i) does.
Timon Gehr (3/34) May 04 2013 Both match the first overload because that is an exact match whereas the...

Andrei Alexandrescu (3/6) May 04 2013 Yes.

Andrei Alexandrescu (23/36) May 04 2013 That's not derivable, it's embedded: type U transitively has a member of...

Walter Bright (16/51) May 04 2013 Ref is a restricted form of pointer, the whole point of them is so we ca...

Jonathan M Davis (12/50) May 04 2013 The rvalue part wasn't agreed upon, just the @safety solution. I'm sure ...
Jacob Carlborg (4/5) May 05 2013 Perhaps a new flag for this.

Dicebot (2/6) May 05 2013 Or just rename it in more general -noruntimesafetychecks

Michel Fortin (14/22) May 05 2013 I just want to note that this has the effect of making any kind of heap

Walter Bright (2/18) May 05 2013 I know Andrei has thought about this, but I don't know what the solution...

Michel Fortin (10/34) May 05 2013 Just rethrowing an idea that was already thrown here: support annotated

deadalnix (4/10) May 05 2013 Yes, that is also my point of view. We don't even need to support

Zach the Mystic (16/34) May 05 2013 This is a brilliant solution. I'm glad my DIP seems to have
Rainer Schuetze (7/10) May 09 2013 I'm not exactly sure what a "safe type paint operation" does, and

Jonathan M Davis (20/33) May 09 2013 Asuming that taking the slice of a static array is treated like ref (as ...

Maxim Fomin (57/61) May 09 2013 ...

Walter Bright <newshound2 digitalmars.com> writes:

Thanks to the many recent threads on this, and the dips on it, everyone was 
pretty much up to speed and ready to find a resolution. This resolution only 
deals with the memory safety issue.

The first point is that rvalues are turned into references by the simple 
expedient of creating a temporary, copying the rvalue into the temporary, and 
taking the address of that temporary. Therefore, the issue is really about 
returning references to stack variables that have gone out of scope. From a 
memory safety issue, this is unacceptable as D strives to be a memory safe 
language. The solution in other languages of "just don't do that" is invalid
for D.

Cases where this can occur:

Case A:
     ref T fooa(ref T t) { return t; }
     ref T bar() { T t; return fooa(t); }

Case B:
     ref T foob(ref U u) { return u.t; }   // note that T is derivable from U
     ref U bar() { T t; return foob(t); }

Case C:
     struct S { T t; ref T fooc() { return t; } }
     ref T bar() { S s; return s.fooc(); }

Case D:
     Returning ref to uplevel local:

     ref T food() {
         T t;
         ref T bar() { return t; }
         return bar();
     }

case E:
     Transitively calling other functions:

     ref T fooe(T t) { return fooa(t); }



Observations:

1. Always involves a return statement.
2. The return type must always be the type of the stack variable or a type type 
derived from a stack variable's type via safe casting or subtyping.
3. Returning rvalues is the same issue, as rvalues are always turned into local 
stack temporaries.
4. Whether a function returns a ref derived from a parameter or not is not 
reflected in the function signature.
5. Always involves passing a local by ref to a function that returns by ref,
and 
that function gets called in a return statement.

Scope Ref

http://wiki.dlang.org/DIP35 is one solution, but Andrei and I argued strongly 
against it due to the perceived complexity the user would face with it. I also 
argued against it due to Case C (where would the scope annotation go) and the 
possibility that functions returning ref would have to appear in pairs - one 
with scope ref parameters, the other without - and the copy/pasta duplication
of 
the function bodies (which appears in C++ const& functions).

Andrei & I argued that we needed to make it work with just ref annotations.


Static Compiler Detection (in  safe mode):

1. Do not allow taking the address of a local variable, unless doing a safe
type 
'paint' operation.

2. In some cases, such as nested, private, and template functions, the source
is 
always available so the compiler can error on those. Because of the .di file 
problem, doing this with auto return functions is problematic.

3. Issue error on return statements where the expression may contain a ref to a 
local that is going out of scope, taking into account the observations.

Runtime Detection

There are still a few cases that the compiler cannot statically detect. For 
these a runtime check is inserted, which compares the returned ref pointer to 
see if it lies within the stack frame of the exiting function, and if it does, 
halts the program. The cost will be a couple of CMP instructions and an LEA. 
These checks would be omitted if the -noboundscheck compiler switch was
provided.

The runtime check would not be on all ref returning functions. It'll only be on 
those where the compiler cannot prove a ref to a local is not being returned.

The good thing about the runtime detection is that ref's use is restricted 
enough that merely executing all the code paths will check all the
possibilities.

May 04 2013

"David Nadlinger" <see klickverbot.at> writes:

On Saturday, 4 May 2013 at 18:33:04 UTC, Walter Bright wrote:
 Thanks to the many recent threads on this, and the dips on it, 
 everyone was pretty much up to speed and ready to find a 
 resolution. This resolution only deals with the memory safety 
 issue.

And to anybody who couldn't make it to DConf: You definitely 
missed something here. There were literally hours and hours of 
heated, yet focused debate about the issue. Although with all the 
smart people around, we should have probably tackled some much 
bigger problem, say world poverty… ;)

David

May 04 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/4/13 2:56 PM, David Nadlinger wrote:
 On Saturday, 4 May 2013 at 18:33:04 UTC, Walter Bright wrote:
 Thanks to the many recent threads on this, and the dips on it,
 everyone was pretty much up to speed and ready to find a resolution.
 This resolution only deals with the memory safety issue.

 And to anybody who couldn't make it to DConf: You definitely missed
 something here. There were literally hours and hours of heated, yet
 focused debate about the issue. Although with all the smart people
 around, we should have probably tackled some much bigger problem, say
 world poverty… ;)

 David

Next year.

Andrei

May 04 2013

"Tove" <tove fransson.se> writes:

On Saturday, 4 May 2013 at 18:33:04 UTC, Walter Bright wrote:
 Runtime Detection

 There are still a few cases that the compiler cannot statically 
 detect. For these a runtime check is inserted, which compares 
 the returned ref pointer to see if it lies within the stack 
 frame of the exiting function, and if it does, halts the 
 program. The cost will be a couple of CMP instructions and an 
 LEA. These checks would be omitted if the -noboundscheck 
 compiler switch was provided.

Thanks for taking the time to detail the solution, I was quite 
curious.

Runtime Detection and opt-out with "-noboundscheck" is a stroke 
of genius!

"couple of CMP instructions"
should be possible to reduce to only one with the "normal" 
unsigned range check idiom, no?

Looking forwards to hear more cool news. :)

May 04 2013

"Araq" <rumpf_a gmx.de> writes:

 Runtime Detection and opt-out with "-noboundscheck" is a stroke 
 of genius!

Thanks. ;-)

Araq wrote in January:

You can also look at how Algol solved this over 40 years ago:
Insert a runtime check that the escaping reference does not point
to the current stack frame which is about to be destroyed. The
check should be very cheap at runtime but it can be deactivated
in a release build for efficiency just like it is done for array
indexing.

http://forum.dlang.org/thread/mailman.3107.1356856707.5162.digitalmars-d puremagic.com?page=6

May 04 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/4/13 4:15 PM, Araq wrote:
 Runtime Detection and opt-out with "-noboundscheck" is a stroke of
 genius!

 Thanks. ;-)

 Araq wrote in January:

 You can also look at how Algol solved this over 40 years ago:
 Insert a runtime check that the escaping reference does not point
 to the current stack frame which is about to be destroyed. The
 check should be very cheap at runtime but it can be deactivated
 in a release build for efficiency just like it is done for array
 indexing.

 http://forum.dlang.org/thread/mailman.3107.1356856707.5162.digitalmars-d puremagic.com?page=6

Whoa. Kudos!

Andrei

May 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2013 1:20 PM, Andrei Alexandrescu wrote:
 On 5/4/13 4:15 PM, Araq wrote:
 Runtime Detection and opt-out with "-noboundscheck" is a stroke of
 genius!

 Thanks. ;-)

 Araq wrote in January:

 You can also look at how Algol solved this over 40 years ago:
 Insert a runtime check that the escaping reference does not point
 to the current stack frame which is about to be destroyed. The
 check should be very cheap at runtime but it can be deactivated
 in a release build for efficiency just like it is done for array
 indexing.

 http://forum.dlang.org/thread/mailman.3107.1356856707.5162.digitalmars-d puremagic.com?page=6

 Whoa. Kudos!

Araq for the win!

May 04 2013

"deadalnix" <deadalnix gmail.com> writes:

On Saturday, 4 May 2013 at 19:40:36 UTC, Tove wrote:
 On Saturday, 4 May 2013 at 18:33:04 UTC, Walter Bright wrote:
 Runtime Detection

 There are still a few cases that the compiler cannot 
 statically detect. For these a runtime check is inserted, 
 which compares the returned ref pointer to see if it lies 
 within the stack frame of the exiting function, and if it 
 does, halts the program. The cost will be a couple of CMP 
 instructions and an LEA. These checks would be omitted if the 
 -noboundscheck compiler switch was provided.

 Thanks for taking the time to detail the solution, I was quite 
 curious.

 Runtime Detection and opt-out with "-noboundscheck" is a stroke 
 of genius!

 "couple of CMP instructions"
 should be possible to reduce to only one with the "normal" 
 unsigned range check idiom, no?

 Looking forwards to hear more cool news. :)

It shouldn't be expensive. Additionally, consider that returning 
by reference is quite rare in practice.

Due to D semantic, returning by reference isn't a performance 
improvement (you get full performance returning by value in D), 
so you only return by reference when you intend to keep identity 
(ie, when you intend to modify a given value, in containers for 
instance).

I still think this is inferior to Rust's solution and like to see 
ref as a equivalent of the Rust burrowed pointer. It achieve the 
same safety at compile time instead at runtime, and incurs no 
extra complexity except in some very rare cases (when you have a 
function taking several arguments by ref and returning also by 
ref and the lifetime of the returned ref isn't the union of the 
lifetime of the ref parameters - a very specific case).

Talking with people at DConf, it seems that many of them didn't 
knew about how Rust solve that issue, and so I'm not sure if we 
should validate the proposal.

At a first glance, it seems that the proposal allow for rather 
painless later inclusion of the concept of burrowed pointer, and 
we can ensure that this is effectively the case, I'm definitively 
for it.

But we shouldn't close the door to that concept. After all, D is 
about doing as much as possible at compile time, and when we have 
the choice to trade a runtime check against a compile time one, 
we must go for it.

May 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2013 2:22 PM, deadalnix wrote:
 I still think this is inferior to Rust's solution and like to see ref as a
 equivalent of the Rust burrowed pointer. It achieve the same safety at compile
 time instead at runtime, and incurs no extra complexity except in some very
rare
 cases (when you have a function taking several arguments by ref and returning
 also by ref and the lifetime of the returned ref isn't the union of the
lifetime
 of the ref parameters - a very specific case).

As you say, D ref's are analogous to Rust's borrowed pointers, and for the 
escaping ref problem, Rust requires additional annotations (much like the
'scope 
ref' proposal).

http://static.rust-lang.org/doc/tutorial-borrowed-ptr.html#returning-borrowed-pointers

The runtime check is because Andrei & I really didn't like requiring additional 
annotations.

May 04 2013

"deadalnix" <deadalnix gmail.com> writes:

On Saturday, 4 May 2013 at 21:48:25 UTC, Walter Bright wrote:
 On 5/4/2013 2:22 PM, deadalnix wrote:
 I still think this is inferior to Rust's solution and like to 
 see ref as a
 equivalent of the Rust burrowed pointer. It achieve the same 
 safety at compile
 time instead at runtime, and incurs no extra complexity except 
 in some very rare
 cases (when you have a function taking several arguments by 
 ref and returning
 also by ref and the lifetime of the returned ref isn't the 
 union of the lifetime
 of the ref parameters - a very specific case).

 As you say, D ref's are analogous to Rust's borrowed pointers, 
 and for the escaping ref problem, Rust requires additional 
 annotations (much like the 'scope ref' proposal).

 http://static.rust-lang.org/doc/tutorial-borrowed-ptr.html#returning-borrowed-pointers

 The runtime check is because Andrei & I really didn't like 
 requiring additional annotations.

Where you miss the point, is that these annotations may be 
omitted (and they are most of the time). When nothing is 
specified, the lifetime of the returned reference is considered 
to be the union of the lifetime of parameters lifetime, which is 
what you want in 99% of cases.

May 04 2013

"deadalnix" <deadalnix gmail.com> writes:

On Saturday, 4 May 2013 at 21:57:14 UTC, deadalnix wrote:
 On Saturday, 4 May 2013 at 21:48:25 UTC, Walter Bright wrote:
 On 5/4/2013 2:22 PM, deadalnix wrote:
 I still think this is inferior to Rust's solution and like to 
 see ref as a
 equivalent of the Rust burrowed pointer. It achieve the same 
 safety at compile
 time instead at runtime, and incurs no extra complexity 
 except in some very rare
 cases (when you have a function taking several arguments by 
 ref and returning
 also by ref and the lifetime of the returned ref isn't the 
 union of the lifetime
 of the ref parameters - a very specific case).

 As you say, D ref's are analogous to Rust's borrowed pointers, 
 and for the escaping ref problem, Rust requires additional 
 annotations (much like the 'scope ref' proposal).

 http://static.rust-lang.org/doc/tutorial-borrowed-ptr.html#returning-borrowed-pointers

 The runtime check is because Andrei & I really didn't like 
 requiring additional annotations.

 Where you miss the point, is that these annotations may be 
 omitted (and they are most of the time). When nothing is 
 specified, the lifetime of the returned reference is considered 
 to be the union of the lifetime of parameters lifetime, which 
 is what you want in 99% of cases.

Note : We may also choose the lack of explicit lifetime means 
runtime check as proposed, instead of being an error.

May 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2013 3:03 PM, deadalnix wrote:
 Where you miss the point, is that these annotations may be omitted (and they
 are most of the time). When nothing is specified, the lifetime of the returned
 reference is considered to be the union of the lifetime of parameters
 lifetime, which is what you want in 99% of cases.

 Note : We may also choose the lack of explicit lifetime means runtime check as
 proposed, instead of being an error.

D omits the check when it can prove that the returned ref is not a ref to one
of 
the parameters that is local.

My other comments about 'scope ref' in the first posting in this thread apply
as 
well to the Rust annotation scheme.

May 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

To put it another way, we wish to solve the problem without introducing more 
annotations. Rust's solution requires additional annotations, and so is not
what 
we're looking for.

May 04 2013

"deadalnix" <deadalnix gmail.com> writes:

On Saturday, 4 May 2013 at 22:33:58 UTC, Walter Bright wrote:
 To put it another way, we wish to solve the problem without 
 introducing more annotations. Rust's solution requires 
 additional annotations, and so is not what we're looking for.

Require isn't the right word, or you hav to explain yourself much 
more.

For instance, see : 
http://smallcultfollowing.com/babysteps/blog/2012/07/19/yet-another-tutorial-on-borrowed-pointers/

"So far we have always used the notation &T for a borrowed 
pointer. However, sometimes if a function takes many parameters, 
it is useful to be able to group those parameters by lifetime."

In other terms, you need to have several parameters that passes 
by ref + return by ref + you want a different lifetime for your 
returned value than union of parameters's lifetime.

Which is a very specific case. That is far away from require. 
That is most case don't require anything, while 1% require an 
explicit lifetime. Unless you do some goofy stuff, you don't even 
need to know about it. And if we decide that no explicit lifetime 
== runtime check, it is never ever required. Just provided for 
the dev that want to have the runtime check removed.

If we get more actual, return by ref in D are mostly usefull for 
collections and ranges. None of theses would require explicit 
lifetime ever.

May 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2013 3:50 PM, deadalnix wrote:
 Require isn't the right word, or you hav to explain yourself much more.

You need an explicit annotation if a ref parameter is returned by ref by that 
function. This is what Rust's annotations do.

Consider:

     ref T foob(ref U u) { return u.t; }

     ref U bar() { U u; return foob(u); }

The compiler cannot know that the ref return of foob is referring to local u
(as 
opposed to, say, a ref to a global) unless it is annotated to say so. Rust is
no 
different.

May 04 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 05/05/2013 01:30 AM, Walter Bright wrote:
 On 5/4/2013 3:50 PM, deadalnix wrote:
 Require isn't the right word, or you hav to explain yourself much more.

 You need an explicit annotation if a ref parameter is returned by ref by
 that function. This is what Rust's annotations do.

 Consider:

      ref T foob(ref U u) { return u.t; }

      ref U bar() { U u; return foob(u); }

 The compiler cannot know that the ref return of foob is referring to
 local u (as opposed to, say, a ref to a global) unless it is annotated
 to say so. Rust is no different.

What is the point? Rust conservatively assumes this by default.

May 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2013 4:51 PM, Timon Gehr wrote:
 What is the point? Rust conservatively assumes this by default.

We could do that, too, and then disallow all code that looks like:

       ref T foob(ref U u);

       ref T bar() { U u; return foob(u); }

which I doubt would be very popular. Or we could add "scope ref" annotations 
everywhere, which brings another set of problems as I pointed out.

I.e. there is no free lunch with this. Rust uses annotations, it doesn't have a 
clever way to not have them. The choices are:

1. use annotations
2. issue error on otherwise useful cases
3. add runtime check
4. put 'suspicious' locals on the heap, like what is done for closures

We decided that (3) was the most practical and was the easiest for users to
deal 
with.

May 04 2013

"deadalnix" <deadalnix gmail.com> writes:

On Saturday, 4 May 2013 at 23:30:01 UTC, Walter Bright wrote:
 On 5/4/2013 3:50 PM, deadalnix wrote:
 Require isn't the right word, or you hav to explain yourself 
 much more.

 You need an explicit annotation if a ref parameter is returned 
 by ref by that function. This is what Rust's annotations do.

 Consider:

     ref T foob(ref U u) { return u.t; }

     ref U bar() { U u; return foob(u); }

 The compiler cannot know that the ref return of foob is 
 referring to local u (as opposed to, say, a ref to a global) 
 unless it is annotated to say so. Rust is no different.

This code sample won't require any annotation in Rust. And it 
illustrate wonderfully what I'm saying : most people in the 
discussion (and it has been shown now that this includes you) 
were unaware of how does Rust solve the problem.

I don't think excluding a solution that isn't understood is the 
smartest thing to do.

May 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2013 5:02 PM, deadalnix wrote:
 On Saturday, 4 May 2013 at 23:30:01 UTC, Walter Bright wrote:
 On 5/4/2013 3:50 PM, deadalnix wrote:
 Require isn't the right word, or you hav to explain yourself much more.

 You need an explicit annotation if a ref parameter is returned by ref by that
 function. This is what Rust's annotations do.

 Consider:

     ref T foob(ref U u) { return u.t; }

     ref T bar() { U u; return foob(u); }

 The compiler cannot know that the ref return of foob is referring to local u
 (as opposed to, say, a ref to a global) unless it is annotated to say so. Rust
 is no different.

 This code sample won't require any annotation in Rust.

If the compiler accepts that code, it will crash at runtime. If it doesn't 
accept that code, then it will also disallow legitimate code like:

      ref T foob(ref U u) { static T t; return t; }

      ref T bar() { U u; return foob(u); }

 And it illustrate
 wonderfully what I'm saying : most people in the discussion (and it has been
 shown now that this includes you) were unaware of how does Rust solve the
problem.

 I don't think excluding a solution that isn't understood is the smartest thing
 to do.

I suggest you enumerate the cases with a Rust-like system and show us how it 
solves the problem without annotations. Note that Rust has pretty much zero
real 
world usage - it's one thing to say needing to use annotations is 'rare' and 
another to know it based on typical usage patterns of the language.

For example, if the default is "assume the ref return refers to the ref 
parameter", then some containers would require the annotation and some would 
not. This is not very viable when doing generic coding, unless you are willing 
to provide two copies of each such function - one with the annotations and the 
other without.

Note also that if you have A calls B calls C, the annotation on C doesn't 
propagate up to B, again leading to a situation where you're forced to make two 
versions of the functions.

(I say doesn't propagate because in a language that supports separate 
compilation, all the compiler knows about a function is its signature.)

May 04 2013

"deadalnix" <deadalnix gmail.com> writes:

On Sunday, 5 May 2013 at 00:47:00 UTC, Walter Bright wrote:
 If the compiler accepts that code, it will crash at runtime. If 
 it doesn't accept that code, then it will also disallow 
 legitimate code like:

      ref T foob(ref U u) { static T t; return t; }

      ref T bar() { U u; return foob(u); }

It doesn't accept it, with or without any combination of 
annotation. Now, the example with a static effectively require an 
annotation.

 And it illustrate
 wonderfully what I'm saying : most people in the discussion 
 (and it has been
 shown now that this includes you) were unaware of how does 
 Rust solve the problem.

 I don't think excluding a solution that isn't understood is 
 the smartest thing
 to do.

 I suggest you enumerate the cases with a Rust-like system and 
 show us how it solves the problem without annotations. Note 
 that Rust has pretty much zero real world usage - it's one 
 thing to say needing to use annotations is 'rare' and another 
 to know it based on typical usage patterns of the language.

Rust assume, when no annotation is present, that the return ref's 
lifetime is the union of ref parameters lifetime. I'm sure we can 
find an example of D code somewhere that don't fit into this, but 
real world usage in D would almost never require any annotation 
(this is the case of all D codebase I've played with as of now, 
and I don't actually see any use case for example like the static 
one mentioned above).

 For example, if the default is "assume the ref return refers to 
 the ref parameter", then some containers would require the 
 annotation and some would not. This is not very viable when 
 doing generic coding, unless you are willing to provide two 
 copies of each such function - one with the annotations and the 
 other without.

The default can't be that as several parameters can be passed by 
ref. The default is return ref lifetime is the union of ref 
parameters lifetime. I don't see any container that require the 
annotation.

 Note also that if you have A calls B calls C, the annotation on 
 C doesn't propagate up to B, again leading to a situation where 
 you're forced to make two versions of the functions.

 (I say doesn't propagate because in a language that supports 
 separate compilation, all the compiler knows about a function 
 is its signature.)

It doesn't require code duplication. Named lifetime make sense 
for the caller, not the callee (in which they only are identifier 
that can be used to describe lifetime's relation explicitly for 
the caller).

May 05 2013

Martin Nowak <code dawg.eu> writes:

On 05/05/2013 12:30 AM, Walter Bright wrote:
 On 5/4/2013 3:03 PM, deadalnix wrote:
 Where you miss the point, is that these annotations may be omitted
 (and they
 are most of the time). When nothing is specified, the lifetime of the
 returned
 reference is considered to be the union of the lifetime of parameters
 lifetime, which is what you want in 99% of cases.

 Note : We may also choose the lack of explicit lifetime means runtime
 check as
 proposed, instead of being an error.

 D omits the check when it can prove that the returned ref is not a ref
 to one of the parameters that is local.

ref int foo(ref int a, ref int b);

It's a very nice observation that calling foo with only non-local 
references means that the returned reference is non-local too.
In a way this works like inout but with a safe default so
that no annotation is needed.

In fact it's also possible to know that these don't return a reference 
to their parameter.

ref double foo(ref int a);

Struct S {}
ref double foo(ref S a);

It can become somewhat complicated to check though.

Anyhow I think using flow-analysis to omit runtime checks is a nice 
approach.

May 26 2013

Timothee Cour <thelastmammoth gmail.com> writes:

 In fact it's also possible to know that these don't return a reference to

their parameter.

Watch out for this:
Struct S {double x;}
ref double foo(ref S a){return a.x;}

This sounds hacky. I've proposed a general solution here:
http://wiki.dlang.org/DIP38
either with user annotations of ref-return functions (scheme A) (just
distinguishing ref vs scope ref), or with compiler taking care of
annotations (scheme B).

On Sun, May 26, 2013 at 1:21 PM, Martin Nowak <code dawg.eu> wrote:

 On 05/05/2013 12:30 AM, Walter Bright wrote:
 On 5/4/2013 3:03 PM, deadalnix wrote:
 Where you miss the point, is that these annotations may be omitted
 (and they
 are most of the time). When nothing is specified, the lifetime of the
 returned
 reference is considered to be the union of the lifetime of parameters
 lifetime, which is what you want in 99% of cases.

 Note : We may also choose the lack of explicit lifetime means runtime
 check as
 proposed, instead of being an error.

 D omits the check when it can prove that the returned ref is not a ref
 to one of the parameters that is local.

 ref int foo(ref int a, ref int b);

 It's a very nice observation that calling foo with only non-local
 references means that the returned reference is non-local too.
 In a way this works like inout but with a safe default so
 that no annotation is needed.

 In fact it's also possible to know that these don't return a reference to
 their parameter.

 ref double foo(ref int a);

 Struct S {}
 ref double foo(ref S a);

 It can become somewhat complicated to check though.

 Anyhow I think using flow-analysis to omit runtime checks is a nice
 approach.

May 26 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sun, 26 May 2013 18:56:58 -0400, Timothee Cour  
<thelastmammoth gmail.com> wrote:

 In fact it's also possible to know that these don't return a reference  
 to

 their parameter.

 Watch out for this:
 Struct S {double x;}
 ref double foo(ref S a){return a.x;}

That case is covered by the proposal.  It incurs a runtime check (worst  
case, best case it simply doesn't compile).

-Steve

May 28 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/4/13 5:22 PM, deadalnix wrote:
 I still think this is inferior to Rust's solution and like to see ref as
 a equivalent of the Rust burrowed pointer.

http://static.rust-lang.org/doc/tutorial-borrowed-ptr.html

Andrei

May 04 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 http://static.rust-lang.org/doc/tutorial-borrowed-ptr.html

The management of pointers is one of the most refined parts of 
the Rust design. It offers safety, allows per-thread GCs, and 
more. It's powerful but it also adds some complexity to the 
language.

Bye,
bearophile

May 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2013 5:42 PM, bearophile wrote:
 Andrei Alexandrescu:

 http://static.rust-lang.org/doc/tutorial-borrowed-ptr.html

 The management of pointers is one of the most refined parts of the Rust design.
 It offers safety, allows per-thread GCs, and more. It's powerful but it also
 adds some complexity to the language.

Years ago, Bartosz proposed an ownership system for pointers. While sound, it 
was rather complicated.

I don't think a complex system is going to gain wide adoption.

May 04 2013

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday, May 04, 2013 19:07:25 Walter Bright wrote:
 On 5/4/2013 5:42 PM, bearophile wrote:
 Andrei Alexandrescu:
 http://static.rust-lang.org/doc/tutorial-borrowed-ptr.html

 
 The management of pointers is one of the most refined parts of the Rust
 design. It offers safety, allows per-thread GCs, and more. It's powerful
 but it also adds some complexity to the language.

 
 Years ago, Bartosz proposed an ownership system for pointers. While sound,
 it was rather complicated.
 
 I don't think a complex system is going to gain wide adoption.

The trick is balancing it so that it's powerful enough and yet not too 
complicated to be useable by normal programmers. I think that we're okay, but 
I also think that we're pushing it as it is. Going with Bartosz proposal would 
almost certainly have been too much.

As it is, we arguably didn't choose the best defaults with the attributes that 
we have (e.g.  system is the default instead of  safe, and impure is the 
default instead of pure). The result is that we have to use a lot of 
annotations if we want to properly take advantage of the various language 
features, whereas ideally, having to use annotations for stuff like  safety or 
purity would be the exception. Don was complaining that one reason that moving 
to D2 at Sociomantic looks unappealing in spite of the benefits is the fact 
that they're going to have to add so many extra annotations to their code.

- Jonathan M Davis

May 04 2013

"David Nadlinger" <see klickverbot.at> writes:

On Sunday, 5 May 2013 at 02:36:45 UTC, Jonathan M Davis wrote:
 Don was complaining that one reason that moving
 to D2 at Sociomantic looks unappealing in spite of the benefits 
 is the fact
 that they're going to have to add so many extra annotations to 
 their code.

When did he mention that? If I had noticed, I would have been 
interested in a closer rationale, as D2's extra annotations are 
pretty much opt-in only, even more so if you are using your own 
library anyway.

David

May 04 2013

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday, May 05, 2013 06:10:34 David Nadlinger wrote:
 On Sunday, 5 May 2013 at 02:36:45 UTC, Jonathan M Davis wrote:
 Don was complaining that one reason that moving
 to D2 at Sociomantic looks unappealing in spite of the benefits
 is the fact
 that they're going to have to add so many extra annotations to
 their code.

 
 When did he mention that?

It was during dinner Friday night when he and Manu were discussing stuff in D 
that was problematic for companies like the ones that they work for. You were 
far enough down the table that you wouldn't have heard.

 If I had noticed, I would have been
 interested in a closer rationale, as D2's extra annotations are
 pretty much opt-in only, even more so if you are using your own
 library anyway.

True, they're opt-in, but it's also true that we generally consider it good 
style to use them as much as possible, which tends to mean using them all over 
the place - to the point that it starts seeming very odd that  safe and pure 
aren't the default, particularly when  system code is generally supposed to be 
the minority of your program, and very few functions should need to access 
global state. The two reasons that they don't get used way more in Phobos is 
because it uses templates so heavily, and because some basic stuff that gets 
used all over the place isn't pure yet even though it's supposed to be.

I'm sure that Don could answer about his concerns better than I could, but I 
think that it pretty much came down to the fact that D2 had a bunch of new 
attributes that they then had to worry about, many of which more or less only 
provide theoretical benefits which may or may not materialize at some point in 
the future.

For instance, optimizations with pure don't really happen all that often. 
There just aren't enough cases where the arguments are immutable (or 
implicitly convertible to immutable) for it to apply frequently, and IIRC, 
optimizations are only applied within the same statement, meaning that when 
they _are_ applied, they don't generally remove many function calls. The 
compiler doesn't even try and optimize across multiple lines within the same 
function (since that would require flow analysis) let alone memoize the result 
(which it probably shouldn't be doing anyway, since that would require storing 
the result somewhere, but it's the sort of thing that people often think of 
with pure).

Now, I argued that pure's primary benefit isn't really in optimizations but 
rather in the fact that it guarantees that your code isn't accessing global 
state, but there's still the general concern that there's a lot of new 
attributes to worry about, whether you choose to use them or not. I don't 
think that it was a deal-breaker for Don or anything like that, but it was one 
of his concerns and one more item on the list of things that makes it more 
costly for them to move to D2, even if it alone doesn't necessarily add a huge 
cost.

- Jonathan M Davis

May 05 2013

"Tove" <tove fransson.se> writes:

On Sunday, 5 May 2013 at 07:22:06 UTC, Jonathan M Davis wrote:
 Now, I argued that pure's primary benefit isn't really in 
 optimizations but
 rather in the fact that it guarantees that your code isn't 
 accessing global
 state, but there's still the general concern that there's a lot 
 of new
 attributes to worry about, whether you choose to use them or 
 not. I don't
 think that it was a deal-breaker for Don or anything like that, 
 but it was one
 of his concerns and one more item on the list of things that 
 makes it more
 costly for them to move to D2, even if it alone doesn't 
 necessarily add a huge
 cost.

 - Jonathan M Davis

Assuming:
1. functioning attribute inference
2. attributes are expanded in the *.di file

Then, it would be trivial to create a tool which, upon request, 
merges "a defined set of attributes" back to the original d 
source file, this would reduce some of the burden and with full 
IDE integration even more so.

May 05 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2013 7:36 PM, Jonathan M Davis wrote:
 The trick is balancing it so that it's powerful enough and yet not too
 complicated to be useable by normal programmers. I think that we're okay, but
 I also think that we're pushing it as it is. Going with Bartosz proposal would
 almost certainly have been too much.

Consider also that the appeal of dynamic languages is people don't have to 
annotate things with types.

May 04 2013

"Zach the Mystic" <reachzach gggggmail.com> writes:

On Sunday, 5 May 2013 at 02:36:45 UTC, Jonathan M Davis wrote:
 As it is, we arguably didn't choose the best defaults with the 
 attributes that
 we have (e.g.  system is the default instead of  safe, and 
 impure is the
 default instead of pure). The result is that we have to use a 
 lot of
 annotations if we want to properly take advantage of the 
 various language
 features, whereas ideally, having to use annotations for stuff 
 like  safety or
 purity would be the exception. Don was complaining that one 
 reason that moving
 to D2 at Sociomantic looks unappealing in spite of the benefits 
 is the fact
 that they're going to have to add so many extra annotations to 
 their code.

In the thread which appeared on github someone suggested 
' infer', which I altered to ' auto', which gets all the 
attributes automatically, and creates the '.di' with the full 
attributes (which might actually be problematic if they change 
too often and force compilation too many times). I'm starting to 
think it might actually be quite valuable to have this annotation 
available to the programmer. What do you think?

May 05 2013

"Diggory" <diggsey googlemail.com> writes:

So just to be clear, "ref" parameters can now take rvalues?

There's one minor problem I see with this:

S currentVar;
void makeCurrent(ref S var) {
     currentVar = var;
}

makeCurrent(getRValue());

If "makeCurrent" knew that "var" was an rvalue it could avoid 
calling "postblit" on currentVar, because it's simply a move 
operation, thus saving a potentially costly deep copy operation 
and extra destructor call.

May 04 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/4/13 4:32 PM, Diggory wrote:
 So just to be clear, "ref" parameters can now take rvalues?

That part of the design isn't finished yet.

Andrei

May 04 2013

"Namespace" <rswhite4 googlemail.com> writes:

You mean DIP 36, not DIP 35. ;)

Any estimates as to when the whole is implemented?
So dmd 2.064, 2.070, etc.?

May 04 2013

"w0rp" <devw0rp gmail.com> writes:

 These checks would be omitted if the -noboundscheck compiler 
 switch was provided.

This reminds me of Tony Hoare's lecture on null references being 
a "billion dollar mistake." He mentioned that he asked his Algol 
customers if they wanted the option to disable array bounds 
checking, and they all said no. I like this solution, and I will 
personally never ever turn that safety off.

Diggory asked this same question already. Does all of this also 
mean that a function with a ref parameter will automagically work 
with r-values? My one and only attempt at a Phobos pull request 
thus far will be mostly obsolete if this is the case. (Which is a 
good thing.)

May 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2013 3:51 PM, w0rp wrote:
 Does all of this also mean that a
 function with a ref parameter will automagically work with r-values?

Yes.

May 04 2013

"deadalnix" <deadalnix gmail.com> writes:

On Saturday, 4 May 2013 at 23:31:39 UTC, Walter Bright wrote:
 On 5/4/2013 3:51 PM, w0rp wrote:
 Does all of this also mean that a
 function with a ref parameter will automagically work with 
 r-values?

 Yes.

This is good, but not I'm a bit bitter with the whole code 
breakage of slice are rvalues that happened recently.

This is what I refers to when I complain about the way D is 
released. Both changes are super good, but we should have gotten 
both AT ONCE. And while the second isn't there, I should have 
none and still get bug fixes.

May 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2013 4:36 PM, deadalnix wrote:
 This is good, but not I'm a bit bitter with the whole code breakage of slice
are
 rvalues that happened recently.

I know that code breakage sux.

May 04 2013

"deadalnix" <deadalnix gmail.com> writes:

On Saturday, 4 May 2013 at 23:56:09 UTC, Walter Bright wrote:
 On 5/4/2013 4:36 PM, deadalnix wrote:
 This is good, but not I'm a bit bitter with the whole code 
 breakage of slice are
 rvalues that happened recently.

 I know that code breakage sux.

And in this case, this was avoidable. We MUST get better at 
releasing versions of D.

May 04 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/4/13 7:31 PM, Walter Bright wrote:
 On 5/4/2013 3:51 PM, w0rp wrote:
 Does all of this also mean that a
 function with a ref parameter will automagically work with r-values?

 Yes.

This is new to me. My understanding is that the discussed design 
addresses safety, and leaves the rvalue discussion for a future iteration.

Andrei

May 04 2013

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday, May 04, 2013 20:37:36 Andrei Alexandrescu wrote:
 On 5/4/13 7:31 PM, Walter Bright wrote:
 On 5/4/2013 3:51 PM, w0rp wrote:
 Does all of this also mean that a
 function with a ref parameter will automagically work with r-values?

 
 Yes.

 
 This is new to me. My understanding is that the discussed design
 addresses safety, and leaves the rvalue discussion for a future iteration.

That is definitely where things were when we ended the discussion on Wednesday 
night. Walter favored making ref accept rvalues, but we never agreed on that. 
Manu was still in favor of scop ref (and David Nadlinger agreed with him 
IIRC), and you and I were arguing for auto ref to designate that a function 
accepts rvalues. We all agreed on the bounds check solution for  safety, but 
we explicitly tabled the discussion about accepting rvalues, because it was 
getting late, and we'd already been discussing it / arguing about it for quite 
some time. So, unless further discussion occurred after that which I missed, 
there is still no agreement on how to handle having a parameter accept both 
lvalues and rvalues by ref.

- Jonathan M Davis

May 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2013 6:44 PM, Jonathan M Davis wrote:
 On Saturday, May 04, 2013 20:37:36 Andrei Alexandrescu wrote:
 On 5/4/13 7:31 PM, Walter Bright wrote:
 On 5/4/2013 3:51 PM, w0rp wrote:
 Does all of this also mean that a
 function with a ref parameter will automagically work with r-values?

 Yes.

 This is new to me. My understanding is that the discussed design
 addresses safety, and leaves the rvalue discussion for a future iteration.

 That is definitely where things were when we ended the discussion on Wednesday
 night. Walter favored making ref accept rvalues, but we never agreed on that.
 Manu was still in favor of scop ref (and David Nadlinger agreed with him
 IIRC), and you and I were arguing for auto ref to designate that a function
 accepts rvalues. We all agreed on the bounds check solution for  safety, but
 we explicitly tabled the discussion about accepting rvalues, because it was
 getting late, and we'd already been discussing it / arguing about it for quite
 some time. So, unless further discussion occurred after that which I missed,
 there is still no agreement on how to handle having a parameter accept both
 lvalues and rvalues by ref.

That wasn't my understanding. I thought we agreed that since rvalues would be 
copied to locals, and then the issue was one of escaping local references.

We did explicitly defer discussion about what happens with "nop" rvalue
conversions.

May 04 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/4/13 10:04 PM, Walter Bright wrote:
 That wasn't my understanding. I thought we agreed that since rvalues
 would be copied to locals, and then the issue was one of escaping local
 references.

The short answer is no.

Andrei

May 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2013 7:16 PM, Andrei Alexandrescu wrote:
 On 5/4/13 10:04 PM, Walter Bright wrote:
 That wasn't my understanding. I thought we agreed that since rvalues
 would be copied to locals, and then the issue was one of escaping local
 references.

 The short answer is no.

Please explain your understanding of what we agreed on.

May 04 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/4/13 10:33 PM, Walter Bright wrote:
 Please explain your understanding of what we agreed on.

Just the factual events. We all said repeatedly in the beginning of the 
discussion that "we focus only on the safety aspect for now and then 
figure the rvalue references thing". I've heard you say it at least two 
times clear as day. We can't now construe a solution to the safety 
matter into a solution to binding rvalues to ref.

I'll post separately about the issues involved with binding rvalues to 
references, but I'm retorting to this rather strongly because we must be 
clear, before getting into any level of detail, that we are not done 
with rvalues and ref.


Thanks,

Andrei

May 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2013 10:15 PM, Andrei Alexandrescu wrote:
 On 5/4/13 10:33 PM, Walter Bright wrote:
 Please explain your understanding of what we agreed on.

 Just the factual events. We all said repeatedly in the beginning of the
 discussion that "we focus only on the safety aspect for now and then figure the
 rvalue references thing". I've heard you say it at least two times clear as
day.
 We can't now construe a solution to the safety matter into a solution to
binding
 rvalues to ref.

 I'll post separately about the issues involved with binding rvalues to
 references, but I'm retorting to this rather strongly because we must be clear,
 before getting into any level of detail, that we are not done with rvalues and
ref.

What I was talking about was the "no-op" thing with rvalue references, and yes, 
we deferred that.

May 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2013 10:15 PM, Andrei Alexandrescu wrote:
 Just the factual events. We all said repeatedly in the beginning of the
 discussion that "we focus only on the safety aspect for now and then figure the
 rvalue references thing". I've heard you say it at least two times clear as
day.
 We can't now construe a solution to the safety matter into a solution to
binding
 rvalues to ref.

Yes, I should have entitled the thread that it was a solution to the safety 
issue. I agree that we deferred the 'nop' issue.

May 04 2013

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday, May 04, 2013 19:04:21 Walter Bright wrote:
 On 5/4/2013 6:44 PM, Jonathan M Davis wrote:
 On Saturday, May 04, 2013 20:37:36 Andrei Alexandrescu wrote:
 On 5/4/13 7:31 PM, Walter Bright wrote:
 On 5/4/2013 3:51 PM, w0rp wrote:
 Does all of this also mean that a
 function with a ref parameter will automagically work with r-values?

 
 Yes.

 
 This is new to me. My understanding is that the discussed design
 addresses safety, and leaves the rvalue discussion for a future
 iteration.

 
 That is definitely where things were when we ended the discussion on
 Wednesday night. Walter favored making ref accept rvalues, but we never
 agreed on that. Manu was still in favor of scop ref (and David Nadlinger
 agreed with him IIRC), and you and I were arguing for auto ref to
 designate that a function accepts rvalues. We all agreed on the bounds
 check solution for  safety, but we explicitly tabled the discussion about
 accepting rvalues, because it was getting late, and we'd already been
 discussing it / arguing about it for quite some time. So, unless further
 discussion occurred after that which I missed, there is still no
 agreement on how to handle having a parameter accept both lvalues and
 rvalues by ref.

 
 That wasn't my understanding. I thought we agreed that since rvalues would
 be copied to locals, and then the issue was one of escaping local
 references.

The  safety issue is one of escaping local references, but Andrei and I were 
arguing that it's a maintenance issue for ref to always accept rvalues. If ref 
does not accept rvalues, then you can look at a function signature like

auto foo(ref int i);

and know that it's intended to alter its argument. However, if ref accepted 
rvalues, you couldn't know that anymore. People would be using ref all over 
the place for the efficiency gain - just like they do with const ref in C++ -
so 
the fact that a parameter was ref would mean nothing about how it was used. 
So, you could see code like

[5, 6, 7].popFrontN(5);

and not know that it was effectively a no-op (in this case, it's fairly 
obvious, but if you're not already familiar with the function, it generally 
wouldn't be).

However, if we had an attribute which explicitly designated that a function 
accepted both rvalues and lvalues (which is what auto ref was originally 
supposed to do as Andrei proposed it), then if you saw

auto foo(ref int i);
auto bar(auto ref int i);

then you could be reasonably certain that foo was intending to alter its 
arguments and bar was not. And if you want the full guarantee that bar _can't_ 
alter its arguments, you use const

auto bar(auto ref const int i);

But given how restrictive D's const is, we can't really go with C++'s solution 
of const& for that. However, auto ref is then very similar to C++'s const&, 
except that it doesn't require const to do it (and it's  safe thanks to the 
new  safety solution for ref).

So, the primary difference between ref and auto ref would then be simply that 
auto ref accepted rvalues and ref wouldn't (though, the difference would be 
somewhat greater with templates, since in that case, it generates different 
templates for lvalues and rvalues in order to accept both, whereas the non-
templated version would effectively create a local variable to assign the 
rvalue to so that it could be passed to the function as an lvalue). But the 
distinction between ref and auto ref is very important when trying to 
understand what code does and therefore will have a definite impact on how 
maintainable code is.

 We did explicitly defer discussion about what happens with "nop" rvalue
 conversions.

I'm not sure what you mean by nop rvalue conversions, at least not by name.

- Jonathan M Davis

May 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2013 7:30 PM, Jonathan M Davis wrote:
 On Saturday, May 04, 2013 19:04:21 Walter Bright wrote:
 On 5/4/2013 6:44 PM, Jonathan M Davis wrote:
 On Saturday, May 04, 2013 20:37:36 Andrei Alexandrescu wrote:
 On 5/4/13 7:31 PM, Walter Bright wrote:
 On 5/4/2013 3:51 PM, w0rp wrote:
 Does all of this also mean that a
 function with a ref parameter will automagically work with r-values?

 Yes.

 This is new to me. My understanding is that the discussed design
 addresses safety, and leaves the rvalue discussion for a future
 iteration.

 That is definitely where things were when we ended the discussion on
 Wednesday night. Walter favored making ref accept rvalues, but we never
 agreed on that. Manu was still in favor of scop ref (and David Nadlinger
 agreed with him IIRC), and you and I were arguing for auto ref to
 designate that a function accepts rvalues. We all agreed on the bounds
 check solution for  safety, but we explicitly tabled the discussion about
 accepting rvalues, because it was getting late, and we'd already been
 discussing it / arguing about it for quite some time. So, unless further
 discussion occurred after that which I missed, there is still no
 agreement on how to handle having a parameter accept both lvalues and
 rvalues by ref.

 That wasn't my understanding. I thought we agreed that since rvalues would
 be copied to locals, and then the issue was one of escaping local
 references.

 The  safety issue is one of escaping local references, but Andrei and I were
 arguing that it's a maintenance issue for ref to always accept rvalues. If ref
 does not accept rvalues, then you can look at a function signature like

 auto foo(ref int i);

 and know that it's intended to alter its argument. However, if ref accepted
 rvalues, you couldn't know that anymore. People would be using ref all over
 the place for the efficiency gain - just like they do with const ref in C++ -
so
 the fact that a parameter was ref would mean nothing about how it was used.
 So, you could see code like

 [5, 6, 7].popFrontN(5);

 and not know that it was effectively a no-op (in this case, it's fairly
 obvious, but if you're not already familiar with the function, it generally
 wouldn't be).

 However, if we had an attribute which explicitly designated that a function
 accepted both rvalues and lvalues (which is what auto ref was originally
 supposed to do as Andrei proposed it), then if you saw

 auto foo(ref int i);
 auto bar(auto ref int i);

 then you could be reasonably certain that foo was intending to alter its
 arguments and bar was not. And if you want the full guarantee that bar _can't_
 alter its arguments, you use const

 auto bar(auto ref const int i);

 But given how restrictive D's const is, we can't really go with C++'s solution
 of const& for that. However, auto ref is then very similar to C++'s const&,
 except that it doesn't require const to do it (and it's  safe thanks to the
 new  safety solution for ref).

 So, the primary difference between ref and auto ref would then be simply that
 auto ref accepted rvalues and ref wouldn't (though, the difference would be
 somewhat greater with templates, since in that case, it generates different
 templates for lvalues and rvalues in order to accept both, whereas the non-
 templated version would effectively create a local variable to assign the
 rvalue to so that it could be passed to the function as an lvalue). But the
 distinction between ref and auto ref is very important when trying to
 understand what code does and therefore will have a definite impact on how
 maintainable code is.

 We did explicitly defer discussion about what happens with "nop" rvalue
 conversions.

 I'm not sure what you mean by nop rvalue conversions, at least not by name.

I meant exactly what you said: "and not know that it was effectively a no-op".

May 04 2013

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday, May 04, 2013 20:12:24 Walter Bright wrote:
 I'm not sure what you mean by nop rvalue conversions, at least not by
 name.

 
 I meant exactly what you said: "and not know that it was effectively a
 no-op".

Oh, okay. LOL. I was thinking you meant something lower level like than that, 
and it didn't click. Yeah, distinguishing between functions that are meant to 
mutate their arguments and those that just want to pass them efficiently is the 
core issue with naked ref accepting rvalues, and we didn't come to an 
agreement on that.

- Jonathan M Davis

May 04 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 04 May 2013 19:30:21 -0700, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 However, if we had an attribute which explicitly designated that a  
 function
 accepted both rvalues and lvalues (which is what auto ref was originally
 supposed to do as Andrei proposed it), then if you saw

 auto foo(ref int i);
 auto bar(auto ref int i);

 then you could be reasonably certain that foo was intending to alter its
 arguments and bar was not.

The counter argument:

foo(makeRvalue()); // error:  cannot pass rvalues to ref

// programmer: WTF?  This is stupid, but ok:

auto x = makeRvalue();
foo(x);

In other words, explicit nops aren't any better than implicit nops.  Even  
if we *require* the user to be explicit (and it's not at all clear from a  
code-review perspective that the auto x line is to circumvent the  
requirements), the fact that this is trivially circumvented makes it a  
useless feature.  It's like having const you can cast away.

I think the larger issue with binding rvalues to refs is this:

int foo(int i);
int foo(ref int i);

what does foo(1) bind to?  It MUST bind to the non-ref, or there is no  
point for it.

If this can be solved, binding rvalues to refs is fine.

-Steve

May 06 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/6/13 12:10 PM, Steven Schveighoffer wrote:
 The counter argument:

 foo(makeRvalue()); // error: cannot pass rvalues to ref

 // programmer: WTF? This is stupid, but ok:

 auto x = makeRvalue();
 foo(x);

 In other words, explicit nops aren't any better than implicit nops. Even
 if we *require* the user to be explicit (and it's not at all clear from
 a code-review perspective that the auto x line is to circumvent the
 requirements), the fact that this is trivially circumvented makes it a
 useless feature. It's like having const you can cast away.

 I think the larger issue with binding rvalues to refs is this:

 int foo(int i);
 int foo(ref int i);

 what does foo(1) bind to? It MUST bind to the non-ref, or there is no
 point for it.

 If this can be solved, binding rvalues to refs is fine.

I think we can technically make the overloading work while also allowing 
binding rvalues to ref. But that wouldn't help any. Consider:

ref int min(ref int a, ref int b) { return b < a ? b : a; }
...
int x;
fun(min(x, 100));

Here the result of min may be bound to an lvalue or an rvalue depending 
on a condition. In the latter case, combined with D's propensity to 
destroy temporaries too early (immediately after function calls), the 
behavior is silently undefined; the code may pass unittests.

This is a known issue in C++. Allowing loose binding of rvalues to ref 
not only inherits C++'s mistake, but also adds a fresh one.


Andrei

May 06 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 06 May 2013 06:43:38 -0700, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 I think we can technically make the overloading work while also allowing  
 binding rvalues to ref. But that wouldn't help any. Consider:

 ref int min(ref int a, ref int b) { return b < a ? b : a; }
 ...
 int x;
 fun(min(x, 100));

 Here the result of min may be bound to an lvalue or an rvalue depending  
 on a condition. In the latter case, combined with D's propensity to  
 destroy temporaries too early (immediately after function calls), the  
 behavior is silently undefined; the code may pass unittests.

Wouldn't the new runtime check fix this?

 This is a known issue in C++. Allowing loose binding of rvalues to ref  
 not only inherits C++'s mistake, but also adds a fresh one.

I thought C++ would handle this kind of code.  I remember being able to  
use references to rvalues in ways that were unintuitive, but not undefined.

-Steve

May 06 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/6/13 12:48 PM, Steven Schveighoffer wrote:
 On Mon, 06 May 2013 06:43:38 -0700, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 I think we can technically make the overloading work while also
 allowing binding rvalues to ref. But that wouldn't help any. Consider:

 ref int min(ref int a, ref int b) { return b < a ? b : a; }
 ...
 int x;
 fun(min(x, 100));

 Here the result of min may be bound to an lvalue or an rvalue
 depending on a condition. In the latter case, combined with D's
 propensity to destroy temporaries too early (immediately after
 function calls), the behavior is silently undefined; the code may pass
 unittests.

 Wouldn't the new runtime check fix this?

Depends how you define "fix". It would be a possibly rare bounds check 
violation on completely innocuous code.

 This is a known issue in C++. Allowing loose binding of rvalues to ref
 not only inherits C++'s mistake, but also adds a fresh one.

 I thought C++ would handle this kind of code. I remember being able to
 use references to rvalues in ways that were unintuitive, but not undefined.

template <class T> const T& min(const T& a, const T& b) {
     return b < a ? b : a;
}
...
int x = ...;
auto & weird = min(x, 100);

Have a nice day :o).


Andrei

May 06 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 06 May 2013 10:05:48 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 5/6/13 12:48 PM, Steven Schveighoffer wrote:
 On Mon, 06 May 2013 06:43:38 -0700, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 I think we can technically make the overloading work while also
 allowing binding rvalues to ref. But that wouldn't help any. Consider:

 ref int min(ref int a, ref int b) { return b < a ? b : a; }
 ...
 int x;
 fun(min(x, 100));

 Here the result of min may be bound to an lvalue or an rvalue
 depending on a condition. In the latter case, combined with D's
 propensity to destroy temporaries too early (immediately after
 function calls), the behavior is silently undefined; the code may pass
 unittests.

 Wouldn't the new runtime check fix this?

 Depends how you define "fix". It would be a possibly rare bounds check  
 violation on completely innocuous code.

By "completely innocuous" you mean valid?  I don't think the above is  
valid.

 This is a known issue in C++. Allowing loose binding of rvalues to ref
 not only inherits C++'s mistake, but also adds a fresh one.

 I thought C++ would handle this kind of code. I remember being able to
 use references to rvalues in ways that were unintuitive, but not  
 undefined.

 template <class T> const T& min(const T& a, const T& b) {
      return b < a ? b : a;
 }
 ...
 int x = ...;
 auto & weird = min(x, 100);

 Have a nice day :o).

It seems to compile and work for me, but I don't know what the point is,  
since you are being mysterious :)

A long time ago I wrote a logging feature for C++ that returned an rvalue  
(maybe it was an rvalue reference, it was a long time ago, and I don't  
have the code anymore).  That would collect log messages via the <<  
operator, and then when the line was through, the destructor would output  
that line to the logger.  The logging object fetched would either be a  
dummy no-output object, or a real logger, depending on the logging level  
selected.  If the logger was disabled, no message was constructed, making  
it somewhat lazy (any expressions in the line would obviously be executed,  
just like any standard logger).  It worked without a hitch as long as we  
used it.  The rvalue stayed allocated and valid throughout the whole line,  
even though it was passed into each << operation by reference.

-Steve

May 06 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/6/13 10:31 AM, Steven Schveighoffer wrote:
 On Mon, 06 May 2013 10:05:48 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 5/6/13 12:48 PM, Steven Schveighoffer wrote:
 On Mon, 06 May 2013 06:43:38 -0700, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 I think we can technically make the overloading work while also
 allowing binding rvalues to ref. But that wouldn't help any. Consider:

 ref int min(ref int a, ref int b) { return b < a ? b : a; }
 ...
 int x;
 fun(min(x, 100));

 Here the result of min may be bound to an lvalue or an rvalue
 depending on a condition. In the latter case, combined with D's
 propensity to destroy temporaries too early (immediately after
 function calls), the behavior is silently undefined; the code may pass
 unittests.

 Wouldn't the new runtime check fix this?

 Depends how you define "fix". It would be a possibly rare bounds check
 violation on completely innocuous code.

 By "completely innocuous" you mean valid? I don't think the above is valid.

I meant valid-looking.

 This is a known issue in C++. Allowing loose binding of rvalues to ref
 not only inherits C++'s mistake, but also adds a fresh one.

 I thought C++ would handle this kind of code. I remember being able to
 use references to rvalues in ways that were unintuitive, but not
 undefined.

 template <class T> const T& min(const T& a, const T& b) {
 return b < a ? b : a;
 }
 ...
 int x = ...;
 auto & weird = min(x, 100);

 Have a nice day :o).

 It seems to compile and work for me, but I don't know what the point is,
 since you are being mysterious :)

If x > 100, the code is saving a reference to a destroyed temporary. If 
you couldn't see it, how many do you expect would see similar issues in 
even simpler and cleaner D code?

 A long time ago I wrote a logging feature for C++ that returned an
 rvalue (maybe it was an rvalue reference, it was a long time ago, and I
 don't have the code anymore). That would collect log messages via the <<
 operator, and then when the line was through, the destructor would
 output that line to the logger. The logging object fetched would either
 be a dummy no-output object, or a real logger, depending on the logging
 level selected. If the logger was disabled, no message was constructed,
 making it somewhat lazy (any expressions in the line would obviously be
 executed, just like any standard logger). It worked without a hitch as
 long as we used it. The rvalue stayed allocated and valid throughout the
 whole line, even though it was passed into each << operation by reference.

Not relevant.


Andrei

May 06 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 06 May 2013 10:40:06 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 5/6/13 10:31 AM, Steven Schveighoffer wrote:

 By "completely innocuous" you mean valid? I don't think the above is  
 valid.

 I meant valid-looking.

OK.

 It seems to compile and work for me, but I don't know what the point is,
 since you are being mysterious :)

 If x > 100, the code is saving a reference to a destroyed temporary. If  
 you couldn't see it, how many do you expect would see similar issues in  
 even simpler and cleaner D code?

No, I was wondering whether the compiler detects this and keeps the  
temporary in scope (after all, it is in control of that temporary's  
lifetime).  I called cout with that temporary as the reference, and it  
seems to not have clobbered it (outputs 100).  I have not had such "lucky"  
experience with D.  Coming from the perspective of a complete compiler  
ignoramus, I have no idea what is really happening :)  I know that it's  
common practice to throw rvalues and catch them as references, which seems  
to be handled correctly by the C++ compiler.

 A long time ago I wrote a logging feature for C++ that returned an
 rvalue (maybe it was an rvalue reference, it was a long time ago, and I
 don't have the code anymore). That would collect log messages via the <<
 operator, and then when the line was through, the destructor would
 output that line to the logger. The logging object fetched would either
 be a dummy no-output object, or a real logger, depending on the logging
 level selected. If the logger was disabled, no message was constructed,
 making it somewhat lazy (any expressions in the line would obviously be
 executed, just like any standard logger). It worked without a hitch as
 long as we used it. The rvalue stayed allocated and valid throughout the
 whole line, even though it was passed into each << operation by  
 reference.

 Not relevant.

How so?  I thought the point is you were saying that we couldn't handle  
passing a ref bound to an rvalue to another function (because D destroys  
it early?), that is precisely what I did.  I felt it was completely  
on-point.

-Steve

May 06 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/6/13 11:12 AM, Steven Schveighoffer wrote:
 If x > 100, the code is saving a reference to a destroyed temporary.
 If you couldn't see it, how many do you expect would see similar
 issues in even simpler and cleaner D code?

 No, I was wondering whether the compiler detects this and keeps the
 temporary in scope (after all, it is in control of that temporary's
 lifetime).

It can't.

Consider the body of min isn't known (eliminate templates etc). Then 
what the compiler sees is a function call that returns a const ref. All 
it can assume is it's a valid reference which it will subsequently bind 
to the name given by the caller. The reference will refer therefore to a 
destroyed rvalue (temporaries are destroyed at the end of the full 
expression).

Your example is irrelevant to this discussion because returning an 
rvalue and subsequently binding it to a const T& is a completely 
different scenario. It would be also sound if it weren't for this:

struct A {
   A(const T& x) : a(x) {}
   const T& a;
};

In _this_ case, initializing A with an rvalue of type T compiles and 
subsequently runs with undefined behavior.

I repeat: binding rvalues to ref would make every mistake C++ has done 
in the area, and add a few original ones. It is not a simple problem; if 
it seems, more study is required.


Andrei

May 06 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/6/13 11:31 AM, Andrei Alexandrescu wrote:
 struct A {
 A(const T& x) : a(x) {}
 const T& a;
 };

 In _this_ case, initializing A with an rvalue of type T compiles and
 subsequently runs with undefined behavior.

I should add I've seen this bug several times (causing mysterious 
crashes) several times at Facebook. We're working on adding a lint rule 
to disable the pattern statically.

Binding rvalues to references is fraught with peril.


Andrei

May 06 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 06 May 2013 11:31:05 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 5/6/13 11:12 AM, Steven Schveighoffer wrote:
 If x > 100, the code is saving a reference to a destroyed temporary.
 If you couldn't see it, how many do you expect would see similar
 issues in even simpler and cleaner D code?

 No, I was wondering whether the compiler detects this and keeps the
 temporary in scope (after all, it is in control of that temporary's
 lifetime).

 It can't.

 Consider the body of min isn't known (eliminate templates etc). Then  
 what the compiler sees is a function call that returns a const ref. All  
 it can assume is it's a valid reference which it will subsequently bind  
 to the name given by the caller. The reference will refer therefore to a  
 destroyed rvalue (temporaries are destroyed at the end of the full  
 expression).

Well, given that we intend to infer some special behavior given the types  
of the parameters, I wouldn't think it was impossible to do the same  
here.  This would make the rvalue live beyond the expression, so maybe  
that's not allowed in C++.

 Your example is irrelevant to this discussion because returning an  
 rvalue and subsequently binding it to a const T& is a completely  
 different scenario.

I quote from your original rebuttal:

 ref int min(ref int a, ref int b) { return b < a ? b : a; }
 ...
 int x;
 fun(min(x, 100));

Which is returning an rvalue ref and subsequently binding it to a ref  
parameter of fun.

Isn't that the same thing?  I would note that my code continued to return  
the rvalue for chained operator<< calls.

 It would be also sound if it weren't for this:

 struct A {
    A(const T& x) : a(x) {}
    const T& a;
 };

 In _this_ case, initializing A with an rvalue of type T compiles and  
 subsequently runs with undefined behavior.

This seems like a separate ref problem.  But we don't have ref members, so  
it would require an address-of in D.  That should be forbidden, right?

 I repeat: binding rvalues to ref would make every mistake C++ has done  
 in the area, and add a few original ones. It is not a simple problem; if  
 it seems, more study is required.

I never said it was a simple problem.  I said that if you have solved the  
escape problem, the logic problem is difficult to solve, but not  
necessarily required.  Even though it is pointless to bind rvalues to refs  
in some instances, it's not dangerous memory-wise.

If you are saying we haven't solved the escape problem, that is news to  
me.  I thought the runtime check solves that.

-Steve

May 06 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/6/13 11:48 AM, Steven Schveighoffer wrote:
 On Mon, 06 May 2013 11:31:05 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Consider the body of min isn't known (eliminate templates etc). Then
 what the compiler sees is a function call that returns a const ref.
 All it can assume is it's a valid reference which it will subsequently
 bind to the name given by the caller. The reference will refer
 therefore to a destroyed rvalue (temporaries are destroyed at the end
 of the full expression).

 Well, given that we intend to infer some special behavior given the
 types of the parameters, I wouldn't think it was impossible to do the
 same here. This would make the rvalue live beyond the expression, so
 maybe that's not allowed in C++.

I'm not sure I understand what you're suggesting.

 Your example is irrelevant to this discussion because returning an
 rvalue and subsequently binding it to a const T& is a completely
 different scenario.

 I quote from your original rebuttal:

 ref int min(ref int a, ref int b) { return b < a ? b : a; }
 ...
 int x;
 fun(min(x, 100));

 Which is returning an rvalue ref and subsequently binding it to a ref
 parameter of fun.

 Isn't that the same thing?

No. It's a very different thing handled by a special rule in C++.

 I would note that my code continued to return
 the rvalue for chained operator<< calls.

Of course.

 It would be also sound if it weren't for this:

 struct A {
 A(const T& x) : a(x) {}
 const T& a;
 };

 In _this_ case, initializing A with an rvalue of type T compiles and
 subsequently runs with undefined behavior.

 This seems like a separate ref problem. But we don't have ref members,
 so it would require an address-of in D. That should be forbidden, right?

Yes. My point was to illustrate that a special rule that works in a 
situation can't help another.

 I repeat: binding rvalues to ref would make every mistake C++ has done
 in the area, and add a few original ones. It is not a simple problem;
 if it seems, more study is required.

 I never said it was a simple problem. I said that if you have solved the
 escape problem, the logic problem is difficult to solve, but not
 necessarily required. Even though it is pointless to bind rvalues to
 refs in some instances, it's not dangerous memory-wise.

 If you are saying we haven't solved the escape problem, that is news to
 me. I thought the runtime check solves that.

It does. But binding rvalues to ref makes bounds check failures more 
frequent, less predictable, and harder to debug. Failures will be more 
frequent because there's more chance that a ref refers to a defunct 
rvalue; less predictable because conditional execution may cause some 
paths to be rarely exercised; and harder to debug because rvalues come 
and go following implicit rules, not visible scopes.


Andrei

May 06 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 06 May 2013 12:03:27 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 5/6/13 11:48 AM, Steven Schveighoffer wrote:
 On Mon, 06 May 2013 11:31:05 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Consider the body of min isn't known (eliminate templates etc). Then
 what the compiler sees is a function call that returns a const ref.
 All it can assume is it's a valid reference which it will subsequently
 bind to the name given by the caller. The reference will refer
 therefore to a destroyed rvalue (temporaries are destroyed at the end
 of the full expression).

 Well, given that we intend to infer some special behavior given the
 types of the parameters, I wouldn't think it was impossible to do the
 same here. This would make the rvalue live beyond the expression, so
 maybe that's not allowed in C++.

 I'm not sure I understand what you're suggesting.

Not suggesting anything.  I was inferring that since the code worked,  
maybe the compiler was correctly handling it.  And given that we plan to  
have special rules regarding ref, it's not out of the question C++ might  
also.

 Your example is irrelevant to this discussion because returning an
 rvalue and subsequently binding it to a const T& is a completely
 different scenario.

 I quote from your original rebuttal:

 ref int min(ref int a, ref int b) { return b < a ? b : a; }
 ...
 int x;
 fun(min(x, 100));

 Which is returning an rvalue ref and subsequently binding it to a ref
 parameter of fun.

 Isn't that the same thing?

 No. It's a very different thing handled by a special rule in C++.

This isn't helping.  You keep saying its different but not how.  I repeat,  
isn't it possible to solve the problem of binding rvalues to references?   
Yours and my examples seem to say it works in C++, but yet you say it's  
not feasible in D.  Why is C++ able to handle this while D is not?

 It would be also sound if it weren't for this:

 struct A {
 A(const T& x) : a(x) {}
 const T& a;
 };

 In _this_ case, initializing A with an rvalue of type T compiles and
 subsequently runs with undefined behavior.

 This seems like a separate ref problem. But we don't have ref members,
 so it would require an address-of in D. That should be forbidden, right?

 Yes. My point was to illustrate that a special rule that works in a  
 situation can't help another.

Another situation that's already solved?  Don't see the point.

 If you are saying we haven't solved the escape problem, that is news to
 me. I thought the runtime check solves that.

 It does. But binding rvalues to ref makes bounds check failures more  
 frequent, less predictable, and harder to debug. Failures will be more  
 frequent because there's more chance that a ref refers to a defunct  
 rvalue;

That is a lifetime issue.  We can make the lifetime last long enough for  
the current statement.

 less predictable because conditional execution may cause some paths to  
 be rarely exercised;

An existing problem, not made worse by rvalue references.

 and harder to debug because rvalues come and go following implicit  
 rules, not visible scopes.

What are the rules?  Maybe we should start there.

-Steve

May 06 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/6/13 12:17 PM, Steven Schveighoffer wrote:
 On Mon, 06 May 2013 12:03:27 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 No. It's a very different thing handled by a special rule in C++.

 This isn't helping. You keep saying its different but not how.

In one case a reference is returned, in the other an rvalue is returned.

 I repeat,
 isn't it possible to solve the problem of binding rvalues to references?
 Yours and my examples seem to say it works in C++, but yet you say it's
 not feasible in D. Why is C++ able to handle this while D is not?

I explained twice: min and other similar C++ examples are broken.

 Yes. My point was to illustrate that a special rule that works in a
 situation can't help another.

 Another situation that's already solved? Don't see the point.

No. That situation leads to undefined behavior.


Andrei

May 06 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 06 May 2013 13:28:18 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 5/6/13 12:17 PM, Steven Schveighoffer wrote:
 On Mon, 06 May 2013 12:03:27 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 No. It's a very different thing handled by a special rule in C++.

 This isn't helping. You keep saying its different but not how.

 In one case a reference is returned, in the other an rvalue is returned.

This is a trimmed down example:

int &foo(int &val) { return val; }

What I read from you (and I could be wrong) is you are saying this is not  
valid:

foo(foo(foo(1)));

Is that right?

 Yes. My point was to illustrate that a special rule that works in a
 situation can't help another.

 Another situation that's already solved? Don't see the point.

 No. That situation leads to undefined behavior.

In D that situation is invalid.  You can't have ref members.

-Steve

May 06 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/6/13 1:45 PM, Steven Schveighoffer wrote:
 On Mon, 06 May 2013 13:28:18 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 5/6/13 12:17 PM, Steven Schveighoffer wrote:
 On Mon, 06 May 2013 12:03:27 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 No. It's a very different thing handled by a special rule in C++.

 This isn't helping. You keep saying its different but not how.

 In one case a reference is returned, in the other an rvalue is returned.

 This is a trimmed down example:

 int &foo(int &val) { return val; }

 What I read from you (and I could be wrong) is you are saying this is
 not valid:

 foo(foo(foo(1)));

 Is that right?

No. I believe I was very specific about what I destroyed and in all 
likelihood so do you. Probably at this point we've reached violent 
agreement a couple of iterations back.

Long story short: binding rvalues to ref is fraught with peril and must 
be designed very carefully.


Andrei

May 06 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 06 May 2013 13:53:10 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 5/6/13 1:45 PM, Steven Schveighoffer wrote:

 This is a trimmed down example:

 int &foo(int &val) { return val; }

 What I read from you (and I could be wrong) is you are saying this is
 not valid:

 foo(foo(foo(1)));

 Is that right?

 No. I believe I was very specific about what I destroyed and in all  
 likelihood so do you. Probably at this point we've reached violent  
 agreement a couple of iterations back.

OK, I was confused (seriously, I was not playing devil's advocate here).   
We are in agreement (at least at what should be possible).

 Long story short: binding rvalues to ref is fraught with peril and must  
 be designed very carefully.

I think empirical proof from this newsgroup is pretty good evidence.

-Steve

May 06 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 5/6/2013 8:31 AM, Andrei Alexandrescu wrote:
 In _this_ case, initializing A with an rvalue of type T compiles and
 subsequently runs with undefined behavior.

This is why D does not allow ref as a storage class for variables.

May 06 2013

"Rob T" <alanb ucora.com> writes:

On Monday, 6 May 2013 at 14:05:48 UTC, Andrei Alexandrescu wrote:
 template <class T> const T& min(const T& a, const T& b) {
     return b < a ? b : a;
 }
 ...
 int x = ...;
 auto & weird = min(x, 100);

What I see going on is an attempt to double up on the use of ref 
for twp conflicting purposes. Perhaps part of the solution is to 
use a new variation of ref that allows rvalues and lvalues, while 
normal ref continues to disallow rvalues, eg ref vs refr.


void foo(ref b)
{
...
}

ref T min(ref T a, refr T b) {
      ++b; // error refr cannot be modified
      foo(b); // error, cannot pass refr to normal ref
      return b < a ? b : a; // error cannot return refr
  }

The "auto ref" system can then be extended to determine if normal 
ref or refr is required, and refuse to compile when the rules are 
violated rather than try and fake a real ref with a temporary, 
since I would think that's something you'd normally never want 
done anyway.

A runtime safety check will still be needed for returns of normal 
ref that may escape.

I can definitely agree on the runtime safety check, but I have 
doubts about the idea of faking a real ref.

--rt

May 06 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/6/13 12:48 PM, Steven Schveighoffer wrote:

(your clock seems to be messed up)

Andrei

May 06 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 06 May 2013 10:07:01 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 5/6/13 12:48 PM, Steven Schveighoffer wrote:

 (your clock seems to be messed up)

 Andrei

Could be the time change, haven't rebooted my Mac since flying back.  My  
clock is correct, but Opera may be confused.

This is a test message, I restarted Opera.

-Steve

May 06 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Monday, May 06, 2013 10:16:57 Steven Schveighoffer wrote:
 Could be the time change, haven't rebooted my Mac since flying back. My
 clock is correct, but Opera may be confused.

Oh, the wonders of dealing with time... :)

- Jonathan M Davis

May 07 2013

"deadalnix" <deadalnix gmail.com> writes:

On Monday, 6 May 2013 at 13:43:38 UTC, Andrei Alexandrescu wrote:
 I think we can technically make the overloading work while also 
 allowing binding rvalues to ref. But that wouldn't help any. 
 Consider:

 ref int min(ref int a, ref int b) { return b < a ? b : a; }
 ...
 int x;
 fun(min(x, 100));

 Here the result of min may be bound to an lvalue or an rvalue 
 depending on a condition. In the latter case, combined with D's 
 propensity to destroy temporaries too early (immediately after 
 function calls), the behavior is silently undefined; the code 
 may pass unittests.

Now that you mention that, is the proposal for ref safety is 
really safe ?

May 06 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/6/13 11:34 AM, deadalnix wrote:
 On Monday, 6 May 2013 at 13:43:38 UTC, Andrei Alexandrescu wrote:
 I think we can technically make the overloading work while also
 allowing binding rvalues to ref. But that wouldn't help any. Consider:

 ref int min(ref int a, ref int b) { return b < a ? b : a; }
 ...
 int x;
 fun(min(x, 100));

 Here the result of min may be bound to an lvalue or an rvalue
 depending on a condition. In the latter case, combined with D's
 propensity to destroy temporaries too early (immediately after
 function calls), the behavior is silently undefined; the code may pass
 unittests.

 Now that you mention that, is the proposal for ref safety is really safe ?

Yes, because it's dynamically checked.

Andrei

May 06 2013

"deadalnix" <deadalnix gmail.com> writes:

On Monday, 6 May 2013 at 15:39:07 UTC, Andrei Alexandrescu wrote:
 Yes, because it's dynamically checked.

The check will see that the reference is in the current stack 
frame and pass.

May 06 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/6/13 11:52 AM, deadalnix wrote:
 On Monday, 6 May 2013 at 15:39:07 UTC, Andrei Alexandrescu wrote:
 Yes, because it's dynamically checked.

 The check will see that the reference is in the current stack frame and
 pass.

No. The check will fail (unless wrongly written).

Andrei

May 06 2013

"deadalnix" <deadalnix gmail.com> writes:

On Monday, 6 May 2013 at 16:03:51 UTC, Andrei Alexandrescu wrote:
 On 5/6/13 11:52 AM, deadalnix wrote:
 On Monday, 6 May 2013 at 15:39:07 UTC, Andrei Alexandrescu 
 wrote:
 Yes, because it's dynamically checked.

 The check will see that the reference is in the current stack 
 frame and
 pass.

 No. The check will fail (unless wrongly written).

You'll have to explain more as I don't see how to make the check 
work with temporaries that will live in the caller stack frame. 
By definition they'll be valid if only addresses are checked. But 
the reference will exceed the lifetime of the returned reference.

May 06 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 06 May 2013 09:43:38 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 ref int min(ref int a, ref int b) { return b < a ? b : a; }
 ...
 int x;
 fun(min(x, 100));

 Here the result of min may be bound to an lvalue or an rvalue depending  
 on a condition. In the latter case, combined with D's propensity to  
 destroy temporaries too early (immediately after function calls), the  
 behavior is silently undefined; the code may pass unittests.

Focusing back on this, I think any rvalues should be treated as though  
they survive through the end of the statement.  If the compiler can prove  
they are not in use after partially executing a statement, they can be  
destroyed early.

Is there any reason this shouldn't be the case?

-Steve

May 06 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/6/13 12:20 PM, Steven Schveighoffer wrote:
 On Mon, 06 May 2013 09:43:38 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 ref int min(ref int a, ref int b) { return b < a ? b : a; }
 ...
 int x;
 fun(min(x, 100));

 Here the result of min may be bound to an lvalue or an rvalue
 depending on a condition. In the latter case, combined with D's
 propensity to destroy temporaries too early (immediately after
 function calls), the behavior is silently undefined; the code may pass
 unittests.

 Focusing back on this, I think any rvalues should be treated as though
 they survive through the end of the statement. If the compiler can prove
 they are not in use after partially executing a statement, they can be
 destroyed early.

 Is there any reason this shouldn't be the case?

That should probably be a prerequisite of any working solution.

Andrei

May 06 2013

"David Nadlinger" <see klickverbot.at> writes:

On Sunday, 5 May 2013 at 02:04:14 UTC, Walter Bright wrote:
 That wasn't my understanding. I thought we agreed that since 
 rvalues would be copied to locals, and then the issue was one 
 of escaping local references.

I think you, Manu and I agreed on this simplification, and thus 
consequently left rvalues out of the discussion entirely. The 
others might not even have commented on this though.

David

May 04 2013

"David Nadlinger" <see klickverbot.at> writes:

On Sunday, 5 May 2013 at 01:45:05 UTC, Jonathan M Davis wrote:
 On Saturday, May 04, 2013 20:37:36 Andrei Alexandrescu wrote:
 On 5/4/13 7:31 PM, Walter Bright wrote:
 On 5/4/2013 3:51 PM, w0rp wrote:
 Does all of this also mean that a
 function with a ref parameter will automagically work with 
 r-values?

 
 Yes.

 
 This is new to me. My understanding is that the discussed 
 design
 addresses safety, and leaves the rvalue discussion for a 
 future iteration.

 That is definitely where things were when we ended the 
 discussion on Wednesday
 night. Walter favored making ref accept rvalues, but we never 
 agreed on that.
 Manu was still in favor of scop ref (and David Nadlinger agreed 
 with him
 IIRC),

I was mostly arguing against Andrei's (in my opinion) overhasty 
dismissal of anything involving scope ref, because I didn't buy 
his argument about it being a drastic increase in perceived 
language complexity.

I fully agree with runtime-supported escape checking being the 
cleanest design, and like the fact that it is simple.

 and you and I were arguing for auto ref to designate that a 
 function
 accepts rvalues. We all agreed on the bounds check solution for 
  safety, but
 we explicitly tabled the discussion about accepting rvalues, 
 because it was
 getting late, and we'd already been discussing it / arguing 
 about it for quite
 some time. So, unless further discussion occurred after that 
 which I missed,
 there is still no agreement on how to handle having a parameter 
 accept both
 lvalues and rvalues by ref.

I'd argue that if ref can safely accept rvalues, it should – 
simplicity at its best.

David

May 04 2013

Manu <turkeyman gmail.com> writes:

On 5 May 2013 10:37, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>wrote:

 On 5/4/13 7:31 PM, Walter Bright wrote:

 On 5/4/2013 3:51 PM, w0rp wrote:

 Does all of this also mean that a
 function with a ref parameter will automagically work with r-values?

 Yes.

 This is new to me. My understanding is that the discussed design addresses
 safety, and leaves the rvalue discussion for a future iteration.


I was left under the same impression that Walter also seems to be under.

May 08 2013

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 5/4/13, Walter Bright <newshound2 digitalmars.com> wrote:
 Andrei & I argued that we needed to make it work with just ref annotations.

So to recap, 2.063 turns slices into r-values which will break code
that used ref, e.g.:

-----
void parse(ref int[] arr) { }

void main()
{
    int[] arr = [1, 2];
    parse(arr[]);  // ok in 2.062, error in 2.063
}
-----

Then the user might introduce a non-ref overload:

-----
void parse(ref int[] arr) { }
void parse(int[] arr) { }  // picks this one
-----

And later down the road, maybe even in 2.064, ref will take r-values
making the new code error because of ambiguity between the two
functions.

Has code breakage ever been taken into account during this dconf conversation?

I doubt a short verbal conversation can solve design problems or take
into account all edge-cases, this is why we have the web where we can
document all code samples and the flaws of some design spec.

This "resolution" should be a DIP that goes through a review just like
all the other DIPs, otherwise DIPs are pointless if they get overruled
by some behind-the-scenes conversation.

May 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2013 4:03 PM, Andrej Mitrovic wrote:
 On 5/4/13, Walter Bright <newshound2 digitalmars.com> wrote:
 Andrei & I argued that we needed to make it work with just ref annotations.

 So to recap, 2.063 turns slices into r-values which will break code
 that used ref, e.g.:

 -----
 void parse(ref int[] arr) { }

 void main()
 {
      int[] arr = [1, 2];
      parse(arr[]);  // ok in 2.062, error in 2.063
 }
 -----

Do you mean that is an error now with HEAD?


 Then the user might introduce a non-ref overload:

 -----
 void parse(ref int[] arr) { }
 void parse(int[] arr) { }  // picks this one
 -----

 And later down the road, maybe even in 2.064, ref will take r-values
 making the new code error because of ambiguity between the two
 functions.

 Has code breakage ever been taken into account during this dconf conversation?

I don't know of any code it would break.

May 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2013 4:34 PM, Walter Bright wrote:
 And later down the road, maybe even in 2.064, ref will take r-values
 making the new code error because of ambiguity between the two
 functions.

 Has code breakage ever been taken into account during this dconf conversation?


I see what you mean now. You mean how does an rvalue overload if faced with T 
and ref T. Currently:

   void foo(ref int i);
   void foo(int i);

   void main() {
     int i;
     foo(i);    // matches ref int
     foo(1);    // matches int
   }

I don't think that should change with this proposal.

May 04 2013

"Diggory" <diggsey googlemail.com> writes:

On Saturday, 4 May 2013 at 23:44:27 UTC, Walter Bright wrote:
 On 5/4/2013 4:34 PM, Walter Bright wrote:
 And later down the road, maybe even in 2.064, ref will take 
 r-values
 making the new code error because of ambiguity between the two
 functions.

 Has code breakage ever been taken into account during this 
 dconf conversation?


 I see what you mean now. You mean how does an rvalue overload 
 if faced with T and ref T. Currently:

   void foo(ref int i);
   void foo(int i);

   void main() {
     int i;
     foo(i);    // matches ref int
     foo(1);    // matches int
   }

 I don't think that should change with this proposal.

What about this:

void foo(ref int i);
void foo(ref const(int) i);

void main() {
     int i;
     foo(i);
     foo(1);
}

What do they match here?

May 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2013 4:47 PM, Diggory wrote:
 What about this:

 void foo(ref int i);
 void foo(ref const(int) i);

 void main() {
      int i;
      foo(i);
      foo(1);
 }

 What do they match here?

An rvalue ref is not const, so (1) would match the same as (i) does.

May 04 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 05/05/2013 01:47 AM, Diggory wrote:
 On Saturday, 4 May 2013 at 23:44:27 UTC, Walter Bright wrote:
 On 5/4/2013 4:34 PM, Walter Bright wrote:
 And later down the road, maybe even in 2.064, ref will take r-values
 making the new code error because of ambiguity between the two
 functions.

 Has code breakage ever been taken into account during this dconf
 conversation?


 I see what you mean now. You mean how does an rvalue overload if faced
 with T and ref T. Currently:

   void foo(ref int i);
   void foo(int i);

   void main() {
     int i;
     foo(i);    // matches ref int
     foo(1);    // matches int
   }

 I don't think that should change with this proposal.

 What about this:

 void foo(ref int i);
 void foo(ref const(int) i);

 void main() {
      int i;
      foo(i);
      foo(1);
 }

 What do they match here?

Both match the first overload because that is an exact match whereas the 
second overload is only a match with conversion to const.

May 04 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/4/13 7:03 PM, Andrej Mitrovic wrote:
 This "resolution" should be a DIP that goes through a review just like
 all the other DIPs, otherwise DIPs are pointless if they get overruled
 by some behind-the-scenes conversation.

Yes.

Andrei

May 04 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Couple amendments:

On 5/4/13 2:33 PM, Walter Bright wrote:
 Case B:
 ref T foob(ref U u) { return u.t; } // note that T is derivable from U
 ref U bar() { T t; return foob(t); }

That's not derivable, it's embedded: type U transitively has a member of 
type T.

Same case applies to statically-sized arrays:

ref T foob(ref T[42] u) { return u[13]; }
ref T[42] bar() { T[42] t; return foob(t); }

Here the notion that a statically-sized arrays behaves much like a 
struct is applicable. This case probably deserves notice too:

ref T fooa(ref T t) { return t; }
ref T bar() { T[42] t; return fooa(t[13]); }

 1. Always involves a return statement.

Except if pointers are used, which leaves the question of what we do 
when people take the address of refs returned by functions.

 2. The return type must always be the type of the stack variable or a
 type type derived from a stack variable's type via safe casting or
 subtyping.

That's not subtyping, it's transitive member access. Here transitive 
goes through members but not through indirections. Not sure how to call 
that to not make it confusing.

 3. Returning rvalues is the same issue, as rvalues are always turned
 into local stack temporaries.

The complicating factor here is that lvalues have well-understood 
lifetimes whereas rvalues are more subtle and opened to subtleties and 
interpretations. I think right now D destroys temporaries too early.

 4. Whether a function returns a ref derived from a parameter or not is
 not reflected in the function signature.

Yes! That's why any static solution is either conservative or 
complicates the language.

 5. Always involves passing a local by ref to a function that returns by
 ref, and that function gets called in a return statement.

There's also the case of e.g. "return *p;" and "return a[13];".


Andrei

May 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2013 5:28 PM, Andrei Alexandrescu wrote:
 Couple amendments:

 On 5/4/13 2:33 PM, Walter Bright wrote:
 Case B:
 ref T foob(ref U u) { return u.t; } // note that T is derivable from U
 ref U bar() { T t; return foob(t); }

 That's not derivable, it's embedded: type U transitively has a member of type
T.

 Same case applies to statically-sized arrays:

 ref T foob(ref T[42] u) { return u[13]; }
 ref T[42] bar() { T[42] t; return foob(t); }

 Here the notion that a statically-sized arrays behaves much like a struct is
 applicable. This case probably deserves notice too:

 ref T fooa(ref T t) { return t; }
 ref T bar() { T[42] t; return fooa(t[13]); }

Yes.

 1. Always involves a return statement.

 Except if pointers are used, which leaves the question of what we do when
people
 take the address of refs returned by functions.

Ref is a restricted form of pointer, the whole point of them is so we can do 
more reasoning about them. If we throw into the mix allowing converting them to 
pointers in safe code, everything falls apart.

 2. The return type must always be the type of the stack variable or a
 type type derived from a stack variable's type via safe casting or
 subtyping.

 That's not subtyping, it's transitive member access. Here transitive goes
 through members but not through indirections. Not sure how to call that to not
 make it confusing.

I know what you mean. I don't know what word to use, either.


 3. Returning rvalues is the same issue, as rvalues are always turned
 into local stack temporaries.

 The complicating factor here is that lvalues have well-understood lifetimes
 whereas rvalues are more subtle and opened to subtleties and interpretations. I
 think right now D destroys temporaries too early.

Considering that we only have to deal with return statement expressions here, 
where the lifetime of those temporaries would be restricted to those
expressions 
regardless, that shouldn't be an issue.


 4. Whether a function returns a ref derived from a parameter or not is
 not reflected in the function signature.

 Yes! That's why any static solution is either conservative or complicates the
 language.

 5. Always involves passing a local by ref to a function that returns by
 ref, and that function gets called in a return statement.

 There's also the case of e.g. "return *p;"

Again, allowing ref <=> pointer conversions makes it impractical to reason
about 
refs.

 and "return a[13];".

If 'a' is a local array allocated on the stack, this is trivially disallowed, 
just like:

     S s;
     return s.t;

would be.

May 04 2013

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday, May 05, 2013 01:03:17 Andrej Mitrovic wrote:
 On 5/4/13, Walter Bright <newshound2 digitalmars.com> wrote:
 Andrei & I argued that we needed to make it work with just ref
 annotations.

 
 So to recap, 2.063 turns slices into r-values which will break code
 that used ref, e.g.:
 
 -----
 void parse(ref int[] arr) { }
 
 void main()
 {
     int[] arr = [1, 2];
     parse(arr[]);  // ok in 2.062, error in 2.063
 }
 -----
 
 Then the user might introduce a non-ref overload:
 
 -----
 void parse(ref int[] arr) { }
 void parse(int[] arr) { }  // picks this one
 -----
 
 And later down the road, maybe even in 2.064, ref will take r-values
 making the new code error because of ambiguity between the two
 functions.
 
 Has code breakage ever been taken into account during this dconf
 conversation?
 
 I doubt a short verbal conversation can solve design problems or take
 into account all edge-cases, this is why we have the web where we can
 document all code samples and the flaws of some design spec.
 
 This "resolution" should be a DIP that goes through a review just like
 all the other DIPs, otherwise DIPs are pointless if they get overruled
 by some behind-the-scenes conversation.

The rvalue part wasn't agreed upon, just the  safety solution. I'm sure that 
the  safety solution can be discussed further if there's dissension, but it's 
completely non-breaking change (the only case where you'd get an Error is one 
where the code was operating on a variable which had already left scope and 
been destroyed). So, I wouldn't expect there to be any real issues with that. 
The rvalue portion, however, definitely needs further discussion. Walter was in 
favor of ref accepting rvalues (I think that he though that accepting rvalues 
was only a safety issue, but I'm not sure), so maybe that's why he was 
thinking that that was resolved, but there was certainly no agreement on it 
even between him and Andrei, let alone among the rest of us.

- Jonathan M Davis

May 04 2013

Jacob Carlborg <doob me.com> writes:

On 2013-05-04 20:33, Walter Bright wrote:

 These checks would be omitted if the -noboundscheck compiler switch was
provided.

Perhaps a new flag for this.

-- 
/Jacob Carlborg

May 05 2013

"Dicebot" <m.strashun gmail.com> writes:

On Sunday, 5 May 2013 at 09:26:54 UTC, Jacob Carlborg wrote:
 On 2013-05-04 20:33, Walter Bright wrote:

 These checks would be omitted if the -noboundscheck compiler 
 switch was provided.

 Perhaps a new flag for this.

Or just rename it in more general -noruntimesafetychecks

May 05 2013

Michel Fortin <michel.fortin michelf.ca> writes:

On 2013-05-04 18:33:10 +0000, Walter Bright <newshound2 digitalmars.com> said:

 Runtime Detection
 
 There are still a few cases that the compiler cannot statically detect. 
 For these a runtime check is inserted, which compares the returned ref 
 pointer to see if it lies within the stack frame of the exiting 
 function, and if it does, halts the program. The cost will be a couple 
 of CMP instructions and an LEA. These checks would be omitted if the 
 -noboundscheck compiler switch was provided.

I just want to note that this has the effect of making any kind of heap 
allocation not done by the GC unsafe. For instance, if you have a 
container struct that allocates using malloc/realloc and that container 
gives access to its elements by reference then you're screwed (it can't 
be detected).

The obvious answer is to not make  trusted the function returning a 
reference or a slice to malloced memory. But I remember Andrei wanting 
to make standard containers of this sort at one point, so I think it's 
important to note this limitation.

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca/

May 05 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 5/5/2013 4:43 AM, Michel Fortin wrote:
 On 2013-05-04 18:33:10 +0000, Walter Bright <newshound2 digitalmars.com> said:

 Runtime Detection

 There are still a few cases that the compiler cannot statically detect. For
 these a runtime check is inserted, which compares the returned ref pointer to
 see if it lies within the stack frame of the exiting function, and if it does,
 halts the program. The cost will be a couple of CMP instructions and an LEA.
 These checks would be omitted if the -noboundscheck compiler switch was
provided.

 I just want to note that this has the effect of making any kind of heap
 allocation not done by the GC unsafe. For instance, if you have a container
 struct that allocates using malloc/realloc and that container gives access to
 its elements by reference then you're screwed (it can't be detected).

 The obvious answer is to not make  trusted the function returning a reference
or
 a slice to malloced memory. But I remember Andrei wanting to make standard
 containers of this sort at one point, so I think it's important to note this
 limitation.

I know Andrei has thought about this, but I don't know what the solution is.

May 05 2013

Michel Fortin <michel.fortin michelf.ca> writes:

On 2013-05-05 18:19:26 +0000, Walter Bright <newshound2 digitalmars.com> said:

 On 5/5/2013 4:43 AM, Michel Fortin wrote:
 On 2013-05-04 18:33:10 +0000, Walter Bright <newshound2 digitalmars.com> said:
 
 Runtime Detection
 
 There are still a few cases that the compiler cannot statically detect. For
 these a runtime check is inserted, which compares the returned ref pointer to
 see if it lies within the stack frame of the exiting function, and if it does,
 halts the program. The cost will be a couple of CMP instructions and an LEA.
 These checks would be omitted if the -noboundscheck compiler switch was 
 provided.

 
 I just want to note that this has the effect of making any kind of heap
 allocation not done by the GC unsafe. For instance, if you have a container
 struct that allocates using malloc/realloc and that container gives access to
 its elements by reference then you're screwed (it can't be detected).
 
 The obvious answer is to not make  trusted the function returning a 
 reference or
 a slice to malloced memory. But I remember Andrei wanting to make standard
 containers of this sort at one point, so I think it's important to note this
 limitation.

 
 I know Andrei has thought about this, but I don't know what the solution is.

Just rethrowing an idea that was already thrown here: support annotated 
lifetimes *in addition* to this runtime detection system. Those who use 
manual memory management will need it to make their code  safe. Those 
who stick to the GC won't have to. Anyway, you don't have to implement 
both right away, it can always be decided later.

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca/

May 05 2013

"deadalnix" <deadalnix gmail.com> writes:

On Sunday, 5 May 2013 at 23:45:21 UTC, Michel Fortin wrote:
 Just rethrowing an idea that was already thrown here: support 
 annotated lifetimes *in addition* to this runtime detection 
 system. Those who use manual memory management will need it to 
 make their code  safe. Those who stick to the GC won't have to. 
 Anyway, you don't have to implement both right away, it can 
 always be decided later.

Yes, that is also my point of view. We don't even need to support 
annotation now, simply ensure that we don't close the door to 
annotation.

May 05 2013

"Zach the Mystic" <reachzach gggggmail.com> writes:

On Saturday, 4 May 2013 at 18:33:04 UTC, Walter Bright wrote:
 Static Compiler Detection (in  safe mode):

 1. Do not allow taking the address of a local variable, unless 
 doing a safe type 'paint' operation.

 2. In some cases, such as nested, private, and template 
 functions, the source is always available so the compiler can 
 error on those. Because of the .di file problem, doing this 
 with auto return functions is problematic.

 3. Issue error on return statements where the expression may 
 contain a ref to a local that is going out of scope, taking 
 into account the observations.

 Runtime Detection

 There are still a few cases that the compiler cannot statically 
 detect. For these a runtime check is inserted, which compares 
 the returned ref pointer to see if it lies within the stack 
 frame of the exiting function, and if it does, halts the 
 program. The cost will be a couple of CMP instructions and an 
 LEA. These checks would be omitted if the -noboundscheck 
 compiler switch was provided.

This is a brilliant solution. I'm glad my DIP seems to have 
helped pivot the design process into this superior conclusion, 
which uses something, i.e. runtime checking, I simply didn't 
think of. I guess I didn't realize that the stack has "bounds", 
so to say.

I suppose that underneath the hood the compiler will still track 
the state of the return value using something like a 'scope' bit. 
It's just that the user code doesn't need to see this bit, which 
is probably how it should be. And it's great to realize that a 
suitable safety framework - -noboundscheck - has been found which 
already exists to encompass the checking.

I think the main data still to be researched is the slowdown with 
both compile and run times with this checking implemented - not 
that I see how to avoid it, but it's better to know than not to 
know, right?

May 05 2013

Rainer Schuetze <r.sagitario gmx.de> writes:

On 04.05.2013 20:33, Walter Bright wrote:
 Static Compiler Detection (in  safe mode):

 1. Do not allow taking the address of a local variable, unless doing a
 safe type 'paint' operation.

I'm not exactly sure what a "safe type paint operation" does, and 
whether the following has already been considered, but I just like to be 
assured it has:

Taking a slice of a stack allocated fixed-size array also includes 
taking its address, so it is also forbidden? This might disallow any 
range based algorithms on the static array.

May 09 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Thursday, May 09, 2013 21:30:00 Rainer Schuetze wrote:
 On 04.05.2013 20:33, Walter Bright wrote:
 Static Compiler Detection (in  safe mode):
 
 1. Do not allow taking the address of a local variable, unless doing a
 safe type 'paint' operation.

 
 I'm not exactly sure what a "safe type paint operation" does, and
 whether the following has already been considered, but I just like to be
 assured it has:
 
 Taking a slice of a stack allocated fixed-size array also includes
 taking its address, so it is also forbidden? This might disallow any
 range based algorithms on the static array.

Asuming that taking the slice of a static array is treated like ref (as  safe) 
rather than like taking the address of a local variable is (as  system), then 
we'll have to add similar runtime checks for arrays, and that would be way, 
way worse given that without purity, they could be assigned to a global 
dynamic array (or could be assigned to a member variable in a return value 
even with pure functions). It's fairly clean for ref simply because ref is a 
storage class and not a type constructor. Array slices on the other hand could 
escape all over the place.

I'm inclined to believe that taking a slice of a static array should be 
considered  system just like taking the address of a local variable is 
considered  system. If I could, I'd even disallow the implicit slicing of 
static arrays when passing them to functions taking dynamic arrays, but I 
question that Walter would go that far. But I don't know what we can do other 
than making slicing static arrays  system given how difficult it would be to 
have runtime checks catch that.

I'd brought this issue up in the past but had not remembered it during the 
recent discussions on ref safety. Good catch. We don't want any holes like 
this to persist.

- Jonathan M Davis

May 09 2013

"Maxim Fomin" <maxim maxim-fomin.ru> writes:

On Saturday, 4 May 2013 at 18:33:04 UTC, Walter Bright wrote:
 Thanks to the many recent threads on this, and the dips on it, 
 everyone was pretty much up to speed and ready to find a 
 resolution. This resolution only deals with the memory safety 
 issue.

...

What if an argument is captured by a delegate?

import std.stdio;

alias long[100] T;

T delegate() dg;

//ref T foo(ref T i)  safe
void foo(ref T i)  safe
{
    dg = { return i; } ;
    //return i;
}

//ref T bar()
void bar()  safe
{
    T i = 1;
    //return foo(i);
    foo(i);
}

void rewrite_stack()  safe
{
    T tmp = -1;
}

void main()
{
    //T i = bar();
    bar();
    rewrite_stack();
    writeln(dg());
}

I believe that even taking your runtime solution into account 
there is still flaw in the code which is caused by capturing 
reference (pointer) to passed object. Since definition of 'foo' 
may be unavailable, compiler cannot know during issuing call to 
'bar' whether to allocate argument on the stack or in the heap.

By the way, lazy+delegate is broken.

auto foo(lazy int i)  safe
{
    return { return i; } ;
}

auto bar()  safe
{
    int i = 4;
    return foo(i);
}

void baz()  safe
{
    int[1] arr = 2;
}

void main()  safe
{
    auto x = bar();
    baz();
    assert(x() is 2); // stack value hijacktion
}

First example: http://dpaste.dzfl.pl/4c84a5e4
Second example: http://dpaste.dzfl.pl/9399adc6

May 09 2013

D Programming

C/C++ Programming

Other

digitalmars.D - Rvalue references - The resolution