digitalmars.D - casting away const and then mutating

anonymous (27/27) Jul 23 2015 On a GitHub pull request, Steven Schveighoffer (schveiguy),

Steven Schveighoffer (36/52) Jul 23 2015 Yes, IMO, this should simply work and be consistent. The compiler could

Timon Gehr (3/18) Jul 24 2015 No, it would be sufficient to have a simple form of constant propagation...

Steven Schveighoffer (3/22) Jul 24 2015 What do you mean?

Timon Gehr (12/35) Jul 24 2015 Assuming UB for modifying through a const reference, the compiler does

Jonathan M Davis (53/72) Jul 23 2015 It's come up time and time again with discussions for logical

Steven Schveighoffer (59/97) Jul 23 2015 This is not logical const. We are starting with mutable data, moving it

Jonathan M Davis (81/135) Jul 23 2015 It's come up a number of times in discussions on logical const,

Steven Schveighoffer (16/64) Jul 24 2015 The compiler knows everything that is going on inside a function. It can...

Jonathan M Davis (92/145) Jul 24 2015 You're assuming that no separate compilation is going on, which

Steven Schveighoffer (50/119) Jul 24 2015 No, I'm not. Using an inout function is like inserting a wrapper around

Timon Gehr (2/5) Jul 24 2015 But what is the point?

Steven Schveighoffer (40/47) Jul 24 2015 The original PR is to add a const and immutable version of upperBound

Jonathan M Davis (22/33) Jul 24 2015 Yeah. That needs to be fixed. As I understand it, it's feasible

anonymous (3/10) Jul 26 2015 Pull request for that:
Martin Nowak (6/18) Aug 06 2015 Please open a Bugzilla issue to keep track of this and raise

Jonathan M Davis (22/47) Jul 24 2015 The only except that makes any sense to me is when you're casting

"anonymous" <anonymous example.com> writes:

On a GitHub pull request, Steven Schveighoffer (schveiguy), 
Jonathan M Davis (jmdavis), and I (aG0aep6G) have been discussing 
if or when it's ok to cast away const and then mutate the data:

https://github.com/D-Programming-Language/phobos/pull/3501#issuecomment-124169544

I've been under the impression that it's never allowed, i.e. it's 
always undefined behaviour. I think Jonathan is of the same 
opinion.

Steven disagrees and thinks that there are cases where it's ok. 
Namely, this simple case would be ok:

----
int x;
const int *y = &x;
*(cast(int *)y) = 5;
----

As I understand him, he's arguing that since the data is mutable, 
and since no function boundaries are crossed, compilers should 
not be allowed to do anything but the obvious with that code.

I think we've exchanged all arguments we have, yet no one has 
been convinced by the other side.

We've found the language spec to be a bit sparse on this. All I 
could find is essentially "you can't mutate through a const 
reference" [1], but no mention of if/when it's ok to cast a const 
reference to a mutable one (and then mutate).

So the questions are:
Is this specified somewhere?
If it isn't specified, how should it be specified?

[1] http://dlang.org/const3.html

Jul 23 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/23/15 2:43 PM, anonymous wrote:
 Steven disagrees and thinks that there are cases where it's ok. Namely,
 this simple case would be ok:

 ----
 int x;
 const int *y = &x;
 *(cast(int *)y) = 5;
 ----

Yes, IMO, this should simply work and be consistent. The compiler could 
use willful ignorance to assume x is still 0, but I find that to be 
counterproductive. It would have to implement flow analysis to determine 
that y must point at x, and then simply ignore the assignment during 
that analysis.

I'll note that the reason I want to allow this, is because we have a 
case where the same implementation must be used for the const and 
mutable version of the function, but the return value is a template that 
varies on constancy. Therefore, you necessarily need 2 function 
definitions -- the compiler isn't smart enough (or maybe it's too 
smart?) to use inout(T) as a template parameter to the Range struct, and 
auto convert that back on return.

The nice thing about inout, is the compiler guarantees one 
implementation of the function, and the implementation will guarantee 
const is preserved. But on returning, the function puts the data back to 
the way it was. That's exactly what we want.

In this case, we can't use inout. So we have to cast (or alternatively, 
copy-paste implementation) one result to the other.

My opinion is, we should execute the implementation as if the object 
were const, and then cast back to mutable if we are using the mutable 
entry point. This allows the compiler to check the function 
implementation for const-correctness, vs. the other suggestion: casting 
the const object to a mutable one and then hoping the mutable function 
implementation is const-correct without compiler help.

The proposed usage of casting also does not mutate the data in the 
function that casts. It receives in a mutable object, and it outputs a 
reference to the mutable object. The compiler would have to go through 
great lengths to see that the source of the mutable range it's receiving 
comes from a const range, and then ignore the type system in order to 
elide a reloading of something that is miniscule compared to the 
function call itself.

 As I understand him, he's arguing that since the data is mutable, and
 since no function boundaries are crossed, compilers should not be
 allowed to do anything but the obvious with that code.

 I think we've exchanged all arguments we have, yet no one has been
 convinced by the other side.

 We've found the language spec to be a bit sparse on this. All I could
 find is essentially "you can't mutate through a const reference" [1],
 but no mention of if/when it's ok to cast a const reference to a mutable
 one (and then mutate).

Note that it does specifically mention immutable cannot be cast away and 
then modified (on that same page, see "Removing Immutable With A Cast"). 
It does not mention const, I assume that is on purpose.

-Steve

Jul 23 2015

Timon Gehr <timon.gehr gmx.ch> writes:

On 07/23/2015 10:20 PM, Steven Schveighoffer wrote:
 On 7/23/15 2:43 PM, anonymous wrote:
 Steven disagrees and thinks that there are cases where it's ok. Namely,
 this simple case would be ok:

 ----
 int x;
 const int *y = &x;
 *(cast(int *)y) = 5;
 ----

 Yes, IMO, this should simply work and be consistent. The compiler could
 use willful ignorance to assume x is still 0, but I find that to be
 counterproductive. It would have to implement flow analysis to determine
 that y must point at x, and then simply ignore the assignment during
 that analysis.
...

No, it would be sufficient to have a simple form of constant propagation 
to screw up here.

Jul 24 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/24/15 3:02 PM, Timon Gehr wrote:
 On 07/23/2015 10:20 PM, Steven Schveighoffer wrote:
 On 7/23/15 2:43 PM, anonymous wrote:
 Steven disagrees and thinks that there are cases where it's ok. Namely,
 this simple case would be ok:

 ----
 int x;
 const int *y = &x;
 *(cast(int *)y) = 5;
 ----

 Yes, IMO, this should simply work and be consistent. The compiler could
 use willful ignorance to assume x is still 0, but I find that to be
 counterproductive. It would have to implement flow analysis to determine
 that y must point at x, and then simply ignore the assignment during
 that analysis.
 ...

 No, it would be sufficient to have a simple form of constant propagation
 to screw up here.

What do you mean?

-Steve

Jul 24 2015

Timon Gehr <timon.gehr gmx.ch> writes:

On 07/24/2015 09:43 PM, Steven Schveighoffer wrote:
 On 7/24/15 3:02 PM, Timon Gehr wrote:
 On 07/23/2015 10:20 PM, Steven Schveighoffer wrote:
 On 7/23/15 2:43 PM, anonymous wrote:
 Steven disagrees and thinks that there are cases where it's ok. Namely,
 this simple case would be ok:

 ----
 int x;
 const int *y = &x;
 *(cast(int *)y) = 5;
 ----

 Yes, IMO, this should simply work and be consistent. The compiler could
 use willful ignorance to assume x is still 0, but I find that to be
 counterproductive. It would have to implement flow analysis to determine
 that y must point at x, and then simply ignore the assignment during
 that analysis.
 ...

 No, it would be sufficient to have a simple form of constant propagation
 to screw up here.

 What do you mean?

 -Steve

Assuming UB for modifying through a const reference, the compiler does 
not have to be clever at all to come up with the following semantics for 
that piece of code:

void main(){
     int x;
     const int* y=&x;
     *(cast(int*)y)=5;

     assert(x==0); // constant propagated
     assert(*y==5);
     assert(*&x==5);
}

Jul 24 2015

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Thursday, 23 July 2015 at 18:43:03 UTC, anonymous wrote:
 On a GitHub pull request, Steven Schveighoffer (schveiguy), 
 Jonathan M Davis (jmdavis), and I (aG0aep6G) have been 
 discussing if or when it's ok to cast away const and then 
 mutate the data:

 https://github.com/D-Programming-Language/phobos/pull/3501#issuecomment-124169544

 I've been under the impression that it's never allowed, i.e. 
 it's always undefined behaviour. I think Jonathan is of the 
 same opinion.

It's come up time and time again with discussions for logical 
const. If you cast away const, it's up to you to guarantee that 
the data being referenced is not mutated, and if it is mutated, 
it's undefined behavior.

Now, if you know that the data being referenced is actually 
mutable and not immutable, and you know that the compiler isn't 
going to make any assumptions based on const which are then wrong 
if you mutate the variable after casting away const, then you can 
get away with it. But it's still undefined behavior, and if the 
compiler later starts doing more than it does now based on the 
knowledge that you can't mutate via a const reference, then your 
code might stop working correctly. So, if you're _really_ 
careful, you can get away with casting away const and mutating a 
variable, but you are depending on undefined behavior.

 Steven disagrees and thinks that there are cases where it's ok. 
 Namely, this simple case would be ok:

 ----
 int x;
 const int *y = &x;
 *(cast(int *)y) = 5;
 ----

 As I understand him, he's arguing that since the data is 
 mutable, and since no function boundaries are crossed, 
 compilers should not be allowed to do anything but the obvious 
 with that code.

Even if this were defined behavior, what would be the point? You 
have access to x. You could just mutate it directly. I don't see 
how it would make any sense to be attempting to mutate something 
via a const reference when you have access to it via a mutable 
reference. It's when you don't have access to it via a mutable 
reference that it becomes an issue - which means that you've 
crossed a function boundary.

As far as I can tell, making the above defined behavior buys you 
nothing. The times when you gain something from being able to 
cast away const and mutate are the times when you've crossed 
function boundaries and you have to assume that the calling code 
can't see that the cast is happening and thus can't see that the 
data it passed it might have been mutated even though it was 
const. The times where being able to cast away const and mutate 
would be valuable are exactly the times when that would be 
violating the purpose of const - that the data isn't changed via 
a const reference.

The only way to make casting away const and mutating defined 
behavior in general is to make it so that the compiler can't make 
assumptions based on const, which does tend to defeat the purpose 
of const on some level. And part of the whole deal with D's const 
is that it's actually physical const and not logical const or 
C++'s const or any other type of const, and if that's the case, 
then casting away const and mutating is _not_ something that 
should be defined behavior. If it were, then we wouldn't be 
dealing with physical const anymore. Instead we'd be in the same 
boat as C++ where const didn't actually mean that the object 
wasn't mutated, since you could cast away mutate - just with the 
caveat that you have to be sure that the data wasn't actually 
immutable, since mutating immutable data _definitely_ breaks 
immutable, and it could segfault, depending on where the data is 
stored.

So, I don't see how we say that it makes sense for const to ever 
be cast away and then mutated. That violates the guarantees that 
const is supposed to provide and puts us back in the C++ boat, 
only worse, since you still have to worry about immutable and not 
mutating const when it's actually immutable.

- Jonathan M Davis

Jul 23 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/23/15 7:57 PM, Jonathan M Davis wrote:
 On Thursday, 23 July 2015 at 18:43:03 UTC, anonymous wrote:
 On a GitHub pull request, Steven Schveighoffer (schveiguy), Jonathan M
 Davis (jmdavis), and I (aG0aep6G) have been discussing if or when it's
 ok to cast away const and then mutate the data:

 https://github.com/D-Programming-Language/phobos/pull/3501#issuecomment-124169544


 I've been under the impression that it's never allowed, i.e. it's
 always undefined behaviour. I think Jonathan is of the same opinion.

 It's come up time and time again with discussions for logical const.

This is not logical const. We are starting with mutable data, moving it 
through a function that we *don't* want to mutate the data, and then 
using it as mutable again in the function where you (and the compiler) 
know its mutable. But you aren't even mutating, just getting it back to 
the original constancy (though mutation should be OK, you still have a 
mutable reference).

 If
 you cast away const, it's up to you to guarantee that the data being
 referenced is not mutated, and if it is mutated, it's undefined behavior.

Still need a reference to the spec that says that. Note that the spec 
specifically says it's undefined to cast away and modify immutable, and 
is careful not to include const/mutable in that discussion.

 Now, if you know that the data being referenced is actually mutable and
 not immutable, and you know that the compiler isn't going to make any
 assumptions based on const which are then wrong if you mutate the
 variable after casting away const, then you can get away with it.
 But
 it's still undefined behavior, and if the compiler later starts doing
 more than it does now based on the knowledge that you can't mutate via a
 const reference, then your code might stop working correctly. So, if
 you're _really_ careful, you can get away with casting away const and
 mutating a variable, but you are depending on undefined behavior.

An example of what the compiler can "start doing more than it does now" 
would be helpful. I can't see how it can do anything based on this.

 ----
 int x;
 const int *y = &x;
 *(cast(int *)y) = 5;
 ----

 Even if this were defined behavior, what would be the point? You have
 access to x. You could just mutate it directly.

OK, but the point is you have run an algorithm that gets a *piece* of x 
(pretend x is not just a simple int), which you know to be mutable 
because x is mutable. But you don't want the algorithm to mutate x.

Basically, if we say this is undefined behavior, then inout is undefined 
behavior.

 I don't see how it would
 make any sense to be attempting to mutate something via a const
 reference when you have access to it via a mutable reference. It's when
 you don't have access to it via a mutable reference that it becomes an
 issue - which means that you've crossed a function boundary.

Exactly. You still have mutable access to it, and you know the const 
access is to the same object, just transformed via an algorithm.

 The only way to make casting away const and mutating defined behavior in
 general

This is NOT what is being asked. Not the general case of making it 
defined to cast away const on any item (which could turn out to be 
immutable).

I think it's pointless to argue over this. The behavior can't be defined 
any other way than what I'm asking. The question is if we want the 
*official* position to nonsensically call it undefined behavior.

Specifically, I would say you can cast away const on a reference that 
you have created within your own function on a mutable piece of data (in 
other words, you control the mutable data), then you can mutate via the 
cast reference. Otherwise, the inout feature is invalid, and we should 
remove it from the language, because that's EXACTLY what it does.

A simple example:

struct Node
{
    int val;
    Node *next;
}
const(Node) *find(const(Node)* n, int val)
{
    while(n && n.val != val) n = n.next;
    return n;
}
Node *find(Node *n, int val)
{
     const cn = n;
     return cast(Node *)find(cn, val);
}

Note that the mutable version of find doesn't mutate the node (checked 
by the compiler BTW), and it's signature doesn't allow any const 
optimizations -- it gets in a mutable and returns a mutable. This can be 
rewritten like this:

inout(Node) *find(inout(Node)* n, int val)
{
     while(n && n.val != val) n = n.next;
     return n;
}

But in the case of the PR in question, we can't do this, because we 
can't inout our range and have it continue to be a range. So we are 
mimicking the behavior of inout, and the compiler should be fine with this.

-Steve

Jul 23 2015

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Friday, 24 July 2015 at 03:02:30 UTC, Steven Schveighoffer 
wrote:
 On 7/23/15 7:57 PM, Jonathan M Davis wrote:
 If
 you cast away const, it's up to you to guarantee that the data 
 being
 referenced is not mutated, and if it is mutated, it's 
 undefined behavior.

 Still need a reference to the spec that says that. Note that 
 the spec specifically says it's undefined to cast away and 
 modify immutable, and is careful not to include const/mutable 
 in that discussion.

It's come up a number of times in discussions on logical const, 
including from Walter. For it to be otherwise would mean that 
const is not actually physical const. I'm quite certain that the 
spec is wrong in this case.

 OK, but the point is you have run an algorithm that gets a 
 *piece* of x (pretend x is not just a simple int), which you 
 know to be mutable because x is mutable. But you don't want the 
 algorithm to mutate x.

 Basically, if we say this is undefined behavior, then inout is 
 undefined behavior.

inout is done by the compiler. It knows that it's safe to cast 
the return type to mutable (or immutable), because it knows that 
the return value was either the argument that it passed in or 
something constructed within the function and thus safe to cast. 
The compiler knows what's going on, so it can ensure that it 
doesn't violate the type system and is well-defined.

 An example of what the compiler can "start doing more than it 
 does now" would be helpful. I can't see how it can do anything 
 based on this.

Well, it could remove dead code. For instance, if you had

const(Foo) bar(T t)
{
     const myFoo = getFoo(t);
     auto value1 = pureFunc(myFoo);
     auto value2 = pureFunc2(myFoo);
     auto value3 = pureFunc3(value1, value2);
     return myFoo;
}

All of the lines with pureFunc* could be removed outright, 
because they're all pure function calls, and they can't possibly 
have mutated myFoo. I wouldn't expect a lot of dead code like 
that, and maybe something like that would never be implemented in 
the compiler, but it could be as long as the compiler can 
actually rely on const not being mutated.

But part of the problem with "start doing more than it does now" 
is that that could easily depend on ideas that folks come up with 
later. At some point in the future, someone might figure out how 
const interacts with some other set of attributes and be able to 
optimize based on that. So, if you're casting away const and 
mutating, relying on no one coming up with new optimizations, 
then you could be in trouble later when they do. And maybe they 
won't, but we don't know.

 Specifically, I would say you can cast away const on a 
 reference that you have created within your own function on a 
 mutable piece of data (in other words, you control the mutable 
 data), then you can mutate via the cast reference. Otherwise, 
 the inout feature is invalid, and we should remove it from the 
 language, because that's EXACTLY what it does.

 A simple example:

 struct Node
 {
    int val;
    Node *next;
 }
 const(Node) *find(const(Node)* n, int val)
 {
    while(n && n.val != val) n = n.next;
    return n;
 }
 Node *find(Node *n, int val)
 {
     const cn = n;
     return cast(Node *)find(cn, val);
 }

 Note that the mutable version of find doesn't mutate the node 
 (checked by the compiler BTW), and it's signature doesn't allow 
 any const optimizations -- it gets in a mutable and returns a 
 mutable. This can be rewritten like this:

 inout(Node) *find(inout(Node)* n, int val)
 {
     while(n && n.val != val) n = n.next;
     return n;
 }

 But in the case of the PR in question, we can't do this, 
 because we can't inout our range and have it continue to be a 
 range. So we are mimicking the behavior of inout, and the 
 compiler should be fine with this.

It sounds like what you really want is a tail-inout range or 
somesuch, though since we can't even sort out tail-const ranges 
properly at this point, I expect that tail-inout ranges are a bit 
of a pipe dream.

In any case, regardless of whether what you're proposing defined 
behavior or not, it'll work, because there's no way that the 
compiler could do any optimizations based on const after the cast 
is done, because it's only const within the function. It's when 
you cast away const on something that you were given as const 
that you have a real problem. e.g.

const(Foo) bar(const(Foo) foo)
{
     auto f = cast(Foo)foo;
     f.mutateMe();
     return foo;
}

I agree that what you're asking for makes sense. If you pass in a 
mutable object to a function, and you know that the const object 
you get out is either the same object or a new one that is not 
immutable, and no references to that object escaped the function, 
then casting the return type to mutable should work, and I'm not 
against that being well-defined, but as I understand it, it 
technically isn't, because it involves casting away const and 
then mutating the result. And if it is well-defined, then we'd 
need clear way to describe the circumstances to separate it from 
casting away const in general (even when the data itself is 
actually mutable).

I'm am quite sure that it is undefined behavior to cast away 
const and mutate, even if the spec doesn't say that, because it's 
come up time and time again in discussions on logical const. And 
in the general case, even without immutable, if it's 
well-defined, then compiler can't assume that a const variable 
isn't going to be mutated, even when it knows that no mutable 
references could have mutated it, and it means that const really 
isn't physical const anymore, because you would be free to cast 
away const and mutate so long as the data wasn't immutable, thus 
making const pretty meaningless as far as compiler guarantees go 
(which is Walter's big beef with C++'s const). So, I don't see 
how we could allow casting away const and mutating to be 
well-defined aside from very specific cases like this one. But 
since the spec doesn't actually seem to say anything one way or 
the other (aside from with regards to immutable), I think that 
Walter is going to have weigh in. Clearly, the the best that I 
could do convince you otherwise would be to dig through all of 
the old threads on const to find quotes from Walter.

- Jonathan M Davis

Jul 23 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/23/15 11:58 PM, Jonathan M Davis wrote:
 On Friday, 24 July 2015 at 03:02:30 UTC, Steven Schveighoffer wrote:
 Basically, if we say this is undefined behavior, then inout is
 undefined behavior.

 inout is done by the compiler. It knows that it's safe to cast the
 return type to mutable (or immutable), because it knows that the return
 value was either the argument that it passed in or something constructed
 within the function and thus safe to cast. The compiler knows what's
 going on, so it can ensure that it doesn't violate the type system and
 is well-defined.

The compiler knows everything that is going on inside a function. It can 
see the cast and knows that it should execute it, and also that the 
original variable is mutable and could be the one being mutated. This 
isn't any different.

 An example of what the compiler can "start doing more than it does
 now" would be helpful. I can't see how it can do anything based on this.

 Well, it could remove dead code. For instance, if you had

 const(Foo) bar(T t)
 {
      const myFoo = getFoo(t);
      auto value1 = pureFunc(myFoo);
      auto value2 = pureFunc2(myFoo);
      auto value3 = pureFunc3(value1, value2);
      return myFoo;
 }

 All of the lines with pureFunc* could be removed outright, because
 they're all pure function calls, and they can't possibly have mutated
 myFoo. I wouldn't expect a lot of dead code like that, and maybe
 something like that would never be implemented in the compiler, but it
 could be as long as the compiler can actually rely on const not being
 mutated.

And my interpretation of the spec doesn't change this. You can still 
elide those calls as none of them should be casting away const and 
mutating internally.

 But part of the problem with "start doing more than it does now" is that
 that could easily depend on ideas that folks come up with later. At some
 point in the future, someone might figure out how const interacts with
 some other set of attributes and be able to optimize based on that. So,
 if you're casting away const and mutating, relying on no one coming up
 with new optimizations, then you could be in trouble later when they do.
 And maybe they won't, but we don't know.

These kinds of "maybe someone someday can think of something" arguments 
are quite unconvincing.

 It sounds like what you really want is a tail-inout range or somesuch,
 though since we can't even sort out tail-const ranges properly at this
 point, I expect that tail-inout ranges are a bit of a pipe dream.

tail-const and tail-inout are the same problem, and will likely be 
solved at the same time. But yes, tail-inout would solve this problem 
nicely.

 In any case, regardless of whether what you're proposing defined
 behavior or not, it'll work, because there's no way that the compiler
 could do any optimizations based on const after the cast is done,
 because it's only const within the function. It's when you cast away
 const on something that you were given as const that you have a real
 problem. e.g.

 const(Foo) bar(const(Foo) foo)
 {
      auto f = cast(Foo)foo;
      f.mutateMe();
      return foo;
 }

Right, I agree this can result in undefined behavior, and the compiler 
is free to assume foo isn't modified through this function (if it's pure).

-Steve

Jul 24 2015

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Friday, 24 July 2015 at 14:07:46 UTC, Steven Schveighoffer 
wrote:
 On 7/23/15 11:58 PM, Jonathan M Davis wrote:
 On Friday, 24 July 2015 at 03:02:30 UTC, Steven Schveighoffer 
 wrote:
 Basically, if we say this is undefined behavior, then inout is
 undefined behavior.

 inout is done by the compiler. It knows that it's safe to cast 
 the
 return type to mutable (or immutable), because it knows that 
 the return
 value was either the argument that it passed in or something 
 constructed
 within the function and thus safe to cast. The compiler knows 
 what's
 going on, so it can ensure that it doesn't violate the type 
 system and
 is well-defined.

 The compiler knows everything that is going on inside a 
 function. It can see the cast and knows that it should execute 
 it, and also that the original variable is mutable and could be 
 the one being mutated. This isn't any different.

You're assuming that no separate compilation is going on, which 
in the case of a template like you get with RedBlackTree, is 
true, because the source has to be there, but in the general 
case, separate compilation could make it so that the compiler 
can't see what's going on inside the function. It relies on the 
separate compilation step having verified the inout attribute 
appropriately when the function's body was compiled, and all it 
has to go on is the inout in the signature. If you were to try 
and implement inout yourself in non-templated code, the compiler 
wouldn't necessarily be able to see any of what's going on inside 
the function when it compiles the calling code. Without inout, in 
non-templated code, even if the compiler were being very smart 
about this, it wouldn't have a clue that when you cast away const 
on the return value that it was the same one that was passed in.

 An example of what the compiler can "start doing more than it 
 does
 now" would be helpful. I can't see how it can do anything 
 based on this.

 Well, it could remove dead code. For instance, if you had

 const(Foo) bar(T t)
 {
      const myFoo = getFoo(t);
      auto value1 = pureFunc(myFoo);
      auto value2 = pureFunc2(myFoo);
      auto value3 = pureFunc3(value1, value2);
      return myFoo;
 }

 All of the lines with pureFunc* could be removed outright, 
 because
 they're all pure function calls, and they can't possibly have 
 mutated
 myFoo. I wouldn't expect a lot of dead code like that, and 
 maybe
 something like that would never be implemented in the 
 compiler, but it
 could be as long as the compiler can actually rely on const 
 not being
 mutated.

 And my interpretation of the spec doesn't change this. You can 
 still elide those calls as none of them should be casting away 
 const and mutating internally.

You seem to be arguing that as long as you know that a const 
reference refers to mutable data, it is defined behavior to cast 
away const and mutate it. And if that were true, then if you knew 
that myFoo referred to mutable data, it would be valid to cast 
away const and mutate it inside of one of the pureFunc* 
functions, because you know that it's mutable and not immutable. 
And this shows why that isn't enough. And that is the major 
objection I have with what you're arguing here. In the general 
case, even if immutable is not used in the program even once, 
casting away const and mutating is not and cannot be defined 
behavior, or const guarantees nothing - just like in C++.

The exact use case that you're looking for - essentially inout - 
works only because when you cast it back, no const reference that 
was generated by calling the function with a mutable reference 
escaped that function except via the return value, so there's no 
way for the compiler to optimize based on the const reference, 
because there isn't one anymore. And that's _way_ more restricted 
than saying that it's defined behavior to cast away const and 
mutate as long as you know that the underlying data is actually 
mutable.

So, I don't object to the behavior you're arguing for being 
well-defined. It's that you're arguing that the fact that you 
know that it's mutable underneath is enough to make it valid to 
cast away const and mutate. And that cannot be the case, 
regardless of what the spec does or doesn't say, or we have C++'s 
const - except that it's transitive. And Walter has made it 
abundantly clear that his intention is that D's const be physical 
const and provide actual guarantees. For that to be the case, 
casting away and mutating const cannot be well-defined except in 
very specific cases where we can guarantee that we're not 
violating the guarantee that const objects are not mutated via a 
const reference.

Your case works, because the const reference is gone after the 
cast, and there are no others that were created from the point 
that it temporarily became const. So, it's a very special case. 
And maybe the rule can be worded in a way that incorporates that 
nicely, whereas simply saying that it's undefined behavior to 
cast away const and mutate would not allow it. But we cannot say 
that it's defined behavior to cast away const and mutate simply 
because you know that the data is mutable, or we do not have 
physical const, and const provides no real guarantees.

It sounds to me like we need to come up with a way to word the 
rule that allows for what you're trying to do while not allowing 
the mutating of const in general (even if the data is actually 
mutable) and get it approved by Walter and in the spec. Walter 
has made it clear on several occasions that const is supposed to 
be physical const with real guarantees, and if that is not what 
the spec says (or if it does say that but not clearly), then it 
needs to be updated. And I do agree that the case that you have 
here should be well-defined, but it could be tricky to come up 
with a way to word the rule that doesn't require a lot of 
ancillary explanation about what exactly it means. The spec needs 
to be clear, not wish-washy.

In either case, I think that it's clear that we need Walter to 
say something on the matter.

 These kinds of "maybe someone someday can think of something" 
 arguments are quite unconvincing.

The point is that if you're relying on undefined behavior to do 
whatever you're doing, what the compiler is doing could change 
later. Depending on how the compiler works now or on what 
improvements will or won't be able to be made later means risking 
writing code that will not work later. So, using undefined 
behavior and relying on the compiler's current behavior is just 
asking for trouble. I don't think that I should need to come up 
with a convincing argument about "someone someday can think of 
something," because that's the whole point of undefined behavior. 
You can't rely on it, because it's not defined.

And in the case of const, the compiler is supposed to be able to 
rely on data not being mutated via a const reference (that _is_ 
in the spec), so in almost all cases, casting away const and 
mutating cannot possibly be defined, because it would make it 
impossible for the compiler to rely on data not being mutated via 
a const reference.

In any case, clearly, I need to figure out how to improve the 
spec's explanation of const so that the situation is clear and 
get that approved by Walter. The simple fact that we're arguing 
over this shows that the spec isn't clear enough.

- Jonathan M Davis

Jul 24 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/24/15 3:35 PM, Jonathan M Davis wrote:
 On Friday, 24 July 2015 at 14:07:46 UTC, Steven Schveighoffer wrote:
 On 7/23/15 11:58 PM, Jonathan M Davis wrote:
 On Friday, 24 July 2015 at 03:02:30 UTC, Steven Schveighoffer wrote:
 Basically, if we say this is undefined behavior, then inout is
 undefined behavior.

 inout is done by the compiler. It knows that it's safe to cast the
 return type to mutable (or immutable), because it knows that the return
 value was either the argument that it passed in or something constructed
 within the function and thus safe to cast. The compiler knows what's
 going on, so it can ensure that it doesn't violate the type system and
 is well-defined.

 The compiler knows everything that is going on inside a function. It
 can see the cast and knows that it should execute it, and also that
 the original variable is mutable and could be the one being mutated.
 This isn't any different.

 You're assuming that no separate compilation is going on, which in the
 case of a template like you get with RedBlackTree, is true, because the
 source has to be there, but in the general case, separate compilation
 could make it so that the compiler can't see what's going on inside the
 function.

No, I'm not. Using an inout function is like inserting a wrapper around 
the real function that casts the result back to the right type. The 
inout rules inside the function make the casting sane without having to 
examine the code, but the code itself inside does not do any casting.

 It relies on the separate compilation step having verified the
 inout attribute appropriately when the function's body was compiled, and
 all it has to go on is the inout in the signature. If you were to try
 and implement inout yourself in non-templated code, the compiler
 wouldn't necessarily be able to see any of what's going on inside the
 function when it compiles the calling code. Without inout, in
 non-templated code, even if the compiler were being very smart about
 this, it wouldn't have a clue that when you cast away const on the
 return value that it was the same one that was passed in.

The compiler doesn't see "inout", it sees mutable, const, immutable -- 
three versions of the function. It calls the right one, and inside the 
function, the casting happens outside the implementation. The compiler 
doesn't have to know it's the same value, it doesn't even have to care 
whether the value is modified. It just has to accept that it can't 
optimize out the loading of the mutable variable again.

In other words, if the compiler compiles this:

int *foo(int *x);

void main()
{
    int x;
    auto y = foo(&x);
    y = 5;
}

It doesn't have to know that y is or is not pointing at x. What it just 
knows is that x may have changed inside foo, and that it's possible y is 
pointing at it (and therefore changed it as well).

 All of the lines with pureFunc* could be removed outright, because
 they're all pure function calls, and they can't possibly have mutated
 myFoo. I wouldn't expect a lot of dead code like that, and maybe
 something like that would never be implemented in the compiler, but it
 could be as long as the compiler can actually rely on const not being
 mutated.

 And my interpretation of the spec doesn't change this. You can still
 elide those calls as none of them should be casting away const and
 mutating internally.

 You seem to be arguing that as long as you know that a const reference
 refers to mutable data, it is defined behavior to cast away const and
 mutate it.

No. If you *create* a const reference to mutable data, you can cast away 
that const back to mutable, because everything is there for the compiler 
to see.

 And if that were true, then if you knew that myFoo referred
 to mutable data, it would be valid to cast away const and mutate it
 inside of one of the pureFunc* functions, because you know that it's
 mutable and not immutable.

No, because those pure functions don't know whether the data is mutable, 
and the compiler is allowed to infer that they don't based on their 
signatures.

Basically, it's the difference between these 2 calls:

pure void foo(int *x) { *x = 5;}
pure void bar(const(int) *x) { *(cast(int *)x) = 10;}

void main()
{
    int x;
    const int *y = &x;
    foo(cast(int *)y); // should be OK, can't be elided, and the 
compiler can see what is going on here
    bar(y); // BAD, compiler is free to remove
}

 And this shows why that isn't enough. And
 that is the major objection I have with what you're arguing here. In the
 general case, even if immutable is not used in the program even once,
 casting away const and mutating is not and cannot be defined behavior,
 or const guarantees nothing - just like in C++.

 The exact use case that you're looking for - essentially inout - works
 only because when you cast it back, no const reference that was
 generated by calling the function with a mutable reference escaped that
 function except via the return value, so there's no way for the compiler
 to optimize based on the const reference, because there isn't one
 anymore. And that's _way_ more restricted than saying that it's defined
 behavior to cast away const and mutate as long as you know that the
 underlying data is actually mutable.

I'm not saying that general statement. I'm saying in restricted 
situations, casting away const is not undefined behavior.

 Your case works, because the const reference is gone after the cast, and
 there are no others that were created from the point that it temporarily
 became const. So, it's a very special case. And maybe the rule can be
 worded in a way that incorporates that nicely, whereas simply saying
 that it's undefined behavior to cast away const and mutate would not
 allow it. But we cannot say that it's defined behavior to cast away
 const and mutate simply because you know that the data is mutable, or we
 do not have physical const, and const provides no real guarantees.

I agree, we can't just make the general case that you can cast away 
const if you know the data is mutable, given some configuration of 
function calls. There has to be complete visibility to the compiler 
within the same function to allow the possibility that some mutable data 
changed.

We can start with "casting away const and mutating, even if you know the 
underlying data is mutable, is UB, except for these situations:..."

And relax from there.

-Steve

Jul 24 2015

Timon Gehr <timon.gehr gmx.ch> writes:

On 07/24/2015 10:08 PM, Steven Schveighoffer wrote:
 We can start with "casting away const and mutating, even if you know the
 underlying data is mutable, is UB, except for these situations:..."

 And relax from there.

But what is the point?

Jul 24 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/24/15 4:20 PM, Timon Gehr wrote:
 On 07/24/2015 10:08 PM, Steven Schveighoffer wrote:
 We can start with "casting away const and mutating, even if you know the
 underlying data is mutable, is UB, except for these situations:..."

 And relax from there.

 But what is the point?

The original PR is to add a const and immutable version of upperBound 
and lowerBound to RedBlackTree.

Both of these functions are effectively const. But we must return a 
range that matches the constancy of the tree itself.

So for example:

RedBlackTree!int m;
const RedBlackTree!int c = m;

auto x = m.upperBound(5); // should return range over mutable ints
auto y = c.upperBound(5); // should return range over const ints.

The chosen implementation was to cast away const inside the const 
upperBound function, and run the mutable one, knowing that the actual 
algorithm doesn't modify any data. But I objected saying that it's 
better to run the code as const, and cast away const at the end in the 
mutable version, since the compiler will then be mechanically ensuring 
the const promise in the case of a const RedBlackTree.

The resulting discussion was that this is undefined behavior. But 
upperBound itself isn't modifying any data, it's just restoring the 
constancy of the range. But the range itself could potentially be used 
to modify the data. It didn't seem to me like this should be undefined 
behavior, since the compiler would have to make a very long connection 
through the various calls in order to see that everything would be const.

inout would work perfectly here, except you can't create a custom struct 
with an inout member that implicitly casts back to mutable/const/immutable.

So I don't know the answer. It seems very bad to cast away const to run 
a complex algorithm without mechanical checking. But ironically, that 
may be the only defined way to do it (aside from copy-paste 
implementation, or using a templated implementation).

The advantage of simply clarifying the spec is that the current compiler 
behavior (which should work) doesn't need to change, we just change the 
spec.

Ideally, we should just fix the situation with tail-const and we could 
have the best answer.

I think I'll give up on this argument. There isn't much use in putting 
in a rule for the spec that covers over a missing feature that we will 
likely add later.

Also, I just thought of a better way to do this that doesn't require any 
casting.

Forget this thread ever happened :)

-Steve

Jul 24 2015

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Friday, 24 July 2015 at 20:44:44 UTC, Steven Schveighoffer 
wrote:
 The advantage of simply clarifying the spec is that the current 
 compiler behavior (which should work) doesn't need to change, 
 we just change the spec.

 Ideally, we should just fix the situation with tail-const and 
 we could have the best answer.

Yeah. That needs to be fixed. As I understand it, it's feasible 
without any language improvements, but it's horrific. Jonathan 
Crapuchettes talked at one point about doing it at EMSI (and how 
hard it was). The last time I tried it, I ran into problems with 
recursive template definitions, though static if can probably 
solve those.

Regardless, the situation with it is ugly and not well 
understood, even if there is a solution, and ideally, we'd find a 
way to implement it that was a lot easier and cleaner. Without 
that, almost no one is going to be doing it - probably even if 
there's an article on dlang.org explaining how - simply because 
of how annoying it is to do.

 I think I'll give up on this argument. There isn't much use in 
 putting in a rule for the spec that covers over a missing 
 feature that we will likely add later.

 Also, I just thought of a better way to do this that doesn't 
 require any casting.

 Forget this thread ever happened :)

Well, regardless of whether mimicking inout like we're talking 
about with RedBlackTree should be considered defined behavior or 
not, I think that the spec should be updated so that the 
situation is clearer. It needs to be clear to the community at 
large that you _cannot_ be casting away const and mutating simply 
because you know that the data is mutable underneath rather than 
immutable.

- Jonathan M Davis

Jul 24 2015

"anonymous" <anonymous example.com> writes:

On Friday, 24 July 2015 at 21:12:57 UTC, Jonathan M Davis wrote:
 Well, regardless of whether mimicking inout like we're talking 
 about with RedBlackTree should be considered defined behavior 
 or not, I think that the spec should be updated so that the 
 situation is clearer. It needs to be clear to the community at 
 large that you _cannot_ be casting away const and mutating 
 simply because you know that the data is mutable underneath 
 rather than immutable.

Pull request for that:
https://github.com/D-Programming-Language/dlang.org/pull/1047

Jul 26 2015

"Martin Nowak" <code dawg.eu> writes:

On Friday, 24 July 2015 at 21:12:57 UTC, Jonathan M Davis wrote:
 Yeah. That needs to be fixed. As I understand it, it's feasible 
 without any language improvements, but it's horrific. Jonathan 
 Crapuchettes talked at one point about doing it at EMSI (and 
 how hard it was). The last time I tried it, I ran into problems 
 with recursive template definitions, though static if can 
 probably solve those.

 Regardless, the situation with it is ugly and not well 
 understood, even if there is a solution, and ideally, we'd find 
 a way to implement it that was a lot easier and cleaner. 
 Without that, almost no one is going to be doing it - probably 
 even if there's an article on dlang.org explaining how - simply 
 because of how annoying it is to do.

Please open a Bugzilla issue to keep track of this and raise 
awareness. If we're going to need a language feature we need to 
start collecting arguments, and maybe someone can still come up 
with a clean solution.
It's an important issue b/c it affects every container range.

Aug 06 2015

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Friday, 24 July 2015 at 20:08:11 UTC, Steven Schveighoffer 
wrote:
 On 7/24/15 3:35 PM, Jonathan M Davis wrote:
 Your case works, because the const reference is gone after the 
 cast, and
 there are no others that were created from the point that it 
 temporarily
 became const. So, it's a very special case. And maybe the rule 
 can be
 worded in a way that incorporates that nicely, whereas simply 
 saying
 that it's undefined behavior to cast away const and mutate 
 would not
 allow it. But we cannot say that it's defined behavior to cast 
 away
 const and mutate simply because you know that the data is 
 mutable, or we
 do not have physical const, and const provides no real 
 guarantees.

 I agree, we can't just make the general case that you can cast 
 away const if you know the data is mutable, given some 
 configuration of function calls. There has to be complete 
 visibility to the compiler within the same function to allow 
 the possibility that some mutable data changed.

 We can start with "casting away const and mutating, even if you 
 know the underlying data is mutable, is UB, except for these 
 situations:..."

The only except that makes any sense to me is when you're casting 
away const from the last const reference, so there are no const 
references left for the compiler to make any assumptions - so the 
case where you're trying to mimic inout. Something like

----
int x;
const int *y = &x;
*(cast(int *)y) = 5;
----

should be completely invalid IMHO. I don't see any reason to make 
it valid to cast away const and mutate just because the compiler 
can see that that's what you're doing, especially when it doesn't 
buy you anything, since you have access to the mutable reference 
anyway. Allowing it would just complicate things.

It might be possible to word the spec in a way to essentially 
allow you to do your own inout when inout doesn't cut it, since 
you're not really violating what const is supposed to guarantee, 
but for the rest, I say leave it undefined, because in that case 
you are violating it.

- Jonathan M Davis

Jul 24 2015

D Programming

C/C++ Programming

Other

digitalmars.D - casting away const and then mutating