www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - casting away const and then mutating

reply "anonymous" <anonymous example.com> writes:
On a GitHub pull request, Steven Schveighoffer (schveiguy), 
Jonathan M Davis (jmdavis), and I (aG0aep6G) have been discussing 
if or when it's ok to cast away const and then mutate the data:

https://github.com/D-Programming-Language/phobos/pull/3501#issuecomment-124169544

I've been under the impression that it's never allowed, i.e. it's 
always undefined behaviour. I think Jonathan is of the same 
opinion.

Steven disagrees and thinks that there are cases where it's ok. 
Namely, this simple case would be ok:

----
int x;
const int *y = &x;
*(cast(int *)y) = 5;
----

As I understand him, he's arguing that since the data is mutable, 
and since no function boundaries are crossed, compilers should 
not be allowed to do anything but the obvious with that code.

I think we've exchanged all arguments we have, yet no one has 
been convinced by the other side.

We've found the language spec to be a bit sparse on this. All I 
could find is essentially "you can't mutate through a const 
reference" [1], but no mention of if/when it's ok to cast a const 
reference to a mutable one (and then mutate).

So the questions are:
Is this specified somewhere?
If it isn't specified, how should it be specified?

[1] http://dlang.org/const3.html
Jul 23 2015
next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/23/15 2:43 PM, anonymous wrote:
 Steven disagrees and thinks that there are cases where it's ok. Namely,
 this simple case would be ok:

 ----
 int x;
 const int *y = &x;
 *(cast(int *)y) = 5;
 ----
Yes, IMO, this should simply work and be consistent. The compiler could use willful ignorance to assume x is still 0, but I find that to be counterproductive. It would have to implement flow analysis to determine that y must point at x, and then simply ignore the assignment during that analysis. I'll note that the reason I want to allow this, is because we have a case where the same implementation must be used for the const and mutable version of the function, but the return value is a template that varies on constancy. Therefore, you necessarily need 2 function definitions -- the compiler isn't smart enough (or maybe it's too smart?) to use inout(T) as a template parameter to the Range struct, and auto convert that back on return. The nice thing about inout, is the compiler guarantees one implementation of the function, and the implementation will guarantee const is preserved. But on returning, the function puts the data back to the way it was. That's exactly what we want. In this case, we can't use inout. So we have to cast (or alternatively, copy-paste implementation) one result to the other. My opinion is, we should execute the implementation as if the object were const, and then cast back to mutable if we are using the mutable entry point. This allows the compiler to check the function implementation for const-correctness, vs. the other suggestion: casting the const object to a mutable one and then hoping the mutable function implementation is const-correct without compiler help. The proposed usage of casting also does not mutate the data in the function that casts. It receives in a mutable object, and it outputs a reference to the mutable object. The compiler would have to go through great lengths to see that the source of the mutable range it's receiving comes from a const range, and then ignore the type system in order to elide a reloading of something that is miniscule compared to the function call itself.
 As I understand him, he's arguing that since the data is mutable, and
 since no function boundaries are crossed, compilers should not be
 allowed to do anything but the obvious with that code.

 I think we've exchanged all arguments we have, yet no one has been
 convinced by the other side.

 We've found the language spec to be a bit sparse on this. All I could
 find is essentially "you can't mutate through a const reference" [1],
 but no mention of if/when it's ok to cast a const reference to a mutable
 one (and then mutate).
Note that it does specifically mention immutable cannot be cast away and then modified (on that same page, see "Removing Immutable With A Cast"). It does not mention const, I assume that is on purpose. -Steve
Jul 23 2015
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 07/23/2015 10:20 PM, Steven Schveighoffer wrote:
 On 7/23/15 2:43 PM, anonymous wrote:
 Steven disagrees and thinks that there are cases where it's ok. Namely,
 this simple case would be ok:

 ----
 int x;
 const int *y = &x;
 *(cast(int *)y) = 5;
 ----
Yes, IMO, this should simply work and be consistent. The compiler could use willful ignorance to assume x is still 0, but I find that to be counterproductive. It would have to implement flow analysis to determine that y must point at x, and then simply ignore the assignment during that analysis. ...
No, it would be sufficient to have a simple form of constant propagation to screw up here.
Jul 24 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/24/15 3:02 PM, Timon Gehr wrote:
 On 07/23/2015 10:20 PM, Steven Schveighoffer wrote:
 On 7/23/15 2:43 PM, anonymous wrote:
 Steven disagrees and thinks that there are cases where it's ok. Namely,
 this simple case would be ok:

 ----
 int x;
 const int *y = &x;
 *(cast(int *)y) = 5;
 ----
Yes, IMO, this should simply work and be consistent. The compiler could use willful ignorance to assume x is still 0, but I find that to be counterproductive. It would have to implement flow analysis to determine that y must point at x, and then simply ignore the assignment during that analysis. ...
No, it would be sufficient to have a simple form of constant propagation to screw up here.
What do you mean? -Steve
Jul 24 2015
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 07/24/2015 09:43 PM, Steven Schveighoffer wrote:
 On 7/24/15 3:02 PM, Timon Gehr wrote:
 On 07/23/2015 10:20 PM, Steven Schveighoffer wrote:
 On 7/23/15 2:43 PM, anonymous wrote:
 Steven disagrees and thinks that there are cases where it's ok. Namely,
 this simple case would be ok:

 ----
 int x;
 const int *y = &x;
 *(cast(int *)y) = 5;
 ----
Yes, IMO, this should simply work and be consistent. The compiler could use willful ignorance to assume x is still 0, but I find that to be counterproductive. It would have to implement flow analysis to determine that y must point at x, and then simply ignore the assignment during that analysis. ...
No, it would be sufficient to have a simple form of constant propagation to screw up here.
What do you mean? -Steve
Assuming UB for modifying through a const reference, the compiler does not have to be clever at all to come up with the following semantics for that piece of code: void main(){ int x; const int* y=&x; *(cast(int*)y)=5; assert(x==0); // constant propagated assert(*y==5); assert(*&x==5); }
Jul 24 2015
prev sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, 23 July 2015 at 18:43:03 UTC, anonymous wrote:
 On a GitHub pull request, Steven Schveighoffer (schveiguy), 
 Jonathan M Davis (jmdavis), and I (aG0aep6G) have been 
 discussing if or when it's ok to cast away const and then 
 mutate the data:

 https://github.com/D-Programming-Language/phobos/pull/3501#issuecomment-124169544

 I've been under the impression that it's never allowed, i.e. 
 it's always undefined behaviour. I think Jonathan is of the 
 same opinion.
It's come up time and time again with discussions for logical const. If you cast away const, it's up to you to guarantee that the data being referenced is not mutated, and if it is mutated, it's undefined behavior. Now, if you know that the data being referenced is actually mutable and not immutable, and you know that the compiler isn't going to make any assumptions based on const which are then wrong if you mutate the variable after casting away const, then you can get away with it. But it's still undefined behavior, and if the compiler later starts doing more than it does now based on the knowledge that you can't mutate via a const reference, then your code might stop working correctly. So, if you're _really_ careful, you can get away with casting away const and mutating a variable, but you are depending on undefined behavior.
 Steven disagrees and thinks that there are cases where it's ok. 
 Namely, this simple case would be ok:

 ----
 int x;
 const int *y = &x;
 *(cast(int *)y) = 5;
 ----

 As I understand him, he's arguing that since the data is 
 mutable, and since no function boundaries are crossed, 
 compilers should not be allowed to do anything but the obvious 
 with that code.
Even if this were defined behavior, what would be the point? You have access to x. You could just mutate it directly. I don't see how it would make any sense to be attempting to mutate something via a const reference when you have access to it via a mutable reference. It's when you don't have access to it via a mutable reference that it becomes an issue - which means that you've crossed a function boundary. As far as I can tell, making the above defined behavior buys you nothing. The times when you gain something from being able to cast away const and mutate are the times when you've crossed function boundaries and you have to assume that the calling code can't see that the cast is happening and thus can't see that the data it passed it might have been mutated even though it was const. The times where being able to cast away const and mutate would be valuable are exactly the times when that would be violating the purpose of const - that the data isn't changed via a const reference. The only way to make casting away const and mutating defined behavior in general is to make it so that the compiler can't make assumptions based on const, which does tend to defeat the purpose of const on some level. And part of the whole deal with D's const is that it's actually physical const and not logical const or C++'s const or any other type of const, and if that's the case, then casting away const and mutating is _not_ something that should be defined behavior. If it were, then we wouldn't be dealing with physical const anymore. Instead we'd be in the same boat as C++ where const didn't actually mean that the object wasn't mutated, since you could cast away mutate - just with the caveat that you have to be sure that the data wasn't actually immutable, since mutating immutable data _definitely_ breaks immutable, and it could segfault, depending on where the data is stored. So, I don't see how we say that it makes sense for const to ever be cast away and then mutated. That violates the guarantees that const is supposed to provide and puts us back in the C++ boat, only worse, since you still have to worry about immutable and not mutating const when it's actually immutable. - Jonathan M Davis
Jul 23 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/23/15 7:57 PM, Jonathan M Davis wrote:
 On Thursday, 23 July 2015 at 18:43:03 UTC, anonymous wrote:
 On a GitHub pull request, Steven Schveighoffer (schveiguy), Jonathan M
 Davis (jmdavis), and I (aG0aep6G) have been discussing if or when it's
 ok to cast away const and then mutate the data:

 https://github.com/D-Programming-Language/phobos/pull/3501#issuecomment-124169544


 I've been under the impression that it's never allowed, i.e. it's
 always undefined behaviour. I think Jonathan is of the same opinion.
It's come up time and time again with discussions for logical const.
This is not logical const. We are starting with mutable data, moving it through a function that we *don't* want to mutate the data, and then using it as mutable again in the function where you (and the compiler) know its mutable. But you aren't even mutating, just getting it back to the original constancy (though mutation should be OK, you still have a mutable reference).
 If
 you cast away const, it's up to you to guarantee that the data being
 referenced is not mutated, and if it is mutated, it's undefined behavior.
Still need a reference to the spec that says that. Note that the spec specifically says it's undefined to cast away and modify immutable, and is careful not to include const/mutable in that discussion.
 Now, if you know that the data being referenced is actually mutable and
 not immutable, and you know that the compiler isn't going to make any
 assumptions based on const which are then wrong if you mutate the
 variable after casting away const, then you can get away with it.
 But
 it's still undefined behavior, and if the compiler later starts doing
 more than it does now based on the knowledge that you can't mutate via a
 const reference, then your code might stop working correctly. So, if
 you're _really_ careful, you can get away with casting away const and
 mutating a variable, but you are depending on undefined behavior.
An example of what the compiler can "start doing more than it does now" would be helpful. I can't see how it can do anything based on this.
 ----
 int x;
 const int *y = &x;
 *(cast(int *)y) = 5;
 ----
Even if this were defined behavior, what would be the point? You have access to x. You could just mutate it directly.
OK, but the point is you have run an algorithm that gets a *piece* of x (pretend x is not just a simple int), which you know to be mutable because x is mutable. But you don't want the algorithm to mutate x. Basically, if we say this is undefined behavior, then inout is undefined behavior.
 I don't see how it would
 make any sense to be attempting to mutate something via a const
 reference when you have access to it via a mutable reference. It's when
 you don't have access to it via a mutable reference that it becomes an
 issue - which means that you've crossed a function boundary.
Exactly. You still have mutable access to it, and you know the const access is to the same object, just transformed via an algorithm.
 The only way to make casting away const and mutating defined behavior in
 general
This is NOT what is being asked. Not the general case of making it defined to cast away const on any item (which could turn out to be immutable). I think it's pointless to argue over this. The behavior can't be defined any other way than what I'm asking. The question is if we want the *official* position to nonsensically call it undefined behavior. Specifically, I would say you can cast away const on a reference that you have created within your own function on a mutable piece of data (in other words, you control the mutable data), then you can mutate via the cast reference. Otherwise, the inout feature is invalid, and we should remove it from the language, because that's EXACTLY what it does. A simple example: struct Node { int val; Node *next; } const(Node) *find(const(Node)* n, int val) { while(n && n.val != val) n = n.next; return n; } Node *find(Node *n, int val) { const cn = n; return cast(Node *)find(cn, val); } Note that the mutable version of find doesn't mutate the node (checked by the compiler BTW), and it's signature doesn't allow any const optimizations -- it gets in a mutable and returns a mutable. This can be rewritten like this: inout(Node) *find(inout(Node)* n, int val) { while(n && n.val != val) n = n.next; return n; } But in the case of the PR in question, we can't do this, because we can't inout our range and have it continue to be a range. So we are mimicking the behavior of inout, and the compiler should be fine with this. -Steve
Jul 23 2015
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, 24 July 2015 at 03:02:30 UTC, Steven Schveighoffer 
wrote:
 On 7/23/15 7:57 PM, Jonathan M Davis wrote:
 If
 you cast away const, it's up to you to guarantee that the data 
 being
 referenced is not mutated, and if it is mutated, it's 
 undefined behavior.
Still need a reference to the spec that says that. Note that the spec specifically says it's undefined to cast away and modify immutable, and is careful not to include const/mutable in that discussion.
It's come up a number of times in discussions on logical const, including from Walter. For it to be otherwise would mean that const is not actually physical const. I'm quite certain that the spec is wrong in this case.
 OK, but the point is you have run an algorithm that gets a 
 *piece* of x (pretend x is not just a simple int), which you 
 know to be mutable because x is mutable. But you don't want the 
 algorithm to mutate x.

 Basically, if we say this is undefined behavior, then inout is 
 undefined behavior.
inout is done by the compiler. It knows that it's safe to cast the return type to mutable (or immutable), because it knows that the return value was either the argument that it passed in or something constructed within the function and thus safe to cast. The compiler knows what's going on, so it can ensure that it doesn't violate the type system and is well-defined.
 An example of what the compiler can "start doing more than it 
 does now" would be helpful. I can't see how it can do anything 
 based on this.
Well, it could remove dead code. For instance, if you had const(Foo) bar(T t) { const myFoo = getFoo(t); auto value1 = pureFunc(myFoo); auto value2 = pureFunc2(myFoo); auto value3 = pureFunc3(value1, value2); return myFoo; } All of the lines with pureFunc* could be removed outright, because they're all pure function calls, and they can't possibly have mutated myFoo. I wouldn't expect a lot of dead code like that, and maybe something like that would never be implemented in the compiler, but it could be as long as the compiler can actually rely on const not being mutated. But part of the problem with "start doing more than it does now" is that that could easily depend on ideas that folks come up with later. At some point in the future, someone might figure out how const interacts with some other set of attributes and be able to optimize based on that. So, if you're casting away const and mutating, relying on no one coming up with new optimizations, then you could be in trouble later when they do. And maybe they won't, but we don't know.
 Specifically, I would say you can cast away const on a 
 reference that you have created within your own function on a 
 mutable piece of data (in other words, you control the mutable 
 data), then you can mutate via the cast reference. Otherwise, 
 the inout feature is invalid, and we should remove it from the 
 language, because that's EXACTLY what it does.

 A simple example:

 struct Node
 {
    int val;
    Node *next;
 }
 const(Node) *find(const(Node)* n, int val)
 {
    while(n && n.val != val) n = n.next;
    return n;
 }
 Node *find(Node *n, int val)
 {
     const cn = n;
     return cast(Node *)find(cn, val);
 }

 Note that the mutable version of find doesn't mutate the node 
 (checked by the compiler BTW), and it's signature doesn't allow 
 any const optimizations -- it gets in a mutable and returns a 
 mutable. This can be rewritten like this:

 inout(Node) *find(inout(Node)* n, int val)
 {
     while(n && n.val != val) n = n.next;
     return n;
 }

 But in the case of the PR in question, we can't do this, 
 because we can't inout our range and have it continue to be a 
 range. So we are mimicking the behavior of inout, and the 
 compiler should be fine with this.
It sounds like what you really want is a tail-inout range or somesuch, though since we can't even sort out tail-const ranges properly at this point, I expect that tail-inout ranges are a bit of a pipe dream. In any case, regardless of whether what you're proposing defined behavior or not, it'll work, because there's no way that the compiler could do any optimizations based on const after the cast is done, because it's only const within the function. It's when you cast away const on something that you were given as const that you have a real problem. e.g. const(Foo) bar(const(Foo) foo) { auto f = cast(Foo)foo; f.mutateMe(); return foo; } I agree that what you're asking for makes sense. If you pass in a mutable object to a function, and you know that the const object you get out is either the same object or a new one that is not immutable, and no references to that object escaped the function, then casting the return type to mutable should work, and I'm not against that being well-defined, but as I understand it, it technically isn't, because it involves casting away const and then mutating the result. And if it is well-defined, then we'd need clear way to describe the circumstances to separate it from casting away const in general (even when the data itself is actually mutable). I'm am quite sure that it is undefined behavior to cast away const and mutate, even if the spec doesn't say that, because it's come up time and time again in discussions on logical const. And in the general case, even without immutable, if it's well-defined, then compiler can't assume that a const variable isn't going to be mutated, even when it knows that no mutable references could have mutated it, and it means that const really isn't physical const anymore, because you would be free to cast away const and mutate so long as the data wasn't immutable, thus making const pretty meaningless as far as compiler guarantees go (which is Walter's big beef with C++'s const). So, I don't see how we could allow casting away const and mutating to be well-defined aside from very specific cases like this one. But since the spec doesn't actually seem to say anything one way or the other (aside from with regards to immutable), I think that Walter is going to have weigh in. Clearly, the the best that I could do convince you otherwise would be to dig through all of the old threads on const to find quotes from Walter. - Jonathan M Davis
Jul 23 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/23/15 11:58 PM, Jonathan M Davis wrote:
 On Friday, 24 July 2015 at 03:02:30 UTC, Steven Schveighoffer wrote:
 Basically, if we say this is undefined behavior, then inout is
 undefined behavior.
inout is done by the compiler. It knows that it's safe to cast the return type to mutable (or immutable), because it knows that the return value was either the argument that it passed in or something constructed within the function and thus safe to cast. The compiler knows what's going on, so it can ensure that it doesn't violate the type system and is well-defined.
The compiler knows everything that is going on inside a function. It can see the cast and knows that it should execute it, and also that the original variable is mutable and could be the one being mutated. This isn't any different.
 An example of what the compiler can "start doing more than it does
 now" would be helpful. I can't see how it can do anything based on this.
Well, it could remove dead code. For instance, if you had const(Foo) bar(T t) { const myFoo = getFoo(t); auto value1 = pureFunc(myFoo); auto value2 = pureFunc2(myFoo); auto value3 = pureFunc3(value1, value2); return myFoo; } All of the lines with pureFunc* could be removed outright, because they're all pure function calls, and they can't possibly have mutated myFoo. I wouldn't expect a lot of dead code like that, and maybe something like that would never be implemented in the compiler, but it could be as long as the compiler can actually rely on const not being mutated.
And my interpretation of the spec doesn't change this. You can still elide those calls as none of them should be casting away const and mutating internally.
 But part of the problem with "start doing more than it does now" is that
 that could easily depend on ideas that folks come up with later. At some
 point in the future, someone might figure out how const interacts with
 some other set of attributes and be able to optimize based on that. So,
 if you're casting away const and mutating, relying on no one coming up
 with new optimizations, then you could be in trouble later when they do.
 And maybe they won't, but we don't know.
These kinds of "maybe someone someday can think of something" arguments are quite unconvincing.
 It sounds like what you really want is a tail-inout range or somesuch,
 though since we can't even sort out tail-const ranges properly at this
 point, I expect that tail-inout ranges are a bit of a pipe dream.
tail-const and tail-inout are the same problem, and will likely be solved at the same time. But yes, tail-inout would solve this problem nicely.
 In any case, regardless of whether what you're proposing defined
 behavior or not, it'll work, because there's no way that the compiler
 could do any optimizations based on const after the cast is done,
 because it's only const within the function. It's when you cast away
 const on something that you were given as const that you have a real
 problem. e.g.

 const(Foo) bar(const(Foo) foo)
 {
      auto f = cast(Foo)foo;
      f.mutateMe();
      return foo;
 }
Right, I agree this can result in undefined behavior, and the compiler is free to assume foo isn't modified through this function (if it's pure). -Steve
Jul 24 2015
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, 24 July 2015 at 14:07:46 UTC, Steven Schveighoffer 
wrote:
 On 7/23/15 11:58 PM, Jonathan M Davis wrote:
 On Friday, 24 July 2015 at 03:02:30 UTC, Steven Schveighoffer 
 wrote:
 Basically, if we say this is undefined behavior, then inout is
 undefined behavior.
inout is done by the compiler. It knows that it's safe to cast the return type to mutable (or immutable), because it knows that the return value was either the argument that it passed in or something constructed within the function and thus safe to cast. The compiler knows what's going on, so it can ensure that it doesn't violate the type system and is well-defined.
The compiler knows everything that is going on inside a function. It can see the cast and knows that it should execute it, and also that the original variable is mutable and could be the one being mutated. This isn't any different.
You're assuming that no separate compilation is going on, which in the case of a template like you get with RedBlackTree, is true, because the source has to be there, but in the general case, separate compilation could make it so that the compiler can't see what's going on inside the function. It relies on the separate compilation step having verified the inout attribute appropriately when the function's body was compiled, and all it has to go on is the inout in the signature. If you were to try and implement inout yourself in non-templated code, the compiler wouldn't necessarily be able to see any of what's going on inside the function when it compiles the calling code. Without inout, in non-templated code, even if the compiler were being very smart about this, it wouldn't have a clue that when you cast away const on the return value that it was the same one that was passed in.
 An example of what the compiler can "start doing more than it 
 does
 now" would be helpful. I can't see how it can do anything 
 based on this.
Well, it could remove dead code. For instance, if you had const(Foo) bar(T t) { const myFoo = getFoo(t); auto value1 = pureFunc(myFoo); auto value2 = pureFunc2(myFoo); auto value3 = pureFunc3(value1, value2); return myFoo; } All of the lines with pureFunc* could be removed outright, because they're all pure function calls, and they can't possibly have mutated myFoo. I wouldn't expect a lot of dead code like that, and maybe something like that would never be implemented in the compiler, but it could be as long as the compiler can actually rely on const not being mutated.
And my interpretation of the spec doesn't change this. You can still elide those calls as none of them should be casting away const and mutating internally.
You seem to be arguing that as long as you know that a const reference refers to mutable data, it is defined behavior to cast away const and mutate it. And if that were true, then if you knew that myFoo referred to mutable data, it would be valid to cast away const and mutate it inside of one of the pureFunc* functions, because you know that it's mutable and not immutable. And this shows why that isn't enough. And that is the major objection I have with what you're arguing here. In the general case, even if immutable is not used in the program even once, casting away const and mutating is not and cannot be defined behavior, or const guarantees nothing - just like in C++. The exact use case that you're looking for - essentially inout - works only because when you cast it back, no const reference that was generated by calling the function with a mutable reference escaped that function except via the return value, so there's no way for the compiler to optimize based on the const reference, because there isn't one anymore. And that's _way_ more restricted than saying that it's defined behavior to cast away const and mutate as long as you know that the underlying data is actually mutable. So, I don't object to the behavior you're arguing for being well-defined. It's that you're arguing that the fact that you know that it's mutable underneath is enough to make it valid to cast away const and mutate. And that cannot be the case, regardless of what the spec does or doesn't say, or we have C++'s const - except that it's transitive. And Walter has made it abundantly clear that his intention is that D's const be physical const and provide actual guarantees. For that to be the case, casting away and mutating const cannot be well-defined except in very specific cases where we can guarantee that we're not violating the guarantee that const objects are not mutated via a const reference. Your case works, because the const reference is gone after the cast, and there are no others that were created from the point that it temporarily became const. So, it's a very special case. And maybe the rule can be worded in a way that incorporates that nicely, whereas simply saying that it's undefined behavior to cast away const and mutate would not allow it. But we cannot say that it's defined behavior to cast away const and mutate simply because you know that the data is mutable, or we do not have physical const, and const provides no real guarantees. It sounds to me like we need to come up with a way to word the rule that allows for what you're trying to do while not allowing the mutating of const in general (even if the data is actually mutable) and get it approved by Walter and in the spec. Walter has made it clear on several occasions that const is supposed to be physical const with real guarantees, and if that is not what the spec says (or if it does say that but not clearly), then it needs to be updated. And I do agree that the case that you have here should be well-defined, but it could be tricky to come up with a way to word the rule that doesn't require a lot of ancillary explanation about what exactly it means. The spec needs to be clear, not wish-washy. In either case, I think that it's clear that we need Walter to say something on the matter.
 These kinds of "maybe someone someday can think of something" 
 arguments are quite unconvincing.
The point is that if you're relying on undefined behavior to do whatever you're doing, what the compiler is doing could change later. Depending on how the compiler works now or on what improvements will or won't be able to be made later means risking writing code that will not work later. So, using undefined behavior and relying on the compiler's current behavior is just asking for trouble. I don't think that I should need to come up with a convincing argument about "someone someday can think of something," because that's the whole point of undefined behavior. You can't rely on it, because it's not defined. And in the case of const, the compiler is supposed to be able to rely on data not being mutated via a const reference (that _is_ in the spec), so in almost all cases, casting away const and mutating cannot possibly be defined, because it would make it impossible for the compiler to rely on data not being mutated via a const reference. In any case, clearly, I need to figure out how to improve the spec's explanation of const so that the situation is clear and get that approved by Walter. The simple fact that we're arguing over this shows that the spec isn't clear enough. - Jonathan M Davis
Jul 24 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/24/15 3:35 PM, Jonathan M Davis wrote:
 On Friday, 24 July 2015 at 14:07:46 UTC, Steven Schveighoffer wrote:
 On 7/23/15 11:58 PM, Jonathan M Davis wrote:
 On Friday, 24 July 2015 at 03:02:30 UTC, Steven Schveighoffer wrote:
 Basically, if we say this is undefined behavior, then inout is
 undefined behavior.
inout is done by the compiler. It knows that it's safe to cast the return type to mutable (or immutable), because it knows that the return value was either the argument that it passed in or something constructed within the function and thus safe to cast. The compiler knows what's going on, so it can ensure that it doesn't violate the type system and is well-defined.
The compiler knows everything that is going on inside a function. It can see the cast and knows that it should execute it, and also that the original variable is mutable and could be the one being mutated. This isn't any different.
You're assuming that no separate compilation is going on, which in the case of a template like you get with RedBlackTree, is true, because the source has to be there, but in the general case, separate compilation could make it so that the compiler can't see what's going on inside the function.
No, I'm not. Using an inout function is like inserting a wrapper around the real function that casts the result back to the right type. The inout rules inside the function make the casting sane without having to examine the code, but the code itself inside does not do any casting.
 It relies on the separate compilation step having verified the
 inout attribute appropriately when the function's body was compiled, and
 all it has to go on is the inout in the signature. If you were to try
 and implement inout yourself in non-templated code, the compiler
 wouldn't necessarily be able to see any of what's going on inside the
 function when it compiles the calling code. Without inout, in
 non-templated code, even if the compiler were being very smart about
 this, it wouldn't have a clue that when you cast away const on the
 return value that it was the same one that was passed in.
The compiler doesn't see "inout", it sees mutable, const, immutable -- three versions of the function. It calls the right one, and inside the function, the casting happens outside the implementation. The compiler doesn't have to know it's the same value, it doesn't even have to care whether the value is modified. It just has to accept that it can't optimize out the loading of the mutable variable again. In other words, if the compiler compiles this: int *foo(int *x); void main() { int x; auto y = foo(&x); y = 5; } It doesn't have to know that y is or is not pointing at x. What it just knows is that x may have changed inside foo, and that it's possible y is pointing at it (and therefore changed it as well).
 All of the lines with pureFunc* could be removed outright, because
 they're all pure function calls, and they can't possibly have mutated
 myFoo. I wouldn't expect a lot of dead code like that, and maybe
 something like that would never be implemented in the compiler, but it
 could be as long as the compiler can actually rely on const not being
 mutated.
And my interpretation of the spec doesn't change this. You can still elide those calls as none of them should be casting away const and mutating internally.
You seem to be arguing that as long as you know that a const reference refers to mutable data, it is defined behavior to cast away const and mutate it.
No. If you *create* a const reference to mutable data, you can cast away that const back to mutable, because everything is there for the compiler to see.
 And if that were true, then if you knew that myFoo referred
 to mutable data, it would be valid to cast away const and mutate it
 inside of one of the pureFunc* functions, because you know that it's
 mutable and not immutable.
No, because those pure functions don't know whether the data is mutable, and the compiler is allowed to infer that they don't based on their signatures. Basically, it's the difference between these 2 calls: pure void foo(int *x) { *x = 5;} pure void bar(const(int) *x) { *(cast(int *)x) = 10;} void main() { int x; const int *y = &x; foo(cast(int *)y); // should be OK, can't be elided, and the compiler can see what is going on here bar(y); // BAD, compiler is free to remove }
 And this shows why that isn't enough. And
 that is the major objection I have with what you're arguing here. In the
 general case, even if immutable is not used in the program even once,
 casting away const and mutating is not and cannot be defined behavior,
 or const guarantees nothing - just like in C++.

 The exact use case that you're looking for - essentially inout - works
 only because when you cast it back, no const reference that was
 generated by calling the function with a mutable reference escaped that
 function except via the return value, so there's no way for the compiler
 to optimize based on the const reference, because there isn't one
 anymore. And that's _way_ more restricted than saying that it's defined
 behavior to cast away const and mutate as long as you know that the
 underlying data is actually mutable.
I'm not saying that general statement. I'm saying in restricted situations, casting away const is not undefined behavior.
 Your case works, because the const reference is gone after the cast, and
 there are no others that were created from the point that it temporarily
 became const. So, it's a very special case. And maybe the rule can be
 worded in a way that incorporates that nicely, whereas simply saying
 that it's undefined behavior to cast away const and mutate would not
 allow it. But we cannot say that it's defined behavior to cast away
 const and mutate simply because you know that the data is mutable, or we
 do not have physical const, and const provides no real guarantees.
I agree, we can't just make the general case that you can cast away const if you know the data is mutable, given some configuration of function calls. There has to be complete visibility to the compiler within the same function to allow the possibility that some mutable data changed. We can start with "casting away const and mutating, even if you know the underlying data is mutable, is UB, except for these situations:..." And relax from there. -Steve
Jul 24 2015
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 07/24/2015 10:08 PM, Steven Schveighoffer wrote:
 We can start with "casting away const and mutating, even if you know the
 underlying data is mutable, is UB, except for these situations:..."

 And relax from there.
But what is the point?
Jul 24 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/24/15 4:20 PM, Timon Gehr wrote:
 On 07/24/2015 10:08 PM, Steven Schveighoffer wrote:
 We can start with "casting away const and mutating, even if you know the
 underlying data is mutable, is UB, except for these situations:..."

 And relax from there.
But what is the point?
The original PR is to add a const and immutable version of upperBound and lowerBound to RedBlackTree. Both of these functions are effectively const. But we must return a range that matches the constancy of the tree itself. So for example: RedBlackTree!int m; const RedBlackTree!int c = m; auto x = m.upperBound(5); // should return range over mutable ints auto y = c.upperBound(5); // should return range over const ints. The chosen implementation was to cast away const inside the const upperBound function, and run the mutable one, knowing that the actual algorithm doesn't modify any data. But I objected saying that it's better to run the code as const, and cast away const at the end in the mutable version, since the compiler will then be mechanically ensuring the const promise in the case of a const RedBlackTree. The resulting discussion was that this is undefined behavior. But upperBound itself isn't modifying any data, it's just restoring the constancy of the range. But the range itself could potentially be used to modify the data. It didn't seem to me like this should be undefined behavior, since the compiler would have to make a very long connection through the various calls in order to see that everything would be const. inout would work perfectly here, except you can't create a custom struct with an inout member that implicitly casts back to mutable/const/immutable. So I don't know the answer. It seems very bad to cast away const to run a complex algorithm without mechanical checking. But ironically, that may be the only defined way to do it (aside from copy-paste implementation, or using a templated implementation). The advantage of simply clarifying the spec is that the current compiler behavior (which should work) doesn't need to change, we just change the spec. Ideally, we should just fix the situation with tail-const and we could have the best answer. I think I'll give up on this argument. There isn't much use in putting in a rule for the spec that covers over a missing feature that we will likely add later. Also, I just thought of a better way to do this that doesn't require any casting. Forget this thread ever happened :) -Steve
Jul 24 2015
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, 24 July 2015 at 20:44:44 UTC, Steven Schveighoffer 
wrote:
 The advantage of simply clarifying the spec is that the current 
 compiler behavior (which should work) doesn't need to change, 
 we just change the spec.

 Ideally, we should just fix the situation with tail-const and 
 we could have the best answer.
Yeah. That needs to be fixed. As I understand it, it's feasible without any language improvements, but it's horrific. Jonathan Crapuchettes talked at one point about doing it at EMSI (and how hard it was). The last time I tried it, I ran into problems with recursive template definitions, though static if can probably solve those. Regardless, the situation with it is ugly and not well understood, even if there is a solution, and ideally, we'd find a way to implement it that was a lot easier and cleaner. Without that, almost no one is going to be doing it - probably even if there's an article on dlang.org explaining how - simply because of how annoying it is to do.
 I think I'll give up on this argument. There isn't much use in 
 putting in a rule for the spec that covers over a missing 
 feature that we will likely add later.

 Also, I just thought of a better way to do this that doesn't 
 require any casting.

 Forget this thread ever happened :)
Well, regardless of whether mimicking inout like we're talking about with RedBlackTree should be considered defined behavior or not, I think that the spec should be updated so that the situation is clearer. It needs to be clear to the community at large that you _cannot_ be casting away const and mutating simply because you know that the data is mutable underneath rather than immutable. - Jonathan M Davis
Jul 24 2015
next sibling parent "anonymous" <anonymous example.com> writes:
On Friday, 24 July 2015 at 21:12:57 UTC, Jonathan M Davis wrote:
 Well, regardless of whether mimicking inout like we're talking 
 about with RedBlackTree should be considered defined behavior 
 or not, I think that the spec should be updated so that the 
 situation is clearer. It needs to be clear to the community at 
 large that you _cannot_ be casting away const and mutating 
 simply because you know that the data is mutable underneath 
 rather than immutable.
Pull request for that: https://github.com/D-Programming-Language/dlang.org/pull/1047
Jul 26 2015
prev sibling parent "Martin Nowak" <code dawg.eu> writes:
On Friday, 24 July 2015 at 21:12:57 UTC, Jonathan M Davis wrote:
 Yeah. That needs to be fixed. As I understand it, it's feasible 
 without any language improvements, but it's horrific. Jonathan 
 Crapuchettes talked at one point about doing it at EMSI (and 
 how hard it was). The last time I tried it, I ran into problems 
 with recursive template definitions, though static if can 
 probably solve those.

 Regardless, the situation with it is ugly and not well 
 understood, even if there is a solution, and ideally, we'd find 
 a way to implement it that was a lot easier and cleaner. 
 Without that, almost no one is going to be doing it - probably 
 even if there's an article on dlang.org explaining how - simply 
 because of how annoying it is to do.
Please open a Bugzilla issue to keep track of this and raise awareness. If we're going to need a language feature we need to start collecting arguments, and maybe someone can still come up with a clean solution. It's an important issue b/c it affects every container range.
Aug 06 2015
prev sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, 24 July 2015 at 20:08:11 UTC, Steven Schveighoffer 
wrote:
 On 7/24/15 3:35 PM, Jonathan M Davis wrote:
 Your case works, because the const reference is gone after the 
 cast, and
 there are no others that were created from the point that it 
 temporarily
 became const. So, it's a very special case. And maybe the rule 
 can be
 worded in a way that incorporates that nicely, whereas simply 
 saying
 that it's undefined behavior to cast away const and mutate 
 would not
 allow it. But we cannot say that it's defined behavior to cast 
 away
 const and mutate simply because you know that the data is 
 mutable, or we
 do not have physical const, and const provides no real 
 guarantees.
I agree, we can't just make the general case that you can cast away const if you know the data is mutable, given some configuration of function calls. There has to be complete visibility to the compiler within the same function to allow the possibility that some mutable data changed. We can start with "casting away const and mutating, even if you know the underlying data is mutable, is UB, except for these situations:..."
The only except that makes any sense to me is when you're casting away const from the last const reference, so there are no const references left for the compiler to make any assumptions - so the case where you're trying to mimic inout. Something like ---- int x; const int *y = &x; *(cast(int *)y) = 5; ---- should be completely invalid IMHO. I don't see any reason to make it valid to cast away const and mutate just because the compiler can see that that's what you're doing, especially when it doesn't buy you anything, since you have access to the mutable reference anyway. Allowing it would just complicate things. It might be possible to word the spec in a way to essentially allow you to do your own inout when inout doesn't cut it, since you're not really violating what const is supposed to guarantee, but for the rest, I say leave it undefined, because in that case you are violating it. - Jonathan M Davis
Jul 24 2015