www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Possible D2 solution to the upcasting array problem, and a related

reply Stewart Gordon <smjg_1998 yahoo.com> writes:
I was just looking at this
http://d.puremagic.com/issues/show_bug.cgi?id=2544
which describes how it's possible to bypass const by doing this:

     const(int)[] answers = [42];
     int[][] unconsted = [[]];
     const(int)[][] unsafe = unconsted;
     unsafe[0] = answers;
     unconsted[0][0] = 43;

The problem is that converting from int[][] to const(int)[][] isn't 
safe, even though the language/compiler seems to think it is.

Really, it's another version of how you can use a DerivedClass[] as a 
BaseClass[] and thereby place in it an object that isn't of type 
DerivedClass.

There's actually a simple solution to this: specify that, where 
DerivedClass derives from BaseClass, DerivedClass[] cannot be implicitly 
converted to BaseClass[], but only to const(BaseClass)[].

Java has had something like this for a while, albeit not with arrays. 
That is, IIRC, you can assign a DataStructure<DerivedClass> to a 
variable of type DataStructure<? extends BaseClass> (or even 
DataStructure<? extends DerivedClass>) - this creates a read-only view 
of the data structure.  My proposal implements the same basic concept as 
this, but in a simpler way.  (Java also supports write-only 'views' with 
DataStructure<? super FurtherDerivedClass>, but I'm not sure we need 
anything like this in D at the moment.)

Now let's apply the same principle to the example in the bug report. 
Try defining that, in general, T[][] can be converted to const(T[])[] 
but not const(T)[][].  Then

     const(int)[] answers = [42];
     int[][] unconsted = [[]];
     const(int)[][] unsafe = unconsted;

would be illegal.  One would have to do

     const(int[])[] safe = unconsted;

and now

     safe[0] = answers;

is illegal.

In order to deal with the original, slightly more complex testcase, the 
principle needs to be applied in the same way to further levels of array 
nesting.  It would need applying to pointer types as well as array types 
- so, for example,

     int[]*[]

would be implicitly convertible to

     const(int[]*)[]

but not

     const(int[])*[]
     const(int)[]*[]

We could combine two applications of the principle, and get

     DerivedClass[][]

convertible to

     const(DerivedClass[])[]
     const(BaseClass[])[]

but not

     const(DerivedClass)[][]
     const(BaseClass)[][]
     BaseClass[][]


To summarise, the rules would be:

- Generalise the definition of an upcast to be any of the following:
-- conversion of a class type to a class type further up the hierarchy
-- conversion of a type to a const version of that type
-- a legal implicit conversion according to the following rule

- If U is an upcast of T, then a legal implicit conversion is from T[] 
to const(U)[], or T* to const(U)*.  In particular, conversion from T[] 
to U[] or T* to U* is illegal, except in the cases where T and const(T) 
are exactly the same.


If I've worked it out right, then these rules'll be fix the const system 
to be safe, while at the same time making upcasting of object arrays 
safe.  At least, before you consider invariant - a little more thought 
is needed to work out how this would be dealt with.

And it shouldn't break too much existing code.  Code that is already 
broken will just promote this breakage from runtime to compiletime, and 
those uses that were already 'correct' will be easily fixed by updating 
the declarations.  The unsafe conversions could be deprecated before 
being removed altogether, with messages such as

     conversion from int[][] to const(int)[][] is unsafe and deprecated, 
use const(int[])[] instead
     conversion from DerivedClass[] to BaseClass[] is unsafe and 
deprecated, use const(BaseClass)[] instead

These unsafe conversions could still be allowed by explicit casts, 
should they be needed for something, IWC you'd be expected to know what 
you're doing.

What does everyone think?  Even better, can anyone find any holes that 
my proposal misses?

Stewart.
Jan 02 2009
next sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
I've looked at invariant a bit, and found a few things.

     void func(const(int)[] answers) {
         invariant(int)[][] invar = [[]];
         const(int)[][] unsafe = invar;
         unsafe[0] = answers;
     }

This is unsafe, since answers, which is merely a read-only view, has 
been sneaked into invar, thereby disguising it as something that will 
never change.  So invariant is just like plain mutable in this instance. 
  OTOH, it would be OK assigned to a const(int[])[], or even a 
const(invariant(int)[])[].  (My experiment shows that the transitivity 
of invariant overrides that of const, though this doesn't seem to be in 
the spec.)


OTOH, an invariant class array can be upcast no problem

     invariant(DerivedClass)[]
to
     invariant(BaseClass)[]
or to either of these replacing invariant with const.  But we still need 
const on levels that are further out:
     invariant(DerivedClass)[][]
to
     const(invariant(DerivedClass)[])[]
     const(invariant(BaseClass)[])[]
     const(BaseClass[])[]


So we can rewrite my proposed set of rules:

1. Generalise the definition of an upcast to be any of the following:
(a) conversion of a class type to a class type further up the hierarchy
(b) conversion of a mutable or invariant type to a const version of that 
type
(c) an implicit conversion permitted by rule 2

2. If U is an upcast of T, then a legal implicit conversion is any of:
(a) T[] to const(U)[]
(b) T* to const(U)*
(c) invariant(T)[] to invariant(U)[]
(d) invariant(T)* to invariant(U)*
Any other conversion from T[] to U[] or T* to U* is illegal.


Notice that:

- the case of const(T)[] to const(U)[] is covered, since const(const(U)) 
is the same as const(U)

- given invariant(int)[][], both possible conversions are covered
-- const(int[])[] by first applying 1(b) to invariant(int) and then by 
applying 2(a) with T = const(int)[]
-- const(invariant(int)[])[] by just applying 1(b) to invariant(int)[]


That leaves AAs to consider....

Stewart.
Jan 02 2009
next sibling parent reply Luther Tychonievich <lat7h virginia.edu> writes:
Stewart Gordon Wrote:

 That leaves AAs to consider....
I can't find any conceptual difference here between any type of indirection: pointers, "ref" variables, static arrays, dynamic arrays, associative arrays, even aggregate types if the typing engine bothered with them: void main(string[] args) { const(int)[string] answers = ["L,tU,&E":42]; // AA void refAA(ref const(int)[string] arg) { arg = answers; } // ref of AA int[string] tmp = ["junk":0]; int[string]* arrayPointer = &tmp; // pointer to AA refAA(*arrayPointer); // TODO: ban this cast (*arrayPointer)["L,tU,&E"] = 43; // reassignment writeln(answers); // check that it changed } I'd generalize your proposed rule 2:
 2. If U is an upcast of T, then a legal implicit conversion is any of:
 (a) T[] to const(U)[]
 (b) T* to const(U)*
 (c) invariant(T)[] to invariant(U)[]
 (d) invariant(T)* to invariant(U)*
 Any other conversion from T[] to U[] or T* to U* is illegal.
with the more general: 2. If U is an upcast of T, then a legal implicit conversion is any of: (a) indirect(T) to indirect(const(U)) (b) indirect(invariant(T)) to indirect(invariant(U)) where indirect is any type of array, pointer, or "ref" type. Ref is a little strange because it only shows up on the "to" side, being implicit on the "from" side. Associative arrays are conceptually aggregates of two types (key and value), and both should follow the safe upcasting rules; in practice, however, the implementation (at least dmd 2.022) automatically const-s all but the outermost level of indirection within the key type, such that "int[int[]]" parses as "int[const(int)[]]", so the rule will never rule out a key type. (incidentally, I just noticed that the AA initializer expression "[[1]:2]" won't parse in dmd...) There may be some subtleties to the "all objects are implicitly pointers" rule (implying that class types are a kind of indirection), but at first glance they seem to be avoided because const-ness cannot get inside that implicit pointer.
Jan 02 2009
parent Stewart Gordon <smjg_1998 yahoo.com> writes:
Luther Tychonievich wrote:
 Stewart Gordon Wrote:
 
 That leaves AAs to consider....
I can't find any conceptual difference here between any type of indirection: pointers, "ref" variables, static arrays, dynamic arrays, associative arrays, even aggregate types if the typing engine bothered with them:
Static arrays aren't a type of indirection. They have value semantics. Hence const(char[26]) and const(char)[26] are identical. (There seems to be a bug at the moment whereby DMD treats them as different types.) <snip>
 Associative arrays are conceptually aggregates of two types (key and
 value), and both should follow the safe upcasting rules; in practice,
 however, the implementation (at least dmd 2.022) automatically
 const-s all but the outermost level of indirection within the key
 type, such that "int[int[]]" parses as "int[const(int)[]]", so the
 rule will never rule out a key type.
So effectively, key types are implicitly tail-consted.
 (incidentally, I just noticed that the AA initializer expression
 "[[1]:2]" won't parse in dmd...)
 
 There may be some  subtleties to the "all objects are implicitly
 pointers" rule (implying that class types are a kind of indirection),
 but at first glance they seem to be avoided because const-ness cannot
 get inside that implicit pointer.
Indeed, classes used as AA keys seem to be mutable by default. But this does seem to be an inconsistency. There is std.typecons.Rebindable. But ISTM mainly a syntax deficiency that this can't be expressed as a D builtin. Stewart.
Jan 03 2009
prev sibling parent Stewart Gordon <smjg_1998 yahoo.com> writes:
<snip>
 - given invariant(int)[][], both possible conversions are covered
 -- const(int[])[] by first applying 1(b) to invariant(int) and then by 
 applying 2(a) with T = const(int)[]
 -- const(invariant(int)[])[] by just applying 1(b) to invariant(int)[]
<snip> Hang on ... this involves applying 2(a) as well. Stewart.
Jan 03 2009
prev sibling next sibling parent reply Jason House <jason.james.house gmail.com> writes:
Stewart Gordon wrote:

 I was just looking at this
 http://d.puremagic.com/issues/show_bug.cgi?id=2544
 which describes how it's possible to bypass const by doing this:
 
      const(int)[] answers = [42];
      int[][] unconsted = [[]];
      const(int)[][] unsafe = unconsted;
      unsafe[0] = answers;
      unconsted[0][0] = 43;
 
 The problem is that converting from int[][] to const(int)[][] isn't
 safe, even though the language/compiler seems to think it is.
 
 Really, it's another version of how you can use a DerivedClass[] as a
 BaseClass[] and thereby place in it an object that isn't of type
 DerivedClass.
 
 There's actually a simple solution to this: specify that, where
 DerivedClass derives from BaseClass, DerivedClass[] cannot be implicitly
 converted to BaseClass[], but only to const(BaseClass)[].
 
 Java has had something like this for a while, albeit not with arrays.
 That is, IIRC, you can assign a DataStructure<DerivedClass> to a
 variable of type DataStructure<? extends BaseClass> (or even
 DataStructure<? extends DerivedClass>) - this creates a read-only view
 of the data structure.  My proposal implements the same basic concept as
 this, but in a simpler way.  (Java also supports write-only 'views' with
 DataStructure<? super FurtherDerivedClass>, but I'm not sure we need
 anything like this in D at the moment.)
 
 Now let's apply the same principle to the example in the bug report.
 Try defining that, in general, T[][] can be converted to const(T[])[]
 but not const(T)[][].  Then
 
      const(int)[] answers = [42];
      int[][] unconsted = [[]];
      const(int)[][] unsafe = unconsted;
 
 would be illegal.  One would have to do
 
      const(int[])[] safe = unconsted;
 
 and now
 
      safe[0] = answers;
 
 is illegal.
What about this code? const int[] answers = [42]; int[][] unconsted = [[]]; const(int[])[] safe = unconsted; safe[0] = answers; safe[0][0] = 43;
Jan 04 2009
parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Mon, 05 Jan 2009 04:16:43 +0300, Jason House <jason.james.house gmail.com>
wrote:

 Stewart Gordon wrote:

 I was just looking at this
 http://d.puremagic.com/issues/show_bug.cgi?id=2544
 which describes how it's possible to bypass const by doing this:

      const(int)[] answers = [42];
      int[][] unconsted = [[]];
      const(int)[][] unsafe = unconsted;
      unsafe[0] = answers;
      unconsted[0][0] = 43;

 The problem is that converting from int[][] to const(int)[][] isn't
 safe, even though the language/compiler seems to think it is.

 Really, it's another version of how you can use a DerivedClass[] as a
 BaseClass[] and thereby place in it an object that isn't of type
 DerivedClass.

 There's actually a simple solution to this: specify that, where
 DerivedClass derives from BaseClass, DerivedClass[] cannot be implicitly
 converted to BaseClass[], but only to const(BaseClass)[].

 Java has had something like this for a while, albeit not with arrays.
 That is, IIRC, you can assign a DataStructure<DerivedClass> to a
 variable of type DataStructure<? extends BaseClass> (or even
 DataStructure<? extends DerivedClass>) - this creates a read-only view
 of the data structure.  My proposal implements the same basic concept as
 this, but in a simpler way.  (Java also supports write-only 'views' with
 DataStructure<? super FurtherDerivedClass>, but I'm not sure we need
 anything like this in D at the moment.)

 Now let's apply the same principle to the example in the bug report.
 Try defining that, in general, T[][] can be converted to const(T[])[]
 but not const(T)[][].  Then

      const(int)[] answers = [42];
      int[][] unconsted = [[]];
      const(int)[][] unsafe = unconsted;

 would be illegal.  One would have to do

      const(int[])[] safe = unconsted;

 and now

      safe[0] = answers;

 is illegal.
What about this code? const int[] answers = [42]; int[][] unconsted = [[]]; const(int[])[] safe = unconsted; safe[0] = answers; safe[0][0] = 43;
Err... "safe[0][0] = 43;" shouldn't compile (it's const)
Jan 04 2009
parent reply Jason House <jason.james.house gmail.com> writes:
Denis Koroskin wrote:

 On Mon, 05 Jan 2009 04:16:43 +0300, Jason House
 <jason.james.house gmail.com> wrote:
 
 Stewart Gordon wrote:

 I was just looking at this
 http://d.puremagic.com/issues/show_bug.cgi?id=2544
 which describes how it's possible to bypass const by doing this:

      const(int)[] answers = [42];
      int[][] unconsted = [[]];
      const(int)[][] unsafe = unconsted;
      unsafe[0] = answers;
      unconsted[0][0] = 43;

 The problem is that converting from int[][] to const(int)[][] isn't
 safe, even though the language/compiler seems to think it is.

 Really, it's another version of how you can use a DerivedClass[] as a
 BaseClass[] and thereby place in it an object that isn't of type
 DerivedClass.

 There's actually a simple solution to this: specify that, where
 DerivedClass derives from BaseClass, DerivedClass[] cannot be implicitly
 converted to BaseClass[], but only to const(BaseClass)[].

 Java has had something like this for a while, albeit not with arrays.
 That is, IIRC, you can assign a DataStructure<DerivedClass> to a
 variable of type DataStructure<? extends BaseClass> (or even
 DataStructure<? extends DerivedClass>) - this creates a read-only view
 of the data structure.  My proposal implements the same basic concept as
 this, but in a simpler way.  (Java also supports write-only 'views' with
 DataStructure<? super FurtherDerivedClass>, but I'm not sure we need
 anything like this in D at the moment.)

 Now let's apply the same principle to the example in the bug report.
 Try defining that, in general, T[][] can be converted to const(T[])[]
 but not const(T)[][].  Then

      const(int)[] answers = [42];
      int[][] unconsted = [[]];
      const(int)[][] unsafe = unconsted;

 would be illegal.  One would have to do

      const(int[])[] safe = unconsted;

 and now

      safe[0] = answers;

 is illegal.
What about this code? const int[] answers = [42]; int[][] unconsted = [[]]; const(int[])[] safe = unconsted; safe[0] = answers; safe[0][0] = 43;
Err... "safe[0][0] = 43;" shouldn't compile (it's const)
Oops... The following is what I meant. unconsted should be used to change the content of answers: What about this code? const int[] answers = [42]; int[][] unconsted = [[]]; const(int[])[] safe = unconsted; safe[0] = answers; unconsted[0][0] = 43;
Jan 04 2009
parent "Denis Koroskin" <2korden gmail.com> writes:
On Mon, 05 Jan 2009 06:24:37 +0300, Jason House <jason.james.house gmail.com>
wrote:

 Denis Koroskin wrote:

 On Mon, 05 Jan 2009 04:16:43 +0300, Jason House
 <jason.james.house gmail.com> wrote:

 Stewart Gordon wrote:

 I was just looking at this
 http://d.puremagic.com/issues/show_bug.cgi?id=2544
 which describes how it's possible to bypass const by doing this:

      const(int)[] answers = [42];
      int[][] unconsted = [[]];
      const(int)[][] unsafe = unconsted;
      unsafe[0] = answers;
      unconsted[0][0] = 43;

 The problem is that converting from int[][] to const(int)[][] isn't
 safe, even though the language/compiler seems to think it is.

 Really, it's another version of how you can use a DerivedClass[] as a
 BaseClass[] and thereby place in it an object that isn't of type
 DerivedClass.

 There's actually a simple solution to this: specify that, where
 DerivedClass derives from BaseClass, DerivedClass[] cannot be  
 implicitly
 converted to BaseClass[], but only to const(BaseClass)[].

 Java has had something like this for a while, albeit not with arrays.
 That is, IIRC, you can assign a DataStructure<DerivedClass> to a
 variable of type DataStructure<? extends BaseClass> (or even
 DataStructure<? extends DerivedClass>) - this creates a read-only view
 of the data structure.  My proposal implements the same basic concept  
 as
 this, but in a simpler way.  (Java also supports write-only 'views'  
 with
 DataStructure<? super FurtherDerivedClass>, but I'm not sure we need
 anything like this in D at the moment.)

 Now let's apply the same principle to the example in the bug report.
 Try defining that, in general, T[][] can be converted to const(T[])[]
 but not const(T)[][].  Then

      const(int)[] answers = [42];
      int[][] unconsted = [[]];
      const(int)[][] unsafe = unconsted;

 would be illegal.  One would have to do

      const(int[])[] safe = unconsted;

 and now

      safe[0] = answers;

 is illegal.
What about this code? const int[] answers = [42]; int[][] unconsted = [[]]; const(int[])[] safe = unconsted; safe[0] = answers; safe[0][0] = 43;
Err... "safe[0][0] = 43;" shouldn't compile (it's const)
Oops... The following is what I meant. unconsted should be used to change the content of answers: What about this code? const int[] answers = [42]; int[][] unconsted = [[]]; const(int[])[] safe = unconsted; safe[0] = answers;
You can't mutate safe because it is const.
 unconsted[0][0] = 43;
Jan 04 2009
prev sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
Stewart Gordon wrote:
<snip>
 These unsafe conversions could still be allowed by explicit casts, 
 should they be needed for something, IWC you'd be expected to know what 
 you're doing.
<snip> Just thinking about it, I wonder if this would be a good time to reconsider Janice's 1⅓-year-old proposal for preventing accidental casting away of const or invariant. Without it, my proposal would leave open some cases where you really want an explicit cast, but not an unsafe one. This applies both to casting DerivedClass[] to BaseClass[] and to such things as int[][] to const(int)[][]. In what follows, I will describe how it would work in conjunction with the implicit conversion rules I have already proposed. Casting away const and casting away invariant would work in the same way, but I will describe it in terms of const. (cast(!const) will probably still do for this, no need for cast(!invariant) as a distinct concept.) There could be two forms of the proposed !const notation, which would become the only legal ways of casting away const. cast(!const) would discard all levels of constancy from the type, but otherwise leave the type unchanged. cast(!const int[][]) would convert something to an int[][], obeying the usual type conversion rules except that any constancy in the original type is ignored. Similarly, cast(!const BaseClass[]) would be the legal way to convert a DerivedClass[] to a BaseClass[], if that's what you really want. Perhaps !const could be combined with const and invariant type modifiers within the type: cast(!const const(int)[][]) would cast a const(int[])[] or an int[][] to a const(int)[][]. I'm not sure whether it should be legal to use the bracket notation with !const cast(!const(const(int)[])[]) but it might be of use, e.g. cast(!const(int[])[][]) to ensure that const is cast away only from a certain level down - the cast would be illegal if const or invariant is specified at a higher level. So this cast would convert either const(int)[][][] or const(int[])[][] to int[][][], but be illegal on const(int[][])[]. But I'm not sure whether this little detail has enough practical use to be worth it.... Comments? Stewart.
Jan 06 2009
parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Wed, 07 Jan 2009 03:49:38 +0300, Stewart Gordon <smjg_1998 yahoo.com> wrote:

 Stewart Gordon wrote:
 <snip>
 These unsafe conversions could still be allowed by explicit casts,  
 should they be needed for something, IWC you'd be expected to know what  
 you're doing.
<snip> Just thinking about it, I wonder if this would be a good time to reconsider Janice's 1⅓-year-old proposal for preventing accidental casting away of const or invariant. Without it, my proposal would leave open some cases where you really want an explicit cast, but not an unsafe one. This applies both to casting DerivedClass[] to BaseClass[] and to such things as int[][] to const(int)[][]. In what follows, I will describe how it would work in conjunction with the implicit conversion rules I have already proposed. Casting away const and casting away invariant would work in the same way, but I will describe it in terms of const. (cast(!const) will probably still do for this, no need for cast(!invariant) as a distinct concept.) There could be two forms of the proposed !const notation, which would become the only legal ways of casting away const. cast(!const) would discard all levels of constancy from the type, but otherwise leave the type unchanged. cast(!const int[][]) would convert something to an int[][], obeying the usual type conversion rules except that any constancy in the original type is ignored. Similarly, cast(!const BaseClass[]) would be the legal way to convert a DerivedClass[] to a BaseClass[], if that's what you really want. Perhaps !const could be combined with const and invariant type modifiers within the type: would cast a const(int[])[] or an int[][] to a const(int)[][]. I'm not sure whether it should be legal to use the bracket notation with !const cast(!const(const(int)[])[]) but it might be of use, e.g. cast(!const(int[])[][]) to ensure that const is cast away only from a certain level down - the cast would be illegal if const or invariant is specified at a higher level. So this cast would convert either const(int)[][][] or const(int[])[][] to int[][][], but be illegal on const(int[][])[]. But I'm not sure whether this little detail has enough practical use to be worth it.... Comments? Stewart.
Nice, but I don't like the proposed syntax much, especially these: cast(!const BaseClass[]) cast(!const const(int)[][]) I believe it would be better to split them into two separate casts as follows: BaseClass[] base = cast(BaseClass[])cast(!const)derived;
Jan 06 2009
parent Stewart Gordon <smjg_1998 yahoo.com> writes:
Denis Koroskin wrote:
<snip>
 Nice, but I don't like the proposed syntax much, especially these:
 
 cast(!const BaseClass[])
 cast(!const const(int)[][])
 
 I believe it would be better to split them into two separate casts as 
 follows:
 
 BaseClass[] base = cast(BaseClass[])cast(!const)derived;
Won't work. DerivedClass[] still needs to convert to const(BaseClass)[], not to BaseClass[]. Would have to be the more long-winded BaseClass[] base = cast(!const) cast(const(BaseClass)[]) derived; and moreover, there'd be no way to do the casts to const(int)[][]. Unless you want cast(!const) to cast to a purely internal intermediate type that suppresses const-checking altogether. Stewart.
Jan 06 2009