digitalmars.D - Array type conversion

Mark Burnett (11/11) Apr 28 2007 I have spent much of the last couple of weeks trying to choose a languag...

torhu (9/9) Apr 28 2007 Mark Burnett wrote:

Mark Burnett (4/14) Apr 28 2007 Right the specific problem of size differences in c++ does not exist (Or...

Manfred Nowak (13/14) Apr 28 2007 From the specs:

Mark Burnett (11/30) Apr 28 2007 Certainly making a copy prevents this problem, but what I'm really curio...

Walter Bright (5/24) Apr 28 2007 But that isn't what is happening with D. base[] is an array of

James Dennett (16/45) Apr 28 2007 Slicing isn't the big issue. The big issue is semantics;

Walter Bright (3/16) Apr 28 2007 But a derived reference can always be implicitly converted to a base

James Dennett (11/28) Apr 28 2007 That's not the issue here either. One more level of

jovo (13/21) Apr 29 2007 Interestingly, given:

Mark Burnett (6/23) Apr 29 2007 Java can allow treating D[] as a mutable B[] only because of its runtime...

Mark Burnett (6/10) Apr 28 2007 You can end up performing operations associated with one type on an obje...
Mike Capp (10/16) Apr 28 2007 I don't think he's talking about slicing, I think he's talking about the

Mark Burnett (11/15) Apr 28 2007 After playing around a bit more, I've discovered that (at least with gdc...

janderson (17/32) Apr 28 2007 What is happening is a pointer copy from justsomefruits to lotsofapples....
janderson (95/110) Apr 28 2007 Ok looking at your example again: I think the real issue is this:

Mark Burnett (2/26) Apr 28 2007 Bingo. D does not exhibit this behavior with interfaces, however, so de...

Thomas Kuehne (39/41) Apr 30 2007 -----BEGIN PGP SIGNED MESSAGE-----

Manfred Nowak (10/12) Apr 30 2007 That assignment above is totally useless, unless one wants to drop some

Mark Burnett <unstained gmail.com> writes:

I have spent much of the last couple of weeks trying to choose a language in
which to write the code for my PhD thesis (in computational physics).  I had
very nearly decided on using c++, when yesterday I stumbled upon D.  So far I'm
ecstatic about it's feature set.

Still there are one or two things that strike me as odd:  in particular that
arrays of a derived type can be converted to an array of a base type.  As
pointed out by Marshall Cline,
[http://www.parashift.com/c++-faq-lite/proper-inheritance.html#faq-21.4] this
is dangerous.  Is this possibly a holdover from c++?  It is explicitly
mentioned in the array page that they behave this way, so I am not convinced
that is the case.

Fortunately not all of the problems associated with doing this in c++ exist in
d (see attached code).  What d seems to do is treat all derived[] as base[],
which is silly because if i want a base[], I would just declare it that way. 
Asking for a derived[] is how I say that I  *only* want derived objects in
there.

The attached code generates this output using gdc 0.23 on OSX:
Here are the different apples we have:
A P P L E -- Red
A P P L E -- Red
A P P L E -- Red
Orange -- Orange
A P P L E -- Red

Please, keep in mind that this test is the first d I have written, and I don't
claim to understand the language.  Array type promotion just seems odd to
include, and I would like to understand the motivation for doing so.

Apr 28 2007

torhu <fake address.dude> writes:

Mark Burnett wrote:
Cline, 
[http://www.parashift.com/c++-faq-lite/proper-inheritance.html#faq-21.4] 
this is dangerous.  Is this possibly a holdover from c++?  It is 
explicitly mentioned in the array page that they behave this way, so I 
am not convinced that is the case.

I guess you're already aware that objects in D are reference types, so 
the specific problem mentioned on that page does not apply?  I you want 
value types, you would use structs, which cannot be subclassed.

Apr 28 2007

Mark Burnett <unstained gmail.com> writes:

torhu Wrote:

 Mark Burnett wrote:
 Cline, 
 [http://www.parashift.com/c++-faq-lite/proper-inheritance.html#faq-21.4] 
 this is dangerous.  Is this possibly a holdover from c++?  It is 
 explicitly mentioned in the array page that they behave this way, so I 
 am not convinced that is the case.
 
 I guess you're already aware that objects in D are reference types, so 
 the specific problem mentioned on that page does not apply?  I you want 
 value types, you would use structs, which cannot be subclassed.

Right the specific problem of size differences in c++ does not exist (Orange
has extra data members to demonstrate this), but the "a container of derived is
not a container of base" problem still does.

Imagine adding a core function to Apple (and not to Orange of course), then
after an Orange is added to the array, you loop go through and core all the
Apples..only one of them isn't an Apple and you have undefined behavior.

The attached file again compiles with gdc, and crashes at runtime (though of
course it could do almost anything).

Apr 28 2007

Manfred Nowak <svv1999 hotmail.com> writes:

Mark Burnett wrote

 only one of them isn't an Apple and you have undefined behavior. 

From the specs:
| Multiple dynamic arrays can share all or parts of the array data.

With the asssignment:
  justsomefruits = lotsofapples;
you did pointer assignments and thereby declared that the array data 
can be interpreted as both: Fruits and Apples. The handling of this 
declaration is up to your intelligence. If you fail, then you might 
have tricked out yourself.

If you wanted a componentwise array copy you should have written:
  justsomefruits[] = lotsofapples; // observe the []
and the compiler would have served you with appropriate error messages.

-manfred

Apr 28 2007

Mark Burnett <unstained gmail.com> writes:

Manfred Nowak Wrote:

 Mark Burnett wrote
 
 only one of them isn't an Apple and you have undefined behavior. 

 
 From the specs:
 | Multiple dynamic arrays can share all or parts of the array data.
 
 With the asssignment:
   justsomefruits = lotsofapples;
 you did pointer assignments and thereby declared that the array data 
 can be interpreted as both: Fruits and Apples. The handling of this 
 declaration is up to your intelligence. If you fail, then you might 
 have tricked out yourself.
 
 If you wanted a componentwise array copy you should have written:
   justsomefruits[] = lotsofapples; // observe the []
 and the compiler would have served you with appropriate error messages.
 
 -manfred

Certainly making a copy prevents this problem, but what I'm really curious
about is the motivation for including these imlicit conversions (they are
mentioned specifically on the Array description page on digitalmars.com).

They're not safe in general, so I imainge library designers will end up using
their own user-defined array/vector type just as in c++, so that such errors
are caught at compile time.  This seems to limit the usefulness of these built
in array types.  Sure they're bounds-safe, but they don't seem 100% type-safe.

For example:

std::vector<derived> ad;
std::vector<base> ab;

ab = ad; // c++ compiler error

A compiler error is more what I would expect.  Treating a container of derived
as a container of base is an error.

Not to speculate too much on the reasons, but is it just too much overhead for
a built-in type to disallow this behavior?  Am I underestimating the usefulness
of these built-in arrays in library design?

Mark

PS Thanks for your replies so far ;)

Apr 28 2007

Walter Bright <newshound1 digitalmars.com> writes:

Mark Burnett wrote:
 Certainly making a copy prevents this problem, but what I'm really curious
about is the motivation for including these imlicit conversions (they are
mentioned specifically on the Array description page on digitalmars.com).
 
 They're not safe in general,

Why not?

 so I imainge library designers will end up using their own user-defined
array/vector type just as in c++, so that such errors are caught at compile
time.  This seems to limit the usefulness of these built in array types.  Sure
they're bounds-safe, but they don't seem 100% type-safe.
 
 For example:
 
 std::vector<derived> ad;
 std::vector<base> ab;
 
 ab = ad; // c++ compiler error
 
 A compiler error is more what I would expect.  Treating a container of derived
as a container of base is an error.

But that isn't what is happening with D. base[] is an array of 
*references* to base, so the slicing problem one has in C++ is not 
possible in D.

 
 Not to speculate too much on the reasons, but is it just too much overhead for
a built-in type to disallow this behavior?  Am I underestimating the usefulness
of these built-in arrays in library design?
 
 Mark
 
 PS Thanks for your replies so far ;)

Apr 28 2007

James Dennett <jdennett acm.org> writes:

Walter Bright wrote:
 Mark Burnett wrote:
 Certainly making a copy prevents this problem, but what I'm really
 curious about is the motivation for including these imlicit
 conversions (they are mentioned specifically on the Array description
 page on digitalmars.com).

 They're not safe in general,

 
 Why not?
 
 so I imainge library designers will end up using their own
 user-defined array/vector type just as in c++, so that such errors are
 caught at compile time.  This seems to limit the usefulness of these
 built in array types.  Sure they're bounds-safe, but they don't seem
 100% type-safe.

 For example:

 std::vector<derived> ad;
 std::vector<base> ab;

 ab = ad; // c++ compiler error

 A compiler error is more what I would expect.  Treating a container of
 derived as a container of base is an error.

 
 But that isn't what is happening with D. base[] is an array of
 *references* to base, so the slicing problem one has in C++ is not
 possible in D.

Slicing isn't the big issue.  The big issue is semantics;
an array of derived is not an array of base, by LSP.

An array of (pointers/references to) derived is usable
as an *immutable* array of base (for suitable English
meaning of immutable, matching C++'s notion of the
array (equivalently, the pointers it contains) being
const.

Java has runtime checks required because it allows
conversion from array of Derived to array of Base,
and that (as you know) also uses reference semantics.
The conversion is widely viewed as a mistake in Java;
if I pass a Derived[] around, the language should
not silently allow one of its elements to refer to
a Base object.

-- James

Apr 28 2007

Walter Bright <newshound1 digitalmars.com> writes:

James Dennett wrote:
 An array of (pointers/references to) derived is usable
 as an *immutable* array of base (for suitable English
 meaning of immutable, matching C++'s notion of the
 array (equivalently, the pointers it contains) being
 const.
 
 Java has runtime checks required because it allows
 conversion from array of Derived to array of Base,
 and that (as you know) also uses reference semantics.
 The conversion is widely viewed as a mistake in Java;
 if I pass a Derived[] around, the language should
 not silently allow one of its elements to refer to
 a Base object.

But a derived reference can always be implicitly converted to a base 
reference anyway. That's the point of polymorphism.

Apr 28 2007

James Dennett <jdennett acm.org> writes:

Walter Bright wrote:
 James Dennett wrote:
 An array of (pointers/references to) derived is usable
 as an *immutable* array of base (for suitable English
 meaning of immutable, matching C++'s notion of the
 array (equivalently, the pointers it contains) being
 const.

 Java has runtime checks required because it allows
 conversion from array of Derived to array of Base,
 and that (as you know) also uses reference semantics.
 The conversion is widely viewed as a mistake in Java;
 if I pass a Derived[] around, the language should
 not silently allow one of its elements to refer to
 a Base object.

 
 But a derived reference can always be implicitly converted to a base
 reference anyway. That's the point of polymorphism.

That's not the issue here either.  One more level of
indirection is present when dealing with arrays of
references or references to references.

The point is that a reference to a derived reference
must *not* be converted to a reference to a base
reference, just as an array of derived references
must not be converted to an array of base references
in case any is changed to a reference to an object
that is not a derived.

-- James

Apr 28 2007

jovo <jovo at.home> writes:

James Dennett Wrote:
 
 The point is that a reference to a derived reference
 must *not* be converted to a reference to a base
 reference, just as an array of derived references
 must not be converted to an array of base references
 in case any is changed to a reference to an object
 that is not a derived.
 

Interestingly, given:

class B{}
class D: B{}    

void f(inout B x){
    x = new B();
}

void main(){
    D[] a1 = new D[3];
    B[] a2 = a1;

    f(a1[1]);   // Error: cast(B)(a1[1u]) is not an lvalue
    f(a2[1]);   // naturally works

jovo

Apr 29 2007

Mark Burnett <unstained gmail.com> writes:

Walter Bright Wrote:

 James Dennett wrote:
 An array of (pointers/references to) derived is usable
 as an *immutable* array of base (for suitable English
 meaning of immutable, matching C++'s notion of the
 array (equivalently, the pointers it contains) being
 const.
 
 Java has runtime checks required because it allows
 conversion from array of Derived to array of Base,
 and that (as you know) also uses reference semantics.
 The conversion is widely viewed as a mistake in Java;
 if I pass a Derived[] around, the language should
 not silently allow one of its elements to refer to
 a Base object.

 
 But a derived reference can always be implicitly converted to a base 
 reference anyway. That's the point of polymorphism.

Java can allow treating D[] as a mutable B[] only because of its runtime
checks.  That way it can just throw an exception when you do things like call
DerivedA.foo on DerviedB.  D doesn't have this, and so it's arrays are just as
type unsafe as C++'s.

You really should have a look at Chapter 24 of Marshall Cline's excellent FAQ
again.  He describes the issue perhaps better than I can.

I am actually a little surprised that there is a difference in the way
conversions from D[] -> B[] and D[] -> I[] work.

It seems that the easiest way to fix this is to remove the implicit D[] -> B[].
 Though as James suggests, it would be safe (and useful) to pass D[] as an
immutable B[] *or* immutable I[].

Mark

Apr 29 2007

Mark Burnett <unstained gmail.com> writes:

Walter Bright Wrote:

 
 They're not safe in general,

 
 Why not?

You can end up performing operations associated with one type on an object that
is  not that type as demonstrated in fruit3.d with the core() function.  I
appreciate that silicing is not the problem.  It's this (seemingly?) undefined
behavior that is.

I did, however, find how to achieve the behavior I was looking for with
interfaces.  Arrays of objects are not implicitly converted to arrays of their
interfaces.  Which brings me to a quick tangent question:  Are there still
plans to implement interface contracts?  I was just reading an old usenet
thread about the posibility.

FYI, contracts and integrated unittest are two major feature draws for me
(scientists often write awful code).  All-in-all I am strongly leaning toward d
for my project.  I hope more people start cactching on to how good it looks ;)

Thanks again,
Mark

Apr 28 2007

Mike Capp <mike.capp gmail.com> writes:

Walter Bright wrote:
 Mark Burnett wrote:

 Treating a container of derived as a container
 of base is an error.

 But that isn't what is happening with D. base[]
 is an array of *references* to base, so the
 slicing problem one has in C++ is not possible in D

I don't think he's talking about slicing, I think he's talking about the
type-system hole. It's not unreasonable to assume when reading/debugging code
that
any members of a Foo[] will be of type Foo or, if not, that a cast will have
been
required somewhere to indicate that fishy things are afoot (afin?). This
conversion subverts that.

 OP: I raised this about a year ago and nobody seemed bothered then; I don't
imagine that's changed.

cheers
Mike

Apr 28 2007

Mark Burnett <unstained gmail.com> writes:

 Treating a container of derived as a container
 of base is an error.



  OP: I raised this about a year ago and nobody seemed bothered then; I don't
 imagine that's changed.

After playing around a bit more, I've discovered that (at least with gdc)
rewriting the addfavoritefruit function to use the ~= operator to append
institutes copy on write by default, while this is *not* the case for the
previously posted indexed veresion.

If the keyword "in" guaranteed that all internal writes were copies, then it
would probably be unnecessary to change the implicit conversion behavior.  No
one in their right mind would pass derived[] to a f(inout base[]), so the only
time the issue would arise is in the local scope, which at least narrows the
potential bug down for the programmer.

So basically, is there a reason for the following distinction?

foo(in baseT [] base)
{
base[27] = new derived2;  // does not create a copy -- not correct, see below

base ~= new derived2;  // creates a copy
}

If not, a consistent change to the implimentation would have practically no
effect on the language specification, while (for practical purposes) fixing
this problem.

Correction:  as I was experimenting with this some more I noticed that base[i]
= ...; *does* seem to create a copy, but it does so *after* the assignment
takes place (during the return?).  If this is really the case, it's almost
certainly unintended, and changing it would eliminate this problem for many
purposes.

Mark

Apr 28 2007

janderson <askme me.com> writes:

Mark Burnett wrote:
 I have spent much of the last couple of weeks trying to choose a language in
which to write the code for my PhD thesis (in computational physics).  I had
very nearly decided on using c++, when yesterday I stumbled upon D.  So far I'm
ecstatic about it's feature set.
 
 Still there are one or two things that strike me as odd:  in particular that
arrays of a derived type can be converted to an array of a base type.  As
pointed out by Marshall Cline,
[http://www.parashift.com/c++-faq-lite/proper-inheritance.html#faq-21.4] this
is dangerous.  Is this possibly a holdover from c++?  It is explicitly
mentioned in the array page that they behave this way, so I am not convinced
that is the case.
 
 Fortunately not all of the problems associated with doing this in c++ exist in
d (see attached code).  What d seems to do is treat all derived[] as base[],
which is silly because if i want a base[], I would just declare it that way. 
Asking for a derived[] is how I say that I  *only* want derived objects in
there.
 
 The attached code generates this output using gdc 0.23 on OSX:
 Here are the different apples we have:
 A P P L E -- Red
 A P P L E -- Red
 A P P L E -- Red
 Orange -- Orange
 A P P L E -- Red
 
 Please, keep in mind that this test is the first d I have written, and I don't
claim to understand the language.  Array type promotion just seems odd to
include, and I would like to understand the motivation for doing so.

What is happening is a pointer copy from justsomefruits to lotsofapples. 
  I think that D should really enforce that the programmer writes: 
justsomefruits.ptr = lotsofapples.ptr.

Other then that though, I don't really have a problem with this type of 
conversion since it is really useful for polymorphisms.  Consider that 
you may want to write some sort of generic function that takes the 
derived class like:

void sort(Fruit [] basket)
{
	
}

Fruit [] justsomefruits; //I only have an array of fruits here because 
this particular class only wants to work on fruits (ie fruit has extra 
properties I know about).

You start to see how beneficial it can be. In my option its a neat feature.

-Joel

Apr 28 2007

janderson <askme me.com> writes:

Mark Burnett wrote:
 I have spent much of the last couple of weeks trying to choose a language in
which to write the code for my PhD thesis (in computational physics).  I had
very nearly decided on using c++, when yesterday I stumbled upon D.  So far I'm
ecstatic about it's feature set.
 
 Still there are one or two things that strike me as odd:  in particular that
arrays of a derived type can be converted to an array of a base type.  As
pointed out by Marshall Cline,
[http://www.parashift.com/c++-faq-lite/proper-inheritance.html#faq-21.4] this
is dangerous.  Is this possibly a holdover from c++?  It is explicitly
mentioned in the array page that they behave this way, so I am not convinced
that is the case.
 
 Fortunately not all of the problems associated with doing this in c++ exist in
d (see attached code).  What d seems to do is treat all derived[] as base[],
which is silly because if i want a base[], I would just declare it that way. 
Asking for a derived[] is how I say that I  *only* want derived objects in
there.
 
 The attached code generates this output using gdc 0.23 on OSX:
 Here are the different apples we have:
 A P P L E -- Red
 A P P L E -- Red
 A P P L E -- Red
 Orange -- Orange
 A P P L E -- Red
 
 Please, keep in mind that this test is the first d I have written, and I don't
claim to understand the language.  Array type promotion just seems odd to
include, and I would like to understand the motivation for doing so.


Ok looking at your example again:  I think the real issue is this:
// Code to test array type promotion.

import std.stdio;

class Fruit
{
	enum color { Red, Orange, Fuchsia };
	static char[] [color] colorlist;

	color mycolor;
	char [] name;

	static this() // Love these :)
	{
		colorlist[color.Red] = "Red";
		colorlist[color.Orange] = "Orange";
		colorlist[color.Fuchsia] = "Fuchsia";
	}

	this()
	{
		mycolor = color.Fuchsia;
		name = "Generic Fruit";
	}

	void whatkind()
	{
		writefln("%s -- %s", name, colorlist[mycolor]);
	}
}

class Apple : Fruit
{
	this()
	{
		mycolor = color.Red;
		name = "A P P L E";
    	}


	void DropApple()
	{
		writefln("DropApple");
	}
}

class Orange : Fruit
{
	struct orangesarebiggerthanapples
	{
		int numberofbumps = 42;
		double ph = 5.3;
	}
	this()
	{
		mycolor = color.Orange;
		name = "Orange";
	}

	void EatOrange()
	{
		writefln("EatOrange");
	}
}


void addfavoritefruit(Fruit [] basket, int index)
{
	basket[index] = new Orange;
}


void main(char [] [] args)
{
	Apple [] lotsofapples;
	Fruit [] justsomefruits;

	lotsofapples.length = 5;
	lotsofapples[0] = new Apple;
	lotsofapples[1] = new Apple;
	lotsofapples[2] = new Apple;

	justsomefruits = lotsofapples; // Dangerous!

	justsomefruits.addfavoritefruit(3);
	
	lotsofapples[4] = new Apple;

	writefln("Here are the different apples we have:");
	foreach (apple; lotsofapples)
	{
		apple.DropApple();
	}

	while(true) {}
}

DropApple
DropApple
DropApple
EatOrange  //What the hell, I never called this function.
DropApple


//2

//Even worse, remove the EatOrange

DropApple
DropApple
DropApple
Error: Access Violation


The problem is the refcopy.  I'm not sure this sort of conversion should 
be band.  However I'm not sure what the correct type of checking should 
be used.  Maybe it should only be converted when passed into the 
function (otherwise require a .ptr qualifier).  It would still have 
potential issues but I think it would be less error prone.

-Joel

Apr 28 2007

Mark Burnett <unstained gmail.com> writes:

janderson Wrote:

 DropApple
 DropApple
 DropApple
 EatOrange  //What the hell, I never called this function.
 DropApple
 
 
 //2
 
 //Even worse, remove the EatOrange
 
 DropApple
 DropApple
 DropApple
 Error: Access Violation
 
 
 The problem is the refcopy.  I'm not sure this sort of conversion should 
 be band.  However I'm not sure what the correct type of checking should 
 be used.  Maybe it should only be converted when passed into the 
 function (otherwise require a .ptr qualifier).  It would still have 
 potential issues but I think it would be less error prone.
 
 -Joel

Bingo.  D does not exhibit this behavior with interfaces, however, so designers
are safe using arrays of interfaces (only!).

Apr 28 2007

Thomas Kuehne <thomas-dloop kuehne.cn> writes:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Mark Burnett schrieb am 2007-04-28:
 Still there are one or two things that strike me as odd:  in particular
 that arrays of a derived type can be converted to an array of a base type.

[...]

Below is a simplified sample:




























derived[0].y);


Thomas


-----BEGIN PGP SIGNATURE-----

iD8DBQFGNcY1LK5blCcjpWoRArn8AJ9yGL1zyYJZRea2odm0ZPNzebpGnQCeI219
X6rZ2SXWKt1ZF3dGxMol+Ag=
=EClm
-----END PGP SIGNATURE-----

Apr 30 2007

Manfred Nowak <svv1999 hotmail.com> writes:

Thomas Kuehne wrote
 Below is a simplified sample:


That assignment above is totally useless, unless one wants to drop some 
data, that can only be hold by derived.

But if one drops some of that data, every access through derived might 
behave unpredictable.

The obvious fault is, that derived is not nulled immeditely after that 
assignment in order to prevent every further access.

Is that what the compiler is supposed to do automatically: null out the 
RHS as a side effect?

-manfred

Apr 30 2007

D Programming

C/C++ Programming

Other

digitalmars.D - Array type conversion