D - Design questions : Copying and primitive-likes

davids argia.net (40/40) Feb 21 2003 I'm pretty excited by D; I've always liked C++, but annoyed by the known

Sean L. Palmer (62/102) Feb 22 2003 sans the

Farmer (58/154) Feb 26 2003 IMO having classes on the gc heap and structs on stack (you cannot bring...

Farmer (9/19) Mar 07 2003 The primary problem with arrays (or hashtables, depends on implementatio...
David Simon (24/32) Mar 08 2003 The approach you have above has some problems; it doesnt work with templ...

Farmer (27/67) Mar 08 2003 Hi,

davids argia.net writes:

I'm pretty excited by D; I've always liked C++, but annoyed by the known
problems that implementations were prevented from fixing due to C compatability.
I really like the idea of a language similar in spirit and feel to C++, sans the
known problems and plus a few additional handy features.

But looking over the design, a couple things puzzle me. I'm not sure if they've
been discussed before here, as there's no search mechanism on the web interface
and my ISP doesnt have NNTP, so if I'm inadvertantely restarting an old flame
war, hand me a link and I'll go quiet-like. On to the questions:

First off, I notice that none of the operators you'd need in order to create
classes that act like arrays and associative arrays are in the list of
overloadable operators. Is this omission intentional or is it just a matter of
waiting till someone gets around to implementing it?

In the design overview, I notice that the Java concept of allocating all class
objects to the heap is adopted. This has some definite advantages: doing things
on the heap by default means that the safest behavior polymorphically is the
standard one, and it avoids unneccessary copying.

However, I don't really understand why this hasn't simply been made the default
as opposed to the only option, and I also don't understand why primitives are
made to behave so differently from class objects. This introduces some problems:

- There's no template-safe way to make a copy of something. Sure, you could have
a standard function 'clone()' or somesuch, but having it done by the language is
better since it's official, and it could be implemented for by D itself in most
cases.
- Similarly, there's no template-safe way to make a reference of something. This
is because the behavior of class objects differs from primitives; if you do A =
B in a template where A and B are of the parameterized type, this will produce
different results depending on whether or not its a primitive or a class object,
and theres no safe way to check except to specialize for each individual
primitive type the template supports.
- You cannot make a class object that behaves just like a primitive object in
terms of assignment, since you cannot overload operator= (or is this just an
accidental ommision from the overloadable ops list?)
- Things using Object cannot work with primitives.

A good example of a case for class objects needing to have primitive-like
semantics is a math library with types for rectangles, points, and so on. These
types are not suitable for D structs, since they need methods, but they are not
suitable for D classes either, since they're small enough to be considered
primitive-like and to be noticabely more efficient on the stack, and since a
user of this library could easily be using them all over the place as temporary
variables in their own algorithms.

Feb 21 2003

"Sean L. Palmer" <seanpalmer directvinternet.com> writes:

<davids argia.net> wrote in message news:b3779f$fu5$1 digitaldaemon.com...
 I'm pretty excited by D; I've always liked C++, but annoyed by the known
 problems that implementations were prevented from fixing due to C

compatability.
 I really like the idea of a language similar in spirit and feel to C++,

sans the
 known problems and plus a few additional handy features.

If you ask me, D seems more like a cross between Java and C++ and C.  Some
features of all 3.  It's not attempting to go the same route C++ went, even
though it has generics and operator overloading.

 But looking over the design, a couple things puzzle me. I'm not sure if

they've
 been discussed before here, as there's no search mechanism on the web

interface
 and my ISP doesnt have NNTP, so if I'm inadvertantely restarting an old

flame
 war, hand me a link and I'll go quiet-like. On to the questions:

Discussion is always good.

 First off, I notice that none of the operators you'd need in order to

create
 classes that act like arrays and associative arrays are in the list of
 overloadable operators. Is this omission intentional or is it just a

matter of
 waiting till someone gets around to implementing it?

There was a vote, and not enough people voted for them I guess.  I think
it's handy to overload array access.  I'd want to be able to overload
higher-dimensional forms too, such as operator [int x, int y, int z].

If you can overload array indexing you should also be able to overload
iteration.  There's no standard iteration mechanism yet in D though.  There
was some debate a long time ago about being able to apply an operation
designed for an array element to all the elements in an array or slice, but
this has never actually been implemented yet.  That's the closest thing to
iteration that has been seriously considered so far that I've seen.  Lots of
talk about it and many proposals but nothing that looks like a possible
consensus has been approached.  Do you have any suggestions?

 In the design overview, I notice that the Java concept of allocating all

class
 objects to the heap is adopted. This has some definite advantages: doing

things
 on the heap by default means that the safest behavior polymorphically is

the
 standard one, and it avoids unneccessary copying.

For smallish classes it creates unnecessary allocation.  You can always pass
structs by reference.  What it does is make it safer to share pointers to
objects.  It doesn't completely solve any safety or robustness problems, and
doesn't make things much easier for itself say for implementing a much more
efficient form of garbage collection.  Still the same old malloc with a
spiffy new generic garbage collector bolted on almost as an afterthought.

 However, I don't really understand why this hasn't simply been made the

default
 as opposed to the only option, and I also don't understand why primitives

are
 made to behave so differently from class objects. This introduces some

problems:
 - There's no template-safe way to make a copy of something. Sure, you

could have
 a standard function 'clone()' or somesuch, but having it done by the

language is
 better since it's official, and it could be implemented for by D itself in

most
 cases.

Agreed.  There are different forms of operator == (and === (that's 3 ='s) )
which compare by value or by reference, but it's unclear what they do if
applied to the wrong kind of type (what does === do on a value type?)  But
operator = is used to copy by value *and* for copy by reference.  Yes it can
be confusing.  There is no standard way of getting what most would call a
deep copy of a class object.

 - Similarly, there's no template-safe way to make a reference of

something. This
 is because the behavior of class objects differs from primitives; if you

do A =
 B in a template where A and B are of the parameterized type, this will

produce
 different results depending on whether or not its a primitive or a class

object,
 and theres no safe way to check except to specialize for each individual
 primitive type the template supports.

I'd personally prefer more integration of types myself.

 - You cannot make a class object that behaves just like a primitive object

in
 terms of assignment, since you cannot overload operator= (or is this just

an
 accidental ommision from the overloadable ops list?)

D assumes that assignment isn't something you would want to overload.   I
don't really agree.

 - Things using Object cannot work with primitives.



 A good example of a case for class objects needing to have primitive-like
 semantics is a math library with types for rectangles, points, and so on.

These
 types are not suitable for D structs, since they need methods, but they

are not
 suitable for D classes either, since they're small enough to be considered
 primitive-like and to be noticabely more efficient on the stack, and since

a
 user of this library could easily be using them all over the place as

temporary
 variables in their own algorithms.

D structs can have methods and overloaded operators.  They just can't have
constructors and destructors.  In fact, D structs are intended for just such
primitives as points and rectangles.

Sean

Feb 22 2003

Farmer <itsFarmer. freenet.de> writes:

"Sean L. Palmer" <seanpalmer directvinternet.com> wrote in 
news:b37dbj$lbe$1 digitaldaemon.com: 

 <davids argia.net> wrote in message 
 news:b3779f$fu5$1 digitaldaemon.com... 
 In the design overview, I notice that the Java concept of allocating 
 all 

 class 
 objects to the heap is adopted. This has some definite advantages: 
 doing 

 things 
 on the heap by default means that the safest behavior polymorphically 
 is 

 the 
 standard one, and it avoids unneccessary copying. 

 
 For smallish classes it creates unnecessary allocation.  You can 
 always pass structs by reference.  What it does is make it safer to 
 share pointers to objects.  It doesn't completely solve any safety or 
 robustness problems, and doesn't make things much easier for itself 
 say for implementing a much more efficient form of garbage collection. 
  Still the same old malloc with a spiffy new generic garbage collector 
 bolted on almost as an afterthought. 
 

IMO having classes on the gc heap and structs on stack (you cannot bring 
them on the gc heap, only the malloc heap) makes life much easier: 
-virtual methods are always dispatched dynamically
-no risk of slicing off data when passing parameters by value
-no risk of slicing off data when creating arrays or hashtables for objects
-objects can be returned efficiently by functions
-behaviour of assignments is consistent and predicable for maintainers
-no risk of returning invalid pointers to objects on the stack

Actually since DMD 0.56 you can allocate objects on the stack, but I did 
not check that out (yet). But I guess, this feature should be removed from 
D. Looks unsafe and incomplete, e.g. I want objects embedded within other 
objects without costly indirection. 
Anyway, a compiler could automatically detect when it is safe to put 
objects onto the stack. I read a paper "Marmot:An Optimizing Compiler for 
Java" about a native Java compiler that did this.


 However, I don't really understand why this hasn't simply been made 
 the 

 default 
 as opposed to the only option, and I also don't understand why 
 primitives 

 are 
 made to behave so differently from class objects. This introduces 
 some 

 problems: 
 - There's no template-safe way to make a copy of something.


Maybe there is one, I think you can specialize the templates to simple 
types and classes (types derived from Object), but a bug in the compiler 
currently prevents this.


 Sure, you 
 could have 
 a standard function 'clone()' or somesuch, but having it done by the 

 language is 
 better since it's official, and it could be implemented for by D 
 itself in 

 most 
 cases. 


Most objects cannot simply be copied, it requires extra thoughts by the 
programmer to make things work right.
If you want to make shallow copies or deep copies of your objects you can 
create a clone method. Implementation is straight forward: 
1)Move all member variables that should be shallowed copied in a struct. 
2)Use this struct as the member variable of the class.
3)Create a new instance in the clone method and assign the struct member to 
the new instance.
4) If some members (e.g. some pointers) require a deep copy, you must copy 
of each them and assign these copies to the new instance.

The code generated by DMD is as efficient as the implicit copy constructor 
of C++-compilers, as far as I can judge that.
 

 
 Agreed.  There are different forms of operator == (and === (that's 3 
 ='s) ) which compare by value or by reference, but it's unclear what 
 they do if applied to the wrong kind of type (what does === do on a 
 value type?)  But operator = is used to copy by value *and* for copy 
 by reference.  Yes it can be confusing.  There is no standard way of 
 getting what most would call a deep copy of a class object. 

We could use a new operator := for making copywise assignment and operator 
= for copy-only-reference semantic. But I think, that even operator === 
should be removed. It will cause just too many bugs for too many people.

	Object o;
	if (o==null)  //BUG must be if (o===null)
	{}
At least if the D compiler would issue a warning here, it would help in 
most cases.


 
 - Similarly, there's no template-safe way to make a reference of 

 something. This 
 is because the behavior of class objects differs from primitives; if 
 you 

 do A = 
 B in a template where A and B are of the parameterized type, this 
 will 

 produce 
 different results depending on whether or not its a primitive or a 
 class 

 object, 
 and theres no safe way to check except to specialize for each 
 individual primitive type the template supports. 

 
 I'd personally prefer more integration of types myself. 
 
 - You cannot make a class object that behaves just like a primitive 
 object 

 in 
 terms of assignment, since you cannot overload operator= (or is this 
 just 

 an 
 accidental ommision from the overloadable ops list?) 

 
 D assumes that assignment isn't something you would want to overload. 
  I don't really agree. 
 
 - Things using Object cannot work with primitives. 

 



attracting Visual Basic people. 
Primitive types and Object types are made so different for performance 
reasons. 
D could make everything to behave as an Object but not the otherway round. 

wrapper-object for the primitive value, generated by the compiler behind 
you back. I guess that if templates were part of the initial relase of 

mainly for container classes.

 A good example of a case for class objects needing to have 
 primitive-like semantics is a math library with types for rectangles, 
 points, and so on. 

 These 
 types are not suitable for D structs, since they need methods, but 
 they 

 are not 
 suitable for D classes either, since they're small enough to be 
 considered primitive-like and to be noticabely more efficient on the 
 stack, and since 

 a 
 user of this library could easily be using them all over the place as 

 temporary 
 variables in their own algorithms. 

 
 D structs can have methods and overloaded operators.  They just can't 
 have constructors and destructors.  In fact, D structs are intended 
 for just such primitives as points and rectangles. 

I guess, constructors and destructors could be added to structs without 
performance loss, as long as any copy-constructor or assigment operator is 
disallowed. But why is Walter against them? Just because people would ask 
for copy-constructors as soon as he adds constructors? That does not seems 
rational to me.


Just some thoughts.

Farmer

Feb 26 2003

Farmer <itsFarmer. freenet.de> writes:

I wrote
 IMO having classes on the gc heap and structs on stack (you cannot
 bring them on the gc heap, only the malloc heap) makes life much
 easier: 
-virtual methods are always dispatched dynamically
 -no risk of slicing off data when passing parameters by value
 -no risk of slicing off data when creating arrays or hashtables for
 objects 
-objects can be returned efficiently by functions
 -behaviour of assignments is consistent and predicable for maintainers
 -no risk of returning invalid pointers to objects on the stack

The primary problem with arrays (or hashtables, depends on implementation) 
is not slicing off data, but corrupt data when accessing array elements 
polymorphically. 
But for D this would currently be no issue, as covariance of arrays are not 
allowed.



 Just a correction.
 
 Farmer

Mar 07 2003

David Simon <David_member pathlink.com> writes:

If you want to make shallow copies or deep copies of your objects you can 
create a clone method. Implementation is straight forward: 
1)Move all member variables that should be shallowed copied in a struct. 
2)Use this struct as the member variable of the class.
3)Create a new instance in the clone method and assign the struct member to 
the new instance.
4) If some members (e.g. some pointers) require a deep copy, you must copy 
of each them and assign these copies to the new instance.

The approach you have above has some problems; it doesnt work with templates
unless they understand your convention for naming the clone function, and
accessing the members of the substruct requires a bit. Another issue is that its
just sort of using a property of struct (copying) as a property of class by
using a struct; but why not just make this a property of classes in the first
place?

The biggest problem I had was not whether or not things are placed on the heap
or the stack, but that class objects have different semantics for the same
operation as non-class objects not on the gc heap. This just seems inconsistent,
and it forces any coder writing templates to very carefully avoid any use of the
'=' operator that assumes a copy was or was not made, which is pretty much every
use that I can think of...

This has no chance of happening so late in the design phase, but what might've
done this would be to use copy-on-write for class objects, and to give them =
syntax for copying. Then, they'd be consistent regular stack objects in terms of
assignment, but you'd still get polymorphic behavior naturally in containers and
arguments, avoid slicing, and avoid unnecessary copies.

The best way to do class copying automatically is probably to differentiate
owning and non-owning pointers (this could also make auto-serialization systems
easier as well). This is what a lot of smart pointer systems in C++ do,
including Boost's and the standard library auto_ptr. The difference between deep
copy and shallow copy isn't important; the point of making a copy is to create
an entirely seperate object from the original, which pretty much implies a deep
copy (or something that acts like one, like a copy-on-write).

Mar 08 2003

Farmer <itsFarmer. freenet.de> writes:

Hi,

comments are embedded.

David Simon <David_member pathlink.com> wrote in
news:b4c929$312g$1 digitaldaemon.com: 

If you want to make shallow copies or deep copies of your objects you
can create a clone method. Implementation is straight forward: 
1)Move all member variables that should be shallowed copied in a
struct. 2)Use this struct as the member variable of the class.
3)Create a new instance in the clone method and assign the struct
member to the new instance.
4) If some members (e.g. some pointers) require a deep copy, you must
copy of each them and assign these copies to the new instance.

 
 The approach you have above has some problems; it doesnt work with
 templates unless they understand your convention for naming the clone
 function, and accessing the members of the substruct requires a bit.
 Another issue is that its just sort of using a property of struct
 (copying) as a property of class by using a struct; but why not just
 make this a property of classes in the first place?

One design goal of D is to keep the standard rather simple. 

I only see two minor problems:
-More typing for programmers (solution: wasting less time in pointless 
meetings, so one *has* plenty of time for typing D code ;-).
-Templates must understand a clonable convention: That's possible with 
interfaces in D. 

 The biggest problem I had was not whether or not things are placed on
 the heap or the stack, but that class objects have different semantics
 for the same operation as non-class objects not on the gc heap. This
 just seems inconsistent [...]

That's the best point about it. Class objects are objects, everything else 
is not an object. No superficial unification of language types. An int is 
not an object; it's more likely to be register in the CPU ;-).  The 
programming modell that is induced by D's semantic for objects is likely to 


, and it forces any coder writing templates to
 very carefully avoid any use of the '=' operator that assumes a copy
 was or was not made, which is pretty much every use that I can think
 of... 

Could you please post just one of them?
I wanted to write a template that unifies the copying behaviour of the 
template parameter. Other templates could use this template if they 
required the unified behaviour for their parameters. But I did not succeed 
for two reasons:
-due to a bug in the compiler, I could not specialize for objects. It's 
fixed now.
-I was not able to think of an example that requires a unified behaviour.



 
 This has no chance of happening so late in the design phase, but what
 might've done this would be to use copy-on-write for class objects,
 and to give them = syntax for copying. Then, they'd be consistent
 regular stack objects in terms of assignment, but you'd still get
 polymorphic behavior naturally in containers and arguments, avoid
 slicing, and avoid unnecessary copies. 
 
 The best way to do class copying automatically is probably to
 differentiate owning and non-owning pointers (this could also make
 auto-serialization systems easier as well). This is what a lot of
 smart pointer systems in C++ do, including Boost's and the standard
 library auto_ptr. The difference between deep copy and shallow copy
 isn't important; the point of making a copy is to create an entirely
 seperate object from the original, which pretty much implies a deep 
 copy (or something that acts like one, like a copy-on-write).
 

Don't think that the concepts you mentioned here would be useful for D. D's 
GC deals with the issues solved by smart-pointers and COW classes in C++, 
in a much simpler way.



Farmer.

Mar 08 2003

D Programming

C/C++ Programming

Other

D - Design questions : Copying and primitive-likes