www.digitalmars.com         C & C++   DMDScript  

D - Design questions : Copying and primitive-likes

reply davids argia.net writes:
I'm pretty excited by D; I've always liked C++, but annoyed by the known
problems that implementations were prevented from fixing due to C compatability.
I really like the idea of a language similar in spirit and feel to C++, sans the
known problems and plus a few additional handy features.

But looking over the design, a couple things puzzle me. I'm not sure if they've
been discussed before here, as there's no search mechanism on the web interface
and my ISP doesnt have NNTP, so if I'm inadvertantely restarting an old flame
war, hand me a link and I'll go quiet-like. On to the questions:

First off, I notice that none of the operators you'd need in order to create
classes that act like arrays and associative arrays are in the list of
overloadable operators. Is this omission intentional or is it just a matter of
waiting till someone gets around to implementing it?

In the design overview, I notice that the Java concept of allocating all class
objects to the heap is adopted. This has some definite advantages: doing things
on the heap by default means that the safest behavior polymorphically is the
standard one, and it avoids unneccessary copying.

However, I don't really understand why this hasn't simply been made the default
as opposed to the only option, and I also don't understand why primitives are
made to behave so differently from class objects. This introduces some problems:

- There's no template-safe way to make a copy of something. Sure, you could have
a standard function 'clone()' or somesuch, but having it done by the language is
better since it's official, and it could be implemented for by D itself in most
cases.
- Similarly, there's no template-safe way to make a reference of something. This
is because the behavior of class objects differs from primitives; if you do A =
B in a template where A and B are of the parameterized type, this will produce
different results depending on whether or not its a primitive or a class object,
and theres no safe way to check except to specialize for each individual
primitive type the template supports.
- You cannot make a class object that behaves just like a primitive object in
terms of assignment, since you cannot overload operator= (or is this just an
accidental ommision from the overloadable ops list?)
- Things using Object cannot work with primitives.

A good example of a case for class objects needing to have primitive-like
semantics is a math library with types for rectangles, points, and so on. These
types are not suitable for D structs, since they need methods, but they are not
suitable for D classes either, since they're small enough to be considered
primitive-like and to be noticabely more efficient on the stack, and since a
user of this library could easily be using them all over the place as temporary
variables in their own algorithms.
Feb 21 2003
parent reply "Sean L. Palmer" <seanpalmer directvinternet.com> writes:
<davids argia.net> wrote in message news:b3779f$fu5$1 digitaldaemon.com...
 I'm pretty excited by D; I've always liked C++, but annoyed by the known
 problems that implementations were prevented from fixing due to C
compatability.
 I really like the idea of a language similar in spirit and feel to C++,
sans the
 known problems and plus a few additional handy features.
If you ask me, D seems more like a cross between Java and C++ and C. Some features of all 3. It's not attempting to go the same route C++ went, even though it has generics and operator overloading.
 But looking over the design, a couple things puzzle me. I'm not sure if
they've
 been discussed before here, as there's no search mechanism on the web
interface
 and my ISP doesnt have NNTP, so if I'm inadvertantely restarting an old
flame
 war, hand me a link and I'll go quiet-like. On to the questions:
Discussion is always good.
 First off, I notice that none of the operators you'd need in order to
create
 classes that act like arrays and associative arrays are in the list of
 overloadable operators. Is this omission intentional or is it just a
matter of
 waiting till someone gets around to implementing it?
There was a vote, and not enough people voted for them I guess. I think it's handy to overload array access. I'd want to be able to overload higher-dimensional forms too, such as operator [int x, int y, int z]. If you can overload array indexing you should also be able to overload iteration. There's no standard iteration mechanism yet in D though. There was some debate a long time ago about being able to apply an operation designed for an array element to all the elements in an array or slice, but this has never actually been implemented yet. That's the closest thing to iteration that has been seriously considered so far that I've seen. Lots of talk about it and many proposals but nothing that looks like a possible consensus has been approached. Do you have any suggestions?
 In the design overview, I notice that the Java concept of allocating all
class
 objects to the heap is adopted. This has some definite advantages: doing
things
 on the heap by default means that the safest behavior polymorphically is
the
 standard one, and it avoids unneccessary copying.
For smallish classes it creates unnecessary allocation. You can always pass structs by reference. What it does is make it safer to share pointers to objects. It doesn't completely solve any safety or robustness problems, and doesn't make things much easier for itself say for implementing a much more efficient form of garbage collection. Still the same old malloc with a spiffy new generic garbage collector bolted on almost as an afterthought.
 However, I don't really understand why this hasn't simply been made the
default
 as opposed to the only option, and I also don't understand why primitives
are
 made to behave so differently from class objects. This introduces some
problems:
 - There's no template-safe way to make a copy of something. Sure, you
could have
 a standard function 'clone()' or somesuch, but having it done by the
language is
 better since it's official, and it could be implemented for by D itself in
most
 cases.
Agreed. There are different forms of operator == (and === (that's 3 ='s) ) which compare by value or by reference, but it's unclear what they do if applied to the wrong kind of type (what does === do on a value type?) But operator = is used to copy by value *and* for copy by reference. Yes it can be confusing. There is no standard way of getting what most would call a deep copy of a class object.
 - Similarly, there's no template-safe way to make a reference of
something. This
 is because the behavior of class objects differs from primitives; if you
do A =
 B in a template where A and B are of the parameterized type, this will
produce
 different results depending on whether or not its a primitive or a class
object,
 and theres no safe way to check except to specialize for each individual
 primitive type the template supports.
I'd personally prefer more integration of types myself.
 - You cannot make a class object that behaves just like a primitive object
in
 terms of assignment, since you cannot overload operator= (or is this just
an
 accidental ommision from the overloadable ops list?)
D assumes that assignment isn't something you would want to overload. I don't really agree.
 - Things using Object cannot work with primitives.
 A good example of a case for class objects needing to have primitive-like
 semantics is a math library with types for rectangles, points, and so on.
These
 types are not suitable for D structs, since they need methods, but they
are not
 suitable for D classes either, since they're small enough to be considered
 primitive-like and to be noticabely more efficient on the stack, and since
a
 user of this library could easily be using them all over the place as
temporary
 variables in their own algorithms.
D structs can have methods and overloaded operators. They just can't have constructors and destructors. In fact, D structs are intended for just such primitives as points and rectangles. Sean
Feb 22 2003
parent reply Farmer <itsFarmer. freenet.de> writes:
"Sean L. Palmer" <seanpalmer directvinternet.com> wrote in 
news:b37dbj$lbe$1 digitaldaemon.com: 

 <davids argia.net> wrote in message 
 news:b3779f$fu5$1 digitaldaemon.com... 
 In the design overview, I notice that the Java concept of allocating 
 all 
class
 objects to the heap is adopted. This has some definite advantages: 
 doing 
things
 on the heap by default means that the safest behavior polymorphically 
 is 
the
 standard one, and it avoids unneccessary copying. 
For smallish classes it creates unnecessary allocation. You can always pass structs by reference. What it does is make it safer to share pointers to objects. It doesn't completely solve any safety or robustness problems, and doesn't make things much easier for itself say for implementing a much more efficient form of garbage collection. Still the same old malloc with a spiffy new generic garbage collector bolted on almost as an afterthought.
IMO having classes on the gc heap and structs on stack (you cannot bring them on the gc heap, only the malloc heap) makes life much easier: -virtual methods are always dispatched dynamically -no risk of slicing off data when passing parameters by value -no risk of slicing off data when creating arrays or hashtables for objects -objects can be returned efficiently by functions -behaviour of assignments is consistent and predicable for maintainers -no risk of returning invalid pointers to objects on the stack Actually since DMD 0.56 you can allocate objects on the stack, but I did not check that out (yet). But I guess, this feature should be removed from D. Looks unsafe and incomplete, e.g. I want objects embedded within other objects without costly indirection. Anyway, a compiler could automatically detect when it is safe to put objects onto the stack. I read a paper "Marmot:An Optimizing Compiler for Java" about a native Java compiler that did this.
 However, I don't really understand why this hasn't simply been made 
 the 
default
 as opposed to the only option, and I also don't understand why 
 primitives 
are
 made to behave so differently from class objects. This introduces 
 some 
problems:
 - There's no template-safe way to make a copy of something.
Maybe there is one, I think you can specialize the templates to simple types and classes (types derived from Object), but a bug in the compiler currently prevents this.
 Sure, you 
 could have 
 a standard function 'clone()' or somesuch, but having it done by the 
language is
 better since it's official, and it could be implemented for by D 
 itself in 
most
 cases. 
Most objects cannot simply be copied, it requires extra thoughts by the programmer to make things work right. If you want to make shallow copies or deep copies of your objects you can create a clone method. Implementation is straight forward: 1)Move all member variables that should be shallowed copied in a struct. 2)Use this struct as the member variable of the class. 3)Create a new instance in the clone method and assign the struct member to the new instance. 4) If some members (e.g. some pointers) require a deep copy, you must copy of each them and assign these copies to the new instance. The code generated by DMD is as efficient as the implicit copy constructor of C++-compilers, as far as I can judge that.
 
 Agreed.  There are different forms of operator == (and === (that's 3 
 ='s) ) which compare by value or by reference, but it's unclear what 
 they do if applied to the wrong kind of type (what does === do on a 
 value type?)  But operator = is used to copy by value *and* for copy 
 by reference.  Yes it can be confusing.  There is no standard way of 
 getting what most would call a deep copy of a class object. 
We could use a new operator := for making copywise assignment and operator = for copy-only-reference semantic. But I think, that even operator === should be removed. It will cause just too many bugs for too many people. Object o; if (o==null) //BUG must be if (o===null) {} At least if the D compiler would issue a warning here, it would help in most cases.
 
 - Similarly, there's no template-safe way to make a reference of 
something. This
 is because the behavior of class objects differs from primitives; if 
 you 
do A =
 B in a template where A and B are of the parameterized type, this 
 will 
produce
 different results depending on whether or not its a primitive or a 
 class 
object,
 and theres no safe way to check except to specialize for each 
 individual primitive type the template supports. 
I'd personally prefer more integration of types myself.
 - You cannot make a class object that behaves just like a primitive 
 object 
in
 terms of assignment, since you cannot overload operator= (or is this 
 just 
an
 accidental ommision from the overloadable ops list?) 
D assumes that assignment isn't something you would want to overload. I don't really agree.
 - Things using Object cannot work with primitives. 
attracting Visual Basic people. Primitive types and Object types are made so different for performance reasons. D could make everything to behave as an Object but not the otherway round. wrapper-object for the primitive value, generated by the compiler behind you back. I guess that if templates were part of the initial relase of mainly for container classes.
 A good example of a case for class objects needing to have 
 primitive-like semantics is a math library with types for rectangles, 
 points, and so on. 
These
 types are not suitable for D structs, since they need methods, but 
 they 
are not
 suitable for D classes either, since they're small enough to be 
 considered primitive-like and to be noticabely more efficient on the 
 stack, and since 
a
 user of this library could easily be using them all over the place as 
temporary
 variables in their own algorithms. 
D structs can have methods and overloaded operators. They just can't have constructors and destructors. In fact, D structs are intended for just such primitives as points and rectangles.
I guess, constructors and destructors could be added to structs without performance loss, as long as any copy-constructor or assigment operator is disallowed. But why is Walter against them? Just because people would ask for copy-constructors as soon as he adds constructors? That does not seems rational to me. Just some thoughts. Farmer
Feb 26 2003
next sibling parent Farmer <itsFarmer. freenet.de> writes:
I wrote
 IMO having classes on the gc heap and structs on stack (you cannot
 bring them on the gc heap, only the malloc heap) makes life much
 easier: 
-virtual methods are always dispatched dynamically
 -no risk of slicing off data when passing parameters by value
 -no risk of slicing off data when creating arrays or hashtables for
 objects 
-objects can be returned efficiently by functions
 -behaviour of assignments is consistent and predicable for maintainers
 -no risk of returning invalid pointers to objects on the stack
The primary problem with arrays (or hashtables, depends on implementation) is not slicing off data, but corrupt data when accessing array elements polymorphically. But for D this would currently be no issue, as covariance of arrays are not allowed. Just a correction. Farmer
Mar 07 2003
prev sibling parent reply David Simon <David_member pathlink.com> writes:
If you want to make shallow copies or deep copies of your objects you can 
create a clone method. Implementation is straight forward: 
1)Move all member variables that should be shallowed copied in a struct. 
2)Use this struct as the member variable of the class.
3)Create a new instance in the clone method and assign the struct member to 
the new instance.
4) If some members (e.g. some pointers) require a deep copy, you must copy 
of each them and assign these copies to the new instance.
The approach you have above has some problems; it doesnt work with templates unless they understand your convention for naming the clone function, and accessing the members of the substruct requires a bit. Another issue is that its just sort of using a property of struct (copying) as a property of class by using a struct; but why not just make this a property of classes in the first place? The biggest problem I had was not whether or not things are placed on the heap or the stack, but that class objects have different semantics for the same operation as non-class objects not on the gc heap. This just seems inconsistent, and it forces any coder writing templates to very carefully avoid any use of the '=' operator that assumes a copy was or was not made, which is pretty much every use that I can think of... This has no chance of happening so late in the design phase, but what might've done this would be to use copy-on-write for class objects, and to give them = syntax for copying. Then, they'd be consistent regular stack objects in terms of assignment, but you'd still get polymorphic behavior naturally in containers and arguments, avoid slicing, and avoid unnecessary copies. The best way to do class copying automatically is probably to differentiate owning and non-owning pointers (this could also make auto-serialization systems easier as well). This is what a lot of smart pointer systems in C++ do, including Boost's and the standard library auto_ptr. The difference between deep copy and shallow copy isn't important; the point of making a copy is to create an entirely seperate object from the original, which pretty much implies a deep copy (or something that acts like one, like a copy-on-write).
Mar 08 2003
parent Farmer <itsFarmer. freenet.de> writes:
Hi,

comments are embedded.

David Simon <David_member pathlink.com> wrote in
news:b4c929$312g$1 digitaldaemon.com: 

If you want to make shallow copies or deep copies of your objects you
can create a clone method. Implementation is straight forward: 
1)Move all member variables that should be shallowed copied in a
struct. 2)Use this struct as the member variable of the class.
3)Create a new instance in the clone method and assign the struct
member to the new instance.
4) If some members (e.g. some pointers) require a deep copy, you must
copy of each them and assign these copies to the new instance.
The approach you have above has some problems; it doesnt work with templates unless they understand your convention for naming the clone function, and accessing the members of the substruct requires a bit. Another issue is that its just sort of using a property of struct (copying) as a property of class by using a struct; but why not just make this a property of classes in the first place?
One design goal of D is to keep the standard rather simple. I only see two minor problems: -More typing for programmers (solution: wasting less time in pointless meetings, so one *has* plenty of time for typing D code ;-). -Templates must understand a clonable convention: That's possible with interfaces in D.
 The biggest problem I had was not whether or not things are placed on
 the heap or the stack, but that class objects have different semantics
 for the same operation as non-class objects not on the gc heap. This
 just seems inconsistent [...]
That's the best point about it. Class objects are objects, everything else is not an object. No superficial unification of language types. An int is not an object; it's more likely to be register in the CPU ;-). The programming modell that is induced by D's semantic for objects is likely to
, and it forces any coder writing templates to
 very carefully avoid any use of the '=' operator that assumes a copy
 was or was not made, which is pretty much every use that I can think
 of... 
Could you please post just one of them? I wanted to write a template that unifies the copying behaviour of the template parameter. Other templates could use this template if they required the unified behaviour for their parameters. But I did not succeed for two reasons: -due to a bug in the compiler, I could not specialize for objects. It's fixed now. -I was not able to think of an example that requires a unified behaviour.
 
 This has no chance of happening so late in the design phase, but what
 might've done this would be to use copy-on-write for class objects,
 and to give them = syntax for copying. Then, they'd be consistent
 regular stack objects in terms of assignment, but you'd still get
 polymorphic behavior naturally in containers and arguments, avoid
 slicing, and avoid unnecessary copies. 
 
 The best way to do class copying automatically is probably to
 differentiate owning and non-owning pointers (this could also make
 auto-serialization systems easier as well). This is what a lot of
 smart pointer systems in C++ do, including Boost's and the standard
 library auto_ptr. The difference between deep copy and shallow copy
 isn't important; the point of making a copy is to create an entirely
 seperate object from the original, which pretty much implies a deep 
 copy (or something that acts like one, like a copy-on-write).
 
Don't think that the concepts you mentioned here would be useful for D. D's GC deals with the issues solved by smart-pointers and COW classes in C++, in a much simpler way. Farmer.
Mar 08 2003