digitalmars.D - Writing const-correct code in D
- Kevin Bealer (164/164) Mar 08 2006 Since people want the benefits of const, I'm showing a way to get them
- Brad Roberts (13/26) Mar 08 2006 I must have missed something somewhere along the way.. when did copying
- Kevin Bealer (40/66) Mar 08 2006 Yeah - this is a tradeoff, but as I understand it, the copy constructor ...
- xs0 (21/211) Mar 09 2006 OK, while it might work, your approach has several problems:
- Kevin Bealer (61/272) Mar 09 2006 True. This could be done with a wrapper too, particularly with IFTI, bu...
Since people want the benefits of const, I'm showing a way to get them by following coding conventions. This requires *no* changes to D. Also, this is not full "C++ const", only parameter passing and const methods, which seems to be the most popular parts of the const idea. It seems like it should require more syntax that C++, but it only takes a small amount. When working with types like "int", use "in" - const is not too much of an issue here. The same is true for struct, it gets copied in, which is fine for small structs. For larger structs, you might want to pass by "in *", i.e. use "in Foo *". You can modify this technique to use struct, for that see the last item in the numbered list at the end. For classes, the issue is that the pointer will not be modified with the "in" convention, but the values in the class may be. : // "Problem" code : : class Bar {...} : : class Foo { : this(Bar b) : { x1 = b; } : : this(const_Foo b) : { : x1 = b.x1.dup; : } : : // Modifies this Foo : void changeBar(Bar b2) : { x1 = b2; } : : // Does not modify this Foo : int doesWork() {...} : : protected: : Bar x1; : }; : : // NOTE: changes foo1 : void barfoo(in Foo foo1, in Bar b) : { : foo1.changeBar(b); : } We'd like barfoo() not to modify foo1 - we want to guarantee it. To deal with this, you can write a "const interface" for your class. I recommend the prefix "const_" so that it looks a little like the C++ version. This interface definition is quite simple to do. Note that a Foo is-a const_Foo, and passing it to a const-Foo interface is legal. But modifying it will throw an exception. NOTE: You don't need any extra method code, except constructors and optionally the "clone()" method. What we are doing is SPLITTING the personality of Foo into two halves - read and write. : // The read stuff : : class const_Foo { : this(Bar b) : { x1 = b; } : : this(Foo b) : { x1 = b.x1.dup; } : : // Does not modify Foo : int doesWork() {...} : : Foo clone() // how to un-const (optional) : { : return new Foo(this); // use const->nonconst ctor : } : : protected: : Bar x1; : }; : : // The write stuff - can also do read stuff of course. : : class Foo : const_Foo { : this(Bar b) : { : const_Foo(b); : } : : this(const_Foo b) // const->nonconst ctor : { : const_Foo(b.dup); : } : : void changeBar(Bar b2) : { : x1 = b2; : } : }; : : // Can only call this with non-const Foo. : void barfoo(in Foo foo1, in Bar b) : { : foo1.changeBar(b); : } : : // Can call this with either const_Foo or Foo. : void barfaa(in const_Foo foo1) : { : int q = foo1.doesWork(); : } 1. In C++, you need to make the same division into const and non-const, since every method must be labeled as "const" or not labeled (and thus unusable in a const object). So there is no extra "design burden". 2. You can easily change any method's constness by cut/pasting it to the other class. All implementation code/data is shared. 3. Relationships are enforced! If doesWork() calls changeBar(), the compiler will complain. 4. The class author decides whether "clone()" and the other special methods are written at all - so if "Bar" is uncloneable for some reason (i.e. maybe its a File), don't write clone() for Foo, or find a way to get around copying it. This work needs to be done in C++ too. 5. Users of const_Foo don't need to know what the editable Foo does. Their code can't break unless the const_ side is changed. It's now very hard to miss the distinction between const/non-const, which is easy to miss in C++ when writing methods for example. 6. Easy to use as a Copy-On-Write design: If you need to store an object, and don't know if it is const or not, use a const_Foo reference. In the event you need to modify it, you can test whether it is const with a dynamic cast. If it is, clone it first! 7. In C++, you can also define distinct const and non-const methods for a class. This happens automatically here - the non-const method (if one exists) just overrides the const one. 8. Finally, for OOD/OOP purists: Although the non-const version is not really "is-a" const, the relationship still holds once you realize that const is really a "subtracting" adjective - we could use the terms readable and read/writeable, where it is easy to see that a read/writeable think is-a readable thing. 9. You can have "in Foo" parameters and "out const_Foo" without it being a contradiction. The first means "I don't want to change what it points to -- something the caller might also want to know -- but I might modify it. The second is a way to return something. [The semantics of input and output (argument and return value) are normally different in OO programming, since one is covariant and the other contravariant. (This is true in D, right?)] 10. For structs you can do a similar thing: : // read-write version : struct X { : int opIndex(int i) { ... } : int opIndexAssign(int i) { ... } : : private: : int[1024] data_; : }; : // read-only version : struct const_X { : int opIndex(int i) { return impl[i]; } : : X * clone() : { : return impl.dup; : } : : private: : X impl; : }; If people like this, maybe something along these lines would be useful for the C++ programmer intro on the D site? I can make a more thorough version if so. If people use this technique, it might be good for them to follow the same style, i.e. method names. Kevin
Mar 08 2006
On Thu, 9 Mar 2006, Kevin Bealer wrote:The same is true for struct, it gets copied in, which is fine for small structs. For larger structs, you might want to pass by "in *", i.e. use "in Foo *". You can modify this technique to use struct, for that see the last item in the numbered list at the end. For classes, the issue is that the pointer will not be modified with the "in" convention, but the values in the class may be. : this(const_Foo b) : { : x1 = b.x1.dup; : }I must have missed something somewhere along the way.. when did copying imply const? To me, something that's const can't be modified. That doesn't mean just to the caller, but also to the callee. It's a mechanism for saying "this object shouldn't be changed". Const by duplication doesn't help with the last part and make it entirely probable that code will at some point be change to modify parts of the passed in data with the expectation that those changes actually occur on up through to the caller. Sorry, const by dup is in some ways even worse than not having const. I see how it solves some usecases though, so it's not totally worse. :) Later, Brad
Mar 08 2006
In article <Pine.LNX.4.64.0603081855530.30259 bellevue.puremagic.com>, Brad Roberts says...On Thu, 9 Mar 2006, Kevin Bealer wrote:Yeah - this is a tradeoff, but as I understand it, the copy constructor in D (unlike C++) can't be used for automatic conversions. So the person receiving a const_Foo has to do this: : void dofoo(in const_Foo x) : { : Foo y = new Foo(x); // explicit copy of value(s) from x : ... : } ..in order to get a new one. Now, normally one would not expect y to propagate changes back to x, if the syntax looks like the above, right? These versions won't even compile: : void doAAA(in Foo x) : { : y.modifyStuff(); // okay here : } : : void doBBB(in const_Foo x) : { : y.modifyStuff(); // failure here - const_Foo doesn't have this method : } : : const_Foo bar; : : doAAA(bar); // error: can't convert const_Foo to Foo : doBBB(bar); // okay here So -- both are caught at *compile* time. No .duplication unless requested explicitely. The clone() method and clone() constructors are designed to do as deep of a copy as necessary, which means they need to be user defined. That's why all the proposals for deep .dup don't work - it requires developer input, the compiler doesn't know enough. If you wanted to really prevent modification or copy, you can just omit that method, and not provide a way to go from const_Foo to Foo. I'm proposing this as the obvious standard way to de-const -- with a constructor -- for cases where you want that behavior. My thinking is that we could set up rules for people who want to do const, maybe because they have a huge C++ project they are rewriting in D, and it uses const in complex ways. Maybe because they just like the const facility. KevinThe same is true for struct, it gets copied in, which is fine for small structs. For larger structs, you might want to pass by "in *", i.e. use "in Foo *". You can modify this technique to use struct, for that see the last item in the numbered list at the end. For classes, the issue is that the pointer will not be modified with the "in" convention, but the values in the class may be. : this(const_Foo b) : { : x1 = b.x1.dup; : }I must have missed something somewhere along the way.. when did copying imply const? To me, something that's const can't be modified. That doesn't mean just to the caller, but also to the callee. It's a mechanism for saying "this object shouldn't be changed". Const by duplication doesn't help with the last part and make it entirely probable that code will at some point be change to modify parts of the passed in data with the expectation that those changes actually occur on up through to the caller. Sorry, const by dup is in some ways even worse than not having const. I see how it solves some usecases though, so it's not totally worse. :) Later, Brad
Mar 08 2006
Kevin Bealer wrote:Since people want the benefits of const, I'm showing a way to get them by following coding conventions. This requires *no* changes to D. Also, this is not full "C++ const", only parameter passing and const methods, which seems to be the most popular parts of the const idea. It seems like it should require more syntax that C++, but it only takes a small amount. When working with types like "int", use "in" - const is not too much of an issue here. The same is true for struct, it gets copied in, which is fine for small structs. For larger structs, you might want to pass by "in *", i.e. use "in Foo *". You can modify this technique to use struct, for that see the last item in the numbered list at the end. For classes, the issue is that the pointer will not be modified with the "in" convention, but the values in the class may be. : // "Problem" code : : class Bar {...} : : class Foo { : this(Bar b) : { x1 = b; } : : this(const_Foo b) : { : x1 = b.x1.dup; : } : : // Modifies this Foo : void changeBar(Bar b2) : { x1 = b2; } : : // Does not modify this Foo : int doesWork() {...} : : protected: : Bar x1; : }; : : // NOTE: changes foo1 : void barfoo(in Foo foo1, in Bar b) : { : foo1.changeBar(b); : } We'd like barfoo() not to modify foo1 - we want to guarantee it. To deal with this, you can write a "const interface" for your class. I recommend the prefix "const_" so that it looks a little like the C++ version. This interface definition is quite simple to do. Note that a Foo is-a const_Foo, and passing it to a const-Foo interface is legal. But modifying it will throw an exception. NOTE: You don't need any extra method code, except constructors and optionally the "clone()" method. What we are doing is SPLITTING the personality of Foo into two halves - read and write. : // The read stuff : : class const_Foo { : this(Bar b) : { x1 = b; } : : this(Foo b) : { x1 = b.x1.dup; } : : // Does not modify Foo : int doesWork() {...} : : Foo clone() // how to un-const (optional) : { : return new Foo(this); // use const->nonconst ctor : } : : protected: : Bar x1; : }; : : // The write stuff - can also do read stuff of course. : : class Foo : const_Foo { : this(Bar b) : { : const_Foo(b); : } : : this(const_Foo b) // const->nonconst ctor : { : const_Foo(b.dup); : } : : void changeBar(Bar b2) : { : x1 = b2; : } : }; : : // Can only call this with non-const Foo. : void barfoo(in Foo foo1, in Bar b) : { : foo1.changeBar(b); : } : : // Can call this with either const_Foo or Foo. : void barfaa(in const_Foo foo1) : { : int q = foo1.doesWork(); : } 1. In C++, you need to make the same division into const and non-const, since every method must be labeled as "const" or not labeled (and thus unusable in a const object). So there is no extra "design burden". 2. You can easily change any method's constness by cut/pasting it to the other class. All implementation code/data is shared. 3. Relationships are enforced! If doesWork() calls changeBar(), the compiler will complain. 4. The class author decides whether "clone()" and the other special methods are written at all - so if "Bar" is uncloneable for some reason (i.e. maybe its a File), don't write clone() for Foo, or find a way to get around copying it. This work needs to be done in C++ too. 5. Users of const_Foo don't need to know what the editable Foo does. Their code can't break unless the const_ side is changed. It's now very hard to miss the distinction between const/non-const, which is easy to miss in C++ when writing methods for example. 6. Easy to use as a Copy-On-Write design: If you need to store an object, and don't know if it is const or not, use a const_Foo reference. In the event you need to modify it, you can test whether it is const with a dynamic cast. If it is, clone it first! 7. In C++, you can also define distinct const and non-const methods for a class. This happens automatically here - the non-const method (if one exists) just overrides the const one. 8. Finally, for OOD/OOP purists: Although the non-const version is not really "is-a" const, the relationship still holds once you realize that const is really a "subtracting" adjective - we could use the terms readable and read/writeable, where it is easy to see that a read/writeable think is-a readable thing. 9. You can have "in Foo" parameters and "out const_Foo" without it being a contradiction. The first means "I don't want to change what it points to -- something the caller might also want to know -- but I might modify it. The second is a way to return something. [The semantics of input and output (argument and return value) are normally different in OO programming, since one is covariant and the other contravariant. (This is true in D, right?)] 10. For structs you can do a similar thing: : // read-write version : struct X { : int opIndex(int i) { ... } : int opIndexAssign(int i) { ... } : : private: : int[1024] data_; : }; : // read-only version : struct const_X { : int opIndex(int i) { return impl[i]; } : : X * clone() : { : return impl.dup; : } : : private: : X impl; : }; If people like this, maybe something along these lines would be useful for the C++ programmer intro on the D site? I can make a more thorough version if so. If people use this technique, it might be good for them to follow the same style, i.e. method names. KevinOK, while it might work, your approach has several problems: - applicability - there is no support for arrays and pointers to primitive types; especially arrays are a problem - coding efficiency - maintenance of an additional class is bad, having to write a wrapper for structs is even worse - while arbitrarily flexible, the approach is also error-prone, like all manual methods - runtime efficiency - say you have a struct in your object; you can't return a readonly pointer to it, so you either have to heap-allocate a new struct, or make the wrapper contain a pointer, incurring double dereferencing - if you do use a pointer, you have to heap-allocate the wrapped struct, otherwise you can't pass the read-only version around freely - two classes means two vtbls, two TypeInfos, ... Any thoughts on that? :) xs0
Mar 09 2006
In article <dupfcr$erh$1 digitaldaemon.com>, xs0 says...Kevin Bealer wrote:True. This could be done with a wrapper too, particularly with IFTI, but coding efficiency suffers a little.Since people want the benefits of const, I'm showing a way to get them by following coding conventions. This requires *no* changes to D. Also, this is not full "C++ const", only parameter passing and const methods, which seems to be the most popular parts of the const idea. It seems like it should require more syntax that C++, but it only takes a small amount. When working with types like "int", use "in" - const is not too much of an issue here. The same is true for struct, it gets copied in, which is fine for small structs. For larger structs, you might want to pass by "in *", i.e. use "in Foo *". You can modify this technique to use struct, for that see the last item in the numbered list at the end. For classes, the issue is that the pointer will not be modified with the "in" convention, but the values in the class may be. : // "Problem" code : : class Bar {...} : : class Foo { : this(Bar b) : { x1 = b; } : : this(const_Foo b) : { : x1 = b.x1.dup; : } : : // Modifies this Foo : void changeBar(Bar b2) : { x1 = b2; } : : // Does not modify this Foo : int doesWork() {...} : : protected: : Bar x1; : }; : : // NOTE: changes foo1 : void barfoo(in Foo foo1, in Bar b) : { : foo1.changeBar(b); : } We'd like barfoo() not to modify foo1 - we want to guarantee it. To deal with this, you can write a "const interface" for your class. I recommend the prefix "const_" so that it looks a little like the C++ version. This interface definition is quite simple to do. Note that a Foo is-a const_Foo, and passing it to a const-Foo interface is legal. But modifying it will throw an exception. NOTE: You don't need any extra method code, except constructors and optionally the "clone()" method. What we are doing is SPLITTING the personality of Foo into two halves - read and write. : // The read stuff : : class const_Foo { : this(Bar b) : { x1 = b; } : : this(Foo b) : { x1 = b.x1.dup; } : : // Does not modify Foo : int doesWork() {...} : : Foo clone() // how to un-const (optional) : { : return new Foo(this); // use const->nonconst ctor : } : : protected: : Bar x1; : }; : : // The write stuff - can also do read stuff of course. : : class Foo : const_Foo { : this(Bar b) : { : const_Foo(b); : } : : this(const_Foo b) // const->nonconst ctor : { : const_Foo(b.dup); : } : : void changeBar(Bar b2) : { : x1 = b2; : } : }; : : // Can only call this with non-const Foo. : void barfoo(in Foo foo1, in Bar b) : { : foo1.changeBar(b); : } : : // Can call this with either const_Foo or Foo. : void barfaa(in const_Foo foo1) : { : int q = foo1.doesWork(); : } 1. In C++, you need to make the same division into const and non-const, since every method must be labeled as "const" or not labeled (and thus unusable in a const object). So there is no extra "design burden". 2. You can easily change any method's constness by cut/pasting it to the other class. All implementation code/data is shared. 3. Relationships are enforced! If doesWork() calls changeBar(), the compiler will complain. 4. The class author decides whether "clone()" and the other special methods are written at all - so if "Bar" is uncloneable for some reason (i.e. maybe its a File), don't write clone() for Foo, or find a way to get around copying it. This work needs to be done in C++ too. 5. Users of const_Foo don't need to know what the editable Foo does. Their code can't break unless the const_ side is changed. It's now very hard to miss the distinction between const/non-const, which is easy to miss in C++ when writing methods for example. 6. Easy to use as a Copy-On-Write design: If you need to store an object, and don't know if it is const or not, use a const_Foo reference. In the event you need to modify it, you can test whether it is const with a dynamic cast. If it is, clone it first! 7. In C++, you can also define distinct const and non-const methods for a class. This happens automatically here - the non-const method (if one exists) just overrides the const one. 8. Finally, for OOD/OOP purists: Although the non-const version is not really "is-a" const, the relationship still holds once you realize that const is really a "subtracting" adjective - we could use the terms readable and read/writeable, where it is easy to see that a read/writeable think is-a readable thing. 9. You can have "in Foo" parameters and "out const_Foo" without it being a contradiction. The first means "I don't want to change what it points to -- something the caller might also want to know -- but I might modify it. The second is a way to return something. [The semantics of input and output (argument and return value) are normally different in OO programming, since one is covariant and the other contravariant. (This is true in D, right?)] 10. For structs you can do a similar thing: : // read-write version : struct X { : int opIndex(int i) { ... } : int opIndexAssign(int i) { ... } : : private: : int[1024] data_; : }; : // read-only version : struct const_X { : int opIndex(int i) { return impl[i]; } : : X * clone() : { : return impl.dup; : } : : private: : X impl; : }; If people like this, maybe something along these lines would be useful for the C++ programmer intro on the D site? I can make a more thorough version if so. If people use this technique, it might be good for them to follow the same style, i.e. method names. KevinOK, while it might work, your approach has several problems: - applicability - there is no support for arrays and pointers to primitive types; especially arrays are a problem- coding efficiency - maintenance of an additional class is bad, having to write a wrapper for structs is even worseC++ requires you to maintain two personalities in one class - you have to make the same decisions, and can make most of the same errors.- while arbitrarily flexible, the approach is also error-prone, like all manual methodsIf you don't write the special methods, there is very little extra work to do, really just adding "class A : B {" and "};" to your code. It should not be much more error prone. If you do write the special methods (clone, special constructors) then this is true. So I agree, partially... see my comments at the end.- runtime efficiency - say you have a struct in your object; you can't return a readonly pointer to it, so you either have to heap-allocate a new struct, or make the wrapper contain a pointer, incurring double dereferencingBut, by heap allocating it, you avoid the cost of copying it during the return. The way I see it, you have three options: 1. Return by value - in which case you probably don't need const, unless the object contains stuff that you want to protect. 2. Wrap in a readonly struct (no pointers) and return this by value. The copy into the readonly struct costs efficiency, about the same as returning the original by value. 3. Heap allocations... avoids future copies (good), requires heap allocation (bad). So, a tradeoff. I admit though, the approach doesn't work as well for struct as for class.- if you do use a pointer, you have to heap-allocate the wrapped struct, otherwise you can't pass the read-only version around freelyI think you can wrap in a struct and pass the wrapper by value, which copies the internal struct. The value is copied as with any struct, but the wrapper doesn't have the non-const methods, right? A bigger annoyance is the fact that you can't write one function signature that takes either type, unless it takes a template parameter. I.e. structs don't have inheritance, so there's no Foo -> const_Foo implicit conversion.- two classes means two vtbls, two TypeInfos, ...Small potatoes, and not entirely harmful (see below).Any thoughts on that? :) xs0You're right - this approach has definite limits and is not free. However, it allows some flexibility that C++ (for instance) does not. :const_Foo f = get_Object(); : :Foo f_mod = cast(Foo) f; : :if (f_mod !is null) { : // modify f, only if allowed :} With this technique, you can create real "const_Foo" objects (i.e. they were never a Foo) and test which kind you have at runtime. Normally all "Foo" objects are created as "Foo", but this approach enables you to build objects that are designed to never change - someone has to do the clone operation to get a modifiable version. This could be really useful in (for instance) cache designs - get an object from the cache, pass it anywhere, since the const_Foo class uses "immutable" semantics (like a Java String), this is safe to do. It requires classes that can be built entirely in the constructor, but many people write code like this now. Similarly, I could create a Baz object that has all the readonly methods of Foo, and derive it from const_Foo. Now you can treat it as a const_Foo, but cannot cast it to a Foo (since it isnt one). Why would I do this? Imagine I have a string class Foo - I could create a const_Foo derived class where the actual data was a memory mapped file. It has all the *readonly* properties of a string, but you can't resize it. Or a readonly version of a database interface, which would dynamically provide DB information for user queries, but would not allow storage into the database. I could use this database interface as a front end for any number of data sources that are not really databases. You can't do any of these in C++, since C++'s const acts like a template - all the work happens at compile time. Having an extra vtbl, allows you to do runtime tricks too - and have compile time static type correctness for code that just uses Foo without thinking about const. Kevin
Mar 09 2006