digitalmars.D - Review of DIP49
- Andrei Alexandrescu (75/75) Feb 02 2014 Here are a few questions and comments:
- Timon Gehr (25/50) Feb 02 2014 I think it is a weakness of the 'inout' qualifier (or more generally,
- Andrei Alexandrescu (6/9) Feb 02 2014 A bit of history: indeed that was the motivation back in the day, but at...
- Walter Bright (2/2) Feb 02 2014 Linky:
- Walter Bright (4/6) Feb 02 2014 I agree with this so strongly that I feel unique expressions should get ...
- Kenji Hara (102/176) Feb 03 2014 First of all, thanks for your reviewing!
- Andrei Alexandrescu (20/37) Feb 04 2014 And thank you for your work and this reply.
- Timon Gehr (10/35) Feb 07 2014 How so?
Here are a few questions and comments: * Why is there an immutable postblit? I wonder if there are cases in which that's needed. Most of the time it's fine to just memcpy() the data - it's immutable anyway so nothing will change! In fact there is one use case for immutable postblit, and that's refcounting (when copying data the reference count must be incremented). But that would be increment via a pointer, which is not allowed in your proposal. (I think it is ok for the postblit proposal to not cover refcounting - we can make it built in the language.) * The inout case looks good, but again we need to have some good examples on when it would be necessary. * What happens if an object defines both the mutable and the inout version of postblit? I assume the mutable has priority when duplicating mutable object. The DIP should clarify that. * Const postblit is spelled "this(this) const" but that's misleading because it's really "this(whatever1 this) whatever2" with whatever1 and whatever2 being arbitrary qualifiers. We should probably find a more descriptive syntax for that. "this(this) auto" comes to mind. We can even add a contextual keyword such that the compiler recognizes "this(this) unique" without otherwise giving "unique" any special meaning. * My main beef is with this constructor. I'll refer to it henceforth as "the unique constructor". * Qualifiers obey a hierarchy, i.e. const(T) is a supertype of both T and immutable(T) as shown below with ASCII art. const(T) /\ / \ T immutable(T) That imparts structure over the mixed-qualifier constructors. However, that structure is missing from the unique constructor; all mixed-qualifier constructors are heaped into one. This makes the rules forced for certain constructors (e.g. constructing a const(T) from a T should be much less restricted than constructing a T from a const(T)). The proposed unique constructor is not sensitive to such distinctions. * The "unique expression" definition is quite strong and we should refine it and reuse in other contexts as well. * The section "Overloading of qualified postblits" is great because it brings back the subtyping structure! It says: "If mutable postblit is defined, it is alwasy used for the copies: (a) mutable to mutable; (b) mutable to const". Indeed that is correct because mutable to const is a copy to a supertype. To respect subtyping, we should also add (c) immutable to const; (d) const to const; (e) immutable to immutable. That way we have a simple rule: "this(this) is invoked for all upcasts obtained by qualifiers". Wonderful! * If we go forward with the idea above, we only need to treat the remaining cases of downcasts and cross-casts across the subtyping hierarchy: (a) downcasts: const(T) -> T and const(T) -> immutable(T) (b) cross-casts: immutable(T) -> T and T -> immutable(T) * The use of subtyping above would replace the elaborate rules in section "Overloading of qualified postblits". In fact they seem to agree 95% of the time. By the way there are some confusing negations e.g. "if mutable postblit is not defined, it will be used for the copies". I assume the "not" should be removed? * So we're left with the following postblitting rules as the maximum: struct T { this(this); // all upcasts including identity this(const this); // construct T from const(T) this(const this) immutable; // construct immutable(T) from const(T) this(immutable this); // construct T from immutable(T) this(this) immutable; // construct immutable(T) from T } Some could be missing, some could be deduced, but this is the total set. * Consider a conversion like "this(this) immutable" which constructs an immutable(T) from a T. This is tricky to typecheck because fields of T have a mutable type when first read and immutable type after having been written to. That raises the question whether the entire notion of postblitting is too complicated for its own good. Should we leave it as is and go with classic C++-style copy construction in which source and destination are distinct objects? I think that would simplify both the language definition and its implementation. * The section "Why 'const' postblit will called to copy arbitrary qualified object?" alludes to the subtyping relationship among qualifiers without stating it. Andrei
Feb 02 2014
On 02/03/2014 12:54 AM, Andrei Alexandrescu wrote:* Const postblit is spelled "this(this) const" but that's misleading because it's really "this(whatever1 this) whatever2" with whatever1 and whatever2 being arbitrary qualifiers. We should probably find a moreI think it is a weakness of the 'inout' qualifier (or more generally, the polymorphic part of the type system) that this cannot be expressed in an orthogonal way. (It only allows one type constructor variable per scope.) Effectively, one would want something like: this(inout this) inout' { ... } i.e. two unrelated wildcards. (Not an actual syntax proposal.)descriptive syntax for that. "this(this) auto" comes to mind. We can even add a contextual keyword such that the compiler recognizes "this(this) unique" without otherwise giving "unique" any special meaning.* So we're left with the following postblitting rules as the maximum: struct T { this(this); // all upcasts including identity this(const this); // construct T from const(T) this(const this) immutable; // construct immutable(T) from const(T) this(immutable this); // construct T from immutable(T) this(this) immutable; // construct immutable(T) from T } Some could be missing, some could be deduced, but this is the total set. ...We'd also want to consider inout.* Consider a conversion like "this(this) immutable" which constructs an immutable(T) from a T. This is tricky to typecheck because fields of T have a mutable type when first read and immutable type after having been written to.The language needs to track initialized fields in constructors anyways, so I think that effort can be shared to some extent.That raises the question whether the entire notion of postblitting is too complicated for its own good. Should we leave it as is and go with classic C++-style copy construction in which source and destination are distinct objects? I think that would simplify both the language definition and its implementation.If the source object can be obtained by ref, it also increases modelling power slightly. (e.g. initialize it on copy in order to have actual reference semantics for structs with default initialization.) This also has the benefit that one does not have to pay for postblits and destructors of fields one is going to reinitialize anyway. The main drawback of copy-construction with regard to postblit is that (potentially less efficient?) boilerplate is required to copy all fields over manually, which I think was what motivated the concept. Of course, copy construction does not address the need to copy from any qualifier to any qualifier, which would IMO rather be achieved by improvements to other parts of the type system anyway. (A weakness of DIP49 is that it is impossible to abstract out identical code snippets that use unique postblit at different type qualifiers.)
Feb 02 2014
On 2/2/14, 4:28 PM, Timon Gehr wrote:The main drawback of copy-construction with regard to postblit is that (potentially less efficient?) boilerplate is required to copy all fields over manually, which I think was what motivated the concept.A bit of history: indeed that was the motivation back in the day, but at that point in time D's introspection abilities were just emerging. We simply didn't anticipate that D (as it is today) makes is trivially easy to write a function that initializes all fields of an object from another. Andrei
Feb 02 2014
On 2/2/2014 3:54 PM, Andrei Alexandrescu wrote:* The "unique expression" definition is quite strong and we should refine it and reuse in other contexts as well.I agree with this so strongly that I feel unique expressions should get their own, independent DIP, such as: http://wiki.dlang.org/DIP29
Feb 02 2014
2014-02-03 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:Here are a few questions and comments:First of all, thanks for your reviewing! * Why is there an immutable postblit? I wonder if there are cases in whichthat's needed. Most of the time it's fine to just memcpy() the data - it's immutable anyway so nothing will change! In fact there is one use case for immutable postblit, and that's refcounting (when copying data the reference count must be incremented). But that would be increment via a pointer, which is not allowed in your proposal. (I think it is ok for the postblit proposal to not cover refcounting - we can make it built in the language.)immutable postblit "this(this) immutable" is necessary to support immutable(T) to immutable(T) copy. (It's impossible by "this(this)".) In immutable postblit body, modifying referenced data is not allowed. But, rebinding references is still allowed because first field assignment is treated as field initializing. struct S { int[] arr; this(this) immutable { //arr[0] = 10; // NG, but... arr = [1,2,3]; // OK so this is 'arr' field initializing. } }* The inout case looks good, but again we need to have some good examples on when it would be necessary.If mutable psotblit and immutabe postblit have same code, you can merge two into one inout postblit. struct S { int[] arr; this(this) { arr = arr.dup; } this(this) immutable { arr = arr.idup; } // you can merge above two postblits into: this(this) inout { arr = arr.dup; } // arr.dup makes unique data, so implicitly conversion to inout(int[]) is allowed. }* What happens if an object defines both the mutable and the inout version of postblit? I assume the mutable has priority when duplicating mutable object. The DIP should clarify that.It is clarified in "Overloading of qualified postblits" section.If immutable postblit is defined, it is alwasy used for the copies: - immutable to const - immutable to immutable These priority order is defined based on the following rule: - If source is mutable or immutable, most specialized postblits(mutable/immutable postblit) will be used, if they exists. * Const postblit is spelled "this(this) const" but that's misleadingbecause it's really "this(whatever1 this) whatever2" with whatever1 and whatever2 being arbitrary qualifiers. We should probably find a more descriptive syntax for that. "this(this) auto" comes to mind. We can even add a contextual keyword such that the compiler recognizes "this(this) unique" without otherwise giving "unique" any special meaning.I have deliberately avoided "contextual keyword" because it does not currently exist in D grammar. But unfortunately, using existing keyword for the new meaning seems to cause just confusion. I still don't have a strong opinion for the syntax ...* My main beef is with this constructor. I'll refer to it henceforth as "the unique constructor". * Qualifiers obey a hierarchy, i.e. const(T) is a supertype of both T and immutable(T) as shown below with ASCII art. const(T) /\ / \ T immutable(T) That imparts structure over the mixed-qualifier constructors. However, that structure is missing from the unique constructor; all mixed-qualifier constructors are heaped into one. This makes the rules forced for certain constructors (e.g. constructing a const(T) from a T should be much less restricted than constructing a T from a const(T)). The proposed unique constructor is not sensitive to such distinctions.??? An unique constructor is always less specialized than any other constructors (mutable, immutable, and inout). Specialization order is: mutable == immutable > inout > unique postblit So, by overloading postblits, you can control copy cost as you expect. For example: struct Array(T) { T[] data; this(this) unique { data = data.dup; } } Array!T allows copying object between arbitrary qualifiers. But it also needs data duplication. If you want to share data in possible case, overload postblit as follows. struct Array(T) { T[] data; this(this) inout { /* nothing to do.*/; } this(this) unique { data = data.dup; } } If the copy target has weak or equal qualifier than the copy source, inout postblit is used, and it will always copy object by just bit blit (==memcpy). If you need to avoid data sharing between mutable objects, you can add mutable postblit for that. struct Array(T) { T[] data; this(this) { data = data.dup; } this(this) inout { /* nothing to do.*/; } this(this) unique { data = data.dup; } }* The "unique expression" definition is quite strong and we should refine it and reuse in other contexts as well.OK, I'll separate the concept to other DIP.* The section "Overloading of qualified postblits" is great because it brings back the subtyping structure! It says: "If mutable postblit is defined, it is alwasy used for the copies: (a) mutable to mutable; (b) mutable to const". Indeed that is correct because mutable to const is a copy to a supertype. To respect subtyping, we should also add (c) immutable to const; (d) const to const; (e) immutable to immutable. That way we have a simple rule: "this(this) is invoked for all upcasts obtained by qualifiers". Wonderful!Yes. to support both simple case and complicated case, the rule is defined.* If we go forward with the idea above, we only need to treat the remaining cases of downcasts and cross-casts across the subtyping hierarchy: (a) downcasts: const(T) -> T and const(T) -> immutable(T) (b) cross-casts: immutable(T) -> T and T -> immutable(T)There is unique postblit to support them exactly what. * The use of subtyping above would replace the elaborate rules in section"Overloading of qualified postblits". In fact they seem to agree 95% of the time.The purpose of the section is to describe priority between postblits. It is simple: (mutable == immutable) > inout > unique postblit By the way there are some confusing negations e.g. "if mutable postblit isnot defined, it will be used for the copies". I assume the "not" should be removed?inout postblit is less specialized than (mutable|immutable) postblits. So "not" is necessary.* So we're left with the following postblitting rules as the maximum: struct T { this(this); // all upcasts including identity this(const this); // construct T from const(T) this(const this) immutable; // construct immutable(T) from const(T) this(immutable this); // construct T from immutable(T) this(this) immutable; // construct immutable(T) from T } Some could be missing, some could be deduced, but this is the total set.The DIP *does not* support to define different operations for "T from immutable(T)" and "immutable(T) from T". They are always defined by "unique postblit". I believe that supporting it will never make language more useful. For example, if an object supports immutable to mutable copy, but disables mutable to mutable copy, what benefit will be there? In D, type qualifier is very important feature (especially 'immutable'). And that's why we should keep object copying rule (== postblit definition way) simple. Note that, currently D has 9 qualifiers: - (mutable) - const - inout - inout const (added from 2.065. See issue 6930) - shared - shared const - shared inout - shared inout const (added from 2.065. See issue 6930) - immutable So, if we allow defining arbitrary postblit from A to B copy, you can define 9 * 9 = 81 postblits! I don't want to see such horrible D code.* Consider a conversion like "this(this) immutable" which constructs an immutable(T) from a T. This is tricky to typecheck because fields of T have a mutable type when first read and immutable type after having been written to. That raises the question whether the entire notion of postblitting is too complicated for its own good. Should we leave it as is and go with classic C++-style copy construction in which source and destination are distinct objects? I think that would simplify both the language definition and its implementation.??? immutable postblit cannot be used for immutable(T) to T copy. If you want to do it, you should define unique postblit. There's no way than others.* The section "Why 'const' postblit will called to copy arbitrary qualified object?" alludes to the subtyping relationship among qualifiers without stating it.If we select "this(this) unique" syntax for unique postblit, the section will be unnecessary. Kenji Hara
Feb 03 2014
On 2/3/14, 6:07 AM, Kenji Hara wrote:2014-02-03 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>>: Here are a few questions and comments: First of all, thanks for your reviewing!And thank you for your work and this reply. This part:Note that, currently D has 9 qualifiers: - (mutable) - const - inout - inout const (added from 2.065. See issue 6930) - shared - shared const - shared inout - shared inout const (added from 2.065. See issue 6930) - immutable So, if we allow defining arbitrary postblit from A to B copy, you can define 9 * 9 = 81 postblits! I don't want to see such horrible D code.convinced me something has definitely gotten out of hand. We must devise a radically simpler solution to object copying, one that aims straight at solving the fundamental problems we're facing. Far as I can tell these are it: 1. Reference counting for all objects, including const/immutable/shared. If this(this) can't be reasonably made to solve that, we will build it into the language and raise the question whether this(this) should exist at all. In particular, calling .dup inside this(this) is an antipattern. Objects must be cheap to copy. So any example using somrArray.dup inside this(this) is an instant indication we're solving the wrong problem (and with a wrong solution as well). 2. Removing head const. You have added the nice conversion qualified(T[]) -> qualified(T)[] upon calling a function. We must look at ways to allow the user a similar generalization for type constructors. If we figure this(this) doesn't help these two problems, its entire existence must be put in question. Andrei
Feb 04 2014
On 02/04/2014 08:53 PM, Andrei Alexandrescu wrote:This part:How so? 1. We have _4_ qualifiers. (DMD appears to implement them as 9 separate qualifiers which has led to problems such as not all combinations being implemented from the start.) 2. We can also overload a usual 1-argument/1-return function based only on qualifiers in the stated 81 ways. Why would this be unexpected or even noteworthy?Note that, currently D has 9 qualifiers: - (mutable) - const - inout - inout const (added from 2.065. See issue 6930) - shared - shared const - shared inout - shared inout const (added from 2.065. See issue 6930) - immutable So, if we allow defining arbitrary postblit from A to B copy, you can define 9 * 9 = 81 postblits! I don't want to see such horrible D code.convinced me something has definitely gotten out of hand.We must devise a radically simpler solution to object copying, one that aims straight at solving the fundamental problems we're facing. Far as I can tell these are it: 1. Reference counting for all objects, including const/immutable/shared. ... 2. Removing head const. You have added the nice conversion qualified(T[]) -> qualified(T)[] upon calling a function. We must look at ways to allow the user a similar generalization for type constructors. If we figure this(this) doesn't help these two problems,It doesn't help with 2.its entire existence must be put in question.Make sure to keep it possible to mark structs non-copyable.
Feb 07 2014