digitalmars.D - Review of DIP49

Andrei Alexandrescu (75/75) Feb 02 2014 Here are a few questions and comments:

Timon Gehr (25/50) Feb 02 2014 I think it is a weakness of the 'inout' qualifier (or more generally,

Andrei Alexandrescu (6/9) Feb 02 2014 A bit of history: indeed that was the motivation back in the day, but at...

Walter Bright (2/2) Feb 02 2014 Linky:
Walter Bright (4/6) Feb 02 2014 I agree with this so strongly that I feel unique expressions should get ...
Kenji Hara (102/176) Feb 03 2014 First of all, thanks for your reviewing!

Andrei Alexandrescu (20/37) Feb 04 2014 And thank you for your work and this reply.

Timon Gehr (10/35) Feb 07 2014 How so?

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Here are a few questions and comments:

* Why is there an immutable postblit? I wonder if there are cases in 
which that's needed. Most of the time it's fine to just memcpy() the 
data - it's immutable anyway so nothing will change! In fact there is 
one use case for immutable postblit, and that's refcounting (when 
copying data the reference count must be incremented). But that would be 
increment via a pointer, which is not allowed in your proposal. (I think 
it is ok for the postblit proposal to not cover refcounting - we can 
make it built in the language.)

* The inout case looks good, but again we need to have some good 
examples on when it would be necessary.

* What happens if an object defines both the mutable and the inout 
version of postblit? I assume the mutable has priority when duplicating 
mutable object. The DIP should clarify that.

* Const postblit is spelled "this(this) const" but that's misleading 
because it's really "this(whatever1 this) whatever2" with whatever1 and 
whatever2 being arbitrary qualifiers. We should probably find a more 
descriptive syntax for that. "this(this) auto" comes to mind. We can 
even add a contextual keyword such that the compiler recognizes 
"this(this) unique" without otherwise giving "unique" any special meaning.

* My main beef is with this constructor. I'll refer to it henceforth as 
"the unique constructor".

* Qualifiers obey a hierarchy, i.e. const(T) is a supertype of both T 
and immutable(T) as shown below with ASCII art.

  const(T)
   /\
  /  \
T   immutable(T)

That imparts structure over the mixed-qualifier constructors. However, 
that structure is missing from the unique constructor; all 
mixed-qualifier constructors are heaped into one. This makes the rules 
forced for certain constructors (e.g. constructing a const(T) from a T 
should be much less restricted than constructing a T from a const(T)). 
The proposed unique constructor is not sensitive to such distinctions.

* The "unique expression" definition is quite strong and we should 
refine it and reuse in other contexts as well.

* The section "Overloading of qualified postblits" is great because it 
brings back the subtyping structure! It says: "If mutable postblit is 
defined, it is alwasy used for the copies: (a) mutable to mutable; (b) 
mutable to const". Indeed that is correct because mutable to const is a 
copy to a supertype. To respect subtyping, we should also add (c) 
immutable to const; (d) const to const; (e) immutable to immutable. That 
way we have a simple rule: "this(this) is invoked for all upcasts 
obtained by qualifiers". Wonderful!

* If we go forward with the idea above, we only need to treat the 
remaining cases of downcasts and cross-casts across the subtyping 
hierarchy:

(a) downcasts: const(T) -> T and const(T) -> immutable(T)
(b) cross-casts: immutable(T) -> T and T -> immutable(T)

* The use of subtyping above would replace the elaborate rules in 
section "Overloading of qualified postblits". In fact they seem to agree 
95% of the time. By the way there are some confusing negations e.g. "if 
mutable postblit is not defined, it will be used for the copies". I 
assume the "not" should be removed?

* So we're left with the following postblitting rules as the maximum:

struct T {
   this(this); // all upcasts including identity
   this(const this); // construct T from const(T)
   this(const this) immutable; // construct immutable(T) from const(T)
   this(immutable this); // construct T from immutable(T)
   this(this) immutable; // construct immutable(T) from T
}

Some could be missing, some could be deduced, but this is the total set.

* Consider a conversion like "this(this) immutable" which constructs an 
immutable(T) from a T. This is tricky to typecheck because fields of T 
have a mutable type when first read and immutable type after having been 
written to. That raises the question whether the entire notion of 
postblitting is too complicated for its own good. Should we leave it as 
is and go with classic C++-style copy construction in which source and 
destination are distinct objects? I think that would simplify both the 
language definition and its implementation.

* The section "Why 'const' postblit will called to copy arbitrary 
qualified object?" alludes to the subtyping relationship among 
qualifiers without stating it.



Andrei

Feb 02 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 02/03/2014 12:54 AM, Andrei Alexandrescu wrote:
 * Const postblit is spelled "this(this) const" but that's misleading
 because it's really "this(whatever1 this) whatever2"  with whatever1 and
 whatever2 being arbitrary qualifiers.  We should probably find a more
 descriptive syntax for that. "this(this) auto" comes to mind. We can
 even add a contextual keyword such that the compiler recognizes
 "this(this) unique" without otherwise giving "unique" any special meaning.


I think it is a weakness of the 'inout' qualifier (or more generally, 
the polymorphic part of the type system) that this cannot be expressed 
in an orthogonal way. (It only allows one type constructor variable per 
scope.)

Effectively, one would want something like:

this(inout this) inout' { ... }

i.e. two unrelated wildcards.
(Not an actual syntax proposal.)

 * So we're left with the following postblitting rules as the maximum:

 struct T {
    this(this); // all upcasts including identity
    this(const this); // construct T from const(T)
    this(const this) immutable; // construct immutable(T) from const(T)
    this(immutable this); // construct T from immutable(T)
    this(this) immutable; // construct immutable(T) from T
 }

 Some could be missing, some could be deduced, but this is the total set.
 ...

We'd also want to consider inout.

 * Consider a conversion like "this(this) immutable" which constructs an
 immutable(T) from a T. This is tricky to typecheck because fields of T
 have a mutable type when first read and immutable type after having been
 written to.

The language needs to track initialized fields in constructors anyways, 
so I think that effort can be shared to some extent.

 That raises the question whether the entire notion of
 postblitting is too complicated for its own good. Should we leave it as
 is and go with classic C++-style copy construction in which source and
 destination are distinct objects? I think that would simplify both the
 language definition and its implementation.

If the source object can be obtained by ref, it also increases modelling 
power slightly. (e.g. initialize it on copy in order to have actual 
reference semantics for structs with default initialization.)

This also has the benefit that one does not have to pay for postblits 
and destructors of fields one is going to reinitialize anyway.

The main drawback of copy-construction with regard to postblit is that 
(potentially less efficient?) boilerplate is required to copy all fields 
over manually, which I think was what motivated the concept.

Of course, copy construction does not address the need to copy from any 
qualifier to any qualifier, which would IMO rather be achieved by 
improvements to other parts of the type system anyway. (A weakness of 
DIP49 is that it is impossible to abstract out identical code snippets 
that use unique postblit at different type qualifiers.)

Feb 02 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/2/14, 4:28 PM, Timon Gehr wrote:
 The main drawback of copy-construction with regard to postblit is that
 (potentially less efficient?) boilerplate is required to copy all fields
 over manually, which I think was what motivated the concept.

A bit of history: indeed that was the motivation back in the day, but at 
that point in time D's introspection abilities were just emerging. We 
simply didn't anticipate that D (as it is today) makes is trivially easy 
to write a function that initializes all fields of an object from another.

Andrei

Feb 02 2014

Walter Bright <newshound2 digitalmars.com> writes:

Linky:

http://wiki.dlang.org/DIP49

Feb 02 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/2/2014 3:54 PM, Andrei Alexandrescu wrote:
 * The "unique expression" definition is quite strong and we should refine it
and
 reuse in other contexts as well.

I agree with this so strongly that I feel unique expressions should get their 
own, independent DIP, such as:

http://wiki.dlang.org/DIP29

Feb 02 2014

Kenji Hara <k.hara.pg gmail.com> writes:

2014-02-03 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:
 Here are a few questions and comments:

First of all, thanks for your reviewing!

* Why is there an immutable postblit? I wonder if there are cases in which
 that's needed. Most of the time it's fine to just memcpy() the data - it's
 immutable anyway so nothing will change! In fact there is one use case for
 immutable postblit, and that's refcounting (when copying data the reference
 count must be incremented). But that would be increment via a pointer,
 which is not allowed in your proposal. (I think it is ok for the postblit
 proposal to not cover refcounting - we can make it built in the language.)

immutable postblit "this(this) immutable" is necessary to support
immutable(T) to immutable(T) copy.
(It's impossible by "this(this)".)

In immutable postblit body, modifying referenced data is not allowed. But,
rebinding references is still allowed
because first field assignment is treated as field initializing.

struct S {
    int[] arr;
    this(this) immutable {
        //arr[0] = 10;    // NG, but...
        arr = [1,2,3];    // OK so this is 'arr' field initializing.
    }
}


 * The inout case looks good, but again we need to have some good examples
 on when it would be necessary.

If mutable psotblit and immutabe postblit have same code, you can merge two
into one inout postblit.

struct S {
    int[] arr;
    this(this) { arr = arr.dup; }
    this(this) immutable { arr = arr.idup; }

    // you can merge above two postblits into:
    this(this) inout { arr = arr.dup; }
    // arr.dup makes unique data, so implicitly conversion to inout(int[])
is allowed.
}



 * What happens if an object defines both the mutable and the inout version
 of postblit? I assume the mutable has priority when duplicating mutable
 object. The DIP should clarify that.

It is clarified in "Overloading of qualified postblits" section.

 If immutable postblit is defined, it is alwasy used for the copies:
 - immutable to const
 - immutable to immutable

 These priority order is defined based on the following rule:
 - If source is mutable or immutable, most specialized postblits

(mutable/immutable postblit) will be used, if they exists.

* Const postblit is spelled "this(this) const" but that's misleading
 because it's really "this(whatever1 this) whatever2" with whatever1 and
 whatever2 being arbitrary qualifiers. We should probably find a more
 descriptive syntax for that. "this(this) auto" comes to mind. We can even
 add a contextual keyword such that the compiler recognizes "this(this)
 unique" without otherwise giving "unique" any special meaning.

I have deliberately avoided "contextual keyword" because it does not
currently exist in D grammar.
But unfortunately, using existing keyword for the new meaning seems to
cause just confusion.

I still don't have a strong opinion for the syntax ...


 * My main beef is with this constructor. I'll refer to it henceforth as
 "the unique constructor".

 * Qualifiers obey a hierarchy, i.e. const(T) is a supertype of both T and
 immutable(T) as shown below with ASCII art.

  const(T)
   /\
  /  \
 T   immutable(T)

 That imparts structure over the mixed-qualifier constructors. However,
 that structure is missing from the unique constructor; all mixed-qualifier
 constructors are heaped into one. This makes the rules forced for certain
 constructors (e.g. constructing a const(T) from a T should be much less
 restricted than constructing a T from a const(T)). The proposed unique
 constructor is not sensitive to such distinctions.

??? An unique constructor is always less specialized than any other
constructors (mutable, immutable, and inout).
Specialization order is:

  mutable == immutable > inout > unique postblit

So, by overloading postblits, you can control copy cost as you expect.
For example:

struct Array(T) {
    T[] data;
    this(this) unique { data = data.dup; }
}

Array!T allows copying object between arbitrary qualifiers. But it also
needs data duplication.

If you want to share data in possible case, overload postblit as follows.

struct Array(T) {
    T[] data;
    this(this) inout { /* nothing to do.*/; }
    this(this) unique { data = data.dup; }
}

If the copy target has weak or equal qualifier than the copy source, inout
postblit is used, and it will always copy object by just bit blit
(==memcpy).

If you need to avoid data sharing between mutable objects, you can add
mutable postblit for that.

struct Array(T) {
    T[] data;
    this(this) { data = data.dup; }
    this(this) inout { /* nothing to do.*/; }
    this(this) unique { data = data.dup; }
}


 * The "unique expression" definition is quite strong and we should refine
 it and reuse in other contexts as well.

OK, I'll separate the concept to other DIP.


 * The section "Overloading of qualified postblits" is great because it
 brings back the subtyping structure! It says: "If mutable postblit is
 defined, it is alwasy used for the copies: (a) mutable to mutable; (b)
 mutable to const". Indeed that is correct because mutable to const is a
 copy to a supertype. To respect subtyping, we should also add (c) immutable
 to const; (d) const to const; (e) immutable to immutable. That way we have
 a simple rule: "this(this) is invoked for all upcasts obtained by
 qualifiers". Wonderful!

Yes. to support both simple case and complicated case, the rule is defined.


 * If we go forward with the idea above, we only need to treat the
 remaining cases of downcasts and cross-casts across the subtyping hierarchy:

 (a) downcasts: const(T) -> T and const(T) -> immutable(T)
 (b) cross-casts: immutable(T) -> T and T -> immutable(T)

There is unique postblit to support them exactly what.

* The use of subtyping above would replace the elaborate rules in section
 "Overloading of qualified postblits". In fact they seem to agree 95% of the
 time.


The purpose of the section is to describe priority between postblits. It is
simple:

  (mutable == immutable) > inout > unique postblit

By the way there are some confusing negations e.g. "if mutable postblit is
 not defined, it will be used for the copies". I assume the "not" should be
 removed?

inout postblit is less specialized than (mutable|immutable) postblits. So
"not" is necessary.


 * So we're left with the following postblitting rules as the maximum:

 struct T {
   this(this); // all upcasts including identity
   this(const this); // construct T from const(T)
   this(const this) immutable; // construct immutable(T) from const(T)
   this(immutable this); // construct T from immutable(T)
   this(this) immutable; // construct immutable(T) from T
 }

 Some could be missing, some could be deduced, but this is the total set.

The DIP *does not* support to define different operations for "T from
immutable(T)" and "immutable(T) from T".
They are always defined by "unique postblit".

I believe that supporting it will never make language more useful. For
example, if an object supports immutable to mutable copy, but disables
mutable to mutable copy, what benefit will be there?
In D, type qualifier is very important feature (especially 'immutable').
And that's why we should keep object copying rule (== postblit definition
way) simple.

Note that, currently D has 9 qualifiers:
- (mutable)
- const
- inout
- inout const (added from 2.065. See issue 6930)
- shared
- shared const
- shared inout
- shared inout const (added from 2.065. See issue 6930)
- immutable

So, if we allow defining arbitrary postblit from A to B copy, you can
define 9 * 9 = 81 postblits!
I don't want to see such horrible D code.


 * Consider a conversion like "this(this) immutable" which constructs an
 immutable(T) from a T. This is tricky to typecheck because fields of T have
 a mutable type when first read and immutable type after having been written
 to. That raises the question whether the entire notion of postblitting is
 too complicated for its own good. Should we leave it as is and go with
 classic C++-style copy construction in which source and destination are
 distinct objects? I think that would simplify both the language definition
 and its implementation.

??? immutable postblit cannot be used for immutable(T) to T copy.
If you want to do it, you should define unique postblit. There's no way
than others.


 * The section "Why 'const' postblit will called to copy arbitrary
 qualified object?" alludes to the subtyping relationship among qualifiers
 without stating it.


If we select "this(this) unique" syntax for unique postblit, the section
will be unnecessary.

Kenji Hara

Feb 03 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/3/14, 6:07 AM, Kenji Hara wrote:
 2014-02-03 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org
 <mailto:SeeWebsiteForEmail erdani.org>>:

     Here are a few questions and comments:

 First of all, thanks for your reviewing!

And thank you for your work and this reply.

This part:

 Note that, currently D has 9 qualifiers:
 - (mutable)
 - const
 - inout
 - inout const (added from 2.065. See issue 6930)
 - shared
 - shared const
 - shared inout
 - shared inout const (added from 2.065. See issue 6930)
 - immutable

 So, if we allow defining arbitrary postblit from A to B copy, you can
 define 9 * 9 = 81 postblits!
 I don't want to see such horrible D code.

convinced me something has definitely gotten out of hand. We must devise 
a radically simpler solution to object copying, one that aims straight 
at solving the fundamental problems we're facing. Far as I can tell 
these are it:

1. Reference counting for all objects, including const/immutable/shared. 
If this(this) can't be reasonably made to solve that, we will build it 
into the language and raise the question whether this(this) should exist 
at all. In particular, calling .dup inside this(this) is an antipattern. 
Objects must be cheap to copy. So any example using somrArray.dup inside 
this(this) is an instant indication we're solving the wrong problem (and 
with a wrong solution as well).

2. Removing head const. You have added the nice conversion 
qualified(T[]) -> qualified(T)[] upon calling a function. We must look 
at ways to allow the user a similar generalization for type constructors.

If we figure this(this) doesn't help these two problems, its entire 
existence must be put in question.


Andrei

Feb 04 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 02/04/2014 08:53 PM, Andrei Alexandrescu wrote:
 This part:

 Note that, currently D has 9 qualifiers:
 - (mutable)
 - const
 - inout
 - inout const (added from 2.065. See issue 6930)
 - shared
 - shared const
 - shared inout
 - shared inout const (added from 2.065. See issue 6930)
 - immutable

 So, if we allow defining arbitrary postblit from A to B copy, you can
 define 9 * 9 = 81 postblits!
 I don't want to see such horrible D code.

 convinced me something has definitely gotten out of hand.

How so?

1. We have _4_ qualifiers. (DMD appears to implement them as 9 separate 
qualifiers which has led to problems such as not all combinations being 
implemented from the start.)

2. We can also overload a usual 1-argument/1-return function based only 
on qualifiers in the stated 81 ways. Why would this be unexpected or 
even noteworthy?

 We must devise a radically simpler solution to object copying,
 one that aims straight at solving the fundamental problems we're facing.
 Far as I can tell these are it:

 1. Reference counting for all objects, including const/immutable/shared.
 ...
 2. Removing head const. You have added the nice conversion qualified(T[]) ->
qualified(T)[]
 upon calling a function. We must look at ways to allow the user a similar
generalization for type constructors.

 If we figure this(this) doesn't help these two problems,

It doesn't help with 2.

 its entire existence must be put in question.

Make sure to keep it possible to mark structs non-copyable.

Feb 07 2014

D Programming

C/C++ Programming

Other

digitalmars.D - Review of DIP49