digitalmars.D - Copy constructors for lazy initialization

Andrei Alexandrescu (28/28) May 28 2010 Walter has had a great idea last night: allow classes to define

bearophile (7/10) May 28 2010 In this moment I am too much sleepy to understand the semantics of what ...

Jonathan M Davis (9/34) May 28 2010 Well, as long as S is the name of the struct, it's essentially what's do...

Walter Bright (6/8) May 28 2010 Yup.

Jonathan M Davis (24/28) May 28 2010 Certainly, in the case provided, it's a definite win. I'm not sure what ...

Bruno Medeiros (12/22) Jun 03 2010 Yeah, I agree. I mean, for me the problem here is that the

Andrei Alexandrescu (14/41) Jun 03 2010 An associative array is a reference type. The problem discussed above

Bruno Medeiros (8/50) Jun 03 2010 In my notion of reference type (which you may argue is not the correct

Lionello Lunesu (4/47) May 28 2010 Nice. This could also be used to implement unique_ptr(T), with move

Andrei Alexandrescu (3/5) May 28 2010 Yah, a number of interesting idioms spring to life.

Steven Schveighoffer (30/41) May 28 2010 I'm almost positive you meant "allow *structs* to define"
Jonathan M Davis (33/33) May 29 2010 Actually, I have to ask what the purpose behind this delayed initializat...
Michel Fortin (17/29) May 29 2010 At this point I'll put the lazy initialization into question. If as
Michael Rynn (12/55) May 29 2010 Nothing wrong with yet another widely available constructor tool if the
Michael Rynn (6/22) May 29 2010 Thats great.

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Walter has had a great idea last night: allow classes to define

this(ref S src);

where S is the name of the struct being defined, as an alternative for

this(this);

The result would be a function similar with a C++ copy constructor.

Such an abstraction allows us to perform lazy initialization in a way 
that allows the kind of problems associated with non-shared hash tables:

void foo(int[int] x)
{
    x[5] = 5;
}

void main()
{
    int[int] x;
    foo(x);
    assert(x[5] == 5); // fails
}

If you change the first line of main() with

int[int] x = [ 42:42 ];

the assert passes.

The idea of the copy constructor is to lazy initialize the source and 
the target if the source has null state. That would take care of this 
problem and the similar problems for shared state.

There is still a possibility to call a method against an object with 
null state. I think that's acceptable, particularly because lazy 
initialization saves some state allocation.

What do you think?


Andrei

May 28 2010

bearophile <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu Wrote:

 this(ref S src);
 this(this);

 What do you think?

In this moment I am too much sleepy to understand the semantics of what you say.

But I can say something about syntax: that this(this) syntax is bad, it's
cryptic, I prefer something that uses/contains some English word/name that I
read and reminds me of what it does.

The this(ref S src) syntax makes things even worse in this regard. Please don't
turn D into a puzzle language (note that I am not saying your feature is bad,
far from it, I am just saying that the syntax you have proposed is very far
from being easy to understand from the way it is written).

Regardless of what Don has said, here I'd probably like something like a
readable  attribute to replace this(this) :-)

Bye,
bearophile

May 28 2010

Jonathan M Davis <jmdavisProg gmail.com> writes:

bearophile wrote:

 Andrei Alexandrescu Wrote:
 
 this(ref S src);
 this(this);

 
 What do you think?

 
 In this moment I am too much sleepy to understand the semantics of what
 you say.
 
 But I can say something about syntax: that this(this) syntax is bad, it's
 cryptic, I prefer something that uses/contains some English word/name that
 I read and reminds me of what it does.
 
 The this(ref S src) syntax makes things even worse in this regard. Please
 don't turn D into a puzzle language (note that I am not saying your
 feature is bad, far from it, I am just saying that the syntax you have
 proposed is very far from being easy to understand from the way it is
 written).
 
 Regardless of what Don has said, here I'd probably like something like a
 readable  attribute to replace this(this) :-)
 
 Bye,
 bearophile

Well, as long as S is the name of the struct, it's essentially what's done 
in C++ all the time. So, we get

S(ref S src)

instead of

S(const S& src)


The weird thing here is that you're actually altering the parameter that you 
passed in, which is normally a major no-no with copy constructors.

- Jonathan M Davis

May 28 2010

Walter Bright <newshound1 digitalmars.com> writes:

Jonathan M Davis wrote:
 The weird thing here is that you're actually altering the parameter that you 
 passed in, which is normally a major no-no with copy constructors.

Yup.

One subtle but important distinction from C++ is that D can omit copy 
construction completely if the compiler can determine there are no further uses 
of the source object, and substitute a simple bit copy. This should result in a 
fundamental performance improvement.

May 28 2010

Jonathan M Davis <jmdavisProg gmail.com> writes:

Andrei Alexandrescu wrote:

 What do you think?
 
 
 Andrei

Certainly, in the case provided, it's a definite win. I'm not sure what the 
overall implications would be though. Part of the problem stems from the 
fact that the array is initialized to null, and yet you can still add stuff 
to it. My first reaction (certainly without having messed around with it in 
D) would be that x[5] = 5 would fail because the array was null. However, 
instead of blowing up, D just makes the null array into an empty one and 
does the assignment. If D didn't allow the assignment without having first 
truly created an empty array rather than a null one, then we wouldn't have 
the problem.

Now, there may be very good reasons for the current behavior, and this 
suggestion would fix the problem as it stands. But it would still require 
the programmer to be aware of the issue and use this(ref S src) instead of 
this(this) if they were writing the constructor or be aware of which it was 
if they didn't write the constructor.

Not knowing what other implications there are, I'm fine with the change, but 
the fact that D creates the array when you try and insert into it (or append 
to it in the case of normal arrays IIRC) rather than blowing up on null 
seems like a source of subtle bugs and that perhaps it's not the best design 
choice. But maybe there was a good reason for that that I'm not aware of, so 
it could be that it really should stay as-is. It's just that it seems 
danger-prone and that the situation that you're trying to fix wouldn't be an 
issue if the array stayed null until the programmer made it otherwise.

- Jonathan M Davis

May 28 2010

Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:

On 29/05/2010 03:02, Jonathan M Davis wrote:
 Andrei Alexandrescu wrote:

 Not knowing what other implications there are, I'm fine with the change, but
 the fact that D creates the array when you try and insert into it (or append
 to it in the case of normal arrays IIRC) rather than blowing up on null
 seems like a source of subtle bugs and that perhaps it's not the best design
 choice. But maybe there was a good reason for that that I'm not aware of, so
 it could be that it really should stay as-is. It's just that it seems
 danger-prone and that the situation that you're trying to fix wouldn't be an
 issue if the array stayed null until the programmer made it otherwise.

 - Jonathan M Davis

Yeah, I agree. I mean, for me the problem here is that the 
map/associative-array is not really a value type, nor a reference type, 
but something in between. I think it might be better for it to be just a 
proper reference type.

I'm not saying that Andrei suggestion has no merit though. Rather, I 
have to admit I am not familiar with these C++ idioms and techniques. 
Can someone explain me why we need a copy constructor in the first 
place, instead of just using a reference object, aka a class, and an 
optional clone method?



-- 
Bruno Medeiros - Software Engineer

Jun 03 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/03/2010 06:30 AM, Bruno Medeiros wrote:
 On 29/05/2010 03:02, Jonathan M Davis wrote:
 Andrei Alexandrescu wrote:

 Not knowing what other implications there are, I'm fine with the
 change, but
 the fact that D creates the array when you try and insert into it (or
 append
 to it in the case of normal arrays IIRC) rather than blowing up on null
 seems like a source of subtle bugs and that perhaps it's not the best
 design
 choice. But maybe there was a good reason for that that I'm not aware
 of, so
 it could be that it really should stay as-is. It's just that it seems
 danger-prone and that the situation that you're trying to fix wouldn't
 be an
 issue if the array stayed null until the programmer made it otherwise.

 - Jonathan M Davis

 Yeah, I agree. I mean, for me the problem here is that the
 map/associative-array is not really a value type, nor a reference type,
 but something in between. I think it might be better for it to be just a
 proper reference type.

An associative array is a reference type. The problem discussed above 
applies to all reference types.

 I'm not saying that Andrei suggestion has no merit though. Rather, I
 have to admit I am not familiar with these C++ idioms and techniques.
 Can someone explain me why we need a copy constructor in the first
 place, instead of just using a reference object, aka a class, and an
 optional clone method?

The problem is that null objects don't obey reference semantics. The 
example discussed in this group is:

void fun(int[int] a) { a[5] = 10; }

void main(string[] args) {
     int[int] x;
     if (args.length & 1) x[0] = 42;
     fun(x);
     assert(x[5] == 10);
}

The program will fail or not depending on the number of arguments passed.


Andrei

Jun 03 2010

Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:

On 03/06/2010 14:48, Andrei Alexandrescu wrote:
 On 06/03/2010 06:30 AM, Bruno Medeiros wrote:
 On 29/05/2010 03:02, Jonathan M Davis wrote:
 Andrei Alexandrescu wrote:

 Not knowing what other implications there are, I'm fine with the
 change, but
 the fact that D creates the array when you try and insert into it (or
 append
 to it in the case of normal arrays IIRC) rather than blowing up on null
 seems like a source of subtle bugs and that perhaps it's not the best
 design
 choice. But maybe there was a good reason for that that I'm not aware
 of, so
 it could be that it really should stay as-is. It's just that it seems
 danger-prone and that the situation that you're trying to fix wouldn't
 be an
 issue if the array stayed null until the programmer made it otherwise.

 - Jonathan M Davis

 Yeah, I agree. I mean, for me the problem here is that the
 map/associative-array is not really a value type, nor a reference type,
 but something in between. I think it might be better for it to be just a
 proper reference type.

 An associative array is a reference type. The problem discussed above
 applies to all reference types.

 I'm not saying that Andrei suggestion has no merit though. Rather, I
 have to admit I am not familiar with these C++ idioms and techniques.
 Can someone explain me why we need a copy constructor in the first
 place, instead of just using a reference object, aka a class, and an
 optional clone method?

 The problem is that null objects don't obey reference semantics. The
 example discussed in this group is:

 void fun(int[int] a) { a[5] = 10; }

 void main(string[] args) {
 int[int] x;
 if (args.length & 1) x[0] = 42;
 fun(x);
 assert(x[5] == 10);
 }

 The program will fail or not depending on the number of arguments passed.


 Andrei

In my notion of reference type (which you may argue is not the correct 
one, but that's a different issue) a null object cannot be used at all 
(other than checking its identity against other objects). Thus that 
program would always fail, either on 'a[5] = 10;' or 'x[0] = 42;', 
according to these semantics


-- 
Bruno Medeiros - Software Engineer

Jun 03 2010

Lionello Lunesu <lio lunesu.remove.com> writes:

Nice. This could also be used to implement unique_ptr(T), with move
semantics.

L.

On 29-5-2010 9:26, Andrei Alexandrescu wrote:
 Walter has had a great idea last night: allow classes to define
 
 this(ref S src);
 
 where S is the name of the struct being defined, as an alternative for
 
 this(this);
 
 The result would be a function similar with a C++ copy constructor.
 
 Such an abstraction allows us to perform lazy initialization in a way
 that allows the kind of problems associated with non-shared hash tables:
 
 void foo(int[int] x)
 {
    x[5] = 5;
 }
 
 void main()
 {
    int[int] x;
    foo(x);
    assert(x[5] == 5); // fails
 }
 
 If you change the first line of main() with
 
 int[int] x = [ 42:42 ];
 
 the assert passes.
 
 The idea of the copy constructor is to lazy initialize the source and
 the target if the source has null state. That would take care of this
 problem and the similar problems for shared state.
 
 There is still a possibility to call a method against an object with
 null state. I think that's acceptable, particularly because lazy
 initialization saves some state allocation.
 
 What do you think?
 
 
 Andrei

May 28 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/28/2010 09:18 PM, Lionello Lunesu wrote:
 Nice. This could also be used to implement unique_ptr(T), with move
 semantics.

Yah, a number of interesting idioms spring to life.

Andrei

May 28 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 28 May 2010 21:26:50 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Walter has had a great idea last night: allow classes to define

I'm almost positive you meant "allow *structs* to define"

 this(ref S src);

 where S is the name of the struct being defined, as an alternative for

 this(this);

[snip]

 The idea of the copy constructor is to lazy initialize the source and  
 the target if the source has null state. That would take care of this  
 problem and the similar problems for shared state.

 There is still a possibility to call a method against an object with  
 null state. I think that's acceptable, particularly because lazy  
 initialization saves some state allocation.

 What do you think?

It is a good effort to solve the problem.  The problem I see with such  
constructs is inherently the lazy construction.  Because you must lazily  
construct such a container, every method meant to be called on the struct  
must first check and initialize the container if not done already.  This  
results in an undue burden on the struct author to make sure he covers  
every method.  The first method he forgets to lazily initialize the struct  
results in an obscure bug.

I wonder, would it be possible to go even further?  For example, something  
like this:

struct S
{
   lazy this()
   {
      // create state
   }
}

which would be called if the struct still has not been initialized?  lazy  
this() should be prepared to accept an already initialized struct, which  
should be no problem because most lazily initialized structs always differ  
 from the .init value.

The compiler could optimize this call out when it can statically determine  
that a struct has already been initialized.

This of course, does not cover copy construction, but it would be called  
before the copy constructor.

BTW, I'm glad you guys are looking at this problem.

-Steve

May 28 2010

Jonathan M Davis <jmdavisProg gmail.com> writes:

Actually, I have to ask what the purpose behind this delayed initialization 
is in the first place. The following works just fine as things are:

    int[] a;

    assert(!a);

    a ~= 42;

    assert(a);


If a were an object, this wouldn't work at all - even if it implemented the 
concatenation assignment operator. It would be null until you actually 
assigned it an object. Arrays - both normal and associative - don't seem to 
operate this way at all. This has the advantage that it's a bit hard to get 
the program to blow up on a null array, but it's not like that would 
generally be hard to find and fix. It makes it bit hard to have actual null 
array which stays that way. It would be easy to make an array null and 
accidentally add something to it, resulting in a bug which would have been 
found if the array didn't create itself upon concatenation. Also, you get 
the problem which started this thread - that of it getting created in a 
function that it's passed to and not ending up in the function that did the 
passing.

Other than having to update existing code, it doesn't seem that onerous to 
me to require

int[] = new int[](0);

to have an empty array if you want one (though it is a little weird to 
create an array of length 0).

Are there bugs that I'm not thinking of which the current behavior of 
creating the array for you avoids? Or am I just missing something here? It 
really seems to me like you're creating a workaround for a problem in the 
language. And while that workaround may be great for other stuff too, just 
making arrays stay null until the programmer assigns them another value 
fixes the bug that you're trying to fix - at least as far as I can tell. I 
don't understand why the current behavior was chosen. It does simplify array 
creation somewhat, but it seems to me that it's more likely to cause bugs 
than avoid them.

- Jonathan M Davis

May 29 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-05-28 21:26:50 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Walter has had a great idea last night: allow classes to define
 
 this(ref S src);
 
 where S is the name of the struct being defined, as an alternative for
 
 this(this);
 
 The result would be a function similar with a C++ copy constructor.
 
 Such an abstraction allows us to perform lazy initialization in a way 
 that allows the kind of problems associated with non-shared hash tables:

At this point I'll put the lazy initialization into question. If as 
soon as you make a copy of the struct you must allocate, it means that 
the container will be initialized as soon as you pass it to some 
function (unless the argument is passed by 'ref', but you want 
reference semantics precisely to avoid that, am I right?).

If the container is to be initialized as soon as you make a copy, the 
lazy initialization becomes of limited utility; it'll only be useful 
when you have a container you don't pass to another function *and* you 
never put anything in it. This makes the tradeoff of lazy 
initialization less worth it, as the extra logic checking every time if 
the container is already initialized will rarely serve a purpose.


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

May 29 2010

Michael Rynn <michaelrynn optusnet.com.au> writes:

On Fri, 28 May 2010 20:26:50 -0500, Andrei Alexandrescu wrote:

 Walter has had a great idea last night: allow classes to define
 
 this(ref S src);
 
 where S is the name of the struct being defined, as an alternative for
 
 this(this);
 
 The result would be a function similar with a C++ copy constructor.
 
 Such an abstraction allows us to perform lazy initialization in a way
 that allows the kind of problems associated with non-shared hash tables:
 
 void foo(int[int] x)
 {
     x[5] = 5;
 }
 
 void main()
 {
     int[int] x;
     foo(x);
     assert(x[5] == 5); // fails
 }
 
 If you change the first line of main() with
 
 int[int] x = [ 42:42 ];
 
 the assert passes.
 
 The idea of the copy constructor is to lazy initialize the source and
 the target if the source has null state. That would take care of this
 problem and the similar problems for shared state.
 
 There is still a possibility to call a method against an object with
 null state. I think that's acceptable, particularly because lazy
 initialization saves some state allocation.
 
 What do you think?
 
 
 Andrei

Nothing wrong with yet another widely available constructor tool if the 
user optionally wants to have it available and use it that way, and its 
usage is well defined.

Good if no cost if the programmer does not use the facility.

What seems to be missing, in this example of the unintentional creation 
of 2 AAs because initially a null AA is passed, if there was a way of 
explicitly initialising the AA first without having to insert something 
into it.   That subject seems to be taboo.  Of course, an empty but setup 
AA can be kludged by adding an arbitrary value and then removing it, and 
as such begs the need for a setup property / function.

\\

May 29 2010

Michael Rynn <michaelrynn optusnet.com.au> writes:

On Fri, 28 May 2010 20:26:50 -0500, Andrei Alexandrescu wrote:

 Walter has had a great idea last night: allow classes to define
 
 this(ref S src);
 
 where S is the name of the struct being defined, as an alternative for
 
 this(this);
 
 The result would be a function similar with a C++ copy constructor.
 
 S.
 
 What do you think?
 
 
 Andrei


Thats great.
Yet another way to initialize an AA, without having to insert and then 
remove a value.  I just have to make a dummy function and pass the AA to 
it.

Maybe a setup property / function would be easier.

May 29 2010

D Programming

C/C++ Programming

Other

digitalmars.D - Copy constructors for lazy initialization