www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Behaviour of append (~=)

reply Lionello Lunesu <lio lunesu.remove.com> writes:
Why does the appending to a 'null' array cause the contents to be copied?

int[] ar1; // some array, not constant

int[] ar2 = null;
ar2 ~= ar1;
// ar2 !is ar1

But it could be like this:
if (ar2)
   ar2 ~= ar1;
else
   ar2 = ar1;

Wouldn't it be a good optimization for ~= to check for null first, to 
prevent the copy?

L.
May 30 2006
next sibling parent reply "Derek Parnell" <derek psych.ward> writes:
On Tue, 30 May 2006 21:37:02 +1000, Lionello Lunesu  
<lio lunesu.remove.com> wrote:

 Why does the appending to a 'null' array cause the contents to be copied?

 int[] ar1; // some array, not constant

 int[] ar2 = null;
 ar2 ~= ar1;
 // ar2 !is ar1

 But it could be like this:
 if (ar2)
    ar2 ~= ar1;
 else
    ar2 = ar1;

 Wouldn't it be a good optimization for ~= to check for null first, to  
 prevent the copy?
Not so sure, because for consistency sake its good to know that 'append' will always do a copy. Then we can rely on this to happen. -- Derek Parnell Melbourne, Australia
May 30 2006
next sibling parent reply Lionello Lunesu <lio lunesu.remove.com> writes:
Derek Parnell wrote:
 On Tue, 30 May 2006 21:37:02 +1000, Lionello Lunesu 
 <lio lunesu.remove.com> wrote:
 
 Why does the appending to a 'null' array cause the contents to be copied?

 int[] ar1; // some array, not constant

 int[] ar2 = null;
 ar2 ~= ar1;
 // ar2 !is ar1

 But it could be like this:
 if (ar2)
    ar2 ~= ar1;
 else
    ar2 = ar1;

 Wouldn't it be a good optimization for ~= to check for null first, to 
 prevent the copy?
Not so sure, because for consistency sake its good to know that 'append' will always do a copy. Then we can rely on this to happen. --Derek Parnell Melbourne, Australia
But, take std.string.replace for example. It only copies when needed. Isn't this "exactly what you expect it to do"? I thought this was kind-of understood by COW. Also, I don't think it's safe to rely on the copy to happen. If everything follows COW rules, then that's all you need to know, and AFAIC ~= copying only iff lhs !is null is a normal COW rule. L.
May 30 2006
parent reply "Derek Parnell" <derek psych.ward> writes:
On Tue, 30 May 2006 23:30:21 +1000, Lionello Lunesu  
<lio lunesu.remove.com> wrote:

 Derek Parnell wrote:
 On Tue, 30 May 2006 21:37:02 +1000, Lionello Lunesu  
 <lio lunesu.remove.com> wrote:

 Why does the appending to a 'null' array cause the contents to be  
 copied?

 int[] ar1; // some array, not constant

 int[] ar2 = null;
 ar2 ~= ar1;
 // ar2 !is ar1

 But it could be like this:
 if (ar2)
    ar2 ~= ar1;
 else
    ar2 = ar1;

 Wouldn't it be a good optimization for ~= to check for null first, to  
 prevent the copy?
Not so sure, because for consistency sake its good to know that 'append' will always do a copy. Then we can rely on this to happen. --Derek Parnell Melbourne, Australia
But, take std.string.replace for example. It only copies when needed. Isn't this "exactly what you expect it to do"? I thought this was kind-of understood by COW.
But what does the 'replace' function and CoW have to do with this discussion? All I'm saying that the append operation is always going to copy the right hand data. If you don't want that behaviour (such as CoW then only use the append operator when you need to copy the data.)
 Also, I don't think it's safe to rely on the copy to happen. If  
 everything follows COW rules, then that's all you need to know, and  
 AFAIC ~= copying only iff lhs !is null is a normal COW rule.
But not everything uses or needs CoW. In fact, the append operator is a useful method to copy something during a CoW function. -- Derek Parnell Melbourne, Australia
May 30 2006
parent "Lionello Lunesu" <lionello lunesu.remove.com> writes:
 But not everything uses or needs CoW. In fact, the append operator is a 
 useful method to copy something during a CoW function.
I think all of you have made it clear that it would indeed not be a good idea to only copy the data sometimes. It's just that I've encountered code using ~= where the "if(ar) ar~= else ar=" would yield some performance benefit. But I guess we'll have to optimize those cases manually. Thanks for your points of view. They are appreciated! L.
May 30 2006
prev sibling next sibling parent "Chris Miller" <chris dprogramming.com> writes:
On Tue, 30 May 2006 07:47:22 -0400, Derek Parnell <derek psych.ward> wrote:

 On Tue, 30 May 2006 21:37:02 +1000, Lionello Lunesu  
 <lio lunesu.remove.com> wrote:

 Why does the appending to a 'null' array cause the contents to be  
 copied?

 int[] ar1; // some array, not constant

 int[] ar2 = null;
 ar2 ~= ar1;
 // ar2 !is ar1

 But it could be like this:
 if (ar2)
    ar2 ~= ar1;
 else
    ar2 = ar1;

 Wouldn't it be a good optimization for ~= to check for null first, to  
 prevent the copy?
Not so sure, because for consistency sake its good to know that 'append' will always do a copy. Then we can rely on this to happen.
Exactly; many, many times I rely on this behavior. If it is changed, a lot of memory will be overwritten unintentionally and broken code will result. Consider the following code if ~= is changed: foo ~= bar[0 .. n]; foo ~= baz; // likely corruption after bar[n]. It's kind of like saying, why require ~ to copy when you can just use and overwrite the memory after the first operand. Too many people rely on ~ copying.
May 30 2006
prev sibling parent "Derek Parnell" <derek psych.ward> writes:
On Tue, 30 May 2006 23:20:45 +1000, Oskar Linde  
<oskar.lindeREM OVEgmail.com> wrote:

 Wouldn't it be a good optimization for ~= to check for null first, to  
 prevent the copy?
Not so sure, because for consistency sake its good to know that 'append' will always do a copy. Then we can rely on this to happen.
But does it? Isn't n repeated appends guaranteed to give at most log(n) allocations? How can that be if all appends force a copy? A simple test:
I didn't mention allocations. I was talking about copying.
 char[] a = "abcdefgh";
 char[] b = a[0..3];
 b ~= "xx";
 writefln("a = %s",a);

 prints:
 abcxxfgh
Yes, but the "xx" was copied wasn't it. That is, 'a' does not contain a slice to the literal "xx". -- Derek Parnell Melbourne, Australia
May 30 2006
prev sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Lionello Lunesu wrote:
 Why does the appending to a 'null' array cause the contents to be copied?
 
 int[] ar1; // some array, not constant
 
 int[] ar2 = null;
 ar2 ~= ar1;
 // ar2 !is ar1
 
 But it could be like this:
 if (ar2)
   ar2 ~= ar1;
 else
   ar2 = ar1;
 
 Wouldn't it be a good optimization for ~= to check for null first, to
 prevent the copy?
 
 L.
I think the problem with this is that it's an edge case. With your suggestion, appending to a null array and appending to an *empty* array would have completely different semantics. Now, every programmer who uses arrays has to watch out for this one special case. Yes, it would be better performance-wise, but it would be hell on programmers since it's non-obvious behaviour. -- Daniel -- v1sw5+8Yhw5ln4+5pr6OFma8u6+7Lw4Tm6+7l6+7D a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
May 30 2006
parent reply Lionello Lunesu <lio lunesu.remove.com> writes:
Daniel Keep wrote:
 
 Lionello Lunesu wrote:
 Why does the appending to a 'null' array cause the contents to be copied?

 int[] ar1; // some array, not constant

 int[] ar2 = null;
 ar2 ~= ar1;
 // ar2 !is ar1

 But it could be like this:
 if (ar2)
   ar2 ~= ar1;
 else
   ar2 = ar1;

 Wouldn't it be a good optimization for ~= to check for null first, to
 prevent the copy?

 L.
I think the problem with this is that it's an edge case. With your suggestion, appending to a null array and appending to an *empty* array would have completely different semantics. Now, every programmer who uses arrays has to watch out for this one special case.
It doesn't have to be different. It only depends on what the compiler's testing. If it tests "ar2.length" then the two cases will be treated the same was (no copy).
 Yes, it would be better performance-wise, but it would be hell on
 programmers since it's non-obvious behaviour.
I don't think those programmers should write code that depends on the compiler copying the data in those cases. You got the array you asked for, so? If you want to make sure have a copy, you probably needed to .dup yourself anyway. L.
May 30 2006
parent reply Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
(I posted another thing to this thread that I later canceled. It was 
just me misunderstanding what Derek meant. I'm just too tired right now...)

Lionello Lunesu skrev:

 I don't think those programmers should write code that depends on the 
 compiler copying the data in those cases. You got the array you asked 
 for, so? If you want to make sure have a copy, you probably needed to 
 .dup yourself anyway.
I would put it the other way around. IMHO, the more guarantees the language can give you the better. Defensive .dup-ing is never good. D doesn't have any notion of ownership, you have to remember what you own and what you don't. Currently, if you create an array and add (append) data to it, you are guaranteed that you still own the data. With your suggestion, that would no longer be true. It is very common to append data that you don't own: a ~= "static read-only string constant"; or a ~= itoa(7); // May refer to static string data. /Oskar
May 30 2006
parent "Lionello Lunesu" <lionello lunesu.remove.com> writes:
 It is very common to append data that you don't own:
 a ~= "static read-only string constant";
 or
 a ~= itoa(7); // May refer to static string data.
I can see how that would cause problems if it wouldn't copy : ) Thanks for the example. L.
May 30 2006