www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - making COW and ownership

reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
  I was thinking briefly and glancing over documentation regarding 
cluts and non-cluts (Reference vs value types), and considering 
my BitArray implementation. I was having a problem trying to 
solve how to avoid duplicating it unless it actually was needed; 
And I've come to a possible solution.

  There are data types that must have new copies, but what if we 
delay the duplicating unless it actually tries to make a change. 
Mind you this won't be usable in all cases; But if we can put an 
ownership tag on it, perhaps it might do a good portion of the 
job.

  Here's a quick thrown together example of what I'm talking about.

[code]
import std.stdio;
import std.conv;

struct COWArray(T) {
   COWArray *owner;
   T[] data;

   this(int i) {
     owner = &this;
     data.length = i;
   }

   inout(T) opIndex(int i) inout pure nothrow {
     return data[i];
   }

   ref COWArray opIndexAssign(T value, int i) pure {
     if (owner != &this) {
       owner = &this;
       data = data.dup;
     }

     data[i] = value;
     return this;
   }
}

unittest {
   auto ca = COWArray!int(4);

   writeln("\nOriginal:");
   foreach(i; 0 .. 4) {
     ca[i] = 10 + i;
     writeln(ca);
   }

   auto new_ca = ca;

   writeln("\nNew section: ");
   foreach(i; 0 .. 4) {
     new_ca[i] = 200 + i;
     writeln(new_ca);
   }

   writeln("\nPost processing:");
   writeln(ca);
   writeln(new_ca);
}
[/code]

  This produces:

Original:
COWArray!(int)(18FDCC, [10, 0, 0, 0])
COWArray!(int)(18FDCC, [10, 11, 0, 0])
COWArray!(int)(18FDCC, [10, 11, 12, 0])
COWArray!(int)(18FDCC, [10, 11, 12, 13])

New section:
COWArray!(int)(18FDE4, [200, 11, 12, 13])
COWArray!(int)(18FDE4, [200, 201, 12, 13])
COWArray!(int)(18FDE4, [200, 201, 202, 13])
COWArray!(int)(18FDE4, [200, 201, 202, 203])

Post processing:
COWArray!(int)(18FDCC, [10, 11, 12, 13])
COWArray!(int)(18FDE4, [200, 201, 202, 203])
Nov 20 2012
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 11/21/2012 05:23 AM, Era Scarecrow wrote:
   I was thinking briefly and glancing over documentation regarding cluts
 and non-cluts (Reference vs value types), and considering my BitArray
 implementation. I was having a problem trying to solve how to avoid
 duplicating it unless it actually was needed; And I've come to a
 possible solution.

   There are data types that must have new copies, but what if we delay
 the duplicating unless it actually tries to make a change. Mind you this
 won't be usable in all cases; But if we can put an ownership tag on it,
 perhaps it might do a good portion of the job.

   Here's a quick thrown together example of what I'm talking about.

 [code]
 import std.stdio;
 import std.conv;

 struct COWArray(T) {
    COWArray *owner;
    T[] data;

    this(int i) {
      owner = &this;
      data.length = i;
    }

    inout(T) opIndex(int i) inout pure nothrow {
      return data[i];
    }

    ref COWArray opIndexAssign(T value, int i) pure {
      if (owner != &this) {
        owner = &this;
        data = data.dup;
      }

      data[i] = value;
      return this;
    }
 }

 unittest {
    auto ca = COWArray!int(4);

    writeln("\nOriginal:");
    foreach(i; 0 .. 4) {
      ca[i] = 10 + i;
      writeln(ca);
    }

    auto new_ca = ca;

    writeln("\nNew section: ");
    foreach(i; 0 .. 4) {
      new_ca[i] = 200 + i;
      writeln(new_ca);
    }

    writeln("\nPost processing:");
    writeln(ca);
    writeln(new_ca);
 }
 [/code]

   This produces:

 Original:
 COWArray!(int)(18FDCC, [10, 0, 0, 0])
 COWArray!(int)(18FDCC, [10, 11, 0, 0])
 COWArray!(int)(18FDCC, [10, 11, 12, 0])
 COWArray!(int)(18FDCC, [10, 11, 12, 13])

 New section:
 COWArray!(int)(18FDE4, [200, 11, 12, 13])
 COWArray!(int)(18FDE4, [200, 201, 12, 13])
 COWArray!(int)(18FDE4, [200, 201, 202, 13])
 COWArray!(int)(18FDE4, [200, 201, 202, 203])

 Post processing:
 COWArray!(int)(18FDCC, [10, 11, 12, 13])
 COWArray!(int)(18FDE4, [200, 201, 202, 203])
This also duplicates the data if you move the struct in memory and then mutate. Probably you just need to have a boolean owner flag and set it to false on postblit.
Nov 21 2012
parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Wednesday, 21 November 2012 at 10:15:52 UTC, Timon Gehr wrote:
 This also duplicates the data if you move the struct in memory 
 and then mutate. Probably you just need to have a boolean owner 
 flag and set it to false on postblit.
Hadn't thought of a move being involved.. I was trying to use a way to identify it at the off chance it copied without a postblit, then it could still identify itself... (casting could easily do that). I have code that forcibly converts a struct into a raw array, since reversing that is needed at times. In those cases the owner address would still identify it where with a flag it would overwrite the referenced data believing it was the owner. Let's see... Here's the convert struct/type to array, modified to simplify. void[] getArrayOfType(TYPE, V ...)() { TYPE[] t; t.length = 1; static if (V.length) t[0] = TYPE(V); return cast(void[]) t; }
Nov 21 2012
prev sibling parent "Dan" <dbdavidson yahoo.com> writes:
On Wednesday, 21 November 2012 at 04:23:56 UTC, Era Scarecrow 
wrote:
  Here's a quick thrown together example of what I'm talking 
 about.
I think I see where you are going. After the copy creation of new_ca you have two objects referencing exactly the same data. It doesn't show it, but the benefit is that a whole bunch of reading can be going on against two handles to the same data without conflict (i.e. before the first assignment 'new_ca[i] = 200+i;' and you have delayed work. I think for this to be effective or sane you have to lock all the doors of change. So, for example, you would need 'data' to be private, otherwise data could be changed without successfully triggering the copy via opAssignIndex. Also, if ca itself were changed after the assignment to new_ca, but before mutation on new_ca got its copy then both would see that change, which is probably not desired. An alternative might be to use the const system. Assume that the whole lot of reading is going on in multiple pieces of code - different functions. Just give them the original bit array as const. Then they can not write to it directly, but can read from it. Eventually when they need to mutate they would need to copy just before beginning mutation. The const system helps, but what is missing is a way to copy a const object. No way exists yet - but I'm proposing and have a 'gdup' that does for this case. Any comments on it appreciated. https://github.com/patefacio/d-help/blob/master/doc/canonical.pdf https://github.com/patefacio/d-help/blob/master/d-help/opmix/mix.d The idea is that the transitive nature of const in D is perfect for providing those copy on write semantics without any extra work if you can copy the const object when needed. Thanks Dan -------------------------------------------------- import std.stdio; import std.conv; import opmix.mix; struct COWArray(T) { private T[] data; this(int i) { data.length = i; } inout(T) opIndex(int i) inout pure nothrow { return data[i]; } ref COWArray opIndexAssign(T value, int i) pure { data[i] = value; return this; } } void give_away_ref(T)(ref const(COWArray!T) ca) { // Do lots of reading auto new_ca = ca.gdup; writeln("\nNew section: "); foreach(i; 0 .. 4) { new_ca[i] = 200 + i; writeln(new_ca); } } unittest { auto ca = COWArray!int(4); writeln("\nOriginal:"); foreach(i; 0 .. 4) { ca[i] = 10 + i; writeln(ca); } give_away_ref(ca); writeln("\nPost processing:"); writeln(ca); }
Nov 21 2012