digitalmars.D - More on const-related issue

bearophile (89/89) Sep 08 2011 With the latest DMD versions it's easy to create an immutable array of s...

Jonathan M Davis (17/20) Sep 08 2011 [snip]
bearophile (57/58) Sep 08 2011 Sorry, another solution I've used to solve that problem is to split the ...

Timon Gehr (31/87) Sep 08 2011 I think the compiler just needs to be smart enough to understand that

Timon Gehr (3/119) Sep 08 2011 Better:
bearophile (14/24) Sep 08 2011 At a first look I like this idea. Is this idea going to work well?

Jason House (4/8) Sep 08 2011 Two alternate ideas:

Timon Gehr (5/13) Sep 08 2011 This is still pretty slow at the moment because afaik each append is a

Jason House (3/20) Sep 08 2011 Thanks. A constFree template is likely very useful. I wonder how frequen...

bearophile <bearophileHUGS lycos.com> writes:

With the latest DMD versions it's easy to create an immutable array of structs,
with a pure function (I think this gives advantages similar to the Transients
of the Clojure language):


struct Foo {
    int x;
    bool b;
}
immutable(Foo[]) generate() pure nothrow {
    auto arr = new Foo[10];
    foreach (i; 0 .. 10)
        arr[i].x = i;
    arr[3] = Foo(3, true);
    return arr;
}
void main() {
    auto a = generate();
}



But _very_ often I have to create arrays of structs that have mixed
mutable/immutable fields. To create them I have to use append, that is OK for
little arrays, but it's not nice when the arrays get large and when the rules
to fill them become complex:


struct Foo {
    int x;
    immutable bool b;
}
Foo[] generate() pure nothrow {
    Foo[] arr;
    foreach (i; 0 .. 10) {
        if (i == 3) // the rule to fill arr
            arr ~= Foo(i, true);
        else
            arr ~= Foo(i, false);
    }
    return arr;
}
void main() {
    auto a = generate();
}



As alternative, to avoid the append, I am able to use read-only properties for
the immutable fields, but this requires a more complex struct (and I think
currently DMD can't perform const-related optimizations on fileds like this
b__, because in D inside a module the "private" attribute is ignored. I think

because "private" works at module level too, so a read-only property is similar
to a const value. I'd like to know more about this optimization-related
situation):


struct Foo {
    int x;
    private bool b__ = false;
    // readonly
     property bool b() const pure nothrow { return b__; }
    this(in int x_, in bool b_=false) pure nothrow {
        this.x = x_;
        this.b__ = b_;
    }
}
Foo[] generate() pure nothrow {
    auto arr = new Foo[10];
    foreach (i; 0 .. 10)
        arr[i].x = i;    
    arr[3] = Foo(3, true);
    return arr;
}
void main() {
    auto a = generate();
}




This kind of code (currently the type system raises an error here) is not safe
even using just "const" for the field "b", so it's not a solution, because
inside generate the const is broken/ignored, so the compiler must perform zero
const-related optimizations regarding the field b (and its sub-tree data):


struct Foo {
    int x;
    const bool b;
}
Foo[] generate() pure nothrow {
    auto arr = new Foo[10];
    foreach (i; 0 .. 10)
        arr[i].x = i;
    arr[3] = Foo(3, true);
    return arr;
}
void main() {
    auto a = generate();
}



I think code like this is a mess (this is kind of the opposite of the "nested
constantness" I discussed some time ago):


struct Foo {
    int x;
    const bool b;
}
Foo[] generate() pure nothrow {
    with (mutable Foo.b) {
        auto arr = new Foo[10];
        foreach (i; 0 .. 10)
            arr[i].x = i;
        arr[3] = Foo(3, true);
    }
    return arr;
}
void main() {
    auto a = generate();
}


At the moment I don't have further ideas on this.

Bye,
bearophile

Sep 08 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday, September 08, 2011 05:33:01 bearophile wrote:
 With the latest DMD versions it's easy to create an immutable array of
 structs, with a pure function (I think this gives advantages similar to the
 Transients of the Clojure language):

[snip]

My take on it is pretty much just not a good idea create structs with const or 
immutable fields. Once they're const or immutable, you can't ever assign to 
them again, and as you point out, it really doesn't work to put them in 
arrays. Structs are generally meant to be assignable - and the fact that it's 
so easy to end up in a situation where a variable needs to be the struct's 
init property first, it's that much critical. Giving them const or immutable 
fields tends to mess with that pretty thoroughly.

I'd argue that if you want a field in a struct to be treated as const or 
immutable, then you should make it private and give it a getter property 
function which returns a const or immutable version of it (and it can even 
return it by ref if you want to avoid unnecessary copying).

So, yes. The situation is a bit of a pain, but init pretty much fries any 
chance of making const or immutable struct fields easy to deal with just like 
it fries the ability to have default constructors for structs.

- Jonathan M Davis

Sep 08 2011

bearophile <bearophileHUGS lycos.com> writes:

 At the moment I don't have further ideas on this.

Sorry, another solution I've used to solve that problem is to split the array
arr of structs in two parallel arrays, one with just the mutable fields and one
with just the immutable ones. But this increases the program complexity, makes
the program a bit more bug-prone, and in your program all your contracts have
to assert the two parallel arrays have the same length:


struct FooA {
    int x;
    // other mutable fields here
}
struct FooB {
    bool b;
    // other immutable fields here
}
FooA[] generateA() pure nothrow {
    auto arra = new FooA[10];
    foreach (i; 0 .. 10)
        arra[i].x = i;
    return arra;
}
immutable(FooB[]) generateB() pure nothrow {
    auto arrb = new FooB[10];
    arrb[3].b = true;
    return arrb;
}

void main() {
    auto a1 = generateA();
    auto a2 = generateB();
}

--------------------

In the end what's the difference between code like this:



struct Foo {
    immutable bool b;
}
Foo[] generate() pure nothrow {
    Foo[] arr;
    foreach (i; 0 .. 10) {
        if (i == 3)
            arr ~= Foo(true);
        else
            arr ~= Foo(false);
    }
    return arr;
}
void main() {
    auto a = generate();
}


And code like this?

struct Foo {
    immutable bool b;
}
Foo[] generate() pure nothrow {
    auto arr = new Foo[10];
    arr[3].b = true;
    return arr;
}
void main() {
    auto a = generate();
}


In the first case, by code construction the compiler has guarantees that you
will not write each struct more than one time (because appending never appends
two times on the same array slot), and you will not read array slots that are
not yet initialized (because the array items yet to be appended can't be read
in any way, they don't exist yet). For the compiler it's hard to understand
that the second program has the same qualities (well, arr[3] gets initialized
twice, but structs don't allow argument-less constructors, so I think the
constructor is like a pure function, so what matters is just the latest value
assigned to arr[3]).

Bye,
bearophile

Sep 08 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/08/2011 11:56 AM, bearophile wrote:
 At the moment I don't have further ideas on this.

 Sorry, another solution I've used to solve that problem is to split the array
arr of structs in two parallel arrays, one with just the mutable fields and one
with just the immutable ones. But this increases the program complexity, makes
the program a bit more bug-prone, and in your program all your contracts have
to assert the two parallel arrays have the same length:


 struct FooA {
      int x;
      // other mutable fields here
 }
 struct FooB {
      bool b;
      // other immutable fields here
 }
 FooA[] generateA() pure nothrow {
      auto arra = new FooA[10];
      foreach (i; 0 .. 10)
          arra[i].x = i;
      return arra;
 }
 immutable(FooB[]) generateB() pure nothrow {
      auto arrb = new FooB[10];
      arrb[3].b = true;
      return arrb;
 }

 void main() {
      auto a1 = generateA();
      auto a2 = generateB();
 }

 --------------------

 In the end what's the difference between code like this:



 struct Foo {
      immutable bool b;
 }
 Foo[] generate() pure nothrow {
      Foo[] arr;
      foreach (i; 0 .. 10) {
          if (i == 3)
              arr ~= Foo(true);
          else
              arr ~= Foo(false);
      }
      return arr;
 }
 void main() {
      auto a = generate();
 }


 And code like this?

 struct Foo {
      immutable bool b;
 }
 Foo[] generate() pure nothrow {
      auto arr = new Foo[10];
      arr[3].b = true;
      return arr;
 }
 void main() {
      auto a = generate();
 }


 In the first case, by code construction the compiler has guarantees that you
will not write each struct more than one time (because appending never appends
two times on the same array slot), and you will not read array slots that are
not yet initialized (because the array items yet to be appended can't be read
in any way, they don't exist yet). For the compiler it's hard to understand
that the second program has the same qualities (well, arr[3] gets initialized
twice, but structs don't allow argument-less constructors, so I think the
constructor is like a pure function, so what matters is just the latest value
assigned to arr[3]).

I think the compiler just needs to be smart enough to understand that 
after the execution of code of the form:

T[] a;
foreach(i;x..y) {
     if(c1) a~=v1;
     else if(c2) a~=v2;
     ... // exactly one one-element-append per branch
}

the array a will have length y-x. Maybe value range propagation can help.

then it can be optimized not to use the runtime that heavily. This would 
be useful in general.

(alternatively, something like this could be allowed:

Foo[] generate() {
     Foo[10] arr = void; // no way to create an uninitialized array on 
the heap =(
     foreach(i,ref x; arr){
         x = Foo(i == 3);
     }
     return arr.dup; // only one runtime call
})

That would be fine, because using the contents of uninitialized memory 
is disallowed anyways. (?)

What about

void generate(this Foo[] self, int len) this{ // maybe there is a better 
syntax
     // init self
}

?

This would just have the same restrictions as a class/struct constructor 
that initializes immutable fields.

Sep 08 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/08/2011 03:02 PM, Timon Gehr wrote:
 On 09/08/2011 11:56 AM, bearophile wrote:
 At the moment I don't have further ideas on this.

 Sorry, another solution I've used to solve that problem is to split
 the array arr of structs in two parallel arrays, one with just the
 mutable fields and one with just the immutable ones. But this
 increases the program complexity, makes the program a bit more
 bug-prone, and in your program all your contracts have to assert the
 two parallel arrays have the same length:


 struct FooA {
 int x;
 // other mutable fields here
 }
 struct FooB {
 bool b;
 // other immutable fields here
 }
 FooA[] generateA() pure nothrow {
 auto arra = new FooA[10];
 foreach (i; 0 .. 10)
 arra[i].x = i;
 return arra;
 }
 immutable(FooB[]) generateB() pure nothrow {
 auto arrb = new FooB[10];
 arrb[3].b = true;
 return arrb;
 }

 void main() {
 auto a1 = generateA();
 auto a2 = generateB();
 }

 --------------------

 In the end what's the difference between code like this:



 struct Foo {
 immutable bool b;
 }
 Foo[] generate() pure nothrow {
 Foo[] arr;
 foreach (i; 0 .. 10) {
 if (i == 3)
 arr ~= Foo(true);
 else
 arr ~= Foo(false);
 }
 return arr;
 }
 void main() {
 auto a = generate();
 }


 And code like this?

 struct Foo {
 immutable bool b;
 }
 Foo[] generate() pure nothrow {
 auto arr = new Foo[10];
 arr[3].b = true;
 return arr;
 }
 void main() {
 auto a = generate();
 }


 In the first case, by code construction the compiler has guarantees
 that you will not write each struct more than one time (because
 appending never appends two times on the same array slot), and you
 will not read array slots that are not yet initialized (because the
 array items yet to be appended can't be read in any way, they don't
 exist yet). For the compiler it's hard to understand that the second
 program has the same qualities (well, arr[3] gets initialized twice,
 but structs don't allow argument-less constructors, so I think the
 constructor is like a pure function, so what matters is just the
 latest value assigned to arr[3]).

 I think the compiler just needs to be smart enough to understand that
 after the execution of code of the form:

 T[] a;
 foreach(i;x..y) {
 if(c1) a~=v1;
 else if(c2) a~=v2;
 ... // exactly one one-element-append per branch
 }

 the array a will have length y-x. Maybe value range propagation can help.

 then it can be optimized not to use the runtime that heavily. This would
 be useful in general.

 (alternatively, something like this could be allowed:

 Foo[] generate() {
 Foo[10] arr = void; // no way to create an uninitialized array on the
 heap =(
 foreach(i,ref x; arr){
 x = Foo(i == 3);
 }
 return arr.dup; // only one runtime call
 })

 That would be fine, because using the contents of uninitialized memory
 is disallowed anyways. (?)

 What about

 void generate(this Foo[] self, int len) this{ // maybe there is a better
 syntax
 // init self
 }

Better:

void generate(out this Foo[] self, int len){}


 ?

 This would just have the same restrictions as a class/struct constructor
 that initializes immutable fields.

Sep 08 2011

bearophile <bearophileHUGS lycos.com> writes:

Timon Gehr:

 What about
 
 void generate(this Foo[] self, int len) this{ // maybe there is a better syntax
      // init self
 }
 
 ?
 
 This would just have the same restrictions as a class/struct constructor 
 that initializes immutable fields.

At a first look I like this idea. Is this idea going to work well?

Note that currently you can't initialize a const array in a constructor (I
think this is bug):

struct Foo {
    const int i;
    const int[3] array;
    this(int x) {
        this.i = x;
        this.array[0] = x; // Error: this.array[0] isn't mutable
    }
}
void main() {}

Bye,
bearophile

Sep 08 2011

Jason House <jason.james.house gmail.com> writes:

bearophile Wrote:

 With the latest DMD versions it's easy to create an immutable array of
structs, with a pure function (I think this gives advantages similar to the
Transients of the Clojure language):
 
 
 ...

Two alternate ideas:
1. Reserve the required space then start appending
2. Use a const-free struct to build the array and then cast array to the proper
type on return.

Sep 08 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/08/2011 03:07 PM, Jason House wrote:
 bearophile Wrote:

 With the latest DMD versions it's easy to create an immutable array of
structs, with a pure function (I think this gives advantages similar to the
Transients of the Clojure language):


 ...

 Two alternate ideas:
 1. Reserve the required space then start appending

This is still pretty slow at the moment because afaik each append is a 
runtime call.

 2. Use a const-free struct to build the array and then cast array to the
proper type on return.

That is a nice idea, it could be implemented using a constFree!Struct 
template struct.

Sep 08 2011

Jason House <jason.james.house gmail.com> writes:

Timon Gehr Wrote:

 On 09/08/2011 03:07 PM, Jason House wrote:
 bearophile Wrote:

 With the latest DMD versions it's easy to create an immutable array of
structs, with a pure function (I think this gives advantages similar to the
Transients of the Clojure language):


 ...

 Two alternate ideas:
 1. Reserve the required space then start appending

 
 This is still pretty slow at the moment because afaik each append is a 
 runtime call.

If you do it a bit more manually, you can use emplace.


 2. Use a const-free struct to build the array and then cast array to the
proper type on return.

 
 That is a nice idea, it could be implemented using a constFree!Struct 
 template struct.

Thanks. A constFree template is likely very useful. I wonder how frequently it
would get abused.

Sep 08 2011

D Programming

C/C++ Programming

Other

digitalmars.D - More on const-related issue