www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - More on const-related issue

reply bearophile <bearophileHUGS lycos.com> writes:
With the latest DMD versions it's easy to create an immutable array of structs,
with a pure function (I think this gives advantages similar to the Transients
of the Clojure language):


struct Foo {
    int x;
    bool b;
}
immutable(Foo[]) generate() pure nothrow {
    auto arr = new Foo[10];
    foreach (i; 0 .. 10)
        arr[i].x = i;
    arr[3] = Foo(3, true);
    return arr;
}
void main() {
    auto a = generate();
}



But _very_ often I have to create arrays of structs that have mixed
mutable/immutable fields. To create them I have to use append, that is OK for
little arrays, but it's not nice when the arrays get large and when the rules
to fill them become complex:


struct Foo {
    int x;
    immutable bool b;
}
Foo[] generate() pure nothrow {
    Foo[] arr;
    foreach (i; 0 .. 10) {
        if (i == 3) // the rule to fill arr
            arr ~= Foo(i, true);
        else
            arr ~= Foo(i, false);
    }
    return arr;
}
void main() {
    auto a = generate();
}



As alternative, to avoid the append, I am able to use read-only properties for
the immutable fields, but this requires a more complex struct (and I think
currently DMD can't perform const-related optimizations on fileds like this
b__, because in D inside a module the "private" attribute is ignored. I think

because "private" works at module level too, so a read-only property is similar
to a const value. I'd like to know more about this optimization-related
situation):


struct Foo {
    int x;
    private bool b__ = false;
    // readonly
     property bool b() const pure nothrow { return b__; }
    this(in int x_, in bool b_=false) pure nothrow {
        this.x = x_;
        this.b__ = b_;
    }
}
Foo[] generate() pure nothrow {
    auto arr = new Foo[10];
    foreach (i; 0 .. 10)
        arr[i].x = i;    
    arr[3] = Foo(3, true);
    return arr;
}
void main() {
    auto a = generate();
}




This kind of code (currently the type system raises an error here) is not safe
even using just "const" for the field "b", so it's not a solution, because
inside generate the const is broken/ignored, so the compiler must perform zero
const-related optimizations regarding the field b (and its sub-tree data):


struct Foo {
    int x;
    const bool b;
}
Foo[] generate() pure nothrow {
    auto arr = new Foo[10];
    foreach (i; 0 .. 10)
        arr[i].x = i;
    arr[3] = Foo(3, true);
    return arr;
}
void main() {
    auto a = generate();
}



I think code like this is a mess (this is kind of the opposite of the "nested
constantness" I discussed some time ago):


struct Foo {
    int x;
    const bool b;
}
Foo[] generate() pure nothrow {
    with (mutable Foo.b) {
        auto arr = new Foo[10];
        foreach (i; 0 .. 10)
            arr[i].x = i;
        arr[3] = Foo(3, true);
    }
    return arr;
}
void main() {
    auto a = generate();
}


At the moment I don't have further ideas on this.

Bye,
bearophile
Sep 08 2011
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, September 08, 2011 05:33:01 bearophile wrote:
 With the latest DMD versions it's easy to create an immutable array of
 structs, with a pure function (I think this gives advantages similar to the
 Transients of the Clojure language):
[snip] My take on it is pretty much just not a good idea create structs with const or immutable fields. Once they're const or immutable, you can't ever assign to them again, and as you point out, it really doesn't work to put them in arrays. Structs are generally meant to be assignable - and the fact that it's so easy to end up in a situation where a variable needs to be the struct's init property first, it's that much critical. Giving them const or immutable fields tends to mess with that pretty thoroughly. I'd argue that if you want a field in a struct to be treated as const or immutable, then you should make it private and give it a getter property function which returns a const or immutable version of it (and it can even return it by ref if you want to avoid unnecessary copying). So, yes. The situation is a bit of a pain, but init pretty much fries any chance of making const or immutable struct fields easy to deal with just like it fries the ability to have default constructors for structs. - Jonathan M Davis
Sep 08 2011
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
 At the moment I don't have further ideas on this.
Sorry, another solution I've used to solve that problem is to split the array arr of structs in two parallel arrays, one with just the mutable fields and one with just the immutable ones. But this increases the program complexity, makes the program a bit more bug-prone, and in your program all your contracts have to assert the two parallel arrays have the same length: struct FooA { int x; // other mutable fields here } struct FooB { bool b; // other immutable fields here } FooA[] generateA() pure nothrow { auto arra = new FooA[10]; foreach (i; 0 .. 10) arra[i].x = i; return arra; } immutable(FooB[]) generateB() pure nothrow { auto arrb = new FooB[10]; arrb[3].b = true; return arrb; } void main() { auto a1 = generateA(); auto a2 = generateB(); } -------------------- In the end what's the difference between code like this: struct Foo { immutable bool b; } Foo[] generate() pure nothrow { Foo[] arr; foreach (i; 0 .. 10) { if (i == 3) arr ~= Foo(true); else arr ~= Foo(false); } return arr; } void main() { auto a = generate(); } And code like this? struct Foo { immutable bool b; } Foo[] generate() pure nothrow { auto arr = new Foo[10]; arr[3].b = true; return arr; } void main() { auto a = generate(); } In the first case, by code construction the compiler has guarantees that you will not write each struct more than one time (because appending never appends two times on the same array slot), and you will not read array slots that are not yet initialized (because the array items yet to be appended can't be read in any way, they don't exist yet). For the compiler it's hard to understand that the second program has the same qualities (well, arr[3] gets initialized twice, but structs don't allow argument-less constructors, so I think the constructor is like a pure function, so what matters is just the latest value assigned to arr[3]). Bye, bearophile
Sep 08 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09/08/2011 11:56 AM, bearophile wrote:
 At the moment I don't have further ideas on this.
Sorry, another solution I've used to solve that problem is to split the array arr of structs in two parallel arrays, one with just the mutable fields and one with just the immutable ones. But this increases the program complexity, makes the program a bit more bug-prone, and in your program all your contracts have to assert the two parallel arrays have the same length: struct FooA { int x; // other mutable fields here } struct FooB { bool b; // other immutable fields here } FooA[] generateA() pure nothrow { auto arra = new FooA[10]; foreach (i; 0 .. 10) arra[i].x = i; return arra; } immutable(FooB[]) generateB() pure nothrow { auto arrb = new FooB[10]; arrb[3].b = true; return arrb; } void main() { auto a1 = generateA(); auto a2 = generateB(); } -------------------- In the end what's the difference between code like this: struct Foo { immutable bool b; } Foo[] generate() pure nothrow { Foo[] arr; foreach (i; 0 .. 10) { if (i == 3) arr ~= Foo(true); else arr ~= Foo(false); } return arr; } void main() { auto a = generate(); } And code like this? struct Foo { immutable bool b; } Foo[] generate() pure nothrow { auto arr = new Foo[10]; arr[3].b = true; return arr; } void main() { auto a = generate(); } In the first case, by code construction the compiler has guarantees that you will not write each struct more than one time (because appending never appends two times on the same array slot), and you will not read array slots that are not yet initialized (because the array items yet to be appended can't be read in any way, they don't exist yet). For the compiler it's hard to understand that the second program has the same qualities (well, arr[3] gets initialized twice, but structs don't allow argument-less constructors, so I think the constructor is like a pure function, so what matters is just the latest value assigned to arr[3]).
I think the compiler just needs to be smart enough to understand that after the execution of code of the form: T[] a; foreach(i;x..y) { if(c1) a~=v1; else if(c2) a~=v2; ... // exactly one one-element-append per branch } the array a will have length y-x. Maybe value range propagation can help. then it can be optimized not to use the runtime that heavily. This would be useful in general. (alternatively, something like this could be allowed: Foo[] generate() { Foo[10] arr = void; // no way to create an uninitialized array on the heap =( foreach(i,ref x; arr){ x = Foo(i == 3); } return arr.dup; // only one runtime call }) That would be fine, because using the contents of uninitialized memory is disallowed anyways. (?) What about void generate(this Foo[] self, int len) this{ // maybe there is a better syntax // init self } ? This would just have the same restrictions as a class/struct constructor that initializes immutable fields.
Sep 08 2011
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 09/08/2011 03:02 PM, Timon Gehr wrote:
 On 09/08/2011 11:56 AM, bearophile wrote:
 At the moment I don't have further ideas on this.
Sorry, another solution I've used to solve that problem is to split the array arr of structs in two parallel arrays, one with just the mutable fields and one with just the immutable ones. But this increases the program complexity, makes the program a bit more bug-prone, and in your program all your contracts have to assert the two parallel arrays have the same length: struct FooA { int x; // other mutable fields here } struct FooB { bool b; // other immutable fields here } FooA[] generateA() pure nothrow { auto arra = new FooA[10]; foreach (i; 0 .. 10) arra[i].x = i; return arra; } immutable(FooB[]) generateB() pure nothrow { auto arrb = new FooB[10]; arrb[3].b = true; return arrb; } void main() { auto a1 = generateA(); auto a2 = generateB(); } -------------------- In the end what's the difference between code like this: struct Foo { immutable bool b; } Foo[] generate() pure nothrow { Foo[] arr; foreach (i; 0 .. 10) { if (i == 3) arr ~= Foo(true); else arr ~= Foo(false); } return arr; } void main() { auto a = generate(); } And code like this? struct Foo { immutable bool b; } Foo[] generate() pure nothrow { auto arr = new Foo[10]; arr[3].b = true; return arr; } void main() { auto a = generate(); } In the first case, by code construction the compiler has guarantees that you will not write each struct more than one time (because appending never appends two times on the same array slot), and you will not read array slots that are not yet initialized (because the array items yet to be appended can't be read in any way, they don't exist yet). For the compiler it's hard to understand that the second program has the same qualities (well, arr[3] gets initialized twice, but structs don't allow argument-less constructors, so I think the constructor is like a pure function, so what matters is just the latest value assigned to arr[3]).
I think the compiler just needs to be smart enough to understand that after the execution of code of the form: T[] a; foreach(i;x..y) { if(c1) a~=v1; else if(c2) a~=v2; ... // exactly one one-element-append per branch } the array a will have length y-x. Maybe value range propagation can help. then it can be optimized not to use the runtime that heavily. This would be useful in general. (alternatively, something like this could be allowed: Foo[] generate() { Foo[10] arr = void; // no way to create an uninitialized array on the heap =( foreach(i,ref x; arr){ x = Foo(i == 3); } return arr.dup; // only one runtime call }) That would be fine, because using the contents of uninitialized memory is disallowed anyways. (?) What about void generate(this Foo[] self, int len) this{ // maybe there is a better syntax // init self }
Better: void generate(out this Foo[] self, int len){}
 ?

 This would just have the same restrictions as a class/struct constructor
 that initializes immutable fields.
Sep 08 2011
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Timon Gehr:

 What about
 
 void generate(this Foo[] self, int len) this{ // maybe there is a better syntax
      // init self
 }
 
 ?
 
 This would just have the same restrictions as a class/struct constructor 
 that initializes immutable fields.
At a first look I like this idea. Is this idea going to work well? Note that currently you can't initialize a const array in a constructor (I think this is bug): struct Foo { const int i; const int[3] array; this(int x) { this.i = x; this.array[0] = x; // Error: this.array[0] isn't mutable } } void main() {} Bye, bearophile
Sep 08 2011
prev sibling parent reply Jason House <jason.james.house gmail.com> writes:
bearophile Wrote:

 With the latest DMD versions it's easy to create an immutable array of
structs, with a pure function (I think this gives advantages similar to the
Transients of the Clojure language):
 
 
 ...
Two alternate ideas: 1. Reserve the required space then start appending 2. Use a const-free struct to build the array and then cast array to the proper type on return.
Sep 08 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09/08/2011 03:07 PM, Jason House wrote:
 bearophile Wrote:

 With the latest DMD versions it's easy to create an immutable array of
structs, with a pure function (I think this gives advantages similar to the
Transients of the Clojure language):


 ...
Two alternate ideas: 1. Reserve the required space then start appending
This is still pretty slow at the moment because afaik each append is a runtime call.
 2. Use a const-free struct to build the array and then cast array to the
proper type on return.
That is a nice idea, it could be implemented using a constFree!Struct template struct.
Sep 08 2011
parent Jason House <jason.james.house gmail.com> writes:
Timon Gehr Wrote:

 On 09/08/2011 03:07 PM, Jason House wrote:
 bearophile Wrote:

 With the latest DMD versions it's easy to create an immutable array of
structs, with a pure function (I think this gives advantages similar to the
Transients of the Clojure language):


 ...
Two alternate ideas: 1. Reserve the required space then start appending
This is still pretty slow at the moment because afaik each append is a runtime call.
If you do it a bit more manually, you can use emplace.
 2. Use a const-free struct to build the array and then cast array to the
proper type on return.
That is a nice idea, it could be implemented using a constFree!Struct template struct.
Thanks. A constFree template is likely very useful. I wonder how frequently it would get abused.
Sep 08 2011