www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Troubles with immutable arrays

reply "bearophile" <bearophileHUGS lycos.com> writes:
After a discussion in D.learn started by someone else, after a 
suggestion of mine Timon Gehr has added a bug report:

http://d.puremagic.com/issues/show_bug.cgi?id=8400

But the bug was fixed in the opposite way of what I was thinking.

The problem was that the length of global immutable arrays arrays 
is seen as a compile-time constant.
Instead of fixing that, Issue 8400 has done the opposite, now 
even the lenght of local immutable arrays is seen sometimes as a 
compile-time constant, and example:


int[] foo(in int n) pure nothrow {
     int[] a;
     foreach (i; 0 .. n)
         a ~= i * 10;
     return a;
}
void main() {
     import core.stdc.stdio: printf;
     immutable int[] A = foo(5);
     int[A.length] B;
     printf("%zd\n", B.length);
}



The asm, compiled with -release:

_D4temp3fooFNaNbxiZAi   comdat
L0:     enter   018h,0
         push    EBX
         push    ESI
         mov dword ptr -018h[EBP],0
         mov dword ptr -014h[EBP],0
         mov dword ptr -010h[EBP],0
         mov -0Ch[EBP],EAX
L1E:        mov EAX,-010h[EBP]
         cmp EAX,-0Ch[EBP]
         jge L48
         lea ECX,-010h[EBP]
         mov EDX,[ECX]
         lea EBX,[EDX*4][EDX]
         add EBX,EBX
         push    EBX
         lea ESI,-018h[EBP]
         push    ESI
         mov EAX,offset FLAT:_D11TypeInfo_Ai6__initZ
         push    EAX
         call    near ptr __d_arrayappendcT
         add ESP,0Ch
         inc dword ptr -010h[EBP]
         jmp short   L1E
L48:        mov EDX,-014h[EBP]
         mov EAX,-018h[EBP]
         pop ESI
         pop EBX
         leave
         ret

__Dmain comdat
L0:     enter   018h,0
         push    EBX
         push    ESI
         push    5
         mov EAX,offset FLAT:_D11TypeInfo_Ai6__initZ
         push    EAX
         call    near ptr __d_arrayliteralTX
         mov dword ptr [EAX],0
         mov dword ptr 4[EAX],0Ah
         mov dword ptr 8[EAX],014h
         mov dword ptr 0Ch[EAX],01Eh
         mov dword ptr 010h[EAX],028h
         mov ECX,EAX
         mov EBX,5
         lea EDX,-018h[EBP]
         xor EAX,EAX
         mov [EDX],EAX
         mov 4[EDX],EAX
         mov 8[EDX],EAX
         mov 0Ch[EDX],EAX
         mov 010h[EDX],EAX
         push    5
         mov ESI,offset FLAT:_DATA
         push    ESI
         call    near ptr _printf
         xor EAX,EAX
         add ESP,010h
         pop ESI
         pop EBX
         leave
         ret



This code too compiles, so A is sometimes computed at run-time 
and sometimes at compile-time:

int[] foo(in int n) pure nothrow {
     int[] a;
     foreach (i; 0 .. n)
         a ~= i * 10;
     return a;
}
void main() {
     import core.stdc.stdio: printf;
     int n = 5;
     immutable int[] A = foo(n);
}



Now immutable arrays are sometimes seen as enums. I think this is 
a problem. I think in D compile-time is run only if it's 
evaluated in a context where compile-time values are required. 
But now the situation is more muddy, because knowing n at 
compile-time is not a way to ask A to be computed at compile-time.

Another problem is that compile-time arrays in many situations 
are not efficient, they gets copied every time you use them, and 
I think that __d_arrayliteralTX performs a heap allocation. So 
now both enum and immutable arrays perform heap allocations every 
time you use them, but only in some situations.

I think this is a messy situation, I think the fix for bug 8400 
should be undone, and I think Issue 8400 should be fixed the 
other way, turning global immutable arrays too into run-time 
entities.

The bug was fixed by Hara and accepted by Walter, both of them 
are very intelligent, so maybe I am just very wrong :-)

Bye,
bearophile
Jul 23 2012
next sibling parent reply kenji hara <k.hara.pg gmail.com> writes:
Oh, no. I've accidentally done the *fix*, even though that is contrary
to my own claim.

I don't like the cross-talk of compile-time and run-time variables.
An related issue is bug 3449, but it has been opposed by Walter.

http://d.puremagic.com/issues/show_bug.cgi?id=3449

A small example of bug 3449:
----
void main()
{
    struct Foo1 { const int bar; }
    pragma(msg, Foo1.sizeof);       // Prints "4"

    Foo1 foo1;
    auto p1 = &foo1.bar;            // Succeeds to compile, as excepced.

    struct Foo2 { const int bar = 123; }
    pragma(msg, Foo2.sizeof);       // Prints "1", not "4"

    Foo2 foo2;
    auto p2 = &foo2.bar;    // Error: constant 123 is not an lvalue
}

Why cannot get address of foo2.bar?
The answer is: compiler makes Foo2.bar a manifest constant, because
its type is not mutable and has an initializer.

---

With current dmd, *all* of variable declarations, that has non mutable
type and initializer, are _speculatively_ interpreted in compile time
(== CTFE-ed). If it is succeeds, the declaration will be treated as
same as manifest constant. That is the reason of you explained *bug*
and bug 3449.

I think the *implicit interpretation* is inherited from D1, and if go
back further, will reach to C++ constant variable.
BUT, in D2, we have the 'enum' declaration, which express the
declaration is really manifest constant.

So, the muddy interpretation just confuses many D user's, and less benefit.
I think we should separate run-time variable declarations and compile
time ones, to moderate the leaning curve.

Regards.

Kenji Hara

2012/7/23 bearophile <bearophileHUGS lycos.com>:
 After a discussion in D.learn started by someone else, after a suggestion of
 mine Timon Gehr has added a bug report:

 http://d.puremagic.com/issues/show_bug.cgi?id=8400

 But the bug was fixed in the opposite way of what I was thinking.

 The problem was that the length of global immutable arrays arrays is seen as
 a compile-time constant.
 Instead of fixing that, Issue 8400 has done the opposite, now even the
 lenght of local immutable arrays is seen sometimes as a compile-time
 constant, and example:


 int[] foo(in int n) pure nothrow {
     int[] a;
     foreach (i; 0 .. n)
         a ~= i * 10;
     return a;
 }
 void main() {
     import core.stdc.stdio: printf;
     immutable int[] A = foo(5);
     int[A.length] B;
     printf("%zd\n", B.length);
 }



 The asm, compiled with -release:

 _D4temp3fooFNaNbxiZAi   comdat
 L0:     enter   018h,0
         push    EBX
         push    ESI
         mov dword ptr -018h[EBP],0
         mov dword ptr -014h[EBP],0
         mov dword ptr -010h[EBP],0
         mov -0Ch[EBP],EAX
 L1E:        mov EAX,-010h[EBP]
         cmp EAX,-0Ch[EBP]
         jge L48
         lea ECX,-010h[EBP]
         mov EDX,[ECX]
         lea EBX,[EDX*4][EDX]
         add EBX,EBX
         push    EBX
         lea ESI,-018h[EBP]
         push    ESI
         mov EAX,offset FLAT:_D11TypeInfo_Ai6__initZ
         push    EAX
         call    near ptr __d_arrayappendcT
         add ESP,0Ch
         inc dword ptr -010h[EBP]
         jmp short   L1E
 L48:        mov EDX,-014h[EBP]
         mov EAX,-018h[EBP]
         pop ESI
         pop EBX
         leave
         ret

 __Dmain comdat
 L0:     enter   018h,0
         push    EBX
         push    ESI
         push    5
         mov EAX,offset FLAT:_D11TypeInfo_Ai6__initZ
         push    EAX
         call    near ptr __d_arrayliteralTX
         mov dword ptr [EAX],0
         mov dword ptr 4[EAX],0Ah
         mov dword ptr 8[EAX],014h
         mov dword ptr 0Ch[EAX],01Eh
         mov dword ptr 010h[EAX],028h
         mov ECX,EAX
         mov EBX,5
         lea EDX,-018h[EBP]
         xor EAX,EAX
         mov [EDX],EAX
         mov 4[EDX],EAX
         mov 8[EDX],EAX
         mov 0Ch[EDX],EAX
         mov 010h[EDX],EAX
         push    5
         mov ESI,offset FLAT:_DATA
         push    ESI
         call    near ptr _printf
         xor EAX,EAX
         add ESP,010h
         pop ESI
         pop EBX
         leave
         ret



 This code too compiles, so A is sometimes computed at run-time and sometimes
 at compile-time:

 int[] foo(in int n) pure nothrow {
     int[] a;
     foreach (i; 0 .. n)
         a ~= i * 10;
     return a;
 }
 void main() {
     import core.stdc.stdio: printf;
     int n = 5;
     immutable int[] A = foo(n);
 }



 Now immutable arrays are sometimes seen as enums. I think this is a problem.
 I think in D compile-time is run only if it's evaluated in a context where
 compile-time values are required. But now the situation is more muddy,
 because knowing n at compile-time is not a way to ask A to be computed at
 compile-time.

 Another problem is that compile-time arrays in many situations are not
 efficient, they gets copied every time you use them, and I think that
 __d_arrayliteralTX performs a heap allocation. So now both enum and
 immutable arrays perform heap allocations every time you use them, but only
 in some situations.

 I think this is a messy situation, I think the fix for bug 8400 should be
 undone, and I think Issue 8400 should be fixed the other way, turning global
 immutable arrays too into run-time entities.

 The bug was fixed by Hara and accepted by Walter, both of them are very
 intelligent, so maybe I am just very wrong :-)

 Bye,
 bearophile
Jul 23 2012
parent "bearophile" <bearophileHUGS lycos.com> writes:
Thank you for your answer and your understanding, Hara. I hope 
you will improve the situation.

Regarding this issue: 
http://d.puremagic.com/issues/show_bug.cgi?id=3449
Maybe Walter is wrong about it, so I suggest people to express 
their opinion.

Bye,
bearophile
Jul 23 2012
prev sibling parent Don Clugston <dac nospam.com> writes:
On 23/07/12 15:29, bearophile wrote:
 After a discussion in D.learn started by someone else, after a
 suggestion of mine Timon Gehr has added a bug report:

 http://d.puremagic.com/issues/show_bug.cgi?id=8400

 But the bug was fixed in the opposite way of what I was thinking.

 The problem was that the length of global immutable arrays arrays is
 seen as a compile-time constant.
 Instead of fixing that, Issue 8400 has done the opposite, now even the
 lenght of local immutable arrays is seen sometimes as a compile-time
 constant, and example:
Sorry bearophile, I completely disagree with this post. Currently, when a compile time value is required, CTFE is attempted. If it fails, an error message is generated. You are asking for a corner case to be introduced. Under certain circumstances (which aren't clearly defined), you want CTFE to *not* be attempted.
      immutable int[] A = foo(5);
      int[A.length] B;
 This code too compiles, so A is sometimes computed at run-time and
 sometimes at compile-time:
immutable int[] A = foo(n);
 Now immutable arrays are sometimes seen as enums.
That is not correct, an immutable array is always different to an enum. For example, an enum is simply a manifest constant, and does not have an address. An immutable array always has an address.
 I think this is a
 problem. I think in D compile-time is run only if it's evaluated in a
 context where compile-time values are required. But now the situation is
 more muddy, because knowing n at compile-time is not a way to ask A to
 be computed at compile-time.
The only consequence is that if you don't require it at compile time, a particular optimization might not happen. There is no change to semantics.
 Another problem is that compile-time arrays in many situations are not
 efficient, they gets copied every time you use them, and I think that
 __d_arrayliteralTX performs a heap allocation. So now both enum and
 immutable arrays perform heap allocations every time you use them, but
 only in some situations.
That's the famous bug 2356, which is completely irrelevant to this situation.
 I think this is a messy situation, I think the fix for bug 8400 should
 be undone, and I think Issue 8400 should be fixed the other way, turning
 global immutable arrays too into run-time entities.
Did you even know that initializers of global variables, including arrays, including even mutable arrays, are ALWAYS CTFE'd?
Jul 25 2012