digitalmars.D - Actual immutability enforcement by placing immutable data into
- Siarhei Siamashka (16/16) Dec 19 2022 Right now D compilers place string literals into a read-only
- Siarhei Siamashka (15/16) Dec 19 2022 BTW, I tried to experiment with DMD code and can place some array
- bauss (7/21) Dec 19 2022 Isn't it going to be difficult to properly implement? Since you
- bauss (4/29) Dec 19 2022 Of course literals can be placed in read-only and should be, so
- IGotD- (29/32) Dec 19 2022 https://dlang.org/articles/const-faq.html
- bauss (23/57) Dec 19 2022 Yes, but it's not the reality. Immutable data can be constructed
- IGotD- (2/24) Dec 19 2022 Couldn't D could just have used the 'const' keyword for such data.
- Nick Treleaven (3/5) Dec 21 2022 Then you wouldn't be able to share it across threads. Besides you
- Siarhei Siamashka (7/31) Dec 19 2022 The compiler will reject your constructor if you change
- Tejas (3/11) Dec 19 2022 That is why he specified `static immutable` rather than
- Siarhei Siamashka (72/78) Dec 19 2022 I did mention static immutable and CTFE in my initial message.
Right now D compilers place string literals into a read-only section, but most of the other types of `static immutable` data have no protection against rogue writes. https://forum.dlang.org/post/cmtaeuedmdwxjecpcrjh forum.dlang.org is an example of a non-obvious case of immutable data corruption. What's happening there is that [druntime modifies](https://github.com/dlang/dmd/blob/v2.101.1/druntime/src/rt/deh.d#L46) the static immutable instance of Exception when throwing it. The old bugreport https://issues.dlang.org/show_bug.cgi?id=12118 is also related to throwing an immutable Exception, but the corruption is done by the user code in the catch block. Troubleshooting such problems would have been so much easier if immutable objects were actually placed in a read-only section and any write attempts triggered segfaults at runtime. I think that [bare metal code for microcontrollers](https://forum.dlang.org/post/rkrpdgjnhwdysqnnb lf forum.dlang.org) could also potentially benefit from this, because this would enable placing immutable data generated by CTFE into NOR flash instead of wasting SRAM space. What do you think about it? Does this require a new DIP?
Dec 19 2022
On Monday, 19 December 2022 at 12:13:08 UTC, Siarhei Siamashka wrote:What do you think about it? Does this require a new DIP?BTW, I tried to experiment with DMD code and can place some array literals into a read-only section: https://github.com/ssvb/dmd/commit/44c3a7c312b042fa7fafd357775aedf904ba0700 But much more seems to be needed to get it right. Nested array literals, such as the `[1,2]` part of `[[1,2],[3,4]]`, don't seem to have the immutable flag set when checked from https://github.com/dlang/dmd/blob/v2.101.1/compiler/src/dmd/todt.d#L456-L490 Additionally, the immutable flag seems to be stripped at https://github.com/dlang/dmd/blob/v2.101.1/compiler/src/dmd/tocsym.d#L189-L219 from immutable class and struct instances if they have constructors. But does this really matter for the data generated by CTFE? Detecting whether the data was generated by CTFE also doesn't seem to be very obvious. I tried to check the `.ownedByCtfe` field, but I'm getting strange results. Can anyone give me some hints?
Dec 19 2022
On Monday, 19 December 2022 at 12:13:08 UTC, Siarhei Siamashka wrote:Right now D compilers place string literals into a read-only section, but most of the other types of `static immutable` data have no protection against rogue writes. https://forum.dlang.org/post/cmtaeuedmdwxjecpcrjh forum.dlang.org is an example of a non-obvious case of immutable data corruption. What's happening there is that [druntime modifies](https://github.com/dlang/dmd/blob/v2.101.1/druntime/src/rt/deh.d#L46) the static immutable instance of Exception when throwing it. The old bugreport https://issues.dlang.org/show_bug.cgi?id=12118 is also related to throwing an immutable Exception, but the corruption is done by the user code in the catch block. Troubleshooting such problems would have been so much easier if immutable objects were actually placed in a read-only section and any write attempts triggered segfaults at runtime. I think that [bare metal code for microcontrollers](https://forum.dlang.org/post/rkrpdgjnhwdysqnnb lf forum.dlang.org) could also potentially benefit from this, because this would enable placing immutable data generated by CTFE into NOR flash instead of wasting SRAM space. What do you think about it? Does this require a new DIP?Isn't it going to be difficult to properly implement? Since you can't really place data into read-only memory, but you have to protect whole pages ex. VirtualProtect() on Windows. Esepcially with how immutable data can still be allocated through GC. Or am I not understanding something about this at all?
Dec 19 2022
On Monday, 19 December 2022 at 14:06:50 UTC, bauss wrote:On Monday, 19 December 2022 at 12:13:08 UTC, Siarhei Siamashka wrote:Of course literals can be placed in read-only and should be, so in that case I think this would be good, BUT I don't think it's possible to really do for __all__ immutable data.Right now D compilers place string literals into a read-only section, but most of the other types of `static immutable` data have no protection against rogue writes. https://forum.dlang.org/post/cmtaeuedmdwxjecpcrjh forum.dlang.org is an example of a non-obvious case of immutable data corruption. What's happening there is that [druntime modifies](https://github.com/dlang/dmd/blob/v2.101.1/druntime/src/rt/deh.d#L46) the static immutable instance of Exception when throwing it. The old bugreport https://issues.dlang.org/show_bug.cgi?id=12118 is also related to throwing an immutable Exception, but the corruption is done by the user code in the catch block. Troubleshooting such problems would have been so much easier if immutable objects were actually placed in a read-only section and any write attempts triggered segfaults at runtime. I think that [bare metal code for microcontrollers](https://forum.dlang.org/post/rkrpdgjnhwdysqnnb lf forum.dlang.org) could also potentially benefit from this, because this would enable placing immutable data generated by CTFE into NOR flash instead of wasting SRAM space. What do you think about it? Does this require a new DIP?Isn't it going to be difficult to properly implement? Since you can't really place data into read-only memory, but you have to protect whole pages ex. VirtualProtect() on Windows. Esepcially with how immutable data can still be allocated through GC. Or am I not understanding something about this at all?
Dec 19 2022
On Monday, 19 December 2022 at 14:07:49 UTC, bauss wrote:Of course literals can be placed in read-only and should be, so in that case I think this would be good, BUT I don't think it's possible to really do for __all__ immutable data.https://dlang.org/articles/const-faq.html *What is immutable good for?* *Immutable data, once initialized, is never changed. This has many uses:* - *Access to immutable data need not be synchronized when multiple threads read it.* - *Data races, tearing, sequential consistency, and cache consistency are all non-issues when working with immutable data.* - *When doing a deep copy of a data structure, the immutable portions need not be copied. - Invariance allows a large chunk of data to be treated as a value type even if it is passed around by reference (strings are the most common case of this).* - *Immutable types provide more self-documenting information to the programmer.* - ***Immutable data can be placed in hardware protected read-only memory, or even in ROMs.*** - *If immutable data does change, it is a sure sign of a memory corruption bug, and it is possible to automatically check for such data integrity.* - *Immutable types provide for many program optimization opportunities.* *const acts as a bridge between the mutable and immutable worlds, so a single function can be used to accept both types of arguments.* I always interpreted immutable as something that must be constructed during compile time and put in the RO section of the program.
Dec 19 2022
On Monday, 19 December 2022 at 15:25:17 UTC, IGotD- wrote:On Monday, 19 December 2022 at 14:07:49 UTC, bauss wrote:Yes, but it's not the reality. Immutable data can be constructed at runtime and it happens all the time in shared static constructors etc. I think it would be a too big breaking change that you suddenly can't do that anymore. Ex. the following program is valid: ``` import std.stdio : writeln; import std.datetime : Clock; immutable int a; shared static this() { a = Clock.currTime().year; } void main() { writeln(a); } ``` In the above example "a" cannot be placed in read-only memory. Of course my example isn't something you would do in an every day program, BUT it could be substituted for values loaded from a file etc.Of course literals can be placed in read-only and should be, so in that case I think this would be good, BUT I don't think it's possible to really do for __all__ immutable data.https://dlang.org/articles/const-faq.html *What is immutable good for?* *Immutable data, once initialized, is never changed. This has many uses:* - *Access to immutable data need not be synchronized when multiple threads read it.* - *Data races, tearing, sequential consistency, and cache consistency are all non-issues when working with immutable data.* - *When doing a deep copy of a data structure, the immutable portions need not be copied. - Invariance allows a large chunk of data to be treated as a value type even if it is passed around by reference (strings are the most common case of this).* - *Immutable types provide more self-documenting information to the programmer.* - ***Immutable data can be placed in hardware protected read-only memory, or even in ROMs.*** - *If immutable data does change, it is a sure sign of a memory corruption bug, and it is possible to automatically check for such data integrity.* - *Immutable types provide for many program optimization opportunities.* *const acts as a bridge between the mutable and immutable worlds, so a single function can be used to accept both types of arguments.* I always interpreted immutable as something that must be constructed during compile time and put in the RO section of the program.
Dec 19 2022
On Monday, 19 December 2022 at 15:34:35 UTC, bauss wrote:Yes, but it's not the reality. Immutable data can be constructed at runtime and it happens all the time in shared static constructors etc. I think it would be a too big breaking change that you suddenly can't do that anymore. Ex. the following program is valid: ``` import std.stdio : writeln; import std.datetime : Clock; immutable int a; shared static this() { a = Clock.currTime().year; } void main() { writeln(a); } ``` In the above example "a" cannot be placed in read-only memory. Of course my example isn't something you would do in an every day program, BUT it could be substituted for values loaded from a file etc.Couldn't D could just have used the 'const' keyword for such data.
Dec 19 2022
On Monday, 19 December 2022 at 15:52:29 UTC, IGotD- wrote:Couldn't D could just have used the 'const' keyword for such data.Then you wouldn't be able to share it across threads. Besides you can still do `new immutable int` at runtime so it is consistent.
Dec 21 2022
On Monday, 19 December 2022 at 15:34:35 UTC, bauss wrote:On Monday, 19 December 2022 at 15:25:17 UTC, IGotD- wrote:The compiler will reject your constructor if you change "immutable int a;" to "immutable int a = 2030;": test.d(8): Error: cannot modify `immutable` expression `a` If a variable is both declared and initialized simultaneously, then it's probably safe to be placed into a read-only section. Please correct me if I'm wrong.[...] I always interpreted immutable as something that must be constructed during compile time and put in the RO section of the program.Yes, but it's not the reality. Immutable data can be constructed at runtime and it happens all the time in shared static constructors etc. I think it would be a too big breaking change that you suddenly can't do that anymore. Ex. the following program is valid: ```D import std.stdio : writeln; import std.datetime : Clock; immutable int a; shared static this() { a = Clock.currTime().year; } void main() { writeln(a); } ``` In the above example "a" cannot be placed in read-only memory.
Dec 19 2022
On Monday, 19 December 2022 at 14:06:50 UTC, bauss wrote:On Monday, 19 December 2022 at 12:13:08 UTC, Siarhei Siamashka wrote:That is why he specified `static immutable` rather than `immutable` only[...]Isn't it going to be difficult to properly implement? Since you can't really place data into read-only memory, but you have to protect whole pages ex. VirtualProtect() on Windows. Esepcially with how immutable data can still be allocated through GC. Or am I not understanding something about this at all?
Dec 19 2022
On Monday, 19 December 2022 at 14:06:50 UTC, bauss wrote:I did mention static immutable and CTFE in my initial message. Some of the immutable data is generated at compile time and can safely go into read-only sections. Right now I'm only interested in trying to improve just this. But since you mentioned catching write accesses to the immutable data backed by GC allocations, this can be done with some help from extra tools or instrumentation. For example, I did use valgrind to debug the code from https://forum.dlang.org/post/cmtaeuedmdwxjecpcrjh forum.dlang.org ```C #include <stddef.h> #include <valgrind/memcheck.h> void vg_mark_block(void *p, size_t size) { int valgrind_handle = VALGRIND_CREATE_BLOCK(p, size, "MARKED BLOCK"); VALGRIND_MAKE_MEM_NOACCESS(p, size); } ``` ```D extern(C) void vg_mark_block(void *p, size_t size) nogc; void main() nogc { try { static immutable e = new Exception("test"); vg_mark_block(cast(void*)e, __traits(classInstanceSize, typeof(e))); throw e; } catch (Exception e) { assert(e.msg == "test"); } } ``` ``` ==3369== Invalid write of size 8 ==3369== at 0x4D5BEAE: _d_createTrace (in /usr/lib64/libphobos2.so.0.99.1) ==3369== by 0x4D5D4F9: _d_throwdwarf (in /usr/lib64/libphobos2.so.0.99.1) ==3369== by 0x1091C2: _Dmain (in /tmp/test/test) ==3369== by 0x4D5CEBE: void rt.dmain2._d_run_main2(char[][], ulong, extern (C) int function(char[][])*).runAll().__lambda2() (in /usr/lib64/libphobos2.so.0.99.1) ==3369== by 0x4D5CD6D: void rt.dmain2._d_run_main2(char[][], ulong, extern (C) int function(char[][])*).tryExec(scope void delegate()) (in /usr/lib64/libphobos2.so.0.99.1) ==3369== by 0x4D5CE46: void rt.dmain2._d_run_main2(char[][], ulong, extern (C) int function(char[][])*).runAll() (in /usr/lib64/libphobos2.so.0.99.1) ==3369== by 0x4D5CD6D: void rt.dmain2._d_run_main2(char[][], ulong, extern (C) int function(char[][])*).tryExec(scope void delegate()) (in /usr/lib64/libphobos2.so.0.99.1) ==3369== by 0x4D5CCD6: _d_run_main2 (in /usr/lib64/libphobos2.so.0.99.1) ==3369== by 0x4D5CA9F: _d_run_main (in /usr/lib64/libphobos2.so.0.99.1) ==3369== by 0x10923F: main (in /tmp/test/test) ==3369== Address 0x10c098 is 56 bytes inside a MARKED BLOCK of size 76 client-defined ==3369== at 0x1095A1: vg_mark_block (in /tmp/test/test) ==3369== by 0x1091B3: _Dmain (in /tmp/test/test) ``` Unfortunately valgrind reports both read and write accesses to this area in the log, so the noise about "invalid reads" needs to be filtered out. It doesn't support marking an address range as read-only out of the box: https://valgrind.org/docs/manual/mc-manual.html#mc-manual.clientreqs (but maybe this can be improved?). ASAN instrumentation could be also potentially useful in the future for catching write accesses to the immutable data backed by GC allocations. And I'm pleasantly surprised to see that ASAN is [already available in LDC](http://johanengelen.github.io/ldc/2017/12/25/LDC-and-Add essSanitizer.html). However just like valgrind, right now ASAN doesn't support poisoning a memory area as read-only: https://www.mail-archive.com/address-sanitizer googlegroups.com/msg01948.htmlWhat do you think about it? Does this require a new DIP?Isn't it going to be difficult to properly implement? Since you can't really place data into read-only memory, but you have to protect whole pages ex. VirtualProtect() on Windows. Esepcially with how immutable data can still be allocated through GC. Or am I not understanding something about this at all?
Dec 19 2022