digitalmars.D - Memory safety, C#, D and more
- bearophile (114/115) May 05 2009 Here I have collected few more bits that may be interesting for D develo...
Here I have collected few more bits that may be interesting for D development/design. ------------------- movable variable. The fixed statement is only permitted in an unsafe context: http://msdn.microsoft.com/en-us/library/f58wzh21.aspx http://msdn.microsoft.com/en-us/library/aa664784(VS.71).aspx So it "pins" a variable, so the GC can't move it anymore in memory, so you can avoid a conservative GC and keep its moving one. You can use it for example like this: int[,,] a = new int[2, 3, 4]; unsafe { fixed (int* p = a) { for (int i = 0; i < a.Length; ++i) // treat as linear p[i] = i; } } Where int[,,] are built-in multi-dimensional arrays made of a single block of memory. that save some memory and improve cache coherence a bit (but sometimes on modern CPU I have seen they may end a bit slower, because they may require integer multiplications to find items if a bitshift can't be used). fixed can also be nested if you want to pin two or more pointers: fixed (...) fixed (...) { ... } The pointer is meant as fixed only inside the scope. Where you use "fixed" to take the char* of a string, then the compiler calls toStringz automatically. You can also use fixed to call another function with a pointer: class Test { unsafe static void Fill(int* p, int count, int value) { for (; count != 0; count--) *p++ = value; } static void Main() { int[] a = new int[100]; unsafe { fixed (int* p = a) Fill(p, 100, -1); } } } I guess the compiler makes sure to never relocate the "a" array inside that Fill() method. everything possible to increase flexibility. D starts from an unsafe situation and does more to give some safety. This explains a bit how "fixed" interacts with the generational GC: http://www.codeproject.com/KB/dotnet/pointers.aspxPinning has a HUGE cost to the garbage collector. I assume that you are familiar with the generational algorithm of the garbage collection. Let us say we allocated enough memory to fill Gen 0 Heap (the youngest), and that an additional allocation will trigger a collection. If that very last allocation at the end of the heap was pinned, the pinned object moves to generation 1. (Call GC.GetGeneration(obj) and see). Gen 1 is guaranteed to grow to include the pinned memory at the very end of the Gen 0 Heap. Even if all other memory in Gen 0 was freed, that would still leave a huge unreclaimed space of memory and Gen 0 will begin allocating starting from its previous limit. That is how bad "pinning" is. [...] when you use fixed, do whatever you have do quickly and avoid any memory allocation in the process, which can potentially trigger a garbage collection. If a garbage collection did occur inside a fixed block, most likely the pinned memory was close to the end of Gen 0 heap.<For example if you run the following code (not in debug mode): int* a = stackalloc int[n]; for (int i = 0; i < 3 * n; i++) { a[i] = i; Console.WriteLine("a[i] = {0}", a[i]); } With n=10 it stops running just after i=10 (1 past the length). So the runtime is able to catch the trespassing outside the allowed memory anyway, and the docs say it stops the program as soon as possible to avoid malicious code, avoid troubles, etc. that's a stack safety, not an heap one. (often the compiler/runtime isn't able to remove array bound checks, despite this is a supported feature) and slower than equivalent "release mode" D code. uses a canary, or sets the memory after the array as not writeable. After a small test with the following code that performs reads only: int* a = stackalloc int[n]; for (int i = 0; i < 30 * n; i++) { Console.WriteLine("a[{0}] = {1}", i, a[i]); } Now the running doesn't stop, so with n=10 it stops printing when i = 299. So there's write-safety only. I have tried with dmd a stack-based "array": import std.conv: toInt; import std.c.stdlib: alloca; void main(string[] args) { int n = args.length == 2 ? toInt(args[1]) : 10; int* a = cast(int*)alloca(n * int.sizeof); for (int i = 0; i < 30 * n; i++) { a[i] = i; printf("a[%d] = %d\n", i, a[i]); } } It stops printing after i = 12 (3 items after the last one). If inside the loop I keep only the printf, it prints up to 300 and more, no read safety. While the following code with a heap-based array: import std.conv: toInt; void main(string[] args) { int n = args.length == 2 ? toInt(args[1]) : 10; auto aa = new int[n]; auto a = aa.ptr; for (int i = 0; i < 3000 * n; i++) { a[i] = i; printf("a[%d] = %d\n", i, a[i]); } } generates an Access Violation after i=15391, there's not much write safety. using System; unsafe sealed class test { static unsafe void Main(string[] args) { int n = args.Length > 0 ? Int32.Parse(args[0]) : 10; int[] a = new int[n]; unsafe { fixed (int* p = a) { for (int i = 0; i < 1000 * n; ++i) { p[i] = i; Console.WriteLine("p[{0}] = {1}", i, p[i]); } } } } } prints items up to i=20 and then throws an exception: System.IO.IOException, "The handle is invalid" (in debug code it stops when i is about 25). So even with heap memory and in faster, because the program stops very close to where the bug is). Having such safety when working with pointers-based arrays is a very good thing, I'd like to have it D too when I am not compiling in release mode. Is this doable? ----------------------------- 0,1,2,3... of items, but the compiler sees them as powers of two, so they can be combined bitwise: http://weblogs.asp.net/wim/archive/2004/04/07/109095.aspx [Flags] public enum ClientStates { Ordinary, HasDiscount, IsSupplier, IsBlackListed, IsOverdrawn } ClientStates c = ClientStates.HasDiscount | ClientStates.IsSupplier; for D2 too. ----------------------------- Unrelated. (Java) 'new' considered harmful: http://www.ddj.com/java/184405016 Bye, bearophile
May 05 2009