www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Generating struct .init at run time?

reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
Normally, struct .init values are known at compile time. Unfortunately, 
they add to binary size:

enum elementCount = 1024 * 1024;

struct S {
   double[elementCount] a;
}

void main() {
     S s;
     assert(typeid(S).initializer.length == double.sizeof * elementCount);
     assert(typeid(S).initializer.ptr !is null);
}

Both asserts pass: S.init is 800M and is embedded into the compiled program.

Of course, the solution is to define members with '= void':

enum elementCount = 1024 * 1024;

struct S {
   double[elementCount] a = void;  // <-- HERE
}

void main() {
     S s;
     assert(typeid(S).initializer.length == double.sizeof * elementCount);
     assert(typeid(S).initializer.ptr is null);
}

Now the program binary is 800M shorter. (Note .ptr is now null.) Also 
note that I did NOT use the following syntax because there is a dmd bug:

   auto s = S(); // Segfaults: 
https://issues.dlang.org/show_bug.cgi?id=21004

My question is: Is there a function that I can call to initialize 's' to 
the same .init value that compiler would have used:

S sInit;

shared static this() {
   defaultInitValue(&sInit);  // Does this exist?
}

I can then use sInit to copy over the bytes of all S objects in the 
program. (Both the structs and their object instantiations are all 
code-generated; so there is no usability issue. There are thousands of 
structs and the current binary size is 2G! :) )

If not, I am planning on writing the equivalent of defaultInitValue() 
that will zero-init the entire struct and then overwrite float, double, 
char, wchar, and dchar members with their respective .init values, 
recursively. Does that make sense?

Ali
Jul 02 2020
next sibling parent reply IGotD- <nise nise.com> writes:
On Thursday, 2 July 2020 at 07:51:29 UTC, Ali Çehreli wrote:
 Both asserts pass: S.init is 800M and is embedded into the 
 compiled program.
Not an answer to your problem but what on earth are those extra 800MB? The array size is 8MB so if the program would just copy the data it would just take 8MB. Does the binary have this size, even with the debugging info stripped? Also, this an obvious optimization that can be implemented, that the program do an initialization loop instead of putting it in the data segment when the array size is above a certain size and they are supposed to have the same value.
Jul 02 2020
parent =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 7/2/20 2:37 AM, IGotD- wrote:

 what on earth are those extra 800MB?
I'm losing my mind. :) Of course it's just 8M. Too many digits for me to handle. :p
 Also, this an obvious optimization that can be implemented, that the
 program do an initialization loop instead of putting it in the data
 segment when the array size is above a certain size and they are
 supposed to have the same value.
+1 Ali
Jul 02 2020
prev sibling next sibling parent Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Thursday, 2 July 2020 at 07:51:29 UTC, Ali Çehreli wrote:
 Normally, struct .init values are known at compile time. 
 Unfortunately, they add to binary size:

 [...]
memset() is the function you want. The initializer is an element generated in the data segment (or in a read only segment) that will be copied to the variable by a internal call to memcpy. The same happens in C except that the compilers are often clever and replace the copy by a memset().
Jul 02 2020
prev sibling parent reply kinke <noone nowhere.com> writes:
On Thursday, 2 July 2020 at 07:51:29 UTC, Ali Çehreli wrote:
 Of course, the solution is to define members with '= void'
Since when? https://issues.dlang.org/show_bug.cgi?id=11331 and your https://issues.dlang.org/show_bug.cgi?id=16956 are still open. For recent LDC versions, the 'solution' is to (statically) initialize the array with zeros, as fully zero-initialized structs don't feature any explicit .init symbols anymore.
 enum elementCount = 1024 * 1024;

 struct S {
   double[elementCount] a = void;  // <-- HERE
 }

 void main() {
     S s;
     assert(typeid(S).initializer.length == double.sizeof * 
 elementCount);
     assert(typeid(S).initializer.ptr is null);
 }

 Now the program binary is 800M shorter.
So you're saying you have a *stack* that can deal with an 800M struct (assuming you used a different `elementCount` for the actual tests)?! Even 8 MB should be too large without extra compiler/linker options, as that's the default stack size on Linux IIRC (on Windows, 2 MB IIRC). I don't think a struct should ever be that large, as it can probably only live on the heap anyway and only passed around by refs. I'd probably use a thin struct instead, containing and managing a `double[]` member (or `double[elementCount]*`).
Jul 02 2020
next sibling parent Basile B. <b2.temp gmx.com> writes:
On Thursday, 2 July 2020 at 10:37:27 UTC, kinke wrote:
 I don't think a struct should ever be that large, as it can 
 probably only live on the heap anyway and only passed around by 
 refs. I'd probably use a thin struct instead, containing and 
 managing a `double[]` member (or `double[elementCount]*`).
so right but the compiler should definitively not crash.
Jul 02 2020
prev sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 7/2/20 3:37 AM, kinke wrote:

 On Thursday, 2 July 2020 at 07:51:29 UTC, Ali =C3=87ehreli wrote:
 Of course, the solution is to define members with '=3D void'
Since when? https://issues.dlang.org/show_bug.cgi?id=3D11331 and your https://issues.dlang.org/show_bug.cgi?id=3D16956 are still open.
Wow! I didn't remember that one. According to its date, it was written=20 when I was working for Weka. Apparently, ldc took care of it for them=20 after all.
 For recent LDC versions, the 'solution' is to (statically) initialize
 the array with zeros, as fully zero-initialized structs don't feature
 any explicit .init symbols anymore.
What about floating point and char types? Their .init values are not all = zeros in D spec. (I don't think this matters in my case but still.)
 So you're saying you have a *stack* that can deal with an 800M struct
Sorry, my test code was too simplistic. The actual code constructs these = objects in dynamic memory for that exact reason.
 I don't think a struct should ever be that large, as it can probably
 only live on the heap anyway and only passed around by refs. I'd
 probably use a thin struct instead, containing and managing a `double[=
]`
 member (or `double[elementCount]*`).
Exactly. These structs are code-generated to reflect ROS interface message types. = Just like in D, arrays have dynamic/static distinction in ROS so I=20 blindly translated the types to D without remembering this .init issue. The following are the options I am considering: a) Move to ldc b) As you and IGotD- suggest, define all members with '=3D void' and=20 memset to zero at runtime. (I will decide whether to take care of char=20 and floating point types specially e.g. by setting doubles to NaN; this=20 distinction may not be important in our use case.) Luckily, issue 16956=20 you mention above does not affect us because these are non-template struc= ts. c) Again, as you say, define static arrays as dynamic arrays,=20 code-generate a default constructor that sets the length to the actual=20 static length, which requires some magic as struct default constructor=20 cannot be defined for structs. d) ? Ali
Jul 02 2020
parent reply kinke <noone nowhere.com> writes:
On Thursday, 2 July 2020 at 15:20:23 UTC, Ali Çehreli wrote:
 According to its date, it was written when I was working for 
 Weka. Apparently, ldc took care of it for them after all.
If so, then without them posting any issue beforehand or giving any feedback afterwards.
 For recent LDC versions, the 'solution' is to (statically)
initialize
 the array with zeros, as fully zero-initialized structs don't
feature
 any explicit .init symbols anymore.
What about floating point and char types? Their .init values are not all zeros in D spec. (I don't think this matters in my case but still.)
That's why all you have to do, in order not to have recent LDC emit the struct's init symbol, is to initialize these members manually with zeros: struct S { double[elementCount] a = 0; } void foo() { S s; } // compiler does a memset `= void` for members doesn't work and, I dare say, not work anytime soon if ever.
Jul 02 2020
parent reply kinke <noone nowhere.com> writes:
On Thursday, 2 July 2020 at 16:51:52 UTC, kinke wrote:
 `= void` for members doesn't work and, I dare say, not work 
 anytime soon if ever.
I've quickly checked; `= void` for members has initialize-with-zeros semantics too, so with LDC, it's equivalent to `= 0` but applicable to user-defined types as well. For DMD, `= void` for non-default-zero-initialized members can be used for the same effect. If all members are effectively zero-initialized, the init symbol isn't emitted, and the compiler initializes the whole struct with zeros. With `= 0`, DMD still emits the init symbol into the object file, but doesn't use it (at least not for stack allocations). TLDR: Seems like initializing (all non-default-zero-initialized) members with `= void` is the portable solution to elide the init symbols *and* have the compiler initialize the whole struct with zeros, so a manual memset isn't required.
Jul 02 2020
parent =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 7/2/20 10:51 AM, kinke wrote:
 On Thursday, 2 July 2020 at 16:51:52 UTC, kinke wrote:
 `= void` for members doesn't work and, I dare say, not work anytime 
 soon if ever.
I've quickly checked; `= void` for members has initialize-with-zeros semantics too, so with LDC, it's equivalent to `= 0` but applicable to user-defined types as well. For DMD, `= void` for non-default-zero-initialized members can be used for the same effect. If all members are effectively zero-initialized, the init symbol isn't emitted, and the compiler initializes the whole struct with zeros. With `= 0`, DMD still emits the init symbol into the object file, but doesn't use it (at least not for stack allocations). TLDR: Seems like initializing (all non-default-zero-initialized) members with `= void` is the portable solution to elide the init symbols *and* have the compiler initialize the whole struct with zeros, so a manual memset isn't required.
Thank you! I just checked: Even 2.084 behaves the same. I will deal with double.nan, etc. for structs where they matter. Ali
Jul 02 2020