digitalmars.D - Static arrays problem
- bearophile (84/84) Dec 17 2008 At the bottom of this post there are two little test programs that may s...
- Sean Kelly (10/20) Dec 17 2008 arrays.
- bearophile (6/13) Dec 17 2008 Thank you, that fixes the problem, the exe size is now a more normal 180...
- BCS (2/8) Dec 17 2008 OTOH that could end up large boot times.
- bearophile (4/5) Dec 17 2008 Why? Is a "cleaning loop" with memset (done with modern CPU instructions...
- BCS (7/16) Dec 17 2008 It depends. If the array is not accessed all at once, the OS needs only ...
- bearophile (10/13) Dec 17 2008 I have timed tree versions of that little code:
- BCS (3/19) Dec 17 2008 That dosn't test what I was looking at. Try it without the init loop, I'...
- Sean Kelly (9/10) Dec 17 2008 done with a loop, avoiding the creation of such huge executables.
- Denis Koroskin (6/95) Dec 18 2008 This is a linker issue. DMD stores a, b,c AND a.init, b.init, c.init
At the bottom of this post there are two little test programs that may show a problem with static arrays. D code compiled with DMD 1.038: dmd test_d.d Compile time: 8.29 s Resulting exe and obj: 17.756.188 test_d.exe 2.390 test_d.map 17.734.261 test_d.obj C code compiled with DMC 8.42n: dmc test_c.c Compile time: 0.04 s Resulting exe: 51.740 test_c.exe C code compiled with GCC (4.2.1): gcc test_c.c -o test_c Compile time: 0.16 s Resulting exe: 16.781 test_c.exe C code compiled with LLVM-GCC (2.5): llvm-gcc test_c.c -o test_c Compile time: 0.11 s Resulting exe: 13.462 test_c.exe (The following code is just reduced toy, but the full program runs up to about 13 times faster with static arrays, so I'd like to use them instead of the dynamic ones.) Using obj2asm on the test_d.obj it's made mostly of: _D6test_d1aG3G65G65G130f: db 000h,000h,0c0h,07fh,000h,000h,0c0h,07fh db 000h,000h,0c0h,07fh,000h,000h,0c0h,07fh db 000h,000h,0c0h,07fh,000h,000h,0c0h,07fh db 000h,000h,0c0h,07fh,000h,000h,0c0h,07fh ... ------------------------- // C code: #include "stdio.h" #define N 65 #define M (N+N) float a[3][N][N][M]; float b[4][N][N][M]; float c[N][N][M]; int main() { int i, j, k; for (i = 0; i < N; i++) for (j = 0; j < N; j++) for (k = 0; k < M; k++) { a[0][i][j][k] = 0.0; a[1][i][j][k] = 0.0; a[2][i][j][k] = 0.0; b[0][i][j][k] = 0.0; b[1][i][j][k] = 0.0; b[2][i][j][k] = 0.0; b[3][i][j][k] = 0.0; c[i][j][k] = 0.0; } printf("%f %f %f\n", a[2][10][10][10], a[2][10][10][10], c[10][10][10]); return 0; } ------------------------- // D code, it's the same, // I have just changed the first 3 lines. import std.c.stdio: printf; const int N = 65; const int M = (N+N); float a[3][N][N][M]; float b[4][N][N][M]; float c[N][N][M]; int main() { int i, j, k; for (i = 0; i < N; i++) for (j = 0; j < N; j++) for (k = 0; k < M; k++) { a[0][i][j][k] = 0.0; a[1][i][j][k] = 0.0; a[2][i][j][k] = 0.0; b[0][i][j][k] = 0.0; b[1][i][j][k] = 0.0; b[2][i][j][k] = 0.0; b[3][i][j][k] = 0.0; c[i][j][k] = 0.0; } printf("%f %f %f\n", a[2][10][10][10], a[2][10][10][10], c[10][10][10]); return 0; } Bye, bearophile
Dec 17 2008
== Quote from bearophile (bearophileHUGS lycos.com)'s articleAt the bottom of this post there are two little test programs that may show a problem with staticarrays. Could you describe what the problem is?------------------------- // D code, it's the same, // I have just changed the first 3 lines. import std.c.stdio: printf; const int N = 65; const int M = (N+N); float a[3][N][N][M]; float b[4][N][N][M]; float c[N][N][M];Consider changing this to: float a[3][N][N][M] = void; float b[4][N][N][M] = void; float c[N][N][M] = void; You don't need default initialization if you're just going to manually initialize to 0.0 anyway, so eliminate it. Sean
Dec 17 2008
Sean Kelly:Could you describe what the problem is?The problem is that the D code produces an huge executable.Consider changing this to: float a[3][N][N][M] = void; float b[4][N][N][M] = void; float c[N][N][M] = void; You don't need default initialization if you're just going to manually initialize to 0.0 anyway, so eliminate it.Thank you, that fixes the problem, the exe size is now a more normal 180.252 bytes :-) I think the D compiler has to take a look at the size of the static arrays, if they are too much large, then their initialization has to be done with a loop, avoiding the creation of such huge executables. Bye, bearophile
Dec 17 2008
Reply to bearophile,the exe size is now a more normal 180.252 bytes :-) I think the D compiler has to take a look at the size of the static arrays, if they are too much large, then their initialization has to be done with a loop, avoiding the creation of such huge executables.OTOH that could end up large boot times.
Dec 17 2008
BCS:OTOH that could end up large boot times.Why? Is a "cleaning loop" with memset (done with modern CPU instructions) slower than the loading of a 17 MB executable? Bye, bearophile
Dec 17 2008
Reply to bearophile,BCS:It depends. If the array is not accessed all at once, the OS needs only to set up the virtual memory maps and then page fault in the pages as needed. If the array is larger than the amount of memory that the OS would like to allow the program to keep in memory it could get even worse as it will start paging stuff out before the app even gets to main() (OTOH IIRC OPTLINK won't allow that big a static array).OTOH that could end up large boot times.Why? Is a "cleaning loop" with memset (done with modern CPU instructions) slower than the loading of a 17 MB executable? Bye, bearophile
Dec 17 2008
BCS:It depends. If the array is not accessed all at once, the OS needs only to set up the virtual memory maps and then page fault in the pages as needed. If [...]I have timed tree versions of that little code: D without = void: first run 1.85 s, successive ones: 0.11 s D with = void: first run 0.24 s, successive ones: 0.10 s C compiled with DMC: first run: 0.10 s, successive ones: 0.06 s C compiled with GCC: first run: 0.07 s, successive ones: 0.05 s (Compilations with no arguments. GCC compiled with -s -Os). So that huge executable is bad for the speed too. Bye, bearophile
Dec 17 2008
Reply to bearophile,BCS:That dosn't test what I was looking at. Try it without the init loop, I'll bet you won't see any difference between the cases at all.It depends. If the array is not accessed all at once, the OS needs only to set up the virtual memory maps and then page fault in the pages as needed. If [...]I have timed tree versions of that little code: D without = void: first run 1.85 s, successive ones: 0.11 s D with = void: first run 0.24 s, successive ones: 0.10 s C compiled with DMC: first run: 0.10 s, successive ones: 0.06 s C compiled with GCC: first run: 0.07 s, successive ones: 0.05 s (Compilations with no arguments. GCC compiled with -s -Os). So that huge executable is bad for the speed too. Bye, bearophile
Dec 17 2008
== Quote from bearophile (bearophileHUGS lycos.com)'s articleI think the D compiler has to take a look at the size of the static arrays, if they are too much large, then their initialization has to bedone with a loop, avoiding the creation of such huge executables. TypeInfo should probably contain an init(void[]) routine instead of a void[] member which is byte copied onto the new block. Then, initializers that are easy to decompose could at least be turned into a series of memset operations. For built-in types, this should be very easy to do, since the functions could be hand-coded as opposed to requiring code generation from the compiler. Sean
Dec 17 2008
On Thu, 18 Dec 2008 00:52:43 +0300, bearophile <bearophileHUGS lycos.com> wrote:At the bottom of this post there are two little test programs that may show a problem with static arrays. D code compiled with DMD 1.038: dmd test_d.d Compile time: 8.29 s Resulting exe and obj: 17.756.188 test_d.exe 2.390 test_d.map 17.734.261 test_d.obj C code compiled with DMC 8.42n: dmc test_c.c Compile time: 0.04 s Resulting exe: 51.740 test_c.exe C code compiled with GCC (4.2.1): gcc test_c.c -o test_c Compile time: 0.16 s Resulting exe: 16.781 test_c.exe C code compiled with LLVM-GCC (2.5): llvm-gcc test_c.c -o test_c Compile time: 0.11 s Resulting exe: 13.462 test_c.exe (The following code is just reduced toy, but the full program runs up to about 13 times faster with static arrays, so I'd like to use them instead of the dynamic ones.) Using obj2asm on the test_d.obj it's made mostly of: _D6test_d1aG3G65G65G130f: db 000h,000h,0c0h,07fh,000h,000h,0c0h,07fh db 000h,000h,0c0h,07fh,000h,000h,0c0h,07fh db 000h,000h,0c0h,07fh,000h,000h,0c0h,07fh db 000h,000h,0c0h,07fh,000h,000h,0c0h,07fh ... ------------------------- // C code: #include "stdio.h" #define N 65 #define M (N+N) float a[3][N][N][M]; float b[4][N][N][M]; float c[N][N][M]; int main() { int i, j, k; for (i = 0; i < N; i++) for (j = 0; j < N; j++) for (k = 0; k < M; k++) { a[0][i][j][k] = 0.0; a[1][i][j][k] = 0.0; a[2][i][j][k] = 0.0; b[0][i][j][k] = 0.0; b[1][i][j][k] = 0.0; b[2][i][j][k] = 0.0; b[3][i][j][k] = 0.0; c[i][j][k] = 0.0; } printf("%f %f %f\n", a[2][10][10][10], a[2][10][10][10], c[10][10][10]); return 0; } ------------------------- // D code, it's the same, // I have just changed the first 3 lines. import std.c.stdio: printf; const int N = 65; const int M = (N+N); float a[3][N][N][M]; float b[4][N][N][M]; float c[N][N][M]; int main() { int i, j, k; for (i = 0; i < N; i++) for (j = 0; j < N; j++) for (k = 0; k < M; k++) { a[0][i][j][k] = 0.0; a[1][i][j][k] = 0.0; a[2][i][j][k] = 0.0; b[0][i][j][k] = 0.0; b[1][i][j][k] = 0.0; b[2][i][j][k] = 0.0; b[3][i][j][k] = 0.0; c[i][j][k] = 0.0; } printf("%f %f %f\n", a[2][10][10][10], a[2][10][10][10], c[10][10][10]); return 0; } Bye, bearophileThis is a linker issue. DMD stores a, b,c AND a.init, b.init, c.init (doubling data size). Intelligent linker could strip a.init, b.init and c.init if they are never accessed.
Dec 18 2008