digitalmars.D - D stack allocation
- van eeshan (95/117) Aug 10 2004 needs
- Andy Friesen (7/15) Aug 10 2004 You'll notice, though, that there's absolutely nothing preventing D from...
- van eeshan (6/21) Aug 10 2004 Yep; agreed (and noted below). It just doesn't do that today, so I thoug...
- van eeshan (205/205) Aug 11 2004 Attached a little graph that shows arrays allocated on the heap versus
- Sean Kelly <sean f4.ca>s (8/19) Aug 11 2004 Regan, if I remember correctly, reported a bug where executable size is
- stonecobra (8/20) Aug 11 2004 That's why I think the ability to have it on by default is great. If it...
- van eeshan (23/42) Aug 11 2004 testing
- Sean Kelly (14/39) Aug 11 2004 Ah okay. I thought perhaps the current code generation involving arrays...
- Regan Heath (21/79) Aug 11 2004 I'm not sure it was me who reported it initially, Arcane Jill was
- Walter (7/19) Aug 11 2004 stack-based
- Walter (43/65) Aug 11 2004 to
- van eeshan (59/101) Aug 12 2004 The whole point of that post was to selectively and optionally /inhibit/
- Derek Parnell (10/37) Aug 12 2004 Sounds like a good idea. It allows a coder to take some responsibility f...
- Ben Hinkle (27/65) Aug 12 2004 As a general principle I agree it would be nice to be able to skip
- van eeshan (65/116) Aug 12 2004 easily
- Walter (28/37) Aug 12 2004 the
- van eeshan (57/96) Aug 12 2004 deal.
- Walter (32/71) Aug 12 2004 approach
- Michael (3/3) Aug 13 2004 Just out of curiosity, how does alloca() work? I guess it just manipulat...
- Walter (8/11) Aug 13 2004 the
- Michael (3/13) Aug 13 2004 Ahh, I hadn't thought about expression temps. Very interesting stuff. I ...
- van eeshan (58/118) Aug 13 2004 out
- Sean Kelly (8/13) Aug 13 2004 I think Walter meant that this is typically the proper behavior for an
- Walter (11/23) Aug 13 2004 that
- van eeshan (21/45) Aug 13 2004 May I point you towards this post, which has an attempted summary of the
- van eeshan (24/61) Aug 12 2004 Addendum.
- Walter (39/59) Aug 12 2004 I'm confused - you wrote: "The allocated (temporary) workspace is typica...
- Arcane Jill (5/8) Aug 12 2004 That's probably good news, but:
- Walter (4/11) Aug 13 2004 (Please
- Arcane Jill (16/34) Aug 13 2004 That is a very good point. In fact, it's an excellent point, and one whi...
- Walter (10/16) Aug 13 2004 with
- van eeshan (88/123) Aug 13 2004 flaw
- Arcane Jill (10/22) Aug 13 2004 I think you may have misunderstood Walter. Arbitrary random data can ind...
- van eeshan (21/43) Aug 13 2004 Fair point Jill.
- Walter (17/33) Aug 13 2004 that
- van eeshan (37/96) Aug 30 2004 So, the resolution on this has apparently been implemented. If you write...
- Ilya Minkov (11/11) Aug 12 2004 Ahem, you're speaking of allocating a *buffer*. A large thing, perhaps
- Ben Hinkle (7/17) Aug 13 2004 another option is to look for "void" as an initialization value:
- Arcane Jill (8/15) Aug 13 2004 /another/ option is for the compiler, on seeing the code:
- Ben Hinkle (12/28) Aug 13 2004 large
- Andy Friesen (8/16) Aug 13 2004 I like this approach. The programmer didn't say where he wants the
- van eeshan (10/26) Aug 13 2004 With respect Andy, the programmer was quite explicit about where the mem...
- Andy Friesen (32/38) Aug 13 2004 One of the greatest things about D is that we have a chance to change
- van eeshan (24/64) Aug 13 2004 All good points. As long as there is some easy means of avoiding unwante...
- Andy Friesen (12/32) Aug 13 2004 I actually didn't mean to address implicit initialization directly at
- Ben Hinkle (5/9) Aug 13 2004 Perl! ouch! ;-)
- van eeshan (44/50) Aug 13 2004 In light of what we now know (but still don't quite know :-) I think it'...
- ben 0x539.de (41/41) Aug 15 2004 Pretty ignorant of some of this thread, I would like to make some points...
- Derek Parnell (13/60) Aug 15 2004 This idea is stilling making sense to me, even when taking Walter's
- van eeshan (3/20) Aug 13 2004 That's a very nice idea Ben ...
- Ben Hinkle (6/148) Aug 11 2004 Have you tried alloca? I haven't tried running it but I'm thinking of
- van eeshan (3/8) Aug 11 2004 Tried that, Ben; alloca() also clears everything to zero.
- Ben Hinkle (6/16) Aug 11 2004 Hmm. I just tried on Linux and got the times:
- Walter (6/15) Aug 11 2004 No, it does not. It simply calls C's alloca(), and gets whatever C's
- van eeshan (34/182) Aug 11 2004 Well, blow me down with a feather. I must have really screwed that test ...
- Arcane Jill (12/15) Aug 12 2004 My experiments seem to suggest that the following actually works rather ...
- van eeshan (24/40) Aug 12 2004 Now that is exactly what the compiler ought to (optionally) provide for,...
- van eeshan (7/23) Aug 12 2004 I've been trying to convert this to a mixin, but not having much luck. G...
"Walter" <newshound digitalmars.com> wrote in message news:cfbip7$ora$1 digitaldaemon.com..."van eeshan" <vanee hotmail.net> wrote in message news:cfb72j$9l7$1 digitaldaemon.com...needs"Walter" <newshound digitalmars.com> wrote in message news:cfb5v9$76u$1 digitaldaemon.com...The problem, however, is that since Java does not allow any objects or arrays on the stack (or in static data), whenever you need one ittoThanks for clarifying that Walter. Would it be prudent to note that D allocates auto Objects on the heap? Stack allocation may be allowed for in the spec, but that's not how the compiler operates. Also, there is that point about heap allocations being a lot more expensive than stack allocations. Naturally, one would lean towards wholeheartedly agreeing; but this is just not true when it comes to D. Here's a little (contrived) test jig to illustrate: private import std.c.time; private import std.c.stdlib; private import std.c.windows.windows; void heap() { void *x = malloc (16 * 1024); free (x); } void stack() { ubyte[16 * 1024] x; } void trace (uint startTime) { FILETIME ft; GetSystemTimeAsFileTime(&ft); printf ("%u\n", (ft.dwLowDateTime - startTime) / 10000); } void main() { FILETIME start; GetSystemTimeAsFileTime(&start); trace (start.dwLowDateTime); sleep (3); trace (start.dwLowDateTime); for (int i=1_000_000; --i;) heap(); trace (start.dwLowDateTime); for (int i=1_000_000; --i;) stack(); trace (start.dwLowDateTime); } The results? On this machine it takes the heap() routine ~500ms to execute, and the stack() version ~3500ms. Yep, that's seven times longer. Why? What wasn't stated is that D ensures all stack variables are initialized ... including arrays. The innocuous stack allocation actually clears the 16KB to zero each time, which is why it takes so long. For a 32KB 'buffer', the stack() execution time goes up to ~8600ms versus the ~500ms for heap(); you can see there's some rather non-linear behavior for the stack version <g> The point here is not about the benefits/detriments of deterministic initialization. It's about clarifying a point of performance-tuning that is far from obvious. Without wishing to put too fine a point on it, the quoted post was somewhat misleading. That is; allocating an array on the D stack can consume far more cycles than the equivalent heap allocation. It depends upon the size of the array, the bus speed, the size of the cache, heap fragmentation and many other things. D stack allocations are simply not like C/C++ at all. Far from it. For the curious, here's the disassembly for both routines. It's worth noting that the array initialization could be sped up. Also worth noting is that the loop/call overhead was 10ms. 10: { 11: void *x = malloc (16 * 1024); 12: free (x); 00402011 push 4000h 00402016 call _malloc (0040574c) 0040201B add esp,4 0040201E push eax 0040201F call _free (004057a4) 00402024 add esp,4 13: } 00402027 pop eax 00402028 ret 14: 15: 16: void stack() 0040202C push ebp 0040202D mov ebp,esp 0040202F mov edx,4 00402034 sub esp,1000h 0040203A test dword ptr [esp],esp 0040203D dec edx 0040203E jne _D5hello5stackFZv+8 (00402034) 00402040 sub esp,edx 00402042 mov ecx,4000h 00402047 xor eax,eax 17: { 18: ubyte[16 * 1024] x; 00402049 push edi 0040204A lea edi,[x] 00402050 rep stos byte ptr [edi] 19: } 00402052 pop edi 00402053 mov esp,ebp 00402055 pop ebp 00402056 retJavabe allocated on the heap. This is a lot more expensive than stack allocation. Since one winds up allocating a lot more heap objects instack? For auto objects, it is allowed for the compiler to put them there. But I was primarilly referring to structs and arrays.than one does in C/C++, the end result is slower. Java offers no alternatives to gc allocation. This is why D allows stack based objects and value objects.Are you saying that D allows one to create Objects entirely upon theAnd entirely within static data?Structs and arrays, yes. Java does not allow structs, nor does it allow arrays on the stack.
Aug 10 2004
van eeshan wrote:Thanks for clarifying that Walter. Would it be prudent to note that D allocates auto Objects on the heap? Stack allocation may be allowed for in the spec, but that's not how the compiler operates. Also, there is that point about heap allocations being a lot more expensive than stack allocations. Naturally, one would lean towards wholeheartedly agreeing; but this is just not true when it comes to D. Here's a little (contrived) test jig to illustrate:You'll notice, though, that there's absolutely nothing preventing D from performing these allocations on the stack as an optimization. All the compiler needs is to be certain that the object's lifetime is limited to the scope. Auto references offer precisely this. (incidently, it is legal to write "auto int[] x = new int[...];") -- andy
Aug 10 2004
Yep; agreed (and noted below). It just doesn't do that today, so I thought it was worth a clarification (for those who didn't know). "Andy Friesen" <andy ikagames.com> wrote in message news:cfbpc5$13eh$1 digitaldaemon.com...van eeshan wrote:inThanks for clarifying that Walter. Would it be prudent to note that D allocates auto Objects on the heap? Stack allocation may be allowed forexpensivethe spec, but that's not how the compiler operates. Also, there is that point about heap allocations being a lot morethan stack allocations. Naturally, one would lean towards wholeheartedly agreeing; but this is just not true when it comes to D. Here's a little (contrived) test jig to illustrate:You'll notice, though, that there's absolutely nothing preventing D from performing these allocations on the stack as an optimization. All the compiler needs is to be certain that the object's lifetime is limited to the scope. Auto references offer precisely this. (incidently, it is legal to write "auto int[] x = new int[...];") -- andy
Aug 10 2004
Attached a little graph that shows arrays allocated on the heap versus arrays declared as local (stack) variables. You can see that stack allocations are faster only for array sizes of below ~64 bytes (on this machine), and quickly become significantly slower once you start using local arrays for anything bigger than around 512-bytes. There's an interesting quirk at the 4096-byte mark: the codegen changes strategy slightly at this point, and I'd guess it is doing something akin to paging (ensuring the stack page-set is available), though can't be sure. Take a look at the previously posted assembler for reference. If local (stack) arrays were to use the C/C++ approach, the time consumed would be ~10ms across the board, rather than the ~500ms for heap allocations. That is, for a 32KB local array, the initialization cost is ~865 times /more/ than it would be for the equivalent C/C++ approach (~10ms versus ~8650ms); almost three orders of magnatude. This indicates two things: 1) making claims about D's perfomance superiority over Java due to the former's ability to use the stack (versus always on the heap) is based on speculation; not fact. Testing shows this to be entirely the opposite for arrays beyond 64 bytes in length. Similarly, any kind of performance testing against C/C++ within this realm will be viewed as a negative mark against D. 2) Implicit initialization is great. However, it can clearly and easily be demonstrated to be a serious impediment. Given that D is supposed to be a 'systems' language, I humbly suggest that a mechanism be provided to bypass such nicities when one is prepared to assume full responsibility. For example, there might be a qualifier for variables that disables the automatic initialization (along the lines of the 'auto' qualifier). What do you folks think? begin 666 stack allocation.PNG MB5!.1PT*& H````-24A$4 ```C$```$L" (```"]K75[````!&=!34$````` MBR5 30```"!C2%)-``````````````````````````````````````````" M`0;P. $;KI%$P"D;!*! %`PS =1! `(W62:- %(R"43 *!AC ZR" `*)5 M>($/$$#TZ M$*V3``)H8,;NB.<28_ H& 6C8!2,`C0P1.LD `:DI,W0\NUHV 4C()10'\P M"D;!*!B&8(C620`!-%HGC8)1, I&P3 $0[1.` B T3II%(R"43 *AB$8HG42 MZZ11, I&P2 8U("! 9S:98C620`!-%HGC8)1, I&P: &(ZI.` B T3II%(R" M43 *!B\ KT+Z/V3K)( `&JV31L$H& 6C8) "LBND_T.V3 ((H-$Z:12, E$P M" 8I&(%U$D `C=9)HV 4C()1,! !)172_R%;)P$$T&B=- I&P2 8!8,1C,PZ M`.45TO\A6R<!!!!-ZB3,. :901(7E_E4=.TH& 6C8!0,*C"2ZR2 `*)'/XD! M!M#M'JV31L$H& 6C`!50I4+Z/V3K)( `HE.=A,P8K9-&P2 8!:,`*Z!6A?1_ M^C]DZR2 `!JMDT;!*! %HV P6B=]!^I5 <(H-$Z:12, E$P" 8,T*A"^C]D M==(H& 6C8!30&]"Z0OH_9.LD `:K9-&P2 8!:. KH .%=+_(5LG`030:)TT M"D;!*! %= 6C=1(F )?J``$T6B>- E$P"D8!_0!]*J3_0[9.` B T3II%(R" M43 *Z 3H5B']'[)U$D `C=9)HV 4C()10"<P6B?A`O!2'2" 1NND43 *1L$H MH > 9X7T?\C620`!-%HGC8)1, I&`<T!G2ND_T.V3 ((H-$Z:12, E$P"F . M1NLD_ !>J ,$T&B=- I&P2 8!;0%]*^0_ _9. D &A5)V$U$&$K&""+X^$2 M:? H& 6C8!0,3C!:)Q$$\%(=((!H4B=AK53 L U$T$N+O.IZ-I1, I&P2B M'1B0"NG_D*V3``*(3OTDY/IFM$X:!:- %(P0,% 5TO\A6R<!!! ]ZB22*J'1 M.FD4C()1,&S :)U$)("7Z `!1*<Z"0T JR&I3D(S812, E$P" 8M&, *Z?^0 MP6"HD/X/V3H)((!&ZZ11, I&P2B )ABMD\ `\%(=((!&ZZ11, I&P2B &A D M%=+_(5LG`030:)TT"D;!*! %U &#IT+Z/V3K)( `&JV31L$H& 6C #I M$XB M&\!+=8 `&JV31L$H& 6C I 4%5(_X=LG0000*-UTB 8!:- %% !C-9)E !X MJ0X00*-UTB 8!:- %% *!EN%]'_(UDD``31:)XV"43 *1 %%8!!62/^';)T$ M%(R"43 *R &#N4+Z/V3K)( `&JV31L$H& 6C !PP6B=1$<!+=8 `&JV31L$H M)G429AV#S"")B\M\*KIV%(R"43 *B 1#I4+Z/V3K)( `HE,_"5EPM$X:!:- M% Q1,%HGT0C 2W6 `*)?G819V8S62:- %(R"(02&4(7T?\C620`!-,3J) 88 MH*YK1\$H& 6C " 8K9-H!^"E.D `T6^-`Q:[1_M)HV 4C((A`H96A?1_R-9) M:(7T?\C620`!-%HGC8)1, I&`4XP6B?1!\!+=8 `&JV31L$H& 6C`#L8NA72 M_R%;)P$$T&B=- I&P2 8!=C!:)U$-P`OU0$":+1.& 6C8!2,`BQ 2%=(_X=L MG0000*-UTB 8!:- %*"#H5XA_1^R=1) `(W62:- %(R"48 .1NLD. -XJ0X0 M0*-UTB 8!:- %*" 85 A_1^R=1) `(W62:- %(R"48 "1NLD^ -XJ0X00*-U M"^"E.D `C8[=C8)1, I&.AA^%=+_(5LG`030:)TT"D;!*!CI8+1.&G `+]4! M`HA6=1*: 6 S3R1Q"1H^"D;!*! %9(-A62']'[)U$D `D5PG(73BUH)9QR S M-JA8)Z%5K:- %(R"44 V&*V3!AQ %O ``42/-0ZC_:11, I&P6 #P[M"^C]D M_KJ[`02#P0VC8!2, J$+1NLDJ ."QA)9)P$$T&B=- I&P2 866 D5$C_:5\G M(8]I,6 #R,K^8ZM^L H"!!"MQNYH"D;KI%$P"D8!>6"$5$C_:5PG81J")H)9 M#Z M6B>- E$P"D8*&#D5TG]ZS2>A52IXN$3VDP`"B)PZ:73L;A2, E$P%,%HG?2? MCH)1, H&/QBMD^! 4)6?F'420 "-CMV- E$P"H8Y&($5TO\A6R<!!-!HG30* M%<& "M-1, I&P: %([9"^C]DZR2 `")SS^QHG30*1L$H&/Q M$["!(.J_,2L MUDE8Q0=YG0000*-K'$;!*! %PQ",\ KI/]W/%H(+HBF #ZIA[<E UDD``411 M/VET[&X4C()1,#C!:)U$5)W$0#I M""!Z])-&ZZ11, I&`=W :)T$`8-J?Q(N2S$+?( `&H"QN]$Z:12, E% (S!: M(<'!$*V3``*(G#II MD,%HA80)AFB=!!! =%H+3ETP&-PP"D;!*! \8+1.P 1#M MP= &HQ425C!$ZR2 `!JMDT;!*! %0QN,UDE8P1"MDP`":+1.& 6C8!0,83!: M(>$"0[1.` B T3II%(R"43"$P6B=A L,T3H)((!&ZZ11, I&P5 %HQ42'C!$ MZR2 `!JMDT;!*! %0Q6,UDEXP!"MDP`":+1.& 6C8!0,23!:(>$'0[1.` B MT3II%(R"43 DP6B=A!\,T3H)((!&ZZ11, I&P= #HQ4203!$ZR2 `!JMDT;! M*! %0P^,UDD$P1"MDP`":+1.& 6C8!0,,3!:(1$#AFB=!!! HW72*! %HV"( M =$ZB1 P1.LD `:K9-&P2 8!4,)C%9(1((A6B<!!-!HG30*1L$H&$I M$XB M$ S1. D .A4)Z%=MD02%ZMIM'#D*! %HV"0 ]$*B7 P1.LD "B1YV$L S, M((F+W\!1, I&P8 "HW42\6"(UDD``42G. FYZS-:)XV"43 *R "C%1))8(C6 M` B (58G$90:!:- % Q+,%HGD0J&:)T$$$ #L.X.<WH)#Q>7:31U[2 8!:- M$AE B-9)``$T6B>- E% !3!::-(.C(8M>6"(UDD``31:)XV"44 I !2:HT4G MC<!HP)('AFB=!!! HW72*! %% %XB3E:=-("C(8JV6"(UDD``31:)XV"44 ^ M.1 -0\K!$*V3``)HM$X:!:. !$!D63E:I%((1 .0<C!$ZR2 `!JMDT;!*" 6 MD%10CI:J9(/1H*,*&*)U$D `C=9)HV 4$ 5(+2A'"U:RP6C0404,T3H)((!& MZZ11, H(`_)*R=&RE0PP&FC4`D.T3 ((H-$Z:12,` * [%)RM' E`XP&&K7 M$*V3``)HM$X:!:,`'Z"PB!PM84D"H\%%13!$ZR2 `!JMDT;!*, )*"\B1PM9 M`IH6CJ,E+RXP&C*T`$.T3 (((#K526 `V5*2ZB0T$T;!** BH'7).%KR8 6C MP4(C,$3K)( `&H U#J/]I%$P" %]2L;1\A<3C(8)C< 0K9,``FA ZB2TWA(> M+AYS1L$HH!: 6[$X6OZB =$`H1T8HG420 "-[D\:!2,=T+E8'"V%D<%H:- . M-!QH"H9HG0000*-UTB 8H6! "\31XG T!& -AFB=!!! HW72*!B)8, +Q %W MP("#T1" -1BB=1) `(W62:- Q(%!4AH.$F?0'P`]/F+]3D\P1.LD `:K9-& MK9,``FBT3AH%PPH,W<)QB#K[_U!V^? &0[1.` B T3II% P3,'1K(S 8<NX? M!F$^C,$0K9,``FBT3AH%0QX,FY)Q:/EB:+EV!((A6B<!!-!HG30*AC 8-K41 MHX 2,.QK(S 8;-X<.2$_G, 0K9,``HA^=1)F93-:)XT"(L%(*Q,'E6<'E6-& MW:/]I%& &XSP`G% _3["`W]X "%:)P$$$)W6."#W;]#Z.OBYN MJ]$`'[% B-9)``$T6B>-`GJ T6*1$D!&N(T&^" 8HG420 "-UDFC (9 M&2D M #I M%BD-< ,V]$P'P5XP!"MDP`":+1.& 7D ]$N$3T!/)!'PWP4$ .&:)T$ M&JV31 $^,#HZ-PI&P1 %0[1.` B T3II%" `< TT6 ^- E$PI,$0K9,``FBT M3AJ)`+/N&:V!1L$H&&9 B-9)``$T6B<-6X"KXAFM>T;!*! )8(C620`!-%HG M20B7X782.:X=\*[W(.O7CX)1, J&-QBB=1) `(V8. F?C;3LP8R"43 *1L% M "%:)P$$T$BIDT8KF%$P"D;!B )#M$X""* A5B<QC()1, I&P2 ` S1. D M (98G410BA8N&;5E\- R;#PR:LL M&+4E $$<-<"!-! K), `+\:^KADU);! MHV 4C()1, H&&,#K M8 "O P`":+1.& 6C8!2, E$PP !>!P$$X-Z,<0``01CH_U_M1 Q8H-K!R.AP M;0G$B9__)'?G5)X]253L42N!Y"2A(L1=B0E5G/G<!LM'*$1FY3I85=,01S(` M2::1EXF(CV7*?80G6G$E8^)M(9 !\1M+HY*-_ #N?H `W%4Q$ ` "/K_JVMK M2$!3N^MR<PA25#ZI1X10MP4<#N4EYV MS2>)2Y(M)%42Y%E!O&D45A $N63'#C M#R8 M $&\#H(((!&ZZ11, I&P2 8!0,,X'400 "-UDFC8!2, E$P" 88P.L `: MC_\8W2!D76B,_WCK)/S=IO^CV7 4C& `3_P``31:)XV"40`%:-4)FC N]5 5 MHV"D`7CB!PB T3II%(P", '^7$-&GAK-AJ- Q )XX <(H-$Z:12,`O(!KJ4* MHQ72*! %) %X^ <(H-$Z:12, E$P"D;!``-X'0000*-UTB 8!:- %(R"`0;P M. $;KI%$P"D;!*! %`PS =1! `(W62:- %(R"43 *!AC ZR" `!JMDT;! M& 6C8!2, $&\#H((("P'"(Y"D;!*! %HV 4T!E *B. `!KM'HV"43 *1L$H J&"P`((!&ZZ11, I&P2 8!8,%``08`&FN<"M[6$Y2`````$E%3D2N0F"" ` end
Aug 11 2004
In article <cfdoer$13pt$1 digitaldaemon.com>, van eeshan says...1) making claims about D's perfomance superiority over Java due to the former's ability to use the stack (versus always on the heap) is based on speculation; not fact. Testing shows this to be entirely the opposite for arrays beyond 64 bytes in length. Similarly, any kind of performance testing against C/C++ within this realm will be viewed as a negative mark against D.Regan, if I remember correctly, reported a bug where executable size is proportional to the size of static arrays the application contains. This performance issue sounds like it may be related.2) Implicit initialization is great. However, it can clearly and easily be demonstrated to be a serious impediment. Given that D is supposed to be a 'systems' language, I humbly suggest that a mechanism be provided to bypass such nicities when one is prepared to assume full responsibility. For example, there might be a qualifier for variables that disables the automatic initialization (along the lines of the 'auto' qualifier).An impediment how? Just in terms of performance? In general, I would say that default initialization is worth the performance cost for its ability to prevent bugs. Sean
Aug 11 2004
That's why I think the ability to have it on by default is great. If it were able to be turned off, that is no different from me reusing a buffer. The buffer is only initialized once, yet I re-use it thousands of times without much problem. Performance does and will always matter, regardless of the secret contract between Intel and Microsoft to castrate newer, faster PCs. MHO, of course. Scott2) Implicit initialization is great. However, it can clearly and easily be demonstrated to be a serious impediment. Given that D is supposed to be a 'systems' language, I humbly suggest that a mechanism be provided to bypass such nicities when one is prepared to assume full responsibility. For example, there might be a qualifier for variables that disables the automatic initialization (along the lines of the 'auto' qualifier).An impediment how? Just in terms of performance? In general, I would say that default initialization is worth the performance cost for its ability to prevent bugs.
Aug 11 2004
"Sean Kelly s" <sean f4.ca> wrote in message news:cfe4hu$19cf$1 digitaldaemon.com...In article <cfdoer$13pt$1 digitaldaemon.com>, van eeshan says...testing1) making claims about D's perfomance superiority over Java due to the former's ability to use the stack (versus always on the heap) is based on speculation; not fact. Testing shows this to be entirely the opposite for arrays beyond 64 bytes in length. Similarly, any kind of performanceD.against C/C++ within this realm will be viewed as a negative mark againstRegan, if I remember correctly, reported a bug where executable size is proportional to the size of static arrays the application contains. This performance issue sounds like it may be related.AFAIK this is not related in the slightest, I'm afraid. This topic is purely about CPU cycles, and the stack. Nothing to do with static data or executable size.be2) Implicit initialization is great. However, it can clearly and easilybypassdemonstrated to be a serious impediment. Given that D is supposed to be a 'systems' language, I humbly suggest that a mechanism be provided tothatsuch nicities when one is prepared to assume full responsibility. For example, there might be a qualifier for variables that disables the automatic initialization (along the lines of the 'auto' qualifier).An impediment how? Just in terms of performance? In general, I would saydefault initialization is worth the performance cost for its ability topreventbugs.I don't think anything negative was said about default initialization as a feature, Sean. If it was then I apologize. Default/implicit initialization is great (as was noted; so we agree), but there are times when you need to take full control over what's going on. It may not matter to you specifically, but there are many situations whereby one needs to take full responsibility over a select set of variables (such as temporary stack-based buffers). Providing the mechanism as an /option/ is not detrimental to the language. It's like applying "const", or one of the other optional qualifiers. I'm just illustrating that the language could really use a means to disable default-initialization for those cases where it is deemed both completely unecessary and too costly. Surely that's not too much to consider?
Aug 11 2004
In article <cfe60s$1a2i$1 digitaldaemon.com>, van eeshan says..."Sean Kelly s" <sean f4.ca> wrote in message news:cfe4hu$19cf$1 digitaldaemon.com...Ah okay. I thought perhaps the current code generation involving arrays had issues that might be corrected in the future.Regan, if I remember correctly, reported a bug where executable size is proportional to the size of static arrays the application contains. This performance issue sounds like it may be related.AFAIK this is not related in the slightest, I'm afraid. This topic is purely about CPU cycles, and the stack. Nothing to do with static data or executable size.Not at all. However this might violate Walter's rule against pragmas. Even so, I had overlooked the fact that declaring an array would implicitly initialize all array elements. This kind of stinks from a performance perspective, since 90% of the time I use arrays I have no need for them to be initialized. Perhaps a new attribute? declare int x[10000000]; declare { float a[100000]; char b[1000000]; } SeanAn impediment how? Just in terms of performance? In general, I would saythatdefault initialization is worth the performance cost for its ability topreventbugs.I don't think anything negative was said about default initialization as a feature, Sean. If it was then I apologize. Default/implicit initialization is great (as was noted; so we agree), but there are times when you need to take full control over what's going on. It may not matter to you specifically, but there are many situations whereby one needs to take full responsibility over a select set of variables (such as temporary stack-based buffers). Providing the mechanism as an /option/ is not detrimental to the language. It's like applying "const", or one of the other optional qualifiers. I'm just illustrating that the language could really use a means to disable default-initialization for those cases where it is deemed both completely unecessary and too costly. Surely that's not too much to consider?
Aug 11 2004
On Wed, 11 Aug 2004 15:21:50 -0700, van eeshan <vanee hotmail.net> wrote:"Sean Kelly s" <sean f4.ca> wrote in message news:cfe4hu$19cf$1 digitaldaemon.com...I'm not sure it was me who reported it initially, Arcane Jill was discussing it more recently.. I think what Sean was referring to by mentioning the size of executables is an indication of how static arrays are currently implemented, the array seems to be compiled into the executable, whether this is to increase performance or decrease complexity, we don't know.In article <cfdoer$13pt$1 digitaldaemon.com>, van eeshan says...testing1) making claims about D's perfomance superiority over Java due to the former's ability to use the stack (versus always on the heap) is basedonspeculation; not fact. Testing shows this to be entirely the oppositeforarrays beyond 64 bytes in length. Similarly, any kind of performanceD.against C/C++ within this realm will be viewed as a negative markagainstRegan, if I remember correctly, reported a bug where executable size is proportional to the size of static arrays the application contains. This performance issue sounds like it may be related.AFAIK this is not related in the slightest, I'm afraid. This topic is purely about CPU cycles, and the stack. Nothing to do with static data or executable size.After seeing the cost of default initialization, I can see when someone might not want it, it should be optional IMO. Perhaps it's a good default behaviour, I am un-convinced, AFAIKS un-initialised data generally causes a spectaclar and obvious failure, in fact shouldn't good DBC catch them, so, if I have to choose between default initialization and efficient arrays I choose the latter. default initialization also seems to be responsible for hiding my un-used variables? or is dmd just not looking for them, I guess an 'error' is overkill for an un-used variable, and there might possibly be a reason to use one, can anyone think of one? I am a tidy programmer and these bug me. Regan -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/be2) Implicit initialization is great. However, it can clearly and easilybypassdemonstrated to be a serious impediment. Given that D is supposed tobe a'systems' language, I humbly suggest that a mechanism be provided tothatsuch nicities when one is prepared to assume full responsibility. For example, there might be a qualifier for variables that disables the automatic initialization (along the lines of the 'auto' qualifier).An impediment how? Just in terms of performance? In general, I would saydefault initialization is worth the performance cost for its ability topreventbugs.I don't think anything negative was said about default initialization as a feature, Sean. If it was then I apologize. Default/implicit initialization is great (as was noted; so we agree), but there are times when you need to take full control over what's going on. It may not matter to you specifically, but there are many situations whereby one needs to take full responsibility over a select set of variables (such as temporary stack-based buffers). Providing the mechanism as an /option/ is not detrimental to the language. It's like applying "const", or one of the other optional qualifiers. I'm just illustrating that the language could really use a means to disable default-initialization for those cases where it is deemed both completely unecessary and too costly. Surely that's not too much to consider?
Aug 11 2004
"van eeshan" <vanee hotmail.net> wrote in message news:cfe60s$1a2i$1 digitaldaemon.com...I don't think anything negative was said about default initialization as a feature, Sean. If it was then I apologize. Default/implicit initialization is great (as was noted; so we agree), but there are times when you need to take full control over what's going on. It may not matter to you specifically, but there are many situations whereby one needs to take full responsibility over a select set of variables (such as temporarystack-basedbuffers). Providing the mechanism as an /option/ is not detrimental to the language. It's like applying "const", or one of the other optional qualifiers. I'm just illustrating that the language could really use a means todisabledefault-initialization for those cases where it is deemed both completely unecessary and too costly. Surely that's not too much to consider?You can take full control in D by doing: byte* p = malloc(size); and p will point to an uninitialized buffer.
Aug 11 2004
"van eeshan" <vanee hotmail.net> wrote in message news:cfdoer$13pt$1 digitaldaemon.com...There's an interesting quirk at the 4096-byte mark: the codegen changes strategy slightly at this point, and I'd guess it is doing something akintopaging (ensuring the stack page-set is available), though can't be sure. Take a look at the previously posted assembler for reference.That's because of how the underlying OS allocates space to the stack. This shift in strategy happens with C/C++, too.If local (stack) arrays were to use the C/C++ approach, the time consumed would be ~10ms across the board, rather than the ~500ms for heap allocations. That is, for a 32KB local array, the initialization cost is ~865 times /more/ than it would be for the equivalent C/C++ approach(~10msversus ~8650ms); almost three orders of magnatude.To do an apples/apples comparison, benchmark against calloc/free, not malloc/free. I normally use calloc in C++, not malloc, because the uninitialization bugs aren't worth it.This indicates two things: 1) making claims about D's perfomance superiority over Java due to the former's ability to use the stack (versus always on the heap) is based on speculation; not fact. Testing shows this to be entirely the opposite forNo, it does not show that at all: 1) Your test was with C++, not Java. 2) You cannot allocate uninitialized arrays (or any uninitialized data) in Java. 3) I've written a Java compiler and have done a lot of performance tuning of Java apps trying to make them go faster. I'm very familiar with where the cycles got eaten. 4) I presume that someone allocating an array in C++ would presumably store something in it - just allocating it, doing nothing with it, and freeing it is about as representative as benchmarking the sequence of code: x = x + 1; x = x - 1;arrays beyond 64 bytes in length.Most lightweight objects are less than 64 bytes in length. Allocating 1000 byte objects on the stack is very rare, not the usual case.Similarly, any kind of performance testing against C/C++ within this realm will be viewed as a negative mark againstD. I've had problems with inappropriate use of benchmarks since my compiler executed the following loop: void foo() { int i; double d; for (i = 0; i < 1000; i++) d += 1; } in orders of magnitude faster than the competition. The journalist disassembled the code, found that the compiler had removed the floating point instructions, and so declared the compiler to be "buggy" in his writeup. He never bothered to contact us to find out why. What actually happened was that it was the ONLY compiler reviewed that did data flow analysis, and concluded that the benchmark routine did nothing and so deleted it. Today, nearly all compilers do that.2) Implicit initialization is great. However, it can clearly and easily be demonstrated to be a serious impediment. Given that D is supposed to be a 'systems' language, I humbly suggest that a mechanism be provided tobypasssuch nicities when one is prepared to assume full responsibility. For example, there might be a qualifier for variables that disables the automatic initialization (along the lines of the 'auto' qualifier).The solution is improving the optimizer so it can determine when the initialization is redundant. It already does this for most variables, just not for arrays.
Aug 11 2004
"Walter" <newshound digitalmars.com> wroteconsumedIf local (stack) arrays were to use the C/C++ approach, the timeThe whole point of that post was to selectively and optionally /inhibit/ initialization, where that eats far too many cycles. That's why malloc() was used rather than calloc(). There are plenty times where you need to avoid the (totally superfluous) overhead with workspace buffers. You were stating that heap allocation is a lot slower than stack allocation. Well, that is absolutely not true when you don't want the initialization; heap allocation is (bizarrely) faster within D since you /cannot/ inhibit the darned array initialization when you really need to. C/C++ stack allocations are blindingly fast by comparision; just a couple of instructions in many cases.would be ~10ms across the board, rather than the ~500ms for heap allocations. That is, for a 32KB local array, the initialization cost is ~865 times /more/ than it would be for the equivalent C/C++ approach(~10msversus ~8650ms); almost three orders of magnatude.To do an apples/apples comparison, benchmark against calloc/free, not malloc/free. I normally use calloc in C++, not malloc, because the uninitialization bugs aren't worth it.1) Your test was with C++, not Java.That was D source-code.4) I presume that someone allocating an array in C++ would presumablystoresomething in it - just allocating it, doing nothing with it, and freeingitis about as representative as benchmarking the sequence of code: x = x + 1; x = x - 1;Thanks Walter ... you have never heard of allocating a stack buffer, and then passing it off to another function for population? Yes, something else is done with the buffer, but often there's only a very /small/ part of it used. The overhead of initialization totally swamps the typical processing time. Do that often enough (in, say, something like text processing) and it adds up rapidly. You've never done this kind of thing before? Fine. This was a test to see where the unknown overhead was coming from during a C port; naturally, the test was whittled down to isolate the culprit. Isn't that how you like them?Most lightweight objects are less than 64 bytes in length. Allocating 1000 byte objects on the stack is very rare, not the usual case.What usual case? Do you mean in all software? Please be a little more specific, as "usual case" is terribly vague. Allocating double[1024], char[8192], and often much bigger, on the stack is not at all rare; far from it. How about just one example? Microsoft does it /all/ the time in their Operating Systems, IE, and probably all their other mainstream software. Why do you think AMD and Intel have the NX instruction in their CPUs now? What's nice about D is that you always know the length of the array, and debug options catch out-of-bounds conditions. Rare? I'd claim it's perhaps a rather usual case in existing C and C++ code. May I ask where you get your use-cases from, please?I've had problems with inappropriate use of benchmarks since my compiler executed the following loop: void foo() { int i; double d; for (i = 0; i < 1000; i++) d += 1; } in orders of magnitude faster than the competition. The journalist disassembled the code, found that the compiler had removed the floating point instructions, and so declared the compiler to be "buggy" in his writeup. He never bothered to contact us to find out why. What actually happened was that it was the ONLY compiler reviewed that did data flow analysis, and concluded that the benchmark routine did nothing and so deleted it. Today, nearly all compilers do that.You didn't bother to ask about the code-paths that prompted this topic, Walter; yet seem to go out of your way to belittle any and all points made; and then vehemently deny any possible existence of the problem. I really don't understand why this post has apparently inflamed you so, but do offer sincere and profound apologize. One would have thought knowing where the codegen was seriously non-optimal for given scenarios would be helpful to you. Would you rather not know?be2) Implicit initialization is great. However, it can clearly and easilyademonstrated to be a serious impediment. Given that D is supposed to beGreat! Will it eliminate initialization for this type of (trivail) usage: void A (void delegate (char[]) formatter) { char [1024 * 32] workspace; formatter (workspace); } Because, if not, then the following modification is the sort of (keyword qualifier) thing that would hand control back to the programmer: { uninitialized char[1024 * 32] workspace; formatter (workspace); } That's perhaps not the best name for it, but you get the idea. It's very clear what's going on, it's entirely optional, and the compiler will do exactly what the programmer intends it to do. No under-the-tablecloth-magic required. Oh, and it's easy to grep for. It's just a suggestion.'systems' language, I humbly suggest that a mechanism be provided tobypasssuch nicities when one is prepared to assume full responsibility. For example, there might be a qualifier for variables that disables the automatic initialization (along the lines of the 'auto' qualifier).The solution is improving the optimizer so it can determine when the initialization is redundant. It already does this for most variables, just not for arrays.
Aug 12 2004
On Thu, 12 Aug 2004 00:31:38 -0700, van eeshan wrote:"Walter" <newshound digitalmars.com> wrote[snip]Sounds like a good idea. It allows a coder to take some responsibility for their own work. There is also nothing wrong with D's auto-initialized arrays *when* the coder accepts them too. I feel that D ought to support both ideas. A new keyword is probably the way to go. -- Derek Melbourne, Australia 12/Aug/04 5:54:15 PMThe solution is improving the optimizer so it can determine when the initialization is redundant. It already does this for most variables, just not for arrays.Great! Will it eliminate initialization for this type of (trivail) usage: void A (void delegate (char[]) formatter) { char [1024 * 32] workspace; formatter (workspace); } Because, if not, then the following modification is the sort of (keyword qualifier) thing that would hand control back to the programmer: { uninitialized char[1024 * 32] workspace; formatter (workspace); } That's perhaps not the best name for it, but you get the idea. It's very clear what's going on, it's entirely optional, and the compiler will do exactly what the programmer intends it to do. No under-the-tablecloth-magic required. Oh, and it's easy to grep for. It's just a suggestion.
Aug 12 2004
[snip]As a general principle I agree it would be nice to be able to skip initialization just like it's nice to be able to turn off array-bounds checking. But in practice (in D) I wonder how often that would pay off. Allocating a 32K array on the stack is pretty evil, and I wonder how often that really happens. What C code were you porting that was doing that? I assume it was allocating such a big array not because it was going to use it all (as you said only a small portion of the array gets used) but because the coder was lazy and was trying to avoid worrying about array bounds errors. As we all know overwriting array bounds invites exploits and bugs. In D the way I've been coding APIs that take buffers as input is to take a dynamic array as "a typical size buffer" and if the buffer turns out to be too small then the function increases the length field of the local variable which could potentially allocate new GC'ed memory. For example in std/stream.d there is the function char[] readLine(char[] buffer); which takes an input buffer and fills it with the next line of text from the stream. If the line fits in the buffer the output array's length is the number of characters read and the char* is the buffer's char*. If the line didn't fit then readLine (potentially) allocates a new buffer (just by increasing the length field of the local copy of the buffer variable) and returns the char* of the new buffer. The "overflow" buffer is garbage collected eventually so there isn't any memory leak going on. So in essence the input buffer is just a suggested buffer for readLine but there is no guarantee that the buffer will contain the result of the funtion call when it is done. This approach means you can target your buffers to the typical usage and avoid array-bounds errors.be2) Implicit initialization is great. However, it can clearly and easilyademonstrated to be a serious impediment. Given that D is supposed to beGreat! Will it eliminate initialization for this type of (trivail) usage: void A (void delegate (char[]) formatter) { char [1024 * 32] workspace; formatter (workspace); } Because, if not, then the following modification is the sort of (keyword qualifier) thing that would hand control back to the programmer: { uninitialized char[1024 * 32] workspace; formatter (workspace); } That's perhaps not the best name for it, but you get the idea. It's very clear what's going on, it's entirely optional, and the compiler will do exactly what the programmer intends it to do. No under-the-tablecloth-magic required. Oh, and it's easy to grep for. It's just a suggestion.'systems' language, I humbly suggest that a mechanism be provided tobypasssuch nicities when one is prepared to assume full responsibility. For example, there might be a qualifier for variables that disables the automatic initialization (along the lines of the 'auto' qualifier).The solution is improving the optimizer so it can determine when the initialization is redundant. It already does this for most variables, just not for arrays.
Aug 12 2004
"Ben Hinkle" <bhinkle4 juno.com> wrote in message news:cfg7af$2dbv$1 digitaldaemon.com...[snip]easily2) Implicit initialization is great. However, it can clearly andbebedemonstrated to be a serious impediment. Given that D is supposed tousage:aGreat! Will it eliminate initialization for this type of (trivail)'systems' language, I humbly suggest that a mechanism be provided tobypasssuch nicities when one is prepared to assume full responsibility. For example, there might be a qualifier for variables that disables the automatic initialization (along the lines of the 'auto' qualifier).The solution is improving the optimizer so it can determine when the initialization is redundant. It already does this for most variables, just not for arrays.Is allocating a 2K array on the stack equally evil? From the reply to AJ: 2K allocations: heap: 504 stack[]: 5681 alloca(): 57 asm: 10 32K allocations: heap: 520 stack[]: 8812 alloca(): 118 asm: 10 You can see that the 2K array initialization is almost equally obnoxious, compared to the traditional [asm] C/C++ uninitialized approach. The whole point is that here's a scenario where the compiler thinks it knows best, and won't hand over control unless we run rings around it. Perhaps worse, the whole issue is squirrelled away under the covers; you have to dig like a rabbit just to find the culprit.void A (void delegate (char[]) formatter) { char [1024 * 32] workspace; formatter (workspace); } Because, if not, then the following modification is the sort of (keyword qualifier) thing that would hand control back to the programmer: { uninitialized char[1024 * 32] workspace; formatter (workspace); } That's perhaps not the best name for it, but you get the idea. It's very clear what's going on, it's entirely optional, and the compiler will do exactly what the programmer intends it to do. No under-the-tablecloth-magic required. Oh, and it's easy to grep for. It's just a suggestion.As a general principle I agree it would be nice to be able to skip initialization just like it's nice to be able to turn off array-bounds checking. But in practice (in D) I wonder how often that would pay off. Allocating a 32K array on the stack is pretty evil, and I wonder how often that really happens.What C code were you porting that was doing that? I assume it was allocating such a big array not because it was going to use it all (as you said only a small portion of the array gets used) but because the coder was lazy and was trying to avoid worrying about array bounds errors.Heavy-duty text processing. The coder wasn't being lazy at all, as out-of-bounds conditions are being checked for. The allocated (temporary) workspace is typically used in only a small amounts, but sporadically will get used more fully. The approach taken was apparently "we'll check for end-of-buffer, but if anything exceeds these limits, then it's so beyond the design-specification that it's an error condition". I understand that to be a rather common approach. It's certainly valid, and rather effective. The D approach to this is, " ...you /vill/ pay ze initialization cost, regardless of vot ze design!" (that is a caricature; no offense intended to anyone) The existing code simply took advantage of the fact that it's perfectly legitimate, and very fast, to do that kind of thing in C and C++. If D is going to outlaw such approaches then it had better get it down in the "manifesto". However, I didn't realize that D was such an ideological exercise. <snip>it is done. This approach means you can target your buffers to the typical usage and avoid array-bounds errors.Yes, this is all well and good. But /redesigning/ existing software to suit is just not a solution. While the approach you discuss is perfectly valid, so was the original (given its apparent design scope) since it did not have array-bounds or buffer-overrun conditions. Further, you can't reliably place an extensible buffer on the stack; thus, you are forced to forego the performance benefits of doing so. This thread is not about a design-methodology issue ~ if it were, nobody would be allowed to use the stack for array allocation, at all! It's a simple issue of the D language not providing the programmer with "accessible" tools to decide what's best. In this particular case it showed up in ported code. There's nothing to say that it won't show up in original D software either. As you say, Ben; as a general principal, D should (absolutely) permit the skipping of array initialization. The whole default initialization thing is arguably wonderful, but then not providing accessible control over it is akin to tinpot-dictatorship. For example, I will probably resort to assembly to get around this issue: that's not exactly an accessible or acceptable solution, for obvious reasons. I think we can agree on that basic tenet. Your other points are perhaps more about a design philosophy: that's a different ball-game and, applicable or not, unfortunately does not jive with porting existing code. The optional "keyword qualifier" example (at the top of the page) is likely to be as trivial as one could hope for within the compiler; combined with a little documentation, that would resolve such issues in a pragmatic and considerate manner. Surely that is far simpler than building an optimizer to read the programmers mind? Or, is there some fundamental ideology at stake here?
Aug 12 2004
"van eeshan" <vanee hotmail.net> wrote in message news:cfge6g$2g1j$1 digitaldaemon.com...Heavy-duty text processing. The coder wasn't being lazy at all, as out-of-bounds conditions are being checked for. The allocated (temporary) workspace is typically used in only a small amounts, but sporadically will get used more fully. The approach taken was apparently "we'll check for end-of-buffer, but if anything exceeds these limits, then it's so beyondthedesign-specification that it's an error condition". I understand that tobea rather common approach. It's certainly valid, and rather effective. TheDapproach to this is, " ...you /vill/ pay ze initialization cost,regardlessof vot ze design!"Here's an example of an alternate way to do it, for code that would typically use only a small amount, but occaisionally could use a great deal. It can also be set up to never overflow (at least until all memory is exhausted) by repeatedly resizing it. This has the advantage of not needing to set up any error recovery, or create and document an error message for the limits. import std.stdio; char[] formatter(char[] s) { if (s.length < 14) // if our passed buffer isn't big enough s.length = 14; // yes, we can reallocate things that were originally on the stack s[0..14] = "abcdefghijklmn"; // copy result into s[] return s; } void main() { char[4] workspace; // assume 4 is size typically needed in formatter() char[] s; s = formatter(workspace); writefln(s); }
Aug 12 2004
"Walter" <newshound digitalmars.com> wrote ...Here's an example of an alternate way to do it, for code that would typically use only a small amount, but occaisionally could use a greatdeal. Thanks Walter, for the alternate implementation. Ben had already suggested that, and I am aware of various other approaches too. However, let's not confuse this topic with that of design-methodology. As was noted (in the post you replied to) this is a port. Do you think it reasonable to state, " ... Now listen up you ...scurrilous ... miserable bunch of C coders! If you want to join the D party, then you're gonna' have to shape-up, and drop all those implementation approaches that were valid and comfortable. We don't have any of that nonsense over here! If you want your damned existing code to run fast, well then, by golly you'd better make sure you redesign it too!" ? ... and all because there's no accessible way to inhibit default initialization? The issues at hand are several: 1) the original implementation was, and still is, a perfectly valid approach (given the requirements). It is also executes rather efficiently. The same pattern is quite likely to occur in original D code also. Let's try to avoid whether other approaches are better or worse; that doesn't help. Not even the best education can thoroughly resolve such problems. For the record, your alternative still needs an additional boundary-check plus error-message for equivalent pathological conditions (out of memory). 2) what you are now suggesting is that the program should be redesigned instead. That is simply not realistic for working, optimal code. The code compiled and executed without too much effort, but it exhibits performance issues. That's what drove the foray into this topic. I will probably fix it with the assembly-code thoughtfully provided by AJ (and thanks to Ben also for both his suggestions) but that's hardly an accessible, or portable, resolution in the general sense. 3) the solution offered (by both yourself and by Ben) is not allocated on the stack. The whole point here is to utilize the stack because it's a much faster means of providing a temporary workspace. Going to the heap means a partial redesign, and the program will still run slower in D than it did originally. Forcing stack arrays to initialize will /always/ be dramatically slower than the equivalent C/C++ code, whereas a simple embellishment to the existing initialization mechanism would resolve that completely; and in a jiffy. This is about choice, Walter, and supporting the programmer in what they are doing. 4) the approach to addressing this seems to be one of "guilty until proven innocent". That is, there doesn't seem to be any notion on your part that this might, actually, be a valid observation. With solid number to back it up. You're basically saying, " ... the approach never had, and never will, have any validity whatsoever. Rip out the code and do it just like I would". I do sincerely hope that I'm misunderstanding you, but that's kinda' how it has been coming across. May I ask why some easy manual control over default-initialization is such a terrible thing? Is it really /so/ impossible that you'd rather force people into re-implementing and retesting parts of their existing codebase instead (as you suggest above) ? Or is there perhaps some fundamental ideology at stake, that I just don't see? "Walter" <newshound digitalmars.com> wrote in message news:cfgqv4$2lq9$1 digitaldaemon.com..."van eeshan" <vanee hotmail.net> wrote in message news:cfge6g$2g1j$1 digitaldaemon.com...(temporary)Heavy-duty text processing. The coder wasn't being lazy at all, as out-of-bounds conditions are being checked for. The allocatedwillworkspace is typically used in only a small amounts, but sporadicallyTheget used more fully. The approach taken was apparently "we'll check for end-of-buffer, but if anything exceeds these limits, then it's so beyondthedesign-specification that it's an error condition". I understand that tobea rather common approach. It's certainly valid, and rather effective.Ddeal.approach to this is, " ...you /vill/ pay ze initialization cost,regardlessof vot ze design!"Here's an example of an alternate way to do it, for code that would typically use only a small amount, but occaisionally could use a greatIt can also be set up to never overflow (at least until all memory is exhausted) by repeatedly resizing it. This has the advantage of notneedingto set up any error recovery, or create and document an error message for the limits. import std.stdio; char[] formatter(char[] s) { if (s.length < 14) // if our passed buffer isn't big enough s.length = 14; // yes, we can reallocate things that were originally on the stack s[0..14] = "abcdefghijklmn"; // copy result into s[] return s; } void main() { char[4] workspace; // assume 4 is size typically needed in formatter() char[] s; s = formatter(workspace); writefln(s); }
Aug 12 2004
"van eeshan" <vanee hotmail.net> wrote in message news:cfgvi5$2npu$1 digitaldaemon.com...1) the original implementation was, and still is, a perfectly validapproach(given the requirements).And it is valid and works in D.It is also executes rather efficiently. The same pattern is quite likely to occur in original D code also. Let's try toavoidwhether other approaches are better or worse; that doesn't help. Not even the best education can thoroughly resolve such problems. For the record, your alternative still needs an additional boundary-checkYes.plus error-message for equivalent pathological conditions (out of memory).This is not necessary, because very few programs can proceed if they run out of memory, so they just issue an error message and abort. In D, this is handled automatically - running out of memory throws an exception with a message in it, if you don't catch the exception, the message is printed and the program aborted. For nearly all programs, this is the right behavior.2) what you are now suggesting is that the program should be redesigned instead. That is simply not realistic for working, optimal code. The code compiled and executed without too much effort, but it exhibits performance issues. That's what drove the foray into this topic. I will probably fixitwith the assembly-code thoughtfully provided by AJ (and thanks to Ben also for both his suggestions) but that's hardly an accessible, or portable, resolution in the general sense.I suggest the alloca() approach, and do some timing tests on the real app with it vs AJ's approach. I'm willing to bet you a beer that you won't be able to measure the difference, especially if the typical use of the array will fill it 2 or 3 Kb worth.3) the solution offered (by both yourself and by Ben) is not allocated on the stack.The alloca() version is.The whole point here is to utilize the stack because it's a much faster means of providing a temporary workspace. Going to the heap means a partial redesign, and the program will still run slower in D than it did originally. Forcing stack arrays to initialize will /always/ bedramaticallyslower than the equivalent C/C++ code, whereas a simple embellishment totheexisting initialization mechanism would resolve that completely; and in a jiffy. This is about choice, Walter, and supporting the programmer in what they are doing. 4) the approach to addressing this seems to be one of "guilty until proven innocent". That is, there doesn't seem to be any notion on your part that this might, actually, be a valid observation. With solid number to back it up. You're basically saying, " ... the approach never had, and never will, have any validity whatsoever. Rip out the code and do it just like Iwould".I do sincerely hope that I'm misunderstanding you, but that's kinda' howithas been coming across. May I ask why some easy manual control over default-initialization is suchaterrible thing? Is it really /so/ impossible that you'd rather forcepeopleinto re-implementing and retesting parts of their existing codebaseinstead(as you suggest above) ? Or is there perhaps some fundamental ideology at stake, that I just don't see?Uninitialized data can be a source of bugs and trouble, as I pointed out in my other posting, even when used correctly. One design goal of D is to improve reliability and portability by eliminating sources of undefined behavior, and uninitialized data is one huge source of undefined, unportable, erratic and unpredictable behavior. I'm trying to squeeze that out of the normal constructs in D - but access to C functions and inline assembler is there for folks who really need it. Give the other methods presented a chance, and time them in context.
Aug 12 2004
Just out of curiosity, how does alloca() work? I guess it just manipulates the Stack and Base Pointers? My apologies for going a bit off topic, I've just never seen alloca() before.
Aug 13 2004
"Michael" <Michael_member pathlink.com> wrote in message news:cfhr6d$9qj$1 digitaldaemon.com...Just out of curiosity, how does alloca() work? I guess it just manipulatestheStack and Base Pointers? My apologies for going a bit off topic, I've just never seen alloca()before. It manipulates the ESP register. There is some complexity in it due to needing to 'touch' each 4k page allocated, and to relocate expression temporaries that are pushed on the stack. The source for it comes on the DMC++ CD.
Aug 13 2004
Ahh, I hadn't thought about expression temps. Very interesting stuff. I think I may order that CD. Thanks very much for the information.Just out of curiosity, how does alloca() work? I guess it just manipulatestheStack and Base Pointers? My apologies for going a bit off topic, I've just never seen alloca()before. It manipulates the ESP register. There is some complexity in it due to needing to 'touch' each 4k page allocated, and to relocate expression temporaries that are pushed on the stack. The source for it comes on the DMC++ CD.
Aug 13 2004
"Walter" <newshound digitalmars.com> wrote in message news:cfhk9n$5jh$1 digitaldaemon.com...outplus error-message for equivalent pathological conditions (out of memory).This is not necessary, because very few programs can proceed if they runof memory, so they just issue an error message and abort. In D, this is handled automatically - running out of memory throws an exception with a message in it, if you don't catch the exception, the message is printedandthe program aborted. For nearly all programs, this is the right behavior.Perhaps you haven't had the opportunity to write resiliant software? For example, halting an entire web-site, because some file is not in the correct format, is just not a solution. Frankly, I'm really surprised you'd encourage something so obviously naiive in this day and age. With a server, you do whatever is necessary to track and release allocations to ensure the server stays alive as long as it can. Even better; you focus on temporarily /stack/ allocations (surprise!), and thread-local reusable buffers, so there's never any issue about running out of memory (as a bonus, it's all thread-safe). Please don't waste your efforts on me in this respect: Been there. Done that. Seen the movie a gazillion times. Optimize it while I sleep.code2) what you are now suggesting is that the program should be redesigned instead. That is simply not realistic for working, optimal code. Theperformancecompiled and executed without too much effort, but it exhibitsalsoissues. That's what drove the foray into this topic. I will probably fixitwith the assembly-code thoughtfully provided by AJ (and thanks to BenYou are well aware that this is not about alloca() versus assembly.for both his suggestions) but that's hardly an accessible, or portable, resolution in the general sense.I suggest the alloca() approach, and do some timing tests on the real app with it vs AJ's approach. I'm willing to bet you a beer that you won't be able to measure the difference, especially if the typical use of the array will fill it 2 or 3 Kb worth.on3) the solution offered (by both yourself and by Ben) is not allocatedWas referring to the example Ben gave that looked rather like your own one. The one he noted in the post you recently replied to.the stack.The alloca() version is.aThe whole point here is to utilize the stack because it's a much faster means of providing a temporary workspace. Going to the heap meansapartial redesign, and the program will still run slower in D than it did originally. Forcing stack arrays to initialize will /always/ bedramaticallyslower than the equivalent C/C++ code, whereas a simple embellishment totheexisting initialization mechanism would resolve that completely; and inwhatjiffy. This is about choice, Walter, and supporting the programmer inproventhey are doing. 4) the approach to addressing this seems to be one of "guilty untilthatinnocent". That is, there doesn't seem to be any notion on your partitthis might, actually, be a valid observation. With solid number to backwill,up. You're basically saying, " ... the approach never had, and neversuchhave any validity whatsoever. Rip out the code and do it just like Iwould".I do sincerely hope that I'm misunderstanding you, but that's kinda' howithas been coming across. May I ask why some easy manual control over default-initialization isadon'tterrible thing? Is it really /so/ impossible that you'd rather forcepeopleinto re-implementing and retesting parts of their existing codebaseinstead(as you suggest above) ? Or is there perhaps some fundamental ideology at stake, that I justinsee?Uninitialized data can be a source of bugs and trouble, as I pointed outmy other posting, even when used correctly. One design goal of D is to improve reliability and portability by eliminating sources of undefined behavior, and uninitialized data is one huge source of undefined, unportable, erratic and unpredictable behavior. I'm trying to squeeze that out of the normal constructs in D - but access to C functions and inline assembler is there for folks who really need it.Most admirable, Walter. And agreed, as I've stated in virtually every single post on this topic. But here's where that position truly falls apart ... what about those multitude of cases where buffers are reused over and over again? What about thread-local workspaces? What about pooled buffers? what about all those X, Y, and Z variations? They do /not/ get automagically re-initialized. Your argument, as you state it, is one-shot only. That is, the whole basis of said argument hold validity only as long as the "uninitialized data" is used once, and once only. Anyone who produces high-performance designs and/or code will chortle with glee at this glaring logic hole (where such buffer are deliberately reused thousands, sometimes millions, of times per second). Such systems do not have the time to spend clearing out each little byte of data ... they simply ensure that it's never a problem. Can we please get past this level of discussion? You know; in the time you've taken over these postings, you could probably have implemented a keyword qualifier to optionally inhibit the default initialization. Again, we're talking about a deliberate action by the programmer to say " I don't want that frickin CPU gobblin initialization!". From the developers position, your noble ideology has already been placed to one side at that point. The D language should be able to help there, rather than just get in the way. Remember? It's a "Practical Language for Practical Programmers". Please consider.
Aug 13 2004
van eeshan wrote:Perhaps you haven't had the opportunity to write resiliant software? For example, halting an entire web-site, because some file is not in the correct format, is just not a solution. Frankly, I'm really surprised you'd encourage something so obviously naiive in this day and age.I think Walter meant that this is typically the proper behavior for an out of memory condition. He also notes that an exception is thrown so "resilient software" has a chance to do something about the problem instead of aborting. C++ behaves the exact same way, however C++ does not automatically catch the error and display a meaningful message if the programmer forgot to write a try/catch block. Sean
Aug 13 2004
"van eeshan" <vanee hotmail.net> wrote in message news:cfhv05$bl9$1 digitaldaemon.com...thatUninitialized data can be a source of bugs and trouble, as I pointed outinmy other posting, even when used correctly. One design goal of D is to improve reliability and portability by eliminating sources of undefined behavior, and uninitialized data is one huge source of undefined, unportable, erratic and unpredictable behavior. I'm trying to squeezeaboutout of the normal constructs in D - but access to C functions and inline assembler is there for folks who really need it.But here's where that position truly falls apart ... what about those multitude of cases where buffers are reused over and over again? Whatthread-local workspaces? What about pooled buffers? what about all thoseX,Y, and Z variations? They do /not/ get automagically re-initialized.Right, they don't. However, data buffers are unlikely to contain garbage that looks like pointers into the gc heap, whereas garbage on the stack *is* very likely to. Next, detritus in recycled data buffers is not going to consist of objects that are no longer valid, the source of the other problem I mentioned. And lastly, D can't close every hole, but that doesn't mean it shouldn't close what it can.
Aug 13 2004
May I point you towards this post, which has an attempted summary of the discussion thus far? news:cfhqhu$9jj$1 digitaldaemon.com Would very much appreciate your comments on those points. I think, perhaps, this unfolding picture indicates it's not a question of ideology per se; but one of stack-usage being somewhat incompatible with the current GC design. If there were some means to quickly eliminate what look like old GC references from the (reused) portion of the stack that's still live, then what Ben has suggested would presumably be more palatable? Currently, the method employed to achieve this is to wipe everything to a predefined default value; which can be agonizingly slow compared to the C/C++ approach. Is that a fair assessment? "Walter" <newshound digitalmars.com> wrote in message news:cfisfb$s39$1 digitaldaemon.com..."van eeshan" <vanee hotmail.net> wrote in message news:cfhv05$bl9$1 digitaldaemon.com...outUninitialized data can be a source of bugs and trouble, as I pointedundefinedinmy other posting, even when used correctly. One design goal of D is to improve reliability and portability by eliminating sources ofinlinethatbehavior, and uninitialized data is one huge source of undefined, unportable, erratic and unpredictable behavior. I'm trying to squeezeout of the normal constructs in D - but access to C functions andthoseaboutassembler is there for folks who really need it.But here's where that position truly falls apart ... what about those multitude of cases where buffers are reused over and over again? Whatthread-local workspaces? What about pooled buffers? what about allX,*is*Y, and Z variations? They do /not/ get automagically re-initialized.Right, they don't. However, data buffers are unlikely to contain garbage that looks like pointers into the gc heap, whereas garbage on the stackvery likely to. Next, detritus in recycled data buffers is not going to consist of objects that are no longer valid, the source of the otherproblemI mentioned. And lastly, D can't close every hole, but that doesn't meanitshouldn't close what it can.
Aug 13 2004
Addendum. I misread your code somewhat, Water, but what you suggest has a fatal flaw in it: you are still initializing the original stack-based array, and you're making the assumption that the typical size is very small. Further, you are then extending the array via the heap which will, I imagine, initialize the extended part of the array. That just doesn't work because: a) there's a significant amount of effort to convert all the "workers" to do what you suggest. Plus, the code is less efficient, there's more of it, and there's possible heap-fragmentation involved. b) the initial size needs to be around the 2 to 3KB mark. Given that, your example is hardly faster than the original code. In fact, the initialization cost is ~57000% slower than the equivalent C/C++ array allocation. That's better than ~88000%, but who's to quibble at these levels? I wish it were so simple as your suggestion conveyed. Conversely, if there were a syntactic means of optionally inhibiting the array initialization, there would be no related issue at all. We could all be radiantly happy, and have a beer or something. "Walter" <newshound digitalmars.com> wrote in message news:cfgqv4$2lq9$1 digitaldaemon.com..."van eeshan" <vanee hotmail.net> wrote in message news:cfge6g$2g1j$1 digitaldaemon.com...(temporary)Heavy-duty text processing. The coder wasn't being lazy at all, as out-of-bounds conditions are being checked for. The allocatedwillworkspace is typically used in only a small amounts, but sporadicallyTheget used more fully. The approach taken was apparently "we'll check for end-of-buffer, but if anything exceeds these limits, then it's so beyondthedesign-specification that it's an error condition". I understand that tobea rather common approach. It's certainly valid, and rather effective.Ddeal.approach to this is, " ...you /vill/ pay ze initialization cost,regardlessof vot ze design!"Here's an example of an alternate way to do it, for code that would typically use only a small amount, but occaisionally could use a greatIt can also be set up to never overflow (at least until all memory is exhausted) by repeatedly resizing it. This has the advantage of notneedingto set up any error recovery, or create and document an error message for the limits. import std.stdio; char[] formatter(char[] s) { if (s.length < 14) // if our passed buffer isn't big enough s.length = 14; // yes, we can reallocate things that were originally on the stack s[0..14] = "abcdefghijklmn"; // copy result into s[] return s; } void main() { char[4] workspace; // assume 4 is size typically needed in formatter() char[] s; s = formatter(workspace); writefln(s); }
Aug 12 2004
"van eeshan" <vanee hotmail.net> wrote in message news:cfh14k$2ouk$1 digitaldaemon.com...I misread your code somewhat, Water, but what you suggest has a fatal flaw in it: you are still initializing the original stack-based array, and you're making the assumption that the typical size is very small.I'm confused - you wrote: "The allocated (temporary) workspace is typically used in only a small amounts, but sporadically will get used more fully." My suggestion was tuned to that.Further, you are then extending the array via the heap which will, I imagine, initialize the extended part of the array.That's correct.That just doesn't work because: a) there's a significant amount of effort to convert all the "workers" todowhat you suggest.Correct, but resizing dynamic arrays in D is pretty easy. It's not as klunky and error-prone as in C.Plus, the code is less efficient,Since the resize will only happen in atypical cases, its efficiency should not impact overall performance.there's more of it, and there's possible heap-fragmentation involved.Heap fragmentation is much less of an issue than it used to be, with large memory pools and virtual memory. I know how to build a gc that will compact memory to deal with fragmentation when it does arise, so although this is not in the current D it is not a technical problem to do it.b) the initial size needs to be around the 2 to 3KB mark. Given that, your example is hardly faster than the original code. In fact, theinitializationcost is ~57000% slower than the equivalent C/C++ array allocation. That's better than ~88000%, but who's to quibble at these levels?If your app typically is going to fill 2 to 3 Kb, I suggest using malloc() (or alloca()) for it. The handful of extra instructions to allocate via malloc as opposed to the stack will be lost in the thousands of instructions needed to fill 2 to 3 Kb.I wish it were so simple as your suggestion conveyed. Conversely, if there were a syntactic means of optionally inhibiting the array initialization, there would be no related issue at all. We could all be radiantly happy,andhave a beer or something.Allocating large, uninitialized arrays on the stack has its own problems: 1) The uninitialized data that is on the stack will get scanned by the garbage collector looking for any references to allocated memory. Since the uninitialized data consists of old D stack frames, it is highly likely that some of that garbage will look like references into the gc heap, and the gc memory will not get freed. This problem really does happen, and can be pretty frustrating to track down. Some of my C code that uses GC gets a memset() on the stack frame for that reason. 2) It's possible for a function to pass out of it a reference to data on that function's stack frame. By then allocating a new stack frame over the old data, and not initializing, the reference to the old data may still appear to be valid. The program will then behave erratically. Initializing all data on the stack frame will greatly increase the probability of forcing that bug into the open in a repeatable manner. 3) Large arrays on the stack bring up the specter of stack exhaustion with even modest amounts of recursion. That said, there are things I can do to generate faster code for the initialization.
Aug 12 2004
In article <cfhhbe$3jf$1 digitaldaemon.com>, Walter says...I know how to build a gc that will compact memory to deal with fragmentation when it does arise, so although this is not in the current D it is not a technical problem to do it.That's probably good news, but: Will it be possible to mark specific arrays on the heap as "immovable"? (Please say yes. I'm sure it is not a technical problem). Arcane Jill
Aug 12 2004
"Arcane Jill" <Arcane_member pathlink.com> wrote in message news:cfhoe3$793$1 digitaldaemon.com...In article <cfhhbe$3jf$1 digitaldaemon.com>, Walter says...(PleaseI know how to build a gc that will compact memory to deal with fragmentation when it does arise, so although this is not in the current D it is not a technical problem to do it.That's probably good news, but: Will it be possible to mark specific arrays on the heap as "immovable"?say yes. I'm sure it is not a technical problem).Yes. It's called "pinning". And I can guess why you need this feature <g>.
Aug 13 2004
In article <cfhhbe$3jf$1 digitaldaemon.com>, Walter says...Allocating large, uninitialized arrays on the stack has its own problems: 1) The uninitialized data that is on the stack will get scanned by the garbage collector looking for any references to allocated memory. Since the uninitialized data consists of old D stack frames, it is highly likely that some of that garbage will look like references into the gc heap, and the gc memory will not get freed. This problem really does happen, and can be pretty frustrating to track down. Some of my C code that uses GC gets a memset() on the stack frame for that reason.That is a very good point. In fact, it's an excellent point, and one which I had not previously considered. Although, I suppose you could get round it with something like this:2) It's possible for a function to pass out of it a reference to data on that function's stack frame. By then allocating a new stack frame over the old data, and not initializing, the reference to the old data may still appear to be valid. The program will then behave erratically. Initializing all data on the stack frame will greatly increase the probability of forcing that bug into the open in a repeatable manner.That's another very good point.3) Large arrays on the stack bring up the specter of stack exhaustion with even modest amounts of recursion.I'm not sure about this one. I thought that in the 80386+ architecture, the stack(s) can grow indefinitely. Or is it just that Windows doesn't take advantage of this possibility.That said, there are things I can do to generate faster code for the initialization.I suspect that if you succeed in making this lightening-fast, everyone will stop complaining. You only have to look at the figures which have been posted recently to see what everybody's complaining about.
Aug 13 2004
"Arcane Jill" <Arcane_member pathlink.com> wrote in message news:cfhppt$97t$1 digitaldaemon.com...In article <cfhhbe$3jf$1 digitaldaemon.com>, Walter says...with3) Large arrays on the stack bring up the specter of stack exhaustiontheeven modest amounts of recursion.I'm not sure about this one. I thought that in the 80386+ architecture,stack(s) can grow indefinitely. Or is it just that Windows doesn't take advantage of this possibility.Stacks are allocated a range of memory. Once that's filled up, since the stack cannot be relocated, the program comes to a (crashing) halt. The problem gets worse if you create a lot of threads, because each thread will necessarilly have a fixed maximum stack size. The stack size only appears to be infinite to those of us used to 16 bit machines <g>.
Aug 13 2004
"Walter" <newshound digitalmars.com> wrote in message news:cfhhbe$3jf$1 digitaldaemon.com..."van eeshan" <vanee hotmail.net> wrote in message news:cfh14k$2ouk$1 digitaldaemon.com...flawI misread your code somewhat, Water, but what you suggest has a fataltypicallyin it: you are still initializing the original stack-based array, and you're making the assumption that the typical size is very small.I'm confused - you wrote: "The allocated (temporary) workspace isused in only a small amounts, but sporadically will get used more fully."Mysuggestion was tuned to that.This topic had become attuned somewhat to 16KB and 32KB buffers, so the typical population of 2 to 3KB is relatively small; I can understand how that might have been misinterpreted to become the < 5 bytes you had tuned for.yourb) the initial size needs to be around the 2 to 3KB mark. Given that,That'sexample is hardly faster than the original code. In fact, theinitializationcost is ~57000% slower than the equivalent C/C++ array allocation.instructionsbetter than ~88000%, but who's to quibble at these levels?If your app typically is going to fill 2 to 3 Kb, I suggest using malloc() (or alloca()) for it. The handful of extra instructions to allocate via malloc as opposed to the stack will be lost in the thousands ofneeded to fill 2 to 3 Kb.Good that you should note the thousands of instructions, because that's effectively what the default initialization does (fills the entire array with default data). This particular app is actually very efficient in filling the 'typical' space used; you might think of it as effectively a quick lookup followed by a memcpy(). If you consider that as the core element of the algorithm; then the default initialization will slow that core down by almost a factor of two (clear the entire buffer, then fill the whole thing again). That slows down the overall application quite noticeably. One is rather likely to notice that level of redundant overhead in any CPU-bound application that doesn't completely toss the notion of efficiency out of the proverbial window.Allocating large, uninitialized arrays on the stack has its own problems: 1) The uninitialized data that is on the stack will get scanned by the garbage collector looking for any references to allocated memory. Sincetheuninitialized data consists of old D stack frames, it is highly likelythatsome of that garbage will look like references into the gc heap, and thegcmemory will not get freed. This problem really does happen, and can be pretty frustrating to track down. Some of my C code that uses GC gets a memset() on the stack frame for that reason.So what you are saying is that the current GC is incompatible with stack allocation in general? What you describe is not restricted to uninitialized data, Walter ~ what you appear to be saying is that /any/ data on the stack could potentially look like a reference into the gc heap (as you describe it). If this is the case, the use of alloca() will have exactly the same issues, yet you recommend that as a solution a few paragraphs above. Where does this really stand? Are you really saying that the D GC is incompatible with long-term stack content? That is, if I place some arbitrary values in a stack array during main(), will those values cause the GC confusion throughout the remainder of program execution if they happen to look like GC references?2) It's possible for a function to pass out of it a reference to data on that function's stack frame. By then allocating a new stack frame over the old data, and not initializing, the reference to the old data may still appear to be valid. The program will then behave erratically. Initializing all data on the stack frame will greatly increase the probability offorcingthat bug into the open in a repeatable manner.Oh ... I thought we were already well past the general case here, and into the realm of where the programmer had alreay taken full responsibility for their actions. Specifically, not initializing the stack content. Let's face it: it's quite feasible for any function to barf all over the heap, stack, and everywhere else until the hardware catches it. And what about the cases whereby a stack buffer is consistently reused within the same, or subordinate, function? It's not suddenly cleared all over again before each subsequent usage (e.g. between foreach loops), is it? This appears to be a completely moot (and potentially misleading) point, but please educate me if I'm wrong. Seriously ...3) Large arrays on the stack bring up the specter of stack exhaustion with even modest amounts of recursion.Anyone who has read this thread is well aware that we're not talking recursion. Regardless; your sudden concern didn't stop this being a common and successful approach in C/C++. I mean, did you outlaw large(ish) stack-arrays in the DM C++ compiler? That is not a rhetorical question ... Besides, as noted previously, it's best we not confuse this topic with design-methodologies; as even the best education cannot reconcile such issues. I hope we can agree on that.That said, there are things I can do to generate faster code for the initialization.Yes, I'm aware of that. You could, for example, initialize in chunks of four rather than one. However, that will still not even get close to a C/C++ style array-allocation on the stack. But the thought is both appreciated and noted. To summarize thus far: a) we have a D language facility to initialize all variables with default values. This is arguably a wonderful facility, and can help catch certain bugs in the case of general programming. b) unfortunately there's no easy "accessible" means to inhibit that initialization for stack-based arrays; leaving the programmer to eat a potentially (and repeated) vast overhead of CPU consumption, or resort to some non-obvious guru-tricks (none of which will easily work on anything other than a single dimensional array) ... Not to mention that this CPU cost is completely hidden from source-level view in the first place. c) you indicate, Walter, that the GC has problems with uninitialized data on the stack. Well, that will happen regardless of whether or not the language supports non-initialization. That is; a valid stack array can contain arbitrary data patterns ~ said content may look exactly like gc references. Please correct me if I'm wrong. d) The language explicitly exposes a built-in assembler, to do whatever the heck anyone might want to (barring the usual platform issues). Yet, there seems to be a massive pushback against providing an easy to use, symmetrical, high performance, and thoroughly useful counter to the default initialization ~ the alternative to which is to drop into assembly (or similar guru-grok), to simulate what C/C++ provides as its default, highly efficient, behavior. There's not even the vaguest counter-argument relating to implementation complexity. Please forgive the unsavory mess while this latter, wildly contradictory concept, blows my tiny non-academic mind all over the inside of your screen ... Let me finish with the sincere hope that my (obviously) thoroughly weak grasp on reality will soon be at an end. Can you please help me wrangle that into shape Walter?
Aug 13 2004
In article <cfhqhu$9jj$1 digitaldaemon.com>, van eeshan says...Sincetheuninitialized data consists of old D stack frames, it is highly likelythatsome of that garbage will look like references into the gc heap, and thegcmemory will not get freed.Are you really saying that the D GC is incompatible with long-term stack content? That is, if I place some arbitrary values in a stack array during main(), will those values cause the GC confusion throughout the remainder of program execution if they happen to look like GC references?I think you may have misunderstood Walter. Arbitrary random data can indeed give the GC false positives, but this usually doesn't matter, since it is rare. What Walter said is that uninitialized stack data is /not random/ - in that it is almost certain to contain fragments of some previous stack frame, and therefore also /real, valid/ (but out-of-date) references to GC-controlled memory. So the number of false positives will shoot up from "almost none" to "very many". (And my previous suggestion of temporarily disabling the GC won't help either, so I withdraw the suggestion). Jill
Aug 13 2004
Fair point Jill. But at that stage, one might as well state that the stack is entirely off limits for general array usage. You cannot predict the frequency of what might or might not look like GC references for any given application. It would require dynamic sampling to find a trend and try to offset that against GC behavior. The real problem, in this particular case, is the GC ~ it's simply incompatible with arbitrary (yet valid) stack-based array content. How much of a problem is that? It may actually be a red-herring. On the other hand, I may be waning considerably at this very moment (eight hours behind you). "Arcane Jill" <Arcane_member pathlink.com> wrote in message news:cfhvbq$bra$1 digitaldaemon.com...In article <cfhqhu$9jj$1 digitaldaemon.com>, van eeshan says...theSincetheuninitialized data consists of old D stack frames, it is highly likelythatsome of that garbage will look like references into the gc heap, andin agcmemory will not get freed.Are you really saying that the D GC is incompatible with long-term stack content? That is, if I place some arbitrary valuesGCstack array during main(), will those values cause the GC confusion throughout the remainder of program execution if they happen to look likeindeed givereferences?I think you may have misunderstood Walter. Arbitrary random data canthe GC false positives, but this usually doesn't matter, since it is rare.WhatWalter said is that uninitialized stack data is /not random/ - in that itisalmost certain to contain fragments of some previous stack frame, andthereforealso /real, valid/ (but out-of-date) references to GC-controlled memory.So thenumber of false positives will shoot up from "almost none" to "very many". (And my previous suggestion of temporarily disabling the GC won't helpeither,so I withdraw the suggestion). Jill
Aug 13 2004
"Arcane Jill" <Arcane_member pathlink.com> wrote in message news:cfhvbq$bra$1 digitaldaemon.com...In article <cfhqhu$9jj$1 digitaldaemon.com>, van eeshan says...thatSince the uninitialized data consists of old D stack frames, it is highly likelythesome of that garbage will look like references into the gc heap, andin agc memory will not get freed.Are you really saying that the D GC is incompatible with long-term stack content? That is, if I place some arbitrary valuesGCstack array during main(), will those values cause the GC confusion throughout the remainder of program execution if they happen to look likeindeed givereferences?I think you may have misunderstood Walter. Arbitrary random data canthe GC false positives, but this usually doesn't matter, since it is rare.WhatWalter said is that uninitialized stack data is /not random/ - in that itisalmost certain to contain fragments of some previous stack frame, andthereforealso /real, valid/ (but out-of-date) references to GC-controlled memory.So thenumber of false positives will shoot up from "almost none" to "very many".You're right, I'd forgotten to make that clear. Uninitialized data is "garbage", but that's not at all the same thing as "random". In my extensive experiments with this, text and integer "garbage" rarely looks like a valid pointer into the gc heap, but old stack frames are very likely to. Also, it is not a matter of "incompatibility" with gc, it's more a matter of the gc being more effective.
Aug 13 2004
So, the resolution on this has apparently been implemented. If you write: { char[] x; const int size = 2048; x = cast(char[]) alloca(size)[0..size]; } ... then you will get the equivalent of a C/C++ stack array, without all the prior initialization overhead involved. While this is a VeryGoodThing, it could use some syntactic sugar ... There were a number of suggestions made throughout this topic; here's a brief recap of three: { uninitialized char[2048] x; char[2048] x = void; char[] x = stackalloc char[2048]; } The first two may find use outside of stack-allocation, whilst the third is clearly stack-based. The second suggestion avoids creating an additional language 'qualifier'. How about it? "Walter" <newshound digitalmars.com> wrote in message news:cfhhbe$3jf$1 digitaldaemon.com..."van eeshan" <vanee hotmail.net> wrote in message news:cfh14k$2ouk$1 digitaldaemon.com...flawI misread your code somewhat, Water, but what you suggest has a fataltypicallyin it: you are still initializing the original stack-based array, and you're making the assumption that the typical size is very small.I'm confused - you wrote: "The allocated (temporary) workspace isused in only a small amounts, but sporadically will get used more fully."Mysuggestion was tuned to that.toFurther, you are then extending the array via the heap which will, I imagine, initialize the extended part of the array.That's correct.That just doesn't work because: a) there's a significant amount of effort to convert all the "workers"doklunkywhat you suggest.Correct, but resizing dynamic arrays in D is pretty easy. It's not asand error-prone as in C.compactPlus, the code is less efficient,Since the resize will only happen in atypical cases, its efficiency should not impact overall performance.there's more of it, and there's possible heap-fragmentation involved.Heap fragmentation is much less of an issue than it used to be, with large memory pools and virtual memory. I know how to build a gc that willmemory to deal with fragmentation when it does arise, so although this is not in the current D it is not a technical problem to do it.yourb) the initial size needs to be around the 2 to 3KB mark. Given that,That'sexample is hardly faster than the original code. In fact, theinitializationcost is ~57000% slower than the equivalent C/C++ array allocation.instructionsbetter than ~88000%, but who's to quibble at these levels?If your app typically is going to fill 2 to 3 Kb, I suggest using malloc() (or alloca()) for it. The handful of extra instructions to allocate via malloc as opposed to the stack will be lost in the thousands ofneeded to fill 2 to 3 Kb.thereI wish it were so simple as your suggestion conveyed. Conversely, ifinitialization,were a syntactic means of optionally inhibiting the arraythethere would be no related issue at all. We could all be radiantly happy,andhave a beer or something.Allocating large, uninitialized arrays on the stack has its own problems: 1) The uninitialized data that is on the stack will get scanned by the garbage collector looking for any references to allocated memory. Sinceuninitialized data consists of old D stack frames, it is highly likelythatsome of that garbage will look like references into the gc heap, and thegcmemory will not get freed. This problem really does happen, and can be pretty frustrating to track down. Some of my C code that uses GC gets a memset() on the stack frame for that reason. 2) It's possible for a function to pass out of it a reference to data on that function's stack frame. By then allocating a new stack frame over the old data, and not initializing, the reference to the old data may still appear to be valid. The program will then behave erratically. Initializing all data on the stack frame will greatly increase the probability offorcingthat bug into the open in a repeatable manner. 3) Large arrays on the stack bring up the specter of stack exhaustion with even modest amounts of recursion. That said, there are things I can do to generate faster code for the initialization.
Aug 30 2004
Ahem, you're speaking of allocating a *buffer*. A large thing, perhaps used partially. Normally, i would feel somewhat uneasy to do that on the stack - the heap does it quite well, and, well, if there is still a performance problem, a buffer can be cached/shared/whatever, fairly easily. And if i remember correctly, there was some circumstance which could make the OS break if it had to extend the stack by more than 1 page at a time. The bulk of performance loss from heap allocation comes from many little objects, much more than from a few big ones... So what's the fuss about? Besides, you'd be using it by a pointer anyway. -eye
Aug 12 2004
Because, if not, then the following modification is the sort of (keyword qualifier) thing that would hand control back to the programmer: { uninitialized char[1024 * 32] workspace; formatter (workspace); } That's perhaps not the best name for it, but you get the idea.another option is to look for "void" as an initialization value: { char[1024*32] workspace = void; formatter (workspace); } that way we don't need another keyword. If the initializer is void the variable isn't initialized.
Aug 13 2004
In article <cfiam0$hhc$1 digitaldaemon.com>, Ben Hinkle says...another option is to look for "void" as an initialization value: { char[1024*32] workspace = void; formatter (workspace); } that way we don't need another keyword. If the initializer is void the variable isn't initialized./another/ option is for the compiler, on seeing the code: would think to itself: "Hmmm. Now 1024*32 - that's 32768. That's a mighty large number of bytes to be putting on the stack. I think I'll allocate it from the heap instead. The programmer won't care. Hell, the programmer won't even /notice/ unless they start looking at the generated assembler!" Arcane Jill
Aug 13 2004
"Arcane Jill" <Arcane_member pathlink.com> wrote in message news:cfidel$j0u$1 digitaldaemon.com...In article <cfiam0$hhc$1 digitaldaemon.com>, Ben Hinkle says...largeanother option is to look for "void" as an initialization value: { char[1024*32] workspace = void; formatter (workspace); } that way we don't need another keyword. If the initializer is void the variable isn't initialized./another/ option is for the compiler, on seeing the code: would think to itself: "Hmmm. Now 1024*32 - that's 32768. That's a mightynumber of bytes to be putting on the stack. I think I'll allocate it fromtheheap instead.That would be very unusual for a compiler to do - and it raises various questions like what "large" is. Plus unless it uses malloc to allocate it still won't be as fast as C due to the initialization overhead.The programmer won't care. Hell, the programmer won't even /notice/ unless they start looking at the generated assembler!"uhh - that's obviously not completely true. Some programmers will care and will definitely notice. The question is how many (and how bad the problem is). Walter thinks the number is small (ie-not large enough to change things). The OP thinks the number is significant. I'm in the camp of nice-to-have but not a high priority problem.Arcane Jill
Aug 13 2004
Arcane Jill wrote:/another/ option is for the compiler, on seeing the code: would think to itself: "Hmmm. Now 1024*32 - that's 32768. That's a mighty large number of bytes to be putting on the stack. I think I'll allocate it from the heap instead. The programmer won't care. Hell, the programmer won't even /notice/ unless they start looking at the generated assembler!"I like this approach. The programmer didn't say where he wants the memory to come from, so it's only logical to assume that he isn't thinking about it and is therefore relying on the compiler to do the right thing for him. If the programmer wants to force the storage to one place or another, he can say so explicitly. (using calloc, malloc, gc.malloc, etc) -- andy
Aug 13 2004
With respect Andy, the programmer was quite explicit about where the memory should come from ... it's a locally-declared array, and the language documentation points out such declarations are place on the stack (just like C/C++) because there are some very good reasons to do so. If you abstract that notion away, then we're not discussing D anymore. Note that the declaration does not have static, const, or any other qualifier either. "Andy Friesen" <andy ikagames.com> wrote in message news:cfim8j$nq2$1 digitaldaemon.com...Arcane Jill wrote:mighty large/another/ option is for the compiler, on seeing the code: would think to itself: "Hmmm. Now 1024*32 - that's 32768. That's afrom thenumber of bytes to be putting on the stack. I think I'll allocate itheap instead. The programmer won't care. Hell, the programmer won't even /notice/ unless they start looking at the generated assembler!"I like this approach. The programmer didn't say where he wants the memory to come from, so it's only logical to assume that he isn't thinking about it and is therefore relying on the compiler to do the right thing for him. If the programmer wants to force the storage to one place or another, he can say so explicitly. (using calloc, malloc, gc.malloc, etc) -- andy
Aug 13 2004
van eeshan wrote:With respect Andy, the programmer was quite explicit about where the memory should come from ... it's a locally-declared array, and the language documentation points out such declarations are place on the stack (just like C/C++) because there are some very good reasons to do so. If you abstract that notion away, then we're not discussing D anymore. Note that the declaration does not have static, const, or any other qualifier either.One of the greatest things about D is that we have a chance to change it. Therefore, we are discussing D even if we abstract that notion away, because D itself may follow suit if we can convince Walter. :) The declaration, taken purely at face value: char[1024 * 32] workspace; declares 32768 consecutive chars under the name 'workspace'. Since the 'new' keyword is not used, this storage is only guaranteed to exist within the function's scope. It doesn't necessarily have to say any more than that: the D language (not the D compiler implementation) is much more than spec for a portable assembler, (as C was) and it makes perfect sense to describe it in conceptual terms where possible because concepts are by far the most common use case. (code is rarely optimized before it is written :) ) Ideally, the compiler knows what the output has to do, but is completely free to achieve that any way it wants. (this is the main reason O'Caml is so fast, as I understand it) However, there's no denying that D is a systems language, so it goes without saying that there absolutely must be a convenient way to say that you *do* care precisely where that storage comes from and how you want it to be handled. its name, allocates stack storage. By being a language primitive, it presents plenty of information to the code optimizer. So, hypothetically, we would be able to write // want chars. Don't care where they came from. char[1024*32] workspace; // get 'new' storage from the GC. char[] workspace = new char[1024*32]; // allocate stack storage char[] workspace = stackalloc char[1024*32]; -- andy
Aug 13 2004
All good points. As long as there is some easy means of avoiding unwanted initialization overhead (particularly vis-a-vis the stack, since it is by far the fastest place to allocate locally-scoped storage), then I'd certainly be all for it. Regarding the stack in particular, the declaration:// allocate stack storage char[] workspace = stackalloc char[1024*32];... would likely map to the alloca() function (which does not waste valuable time initializing all of the requested storage space). However, as Walter has described it, the GC is apparently incompatible with usage of alloca() also ~ because potentially old GC references will suddenly appear back in the "live" stack area again. If this is truly the case, then you should expect the same pushback on this notion as I've had with the notion of an "uninitialized" qualifier, or Ben's alternative "=void". After all, your "stackalloc" keyword is a variation on the theme. On the other hand, if the GC turns out to be compatible with alloca (as it is today), then we can all sing and dance with joy when Walter wraps that function with some "more accessible" syntax <g>. I prefer Ben's approach though, because it's potentially applicable in more places than just the stack. Bon Chance. "Andy Friesen" <andy ikagames.com> wrote in message news:cfivtl$vcr$1 digitaldaemon.com...van eeshan wrote:memoryWith respect Andy, the programmer was quite explicit about where thelikeshould come from ... it's a locally-declared array, and the language documentation points out such declarations are place on the stack (justabstractC/C++) because there are some very good reasons to do so. If youthat notion away, then we're not discussing D anymore. Note that the declaration does not have static, const, or any other qualifier either.One of the greatest things about D is that we have a chance to change it. Therefore, we are discussing D even if we abstract that notion away, because D itself may follow suit if we can convince Walter. :) The declaration, taken purely at face value: char[1024 * 32] workspace; declares 32768 consecutive chars under the name 'workspace'. Since the 'new' keyword is not used, this storage is only guaranteed to exist within the function's scope. It doesn't necessarily have to say any more than that: the D language (not the D compiler implementation) is much more than spec for a portable assembler, (as C was) and it makes perfect sense to describe it in conceptual terms where possible because concepts are by far the most common use case. (code is rarely optimized before it is written :) ) Ideally, the compiler knows what the output has to do, but is completely free to achieve that any way it wants. (this is the main reason O'Caml is so fast, as I understand it) However, there's no denying that D is a systems language, so it goes without saying that there absolutely must be a convenient way to say that you *do* care precisely where that storage comes from and how you want it to be handled. its name, allocates stack storage. By being a language primitive, it presents plenty of information to the code optimizer. So, hypothetically, we would be able to write // want chars. Don't care where they came from. char[1024*32] workspace; // get 'new' storage from the GC. char[] workspace = new char[1024*32]; // allocate stack storage char[] workspace = stackalloc char[1024*32]; -- andy
Aug 13 2004
van eeshan wrote:All good points. As long as there is some easy means of avoiding unwanted initialization overhead (particularly vis-a-vis the stack, since it is by far the fastest place to allocate locally-scoped storage), then I'd certainly be all for it. Regarding the stack in particular, the declaration:I actually didn't mean to address implicit initialization directly at all so much as a need for some way to let the compiler decide what's faster. I think the same reasoning applies, though: as a language which allows the programmer to hit the metal when he needs to, D needs to offer some mechanism that communicates that the programmer knows better than the compiler. Using the 'void' keyword for this purpose sounds a lot like the Perl practice of assigning arbitrary meaning to symbols for the sake of terseness. That being said, I can't think of anything better off the top of my head, so I'll shut up now. :) -- andy// allocate stack storage char[] workspace = stackalloc char[1024*32];... would likely map to the alloca() function (which does not waste valuable time initializing all of the requested storage space). However, as Walter has described it, the GC is apparently incompatible with usage of alloca() also ~ because potentially old GC references will suddenly appear back in the "live" stack area again. If this is truly the case, then you should expect the same pushback on this notion as I've had with the notion of an "uninitialized" qualifier, or Ben's alternative "=void". After all, your "stackalloc" keyword is a variation on the theme.
Aug 13 2004
Using the 'void' keyword for this purpose sounds a lot like the Perl practice of assigning arbitrary meaning to symbols for the sake of terseness. That being said, I can't think of anything better off the top of my head, so I'll shut up now. :)Perl! ouch! ;-) Void currently means "no type" so it seems very natural to me to extend that to mean "no initial value" when used as an initializer. On a related note, I think the only way to get uninitialized memory from the heap should be through malloc.
Aug 13 2004
"Andy Friesen" wrote ..I actually didn't mean to address implicit initialization directly at all so much as a need for some way to let the compiler decide what'sfaster.I think the same reasoning applies, though: as a language which allows the programmer to hit the metal when he needs to, D needs to offer some mechanism that communicates that the programmer knows better than the compiler.In light of what we now know (but still don't quite know :-) I think it's worth recapturing the initial request. I hope you folks will correct and/or embellish the following, as my knowledge of the internal GC operation is meager: a) there are two things that a programmer uses the stack for: 1) nonchalant locally-scoped variables, and 2) a deliberate and explicit means of allocating temporary storage very quickly ( ~50x faster than the heap). b) the default initialization scheme works great for most variables, and even arrays, where they belong to the nonchalant-usage group. storage requested is of anything other than a truly trivial size (~5 bytes). These issues apparently (still to be clarified) are borne due to the nature and behavior of GC stack-scans. That is, the GC apparently performs more efficiently where the "live" stack does not contain "old" things that look attempt to eliminate such references from the stack content (wipe everything to a default value). This begs the question, "... how often does the GC run?" And also, "... how much impact do "old" GC references have on its performance?" Let's make an assertion here first: If a programmer is explicitly and assume said storage will be given up very soon. After all, if the execution time is long-lived then one might as well allocate on the heap instead; right? The core issue brought up by this thread/topic is about those cases where execution-time is very short-lived, such that the overhead of initialization is enough to be noticeably detrimental. Let's give this short execution-window a name: X ... Okay; so given this assertion, how many times will the GC become active during X? One might expect something close to zero ... especially so because the currently executing thread will not be making any heap requests. If it did, then the benefits of the original stack-allocation would likely be drowned out anyway. The problem remains that /another/ thread may cause the GC to scan the (now multiple) stacks during X. So, how often does that happen? Within this short window? One might expect "very few". Another approach would presumably be to disable GC stack-walks during X; that is, inhibit the GC from scanning a specific stack (the one belonging to the X-thread) for the duration of X. Either way, the upshot is this: the programmer wants to take full control over what's going on; and D needs a decent mechanism to enable such control. Placing blinkers over the eyes of the programmer is not a good approach to this.
Aug 13 2004
Pretty ignorant of some of this thread, I would like to make some points anyway. Firstly, I would consider the ability to request storage on the stack without it being initialised a rather important feature for a system programming language. Yes, I absolutely want this piece of storage on the stack, and no, my dear compiler, I do not want you to waste my time by filling it with zeroes. ;) There have been some fine suggestions that address this point, yet they appear to be less accessible (like alloca, or even asm which is not even portable) than, or bring the semantics of the language too far away from, the way C or C++ do it. I recognise that D need not be another iteration of C++, yet I consider the ease of adoption of D for folks used to C'ish languages to be an important strength. If someone as simple as int[2048] foo; could have a meaning (`` allocate GC'ed heap memory '') `` totally '' different from that in C (`` subq %rsp, 8192 ''), it would probably appear to be unintuitive. But that is just my unbased personal opinion. Therefore I consider some simple qualifier with obvious semantics like `` uninitialised '' to be worth the possible danger. This does not mean that everybody should default to using uninitialised memory unless they absolutely need initialisation, for methological reasons that were well stated by Walter. But if you want it, here it is, no need for alloca or asm hacks. This might confuse the garbage collector, yet I do not see how that is much of a problem. Some of the uses of uninitialised arrays demonstrated in this thread, I think, quickly overwrote the array with memcpy'd data, and in others the array might not even live long enough to `` block '' memory from being free'd for any timespan that would matter. Of course the problem exists, but `` false positives '' are not exactly fatal. As another approach to this problem, I think it might be a worthwhile hack to the semantics of the language itself to have all pointers' memory be zeroed when it goes out of scope. This basically means that pointers do the opposite of what arrays do, being `` initialised '' once we do not need them anymore instead of being initialised when they are created. So when the memory occupied by them goes back in scope due to any means of allocating uninitialised memory on the stack, what used to be used for pointers does now not confuse the garbage collector. Considering we already have a lot of initialisation going on, I do not think that it would be a terrible overhead to have deinitialisation, but I might be wrong, of course. Please excuse my amateurish intervention. I certainly have no qualifications at all to point at anything you folks have said and say `` NO YOU ARE WRONG! '', yet I hope that I might have helped.
Aug 15 2004
On Mon, 16 Aug 2004 04:43:43 +0000 (UTC), ben 0x539.de wrote:Pretty ignorant of some of this thread, I would like to make some points anyway. Firstly, I would consider the ability to request storage on the stack without it being initialised a rather important feature for a system programming language. Yes, I absolutely want this piece of storage on the stack, and no, my dear compiler, I do not want you to waste my time by filling it with zeroes. ;) There have been some fine suggestions that address this point, yet they appear to be less accessible (like alloca, or even asm which is not even portable) than, or bring the semantics of the language too far away from, the way C or C++ do it. I recognise that D need not be another iteration of C++, yet I consider the ease of adoption of D for folks used to C'ish languages to be an important strength. If someone as simple as int[2048] foo; could have a meaning (`` allocate GC'ed heap memory '') `` totally '' different from that in C (`` subq %rsp, 8192 ''), it would probably appear to be unintuitive. But that is just my unbased personal opinion. Therefore I consider some simple qualifier with obvious semantics like `` uninitialised '' to be worth the possible danger. This does not mean that everybody should default to using uninitialised memory unless they absolutely need initialisation, for methological reasons that were well stated by Walter. But if you want it, here it is, no need for alloca or asm hacks. This might confuse the garbage collector, yet I do not see how that is much of a problem. Some of the uses of uninitialised arrays demonstrated in this thread, I think, quickly overwrote the array with memcpy'd data, and in others the array might not even live long enough to `` block '' memory from being free'd for any timespan that would matter. Of course the problem exists, but `` false positives '' are not exactly fatal. As another approach to this problem, I think it might be a worthwhile hack to the semantics of the language itself to have all pointers' memory be zeroed when it goes out of scope. This basically means that pointers do the opposite of what arrays do, being `` initialised '' once we do not need them anymore instead of being initialised when they are created. So when the memory occupied by them goes back in scope due to any means of allocating uninitialised memory on the stack, what used to be used for pointers does now not confuse the garbage collector. Considering we already have a lot of initialisation going on, I do not think that it would be a terrible overhead to have deinitialisation, but I might be wrong, of course. Please excuse my amateurish intervention. I certainly have no qualifications at all to point at anything you folks have said and say `` NO YOU ARE WRONG! '', yet I hope that I might have helped.This idea is stilling making sense to me, even when taking Walter's objections into consideration. Okay, so an uninitialized stack array might be a likely source of bugs. But if its not the default behaviour, and the coder explicitly takes responsibility of it by coding the keyword 'uninitialized' or similar, then why shouldn't D say "Okay if that's what you want, that's what you'll get!" D allows 'goto' which is probably a more common source of bugs and high maintenance costs. -- Derek Melbourne, Australia 16/Aug/04 3:26:36 PM
Aug 15 2004
That's a very nice idea Ben ... "Ben Hinkle" <bhinkle4 juno.com> wrote in message news:cfiam0$hhc$1 digitaldaemon.com...Because, if not, then the following modification is the sort of (keyword qualifier) thing that would hand control back to the programmer: { uninitialized char[1024 * 32] workspace; formatter (workspace); } That's perhaps not the best name for it, but you get the idea.another option is to look for "void" as an initialization value: { char[1024*32] workspace = void; formatter (workspace); } that way we don't need another keyword. If the initializer is void the variable isn't initialized.
Aug 13 2004
Have you tried alloca? I haven't tried running it but I'm thinking of void stack2() { void* x = alloca(16*1024); } van eeshan wrote:"Walter" <newshound digitalmars.com> wrote in message news:cfbip7$ora$1 digitaldaemon.com..."van eeshan" <vanee hotmail.net> wrote in message news:cfb72j$9l7$1 digitaldaemon.com...needs"Walter" <newshound digitalmars.com> wrote in message news:cfb5v9$76u$1 digitaldaemon.com...The problem, however, is that since Java does not allow any objects or arrays on the stack (or in static data), whenever you need one ittoThanks for clarifying that Walter. Would it be prudent to note that D allocates auto Objects on the heap? Stack allocation may be allowed for in the spec, but that's not how the compiler operates. Also, there is that point about heap allocations being a lot more expensive than stack allocations. Naturally, one would lean towards wholeheartedly agreeing; but this is just not true when it comes to D. Here's a little (contrived) test jig to illustrate: private import std.c.time; private import std.c.stdlib; private import std.c.windows.windows; void heap() { void *x = malloc (16 * 1024); free (x); } void stack() { ubyte[16 * 1024] x; } void trace (uint startTime) { FILETIME ft; GetSystemTimeAsFileTime(&ft); printf ("%u\n", (ft.dwLowDateTime - startTime) / 10000); } void main() { FILETIME start; GetSystemTimeAsFileTime(&start); trace (start.dwLowDateTime); sleep (3); trace (start.dwLowDateTime); for (int i=1_000_000; --i;) heap(); trace (start.dwLowDateTime); for (int i=1_000_000; --i;) stack(); trace (start.dwLowDateTime); } The results? On this machine it takes the heap() routine ~500ms to execute, and the stack() version ~3500ms. Yep, that's seven times longer. Why? What wasn't stated is that D ensures all stack variables are initialized ... including arrays. The innocuous stack allocation actually clears the 16KB to zero each time, which is why it takes so long. For a 32KB 'buffer', the stack() execution time goes up to ~8600ms versus the ~500ms for heap(); you can see there's some rather non-linear behavior for the stack version <g> The point here is not about the benefits/detriments of deterministic initialization. It's about clarifying a point of performance-tuning that is far from obvious. Without wishing to put too fine a point on it, the quoted post was somewhat misleading. That is; allocating an array on the D stack can consume far more cycles than the equivalent heap allocation. It depends upon the size of the array, the bus speed, the size of the cache, heap fragmentation and many other things. D stack allocations are simply not like C/C++ at all. Far from it. For the curious, here's the disassembly for both routines. It's worth noting that the array initialization could be sped up. Also worth noting is that the loop/call overhead was 10ms. 10: { 11: void *x = malloc (16 * 1024); 12: free (x); 00402011 push 4000h 00402016 call _malloc (0040574c) 0040201B add esp,4 0040201E push eax 0040201F call _free (004057a4) 00402024 add esp,4 13: } 00402027 pop eax 00402028 ret 14: 15: 16: void stack() 0040202C push ebp 0040202D mov ebp,esp 0040202F mov edx,4 00402034 sub esp,1000h 0040203A test dword ptr [esp],esp 0040203D dec edx 0040203E jne _D5hello5stackFZv+8 (00402034) 00402040 sub esp,edx 00402042 mov ecx,4000h 00402047 xor eax,eax 17: { 18: ubyte[16 * 1024] x; 00402049 push edi 0040204A lea edi,[x] 00402050 rep stos byte ptr [edi] 19: } 00402052 pop edi 00402053 mov esp,ebp 00402055 pop ebp 00402056 retJavabe allocated on the heap. This is a lot more expensive than stack allocation. Since one winds up allocating a lot more heap objects instack? For auto objects, it is allowed for the compiler to put them there. But I was primarilly referring to structs and arrays.than one does in C/C++, the end result is slower. Java offers no alternatives to gc allocation. This is why D allows stack based objects and value objects.Are you saying that D allows one to create Objects entirely upon theAnd entirely within static data?Structs and arrays, yes. Java does not allow structs, nor does it allow arrays on the stack.
Aug 11 2004
"Ben Hinkle" <bhinkle4 juno.com> wrote in message news:cfe87l$1akg$1 digitaldaemon.com...Have you tried alloca? I haven't tried running it but I'm thinking of void stack2() { void* x = alloca(16*1024); }Tried that, Ben; alloca() also clears everything to zero.
Aug 11 2004
van eeshan wrote:"Ben Hinkle" <bhinkle4 juno.com> wrote in message news:cfe87l$1akg$1 digitaldaemon.com...Hmm. I just tried on Linux and got the times: heap: .17sec stack: 14 stack2: .06 looks like on Linux alloca is the fastest.Have you tried alloca? I haven't tried running it but I'm thinking of void stack2() { void* x = alloca(16*1024); }Tried that, Ben; alloca() also clears everything to zero.
Aug 11 2004
"van eeshan" <vanee hotmail.net> wrote in message news:cfe8ob$1b85$1 digitaldaemon.com..."Ben Hinkle" <bhinkle4 juno.com> wrote in message news:cfe87l$1akg$1 digitaldaemon.com...No, it does not. It simply calls C's alloca(), and gets whatever C's alloca() does. Note, however, that under linux the OS seems to clear pages that are faulted into the stack, so it could *appear* to be 0 initialized when it really is just fresh snow <g>.Have you tried alloca? I haven't tried running it but I'm thinking of void stack2() { void* x = alloca(16*1024); }Tried that, Ben; alloca() also clears everything to zero.
Aug 11 2004
Well, blow me down with a feather. I must have really screwed that test up; thank you Ben ~ that gets me out of a hole for now. So, in light of this, where are we now? Currently, if you want to quickly allocate an array on the stack you must do something like the following: char[] array = (cast(char *) alloca(Size))[0..Size]; I feel it would be beneficial to both programmers and to the language reputation to provide something rather more "accessible" than this ... thing. Perhaps a globally available template? Something, where taking advantage of the stack is not hidden away in the darkest recesses. This is why I figured something like an optional qualifier on the variable (to inhibit default initialization) would be ideal: it's easy to apply, easy to understand, easy to document, and very likely easy to implement. If this sort of thing remains poorly documented, or is not easily accessible, people will be left wondering why their supposed fast stack-arrays are actually eating serious chunks of CPU time. After all, porting some C code is what led to this foray (temporary stack arrays are commonplace in C). Finally, the alloca() approach is still ~700% slower than the C/C++ standard mechanism, yet it is faster than simply declaring a "char[20]" D local variable (that's kinda' hoky when you think about it ... a char[20] declaration is ~750% slower in D than in C). What to do? Can we please have a simple, clean, optional mechanism for inhibiting default initialization? Is it really so much trouble? "Ben Hinkle" <bhinkle4 juno.com> wrote in message news:cfe87l$1akg$1 digitaldaemon.com...Have you tried alloca? I haven't tried running it but I'm thinking of void stack2() { void* x = alloca(16*1024); } van eeshan wrote:it"Walter" <newshound digitalmars.com> wrote in message news:cfbip7$ora$1 digitaldaemon.com..."van eeshan" <vanee hotmail.net> wrote in message news:cfb72j$9l7$1 digitaldaemon.com..."Walter" <newshound digitalmars.com> wrote in message news:cfb5v9$76u$1 digitaldaemon.com...The problem, however, is that since Java does not allow any objects or arrays on the stack (or in static data), whenever you need oneinneedstobe allocated on the heap. This is a lot more expensive than stack allocation. Since one winds up allocating a lot more heap objectsIJavastack? For auto objects, it is allowed for the compiler to put them there. Butthan one does in C/C++, the end result is slower. Java offers no alternatives to gc allocation. This is why D allows stack based objects and value objects.Are you saying that D allows one to create Objects entirely upon theinwas primarilly referring to structs and arrays.Thanks for clarifying that Walter. Would it be prudent to note that D allocates auto Objects on the heap? Stack allocation may be allowed forAnd entirely within static data?Structs and arrays, yes. Java does not allow structs, nor does it allow arrays on the stack.longer.the spec, but that's not how the compiler operates. Also, there is that point about heap allocations being a lot more expensive than stack allocations. Naturally, one would lean towards wholeheartedly agreeing; but this is just not true when it comes to D. Here's a little (contrived) test jig to illustrate: private import std.c.time; private import std.c.stdlib; private import std.c.windows.windows; void heap() { void *x = malloc (16 * 1024); free (x); } void stack() { ubyte[16 * 1024] x; } void trace (uint startTime) { FILETIME ft; GetSystemTimeAsFileTime(&ft); printf ("%u\n", (ft.dwLowDateTime - startTime) / 10000); } void main() { FILETIME start; GetSystemTimeAsFileTime(&start); trace (start.dwLowDateTime); sleep (3); trace (start.dwLowDateTime); for (int i=1_000_000; --i;) heap(); trace (start.dwLowDateTime); for (int i=1_000_000; --i;) stack(); trace (start.dwLowDateTime); } The results? On this machine it takes the heap() routine ~500ms to execute, and the stack() version ~3500ms. Yep, that's seven timesactuallyWhy? What wasn't stated is that D ensures all stack variables are initialized ... including arrays. The innocuous stack allocationforclears the 16KB to zero each time, which is why it takes so long. For a 32KB 'buffer', the stack() execution time goes up to ~8600ms versus the ~500ms for heap(); you can see there's some rather non-linear behaviorDthe stack version <g> The point here is not about the benefits/detriments of deterministic initialization. It's about clarifying a point of performance-tuning that is far from obvious. Without wishing to put too fine a point on it, the quoted post was somewhat misleading. That is; allocating an array on theItstack can consume far more cycles than the equivalent heap allocation.cache,depends upon the size of the array, the bus speed, the size of theheap fragmentation and many other things. D stack allocations are simply not like C/C++ at all. Far from it. For the curious, here's the disassembly for both routines. It's worth noting that the array initialization could be sped up. Also worth noting is that the loop/call overhead was 10ms. 10: { 11: void *x = malloc (16 * 1024); 12: free (x); 00402011 push 4000h 00402016 call _malloc (0040574c) 0040201B add esp,4 0040201E push eax 0040201F call _free (004057a4) 00402024 add esp,4 13: } 00402027 pop eax 00402028 ret 14: 15: 16: void stack() 0040202C push ebp 0040202D mov ebp,esp 0040202F mov edx,4 00402034 sub esp,1000h 0040203A test dword ptr [esp],esp 0040203D dec edx 0040203E jne _D5hello5stackFZv+8 (00402034) 00402040 sub esp,edx 00402042 mov ecx,4000h 00402047 xor eax,eax 17: { 18: ubyte[16 * 1024] x; 00402049 push edi 0040204A lea edi,[x] 00402050 rep stos byte ptr [edi] 19: } 00402052 pop edi 00402053 mov esp,ebp 00402055 pop ebp 00402056 ret
Aug 11 2004
In article <cfete2$1p7g$1 digitaldaemon.com>, van eeshan says...Currently, if you want to quickly allocate an array on the stack you must do something like the following: char[] array = (cast(char *) alloca(Size))[0..Size];My experiments seem to suggest that the following actually works rather well.
Aug 12 2004
Now that is exactly what the compiler ought to (optionally) provide for, at the high level :-) 16 bytes allocations (versus relative execution time): heap: 269 stack[]: 50 alloca(): 57 asm: 6 2K allocations: heap: 504 stack[]: 5681 alloca(): 57 asm: 10 32K allocations: heap: 520 stack[]: 8812 alloca(): 118 asm: 10 There's that 32KB stack-array initialization being ~88000% slower than the traditional [asm] C/C++ uninitialized style. Barring further issue, I will use that ~ thanks very much Jill. Perhaps one of the Wiki owners will take an interest in noting these workarounds. "Arcane Jill" <Arcane_member pathlink.com> wrote in message news:cff6a9$1ugb$1 digitaldaemon.com...In article <cfete2$1p7g$1 digitaldaemon.com>, van eeshan says...well.Currently, if you want to quickly allocate an array on the stack you must do something like the following: char[] array = (cast(char *) alloca(Size))[0..Size];My experiments seem to suggest that the following actually works rather
Aug 12 2004
I've been trying to convert this to a mixin, but not having much luck. Guess I don't know enough about templates ... can you point me in the right direction please Jill? Or will a mixin always place this inside its own private function scope (and thus restore ESP before the array can be used) ? "Arcane Jill" <Arcane_member pathlink.com> wrote in message news:cff6a9$1ugb$1 digitaldaemon.com...In article <cfete2$1p7g$1 digitaldaemon.com>, van eeshan says...well.Currently, if you want to quickly allocate an array on the stack you must do something like the following: char[] array = (cast(char *) alloca(Size))[0..Size];My experiments seem to suggest that the following actually works rather
Aug 12 2004