digitalmars.D.bugs - array .dup bug in 0.97 and 0.98? - matrix_dup_bug.zip
- Dave (97/97) Aug 05 2004 Wow! Great language Walter!!
- Nick (20/24) Aug 05 2004 I'm not quite sure I understand the problem. Are you trying to copy a do...
- Dave (31/37) Aug 05 2004 Yes, I was trying to copy the whole matrix with a dup. on m1.
- Dave (15/26) Aug 05 2004 I apologize - you are exactly correct in what is happening. I guess it w...
- Regan Heath (34/74) Aug 05 2004 char[] doesn't exactly have copy-on-write semantics, for example:
- Dave (17/19) Aug 05 2004 Got it - thanks for the quick replies.
- J C Calvarese (20/49) Aug 05 2004 From: http://www.digitalmars.com/d/future.html
- Juanjo =?ISO-8859-15?Q?=C1lvarez?= (6/36) Aug 06 2004 Somebody with time should implement and run the "Great Language
- Dave (22/27) Aug 06 2004 Where is the updated version? I've already run half the tests, thanks to...
- Juanjo =?ISO-8859-15?Q?=C1lvarez?= (6/29) Aug 06 2004 The objective would be to fix the slow parts of the language, if the nat...
- Dave (12/26) Aug 06 2004 Can't believe I missed it the first time - been a long week, TGIF.
- Regan Heath (25/65) Aug 08 2004 The built in D strings do not handle concatenation too well either, they...
- Dave (8/26) Aug 09 2004 If 'reserve' is added, another suggestion could be to preemptively expan...
- Juanjo =?ISO-8859-15?Q?=C1lvarez?= (4/7) Aug 19 2004 An algorithm that I've find good enough for most situations and simple t...
- Dave (17/26) Aug 19 2004 Good idea - I've written some wicked fast C++ string classes (for, among
Wow! Great language Walter!! Please see attached. It contains the .d and two .cod files (one with the bug and one without). Comments in the .d file explain the potential problem fairly well, hopefully. Also, would you like reports and examples on performance issues should I run into any at this time? I would be more than happy to take the time and provide this info. if it is wanted. Thanks, Dave begin 0644 matrix_dup_bug.zip M$D%_2A)!57 $``````"-5FMOVS84_1X _^&B:!$[UH.2_(I3(VB;8`BPH,/6 M8<`\?Z`EVJ8MDII(V>F&[+?ODGXICH,F$&(^[CWW\.CPVF$(_J4/0F5L`&FK M)<Z-*09AN%ZO PF=Y>Q[H,I9^)^-"O5<*:,J$YZ?G9]Q4:C2 .&"E==V(552 MY0K'0XBN]SL;E.U<!#F3,S/'$(NU79VJ$AI\2*Z!?W3+P%NM/;K+&_'Q(=46 MOSYLNO2%35]\=%NP.$K?08P68Y>.)%NM&L+39KC]*)FI2MD038QX>J:%J')S M#IZ[%,WCN/A%7#TL M`5H4C)8:S)P:N#B2]`+H1*T8<`U2&9 P6X'A3D4-RV"+G'^'QLU-\QGPU!'! M/,7."&4EM255&>B1#Z!SM;8O<$X=8*T.L+\KCCH 9_B"W=1E[QC!FJ-;,I'" MZ(KRG$YR9H^Z/=A!_T/OL*:)`GP3QW82K]KI96N0)WJCZTH'NWJN"]GN(\3. MDT\[K`)OBIDVWGW(8/?\)=]Y0HS(&!\[B,>CQ`V2,8[MH#W&9P>U1]`V,:7: M\^6PZ ?$0L0]$G4[$)%NU.L2B-J=I'N%\WZG2^(.0"?H]Q+2!\WPNRS3T/"_ M)N"+M*B&!5J"5Z(-OC;4\+2YJX!?KK;7)4$[B'Y<H1MT^R3J'568*L&-!4Z7 MD18%^&H[VY.Y??AB->D';?(&U:^"J-,FI$9&U9!N+1()KOH_1HHZ0;?W'.DK M^%SFME'X)<L9U0RA_P=02P,$%````` `9TX%,:-C]V67!```.Q(```H`%0!M M871R:7 N8V]D550)``-B21)!R4D2055X!```````I5A;;Z,Z$'XVOZ+/*UI! M0E*2-P*)%"D/1\H^1*VJB`0:V.42 7.VY]^?\05C&]/-;J6*VI[Q>"Z>;\;9 M[(+OZ-+4MRNRT/5V*O(S.D;S,L9-_N'MMR_K?+CNES_98)/G+T% X)B5Y:W` M0`8J^V-\Z0=N*G0LX[SJ-R5I=CRE%W4AK1))+-O`=E^;O,+O0E8TPWF9-K/O M]'L\X[K9O(3RHF!-CE7ZZUS$;2N60C([&KFC:8N3>0OF5!<OQG6^">*70A(6 M-TW\7YOB(JTN.+.L)PPDU*:7,JVPA4#PK4Q1N M2D[(<=/,=AQ'^O!M3,*I;84`.F'+3:V(II(F,]C_/"5"XDX<79M[9,2GC%=9 MLHU3$,)XJ4K\0';ZI;H]%7GULZ[.Z1,>O2]&']VYUQJE -_ PK096J_^ 6%9 M_TM&]GI/9NWMA&!D.ZZ?]7P'&!9IC-;AP7YTPNP5-KR)O0=['1P$\WXKAE$_ MG6*A;!59D:(DL06&"Q%(YH>'-TFBIZTGJ7PF.&O[S6-$? 93:LLC08WE]B`P M\'`]*LFG"F,2G*&$YT .K1IL"+\4ZSY; X.4K>RZ=[C#TK MQ/_"QQP`O,`"%R_[``T0JU-`KFE<MJ9Z-Y?5XI TA"%=D<D"D"CJ%`&&/M:F MS9ZRV07T]F48,\"N_W40(SK_`8Z-(),9S4AW>A=HT3:6_;L/ L!7HQ"D 8[> M(] 3XN`*TG#B64 M='5<#D%=3M;O[]`;/&S G;(T=NMR#S+H+?I^7_01GD'M.YX0'$YZF`OD7L%Q ML0*$62U5)09C4^MMI/GCM-'T4*J`3O/&"&/]^N\>R PX5P+^B0=Y)P"-`'AC ML9([OP&F:]46$HGAKM2V2YE/+0O5BDMO>J341U:PC2+\P7[J9'W_N`KA4`/7 M&0"2J;Q*"UL%+OC;]-4)YYGA2&4K*8QR69/!58E<]^N#U M$D%5>`0``````*586V^C.A!^AE_1YQ6I("$IR1N!1(J4AY6Z#U&K*B*!)NQR M.OUPGW<OFW2X[N4_V&";IB^^K^&8Y_DM:X$,5/;'^)*/MBZ,8QZE1;\I3J[' M4W*1%Y(B1F+9!K:[JM.B?1>RHCB\56(6SMLT3^KY-_H]GMNRWKX$>%&PQL<B M^77.HJ812P&9';7<X:QIXT4#QA47-VK+=.M'+QD2%M5U]%^3M%E27-JK:3ZV M))R:1 B $[9<EY)H*FDZA_U/,R(DZL31M85+1GS*>*4E2SL%(8R7JL0/9*=? MBMMCEA8_RN*</+:CMT?KHSOWFJ,4\!M<G^9J;-9?89B7/\G(VCR367,[&3"R M;,>[]GP'&&9)9&R" S6Q ^LK;' 3>P_6QC\(YN>=&(;],!`,'M\7_RKK^*%J MZX>)[;A,I.5HB/PXRQ[2O)Y"9?<DKU/Q'&69421139>'M]$P/LJ::&TQS:,X M.Q3Y^T[%0U`\'"K>)DW+0D.%,;5M<S^?8;6%XX$E+<YDL5,LP%:1%10EQ.9K M+H2/S`\.;TBBJZS'"3X3G+7[XC(B/X,IM>.1H,9R>PPPQ0235ISI63!!*(7B M^!-X&"D[OZ'?"16V`A4(".3D0K!`0F5/Q8WJD\N#Y')L'JZ)E'RR,";!'DIX MG- :)% ?Y'FX4Q5G9TCK*5S''DC[LWI`(' G_!MT_B7^%S[F`.#Z)KAXU0=H M %B=`KBF<=F*ZMT<J\4Q:0A#JB M"TJ1T\X;8?4Z1J68C[7,XF*PANI[7CTTU[)NV=69NNCJ.!R"NIPLW]^A-WC8 MPJMEI>W6<0\RZ"WZ?E_T$:Y&[3N>$!Q.>ICS<:] .QL1*#Y6SQA]275`Y7TF M6=NI_-UQHEM5 CI1`16#I]`)U8]>!TE(C]2*3OS=QJJ+:.&D<NCK*XRN!Q3U MDI::Y18 :+N2?3(8CQC[F0T#VFC M!HX]0"Q=_44+.PE/^./UU0[\J^9(:2NIG+CN8?25(M?]6(&Z"KE30,\ [&%/ MV,=%'Y`96JW#S55A M550%``-_2A)!57 ``%!+`0(7`Q0````(`&=.!3&C8_=EEP0``#L2```*``T` M``````$```"D 48$``!M871R:7 N8V]D550%``-B21)!57 ``%!+`0(7`Q0` M```(`&U)!3'0*7OWK`0``&H2```.``T```````$```"D 1H)``!M871R:7A? ` end
 Aug 05 2004
In article <cetin9$1oo1$1 digitaldaemon.com>, Dave says...Wow! Great language Walter!! Please see attached. It contains the .d and two .cod files (one with the bug and one without). Comments in the .d file explain the potential problem fairly well, hopefully.I'm not quite sure I understand the problem. Are you trying to copy a double array with a dup? In a double array such as int[][] mm, each element mm[] is a pointer to an array int[]. Writing mm.dup gives you a copy of the table of row pointer, but the pointers still point to the same actual data. If you want to copy the entire matrix you would write something like Incidently you can also write the last part as m1 = m2.copyMatrix(), but this only works on arrays (as discussed recently on one of these NGs.) Nick
 Aug 05 2004
In article <cetnkl$1r0r$1 digitaldaemon.com>, Nick says...In article <cetin9$1oo1$1 digitaldaemon.com>, Dave says... I'm not quite sure I understand the problem. Are you trying to copy a double array with a dup? In a double array such as int[][] mm, each element mm[] is a pointer to an array int[]. Writing mm.dup gives you a copy of the table of row pointer, but the pointers still point to the same actual data. If you want to copy the entire matrix you would write something likeYes, I was trying to copy the whole matrix with a dup. on m1. I guess what you say makes sense (if you are used to the C/C++ 'way' at least), but I gotta say is not explained well in the docs., the compiler didn't complain at all, and the thing didn't crash at runtime. Besides that, I tried this: int[][] m1 = mkmatrix(SIZE, SIZE); // mkmatrix allocates/init's each element printf("%d, %d, %d\n",m1.length,m1[0].length, m1[m1.length - 1].length); printf("%d %d %d %d\n",m1[0][0],m1[2][3],m1[3][2],m1[4][4]); int[][] mx = m1.dup; printf("%d, %d, %d\n",mx.length,mx[0].length, mx[mx.length - 1].length); printf("%d %d %d %d\n",mx[0][0],mx[2][3],mx[3][2],mx[4][4]); The results were the same for both: 30, 30, 30 1 64 93 125 30, 30, 30 1 64 93 125 The way I look at it and how the compiler apparently looks at it is that m1 and mx are both type int[][], so therefore a 'dup' that is applied directly to m1 should therefore allocate for and copy the entire thing. Either way, there is a bug somewhere in the compiler - either it is copying the whole matrix as it is supposed to and the buglet is further along in the code or it is copying the entire matrix and is _not_ supposed to do that. If it is not supposed to allocate and copy everything, then at least a warning should be issued because the m1.dup code is pretty intuitive (and nice! if that is how it is supposed to work). I think the dup part is working the way it should but the buglet is further along in the code, like in the mmult(...) part. Maybe the compiler is getting references mixed up after a 'copy on write' or something.. - Dave
 Aug 05 2004
In article <cetvkv$1vpp$1 digitaldaemon.com>, Dave says...In article <cetnkl$1r0r$1 digitaldaemon.com>, Nick says...I apologize - you are exactly correct in what is happening. I guess it would not necessarily be considered a bug and my reply just confused things more. After the dup and subsequent writes to mm, the m1 data was being modified to be the same as mm, as you implied. What was further confusing me was the copy-on-write stuff, for example, for operations on char[]. Which begs the question, since 1-D arrays are copy-on-write, shouldn't the 2nd dimension in a dup'ed matrix also be copy-on-write?? Or does copy-on-write only apply to char[]'s and not any other type of array?? I think that is really non-intuitive. I think of dup as a synonym for 'copy' and most object.copy() type of methods I've used and written actually make a deep copy of the object and it's contents (in this case, I'm looking at an int[][] as an object). - DaveIn article <cetin9$1oo1$1 digitaldaemon.com>, Dave says... I'm not quite sure I understand the problem. Are you trying to copy a double array with a dup? In a double array such as int[][] mm, each element mm[] is a pointer to an array int[]. Writing mm.dup gives you a copy of the table of row pointer, but the pointers still point to the same actual data. If you want to copy the entire matrix you would write something likeYes, I was trying to copy the whole matrix with a dup. on m1.
 Aug 05 2004
On Thu, 5 Aug 2004 20:54:08 +0000 (UTC), Dave <Dave_member pathlink.com> wrote:In article <cetvkv$1vpp$1 digitaldaemon.com>, Dave says...char[] doesn't exactly have copy-on-write semantics, for example: char[] a = "regan"; char[] b = a; b[0] = 'm'; assert(b[0] == a[0]); will _not_ assert, this is because a and b are references, in this case to the same data. Similarly, this.. char[] a = "regan"; char[] b = a[1..3]; b[0] = 'o'; assert(a[1] == b[0]); will also not assert, this is because b simply references the same data as a. This however... char[] a = "regan"; char[] b; b = a ~ " was here"; will copy the contents of a, for b, and append the new data. The docs specifically state that an append will copy.In article <cetnkl$1r0r$1 digitaldaemon.com>, Nick says...I apologize - you are exactly correct in what is happening. I guess it would not necessarily be considered a bug and my reply just confused things more. After the dup and subsequent writes to mm, the m1 data was being modified to be the same as mm, as you implied. What was further confusing me was the copy-on-write stuff, for example, for operations on char[].In article <cetin9$1oo1$1 digitaldaemon.com>, Dave says... I'm not quite sure I understand the problem. Are you trying to copy a double array with a dup? In a double array such as int[][] mm, each element mm[] is a pointer to an array int[]. Writing mm.dup gives you a copy of the table of row pointer, but the pointers still point to the same actual data. If you want to copy the entire matrix you would write something likeYes, I was trying to copy the whole matrix with a dup. on m1.Which begs the question, since 1-D arrays are copy-on-write, shouldn't the 2nd dimension in a dup'ed matrix also be copy-on-write?? Or does copy-on-write only apply to char[]'s and not any other type of array??An 'int[][]' is a reference, to an array of references, to arrays of ints. So... int[][] a; a.dup says duplicate the data 'a' references, which is an array of references, to arrays of ints, the array of references gets duplicated, but the arrays of ints they refer to do not. Nicks function was duplicating the arrays of ints as well.I think that is really non-intuitive. I think of dup as a synonym for 'copy' and most object.copy() type of methods I've used and written actually make a deep copy of the object and it's contents (in this case, I'm looking at an int[][] as an object).And it does, the difference is what is considered the object in this case. Regan -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
 Aug 05 2004
In article <opsb99pvnp5a2sq9 digitalmars.com>, Regan Heath says...And it does, the difference is what is considered the object in this case. ReganGot it - thanks for the quick replies. In that same attachment with the original post, there is a performance comparison between dmd, dmc, Intel and gcc for that code. I suspect the question of array performance is going to be on many potential users minds and gcc and intel perform 3x better for this. Of course, Java is still living bad perf. down even though most runtimes are now pretty good, at least for native data types, and I would like to see D avoid that wrap. I know, this code is kind-of trivial, artificial, etc. but a 3x difference is still pretty large (and yes, I used -O -inline -release). Not complaining - just trying to make the language and tools better. Actually am quite pleased with things like OutBuffer performance, etc. Other things, like the native associative arrays though seem pretty slow compared to C++ hash_map<> and Java HashMap(). Does anyone know if there are planned improvements before what I presume will be the big release of 1.0? Is there any "roadmap" around I could look at? Thanks.
 Aug 05 2004
Dave wrote:In article <opsb99pvnp5a2sq9 digitalmars.com>, Regan Heath says...From: http://www.digitalmars.com/d/future.html Future Directions The following new features for D are planned, but the details have not been worked out: 1. Mixins. 2. Template inheritance. 3. Array literal expressions. I get the impression that once the important bugs and are squashed (and don't ask me which ones are important because I don't know), it'll be stamped D 1.0 and sent out the door. I suspect that the only major features that might be added before 1.0 would be the ones inspired/demanded by the D Template Library project. Then, I think Walter's planning on adding optimizations in the trek to 2.0. And new features that fit into the D paradigm would be accumulated, too. -- Justin (a/k/a jcc7) http://jcc_7.tripod.com/d/And it does, the difference is what is considered the object in this case. ReganGot it - thanks for the quick replies. In that same attachment with the original post, there is a performance comparison between dmd, dmc, Intel and gcc for that code. I suspect the question of array performance is going to be on many potential users minds and gcc and intel perform 3x better for this. Of course, Java is still living bad perf. down even though most runtimes are now pretty good, at least for native data types, and I would like to see D avoid that wrap. I know, this code is kind-of trivial, artificial, etc. but a 3x difference is still pretty large (and yes, I used -O -inline -release). Not complaining - just trying to make the language and tools better. Actually am quite pleased with things like OutBuffer performance, etc. Other things, like the native associative arrays though seem pretty slow compared to C++ hash_map<> and Java HashMap(). Does anyone know if there are planned improvements before what I presume will be the big release of 1.0? Is there any "roadmap" around I could look at? Thanks.
 Aug 05 2004
Dave wrote:In article <opsb99pvnp5a2sq9 digitalmars.com>, Regan Heath says...Somebody with time should implement and run the "Great Language Shootout" (updated version) to find problems like that in the current DMD compiler. http://shootout.alioth.debian.org/ PS: How GDC compare to DMD on array handling?And it does, the difference is what is considered the object in this case. ReganGot it - thanks for the quick replies. In that same attachment with the original post, there is a performance comparison between dmd, dmc, Intel and gcc for that code. I suspect the question of array performance is going to be on many potential users minds and gcc and intel perform 3x better for this. Of course, Java is still living bad perf. down even though most runtimes are now pretty good, at least for native data types, and I would like to see D avoid that wrap. I know, this code is kind-of trivial, artificial, etc. but a 3x difference is still pretty large (and yes, I used -O -inline -release). Not complaining - just trying to make the language and tools better. Actually am quite pleased with things like OutBuffer performance, etc. Other things, like the native associative arrays though seem pretty slow compared to C++ hash_map<> and Java HashMap(). Does anyone know if there are planned improvements before what I presume will be the big release of 1.0? Is there any "roadmap" around I could look at? Thanks.
 Aug 06 2004
In article <cevk5l$3e4$1 digitaldaemon.com>, Juanjo =?ISO-8859-15?Q?=C1lvarez?= says...Somebody with time should implement and run the "Great Language Shootout" (updated version) to find problems like that in the current DMD compiler. http://shootout.alioth.debian.org/ PS: How GDC compare to DMD on array handling?Where is the updated version? I've already run half the tests, thanks to a quick start from http://www.functionalfuture.com/d/ For this particular test, GDC actually took over 2x as long compiled with: '-O3 -fomit-frame-pointer -funroll-loops -mtune=pentium4 -static'. Based on similiar differences between g++ and gcj, I think this has a lot to do with the implementation of the frontend with the gcc backend and doesn't have anything to do with D itself (gdc is brand-new, right)? I would be happy to start on this comparison (but I can't 'promise' that I'll have the time to finish it anytime real soon). I therefore need to know thoughts on: - Use the built-in Associative Array, or not? If not, is there a DTL hash_map-like class I should use? I would vote for the built-in AA because that's what'll be used most and so far it seems pretty slow. - Use the built-in string, OutBuffer or DTL string (if there is one)? - I plan to always use the built-in arrays for the numeric stuff, unless somebody can give me a real good reason to use something else (like DTL). Or... - Just make it go as fast as possible (barring inline assembler) no matter what the intent of the language design? Thanks..
 Aug 06 2004
Dave wrote:The url posted is the one of the updated version.Somebody with time should implement and run the "Great Language Shootout" (updated version) to find problems like that in the current DMD compiler. http://shootout.alioth.debian.org/ PS: How GDC compare to DMD on array handling?Where is the updated version? I've already run half the tests, thanks to a quick start from http://www.functionalfuture.com/d/I would be happy to start on this comparison (but I can't 'promise' that I'll have the time to finish it anytime real soon). I therefore need to know thoughts on: - Use the built-in Associative Array, or not? If not, is there a DTL hash_map-like class I should use? I would vote for the built-in AA because that's what'll be used most and so far it seems pretty slow.The objective would be to fix the slow parts of the language, if the native associative array is damn slow use it so maybe big W fix it :)- Use the built-in string, OutBuffer or DTL string (if there is one)? - I plan to always use the built-in arrays for the numeric stuff, unless somebody can give me a real good reason to use something else (like DTL).Native, it will shown better how nice string manipulation can be in D.- Just make it go as fast as possible (barring inline assembler) no matter what the intent of the language design?Please, not.
 Aug 06 2004
In article <cf13rk$14jn$1 digitaldaemon.com>, Juanjo =?ISO-8859-15?Q?=C1lvarez?= says...Dave wrote:Can't believe I missed it the first time - been a long week, TGIF.The url posted is the one of the updated version.Somebody with time should implement and run the "Great Language Shootout" (updated version) to find problems like that in the current DMD compiler. http://shootout.alioth.debian.org/ PS: How GDC compare to DMD on array handling?Where is the updated version? I've already run half the tests, thanks to a quick start from http://www.functionalfuture.com/d/The objective would be to fix the slow parts of the language, if the native associative array is damn slow use it so maybe big W fix it :)Makes total sense, but being new to the language I thought I'd ask because I don't know the history or intended use of the built-ins. For example, Java has String also, but it is not meant for heavy concatenation and the recommendation has always been to use StringBuffer (and now StringBuilder for Java1.5 when it doesn't need to be thread sychronized). I will use the built-in AA, string and indexed arrays for everything applicable, but I bet when the comparisons come out, I'll get lots of the "hey, you should've used class X instead" type of complaints ;). - Dave
 Aug 06 2004
On Sat, 7 Aug 2004 03:39:21 +0000 (UTC), Dave <Dave_member pathlink.com> wrote:In article <cf13rk$14jn$1 digitaldaemon.com>, Juanjo =?ISO-8859-15?Q?=C1lvarez?= says...The built in D strings do not handle concatenation too well either, they reallocate each time creating only enough space for each concatentation. You can however use a little trick like so: char[] s = "test" int keep; keep = s.length; s.length = 1000; s.length = keep; s = s ~ "a"; to pre-allocate the space you think you might need. However there is no guarantee it won't release that memory at some point during your concatenations. I have requested a .reserve property for strings that reserves space as the above does but with the guarantee not to release it. It also has the benefit of more intuitive and simple syntax eg. char[] s = "test"; s.reserve = 1000; s = s ~ "a"; So far no luck.Dave wrote:Can't believe I missed it the first time - been a long week, TGIF.The url posted is the one of the updated version.Somebody with time should implement and run the "Great Language Shootout" (updated version) to find problems like that in the current DMD compiler. http://shootout.alioth.debian.org/ PS: How GDC compare to DMD on array handling?Where is the updated version? I've already run half the tests, thanks to a quick start from http://www.functionalfuture.com/d/The objective would be to fix the slow parts of the language, if the native associative array is damn slow use it so maybe big W fix it :)Makes total sense, but being new to the language I thought I'd ask because I don't know the history or intended use of the built-ins. For example, Java has String also, but it is not meant for heavy concatenation and the recommendation has always been to use StringBuffer (and now StringBuilder for Java1.5 when it doesn't need to be thread sychronized).I will use the built-in AA, string and indexed arrays for everything applicable, but I bet when the comparisons come out, I'll get lots of the "hey, you should've used class X instead" type of complaints ;).To avoid that comment your intent at the top of the code. Regan -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
 Aug 08 2004
In article <opscftmwn55a2sq9 digitalmars.com>, Regan Heath says...The built in D strings do not handle concatenation too well either, they reallocate each time creating only enough space for each concatentation. You can however use a little trick like so: char[] s = "test" int keep; keep = s.length; s.length = 1000; s.length = keep; s = s ~ "a"; to pre-allocate the space you think you might need. However there is no guarantee it won't release that memory at some point during your concatenations. I have requested a .reserve property for strings that reserves space as the above does but with the guarantee not to release it. It also has the benefit of more intuitive and simple syntax eg. char[] s = "test"; s.reserve = 1000; s = s ~ "a";If 'reserve' is added, another suggestion could be to preemptively expand the buffer as the string grows from concatenation, to lower the copying/reallocations. I believe most C++ std::string implementations do this. Maybe for most situations, the programmer would be able to avoid using string.reserve and performance would be better for the times it is tough to make a good guess at the size at runtime. Maybe this is how OutBuffer currently works as well.
 Aug 09 2004
Dave wrote:Maybe for most situations, the programmer would be able to avoid using string.reserve and performance would be better for the times it is tough to make a good guess at the size at runtime.An algorithm that I've find good enough for most situations and simple to code is to double the reserved size of the string every time it gets out of space. I think that's also the way Python lists work.
 Aug 19 2004
Juanjo Álvarez wrote:Dave wrote:Good idea - I've written some wicked fast C++ string classes (for, among other things, 'buffering' very large dynamically generated HTML pages) that would basically add 10% - 20% (can't remember exactly) whenever things were realloc'd. This avoided a bunch of swapping once the strings got large and the perf. difference for the smaller requirements was negligable. Pretty simple algorithm, but it worked. No special attention to alignment or anything else - just realloc a bit more than requested and perf. improved dramatically in tight loops. I've never dug into what std::basic_string<> is doing internally, but it looks to be something along those lines rather than doubling it. And basic_string<> is as fast or faster for concatenation as anything else I've seen, including other built-in's like Borland Object Pascal. Matthew, if you happen to read this, what do the STLSoft implementations do in these cases? Thanks, DaveMaybe for most situations, the programmer would be able to avoid using string.reserve and performance would be better for the times it is tough to make a good guess at the size at runtime.An algorithm that I've find good enough for most situations and simple to code is to double the reserved size of the string every time it gets out of space. I think that's also the way Python lists work.
 Aug 19 2004








 
  
  
 
 J C Calvarese <jcc7 cox.net>
 J C Calvarese <jcc7 cox.net> 