www.digitalmars.com         C & C++   DMDScript  

D - toStringz: BUG and suggestion

Fellows and I examined the behavior of array repositioning.
In result, we found that toStringz is possible to return illegal memory block in
such a case:

char[] a = "12345678";
puts(toStringz(a ~ a));  // <- Here!


toStringz behaves:

[1] If the length is 0, returns "".
[2] If the (&str[0])[str.length] is 0, returns &str[0].
[3] Otherwise, returns the copy added 0 back.

And manipulated string behaves:

1) If the new length is less than 16 + 1, the capacity is 16.
2) If the new length is more than 16 and less than 32 + 1, the capacity is 32.
3) If the new length isn't equal to the capacity, the unused area is initialized
by 0.

It means,

If the new length is 16, the capacity is 16 due to 1).
So, if the length of manipulating result is 16,
[2] accesses illegal memory block,
because (&str[0])[str.length] is out of the capacity.
If it is 0 by accident, toStringz doesn't copy it,
and it returns illegal memory block.

Therefore, I think toStringz has to check the capacity, too.
It helps us not only to correct the bug,
but also to cause copying less frequently.
e.g.

{1} If the length is 0, returns "".
{2} If the length is equal to the capacity, returns the copy added 0 back.
{3} Otherwise, returns &str[0] with (&str[0])[str.length] = 0.

{3} appears to be contrary to the spirit of "copy-on-write",
but {3} needn't touch the string itself,
i.e. it changes only "out of the string but in the capacity" region.

Robert (Japanese)
Dec 11 2003