www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - toStringz and toUTFz potentially unsafe

reply Johann MacDonagh <johann.macdonagh.no spam.gmail.com> writes:
Both toStringz and toUTFz do something potentially unsafe. Both check 
whether the character after the end of the string is NULL. If so, then 
it simply returns a pointer to the original string. This is a good 
optimization in theory because this code:

string s = "abc";

will be a slice to a read-only section of the executable. The compiler 
will insert a NULL after the string in the read-only section. So this:

auto x = toStringz("abc");

is efficient. No relocations.

As  AndrejMitrovic commented in Phobos pull request 123 
https://github.com/D-Programming-Language/phobos/pull/123, this has 
potential issues:

import std.string;
import std.stdio;

struct A
{
     immutable char[2] foo;
     char[2] bar;
}

void main()
{
     auto a = A("aa", "\0b");
     auto charptr = toStringz(a.foo[]);

     a.bar = "bo";
     printf(charptr);   // two chars, then garbage
}

Another issue not mentioned is with slices. If I do...

string s = "abc";
string y = s[];
string z = y[];

z ~= '\0';

auto c = toStringz(y);

assert(c.ptr == y.ptr);

... what happens if I change that last character of z before I pass c to 
the C routine? Bad news. I think this optimization is great, but doesn't 
it go against D's motto of doing "the right thing by default"?

The question is, how can we keep this optimization so that:

toStringz("abc");

remains efficient?

The capacity field is 0 if the string is in a read-only section *or* if 
the string is on the stack:

auto x = "abc";
assert(x.capacity == 0);
char[3] y = "abc";
assert(x.capacity == 0);

So, this isn't safe either. This code:

     char[3] x = "abc";
     char y = '\0';

will put y right after x, so changing y after calling toStringz will 
cause issues.

In reality, the only time it's safe to do the "peek after end" is if the 
string is in the read-only section. Otherwise, there are potential 
issues (even if they are edge cases).

Do we care about this? Is there something we can add to druntime arrays 
that will tell whether or not the data backing a slice is in read-only 
memory (or perhaps an enum: read-only, stack, heap, other)? In reality, 
the only time this changes is when a read-only / stack heap is appended 
to, so performance issues are minimal.

Comments? Ideas?
Jul 24 2011
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/24/11 7:41 PM, Johann MacDonagh wrote:
 Both toStringz and toUTFz do something potentially unsafe. Both check
 whether the character after the end of the string is NULL. If so, then
 it simply returns a pointer to the original string. This is a good
 optimization in theory because this code:

 string s = "abc";

 will be a slice to a read-only section of the executable. The compiler
 will insert a NULL after the string in the read-only section. So this:

 auto x = toStringz("abc");

 is efficient. No relocations.

 As  AndrejMitrovic commented in Phobos pull request 123
 https://github.com/D-Programming-Language/phobos/pull/123, this has
 potential issues:

 import std.string;
 import std.stdio;

 struct A
 {
 immutable char[2] foo;
 char[2] bar;
 }

 void main()
 {
 auto a = A("aa", "\0b");
 auto charptr = toStringz(a.foo[]);

 a.bar = "bo";
 printf(charptr); // two chars, then garbage
 }

 Another issue not mentioned is with slices. If I do...

 string s = "abc";
 string y = s[];
 string z = y[];

 z ~= '\0';

 auto c = toStringz(y);

 assert(c.ptr == y.ptr);

 ... what happens if I change that last character of z before I pass c to
 the C routine? Bad news. I think this optimization is great, but doesn't
 it go against D's motto of doing "the right thing by default"?

 The question is, how can we keep this optimization so that:

 toStringz("abc");

 remains efficient?

 The capacity field is 0 if the string is in a read-only section *or* if
 the string is on the stack:

 auto x = "abc";
 assert(x.capacity == 0);
 char[3] y = "abc";
 assert(x.capacity == 0);

 So, this isn't safe either. This code:

 char[3] x = "abc";
 char y = '\0';

 will put y right after x, so changing y after calling toStringz will
 cause issues.

 In reality, the only time it's safe to do the "peek after end" is if the
 string is in the read-only section. Otherwise, there are potential
 issues (even if they are edge cases).

 Do we care about this? Is there something we can add to druntime arrays
 that will tell whether or not the data backing a slice is in read-only
 memory (or perhaps an enum: read-only, stack, heap, other)? In reality,
 the only time this changes is when a read-only / stack heap is appended
 to, so performance issues are minimal.

 Comments? Ideas?
I'm not too worried. I think it is fair to guarantee the pointer returned from toStringz is guaranteed to point to a zero-terminated string only up until the first relevant change. It is difficult to define what a relevant change is, but practically I think it is understood what's going on. If there's a need for a persistent stringz, creating a private copy immediately after the call to toStringz is always an option. Andrei
Jul 24 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday 24 July 2011 20:41:47 Johann MacDonagh wrote:
 Both toStringz and toUTFz do something potentially unsafe. Both check
 whether the character after the end of the string is NULL. If so, then
 it simply returns a pointer to the original string. This is a good
 optimization in theory because this code:
 
 string s = "abc";
 
 will be a slice to a read-only section of the executable. The compiler
 will insert a NULL after the string in the read-only section. So this:
 
 auto x = toStringz("abc");
 
 is efficient. No relocations.
 
 As  AndrejMitrovic commented in Phobos pull request 123
 https://github.com/D-Programming-Language/phobos/pull/123, this has
 potential issues:
 
 import std.string;
 import std.stdio;
 
 struct A
 {
      immutable char[2] foo;
      char[2] bar;
 }
 
 void main()
 {
      auto a = A("aa", "\0b");
      auto charptr = toStringz(a.foo[]);
 
      a.bar = "bo";
      printf(charptr);   // two chars, then garbage
 }
 
 Another issue not mentioned is with slices. If I do...
 
 string s = "abc";
 string y = s[];
 string z = y[];
 
 z ~= '\0';
 
 auto c = toStringz(y);
 
 assert(c.ptr == y.ptr);
 
 ... what happens if I change that last character of z before I pass c to
 the C routine? Bad news. I think this optimization is great, but doesn't
 it go against D's motto of doing "the right thing by default"?
 
 The question is, how can we keep this optimization so that:
 
 toStringz("abc");
 
 remains efficient?
 
 The capacity field is 0 if the string is in a read-only section *or* if
 the string is on the stack:
 
 auto x = "abc";
 assert(x.capacity == 0);
 char[3] y = "abc";
 assert(x.capacity == 0);
 
 So, this isn't safe either. This code:
 
      char[3] x = "abc";
      char y = '\0';
 
 will put y right after x, so changing y after calling toStringz will
 cause issues.
 
 In reality, the only time it's safe to do the "peek after end" is if the
 string is in the read-only section. Otherwise, there are potential
 issues (even if they are edge cases).
 
 Do we care about this? Is there something we can add to druntime arrays
 that will tell whether or not the data backing a slice is in read-only
 memory (or perhaps an enum: read-only, stack, heap, other)? In reality,
 the only time this changes is when a read-only / stack heap is appended
 to, so performance issues are minimal.
 
 Comments? Ideas?
The _only_ time that this matters is if you actually keep the result of toStringz or toUTFz around. If you do what is almost always done - that is pass the the result of toStringz/toUTFz directly to a function and not save it at all - then there is _zero_ problem. If you want to save the char* instead, then you can simply choose to append '\0' instead of calling toStringz/toUTFz. toStringz has pretty much 0 value if it's forced to copy the string in all cases, and unless checking past the end of the string _and_ guaranteeing that the character one past the end won't change can be done and done efficiently, then the only way to guarantee that then value one past the end won't change is to make a copy, in which case toStringz is essentially pointless. The documentation for toUTFz has an appropriate warning on it, so the issue should be clear. If you want to avoid it, simply append '\0' instead of calling toStringz or toUTFz. In 99.99% of cases, the char* is passed directly to a C function anyway, in which case there is not problem. So, I really don't think that this is really an issue. If we can find an efficient way to make better guarantees, then that would be great, but the risk at this point is _very_ minimal, and there's an easy way to get around it in the _rare_ case where it could be a problem (simply append '\0' instead of calling toStringz/toUTFz). - Jonathan M Davis
Jul 24 2011
prev sibling next sibling parent reply Brad Roberts <braddr puremagic.com> writes:
On Sunday, July 24, 2011 5:56:04 PM, Jonathan M Davis wrote:
 On Sunday 24 July 2011 20:41:47 Johann MacDonagh wrote:
 Both toStringz and toUTFz do something potentially unsafe. Both check
 whether the character after the end of the string is NULL. If so, then
 it simply returns a pointer to the original string. This is a good
 optimization in theory because this code:

 string s = "abc";

 will be a slice to a read-only section of the executable. The compiler
 will insert a NULL after the string in the read-only section. So this:

 auto x = toStringz("abc");

 is efficient. No relocations.

 As  AndrejMitrovic commented in Phobos pull request 123
 https://github.com/D-Programming-Language/phobos/pull/123, this has
 potential issues:

 import std.string;
 import std.stdio;

 struct A
 {
      immutable char[2] foo;
      char[2] bar;
 }

 void main()
 {
      auto a = A("aa", "\0b");
      auto charptr = toStringz
(a.foo[]);
      a.bar = "bo";
      printf(charptr);   // two chars, then garbage
 }

 Another issue not mentioned is with slices. If I do...

 string s = "abc";
 string y = s[];
 string z = y[];

 z ~= '\0';

 auto c = toStringz(y);

 assert(c.ptr == y.ptr);

 ... what happens if I change that last character of z before I pass c to
 the C routine? Bad news. I think this optimization is great, but doesn't
 it go against D's motto of doing "the right thing by default"?

 The question is, how can we keep this optimization so that:

 toStringz("abc");

 remains efficient?

 The capacity field is 0 if the string is in a read-only section *or* if
 the string is on the stack:

 auto x = "abc";
 assert(x.capacity == 0);
 char[3] y = "abc";
 assert(x.capacity == 0);

 So, this isn't safe either. This code:

      char[3] x = "abc";
      char y = '\0';

 will put y right after x, so changing y after c
alling toStringz will
 cause issues.

 In reality, the only time it's safe to do the "peek after end" is if the
 string is in the read-only section. Otherwise, there are potential
 issues (even if they are edge cases).

 Do we care about this? Is there something we can add to druntime arrays
 that will tell whether or not the data backing a slice is in read-only
 memory (or perhaps an enum: read-only, stack, heap, other)? In reality,
 the only time this changes is when a read-only / stack heap is appended
 to, so performance issues are minimal.

 Comments? Ideas?
The _only_ time that this matters is if you actually keep the result of toStringz or toUTFz around. If you do what is almost always done - that is pass the the result of toStringz/toUTFz directly to a function and not save it at all - then there is _zero_ problem. If you want to save the char* instead, then you can simply choose to append '\0' instead of calling toStrin
gz/toUTFz. toStringz has pretty much 0 value if
 it's forced to copy the string in all cases, and unless checking past the end 
 of the string _and_ guaranteeing that the character one past the end won't 
 change can be done and done efficiently, then the only way to guarantee that 
 then value one past the end won't change is to make a copy, in which case 
 toStringz is essentially pointless.
 
 The documentation for toUTFz has an appropriate warning on it, so the issue 
 should be clear. If you want to avoid it, simply append '\0' instead of 
 calling toStringz or toUTFz. In 99.99% of cases, the char* is passed directly 
 to a C function anyway, in which case there is not problem.
 
 So, I really don't think that this is really an issue. If we can find an 
 efficient way to make better guarantees, then that would be great, but the
risk 
 at this point is _very_ minimal, and there's an easy way to get around it in 
 the _rare_ case where it could be a problem
(simply append '\0' instead of
 calling toStringz/toUTFz).
 
 - Jonathan M Davis
Not entire true. There's one more very important way this can cause a serious problem: when the string in question is at the end of a block of memory and the +1 location is outside that. The result will be a segv. While I agree the chances of problem is low, it's certainly not 0 and is the sort of thing that if we saw another language do it we'd laugh and point. This kind of code produces hard to debug problems and doesn't belong in the standard library.
Jul 24 2011
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/24/11 8:05 PM, Brad Roberts wrote:
 Not entire true.  There's one more very important way this can cause a
 serious problem:  when the string in question is at the end of a block
 of memory and the +1 location is outside that.  The result will be a
 segv.
Good point. In fact the implementation already takes care of that. https://github.com/D-Programming-Language/phobos/blob/master/std/string.d#L403 Andrei
Jul 24 2011
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday 24 July 2011 17:56:04 Jonathan M Davis wrote:
 On Sunday 24 July 2011 20:41:47 Johann MacDonagh wrote:
 Both toStringz and toUTFz do something potentially unsafe. Both check
 whether the character after the end of the string is NULL. If so, then
 it simply returns a pointer to the original string. This is a good
 optimization in theory because this code:
 
 string s = "abc";
 
 will be a slice to a read-only section of the executable. The compiler
 will insert a NULL after the string in the read-only section. So this:
 
 auto x = toStringz("abc");
 
 is efficient. No relocations.
 
 As  AndrejMitrovic commented in Phobos pull request 123
 https://github.com/D-Programming-Language/phobos/pull/123, this has
 potential issues:
 
 import std.string;
 import std.stdio;
 
 struct A
 {
 
      immutable char[2] foo;
      char[2] bar;
 
 }
 
 void main()
 {
 
      auto a = A("aa", "\0b");
      auto charptr = toStringz(a.foo[]);
      
      a.bar = "bo";
      printf(charptr);   // two chars, then garbage
 
 }
 
 Another issue not mentioned is with slices. If I do...
 
 string s = "abc";
 string y = s[];
 string z = y[];
 
 z ~= '\0';
 
 auto c = toStringz(y);
 
 assert(c.ptr == y.ptr);
 
 ... what happens if I change that last character of z before I pass c to
 the C routine? Bad news. I think this optimization is great, but doesn't
 it go against D's motto of doing "the right thing by default"?
 
 The question is, how can we keep this optimization so that:
 
 toStringz("abc");
 
 remains efficient?
 
 The capacity field is 0 if the string is in a read-only section *or* if
 the string is on the stack:
 
 auto x = "abc";
 assert(x.capacity == 0);
 char[3] y = "abc";
 assert(x.capacity == 0);
 
 So, this isn't safe either. This code:
      char[3] x = "abc";
      char y = '\0';
 
 will put y right after x, so changing y after calling toStringz will
 cause issues.
 
 In reality, the only time it's safe to do the "peek after end" is if the
 string is in the read-only section. Otherwise, there are potential
 issues (even if they are edge cases).
 
 Do we care about this? Is there something we can add to druntime arrays
 that will tell whether or not the data backing a slice is in read-only
 memory (or perhaps an enum: read-only, stack, heap, other)? In reality,
 the only time this changes is when a read-only / stack heap is appended
 to, so performance issues are minimal.
 
 Comments? Ideas?
The _only_ time that this matters is if you actually keep the result of toStringz or toUTFz around. If you do what is almost always done - that is pass the the result of toStringz/toUTFz directly to a function and not save it at all - then there is _zero_ problem. If you want to save the char* instead, then you can simply choose to append '\0' instead of calling toStringz/toUTFz. toStringz has pretty much 0 value if it's forced to copy the string in all cases, and unless checking past the end of the string _and_ guaranteeing that the character one past the end won't change can be done and done efficiently, then the only way to guarantee that then value one past the end won't change is to make a copy, in which case toStringz is essentially pointless. The documentation for toUTFz has an appropriate warning on it, so the issue should be clear. If you want to avoid it, simply append '\0' instead of calling toStringz or toUTFz. In 99.99% of cases, the char* is passed directly to a C function anyway, in which case there is not problem. So, I really don't think that this is really an issue. If we can find an efficient way to make better guarantees, then that would be great, but the risk at this point is _very_ minimal, and there's an easy way to get around it in the _rare_ case where it could be a problem (simply append '\0' instead of calling toStringz/toUTFz).
The real question is what to do with to!(char*)(str). The plan is to make it call toUTFz, but at that point, the warning about toUTFz is not as obvious (though it can re-iterate the warning or point you to the toUTFz documentation to read it). Also, since you already have toUTFz, calling to!(char*) is kind of pointless. So, I think that there's a good argument for forcing to!(char*) to append '\0' instead of checking one past the end. Then when you want a guarantee that the '\0' isn't going to change, you can use to!(char*), and if you want the efficiency, you can call toUTFz. But it is debatable whether we should do that or just have to!(char*) call toUTFz in all cases. I'm leaning towards making it always copy though. - Jonathan M Davis
Jul 24 2011
parent reply Johann MacDonagh <johann.macdonagh.no spam.gmail.com> writes:
On 7/24/2011 9:06 PM, Jonathan M Davis wrote:
 On Sunday 24 July 2011 17:56:04 Jonathan M Davis wrote:
 The real question is what to do with to!(char*)(str). The plan is to make it
 call toUTFz, but at that point, the warning about toUTFz is not as obvious
 (though it can re-iterate the warning or point you to the toUTFz documentation
 to read it). Also, since you already have toUTFz, calling to!(char*) is kind
 of pointless. So, I think that there's a good argument for forcing to!(char*)
 to append '\0' instead of checking one past the end. Then when you want a
 guarantee that the '\0' isn't going to change, you can use to!(char*), and if
 you want the efficiency, you can call toUTFz. But it is debatable whether we
 should do that or just have to!(char*) call toUTFz in all cases. I'm leaning
 towards making it always copy though.

 - Jonathan M Davis
In that case, maybe we should implement schveiguy's suggestion. immutable(char)* toStringz(string s, bool unsafe = true) pure nothrow That way the user can decide whether to take the optimization risk (or if they know the string is on the stack, etc...). In addition, always copying is wasteful. We're usually able to append a NULL to a dynamic array without relocation / copying.
Jul 24 2011
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday 24 July 2011 21:57:34 Johann MacDonagh wrote:
 On 7/24/2011 9:06 PM, Jonathan M Davis wrote:
 On Sunday 24 July 2011 17:56:04 Jonathan M Davis wrote:
 The real question is what to do with to!(char*)(str). The plan is to
 make it call toUTFz, but at that point, the warning about toUTFz is not
 as obvious (though it can re-iterate the warning or point you to the
 toUTFz documentation to read it). Also, since you already have toUTFz,
 calling to!(char*) is kind of pointless. So, I think that there's a
 good argument for forcing to!(char*) to append '\0' instead of checking
 one past the end. Then when you want a guarantee that the '\0' isn't
 going to change, you can use to!(char*), and if you want the
 efficiency, you can call toUTFz. But it is debatable whether we should
 do that or just have to!(char*) call toUTFz in all cases. I'm leaning
 towards making it always copy though.
 
 - Jonathan M Davis
In that case, maybe we should implement schveiguy's suggestion. immutable(char)* toStringz(string s, bool unsafe = true) pure nothrow That way the user can decide whether to take the optimization risk (or if they know the string is on the stack, etc...). In addition, always copying is wasteful. We're usually able to append a NULL to a dynamic array without relocation / copying.
If you always append a '\0', then there's no point to toStringz at all. So, sure we _could_ add the unsafe parameter like that, but I seriously question the value of it. The primary value in toStringz is to give you the optimization of looking one past the end of the string and attempting to completely avoid any chance of reallocation. And yes, you'd use ~= which _might_ copy rather than forcing a copy every time, but the cases where you could have just checked one past the end of the array and done nothing if it were '\0' are generally going to be the cases where ~= has to reallocate. So, in reality, you're pretty much going to copy every time that you could have avoided the copy if you had gone with true for unsafe rather than false. - Jonathan M Davis
Jul 24 2011
parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Mon, 25 Jul 2011 05:09:17 +0300, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 If you always append a '\0', then there's no point to toStringz at all.
Semantics? toStringz(s) is much more readable than (s~'\0').ptr -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Jul 25 2011
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 24 Jul 2011 20:41:47 -0400, Johann MacDonagh  
<johann.macdonagh.no spam.gmail.com> wrote:

 Both toStringz and toUTFz do something potentially unsafe. Both check  
 whether the character after the end of the string is NULL. If so, then  
 it simply returns a pointer to the original string.
It is of critical importance to note that this *ONLY* happens for immutable data. It is not done for const or mutable strings (instead a zero is always appended).
 Another issue not mentioned is with slices. If I do...

 string s = "abc";
 string y = s[];
 string z = y[];

 z ~= '\0';

 auto c = toStringz(y);

 assert(c.ptr == y.ptr);

 ... what happens if I change that last character of z before I pass c to  
 the C routine?
Not possible. There are two important errors in your assumptions. First, z is an immutable(char)[], so you cannot change the last character without a cast. And if you do that, it's undefined behavior. Second, z.ptr != y.ptr. You cannot append to a ROM string, it will reallocate. However, it's quite possible that *unallocated* data may change. That is: string s = "abc".dup; // it's possible that the data after "abc" is a 0. The GC does not default initialize the unallocated portion of a data block if you are allocating a NO_SCAN block (for efficiency). auto c = toStringz(s); // might just return s.ptr. s ~= 'a'; // now if c == s.ptr, c is affected. However, I still am not concerned about this (see my statement at the end of this message).
 Bad news. I think this optimization is great, but doesn't it go against  
 D's motto of doing "the right thing by default"?

 The question is, how can we keep this optimization so that:

 toStringz("abc");

 remains efficient?

 The capacity field is 0 if the string is in a read-only section *or* if  
 the string is on the stack:

 auto x = "abc";
 assert(x.capacity == 0);
 char[3] y = "abc";
 assert(x.capacity == 0);

 So, this isn't safe either. This code:

      char[3] x = "abc";
      char y = '\0';

 will put y right after x, so changing y after calling toStringz will  
 cause issues.
A not-so-small flaw in this argument is char[3] x is not immutable -- as I stated above, only an immutable string causes the optimization to occur. It's highly unlikely anyone would ever write: immutable char[3] y = "abc";
 In reality, the only time it's safe to do the "peek after end" is if the  
 string is in the read-only section. Otherwise, there are potential  
 issues (even if they are edge cases).

 Do we care about this? Is there something we can add to druntime arrays  
 that will tell whether or not the data backing a slice is in read-only  
 memory (or perhaps an enum: read-only, stack, heap, other)? In reality,  
 the only time this changes is when a read-only / stack heap is appended  
 to, so performance issues are minimal.
I think there could be a compromise solution: If the slice is for immutable data (i.e. string), AND the slice is of a heap-allocated array (the runtime can determine this via a block lookup), AND the characer after the slice is part of the used block data, it would be safe to assume that data is also immutable. There are remote possibilities that this could not be the case, but this involves improbable casting scenarios. Note that the lookup cost is not trivial -- it's O(lgn), and it requires a GC lock. This, of course, would have to be in addition to checking if the string is in ROM. I don't know if this is possible... BTW, I agree with Jonathan that this is a very rare corner-case (it might even be non-existent in real code). I've only ever used toStringz or seen it used as an expression for a parameter -- I've never actually saved the result and did other stuff before passing it. I think the warning he states in documentation is sufficient. -Steve
Jul 25 2011
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday 25 July 2011 09:27:34 Steven Schveighoffer wrote:
 On Sun, 24 Jul 2011 20:41:47 -0400, Johann MacDonagh
 
 <johann.macdonagh.no spam.gmail.com> wrote:
 Both toStringz and toUTFz do something potentially unsafe. Both check
 whether the character after the end of the string is NULL. If so, then
 it simply returns a pointer to the original string.
It is of critical importance to note that this *ONLY* happens for immutable data. It is not done for const or mutable strings (instead a zero is always appended).
That's not quite true actually. It _is_ true for toStringz, because it _always_ returns immutable char*, but because toUTFz has more control over the return type, it does it in every case where it could conceivably avoid a copy. So, C[] -> C*, C[] -> const C*, immutable (C)[] -> const(C)[], and immutable(C)[] -> immutable C* all check one past the end of the array. Now, the non-immutable strings probably end up appending anyway, because they don't get the embedded '\0' insert one past their end by the compiler, but they still try and avoid it. But the non-immutable strings aren't really any more likely to have the problem than immutable ones, since it all hinges on saving the returned pointer, which just isn't done normally. The only real increased risk is if you slice a mutable array where the first element after the slice is '\0', since then you could easily change it, where you couldn't if it were immutable, but again, you have to save the pointer, and the warning makes it clear that that's a situation which would be a problem. - Jonathan M Davis
Jul 25 2011