www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 15136] New: If we want toStringz to be fully correct, it

https://issues.dlang.org/show_bug.cgi?id=15136

          Issue ID: 15136
           Summary: If we want toStringz to be fully correct, it needs to
                    stop checking for '\0'
           Product: D
           Version: D2
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P1
         Component: phobos
          Assignee: nobody puremagic.com
          Reporter: issues.dlang jmdavisProg.com

I believe that this has been brought before (by Andrej Mitrovic IIRC), but
toStringz has never been changed to fix the problem. This code succeeds:

import std.string;

void main()
{
    auto s = S("01234");
    auto str = s.str.toStringz;
    assert(str == s.str.ptr);
    assert(*(str + 5) == 0); // Null terminated.
    s.foo = 42;
    assert(*(str + 5) == 42); // No longer null terminated.
}

struct S
{
    immutable char[5] str;
    ubyte foo;

    this(char[5] str)
    {
        this.str = str;
    }
}

We have managed to create a null-terminated string which stops being
null-terminated after its creation, because the null terminator that toStringz
found was actually in the member variable after the static array, not in the
static array itself. Now, fixing it so that toStringz had a separate overload
for static arrays would fix the code as it stands, but simply slicing the
static array before passing it to toStringz would get around that, making this
a problem again.

What we _really_ want for toStringz to do is to detect string literals, and if
we could do that, then we'd be home free, because toStringz could return the
ptr value for a string literal and append '\0' to everything else, but AFAIK,
we have no way to detect string literals.

Realistically, this problem is going to be extremely rare, but it _is_
possible, and it does potentially result in buffer overflows, which could mean
that a program using toStringz could end up with a security problem in spite of
D's array bounds checking and supposed guarantee from toStringz that the string
is null-terminated. So, unless we want to take the approach that this example
is so unlikely that we don't want to worry about it and just assume that no one
is going to have buffer overflows because of this, I think that we're forced to
make it so that toStringz always appends even for immutable(char)[] - either
that, or we find a way to detect string literals (which would be the ideal
solution), but I have no idea how that could be done.

--
Oct 01 2015