www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 203] New: std.format.doFormat() pads width incorrectly on Unicode strings

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=203

           Summary: std.format.doFormat() pads width incorrectly on Unicode
                    strings
           Product: D
           Version: 0.160
          Platform: PC
        OS/Version: Windows
            Status: NEW
          Keywords: wrong-code
          Severity: normal
          Priority: P2
         Component: Phobos
        AssignedTo: bugzilla digitalmars.com
        ReportedBy: deewiant gmail.com


import std.string;

void main() {
        assert(format("%8s", "foo")             == "     foo");
        assert(format("%8s", "foobar")          == "  foobar");
        assert(format("%8s", "hello")           == "   hello");
        assert(format("%8s", "h\u00e9ll\u00f4") == "   h\u00e9ll\u00f4");
        // this passes, though it shouldn't: assert(format("%8s",
"h\u00e9ll\u00f4") == " h\u00e9ll\u00f4");
}
--

In the above, the last assertion fails.

One would expect the last two strings, having five characters each, to both be
padded in the front by three spaces: however, it appears the byte count is
being used for determining the length and not the actual character count, and
so the last string is padded by only one space.


-- 
Jun 17 2006
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=203


thomas-dloop kuehne.cn changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|wrong-code                  |spec
         OS/Version|Windows                     |All





 One would expect the last two strings, having five characters each,
 to both be padded in the front by three spaces: however, it appears
 the byte count is being used for determining the length and not the
 actual character count, and so the last string is padded by only one
 space.
The only relevant documentation I found is:
 Width
    Specifies the minimum field width. If the width is a *, the next
    argument, which must be of type int, is taken as the width. If
    the width is negative, it is as if the - was given as a Flags
    character.
"field width" could be both interpreted as " byte length" and "UTF codepoint count". --
Apr 29 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=203






I suggest it's codepoint count, as field width is for display purposes.


-- 
Jun 23 2008
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=203


bugzilla digitalmars.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED





Fixed dmd 1.032 and 2.016


-- 
Jul 09 2008