digitalmars.D.bugs - [Issue 203] New: std.format.doFormat() pads width incorrectly on Unicode strings
- d-bugmail puremagic.com (30/30) Jun 17 2006 http://d.puremagic.com/issues/show_bug.cgi?id=203
- d-bugmail puremagic.com (11/21) Apr 29 2007 http://d.puremagic.com/issues/show_bug.cgi?id=203
- d-bugmail puremagic.com (4/4) Jun 23 2008 http://d.puremagic.com/issues/show_bug.cgi?id=203
- d-bugmail puremagic.com (9/9) Jul 09 2008 http://d.puremagic.com/issues/show_bug.cgi?id=203
http://d.puremagic.com/issues/show_bug.cgi?id=203 Summary: std.format.doFormat() pads width incorrectly on Unicode strings Product: D Version: 0.160 Platform: PC OS/Version: Windows Status: NEW Keywords: wrong-code Severity: normal Priority: P2 Component: Phobos AssignedTo: bugzilla digitalmars.com ReportedBy: deewiant gmail.com import std.string; void main() { assert(format("%8s", "foo") == " foo"); assert(format("%8s", "foobar") == " foobar"); assert(format("%8s", "hello") == " hello"); assert(format("%8s", "h\u00e9ll\u00f4") == " h\u00e9ll\u00f4"); // this passes, though it shouldn't: assert(format("%8s", "h\u00e9ll\u00f4") == " h\u00e9ll\u00f4"); } -- In the above, the last assertion fails. One would expect the last two strings, having five characters each, to both be padded in the front by three spaces: however, it appears the byte count is being used for determining the length and not the actual character count, and so the last string is padded by only one space. --
Jun 17 2006
http://d.puremagic.com/issues/show_bug.cgi?id=203 thomas-dloop kuehne.cn changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords|wrong-code |spec OS/Version|Windows |AllOne would expect the last two strings, having five characters each, to both be padded in the front by three spaces: however, it appears the byte count is being used for determining the length and not the actual character count, and so the last string is padded by only one space.The only relevant documentation I found is:Width Specifies the minimum field width. If the width is a *, the next argument, which must be of type int, is taken as the width. If the width is negative, it is as if the - was given as a Flags character."field width" could be both interpreted as " byte length" and "UTF codepoint count". --
Apr 29 2007
http://d.puremagic.com/issues/show_bug.cgi?id=203 I suggest it's codepoint count, as field width is for display purposes. --
Jun 23 2008
http://d.puremagic.com/issues/show_bug.cgi?id=203 bugzilla digitalmars.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED Fixed dmd 1.032 and 2.016 --
Jul 09 2008