digitalmars.D.bugs - [Issue 5173] New: std.process.shell cannot handle non-UTF8 output
- d-bugmail puremagic.com (57/57) Nov 05 2010 http://d.puremagic.com/issues/show_bug.cgi?id=5173
- d-bugmail puremagic.com (6/6) Nov 05 2010 http://d.puremagic.com/issues/show_bug.cgi?id=5173
- d-bugmail puremagic.com (10/10) Nov 05 2010 http://d.puremagic.com/issues/show_bug.cgi?id=5173
http://d.puremagic.com/issues/show_bug.cgi?id=5173 Summary: std.process.shell cannot handle non-UTF8 output Product: D Version: D2 Platform: All OS/Version: Windows Status: NEW Severity: minor Priority: P2 Component: Phobos AssignedTo: nobody puremagic.com ReportedBy: lars.holowko gmail.com PDT --- std.process.shell dies with an exception when the utility returns UTF-16. for example: import std.process, std.stdio, std.string; int main(string[] args) { auto output = shell("wmic NTDOMAIN GET DomainName /value"); writefln("Output: %s", output); return 0; } produces this output: dchar decode(in char[], ref size_t): Invalid UTF-8 sequence [255, 254, 13, 0, 10, 0, 13, 0, 10, 0, 68, 0, 111, 0, 109, 0, 97, 0, 105, 0, 110, 0, 78, 0, 97, 0, 109, 0, 101, 0, 61, 0, 13, 0, 10, 0, 13, 0, 10, 0, 13, 0, 10, 0] around index 0 wmic's output looks like UTF-16(little endian). As a work-around, if I modify std.process.shell slightly to use a wstring instead: import std.array, std.random, std.file, std.format, std.exception; wstring shell2(string cmd) { auto a = appender!string(); foreach (ref e; 0 .. 8) { formattedWrite(a, "%x", rndGen.front); rndGen.popFront; } auto filename = a.data; scope(exit) if (exists(filename)) remove(filename); errnoEnforce(system(cmd ~ "> " ~ filename) == 0); return readText!wstring(filename); } things seem to work for this case. But a proper fix would be to make readText try to determine the encoding based on the prefix and then do the necessary conversion before calling std.utf.validate. readText currently looks like this; S readText(S = string)(in char[] name) { auto result = cast(S) read(name); std.utf.validate(result); return result; } -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 05 2010
http://d.puremagic.com/issues/show_bug.cgi?id=5173 PDT --- forgot to mention: this is on 2.050 -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 05 2010
http://d.puremagic.com/issues/show_bug.cgi?id=5173 PDT --- Created an attachment (id=801) replacement std.file.readText that would fix the issue the attached std.file.readText function implements uses the UTF encoding detection "algorithm" described in TDPL and does the necessary conversions to fix the described bug. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 05 2010