digitalmars.D.learn - xml utf-8 encoding error
- graw-prog (60/65) Aug 28 2017 Hi, I'm having some trouble getting an xml file using
- Kagamin (3/4) Aug 29 2017 Should be
- Adam D. Ruppe (2/4) Aug 29 2017 I'm pretty sure both are equally legal.
- ag0aep6g (3/8) Aug 29 2017 HTTP allows a quoted string there.
- Adam D. Ruppe (8/12) Aug 29 2017 It looks like a bug in Phobos:
Hi, I'm having some trouble getting an xml file using std.net.curl. I'm using get() to receive device info from a roku television using this code: char[] inputQuery(string input) { string url = ip ~ "query/" ~ input; auto client = HTTP(); auto content = get(url,client); return content; } I've had no problem in the past using similar code to receive JSON data from a web server. However in this case I run into this error: std.encoding.EncodingException std/encoding.d(2346): Unrecognized Encoding: "utf-8" ---------------- ??:? std.encoding.EncodingScheme std.encoding.EncodingScheme.create(immutable(char)[]) [0x9773ff] /usr/include/dmd/phobos/std/net/curl.d:1196 char[] std.net.curl._decodeContent!(char)._decodeContent(ubyte[], immutable(char)[]) [0x7951cb] /usr/include/dmd/phobos/std/net/curl.d:1049 char[] std.net.curl._basicHTTP!(char)._basicHTTP(const(char)[], const(void)[], std.net.curl.HTTP) [0x793559] /usr/include/dmd/phobos/std/net/curl.d:540 char[] std.net.curl.get!(std.net.curl.HTTP, char).get(const(char)[], std.net.curl.HTTP) [0x795b77] source/backend.d:26 immutable(char)[] backend.inputQuery(immutable(char)[]) [0x7886bd] When I use cURL directly to get the info I get this: curl -v --request GET http://192.168.1.140:8060/query/device-infoGET /query/device-info HTTP/1.1 Host: 192.168.1.140:8060 User-Agent: curl/7.47.0 Accept: */*< HTTP/1.1 200 OK < Server: Roku UPnP/1.0 MiniUPnPd/1.4 < Content-Length: 1826 < Cache-Control: no-cache < Content-Type: text/xml; charset="utf-8" < <?xml version="1.0" encoding="UTF-8" ?> This seems to be the relevant code in curl.d: private auto _decodeContent(T)(ubyte[] content, string encoding) { static if (is(T == ubyte)) { return content; } else { import std.format : format; // Optimally just return the utf8 encoded content if (encoding == "UTF-8"||encoding == "utf-8") return cast(char[])(content); // The content has to be re-encoded to utf8 auto scheme = EncodingScheme.create(encoding); enforce!CurlException(scheme !is null, format("Unknown encoding '%s'", encoding)); I'm not sure what the problem is. It seems to be may the lowercase 'utf-8' in the charset section but I'm not sure if the problem is some mistake I made, a bug in DMD or just lousy xml. Either way is there any way around this issue?
Aug 28 2017
On Tuesday, 29 August 2017 at 04:41:34 UTC, graw-prog wrote:< Content-Type: text/xml; charset="utf-8"Should be Content-Type: text/xml; charset=utf-8
Aug 29 2017
On Tuesday, 29 August 2017 at 15:41:58 UTC, Kagamin wrote:Should be Content-Type: text/xml; charset=utf-8I'm pretty sure both are equally legal.
Aug 29 2017
On 08/29/2017 05:41 PM, Kagamin wrote:On Tuesday, 29 August 2017 at 04:41:34 UTC, graw-prog wrote:HTTP allows a quoted string there. https://tools.ietf.org/html/rfc7231#section-3.1.1.1< Content-Type: text/xml; charset="utf-8"Should be Content-Type: text/xml; charset=utf-8
Aug 29 2017
On Tuesday, 29 August 2017 at 04:41:34 UTC, graw-prog wrote:I'm not sure what the problem is. It seems to be may the lowercase 'utf-8' in the charset section but I'm not sure if the problem is some mistake I made, a bug in DMD or just lousy xml. Either way is there any way around this issue?It looks like a bug in Phobos: http://dpldocs.info/experimental-docs/source/std.net.curl.d.html#L2470 That's where it populates the charset that it passes to Phobos' (woefully inadequate btw) encoding decoder... and it doesn't handle the quotes correctly according to the http standard. I guess you could probably hack it by editing your copy of Phobos or change your server to remove the quotes.
Aug 29 2017
On Tuesday, 29 August 2017 at 15:55:50 UTC, Adam D. Ruppe wrote:http://dpldocs.info/experimental-docs/source/std.net.curl.d.html#L2470Ow, annotated sources, cool. pre { box-sizing: border-box; overflow: auto; max-width: 800px; /* The script sets the real one */ max-width: calc(80vw - 16em - 4em); } Hmm... AFAIK free side space on pages is left so that the content is not too wide in characters, not because people like free side space :) But for preformatted text such limit makes little sense, it's only for word-wrapped text. I'd say code should take all the width it wants.
Aug 29 2017
On Tuesday, 29 August 2017 at 15:55:50 UTC, Adam D. Ruppe wrote:On Tuesday, 29 August 2017 at 04:41:34 UTC, graw-prog wrote:Thank you for the explanation. I guess I'll have to take a look in phobos and see if I can figure out how to make it work. I'm getting the data using the api built into the tv so I don't think I can change anything on the server side. Thank you everybody for your help.I'm not sure what the problem is. It seems to be may the lowercase 'utf-8' in the charset section but I'm not sure if the problem is some mistake I made, a bug in DMD or just lousy xml. Either way is there any way around this issue?It looks like a bug in Phobos: http://dpldocs.info/experimental-docs/source/std.net.curl.d.html#L2470 That's where it populates the charset that it passes to Phobos' (woefully inadequate btw) encoding decoder... and it doesn't handle the quotes correctly according to the http standard. I guess you could probably hack it by editing your copy of Phobos or change your server to remove the quotes.
Aug 29 2017