digitalmars.D.bugs - [Issue 9621] New: std.conv.parseEscape fails on octals and named
- d-bugmail puremagic.com (38/38) Mar 01 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9621
- d-bugmail puremagic.com (14/14) Mar 01 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9621
- d-bugmail puremagic.com (19/24) Mar 01 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9621
- d-bugmail puremagic.com (11/14) Mar 01 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9621
- d-bugmail puremagic.com (26/37) Mar 01 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9621
- d-bugmail puremagic.com (6/12) Mar 01 2013 I've meant lower, obviously.
- d-bugmail puremagic.com (8/12) Mar 01 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9621
- d-bugmail puremagic.com (30/32) Mar 01 2013 http://d.puremagic.com/issues/show_bug.cgi?id=9621
http://d.puremagic.com/issues/show_bug.cgi?id=9621 Summary: std.conv.parseEscape fails on octals and named Product: D Version: D2 Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Phobos AssignedTo: nobody puremagic.com ReportedBy: monarchdodra gmail.com D allows this: void main() { string s1 = "\&"; string s2 = "\141"; assert(s1 == "&"); assert(s2 == "a"); } But parse doesn't allow it (not supported in parse escape). //---- void main() { string s1 = `[ "\&", "\141" ]`; writeln(parse!(string[])(s1)); } //---- Can't parse string: Unknown escape character & Can't parse string: Unknown escape character 1 -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 01 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9621 Dmitry Olshansky <dmitry.olsh gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dmitry.olsh gmail.com 02:59:43 PST --- Is it documented anywhere that std.conv.parse should follow D lexer conventions on parsing?? If not I guess we shouldn't pretend it does and pull the whole freaking table of HTML4/5 entities in *every* program that uses parse to read a couple of ints. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 01 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9621Is it documented anywhere that std.conv.parse should follow D lexer conventions on parsing??Well it's kind of implied, isn't it? Why would parse follow a convention other than D's ? No it's not documented, but I do remember somewhere in the threads that Jonathan (I thin it was him), specifically saying that the idea is that it allowed parsing pretty much anything that's valid D.If not I guess we shouldn't pretend it does and pull the whole freaking table of HTML4/5 entities in *every* program that uses parse to read a couple of ints.I Disagree because the function *is* named parse, and is capable of parsing a string, and returning the object parsed (in this case a string). If "\"" is a valid D string, then I'd expect parse to not choke on it. As long as the user is parsing string to int, then no, he shouldn't need it, but if the parse outcome is a string, there is no excuse to not do it right. Shouldn't the fact that the table would only ever be used in a template function (parse) mean the compiler should be able to know whether or not to link with said table? Or would importing std.conv immediately link in the table into the final executable? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 01 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9621If not I guess we shouldn't pretend it does and pull the whole freaking table of HTML4/5 entities in *every* program that uses parse to read a couple of ints.How does std.uni does it? I mean, in the case I want to know if unicode character is white, does it mean I'll have to pull the entire unicode tables for isUpper etc. etc. etc. I'm not trying to justify by comparison, but trying to see how other modules work with this "problem". -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 01 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9621 04:12:34 PST ---That's why I'm increasinlgy against of adding tables that are hidden behind opaque interface. I feel uneasy about it. That's why I exposed all I ould about tables & predefined sets in std.uni. For instance any set is usable not only for std.uni puprposes. I also took tremendous effort to not include tables unless user code needs them and will seek new ways to avoid it. Having a dead HTML5 entity table burried beneath innocently looking function is NOT good enough. If we do it there HAS to be a way to tap into HTML entities so that people wouldn't have to include the VERY SAME table twice should they need full access to HTML5 entities.If not I guess we shouldn't pretend it does and pull the whole freaking table of HTML4/5 entities in *every* program that uses parse to read a couple of ints.How does std.uni does it?I mean, in the case I want to know if unicode character is white, does it mean I'll have to pull the entire unicode tables for isUpper etc. etc. etc.Something I'm going to change. Technically there is no reason to pull these tables. Also in case of parse the cost to benefit is far greater since if you use isXXX you surely need the table, period. In case of parse you may easily never hit escape sequence or even mean to unescape it in your data but you'd pay all the same.I'm not trying to justify by comparison, but trying to see how other modules work with this "problem".I thought std.conv.parse goal was closer to sscanf of C. In other words that it's a backbone behind the formattedRead, readf etc. If the goal is to parse whatever D strings are I fail to see the use case as e.g. std.d.lexer would 100% likely to use its own tricks to process escapes etc. to be more efficient. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 01 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9621 04:13:40 PST ---Something I'm going to change. Technically there is no reason to pull these tables. Also in case of parse the cost to benefit is farI've meant lower, obviously.since if you use isXXX you surely need the table, period. In case of parse you may easily never hit escape sequence or even mean to unescape it in your data but you'd pay all the same.-- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 01 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9621 04:33:15 PST ---Looks like I'm on streak... for std.conv.parse it's *higher* cost to benefit ratio after all. Sorry for the confusion. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------Something I'm going to change. Technically there is no reason to pull these tables. Also in case of parse the cost to benefit is farI've meant lower, obviously.
Mar 01 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9621I thought std.conv.parse goal was closer to sscanf of C. In other words that it's a backbone behind the formattedRead, readf etc.I guess the whole discussion boils down to rather "what should/does formattedRead" accept then? Given the fact that it is "higher order" and capable of parsing arrays of stuff, what happens what it parses a string that represents an array of strings? I mean, imagine this program: string s1 = ... string s2[]; formattedRead(s1, "%s", &s2); The question is: What are legal s1 values? s1 = `["a", "b"]`; => ["a", "b"] s1 = `["a", "b", ]`; => ["a", "b"] (1) s1 = `["ab", ['a', 'b']]` => ["ab", "ab"] s1 = `["\t", "\n"]`; => ["\t", "\n"] s1 = `["\0"]`; => ["\0"] (2) s1 = `["\141"]`; => ["a"] s1 = `["\x61"]`; => ["a"] s1 = `["\u0061"]`; => ["a"] s1 = `["\U00000061"]`; => ["a"] s1 = `["\&"]`; => ["&"] (3) (1) //Not currently supported (2) //Not currently supported (3) //Not currently supported Unless formatted read can document what it can(should) and doesn't support, we'll just run around in circles... -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 01 2013