digitalmars.D.bugs - inconsistent behavior of std.string.split
- zwang (20/20) Aug 20 2005 According to the documentation:
- Jarrett Billingsley (7/27) Aug 20 2005 Yeah, the one that takes a delimiter string should skip any zero-length
- zwang (5/40) Aug 20 2005 Keeping zero-length strings is sometimes useful, for example, when parsi...
- Jarrett Billingsley (3/7) Aug 20 2005 Good point.
According to the documentation: <spec> char[][] split(char[] s) Split s[] into an array of words, using whitespace as the delimiter. char[][] split(char[] s, char[] delim) Split s[] into an array of words, using delim[] as the delimiter. </spec> Intuitively, split(s) should be equivalent to split(s, " \t\f\r\n\v"). But the former function discards empty lines while the latter does not. The following example demonstrates the difference. <code> import std.stdio; import std.string; void main(){ writefln(std.string.split("0 3"," ")); //[0,,3] writefln(std.string.split("0 3")); //[0,3] writefln(std.string.split(" "," ")); //[,,,,] writefln(std.string.split(" ")); //[] } </code>
Aug 20 2005
"zwang" <nehzgnaw gmail.com> wrote in message news:de7c7e$17au$1 digitaldaemon.com...According to the documentation: <spec> char[][] split(char[] s) Split s[] into an array of words, using whitespace as the delimiter. char[][] split(char[] s, char[] delim) Split s[] into an array of words, using delim[] as the delimiter. </spec> Intuitively, split(s) should be equivalent to split(s, " \t\f\r\n\v"). But the former function discards empty lines while the latter does not. The following example demonstrates the difference. <code> import std.stdio; import std.string; void main(){ writefln(std.string.split("0 3"," ")); //[0,,3] writefln(std.string.split("0 3")); //[0,3] writefln(std.string.split(" "," ")); //[,,,,] writefln(std.string.split(" ")); //[] } </code>Yeah, the one that takes a delimiter string should skip any zero-length strings in-between delimiters. The whitespace one will keep skipping characters until it hits a non-whitespace one, but the delimiter one will create a new string after every delimiter, when it should just keep reading delimiters until it hits a non-delimiter sequence.
Aug 20 2005
Jarrett Billingsley wrote:"zwang" <nehzgnaw gmail.com> wrote in message news:de7c7e$17au$1 digitaldaemon.com...Keeping zero-length strings is sometimes useful, for example, when parsing a CSV or tab-delimited file. A better solution might be two versions of split that handle consecutive delimiters differently. Or another two overloaded split functions for the special case of whitespace delimiters.According to the documentation: <spec> char[][] split(char[] s) Split s[] into an array of words, using whitespace as the delimiter. char[][] split(char[] s, char[] delim) Split s[] into an array of words, using delim[] as the delimiter. </spec> Intuitively, split(s) should be equivalent to split(s, " \t\f\r\n\v"). But the former function discards empty lines while the latter does not. The following example demonstrates the difference. <code> import std.stdio; import std.string; void main(){ writefln(std.string.split("0 3"," ")); //[0,,3] writefln(std.string.split("0 3")); //[0,3] writefln(std.string.split(" "," ")); //[,,,,] writefln(std.string.split(" ")); //[] } </code>Yeah, the one that takes a delimiter string should skip any zero-length strings in-between delimiters. The whitespace one will keep skipping characters until it hits a non-whitespace one, but the delimiter one will create a new string after every delimiter, when it should just keep reading delimiters until it hits a non-delimiter sequence.
Aug 20 2005
"zwang" <nehzgnaw gmail.com> wrote in message news:de7e7l$18se$1 digitaldaemon.com...Keeping zero-length strings is sometimes useful, for example, when parsing a CSV or tab-delimited file. A better solution might be two versions of split that handle consecutive delimiters differently. Or another two overloaded split functions for the special case of whitespace delimiterGood point.
Aug 20 2005