digitalmars.D.bugs - inconsistent behavior of std.string.split
- zwang (20/20) Aug 20 2005 According to the documentation:
- Jarrett Billingsley (7/27) Aug 20 2005 Yeah, the one that takes a delimiter string should skip any zero-length
- zwang (5/40) Aug 20 2005 Keeping zero-length strings is sometimes useful, for example, when parsi...
- Jarrett Billingsley (3/7) Aug 20 2005 Good point.
According to the documentation:
<spec>
char[][] split(char[] s)
Split s[] into an array of words, using whitespace as the delimiter.
char[][] split(char[] s, char[] delim)
Split s[] into an array of words, using delim[] as the delimiter.
</spec>
Intuitively, split(s) should be equivalent to split(s, " \t\f\r\n\v").
But the former function discards empty lines while the latter does not.
The following example demonstrates the difference.
<code>
import std.stdio;
import std.string;
void main(){
writefln(std.string.split("0 3"," ")); //[0,,3]
writefln(std.string.split("0 3")); //[0,3]
writefln(std.string.split(" "," ")); //[,,,,]
writefln(std.string.split(" ")); //[]
}
</code>
Aug 20 2005
"zwang" <nehzgnaw gmail.com> wrote in message
news:de7c7e$17au$1 digitaldaemon.com...
According to the documentation:
<spec>
char[][] split(char[] s)
Split s[] into an array of words, using whitespace as the delimiter.
char[][] split(char[] s, char[] delim)
Split s[] into an array of words, using delim[] as the delimiter.
</spec>
Intuitively, split(s) should be equivalent to split(s, " \t\f\r\n\v").
But the former function discards empty lines while the latter does not.
The following example demonstrates the difference.
<code>
import std.stdio;
import std.string;
void main(){
writefln(std.string.split("0 3"," ")); //[0,,3]
writefln(std.string.split("0 3")); //[0,3]
writefln(std.string.split(" "," ")); //[,,,,]
writefln(std.string.split(" ")); //[]
}
</code>
Yeah, the one that takes a delimiter string should skip any zero-length
strings in-between delimiters. The whitespace one will keep skipping
characters until it hits a non-whitespace one, but the delimiter one will
create a new string after every delimiter, when it should just keep reading
delimiters until it hits a non-delimiter sequence.
Aug 20 2005
Jarrett Billingsley wrote:"zwang" <nehzgnaw gmail.com> wrote in message news:de7c7e$17au$1 digitaldaemon.com...Keeping zero-length strings is sometimes useful, for example, when parsing a CSV or tab-delimited file. A better solution might be two versions of split that handle consecutive delimiters differently. Or another two overloaded split functions for the special case of whitespace delimiters.According to the documentation: <spec> char[][] split(char[] s) Split s[] into an array of words, using whitespace as the delimiter. char[][] split(char[] s, char[] delim) Split s[] into an array of words, using delim[] as the delimiter. </spec> Intuitively, split(s) should be equivalent to split(s, " \t\f\r\n\v"). But the former function discards empty lines while the latter does not. The following example demonstrates the difference. <code> import std.stdio; import std.string; void main(){ writefln(std.string.split("0 3"," ")); //[0,,3] writefln(std.string.split("0 3")); //[0,3] writefln(std.string.split(" "," ")); //[,,,,] writefln(std.string.split(" ")); //[] } </code>Yeah, the one that takes a delimiter string should skip any zero-length strings in-between delimiters. The whitespace one will keep skipping characters until it hits a non-whitespace one, but the delimiter one will create a new string after every delimiter, when it should just keep reading delimiters until it hits a non-delimiter sequence.
Aug 20 2005
"zwang" <nehzgnaw gmail.com> wrote in message news:de7e7l$18se$1 digitaldaemon.com...Keeping zero-length strings is sometimes useful, for example, when parsing a CSV or tab-delimited file. A better solution might be two versions of split that handle consecutive delimiters differently. Or another two overloaded split functions for the special case of whitespace delimiterGood point.
Aug 20 2005








"Jarrett Billingsley" <kb3ctd2 yahoo.com>