digitalmars.D.learn - odd behavior of split() function
- Bedros (16/16) Jun 07 2013 I would like to split "A+B+C+D" into "A", "B", "C", "D"
- Jonathan M Davis (24/48) Jun 07 2013 That would be because of your misuse of printf. If you used
- Bedros (6/69) Jun 07 2013 first of all, many thanks for the quick reply.
- Benjamin Thaut (6/81) Jun 07 2013 You can use printf if you want to, the correct usage is not so nice thou...
I would like to split "A+B+C+D" into "A", "B", "C", "D" but when using split() I get "A+B+C+D", "B+C+D", "C+D", "D" the code is below import std.stdio; import std.string; import std.array; int main() { string [] str_list; string test_str = "A+B+C+D"; str_list = test_str.split("+"); foreach(item; str_list) printf("%s\n", cast(char*)item); return 0; }
Jun 07 2013
On Friday, June 07, 2013 09:18:57 Bedros wrote:I would like to split "A+B+C+D" into "A", "B", "C", "D" but when using split() I get "A+B+C+D", "B+C+D", "C+D", "D" the code is below import std.stdio; import std.string; import std.array; int main() { string [] str_list; string test_str = "A+B+C+D"; str_list = test_str.split("+"); foreach(item; str_list) printf("%s\n", cast(char*)item); return 0; }That would be because of your misuse of printf. If you used foreach(item; str_list) writeln(item); you would have been fine. D string literals do happen to have a null character one past their end so that you can pass them directly to C functions, but D strings in general are _not_ null terminated, and printf expects strings to be null terminated. If you want to convert a D string to a null terminated string, you need to use std.string.toStringz, not a cast. You should pretty much never cast a D string to char* or const char* or any variant thereof. So, you could have done printf("%s\n", toStringz(item)); but I don't know why you'd want to use printf rather than writeln or writefln - both of which (unlike printf) are typesafe and understand D types. You got "A+B+C+D", "B+C+D", "C+D", "D" because the original string (being a string literal) had a null character one past its end, and each of the strings returned by split was a slice of the original string, and printf blithely ignored the actual boundaries of the slice looking for the next null character that it happened to find in memory, which - because they were all slices of the same string literal - happened to be the end of the original string literal. And the strings printed differed, because each slice started in a different portion of the underlying array. - Jonathan M Davis
Jun 07 2013
first of all, many thanks for the quick reply. I'm learning D and it's just because of the habit I unconsciously used printf instead of writef thanks again. -Bedros On Friday, 7 June 2013 at 07:29:48 UTC, Jonathan M Davis wrote:On Friday, June 07, 2013 09:18:57 Bedros wrote:I would like to split "A+B+C+D" into "A", "B", "C", "D" but when using split() I get "A+B+C+D", "B+C+D", "C+D", "D" the code is below import std.stdio; import std.string; import std.array; int main() { string [] str_list; string test_str = "A+B+C+D"; str_list = test_str.split("+"); foreach(item; str_list) printf("%s\n", cast(char*)item); return 0; }That would be because of your misuse of printf. If you used foreach(item; str_list) writeln(item); you would have been fine. D string literals do happen to have a null character one past their end so that you can pass them directly to C functions, but D strings in general are _not_ null terminated, and printf expects strings to be null terminated. If you want to convert a D string to a null terminated string, you need to use std.string.toStringz, not a cast. You should pretty much never cast a D string to char* or const char* or any variant thereof. So, you could have done printf("%s\n", toStringz(item)); but I don't know why you'd want to use printf rather than writeln or writefln - both of which (unlike printf) are typesafe and understand D types. You got "A+B+C+D", "B+C+D", "C+D", "D" because the original string (being a string literal) had a null character one past its end, and each of the strings returned by split was a slice of the original string, and printf blithely ignored the actual boundaries of the slice looking for the next null character that it happened to find in memory, which - because they were all slices of the same string literal - happened to be the end of the original string literal. And the strings printed differed, because each slice started in a different portion of the underlying array. - Jonathan M Davis
Jun 07 2013
Am 07.06.2013 09:53, schrieb Bedros:first of all, many thanks for the quick reply. I'm learning D and it's just because of the habit I unconsciously used printf instead of writef thanks again. -Bedros On Friday, 7 June 2013 at 07:29:48 UTC, Jonathan M Davis wrote:You can use printf if you want to, the correct usage is not so nice though: string str = "test"; printf("%.*s", str.length, str.ptr); Kind Regards Benjamin ThautOn Friday, June 07, 2013 09:18:57 Bedros wrote:I would like to split "A+B+C+D" into "A", "B", "C", "D" but when using split() I get "A+B+C+D", "B+C+D", "C+D", "D" the code is below import std.stdio; import std.string; import std.array; int main() { string [] str_list; string test_str = "A+B+C+D"; str_list = test_str.split("+"); foreach(item; str_list) printf("%s\n", cast(char*)item); return 0; }That would be because of your misuse of printf. If you used foreach(item; str_list) writeln(item); you would have been fine. D string literals do happen to have a null character one past their end so that you can pass them directly to C functions, but D strings in general are _not_ null terminated, and printf expects strings to be null terminated. If you want to convert a D string to a null terminated string, you need to use std.string.toStringz, not a cast. You should pretty much never cast a D string to char* or const char* or any variant thereof. So, you could have done printf("%s\n", toStringz(item)); but I don't know why you'd want to use printf rather than writeln or writefln - both of which (unlike printf) are typesafe and understand D types. You got "A+B+C+D", "B+C+D", "C+D", "D" because the original string (being a string literal) had a null character one past its end, and each of the strings returned by split was a slice of the original string, and printf blithely ignored the actual boundaries of the slice looking for the next null character that it happened to find in memory, which - because they were all slices of the same string literal - happened to be the end of the original string literal. And the strings printed differed, because each slice started in a different portion of the underlying array. - Jonathan M Davis
Jun 07 2013