digitalmars.D.learn - strtok?
- Carlos Santander B. (18/18) Apr 09 2005 I have this C program (not written by me) that uses strtok. To port it
- Regan Heath (69/85) Apr 10 2005 From MSDN:
- Carlos Santander B. (6/84) Apr 10 2005 And especially thanks for that!
I have this C program (not written by me) that uses strtok. To port it to D, I wrote this: //-------------------------------------------------------------- extern(C) char * strtok (char * strToken, char * strDelimit); char [] tokenize(char [] str, char [] sep) { char * arg1, arg2, res; arg2 = toStringz(sep); arg1 = (str.length>0) ? toStringz(str) : null; res = strtok(arg1,arg2); return toString(res); } //-------------------------------------------------------------- I would like to have a D only version of this. However, I'm not sure what strtok does. Does anybody know how to do this? -- Carlos Santander Bernal JP2, you'll always live in our minds
Apr 09 2005
On Sat, 09 Apr 2005 11:32:02 -0500, Carlos Santander B. <csantander619 gmail.com> wrote:I have this C program (not written by me) that uses strtok. To port it to D, I wrote this: //-------------------------------------------------------------- extern(C) char * strtok (char * strToken, char * strDelimit); char [] tokenize(char [] str, char [] sep) { char * arg1, arg2, res; arg2 = toStringz(sep); arg1 = (str.length>0) ? toStringz(str) : null; res = strtok(arg1,arg2); return toString(res); } //-------------------------------------------------------------- I would like to have a D only version of this. However, I'm not sure what strtok does.From MSDN: char *strtok( char *strToken, const char *strDelimit ); wchar_t *wcstok( wchar_t *strToken, const wchar_t *strDelimit ); unsigned char *_mbstok( unsigned char*strToken, const unsigned char *strDelimit ); All of these functions return a pointer to the next token found in strToken. They return NULL when no more tokens are found. Each call modifies strToken by substituting a NULL character for each delimiter that is encountered. The strtok function finds the next token in strToken. The set of characters in strDelimit specifies possible delimiters of the token to be found in strToken on the current call. wcstok and _mbstok are wide-character and multibyte-character versions of strtok. The arguments and return value of wcstok are wide-character strings; those of _mbstok are multibyte-character strings. These three functions behave identically otherwise. On the first call to strtok, the function skips leading delimiters and returns a pointer to the first token in strToken, terminating the token with a null character. More tokens can be broken out of the remainder of strToken by a series of calls to strtok. Each call to strtok modifies strToken by inserting a null character after the token returned by that call. To read the next token from strToken, call strtok with a NULL value for the strToken argument. The NULL strToken argument causes strtok to search for the next token in the modified strToken. The strDelimit argument can take any value from one call to the next so that the set of delimiters may vary. Warning Each of these functions uses a static variable for parsing the string into tokens. If multiple or simultaneous calls are made to the same function, a high potential for data corruption and inaccurate results exists. Therefore, do not attempt to call the same function simultaneously for different strings and be aware of calling one of these function from within a loop where another routine may be called that uses the same function. However, calling this function simultaneously from multiple threads does not have undesirable effects.Does anybody know how to do this?D can do much better than C, using slices you can tokenize a string without modification and return all the results in an array. import std.stdio; import std.string; char[][] tokenise(char[] input, char[] tokens) { char[][] res = null; int start = -1; foreach(int i, char c; input) { if (tokens.find(c) == -1) { if (start == -1) start = i; } else { if (start != -1) { res ~= input[start..i]; start = -1; } } } if (start != -1) res ~= input[start..$]; return res; } void main() { char[] input = ",ab.c,,..def,.,g,,h..i,,jkl,"; writefln(input); foreach(char[] s; tokenise(input,",.")) writefln(s); } Regan
Apr 10 2005
Regan Heath wrote:From MSDN: char *strtok( char *strToken, const char *strDelimit ); wchar_t *wcstok( wchar_t *strToken, const wchar_t *strDelimit ); unsigned char *_mbstok( unsigned char*strToken, const unsigned char *strDelimit ); All of these functions return a pointer to the next token found in strToken. They return NULL when no more tokens are found. Each call modifies strToken by substituting a NULL character for each delimiter that is encountered. The strtok function finds the next token in strToken. The set of characters in strDelimit specifies possible delimiters of the token to be found in strToken on the current call. wcstok and _mbstok are wide-character and multibyte-character versions of strtok. The arguments and return value of wcstok are wide-character strings; those of _mbstok are multibyte-character strings. These three functions behave identically otherwise. On the first call to strtok, the function skips leading delimiters and returns a pointer to the first token in strToken, terminating the token with a null character. More tokens can be broken out of the remainder of strToken by a series of calls to strtok. Each call to strtok modifies strToken by inserting a null character after the token returned by that call. To read the next token from strToken, call strtok with a NULL value for the strToken argument. The NULL strToken argument causes strtok to search for the next token in the modified strToken. The strDelimit argument can take any value from one call to the next so that the set of delimiters may vary. Warning Each of these functions uses a static variable for parsing the string into tokens. If multiple or simultaneous calls are made to the same function, a high potential for data corruption and inaccurate results exists. Therefore, do not attempt to call the same function simultaneously for different strings and be aware of calling one of these function from within a loop where another routine may be called that uses the same function. However, calling this function simultaneously from multiple threads does not have undesirable effects.Thanks for that.D can do much better than C, using slices you can tokenize a string without modification and return all the results in an array. import std.stdio; import std.string; char[][] tokenise(char[] input, char[] tokens) { char[][] res = null; int start = -1; foreach(int i, char c; input) { if (tokens.find(c) == -1) { if (start == -1) start = i; } else { if (start != -1) { res ~= input[start..i]; start = -1; } } } if (start != -1) res ~= input[start..$]; return res; } void main() { char[] input = ",ab.c,,..def,.,g,,h..i,,jkl,"; writefln(input); foreach(char[] s; tokenise(input,",.")) writefln(s); } ReganAnd especially thanks for that! -- Carlos Santander Bernal JP2, you'll always live in our minds
Apr 10 2005