digitalmars.D - toStringz and predictability
- Ben Hinkle (31/31) Jan 18 2005 There's something about toStringz that has me uncomfortable. Consider th...
- Walter (14/23) Jan 18 2005 length
- Ben Hinkle (6/29) Jan 18 2005 But the string doesn't necessarily own the byte after the string. It's a...
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (7/19) Jan 18 2005 Yes, it does work for string literals and for dynamic arrays...
- Ben Hinkle (19/20) Jan 18 2005 Actually it doesn't even work for dynamic arrays:
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (9/53) Jan 18 2005 Funny, I was just writing that :-)
- Ben Hinkle (28/51) Jan 19 2005 the
- parabolis (14/23) Jan 19 2005 Would this implementation work?
- Ben Hinkle (18/41) Jan 19 2005 the
- Lukas Pinkowski (19/29) Jan 19 2005 Hm, doesn't initialize D uninitialized chars to 0 (here str[length-1]), ...
- Ben Hinkle (9/38) Jan 19 2005 good
- Georg Wrede (18/23) Jan 24 2005 What bothers me is, if a string gets repeatedly passed, say, between a
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (14/51) Jan 24 2005 Isn't that just what "string.length = string.length + 1" does, anyway ?
- Ben Hinkle (5/5) Jan 19 2005 another version:
- Ben Hinkle (12/35) Jan 20 2005 ok, one last try. Walter, I can't tell if you still think this counts as...
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (26/50) Jan 18 2005 That's dependent on the compiler, and the alignment:
- Ben Hinkle (10/35) Jan 18 2005 That's becaseu the "new" allocates space on the heap and so it has nothi...
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (4/8) Jan 18 2005 Never mind, I was thinking in C (just because it is implemented
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (9/28) Jan 18 2005 Right, I think I only got lucky because how it allocates memory...
- parabolis (18/19) Jan 19 2005 There is something else that you should be uncomfortable about - the
- Ben Hinkle (6/14) Jan 19 2005 I hate to disagree but.. that doesn't bother me. I don't see anything wr...
- parabolis (4/25) Jan 19 2005 ----------------------------------------------------------------
- Matthew (2/18) Jan 21 2005 Has there been debate about unless/until? If so, count me on the list of...
- parabolis (8/35) Jan 21 2005 Yes back around the time the digitalmars.d newsgroup started:
There's something about toStringz that has me uncomfortable. Consider this code: import std.string; int main() { char* x; uint b1; char[4] y; uint b2; y[0] = 'a'; y[1] = 'b'; y[2] = 'c'; y[3] = 'd'; x = toStringz(y); printf("x length is %d, ptr %p b1 %p b2 %p\n",strlen(x),x,&b1,&b2); b1 = 0x11223344; b2 = 0x11223344; printf("x length is %d, ptr %p b1 %p b2 %p\n",strlen(x),x,&b1,&b2); return 0; } Here's what it prints when I run it: x length is 4, ptr 0xfefff870 b1 0xfefff86c b2 0xfefff874 x length is 17, ptr 0xfefff870 b1 0xfefff86c b2 0xfefff874 The reason why the length changed is that toStringz looks at one past the length of the string to see if it is 0 and does nothing to the string if it is. But the sample code then changes the byte past the string by touching a completely different variable and so the toStringz result is "corrupted". I have toStringz calls sprinkled through my code when I call C functions and now I'm starting to get nervous about the lifespans of those strings and how to figure out if they are valid or not. Thoughts? Walter, is there a guideline I should follow? The most extreme one that comes to mind is "only call toStringz for strings that get immediately copied". -Ben
Jan 18 2005
"Ben Hinkle" <Ben_member pathlink.com> wrote in message news:csj4hq$1cvi$1 digitaldaemon.com...The reason why the length changed is that toStringz looks at one past thelengthof the string to see if it is 0 and does nothing to the string if it is.But thesample code then changes the byte past the string by touching a completely different variable and so the toStringz result is "corrupted". I havetoStringzcalls sprinkled through my code when I call C functions and now I'mstarting toget nervous about the lifespans of those strings and how to figure out iftheyare valid or not. Thoughts? Walter, is there a guideline I should follow?Themost extreme one that comes to mind is "only call toStringz for stringsthat getimmediately copied".It's "COW" (Copy On Write) to the rescue. The idea is only modify a string that you know is unique. If you don't know it is unique, make a copy of it before modifying it. After the toStringz(), you're modifying the argument to toStringz() but there's another reference to that string that expects it to not change.
Jan 18 2005
In article <csjffu$1qtp$1 digitaldaemon.com>, Walter says..."Ben Hinkle" <Ben_member pathlink.com> wrote in message news:csj4hq$1cvi$1 digitaldaemon.com...But the string doesn't necessarily own the byte after the string. It's a random piece of memory. Even if the string is living on the heap the byte one past the array can be changed at pretty much any time by anything. Modifying the byte following a string is different than modifying a string.The reason why the length changed is that toStringz looks at one past thelengthof the string to see if it is 0 and does nothing to the string if it is.But thesample code then changes the byte past the string by touching a completely different variable and so the toStringz result is "corrupted". I havetoStringzcalls sprinkled through my code when I call C functions and now I'mstarting toget nervous about the lifespans of those strings and how to figure out iftheyare valid or not. Thoughts? Walter, is there a guideline I should follow?Themost extreme one that comes to mind is "only call toStringz for stringsthat getimmediately copied".It's "COW" (Copy On Write) to the rescue. The idea is only modify a string that you know is unique. If you don't know it is unique, make a copy of it before modifying it.After the toStringz(), you're modifying the argument to toStringz() [...]actually I'm not. I'm modifying another variable.
Jan 18 2005
Ben Hinkle wrote:But the string doesn't necessarily own the byte after the string. It's a random piece of memory. Even if the string is living on the heap the byte one past the array can be changed at pretty much any time by anything. Modifying the byte following a string is different than modifying a string.The bug is in std/string.d :p = &string[0] + string.length; // Peek past end of string[], if it's 0, no conversion necessary. // Note that the compiler will put a 0 past the end of static // strings, and the storage allocator will put a 0 past the end // of newly allocated char[]'s. if (*p == 0) return string;Yes, it does work for string literals and for dynamic arrays... But it doesn't work for slices of pointers, or static arrays ? Unless there is a way to separate them, it should be avoided. (since with the pointers/statics, the byte after is off-limits) --anders
Jan 18 2005
Yes, it does work for string literals and for dynamic arrays...Actually it doesn't even work for dynamic arrays: import std.string; int main() { char* x; char[] y = new char[32]; y[] = 0; char[] z = new char[32]; z[] = 32; x = toStringz(z); printf("x length is %d\n",strlen(x)); y[] = 32; printf("x length is %d\n",strlen(x)); return 0; } outputs x length is 32 x length is 67 This is due to how the memory manager allocates memory. -Ben
Jan 18 2005
Ben Hinkle wrote:Funny, I was just writing that :-) It breaks down for certain multiples of two. (16, 32, 64, 128, 256, 512, 1024, and so on) Sample test program:Yes, it does work for string literals and for dynamic arrays...Actually it doesn't even work for dynamic arrays:import std.string; void main() { for (int x = 15; x <= 17; x++) { char[] a = new char[x]; char[] b = new char[x]; char[] c = new char[x]; a[0] = 0; b[0] = 0; c[0] = 0; printf("%d %p\n",a); printf("%d %p\n",b); printf("%d %p\n",c); char *p = &a[0] + a.length; if(*p != 0) printf("not 0\n"); else printf("is 0\n"); for(int i = 0; i < b.length; i++) b[i] = 'A' + i; char *z = toStringz(b); for(int i = 0; i < a.length; i++) a[i] = 'X'; for(int i = 0; i < c.length; i++) c[i] = 'X'; printf("%s\n",z); } }Prints:15 0xbf498fe0 15 0xbf498fd0 15 0xbf498fc0 is 0 ABCDEFGHIJKLMNO 16 0xbf498fb0 16 0xbf498fa0 16 0xbf498f90 not 0 ABCDEFGHIJKLMNOPXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 17 0xbf497fa0 17 0xbf497f80 17 0xbf497f60 is 0 ABCDEFGHIJKLMNOPQPerhaps a bit contrived, but shows how it works... std.string.toStringz is broken. --anders
Jan 18 2005
"Walter" <newshound digitalmars.com> wrote in message news:csjffu$1qtp$1 digitaldaemon.com..."Ben Hinkle" <Ben_member pathlink.com> wrote in message news:csj4hq$1cvi$1 digitaldaemon.com...theThe reason why the length changed is that toStringz looks at one pastlengthcompletelyof the string to see if it is 0 and does nothing to the string if it is.But thesample code then changes the byte past the string by touching aifdifferent variable and so the toStringz result is "corrupted". I havetoStringzcalls sprinkled through my code when I call C functions and now I'mstarting toget nervous about the lifespans of those strings and how to figure outtheyfollow?are valid or not. Thoughts? Walter, is there a guideline I shouldThetomost extreme one that comes to mind is "only call toStringz for stringsthat getimmediately copied".It's "COW" (Copy On Write) to the rescue. The idea is only modify a string that you know is unique. If you don't know it is unique, make a copy of it before modifying it. After the toStringz(), you're modifying the argumenttoStringz() but there's another reference to that string that expects ittonot change.In case you need another example, I can imagine just the act of calling a function could corrupt a toStringz result. Suppose the char[] was stored on the stack and the last element of the array is at the very top of the stack and that the next item after the stack is zero (and that the stack grows up in memory). Then calling toStringz (also suppose it that call was inlined just for simplicity) wouldn't make a copy.but calling a function after that would push another stack frame which could potentially set a non-zero byte immediately following the array. That would corrupt the result of toStringz. I couldn't get this to happen on any machine I have around here but it depends on the stack architecture and how function calls work but the problem is still there for some architectures. So I have a suggestion. Have toStringz always copy if the array is on the stack. Have it never copy if the array is in the data segment (so literals behave as they do today) and have it check the GC capacity to ask the GC for control over the byte following the array (though the length of the array would be unchanged). To implement this toStringz would probably have to be moved out of std.string and into internal. If it copied everything except literals then I can see keeping it in std.string. Anyhow, I agree wth Anders that something should be done. -Ben
Jan 19 2005
Ben Hinkle wrote:So I have a suggestion. Have toStringz always copy if the array is on the stack. Have it never copy if the array is in the data segment (so literals behave as they do today) and have it check the GC capacity to ask the GC for control over the byte following the array (though the length of the array would be unchanged). To implement this toStringz would probably have to be moved out of std.string and into internal. If it copied everything except literals then I can see keeping it in std.string. Anyhow, I agree wth Anders that something should be done.Would this implementation work? ---------------------------------------------------------------- char* toStringzz(char[] str) { str.length++; str[length-1] = cast(char)0x00; return cast(char*)&str; } ---------------------------------------------------------------- That is to say is the array resizing implementation sufficient to determine whether str is dynamic or static on its own and if it is dynamic deal wisely with cases where incrementing length might be sufficient? Can you break toStringzz in any of the cases that toStringz breaks?
Jan 19 2005
"parabolis" <parabolis softhome.net> wrote in message news:csmbbh$444$1 digitaldaemon.com...Ben Hinkle wrote:theSo I have a suggestion. Have toStringz always copy if the array is onliteralsstack. Have it never copy if the array is in the data segment (soforbehave as they do today) and have it check the GC capacity to ask the GCarraycontrol over the byte following the array (though the length of thebewould be unchanged). To implement this toStringz would probably have toexceptmoved out of std.string and into internal. If it copied everythingAndersliterals then I can see keeping it in std.string. Anyhow, I agree wthNice idea. I think it's on the right track. I've cleaned it up a bit: char* toStringzz(char[] str) { str.length = str.length+1; str[length-1] = 0; return str.ptr; } Also it copies string literals. If there is an easy way to check if something is a string literal we can add that to your code and have a good solution, I think.that something should be done.Would this implementation work? ---------------------------------------------------------------- char* toStringzz(char[] str) { str.length++; str[length-1] = cast(char)0x00; return cast(char*)&str; } ---------------------------------------------------------------- That is to say is the array resizing implementation sufficient to determine whether str is dynamic or static on its own and if it is dynamic deal wisely with cases where incrementing length might be sufficient? Can you break toStringzz in any of the cases that toStringz breaks?
Jan 19 2005
Ben Hinkle wrote:Nice idea. I think it's on the right track. I've cleaned it up a bit: char* toStringzz(char[] str) { str.length = str.length+1; str[length-1] = 0; return str.ptr; } Also it copies string literals. If there is an easy way to check if something is a string literal we can add that to your code and have a good solution, I think.Hm, doesn't initialize D uninitialized chars to 0 (here str[length-1]), so you can leave out the str[length-1] = 0; part? Thus better: char* toStringzz(char[] str) { str.length = str.length+1; return str.ptr; } But this actually alters the parameter (is this intended?) My version would be: char* toStringz( in char[] str ) { char[] new_str; new_str.length = str.length + 1; new_str[0 .. length-2] = str[0 .. length-1]; return &new_str[0]; } Creating a copy of the parameter, thus not changing it as you would think for in-parameters. I checked and it works for string literals, too.
Jan 19 2005
"Lukas Pinkowski" <Lukas.Pinkowski web.de> wrote in message news:csmfl4$a4c$1 digitaldaemon.com...Ben Hinkle wrote:goodNice idea. I think it's on the right track. I've cleaned it up a bit: char* toStringzz(char[] str) { str.length = str.length+1; str[length-1] = 0; return str.ptr; } Also it copies string literals. If there is an easy way to check if something is a string literal we can add that to your code and have athe initializer for char is 0xFF.solution, I think.Hm, doesn't initialize D uninitialized chars to 0 (here str[length-1]), so you can leave out the str[length-1] = 0; part?Thus better: char* toStringzz(char[] str) { str.length = str.length+1; return str.ptr; } But this actually alters the parameter (is this intended?)an array is a pointer to data and a length. Those are passed by value, so changing the length does not change the original string passed to the function.My version would be: char* toStringz( in char[] str ) { char[] new_str; new_str.length = str.length + 1; new_str[0 .. length-2] = str[0 .. length-1]; return &new_str[0]; } Creating a copy of the parameter, thus not changing it as you would think for in-parameters. I checked and it works for string literals, too.watch out for the case when new_str.ptr is str.ptr since I expect the array copy will error if you try to copy overlapping arrays.
Jan 19 2005
(Actually, I refer here to several examples in this thread.)What bothers me is, if a string gets repeatedly passed, say, between a library and the main program, and the library functions pass the string on to the OS or another library, every time using toStringz -- then what keeps the string from growing at each iteration? Finally we end up with a (possibly short) string with a lot of zeros at the end. It seems harmless at first glance, but what if later this kind of strings are concatenated (in D code) and passed on to a C-written parser? It would see a lot of "empty strings" between real data. Or am I missing something? In the same manner, should toStringz guarantee a valid C string? I.e. no internal zeros? At the _very least_ in the non-release build! ---- The name toStringz is misleading. Since the only use for it is to make strings edible for C code, it should be renamed toStringC. Normally, if a programmer _wants_ to slap a zero at the end, he'd use ~, wouldn't he. Misnomers like this introduce parallax, and in this case so subtle that we don't even notice. And that's where it _really_ counts!char* toStringzz(char[] str) { str.length = str.length+1; str[length-1] = 0; return str.ptr; }
Jan 24 2005
Georg Wrede wrote:It seems harmless at first glance, but what if later this kind of strings are concatenated (in D code) and passed on to a C-written parser? It would see a lot of "empty strings" between real data. Or am I missing something?It would probably be easier to remove the hack altogether and just copy?body { if (string.length == 0) return ""; // Need to make a copy char[] copy = new char[string.length + 1]; copy[0..string.length] = string; copy[string.length] = 0; return copy; }Isn't that just what "string.length = string.length + 1" does, anyway ? It would be neat if it could be optimized for string literals, but not at the expense of making the whole function instable? (like it is now)In the same manner, should toStringz guarantee a valid C string? I.e. no internal zeros? At the _very least_ in the non-release build!The contract for toStringz specifies that the char[] is *without* '\0':in { if (string) { // No embedded 0's for (uint i = 0; i < string.length; i++) assert(string[i] != 0); } } out (result) { if (result) { assert(strlen(result) == string.length); assert(memcmp(result, string, string.length) == 0); } }It also (implicitly) returns a "" string, for an input param of null.The name toStringz is misleading. Since the only use for it is to make strings edible for C code, it should be renamed toStringC. Normally, if a programmer _wants_ to slap a zero at the end, he'd use ~, wouldn't he.It converts a char[], to a zero-terminated char*. No "C" about that ?? (I'm not sure why it doesn't just 'return (string ~ "\0");', anyone ?) ==> body { return ((string.length == 0) ? "" : string ~ "\0"); } Besides, most of the C functions does not accept UTF-8 input anyway... To be usable from regular C, it would need to be converted to byte* ? (and that would most likely involve charset encoding conversion too) --anders
Jan 24 2005
another version: char* toStringzz(char[] str) { str ~= 0; return str.ptr; }
Jan 19 2005
In article <csjffu$1qtp$1 digitaldaemon.com>, Walter says..."Ben Hinkle" <Ben_member pathlink.com> wrote in message news:csj4hq$1cvi$1 digitaldaemon.com...ok, one last try. Walter, I can't tell if you still think this counts as COW. So let me boil it down to a question. Given the code char[1] str; char* cstr = toStringz(str); ubyte x = 1; what is strlen(cstr)? I claim the answer is compiler dependent and depends on if the compiler stuck the storage location for x immediately following str. Sure running the code doesn't have a problem due to word alignment etc but following the language definition and the definition of toStringz the strlen is unknown. -BenThe reason why the length changed is that toStringz looks at one past thelengthof the string to see if it is 0 and does nothing to the string if it is.But thesample code then changes the byte past the string by touching a completely different variable and so the toStringz result is "corrupted". I havetoStringzcalls sprinkled through my code when I call C functions and now I'mstarting toget nervous about the lifespans of those strings and how to figure out iftheyare valid or not. Thoughts? Walter, is there a guideline I should follow?Themost extreme one that comes to mind is "only call toStringz for stringsthat getimmediately copied".It's "COW" (Copy On Write) to the rescue. The idea is only modify a string that you know is unique. If you don't know it is unique, make a copy of it before modifying it. After the toStringz(), you're modifying the argument to toStringz() but there's another reference to that string that expects it to not change.
Jan 20 2005
Ben Hinkle wrote:There's something about toStringz that has me uncomfortable. Consider this code: import std.string; int main() { char* x; uint b1; char[4] y; uint b2; y[0] = 'a'; y[1] = 'b'; y[2] = 'c'; y[3] = 'd'; x = toStringz(y); printf("x length is %d, ptr %p b1 %p b2 %p\n",strlen(x),x,&b1,&b2); b1 = 0x11223344; b2 = 0x11223344; printf("x length is %d, ptr %p b1 %p b2 %p\n",strlen(x),x,&b1,&b2); return 0; } Here's what it prints when I run it: x length is 4, ptr 0xfefff870 b1 0xfefff86c b2 0xfefff874 x length is 17, ptr 0xfefff870 b1 0xfefff86c b2 0xfefff874That's dependent on the compiler, and the alignment: GDC Linux: x length is 4, ptr 0xbff772b8 b1 0xbff772bc b2 0xbff772b0 x length is 25, ptr 0xbff772b8 b1 0xbff772bc b2 0xbff772b0 GDC Mac OS X: x length is 4, ptr 0xbffffaa0 b1 0xbffffa9c b2 0xbffffaa8 x length is 4, ptr 0xbffffaa0 b1 0xbffffa9c b2 0xbffffaa8 But why are you calling toStringz on a simple (char*), without having it properly NUL-terminated at the end ? If you change the code to : char[] y = new char[4]; Then it prints: x length is 4, ptr 0xbf429fe0 b1 0xbff3c758 b2 0xbff3c74c x length is 4, ptr 0xbf429fe0 b1 0xbff3c758 b2 0xbff3c74c A more interesting question is why : x = toStringz(y[0..4]); does *not* make a copy of the converted pointer-to-characters, just because the next byte in memory happens to be a NUL char? (ie. it works if first byte of "b1" is 42, but not if it's 0) Having to use x = toStringz(y[0..4].dup); just because of this little "optimization" feature is not exactly a given... There should probably be a small warning printed about using toStringz on slices (since it works with literals and arrays) But that it fails on pointers and static arrays is not surprising? --anders PS. If you add a -O on Mac OS X, then it prints "12" instead. So just because it printed 4 above doesn't mean it works.
Jan 18 2005
That's dependent on the compiler, and the alignment: GDC Linux: x length is 4, ptr 0xbff772b8 b1 0xbff772bc b2 0xbff772b0 x length is 25, ptr 0xbff772b8 b1 0xbff772bc b2 0xbff772b0 GDC Mac OS X: x length is 4, ptr 0xbffffaa0 b1 0xbffffa9c b2 0xbffffaa8 x length is 4, ptr 0xbffffaa0 b1 0xbffffa9c b2 0xbffffaa8even more interesting...But why are you calling toStringz on a simple (char*), without having it properly NUL-terminated at the end ?The point of toStringz is to make a D string null terminated.If you change the code to : char[] y = new char[4]; Then it prints: x length is 4, ptr 0xbf429fe0 b1 0xbff3c758 b2 0xbff3c74c x length is 4, ptr 0xbf429fe0 b1 0xbff3c758 b2 0xbff3c74cThat's becaseu the "new" allocates space on the heap and so it has nothing to do with b1 and b2 after that. To corrupt the string on the heap you'l have to wait until something else gets allocated right after that string and then assign something to the first byte.A more interesting question is why : x = toStringz(y[0..4]); does *not* make a copy of the converted pointer-to-characters, just because the next byte in memory happens to be a NUL char? (ie. it works if first byte of "b1" is 42, but not if it's 0) Having to use x = toStringz(y[0..4].dup); just because of this little "optimization" feature is not exactly a given... There should probably be a small warning printed about using toStringz on slices (since it works with literals and arrays)I'm starting to think the only safe usage of toStringz is on arrays where you can guarantee the byte after the string is owned by the string - which includes literals and maybe some other special cases.But that it fails on pointers and static arrays is not surprising? --andersPS. If you add a -O on Mac OS X, then it prints "12" instead. So just because it printed 4 above doesn't mean it works.ok.
Jan 18 2005
Ben Hinkle wrote:Never mind, I was thinking in C (just because it is implemented that way), forget that D treats static arrays as having lengths... --andersBut why are you calling toStringz on a simple (char*), without having it properly NUL-terminated at the end ?The point of toStringz is to make a D string null terminated.
Jan 18 2005
Ben Hinkle wrote:Right, I think I only got lucky because how it allocates memory... I couldn't find any traces of "the storage allocator will put a 0 past the end of newly allocated char[]'s", so that must be just DMC. In fact, I'm not sure that even DMD does it ? This test program:If you change the code to : char[] y = new char[4]; Then it prints: x length is 4, ptr 0xbf429fe0 b1 0xbff3c758 b2 0xbff3c74c x length is 4, ptr 0xbf429fe0 b1 0xbff3c758 b2 0xbff3c74cThat's becaseu the "new" allocates space on the heap and so it has nothing to do with b1 and b2 after that. To corrupt the string on the heap you'l have to wait until something else gets allocated right after that string and then assign something to the first byte.void main() { for (int i = 1; i <= 1024; i++) { char[] a = new char[i]; char *p = &a[0] + a.length; if(*p != 0) printf("%d\n",i); } }Prints out 16,32,64,128,256,512,1024 for *all* the various D compilers. So that toStringz peeks beyond the length of the array is clearly a bug! Perhaps if it could tell that the argument is a string literal ? Naah... --anders
Jan 18 2005
Ben Hinkle wrote:There's something about toStringz that has me uncomfortable. Consider this code:There is something else that you should be uncomfortable about - the domains of C strings and D strings are not the same. The toStringz function is so named because C strings are 'Z'ero (or null) terminated. That implies they cannot contain a null character yet D strings have no such silly limitations. So the toStringz function should probably look like this: ---------------------------------------------------------------- char* toStringz(char[] dStr) { char[] cStr = new char[dStr.length+1]; foreach(int i, char dChar; dStr) { if(!(cStr[i] = dChar)) throw new Exception("Null char"); } return &cStr; ---------------------------------------------------------------- Now seems like a great time for plugging the unless/until feature of Perl as being nice in this context allowing: unless(cStr[i] = dChar) throw new Exception("Null char");
Jan 19 2005
"parabolis" <parabolis softhome.net> wrote in message news:csmiqa$edp$1 digitaldaemon.com...Ben Hinkle wrote:I hate to disagree but.. that doesn't bother me. I don't see anything wrong with ignoring interior zeros. toStringz just makes sure it is zero-terminated - not that that aren't any internal zeros. [snip]There's something about toStringz that has me uncomfortable. Consider this code:There is something else that you should be uncomfortable about - the domains of C strings and D strings are not the same. The toStringz function is so named because C strings are 'Z'ero (or null) terminated. That implies they cannot contain a null character yet D strings have no such silly limitations.
Jan 19 2005
Ben Hinkle wrote:"parabolis" <parabolis softhome.net> wrote in message news:csmiqa$edp$1 digitaldaemon.com...---------------------------------------------------------------- char* toStringz(char[] dStr, bit ignoreNullsInString = true) ----------------------------------------------------------------Ben Hinkle wrote:I hate to disagree but.. that doesn't bother me. I don't see anything wrong with ignoring interior zeros. toStringz just makes sure it is zero-terminated - not that that aren't any internal zeros. [snip]There's something about toStringz that has me uncomfortable. Consider this code:There is something else that you should be uncomfortable about - the domains of C strings and D strings are not the same. The toStringz function is so named because C strings are 'Z'ero (or null) terminated. That implies they cannot contain a null character yet D strings have no such silly limitations.
Jan 19 2005
"parabolis" <parabolis softhome.net> wrote in message news:csmiqa$edp$1 digitaldaemon.com...Ben Hinkle wrote:Has there been debate about unless/until? If so, count me on the list of 'wanting'. :-)There's something about toStringz that has me uncomfortable. Consider this code:There is something else that you should be uncomfortable about - the domains of C strings and D strings are not the same. The toStringz function is so named because C strings are 'Z'ero (or null) terminated. That implies they cannot contain a null character yet D strings have no such silly limitations. So the toStringz function should probably look like this: ---------------------------------------------------------------- char* toStringz(char[] dStr) { char[] cStr = new char[dStr.length+1]; foreach(int i, char dChar; dStr) { if(!(cStr[i] = dChar)) throw new Exception("Null char"); } return &cStr; ---------------------------------------------------------------- Now seems like a great time for plugging the unless/until feature of Perl as being nice in this context allowing: unless(cStr[i] = dChar) throw new Exception("Null char");
Jan 21 2005
Matthew wrote:"parabolis" <parabolis softhome.net> wrote in message news:csmiqa$edp$1 digitaldaemon.com...Yes back around the time the digitalmars.d newsgroup started: http://www.digitalmars.com/d/archives/digitalmars/D/1714.html Walter wrote:---------------------------------------------------------------- char* toStringz(char[] dStr) { char[] cStr = new char[dStr.length+1]; foreach(int i, char dChar; dStr) { if(!(cStr[i] = dChar)) throw new Exception("Null char"); } return &cStr; ---------------------------------------------------------------- Now seems like a great time for plugging the unless/until feature of Perl as being nice in this context allowing: unless(cStr[i] = dChar) throw new Exception("Null char");Has there been debate about unless/until? If so, count me on the list of 'wanting'. :-)"Brian Hammond" <d at brianhammond dot comBrian_member xx pathlink.com> wrote in message news:c8lmu2$vdm$1 xx digitaldaemon.com...However Walter's response was long before "is" replaced "===" and so I think it at least deserves another consideration as Perl's unless construct would give us "unless(A is null)" instead of the akward and much maligned "if(!(A is null))".I really like the unless because it reads so well. "do this unless this is true"That just seems backwards to me <g>. I like things to execute forwards, not backwards.
Jan 21 2005