digitalmars.D.learn - Why are string literals zero-terminated?
- awishformore (19/65) Jul 20 2010 Following this discussion on announce, I was wondering why string
- Lars T. Kyllingstad (3/7) Jul 20 2010 So you can pass them to C functions.
- Lars T. Kyllingstad (16/24) Jul 20 2010 Note that even though string literals are zero terminated, the actual
- awishformore (7/31) Jul 20 2010 Hey.
Following this discussion on announce, I was wondering why string literals are zero-terminated. Or to re-formulate, why only string literals are zero-terminated. Why that inconsistency? What's the rationale behind it? Does anyone know? /Maxfrom aDid you test with a string that was not in the code itself, e.g.issue ifconfig file? String literals are null terminated so you wouldn't have had ancourseall your strings were literals. Utf8 doesn't contain the string length, so you will run in to problems eventually. You have to use toStringz or your own null terminator. Unless ofup toyou know that the function will always be taking string literals. But even then leaving something like thatterminated in order to have C functions work with them. However, I wonder about some stuff, for instance:When thinking about it, it makes sense to have string literals nullsure, I must admit it is annoying when the same code can do different things just because of where the data came from. It would be easier to notice the bug if d never added a null on literals, but then there would also be a lot more usages of toStringz. I think if you want to test it you can do: auto s = "blah"; open(s[0..$].dup.ptr); // duplicating it should put it somewhere else // just slicing will not testthe programmer to remember is not exactly fool proof. Enjoy. ~RoryHey again and thanks for the hint. I tried finding something on the DM page about string literals being null terminated and while the section about string literals didn't even mention it, it was said some place else. That explains why using string literals works even though I expected it to fail. It's indeed good to know and adding std.string.toStringz is probably a good idea ;). Thanks. Greetings, Max.string s = "string"; // is s == "string\0" now? char[] c = cast(char[])s; // is c[6] == '\0' now? char* p = s.ptr; // is *(p+6) == '\0' now? I think use of the zero terminator should be consistent. Either makeevery string (and char[] for that matter) zero terminated in the underlying memory for backwards compatibility with C or leave it to the user in all cases./Maxperhaps the NULL is there because its there in the executable file? NULL is also often after a dynamic array simply because of d always initializing memory, and when you get an allocation often a larger amount is allocated which remains NULL.
Jul 20 2010
On Tue, 20 Jul 2010 14:59:18 +0200, awishformore wrote:Following this discussion on announce, I was wondering why string literals are zero-terminated. Or to re-formulate, why only string literals are zero-terminated. Why that inconsistency? What's the rationale behind it? Does anyone know?So you can pass them to C functions. -Lars
Jul 20 2010
On Tue, 20 Jul 2010 13:26:56 +0000, Lars T. Kyllingstad wrote:On Tue, 20 Jul 2010 14:59:18 +0200, awishformore wrote:Note that even though string literals are zero terminated, the actual string (the array, that is) doesn't contain the zero character. It's located at the memory position immediately following the string. string s = "hello"; assert (s[$-1] != '\0'); // Last character of s is 'o', not '\0' assert (s.ptr[s.length] == '\0'); Why is it only so for literals? That is because the compiler can only guarantee the zero-termination of string literals. The memory following a string in general could contain anything. string s = getStringFromSomewhere(); // I have no idea where s is coming from, so I don't // know whether it is zero-terminated or not. Better // make sure. someCFunction(toStringz(s)); -LarsFollowing this discussion on announce, I was wondering why string literals are zero-terminated. Or to re-formulate, why only string literals are zero-terminated. Why that inconsistency? What's the rationale behind it? Does anyone know?So you can pass them to C functions.
Jul 20 2010
Am 20.07.2010 15:38, schrieb Lars T. Kyllingstad:On Tue, 20 Jul 2010 13:26:56 +0000, Lars T. Kyllingstad wrote:Hey. Yes, that indeed makes a lot of sense. I didn't actually try those asserts because I'm currently not on a dev machine, but what you point out basically is the behaviour I was hoping for. Thanks for clearing this up. /MaxOn Tue, 20 Jul 2010 14:59:18 +0200, awishformore wrote:Note that even though string literals are zero terminated, the actual string (the array, that is) doesn't contain the zero character. It's located at the memory position immediately following the string. string s = "hello"; assert (s[$-1] != '\0'); // Last character of s is 'o', not '\0' assert (s.ptr[s.length] == '\0'); Why is it only so for literals? That is because the compiler can only guarantee the zero-termination of string literals. The memory following a string in general could contain anything. string s = getStringFromSomewhere(); // I have no idea where s is coming from, so I don't // know whether it is zero-terminated or not. Better // make sure. someCFunction(toStringz(s)); -LarsFollowing this discussion on announce, I was wondering why string literals are zero-terminated. Or to re-formulate, why only string literals are zero-terminated. Why that inconsistency? What's the rationale behind it? Does anyone know?So you can pass them to C functions.
Jul 20 2010