digitalmars.D.learn - peculiarities with char[] and std.string
- Kyle K (38/38) Jun 19 2006 Greetings.
- xs0 (22/64) Jun 19 2006 Well, you didn't touch the memory you didn't allocate :) If you had
- Kyle K (5/14) Jun 19 2006 Ah ok, that makes sense. So using 'in' with arrays and aggregate types w...
- BCS (28/47) Jun 19 2006 Actually "in" always gives you a copy of the actual "thing". Arrays are
- Kyle K (2/34) Jun 19 2006 Got it, thanks a bunch. I knew it had to be something simple... :D
- Kyle K (5/14) Jun 19 2006 Ah ok, that makes sense. So using 'in' with arrays and aggregate types w...
Greetings. I was poking around the std.string lib, and was wondering if someone could answer a few questions about it. I'm relatively new to D, so I'm sure there are pretty obvious answers. I notice in most of the functions like toStringz() and tolower() it implements the copy-on-write convention... but since the default function parameter is in, is there not already an implicit copy of the data being made? For example, import std.stdio; int main() { char []str, str2; str="foo"; str2= bob(str); writefln("%s:%s", str, str2); // should print "foo:keke" return 0; } char []bob(in char[] str) { str = "keke"; return str; } Works fine with my copy of DMD. Is this behavior not to be relied on as you shouldn't ever touch memory you didnt allocate (according to the FAQ)? Also, why is the following the case: printf("%s", "hello\0"); // Fails with access violation printf("%s", cast(char *)"hello\0"); // OK Is the implicit casting from char[] to char * doing something im not aware of in terms of the length of the string, like chopping off the \0? My last question is which is the preferred method of making a copy of a string? Suppose I want str2 to be a copy of str, then: str2.length = str.length; str2[] = str; // These two equivalent? str2 = str.dup; Sorry for all the questions and thanks for the help, let me know if this info is somewhere obvious.. I wasn't able to find it in the spec. Regards Kyle K.
Jun 19 2006
Kyle K wrote:Greetings. I was poking around the std.string lib, and was wondering if someone could answer a few questions about it. I'm relatively new to D, so I'm sure there are pretty obvious answers. I notice in most of the functions like toStringz() and tolower() it implements the copy-on-write convention... but since the default function parameter is in, is there not already an implicit copy of the data being made?No, just a copy of the _reference_ is made, but both point to the same data.For example, import std.stdio; int main() { char []str, str2; str="foo"; str2= bob(str); writefln("%s:%s", str, str2); // should print "foo:keke" return 0; } char []bob(in char[] str) { str = "keke"; return str; } Works fine with my copy of DMD. Is this behavior not to be relied on as you shouldn't ever touch memory you didnt allocate (according to the FAQ)?Well, you didn't touch the memory you didn't allocate :) If you had char[] bob(in char[] str) { str[0] = 'a'; return str; } You'd get "aoo:aoo" as output (or a crash, as you can't write into constants on some platforms)Also, why is the following the case: printf("%s", "hello\0"); // Fails with access violation printf("%s", cast(char *)"hello\0"); // OK Is the implicit casting from char[] to char * doing something im not aware of in terms of the length of the string, like chopping off the \0?"hello\0" is a D char[] array, which is composed of length + char*. printf doesn't know about D arrays, so it takes the length to be the pointer to data, which fails for obvious reasons. When you cast it to char*, you lose the length, keep the pointer, and it works. I think you should use something like printf("%.*s", "hello"); // no zero needed/wanted in this case.. Better yet, use writef/ln instead - it knows all about D's types..My last question is which is the preferred method of making a copy of a string? Suppose I want str2 to be a copy of str, then: str2.length = str.length; str2[] = str; // These two equivalent? str2 = str.dup;Generally, .dup is/could/should be faster, as it's obvious you want a copy, so there's no need to initialize the destination array on resizing, for example. Hope that helped :) xs0
Jun 19 2006
In article <e76aq8$qsr$1 digitaldaemon.com>, xs0 says...Well, you didn't touch the memory you didn't allocate :) If you had char[] bob(in char[] str) { str[0] = 'a'; return str; } You'd get "aoo:aoo" as output (or a crash, as you can't write into constants on some platforms)Ah ok, that makes sense. So using 'in' with arrays and aggregate types will always still give you a reference? I assume with primitives the semantics remain pass-by-value, such that foo(in int b) will never modify the caller's data?Hope that helped :)It did, thanks a lot! :D
Jun 19 2006
Kyle K wrote:In article <e76aq8$qsr$1 digitaldaemon.com>, xs0 says...Actually "in" always gives you a copy of the actual "thing". Arrays are reference types so you get a copy of the reference. Same with objects, as they are also reference types. Stucts on the other hand are not reference types and as such will get passed by value class fooC{int i;} struct fooS{int i;} void main() { fooC c1= new fooC, c2; c1.i = 0; c2 = fn(c1); writef(c1.i, " ", c2.i, \n); // prints "1 1" fooS s1, s2; s1.i = 0; s2 = fn(s1); writef(s1.i, " ", s2.i, \n); // prints "0 1" } fooC fn(in fooC v) { v.i=1; return v; } fooS fn(in fooS v) { v.i=1; return v; }Well, you didn't touch the memory you didn't allocate :) If you had char[] bob(in char[] str) { str[0] = 'a'; return str; } You'd get "aoo:aoo" as output (or a crash, as you can't write into constants on some platforms)Ah ok, that makes sense. So using 'in' with arrays and aggregate types will always still give you a reference? I assume with primitives the semantics remain pass-by-value, such that foo(in int b) will never modify the caller's data?
Jun 19 2006
In article <e76jri$1ds7$1 digitaldaemon.com>, BCS says...Got it, thanks a bunch. I knew it had to be something simple... :DAh ok, that makes sense. So using 'in' with arrays and aggregate types will always still give you a reference? I assume with primitives the semantics remain pass-by-value, such that foo(in int b) will never modify the caller's data?Actually "in" always gives you a copy of the actual "thing". Arrays are reference types so you get a copy of the reference. Same with objects, as they are also reference types. Stucts on the other hand are not reference types and as such will get passed by value class fooC{int i;} struct fooS{int i;} void main() { fooC c1= new fooC, c2; c1.i = 0; c2 = fn(c1); writef(c1.i, " ", c2.i, \n); // prints "1 1" fooS s1, s2; s1.i = 0; s2 = fn(s1); writef(s1.i, " ", s2.i, \n); // prints "0 1" } fooC fn(in fooC v) { v.i=1; return v; } fooS fn(in fooS v) { v.i=1; return v; }
Jun 19 2006
In article <e76aq8$qsr$1 digitaldaemon.com>, xs0 says...Well, you didn't touch the memory you didn't allocate :) If you had char[] bob(in char[] str) { str[0] = 'a'; return str; } You'd get "aoo:aoo" as output (or a crash, as you can't write into constants on some platforms)Ah ok, that makes sense. So using 'in' with arrays and aggregate types will always still give you a reference? I assume with primitives the semantics remain pass-by-value, such that foo(in int b) will never modify the caller's data?Hope that helped :)It did, thanks a lot! :D
Jun 19 2006