digitalmars.D - .init property for char[] type
- Justin Johansson (10/10) Sep 22 2009 In a templated class (D1.0) along lines ...
- Jeremie Pelletier (12/28) Sep 22 2009 You could use a custom type, which would solve your .init problem:
- Jarrett Billingsley (9/19) Sep 22 2009 empty string rather than effectively a null pointer. =A0Is there a conv...
- Justin Johansson (8/14) Sep 22 2009 Big difference if you pass char[] variable .ptr to a C function.
- Justin Johansson (15/17) Sep 22 2009 Justin Johansson Wrote:
- Daniel Keep (3/7) Sep 22 2009 In general, if you pass a string to a C function you should send it
- Justin Johansson (4/7) Sep 22 2009 Agreed .. fair enough.
- Jeremie Pelletier (14/25) Sep 22 2009 It isn't the same semantics:
- Justin Johansson (10/13) Sep 22 2009 Consistency. Since when is that an argument?
- Jeremie Pelletier (4/21) Sep 22 2009 Obviously the nan floating points, which has annoyed me quite many
- Andrei Alexandrescu (6/23) Sep 22 2009 You forgot
- Jeremie Pelletier (3/32) Sep 22 2009 Actually, dchar.init is "\U0000ffff".
- Justin Johansson (5/32) Sep 22 2009 Shhh; don't tell anybody; I left those out of the quiz to weigh in favou...
- Michel Fortin (9/26) Sep 23 2009 Well, I see this as a problem because I've often relied on default
- Jeremie Pelletier (8/34) Sep 23 2009 pragma(msg, char.init.stringof);
- Walter Bright (7/10) Sep 24 2009 That's exactly what drove the design choices.
- Steven Schveighoffer (26/36) Sep 22 2009 A null string *is* an empty string, but an empty string may not be a nul...
- Justin Johansson (8/40) Sep 22 2009 Good write-up Steve; thanks.
- bearophile (5/19) Sep 23 2009 One small disadvantage of some of those init values (for not int variabl...
In a templated class (D1.0) along lines ... class Foo(T) { //.. static T bar() { return T.init; } //.. } Foo!(int).bar() returns 0 and Foo!(char[]).bar() returns nil. I'd much prefer (at least for my purposes) that (char[]).init returned an empty string rather than effectively a null pointer. Is there a convenient solution for this, e.g. by specializing just the bar method of class Foo when T is char[], or by some other means? Maybe this type of question best be asked on D.learn, but I do wonder if an empty string is a more reasonable initializer for char[] .. well maybe not .. I don't know .. I yield to your sensibilities. Thanks to all.
Sep 22 2009
Justin Johansson wrote:In a templated class (D1.0) along lines ... class Foo(T) { //.. static T bar() { return T.init; } //.. } Foo!(int).bar() returns 0 and Foo!(char[]).bar() returns nil. I'd much prefer (at least for my purposes) that (char[]).init returned an empty string rather than effectively a null pointer. Is there a convenient solution for this, e.g. by specializing just the bar method of class Foo when T is char[], or by some other means? Maybe this type of question best be asked on D.learn, but I do wonder if an empty string is a more reasonable initializer for char[] .. well maybe not .. I don't know .. I yield to your sensibilities. Thanks to all.You could use a custom type, which would solve your .init problem: typedef string myString = ""; Or you could specialize your bar(): static T bar() { static if(isSomeString!T) return ""; else return T.init; } I myself favor a null initializer, since char[] is a reference type, not a value type, it only makes sense to initialize it to a null reference.
Sep 22 2009
On Tue, Sep 22, 2009 at 8:07 AM, Justin Johansson <procode adam-dott-com.au> wrote:In a templated class (D1.0) along lines ... class Foo(T) { //.. =A0static T bar() { return T.init; } //.. } Foo!(int).bar() returns 0 and Foo!(char[]).bar() returns nil. I'd much prefer (at least for my purposes) that (char[]).init returned an=empty string rather than effectively a null pointer. =A0Is there a conveni= ent solution for this, e.g. by specializing just the bar method of class Fo= o when T is char[], or by some other means?Maybe this type of question best be asked on D.learn, but I do wonder if =an empty string is a more reasonable initializer for char[] .. well maybe n= ot .. I don't know .. I yield to your sensibilities.Thanks to all.There's no real difference between an empty string and a null reference. Both have 0 length.
Sep 22 2009
Jarrett Billingsley Wrote:On Tue, Sep 22, 2009 at 8:07 AM, Justin JohanssonBig difference if you pass char[] variable .ptr to a C function. static if ( typeid(T) is typeid(char[])) { } else { init_sequence = new ExactlyOne!(T)( T.init); } Tks Jeremie got specialized method working withMaybe this type of question best be asked on D.learn, but I do wonder if an empty string is a more reasonable initializer for char[] .. well maybe not .. I don't know .. I yield to your sensibilities.There's no real difference between an empty string and a null reference. Both have 0 length.
Sep 22 2009
Justin Johansson Wrote: Scratch that last garbled reply .. finger trouble. Was going to say that ...Big difference if you pass char[] variable .ptr to a C function. And thanks Jeremie, got specialized method working with static if ( typeid(T) is typeid(char[])) { // .. } else { // .. } Cheers Justin JohanssonThere's no real difference between an empty string and a null reference. Both have 0 length.
Sep 22 2009
In general, if you pass a string to a C function you should send it through toStringz first. If you don't, you're just begging for segfaults. Justin Johansson wrote:Big difference if you pass char[] variable .ptr to a C function.There's no real difference between an empty string and a null reference. Both have 0 length.
Sep 22 2009
Daniel Keep Wrote:Big difference if you pass char[] variable .ptr to a C function.In general, if you pass a string to a C function you should send it through toStringz first. If you don't, you're just begging for segfaults.Agreed .. fair enough. Actually I'm more interested in the semantics for default initialized char[]. Does it have exactly the same semantics as an empty string (in general D or runtime library, Phobos et. al. context)?
Sep 22 2009
Justin Johansson wrote:Daniel Keep Wrote:It isn't the same semantics: a null array is {0, null}, while an empty array is {0, &zero} where zero is of type 'char zero = 0;' since string literals are zero terminated. Their usage is mostly the same, you can concatenate both of them, append to both of them, and etc, all giving the same results. Where it makes a difference is when you need to enforce an invariant that .ptr is not null. Calling toStringz on either will give the same C string: a pointer to a zero value. You have to remember that arrays are reference types; they are perfectly valid without referenced data. Think of pointers or objects for example, which are also reference types. Besides, if you initialize character arrays to "", what do you initialize other arrays to, and other reference types to? It just wouldn't be consistent.Big difference if you pass char[] variable .ptr to a C function.In general, if you pass a string to a C function you should send it through toStringz first. If you don't, you're just begging for segfaults.Agreed .. fair enough. Actually I'm more interested in the semantics for default initialized char[]. Does it have exactly the same semantics as an empty string (in general D or runtime library, Phobos et. al. context)?
Sep 22 2009
Jeremie Pelletier Wrote:Besides, if you initialize character arrays to "", what do you initialize other arrays to, and other reference types to? It just wouldn't be consistent.Consistency. Since when is that an argument? Just to be a PITA, pick the inconsistent row in the table below (from spec_D1.00.pdf). The row ordering of the the table has been shuffled just to make it a bit more difficult to spot :-) short.init 0 int.init 0 bool.init false byte.init 0 double.init double.nan long.init 0L
Sep 22 2009
Justin Johansson wrote:Jeremie Pelletier Wrote:Obviously the nan floating points, which has annoyed me quite many times, every other type in D inits to zeroed memory, with the exception of void initializers.Besides, if you initialize character arrays to "", what do you initialize other arrays to, and other reference types to? It just wouldn't be consistent.Consistency. Since when is that an argument? Just to be a PITA, pick the inconsistent row in the table below (from spec_D1.00.pdf). The row ordering of the the table has been shuffled just to make it a bit more difficult to spot :-) short.init 0 int.init 0 bool.init false byte.init 0 double.init double.nan long.init 0L
Sep 22 2009
Justin Johansson wrote:Jeremie Pelletier Wrote:You forgot char.init 0xFF wchar.init 0xFFFF dchar.init 0xFFFFFFFF AndreiBesides, if you initialize character arrays to "", what do you initialize other arrays to, and other reference types to? It just wouldn't be consistent.Consistency. Since when is that an argument? Just to be a PITA, pick the inconsistent row in the table below (from spec_D1.00.pdf). The row ordering of the the table has been shuffled just to make it a bit more difficult to spot :-) short.init 0 int.init 0 bool.init false byte.init 0 double.init double.nan long.init 0L
Sep 22 2009
Andrei Alexandrescu wrote:Justin Johansson wrote:Actually, dchar.init is "\U0000ffff". JeremieJeremie Pelletier Wrote:You forgot char.init 0xFF wchar.init 0xFFFF dchar.init 0xFFFFFFFF AndreiBesides, if you initialize character arrays to "", what do you initialize other arrays to, and other reference types to? It just wouldn't be consistent.Consistency. Since when is that an argument? Just to be a PITA, pick the inconsistent row in the table below (from spec_D1.00.pdf). The row ordering of the the table has been shuffled just to make it a bit more difficult to spot :-) short.init 0 int.init 0 bool.init false byte.init 0 double.init double.nan long.init 0L
Sep 22 2009
Andrei Alexandrescu Wrote:Justin Johansson wrote:Shhh; don't tell anybody; I left those out of the quiz to weigh in favour of zero bit pattern init values. (This trick, i.e. omitting information, is one I learned from the Ministries of Statistics and (un)Employment.) Seriously though, I imagine the D design choices to be influenced by the desire to propagate NaN and invalid UTF in their respective cases so as to detect uninitialized data errors. Hmm, guess one could argue the init issue for eons. -- JustinJeremie Pelletier Wrote:You forgot char.init 0xFF wchar.init 0xFFFF dchar.init 0xFFFFFFFF AndreiBesides, if you initialize character arrays to "", what do you initialize other arrays to, and other reference types to? It just wouldn't be consistent.Consistency. Since when is that an argument? Just to be a PITA, pick the inconsistent row in the table below (from spec_D1.00.pdf). The row ordering of the the table has been shuffled just to make it a bit more difficult to spot :-) short.init 0 int.init 0 bool.init false byte.init 0 double.init double.nan long.init 0L
Sep 22 2009
On 2009-09-22 18:08:24 -0400, Justin Johansson <procode adam-dott-com.au> said:Well, I see this as a problem because I've often relied on default initialization being zero in my algorithms. I was bitten once when my algorithm worked perfectly with char but not with wchar. Turns out that char.init == 0 (contraty to what Andrei wrote) and wchar.init == 0xFFFF. -- Michel Fortin michel.fortin michelf.com http://michelf.com/You forgot char.init 0xFF wchar.init 0xFFFF dchar.init 0xFFFFFFFF AndreiShhh; don't tell anybody; I left those out of the quiz to weigh in favour of zero bit pattern init values. (This trick, i.e. omitting information, is one I learned from the Ministries of Statistics and (un)Employment.) Seriously though, I imagine the D design choices to be influenced by the desire to propagate NaN and invalid UTF in their respective cases so as to detect uninitialized data errors. Hmm, guess one could argue the init issue for eons.
Sep 23 2009
Michel Fortin wrote:On 2009-09-22 18:08:24 -0400, Justin Johansson <procode adam-dott-com.au> said:pragma(msg, char.init.stringof); outputs '\xff' in D2, wchar and dchar have the same initializer: '\U0000FFFF'. If you rely on char initializer being the null character, use char c = 0, or else your char gets initialized to an invalid character, just like floats get initialized to nan, other types have the invalid value as either null or do not have an invalid value and use 0.Well, I see this as a problem because I've often relied on default initialization being zero in my algorithms. I was bitten once when my algorithm worked perfectly with char but not with wchar. Turns out that char.init == 0 (contraty to what Andrei wrote) and wchar.init == 0xFFFF.You forgot char.init 0xFF wchar.init 0xFFFF dchar.init 0xFFFFFFFF AndreiShhh; don't tell anybody; I left those out of the quiz to weigh in favour of zero bit pattern init values. (This trick, i.e. omitting information, is one I learned from the Ministries of Statistics and (un)Employment.) Seriously though, I imagine the D design choices to be influenced by the desire to propagate NaN and invalid UTF in their respective cases so as to detect uninitialized data errors. Hmm, guess one could argue the init issue for eons.
Sep 23 2009
Justin Johansson wrote:Seriously though, I imagine the D design choices to be influenced by the desire to propagate NaN and invalid UTF in their respective cases so as to detect uninitialized data errors.That's exactly what drove the design choices. If there was a nan value for integers, D would use that. But there isn't, so 0 is the best we can do. Andrei and I were talking last night about the purity of software design principles and the reality, and how the reality forces compromise on the purity if you wanted to get anything done.
Sep 24 2009
On Tue, 22 Sep 2009 09:53:52 -0400, Justin Johansson <procode adam-dott-com.au> wrote:Daniel Keep Wrote:A null string *is* an empty string, but an empty string may not be a null string. The subtle difference is that the pointer points to null versus some data. A non-null empty string: - May be pointing to heap data, therefore keeping the data from being collected. - May reallocate in place on appending (a null string always must allocate new data on append). It's a difficult concept to get, but an array is really a hybrid type between a reference and a value type. The array is actually a value type struct with a pointer reference and a length value. If the length is zero, then the pointer value technically isn't needed, but in subtle cases, it makes a difference. When you copy the array, the length behaves like a value type (changing the length of one array doesn't affect the other), but the array data is referenced (changing an element of the array *does* affect the other). I think plans are to make the array a full reference type, and leave slices as these structs (in D2). This probably will clear up a lot of confusion people have. I hope this helps... Oh, and BTW, you can pass string literals to C functions, but *not* char[] variables. Always pass them through toStringz. It generally does not take much time/resources to add the zero. -SteveBig difference if you pass char[] variable .ptr to a C function.In general, if you pass a string to a C function you should send it through toStringz first. If you don't, you're just begging for segfaults.Agreed .. fair enough. Actually I'm more interested in the semantics for default initialized char[]. Does it have exactly the same semantics as an empty string (in general D or runtime library, Phobos et. al. context)?
Sep 22 2009
Steven Schveighoffer Wrote:A null string *is* an empty string, but an empty string may not be a null string. The subtle difference is that the pointer points to null versus some data. A non-null empty string: - May be pointing to heap data, therefore keeping the data from being collected. - May reallocate in place on appending (a null string always must allocate new data on append). It's a difficult concept to get, but an array is really a hybrid type between a reference and a value type. The array is actually a value type struct with a pointer reference and a length value. If the length is zero, then the pointer value technically isn't needed, but in subtle cases, it makes a difference. When you copy the array, the length behaves like a value type (changing the length of one array doesn't affect the other), but the array data is referenced (changing an element of the array *does* affect the other). I think plans are to make the array a full reference type, and leave slices as these structs (in D2). This probably will clear up a lot of confusion people have. I hope this helps... Oh, and BTW, you can pass string literals to C functions, but *not* char[] variables. Always pass them through toStringz. It generally does not take much time/resources to add the zero. -SteveGood write-up Steve; thanks. Being relatively new to D, but from a strong C++ and assembler background, I did the usual interrogation for interest: writefln( "(char[]).sizeof=%d", (char[]).sizeof); 8 bytes. So if you wanted to intern string data to conserve memory, and reference such data with a single 32-bit pointer, sounds like you would have to do this with either a char* or perhaps a pointer to a char[], rather than a full char[] field in your class or struct. There's less reason to want to intern string data if you still need 8 bytes to reference said data. Justin
Sep 22 2009
Andrei Alexandrescu:Justin Johansson (I think):One small disadvantage of some of those init values (for not int variables) is that if you have a large global static array of floats or chars in your program, its memory can be found in the binary, that can become huge (you can avoid that setting the static array to void, and then I think the LDC compiler or the operating system resets such memory to zero anyway). To avoid such huge binaries D can keep small static arrays like now, but it can initialize at run-time (before the main) the large static arrays. Bye, bearophileshort.init 0 int.init 0 bool.init false byte.init 0 double.init double.nan long.init 0LYou forgot char.init 0xFF wchar.init 0xFFFF dchar.init 0xFFFFFFFF
Sep 23 2009