digitalmars.D - Suggestion: char.init, wchar.init and dchar.init
- Arcane Jill (19/19) Jun 07 2004 Hi,
- Ilya Minkov (2/2) Jun 07 2004 Gets my vote!
- Walter (13/32) Jun 07 2004 That's a good idea.
- Hauke Duden (11/33) Jun 07 2004 I like the 0 initialization. It is consistent and easy to understand and...
- Ben Hinkle (1/4) Jun 07 2004 .init?
- Arcane Jill (10/13) Jun 07 2004 You're not supposed to /test/ for uninitialized variables - you're simpl...
Hi, The default value of NaN for floating point numbers is an excellent idea. I suggest that we do the same thing for chars, wchars and dchars. The init value for char should (IMO) be 0xFF. Rationale - char by definition contains a UTF-8 fragment. The byte 0xFF will never occur in a valid UTF-8 sequence. It is a clear indication of an unassigned value. The init value for wchar and dchar should be 0xFFFF (that is, 0x0000FFFF for dchar). Rationale - wchar and dchar by definiton contain UTF-16 and UTF-32 (equivalent to plain Unicode within their defined ranges). The codepoint U+FFFF is not a legitimate Unicode character, and, furthermore, it is guaranteed by the Unicode Consortium that 0xFFFF will NEVER be a legitimate Unicode character. This codepoint will remain forever unassigned, precisely so that it may be used for purposes such as this. Be it noted that that the codepoint 0 is a bad choice for a default value. It might have made sense in C, where '\0' has special meaning as a string terminator, but in D '\0' is just another character. Unicode defines '\0' as a control character whose interpretation is implementation dependent. Better, I feel, to use a value with universal meaning. Jill
Jun 07 2004
That's a good idea. "Arcane Jill" <Arcane_member pathlink.com> wrote in message news:ca17qq$224t$1 digitaldaemon.com...Hi, The default value of NaN for floating point numbers is an excellent idea.Isuggest that we do the same thing for chars, wchars and dchars. The init value for char should (IMO) be 0xFF. Rationale - char bydefinitioncontains a UTF-8 fragment. The byte 0xFF will never occur in a valid UTF-8 sequence. It is a clear indication of an unassigned value. The init value for wchar and dchar should be 0xFFFF (that is, 0x0000FFFFfordchar). Rationale - wchar and dchar by definiton contain UTF-16 and UTF-32 (equivalent to plain Unicode within their defined ranges). The codepointU+FFFFis not a legitimate Unicode character, and, furthermore, it is guaranteedby theUnicode Consortium that 0xFFFF will NEVER be a legitimate Unicodecharacter.This codepoint will remain forever unassigned, precisely so that it may beusedfor purposes such as this. Be it noted that that the codepoint 0 is a bad choice for a default value.Itmight have made sense in C, where '\0' has special meaning as a string terminator, but in D '\0' is just another character. Unicode defines '\0'as acontrol character whose interpretation is implementation dependent.Better, Ifeel, to use a value with universal meaning. Jill
Jun 07 2004
Arcane Jill wrote:Hi, The default value of NaN for floating point numbers is an excellent idea. I suggest that we do the same thing for chars, wchars and dchars. The init value for char should (IMO) be 0xFF. Rationale - char by definition contains a UTF-8 fragment. The byte 0xFF will never occur in a valid UTF-8 sequence. It is a clear indication of an unassigned value. The init value for wchar and dchar should be 0xFFFF (that is, 0x0000FFFF for dchar). Rationale - wchar and dchar by definiton contain UTF-16 and UTF-32 (equivalent to plain Unicode within their defined ranges). The codepoint U+FFFF is not a legitimate Unicode character, and, furthermore, it is guaranteed by the Unicode Consortium that 0xFFFF will NEVER be a legitimate Unicode character. This codepoint will remain forever unassigned, precisely so that it may be used for purposes such as this. Be it noted that that the codepoint 0 is a bad choice for a default value. It might have made sense in C, where '\0' has special meaning as a string terminator, but in D '\0' is just another character. Unicode defines '\0' as a control character whose interpretation is implementation dependent. Better, I feel, to use a value with universal meaning.I like the 0 initialization. It is consistent and easy to understand and remember. And it has an important function. If anyone ever passes an uninitialized D memory block to functions that expect a 0-terminated string then nothing bad will happen. But then again, I also don't like that floats are initialized to NaN. If it HAS to be done then there should definitely be an easy-to-remember property for the char types to test for this. Otherwise many programmers will have a hard time remembering which value means "not a char". Hauke
Jun 07 2004
If it HAS to be done then there should definitely be an easy-to-remember property for the char types to test for this. Otherwise many programmers will have a hard time remembering which value means "not a char"..init?
Jun 07 2004
In article <ca2754$h5k$1 digitaldaemon.com>, Hauke Duden says...If it HAS to be done then there should definitely be an easy-to-remember property for the char types to test for this. Otherwise many programmers will have a hard time remembering which value means "not a char".You're not supposed to /test/ for uninitialized variables - you're simply supposed to initialize them! And that error, of course is exactly what we're trying to catch. Anyway, you could always test for "if (c == char.init)" no matter what char.init was. By the way, I got to look at your Unichar code today. Excellent stuff. It's on my machine now. Also, you were right about doxygen, judging by the quality of your documentation - it really does rock. Jill
Jun 07 2004