digitalmars.D - Whitespace for Walter
- Arcane Jill (28/28) Jun 26 2004 Another mad suggestion coming up, but this one might actually make some ...
Another mad suggestion coming up, but this one might actually make some sort of sense. Unicode whitespace is defined as any of the following characters, and no other: 0009..000D <control-0009>..<control-000D> 0020 SPACE 0085 <control-0085> 00A0 NO-BREAK SPACE 1680 OGHAM SPACE MARK 180E MONGOLIAN VOWEL SEPARATOR 2000..200A EN QUAD..HAIR SPACE 2028 LINE SEPARATOR 2029 PARAGRAPH SEPARATOR 202F NARROW NO-BREAK SPACE 205F MEDIUM MATHEMATICAL SPACE 3000 IDEOGRAPHIC SPACE How straightforward would it be to allow the DMD compiler to accept /precisely/ this list as whitespace in a D source file? Java got itself into a bit of a pickle by defining whitespace differently from Unicode. They ended up having to have two separate functions (which from memory I think are called isWhitespace() and isJavaWhitespace(), but I could be wrong). It would be quite cool to have D whitespace and Unicode whitespace as one and the same thing, don't you think? Arcane Jill PS. I /don't/ reccommend changing the value of const char[] whitespace; in std.string, however. To do so would set an AWFUL precedent which const char letters would NOT want to follow. You might, however, consider renaming those constants to ASCII_WHITESPACE, ASCII_LETTERS, etc., once the new Unicode stuff is up.
Jun 26 2004
"Arcane Jill" <Arcane_member pathlink.com> wrote in message news:cbkqcc$usj$1 digitaldaemon.com...Another mad suggestion coming up, but this one might actually make somesort ofsense. Unicode whitespace is defined as any of the following characters, and noother:0009..000D <control-0009>..<control-000D> 0020 SPACE 0085 <control-0085> 00A0 NO-BREAK SPACE 1680 OGHAM SPACE MARK 180E MONGOLIAN VOWEL SEPARATOR 2000..200A EN QUAD..HAIR SPACE 2028 LINE SEPARATOR 2029 PARAGRAPH SEPARATOR 202F NARROW NO-BREAK SPACE 205F MEDIUM MATHEMATICAL SPACE 3000 IDEOGRAPHIC SPACE How straightforward would it be to allow the DMD compiler to accept/precisely/this list as whitespace in a D source file? Java got itself into a bit of a pickle by defining whitespace differentlyfromUnicode. They ended up having to have two separate functions (which frommemoryI think are called isWhitespace() and isJavaWhitespace(), but I could bewrong).In Java: Character.isSpace(char c) is deprecated and replaced by Character.isWhiteSpace(char c) Also Character.isSpace(char ch) for the Unicode space char. There is no "isJavaWhitespace()" or "Whitespace()" There is Character.isJavaLetterOrDigit(char c) which is deprecated, maybe you were confused with this. Phill.
Jun 26 2004
I think it's a good idea. "Arcane Jill" <Arcane_member pathlink.com> wrote in message news:cbkqcc$usj$1 digitaldaemon.com...Another mad suggestion coming up, but this one might actually make somesort ofsense. Unicode whitespace is defined as any of the following characters, and noother:0009..000D <control-0009>..<control-000D> 0020 SPACE 0085 <control-0085> 00A0 NO-BREAK SPACE 1680 OGHAM SPACE MARK 180E MONGOLIAN VOWEL SEPARATOR 2000..200A EN QUAD..HAIR SPACE 2028 LINE SEPARATOR 2029 PARAGRAPH SEPARATOR 202F NARROW NO-BREAK SPACE 205F MEDIUM MATHEMATICAL SPACE 3000 IDEOGRAPHIC SPACE How straightforward would it be to allow the DMD compiler to accept/precisely/this list as whitespace in a D source file? Java got itself into a bit of a pickle by defining whitespace differentlyfromUnicode. They ended up having to have two separate functions (which frommemoryI think are called isWhitespace() and isJavaWhitespace(), but I could bewrong).It would be quite cool to have D whitespace and Unicode whitespace as oneandthe same thing, don't you think? Arcane Jill PS. I /don't/ reccommend changing the value of const char[] whitespace; in std.string, however. To do so would set an AWFUL precedent which constcharletters would NOT want to follow. You might, however, consider renamingthoseconstants to ASCII_WHITESPACE, ASCII_LETTERS, etc., once the new Unicodestuffis up.
Jun 26 2004