digitalmars.D.bugs - char code
- Hiroshi Sakurai (22/22) May 16 2005 Hi.
- Ben Hinkle (11/34) May 17 2005 I'm confused. Is the problem with raw strings like r"blah" or with
- Uwe Salomon (24/47) May 17 2005 Yes, and the boxes are U+FFFD, that is the Unicode replacement character...
- Uwe Salomon (4/7) May 17 2005 Hm, as i see now, Thomas already accomplished that (how?). Please ignore...
- Thomas Kuehne (10/14) May 17 2005 -----BEGIN PGP SIGNED MESSAGE-----
- Uwe Salomon (5/7) May 17 2005 Yes, put salt on the open wound! :-P
- Thomas Kuehne (7/15) May 18 2005 There are input modules for X11 and gtk that support quite a range of
- Thomas Kuehne (15/35) May 17 2005 -----BEGIN PGP SIGNED MESSAGE-----
Hi. this topic writen in 2ch BBS. http://pc8.2ch.net/test/read.cgi/tech/1109933426/567 and Japanese D language wiki bugtrack. http://f17.aaa.livedoor.jp/~labamba/?BugTrack%2F13 Illegal non-ascii WYSIWYG string. ver dmd0.123 /*code page utf8 */ private import std.stream; void main() { // valid char[] str = "ワロスw"; stdout.writeString(str); // valid output : E3 83 AF E3 83 AD E3 82 B9 EF BD 97 // invalid char[] str2 = r"ワロスw"; // or char[] str = `ワロスw`; stdout.writeString(str2); // invalid output : E3 E3 E3 EF return; } thanks, Hiroshi Sakurai. sorry, my english is very poor. OTL
May 16 2005
"Hiroshi Sakurai" <Hiroshi_member pathlink.com> wrote in message news:d6bm67$cfr$1 digitaldaemon.com...Hi. this topic writen in 2ch BBS. http://pc8.2ch.net/test/read.cgi/tech/1109933426/567 and Japanese D language wiki bugtrack. http://f17.aaa.livedoor.jp/~labamba/?BugTrack%2F13 Illegal non-ascii WYSIWYG string. ver dmd0.123 /*code page utf8 */ private import std.stream; void main() { // valid char[] str = "f叔糠X,-"; stdout.writeString(str); // valid output : E3 83 AF E3 83 AD E3 82 B9 EF BD 97 // invalid char[] str2 = r"f叔糠X,-"; // or char[] str = `f叔糠X,-`; stdout.writeString(str2); // invalid output : E3 E3 E3 EF return; } thanks, Hiroshi Sakurai. sorry, my english is very poor. OTLI'm confused. Is the problem with raw strings like r"blah" or with std.stream? The Stream.writeString doesn't look at encodings so whatever is going wrong is happening before the call to writeString. Since I don't have the proper fonts or encoding support in my new reader I only see the raw string r"f..." with boxes in them so I can't tell what is actually in the source file you are trying to compile. The raw strings format is a sequence of bytes assumed to be in utf-8 encoding. Is that what is in your source file? -Ben
May 17 2005
Yes, and the boxes are U+FFFD, that is the Unicode replacement character. Whatever he typed in, it didn't make its way to us. But it is interesting to note that dmd's behaviour for the normal and the wysiwyg string is still different: UTF8: 66 ef bf bd 66 ef bf bd 66 58 2c 2d UTF16: 66 fffd 66 fffd 66 58 2c 2d This is the normal string in UTF8 and UTF16 (note the U+FFFD replacement character). UTF8: 66 ef 66 ef 66 58 2c 2d UTF16: 66 f9af 66 58 2c 2d And this one is the wysiwyg string, with the contents of the other one copied+pasted. Note that dmd omitted the "BF BD" after "66 EF". That produces illegal unicode, as you can see by the UTF16 translation (which is simply wrong - the algorithm does not check on invalid input). Hmm, after some more thinking i found that the whole f?f?fX,- sequence is wrong, it just does not match the "valid output" he denotes above. He wants to input the following: UTF8: e3 83 af e3 83 ad e3 82 b9 ef bd 97 UTF16: 30ef 30ed 30b9 ff57 Does anybody know how to input these characters with Linux? I don't have any input device for that :) Or easier, Hiroshi, could you please send your input file over the list? Ciao uwevoid main() { // valid char[] str = "fソスfソスfX,-"; stdout.writeString(str); // valid output : E3 83 AF E3 83 AD E3 82 B9 EF BD 97 // invalid char[] str2 = r"fソスfソスfX,-"; // or char[] str = `fソスfソスfX,-`; stdout.writeString(str2); // invalid output : E3 E3 E3 EF return; } thanks, Hiroshi Sakurai. sorry, my english is very poor. OTLI'm confused. Is the problem with raw strings like r"blah" or with std.stream? The Stream.writeString doesn't look at encodings so whatever is going wrong is happening before the call to writeString. Since I don't have the proper fonts or encoding support in my new reader I only see the raw string r"f..." with boxes in them so I can't tell what is actually in the source file you are trying to compile.
May 17 2005
Does anybody know how to input these characters with Linux? I don't have any input device for that :) Or easier, Hiroshi, could you please send your input file over the list?Hm, as i see now, Thomas already accomplished that (how?). Please ignore my posting. :( Ciao uwe
May 17 2005
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Uwe Salomon schrieb am Tue, 17 May 2005 19:16:21 +0200:Where is the prテカbトシテゥm with Uniode on Linux 蜷 ? Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFCik0B3w+/yD4P9tIRAuTsAKCpwmUDrhQEV11P/Za+5aDB1A/c1gCgxKrg 6KBpjbBb7mTAZ3HGeuLjb7E= =rVZK -----END PGP SIGNATURE-----Does anybody know how to input these characters with Linux? I don't have any input device for that :) Or easier, Hiroshi, could you please send your input file over the list?Hm, as i see now, Thomas already accomplished that (how?).
May 17 2005
Yes, put salt on the open wound! :-P I don't have problems with Unicode, but i don't know a program/method to insert arbitrary Unicode characters into text... Thus i can only insert the characters that are on my keyboard.. ( ナやぎツカナァ竊絶凪津クテセツィテヲテ淌ート打桔トクナ etc.) uweHm, as i see now, Thomas already accomplished that (how?).Where is the prテカbトシテゥm with Uniode on Linux 蜷 ?
May 17 2005
Uwe Salomon wrote:There are input modules for X11 and gtk that support quite a range of scripts. Last time I checked qt/KDE didn't any way to add native input modules. If you are desperate you might try http://yudit.org/ (X-based) or http://sourceforge.net/projects/jgim/ (Java based) to input "simple" languages. ThomasYes, put salt on the open wound! :-P I don't have problems with Unicode, but i don't know a program/method to insert arbitrary Unicode characters into text... Thus i can only insert the characters that are on my keyboard.. ( ナやぎツカナァ竊絶凪津クテセツィテヲテ淌ート打桔トクナ etc.)Hm, as i see now, Thomas already accomplished that (how?).Where is the prテカbトシテゥm with Uniode on Linux 蜷 ?
May 18 2005
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hiroshi Sakurai schrieb am Tue, 17 May 2005 02:50:47 +0000 (UTC):Hi. this topic writen in 2ch BBS. http://pc8.2ch.net/test/read.cgi/tech/1109933426/567 and Japanese D language wiki bugtrack. http://f17.aaa.livedoor.jp/~labamba/?BugTrack%2F13 Illegal non-ascii WYSIWYG string. ver dmd0.123 /*code page utf8 */ private import std.stream; void main() { // valid char[] str = "ワロスw"; stdout.writeString(str); // valid output : E3 83 AF E3 83 AD E3 82 B9 EF BD 97 // invalid char[] str2 = r"ワロスw"; // or char[] str = `ワロスw`; stdout.writeString(str2); // invalid output : E3 E3 E3 EF return; }Added to DStress as http://dstress.kuehne.cn/run/u/unicode_08_A.d http://dstress.kuehne.cn/run/u/unicode_08_B.d http://dstress.kuehne.cn/run/u/unicode_08_C.d http://dstress.kuehne.cn/run/u/unicode_08_D.dsorry, my english is very poor. OTLI could understand your message, thus your English can't be that bad ;) Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFCiiHX3w+/yD4P9tIRAm7FAKC2uCVJSP8I8scW77UtSU7uTt+YewCfWqVT uzO/m5SpoJA+kZG9qiJA/Fk= =TjZu -----END PGP SIGNATURE-----
May 17 2005