digitalmars.D - Chinese characters in a string(GDC)
- ryuka (9/9) May 15 2007 Hey , I 'm a chinese d users , My d compiler (gdc) can't work well when ...
- Aziz K. (5/5) May 15 2007 Hello ryuka,
- ryuka (4/10) May 15 2007 Thank you , My editor is code::blocks , and I find the code page setting...
- Roberto Mariottini (12/13) May 15 2007 The problem is that the current D console API doesn't translate form
- Carlos Santander (8/30) May 15 2007 That's Windows-only. On Linux and Mac OS X, where the consoles use UTF-8...
- Aziz K. (26/31) May 15 2007 It doesn't need to convert to the codepage set in the console. There is ...
- Aziz K. (3/3) May 15 2007 You can review my command-line parser now, if you like. I committed it a...
- =?ISO-8859-1?Q?Julio_C=E9sar_Carrascal_Urquijo?= (3/12) May 15 2007 You should save your source code as either UTF-8, UTF-16 or UTF-32 (with...
Hey , I 'm a chinese d users , My d compiler (gdc) can't work well when using a string contains chinese characters . when I type "ÄúºÃ" or "ÄúºÃ"w in my source code as a wchar [] , My gdc display some compile error message as below: hello.d:33: invalid UTF-8 sequence hello.d:33: invalid UTF-8 sequence hello.d:33: invalid UTF-8 sequence hello.d:33: invalid UTF-8 sequence :: === Build finished: 4 errors, 0 warnings === I don't know if dmd has the same problem , but in gdc I can't find a way to type chinese characters into a wchar [] .I don't want to use hex values.so I came here for some help thanks
May 15 2007
Hello ryuka, Your problem appears to me like you don't use an editor that saves the source file as Unicode. I'm pretty sure your editor uses a codepage to save the source file, otherwise the DMD front-end wouldn't complain about that (which is the same for dmd and gdc.)
May 15 2007
Aziz K. Wrote:Hello ryuka, Your problem appears to me like you don't use an editor that saves the source file as Unicode. I'm pretty sure your editor uses a codepage to save the source file, otherwise the DMD front-end wouldn't complain about that (which is the same for dmd and gdc.)Thank you , My editor is code::blocks , and I find the code page settings. it is windows-936 ,but g++ works well on this code page when using chinese characters. Anyway ,I try to change the code page to pthers.However ,When the d source stored in some other code pages , the compiler works well (no compile errors ) but my application display strange characters which aren't the characters I type in the source .. and g++ programs display these strange characters too when using these code pages .So there is still a problem about using chinese on my os. Thank you ..
May 15 2007
ryuka wrote: [...]Anyway ,I try to change the code page to pthers.However ,When the d source stored in some other code pages , the compiler works well (no compile errors ) but my application display strange characters which aren't the characters I type in the source .. and g++ programs display these strange characters too when using these code pages .So there is still a problem about using chinese on my os.The problem is that the current D console API doesn't translate form internal Unicode representation to the codepage currently selected in your console. So, even if you write your source as UTF-something (so the compiler can understand it) when you call writef (or printf or the like) it will not translate back those UTF sequences to the console codepage, so you'll get messed up characters on the screen. This makes D console programs currently unusable for any language other than English. Ciao
May 15 2007
Roberto Mariottini escribió:ryuka wrote: [...]That's Windows-only. On Linux and Mac OS X, where the consoles use UTF-8, such characters show up correctly. ryuka, you can use UTF-8 on Windows by using the Lucida Console font and doing "chcp 65001" (IIRC). Or you can convert your text to the local codepage before sending it to the console. -- Carlos Santander BernalAnyway ,I try to change the code page to pthers.However ,When the d source stored in some other code pages , the compiler works well (no compile errors ) but my application display strange characters which aren't the characters I type in the source .. and g++ programs display these strange characters too when using these code pages .So there is still a problem about using chinese on my os.The problem is that the current D console API doesn't translate form internal Unicode representation to the codepage currently selected in your console. So, even if you write your source as UTF-something (so the compiler can understand it) when you call writef (or printf or the like) it will not translate back those UTF sequences to the console codepage, so you'll get messed up characters on the screen. This makes D console programs currently unusable for any language other than English. Ciao
May 15 2007
Roberto Mariottini wrote:The problem is that the current D console API doesn't translate form internal Unicode representation to the codepage currently selected in your console.It doesn't need to convert to the codepage set in the console. There is a neat function called WriteConsoleW() which can print any Unicode character to the console regardless of the current codepage. Other than that there is the issue with the command line, because the arguments aren't passed as Unicode to the main function. To remedy that problem I've written a function that takes the command line with GetCommandLineW() and parses it into an array of wchar[]s (applying the weird escaping rules cmd.exe uses.) I've only used calloc and realloc so that the function can be used while the garbage collector hasn't been initialized yet. I'll submit the function to bugzilla when I've finished it. At first I'd like to see it in Phobos for a while, just in case some bugs crop up, and if it has stood the test of time then Walter could move it to dmain2.d so that every D application has proper support for Unicode command line arguments by default :-) Apart from these problems another one comes to mind, which is, that the font set in the console has to support the codepoints you want to print, otherwise you will get only small boxes. I don't know how to change to another true type font other than Luicida Sans Console (because the dialog restricts you to only two fonts), but as far as I can remember, there should be a guide out there explaining how to do it. PS.: Go to http://openquran.googlecode.com/svn/trunk/src/main.d if you want to see how I'm using WriteConsoleW in the Windows version of my program.This makes D console programs currently unusable for any language other than English.Yes, but fortunately not any longer. Regards.
May 15 2007
You can review my command-line parser now, if you like. I committed it a few minutes ago. http://openquran.googlecode.com/svn/trunk/src/CmdLine.d
May 15 2007
ryuka wrote:Hey , I 'm a chinese d users , My d compiler (gdc) can't work well when using a string contains chinese characters . when I type "ÄúºÃ" or "ÄúºÃ"w in my source code as a wchar [] , My gdc display some compile error message as below: hello.d:33: invalid UTF-8 sequence hello.d:33: invalid UTF-8 sequence hello.d:33: invalid UTF-8 sequence hello.d:33: invalid UTF-8 sequence :: === Build finished: 4 errors, 0 warnings === I don't know if dmd has the same problem , but in gdc I can't find a way to type chinese characters into a wchar [] .I don't want to use hex values.so I came here for some help thanksYou should save your source code as either UTF-8, UTF-16 or UTF-32 (with signature).
May 15 2007