digitalmars.D - Error: 4invalid UTF-8 sequence
- jicman (21/21) Feb 21 2005 Greetings!
- Lars Ivar Igesund (6/37) Feb 21 2005 DMD don't "understand" non-ASCII chars unless the source file is stored
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (7/23) Feb 21 2005 Works For Me:
- jicman (4/6) Feb 21 2005 Ok, this is interesting Windows at its best! I have to completely retyp...
- Regan Heath (5/12) Feb 21 2005 What? Why? Can't you open it, then do a save-as, or copy/paste into
- jicman (9/14) Feb 21 2005 :-) I know exactly how you said that> :-)
- Regan Heath (14/31) Feb 21 2005 :)
- Lars Ivar Igesund (6/23) Feb 22 2005 Here is my UTF-part of _vimrc:
- jicman (2/6) Feb 22 2005 thanks. I didn't have that.
- Charles Hixson (23/48) Feb 23 2005 Well, Kate does, but that's Linux (KDE) only.
- Regan Heath (21/42) Feb 21 2005 The 50 cents answer is, ensure your editor is saving the source file as ...
- =?UTF-8?B?QW5kZXJzIEYgQmrDtnJrbHVuZA==?= (12/22) Feb 21 2005 This source code will "work", even in ISO-8859-*...
- Lars Ivar Igesund (5/12) Feb 21 2005 Yep, the 65001 cp is the one for UTF-8. In addition, the console font
Greetings! admire this complex piece of code: :-) import std.stdio; int main(char[][] args) { printf("josé" ~ "\n"); writefln("josé"); return (0); } when I try to compile it, I get, 16:21:30.97>dmd name.d name.d(4): invalid UTF-8 sequence name.d(5): invalid UTF-8 sequence The 50 cents question is, how can I get rid of it? The real reason is why I ask is that I am downloading a bunch of xml code and some of the names are accented by different languages and I am getting this error when I try to print (writefln) a variable with an accented name. However, an interesting outcome is that when I use printf, the above problem is not encountered. HUH! Thanks much! josé :-)
Feb 21 2005
DMD don't "understand" non-ASCII chars unless the source file is stored as UTF-8. Either it's a config setting in your editor that let's you do it, or you should change editor ASAP :) Note that converting non-UTF-8 files to UTF-8 might produce artifacts. Lars Ivar Igesund jicman wrote:Greetings! admire this complex piece of code: :-) import std.stdio; int main(char[][] args) { printf("josé" ~ "\n"); writefln("josé"); return (0); } when I try to compile it, I get, 16:21:30.97>dmd name.d name.d(4): invalid UTF-8 sequence name.d(5): invalid UTF-8 sequence The 50 cents question is, how can I get rid of it? The real reason is why I ask is that I am downloading a bunch of xml code and some of the names are accented by different languages and I am getting this error when I try to print (writefln) a variable with an accented name. However, an interesting outcome is that when I use printf, the above problem is not encountered. HUH! Thanks much! josé :-)
Feb 21 2005
jicman wrote:admire this complex piece of code: :-) import std.stdio; int main(char[][] args) { printf("josé" ~ "\n"); writefln("josé"); return (0); } when I try to compile it, I get, 16:21:30.97>dmd name.d name.d(4): invalid UTF-8 sequence name.d(5): invalid UTF-8 sequenceWorks For Me: josé joséThe 50 cents question is, how can I get rid of it?Save your file as UTF-8, and use an UTF-8 console... D *only* supports Unicode, not any legacy encodings. --anders
Feb 21 2005
Anders_F_Bj=F6rklund?= says...Ok, this is interesting Windows at its best! I have to completely retype that whole program! :-) Not a good thing. Ok, thanks.The 50 cents question is, how can I get rid of it?Save your file as UTF-8, and use an UTF-8 console...
Feb 21 2005
On Mon, 21 Feb 2005 22:27:04 +0000 (UTC), jicman <jicman_member pathlink.com> wrote:Anders_F_Bj=F6rklund?= says...What? Why? Can't you open it, then do a save-as, or copy/paste into another editor then do a save-as? ReganOk, this is interesting Windows at its best! I have to completely retype that whole program! :-) Not a good thing.The 50 cents question is, how can I get rid of it?Save your file as UTF-8, and use an UTF-8 console...
Feb 21 2005
In article <opsmkl5qje23k2f5 ally>, Regan Heath says...:-) I know exactly how you said that> :-) Yes, I tried that. I even opened the same program with notepad (that's as Windows as Windows can get) and tried to compile it and got the same error. Somehow, my dual keyboard system does not like those accented vowels. I am now searching for a new editor. I love vim, but this is going too far. Which freeware editors have d syntax hightliting? I am downloading one called Zeus that a d lover had on his page. thanks.Ok, this is interesting Windows at its best! I have to completely retype that whole program! :-) Not a good thing.What? Why? Can't you open it, then do a save-as, or copy/paste into another editor then do a save-as?
Feb 21 2005
On Mon, 21 Feb 2005 23:39:37 +0000 (UTC), jicman <jicman_member pathlink.com> wrote:In article <opsmkl5qje23k2f5 ally>, Regan Heath says...:):-) I know exactly how you said that> :-)Ok, this is interesting Windows at its best! I have to completely retype that whole program! :-) Not a good thing.What? Why? Can't you open it, then do a save-as, or copy/paste into another editor then do a save-as?Yes, I tried that. I even opened the same program with notepad (that's as Windows as Windows can get) and tried to compile it and got the same error. Somehow, my dual keyboard system does not like those accented vowels. I am now searching for a new editor. I love vim, but this is going too far.I have windows XP sp2, and... NotePad will save as: Unicode Unicode Big Endian UTF-8 (see "encoding" drop down in save-as dialog) WordPad will save as a "unicode document". I'm guessing that means UTF-16, hopefully with a BOM. (see "save as type" drop down in save-as dialog)Which freeware editors have d syntax hightliting? I am downloading one called Zeus that a d lover had on his page.Try: http://www.prowiki.org/wiki4d/wiki.cgi?EditorSupport Regan
Feb 21 2005
jicman wrote:In article <opsmkl5qje23k2f5 ally>, Regan Heath says...Here is my UTF-part of _vimrc: set bomb set ff=unix set enc=utf-8 fileencodings= Lars Ivar Igesund:-) I know exactly how you said that> :-) Yes, I tried that. I even opened the same program with notepad (that's as Windows as Windows can get) and tried to compile it and got the same error. Somehow, my dual keyboard system does not like those accented vowels. I am now searching for a new editor. I love vim, but this is going too far.Ok, this is interesting Windows at its best! I have to completely retype that whole program! :-) Not a good thing.What? Why? Can't you open it, then do a save-as, or copy/paste into another editor then do a save-as?
Feb 22 2005
In article <cvft8k$s5q$1 digitaldaemon.com>, Lars Ivar Igesund says...Here is my UTF-part of _vimrc: set bomb set ff=unix set enc=utf-8 fileencodings=thanks. I didn't have that.
Feb 22 2005
jicman wrote:In article <opsmkl5qje23k2f5 ally>, Regan Heath says...Well, Kate does, but that's Linux (KDE) only. And with NEdit you can make one, but that's X Window only. Then there's KEdit, but that's the same story as Kate. You could look up MultiEdit. That's what I used for odd languages when I was on MSWind. Again, it's a "define you own language" kind of thing. I seem to remember hearing of others, but since they were MSWind only, I ignored them. And again, you would need to define your own language. I hear that there's a version of KDE for MSWind now, but that seems like an awful lot of work to go to for an editor, and besides, I don't know how well it works. (It's still in the very early days.) What I did when I started finding MSWind too much of a bother was to get a second disk, and run linux from that. OTOH, if you don't need to boot frequently you could get a Mempis CD (or Knoppix) and boot from that. I'm pretty sure that Mempis will let you save your files into files on a MSWind partition. (Not certain, though, so a floppy might be needed for certainty.) Still, that's mainly a demo disk. Booting from a CD is SLOW, and again, every time you need to load something that isn't already in RAM everything turns into molasses.:-) I know exactly how you said that> :-) Yes, I tried that. I even opened the same program with notepad (that's as Windows as Windows can get) and tried to compile it and got the same error. Somehow, my dual keyboard system does not like those accented vowels. I am now searching for a new editor. I love vim, but this is going too far. Which freeware editors have d syntax hightliting? I am downloading one called Zeus that a d lover had on his page. thanks.Ok, this is interesting Windows at its best! I have to completely retype that whole program! :-) Not a good thing.What? Why? Can't you open it, then do a save-as, or copy/paste into another editor then do a save-as?
Feb 23 2005
On Mon, 21 Feb 2005 21:23:32 +0000 (UTC), jicman <jicman_member pathlink.com> wrote:Greetings! admire this complex piece of code: :-) import std.stdio; int main(char[][] args) { printf("josé" ~ "\n"); writefln("josé"); return (0); } when I try to compile it, I get, 16:21:30.97>dmd name.d name.d(4): invalid UTF-8 sequence name.d(5): invalid UTF-8 sequence The 50 cents question is, how can I get rid of it?The 50 cents answer is, ensure your editor is saving the source file as UTF-8, UTF-16 (with a BOM) or UTF-32 (also with a BOM).The real reason is why I ask is that I am downloading a bunch of xml code and some of the names are accented by different languages and I am getting this error when I try to print (writefln) a variable with an accented name. However, an interesting outcome is that when I use printf, the above problem is not encountered. HUH!This is a somewhat complex area, and I'm not sure I have it 100% sorted myself, but I'll give this a go, I _know_ someone will set us both straight if I have it wrong. Things to consider/know: - D source files must be saved in UTF encoding. - on windows your console _might_ be in UTF, it might be in something else i.e. latin-1 - printf is a C function, it is oblivious to UTF etc. - writef is a D function, it ensures you're writing in UTF. So, what I suspect is happening to you is either: 1. You're reading these names from something which is not in UTF format. 2. Your source is not in UTF format. You might see odd results once you get it working, this will be due to your console not being in utf mode, I don't know how to change console modes, someone else will have to chip in here. Regan Regan
Feb 21 2005
Regan Heath wrote:- D source files must be saved in UTF encoding.One simple such UTF encoding is (escaped) ASCII:import std.stdio; int main(char[][] args) { printf("jos\u00e9\n"); writefln("jos\u00e9"); return (0); }This source code will "work", even in ISO-8859-*...- on windows your console _might_ be in UTF, it might be in something else i.e. latin-1On Linux and other platforms, the console might also be in e.g. Latin-1. If you see something like "josé", then D does not like your console... Other languages, like C and Java for instance, support other encodings. But D only does Unicode, preferrably in the form of the UTF-8 encoding. On Linux and Mac OS X it is simple to set the console to UTF-8, and if someone could detail the steps needed on Windows that would be great ? I've heard some rumors that the "chcp 65001" command works on Win 2K... (although you might also have to change the default font being used ?) --anders
Feb 21 2005
Anders F Björklund wrote:On Linux and Mac OS X it is simple to set the console to UTF-8, and if someone could detail the steps needed on Windows that would be great ? I've heard some rumors that the "chcp 65001" command works on Win 2K... (although you might also have to change the default font being used ?) --andersYep, the 65001 cp is the one for UTF-8. In addition, the console font must be UTF-8. AFAIK, none of the raster fonts work which leaves Lucida Console font as the only feasible alternative on my comp. Lars Ivar Igesund
Feb 21 2005