digitalmars.D - using encodings other than UTF-8 - part 2
- Piotr Dworaczyk (28/28) Oct 06 2007 Hi,
- Marcin Kuszczak (21/50) Oct 06 2007 I use something like this for converting from other encodings:
- Piotr Dworaczyk (5/17) Oct 06 2007 Welcome to The Club :)
- Jay Norwood (3/27) Oct 07 2007 The fox tools project had a big effort to add a bunch of text codecs ove...
Hi, is there any way to process non ASCII characters in encodings other than= = UTF-8? I've asked a similar question some time ago ( = http://www.digitalmars.com/webnews/newsgroups.php?art_group=3Ddigitalmar= s.D&article_id=3D54417) = about polish national characters, but haven't found any examples since. language. As a hobby programmer and CS teacher I already thought about using it as= a = teaching tool, but, please understand, the character encoding issues are a no-go. To tell the significance of the problem, just imagine, that beside of = utf-8 (which still isn't very popular), there are two major implementations of the polish national characters: = windows-1250 (cp-1250) and iso-8859-2. I already thought about running a conversion to utf-8, before the D = program's launch, and a conversion from utf-8 to the apropriate encoding= , = but it does make little sense. So is there a way, or could there be a possibility to add it in future = versions of the standard library? Thanks for your answers, Piotr Dworaczyk -- = Using Opera Mail: http://www.opera.com/mail/
Oct 06 2007
Piotr Dworaczyk wrote:Hi, is there any way to process non ASCII characters in encodings other than UTF-8? I've asked a similar question some time ago (http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=54417)about polish national characters, but haven't found any examples since. language. As a hobby programmer and CS teacher I already thought about using it as a teaching tool, but, please understand, the character encoding issues are a no-go. To tell the significance of the problem, just imagine, that beside of utf-8 (which still isn't very popular), there are two major implementations of the polish national characters: windows-1250 (cp-1250) and iso-8859-2. I already thought about running a conversion to utf-8, before the D program's launch, and a conversion from utf-8 to the apropriate encoding, but it does make little sense. So is there a way, or could there be a possibility to add it in future versions of the standard library? Thanks for your answers, Piotr DworaczykI use something like this for converting from other encodings: char[] readFile(char[] name) { char[] file=cast(char[])std.file.read(name) ~ '\0'; file=std.windows.charset.fromMBSz(cast(char*)file, 1250); return file; } and similarly for writing: std.windows.charset.toMBSz(content, 1250); //1250 - polish windows codepage It seems that it works on windows (and only on windows). But I have to agree that support for codepages should be much better (e.g. easy detecting current codepage)... See for docs: http://www.digitalmars.com/d/phobos/std_windows_charset.html -- Regards Marcin Kuszczak (Aarti_pl) ------------------------------------- Ask me why I believe in Jesus - http://www.zapytajmnie.com (en/pl) Doost (port of few Boost libraries) - http://www.dsource.org/projects/doost/ -------------------------------------
Oct 06 2007
I use something like this for converting from other encodings: char[] readFile(char[] name) { char[] file=3Dcast(char[])std.file.read(name) ~ '\0'; file=3Dstd.windows.charset.fromMBSz(cast(char*)file, 1250); return file; } and similarly for writing: std.windows.charset.toMBSz(content, 1250); //1250 - polis=h =windows codepageThanks / Dzieki / for the code.But I have to agree that support for codepages should be much better =(e.g. easy detecting current codepage)...Welcome to The Club :) -- = Using Opera Mail: http://www.opera.com/mail/
Oct 06 2007
Piotr Dworaczyk Wrote:The fox tools project had a big effort to add a bunch of text codecs over the last couple of years. Perhaps a library conversion to D would supply the support you want. http://www.fox-toolkit.org/fox.htmlI use something like this for converting from other encodings: char[] readFile(char[] name) { char[] file=cast(char[])std.file.read(name) ~ '\0'; file=std.windows.charset.fromMBSz(cast(char*)file, 1250); return file; } and similarly for writing: std.windows.charset.toMBSz(content, 1250); //1250 - polish windows codepageThanks / Dzieki / for the code.But I have to agree that support for codepages should be much better (e.g. easy detecting current codepage)...Welcome to The Club :) -- Using Opera Mail: http://www.opera.com/mail/
Oct 07 2007