digitalmars.D.learn - latin-1 encoding
- Simen Haugen (2/2) Jan 11 2007 I'm just starting to look at D, but I can't seem to find any encodings f...
- Johan Granberg (3/5) Jan 11 2007 What are you trying to do? It would be helpfull to know if you want to r...
- Simen Haugen (2/5) Jan 11 2007 Reading and writing files.
- Johan Granberg (6/12) Jan 12 2007 there is no string manipulation functions i the standard library that wi...
- Frits van Bommel (34/40) Jan 12 2007 Now I'm no expert in character encodings, but isn't Latin-1 just the
- Frank Benoit (keinfarbton) (3/7) Jan 12 2007 you can try the mango project. It has a package called ICU, that does
I'm just starting to look at D, but I can't seem to find any encodings for latin-1 in the standard library...
Jan 11 2007
Simen Haugen wrote:I'm just starting to look at D, but I can't seem to find any encodings for latin-1 in the standard library...What are you trying to do? It would be helpfull to know if you want to read files in latin-1 or if you want your whole program to use it internally.
Jan 11 2007
"Johan Granberg" wrote:What are you trying to do? It would be helpfull to know if you want to read files in latin-1 or if you want your whole program to use it internally.Reading and writing files.
Jan 11 2007
Simen Haugen wrote:"Johan Granberg" wrote:there is no string manipulation functions i the standard library that will help you there but you could read them as usual but instead of using char[] use ubyte[] to store them. If you want to use string manipulation functions the easiest would be to convert to utf8, there was some discussion of how to do that a couple of weeks ago.What are you trying to do? It would be helpfull to know if you want to read files in latin-1 or if you want your whole program to use it internally.Reading and writing files.
Jan 12 2007
Simen Haugen wrote:"Johan Granberg" wrote:Now I'm no expert in character encodings, but isn't Latin-1 just the first 256 codepoints (or whatever they're called) of Unicode, packed into a single byte per character? If so, it should be pretty trivial to convert latin-1 characters to Unicode, either to wchar[]/dchar[] by direct one-to-one assignment (no multibyte sequences possible) or to char[] by using std.utf.encode, like this: ----- // warning: incomplete, untested code ubyte[] data_lat1; // ... fill data_lat1 array char[] data_utf8; // perhaps preallocate this to a reasonable length foreach(c; data_lat1) { std.utf.encode(data_utf8, c); } ----- And UTF to Latin-1 should be pretty easy too: ----- // again: incomplete, untested code char[] data_utf; // wchar[] and dchar[] should work as well ubyte[] data_lat1; // again, preallocate a reasonable array if you want size_t i = 0; while(i < data_utf.length) { dchar c = std.utf.decode(data_utf, i); // advances i assert(c < 0x100); // make sure it fits data_lat1 ~= c; } ----- I should note that by 'preallocate' I mean '"new" an array and set the length to 0'. Setting the length to 0 is important since otherwise your output will get appended to the end of a default-initialized array, which isn't what you want ;)What are you trying to do? It would be helpfull to know if you want to read files in latin-1 or if you want your whole program to use it internally.Reading and writing files.
Jan 12 2007
Simen Haugen schrieb:I'm just starting to look at D, but I can't seem to find any encodings for latin-1 in the standard library...you can try the mango project. It has a package called ICU, that does convertions between various encodings and unicode.
Jan 12 2007