digitalmars.D.learn - Internationalization vs. Unicode
- Tyro[17] (14/14) Apr 26 2013 There are myriad encoding schemes. D natively supports Unicode and
- H. S. Teoh (12/26) Apr 26 2013 [...]
- Jacob Carlborg (7/19) Apr 27 2013 Would ICU do the work? If that's the case you can take a look at this:
- Tyro[17] (15/33) Apr 29 2013 This might work. Not sure yet. The first thing that caught my eyes is
- Jesse Phillips (3/5) Apr 29 2013 You'll find the ported Java source:
There are myriad encoding schemes. D natively supports Unicode and provide functionality via phobos. A byproduct of this is that since ASCII is a subset of Unicode, it also natively support ASCII. This is a plus for the language but what of the other encoding schemes? What library functionality is provided to manipulate or convert between those encoding schemes and Unicode? I have a need to convert from CKJ encoding (presently EUC-JP and Shift-JIS) to Unicode. How do I accomplish this using D/Phobos? Is there a standalone library that does this? If so, can someone point me to it? If not, is there planned functionality for inclusion in phobos or am I doomed to resorting to Java or some other language to accomplish this task (or at least until I'm educated enough to do it myself)? Thanks, Andrew
Apr 26 2013
On Fri, Apr 26, 2013 at 06:09:48PM -0400, Tyro[17] wrote:There are myriad encoding schemes. D natively supports Unicode and provide functionality via phobos. A byproduct of this is that since ASCII is a subset of Unicode, it also natively support ASCII. This is a plus for the language but what of the other encoding schemes? What library functionality is provided to manipulate or convert between those encoding schemes and Unicode? I have a need to convert from CKJ encoding (presently EUC-JP and Shift-JIS) to Unicode. How do I accomplish this using D/Phobos? Is there a standalone library that does this? If so, can someone point me to it? If not, is there planned functionality for inclusion in phobos or am I doomed to resorting to Java or some other language to accomplish this task (or at least until I'm educated enough to do it myself)?[...] If you're using a Posix system, you could look into the 'recode' utility to convert from those legacy formats to Unicode before using your program on them. You may be able to figure out how to do it by looking at recode's source code. But AFAIK there is no way to do it in D currently. Maybe someone should invent std.recode and submit it for inclusion into Phobos. ;-) T -- People tell me that I'm paranoid, but they're just out to get me.
Apr 26 2013
On 2013-04-27 00:09, Tyro[17] wrote:There are myriad encoding schemes. D natively supports Unicode and provide functionality via phobos. A byproduct of this is that since ASCII is a subset of Unicode, it also natively support ASCII. This is a plus for the language but what of the other encoding schemes? What library functionality is provided to manipulate or convert between those encoding schemes and Unicode? I have a need to convert from CKJ encoding (presently EUC-JP and Shift-JIS) to Unicode. How do I accomplish this using D/Phobos? Is there a standalone library that does this? If so, can someone point me to it? If not, is there planned functionality for inclusion in phobos or am I doomed to resorting to Java or some other language to accomplish this task (or at least until I'm educated enough to do it myself)?Would ICU do the work? If that's the case you can take a look at this: https://github.com/d-widget-toolkit/com.ibm.icu I will most likely not compile with the latest version of DMD. Also I don't know how complete it is. -- /Jacob Carlborg
Apr 27 2013
On 4/27/13 6:37 AM, Jacob Carlborg wrote:On 2013-04-27 00:09, Tyro[17] wrote:This might work. Not sure yet. The first thing that caught my eyes is import java.lang.all; import java.math.BigInteger; import java.text.CharacterIterator; import java.text.ParsePosition; import java.util.Comparator; import java.util.Date; and I was immediately confused. What? We can directly import and use Java in D? Let me try this... Oh! No! Not really! We can't. Well, since D uses the file system to organize its files, I should be able to find a java folder with these classes signatures or the D equivalent somewhere in the project folder. No... I don't see one anywhere. Looks like I will have to file ICU on my list of things to get educated about. For now I will continue to use the Java implementation I've got. Thanks.There are myriad encoding schemes. D natively supports Unicode and provide functionality via phobos. A byproduct of this is that since ASCII is a subset of Unicode, it also natively support ASCII. This is a plus for the language but what of the other encoding schemes? What library functionality is provided to manipulate or convert between those encoding schemes and Unicode? I have a need to convert from CKJ encoding (presently EUC-JP and Shift-JIS) to Unicode. How do I accomplish this using D/Phobos? Is there a standalone library that does this? If so, can someone point me to it? If not, is there planned functionality for inclusion in phobos or am I doomed to resorting to Java or some other language to accomplish this task (or at least until I'm educated enough to do it myself)?Would ICU do the work? If that's the case you can take a look at this: https://github.com/d-widget-toolkit/com.ibm.icu I will most likely not compile with the latest version of DMD. Also I don't know how complete it is.
Apr 29 2013
On Monday, 29 April 2013 at 18:36:32 UTC, Tyro[17] wrote:This might work. Not sure yet. The first thing that caught my eyes isYou'll find the ported Java source: https://github.com/d-widget-toolkit/base/tree/master/src
Apr 29 2013