digitalmars.D - Phobos uni methods
- Andrew (10/10) Aug 22 2016 Hi,
- Cauterite (5/5) Aug 22 2016 On Monday, 22 August 2016 at 07:58:50 UTC, Andrew wrote:
- Lodovico Giaretta (11/17) Aug 22 2016 Well, the Unicode consortium is famous for having backward
- Andrew (6/8) Aug 27 2016 Is there a tool somewhere that parses the UnicodeData.txt and
- Dmitry Olshansky (5/13) Aug 28 2016 An awful oversight. The tool still sits in its GSOC 2012 repo:
- Jack Stouffer (2/12) Aug 22 2016 Please make a bug report on issues.dlang.org
Hi, It appears as though the Phobos Unicode methods (such as std.uni.isAlpha()) are compatible with Unicode 6.2 (i.e. the 2012 standard, which is now nearly 5 years old). Unicode is now up to version 9.0. Changes do include changes to std.uni.isAlpha(), and other methods. Is there either an updated version of std.uni, or are there plans to update it? Thanks Andrew.
Aug 22 2016
On Monday, 22 August 2016 at 07:58:50 UTC, Andrew wrote:Note that changing isAlpha() can potentially break any D code with unicode in its identifiers, because the DMD frontend uses isAlpha() to determine which characters are allowed in identifiers.
Aug 22 2016
On Monday, 22 August 2016 at 10:26:35 UTC, Cauterite wrote:On Monday, 22 August 2016 at 07:58:50 UTC, Andrew wrote:Well, the Unicode consortium is famous for having backward compatibility as a priority (in fact the Unicode standard has many strange things that are conceptually wrong but are needed to maintain compatibility). So, updating the std.uni methods should not break anything, but at most allow more inputs to be accepted. So I think that the possibility of updating std.uni should be taken into account and further investigated, to see if it's doable. By the way, the core team is very busy so if Andrew (the OP) wants to make a PR himself, it would be welcome.Note that changing isAlpha() can potentially break any D code with unicode in its identifiers, because the DMD frontend uses isAlpha() to determine which characters are allowed in identifiers.
Aug 22 2016
On Monday, 22 August 2016 at 10:48:14 UTC, Lodovico Giaretta wrote:By the way, the core team is very busy so if Andrew (the OP) wants to make a PR himself, it would be welcome.Is there a tool somewhere that parses the UnicodeData.txt and PropList.txt and generates all the tries? I took a quick look but didn't see one alongside the std.uni source code. Andrew.
Aug 27 2016
On 8/27/16 9:40 AM, Andrew wrote:On Monday, 22 August 2016 at 10:48:14 UTC, Lodovico Giaretta wrote:An awful oversight. The tool still sits in its GSOC 2012 repo: https://github.com/DmitryOlshansky/gsoc-bench-2012/blob/master/gen_uni.d And the script to run it: https://github.com/DmitryOlshansky/gsoc-bench-2012/blob/master/gen_uni.shBy the way, the core team is very busy so if Andrew (the OP) wants to make a PR himself, it would be welcome.Is there a tool somewhere that parses the UnicodeData.txt and PropList.txt and generates all the tries? I took a quick look but didn't see one alongside the std.uni source code.Andrew.
Aug 28 2016
On Monday, 22 August 2016 at 07:58:50 UTC, Andrew wrote:Hi, It appears as though the Phobos Unicode methods (such as std.uni.isAlpha()) are compatible with Unicode 6.2 (i.e. the 2012 standard, which is now nearly 5 years old). Unicode is now up to version 9.0. Changes do include changes to std.uni.isAlpha(), and other methods. Is there either an updated version of std.uni, or are there plans to update it? Thanks Andrew.Please make a bug report on issues.dlang.org
Aug 22 2016