www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Phobos uni methods

reply Andrew <andrew trotman.com> writes:
Hi,

It appears as though the Phobos Unicode methods (such as 
std.uni.isAlpha()) are compatible with Unicode 6.2 (i.e. the 2012 
standard, which is now nearly 5 years old). Unicode is now up to 
version 9.0.  Changes do include changes to std.uni.isAlpha(), 
and other methods.

Is there either an updated version of std.uni, or are there plans 
to update it?

Thanks
Andrew.
Aug 22 2016
next sibling parent reply Cauterite <cauterite gmail.com> writes:
On Monday, 22 August 2016 at 07:58:50 UTC, Andrew wrote:

Note that changing isAlpha() can potentially break any D code 
with unicode in its identifiers, because the DMD frontend uses 
isAlpha() to determine which characters are allowed in 
identifiers.
Aug 22 2016
parent reply Lodovico Giaretta <lodovico giaretart.net> writes:
On Monday, 22 August 2016 at 10:26:35 UTC, Cauterite wrote:
 On Monday, 22 August 2016 at 07:58:50 UTC, Andrew wrote:

 Note that changing isAlpha() can potentially break any D code 
 with unicode in its identifiers, because the DMD frontend uses 
 isAlpha() to determine which characters are allowed in 
 identifiers.
Well, the Unicode consortium is famous for having backward compatibility as a priority (in fact the Unicode standard has many strange things that are conceptually wrong but are needed to maintain compatibility). So, updating the std.uni methods should not break anything, but at most allow more inputs to be accepted. So I think that the possibility of updating std.uni should be taken into account and further investigated, to see if it's doable. By the way, the core team is very busy so if Andrew (the OP) wants to make a PR himself, it would be welcome.
Aug 22 2016
parent reply Andrew <andrew trotman.com> writes:
On Monday, 22 August 2016 at 10:48:14 UTC, Lodovico Giaretta 
wrote:
 By the way, the core team is very busy so if Andrew (the OP) 
 wants to make a PR himself, it would be welcome.
Is there a tool somewhere that parses the UnicodeData.txt and PropList.txt and generates all the tries? I took a quick look but didn't see one alongside the std.uni source code. Andrew.
Aug 27 2016
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 8/27/16 9:40 AM, Andrew wrote:
 On Monday, 22 August 2016 at 10:48:14 UTC, Lodovico Giaretta wrote:
 By the way, the core team is very busy so if Andrew (the OP) wants to
 make a PR himself, it would be welcome.
Is there a tool somewhere that parses the UnicodeData.txt and PropList.txt and generates all the tries? I took a quick look but didn't see one alongside the std.uni source code.
An awful oversight. The tool still sits in its GSOC 2012 repo: https://github.com/DmitryOlshansky/gsoc-bench-2012/blob/master/gen_uni.d And the script to run it: https://github.com/DmitryOlshansky/gsoc-bench-2012/blob/master/gen_uni.sh
 Andrew.
Aug 28 2016
prev sibling parent Jack Stouffer <jack jackstouffer.com> writes:
On Monday, 22 August 2016 at 07:58:50 UTC, Andrew wrote:
 Hi,

 It appears as though the Phobos Unicode methods (such as 
 std.uni.isAlpha()) are compatible with Unicode 6.2 (i.e. the 
 2012 standard, which is now nearly 5 years old). Unicode is now 
 up to version 9.0.  Changes do include changes to 
 std.uni.isAlpha(), and other methods.

 Is there either an updated version of std.uni, or are there 
 plans to update it?

 Thanks
 Andrew.
Please make a bug report on issues.dlang.org
Aug 22 2016