digitalmars.D - unicode combinig mark/ std.uni question
- ikod (49/49) Dec 05 2017 Hello,
- Dmitry Olshansky (4/13) Dec 08 2017 Well std.uni gives you ability to build fast lookup tables and a
Hello, I have to create very basic IDNA (Internationalized Domain Names in Applications) library. There are two parts in IDNA - user input checks and punycode encoding/decoding. Punycode part already completed, and now I have to add some checks but I'm weak in unicode and cant find proper way to express these tests using std.uni. Here are list of prohibited domain labels (https://tools.ietf.org/html/rfc5891): o Labels whose first character is a combining mark (see The Unicode Standard, Section 2.11 [Unicode]). o Labels containing prohibited code points, i.e., those that are assigned to the "DISALLOWED" category of the Tables document [RFC5892]. o Labels containing code points that are identified in the Tables document as "CONTEXTJ", i.e., requiring exceptional contextual rule processing on lookup, but that do not conform to those rules. Note that this implies that a rule must be defined, not null: a character that requires a contextual rule but for which the rule is null is treated in this step as having failed to conform to the rule. o Labels containing code points that are identified in the Tables document as "CONTEXTO", but for which no such rule appears in the table of rules. Applications resolving DNS names or carrying out equivalent operations are not required to test contextual rules for "CONTEXTO" characters, only to verify that a rule is defined (although they MAY make such tests to provide better protection or give better information to the user). o Labels containing code points that are unassigned in the version of Unicode being used by the application, i.e., in the UNASSIGNED category of the Tables document. Can anybody help with this task? Thanks!
Dec 05 2017
On Tuesday, 5 December 2017 at 20:04:29 UTC, ikod wrote:Hello, I have to create very basic IDNA (Internationalized Domain Names in Applications) library. There are two parts in IDNA - user input checks and punycode encoding/decoding. Punycode part already completed, and now I have to add some checks but I'm weak in unicode and cant find proper way to express these tests using std.uni. Here are list of prohibited domain labels (https://tools.ietf.org/html/rfc5891):Well std.uni gives you ability to build fast lookup tables and a set of codepoints type, I don’t think we have any of the sets you listed prepared in std. Maybe combining marks are, check the docs.
Dec 08 2017