digitalmars.D.bugs - [Issue 13268] New: Implement greedy alternation in std.regex
- via Digitalmars-d-bugs (27/27) Aug 07 2014 https://issues.dlang.org/show_bug.cgi?id=13268
https://issues.dlang.org/show_bug.cgi?id=13268 Issue ID: 13268 Summary: Implement greedy alternation in std.regex Product: D Version: D2 Hardware: x86 OS: Linux Status: NEW Severity: enhancement Priority: P1 Component: Phobos Assignee: nobody puremagic.com Reporter: hsteoh quickfur.ath.cx Currently, the | operator works on a first-match basis, such that a pattern like (ab)|(abcd) will never match the second alternative because (ab) is always matched first. It would be nice if there was a way to do greedy matching between alternations, such that an alternation a|b|c|... will always prefer the longest match. Probably this will have performance implications, so perhaps a "greedy alternation" operator distinct from | should be used. Maybe something like |* might be a possible syntax: (ab)|*(abcd) will capture (abcd) if the input contains "abcd", but fallback to (ab) only if the input doesn't contain "abcd" but does contain "ab". Precedents for greedy alternation include lex / flex, which take a list of input regexen and always performs longest-match on them. In essence, given a list of patterns P1, P2, ..., the equivalent of P1 |* P2 |* ... is performed. --
Aug 07 2014