digitalmars.D.learn - negative assertion support for RegExp?
- =?UTF-8?B?VGhvbWFzIEvDvGhuZQ==?= (39/39) Aug 13 2005 -----BEGIN PGP SIGNED MESSAGE-----
- Manfred Nowak (7/9) Aug 13 2005 =?UTF-8?B?VGhvbWFzIEvDvGhuZQ==?=
- AJG (4/12) Aug 14 2005 To save himself that bit of programming? ;) Regexes are currently somewh...
- =?UTF-8?B?VGhvbWFzIEvDvGhuZQ==?= (11/29) Aug 14 2005 -----BEGIN PGP SIGNED MESSAGE-----
- Manfred Nowak (14/17) Aug 14 2005 =?UTF-8?B?VGhvbWFzIEvDvGhuZQ==?=
- AJG (23/62) Aug 14 2005 Hi Thomas,
- Derek Parnell (7/14) Aug 14 2005 I don't see that O-O is a requirement. A simple procedural API is quite
- =?UTF-8?B?VGhvbWFzIEvDvGhuZQ==?= (17/50) Aug 14 2005 -----BEGIN PGP SIGNED MESSAGE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Is there any D library that offers regular expressions with negative assertion support? There seems to be no documented way to use negative assertions in Phobo's regular expressions. (http://digitalmars.com/ctg/regular.html) Usually the syntax "(?!doNotMatch)" is used for that on Linux systems. Thomas - -- sample code --- import std.regexp; import std.stdio; int main(){ char[] log= "IP:127.0.0.1; USER:some; additional info\n" "IP:123.3.8.0; USER:other; additional info\n"; char[] pattern = "^(?!IP:(127[.]0[.]0[.]1)); USER:([^; ]*);"; char[] format = "; USER:$2 $1;"; char[] attributes = "g"; char[] filtered = sub(log, pattern, format, attributes); writef("---unfiltered---\n%s\n", log); writef("---filtered---\n%s\n", filtered); return 0; } /* Expected Output: - ---unfiltered--- IP:127.0.0.1; USER:some; additional info IP:123.3.8.17; USER:other; additional info - ---filtered--- IP:127.0.0.1; USER:some; additional info IP:123.3.8.17; USER:other 123.3.8.17; additional info */ -----BEGIN PGP SIGNATURE----- iD4DBQFC/ec13w+/yD4P9tIRAh+7AJ9kLB27xKffpuoXhbkuT34WDP/DYQCYo1x7 r0vTnBDmV/cn7+gjOfKbyA== =Ep0M -----END PGP SIGNATURE-----
Aug 13 2005
=?UTF-8?B?VGhvbWFzIEvDvGhuZQ==?= <thomas-dloop kuehne.THISISSPAM.cn> wrote: [...]Is there any D library that offers regular expressions with negative assertion support?[...] Why do you need such? With a little bit of programming with split, find and rfind you should be able to use std.regexp for that purpose. -manfred
Aug 13 2005
In article <ddmogt$19aq$1 digitaldaemon.com>, Manfred Nowak says...=?UTF-8?B?VGhvbWFzIEvDvGhuZQ==?= <thomas-dloop kuehne.THISISSPAM.cn> wrote: [...]To save himself that bit of programming? ;) Regexes are currently somewhat limited in phobos. I find myself missing Perl features all the time. --AJG.Is there any D library that offers regular expressions with negative assertion support?[...] Why do you need such? With a little bit of programming with split, find and rfind you should be able to use std.regexp for that purpose.
Aug 14 2005
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 AJG schrieb:In article <ddmogt$19aq$1 digitaldaemon.com>, Manfred Nowak says...What I gave was a very simple regex. The production ones are nested, include alternatives and contain more than one negative assertion. Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFC/vW/3w+/yD4P9tIRAtJRAKDTMJZFmrQ1UNfbZYGQkTCqFAWFPwCgxPrt JjSTewdoQtJzw4FSrh+YA3c= =Ee8d -----END PGP SIGNATURE-----=?UTF-8?B?VGhvbWFzIEvDvGhuZQ==?= <thomas-dloop kuehne.THISISSPAM.cn> wrote: [...]To save himself that bit of programming? ;) Regexes are currently somewhat limited in phobos. I find myself missing Perl features all the time.Is there any D library that offers regular expressions with negative assertion support?[...] Why do you need such? With a little bit of programming with split, find and rfind you should be able to use std.regexp for that purpose.
Aug 14 2005
=?UTF-8?B?VGhvbWFzIEvDvGhuZQ==?= <thomas-dloop kuehne.THISISSPAM.cn> wrote: [...]What I gave was a very simple regex. The production ones are nested, include alternatives and contain more than one negative assertion.[...] Then I do not believe, that an approach with RE's and "assertions" is feasable in terms of run time requirements in first place, but also in terms of time for development and maintenance, because you are implementing some sort of lexer/parser for a language you do not have an explicit formal grammar for nor the definitions for the lexical tokens. I do not know the details of the implementation of PCRE, but I do not believe, that a tool that has its emphasis on RE's incidentally also implements an LALR-parser. -manfred
Aug 14 2005
Hi Thomas, Actually, I ported PCRE version 5 to D about a month ago when Walter told me phobos didn't support named groups. AFAIK it works correctly; I compiled the test program (a version of grep) and it didn't show any errors. The only problem is that it's not object-oriented (it's the C API). Anyway, I'm going to upload the code and maybe you can use that. You can find example code in main.d. All you need essentially is: And off you go. If you have Build you can do: % build main And that's it. Let me know if you find it useful. If there's enough interest, I could develop a D-based OO interface for it, and maybe Walter will consider it for inclusion in phobos to replace the old regex. Some technical notes: I ported the code with SUPPORT_UTF8, but _not_ with SUPPORT_UCP because that was just a lot of bloat. Also, the LINK_SIZE I selected was 2, the default. Here's the link: http://pantheon.yale.edu/~ajg36/pcre.zip Enjoy! --AJG. In article <ddkoss$2u5m$1 digitaldaemon.com>, =?UTF-8?B?VGhvbWFzIEvDvGhuZQ==?= says...-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Is there any D library that offers regular expressions with negative assertion support? There seems to be no documented way to use negative assertions in Phobo's regular expressions. (http://digitalmars.com/ctg/regular.html) Usually the syntax "(?!doNotMatch)" is used for that on Linux systems. Thomas - -- sample code --- import std.regexp; import std.stdio; int main(){ char[] log= "IP:127.0.0.1; USER:some; additional info\n" "IP:123.3.8.0; USER:other; additional info\n"; char[] pattern = "^(?!IP:(127[.]0[.]0[.]1)); USER:([^; ]*);"; char[] format = "; USER:$2 $1;"; char[] attributes = "g"; char[] filtered = sub(log, pattern, format, attributes); writef("---unfiltered---\n%s\n", log); writef("---filtered---\n%s\n", filtered); return 0; } /* Expected Output: - ---unfiltered--- IP:127.0.0.1; USER:some; additional info IP:123.3.8.17; USER:other; additional info - ---filtered--- IP:127.0.0.1; USER:some; additional info IP:123.3.8.17; USER:other 123.3.8.17; additional info */ -----BEGIN PGP SIGNATURE----- iD4DBQFC/ec13w+/yD4P9tIRAh+7AJ9kLB27xKffpuoXhbkuT34WDP/DYQCYo1x7 r0vTnBDmV/cn7+gjOfKbyA== =Ep0M -----END PGP SIGNATURE-----
Aug 14 2005
On Sun, 14 Aug 2005 08:02:41 +0000 (UTC), AJG wrote:Hi Thomas, Actually, I ported PCRE version 5 to D about a month ago when Walter told me phobos didn't support named groups. AFAIK it works correctly; I compiled the test program (a version of grep) and it didn't show any errors. The only problem is that it's not object-oriented (it's the C API).I don't see that O-O is a requirement. A simple procedural API is quite satisfactory. -- Derek Parnell Melbourne, Australia 14/08/2005 11:07:28 PM
Aug 14 2005
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 AJG schrieb:Hi Thomas, Actually, I ported PCRE version 5 to D about a month ago when Walter told me phobos didn't support named groups. AFAIK it works correctly; I compiled the test program (a version of grep) and it didn't show any errors. The only problem is that it's not object-oriented (it's the C API). Anyway, I'm going to upload the code and maybe you can use that. You can find example code in main.d. All you need essentially is: And off you go. If you have Build you can do: % build main And that's it. Let me know if you find it useful. If there's enough interest, I could develop a D-based OO interface for it, and maybe Walter will consider it for inclusion in phobos to replace the old regex. Some technical notes: I ported the code with SUPPORT_UTF8, but _not_ with SUPPORT_UCP because that was just a lot of bloat. Also, the LINK_SIZE I selected was 2, the default. Here's the link: http://pantheon.yale.edu/~ajg36/pcre.zipThanks for the code :))) The main.d sample requires to small changes: line 1 < private import pcre_c;private import pcre;line 8 < pcre *re;pcre.pcre *re;I think PCRE_D - after a bit of clean up and some unittests - might become a valuable Phobos addon. Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFC/73e3w+/yD4P9tIRAj4mAJ9HU5X2bZ7lX03Bchj1gU2DxdNcTQCfbDfG RcZqhTLnYs8pQNZEAQL0v0M= =ziME -----END PGP SIGNATURE-----
Aug 14 2005