digitalmars.D - Formal Review of std.regex (FReD)
- Jesse Phillips (31/31) Oct 08 2011 Hello everyone,
- Walter Bright (7/9) Oct 08 2011 1. There are many different regular expressions for strings. Should incl...
- Dmitry Olshansky (8/18) Oct 08 2011 While I do mention ECMA-262 falvor, I agree a table right there is far
- Walter Bright (3/4) Oct 08 2011 Welcs. And I might add that I do greatly appreciate the work you've done...
- Dmitry Olshansky (13/43) Oct 08 2011 Thanks, updated and now it works on linux for me. Though it wasn't that
- Andrei Alexandrescu (4/8) Oct 08 2011 That may be a bug in the compiler. A symbol shouldn't be visible unless
- Christian Kamm (9/17) Oct 09 2011 It's definitely a bug. Once an import is processed, the package is visib...
- Andrei Alexandrescu (4/21) Oct 09 2011 Hm, this is important. But what is the contribution of a.d to the
- Christian Kamm (2/29) Oct 09 2011 Yes, 'dmd b.d' fails, 'dmd a.d b.d' succeeds.
- Brad Roberts (2/29) Oct 09 2011 Isn't this bug #314? Very well known, super old, highly voted for, etc,...
- Christian Kamm (6/8) Oct 09 2011 No, bug 314 is about privately imported symbols being accessible even th...
- Christian Kamm (3/16) Oct 09 2011 Heh, I actually reported it a while ago and then forgot about it. :)
- Jacob Carlborg (4/12) Oct 09 2011 I think it's a bug, but sometimes it can be useful.
- Jacob Carlborg (5/15) Oct 09 2011 What's the difference between Regex and RegEx? I can see RegEx in the
- Dmitry Olshansky (6/23) Oct 09 2011 RegEx is a template parameter (it's that usual abstract 'T'), that in
- Jacob Carlborg (5/28) Oct 09 2011 I don't think the documentation should refer to RegEx if it's not
- Dmitry Olshansky (4/32) Oct 09 2011 Yes, I think I see the typo now, thanks.
- Jacob Carlborg (5/41) Oct 09 2011 The second parameter type of the match function (and a couple of other
- Dmitry Olshansky (10/49) Oct 09 2011 No, that's what I tried to point out but failed obviously.
- Jacob Carlborg (5/54) Oct 09 2011 Aha, ok, I see. Could RegEx be explained in the docs so it won't cause
- Dmitry Olshansky (11/65) Oct 10 2011 Mm... it could get even more confusing.
- Alix Pexton (9/9) Oct 09 2011 I've not had a proper look at the code yet, but I recall from when I
- Jerry (7/7) Oct 11 2011 I have 2 thoughts.
- Dmitry Olshansky (11/18) Oct 12 2011 Looks like I was tricked by their technical standard then.
- Dmitry Olshansky (6/6) Oct 12 2011 Fresh version of documentation is here:
- kennytm (2/7) Oct 12 2011 The '.' really matches any character, including the new line '\n'?
- Dmitry Olshansky (4/11) Oct 12 2011 Hm, yes. Is that a problem?
- kennytm (6/17) Oct 12 2011 Most regex flavors don't match '\n' by default unless you supply the "s"
- Jesse Phillips (3/6) Oct 12 2011 Really? Sense when? I didn't know there was any that didn't match \n. If...
- Andrei Alexandrescu (5/11) Oct 12 2011 Kenny's right.
- Dmitry Olshansky (12/24) Oct 13 2011 The funny thing is that multiline mode affects only ^ & $ anchors. And
- Jacob Carlborg (9/19) Oct 12 2011 Shouldn't "." exclude newlines? I think this is a good reference:
- Dmitry Olshansky (6/10) Oct 15 2011 Updated, with single-line mode and a few documentation fixes.
- Jesse Phillips (5/46) Oct 22 2011 Please note that the review will be ending this weekend in just 32 hours...
- Rainer Schuetze (18/64) Oct 22 2011 I haven't followed the discussion closely, and I cannot really comment
- Dmitry Olshansky (28/44) Oct 22 2011 Coincidentally, you still can access re.ir property in this way.
- Fawzi Mohamed (5/21) Oct 22 2011 that you'd get:
- Rainer Schuetze (10/58) Oct 23 2011 I think, this might be confused with normal usage, like "is this regex
- Dmitry Olshansky (11/80) Oct 23 2011 "" is a valid regex that matches anywhere, with global flag it will
- Rainer Schuetze (4/66) Oct 24 2011 You may be right. Maybe 'initialized', otherwise 'empty' isn't too bad
- Marco Leise (4/15) Oct 24 2011 but I prefer some speaking name here. Otherwise I'd believe 're' is a
Hello everyone, I have taken the role of review manager of the std.regex replacement by Dmitry Olshansky. The review period begins now 2011-10-8 and will end on 2011-10-23 at midnight UTC. A voting thread to include into Phobos will be held after review assuming such is appropriate. The Voting period is one week. Please note that you can try FRed as part of Phobos (Code) or by itself (Package of FReD) which includes docs. Doc: http://nascent.freeshell.org/fred/doc/ Code: https://github.com/blackwhale/phobos MASTER Package of FReD: https://github.com/downloads/blackwhale/FReD/FReD.zip Remember this will be replacing the current std.regex and is intended to be a drop in replacement. This project is also part of GSoC. Dmitry, I ask that you apply this patch to posix.mak (adding to internal modules). --- a/posix.mak +++ b/posix.mak -184,7 +184,8 std/c/, fenv locale math process stdarg stddef stdio stdlib time wcharh) EXTRA_MODULES += $(EXTRA_DOCUMENTABLES) $(addprefix \ std/internal/math/, biguintcore biguintnoasm biguintx86 \ - gammafunction errorfunction) std/internal/processinit + gammafunction errorfunction) std/internal/processinit \ + std/internal/uni std/internal/uni_tab D_MODULES = crc32 $(STD_MODULES) $(EXTRA_MODULES) $(STD_NET_MODULES)
Oct 08 2011
On 10/8/2011 12:56 PM, Jesse Phillips wrote:Doc: http://nascent.freeshell.org/fred/doc/1. There are many different regular expressions for strings. Should include a link to whichever one fred uses. Feel free to crib from http://www.digitalmars.com/ctg/regular.html 2. Many of the examples can be wrapped in a void main(){ ... } so that they are compilable using cut & paste. 3. "Advanced Syntax" and other headings need to be bold faced.
Oct 08 2011
On 09.10.2011 0:30, Walter Bright wrote:On 10/8/2011 12:56 PM, Jesse Phillips wrote:While I do mention ECMA-262 falvor, I agree a table right there is far more preferable. Will do.Doc: http://nascent.freeshell.org/fred/doc/1. There are many different regular expressions for strings. Should include a link to whichever one fred uses. Feel free to crib from http://www.digitalmars.com/ctg/regular.html2. Many of the examples can be wrapped in a void main(){ ... } so that they are compilable using cut & paste.Indeed, I just though it wasn't phobos style. Now looking through again I see there are a lot of examples with void main().3. "Advanced Syntax" and other headings need to be bold faced.Right, thanks. -- Dmitry Olshansky
Oct 08 2011
On 10/8/2011 1:43 PM, Dmitry Olshansky wrote:Right, thanks.Welcs. And I might add that I do greatly appreciate the work you've done on this, I think it could be a showcase for D's capabilities.
Oct 08 2011
On 08.10.2011 23:56, Jesse Phillips wrote:Hello everyone, I have taken the role of review manager of the std.regex replacement by Dmitry Olshansky. The review period begins now 2011-10-8 and will end on 2011-10-23 at midnight UTC. A voting thread to include into Phobos will be held after review assuming such is appropriate. The Voting period is one week. Please note that you can try FRed as part of Phobos (Code) or by itself (Package of FReD) which includes docs. Doc: http://nascent.freeshell.org/fred/doc/ Code: https://github.com/blackwhale/phobos MASTER Package of FReD: https://github.com/downloads/blackwhale/FReD/FReD.zip Remember this will be replacing the current std.regex and is intended to be a drop in replacement. This project is also part of GSoC. Dmitry, I ask that you apply this patch to posix.mak (adding to internal modules). --- a/posix.mak +++ b/posix.mak -184,7 +184,8 std/c/, fenv locale math process stdarg stddef stdio stdlib time wcharh) EXTRA_MODULES += $(EXTRA_DOCUMENTABLES) $(addprefix \ std/internal/math/, biguintcore biguintnoasm biguintx86 \ - gammafunction errorfunction) std/internal/processinit + gammafunction errorfunction) std/internal/processinit \ + std/internal/uni std/internal/uni_tab D_MODULES = crc32 $(STD_MODULES) $(EXTRA_MODULES) $(STD_NET_MODULES)Thanks, updated and now it works on linux for me. Though it wasn't that simple. I've found out what caused my builds to break. The thing is that both std.file & std.stdio use fully qualified std.c.stdio.func calls but never actually import std.c.stdio in any way. I wasn't even aware that's possible. So I changed it to core.stdc in std.file and added static import to std.stdio (some functions from std.c are not present in core.stdc apparently). If there is any problem with that I can revert it, and investigate why it affects only me ;) -- Dmitry Olshansky
Oct 08 2011
On 10/8/11 3:34 PM, Dmitry Olshansky wrote:I've found out what caused my builds to break. The thing is that both std.file & std.stdio use fully qualified std.c.stdio.func calls but never actually import std.c.stdio in any way. I wasn't even aware that's possible.That may be a bug in the compiler. A symbol shouldn't be visible unless e.g. publicly imported from an imported module (could that be the case)? Andrei
Oct 08 2011
Andrei Alexandrescu wrote:On 10/8/11 3:34 PM, Dmitry Olshansky wrote:It's definitely a bug. Once an import is processed, the package is visible globally as long as the parent package is accessible. This compiles: touch dmd2/src/phobos/std/empty.d a.d: import std.stdio; b.d: import std.empty; void main() { std.stdio.writeln("hi!"); }I've found out what caused my builds to break. The thing is that both std.file & std.stdio use fully qualified std.c.stdio.func calls but never actually import std.c.stdio in any way. I wasn't even aware that's possible.That may be a bug in the compiler. A symbol shouldn't be visible unless e.g. publicly imported from an imported module (could that be the case)?
Oct 09 2011
On 10/9/11 2:26 AM, Christian Kamm wrote:Andrei Alexandrescu wrote:Hm, this is important. But what is the contribution of a.d to the example? Do you compile it together with b.d? AndreiOn 10/8/11 3:34 PM, Dmitry Olshansky wrote:It's definitely a bug. Once an import is processed, the package is visible globally as long as the parent package is accessible. This compiles: touch dmd2/src/phobos/std/empty.d a.d: import std.stdio; b.d: import std.empty; void main() { std.stdio.writeln("hi!"); }I've found out what caused my builds to break. The thing is that both std.file& std.stdio use fully qualified std.c.stdio.func calls but never actually import std.c.stdio in any way. I wasn't even aware that's possible.That may be a bug in the compiler. A symbol shouldn't be visible unless e.g. publicly imported from an imported module (could that be the case)?
Oct 09 2011
Andrei Alexandrescu wrote:On 10/9/11 2:26 AM, Christian Kamm wrote:Yes, 'dmd b.d' fails, 'dmd a.d b.d' succeeds.Andrei Alexandrescu wrote:Hm, this is important. But what is the contribution of a.d to the example? Do you compile it together with b.d?On 10/8/11 3:34 PM, Dmitry Olshansky wrote:It's definitely a bug. Once an import is processed, the package is visible globally as long as the parent package is accessible. This compiles: touch dmd2/src/phobos/std/empty.d a.d: import std.stdio; b.d: import std.empty; void main() { std.stdio.writeln("hi!"); }I've found out what caused my builds to break. The thing is that both std.file& std.stdio use fully qualified std.c.stdio.func calls but never actually import std.c.stdio in any way. I wasn't even aware that's possible.That may be a bug in the compiler. A symbol shouldn't be visible unless e.g. publicly imported from an imported module (could that be the case)?
Oct 09 2011
On 10/9/2011 12:30 AM, Andrei Alexandrescu wrote:On 10/9/11 2:26 AM, Christian Kamm wrote:Andrei Alexandrescu wrote:Hm, this is important. But what is the contribution of a.d to the example? Do you compile it together with b.d? AndreiOn 10/8/11 3:34 PM, Dmitry Olshansky wrote:It's definitely a bug. Once an import is processed, the package is visible globally as long as the parent package is accessible. This compiles: touch dmd2/src/phobos/std/empty.d a.d: import std.stdio; b.d: import std.empty; void main() { std.stdio.writeln("hi!"); }I've found out what caused my builds to break. The thing is that both std.file& std.stdio use fully qualified std.c.stdio.func calls but never actually import std.c.stdio in any way. I wasn't even aware that's possible.That may be a bug in the compiler. A symbol shouldn't be visible unless e.g. publicly imported from an imported module (could that be the case)?
Oct 09 2011
Brad Roberts wrote:etc.No, bug 314 is about privately imported symbols being accessible even though they shouldn't be. This problem is about modules that aren't imported at all in a file or any of its imports still being accessible. current dmd/master: https://github.com/D-Programming-Language/dmd/pull/190
Oct 09 2011
Christian Kamm wrote:Andrei Alexandrescu wrote:Heh, I actually reported it a while ago and then forgot about it. :) http://d.puremagic.com/issues/show_bug.cgi?id=6307On 10/8/11 3:34 PM, Dmitry Olshansky wrote:It's definitely a bug. Once an import is processed, the package is visible globally as long as the parent package is accessible.I've found out what caused my builds to break. The thing is that both std.file & std.stdio use fully qualified std.c.stdio.func calls but never actually import std.c.stdio in any way. I wasn't even aware that's possible.That may be a bug in the compiler. A symbol shouldn't be visible unless e.g. publicly imported from an imported module (could that be the case)?
Oct 09 2011
On 2011-10-08 23:37, Andrei Alexandrescu wrote:On 10/8/11 3:34 PM, Dmitry Olshansky wrote:I think it's a bug, but sometimes it can be useful. -- /Jacob CarlborgI've found out what caused my builds to break. The thing is that both std.file & std.stdio use fully qualified std.c.stdio.func calls but never actually import std.c.stdio in any way. I wasn't even aware that's possible.That may be a bug in the compiler. A symbol shouldn't be visible unless e.g. publicly imported from an imported module (could that be the case)? Andrei
Oct 09 2011
On 2011-10-08 21:56, Jesse Phillips wrote:Hello everyone, I have taken the role of review manager of the std.regex replacement by Dmitry Olshansky. The review period begins now 2011-10-8 and will end on 2011-10-23 at midnight UTC. A voting thread to include into Phobos will be held after review assuming such is appropriate. The Voting period is one week. Please note that you can try FRed as part of Phobos (Code) or by itself (Package of FReD) which includes docs. Doc: http://nascent.freeshell.org/fred/doc/What's the difference between Regex and RegEx? I can see RegEx in the documentation but I cannot find its definition in the docs. -- /Jacob Carlborg
Oct 09 2011
On 09.10.2011 14:33, Jacob Carlborg wrote:On 2011-10-08 21:56, Jesse Phillips wrote:RegEx is a template parameter (it's that usual abstract 'T'), that in the end deduced as StaticRegex!Char or Regex!Char where Char is char/wchar/dchar. -- Dmitry OlshanskyHello everyone, I have taken the role of review manager of the std.regex replacement by Dmitry Olshansky. The review period begins now 2011-10-8 and will end on 2011-10-23 at midnight UTC. A voting thread to include into Phobos will be held after review assuming such is appropriate. The Voting period is one week. Please note that you can try FRed as part of Phobos (Code) or by itself (Package of FReD) which includes docs. Doc: http://nascent.freeshell.org/fred/doc/What's the difference between Regex and RegEx? I can see RegEx in the documentation but I cannot find its definition in the docs.
Oct 09 2011
On 2011-10-09 16:09, Dmitry Olshansky wrote:On 09.10.2011 14:33, Jacob Carlborg wrote:I don't think the documentation should refer to RegEx if it's not defined in the docs. -- /Jacob CarlborgOn 2011-10-08 21:56, Jesse Phillips wrote:RegEx is a template parameter (it's that usual abstract 'T'), that in the end deduced as StaticRegex!Char or Regex!Char where Char is char/wchar/dchar.Hello everyone, I have taken the role of review manager of the std.regex replacement by Dmitry Olshansky. The review period begins now 2011-10-8 and will end on 2011-10-23 at midnight UTC. A voting thread to include into Phobos will be held after review assuming such is appropriate. The Voting period is one week. Please note that you can try FRed as part of Phobos (Code) or by itself (Package of FReD) which includes docs. Doc: http://nascent.freeshell.org/fred/doc/What's the difference between Regex and RegEx? I can see RegEx in the documentation but I cannot find its definition in the docs.
Oct 09 2011
On 09.10.2011 18:49, Jacob Carlborg wrote:On 2011-10-09 16:09, Dmitry Olshansky wrote:Yes, I think I see the typo now, thanks. -- Dmitry OlshanskyOn 09.10.2011 14:33, Jacob Carlborg wrote:I don't think the documentation should refer to RegEx if it's not defined in the docs.On 2011-10-08 21:56, Jesse Phillips wrote:RegEx is a template parameter (it's that usual abstract 'T'), that in the end deduced as StaticRegex!Char or Regex!Char where Char is char/wchar/dchar.Hello everyone, I have taken the role of review manager of the std.regex replacement by Dmitry Olshansky. The review period begins now 2011-10-8 and will end on 2011-10-23 at midnight UTC. A voting thread to include into Phobos will be held after review assuming such is appropriate. The Voting period is one week. Please note that you can try FRed as part of Phobos (Code) or by itself (Package of FReD) which includes docs. Doc: http://nascent.freeshell.org/fred/doc/What's the difference between Regex and RegEx? I can see RegEx in the documentation but I cannot find its definition in the docs.
Oct 09 2011
On 2011-10-09 17:01, Dmitry Olshansky wrote:On 09.10.2011 18:49, Jacob Carlborg wrote:The second parameter type of the match function (and a couple of other functions) is RegEx, is that possible to fix as well? -- /Jacob CarlborgOn 2011-10-09 16:09, Dmitry Olshansky wrote:Yes, I think I see the typo now, thanks.On 09.10.2011 14:33, Jacob Carlborg wrote:I don't think the documentation should refer to RegEx if it's not defined in the docs.On 2011-10-08 21:56, Jesse Phillips wrote:RegEx is a template parameter (it's that usual abstract 'T'), that in the end deduced as StaticRegex!Char or Regex!Char where Char is char/wchar/dchar.Hello everyone, I have taken the role of review manager of the std.regex replacement by Dmitry Olshansky. The review period begins now 2011-10-8 and will end on 2011-10-23 at midnight UTC. A voting thread to include into Phobos will be held after review assuming such is appropriate. The Voting period is one week. Please note that you can try FRed as part of Phobos (Code) or by itself (Package of FReD) which includes docs. Doc: http://nascent.freeshell.org/fred/doc/What's the difference between Regex and RegEx? I can see RegEx in the documentation but I cannot find its definition in the docs.
Oct 09 2011
On 09.10.2011 19:09, Jacob Carlborg wrote:On 2011-10-09 17:01, Dmitry Olshansky wrote:No, that's what I tried to point out but failed obviously. The thing is that it is a templated parameter and due to constraint it could be either StaticRegex!Char or Regex!Char. They represent pattern compiled as machine code or bytecode respectively for character width of Char. All of the 6 versions of compiled patterns in the end do not have a common type nor one is technically possible (w/o some quite bad performance trade offs). -- Dmitry OlshanskyOn 09.10.2011 18:49, Jacob Carlborg wrote:The second parameter type of the match function (and a couple of other functions) is RegEx, is that possible to fix as well?On 2011-10-09 16:09, Dmitry Olshansky wrote:Yes, I think I see the typo now, thanks.On 09.10.2011 14:33, Jacob Carlborg wrote:I don't think the documentation should refer to RegEx if it's not defined in the docs.On 2011-10-08 21:56, Jesse Phillips wrote:RegEx is a template parameter (it's that usual abstract 'T'), that in the end deduced as StaticRegex!Char or Regex!Char where Char is char/wchar/dchar.Hello everyone, I have taken the role of review manager of the std.regex replacement by Dmitry Olshansky. The review period begins now 2011-10-8 and will end on 2011-10-23 at midnight UTC. A voting thread to include into Phobos will be held after review assuming such is appropriate. The Voting period is one week. Please note that you can try FRed as part of Phobos (Code) or by itself (Package of FReD) which includes docs. Doc: http://nascent.freeshell.org/fred/doc/What's the difference between Regex and RegEx? I can see RegEx in the documentation but I cannot find its definition in the docs.
Oct 09 2011
On 2011-10-09 17:29, Dmitry Olshansky wrote:On 09.10.2011 19:09, Jacob Carlborg wrote:Aha, ok, I see. Could RegEx be explained in the docs so it won't cause further confusion? -- /Jacob CarlborgOn 2011-10-09 17:01, Dmitry Olshansky wrote:No, that's what I tried to point out but failed obviously. The thing is that it is a templated parameter and due to constraint it could be either StaticRegex!Char or Regex!Char. They represent pattern compiled as machine code or bytecode respectively for character width of Char. All of the 6 versions of compiled patterns in the end do not have a common type nor one is technically possible (w/o some quite bad performance trade offs).On 09.10.2011 18:49, Jacob Carlborg wrote:The second parameter type of the match function (and a couple of other functions) is RegEx, is that possible to fix as well?On 2011-10-09 16:09, Dmitry Olshansky wrote:Yes, I think I see the typo now, thanks.On 09.10.2011 14:33, Jacob Carlborg wrote:I don't think the documentation should refer to RegEx if it's not defined in the docs.On 2011-10-08 21:56, Jesse Phillips wrote:RegEx is a template parameter (it's that usual abstract 'T'), that in the end deduced as StaticRegex!Char or Regex!Char where Char is char/wchar/dchar.Hello everyone, I have taken the role of review manager of the std.regex replacement by Dmitry Olshansky. The review period begins now 2011-10-8 and will end on 2011-10-23 at midnight UTC. A voting thread to include into Phobos will be held after review assuming such is appropriate. The Voting period is one week. Please note that you can try FRed as part of Phobos (Code) or by itself (Package of FReD) which includes docs. Doc: http://nascent.freeshell.org/fred/doc/What's the difference between Regex and RegEx? I can see RegEx in the documentation but I cannot find its definition in the docs.
Oct 09 2011
On 09.10.2011 22:47, Jacob Carlborg wrote:On 2011-10-09 17:29, Dmitry Olshansky wrote:Mm... it could get even more confusing. I guess putting "The RegEx parameter can be either Regex!Char or StaticRegex!Char depending on the actual type of pattern passed" all over the place won't cut it. Placing it somewhere on the top has disadvantage of lacking any prior context, and most users will miss it anyway. Maybe I'll just add Params: section with short description to all functions that still lack one. -- Dmitry OlshanskyOn 09.10.2011 19:09, Jacob Carlborg wrote:Aha, ok, I see. Could RegEx be explained in the docs so it won't cause further confusion?On 2011-10-09 17:01, Dmitry Olshansky wrote:No, that's what I tried to point out but failed obviously. The thing is that it is a templated parameter and due to constraint it could be either StaticRegex!Char or Regex!Char. They represent pattern compiled as machine code or bytecode respectively for character width of Char. All of the 6 versions of compiled patterns in the end do not have a common type nor one is technically possible (w/o some quite bad performance trade offs).On 09.10.2011 18:49, Jacob Carlborg wrote:The second parameter type of the match function (and a couple of other functions) is RegEx, is that possible to fix as well?On 2011-10-09 16:09, Dmitry Olshansky wrote:Yes, I think I see the typo now, thanks.On 09.10.2011 14:33, Jacob Carlborg wrote:I don't think the documentation should refer to RegEx if it's not defined in the docs.On 2011-10-08 21:56, Jesse Phillips wrote:RegEx is a template parameter (it's that usual abstract 'T'), that in the end deduced as StaticRegex!Char or Regex!Char where Char is char/wchar/dchar.Hello everyone, I have taken the role of review manager of the std.regex replacement by Dmitry Olshansky. The review period begins now 2011-10-8 and will end on 2011-10-23 at midnight UTC. A voting thread to include into Phobos will be held after review assuming such is appropriate. The Voting period is one week. Please note that you can try FRed as part of Phobos (Code) or by itself (Package of FReD) which includes docs. Doc: http://nascent.freeshell.org/fred/doc/What's the difference between Regex and RegEx? I can see RegEx in the documentation but I cannot find its definition in the docs.
Oct 10 2011
I've not had a proper look at the code yet, but I recall from when I read the docs during the pre-review period that the introduction was a little on the informal side. It doesn't seem to have changed since then, and IMHO the introduction/description needs a bit of a polish to bring it up to the standard that is required of official documentation. I'll be busy over the next few weeks, but I will try to make time to assemble some more specific comments. I just wanted to let you know that I thought the docs needed some work, just in case. A...
Oct 09 2011
I have 2 thoughts. 1) Minor doc typo: Long form for hex notation should be \U00YYYYYY. 2) Unicode set syntax If you're going to provide unicode set support, why not use ICU syntax rather than invent another one? Jerry
Oct 11 2011
On 12.10.2011 0:04, Jerry wrote:I have 2 thoughts. 1) Minor doc typo: Long form for hex notation should be \U00YYYYYY.Yeah, \U it is.2) Unicode set syntax If you're going to provide unicode set support, why not use ICU syntax rather than invent another one?Looks like I was tricked by their technical standard then. I can't immediately recall where this syntax was ever used but: http://unicode.org/reports/tr18/#Subtraction_and_Intersection The prime reason cited here is that e.g. '--' is (almost) unambigious with range notation '-' and also allows to skip [] where applicable [\p{letter}--a-z] vs [[\p{letter}]-[a-z]]. Come to think of it '--' is cleaner in this case.Jerry-- Dmitry Olshansky
Oct 12 2011
Fresh version of documentation is here: http://blackwhale.github.com/ This fixes all typos reported so far, adds missing overload of replace (ouch!) and introduces a brand new syntax table. -- Dmitry Olshansky
Oct 12 2011
Dmitry Olshansky <dmitry.olsh gmail.com> wrote:Fresh version of documentation is here: http://blackwhale.github.com/ This fixes all typos reported so far, adds missing overload of replace (ouch!) and introduces a brand new syntax table.The '.' really matches any character, including the new line '\n'?
Oct 12 2011
On 12.10.2011 23:32, kennytm wrote:Dmitry Olshansky<dmitry.olsh gmail.com> wrote:Hm, yes. Is that a problem? -- Dmitry OlshanskyFresh version of documentation is here: http://blackwhale.github.com/ This fixes all typos reported so far, adds missing overload of replace (ouch!) and introduces a brand new syntax table.The '.' really matches any character, including the new line '\n'?
Oct 12 2011
Dmitry Olshansky <dmitry.olsh gmail.com> wrote:On 12.10.2011 23:32, kennytm wrote:Most regex flavors don't match '\n' by default unless you supply the "s" flag -- including ECMAScript (well it doesn't even provide the "s" flag to allow '.' to match all characters). While I am OK with having "s" turned on by default, this should at least be documented explicitly.Dmitry Olshansky<dmitry.olsh gmail.com> wrote:Hm, yes. Is that a problem?Fresh version of documentation is here: http://blackwhale.github.com/ This fixes all typos reported so far, adds missing overload of replace (ouch!) and introduces a brand new syntax table.The '.' really matches any character, including the new line '\n'?
Oct 12 2011
On Wed, 12 Oct 2011 23:35:49 +0000, kennytm wrote:Most regex flavors don't match '\n' by default unless you supply the "s" flag -- including ECMAScript (well it doesn't even provide the "s" flag to allow '.' to match all characters).Really? Sense when? I didn't know there was any that didn't match \n. If you want to match everything not a new line [^\n].
Oct 12 2011
On 10/12/11 9:50 PM, Jesse Phillips wrote:On Wed, 12 Oct 2011 23:35:49 +0000, kennytm wrote:Kenny's right. http://www.regular-expressions.info/dot.html Engines have special options for multiline. AndreiMost regex flavors don't match '\n' by default unless you supply the "s" flag -- including ECMAScript (well it doesn't even provide the "s" flag to allow '.' to match all characters).Really? Sense when? I didn't know there was any that didn't match \n. If you want to match everything not a new line [^\n].
Oct 12 2011
On 13.10.2011 8:38, Andrei Alexandrescu wrote:On 10/12/11 9:50 PM, Jesse Phillips wrote:The funny thing is that multiline mode affects only ^ & $ anchors. And single line mode affects only . matches \r and \n rule. So it's entirely possible to use both at the same time. But anyway I guess I have to bite the bullet: add 's' option and introduce classic semantics by default. BTW in unicode end of line is much more then just \r or \n and among other things includes "unbreakable" two codepoint sequence '\r\n'. I wonder if any engine matches . in the middle of \r\n or do they detect stop on any other end-of-line characters. -- Dmitry OlshanskyOn Wed, 12 Oct 2011 23:35:49 +0000, kennytm wrote:Kenny's right. http://www.regular-expressions.info/dot.html Engines have special options for multiline.Most regex flavors don't match '\n' by default unless you supply the "s" flag -- including ECMAScript (well it doesn't even provide the "s" flag to allow '.' to match all characters).Really? Sense when? I didn't know there was any that didn't match \n. If you want to match everything not a new line [^\n].
Oct 13 2011
On 2011-10-12 21:41, Dmitry Olshansky wrote:On 12.10.2011 23:32, kennytm wrote:Shouldn't "." exclude newlines? I think this is a good reference: http://www.regular-expressions.info/reference.html Which says: Matches any single character except line break characters \r and \n. Most regex flavors have an option to make the dot match line break characters too. -- /Jacob CarlborgDmitry Olshansky<dmitry.olsh gmail.com> wrote:Hm, yes. Is that a problem?Fresh version of documentation is here: http://blackwhale.github.com/ This fixes all typos reported so far, adds missing overload of replace (ouch!) and introduces a brand new syntax table.The '.' really matches any character, including the new line '\n'?
Oct 12 2011
On 12.10.2011 22:17, Dmitry Olshansky wrote:Fresh version of documentation is here: http://blackwhale.github.com/ This fixes all typos reported so far, adds missing overload of replace (ouch!) and introduces a brand new syntax table.Updated, with single-line mode and a few documentation fixes. Source code is still here: https://github.com/blackwhale/phobos -- Dmitry Olshansky
Oct 15 2011
Please note that the review will be ending this weekend in just 32 hours. At which point voting will begin, please do not wait for voting to criticize the library. Updating Documentation: http://blackwhale.github.com/ On Sat, 08 Oct 2011 19:56:32 +0000, Jesse Phillips wrote:Hello everyone, I have taken the role of review manager of the std.regex replacement by Dmitry Olshansky. The review period begins now 2011-10-8 and will end on 2011-10-23 at midnight UTC. A voting thread to include into Phobos will be held after review assuming such is appropriate. The Voting period is one week. Please note that you can try FRed as part of Phobos (Code) or by itself (Package of FReD) which includes docs. Doc: http://nascent.freeshell.org/fred/doc/ Code: https://github.com/blackwhale/phobos MASTER Package of FReD: https://github.com/downloads/blackwhale/FReD/FReD.zip Remember this will be replacing the current std.regex and is intended to be a drop in replacement. This project is also part of GSoC. Dmitry, I ask that you apply this patch to posix.mak (adding to internal modules). --- a/posix.mak +++ b/posix.mak -184,7 +184,8 std/c/, fenv locale math process stdarg stddef stdio stdlib time wcharh) EXTRA_MODULES += $(EXTRA_DOCUMENTABLES) $(addprefix \ std/internal/math/, biguintcore biguintnoasm biguintx86 \ - gammafunction errorfunction) std/internal/processinit + gammafunction errorfunction) std/internal/processinit \ + std/internal/uni std/internal/uni_tab $(STD_MODULES) $(EXTRA_MODULES) $(STD_NET_MODULES)
Oct 22 2011
I haven't followed the discussion closely, and I cannot really comment on the core regex functionality, but I did actually use FReD as a replacement of a buggy std.regex once. In that case I wanted to have a lazily created static regex, but I did not find an official way to test whether a Regex has been initialized: static Regex!char re; if(!isInitializedRE(re)) re = regex(r"^(.*)\(([0-9]+)\):(.*)$"); So I implemented isInitializedRE() as "re.ir !is null" for std.regex and "re.captures() > 0" for fred, but that fails for being a "drop-in replacement". I think, both versions use implementation specifics, maybe there should be a documented way to test for being initialized. I also noticed, that "auto match(R, RegEx)(R input, RegEx re);" appears twice in the documentation, same for "bmatch". I guess they should not appear together with the string versions. Rainer On 22.10.2011 18:21, Jesse Phillips wrote:Please note that the review will be ending this weekend in just 32 hours. At which point voting will begin, please do not wait for voting to criticize the library. Updating Documentation: http://blackwhale.github.com/ On Sat, 08 Oct 2011 19:56:32 +0000, Jesse Phillips wrote:Hello everyone, I have taken the role of review manager of the std.regex replacement by Dmitry Olshansky. The review period begins now 2011-10-8 and will end on 2011-10-23 at midnight UTC. A voting thread to include into Phobos will be held after review assuming such is appropriate. The Voting period is one week. Please note that you can try FRed as part of Phobos (Code) or by itself (Package of FReD) which includes docs. Doc: http://nascent.freeshell.org/fred/doc/ Code: https://github.com/blackwhale/phobos MASTER Package of FReD: https://github.com/downloads/blackwhale/FReD/FReD.zip Remember this will be replacing the current std.regex and is intended to be a drop in replacement. This project is also part of GSoC. Dmitry, I ask that you apply this patch to posix.mak (adding to internal modules). --- a/posix.mak +++ b/posix.mak -184,7 +184,8 std/c/, fenv locale math process stdarg stddef stdio stdlib time wcharh) EXTRA_MODULES += $(EXTRA_DOCUMENTABLES) $(addprefix \ std/internal/math/, biguintcore biguintnoasm biguintx86 \ - gammafunction errorfunction) std/internal/processinit + gammafunction errorfunction) std/internal/processinit \ + std/internal/uni std/internal/uni_tab $(STD_MODULES) $(EXTRA_MODULES) $(STD_NET_MODULES)
Oct 22 2011
On 22.10.2011 20:56, Rainer Schuetze wrote:I haven't followed the discussion closely, and I cannot really comment on the core regex functionality, but I did actually use FReD as a replacement of a buggy std.regex once. In that case I wanted to have a lazily created static regex, but I did not find an official way to test whether a Regex has been initialized: static Regex!char re; if(!isInitializedRE(re)) re = regex(r"^(.*)\(([0-9]+)\):(.*)$"); So I implemented isInitializedRE() as "re.ir !is null" for std.regex and "re.captures() > 0" for fred, but that fails for being a "drop-in replacement".Coincidentally, you still can access re.ir property in this way. Wow, I wonder how far with backwards compatibility I can go :) In both cases this relies on undocumented features. Even now I can suggest a more portable and entirely generic way: if(re == Regex!(char).init) { //create re } Though that risks doing more work then needed.I think, both versions use implementation specifics, maybe there should be a documented way to test for being initialized.Definitely. How about adding an empty property + opCast to bool, with that you'd get: if(!re) { //create re } and a bit more verbose: if(re.empty) { //create re }I also noticed, that "auto match(R, RegEx)(R input, RegEx re);" appears twice in the documentation, same for "bmatch". I guess they should not appear together with the string versions.I gather that happens because there is another overload specifically for C-T regexes. It's docs state just that, but lacking the template constraint signatures are the same, so it indeed can cause some confusion. Maybe it would be better to just combine docs together, and leave one overload undocumented. -- Dmitry Olshansky
Oct 22 2011
On Oct 22, 2011, at 12:05 PM, Dmitry Olshansky wrote:On 22.10.2011 20:56, Rainer Schuetze wrote:should[=85] I think, both versions use implementation specifics, maybe there =that you'd get:be a documented way to test for being initialized. =20=20 Definitely. How about adding an empty property + opCast to bool, with =if(!re) { //create re }I think this is better, should one ever want to switch to plain = pointer=85, also you need less thinking if it works like for classes.and a bit more verbose: if(re.empty) { //create re }
Oct 22 2011
On 22.10.2011 21:05, Dmitry Olshansky wrote:On 22.10.2011 20:56, Rainer Schuetze wrote:I think, this might be confused with normal usage, like "is this regex the empty string?" (Is "" a valid regex?). Maybe a more explicite "valid()" predicate would be fine.I haven't followed the discussion closely, and I cannot really comment on the core regex functionality, but I did actually use FReD as a replacement of a buggy std.regex once. In that case I wanted to have a lazily created static regex, but I did not find an official way to test whether a Regex has been initialized: static Regex!char re; if(!isInitializedRE(re)) re = regex(r"^(.*)\(([0-9]+)\):(.*)$"); So I implemented isInitializedRE() as "re.ir !is null" for std.regex and "re.captures() > 0" for fred, but that fails for being a "drop-in replacement".Coincidentally, you still can access re.ir property in this way. Wow, I wonder how far with backwards compatibility I can go :) In both cases this relies on undocumented features. Even now I can suggest a more portable and entirely generic way: if(re == Regex!(char).init) { //create re } Though that risks doing more work then needed.I think, both versions use implementation specifics, maybe there should be a documented way to test for being initialized.Definitely. How about adding an empty property + opCast to bool, with that you'd get: if(!re) { //create re } and a bit more verbose: if(re.empty) { //create re }As RegEx is a template argument here, it can stand for both Regex and StaticRegex, and that should be mentioned. Whether it has two different implementations is an implementation detail that does not need to bother the user. If you want to keep the second entries, I'd recommend renaming the argument to StaticRegEx.I also noticed, that "auto match(R, RegEx)(R input, RegEx re);" appears twice in the documentation, same for "bmatch". I guess they should not appear together with the string versions.I gather that happens because there is another overload specifically for C-T regexes. It's docs state just that, but lacking the template constraint signatures are the same, so it indeed can cause some confusion. Maybe it would be better to just combine docs together, and leave one overload undocumented.
Oct 23 2011
On 23.10.2011 11:28, Rainer Schuetze wrote:On 22.10.2011 21:05, Dmitry Olshansky wrote:"" is a valid regex that matches anywhere, with global flag it will match before any codepoint + once at end. I'm not sure using 'valid' is good, it may mislead user to check it all over the place e.g.: auto r = regex("blah"); if(r.valid()) ...On 22.10.2011 20:56, Rainer Schuetze wrote:I think, this might be confused with normal usage, like "is this regex the empty string?" (Is "" a valid regex?). Maybe a more explicite "valid()" predicate would be fine.I haven't followed the discussion closely, and I cannot really comment on the core regex functionality, but I did actually use FReD as a replacement of a buggy std.regex once. In that case I wanted to have a lazily created static regex, but I did not find an official way to test whether a Regex has been initialized: static Regex!char re; if(!isInitializedRE(re)) re = regex(r"^(.*)\(([0-9]+)\):(.*)$"); So I implemented isInitializedRE() as "re.ir !is null" for std.regex and "re.captures() > 0" for fred, but that fails for being a "drop-in replacement".Coincidentally, you still can access re.ir property in this way. Wow, I wonder how far with backwards compatibility I can go :) In both cases this relies on undocumented features. Even now I can suggest a more portable and entirely generic way: if(re == Regex!(char).init) { //create re } Though that risks doing more work then needed.I think, both versions use implementation specifics, maybe there should be a documented way to test for being initialized.Definitely. How about adding an empty property + opCast to bool, with that you'd get: if(!re) { //create re } and a bit more verbose: if(re.empty) { //create re }OK, will do.As RegEx is a template argument here, it can stand for both Regex and StaticRegex, and that should be mentioned. Whether it has two different implementations is an implementation detail that does not need to bother the user.I also noticed, that "auto match(R, RegEx)(R input, RegEx re);" appears twice in the documentation, same for "bmatch". I guess they should not appear together with the string versions.I gather that happens because there is another overload specifically for C-T regexes. It's docs state just that, but lacking the template constraint signatures are the same, so it indeed can cause some confusion. Maybe it would be better to just combine docs together, and leave one overload undocumented.If you want to keep the second entries, I'd recommend renaming the argument to StaticRegEx.-- Dmitry Olshansky
Oct 23 2011
On 23.10.2011 17:46, Dmitry Olshansky wrote:On 23.10.2011 11:28, Rainer Schuetze wrote:You may be right. Maybe 'initialized', otherwise 'empty' isn't too bad as well. But I think it should be explicite, so I would not add opCast to bool.On 22.10.2011 21:05, Dmitry Olshansky wrote:"" is a valid regex that matches anywhere, with global flag it will match before any codepoint + once at end. I'm not sure using 'valid' is good, it may mislead user to check it all over the place e.g.: auto r = regex("blah"); if(r.valid()) ....On 22.10.2011 20:56, Rainer Schuetze wrote:I think, this might be confused with normal usage, like "is this regex the empty string?" (Is "" a valid regex?). Maybe a more explicite "valid()" predicate would be fine.I haven't followed the discussion closely, and I cannot really comment on the core regex functionality, but I did actually use FReD as a replacement of a buggy std.regex once. In that case I wanted to have a lazily created static regex, but I did not find an official way to test whether a Regex has been initialized: static Regex!char re; if(!isInitializedRE(re)) re = regex(r"^(.*)\(([0-9]+)\):(.*)$"); So I implemented isInitializedRE() as "re.ir !is null" for std.regex and "re.captures() > 0" for fred, but that fails for being a "drop-in replacement".Coincidentally, you still can access re.ir property in this way. Wow, I wonder how far with backwards compatibility I can go :) In both cases this relies on undocumented features. Even now I can suggest a more portable and entirely generic way: if(re == Regex!(char).init) { //create re } Though that risks doing more work then needed.I think, both versions use implementation specifics, maybe there should be a documented way to test for being initialized.Definitely. How about adding an empty property + opCast to bool, with that you'd get: if(!re) { //create re } and a bit more verbose: if(re.empty) { //create re }
Oct 24 2011
Am 22.10.2011, 21:05 Uhr, schrieb Dmitry Olshansky <dmitry.olsh gmail.com>:Definitely. How about adding an empty property + opCast to bool, with that you'd get: if(!re) { //create re }It is nice that you *can* do this,and a bit more verbose: if(re.empty) { //create re }but I prefer some speaking name here. Otherwise I'd believe 're' is a pointer or boolean + it is harder to look up in the documentation.
Oct 24 2011