digitalmars.D.bugs - [Issue 7260] New: "g" on default in std.regex.match
- d-bugmail puremagic.com (37/53) Jan 09 2012 http://d.puremagic.com/issues/show_bug.cgi?id=7260
- d-bugmail puremagic.com (16/16) Feb 24 2012 http://d.puremagic.com/issues/show_bug.cgi?id=7260
- d-bugmail puremagic.com (14/22) Feb 24 2012 http://d.puremagic.com/issues/show_bug.cgi?id=7260
- d-bugmail puremagic.com (9/9) Apr 19 2012 http://d.puremagic.com/issues/show_bug.cgi?id=7260
- d-bugmail puremagic.com (11/11) Apr 19 2012 http://d.puremagic.com/issues/show_bug.cgi?id=7260
- d-bugmail puremagic.com (15/15) Jan 24 2013 http://d.puremagic.com/issues/show_bug.cgi?id=7260
- d-bugmail puremagic.com (20/27) Jan 25 2013 http://d.puremagic.com/issues/show_bug.cgi?id=7260
- d-bugmail puremagic.com (36/43) Mar 10 2013 http://d.puremagic.com/issues/show_bug.cgi?id=7260
- d-bugmail puremagic.com (9/21) Mar 10 2013 http://d.puremagic.com/issues/show_bug.cgi?id=7260
- d-bugmail puremagic.com (17/38) Mar 10 2013 http://d.puremagic.com/issues/show_bug.cgi?id=7260
- d-bugmail puremagic.com (9/20) Mar 10 2013 http://d.puremagic.com/issues/show_bug.cgi?id=7260
- d-bugmail puremagic.com (18/18) Aug 17 2013 http://d.puremagic.com/issues/show_bug.cgi?id=7260
- d-bugmail puremagic.com (14/14) Sep 22 2013 http://d.puremagic.com/issues/show_bug.cgi?id=7260
http://d.puremagic.com/issues/show_bug.cgi?id=7260 Summary: "g" on default in std.regex.match Product: D Version: D2 Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Phobos AssignedTo: nobody puremagic.com ReportedBy: bearophile_hugs eml.cc D2 code: import std.stdio: write, writeln; import std.regex: regex, match; void main() { string text = "abc312de"; foreach (c; text.match("1|2|3|4")) write(c, " "); writeln(); foreach (c; text.match(regex("1|2|3|4", "g"))) write(c, " "); writeln(); } It outputs (DMD 2.058 Head): ["3"] ["3"] ["1"] ["2"] In my code I have seen that usually the "g" option (that means "repeat over the whole input") is what I want. So what do you think about making "g" the default? Note: I have not marked this issue as "enhancement" because of this comment by Dmitry Olshansky (found by drey_ on IRC #D): http://dfeed.kimsufi.thecybershadow.net/discussion/thread/jc9hrl$2lpp$1 digitalmars.com#post-jc9mag:2430tq:241:40digitalmars.comYet I have to issue yet another warning about new std.regex compared with old one: import std.stdio; import std.regex; void main() { string src = "4.5.1"; foreach (c; match(src, regex(r"(\d+)"))) writeln(c.hit); } previously this will find all matches, now it finds only first one. To get all of matches use "g" option. Seems like 100% compatibility was next to impossible.-- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 09 2012
http://d.puremagic.com/issues/show_bug.cgi?id=7260 Dmitry Olshansky <dmitry.olsh gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dmitry.olsh gmail.com 12:21:44 PST --- I dunno how to "fix" this bug. "g" by default imples there is a way to override it. regex("blah","") ? Leaving it as is now breaks old codebases that rely on "g" (though there should be more of legacy std.regexp code out there). Making it "g" on affects old code only inside foreach and generic constructs that show all matches or iterate on them, it's rare but non-zero. Another way would be to ditch current API, which I is not ideal btw ;) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 24 2012
http://d.puremagic.com/issues/show_bug.cgi?id=7260I dunno how to "fix" this bug. "g" by default imples there is a way to override it. regex("blah","") ? Leaving it as is now breaks old codebases that rely on "g" (though there should be more of legacy std.regexp code out there). Making it "g" on affects old code only inside foreach and generic constructs that show all matches or iterate on them, it's rare but non-zero. Another way would be to ditch current API, which I is not ideal btw ;)Fully ditching the currently used API is probably too much. A possible idea: regex("blah") <<== repeat over the whole input. regex("blah","") <<== repeat over the whole input. regex("blah","g") <<== repeat over the whole input. regex("blah","d") <<== doesn't repeat over the whole input. So far you have done good work on the regular expression implementation, so I trust your work. Thank you. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 24 2012
http://d.puremagic.com/issues/show_bug.cgi?id=7260 SomeDude <lovelydear mailmetrash.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |lovelydear mailmetrash.com Severity|normal |enhancement -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 19 2012
http://d.puremagic.com/issues/show_bug.cgi?id=7260 bearophile_hugs eml.cc changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|enhancement |normal This is not an enhancement request (I consider it more like a little Phobos regression). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 19 2012
http://d.puremagic.com/issues/show_bug.cgi?id=7260 bearophile_hugs eml.cc changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|"g" on default in |"g" on default in std.regex |std.regex.match | If changing std.regex.regex is not possible, then an alternative solution is to introduce the new little function "std.regex.re", that repeats on default, that is like: re(someString) === regex(someString, "g") re(someString, "d") === regex(someString, "dg") -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 24 2013
http://d.puremagic.com/issues/show_bug.cgi?id=7260 12:22:46 PST ---If changing std.regex.regex is not possible, then an alternative solution is to introduce the new little function "std.regex.re", that repeats on default, that is like: re(someString) === regex(someString, "g") re(someString, "d") === regex(someString, "dg")Frankly this is stupid (sorry). Obviously the wrong turn is that people (rightfully so) associate "find all" vs "find first" with operation that is "match"/"replace" not the "regex" as in the pattern itself. Personally I think that we better go with explicit overrides on "match"/"replace"/etc. and very slowly deprecate the "g" switch. Then how the override will look like is up for debate. match(someString, pattern).all //range of all matches match(someString, pattern).first //only the first one match(someString, pattern) // using the "g" flag to decide Or pass the override as optional parameter to match: match(someString, pattern, Regex.all); match(someString, pattern, Regex.first); match(someString, pattern); //use the flag I'll probably open a poll to pick the better one. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 25 2013
http://d.puremagic.com/issues/show_bug.cgi?id=7260 10:43:30 PDT ---If changing std.regex.regex is not possible, then an alternative solution is to introduce the new little function "std.regex.re", that repeats on default, that is like: re(someString) === regex(someString, "g") re(someString, "d") === regex(someString, "dg")Here is a plan based on one of my previous idea that I think is clean enough, given the circumstances and the fact that e.g. this Perl-ism is fairly popular in certain circles. (Namely attaching mode of operation to the pattern itself as in /`pattern`/`mode-suffix`). What we do is at first specify that "g" serves only as the intended default "mode" of this pattern. Then introduce simple and elegant way to explicitly specify what mode of matching to use: first, all or the default for this pattern. The your code looks like this (I'm still pondering better names/ways for overriding default): void main() { string text = "abc312de"; foreach (c; text.match("1|2|3|4").first) write(c, " "); writeln(); foreach (c; text.match(regex("1|2|3|4")).all) //could use string pattern as above write(c, " "); writeln(); } Then I'd try to do the same with replace. No overrides used would imply "use whatever the default mode is". How does it sound? Then we place nice bold warning that use of "g" option is discouraged and is provided only for compatibilty and is going be deprecated in future. A year later and depending on the mood of people it gets finally deprecated and slowly shifted towards oblivion. I'll probably cross-post this to NG to collect opinions since this is the largest pain point of the otherwise fine interface. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 10 2013
http://d.puremagic.com/issues/show_bug.cgi?id=7260match(someString, pattern).all //range of all matches match(someString, pattern).first //only the first one match(someString, pattern) // using the "g" flag to decideNo overrides used would imply "use whatever the default mode is". How does it sound? Then we place nice bold warning that use of "g" option is discouraged and is provided only for compatibilty and is going be deprecated in future. A year later and depending on the mood of people it gets finally deprecated and slowly shifted towards oblivion.Once "g" is deprecated what is match(someString, pattern) (without all and first) doing? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 10 2013
http://d.puremagic.com/issues/show_bug.cgi?id=7260 11:54:55 PDT ---Could go both ways. The other posibility I just thought about is: match(...).first - is the same as current match(...).front i.e. simplify interface for the case when 1 match is needed match(...).all - the same as current match(... with "g" overrided) i.e. a range Then once "g" is off we could either make .all a nop. Alternative is to make it opaque object that has 2 methods only .first/.all. The third alternative to add alias this to make .first implicit. I feel it won't work reliably with range-based templates as it would make it "2 ranges in one". So only the first 2 are viable. I'd go with 1st that gets upgraded to the second once people forget about "g" switch entierly. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------match(someString, pattern).all //range of all matches match(someString, pattern).first //only the first one match(someString, pattern) // using the "g" flag to decideNo overrides used would imply "use whatever the default mode is". How does it sound? Then we place nice bold warning that use of "g" option is discouraged and is provided only for compatibilty and is going be deprecated in future. A year later and depending on the mood of people it gets finally deprecated and slowly shifted towards oblivion.Once "g" is deprecated what is match(someString, pattern) (without all and first) doing?
Mar 10 2013
http://d.puremagic.com/issues/show_bug.cgi?id=7260 12:38:45 PDT ---Then once "g" is off we could either make .all a nop. Alternative is to make it opaque object that has 2 methods only .first/.all. The third alternative to add alias this to make .first implicit. I feel it won't work reliably with range-based templates as it would make it "2 ranges in one". So only the first 2 are viable. I'd go with 1st that gets upgraded to the second once people forget about "g" switch entierly.Typo - I've meant make it an opaque object then sometime later turn .all implicitly. It would still have potential to break code so it seems that just make .all implicit is better. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 10 2013
http://d.puremagic.com/issues/show_bug.cgi?id=7260 05:33:49 PDT --- The problem now should is addressed by this pull https://github.com/D-Programming-Language/phobos/pull/1470 There is matchAll/matchFirst calls now that are the prefered way to go about matching. Currently they simply override global flag if present. Returning to the original example: foreach (c; text.matchAll("1|2|3|4")) //this spins over captures of each match write(c, " "); writeln(); foreach (c; text.matchFirst("1|2|3|4")) //this spins submatches of 1st match write(c, " "); writeln(); To me there is little else to do aside from slooowly deprecating old flag-based match/replace interface. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 17 2013
http://d.puremagic.com/issues/show_bug.cgi?id=7260 Dmitry Olshansky <dmitry.olsh gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |WONTFIX 01:12:35 PDT --- Flags are to be gone one day and "g" by default is not going to happen. This IMHO makes it won't fix. Anyhow the core issue should now be addressed by using new API that is more clear. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 22 2013