digitalmars.D - RegExp.find() now crippled
- Steve Teale (12/12) Nov 14 2010 Some time ago in phobos2, the following:
- KennyTM~ (4/16) Nov 15 2010 Isn't std.regexp replaced by std.regex? Why are both of them still in
- Steve Teale (5/33) Nov 15 2010 I guess std.regexp is still there because not all of us necessarily want...
- Jesse Phillips (6/10) Nov 15 2010 That has nothing to do with expressiveness, familiarity/easy of use sure...
- Andrei Alexandrescu (38/83) Nov 15 2010 I am sorry for the inadvertent change, it wasn't meant to change
- Lutger Blijdestijn (13/67) Nov 15 2010 I'm pretty sure that can be filed as a bug. The behavior is still docume...
- Steve Teale (8/58) Nov 16 2010 Andrei,
- Steven Schveighoffer (11/30) Nov 16 2010 The standard library should not have something to please everyone. If
- Steve Teale (5/45) Nov 16 2010 Yes Steve, of course I can, but other much more popular languages like f...
- Steven Schveighoffer (21/75) Nov 16 2010 I don't object to having multiple styles to do things, I even maintain a...
- Steve Teale (6/10) Nov 16 2010 Steve,
- Steven Schveighoffer (18/30) Nov 16 2010 I don't know much about php's lib because I haven't used it enough to kn...
- Andrei Alexandrescu (4/49) Nov 16 2010 It's probably common courtesy that should be preserved. I just committed...
- sybrandy (22/59) Nov 16 2010 This actually sounds interesting. If I'm understanding things right,
- Jonathan M Davis (26/28) Nov 16 2010 Agreed. Ideally, the standard library would be very uniform in approach....
- spir (25/58) Nov 16 2010 That=20
- Andrei Alexandrescu (24/96) Nov 16 2010 I think that's not a good design. Ranges are a cross-cutting
- bearophile (6/7) Nov 16 2010 We are in the initial phase of Phobos develpment, so frequent large chan...
- Steve Teale (3/8) Nov 17 2010 Thanks Andrei. When the next version is released I'll remove the tempora...
Some time ago in phobos2, the following: RegExp wsr = RegExp("(\\s+)"); int p = wsr.find("<thingie att1=\"whatever\">"); writefln("%s|%s|%s %d",wsr.pre(), wsr.match(1), wsr.post(), p); would print: <thingie| |att1="whatever"> 7 Now it prints <thingie| |att1="whatever"> 1 The new return value is pretty useless, equivalent to returning a bool. It seems to me that the 'find' verb's subject should be the string, not the RegExp object. This looks like a case of the implementation being changed to match the documentation, when in fact it would have been better to change the documentation to match the implementation. Either that, or RegExp should have an indexOf method that behaves like string.indexOf. Steve
Nov 14 2010
On Nov 15, 10 14:58, Steve Teale wrote:Some time ago in phobos2, the following: RegExp wsr = RegExp("(\\s+)"); int p = wsr.find("<thingie att1=\"whatever\">"); writefln("%s|%s|%s %d",wsr.pre(), wsr.match(1), wsr.post(), p); would print: <thingie| |att1="whatever"> 7 Now it prints <thingie| |att1="whatever"> 1 The new return value is pretty useless, equivalent to returning a bool. It seems to me that the 'find' verb's subject should be the string, not the RegExp object. This looks like a case of the implementation being changed to match the documentation, when in fact it would have been better to change the documentation to match the implementation. Either that, or RegExp should have an indexOf method that behaves like string.indexOf. SteveIsn't std.regexp replaced by std.regex? Why are both of them still in Phobos 2? (oh, and std.regex is missing a documented .index (= .src_start) property.)
Nov 15 2010
KennyTM~ Wrote:On Nov 15, 10 14:58, Steve Teale wrote:I guess std.regexp is still there because not all of us necessarily want to iterate a range to simply find out the position of the first whitespace in a string. Part of the expressiveness of languages is that one should be free to use the style that suits, and not have to read the documentation every time one uses it. Give me options in Phobos by all means. D2 is not going to succeed by forcing its users to use unfamiliar, and maybe not yet very fashionable constructions. I'm pissed off because this change broke a lot of my code, which I had not used for some time, but now have a paying customer for. The code did not break because of D language evolution. It broke because somebody decided they did not like the style of std.regexp. All I wanted was plain old regular expressions, similar to JavaScript, or PHP, or other popular languages, and std.regexp did that pretty well at one time. SteveSome time ago in phobos2, the following: RegExp wsr = RegExp("(\\s+)"); int p = wsr.find("<thingie att1=\"whatever\">"); writefln("%s|%s|%s %d",wsr.pre(), wsr.match(1), wsr.post(), p); would print: <thingie| |att1="whatever"> 7 Now it prints <thingie| |att1="whatever"> 1 The new return value is pretty useless, equivalent to returning a bool. It seems to me that the 'find' verb's subject should be the string, not the RegExp object. This looks like a case of the implementation being changed to match the documentation, when in fact it would have been better to change the documentation to match the implementation. Either that, or RegExp should have an indexOf method that behaves like string.indexOf. SteveIsn't std.regexp replaced by std.regex? Why are both of them still in Phobos 2? (oh, and std.regex is missing a documented .index (= .src_start) property.)
Nov 15 2010
Steve Teale Wrote:I guess std.regexp is still there because not all of us necessarily want to iterate a range to simply find out the position of the first whitespace in a string.I'm pretty sure it is still there for the same reason many are, trying to figure out when it should be removed.Part of the expressiveness of languages is that one should be free to use the style that suits, and not have to read the documentation every time one uses it. Give me options in Phobos by all means.That has nothing to do with expressiveness, familiarity/easy of use sure.D2 is not going to succeed by forcing its users to use unfamiliar, and maybe not yet very fashionable constructions.Not providing, does not mean forcing to use.I'm pissed off because this change broke a lot of my code, which I had not used for some time, but now have a paying customer for. The code did not break because of D language evolution. It broke because somebody decided they did not like the style of std.regexp. All I wanted was plain old regular expressions, similar to JavaScript, or PHP, or other popular languages, and std.regexp did that pretty well at one time.I agree, there is no reason a module that is scheduled for deletion should have changes made that would cause existing code to break. But looking at the history, there doesn't seem to be such changes for at least the last year. The only questionable change (one that wasn't just type changes to auto/spacing) happened 3 months ago, but I don't think the behavior was intended to change: http://www.dsource.org/projects/phobos/changeset/1923/trunk/phobos/std/regexp.d
Nov 15 2010
On 11/15/10 7:55 AM, Steve Teale wrote:KennyTM~ Wrote:I am sorry for the inadvertent change, it wasn't meant to change semantics of existing code. I'm not sure whether one of my unrelated 64-bit changes messed things up. You may want to file a bug report. There are a number of good reasons for which I was compelled to split std.regex from std.regexp. I'm sure you or others would have found them just as compelling if you saw things the same way. Phobos 1 has experimented in std.string and std.regexp with juxtaposing APIs of various languages (PHP, Ruby, Python). The reasoning was that people familiar with either of those languages could feel right at home by using APIs with similar nomenclatures and semantics. The result was some strange bedfellows in std.string such as "column" or "capwords" and an outright mess in std.regexp. The interface of std.regexp is without a doubt the worst I've ever seen, by a long shot. I have never been able to use it without poring through the documentation _several times_ and without confirming to myself via a small test case that I'm doing the right thing. The simplest problem is this: std.regexp uses the words "exec", "find", "match", "search", and "test" - all to mean regular expression matching. There is absolutely no logic to how meanings are ascribed to words, and there is absolutely no recourse than rote memorization of various arbitrary decisions. The resulting FrankenAPI is likely familiar to anyone except those who've actually spent time learning it, in spite of it trying to be familiar to anyone. So I spawned std.regex in an attempt to sanitize the API (I made minor, if any, changes to the engine; I am in fact having significant trouble maintaining it). The advantages of std.regex are: * No more class definition. Nobody is supposed to inherit RegExp anyway so it's useless to brand the object as a class. * Engine is separated from matches, which means that engines can be memoized for efficiency. Currently regex() only memoizes the last engine. * The new engine works with any character size. * Simpler API: create a regex, call match() against that regex and a string, look at the resulting RegexMatch object. If this all annoys you more than the old API, I will need to disagree. If you have suggestions on how std.regex can be improved, I'm all ears. AndreiOn Nov 15, 10 14:58, Steve Teale wrote:I guess std.regexp is still there because not all of us necessarily want to iterate a range to simply find out the position of the first whitespace in a string. Part of the expressiveness of languages is that one should be free to use the style that suits, and not have to read the documentation every time one uses it. Give me options in Phobos by all means. D2 is not going to succeed by forcing its users to use unfamiliar, and maybe not yet very fashionable constructions. I'm pissed off because this change broke a lot of my code, which I had not used for some time, but now have a paying customer for. The code did not break because of D language evolution. It broke because somebody decided they did not like the style of std.regexp. All I wanted was plain old regular expressions, similar to JavaScript, or PHP, or other popular languages, and std.regexp did that pretty well at one time. SteveSome time ago in phobos2, the following: RegExp wsr = RegExp("(\\s+)"); int p = wsr.find("<thingie att1=\"whatever\">"); writefln("%s|%s|%s %d",wsr.pre(), wsr.match(1), wsr.post(), p); would print: <thingie| |att1="whatever"> 7 Now it prints <thingie| |att1="whatever"> 1 The new return value is pretty useless, equivalent to returning a bool. It seems to me that the 'find' verb's subject should be the string, not the RegExp object. This looks like a case of the implementation being changed to match the documentation, when in fact it would have been better to change the documentation to match the implementation. Either that, or RegExp should have an indexOf method that behaves like string.indexOf. SteveIsn't std.regexp replaced by std.regex? Why are both of them still in Phobos 2? (oh, and std.regex is missing a documented .index (= .src_start) property.)
Nov 15 2010
Steve Teale wrote:KennyTM~ Wrote:I'm pretty sure that can be filed as a bug. The behavior is still documented as returning index of match, and the standalone std.regexp.find works that way. Patch: -1045,7 +1045,7 { int i = test(string); if (i) - i = pmatch[0].rm_so != 0; + i = pmatch[0].rm_so; else i = -1; // no match return i;On Nov 15, 10 14:58, Steve Teale wrote:I guess std.regexp is still there because not all of us necessarily want to iterate a range to simply find out the position of the first whitespace in a string. Part of the expressiveness of languages is that one should be free to use the style that suits, and not have to read the documentation every time one uses it. Give me options in Phobos by all means. D2 is not going to succeed by forcing its users to use unfamiliar, and maybe not yet very fashionable constructions. I'm pissed off because this change broke a lot of my code, which I had not used for some time, but now have a paying customer for. The code did not break because of D language evolution. It broke because somebody decided they did not like the style of std.regexp. All I wanted was plain old regular expressions, similar to JavaScript, or PHP, or other popular languages, and std.regexp did that pretty well at one time. SteveSome time ago in phobos2, the following: RegExp wsr = RegExp("(\\s+)"); int p = wsr.find("<thingie att1=\"whatever\">"); writefln("%s|%s|%s %d",wsr.pre(), wsr.match(1), wsr.post(), p); would print: <thingie| |att1="whatever"> 7 Now it prints <thingie| |att1="whatever"> 1 The new return value is pretty useless, equivalent to returning a bool. It seems to me that the 'find' verb's subject should be the string, not the RegExp object. This looks like a case of the implementation being changed to match the documentation, when in fact it would have been better to change the documentation to match the implementation. Either that, or RegExp should have an indexOf method that behaves like string.indexOf. SteveIsn't std.regexp replaced by std.regex? Why are both of them still in Phobos 2? (oh, and std.regex is missing a documented .index (= .src_start) property.)
Nov 15 2010
Andrei Alexandrescu Wrote:I am sorry for the inadvertent change, it wasn't meant to change semantics of existing code. I'm not sure whether one of my unrelated 64-bit changes messed things up. You may want to file a bug report. There are a number of good reasons for which I was compelled to split std.regex from std.regexp. I'm sure you or others would have found them just as compelling if you saw things the same way. Phobos 1 has experimented in std.string and std.regexp with juxtaposing APIs of various languages (PHP, Ruby, Python). The reasoning was that people familiar with either of those languages could feel right at home by using APIs with similar nomenclatures and semantics. The result was some strange bedfellows in std.string such as "column" or "capwords" and an outright mess in std.regexp. The interface of std.regexp is without a doubt the worst I've ever seen, by a long shot. I have never been able to use it without poring through the documentation _several times_ and without confirming to myself via a small test case that I'm doing the right thing. The simplest problem is this: std.regexp uses the words "exec", "find", "match", "search", and "test" - all to mean regular expression matching. There is absolutely no logic to how meanings are ascribed to words, and there is absolutely no recourse than rote memorization of various arbitrary decisions. The resulting FrankenAPI is likely familiar to anyone except those who've actually spent time learning it, in spite of it trying to be familiar to anyone. So I spawned std.regex in an attempt to sanitize the API (I made minor, if any, changes to the engine; I am in fact having significant trouble maintaining it). The advantages of std.regex are: * No more class definition. Nobody is supposed to inherit RegExp anyway so it's useless to brand the object as a class. * Engine is separated from matches, which means that engines can be memoized for efficiency. Currently regex() only memoizes the last engine. * The new engine works with any character size. * Simpler API: create a regex, call match() against that regex and a string, look at the resulting RegexMatch object. If this all annoys you more than the old API, I will need to disagree. If you have suggestions on how std.regex can be improved, I'm all ears. AndreiAndrei, Maybe it is time that the structure of the standard library became more generalized. At the moment we have std... and core... Perhaps we need another branch in the hierarchy, like ranges... Then there could be a std.range module that was the gateway into ranges... The library could then expand in an orderly fashion, with a wider range of users becoming responsible for the maintenance of of different branches against changes in the language, not against changes in fashion. Then you could have ranges.regex, that suits you, and the people who were happy with the status quo, could continue to use std.regexp, which should continue to behave like it did in DMD2.029 or whatever it was when I wrote my 'legacy' code. The current system, where modules of the library can get arbitrarily deprecated and at some point removed because they are unfashionable, is very unfriendly. I recognize that you are young, hyper-intelligent, and motivated toward fame. But there are other users, like me, who are older, but not senile, and have more conservative attitudes, including the desire to use code they wrote in the past at some point in the future. Steve
Nov 16 2010
On Tue, 16 Nov 2010 13:16:13 -0500, Steve Teale <steve.teale britseyeview.com> wrote:Andrei, Maybe it is time that the structure of the standard library became more generalized. At the moment we have std... and core... Perhaps we need another branch in the hierarchy, like ranges... Then there could be a std.range module that was the gateway into ranges... The library could then expand in an orderly fashion, with a wider range of users becoming responsible for the maintenance of of different branches against changes in the language, not against changes in fashion. Then you could have ranges.regex, that suits you, and the people who were happy with the status quo, could continue to use std.regexp, which should continue to behave like it did in DMD2.029 or whatever it was when I wrote my 'legacy' code. The current system, where modules of the library can get arbitrarily deprecated and at some point removed because they are unfashionable, is very unfriendly. I recognize that you are young, hyper-intelligent, and motivated toward fame. But there are other users, like me, who are older, but not senile, and have more conservative attitudes, including the desire to use code they wrote in the past at some point in the future.The standard library should not have something to please everyone. If there is 5 different styles to do the same thing, it will be a failure. Can you just copy std.regex from 2.029 and compile it in your project? I.e. instead of phobos adding range branch for the new range style, you add branch Teale for your style and copy what you like in there. Then you have what you want (may take a little effort on your part, but then you control the results). Also, 2.029 is still available via download, you can still use it. -Steve
Nov 16 2010
Steven Schveighoffer Wrote:On Tue, 16 Nov 2010 13:16:13 -0500, Steve Teale <steve.teale britseyeview.com> wrote:Yes Steve, of course I can, but other much more popular languages like for instance PHP seem to do OK with the suit-everyone style. I am just upset that code I put a lot of effort into gets broken because somebody else does not like the style of the library. Which should be preserved - style, or substance? SteveAndrei, Maybe it is time that the structure of the standard library became more generalized. At the moment we have std... and core... Perhaps we need another branch in the hierarchy, like ranges... Then there could be a std.range module that was the gateway into ranges... The library could then expand in an orderly fashion, with a wider range of users becoming responsible for the maintenance of of different branches against changes in the language, not against changes in fashion. Then you could have ranges.regex, that suits you, and the people who were happy with the status quo, could continue to use std.regexp, which should continue to behave like it did in DMD2.029 or whatever it was when I wrote my 'legacy' code. The current system, where modules of the library can get arbitrarily deprecated and at some point removed because they are unfashionable, is very unfriendly. I recognize that you are young, hyper-intelligent, and motivated toward fame. But there are other users, like me, who are older, but not senile, and have more conservative attitudes, including the desire to use code they wrote in the past at some point in the future.The standard library should not have something to please everyone. If there is 5 different styles to do the same thing, it will be a failure. Can you just copy std.regex from 2.029 and compile it in your project? I.e. instead of phobos adding range branch for the new range style, you add branch Teale for your style and copy what you like in there. Then you have what you want (may take a little effort on your part, but then you control the results). Also, 2.029 is still available via download, you can still use it. -Steve
Nov 16 2010
On Tue, 16 Nov 2010 13:46:48 -0500, Steve Teale <steve.teale britseyeview.com> wrote:Steven Schveighoffer Wrote:I don't object to having multiple styles to do things, I even maintain a library (dcollections) that is not even close to the style of std.container. I just object to everything being included in the standard library. The standard library should do things one way, and if you want something different, use an add-on library. I'm guessing you are referring to php's pcre vs posix regex? I think posix is marked as deprecated...On Tue, 16 Nov 2010 13:16:13 -0500, Steve Teale <steve.teale britseyeview.com> wrote:Yes Steve, of course I can, but other much more popular languages like for instance PHP seem to do OK with the suit-everyone style.Andrei, Maybe it is time that the structure of the standard library becamemoregeneralized. At the moment we have std... and core... Perhaps we need another branch in the hierarchy, like ranges... Then there could be a std.range module that was the gateway into ranges... The library could then expand in an orderly fashion, with a widerrangeof users becoming responsible for the maintenance of of different branches against changes in the language, not against changes infashion.Then you could have ranges.regex, that suits you, and the people who were happy with the status quo, could continue to use std.regexp,whichshould continue to behave like it did in DMD2.029 or whatever it was when I wrote my 'legacy' code. The current system, where modules of the library can get arbitrarily deprecated and at some point removed because they are unfashionable,isvery unfriendly. I recognize that you are young, hyper-intelligent, and motivatedtowardfame. But there are other users, like me, who are older, but notsenile,and have more conservative attitudes, including the desire to use code they wrote in the past at some point in the future.The standard library should not have something to please everyone. If there is 5 different styles to do the same thing, it will be a failure. Can you just copy std.regex from 2.029 and compile it in your project? I.e. instead of phobos adding range branch for the new range style, you add branch Teale for your style and copy what you like in there. Then you have what you want (may take a little effort on your part, but then you control the results). Also, 2.029 is still available via download, you can still use it. -SteveI am just upset that code I put a lot of effort into gets broken because somebody else does not like the style of the library.Well, the library isn't finished. As much as I understand your pain, I also don't think phobos should be write-only. We should not be stuck with mistakes or designs of the past until we have stated the library is released, and then we can deal with backwards compatibility in a reasonable way. Until then, you shouldn't expect everything to be set in stone. Sorry if this is confusing or annoying. That being said, if std.regexp is broken, I don't think it was intentional. In fact, in the bug report, someone mentions a one-line fix, does that solve your problem? AFAIK, regexp is not even deprecated yet, which means it should be supported. I think Andrei said as much.Which should be preserved - style, or substance?substance. AFAIK, substance is preserved, or am I misunderstanding you? -Steve
Nov 16 2010
Steven Schveighoffer Wrote:Steve, No. I just meant that the library that comes with PHP seems happy to provide different ways of doing the same thing, as in for example, CURL, DOMDocument, and standard file operation wrappers. I've been following D on and off for about seven or eight years now, so I don't subscribe too much to the 'when it's finished' argument. By now, it needs to work for real projects. But I've had my gripe - I'll shut up for now. SteveI'm guessing you are referring to php's pcre vs posix regex? I think posix is marked as deprecated...
Nov 16 2010
On Tue, 16 Nov 2010 14:24:32 -0500, Steve Teale <steve.teale britseyeview.com> wrote:Steven Schveighoffer Wrote:I don't know much about php's lib because I haven't used it enough to know the tendencies of library acceptance. But two different APIs for doing regex strike me as way more overlapping than CURL and file operation wrappers.Steve, No. I just meant that the library that comes with PHP seems happy to provide different ways of doing the same thing, as in for example, CURL, DOMDocument, and standard file operation wrappers.I'm guessing you are referring to php's pcre vs posix regex? I think posix is marked as deprecated...I've been following D on and off for about seven or eight years now, so I don't subscribe too much to the 'when it's finished' argument. By now, it needs to work for real projects.D2 is much younger than that. D1 is complete (to use the term loosely), if you want to use that, its API will not change. There are quite a few projects using D1 for real work (I wrote one a few years ago). D2 is changing monthly, to the point where newer versions of phobos require newer versions of the compiler due to compiler bugs fixed or features added. I can't see how it can be considered finished. Don has recently brought up on the mailing list that we should identify the status of each module in the ddoc so people can understand the plans for that module before basing their work on it. I can see how spending lots of time working with something only to have it disappear can be hugely frustrating. -Steve
Nov 16 2010
On 11/16/10 10:46 AM, Steve Teale wrote:Steven Schveighoffer Wrote:It's probably common courtesy that should be preserved. I just committed the fix prompted by Lutger (thanks). AndreiOn Tue, 16 Nov 2010 13:16:13 -0500, Steve Teale <steve.teale britseyeview.com> wrote:Yes Steve, of course I can, but other much more popular languages like for instance PHP seem to do OK with the suit-everyone style. I am just upset that code I put a lot of effort into gets broken because somebody else does not like the style of the library. Which should be preserved - style, or substance? SteveAndrei, Maybe it is time that the structure of the standard library became more generalized. At the moment we have std... and core... Perhaps we need another branch in the hierarchy, like ranges... Then there could be a std.range module that was the gateway into ranges... The library could then expand in an orderly fashion, with a wider range of users becoming responsible for the maintenance of of different branches against changes in the language, not against changes in fashion. Then you could have ranges.regex, that suits you, and the people who were happy with the status quo, could continue to use std.regexp, which should continue to behave like it did in DMD2.029 or whatever it was when I wrote my 'legacy' code. The current system, where modules of the library can get arbitrarily deprecated and at some point removed because they are unfashionable, is very unfriendly. I recognize that you are young, hyper-intelligent, and motivated toward fame. But there are other users, like me, who are older, but not senile, and have more conservative attitudes, including the desire to use code they wrote in the past at some point in the future.The standard library should not have something to please everyone. If there is 5 different styles to do the same thing, it will be a failure. Can you just copy std.regex from 2.029 and compile it in your project? I.e. instead of phobos adding range branch for the new range style, you add branch Teale for your style and copy what you like in there. Then you have what you want (may take a little effort on your part, but then you control the results). Also, 2.029 is still available via download, you can still use it. -Steve
Nov 16 2010
On 11/16/2010 01:30 PM, Steven Schveighoffer wrote:On Tue, 16 Nov 2010 13:16:13 -0500, Steve Teale <steve.teale britseyeview.com> wrote:This actually sounds interesting. If I'm understanding things right, std.range.* would provide a range interface to specific libraries, such as regex. So, in theory, there could be different interfaces to the same functionality. E.g. std.range.regex, std.oo.regex, and std.proc.regex for a range interface, a OO interface, or a procedural interface. Underneath, you could have the same core functionality, but people can access it in the way they feel most comfortable or that better fits the design of the program being written. As new paradigms are invented, they can be added as well and be based on the existing interfaces. Is this something we want to do? Don't know. I don't even know how feasible it is. However, I do like the concept and if the goal is to make the language as friendly as possible, perhaps it should be looked into. There's the chance that it will cause some confusion, but how much will actually occur? The biggest issue I see is having certain libraries that don't fit well into all of the different paradigms. E.g. a date library can have a nice OO interface and a nice procedural interface, but it doesn't make much sense to have a range interface. Anyway, food for thought. CaseyAndrei, Maybe it is time that the structure of the standard library became more generalized. At the moment we have std... and core... Perhaps we need another branch in the hierarchy, like ranges... Then there could be a std.range module that was the gateway into ranges... The library could then expand in an orderly fashion, with a wider range of users becoming responsible for the maintenance of of different branches against changes in the language, not against changes in fashion. Then you could have ranges.regex, that suits you, and the people who were happy with the status quo, could continue to use std.regexp, which should continue to behave like it did in DMD2.029 or whatever it was when I wrote my 'legacy' code. The current system, where modules of the library can get arbitrarily deprecated and at some point removed because they are unfashionable, is very unfriendly. I recognize that you are young, hyper-intelligent, and motivated toward fame. But there are other users, like me, who are older, but not senile, and have more conservative attitudes, including the desire to use code they wrote in the past at some point in the future.The standard library should not have something to please everyone. If there is 5 different styles to do the same thing, it will be a failure. Can you just copy std.regex from 2.029 and compile it in your project? I.e. instead of phobos adding range branch for the new range style, you add branch Teale for your style and copy what you like in there. Then you have what you want (may take a little effort on your part, but then you control the results). Also, 2.029 is still available via download, you can still use it. -Steve
Nov 16 2010
On Tuesday, November 16, 2010 10:30:03 Steven Schveighoffer wrote:The standard library should not have something to please everyone. If there is 5 different styles to do the same thing, it will be a failure.Agreed. Ideally, the standard library would be very uniform in approach. That makes it easier to learn and use. If it's schizophrenic about it's approach - especially if it has multiple ways of doing everything - then it's going to be much harder to learn and use. Everyone would be asking why you'd choose one way over another and what the differences between them are. It would just cause confusion. Ranges are a key element of how Phobos does things in D2. The truth of the matter is that if you want to effectively use Phobos, you're going to have to use ranges. If ranges aren't appropriate for a particular module or problem, then they shouldn't be used, but Phobos is generally being built around using them, and the more of Phobos which functions in essentially the same way, the easier it will be to understand, learn, and use. The old code is indeed available for modules which are going to be deprecated/removed, and the license is usually Boost, so you're pretty free to do what you want with it if you prefer it. And there's nothing wrong with creating your own libraries if you'd prefer. Plenty of folks have done that in the past. The standard library needs to be fairly uniform in approach, however, and some of the current modules are older and don't follow that approach or have licensing or design issues which were not addressed in the past. Once all of those modules have been updated, replaced, or removed, Phobos will be more uniform and its parts will interact better. And over time, it's unlikely that modules will continue to be deprecated like that. It's happening now because D2 Phobos is still fairly early in its evolution. - Jonathan M Davis
Nov 16 2010
On Tue, 16 Nov 2010 11:24:02 -0800 Jonathan M Davis <jmdavisProg gmx.com> wrote:On Tuesday, November 16, 2010 10:30:03 Steven Schveighoffer wrote:That=20The standard library should not have something to please everyone. If there is 5 different styles to do the same thing, it will be a failure.=20 Agreed. Ideally, the standard library would be very uniform in approach. =makes it easier to learn and use. If it's schizophrenic about it's approa=ch -=20especially if it has multiple ways of doing everything - then it's going =to be=20much harder to learn and use. Everyone would be asking why you'd choose o=ne way=20over another and what the differences between them are. It would just cau=se=20confusion. =20 Ranges are a key element of how Phobos does things in D2. The truth of th=e=20matter is that if you want to effectively use Phobos, you're going to hav=e to use=20ranges. If ranges aren't appropriate for a particular module or problem, =then=20they shouldn't be used, but Phobos is generally being built around using =them,=20and the more of Phobos which functions in essentially the same way, the e=asier=20it will be to understand, learn, and use. =20 The old code is indeed available for modules which are going to be=20 deprecated/removed, and the license is usually Boost, so you're pretty fr=ee to=20do what you want with it if you prefer it. And there's nothing wrong with==20creating your own libraries if you'd prefer. Plenty of folks have done th=at in=20the past. =20 The standard library needs to be fairly uniform in approach, however, and=some=20of the current modules are older and don't follow that approach or have=20 licensing or design issues which were not addressed in the past. Once all=of=20those modules have been updated, replaced, or removed, Phobos will be mor=e=20uniform and its parts will interact better. And over time, it's unlikely =that=20modules will continue to be deprecated like that. It's happening now beca=use D2=20Phobos is still fairly early in its evolution. =20 - Jonathan M Davis+++ Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Nov 16 2010
On 11/16/10 10:16 AM, Steve Teale wrote:Andrei Alexandrescu Wrote:I think that's not a good design. Ranges are a cross-cutting abstraction. One wouldn't put all code using exception under std.exceptions or code using floating point under std.floating_point. Better, ranges, exceptions, or floating point should be used wherever it makes sense to use them.I am sorry for the inadvertent change, it wasn't meant to change semantics of existing code. I'm not sure whether one of my unrelated 64-bit changes messed things up. You may want to file a bug report. There are a number of good reasons for which I was compelled to split std.regex from std.regexp. I'm sure you or others would have found them just as compelling if you saw things the same way. Phobos 1 has experimented in std.string and std.regexp with juxtaposing APIs of various languages (PHP, Ruby, Python). The reasoning was that people familiar with either of those languages could feel right at home by using APIs with similar nomenclatures and semantics. The result was some strange bedfellows in std.string such as "column" or "capwords" and an outright mess in std.regexp. The interface of std.regexp is without a doubt the worst I've ever seen, by a long shot. I have never been able to use it without poring through the documentation _several times_ and without confirming to myself via a small test case that I'm doing the right thing. The simplest problem is this: std.regexp uses the words "exec", "find", "match", "search", and "test" - all to mean regular expression matching. There is absolutely no logic to how meanings are ascribed to words, and there is absolutely no recourse than rote memorization of various arbitrary decisions. The resulting FrankenAPI is likely familiar to anyone except those who've actually spent time learning it, in spite of it trying to be familiar to anyone. So I spawned std.regex in an attempt to sanitize the API (I made minor, if any, changes to the engine; I am in fact having significant trouble maintaining it). The advantages of std.regex are: * No more class definition. Nobody is supposed to inherit RegExp anyway so it's useless to brand the object as a class. * Engine is separated from matches, which means that engines can be memoized for efficiency. Currently regex() only memoizes the last engine. * The new engine works with any character size. * Simpler API: create a regex, call match() against that regex and a string, look at the resulting RegexMatch object. If this all annoys you more than the old API, I will need to disagree. If you have suggestions on how std.regex can be improved, I'm all ears. AndreiAndrei, Maybe it is time that the structure of the standard library became more generalized. At the moment we have std... and core... Perhaps we need another branch in the hierarchy, like ranges... Then there could be a std.range module that was the gateway into ranges... The library could then expand in an orderly fashion, with a wider range of users becoming responsible for the maintenance of of different branches against changes in the language, not against changes in fashion. Then you could have ranges.regex, that suits you, and the people who were happy with the status quo, could continue to use std.regexp, which should continue to behave like it did in DMD2.029 or whatever it was when I wrote my 'legacy' code.The current system, where modules of the library can get arbitrarily deprecated and at some point removed because they are unfashionable, is very unfriendly.I agree we need to have a rather long deprecation schedule. Fashionable has, however, little to do with the rationale for deprecation. You may want to tune to the Phobos developers' mailing list for more details.I recognize that you are young, hyper-intelligent, and motivated toward fame.I have enumerated a list of technical reasons for which std.regexp is inadequate, followed by a list of improvements brought about by std.regex. Ranges are nowhere on that list, nor is being fashionable. It's all good old design stuff that I'm sure you have down better than me: make an API small and simple, separate concerns (engine/matches), use the right tool for the job (struct not class), generalize within reason (character width). Would have been great to have a discussion along those lines. Instead, I see you chose to ignore all technical arguments and go with a presupposition, no matter how assuming and stereotypical.But there are other users, like me, who are older, but not senile, and have more conservative attitudes, including the desire to use code they wrote in the past at some point in the future.Backward compatibility is indeed important, and again we need to have a long deprecation schedule. At the same time, I think there are much more many users in D's future than in its past, and I cannot inflict std.regexp on them. Andrei
Nov 16 2010
Steve Teale:The current system, where modules of the library can get arbitrarily deprecated and at some point removed because they are unfashionable, is very unfriendly.We are in the initial phase of Phobos develpment, so frequent large changes are expected. Surely one year from now Phobos will be more careful in its changes. The other things you have said are too much silly to comment. And thank you to Andrei to the improvements to the D regex API, I'd love to see other good people give other similar improvements to Phobos :-) We must be grateful with Andrei for improving that API. Bye, bearophile
Nov 16 2010
Andrei Alexandrescu Wrote:It's probably common courtesy that should be preserved. I just committed the fix prompted by Lutger (thanks). AndreiThanks Andrei. When the next version is released I'll remove the temporary findRex() function from my current code. Steve ;=)
Nov 17 2010