digitalmars.D - undocumented opMatch induced semantic changes to while-loops
- Oskar Linde (70/70) Feb 16 2006 Hello,
- Oskar Linde (7/12) Feb 16 2006 I apologize calling this undocumented. I overlooked the documentation on
- Derek Parnell (8/21) Feb 16 2006 You make a whole lot of sense, Oskar. I agree that 'while' is expected t...
- Walter Bright (7/11) Feb 16 2006 It is a very good point. Perhaps it should be instead:
- Walter Bright (3/7) Feb 16 2006 That's right, I goofed that.
Hello, Among the language changes introduced by the new ~~ operator is a undocumented(?) semantic change to while-loops. This is a quite surprising behavior: import std.stdio; import std.string; char[] getLine() { static int x = 0; if (x > 32) return ""; return format("%s:%s:%s:%s:",x++,x++,x++,x++); } int main() { while("([0-9]*):" ~~ getLine()) { writefln("Match: ",_match.match(1)); } return 0; } Prints: Match: 3 Match: 2 Match: 1 Match: 0 (To get this to compile on linux, I had to manually include ~/dmd/src/phobos/internal/match.d for the _d_match() function. I guess this is just missing from the precompiled phobos library on linux) I would assume getString() to be evaluated for each iteration of the while loop, but this is apparently not the case. This changes behavior -- that people for more than 30 years have been expecting and relying on -- of the while loop in C-like languages...! What is it that happens here? Looking at the dmd front end sources, statement.c line 513: Statement *WhileStatement::semantic(Scope *sc) { if (condition->op == TOKmatch) { /* Rewrite while (condition) body as: * if (condition) * do * body * while ((_match = _match.opNext), _match); */ ... [snip code that does the rewrite and injects a _match identifier] } ... } So we now have a opNext() that gets called on the match for each following iteration of the while loop... While if - do - while loops generally are very good for branch prediction on modern cpus, it feels very odd to silently change the semantics of the while-loop like this. This whole opMatch thing feels like a hack. In this contex, I would define a hack to be something that adds many new special cases rather than making general ones. Examples: 1) Injecting a _match variable in the scope of if() and while() loops of the result of the condition, but only if the condition is an opMatch() expression. 2) Changing the semantics of the while loop, but only if the condition is an opMatch() expression. We now have two custom ways of iterating over collections: the classic opApply() - used by foreach the new opMatch() - opNext() automatically used by while-loops Suffice to say that this feels like a mistake. To iterate over the matches of a string, one should use foreach(), not make semantic changes to the while loop. IMHO of course. Disclaimer: This post was written before breakfast :) Regards, Oskar
Feb 16 2006
Oskar Linde wrote:Hello, Among the language changes introduced by the new ~~ operator is a undocumented(?) semantic change to while-loops. This is a quite surprising behavior:I apologize calling this undocumented. I overlooked the documentation on while statements. My other points still remain. foreach would be much more natural to overload with this behavior than while. Foreach is expected to evaluate its argument once, while is expected to evaluate it each iteration. /Oskar
Feb 16 2006
On Thu, 16 Feb 2006 22:12:42 +1100, Oskar Linde <olREM OVEnada.kth.se> wrote:Oskar Linde wrote:You make a whole lot of sense, Oskar. I agree that 'while' is expected to re-evaluate the expression on each iteration and 'foreach' is expected to evaluate it just the once. I'm surprised that Walter has changed this. -- Derek Parnell Melbourne, AustraliaHello, Among the language changes introduced by the new ~~ operator is a undocumented(?) semantic change to while-loops. This is a quite surprising behavior:I apologize calling this undocumented. I overlooked the documentation on while statements. My other points still remain. foreach would be much more natural to overload with this behavior than while. Foreach is expected to evaluate its argument once, while is expected to evaluate it each iteration.
Feb 16 2006
"Oskar Linde" <olREM OVEnada.kth.se> wrote in message news:dt1j71$196$1 digitaldaemon.com...My other points still remain. foreach would be much more natural to overload with this behavior than while. Foreach is expected to evaluate its argument once, while is expected to evaluate it each iteration.It is a very good point. Perhaps it should be instead: foreach (MatchExpression) { } ??
Feb 16 2006
In article <dt2aig$qn3$4 digitaldaemon.com>, Walter Bright says..."Oskar Linde" <olREM OVEnada.kth.se> wrote in message news:dt1j71$196$1 digitaldaemon.com...I think it should - just seems a better match overall, especially considering they are both D built-in's as compared to C and C++. As an aside, most Perl users at least will be pretty comfortable with D's built-in 'foreach' (the syntax is different of course, but the idea is firmly implanted). - DaveMy other points still remain. foreach would be much more natural to overload with this behavior than while. Foreach is expected to evaluate its argument once, while is expected to evaluate it each iteration.It is a very good point. Perhaps it should be instead: foreach (MatchExpression) { } ??
Feb 16 2006
In article <dt2cp1$t7g$1 digitaldaemon.com>, Dave says...In article <dt2aig$qn3$4 digitaldaemon.com>, Walter Bright says...What if: if(MatchExpression) // ME not compiled foreach(MatchExpression) // ME compiled Because I can see where things like this will be used a lot: char[][] recs = split(cast(char[])read("path/to/file"),"\n")); foreach(char[] rec; recs) { if("<regex>" ~~ rec) { .. } } Would this make sense? Thanks, - Dave"Oskar Linde" <olREM OVEnada.kth.se> wrote in message news:dt1j71$196$1 digitaldaemon.com...I think it should - just seems a better match overall, especially considering they are both D built-in's as compared to C and C++. As an aside, most Perl users at least will be pretty comfortable with D's built-in 'foreach' (the syntax is different of course, but the idea is firmly implanted). - DaveMy other points still remain. foreach would be much more natural to overload with this behavior than while. Foreach is expected to evaluate its argument once, while is expected to evaluate it each iteration.It is a very good point. Perhaps it should be instead: foreach (MatchExpression) { } ??
Feb 16 2006
In article <dt2aig$qn3$4 digitaldaemon.com>, Walter Bright says..."Oskar Linde" <olREM OVEnada.kth.se> wrote in message news:dt1j71$196$1 digitaldaemon.com...Please? It just makes more sense that way. Of course this opens the door for an implicit iterator syntax, ala the '$' suggestion (semantic scope operator?) earlier.My other points still remain. foreach would be much more natural to overload with this behavior than while. Foreach is expected to evaluate its argument once, while is expected to evaluate it each iteration.It is a very good point. Perhaps it should be instead: foreach (MatchExpression) { }foreach(<Expression>){ // compiler inserts: foreach(auto _match; <something>){ // where _match is aliased to '$', and matches are harvested via opApply() writefln("match: %s",$); }When you look at the problem from a more generlized standpoint, <Expression> doesn't necessarily have to be a <RegularExpression> at all. Likewise the implicit '_match' token could be more general too like '_loop', '_iter' or '_value'. So if we want to use the shorthand we can, but if we're using nested loops or desire an explicit variable, we can do it the old-fashioned way:foreach(auto myMatch; <Expression>)- Eric Anderton at yahoo
Feb 16 2006
"Oskar Linde" <olREM OVEnada.kth.se> wrote in message news:dt1g6c$309n$1 digitaldaemon.com...(To get this to compile on linux, I had to manually include ~/dmd/src/phobos/internal/match.d for the _d_match() function. I guess this is just missing from the precompiled phobos library on linux)That's right, I goofed that.
Feb 16 2006