digitalmars.D.bugs - [Bug 93] New: Template regex example fails without -release switch
- d-bugmail puremagic.com (478/478) Apr 08 2006 http://d.puremagic.com/bugzilla/show_bug.cgi?id=93
- d-bugmail puremagic.com (8/8) Apr 11 2006 http://d.puremagic.com/bugzilla/show_bug.cgi?id=93
- d-bugmail puremagic.com (12/12) Apr 11 2006 http://d.puremagic.com/bugzilla/show_bug.cgi?id=93
- Don Clugston (14/32) Apr 11 2006 That category list really should be changed, it is completely
- Dave (19/60) Apr 11 2006 I appreciate your concerns and believe it or not put some thought into
- Don Clugston (13/78) Apr 11 2006 It's just a bit of proof-of-concept code showing what's possible with D
- d-bugmail puremagic.com (7/7) Apr 12 2006 http://d.puremagic.com/bugzilla/show_bug.cgi?id=93
- d-bugmail puremagic.com (52/56) Apr 12 2006 http://d.puremagic.com/bugzilla/show_bug.cgi?id=93
- d-bugmail puremagic.com (7/7) Apr 28 2006 http://d.puremagic.com/bugzilla/show_bug.cgi?id=93
http://d.puremagic.com/bugzilla/show_bug.cgi?id=93 Summary: Template regex example fails without -release switch Product: D Version: 0.152 Platform: PC OS/Version: All Status: NEW Severity: blocker Priority: P2 Component: DMD AssignedTo: bugzilla digitalmars.com ReportedBy: godaves yahoo.com Without the -release switch, the template example for the 2006 SDWest Presentation fails on both linux and Windows. http://www.digitalmars.com/d/templates-revisited.html The linker error on Windows is: Error 42: Symbol Undefined _array_5regex --- errorlevel 1 The linker error on Linux is: test_regex.o(.gnu.linkonce.t_D5regex49__T10regexMatchVG12aa12_5b612d7a5d2a5c732a5c772aZ10regexMatchFAaZAAa+0x3a): In function `_D5regex49__T10regexMatchVG12aa12_5b612d7a5d2a5c732a5c772aZ10regexMatchFAaZAAa': : undefined reference to `_array_5regex' test_regex.o(.gnu.linkonce.t_D5regex30__T9testRangeVAaa1_61VAaa1_7aZ9testRangeFAaZi+0x16): In function `_D5regex30__T9testRangeVAaa1_61VAaa1_7aZ9testRangeFAaZi': : undefined reference to `_array_5regex' test_regex.o(.gnu.linkonce.t_D5regex30__T9testRangeVAaa1_61VAaa1_7aZ9testRangeFAaZi+0x33): In function `_D5regex30__T9testRangeVAaa1_61VAaa1_7aZ9testRangeFAaZi': : undefined reference to `_array_5regex' test_regex.o(.gnu.linkonce.t_D5regex78__T14testZeroOrMoreS55_D5regex30__T9testRangeVAaa1_61VAaa1_7aZ9testRangeFAaZiZ14testZeroOrMoreFAaZi+0x3d): In function `_D5regex78__T14testZeroOrMoreS55_D5regex30__T9testRangeVAaa1_61VAaa1_7aZ9testRangeFAaZiZ14testZeroOrMoreFAaZi': : undefined reference to `_array_5regex' test_regex.o(.gnu.linkonce.t_D5regex32__T9testRangeVG1aa1_00VG1aa1_20Z9testRangeFAaZi+0x15): In function `_D5regex32__T9testRangeVG1aa1_00VG1aa1_20Z9testRangeFAaZi': : undefined reference to `_array_5regex' test_regex.o(.gnu.linkonce.t_D5regex32__T9testRangeVG1aa1_00VG1aa1_20Z9testRangeFAaZi+0x29): more undefined references to `_array_5regex' follow collect2: ld returned 1 exit status --- errorlevel 1 Source Code ----------- test_regex.d: ------------- import std.stdio; import temp_regex; void main() { auto exp = ®exMatch!(r"[a-z]*\s*\w*"); writefln("matches: %s", exp("hello world")); } ;--- temp_regex.d ------------ module temp_regex; const int testFail = -1; /** * Compile pattern[] and expand to a custom generated * function that will take a string str[] and apply the * regular expression to it, returning an array of matches. */ template regexMatch(char[] pattern) { char[][] regexMatch(char[] str) { char[][] results; int n = regexCompile!(pattern).fn(str); if (n != testFail && n > 0) results ~= str[0..n]; return results; } } /****************************** * The testXxxx() functions are custom generated by templates * to match each predicate of the regular expression. * * Params: * char[] str the input string to match against * * Returns: * testFail failed to have a match * n >= 0 matched n characters */ /// Always match template testEmpty() { int testEmpty(char[] str) { return 0; } } /// Match if testFirst(str) and testSecond(str) match template testUnion(alias testFirst, alias testSecond) { int testUnion(char[] str) { int n1 = testFirst(str); if (n1 != testFail) { int n2 = testSecond(str[n1 .. $]); if (n2 != testFail) return n1 + n2; } return testFail; } } /// Match if first part of str[] matches text[] template testText(char[] text) { int testText(char[] str) { if (str.length && text.length <= str.length && str[0..text.length] == text ) return text.length; return testFail; } } /// Match if testPredicate(str) matches 0 or more times template testZeroOrMore(alias testPredicate) { int testZeroOrMore(char[] str) { if (str.length == 0) return 0; int n = testPredicate(str); if (n != testFail) { int n2 = testZeroOrMore!(testPredicate)(str[n .. $]); if (n2 != testFail) return n + n2; return n; } return 0; } } /// Match if term1[0] <= str[0] <= term2[0] template testRange(char[] term1, char[] term2) { int testRange(char[] str) { if (str.length && str[0] >= term1[0] && str[0] <= term2[0]) return 1; return testFail; } } /// Match if ch[0]==str[0] template testChar(char[] ch) { int testChar(char[] str) { if (str.length && str[0] == ch[0]) return 1; return testFail; } } /// Match if str[0] is a word character template testWordChar() { int testWordChar(char[] str) { if (str.length && ( (str[0] >= 'a' && str[0] <= 'z') || (str[0] >= 'A' && str[0] <= 'Z') || (str[0] >= '0' && str[0] <= '9') || str[0] == '_' ) ) { return 1; } return testFail; } } /*****************************************************/ /** * Returns the front of pattern[] up until * the end or a special character. */ template parseTextToken(char[] pattern) { static if (pattern.length > 0) { static if (isSpecial!(pattern)) const char[] parseTextToken = ""; else const char[] parseTextToken = pattern[0..1] ~ parseTextToken!(pattern[1..$]); } else const char[] parseTextToken=""; } /** * Parses pattern[] up to and including terminator. * Returns: * token[] everything up to terminator. * consumed number of characters in pattern[] parsed */ template parseUntil(char[] pattern,char terminator,bool fuzzy=false) { static if (pattern.length > 0) { static if (pattern[0] == '\\') { static if (pattern.length > 1) { const char[] nextSlice = pattern[2 .. $]; alias parseUntil!(nextSlice,terminator,fuzzy) next; const char[] token = pattern[0 .. 2] ~ next.token; const uint consumed = next.consumed+2; } else { pragma(msg,"Error: expected character to follow \\"); static assert(false); } } else static if (pattern[0] == terminator) { const char[] token=""; const uint consumed = 1; } else { const char[] nextSlice = pattern[1 .. $]; alias parseUntil!(nextSlice,terminator,fuzzy) next; const char[] token = pattern[0..1] ~ next.token; const uint consumed = next.consumed+1; } } else static if (fuzzy) { const char[] token = ""; const uint consumed = 0; } else { pragma(msg,"Error: expected " ~ terminator ~ " to terminate group expression"); static assert(false); } } /** * Parse contents of character class. * Params: * pattern[] = rest of pattern to compile * Output: * fn = generated function * consumed = number of characters in pattern[] parsed */ template regexCompileCharClass2(char[] pattern) { static if (pattern.length > 0) { static if (pattern.length > 1) { static if (pattern[1] == '-') { static if (pattern.length > 2) { alias testRange!(pattern[0..1], pattern[2..3]) termFn; const uint thisConsumed = 3; const char[] remaining = pattern[3 .. $]; } else // length is 2 { pragma(msg, "Error: expected char following '-' in char class"); static assert(false); } } else // not '-' { alias testChar!(pattern[0..1]) termFn; const uint thisConsumed = 1; const char[] remaining = pattern[1 .. $]; } } else { alias testChar!(pattern[0..1]) termFn; const uint thisConsumed = 1; const char[] remaining = pattern[1 .. $]; } alias regexCompileCharClassRecurse!(termFn,remaining) recurse; alias recurse.fn fn; const uint consumed = recurse.consumed + thisConsumed; } else { alias testEmpty!() fn; const uint consumed = 0; } } /** * Used to recursively parse character class. * Params: * termFn = generated function up to this point * pattern[] = rest of pattern to compile * Output: * fn = generated function including termFn and * parsed character class * consumed = number of characters in pattern[] parsed */ template regexCompileCharClassRecurse(alias termFn,char[] pattern) { static if (pattern.length > 0 && pattern[0] != ']') { alias regexCompileCharClass2!(pattern) next; alias testOr!(termFn,next.fn,pattern) fn; const uint consumed = next.consumed; } else { alias termFn fn; const uint consumed = 0; } } /** * At start of character class. Compile it. * Params: * pattern[] = rest of pattern to compile * Output: * fn = generated function * consumed = number of characters in pattern[] parsed */ template regexCompileCharClass(char[] pattern) { static if (pattern.length > 0) { static if (pattern[0] == ']') { alias testEmpty!() fn; const uint consumed = 0; } else { alias regexCompileCharClass2!(pattern) charClass; alias charClass.fn fn; const uint consumed = charClass.consumed; } } else { pragma(msg,"Error: expected closing ']' for character class"); static assert(false); } } /** * Look for and parse '*' postfix. * Params: * test = function compiling regex up to this point * pattern[] = rest of pattern to compile * Output: * fn = generated function * consumed = number of characters in pattern[] parsed */ template regexCompilePredicate(alias test, char[] pattern) { static if (pattern.length > 0 && pattern[0] == '*') { alias testZeroOrMore!(test) fn; const uint consumed = 1; } else { alias test fn; const uint consumed = 0; } } /** * Parse escape sequence. * Params: * pattern[] = rest of pattern to compile * Output: * fn = generated function * consumed = number of characters in pattern[] parsed */ template regexCompileEscape(char[] pattern) { static if (pattern.length > 0) { static if (pattern[0] == 's') { // whitespace char alias testRange!("\x00","\x20") fn; } else static if (pattern[0] == 'w') { //word char alias testWordChar!() fn; } else { alias testChar!(pattern[0 .. 1]) fn; } const uint consumed = 1; } else { pragma(msg,"Error: expected char following '\\'"); static assert(false); } } /** * Parse and compile regex represented by pattern[]. * Params: * pattern[] = rest of pattern to compile * Output: * fn = generated function */ template regexCompile(char[] pattern) { static if (pattern.length > 0) { static if (pattern[0] == '[') { const char[] charClassToken = parseUntil!(pattern[1 .. $],']').token; alias regexCompileCharClass!(charClassToken) charClass; const char[] token = pattern[0 .. charClass.consumed+2]; const char[] next = pattern[charClass.consumed+2 .. $]; alias charClass.fn test; } else static if (pattern[0] == '\\') { alias regexCompileEscape!(pattern[1..pattern.length]) escapeSequence; const char[] token = pattern[0 .. escapeSequence.consumed+1]; const char[] next = pattern[escapeSequence.consumed+1 .. $]; alias escapeSequence.fn test; } else { const char[] token = parseTextToken!(pattern); static assert(token.length > 0); const char[] next = pattern[token.length .. $]; alias testText!(token) test; } alias regexCompilePredicate!(test, next) term; const char[] remaining = next[term.consumed .. next.length]; alias regexCompileRecurse!(term,remaining).fn fn; } else alias testEmpty!() fn; } template regexCompileRecurse(alias term,char[] pattern) { static if (pattern.length > 0) { alias regexCompile!(pattern) next; alias testUnion!(term.fn, next.fn) fn; } else alias term.fn fn; } /// Utility function for parsing template isSpecial(char[] pattern) { static if ( pattern[0] == '*' || pattern[0] == '+' || pattern[0] == '?' || pattern[0] == '.' || pattern[0] == '[' || pattern[0] == '{' || pattern[0] == '(' || pattern[0] == ')' || pattern[0] == '$' || pattern[0] == '^' || pattern[0] == '\\' ) const isSpecial = true; else const isSpecial = false; } --
Apr 08 2006
http://d.puremagic.com/bugzilla/show_bug.cgi?id=93 clugdbug yahoo.com.au changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|blocker |major This isn't a blocker. --
Apr 11 2006
http://d.puremagic.com/bugzilla/show_bug.cgi?id=93 godaves yahoo.com changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|major |blocker Version|0.152 |0.153 "Blocker: Blocks development and/or testing work." It's a blocker if you run into that bug and want to use Contract Programming during the course of development and testing. After all, that's a major part of the langauge. Let Walter make the call. --
Apr 11 2006
d-bugmail puremagic.com wrote:http://d.puremagic.com/bugzilla/show_bug.cgi?id=93 godaves yahoo.com changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|major |blocker Version|0.152 |0.153 "Blocker: Blocks development and/or testing work." It's a blocker if you run into that bug and want to use Contract Programming during the course of development and testing. After all, that's a major part of the langauge. Let Walter make the call.That category list really should be changed, it is completely inappropriate for a compiler. Almost every bug affects development and testing work in that sense! (And segfaults of the compiler are not as bad as incorrect code generation). The fact that a particular example does not compile with -release is not a blocker. I can assure you that contract programming works in general. Blockers are very rare, one example occurred in an early DMD release where almost any program would fail to compile. I doubt that any blockers will be discovered that aren't regressions. (An example of a blocker would be: "dmd can no longer be used with build"). To have any chance of this being fixed, you need to have a go at cutting down the error. Walter generally ignores bug reports which are longer than 20 lines. I suspect he'll completely ignore the severity.
Apr 11 2006
Don Clugston wrote:d-bugmail puremagic.com wrote:I appreciate your concerns and believe it or not put some thought into the original report severity, etc. If Walter wants to ignore it that is his prerogative. If Walter wants to 'downgrade' it that is fine w/ me. Believe me, I'm not doing this stuff to make Walter's job harder. I did not try to reduce the error any more than it is because the summary of the example says: "What follows is a cut-down version of Eric Anderton's regex compiler. It is just enough to compile the regular expression above, serving to illustrate how it is done." In fact I went to the extra 'trouble' of copying and pasting the code to put it all in one spot, and tested it both on Windows and Linux. I agree it probably a recent regression - all the more reason IMHO to get it taken care of right away because Walter knows what he's changed recently in that area. I also agree that perhaps some better bug report descriptions could be developed, but I hesitate to say that because I don't have the time right now to come up with suggestions and/or make the changes myself. - Davehttp://d.puremagic.com/bugzilla/show_bug.cgi?id=93 godaves yahoo.com changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|major |blocker Version|0.152 |0.153 "Blocker: Blocks development and/or testing work." It's a blocker if you run into that bug and want to use Contract Programming during the course of development and testing. After all, that's a major part of the langauge. Let Walter make the call.That category list really should be changed, it is completely inappropriate for a compiler. Almost every bug affects development and testing work in that sense! (And segfaults of the compiler are not as bad as incorrect code generation). The fact that a particular example does not compile with -release is not a blocker. I can assure you that contract programming works in general. Blockers are very rare, one example occurred in an early DMD release where almost any program would fail to compile. I doubt that any blockers will be discovered that aren't regressions. (An example of a blocker would be: "dmd can no longer be used with build"). To have any chance of this being fixed, you need to have a go at cutting down the error. Walter generally ignores bug reports which are longer than 20 lines. I suspect he'll completely ignore the severity.
Apr 11 2006
Dave wrote:Don Clugston wrote:It's just a bit of proof-of-concept code showing what's possible with D templates. No-one should be using the code for any other purpose. Minimal for a regexp does not mean minimal for a bug report. The whole regexp thing is completely irrelevant to this bug.d-bugmail puremagic.com wrote:I appreciate your concerns and believe it or not put some thought into the original report severity, etc. If Walter wants to ignore it that is his prerogative. If Walter wants to 'downgrade' it that is fine w/ me. Believe me, I'm not doing this stuff to make Walter's job harder. I did not try to reduce the error any more than it is because the summary of the example says: "What follows is a cut-down version of Eric Anderton's regex compiler. It is just enough to compile the regular expression above, serving to illustrate how it is done."http://d.puremagic.com/bugzilla/show_bug.cgi?id=93 godaves yahoo.com changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|major |blocker Version|0.152 |0.153 "Blocker: Blocks development and/or testing work." It's a blocker if you run into that bug and want to use Contract Programming during the course of development and testing. After all, that's a major part of the langauge. Let Walter make the call.That category list really should be changed, it is completely inappropriate for a compiler. Almost every bug affects development and testing work in that sense! (And segfaults of the compiler are not as bad as incorrect code generation). The fact that a particular example does not compile with -release is not a blocker. I can assure you that contract programming works in general. Blockers are very rare, one example occurred in an early DMD release where almost any program would fail to compile. I doubt that any blockers will be discovered that aren't regressions. (An example of a blocker would be: "dmd can no longer be used with build"). To have any chance of this being fixed, you need to have a go at cutting down the error. Walter generally ignores bug reports which are longer than 20 lines. I suspect he'll completely ignore the severity.In fact I went to the extra 'trouble' of copying and pasting the code to put it all in one spot, and tested it both on Windows and Linux. I agree it probably a recent regression - all the more reason IMHO to get it taken care of right away because Walter knows what he's changed recently in that area.Actually, the template part of the compiler has changed a lot since Eric wrote that code. I'm a little surprised that it compiles at all. (My compile-time regex, which greatly improves upon that one, was written against a much more recent compiler, is currently broken due to improvements in the template syntax).I also agree that perhaps some better bug report descriptions could be developed, but I hesitate to say that because I don't have the time right now to come up with suggestions and/or make the changes myself.When bugzilla was set up, Walter proposed some definitions which made a lot of sense. I don't understand why the default inappropriate ones were retained. A compiler is so different to a normal app.
Apr 11 2006
http://d.puremagic.com/bugzilla/show_bug.cgi?id=93 I've tried to reproduce this on Windows with DMD 0.153. It always compiles for me. (there's no contract programming in this code). --
Apr 12 2006
http://d.puremagic.com/bugzilla/show_bug.cgi?id=93 godaves yahoo.com changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|blocker |trivial Priority|P2 |P5I've tried to reproduce this on Windows with DMD 0.153. It always compiles for me. (there's no contract programming in this code).The linker error happens because of array bounds checking code that is omitted with -release. I recreated it, but it is arguably my mistake (read on). I copied the code into two files, test_regex.d and temp_regex.d. Then I recompiled: C:\Zz\temp>dmd test_regex.d C:\dmd\bin\..\..\dm\bin\link.exe test_regex,,,user32+kernel32/noi; OPTLINK (R) for Win32 Release 7.50B1 Copyright (C) Digital Mars 1989 - 2001 All Rights Reserved test_regex.obj(test_regex) Error 42: Symbol Undefined _array_10temp_regex --- errorlevel 1 Then I recompiled again with -release and ran it: C:\Zz\temp>dmd test_regex.d -release C:\dmd\bin\..\..\dm\bin\link.exe test_regex,,,user32+kernel32/noi; C:\Zz\temp>test_regex matches: [hello] That recreates the problem, and I should have specified the exact steps better. But, if I recompile w/o -release like so: C:\Zz\temp>dmd test_regex.d temp_regex.d C:\dmd\bin\..\..\dm\bin\link.exe test_regex+temp_regex,,,user32+kernel32/noi; C:\Zz\temp>test_regex matches: [hello] Then it works. The reason I didn't compile in temp_regex.d (or link in the .obj compiled separately) is because the code in tempregex.d is all of either const or template code. Being used to C/++ #include <header>, I just compiled the main() module. So under normal circumstances (e.g. the regex code is linked into a lib and that lib is linked with the app.) this 'bug' would probably not have happened, so along with the other things you pointed out, I lowered the Severity for it to 'trivial' and priority to 'informational'. This is a potentially frustrating inconsistency between the compiler switches because, as the templates are always instantiated in the declaritive scope, the compiler generated stuff is (correctly) generated for the same scope. I say potentially frustrating because sometimes compiler generated stuff is "out of sight, out of mind", at least for me. Walter probably spotted this right away from the linker error and just ignored it or sat back and chuckled as the e-mails went back and forth <g> (The reference to contract programming is because the -release switch omits pre and post contracts, along with asserts, invariants, etc. So, what I was referring to is that if you ran into this bug, then in order to get it to compile the -release switch would remove your CP code, hence "blocker"). Thanks, - Dave --
Apr 12 2006
http://d.puremagic.com/bugzilla/show_bug.cgi?id=93 godaves yahoo.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |INVALID --
Apr 28 2006