digitalmars.D.learn - regex.d(6050): not enough preallocated memory
- Paul (27/27) Jun 05 2012 I am trying to see if all regex matches in one file are present
- Dmitry Olshansky (15/39) Jun 05 2012 To get next match engine is run again, then again for the next match and...
I am trying to see if all regex matches in one file are present in another file. The code works; but, part way through the nested foreach(s) I get the error listed in the subject line. I would think this error would come up when the Regex expressions were executed not when I'm iterating through the resultant matches. Is there a better way to do this or can I just allocate more memory? Thanks. // Execute Regex expressions auto uniCapturesOld = match(uniFileOld, regex(r"^NAME = (?P<comp>[a-zA-Z0-9_]+):*(?P<blk>[a-zA-Z0-9_]*)","gm")); auto uniCapturesNew = match(uniFileNew, regex(r"^NAME = (?P<comp>[a-zA-Z0-9_]+):*(?P<blk>[a-zA-Z0-9_]*)","gm")); // Iterate through match collections to see if both files contain the same matches. foreach (matchOld; uniCapturesOld) { cntOld++; found = false; foreach (matchNew; uniCapturesNew) { cntNew++; // Following line is for troublshooting. writeln(cntOld," ",cntNew," ",matchOld.hit," ",matchNew.hit); if (matchOld.hit == matchNew.hit) {found=true;break;}} if (!found) writeln(cntNF++," ",matchOld.hit," not found);}
Jun 05 2012
On 06.06.2012 0:25, Paul wrote:I am trying to see if all regex matches in one file are present in another file. The code works; but, part way through the nested foreach(s) I get the error listed in the subject line. I would think this error would come up when the Regex expressions were executed not when I'm iterating through the resultant matches.To get next match engine is run again, then again for the next match and so on - it's lazy evaluation at it's finest (how knows maybe you'll break loop half-way through). Obviously it either looses some RAM in between calls or it just bugs out when reaches some specific text.Is there a better way to do this or can I just allocate more memory? Thanks.Looks like you found a bug. Meaning that I probably miscalculated required amount of RAM or lose some free list nodes between calls. File a bug report, keep in mind that I need the data to reproduce it. Untill I figure it out, I recommend to fallback on bmatch function that is slower and in general unbound on used memory but should work. Another idea - try to modify one of regexes insignificantly, so that they don't reuse data structures internally (just in case it has to do with that).// Execute Regex expressions auto uniCapturesOld = match(uniFileOld, regex(r"^NAME = (?P<comp>[a-zA-Z0-9_]+):*(?P<blk>[a-zA-Z0-9_]*)","gm")); auto uniCapturesNew = match(uniFileNew, regex(r"^NAME = (?P<comp>[a-zA-Z0-9_]+):*(?P<blk>[a-zA-Z0-9_]*)","gm"));// Iterate through match collections to see if both files contain the same matches. foreach (matchOld; uniCapturesOld) { cntOld++; found = false; foreach (matchNew; uniCapturesNew) { cntNew++; // Following line is for troublshooting. writeln(cntOld," ",cntNew," ",matchOld.hit," ",matchNew.hit); if (matchOld.hit == matchNew.hit) {found=true;break;}} if (!found) writeln(cntNF++," ",matchOld.hit," not found);}-- Dmitry Olshansky
Jun 05 2012