www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - regex matching but not capturing

reply Paul <phshaffer gmail.com> writes:
My regex is matching but doesnt seem to be capturing.  You may 
recognize this from the AOC challenges.

file contains...
**Valve AA has flow rate=0; tunnels lead to valves DD, II, BB**
**Valve BB has flow rate=13; tunnels lead to valves CC, AA**
**Valve CC has flow rate=2; tunnels lead to valves DD, BB**
**... etc**

```d
auto s = readText(filename);
auto ctr = ctRegex!(`Valve ([A-Z]{2}).*=(\d+).+valves(,* 
[A-Z]{2})+`);
foreach(c;matchAll(s, ctr)) {
	fo.writeln(c);
}
```

produces...
**["Valve AA has flow rate=0; tunnels lead to valves DD, II, BB", 
"AA", "0", ", BB"]**
**["Valve BB has flow rate=13; tunnels lead to valves CC, AA", 
"BB", "13", ", AA"]**
**["Valve CC has flow rate=2; tunnels lead to valves DD, BB", 
"CC", "2", ", BB"]**

what I'm attempting to achieve and expect is, for instance, on 
the 1st line...
[....lead to valves DD, II, BB", "AA", "0", **", DD", ", II", ", 
BB"]**
Apr 06 2023
parent reply Alex Bryan <abryancs gmail.com> writes:
On Thursday, 6 April 2023 at 15:52:16 UTC, Paul wrote:
 My regex is matching but doesnt seem to be capturing.  You may 
 recognize this from the AOC challenges.

 file contains...
 **Valve AA has flow rate=0; tunnels lead to valves DD, II, BB**
 **Valve BB has flow rate=13; tunnels lead to valves CC, AA**
 **Valve CC has flow rate=2; tunnels lead to valves DD, BB**
 **... etc**

 ```d
 auto s = readText(filename);
 auto ctr = ctRegex!(`Valve ([A-Z]{2}).*=(\d+).+valves(,* 
 [A-Z]{2})+`);
 foreach(c;matchAll(s, ctr)) {
 	fo.writeln(c);
 }
 ```

 produces...
 **["Valve AA has flow rate=0; tunnels lead to valves DD, II, 
 BB", "AA", "0", ", BB"]**
 **["Valve BB has flow rate=13; tunnels lead to valves CC, AA", 
 "BB", "13", ", AA"]**
 **["Valve CC has flow rate=2; tunnels lead to valves DD, BB", 
 "CC", "2", ", BB"]**

 what I'm attempting to achieve and expect is, for instance, on 
 the 1st line...
 [....lead to valves DD, II, BB", "AA", "0", **", DD", ", II", 
 ", BB"]**
My understanding browsing the documentation is the matchAll returns a range of Captures (struct documented at https://dlang.org/phobos/std_regex.html#Captures). In your for loop I think c[0] will contain the current full match (current line that matches), c[1] will contain the first captured match ("AA" for first line), c.front[2] will contain "0" for first line, etc. There's probably logic somewhere that decides when a Capture is used as an argument to writeln, to just print the full match (line that matches) (making writeln(capture) the same as writeln(capture[0])
Apr 06 2023
parent reply Paul <phshaffer gmail.com> writes:
On Thursday, 6 April 2023 at 16:27:23 UTC, Alex Bryan wrote:

 My understanding browsing the documentation is the matchAll 
 returns a range of Captures (struct documented at 
 https://dlang.org/phobos/std_regex.html#Captures). In your for 
 loop I think c[0] will contain the current full match (current 
 line that matches), c[1] will contain the first captured match 
 ("AA" for first line), c.front[2] will contain "0" for first 
 line, etc.
Thanks Alex. Read some more and tried some different ways to access those repetitive ", cc" s on the end. I don't think my regex is capturing them.
Apr 06 2023
parent =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 4/6/23 11:08, Paul wrote:
 ways to access 
 those repetitive ", cc" s on the end.  I don't think my regex is 
 capturing them.
Some internets think you are in parser territory: https://stackoverflow.com/questions/1407435/how-do-i-regex-match-with-grouping-with-unknown-number-of-groups Ali
Apr 06 2023