digitalmars.D.learn - regex format string problem
- yawniek (9/9) Nov 22 2015 hi!
- Rikki Cattermole (21/28) Nov 22 2015 I take it that browscap[0] does it not do what you want?
- yawniek (17/32) Nov 23 2015 Hi Rikki,
- Rikki Cattermole (16/47) Nov 23 2015 So like this?
hi! how can i format a string with captures from a regular expression? basically make this pass: https://gist.github.com/f17647fb2f8ff2261d42 context: i'm trying to write a implementation for https://github.com/ua-parser where the regular expression as well as the format strings are given.
Nov 22 2015
On 23/11/15 12:41 PM, yawniek wrote:hi! how can i format a string with captures from a regular expression? basically make this pass: https://gist.github.com/f17647fb2f8ff2261d42 context: i'm trying to write a implementation for https://github.com/ua-parser where the regular expression as well as the format strings are given.I take it that browscap[0] does it not do what you want? I have an generator at [1]. Feel free to steal. Also once you do get yours working, you'll want to use ctRegex and generate a file with all of them in it. That'll increase performance significantly. Reguarding regex, if you want a named sub part use: (?<text>[a-z]*) Where [a-z]* is just an example. I would recommend you learning how input ranges work. They are used with how to get the matches out, e.g. auto rgx = ctRegex!`([a-z])[123]`; foreach(match; rgx.matchAll("b3")) { writeln(match.hit); } Or something along those lines, I did it off the top of my head. [0] https://github.com/rikkimax/Cmsed/blob/master/tools/browser_detection/browscap.ini [1] https://github.com/rikkimax/Cmsed/blob/master/tools/browser_detection/generator.d
Nov 22 2015
Hi Rikki, On Monday, 23 November 2015 at 03:57:06 UTC, Rikki Cattermole wrote:I take it that browscap[0] does it not do what you want? I have an generator at [1]. Feel free to steal.This looks interesting, thanks for the hint. However it might be a bit limited, i have 15M+ different User Agents with all kind of weird cases, sometimes not even the extensive ua-core regexs work. (if you're interested for testing let me know)Also once you do get yours working, you'll want to use ctRegex and generate a file with all of them in it. That'll increase performance significantly.that was my plan.Reguarding regex, if you want a named sub part use: (?<text>[a-z]*) Where [a-z]* is just an example. I would recommend you learning how input ranges work. They are used with how to get the matches out, e.g. auto rgx = ctRegex!`([a-z])[123]`; foreach(match; rgx.matchAll("b3")) { writeln(match.hit); }i'm aware how this works, the problem is a different one: i do have a second string that contains $n's which can occur in any order. now of course i can just go and write another regex and replace it, job done. but from looking at std.regex this seems to be built in, i just failed to get it to work properly, see my gist. i hoped this to be a 1liner.
Nov 23 2015
On 23/11/15 9:22 PM, yawniek wrote:Hi Rikki, On Monday, 23 November 2015 at 03:57:06 UTC, Rikki Cattermole wrote:So like this? import std.regex; import std.stdio : readln, writeln, write, stdout; auto REG = ctRegex!(`(\S+)(?: (.*))?`); void main() { for(;;) { write("> "); stdout.flush; string line = readln(); line.length--; if (line.length == 0) return; writeln("< ", line.replaceAll(REG, "Unknown program: $1")); } }I take it that browscap[0] does it not do what you want? I have an generator at [1]. Feel free to steal.This looks interesting, thanks for the hint. However it might be a bit limited, i have 15M+ different User Agents with all kind of weird cases, sometimes not even the extensive ua-core regexs work. (if you're interested for testing let me know)Also once you do get yours working, you'll want to use ctRegex and generate a file with all of them in it. That'll increase performance significantly.that was my plan.Reguarding regex, if you want a named sub part use: (?<text>[a-z]*) Where [a-z]* is just an example. I would recommend you learning how input ranges work. They are used with how to get the matches out, e.g. auto rgx = ctRegex!`([a-z])[123]`; foreach(match; rgx.matchAll("b3")) { writeln(match.hit); }i'm aware how this works, the problem is a different one: i do have a second string that contains $n's which can occur in any order. now of course i can just go and write another regex and replace it, job done. but from looking at std.regex this seems to be built in, i just failed to get it to work properly, see my gist. i hoped this to be a 1liner.
Nov 23 2015