digitalmars.D.learn - Why do the same work about 'IndexOfAny' and 'indexOf' function?
- FrankLike (22/22) Jan 07 2015 I want to know whether the string strs contains
- John Colvin (5/28) Jan 07 2015 std.algorithm.canFind will do what you want, including telling
- FrankLike (12/45) Jan 07 2015 Sorry, 'std.algorithm.find' do this work:Finds an individual
- bearophile (5/8) Jan 07 2015 Seems this:
- Tobias Pankrath (3/11) Jan 07 2015 Which uses this overload:
- John Colvin (5/45) Jan 07 2015 std.algorithm.find has several overloads, one of which takes
- FrankLike (11/15) Jan 07 2015 string strs ="hello.exe";
- H. S. Teoh via Digitalmars-d-learn (5/5) Jan 07 2015 Try this:
- FrankLike (21/24) Jan 08 2015 You mean ? The result is not that I want to get!
- FrankLike (24/27) Jan 08 2015 Thank you,it can work. but it's not what I want.
- Robert burner Schadek (3/3) Jan 08 2015 use canFind like such:
- FrankLike (8/11) Jan 08 2015 canFind is work for such as :
- ketmar via Digitalmars-d-learn (29/46) Jan 08 2015 On Fri, 09 Jan 2015 07:10:14 +0000
- FrankLike (9/60) Jan 09 2015 Sorry,it's only a example .Thank you work hard,but it's
- ketmar via Digitalmars-d-learn (26/33) Jan 09 2015 On Fri, 09 Jan 2015 09:36:01 +0000
- FrankLike (9/31) Jan 09 2015 The code is the best,and it's better than indexOfAny in C#:
- ketmar via Digitalmars-d-learn (5/15) Jan 09 2015 On Fri, 09 Jan 2015 12:46:53 +0000
- FrankLike (42/52) Jan 09 2015 Thank you.
- ketmar via Digitalmars-d-learn (8/71) Jan 09 2015 On Fri, 09 Jan 2015 13:06:09 +0000
- Robert burner Schadek (8/17) Jan 09 2015 IMO that is not sound advice. Creating the state machine and
- ketmar via Digitalmars-d-learn (9/28) Jan 09 2015 On Fri, 09 Jan 2015 13:54:00 +0000
- Robert burner Schadek (5/15) Jan 09 2015 even with CTFE regex still uses a state machine _mm256_cmpeq_epi8
- ketmar via Digitalmars-d-learn (5/23) Jan 09 2015 On Fri, 09 Jan 2015 14:11:49 +0000
- Robert burner Schadek (4/6) Jan 09 2015 I don't see your point, anyway I think he got his help or at
- FrankLike (14/46) Jan 09 2015 import std.regex;
- ketmar via Digitalmars-d-learn (7/56) Jan 09 2015 On Fri, 09 Jan 2015 15:36:21 +0000
- FrankLike (5/64) Jan 09 2015 Yes. regex doing 'a lot more keywords and a lot longer strings'
I want to know whether the string strs contains I can do : int index = indexofany(strs,["exe","dll","a","lib"]); but in D: I must to do like this: findStr(strs,["exe","lib","dll","a"])) bool findStr(string strIn,string[] strFind) { bool bFind = false; foreach(str;strFind) { if(strIn.indexOf(str) !=-1) { bFind = true; break; } } return bFind; } phobos 's string.d can add this some function to let the indexOfAny to better? Thank you. Frank
Jan 07 2015
On Wednesday, 7 January 2015 at 14:54:51 UTC, FrankLike wrote:I want to know whether the string strs contains I can do : int index = indexofany(strs,["exe","dll","a","lib"]); but in D: I must to do like this: findStr(strs,["exe","lib","dll","a"])) bool findStr(string strIn,string[] strFind) { bool bFind = false; foreach(str;strFind) { if(strIn.indexOf(str) !=-1) { bFind = true; break; } } return bFind; } phobos 's string.d can add this some function to let the indexOfAny to better? Thank you. Frankstd.algorithm.canFind will do what you want, including telling you which of ["exe","lib","dll","a"] was found. If you need to know where in strs it was found as well, you can use std.algorithm.find
Jan 07 2015
On Wednesday, 7 January 2015 at 15:11:57 UTC, John Colvin wrote:On Wednesday, 7 January 2015 at 14:54:51 UTC, FrankLike wrote:Sorry, 'std.algorithm.find' do this work:Finds an individual element in an input range,and it's Parameters: InputRange haystack The range searched in. Element needle The element searched for. But now I want to know in a string (like "hello.exe" or "hello.a",or "hello.dll" or "hello.lib" ) whether contains any of them: ["exe","dll","a","lib"]. My function 'findStr' works fine. If the string.d's function 'indexOfAny' do this work,it will happy.(but now 'IndexOfAny' and 'indexOf' do the same work) . Thank you.I want to know whether the string strs contains I can do : int index = indexofany(strs,["exe","dll","a","lib"]); but in D: I must to do like this: findStr(strs,["exe","lib","dll","a"])) bool findStr(string strIn,string[] strFind) { bool bFind = false; foreach(str;strFind) { if(strIn.indexOf(str) !=-1) { bFind = true; break; } } return bFind; } phobos 's string.d can add this some function to let the indexOfAny to better? Thank you. Frankstd.algorithm.canFind will do what you want, including telling you which of ["exe","lib","dll","a"] was found. If you need to know where in strs it was found as well, you can use std.algorithm.find
Jan 07 2015
FrankLike:But now I want to know in a string (like "hello.exe" or "hello.a",or "hello.dll" or "hello.lib" ) whether contains any of them: ["exe","dll","a","lib"].Seems this: http://rosettacode.org/wiki/File_extension_is_in_extensions_list#D Bye, bearophile
Jan 07 2015
On Wednesday, 7 January 2015 at 16:02:25 UTC, bearophile wrote:FrankLike:Which uses this overload: size_t canFind(Range, Ranges...)(Range haystack, Ranges needles)But now I want to know in a string (like "hello.exe" or "hello.a",or "hello.dll" or "hello.lib" ) whether contains any of them: ["exe","dll","a","lib"].Seems this: http://rosettacode.org/wiki/File_extension_is_in_extensions_list#D Bye, bearophile
Jan 07 2015
On Wednesday, 7 January 2015 at 15:57:18 UTC, FrankLike wrote:On Wednesday, 7 January 2015 at 15:11:57 UTC, John Colvin wrote:std.algorithm.find has several overloads, one of which takes multiple needles. The same is true for std.algorithm.canFind Quoting from the relevant std.algorithm.find overload docs: "Finds two or more needles into a haystack."On Wednesday, 7 January 2015 at 14:54:51 UTC, FrankLike wrote:Sorry, 'std.algorithm.find' do this work:Finds an individual element in an input range,and it's Parameters: InputRange haystack The range searched in. Element needle The element searched for.I want to know whether the string strs contains I can do : int index = indexofany(strs,["exe","dll","a","lib"]); but in D: I must to do like this: findStr(strs,["exe","lib","dll","a"])) bool findStr(string strIn,string[] strFind) { bool bFind = false; foreach(str;strFind) { if(strIn.indexOf(str) !=-1) { bFind = true; break; } } return bFind; } phobos 's string.d can add this some function to let the indexOfAny to better? Thank you. Frankstd.algorithm.canFind will do what you want, including telling you which of ["exe","lib","dll","a"] was found. If you need to know where in strs it was found as well, you can use std.algorithm.find
Jan 07 2015
std.algorithm.find has several overloads, one of which takes multiple needles. The same is true for std.algorithm.canFind Quoting from the relevant std.algorithm.find overload docs: "Finds two or more needles into a haystack."string strs ="hello.exe"; string[] s =["lib","exe","a","dll"]; auto a = canFind!(string,string[])(strs,s); writeln("a is ",a); string strsb ="hello."; auto b = canFind!(string,string[])(strsb,s); writeln("b is ",b); Get error: does not match template declaration canFind(alias pred = "a ==b") you can test it. Thank you.
Jan 07 2015
Try this: T -- MACINTOSH: Most Applications Crash, If Not, The Operating System Hangs
Jan 07 2015
On Wednesday, 7 January 2015 at 17:08:55 UTC, H. S. Teoh via Digitalmars-d-learn wrote:Try this: TYou mean ? The result is not that I want to get! ---------------test.d-------------- import std.stdio, std.algorithm,std.string; auto ext =["exe","lib","a","dll"]; auto strs = "hello.exe"; void main() { auto b = findAmong(ext,strs); writeln("b is ",b); } ---------result----- b is ["exe","lib","a","dll"] -------------------- note: 1. I only want to find the given string 'hello.exe' whether to include any a string in the ["exe","lib","a","dll"]. 2. I think the 'indexOfAny' function of string.d do the same work with 'indexOf',This is not as it should be. Frank
Jan 08 2015
On Wednesday, 7 January 2015 at 17:08:55 UTC, H. S. Teoh via Digitalmars-d-learn wrote:Try this: TThank you,it can work. but it's not what I want. ---------------test.d-------------- import std.stdio, std.algorithm,std.string; auto ext =["exe","lib","a","dll"]; auto strs = "hello.dll"; void main() { auto b = findAmong(ext,strs); writeln("b is ",b); } ---------result----- b is ["dll"] -------------------- I think if 'indexOfAny' function of string.d do the work ,it should be ok. such as : auto b = "hello.dll".indexOfAny(["exe","lib","a","dll"]); writeln("b is ",b); The result should be 'true',if it can work. Can you suggest 'phobos' to update 'indexOfAny' fuction? Thank you. Frank
Jan 08 2015
use canFind like such: bool a = canFind(strs,s) >= 1; let the compiler figger out what the types of the parameter are.
Jan 08 2015
On Thursday, 8 January 2015 at 15:15:59 UTC, Robert burner Schadek wrote:use canFind like such: bool a = canFind(strs,s) >= 1; let the compiler figger out what the types of the parameter are.canFind is work for such as : bool x = canFind(["exe","lib","a","dll"],"a" ); but can't work for canFind(["exe","lib","a","dll"],"hello.lib"); So I very want to let the function 'indexOfAny' do the same work. Thank you. Frank
Jan 08 2015
On Fri, 09 Jan 2015 07:10:14 +0000 FrankLike via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> wrote:On Thursday, 8 January 2015 at 15:15:59 UTC, Robert burner=20 Schadek wrote:be creative! ;-) import std.algorithm, std.stdio; void main () { string fname =3D "hello.exe"; import std.path : extension; if (findAmong([fname.extension], [".exe", ".lib", ".a", ".dll"]).length= ) { writeln("got it!"); } else { writeln("alas..."); } } note the dots in extension list. yet you can do it even easier: import std.algorithm, std.stdio; void main () { string fname =3D "hello.exe"; import std.path : extension; if ([".exe", ".lib", ".a", ".dll"].canFind(fname.extension)) { writeln("got it!"); } else { writeln("alas..."); } } as you obviously interested in extension here -- check only that part! ;-)use canFind like such: bool a =3D canFind(strs,s) >=3D 1; let the compiler figger out what the types of the parameter are.=20 canFind is work for such as : bool x =3D canFind(["exe","lib","a","dll"],"a" ); but can't work for canFind(["exe","lib","a","dll"],"hello.lib"); =20 So I very want to let the function 'indexOfAny' do the same work. =20 Thank you. =20 Frank
Jan 08 2015
iday, 9 January 2015 at 07:41:07 UTC, ketmar via Digitalmars-d-learn wrote:On Fri, 09 Jan 2015 07:10:14 +0000 FrankLike via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> wrote:On Thursday, 8 January 2015 at 15:15:59 UTC, Robert burner Schadek wrote:use canFind like such: bool a = canFind(strs,s) >= 1; let the compiler figger out what the types of the parameter are.canFind is work for such as : bool x = canFind(["exe","lib","a","dll"],"a" ); but can't work for canFind(["exe","lib","a","dll"],"hello.lib"); So I very want to let the function 'indexOfAny' do the same work.Sorry,it's only a example .Thank you work hard,but it's not what I want. 'indexOfAny ' function should do this work. ”he is at home" ,["home","office",”sea","plane"], in I know findAmong can do it,but use two function . Thank you.Thank you. Frankbe creative! ;-) import std.algorithm, std.stdio; void main () { string fname = "hello.exe"; import std.path : extension; if (findAmong([fname.extension], [".exe", ".lib", ".a", ".dll"]).length) { writeln("got it!"); } else { writeln("alas..."); } } note the dots in extension list. yet you can do it even easier: import std.algorithm, std.stdio; void main () { string fname = "hello.exe"; import std.path : extension; if ([".exe", ".lib", ".a", ".dll"].canFind(fname.extension)) { writeln("got it!"); } else { writeln("alas..."); } } as you obviously interested in extension here -- check only that part! ;-)
Jan 09 2015
On Fri, 09 Jan 2015 09:36:01 +0000 FrankLike via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> wrote:Sorry,it's only a example .Thank you work hard,but it's =20 not what I want. 'indexOfAny ' function should do this work. =E2=80=9Dhe is at home" ,["home","office",=E2=80=9Dsea","plane"], in==20I know findAmong can do it,but use two function . Thank you.be creative! ;-) import std.algorithm, std.stdio; void main () { string s =3D "he is at plane"; if (findAmong!((string a, string b) =3D> b.canFind(a))([s], ["home", "o= ffice", "sea", "plane"]).length) { writeln("got it!"); } else { writeln("alas..."); } } or: import std.algorithm, std.stdio; void main () { string s =3D "he is at home"; if (["home", "office", "sea", "plane"].canFind!((a, string b) =3D> b.ca= nFind(a))(s)) { writeln("got it!"); } else { writeln("alas..."); } }
Jan 09 2015
be creative! ;-) import std.algorithm, std.stdio; void main () { string s = "he is at plane"; if (findAmong!((string a, string b) => b.canFind(a))([s], ["home", "office", "sea", "plane"]).length) { writeln("got it!"); } else { writeln("alas..."); } } or: import std.algorithm, std.stdio; void main () { string s = "he is at home"; if (["home", "office", "sea", "plane"].canFind!((a, string b) => b.canFind(a))(s)) { writeln("got it!"); } else { writeln("alas..."); } }import std.algorithm, std.stdio; void main () { auto places = [ "home", "office", "sea","plane"]; auto strWhere = "He is in the sea."; auto where = places.canFind!(a => strWhere.canFind(a)); writeln("Result is ",where); }
Jan 09 2015
On Fri, 09 Jan 2015 12:46:53 +0000 FrankLike via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> wrote:=20 import std.algorithm, std.stdio; void main () { auto places =3D [ "home", "office", "sea","plane"]; auto strWhere =3D "He is in the sea."; auto where =3D places.canFind!(a =3D> strWhere.canFind(a)); writeln("Result is ",where); }this does unnecessary upvalue access (`strWhere`). try to avoid such stuff whenever it is possible.
Jan 09 2015
On Friday, 9 January 2015 at 10:02:53 UTC, ketmar via Digitalmars-d-learn wrote:import std.algorithm, std.stdio; void main () { string s = "he is at home"; if (["home", "office", "sea", "plane"].canFind!((a, string b) => b.canFind(a))(s)) { writeln("got it!"); } else { writeln("alas..."); } }Thank you. /* places.canFind!(a => strWhere.canFind(a)); */ By auto r = benchmark!(f0,f1, f2, f3,f4)(10_0000); Result is : filter is 42ms 85us findAmong is 37ms 268us foreach indexOf is 37ms 841us canFind is 13ms canFind indexOf is 39ms 455us -----------------------5 functions-------------------------- import std.stdio, std.algorithm,std.string; auto places = [ "home", "office", "sea","plane"]; auto strWhere = "He is in the sea."; void main() { auto where = places.filter!(a => strWhere.indexOf(a) != -1); writeln("0 Result is ",where); auto where1 = findAmong(places,strWhere); writeln("1 Result is ",where1); string where2; foreach(a;places) { if(strWhere.indexOf(a) !=-1) { where2 = a; break; } } writeln("2 Result is ",where2); auto where3 = places.canFind!(a => strWhere.canFind(a)); writeln("3 Result is ",where3); auto where4 = places.canFind!(a => strWhere.indexOf(a) != -1); writeln("4 Result is ",where4); } Frank
Jan 09 2015
On Fri, 09 Jan 2015 13:06:09 +0000 FrankLike via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> wrote:On Friday, 9 January 2015 at 10:02:53 UTC, ketmar via=20 Digitalmars-d-learn wrote: =20if you *really* concerned with speed here, you'd better consider using regular expressions. as regular expression can be precompiled and then search for multiple words with only one pass over the source string. i believe that std.regex will use variation of Thomson algorithm for regular expressions when it is able to do so.import std.algorithm, std.stdio; void main () { string s =3D "he is at home"; if (["home", "office", "sea", "plane"].canFind!((a, string=20 b) =3D> b.canFind(a))(s)) { writeln("got it!"); } else { writeln("alas..."); } }=20 Thank you. =20 =20 /* places.canFind!(a =3D> strWhere.canFind(a)); */ =20 By auto r =3D benchmark!(f0,f1, f2, f3,f4)(10_0000); =20 Result is : filter is 42ms 85us findAmong is 37ms 268us foreach indexOf is 37ms 841us canFind is 13ms canFind indexOf is 39ms 455us =20 -----------------------5 functions-------------------------- import std.stdio, std.algorithm,std.string; =20 auto places =3D [ "home", "office", "sea","plane"]; auto strWhere =3D "He is in the sea."; =20 void main() { auto where =3D places.filter!(a =3D> strWhere.indexOf(a) !=3D -1); writeln("0 Result is ",where); =09 auto where1 =3D findAmong(places,strWhere); writeln("1 Result is ",where1); =09 string where2; foreach(a;places) { if(strWhere.indexOf(a) !=3D-1) { where2 =3D a; break; } } writeln("2 Result is ",where2); =09 auto where3 =3D places.canFind!(a =3D> strWhere.canFind(a)); writeln("3 Result is ",where3); =09 auto where4 =3D places.canFind!(a =3D> strWhere.indexOf(a) !=3D -1); writeln("4 Result is ",where4); } =20 Frank
Jan 09 2015
On Friday, 9 January 2015 at 13:25:17 UTC, ketmar via Digitalmars-d-learn wrote:if you *really* concerned with speed here, you'd better consider using regular expressions. as regular expression can be precompiled and then search for multiple words with only one pass over the source string. i believe that std.regex will use variation of Thomson algorithm for regular expressions when it is able to do so.IMO that is not sound advice. Creating the state machine and running will be more costly than using canFind or indexOf how basically only compare char by char. If speed is really need use strstr and look if it uses sse to compare multiple chars at a time. Anyway benchmark and then benchmark some more.
Jan 09 2015
On Fri, 09 Jan 2015 13:54:00 +0000 Robert burner Schadek via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> wrote:On Friday, 9 January 2015 at 13:25:17 UTC, ketmar via=20 Digitalmars-d-learn wrote:std.regex can use CTFE to compile regular expressions (yet it sometimes slower than non-CTFE variant), and i mean that we compile regexp before doing alot of searches, not before each single search. if you have alot of words to match or alot of strings to check, regexp can give a huge boost. sure, it all depends of code patterns.if you *really* concerned with speed here, you'd better=20 consider using regular expressions. as regular expression can be precompiled=20 and then search for multiple words with only one pass over the source=20 string. i believe that std.regex will use variation of Thomson algorithm=20 for regular expressions when it is able to do so.=20 IMO that is not sound advice. Creating the state machine and=20 running will be more costly than using canFind or indexOf how=20 basically only compare char by char. =20 If speed is really need use strstr and look if it uses sse to=20 compare multiple chars at a time. Anyway benchmark and then=20 benchmark some more.
Jan 09 2015
On Friday, 9 January 2015 at 14:03:21 UTC, ketmar via Digitalmars-d-learn wrote:std.regex can use CTFE to compile regular expressions (yet it sometimes slower than non-CTFE variant), and i mean that we compile regexp before doing alot of searches, not before each single search. if you have alot of words to match or alot of strings to check, regexp can give a huge boost. sure, it all depends of code patterns.even with CTFE regex still uses a state machine _mm256_cmpeq_epi8 will beat that even for multiple strings. Basically all lexer are handwritten, if regex where fast enough nobody would do the work.
Jan 09 2015
On Fri, 09 Jan 2015 14:11:49 +0000 Robert burner Schadek via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> wrote:On Friday, 9 January 2015 at 14:03:21 UTC, ketmar via=20 Digitalmars-d-learn wrote: =20heh. regexps *are* fast enough. it's hard to beat well-optimised generated thingy on a complex grammar. ;-)std.regex can use CTFE to compile regular expressions (yet it=20 sometimes slower than non-CTFE variant), and i mean that we compile=20 regexp before doing alot of searches, not before each single search. if you=20 have alot of words to match or alot of strings to check, regexp can give=20 a huge boost. sure, it all depends of code patterns.=20 even with CTFE regex still uses a state machine _mm256_cmpeq_epi8=20 will beat that even for multiple strings. Basically all lexer are=20 handwritten, if regex where fast enough nobody would do the work.
Jan 09 2015
On Friday, 9 January 2015 at 14:21:04 UTC, ketmar via Digitalmars-d-learn wrote:heh. regexps *are* fast enough. it's hard to beat well-optimised generated thingy on a complex grammar. ;-)I don't see your point, anyway I think he got his help or at least some help.
Jan 09 2015
On Friday, 9 January 2015 at 14:03:21 UTC, ketmar via Digitalmars-d-learn wrote:On Fri, 09 Jan 2015 13:54:00 +0000 Robert burner Schadek via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> wrote:import std.regex; auto ctr = ctRegex!(`(home|office|sea|plane)`); auto c2 = !matchFirst("He is in the sea.", ctr).empty; ---------------------------------------------------------- Test by auto r = benchmark!(f0,f1, f2, f3,f4,f5)(10_0000); Result is : filter is 42ms 85us findAmong is 37ms 268us foreach indexOf is 37ms 841us canFind is 13ms canFind indexOf is 39ms 455us ctRegex is 138msOn Friday, 9 January 2015 at 13:25:17 UTC, ketmar via Digitalmars-d-learn wrote:std.regex can use CTFE to compile regular expressions (yet it sometimes slower than non-CTFE variant), and i mean that we compile regexp before doing alot of searches, not before each single search. if you have alot of words to match or alot of strings to check, regexp can give a huge boost. sure, it all depends of code patterns.if you *really* concerned with speed here, you'd better consider using regular expressions. as regular expression can be precompiled and then search for multiple words with only one pass over the source string. i believe that std.regex will use variation of Thomson algorithm for regular expressions when it is able to do so.IMO that is not sound advice. Creating the state machine and running will be more costly than using canFind or indexOf how basically only compare char by char. If speed is really need use strstr and look if it uses sse to compare multiple chars at a time. Anyway benchmark and then benchmark some more.
Jan 09 2015
On Fri, 09 Jan 2015 15:36:21 +0000 FrankLike via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> wrote:On Friday, 9 January 2015 at 14:03:21 UTC, ketmar via=20 Digitalmars-d-learn wrote:1. stop doing captures in regexp, this will speedup the comparison. 2. your sample is very artificial. i was talking about alot more keywords and alot longer strings. sorry, i wasn't told that clear enough.On Fri, 09 Jan 2015 13:54:00 +0000 Robert burner Schadek via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> wrote:import std.regex; auto ctr =3D ctRegex!(`(home|office|sea|plane)`); auto c2 =3D !matchFirst("He is in the sea.", ctr).empty; ---------------------------------------------------------- Test by auto r =3D benchmark!(f0,f1, f2, f3,f4,f5)(10_0000); =20 Result is : filter is 42ms 85us findAmong is 37ms 268us foreach indexOf is 37ms 841us canFind is 13ms canFind indexOf is 39ms 455us ctRegex is 138msOn Friday, 9 January 2015 at 13:25:17 UTC, ketmar via=20 Digitalmars-d-learn wrote:std.regex can use CTFE to compile regular expressions (yet it=20 sometimes slower than non-CTFE variant), and i mean that we compile=20 regexp before doing alot of searches, not before each single search. if you=20 have alot of words to match or alot of strings to check, regexp can give=20 a huge boost. sure, it all depends of code patterns.if you *really* concerned with speed here, you'd better=20 consider using regular expressions. as regular expression can be=20 precompiled and then search for multiple words with only one pass over the source=20 string. i believe that std.regex will use variation of Thomson=20 algorithm for regular expressions when it is able to do so.=20 IMO that is not sound advice. Creating the state machine and=20 running will be more costly than using canFind or indexOf how=20 basically only compare char by char. =20 If speed is really need use strstr and look if it uses sse to=20 compare multiple chars at a time. Anyway benchmark and then=20 benchmark some more.
Jan 09 2015
On Friday, 9 January 2015 at 15:57:21 UTC, ketmar via Digitalmars-d-learn wrote:On Fri, 09 Jan 2015 15:36:21 +0000 FrankLike via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> wrote:Yes. regex doing 'a lot more keywords and a lot longer strings' will be better. Thank you.On Friday, 9 January 2015 at 14:03:21 UTC, ketmar via Digitalmars-d-learn wrote:1. stop doing captures in regexp, this will speedup the comparison. 2. your sample is very artificial. i was talking about alot more keywords and alot longer strings. sorry, i wasn't told that clear enough.On Fri, 09 Jan 2015 13:54:00 +0000 Robert burner Schadek via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> wrote:import std.regex; auto ctr = ctRegex!(`(home|office|sea|plane)`); auto c2 = !matchFirst("He is in the sea.", ctr).empty; ---------------------------------------------------------- Test by auto r = benchmark!(f0,f1, f2, f3,f4,f5)(10_0000); Result is : filter is 42ms 85us findAmong is 37ms 268us foreach indexOf is 37ms 841us canFind is 13ms canFind indexOf is 39ms 455us ctRegex is 138msOn Friday, 9 January 2015 at 13:25:17 UTC, ketmar via Digitalmars-d-learn wrote:std.regex can use CTFE to compile regular expressions (yet it sometimes slower than non-CTFE variant), and i mean that we compile regexp before doing alot of searches, not before each single search. if you have alot of words to match or alot of strings to check, regexp can give a huge boost. sure, it all depends of code patterns.if you *really* concerned with speed here, you'd better consider using regular expressions. as regular expression can be precompiled and then search for multiple words with only one pass over the source string. i believe that std.regex will use variation of Thomson algorithm for regular expressions when it is able to do so.IMO that is not sound advice. Creating the state machine and running will be more costly than using canFind or indexOf how basically only compare char by char. If speed is really need use strstr and look if it uses sse to compare multiple chars at a time. Anyway benchmark and then benchmark some more.
Jan 09 2015