digitalmars.D.learn - Too slow readln
- unDEFER (49/49) Jul 16 2017 Hello, there!
- Jon Degenhardt (8/10) Jul 16 2017 GNU grep is pretty fast, it's tough to beat it reading one line
- unDEFER (5/16) Jul 16 2017 Thank you. I understand yet another trick:
- unDEFER (3/3) Jul 16 2017 I understand the main problem. dirEntries by default follows
- =?UTF-8?Q?Ali_=c3=87ehreli?= (5/9) Jul 16 2017 Another fast GNU utility was on Reddit a month ago:
Hello, there! I have the next "grep" code: https://dpaste.dzfl.pl/7b7273f96ab2 And I have the directory to run it: $ time /home/undefer/MyFiles/Projects/TEST/D/grep "HELLO" . ./strace.log: [pid 18365] write(1, "HELLO\n", 6HELLO real 1m17.096s user 0m54.828s sys 0m13.340s The same result I get with ldc2.. The same with bash and grep: $ time for i in `find .`; do file -b "$i" | grep -q text && grep -a "HELLO" "$i"; done [pid 18365] write(1, "HELLO\n", 6HELLO real 0m42.461s user 0m23.244s sys 0m22.300s Only `file` for all files: $ time find . -exec file {} + >/dev/null real 0m15.013s user 0m14.556s sys 0m0.436s Only grep for all files: $ for i in `find .`; do file -b "$i" | grep -q text && echo "$i"; done > LIST1 $ time for i in `cat LIST1`; do grep -a "HELLO" "$i"; done [pid 18365] write(1, "HELLO\n", 6HELLO real 0m4.431s user 0m1.112s sys 0m3.148s So 15+4.4 much lesser than 42.46. Why? How "find" so fast can run "file" so many times? And why 42.461s much lesser 1m17.096s? The second version of grep: https://dpaste.dzfl.pl/9db5bc2f0a26 $ time /home/undefer/MyFiles/Projects/TEST/D/grep2 "HELLO" `cat LIST1` ./strace.log: [pid 18365] write(1, "HELLO\n", 6HELLO real 0m1.871s user 0m1.824s sys 0m0.048s $ time grep -a "HELLO" `cat LIST1` ./strace.log:[pid 18365] write(1, "HELLO\n", 6HELLO real 0m0.075s user 0m0.044s sys 0m0.028s The profiler says that readln eats CPU. So why 0m0.075s much lesser 0m1.871s? How to write in D grep not slower than GNU grep?
Jul 16 2017
On Sunday, 16 July 2017 at 17:03:27 UTC, unDEFER wrote:[snip] How to write in D grep not slower than GNU grep?GNU grep is pretty fast, it's tough to beat it reading one line at a time. That's because it can play a bit of a trick and do the initial match ignoring line boundaries and correct line boundaries later. There's a good discussion in this thread ("Why GNU grep is fast" by Mike Haertel): https://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html --Jon
Jul 16 2017
On Sunday, 16 July 2017 at 17:37:34 UTC, Jon Degenhardt wrote:On Sunday, 16 July 2017 at 17:03:27 UTC, unDEFER wrote:Thank you. I understand yet another trick: $ find . -exec file -bi {} + is the same $ file -bi `find .`[snip] How to write in D grep not slower than GNU grep?GNU grep is pretty fast, it's tough to beat it reading one line at a time. That's because it can play a bit of a trick and do the initial match ignoring line boundaries and correct line boundaries later. There's a good discussion in this thread ("Why GNU grep is fast" by Mike Haertel): https://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html --Jon
Jul 16 2017
I understand the main problem. dirEntries by default follows symlinks. Without it my first grep works only 28.338s. That really cool!
Jul 16 2017
On 07/16/2017 10:37 AM, Jon Degenhardt wrote:There's a good discussion in this thread ("Why GNU grep is fast" by Mike Haertel): https://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html --JonAnother fast GNU utility was on Reddit a month ago: https://www.reddit.com/r/programming/comments/6gxf02/how_is_gnus_yes_so_fast_xpost_runix/ Ali
Jul 16 2017