digitalmars.D.learn - Too slow readln

unDEFER (49/49) Jul 16 2017 Hello, there!

Jon Degenhardt (8/10) Jul 16 2017 GNU grep is pretty fast, it's tough to beat it reading one line

unDEFER (5/16) Jul 16 2017 Thank you. I understand yet another trick:

unDEFER (3/3) Jul 16 2017 I understand the main problem. dirEntries by default follows

=?UTF-8?Q?Ali_=c3=87ehreli?= (5/9) Jul 16 2017 Another fast GNU utility was on Reddit a month ago:

unDEFER <undefer gmail.com> writes:

Hello, there!

I have the next "grep" code:
https://dpaste.dzfl.pl/7b7273f96ab2

And I have the directory to run it:
$ time /home/undefer/MyFiles/Projects/TEST/D/grep "HELLO" .
./strace.log: [pid 18365] write(1, "HELLO\n", 6HELLO

real	1m17.096s
user	0m54.828s
sys	0m13.340s

The same result I get with ldc2..

The same with bash and grep:
$ time for i in `find .`; do file -b "$i" | grep -q text && grep 
-a "HELLO" "$i"; done
[pid 18365] write(1, "HELLO\n", 6HELLO

real	0m42.461s
user	0m23.244s
sys	0m22.300s

Only `file` for all files:
$ time find . -exec file {} + >/dev/null

real	0m15.013s
user	0m14.556s
sys	0m0.436s

Only grep for all files:
$ for i in `find .`; do file -b "$i" | grep -q text && echo "$i"; 
done > LIST1
$ time for i in `cat LIST1`; do grep -a "HELLO" "$i"; done
[pid 18365] write(1, "HELLO\n", 6HELLO

real	0m4.431s
user	0m1.112s
sys	0m3.148s

So 15+4.4 much lesser than 42.46. Why? How "find" so fast can run 
"file" so many times?
And why 42.461s much lesser 1m17.096s?

The second version of grep:
https://dpaste.dzfl.pl/9db5bc2f0a26

$ time /home/undefer/MyFiles/Projects/TEST/D/grep2 "HELLO" `cat 
LIST1`
./strace.log: [pid 18365] write(1, "HELLO\n", 6HELLO

real	0m1.871s
user	0m1.824s
sys	0m0.048s

$ time grep -a "HELLO" `cat LIST1`
./strace.log:[pid 18365] write(1, "HELLO\n", 6HELLO

real	0m0.075s
user	0m0.044s
sys	0m0.028s

The profiler says that readln eats CPU. So why 0m0.075s much 
lesser 0m1.871s?

How to write in D grep not slower than GNU grep?

Jul 16 2017

Jon Degenhardt <jond noreply.com> writes:

On Sunday, 16 July 2017 at 17:03:27 UTC, unDEFER wrote:
 [snip]

 How to write in D grep not slower than GNU grep?

GNU grep is pretty fast, it's tough to beat it reading one line 
at a time. That's because it can play a bit of a trick and do the 
initial match ignoring line boundaries and correct line 
boundaries later. There's a good discussion in this thread ("Why 
GNU grep is fast" by Mike Haertel): 
https://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html

--Jon

Jul 16 2017

unDEFER <undefer gmail.com> writes:

On Sunday, 16 July 2017 at 17:37:34 UTC, Jon Degenhardt wrote:
 On Sunday, 16 July 2017 at 17:03:27 UTC, unDEFER wrote:
 [snip]

 How to write in D grep not slower than GNU grep?

 GNU grep is pretty fast, it's tough to beat it reading one line 
 at a time. That's because it can play a bit of a trick and do 
 the initial match ignoring line boundaries and correct line 
 boundaries later. There's a good discussion in this thread 
 ("Why GNU grep is fast" by Mike Haertel): 
 https://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html

 --Jon

Thank you. I understand yet another trick:
$ find . -exec file -bi {} +
is the same
$ file -bi `find .`

Jul 16 2017

unDEFER <undefer gmail.com> writes:

I understand the main problem. dirEntries by default follows 
symlinks.
Without it my first grep works only 28.338s. That really cool!

Jul 16 2017

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 07/16/2017 10:37 AM, Jon Degenhardt wrote:

 There's a good discussion in this thread ("Why GNU grep is fast" by Mike
 Haertel):
 https://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html

 --Jon

Another fast GNU utility was on Reddit a month ago:

 
https://www.reddit.com/r/programming/comments/6gxf02/how_is_gnus_yes_so_fast_xpost_runix/

Ali

Jul 16 2017

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Too slow readln