digitalmars.D.learn - Reading bigger file
- bioinfornatics (12/12) Mar 08 2013 Hi,
- Chris Cain (11/13) Mar 08 2013 ----
- Marco Leise (14/29) Mar 08 2013 Ha ha!
- bioinfornatics (3/16) Mar 08 2013 oh ok, i was though sw.start() init at 0. We will said that is
Hi, I already asked some question about it. I come back with a newest ^^ why when reading a huge file more i advance into the file more that take time to get a line ? first line is get into 0 msec and that increase so that is a problem when your file is around 30 GB! by example ucmpress one fastq from ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data/HG00096/sequence_read/ use this tiny code http://dpaste.dzfl.pl/47838d8d you will see that more the line number is higher more that take time to get it.
Mar 08 2013
On Friday, 8 March 2013 at 15:25:02 UTC, bioinfornatics wrote:why when reading a huge file more i advance into the file more that take time to get a line ?---- StopWatch sw; while( !fastq1.empty ){ sw.start(); auto q1 = fastq1.next(); sw.stop(); writeln( sw.peek().msecs() ); } ---- That's because you never reset the StopWatch.
Mar 08 2013
Am Fri, 08 Mar 2013 16:31:45 +0100 schrieb "Chris Cain" <clcain uncg.edu>:On Friday, 8 March 2013 at 15:25:02 UTC, bioinfornatics wrote:Ha ha! On a different note... if you still just parse linearly without seeking inside the file, you should consider parsing the compressed file directly. GZIP decompression is very fast. You may get 80 MiB/s for the decompression as well as for HDD read speed. So as long as you parallelize reading new data and decompression, that's your actual read speed. Now the compression factor you indicated is ~15x, so that makes it effectively a 15 * 80 MiB/s = 1.2 GiB/s read speed. Sounds good? :) -- Marcowhy when reading a huge file more i advance into the file more that take time to get a line ?---- StopWatch sw; while( !fastq1.empty ){ sw.start(); auto q1 = fastq1.next(); sw.stop(); writeln( sw.peek().msecs() ); } ---- That's because you never reset the StopWatch.
Mar 08 2013
On Friday, 8 March 2013 at 15:31:46 UTC, Chris Cain wrote:On Friday, 8 March 2013 at 15:25:02 UTC, bioinfornatics wrote:oh ok, i was though sw.start() init at 0. We will said that is friday ... thanks ^^why when reading a huge file more i advance into the file more that take time to get a line ?---- StopWatch sw; while( !fastq1.empty ){ sw.start(); auto q1 = fastq1.next(); sw.stop(); writeln( sw.peek().msecs() ); } ---- That's because you never reset the StopWatch.
Mar 08 2013