digitalmars.D.learn - How to speedup file processing?

Tyro (24/24) May 01 2007 Good day all,

Brad Anderson (6/34) May 01 2007 You might be better off making a file tailored specifically for your db'...

Tyro (23/59) May 02 2007 Thanks Brad,

Brad Anderson (6/70) May 02 2007 Hmm, sqlite... not sure of a bulk-load facility there. I have more expe...

Tyro (3/79) May 02 2007 0.698 seconds to process 9286 records.

Tyro <ridimz yahoo.com> writes:

Good day all,

running the below code snippet against a 800Kb file takes an ungodly
amount of time (anywhere from 25 to 35 minutes). Is there anything
I can do to speed the process up? 

// Load ID Standards File if no db exists
char[] idfile = cast(char[])read("idfile.ids");

while(idfile.length)
{
	//Avoid utf-8 Error message. Ideal situation would be to 
	//display the contents as it appears in the file but I'm
	// unsure how to accomplish that.
	if(!isascii(idfile[11])) {
		idfile = idfile[80..$];
		continue;
	}

	db.exec("INSERT INTO idstd
		values( null,
		'"~idfile[0..6]~"',   '"~ idfile[6..11] ~"',
		'"~ idfile[11..24] ~"', '"~ idfile[24..38] ~"',
		'"~ idfile[38..40] ~"', '"~ idfile[40..42] ~"',
		'"~ idfile[42..43] ~"', '"~ idfile[43..44] ~"'
	)");
	idfile = idfile[80..$];
}

May 01 2007

Brad Anderson <brad dsource.org> writes:

Tyro wrote:
 Good day all,
 
 running the below code snippet against a 800Kb file takes an ungodly
 amount of time (anywhere from 25 to 35 minutes). Is there anything
 I can do to speed the process up? 
 
 // Load ID Standards File if no db exists
 char[] idfile = cast(char[])read("idfile.ids");
 
 while(idfile.length)
 {
 	//Avoid utf-8 Error message. Ideal situation would be to 
 	//display the contents as it appears in the file but I'm
 	// unsure how to accomplish that.
 	if(!isascii(idfile[11])) {
 		idfile = idfile[80..$];
 		continue;
 	}
 
 	db.exec("INSERT INTO idstd
 		values( null,
 		'"~idfile[0..6]~"',   '"~ idfile[6..11] ~"',
 		'"~ idfile[11..24] ~"', '"~ idfile[24..38] ~"',
 		'"~ idfile[38..40] ~"', '"~ idfile[40..42] ~"',
 		'"~ idfile[42..43] ~"', '"~ idfile[43..44] ~"'
 	)");
 	idfile = idfile[80..$];
 }

You might be better off making a file tailored specifically for your db's bulk
copy program and using that.  I suspect your issue is all the inserts, not the
processing of your code.  Bulk copy usually turns off any logging during the
one huge insert, so that helps as well.

BA

May 01 2007

Tyro <ridimz yahoo.com> writes:

Brad Anderson Wrote:

 Tyro wrote:
 Good day all,
 
 running the below code snippet against a 800Kb file takes an ungodly
 amount of time (anywhere from 25 to 35 minutes). Is there anything
 I can do to speed the process up? 
 
 // Load ID Standards File if no db exists
 char[] idfile = cast(char[])read("idfile.ids");
 
 while(idfile.length)
 {
 	//Avoid utf-8 Error message. Ideal situation would be to 
 	//display the contents as it appears in the file but I'm
 	// unsure how to accomplish that.
 	if(!isascii(idfile[11])) {
 		idfile = idfile[80..$];
 		continue;
 	}
 
 	db.exec("INSERT INTO idstd
 		values( null,
 		'"~idfile[0..6]~"',   '"~ idfile[6..11] ~"',
 		'"~ idfile[11..24] ~"', '"~ idfile[24..38] ~"',
 		'"~ idfile[38..40] ~"', '"~ idfile[40..42] ~"',
 		'"~ idfile[42..43] ~"', '"~ idfile[43..44] ~"'
 	)");
 	idfile = idfile[80..$];
 }

 
 You might be better off making a file tailored specifically for your db's bulk
 copy program and using that.  I suspect your issue is all the inserts, not the
 processing of your code.  Bulk copy usually turns off any logging during the
 one huge insert, so that helps as well.
 
 BA

Thanks Brad,

Following that explanation I did some googleing and came up with the solution
below:

db.exec("BEGIN");
     // Insert while loop here
db.exec("COMMIT");

This led me to an arrayBounds Error on line 199 of sqlite_oo which happens to
be the fetch() originally implemented as such:

public char[][] fetch()
{
	return data[rowCursor++];
} 

After reimplemening it as:

public char[][] fetch()
{
	if(rowCursor < data.length)
		return data[rowCursor++];
	else
		return null;
}

The program compiles and runs correctly.

Again,
Thanks for your assistance.

Tyro

May 02 2007

Brad Anderson <brad dsource.org> writes:

Tyro wrote:
 Brad Anderson Wrote:
 
 Tyro wrote:
 Good day all,

 running the below code snippet against a 800Kb file takes an ungodly
 amount of time (anywhere from 25 to 35 minutes). Is there anything
 I can do to speed the process up? 

 // Load ID Standards File if no db exists
 char[] idfile = cast(char[])read("idfile.ids");

 while(idfile.length)
 {
 	//Avoid utf-8 Error message. Ideal situation would be to 
 	//display the contents as it appears in the file but I'm
 	// unsure how to accomplish that.
 	if(!isascii(idfile[11])) {
 		idfile = idfile[80..$];
 		continue;
 	}

 	db.exec("INSERT INTO idstd
 		values( null,
 		'"~idfile[0..6]~"',   '"~ idfile[6..11] ~"',
 		'"~ idfile[11..24] ~"', '"~ idfile[24..38] ~"',
 		'"~ idfile[38..40] ~"', '"~ idfile[40..42] ~"',
 		'"~ idfile[42..43] ~"', '"~ idfile[43..44] ~"'
 	)");
 	idfile = idfile[80..$];
 }

 You might be better off making a file tailored specifically for your db's bulk
 copy program and using that.  I suspect your issue is all the inserts, not the
 processing of your code.  Bulk copy usually turns off any logging during the
 one huge insert, so that helps as well.

 BA

 
 Thanks Brad,
 
 Following that explanation I did some googleing and came up with the solution
below:
 
 db.exec("BEGIN");
      // Insert while loop here
 db.exec("COMMIT");

ok, setting up a transaction.

 
 This led me to an arrayBounds Error on line 199 of sqlite_oo which happens to
be the fetch() originally implemented as such:

Hmm, sqlite... not sure of a bulk-load facility there.  I have more experience
w/ MSSQL, PGSQL, and MySQL

 
 public char[][] fetch()
 {
 	return data[rowCursor++];
 } 
 
 After reimplemening it as:
 
 public char[][] fetch()
 {
 	if(rowCursor < data.length)
 		return data[rowCursor++];
 	else
 		return null;
 }
 
 The program compiles and runs correctly.

Cool.  How fast?

BA

May 02 2007

Tyro <ridimz yahoo.com> writes:

Brad Anderson Wrote:

 Tyro wrote:
 Brad Anderson Wrote:
 
 Tyro wrote:
 Good day all,

 running the below code snippet against a 800Kb file takes an ungodly
 amount of time (anywhere from 25 to 35 minutes). Is there anything
 I can do to speed the process up? 

 // Load ID Standards File if no db exists
 char[] idfile = cast(char[])read("idfile.ids");

 while(idfile.length)
 {
 	//Avoid utf-8 Error message. Ideal situation would be to 
 	//display the contents as it appears in the file but I'm
 	// unsure how to accomplish that.
 	if(!isascii(idfile[11])) {
 		idfile = idfile[80..$];
 		continue;
 	}

 	db.exec("INSERT INTO idstd
 		values( null,
 		'"~idfile[0..6]~"',   '"~ idfile[6..11] ~"',
 		'"~ idfile[11..24] ~"', '"~ idfile[24..38] ~"',
 		'"~ idfile[38..40] ~"', '"~ idfile[40..42] ~"',
 		'"~ idfile[42..43] ~"', '"~ idfile[43..44] ~"'
 	)");
 	idfile = idfile[80..$];
 }

 You might be better off making a file tailored specifically for your db's bulk
 copy program and using that.  I suspect your issue is all the inserts, not the
 processing of your code.  Bulk copy usually turns off any logging during the
 one huge insert, so that helps as well.

 BA

 
 Thanks Brad,
 
 Following that explanation I did some googleing and came up with the solution
below:
 
 db.exec("BEGIN");
      // Insert while loop here
 db.exec("COMMIT");

 
 ok, setting up a transaction.
 
 
 This led me to an arrayBounds Error on line 199 of sqlite_oo which happens to
be the fetch() originally implemented as such:

 
 Hmm, sqlite... not sure of a bulk-load facility there.  I have more experience
 w/ MSSQL, PGSQL, and MySQL
 
 
 public char[][] fetch()
 {
 	return data[rowCursor++];
 } 
 
 After reimplemening it as:
 
 public char[][] fetch()
 {
 	if(rowCursor < data.length)
 		return data[rowCursor++];
 	else
 		return null;
 }
 
 The program compiles and runs correctly.

 
 Cool.  How fast?
 
 BA

0.698 seconds to process 9286 records.

Tyro

May 02 2007

D Programming

C/C++ Programming

Other

digitalmars.D.learn - How to speedup file processing?