digitalmars.D.learn - Testing D database calls code for regression

aberba (5/5) Mar 16 2018 How will you test D code which makes calls to database to detect

H. S. Teoh (56/61) Mar 16 2018 The usual way I do this is to decouple the code from the real database

aberba (4/12) Mar 18 2018 Mocking a fake database can also be huge pain. Especially when

Jonathan M Davis (9/24) Mar 18 2018 The other way would be to create a test database (or databases) and use
H. S. Teoh (56/71) Mar 18 2018 It depends on what your test is looking for. The idea is that the mock

aberba (11/90) Mar 19 2018 The thing about functional programming where functions are

H. S. Teoh (37/47) Mar 19 2018 Not necessarily; in some cases it may be possible to design code such

nani (5/10) Mar 16 2018 would type providers
Ali (34/39) Mar 19 2018 Well, I am not really sure what you are looking for

aberba <karabutaworld gmail.com> writes:

How will you test D code which makes calls to database to detect 
bugs and regression. Unlike where you can inject data like assert 
(2+1 == 3), database interfacing code will be crazy... Or there's 
some mocking available for such cases. Especially when more 
features are developed on top.

Mar 16 2018

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Mar 16, 2018 at 08:17:49PM +0000, aberba via Digitalmars-d-learn wrote:
 How will you test D code which makes calls to database to detect bugs
 and regression. Unlike where you can inject data like assert (2+1 ==
 3), database interfacing code will be crazy... Or there's some mocking
 available for such cases. Especially when more features are developed
 on top.

The usual way I do this is to decouple the code from the real database
backend by templatizing the database driver.  Then in my unittest I can
instantiate the template with a mock database driver that only
implements the bare minimum to run the test.

For example, instead of:

	import database : Database;
	auto myQueryFunc(Args...)(Database db, Args args) {
		return db.query(...);
	}

Do this:

	import database : Database;
	auto myQueryFunc(Db = database.Database, Args...)(Db db, Args args) {
		return db.query(...);
	}

Then regular calls to myQueryFunc will call the real database backend,
as usual. But in the unittest:

	unittest {
		struct FakeDb {
			auto query(...) {
				// mock implementation here
			}
		}
		FakeDb db;

		// test away
		assert(myQueryFunc(db, ...) == ... ); // uses FakeDb
	}

This applies not only to database backends, but just about anything you
need to insert mockups for.  For example, for testing complicated file
I/O, I've found it useful to do this:

	auto myFunc(File = std.stdio.File, Args...)(Args args) {
		auto f = File(...);
		// do stuff with f
	}

	unittest
	{
		struct FakeFile {
			this(...) { ... }
			// mockup here
		}
		assert(myFunc!FakeFile(...) == ... );
	}

Using this method, you can even create tests for error-handling, like a
simulated filesystem that returns random (simulated) I/O errors, or
exhibits various disk-full conditions (without actually filling up your
real disk!), etc..  I've created tests for code that searches
directories for files, by substituting a fake filesystem that contains
pre-determined sets of files with content that only exist inside the
unittest.  This way, I can run these tests without actually modifying my
real filesystem in any way.

If you push this idea far enough, you might be able to write unittests
for simulated syscalls, too. :-D  (Maybe that's something we could do in
druntime... :-P)


T

-- 
May you live all the days of your life. -- Jonathan Swift

Mar 16 2018

aberba <karabutaworld gmail.com> writes:

On Friday, 16 March 2018 at 21:15:33 UTC, H. S. Teoh wrote:
 On Fri, Mar 16, 2018 at 08:17:49PM +0000, aberba via 
 Digitalmars-d-learn wrote:
 [...]

 The usual way I do this is to decouple the code from the real 
 database backend by templatizing the database driver.  Then in 
 my unittest I can instantiate the template with a mock database 
 driver that only implements the bare minimum to run the test.

 [...]

Mocking a fake database can also be huge pain. Especially when 
something like transactions and prepared statements are involved.

Imagine testing your mock for introduced by future extension.

Mar 18 2018

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Sunday, March 18, 2018 19:51:18 aberba via Digitalmars-d-learn wrote:
 On Friday, 16 March 2018 at 21:15:33 UTC, H. S. Teoh wrote:
 On Fri, Mar 16, 2018 at 08:17:49PM +0000, aberba via

 Digitalmars-d-learn wrote:
 [...]

 The usual way I do this is to decouple the code from the real
 database backend by templatizing the database driver.  Then in
 my unittest I can instantiate the template with a mock database
 driver that only implements the bare minimum to run the test.

 [...]

 Mocking a fake database can also be huge pain. Especially when
 something like transactions and prepared statements are involved.

 Imagine testing your mock for introduced by future extension.

The other way would be to create a test database (or databases) and use
those with the normal code, though you have less control over some stuff
that way. What makes the most sense depends on what you're doing and how
much you're able to really unit test the pieces as opposed to component
testing large chunks of the code at once. And the reality of the matter is
that sometimes testing is a pain, though in the long run, it pretty much
always saves time and pain even if it's a pain to get set up.

- Jonathan M Davis

Mar 18 2018

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sun, Mar 18, 2018 at 07:51:18PM +0000, aberba via Digitalmars-d-learn wrote:
 On Friday, 16 March 2018 at 21:15:33 UTC, H. S. Teoh wrote:
 On Fri, Mar 16, 2018 at 08:17:49PM +0000, aberba via Digitalmars-d-learn
 wrote:
 [...]

 
 The usual way I do this is to decouple the code from the real
 database backend by templatizing the database driver.  Then in my
 unittest I can instantiate the template with a mock database driver
 that only implements the bare minimum to run the test.
 
 [...]

 
 Mocking a fake database can also be huge pain. Especially when
 something like transactions and prepared statements are involved.

It depends on what your test is looking for.  The idea is that the mock
database only implements a (small!) subset of a real database, basically
just enough for the test to run, and nothing more.  Of course, sometimes
it may not be easy to do this, if the code being tested is very complex.


 Imagine testing your mock for introduced by future extension.

If you find yourself needing to test your mock database, then you're
doing it wrong.  :-D  It's supposed to be helping you test your code,
not to create more code that itself needs to be tested!

Basically, this kind of testing imposes certain requirements on the way
you write your code. Certain kinds of code are easier to test than
others.  For example, imagine trying to test a complex I/O pipeline
implemented as nested loops. It's basically impossible to test it except
as a blackbox testing (certain input sets must produce certain output
sets). It's usually impractical for the test to target specific code
paths nested deep inside a nested loop. The only thing you can do is to
hope and pray that your blackbox tests cover enough of the code paths to
ensure things are correct. But you're likely to miss certain exceptional
cases.

But if said I/O pipeline is implemented as series of range compositions,
for example, then it becomes very easy to test each component of the
range composition. Each component is decoupled from the others, so it's
easy for the unittest to check all code paths. Then it's much easier to
have the confidence that the composed pipeline itself is correct.

I/O pipelines are an easy example, but understandably, in real-world
code things are rarely that clean.  So you'll have to find a way of
designing your database code such that it's more easily testable.
Otherwise, it's going to be a challenge no matter what.  No matter what
you do, testing a function made of loops nested 5 levels deep is going
to be very hard.  Similarly, if your database code has very complex
interdependencies, then it's going to be hard to test no matter how you
try.

Anyway, on the more practical side of things, depending on what your
test is trying to do, a mock database could be as simple as:

	struct MockDb {
		string prebakedResponse;
		auto query(string sql) {
			if (sql == "SELECT * FROM data")
				return prebakedResponse;
			else if (sql == "UPDATE * SET ... ")
				prebakedResponse = ...
			else
				assert(0, "Time to rewrite your unittest :-P");
		}
	}

I.e., you literally only need to implement what the test case will
actually invoke. Anything that isn't strictly required is fair game to
just outright ignore.

Also, keep in mind that MockDb can be a completely different thing per
unittest. Trying to use the same mock DB for all unittests will just end
up with writing your own database engine, which kinda defeats the
purpose. :-P  But the ability to do this depends on how decoupled the
code is.  Code with complex interdependencies will generally give you a
much harder time than more modular, decoupled code.


T

-- 
Knowledge is that area of ignorance that we arrange and classify. -- Ambrose
Bierce

Mar 18 2018

aberba <karabutaworld gmail.com> writes:

On Monday, 19 March 2018 at 00:56:26 UTC, H. S. Teoh wrote:
 On Sun, Mar 18, 2018 at 07:51:18PM +0000, aberba via 
 Digitalmars-d-learn wrote:
 On Friday, 16 March 2018 at 21:15:33 UTC, H. S. Teoh wrote:
 On Fri, Mar 16, 2018 at 08:17:49PM +0000, aberba via 
 Digitalmars-d-learn wrote:
 [...]

 
 The usual way I do this is to decouple the code from the 
 real database backend by templatizing the database driver.  
 Then in my unittest I can instantiate the template with a 
 mock database driver that only implements the bare minimum 
 to run the test.
 
 [...]

 
 Mocking a fake database can also be huge pain. Especially when 
 something like transactions and prepared statements are 
 involved.

 It depends on what your test is looking for.  The idea is that 
 the mock database only implements a (small!) subset of a real 
 database, basically just enough for the test to run, and 
 nothing more.  Of course, sometimes it may not be easy to do 
 this, if the code being tested is very complex.


 Imagine testing your mock for introduced by future extension.

 If you find yourself needing to test your mock database, then 
 you're doing it wrong.  :-D  It's supposed to be helping you 
 test your code, not to create more code that itself needs to be 
 tested!

 Basically, this kind of testing imposes certain requirements on 
 the way you write your code. Certain kinds of code are easier 
 to test than others.  For example, imagine trying to test a 
 complex I/O pipeline implemented as nested loops. It's 
 basically impossible to test it except as a blackbox testing 
 (certain input sets must produce certain output sets). It's 
 usually impractical for the test to target specific code paths 
 nested deep inside a nested loop. The only thing you can do is 
 to hope and pray that your blackbox tests cover enough of the 
 code paths to ensure things are correct. But you're likely to 
 miss certain exceptional cases.

 But if said I/O pipeline is implemented as series of range 
 compositions, for example, then it becomes very easy to test 
 each component of the range composition. Each component is 
 decoupled from the others, so it's easy for the unittest to 
 check all code paths. Then it's much easier to have the 
 confidence that the composed pipeline itself is correct.

 I/O pipelines are an easy example, but understandably, in 
 real-world code things are rarely that clean.  So you'll have 
 to find a way of designing your database code such that it's 
 more easily testable. Otherwise, it's going to be a challenge

The thing about functional programming where functions are 
decoupled/testable doesn't seem to apply to database call code. I 
guess its because databases introduces a different 
state...another point of failure.

 no matter what.  No matter what you do, testing a function made 
 of loops nested 5 levels deep is going to be very hard.  
 Similarly, if your database code has very complex 
 interdependencies, then it's going to be hard to test no matter 
 how you try.

My code logic is a mix of file uploads which leads to saving file 
info into db. And some general queries... my worry has been 
adding a feature which might cause a regression in another rearly 
executed code...its feels like I have to test all features/rest 
calls after every major change. Don't know how others do 
this...when there's some tight coupling involved.

 Anyway, on the more practical side of things, depending on what 
 your test is trying to do, a mock database could be as simple 
 as:

 	struct MockDb {
 		string prebakedResponse;
 		auto query(string sql) {
 			if (sql == "SELECT * FROM data")
 				return prebakedResponse;
 			else if (sql == "UPDATE * SET ... ")
 				prebakedResponse = ...
 			else
 				assert(0, "Time to rewrite your unittest :-P");
 		}
 	}

 I.e., you literally only need to implement what the test case 
 will actually invoke. Anything that isn't strictly required is 
 fair game to just outright ignore.

 Also, keep in mind that MockDb can be a completely different 
 thing per unittest. Trying to use the same mock DB for all 
 unittests will just end up with writing your own database 
 engine, which kinda defeats the purpose. :-P  But the ability 
 to do this depends on how decoupled the code is.  Code with 
 complex interdependencies will generally give you a much harder 
 time than more modular, decoupled code.


 T

Mar 19 2018

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Mon, Mar 19, 2018 at 06:45:49PM +0000, aberba via Digitalmars-d-learn wrote:
[...]
 The thing about functional programming where functions are
 decoupled/testable doesn't seem to apply to database call code. I
 guess its because databases introduces a different state...another
 point of failure.

Not necessarily; in some cases it may be possible to design code such
that its logic can be tested independently of an actual database.  But
that may not be practical in your case since it will likely involve a
major rewrite.

Basically, it's pretty rare for an application to actually need the full
range of the SQL language + *all* of the features your database backend
provides.  Usually, the "business logic", so to speak, boils down to
just some simple primitives: uploadFile(), createAccount(), loginUser(),
logoutUser(), deleteAccount(), retrieveFile(), etc..  Ideally, the
business logic part of the code should not even care about whether
there's a database in the back supporting these operations; it should be
higher-level code built on top of these high-level primitives.  There
should definitely not be any literal SQL statements anywhere at this
level. The "business logic" side of the code should be completely
testable with a mock API (with stubs for uploadFile, createAccount,
etc.), and should not need to touch a real database at all.

In the middle level where these primitives are implemented, that's where
you actually translate these high-level operations into SQL. If the
high-level API is well-designed, each operation should be pretty well
encapsulated and should not cause unexpected conflicts with other
operations.


[...]
 My code logic is a mix of file uploads which leads to saving file info
 into db. And some general queries... my worry has been adding a
 feature which might cause a regression in another rearly executed
 code...its feels like I have to test all features/rest calls after
 every major change. Don't know how others do this...when there's some
 tight coupling involved.

[...]

Sounds like you're not really doing *unit* testing anymore, but it's a
large-scale application-wide regression test.  For that, probably your
best bet is to create test databases and use external testing with a
mock network / test DB server. E.g., basically what the dmd testsuite
does today: a directory of input files and expected output files, and
some simple tools to automatically run through all of them.  You could
create a library of test cases that you run your program through before
release, to make sure nothing that has worked in the past will stop
working now.


T

-- 
If it breaks, you get to keep both pieces. -- Software disclaimer notice

Mar 19 2018

nani <eduardomcbt gmail.com> writes:

On Friday, 16 March 2018 at 20:17:49 UTC, aberba wrote:
 How will you test D code which makes calls to database to 
 detect bugs and regression. Unlike where you can inject data 
 like assert (2+1 == 3), database interfacing code will be 
 crazy... Or there's some mocking available for such cases. 
 Especially when more features are developed on top.

would type providers 
(https://docs.microsoft.com/en-us/dotnet/fsharp/tutorials/type-providers/) be
posible with ctfe?
that would be one way to test at compile time functions that use 
the db.

Mar 16 2018

Ali <fakeemail example.com> writes:

On Friday, 16 March 2018 at 20:17:49 UTC, aberba wrote:
 How will you test D code which makes calls to database to 
 detect bugs and regression. Unlike where you can inject data 
 like assert (2+1 == 3), database interfacing code will be 
 crazy... Or there's some mocking available for such cases. 
 Especially when more features are developed on top.

Well, I am not really sure what you are looking for
but to test database code, there are frameworks for this

Check for example tsqlt ( http://tsqlt.org/ )
this framework is ms sql specific and is the one I have 
experience with

There is also dbfit ( http://dbfit.github.io/dbfit/ ) which seem 
to support more database management frameworks

tsqlt is very good for unit testing, sql procedures or statements
at a high this is how it was setup to be used

1. you prepare sql procedure to create your tables that will be 
used in testing
2. you prepare sql procedure with insert statements to create the 
data sample you want to be used for testing
3. you write a script execute the two procedures from the first 
two step then executed the procedure or statement  you want to 
test and then at the end of this script executed some assert 
statements that is basically your unit test

how to setup used those scripts
1. the setup started a transaction
2. the setup dropped everything in the database
3. the setup executed the scripts from point 3 above to create 
your database, followed by the insert statements scripts or data 
creation script, followed by executing the statement to be tested 
and finally executing the assert statements
4. finally the setup rolled back everything

this setup was support by the tsqlt framework, but honestly i 
dont know how much of this was specific to the environment where 
i worked

but you can use tsqlt to have this

D is not a database language, D is not sql
You should clearly separate testing the D code that call the sql 
statements and testing the sql statements themselves

The above frameworks, will help you test the sql code in isolation

Mar 19 2018

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Testing D database calls code for regression