digitalmars.D - Summary on unit testing situation

bearophile (14/14) Mar 23 2010 I have already written one or two times about this topic, but I think su...

=?ISO-8859-1?Q?Pelle_M=E5nsson?= (4/4) Mar 23 2010 I'm not sure I understand, could you explain?

bearophile (7/9) Mar 23 2010 I think two times in the past I have written a list of those lacking thi...

=?ISO-8859-1?Q?Pelle_M=E5nsson?= (7/16) Mar 23 2010 I see, and I think most of these problems are solvable within the
Fawzi Mohamed (71/102) Mar 23 2010 actually there are some hooks in tango (and I believe similarly in

Bane (2/8) Mar 23 2010 Me too.
Trip Volpe (5/9) Mar 23 2010 I definitely agree with this. The built-in unit tests are a great conven...
Trip Volpe (2/3) Mar 23 2010 Actually I said this wrong. It's worse than that: after one assert failu...

=?ISO-8859-1?Q?Pelle_M=E5nsson?= (2/5) Mar 23 2010 The solution to this would be not to use asserts in unittests.

Paul D. Anderson (5/13) Mar 23 2010 Wikipedia is usually a good place to start:

Paul D. Anderson (3/23) Mar 23 2010 Sorry, this should have replied to the message asking for more info!

Paul D. Anderson (13/19) Mar 23 2010

bearophile <bearophileHUGS lycos.com> writes:

I have already written one or two times about this topic, but I think
summarizing the situation again a little can't hurt. Feel free to ignore this
post.

1) D follows Walter's theory that programmers are often lazy or in a rush, they
are often not trained to use unit tests (especially if they come from C or C++)
and they don't like to learn to use too much complex things. So it's better to
put in the D as simple as possible means to perform something useful, in this
case to write unit testing. I was already "test-infected" before learning D, so
I have used unit tests in D from almost day zero, and I have found them very
easy to use, there's very little to learn, just to add some unittest{} spread
in modules, filled with normal code and asserts, plus an argument -unittest for
the compiler (but catching expected exceptions and testing expected
compile-time errors is less easy. I have written a Throws!() function for the
first, and I use is() for the second. And I add a comment that tells what I am
testing inside a single unittest. Every thing has a separate unittest, to keep
things a little more tidy). It can't be simpler than this. So I think Walter
was right. And in future I hope to see D code in the wild that uses a good
amount of unittests (but I think currently Phobos has not enough tests).

2) Dynamic languages perform less sanity tests on the code, so programmers are
trained to write more unit tests. In Python/Ruby you must write unit tests, a
good amount of them. In theory in a statically typed language like D you can
avoid some tests because the type system catches some problems for you, saving
you the time to write some of them. In practice most of the things you have to
write in normal D unit tests are not enforced by normal type systems (even a
type system like D2 one that's better than Java one). I have seen that the
tests that I don't need to put in D unit tests (because the type system catches
them) are only the very simple ones. All the other little more complex tests
must be written in D unit tests too, as in Python. So the save in time is not
much. I write about 2-2.5 lines of tests for every 1 line of code. In Python I
write about 2.5-3 lines of code of tests for every 1 line of code. (But in
Python I often use doctests that are a way to write tests that's even faster
than D unittests).

3) The problem is that D unit tests are a toy. If you start writing programs
composed by many modules you want more flexibility. I have written in the past
some of the important things missing in D unit testing, and I don't repeat them
here, ask me if you want another list. If you take a look at unit test systems

for a professional use, they are a toy. For example in the Python standard
library there are two different (but they can be joined) unit test systems, and
they are both quite more refined than the D one. And people often use a third
external library that ties things together, like one called "nose".

How to solve this situation? There can be various possibilities:
I) Remove the built-in unit testing of D, and wait for someone to write an
external "professional" unit test system for D. This external unit test system
can have not nice/clean syntax/semantics.
II) Keep the built-in unit testing of D, but essentially all serious future
programmers will ignore them and use an external unit type system. This wastes
code in the compiler (and information in the head of programmers, but not a
lot, because the built-in ones are very simple to learn) and has the
disadvantages of the solution I too. D newbies will be adviced to avoid
built-in unit tests as soon as possible.
III) Keep the built-in unit testing of D, and improve it until it becomes fit
for serious usage. This can make the compiler a bit too much complex. Walter
has enough to do already with the core of the front-end. Developing and
improving a serious unit test system is not too much hard, but it's a full job
or almost full job. Another bad thing of this is that unit testing is not set
in stone, in ten years someone can invent a better way to do them, at that
point it will be hard to change the compiler to have the newer type of tests.
IV) Keep the built-in unit testing of D, keep them almost as simple as they are
now, but somehow add hooks and flexibility to allow to external D code to
refine *them* as much as needed (this "external" code can be a Phobos module,
or it can be a third-part library written by other people, or it can be born as
external lib and added to Phobos later, as it happens often in Python, that's
why they say it has "batteries included", such batteries often were not born in
the std library), so they can be used in professional situations too. This will
increase the complexity of the built-in unit tests, but probably not much. It
can increase the complexity of the compiler a little, but I think this extra
complexity (some reflection, maybe) can be then used for other purposes too.

If nothing will be done then the situation will most likely evolve to the
outcome 'II' listed above, because the built-in ones are simply not good
enough. (The development of Tango to replace the not good enough Phobos1 is a
clear example of this. If the built-in is not good enough for serious usage AND
there's no good way to extend/improve its basic structure, then the community
of D programmers is forced to refuse it totally and build something
better/different. This is what has happened with Tango in D1, and it can
naturally happen again with the unit testing).

Among those four solutions the one I like more is the 'IV'. Because it keeps
the work of developing the library out of the busy hands of Walter, but
produces something that can have nice enough syntax, with a not too much
complex compiler, and it probably allows for some future changes in how people
do tests. It can also allow to write both very simple unit tests for novices or
single-module programs as now, and professional/complex unit tests for harder
situations or larger projects.

If you agree with me that the better solution is the IV, then those
hooks/reflection have to be designed first.

Bye,
bearophile

Mar 23 2010

=?ISO-8859-1?Q?Pelle_M=E5nsson?= <pelle.mansson gmail.com> writes:

I'm not sure I understand, could you explain?

I am not experienced with unittest frameworks, and would like to 
understand what the D system lacks.

Thank you.

Mar 23 2010

bearophile <bearophileHUGS lycos.com> writes:

Pelle M.:

I'm not sure I understand, could you explain?<

That was my best explanation, sorry.


I am not experienced with unittest frameworks, and would like to understand
what the D system lacks.<

I think two times in the past I have written a list of those lacking things. To
give a good answer to your question I have to write a lot, and it's not nice to
write a lot when the words get ignored. So first devs have to agree that a
problem exists, then later we can design things to improve the situation.
Otherwise it's just a waste of my energy, like trying to talk in vacuum.

Unit testing has to continue when tests fail. All code must be testable,
compile-time code too. You need a way to assert that things go wrong too, like
exceptions, asserts, compile-time asserts, etc when they are designed to. It's
good to have a way to give a name to tests. And unit test systems enjoy some
reflection to organize themselves, to attach tests to code automatically.
During development you want to test only parts of the code, not the whole
program. Unit testing OOP code has other needs, because in a test you may need
to break data hiding of classes and structs. If you unit test hundred of
classes you soon find the necessity of something to help creation of fake
testing objects. You need some tools for creating mock test objects (objects
that simulate external resources). You need a help to perform performance
tests, to print reports of the testing. You need layers of testing, slow tests
and quick tests that you can run every few minutes or seconds of programming.
Generally the more the unit test system does automatically the better it is,
because you want to write and use unit tests in the most fast way possible.
Those things are useful, but putting most of those things inside a compiler is
not a good idea.


add some unit tests, so you can learn what's useful and what is not. All unit
test systems have some documentation, you can start reading that too. In two
days you can learn more than I can ever tell you. If you don't try to use unit
testing you probably will not be able to understand my words :-)

Bye,
bearophile

Mar 23 2010

=?ISO-8859-1?Q?Pelle_M=E5nsson?= <pelle.mansson gmail.com> writes:

On 03/23/2010 08:29 PM, bearophile wrote:
 Pelle M.:

 I'm not sure I understand, could you explain?<

 That was my best explanation, sorry.


 I am not experienced with unittest frameworks, and would like to understand
what the D system lacks.<

 I think two times in the past I have written a list of those lacking things.
To give a good answer to your question I have to write a lot, and it's not nice
to write a lot when the words get ignored. So first devs have to agree that a
problem exists, then later we can design things to improve the situation.
Otherwise it's just a waste of my energy, like trying to talk in vacuum.

 Unit testing has to continue when tests fail. All code must be testable,
compile-time code too. You need a way to assert that things go wrong too, like
exceptions, asserts, compile-time asserts, etc when they are designed to. It's
good to have a way to give a name to tests. And unit test systems enjoy some
reflection to organize themselves, to attach tests to code automatically.
During development you want to test only parts of the code, not the whole
program. Unit testing OOP code has other needs, because in a test you may need
to break data hiding of classes and structs. If you unit test hundred of
classes you soon find the necessity of something to help creation of fake
testing objects. You need some tools for creating mock test objects (objects
that simulate external resources). You need a help to perform performance
tests, to print reports of the testing. You need layers of testing, slow tests
and quick tests that you can run every few minutes or seconds of programming

. Generally the more the unit test system does automatically the better it is,
because you want to write and use unit tests in the most fast way possible.
Those things are useful, but putting most of those things inside a compiler is
not a good idea.

add some unit tests, so you can learn what's useful and what is not. All unit
test systems have some documentation, you can start reading that too. In two
days you can learn more than I can ever tell you. If you don't try to use unit
testing you probably will not be able to understand my words :-)

 Bye,
 bearophile

I see, and I think most of these problems are solvable within the 
language. For example, you could choose not to use asserts in unittests, 
and __traits should help in other cases.

Some of the problems may need a separate framework, so you are probably 
right about the need for improvement.

Mar 23 2010

Fawzi Mohamed <fawzi gmx.ch> writes:

On 23-mar-10, at 20:29, bearophile wrote:

 Pelle M.:

 I'm not sure I understand, could you explain?<

 That was my best explanation, sorry.


 I am not experienced with unittest frameworks, and would like to  
 understand what the D system lacks.<

 I think two times in the past I have written a list of those lacking  
 things. To give a good answer to your question I have to write a  
 lot, and it's not nice to write a lot when the words get ignored. So  
 first devs have to agree that a problem exists, then later we can  
 design things to improve the situation. Otherwise it's just a waste  
 of my energy, like trying to talk in vacuum.

actually there are some hooks in tango (and I believe similarly in  
phobos) to do what you want

the module info contains the unittests and you can replace the default  
for example the unittester of tango looks like this

import all modules to be tested
import tango.io.Stdout;
import tango.core.Runtime;
import tango.core.stacktrace.TraceExceptions;

bool tangoUnitTester()
{
     uint countFailed = 0;
     uint countTotal = 1;
     Stdout ("NOTE: This is still fairly rudimentary, and will only  
report the").newline;
     Stdout ("    first error per module.").newline;
     foreach ( m; ModuleInfo )  // _moduleinfo_array )
     {
         if ( m.unitTest) {
             Stdout.format ("{}. Executing unittests in '{}' ",  
countTotal, m.name).flush;
             countTotal++;
             try {
                m.unitTest();
             }
             catch (Exception e) {
                 countFailed++;
                 Stdout(" - Unittest failed.").newline;
                 e.writeOut(delegate void(char[]s){ Stdout(s); });
                 continue;
             }
             Stdout(" - Success.").newline;
         }
     }

     Stdout.format ("{} out of {} tests failed.", countFailed,  
countTotal - 1).newline;
     return true;
}

static this() {
     Runtime.moduleUnitTester( &tangoUnitTester );
}

void main() {}

one can do something fancier if he wants.
To really have all test one would need to have an array (or iterator)  
in the module information instead of a single global unittest  
function. Alternatively one could pass some flags to the unittest  
function to control its execution.

 Unit testing has to continue when tests fail. All code must be  
 testable, compile-time code too. You need a way to assert that  
 things go wrong too, like exceptions, asserts, compile-time asserts,  
 etc when they are designed to. It's good to have a way to give a  
 name to tests. And unit test systems enjoy some reflection to  
 organize themselves, to attach tests to code automatically. During  
 development you want to test only parts of the code, not the whole  
 program. Unit testing OOP code has other needs, because in a test  
 you may need to break data hiding of classes and structs. If you  
 unit test hundred of classes you soon find the necessity of  
 something to help creation of fake testing objects. You need some  
 tools for creating mock test objects (objects that simulate external  
 resources). You need a help to perform performance tests, to print  
 reports of the testing. You need layers of testing, slow tests and  
 quick tests that you can run every few minutes or seconds of  
 programming.!
  Generally the more the unit test system does automatically the  
 better it is, because you want to write and use unit tests in the  
 most fast way possible. Those things are useful, but putting most of  
 those things inside a compiler is not a good idea.

I think that what you want is beyond normal requests, executing all  
tests, tests of one module, a single test, yes that should be  
relatively simple.
More complex test series/combination are probably better served by a  
specialized regression tester.

Actually I use a specialized tester that is parallel, and whose basic  
testing building block is a testing function in which the arguments  
for it are generated automatically (derived types have to implement  
generating functions).
This is a somewhat different way to look at tests that the usual one  
(inspired from haskell's QuickCheck), but one that I prefer.
In the end the power is the same, instead of fixtures to prepare a  
test environment you can define a derived type whose generating  
function do the fixtures, and then have the tests as function having  
that type as argument.

Test suites I normally organize like the package structure.

What I have is something like that would also be useful in tango/ 
phobos is a pre written main like function, so that one can easily  
create test suites for pieces of code.
I have for example
	int mainTestFun(char[][] argStr,SingleRTest testSuite)
which can be used to create a unittester that recognizes flags to  
initialize it, perform subtests,...

Fawzi

Mar 23 2010

Bane <branimir.milosavljevic gmail.com> writes:

Pelle M�nsson Wrote:

 I'm not sure I understand, could you explain?
 
 I am not experienced with unittest frameworks, and would like to 
 understand what the D system lacks.
 
 Thank you.


Me too.

Mar 23 2010

Trip Volpe <mraccident gmail.com> writes:

bearophile Wrote:
 Among those four solutions the one I like more is the 'IV'. Because it keeps
the work of developing the library out of the busy hands of Walter, but
produces something that can have nice enough syntax, with a not too much
complex compiler, and it probably allows for some future changes in how people
do tests. It can also allow to write both very simple unit tests for novices or
single-module programs as now, and professional/complex unit tests for harder
situations or larger projects.
 
 If you agree with me that the better solution is the IV, then those
hooks/reflection have to be designed first.
 

I definitely agree with this. The built-in unit tests are a great convenience,
but the provided facilities just aren't good enough _as they are_ for actual
use. Particular problems include the fact that all tests are anonymous, no
detailed contextual feedback is provided, and if an assert fails in one test in
a module, _all_ subsequent tests in that module will be aborted, even though
this makes no sense.

I've actually been doing a bit of solution IV already for my own projects.
Starting with the Runtime.moduleUnitTester() function, I've built a simple
framework that runs all the unit tests in the project but keeps track of
context as well, so every failure (checked with an accompanying set of
non-throwing "expect" functions) is logged by module, specific test, source
line, and nature of failure.

Just a few extra hooks provided to the programmer would enable the construction
of some very useful unit testing on the basis of D's provided primitives (which
I agree were an excellent idea). For one example, being able to name each test
or otherwise associate it with some form of contextual or identifying
information would be very nice.

I also posted some thoughts a while ago (
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars
D&article_id=105682 ) on how template alias parameters could be expanded to
alias complete expressions to allow for automatic logging of the particular
expression that caused an expectation failure. This is the sort of helpful
diagnostic already provided by any serious C or C++ unit testing system, and it
would be great to see it made possible in D.

Mar 23 2010

Trip Volpe <mraccident gmail.com> writes:

Trip Volpe Wrote:
 ...and if an assert fails in one test in a module, _all_ subsequent tests in
that module will be aborted, even though this makes no sense.

Actually I said this wrong. It's worse than that: after one assert failure,
_all_ further execution is aborted, meaning that even unit tests in _other_
modules will be prevented from running. And you can't change this behavior,
even if you override the assert failure handler, since for some reason the
compiler expects the handler to throw an AssertError, and if it doesn't, a
segfault may result.

Mar 23 2010

=?ISO-8859-1?Q?Pelle_M=E5nsson?= <pelle.mansson gmail.com> writes:

On 03/23/2010 08:10 PM, Trip Volpe wrote:
 Trip Volpe Wrote:
 ...and if an assert fails in one test in a module, _all_ subsequent tests in
that module will be aborted, even though this makes no sense.

 Actually I said this wrong. It's worse than that: after one assert failure,
_all_ further execution is aborted, meaning that even unit tests in _other_
modules will be prevented from running. And you can't change this behavior,
even if you override the assert failure handler, since for some reason the
compiler expects the handler to throw an AssertError, and if it doesn't, a
segfault may result.

The solution to this would be not to use asserts in unittests.

Mar 23 2010

Paul D. Anderson <paul.d.removethis.anderson comcast.andthis.net> writes:

Wikipedia is usually a good place to start:

http://en.wikipedia.org/wiki/Unit_test

A couple of  the references at the end of the Wikipedia article are pretty
good, also. The Unit Testing Guidelines gives a pretty good breakdown of what
to expect (or not) from unit testing.

Paul


Pelle M�nsson Wrote:

 On 03/23/2010 08:10 PM, Trip Volpe wrote:
 Trip Volpe Wrote:
 ...and if an assert fails in one test in a module, _all_ subsequent tests in
that module will be aborted, even though this makes no sense.

 Actually I said this wrong. It's worse than that: after one assert failure,
_all_ further execution is aborted, meaning that even unit tests in _other_
modules will be prevented from running. And you can't change this behavior,
even if you override the assert failure handler, since for some reason the
compiler expects the handler to throw an AssertError, and if it doesn't, a
segfault may result.

 
 The solution to this would be not to use asserts in unittests.

Mar 23 2010

Paul D. Anderson <paul.d.removethis.anderson comcast.andthis.net> writes:

Sorry, this should have replied to the message asking for more info!

Paul

Paul D. Anderson Wrote:

 Wikipedia is usually a good place to start:
 
 http://en.wikipedia.org/wiki/Unit_test
 
 A couple of  the references at the end of the Wikipedia article are pretty
good, also. The Unit Testing Guidelines gives a pretty good breakdown of what
to expect (or not) from unit testing.
 
 Paul
 
 
 Pelle M�nsson Wrote:
 
 On 03/23/2010 08:10 PM, Trip Volpe wrote:
 Trip Volpe Wrote:
 ...and if an assert fails in one test in a module, _all_ subsequent tests in
that module will be aborted, even though this makes no sense.

 Actually I said this wrong. It's worse than that: after one assert failure,
_all_ further execution is aborted, meaning that even unit tests in _other_
modules will be prevented from running. And you can't change this behavior,
even if you override the assert failure handler, since for some reason the
compiler expects the handler to throw an AssertError, and if it doesn't, a
segfault may result.

 
 The solution to this would be not to use asserts in unittests.

Mar 23 2010

Paul D. Anderson <paul.d.removethis.anderson comcast.andthis.net> writes:

bearophile Wrote:

<snip>

 IV) Keep the built-in unit testing of D, keep them almost as simple as they
are now, but somehow add hooks and flexibility to allow to external D code to
refine *them* as much as needed (this "external" code can be a Phobos module,
or it can be a third-part library written by other people, or it can be born as
external lib and added to Phobos later, as it happens often in Python, that's
why they say it has "batteries included", such batteries often were not born in
the std library), so they can be used in professional situations too. This will
increase the complexity of the built-in unit tests, but probably not much. It
can increase the complexity of the compiler a little, but I think this extra
complexity (some reflection, maybe) can be then used for other purposes too.
 

<snip> 

 If you agree with me that the better solution is the IV, then those
hooks/reflection have to be designed first.
 
 Bye,
 bearophile

I think your analysis is accurate. Having the simple unit testing built in is
better than not having it at all. I use it as much as I can, but I haven't
written complex applications -- just library modules, where it's perhaps more
suitable.

I'm not having much luck at conceptualizing the hooks/reflection that you refer
to. (Might just be a having a slow day.) (Or, I might just be slow!) 

It seems like we need some of the xUnit kind of tools -- test suites, more
elaborat assertions, test result reporting (not just halting), named tests, and
so on.

What is needed to support that?

* More elaborate asserts can be built from the basic assert. A library of
assert templates or functions doesn't need additional compiler support.

* Named tests are essential. I'm surprised names (and qualified names --
test.math.divide, etc.) aren't already available. So this would have to be a
part of the package.

* Test suites would depend, I think on having names available. Again,
qualification may be necessary -- perhaps to include the modulel name.

* Test running needs to be extended. Running the tests before executing main is
better than not running the tests. But, as you say, that's really only suitable
for toy programs. We'd need some kind of control -- order of execution, action
on failure, etc.

I don't know enough about unit testing or compiler writing to know how much
work is involved, but it seems that just a few "small additions" would go a
long way.

Paul

Mar 23 2010

D Programming

C/C++ Programming

Other

digitalmars.D - Summary on unit testing situation