digitalmars.D - Summary on unit testing situation
- bearophile (14/14) Mar 23 2010 I have already written one or two times about this topic, but I think su...
- =?ISO-8859-1?Q?Pelle_M=E5nsson?= (4/4) Mar 23 2010 I'm not sure I understand, could you explain?
- bearophile (7/9) Mar 23 2010 I think two times in the past I have written a list of those lacking thi...
- =?ISO-8859-1?Q?Pelle_M=E5nsson?= (7/16) Mar 23 2010 I see, and I think most of these problems are solvable within the
- Fawzi Mohamed (71/102) Mar 23 2010 actually there are some hooks in tango (and I believe similarly in
- Bane (2/8) Mar 23 2010 Me too.
- Trip Volpe (5/9) Mar 23 2010 I definitely agree with this. The built-in unit tests are a great conven...
- Trip Volpe (2/3) Mar 23 2010 Actually I said this wrong. It's worse than that: after one assert failu...
- =?ISO-8859-1?Q?Pelle_M=E5nsson?= (2/5) Mar 23 2010 The solution to this would be not to use asserts in unittests.
- Paul D. Anderson (5/13) Mar 23 2010 Wikipedia is usually a good place to start:
- Paul D. Anderson (3/23) Mar 23 2010 Sorry, this should have replied to the message asking for more info!
-
Paul D. Anderson
(13/19)
Mar 23 2010
I have already written one or two times about this topic, but I think summarizing the situation again a little can't hurt. Feel free to ignore this post. 1) D follows Walter's theory that programmers are often lazy or in a rush, they are often not trained to use unit tests (especially if they come from C or C++) and they don't like to learn to use too much complex things. So it's better to put in the D as simple as possible means to perform something useful, in this case to write unit testing. I was already "test-infected" before learning D, so I have used unit tests in D from almost day zero, and I have found them very easy to use, there's very little to learn, just to add some unittest{} spread in modules, filled with normal code and asserts, plus an argument -unittest for the compiler (but catching expected exceptions and testing expected compile-time errors is less easy. I have written a Throws!() function for the first, and I use is() for the second. And I add a comment that tells what I am testing inside a single unittest. Every thing has a separate unittest, to keep things a little more tidy). It can't be simpler than this. So I think Walter was right. And in future I hope to see D code in the wild that uses a good amount of unittests (but I think currently Phobos has not enough tests). 2) Dynamic languages perform less sanity tests on the code, so programmers are trained to write more unit tests. In Python/Ruby you must write unit tests, a good amount of them. In theory in a statically typed language like D you can avoid some tests because the type system catches some problems for you, saving you the time to write some of them. In practice most of the things you have to write in normal D unit tests are not enforced by normal type systems (even a type system like D2 one that's better than Java one). I have seen that the tests that I don't need to put in D unit tests (because the type system catches them) are only the very simple ones. All the other little more complex tests must be written in D unit tests too, as in Python. So the save in time is not much. I write about 2-2.5 lines of tests for every 1 line of code. In Python I write about 2.5-3 lines of code of tests for every 1 line of code. (But in Python I often use doctests that are a way to write tests that's even faster than D unittests). 3) The problem is that D unit tests are a toy. If you start writing programs composed by many modules you want more flexibility. I have written in the past some of the important things missing in D unit testing, and I don't repeat them here, ask me if you want another list. If you take a look at unit test systems for a professional use, they are a toy. For example in the Python standard library there are two different (but they can be joined) unit test systems, and they are both quite more refined than the D one. And people often use a third external library that ties things together, like one called "nose". How to solve this situation? There can be various possibilities: I) Remove the built-in unit testing of D, and wait for someone to write an external "professional" unit test system for D. This external unit test system can have not nice/clean syntax/semantics. II) Keep the built-in unit testing of D, but essentially all serious future programmers will ignore them and use an external unit type system. This wastes code in the compiler (and information in the head of programmers, but not a lot, because the built-in ones are very simple to learn) and has the disadvantages of the solution I too. D newbies will be adviced to avoid built-in unit tests as soon as possible. III) Keep the built-in unit testing of D, and improve it until it becomes fit for serious usage. This can make the compiler a bit too much complex. Walter has enough to do already with the core of the front-end. Developing and improving a serious unit test system is not too much hard, but it's a full job or almost full job. Another bad thing of this is that unit testing is not set in stone, in ten years someone can invent a better way to do them, at that point it will be hard to change the compiler to have the newer type of tests. IV) Keep the built-in unit testing of D, keep them almost as simple as they are now, but somehow add hooks and flexibility to allow to external D code to refine *them* as much as needed (this "external" code can be a Phobos module, or it can be a third-part library written by other people, or it can be born as external lib and added to Phobos later, as it happens often in Python, that's why they say it has "batteries included", such batteries often were not born in the std library), so they can be used in professional situations too. This will increase the complexity of the built-in unit tests, but probably not much. It can increase the complexity of the compiler a little, but I think this extra complexity (some reflection, maybe) can be then used for other purposes too. If nothing will be done then the situation will most likely evolve to the outcome 'II' listed above, because the built-in ones are simply not good enough. (The development of Tango to replace the not good enough Phobos1 is a clear example of this. If the built-in is not good enough for serious usage AND there's no good way to extend/improve its basic structure, then the community of D programmers is forced to refuse it totally and build something better/different. This is what has happened with Tango in D1, and it can naturally happen again with the unit testing). Among those four solutions the one I like more is the 'IV'. Because it keeps the work of developing the library out of the busy hands of Walter, but produces something that can have nice enough syntax, with a not too much complex compiler, and it probably allows for some future changes in how people do tests. It can also allow to write both very simple unit tests for novices or single-module programs as now, and professional/complex unit tests for harder situations or larger projects. If you agree with me that the better solution is the IV, then those hooks/reflection have to be designed first. Bye, bearophile
Mar 23 2010
I'm not sure I understand, could you explain? I am not experienced with unittest frameworks, and would like to understand what the D system lacks. Thank you.
Mar 23 2010
Pelle M.:I'm not sure I understand, could you explain?<That was my best explanation, sorry.I am not experienced with unittest frameworks, and would like to understand what the D system lacks.<I think two times in the past I have written a list of those lacking things. To give a good answer to your question I have to write a lot, and it's not nice to write a lot when the words get ignored. So first devs have to agree that a problem exists, then later we can design things to improve the situation. Otherwise it's just a waste of my energy, like trying to talk in vacuum. Unit testing has to continue when tests fail. All code must be testable, compile-time code too. You need a way to assert that things go wrong too, like exceptions, asserts, compile-time asserts, etc when they are designed to. It's good to have a way to give a name to tests. And unit test systems enjoy some reflection to organize themselves, to attach tests to code automatically. During development you want to test only parts of the code, not the whole program. Unit testing OOP code has other needs, because in a test you may need to break data hiding of classes and structs. If you unit test hundred of classes you soon find the necessity of something to help creation of fake testing objects. You need some tools for creating mock test objects (objects that simulate external resources). You need a help to perform performance tests, to print reports of the testing. You need layers of testing, slow tests and quick tests that you can run every few minutes or seconds of programming. Generally the more the unit test system does automatically the better it is, because you want to write and use unit tests in the most fast way possible. Those things are useful, but putting most of those things inside a compiler is not a good idea. add some unit tests, so you can learn what's useful and what is not. All unit test systems have some documentation, you can start reading that too. In two days you can learn more than I can ever tell you. If you don't try to use unit testing you probably will not be able to understand my words :-) Bye, bearophile
Mar 23 2010
On 03/23/2010 08:29 PM, bearophile wrote:Pelle M.:. Generally the more the unit test system does automatically the better it is, because you want to write and use unit tests in the most fast way possible. Those things are useful, but putting most of those things inside a compiler is not a good idea.I'm not sure I understand, could you explain?<That was my best explanation, sorry.I am not experienced with unittest frameworks, and would like to understand what the D system lacks.<I think two times in the past I have written a list of those lacking things. To give a good answer to your question I have to write a lot, and it's not nice to write a lot when the words get ignored. So first devs have to agree that a problem exists, then later we can design things to improve the situation. Otherwise it's just a waste of my energy, like trying to talk in vacuum. Unit testing has to continue when tests fail. All code must be testable, compile-time code too. You need a way to assert that things go wrong too, like exceptions, asserts, compile-time asserts, etc when they are designed to. It's good to have a way to give a name to tests. And unit test systems enjoy some reflection to organize themselves, to attach tests to code automatically. During development you want to test only parts of the code, not the whole program. Unit testing OOP code has other needs, because in a test you may need to break data hiding of classes and structs. If you unit test hundred of classes you soon find the necessity of something to help creation of fake testing objects. You need some tools for creating mock test objects (objects that simulate external resources). You need a help to perform performance tests, to print reports of the testing. You need layers of testing, slow tests and quick tests that you can run every few minutes or seconds of programmingadd some unit tests, so you can learn what's useful and what is not. All unit test systems have some documentation, you can start reading that too. In two days you can learn more than I can ever tell you. If you don't try to use unit testing you probably will not be able to understand my words :-) Bye, bearophileI see, and I think most of these problems are solvable within the language. For example, you could choose not to use asserts in unittests, and __traits should help in other cases. Some of the problems may need a separate framework, so you are probably right about the need for improvement.
Mar 23 2010
On 23-mar-10, at 20:29, bearophile wrote:Pelle M.:actually there are some hooks in tango (and I believe similarly in phobos) to do what you want the module info contains the unittests and you can replace the default for example the unittester of tango looks like this import all modules to be tested import tango.io.Stdout; import tango.core.Runtime; import tango.core.stacktrace.TraceExceptions; bool tangoUnitTester() { uint countFailed = 0; uint countTotal = 1; Stdout ("NOTE: This is still fairly rudimentary, and will only report the").newline; Stdout (" first error per module.").newline; foreach ( m; ModuleInfo ) // _moduleinfo_array ) { if ( m.unitTest) { Stdout.format ("{}. Executing unittests in '{}' ", countTotal, m.name).flush; countTotal++; try { m.unitTest(); } catch (Exception e) { countFailed++; Stdout(" - Unittest failed.").newline; e.writeOut(delegate void(char[]s){ Stdout(s); }); continue; } Stdout(" - Success.").newline; } } Stdout.format ("{} out of {} tests failed.", countFailed, countTotal - 1).newline; return true; } static this() { Runtime.moduleUnitTester( &tangoUnitTester ); } void main() {} one can do something fancier if he wants. To really have all test one would need to have an array (or iterator) in the module information instead of a single global unittest function. Alternatively one could pass some flags to the unittest function to control its execution.I'm not sure I understand, could you explain?<That was my best explanation, sorry.I am not experienced with unittest frameworks, and would like to understand what the D system lacks.<I think two times in the past I have written a list of those lacking things. To give a good answer to your question I have to write a lot, and it's not nice to write a lot when the words get ignored. So first devs have to agree that a problem exists, then later we can design things to improve the situation. Otherwise it's just a waste of my energy, like trying to talk in vacuum.Unit testing has to continue when tests fail. All code must be testable, compile-time code too. You need a way to assert that things go wrong too, like exceptions, asserts, compile-time asserts, etc when they are designed to. It's good to have a way to give a name to tests. And unit test systems enjoy some reflection to organize themselves, to attach tests to code automatically. During development you want to test only parts of the code, not the whole program. Unit testing OOP code has other needs, because in a test you may need to break data hiding of classes and structs. If you unit test hundred of classes you soon find the necessity of something to help creation of fake testing objects. You need some tools for creating mock test objects (objects that simulate external resources). You need a help to perform performance tests, to print reports of the testing. You need layers of testing, slow tests and quick tests that you can run every few minutes or seconds of programming.! Generally the more the unit test system does automatically the better it is, because you want to write and use unit tests in the most fast way possible. Those things are useful, but putting most of those things inside a compiler is not a good idea.I think that what you want is beyond normal requests, executing all tests, tests of one module, a single test, yes that should be relatively simple. More complex test series/combination are probably better served by a specialized regression tester. Actually I use a specialized tester that is parallel, and whose basic testing building block is a testing function in which the arguments for it are generated automatically (derived types have to implement generating functions). This is a somewhat different way to look at tests that the usual one (inspired from haskell's QuickCheck), but one that I prefer. In the end the power is the same, instead of fixtures to prepare a test environment you can define a derived type whose generating function do the fixtures, and then have the tests as function having that type as argument. Test suites I normally organize like the package structure. What I have is something like that would also be useful in tango/ phobos is a pre written main like function, so that one can easily create test suites for pieces of code. I have for example int mainTestFun(char[][] argStr,SingleRTest testSuite) which can be used to create a unittester that recognizes flags to initialize it, perform subtests,... Fawzi
Mar 23 2010
Pelle Månsson Wrote:I'm not sure I understand, could you explain? I am not experienced with unittest frameworks, and would like to understand what the D system lacks. Thank you.Me too.
Mar 23 2010
bearophile Wrote:Among those four solutions the one I like more is the 'IV'. Because it keeps the work of developing the library out of the busy hands of Walter, but produces something that can have nice enough syntax, with a not too much complex compiler, and it probably allows for some future changes in how people do tests. It can also allow to write both very simple unit tests for novices or single-module programs as now, and professional/complex unit tests for harder situations or larger projects. If you agree with me that the better solution is the IV, then those hooks/reflection have to be designed first.I definitely agree with this. The built-in unit tests are a great convenience, but the provided facilities just aren't good enough _as they are_ for actual use. Particular problems include the fact that all tests are anonymous, no detailed contextual feedback is provided, and if an assert fails in one test in a module, _all_ subsequent tests in that module will be aborted, even though this makes no sense. I've actually been doing a bit of solution IV already for my own projects. Starting with the Runtime.moduleUnitTester() function, I've built a simple framework that runs all the unit tests in the project but keeps track of context as well, so every failure (checked with an accompanying set of non-throwing "expect" functions) is logged by module, specific test, source line, and nature of failure. Just a few extra hooks provided to the programmer would enable the construction of some very useful unit testing on the basis of D's provided primitives (which I agree were an excellent idea). For one example, being able to name each test or otherwise associate it with some form of contextual or identifying information would be very nice. I also posted some thoughts a while ago ( http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars D&article_id=105682 ) on how template alias parameters could be expanded to alias complete expressions to allow for automatic logging of the particular expression that caused an expectation failure. This is the sort of helpful diagnostic already provided by any serious C or C++ unit testing system, and it would be great to see it made possible in D.
Mar 23 2010
Trip Volpe Wrote:...and if an assert fails in one test in a module, _all_ subsequent tests in that module will be aborted, even though this makes no sense.Actually I said this wrong. It's worse than that: after one assert failure, _all_ further execution is aborted, meaning that even unit tests in _other_ modules will be prevented from running. And you can't change this behavior, even if you override the assert failure handler, since for some reason the compiler expects the handler to throw an AssertError, and if it doesn't, a segfault may result.
Mar 23 2010
On 03/23/2010 08:10 PM, Trip Volpe wrote:Trip Volpe Wrote:The solution to this would be not to use asserts in unittests....and if an assert fails in one test in a module, _all_ subsequent tests in that module will be aborted, even though this makes no sense.Actually I said this wrong. It's worse than that: after one assert failure, _all_ further execution is aborted, meaning that even unit tests in _other_ modules will be prevented from running. And you can't change this behavior, even if you override the assert failure handler, since for some reason the compiler expects the handler to throw an AssertError, and if it doesn't, a segfault may result.
Mar 23 2010
Wikipedia is usually a good place to start: http://en.wikipedia.org/wiki/Unit_test A couple of the references at the end of the Wikipedia article are pretty good, also. The Unit Testing Guidelines gives a pretty good breakdown of what to expect (or not) from unit testing. Paul Pelle Månsson Wrote:On 03/23/2010 08:10 PM, Trip Volpe wrote:Trip Volpe Wrote:The solution to this would be not to use asserts in unittests....and if an assert fails in one test in a module, _all_ subsequent tests in that module will be aborted, even though this makes no sense.Actually I said this wrong. It's worse than that: after one assert failure, _all_ further execution is aborted, meaning that even unit tests in _other_ modules will be prevented from running. And you can't change this behavior, even if you override the assert failure handler, since for some reason the compiler expects the handler to throw an AssertError, and if it doesn't, a segfault may result.
Mar 23 2010
Sorry, this should have replied to the message asking for more info! Paul Paul D. Anderson Wrote:Wikipedia is usually a good place to start: http://en.wikipedia.org/wiki/Unit_test A couple of the references at the end of the Wikipedia article are pretty good, also. The Unit Testing Guidelines gives a pretty good breakdown of what to expect (or not) from unit testing. Paul Pelle Månsson Wrote:On 03/23/2010 08:10 PM, Trip Volpe wrote:Trip Volpe Wrote:The solution to this would be not to use asserts in unittests....and if an assert fails in one test in a module, _all_ subsequent tests in that module will be aborted, even though this makes no sense.Actually I said this wrong. It's worse than that: after one assert failure, _all_ further execution is aborted, meaning that even unit tests in _other_ modules will be prevented from running. And you can't change this behavior, even if you override the assert failure handler, since for some reason the compiler expects the handler to throw an AssertError, and if it doesn't, a segfault may result.
Mar 23 2010
bearophile Wrote: <snip>IV) Keep the built-in unit testing of D, keep them almost as simple as they are now, but somehow add hooks and flexibility to allow to external D code to refine *them* as much as needed (this "external" code can be a Phobos module, or it can be a third-part library written by other people, or it can be born as external lib and added to Phobos later, as it happens often in Python, that's why they say it has "batteries included", such batteries often were not born in the std library), so they can be used in professional situations too. This will increase the complexity of the built-in unit tests, but probably not much. It can increase the complexity of the compiler a little, but I think this extra complexity (some reflection, maybe) can be then used for other purposes too.<snip>If you agree with me that the better solution is the IV, then those hooks/reflection have to be designed first. Bye, bearophileI think your analysis is accurate. Having the simple unit testing built in is better than not having it at all. I use it as much as I can, but I haven't written complex applications -- just library modules, where it's perhaps more suitable. I'm not having much luck at conceptualizing the hooks/reflection that you refer to. (Might just be a having a slow day.) (Or, I might just be slow!) It seems like we need some of the xUnit kind of tools -- test suites, more elaborat assertions, test result reporting (not just halting), named tests, and so on. What is needed to support that? * More elaborate asserts can be built from the basic assert. A library of assert templates or functions doesn't need additional compiler support. * Named tests are essential. I'm surprised names (and qualified names -- test.math.divide, etc.) aren't already available. So this would have to be a part of the package. * Test suites would depend, I think on having names available. Again, qualification may be necessary -- perhaps to include the modulel name. * Test running needs to be extended. Running the tests before executing main is better than not running the tests. But, as you say, that's really only suitable for toy programs. We'd need some kind of control -- order of execution, action on failure, etc. I don't know enough about unit testing or compiler writing to know how much work is involved, but it seems that just a few "small additions" would go a long way. Paul
Mar 23 2010