www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - Unit Testing in Action

reply Mike Parker <aldacron gmail.com> writes:
After a couple of weeks of quiet on the D blog, it's about to get 
noisy again. The latest is is a post by Mario Kröplin of Funkwerk 
describing how the company now uses D's built-in tests in their 
codebase after several years of using third-party frameworks.

Blog:
https://dlang.org/blog/2017/10/20/unit-testing-in-action/

Reddit:
https://www.reddit.com/r/programming/comments/77m8r8/ds_builtin_unit_testing_in_action/
Oct 20 2017
next sibling parent reply qznc <qznc web.de> writes:
On Friday, 20 October 2017 at 14:04:25 UTC, Mike Parker wrote:
 After a couple of weeks of quiet on the D blog, it's about to 
 get noisy again. The latest is is a post by Mario Kröplin of 
 Funkwerk describing how the company now uses D's built-in tests 
 in their codebase after several years of using third-party 
 frameworks.

 Blog:
 https://dlang.org/blog/2017/10/20/unit-testing-in-action/

 Reddit:
 https://www.reddit.com/r/programming/comments/77m8r8/ds_builtin_unit_testing_in_action/
Thanks for this post. Personally, I have not really hit the pain points described here, so I learned something. It is a valuable comparison of different unit-testing libraries and which aspects they tackle. I took the following items from the post: * Phobos should provide a UnitTestError class, so we can separate expectation libraries (which throw) from execution libraries (which catch). The community is not ready to decide on best library, so we need to try things and this separation would make that easier. * fluent-asserts is considered the best expectations library. Syntax is `(x + y).should.equal(42).because("of test reasons");` and it gives nice output with code snippets. * unit-threaded is considered the best execution library, because it shows description strings for each test. The parallelization feature did not work out for the author. * coverage is not sufficiently solved. The author suggests to reformat code so short-circuit evaluations become multiple lines? * Fixtures and test parameters do not require special support because builtin features like static foreach are sufficient.
Oct 20 2017
next sibling parent reply Martin Nowak <code dawg.eu> writes:
On Friday, 20 October 2017 at 21:26:35 UTC, qznc wrote:
 * coverage is not sufficiently solved. The author suggests to 
 reformat code so short-circuit evaluations become multiple 
 lines?
If you can use gdc or ldc, branch coverage should be supported out of the box. Other tools support regions to be marked as unreachable, e.g GCOVR_EXCL_START/GCOVR_EXCL_STOP. I'd also err on the side that unittests themselves should not be part of coverage, but an option in druntime and more metadata from dmd might solve this. Filed under https://issues.dlang.org/show_bug.cgi?id=17923.
Oct 21 2017
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/21/2017 6:14 AM, Martin Nowak wrote:
 On Friday, 20 October 2017 at 21:26:35 UTC, qznc wrote:
 * coverage is not sufficiently solved. The author suggests to reformat code so 
 short-circuit evaluations become multiple lines?
If you can use gdc or ldc, branch coverage should be supported out of the box. Other tools support regions to be marked as unreachable, e.g GCOVR_EXCL_START/GCOVR_EXCL_STOP. I'd also err on the side that unittests themselves should not be part of coverage, but an option in druntime and more metadata from dmd might solve this. Filed under https://issues.dlang.org/show_bug.cgi?id=17923.
Not sure what is meant by branch coverage. Consider: x = 2; if (x == 1 || x == 2) Coverage would give: 1| x = 2; 2| if (x == 1 || x == 2) I.e. the second line gets an execution count of 2. By contrast, 1| x = 1; 1| if (x == 1 || x == 2) What's happening here is each of the operands of || are considered to be separate statements as far as coverage analysis goes. It becomes clearer if it is reformatted as: 1| x = 2; 1| if (x == 1 || 1| x == 2) or: 3| x = 2; if (x == 1 || x == 2) It's usually possible to trivially suss out the coverage of the clauses by looking at the preceding and succeeding line counts. Putting the clauses on separate lines also works. If there's a better way to display the various counts, please add it to the bugzilla report.
Oct 21 2017
next sibling parent =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Saturday, 21 October 2017 at 22:50:51 UTC, Walter Bright wrote:
 What's happening here is each of the operands of || are 
 considered to be separate statements as far as coverage 
 analysis goes. It becomes clearer if it is reformatted as:

 1|    x = 2;
 1|    if (x == 1 ||
 1|        x == 2)

 or:

 3|    x = 2; if (x == 1 || x == 2)
What about (adding a flag) making coverage operate at the expression level instead? Meaning that each coverage result would be associated with a column offset and length aswell as the line number. Of course the program would run much slower...
Oct 22 2017
prev sibling parent reply Martin Nowak <code dawg.eu> writes:
On Saturday, 21 October 2017 at 22:50:51 UTC, Walter Bright wrote:
 Coverage would give:

 1|    x = 2;
 2|    if (x == 1 || x == 2)

 I.e. the second line gets an execution count of 2. By contrast,

 1|    x = 1;
 1|    if (x == 1 || x == 2)
Interesting point, but would likely fail for more complex stuff. 1| stmt; 2| if (api1 == 1 && api2 == 2 || api2 == 2 && api3 == 3) Anyhow, I think the current state is good enough and there are gdc/ldc for further coverage features.
Oct 23 2017
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/23/2017 4:44 AM, Martin Nowak wrote:
 On Saturday, 21 October 2017 at 22:50:51 UTC, Walter Bright wrote:
 Coverage would give:

 1|    x = 2;
 2|    if (x == 1 || x == 2)

 I.e. the second line gets an execution count of 2. By contrast,

 1|    x = 1;
 1|    if (x == 1 || x == 2)
Interesting point, but would likely fail for more complex stuff. 1| stmt; 2| if (api1 == 1 && api2 == 2 ||        api2 == 2 && api3 == 3)
There would be a separate coverage count for line 3 which would be the sum of counts for (api2 == 2) and (api3 == 3). Generally, if this is inadequate, just split the expression into more lines. The same thing for for loop statements and ?:
 Anyhow, I think the current state is good enough and there are gdc/ldc for 
 further coverage features.
Oct 24 2017
next sibling parent qznc <qznc web.de> writes:
On Tuesday, 24 October 2017 at 20:51:36 UTC, Walter Bright wrote:
 On 10/23/2017 4:44 AM, Martin Nowak wrote:
 On Saturday, 21 October 2017 at 22:50:51 UTC, Walter Bright 
 wrote:
 Coverage would give:

 1|    x = 2;
 2|    if (x == 1 || x == 2)

 I.e. the second line gets an execution count of 2. By 
 contrast,

 1|    x = 1;
 1|    if (x == 1 || x == 2)
Interesting point, but would likely fail for more complex stuff. 1| stmt; 2| if (api1 == 1 && api2 == 2 ||        api2 == 2 && api3 == 3)
There would be a separate coverage count for line 3 which would be the sum of counts for (api2 == 2) and (api3 == 3). Generally, if this is inadequate, just split the expression into more lines.
An example for inadequate is when you cannot see which expression is not covered: 2| if (api1 == 1 && api2 == 2 || api3 == 3) Just splitting the expression is suggested in the blog post, but in an automatic fashion via dfmt. That is not elegant. The information is there just not expressed in a useable way.
Oct 24 2017
prev sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 10/24/2017 01:51 PM, Walter Bright wrote:
 On 10/23/2017 4:44 AM, Martin Nowak wrote:
 There would be a separate coverage count for line 3 which would be the
 sum of counts for (api2 == 2) and (api3 == 3).

 Generally, if this is inadequate, just split the expression into more
 lines.
It would be very useful if the compiler could do that automatically. Ali
Oct 24 2017
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/24/2017 3:06 PM, Ali Çehreli wrote:
 It would be very useful if the compiler could do that automatically.
On 10/24/2017 2:58 PM, qznc wrote:
 The information is there just not expressed in a useable way.
The problem is how to display it in a text file with the original source code.
Oct 24 2017
next sibling parent =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 10/24/2017 07:15 PM, Walter Bright wrote:
 On 10/24/2017 3:06 PM, Ali Çehreli wrote:
 It would be very useful if the compiler could do that automatically.
On 10/24/2017 2:58 PM, qznc wrote: > The information is there just not expressed in a useable way. The problem is how to display it in a text file with the original source code.
I wouldn't mind as ugly as needed. The following original code if (api1 == 1 && api2 == 2 || api2 == 2 && api3 == 3) { foo(); } could be broken like the following and I wouldn't mind: if (api1 == 1 && api2 == 2 || api2 == 2 && api3 == 3) { foo(); } I would go work on the original code anyway. Ali
Oct 24 2017
prev sibling next sibling parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Tuesday, October 24, 2017 19:15:35 Walter Bright via Digitalmars-d-
announce wrote:
 On 10/24/2017 3:06 PM, Ali ehreli wrote:
 It would be very useful if the compiler could do that automatically.
On 10/24/2017 2:58 PM, qznc wrote: > The information is there just not expressed in a useable way. The problem is how to display it in a text file with the original source code.
One option would be to add some sort of blank line (or with some kind of comment) kind of like what github does when it breaks up a line to show a diff. Github shows a line number next to the start of the actual line, and no line numbers on the subsequent lines until you get to the line that's actually the start of a new line. I don't know exactly how we'd do the same thing with an .lst file (maybe by having a line that doesn't start with |), but it at least seems like trying _something_ like that might work well. It does have the downside though that some extra lines that aren't the actual source code would in there, which may be acceptable but isn't entirely desirable. Another option would be to present multiple numbers with + signs, e.g instead of 5| if(foo || bar) do something like 3+2| if(foo || bar) That might push the code farther to the right than might be desirable if you have a line with a lot of branches and/or the branches are executed a lot of times, but if someone really wants to see the info for each branch, then that's not necessarily unreasonable. - Jonathan M Davis
Oct 25 2017
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2017-10-25 04:15, Walter Bright wrote:

 The problem is how to display it in a text file with the original source 
 code.
An option to output the result in XML or JSON would allow an editor or IDE more options to display the result, for example, hover on different expressions to show the result. -- /Jacob Carlborg
Oct 25 2017
prev sibling parent Mario =?UTF-8?B?S3LDtnBsaW4=?= <linkrope github.com> writes:
On Friday, 20 October 2017 at 21:26:35 UTC, qznc wrote:
 * fluent-asserts is considered the best expectations library. 
 Syntax is `(x + y).should.equal(42).because("of test 
 reasons");` and it gives nice output with code snippets.
The code snippets were the prominent feature from the announcement of fluent-asserts. But this feature was the reason why I originally dismissed the library. In my opinion, the goal is that the failure message describes the issue without the need to look at the test implementation. The diff of lengthy strings is, what I was always looking for. Back then, I wrote a lightweight kind of diff for dunit. In writing the blog post, I rechecked code.dlang.org. To my surprise, Sönke Ludwig ported google-diff-match-patch to D in 2014. (The status is "build: error", but there is hope that it's only corner cases that don't work.) Further investigation revealed that fluent-asserts uses this port. So, it's this "hidden feature" that currently makes fluent-asserts my favorite.
Oct 21 2017
prev sibling next sibling parent Anton Fediushin <fediushin.anton yandex.ru> writes:
On Friday, 20 October 2017 at 14:04:25 UTC, Mike Parker wrote:
 After a couple of weeks of quiet on the D blog, it's about to 
 get noisy again. The latest is is a post by Mario Kröplin of 
 Funkwerk describing how the company now uses D's built-in tests 
 in their codebase after several years of using third-party 
 frameworks.

 Blog:
 https://dlang.org/blog/2017/10/20/unit-testing-in-action/

 Reddit:
 https://www.reddit.com/r/programming/comments/77m8r8/ds_builtin_unit_testing_in_action/
Yay! My app - covered is in this post! That's so cool, when somebody uses your code. Thank you, Mario.
Oct 23 2017
prev sibling next sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Friday, 20 October 2017 at 14:04:25 UTC, Mike Parker wrote:
 After a couple of weeks of quiet on the D blog, it's about to 
 get noisy again. The latest is is a post by Mario Kröplin of 
 Funkwerk describing how the company now uses D's built-in tests 
 in their codebase after several years of using third-party 
 frameworks.

 Blog:
 https://dlang.org/blog/2017/10/20/unit-testing-in-action/

 Reddit:
 https://www.reddit.com/r/programming/comments/77m8r8/ds_builtin_unit_testing_in_action/
"Only for troubleshooting should you switch to unit-threaded. You have to be careful, however, to only use compatible features." I probably should have made it more widely known that unit-threaded now has a `unitThreadedLight` version geared towards fast compile times. It uses the default test runner you get normally with just `-unittest`, but implements all of the custom assertions as plain asserts for faster turnaround times when the tests pass. "parallel test execution (from it’s name, the main goal of unit-threaded) was quite problematic with the first test suite we converted" I'd love to know what the problems were, especially since it's possible to run in just one thread with a command-line option, or to use UDAs to run certain tests in a module in the same thread (sometimes one has to change global state, as much as that is usually not a good idea). "With the new static foreach feature however, it is easy to implement parameterized tests without the support of a framework" It is, but it's the same problem with plain asserts in terms of knowing what went wrong unless the parameterised value happens to be in the assertion. And there's also the issue of running the test only for the value/type that it failed for instead of going through the whole static foreach everytime. Atila
Oct 23 2017
parent Mario =?UTF-8?B?S3LDtnBsaW4=?= <linkrope github.com> writes:
On Monday, 23 October 2017 at 12:38:01 UTC, Atila Neves wrote:
 "parallel test execution (from it’s name, the main goal of 
 unit-threaded) was quite problematic with the first test suite 
 we converted"

 I'd love to know what the problems were, especially since it's 
 possible to run in just one thread with a command-line option, 
 or to use UDAs to run certain tests in a module in the same 
 thread (sometimes one has to change global state, as much as 
 that is usually not a good idea).
Delays are our business, so we use the clock and timers everywhere. Long ago, we introduced Singletons to be able to replace the implementations for unit testing. By now, lots of tests fake time and it's a problem if they do so in parallel. It's not too hard, however, to change this to thread-local replacements of the clock and timers. Another problem was using the same port number for different test cases. We now apply "The port 0 trick" (https://www.dnorth.net/2012/03/17/the-port-0-trick/).
 "With the new static foreach feature however, it is easy to 
 implement parameterized tests without the support of a 
 framework"

 It is, but it's the same problem with plain asserts in terms of 
 knowing what went wrong unless the parameterised value happens 
 to be in the assertion. And there's also the issue of running 
 the test only for the value/type that it failed for instead of 
 going through the whole static foreach everytime.
That's why I recommend to put the `static foreach` around the `unitest`. My example shows how to instantiate test descriptions (with CTFE of `format`) so that these string attributes are used to report failures or to slectively execute a test in isolation.
Oct 24 2017
prev sibling parent John Carter <john.carter taitradio.com> writes:
On Friday, 20 October 2017 at 14:04:25 UTC, Mike Parker wrote:

 https://dlang.org/blog/2017/10/20/unit-testing-in-action/
I'm somewhat late to this party.... but anyway, here is my two cents on the way Unit testing needs to be tweaked. One of the values of Unit Testing is Defect Localization. ie. In a well designed unit test suite, tell me which test failed, I will tell you, to within a few lines, where the bug is. However in the presence of setup and teardown failures, we lose that. Ideally we should differentiate between failures that occur during setup and teardown, versus exceptions occurring in the behaviour under test, or assertions on the validity of the result. ie. Failures under setup and teardown are not failures of the behaviour under test. The only thing we can say about the behaviour under test in these cases is, that it “hasn’t been tested”. Hopefully the behaviour that failed during the setup and teardown is explicitly tested elsewhere. ie. We should stop at the first test that fails at a point other than setup and teardown, as this is likely to be the cause, for the cascade of failures in setup and teardown of other tests.
Nov 28 2017