digitalmars.D - Thoughts about unittest run order

H. S. Teoh (44/44) May 06 2019 In theory, the order in which unittests are run ought to be irrelevant.

Jacob Carlborg (32/79) May 06 2019 There are different schools on how to order code in a file. Some will
Walter Bright (4/5) May 06 2019 That thought never occurred to me, thanks for bringing it up.

H. S. Teoh (10/17) May 07 2019 That was also my first thought, but how would you construct such a

John Colvin (9/13) May 07 2019 Use a test runner that runs all the tests regardless of previous

Atila Neves (5/14) May 07 2019 unit-threaded can also do the random order and reuse a seed like

H. S. Teoh (12/14) May 07 2019 How do you decouple the tests of two functions F and G in which F calls

Atila Neves (7/13) May 07 2019 It depends. If F and G are both public functions that are part of

H. S. Teoh (6/22) May 07 2019 I almost never delete unittests. IME, they usually wind up catching a

H. S. Teoh (12/28) May 07 2019 That's certainly one way to go about it.

Nick Sabalausky (Abscissa) (6/9) May 07 2019 unit-threaded. Seriously, it's awesome. Use it. You'll be happy :)

H. S. Teoh (7/17) May 07 2019 Well yes, and being the lazy coder that I am, this least-effort path is

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

In theory, the order in which unittests are run ought to be irrelevant.
In practice, however, the order can either make debugging code changes
quite easy, or very frustrating.

I came from a C/C++ background, and so out of pure habit write things
"backwards", i.e., main() is at the bottom of the file and the stuff
that main() calls come just before it, and the stuff *they* call come
before them, etc., and at the top are type declarations and low-level
functions that later stuff in the module depend on.  After reading one
of Walter's articles recently about improving the way you write code, I
decided on a whim to write a helper utility in one of my projects "right
side up", since D doesn't actually require declarations before usage
like C/C++ do.  That is, main() goes at the very top, then the stuff
that main() calls, and so on, with the low-level stuff all the way at
the bottom of the file.

It was all going well, until I began to rewrite some of the low-level
code in the process of adding new features. D's unittests have been
immensely helpful when I refactor code, since they catch any obvious
bugs and regressions early on so I don't have to worry too much about
making large changes.  So I set about rewriting some low-level stuff
that required extensive changes, relying on the unittests to catch
mistakes.

But then I ran into a problem: because D's unittests are currently
defined to run in lexical order, that means the unittests for
higher-level functions will run first, followed by the lower-level
unittests, because of the order I put the code in.  So when I
accidentally introduced a bug in lower-level code, it was a high-level
unittest that failed first -- which is too high up to figure out where
exactly the real bug was. I had to gradually narrow it down from the
high-level call through the middle-level calls and work my way to the
low-level function where the bug was introduced.

This is quite the opposite from my usual experience with "upside-down
order" code: since the low-level code and unittests would appear first
in the module, any bugs in the low-level code would trigger failure in
the low-level unittests first, right where the problem was. Once I fix
the code to pass those tests, then the higher-level unittests would run
to ensure the low-level changes didn't break any behaviour the
higher-level functions were depending on.  This made development faster
as less time was spent narrowing down why a high-level unittest was
failing.

So now I'm tempted to switch back to "upside-down" coding order.

What do you guys think about this?


T

-- 
You have to expect the unexpected. -- RL

May 06 2019

Jacob Carlborg <doob me.com> writes:

On 2019-05-06 20:13, H. S. Teoh wrote:
 In theory, the order in which unittests are run ought to be irrelevant.
 In practice, however, the order can either make debugging code changes
 quite easy, or very frustrating.
 
 I came from a C/C++ background, and so out of pure habit write things
 "backwards", i.e., main() is at the bottom of the file and the stuff
 that main() calls come just before it, and the stuff *they* call come
 before them, etc., and at the top are type declarations and low-level
 functions that later stuff in the module depend on.  After reading one
 of Walter's articles recently about improving the way you write code, I
 decided on a whim to write a helper utility in one of my projects "right
 side up", since D doesn't actually require declarations before usage
 like C/C++ do.  That is, main() goes at the very top, then the stuff
 that main() calls, and so on, with the low-level stuff all the way at
 the bottom of the file.
 
 It was all going well, until I began to rewrite some of the low-level
 code in the process of adding new features. D's unittests have been
 immensely helpful when I refactor code, since they catch any obvious
 bugs and regressions early on so I don't have to worry too much about
 making large changes.  So I set about rewriting some low-level stuff
 that required extensive changes, relying on the unittests to catch
 mistakes.
 
 But then I ran into a problem: because D's unittests are currently
 defined to run in lexical order, that means the unittests for
 higher-level functions will run first, followed by the lower-level
 unittests, because of the order I put the code in.  So when I
 accidentally introduced a bug in lower-level code, it was a high-level
 unittest that failed first -- which is too high up to figure out where
 exactly the real bug was. I had to gradually narrow it down from the
 high-level call through the middle-level calls and work my way to the
 low-level function where the bug was introduced.
 
 This is quite the opposite from my usual experience with "upside-down
 order" code: since the low-level code and unittests would appear first
 in the module, any bugs in the low-level code would trigger failure in
 the low-level unittests first, right where the problem was. Once I fix
 the code to pass those tests, then the higher-level unittests would run
 to ensure the low-level changes didn't break any behaviour the
 higher-level functions were depending on.  This made development faster
 as less time was spent narrowing down why a high-level unittest was
 failing.
 
 So now I'm tempted to switch back to "upside-down" coding order.
 
 What do you guys think about this?

There are different schools on how to order code in a file. Some will 
say put all the public functions first and then the private. Some will 
say code should read like a newspaper article, first an overview and the 
more and more you read you get deeper into the details. Others will says 
that you put related code next to each other, regardless if it's public 
or private symbols. I usually put public symbols first and the private 
symbols.

When it comes to the order of unit tests, I think they should run in 
random order. If a test fails it should print a seed value. If the tests 
are run with this seed value the tests should be run in the same order 
as before. This helps with debugging if some tests are accidentally 
depending on each other.

The problem you're facing, I'm guessing, is you run with the default 
unit test runner? If a single test fails, it will stop and run no other 
tests in that module (tests in other modules will run). If you pick and 
existing unit test framework or write your own unit test runner you can 
have the unit tests continue after a failure. Then you would see that 
the lower level tests are failing as well.

Writing a custom unit test runner with the help of the "getUnitTests" 
trait [1] would allow you to do additional things like looking for UDAs 
that could set the order of the unit tests or group the unit tests in 
various ways. You could group them in high and low level groups and have 
the unit test runner run the low level tests first.

For your particular problem it seems it would be solved by continue 
running the other tests in the same module when a test has failed. I 
think "silly" [2] looks really interesting. I haven't had the time yet 
to try it out so I don't know if it will continue after a failed test.

[1] https://dlang.org/spec/traits.html#getUnitTests
[2] https://code.dlang.org/packages/silly

-- 
/Jacob Carlborg

May 06 2019

Walter Bright <newshound2 digitalmars.com> writes:

On 5/6/2019 11:13 AM, H. S. Teoh wrote:
 What do you guys think about this?

That thought never occurred to me, thanks for bringing it up.

It suggests perhaps the order of unittests should be determined by a 
dependency graph, and should start with the leaves.

May 06 2019

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Mon, May 06, 2019 at 10:30:01PM -0700, Walter Bright via Digitalmars-d wrote:
 On 5/6/2019 11:13 AM, H. S. Teoh wrote:
 What do you guys think about this?

 
 That thought never occurred to me, thanks for bringing it up.
 
 It suggests perhaps the order of unittests should be determined by a
 dependency graph, and should start with the leaves.

That was also my first thought, but how would you construct such a
graph? In my case, almost all of the unittests are at module level, and
call various module-level functions.  It's not obvious how the compiler
would divine which ones should come first just by looking at the
unittest body. You'd have to construct the full function call dependency
graph of the entire module to get that information.


T

-- 
Answer: Because it breaks the logical sequence of discussion. / Question: Why
is top posting bad?

May 07 2019

John Colvin <john.loughran.colvin gmail.com> writes:

On Monday, 6 May 2019 at 18:13:37 UTC, H. S. Teoh wrote:
 In theory, the order in which unittests are run ought to be 
 irrelevant. In practice, however, the order can either make 
 debugging code changes quite easy, or very frustrating.

 [...]

Use a test runner that runs all the tests regardless of previous 
errors? (and does them in multiple threads, hooray!)

https://github.com/atilaneves/unit-threaded

Then you'll at least get to know everything that failed instead 
of just whatever happened to be lexically first.

I agree that some ordering system might improve the 
time-to-narrow-down-bug-location a bit, but the above might be 
acceptable nonetheless.

May 07 2019

Atila Neves <atila.neves gmail.com> writes:

On Tuesday, 7 May 2019 at 08:49:15 UTC, John Colvin wrote:
 On Monday, 6 May 2019 at 18:13:37 UTC, H. S. Teoh wrote:
 In theory, the order in which unittests are run ought to be 
 irrelevant. In practice, however, the order can either make 
 debugging code changes quite easy, or very frustrating.

 [...]

 Use a test runner that runs all the tests regardless of 
 previous errors? (and does them in multiple threads, hooray!)

 https://github.com/atilaneves/unit-threaded


unit-threaded can also do the random order and reuse a seed like 
Jacob mentioned above.

If the order tests run in is important, the tests are coupled... 
friends don't let friends couple their tests.

May 07 2019

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Tue, May 07, 2019 at 09:40:27AM +0000, Atila Neves via Digitalmars-d wrote:
[...]
 If the order tests run in is important, the tests are coupled...
 friends don't let friends couple their tests.

How do you decouple the tests of two functions F and G in which F calls
G? If a code change broke the behaviour of G, then both tests should
fail. Then we run into this problem with the default test runner. To
make F's tests independent of G would require that they pass
*regardless* of the behaviour of G, which seems like an unattainable
goal unless you also decouple F from G, which implies that every tested
function must be a leaf function. Which seems unrealistic.


T

-- 
The trouble with TCP jokes is that it's like hearing the same joke over and
over.

May 07 2019

Atila Neves <atila.neves gmail.com> writes:

On Tuesday, 7 May 2019 at 11:29:43 UTC, H. S. Teoh wrote:
 On Tue, May 07, 2019 at 09:40:27AM +0000, Atila Neves via 
 Digitalmars-d wrote: [...]
 If the order tests run in is important, the tests are 
 coupled... friends don't let friends couple their tests.

 How do you decouple the tests of two functions F and G in which 
 F calls G?

It depends. If F and G are both public functions that are part of 
the API, then one can't. Otherwise I'd just test F since G is an 
implementation detail.

I consider keeping tests around for implementation details an 
anti-pattern. Sometimes it's useful to write the tests if doing 
TDD or debugging, but afterwards I delete them.

May 07 2019

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Tue, May 07, 2019 at 04:50:23PM +0000, Atila Neves via Digitalmars-d wrote:
 On Tuesday, 7 May 2019 at 11:29:43 UTC, H. S. Teoh wrote:
 On Tue, May 07, 2019 at 09:40:27AM +0000, Atila Neves via Digitalmars-d
 wrote: [...]
 If the order tests run in is important, the tests are coupled...
 friends don't let friends couple their tests.

 
 How do you decouple the tests of two functions F and G in which F
 calls G?

 
 It depends. If F and G are both public functions that are part of the
 API, then one can't. Otherwise I'd just test F since G is an
 implementation detail.
 
 I consider keeping tests around for implementation details an
 anti-pattern.  Sometimes it's useful to write the tests if doing TDD
 or debugging, but afterwards I delete them.

I almost never delete unittests. IME, they usually wind up catching a
regression that would've been missed otherwise.


T

-- 
If you want to solve a problem, you need to address its root cause, not just
its symptoms. Otherwise it's like treating cancer with Tylenol...

May 07 2019

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Tue, May 07, 2019 at 08:49:15AM +0000, John Colvin via Digitalmars-d wrote:
 On Monday, 6 May 2019 at 18:13:37 UTC, H. S. Teoh wrote:
 In theory, the order in which unittests are run ought to be
 irrelevant.  In practice, however, the order can either make
 debugging code changes quite easy, or very frustrating.
 [...]

 
 Use a test runner that runs all the tests regardless of previous
 errors?  (and does them in multiple threads, hooray!)

That's certainly one way to go about it.

But perhaps what I'm really looking for is a way to invoke a *specific*
unittest (probably identified by starting line number, just like what
dmd does to mangle unittests), so that I can iterate on a specific
problem case until it's fixed before running through all the tests
again.


 https://github.com/atilaneves/unit-threaded
 
 Then you'll at least get to know everything that failed instead of
 just whatever happened to be lexically first.
 
 I agree that some ordering system might improve the
 time-to-narrow-down-bug-location a bit, but the above might be
 acceptable nonetheless.

Yeah, not aborting immediately upon test failure would help a lot in
this respect.


T

-- 
"If you're arguing, you're losing." -- Mike Thomas

May 07 2019

"Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:

On 5/7/19 1:22 PM, H. S. Teoh wrote:
 
 But perhaps what I'm really looking for is a way to invoke a *specific*
 unittest

unit-threaded. Seriously, it's awesome. Use it. You'll be happy :)

But as far as the default test runner and order of code layout, those 
are some really interesting points. With a low-level-to-high-level 
ordering, then in many cases you wouldn't even need to point at a 
particular test you want to focus on.

May 07 2019

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Tue, May 07, 2019 at 05:30:03PM -0400, Nick Sabalausky (Abscissa) via
Digitalmars-d wrote:
 On 5/7/19 1:22 PM, H. S. Teoh wrote:
 
 But perhaps what I'm really looking for is a way to invoke a
 *specific* unittest

 
 unit-threaded. Seriously, it's awesome. Use it. You'll be happy :)

OK, point taken. I'll go check it out. :-P


 But as far as the default test runner and order of code layout, those
 are some really interesting points. With a low-level-to-high-level
 ordering, then in many cases you wouldn't even need to point at a
 particular test you want to focus on.

Well yes, and being the lazy coder that I am, this least-effort path is
particularly appealing to me.


T

-- 
WINDOWS = Will Install Needless Data On Whole System -- CompuMan

May 07 2019

D Programming

C/C++ Programming

Other

digitalmars.D - Thoughts about unittest run order