digitalmars.D - Serious Problems with the Test Suite
- Walter Bright (43/43) Jun 17 2020 A good test suite should:
- Avrina (5/9) Jun 17 2020 I've run into these problems with, for example, optlink. When
- H. S. Teoh (22/31) Jun 17 2020 Whoa, holey miss the point batman! Optlink may have its own share of
- Avrina (8/24) Jun 18 2020 There are issues with optlink, I've seen them manifest in
- Walter Bright (2/4) Jun 18 2020 I've run those tests more than anyone, and have not seen an optlink heis...
- H. S. Teoh (8/15) Jun 18 2020 I think it's because Walter uses advanced quantum technology that can
- Walter Bright (2/11) Jun 18 2020 That's not an optlink issue.
- Mathias LANG (22/38) Jun 18 2020 Starting a new thread as not to derail the original topic, which
- Stefan Koch (8/11) Jun 17 2020 Most of those could be fixed with an improved test runner. If we
- Walter Bright (3/3) Jun 18 2020 I've added a new keyword TestSuite and here are the current test suite b...
A good test suite should: 1. verify that things that are supposed to work do work 2. when things don't verify, point to where the problem is The D test suite fails miserably at point 2. The only bright spot is the autotester, where when one of the tests fail it's quick to find the problem source. But I cringe every time something else fails, because then I know I'm in for hours or even DAYS trying to figure out what and where things went wrong. For example, https://github.com/dlang/dmd/pull/11287 has several failures. All of which come with USELESS log files. I have no idea what went wrong. Some principles for log files: 1. If the log file says ERROR, it should be an ERROR, i.e. the test should fail. I'm often confronted with log files that list multiple ERRORe, but never mind, those errors don't need to pass. All benign ERROR messages, all deprecation messages, all warning messages need to be fixed, so what when the log file says ERROR that's why the test failed. 2. The ERROR that causes the test to fail should be LAST line in the log file, not 300 lines back. 3. Log files need to contain comment text at each step to SAY WHAT THEY ARE DOING. 4. Makefiles should NEVER, EVER be run in "quiet" mode, for the simple reason that one has no idea what it was trying to do when it failed. 5. Test files must either include a URL to the bugzilla issue they fix or have some clue in the comments what they are doing. 6. Running tests multi-process makes them go faster, but since the log files randomly interleave the output from them, it makes it impossible to figure out where the failure is. 7. Any test that fails because of a network error, or other environmental error unrelated to what is being tested, should automatically sleep for a minute or ten, then try again. 8. Any timeout terminations MUST say which test timed out. 9. Tests should not be Rube Goldberg Machines with layers and layers of complexity before the actual test is even run. Tests should be a THIN layer over the test. 10. Many tests are UTTERLY UNDOCUMENTED. For example, https://github.com/dlang/dmd/tree/master/test/unit What is that? What does it do? Is it one test or many tests? Let's look at: https://github.com/dlang/dmd/blob/master/test/unit/frontend.d Not a SINGLE COMMENT in it. What it is, what it does, etc., is all left to the imagination. This is completely unacceptable for production code, it is also unacceptable for any code accepted into the D repository. 11. Every time we run into "oh, that's just a heisenbug, try re-running the test" that is a BUG in the test suite and needs to be fixed. Those are gigantic time and resource wasting problems.
Jun 17 2020
On Wednesday, 17 June 2020 at 23:59:52 UTC, Walter Bright wrote:11. Every time we run into "oh, that's just a heisenbug, try re-running the test" that is a BUG in the test suite and needs to be fixed. Those are gigantic time and resource wasting problems.I've run into these problems with, for example, optlink. When trying to get optlink removed, you prevent it. These heisenbugs exist because, a lot of the time, you aren't willing to chop off dead weight.
Jun 17 2020
On Thu, Jun 18, 2020 at 01:59:39AM +0000, Avrina via Digitalmars-d wrote:On Wednesday, 17 June 2020 at 23:59:52 UTC, Walter Bright wrote:Whoa, holey miss the point batman! Optlink may have its own share of issues, but the problem here isn't with this or that piece of software, it's with the structure of the testsuite. Tests that are non-deterministic or depend on external state, strictly speaking, shouldn't be in the test suite. This includes tests that involve downloading some remote resource over the network, tests that assume things about the host OS and filesystem, etc.. There are a couple of these in the test suite, and they put you at the mercy of external state which is beyond your control. (I remember one time there was a heisenbug that had to do with random number generators, meaning, its probability of arbitrary, totally coincidental failure was non-zero. Sigh.) These tests ought to be removed, or at least disabled in CI. Any time you depend on external state, it really does not belong in the test suite, or at least, it does not belong in the autotester, because it just leads to tons of wasted time trying to track down exactly what it is that failed, which most of the time isn't even relevant to the PR you're trying to push through. T -- MASM = Mana Ada Sistem, Man!11. Every time we run into "oh, that's just a heisenbug, try re-running the test" that is a BUG in the test suite and needs to be fixed. Those are gigantic time and resource wasting problems.I've run into these problems with, for example, optlink. When trying to get optlink removed, you prevent it. These heisenbugs exist because, a lot of the time, you aren't willing to chop off dead weight.
Jun 17 2020
On Thursday, 18 June 2020 at 02:34:42 UTC, H. S. Teoh wrote:On Thu, Jun 18, 2020 at 01:59:39AM +0000, Avrina via Digitalmars-d wrote:There are issues with optlink, I've seen them manifest in testsuite and just running the test again "fix" it. It's not the only problem where this has occured. I'm sure there's more problem with the test suite, and it is rather messy and has grown slow. I was replying specifically to the point about "heisenbugs". Some of which are of Walter's own creation do to his refusal to accept change.On Wednesday, 17 June 2020 at 23:59:52 UTC, Walter Bright wrote:Whoa, holey miss the point batman! Optlink may have its own share of issues, but the problem here isn't with this or that piece of software, it's with the structure of the testsuite.11. Every time we run into "oh, that's just a heisenbug, try re-running the test" that is a BUG in the test suite and needs to be fixed. Those are gigantic time and resource wasting problems.I've run into these problems with, for example, optlink. When trying to get optlink removed, you prevent it. These heisenbugs exist because, a lot of the time, you aren't willing to chop off dead weight.
Jun 18 2020
On 6/18/2020 7:38 AM, Avrina wrote:There are issues with optlink, I've seen them manifest in testsuite and just running the test again "fix" it. It's not the only problem where this has occured.I've run those tests more than anyone, and have not seen an optlink heisenbug.
Jun 18 2020
On Thu, Jun 18, 2020 at 02:40:33PM -0700, Walter Bright via Digitalmars-d wrote:On 6/18/2020 7:38 AM, Avrina wrote:I think it's because Walter uses advanced quantum technology that can directly handle quantum-superimposed computation states [1], so none of these heisenbugs affect him. ;-) [1] https://forum.dlang.org/post/mailman.3657.1591403118.31109.digitalmars-d puremagic.com T -- English is useful because it is a mess. Since English is a mess, it maps well onto the problem space, which is also a mess, which we call reality. Similarly, Perl was designed to be a mess, though in the nicest of all possible ways. -- Larry WallThere are issues with optlink, I've seen them manifest in testsuite and just running the test again "fix" it. It's not the only problem where this has occured.I've run those tests more than anyone, and have not seen an optlink heisenbug.
Jun 18 2020
On 6/18/2020 3:20 PM, H. S. Teoh wrote:On Thu, Jun 18, 2020 at 02:40:33PM -0700, Walter Bright via Digitalmars-d wrote:That's not an optlink issue.I've run those tests more than anyone, and have not seen an optlink heisenbug.I think it's because Walter uses advanced quantum technology that can directly handle quantum-superimposed computation states [1], so none of these heisenbugs affect him. ;-) [1] https://forum.dlang.org/post/mailman.3657.1591403118.31109.digitalmars-d puremagic.com
Jun 18 2020
On Friday, 19 June 2020 at 00:54:15 UTC, Walter Bright wrote:On 6/18/2020 3:20 PM, H. S. Teoh wrote:Starting a new thread as not to derail the original topic, which contained valid points. Optlink has been a pain for everyone on x86 Windows for a while. I personally use Linux and Mac OSX, but tried doing some work on Windows recently and first think I got was a linker crash. There have been active steps taken to limit its use / reduce the exposure of new users to it, among them: - Dub defaults to mscoff since v1.15.0, and that has drastically improved the UX for new users. See https://github.com/dlang/dub/pull/1661 for the many reasons this was done. - Vibe.d recently dropped support for it because they were causing crashes / timeout: https://github.com/vibe-d/vibe.d/pull/2445 - This was tried in DMD, and you obviously shut it down: https://github.com/dlang/dmd/pull/8347 . I will just quote the last post by Manu here: "I don't have the energy to pursue this. I do think it's important though." And yes, they are document, advertised, and have been advocated for years, yet you refused to listen to the feedback countless users have given.On Thu, Jun 18, 2020 at 02:40:33PM -0700, Walter Bright via Digitalmars-d wrote:That's not an optlink issue.I've run those tests more than anyone, and have not seen an optlink heisenbug.I think it's because Walter uses advanced quantum technology that can directly handle quantum-superimposed computation states [1], so none of these heisenbugs affect him. ;-) [1] https://forum.dlang.org/post/mailman.3657.1591403118.31109.digitalmars-d puremagic.com
Jun 18 2020
On Wednesday, 17 June 2020 at 23:59:52 UTC, Walter Bright wrote:A good test suite should: 1. verify that things that are supposed to work do work [...]Most of those could be fixed with an improved test runner. If we did a timeout per test. Another oblivious improvement would be printing only the tests which failed. As for the missing comments, I think that's a plus. When introducing a change in how dmd interprets D's semantics, one should be forced to scratch their head.
Jun 17 2020
I've added a new keyword TestSuite and here are the current test suite bugs that I found: https://issues.dlang.org/buglist.cgi?keywords=TestSuite&list_id=231900
Jun 18 2020