On Disabling Tests
- 3 minutes read - 579 wordsToday I want to talk about a common technique: disabling failing tests to allow a feature to ship. Maybe sometimes you gotta do it. But long-term I think it causes more problems than it solves.
Here’s the scenario: we’re shipping an important hotfix, but there’s one red test blocking a green suite, and thus our deploy. This test is flaky– it’s known to pass and fail on the same commit. Folks on our team are suggesting disabling it. Wanting to be a team player, we disable the test with a short commit, “Disable this test for now” and ship our code.
What Happens Next
At this moment, we have every intention of coming back and re-enabling the test. Let’s give ourselves the benefit of the doubt and predict that is going to happen. What goes on in the meantime?
First, bugs. For every commit that ships to production, our disabled test isn’t protecting against regressions. If it was a meaningful test, losing it temporarily makes our software more vulnerable to bugs.
Second, degrading context. We don’t have context now– that’s why we disabled the test. Instead of earning that context, we deferred the work. Often, if someone does re-enable and fix the test, that person will not be us. We have moved teams or left the company. That unlucky person possibly doesn’t even know why the test was disabled.
Third, the chances it will be reenabled rapidly approaches zero. In ten years of experience, I’ve almost never seen a disabled test become active again after a day or two without herculean effort. Despite everyone’s good intentions, it’s never a priority. The same culture that created the disabled test ensures that it stays disabled.
Why? It’s a slog. Integration tests in particular are often the most painful to fix, requiring advanced frontend and backend skills plus a good amount of curiosity and patience.
Additionally, the reason the test was flaking may not be the reason(s) that it’s failing now. It hasn’t been maintained, so any number of things could be wrong. Maybe the test flaked because some JavaScript was slow to load, but when we try to reenable it, the data is no longer valid,and a route has changed, and a dependency is missing. One problem became three.
Solution
I hope I’ve made the case that disabling tests is the opposite of a solution. So, what’s the path forward? Delete or fix them.
For tests that are really a problem, I want you to consider deleting them. Tests are not sacred; they don’t sell products or save the world. They are imperfect reflections of our codebase that are supposed to help us ship. When they aren’t helping, they should be fighting for their life.
I’m not arguing for casually deleting previous work. I’m arguing that programming is about tradeoffs and re-running the CI build five times a day to catch one flaky test passing has a cost.
For the salvageable tests, take a minute and fix them. If you believe as I do that computers are understandable, then try to understand what’s going wrong. Flaky tests are part of every codebase I’ve ever seen. There are a lot of things you can do to avoid them, but a test suite that never flakes is rare.
So take a breath and fix the issue. As I wrote for the Hashrocket blog in ‘Avoiding Code Catastrophes’, you almost always have more time that you think. So take the time and fix them.