The Perils of “Obvious” Testing


The Cynefin Framework

Like many, I’ve been stunned by the VW Emissions Scandal. But I’ve also been fascinated. when I heard that software had been written to circumvent the EPA’s tests I immediately started wondering how it was done. I thought that would be an extremely complicated process, but that was because I failed to understand the nature of the testing.

As this article on the Washington Post points out, the EPA’s tests are very scripted. They follow a predictable and repeatable routine where speeds have to be kept within “two miles per hour of the required speed at any given moment.” With such a standardized approach, I immediately recognized how it would be comparatively simple to put the necessary monitors into the software to detect the start of such a test, adjust performance throughout, and guarantee a passing performance.

I’m glad the EPA quickly recognized that it needs to revise its testing approach. I hope that they take the deeper lesson from this experience and adjust their testing so that it is not “obvious.” I am deliberately using the term for the Cynefin Domain, because I think the sense-making framework can help us understand what happened and draw broader lessons for how to develop more effective tests.

The Obvious Cynefin Domain was previously called “simple.” It is tightly constrained; there are no degrees of freedom; and best practices can be developed and disseminated for dealing with problems in this space. This is an accurate description of the EPA’s testing approach and also applies to other forms of testing. It also helps explain what happened with the VW diesels, because when Obvious solutions fail, they do not fail mildly.

Instead, they tend to fail catastrophically by passing through the boundary into the Chaotic Domain. This shatters the existing dynamic. It is, I think, a perfect encapsulation of the VW Emissions Scandal; it was a catastrophic failure that undermined our faith in the automotive industry.

I have seen similar failures (on a much smaller scale) in software, where teams used scripted approaches to testing that failed to adequately model user behavior and expectations. The tests “passed,” but the user experience missed the mark. In these circumstances, the software “failed.” Some of these failures have been similarly catastrophic. How can we avoid similar types of failures in our own testing approaches?

One answer is to ensure that our tests are not so “obvious.” They must not be scripted and predictable. If they are, they invite “gaming” and will lead to failure as user expectations change (and they will, once they begin to use the software). Exploratory Testing is an excellent way to do this. Exploratory tests are not “obvious;” from a Cynefin perspective they are Complex, because the tester follows a “probe-sense-respond” paradigm as they work with the software. Assumptions about what it should and should not do are subordinated to learning what it actually does. This offers the potential for learning and discovery. Coupled with a rapid feedback cycle, it is a much better approach and will lead to better solutions.

Perhaps the EPA should consider something similar.