The Perils of “Obvious” Testing


The Cynefin Framework

Like many, I’ve been stunned by the VW Emissions Scandal. But I’ve also been fascinated. when I heard that software had been written to circumvent the EPA’s tests I immediately started wondering how it was done. I thought that would be an extremely complicated process, but that was because I failed to understand the nature of the testing.

As this article on the Washington Post points out, the EPA’s tests are very scripted. They follow a predictable and repeatable routine where speeds have to be kept within “two miles per hour of the required speed at any given moment.” With such a standardized approach, I immediately recognized how it would be comparatively simple to put the necessary monitors into the software to detect the start of such a test, adjust performance throughout, and guarantee a passing performance.

I’m glad the EPA quickly recognized that it needs to revise its testing approach. I hope that they take the deeper lesson from this experience and adjust their testing so that it is not “obvious.” I am deliberately using the term for the Cynefin Domain, because I think the sense-making framework can help us understand what happened and draw broader lessons for how to develop more effective tests.

The Obvious Cynefin Domain was previously called “simple.” It is tightly constrained; there are no degrees of freedom; and best practices can be developed and disseminated for dealing with problems in this space. This is an accurate description of the EPA’s testing approach and also applies to other forms of testing. It also helps explain what happened with the VW diesels, because when Obvious solutions fail, they do not fail mildly.

Instead, they tend to fail catastrophically by passing through the boundary into the Chaotic Domain. This shatters the existing dynamic. It is, I think, a perfect encapsulation of the VW Emissions Scandal; it was a catastrophic failure that undermined our faith in the automotive industry.

I have seen similar failures (on a much smaller scale) in software, where teams used scripted approaches to testing that failed to adequately model user behavior and expectations. The tests “passed,” but the user experience missed the mark. In these circumstances, the software “failed.” Some of these failures have been similarly catastrophic. How can we avoid similar types of failures in our own testing approaches?

One answer is to ensure that our tests are not so “obvious.” They must not be scripted and predictable. If they are, they invite “gaming” and will lead to failure as user expectations change (and they will, once they begin to use the software). Exploratory Testing is an excellent way to do this. Exploratory tests are not “obvious;” from a Cynefin perspective they are Complex, because the tester follows a “probe-sense-respond” paradigm as they work with the software. Assumptions about what it should and should not do are subordinated to learning what it actually does. This offers the potential for learning and discovery. Coupled with a rapid feedback cycle, it is a much better approach and will lead to better solutions.

Perhaps the EPA should consider something similar.  

Thoughts on Agile Coach Camp US

I attended Agile Coach Camp US last week. It was a wonderful experience and a great way to explore new ideas. Here are some of my personal highlights of the event.


Olaf Lewitz gave an excellent introduction to the concept of Temenos. It’s an approach that emphasizes creating space for effective conversations and mutual understanding so that we can become aware of our choices and take more deliberate (and positive) action in the future.

IMG_4302Olaf triggered a sharing exercise by asking us to tell stories from our lives where it felt difficult to say “no.” We began hesitantly, but soon we were relating our experiences, building on each other, and exposing how difficult it can be to turn down our friends, relatives, and colleagues.

While this was happening, we observed several things. The pace of the sharing accelerated. Sharing led to more sharing, more openness, and increased sense of connection with each other. As we discussed our experiences, I—and others as well—became more aware of options. Our choices to say “yes” or “no” were deliberate; we did not “have” to make them the way that we did.

That’s when I was struck by the power of the approach. Instead of drilling into why an event occurred or why a decision was made, we were being focused on options and future possibilities. We talked about our view of the past, but the analysis in my head did not dwell on the past; it was looking to the future. I felt that I had more choices than before. I felt that I could be more deliberate. It was a fascinating effect to observe.

It was made more powerful by the knowledge that the most successful retrospectives I’ve facilitated worked much the same way. I’ve always placed great emphasis on sharing diverse perspectives. I didn’t call these perspectives stories, but they worked similarly, and broadened our view of the past so that we become more aware of future options. From there, it is easier to make a deliberate choice about what to do in the future. I think I can make my approach even more effective with Temenos and want to learn more.

Remaining Curious

Sue Johnston led us through a fun conversation about the value of remaining curious as a coach. The most impressive aspect of this session was how we focused in on questions and questioning style. Questioning is an important aspect of curiosity, of course, but there are different ways to ask questions and we agreed that some are more effective than others.

IMG_4295One of the most interesting suggestions was to try to avoid the use of “why” questions. We ask “why” questions all the time; the “5 whys” is a well known technique. However, they can be very dangerous. “Why” questions can easily lead to blame. Consider the difference between these two questions:

Why did you do that?


What made it seem like that was a good idea?

These two are very different. The first question is framed in such a way that an individual (or a group) immediately feels responsible for doing something wrong (the “that”). Judgment is implicit. This will likely trigger hostility and/or fear. We risk shutting the conversation down.

The second question divorces the action (the “that”) from the individual or team. We operate from an assumption of best intentions and allow them to share their perspective. We remain open to the possibility that they know something we do not and that their choice might have been the best. This approach increases the potential for a broader level of shared understanding. The suggestion was that we should try to reframe “why” questions as “what” questions whenever possible.

Another valuable point was Sue’s introduction of the idea of the “arc” of a coaching conversation. This was presented as a series of different questions, each appropriate for a different moment in the arc.

  • In the beginning, we need to learn and so we ask, “What?”
  • Once we understand the situation better, we need to develop a better sense of the context and so we ask, “So What?”
  • With a sense of the context, we can move to helping draw out factors that may not be immediately obvious and so we ask, “What Else?”
  • Finally, to help someone understand what they can do with the knowledge gained, we ask “Now What?”

I thought this was an excellent little frame for thinking about coaching conversations that can help keep the focus on curiosity.

The Unprintable Work of Derek W. Wade

Derek ran a pair of sessions, both had great names that drew people in, but they’re inappropriate for a family blog.

In the first, we explored the power of venting frustrations in a positive way. We shared techniques we use to deal with frustration and affinitized them. Then we vented our frustrations, not as stories, but just as a word or phrase, going around the group several times. We had plenty of frustrations. Once we had a good list compiled, we shared a few stories and tried to relate them back to the brainstormed techniques. I think the best vision to come out was the concept of simultaneously drinking wine and juggling chainsaws. That gave us all a good laugh!

The second session explored techniques for dealing with leaders who seem resistant to change. We had all encountered people in powerful positions who appeared unwilling to support the improvement of their teams. Although it was not billed as such, the session was a great look into the dangers of fundamental attribution error. We spent a lot of time discussing how these individuals have their own valuable perspective and we need to gain a better understanding of it in order to communicate with them effectively.

Beneficial Conversations

In addition to the sessions, I had several really useful conversations which helped me see things in new ways.

Michael J. Tardiff and I IMG_4282had a fascinating exchange. We talked about infection and subversion as two different models for organizational change. We agreed that an infection model starts out localized and spreads once others see or hear about the value. With infection there is no inherent threat to existing power structures; an effective practice, like TDD, might spread this way. Subversion differs in that it deliberately seeks to undermine existing authority and is driven by interested parties. The two can be combined; they’re not mutually exclusive. But they have very different implications. Our talk got me thinking about the importance of accounting for power dynamics and vocabulary when we discuss organizational change.

Andrea Chiou reminded me of the value of creating a shared set of experiences for increasing empathy and accelerating understanding. The most effective workshops I’ve run all seem to have had some element of this, even if I didn’t intentionally create it.

Bryan Beecham explained his concept of “human refactoring” and used some very effective software code analogies to describe it. He believes we can use this approach to change our behavior and make better choices. I agree. I’m less confident in his goal to live for 200 years, but I wish him luck.

Weighing in on #NoEstimates


“… plans are useless, but planning is indispensable” – Dwight D. Eisenhower

I’m a big fan of the debates that have been triggered by #NoEstimates. It’s wonderful to have the dialog, even though many of us often end up talking past each other. I think the root of some of the misunderstanding rests in our assumptions about planning and how we think about risk.

I like the #NoEstimates concept because it challenges us to get away from the belief that we can accurately predict the future. Many of us gravitate towards the idea that if we could just estimate our tasks better, then we would become more predictable. This idea has a lot of appeal because, as professionals, we strongly believe we should be able to accurately assess how long our work will take.

And this is true, broadly speaking. We can get a rough sense of how long work will take. However, there will always be an element of uncertainty remaining, if not in the task itself, then in our environment or circumstances. I experienced a great example of this earlier this week when the power went out just as I was bringing a new printer online. The task duration grew exponentially because of this external block.

If uncertainty is unavoidable then how can we plan? This is the question that I think many discussions of #NoEstimates get hung up on. How should we plan?

Planning is beneficial. It can help us determine the right thing to work on and when to work on it; it can align the work of multiple teams (or multiple disciplines); it can give us a window into how effective we are at delivering value, and how often we do so; and it can ensure that everyone is working together, towards a common goal.

All these outcomes are possible, even in conditions of uncertainty, provided we make planning a recurring activity. If the future is unpredictable, then our plans are inherently flawed. We must revisit them periodically to update them as we learn more. Otherwise, their value decays rapidly with time.

I think an emphasis on planning, not plans, is a natural consequence of the #NoEstimates concept. However, it’s easy to conclude that #NoEstimates makes planning impossible or irrelevant, because how can we plan without any idea of how long a task will take?

The answer is that we can spend just enough time conceptualizing, categorizing, or breaking down the work to make the planning activity valuable. This can be done without estimating. Some teams I work with just get their stories “small enough;” others try to conceptualize them in terms of “units” of work. None of these teams believe they “estimate,” yet they plan effectively.

Closely tied to concerns about plans and planning is how we think about risk. Risk is inherent to what we do; it’s in the uncertainty of our predictions and in the new and innovative solutions we try to bring to customers. When we emphasize regular planning, we recognize this and provide a release valve—a way for risks to be rapidly identified and discussed as a team. This creates an environment where risk becomes a shared responsibility.

#NoEstimates reflects experience with a very different type of environment. Often when development groups are asked for an estimate, what’s really happening is that risk is being transferred. An estimate of two months triggers an expectation of delivery in two months. If there is no commitment to regularly revising this plan, then any uncertainty becomes the responsibility of the team that gave the estimate. Risk is not a shared responsibility; it’s been transferred exclusively to the team, and their estimate was the transfer mechanism.

Not surprisingly, many teams want to avoid this dynamic. Since the estimate was the mechanism that gave them responsibility for all the risk, #NoEstimates has great appeal. If it can trigger a recurring approach to planning, then the team will develop a healthier approach to managing risk and uncertainty.

For some more specifics on how this might work, check out this post from Paul Boos.

Nominated for Brickell Key

I'm a Brickell Key Award finalist I’ve been nominated for the Brickell Key Award! I’m tremendously excited about this. The award highlights excellence with Kanban, honoring people who have shown outstanding achievement, leadership, and contribution to the Kanban community. I’m also rather stunned.

I don’t think of myself as having shown outstanding leadership or achievement. I really value Kanban though, and I talk about it almost everywhere I go, so maybe there’s something to it.

What I appreciate most about Kanban is that it is a deliberate attempt to create a shared cognitive framework, a shared view of what we are doing. This makes it much easier to work together, cooperate, and collaborate towards a common end. I’ve seen it with the teams I work with, and I’ve also seen it in my own home.

We hear a great deal about the importance of culture and how Agile and Lean require a specific cultural mindset. In most organizations this requires cultural change. I have to agree with this. Although my work is generally considered “process improvement” a vast majority of my efforts focus on improving how teams work together and relate to each other; this improves their culture, or at least changes their perception of it. In many cases, my work would better be described as “cultural change” rather than “process change;” the two go hand in hand.

The wonderful thing about Kanban is that it gives us a tool to work on culture directly, without ever mentioning the concept. By providing a shared frame of reference, a Kanban creates a new cognitive framework. This can overcome existing biases and assumptions and help bring a team together. It’s a powerful constraint that can trigger changes in the way people work and relate to each other. Seeing a team move from infighting and division to collaborative self-organization in this way is a wonderful thing.

And it is disheartening to see it abused. A Kanban can be a wonderful tool, but it can also be a powerful mechanism for division and control. I’ve seen managers construct Kanban systems that enhanced their power and disenfranchised subordinates. It’s rare, but it can happen.

This hasn’t reduced my enthusiasm for it though. And I’m excited to share what I’ve learned and collaborate with others at Lean Kanban North America 2015 in June. I’m also excited that I might be honored with the Brickell Key Award.

If you’ve worked with me and want to put in a good word of support, please follow this link.

I Always See the Gorilla

Silver_back_gorilla_-_Gorilla_beringeiBy now most of you have probably seen one or more variants of the “Gorilla Video.” I’ll do my best not to spoil it in case you haven’t, as the experience can be quite illuminating. What I’ve always found most interesting about those videos is that I always see the gorilla, and in similar videos, I tend to notice more of the “changes” or the “hidden elements” than most.

As part of the LeanUX14 Conference I attended a Cynefin Workshop hosted by Dave Snowden. It was a wonderful experience and he provided an explanation for why most of us miss the gorilla. We normally take in a very small portion of what’s in front of us; 5% appears to be typical for those of us from a western background. We don’t “see” everything. We filter things out based on a system of pattern recognition, even if those things—like the gorilla—are right in front of us. In other words, we observe what we’ve been conditioned to see.

This thought sat fermenting in my head for a good while. It made perfect sense from a scientific and evolutionary perspective. What was it then about my own experience—my own conditioning—that made me more likely to observe more than my peers? I wasn’t ready to accept the idea that I was just unusual; I like to have explanations for things.

It wasn’t until Michael Cheveldave (a colleague of Snowden’s) was giving a presentation later in the conference that a potential explanation hit me. Michael was talking about the concept of Cynefin and its meaning as “place of multiple belongings.” He had a picture of a green valley in Western Canada up on the screen, and he said that although he travels many places, that place—that green valley—was his home.

I don’t have a place like that. My family moved around a lot when I was young. I can feel “at home” in the upper Midwest where I was born, in Indiana and Illinois, and in Virginia where I live now. Wild places, like the Boundary Waters of Northern Minnesota and the desert plateau of Utah, feel like home. London, Paris, and Munich feel as much like home as Washington, DC. In a similar way, the streets of Chennai always feel welcoming when I return to them.

Many places feel like “home” to me, but I have no place like Michael’s valley, no place that is definitively and absolutely home. When I realized that, I thought I had an explanation for why I see the gorilla.

We are conditioned by our surroundings. We learn what to expect. When we have a home, we adjust to it; we learn what we need to pay attention to and what we can safely ignore. I think my youth and my frequent movements conditioned me to expect new situations—new places, new things, new contexts. For me, being able to see and take in subtle variations meant the difference between a fun day at school and an unwanted bullying. I not only learned to pay attention but became conditioned to do so.

The important implication is that our environments—our management systems, our Kanban boards, and our tools—are conditioning us. Are yours triggering the kind of behavior you want to see? Are they enabling people? Are they fostering learned helplessness? What gorillas do you see? Which ones are you missing?

The Danger of Singular Attractors

A6M3_Munda_1943“Responding to Change over Following a Plan”

That phrase is one of my favorites from the Agile Manifesto. In mulling it over the other day—and thinking about how to explain its value to others—I felt that a historical example could help explain the dangers of adhering too strongly to a specific plan when circumstances change. Looking at how the Imperial Japanese Navy (IJN) approached the idea of “decisive battle” before and during World War Two is an excellent way of illustrating the value of “responding to change over following a plan.”

The Japanese planned for war assuming that an American Fleet would steam across the Pacific from its base in Hawaii. The fleet would recapture—or relieve, if Japanese attempts to capture them failed—the Philippine Islands and also seek bases from which war could be brought to the Japanese homeland, such as Okinawa or Formosa. The Japanese plan was to prevent this by engaging the American Fleet in a large naval action, crippling it, and ending its journey. Victory in this “decisive battle” was how the IJN expected to win the war.

This objective would be made easier by subjecting the American Fleet to attritional attacks as it moved west. Before the decisive battle, attacks by airplanes, submarines, and surface forces would reduce the strength of the American force, increasing the odds of victory in the decisive fleet action. Japanese light forces and submarines would attack from bases in the Marshall and Caroline Islands, granted to Japan in the aftermath of World War One. These islands sat astride the most direct route of advance across the Pacific, the one the Americans planned to take.

Japanese weapons, doctrine, tactics, and force structure reflected this emphasis on attritional attacks and decisive battle. They invested in powerful, long-ranged torpedoes, perfect for night attacks on enemy formations. Land-based and carrier planes were extremely long-ranged, so that they could attack at ranges from which the Americans could not respond. Ships were built with an emphasis on offensive action, and powerful for their size. Relatively little attention was devoted to survivability, maintainability, or logistics. The focus on decisive battle trumped other considerations.1

There were good reasons for the IJN to invest so heavily in this core assumption. It reflected their recent history, and the lessons they had drawn from prior wars. The Russo-Japanese War had ended in a decisive action, with the Russian Baltic Fleet steaming around the world to its destruction at Tsushima.2 The Sino-Japanese War had also featured a major fleet action, victory in which gave the Japanese command of the sea. This historical experience was augmented by Japanese interpretations of the writings of Alfred Thayer Mahan, one of the most influential thinkers on naval strategy of the late 19th Century. His theories had a significant influence on Japanese naval strategy leading up to World War Two.3 The result was that the IJN focused almost exclusively on the decisive battle and the attritional campaign leading up to it.

The daring attack on Pearl Harbor was the culmination of this thinking. In the years before the war, modifications to the basic plan had moved the anticipated location of the battle farther eastward and closer in time to the initiation of the war. The IJN’s quest for longer ranges and earlier action influenced its thinking strategically as well as tactically. But Pearl Harbor proved to be just the beginning.

In the language of complexity, the concept of the decisive battle became a “singular attractor” for the IJN. Plans, tactics, and doctrine gravitated towards it, to the exclusion of other potential alternatives. This singular focus began to blind the IJN to other possibilities; the leadership ignored “weak signals” that suggested different paths to victory.

This is the underlying reason why the Agile Manifesto places greater emphasis on “responding to change” than “following a plan.” When we follow a specific plan—when we emphasize its employment over alternatives—we risk creating a singular attractor that blinds us to alternatives. We focus on the plan; we invest ourselves in it; and we try to ensure that we conform to it. We do this even when other, more effective, options arise.

This focus on a specific plan can trigger all manner of problems, but the most insidious is that the more we invest in the plan, the more power we give it as a singular attractor. The more attached we become, the less responsive we are to alternatives that could allow us to achieve the same objective with less effort. We become willingly ignorant of the weak signals in the environment; we miss potential opportunities with more favorable outcomes.

This is what happened to the IJN. The faith it placed in the decisive battle was misplaced. American strategic mobility and industrial might permitted an advance on two fronts—through the South Pacific and across the Central Pacific. The Japanese lacked the forces to hold back both; their attempts to do so spread them too thin and left them vulnerable.

The investment in attritional tactics did not play out as hoped. Attritional combat injured the Japanese as much as the Americans. In mid-1944, when the opportunity arose for the decisive battle, the Japanese lacked the capability to execute their prewar plans. The result was shattering defeat at Philippine Sea and Leyte Gulf. With their prewar plans finally thwarted, the Japanese formally adopted kamikaze tactics.

Software teams that adhere to specific plans despite changing circumstances often adopt similar—but much less lethal—approaches, sacrificing themselves and the future of their applications in a fury of long hours and weekends. They, like the IJN before them, fall victim to an inordinate focus on a singular attractor. They are often no more successful, and the long-term harm done to morale and application architecture often bears a certain resemblance to the brave—but futile—kamikaze tactics of the Japanese.

1. Kaigun: Strategy, Tactics, and Technology in the Imperial Japanese Navy, 1887-1941, David C. Evans and Mark R. Peattie (Naval Institute Press, 1997) and Sunburst: The Rise of Japanese Naval Air Power, 1909-1941, Mark R. Peattie (Naval Institute Press, 2003)

2. Battle of Tsushima

3. From Mahan to Pearl Harbor: The Imperial Japanese Navy and the United States, Sadao Asada (Naval Institute Press, 2006)

Thoughts on Slack and WIP

WIP_BensonThe other day, I was reading this excellent posting by Matt Heusser on the dangers and consequences of having too much work in progress (WIP). It mirrors my own experiences the past few months.

I have a number of techniques in place to manage my own work and keep WIP at a productive level, but I’ve had an unanticipated number of requests from colleagues for assistance. And I’m always happy to help… You can easily see where that leads. Before I knew it, I was overwhelmed.

After a stimulating conversation with Adam Yuret last night, I realized that Matt’s posting only looks at part of the story. It uses physical systems, like traffic and networks, to illustrate the negative results of having too much WIP. I do this too when I talk about WIP; it makes the concept readily accessible and works really well. But it misses something. It doesn’t look at the benefits of slack.

We humans are not mechanical; the costs of high WIP are even greater for us than they are for physical systems. This is because our brains continue to work on and consume varied ideas and experiences subconsciously. When we have too much to do, when we’re too focused on task, there’s too little time to step away from problems and allow these ideas to find their way to the forefront of our minds. This creates stress and tension.

When I took time away last night to have that conversation with Adam, I created slack time. I took my mind off of the topics I’d been working with for several days. I forgot my own challenges for a little while. And when the talk was over, I was hit with a wave of creativity. New ideas bubbled up; I started considering potential solutions for problems I’d been mulling over for months. I had at least one epiphany, and what I hope will be a few other good ideas.

Without taking the time to make some slack, I don’t think those ideas would ever have made it to my conscious mind. I needed that slack. I think all of us do.

So there’s two sides to the high WIP problem. The first is that it pushes us beyond our capacity and bogs us down. The second is that it suppresses our creativity. Either one of these can be crippling, but when they combine together, the challenges can seem insurmountable.

How Can We Learn When Lessons Take So Long?


Alfred Thayer Mahan

Last week I was discussing the idea of software rewrites with a good friend. It was a relevant topic; different teams that we work with are being asked to, or are in the process of, rewriting various pieces of software. But our conversation wasn’t about the mechanics of rewriting applications; it was about the decision to do so, and whether that decision was the appropriate one.

Rewrites are costly. The costs are almost always larger than anticipated—both in time and effort—and failure to anticipate them correctly provides competing businesses with an opportunity. While the organization focuses on the rewrite, and tries to build to parity with their “legacy” solution, new features get pushed lower on the priority list. This creates a gap with evolving customer expectations; the longer the rewrite takes, the larger this gap tends to become. If competing businesses can step into it, they can seize market share while the rewriting organization is busy working towards “feature parity.” Both of us had seen this happen. It seems to be a common theme with major software rewrites.

My friend and I had learned from this experience. Since both of us had been party to rewrites that took longer than anticipated and had important business consequences, we were wary of them and argued for approaches that accounted for these business risks. Our experiences made us wonder if organizations—not just ours, but organizations in general—could effectively learn from these experiences.

The question is an important one. The tenure of engineering managers and their immediate superiors is relatively brief, at least compared to the cycle of software rewrites. Software systems can last for a generation, twenty years or more. A software engineer might see one or two major rewrites during a career. Would that give enough experiential knowledge to avoid a poor decision? We thought it unlikely, especially if managers and decision-makers were moving on to new responsibilities about once every five years.

This wasn’t an inspiring conclusion. We couldn’t help but wonder if software organizations might have real difficulties learning important lessons because of these dynamics. If average leadership tenure is less than ten years and the feedback cycle from a rewrite is double that, how can an organization be expected to learn from one?

In a world where Agile approaches and fast feedback loops have become so common, there are still aspects to our systems that have long cycles, and these can inhibit effective learning.

Watching President Obama’s speech to the U.N. the other day, as he laid out the case for a campaign against ISIL, I wondered if the same might not be true for the U.S. government. It is dangerous to draw specific parallels between American involvement in Southeast Asia—or more specifically Vietnam—and the recent entanglements in the Middle East, but for students of history, it is almost impossible not to. Similar themes reemerge, such as overconfidence in military force, an emphasis on winning tactical victories rather than defining strategic goals, and relative ignorance of the importance of historic and cultural contexts. Presidents, just like software managers, can have difficulty with long feedback loops because of their limited tenure.

In the late nineteenth and early twentieth century, with limited experience in fighting naval wars, the U.S. Navy attempted to solve this problem through the study of history. This was a core aspect of the approach of Alfred Thayer Mahan and the work of his colleagues at the Naval War College. Historical study augmented experiential knowledge and was used to illustrate broad themes. These broad themes became principles that formed the foundation of the Navy’s approach to tactics and doctrine in the early part of the twentieth century. If the performance of the Navy in World War Two is any indication, Mahan’s approach was successful.

Do software teams need something similar? Does the U.S. Government?

What and When to Automate?

Texas_1928This post springs out of a Twitter conversation with Marc Burgauer and Kim B. They will also be sharing their thoughts on what and when to automate (here and here, respectively).

My simple answer is that automation is most valuable when it can provide rapid feedback into the decisions people make.

When the question came up, I immediately thought about my experiences developing software, and the automation of testing cycles. I have developed an ingrained assumption that some types of automated testing are inherently “good.” It was fortunate that Kim was so pointed in her questioning. I was forced to revisit my assumptions and come at the question in another way in order to respond with a considered answer.

I believe the development of the US Navy’s early surface fire control systems is a useful illustration of effective automation. These systems were intended to allow a moving ship to fire its guns accurately and hit another ship at ranges of five to ten miles or more. At the time these systems were developed—between 1905 and 1918—these were significant distances; hitting a moving target at these ranges was not easy.

The core of these systems was a representative model of the movements of the target. At first, this model was developed manually. Large rangefinders observed the target and estimated its range. Other instruments tracked the target and recorded its bearing. These two—bearing and range—if observed and recorded over time, could be combined to develop a plot of the target’s movements. At first, the US Navy’s preferred approach was to plot the movements of the firing ship and the target separately. This produced a bird’s eye plot which could be used to predict the future location of the target, where the guns would have to be aimed to secure a hit.

Feedback was incorporated into the system to allow it to be successful. At first there was only a single feedback loop. A “spotter” high in the masts of the ship watched the movement of the target and observed the splashes of the shells that missed. To make this process easier, the Navy preferred “salvo fire,” which meant firing all available guns in a battery at once, maximizing the number of splashes. Depending on where these shells landed, the spotter would call for corrections. These corrections would be fed back into the model, in order to improve it.

The process did not work well. Building the model manually required numerous observations and took a lot of time. A different approach was adopted which involved measuring rates of change—particularly the rate at which range was changing—and aiming the guns based on that. This was less desirable, as it was not a comprehensive “model” of the target’s movements. However, automatic devices  could be used to roughly predict future ranges once the current rate of change was known, allowing the future position of the target to be predicted more rapidly.

These “Range Clocks” were a simple form of automation. They took two inputs—the current range and the rate at which it was changing—and gave an output based on simple timing. They reduced workload, but did not provide feedback. They also did not account for situations where the rate at which the range was changing was also changing. Automation would have better been focused on something else, and ultimately it was.

The early fire control systems reached maturity when the model of the target’s movements was automated. The Navy introduced its first system of this type in 1916. Called a “Rangekeeper” this device was a mechanical computer that used the same basic observations of the target (range and bearing, along with estimates of course and speed) to develop a model of its movements.

The great advantage of this approach over previous systems was that the model embedded in the Rangekeeper allowed for the introduction of another level of feedback into the system. The face of the device provided a representation of the target. This representation graphically displayed the computed target heading and speed. Overlaid above this representation were two lines that indicated observed target bearing and observed target range.

If the model computed by the Rangekeeper was accurate, the two lines indicating observed bearing and range would meet above the representation of the target course and speed. This meant that if the model was not accurate—due to a change of course by the target or bad inputs—the operator could recognize it and make the necessary corrections. This made for faster and more accurate refinements of the model. Automation in this case led to faster feedback, better decisions, and ultimately more accurate gunfire.

When we think about automating in software, I believe it is better to concentrate on this type of automation—the kind that can lead to more rapid feedback and better decision-making. Automation of unit tests can allow this, by telling us immediately when a build is broken and many teams use them exactly this way.

When we approach the problem this way, we’re not just providing an automated mechanism for a time-consuming repetitive task. There is some value to that—this is the approach the Navy took with the range clock—but it is more valuable if our automation enable better decisions through faster feedback. Decisions are difficult; often there is less information than we would like. The more we can leverage automation to improve decision-making, the better off we will be. This is the approach the Navy took with the Rangekeeper, and I think it’s a valuable lesson for us today.

Making Sense of “The Good, the Bad, and the Ugly”

I recently attended a Cynefin and Sense-Making Workshop given by Dave Snowden and Michael Cheveldave of Cognitive Edge. It was an excellent course and a useful introduction to how to apply concepts from complex adaptive systems, biology, and anthropology to better understand human approaches to problem solving.

The Cynefin framework is an elegant expression of these ideas. It posits five domains that reflect the three types of systems we encounter in the world. There are ordered systems, in which outcomes are predictable and repeatable. There are chaotic systems, which are inherently unpredictable and temporary; and there are complex systems, in which the system and the actors within it interact to shape an unpredictable future.

We can use the Cynefin framework to help us make sense of our current situation and understand what course of action might be best at a given moment. If we are dealing with an ordered system, then we are in one of the ordered domains, either “Obvious” or “Complicated.” In either of these circumstances, we can reason our way to the right answer, provided we have the necessary experience and expertise. The predictability of the system permits this.

If, however, we are in the “Chaotic” domain, the system is wholly unpredictable. The “Complex” domain embraces complex adaptive systems: those that are governed by some level of constraint yet remain unpredictable. Think of the foot traffic in your local shopping mall, and you can get some idea of how these systems manifest: you can purposefully walk from one end to the other, but if the mall is crowded, you can’t predict the course you’ll have to take to get there.

A fifth domain, “Disorder,” exists to explain those times where our current state is unknown.

To increase our familiarity with how to use the Cynefin framework, we performed a number of exercises. In one of them, my tablemates (including Adam Yuret and Marc Burgauer) and I tried to make sense of the final, climactic scene of “The Good, the Bad, and the Ugly.” Spoilers follow, so if you haven’t seen it, now’s a good time to bail out.

The scene involves a three-way standoff between “Blondie” (Clint Eastwood), “Angel” (Lee Van Cleef), and “Tuco” (Eli Wallach). The three gunslingers stand in a rough triangle at the center of a graveyard. Blondie’s written the location of the treasure on the bottom of a rock, and placed it at the center of the triangle. None of them wants to share the treasure.

At first blush, it seems to be an ideal example of a complex system. As soon as any one of them acts, the others will fire, and the standoff will end, but no one can predict how. That’s why each of them stands there, eyeing one another cautiously, as the tension builds to Ennio Morricone’s music.

But that’s not the truth of the matter. Blondie is no fool. He’d gotten the drop on Tuco and had time to unload Tuco’s weapon. As we watch the scene, we don’t know this, but for Blondie, the situation is well-ordered. All he needs to do is pick the right time to gun Angel down. Blondie knows Tuco’s not a threat.

The other two must deal with more unknowns. It’s not a chaotic system for them. There is a certain level of predictability. Someone will shoot. But the details of who that will be—and when he will fire—are uncertain. What happens after that is anyone’s guess. Both Tuco and Angel want to trigger a specific outcome—their survival and the death of the other two—but exactly how to manage this outcome is impossible to predict given the other elements of the system. It’s a perfect example of a complex adaptive system.

We thought this was an extremely useful example to help us “make sense” of Cynefin and the concepts it embraces. I hope you do too.