Taiichi Ohno was one of the inventors of the Toyota Production System. His book Toyota Production System: Beyond Large-Scale Production is a fascinating read, even though it’s decidedly non-practical. After reading it, you might not even realize that there are cars involved in Toyota’s business. Yet there is one specific technique that I learned most clearly from this book: asking “Why?” five times.
When something goes wrong, we tend to see it as a crisis and seek to blame. A better way is to see it as a learning opportunity–and ask “why” five times to get to the root cause of the problem.
How “The Five Whys” Works
Let’s say you notice that your website is down. Obviously, your first priority is to get it back up. But as soon as the crisis is past, you have the discipline to have a post-mortem in which you start asking why:
- Why was the website down? The CPU utilization on all our front-end servers went to 100%
- Why did the CPU usage spike? A new bit of code contained an infinite loop!
- Why did that code get written? So-and-so made a mistake
- Why did his mistake get checked in? He didn’t write a unit test for the feature
- Why didn’t he write a unit test? He’s a new employee, and he was not properly trained in TDD
So far, this isn’t much different from the kind of analysis any competent operations team would conduct for a site outage. The next step is this: you have to commit to make a proportional investment in corrective action at every level of the analysis. So, in the example above, we’d have to take five corrective actions:
- Bring the site back up
- Remove the bad code
- Help so-and-so understand why his code doesn’t work as written
- Train so-and-so in the principles of TDD
- Change the new engineer orientation to include TDD
This technique should be used for all kinds of defects, not just site outages. Each time, we use the defect as an opportunity to find out what’s wrong with our process, and make a small adjustment. By continuously adjusting, we eventually build up a robust series of defenses that prevent problems from happening. This approach is at the heart of breaking down the “time/quality/cost pick two” paradox, because these small investments cause the team to go faster over time.
In the example above, what started as a technical problem actually turned out to be a human and process problem. This is completely typical. Our bias as technologists is to over-focus on the product part of the problem, and five whys tends to counteract that tendency.
Putting It Into Practice
How do you get started with five whys? Start with a specific team and a specific class of problems, and choose a single person to be the “Five Whys” master. This person will run the post mortem whenever anyone on the team identifies a problem. Don’t let them do it by themselves; it’s important to get everyone who was involved with the problem (including those who diagnosed or debugged it) into a room together. Have the “Five Why” master lead the discussion, but they should have the power to assign responsibility for the solution to anyone in the room.
Once that responsibility has been assigned, have that new person email the whole company with the results of the analysis. This last step is difficult, but I think it’s very helpful. Sharing this information widely gives everyone insight into the kinds of problems the team is facing and how the team is tackling those problems.
Over time, people get used to the rhythm of five whys, and it becomes completely normal to make incremental investments. Most of the time, you invest in things that otherwise would have taken tons of meetings to decide to do. And you’ll start to see people from all over the company chime in with interesting suggestions for how you could make things better. Now, everyone is learning together – about your product, process, and team. Each five whys email is a teaching document.
So thank you, Taiichi Ohno. I think you would have liked seeing all the waste we’ve been able to drive out of our systems and processes, all in an industry that didn’t exist when you started your journey at Toyota. And I especially thank you for proving that this technique can work in one of the most difficult and slow-moving industries on earth: automobiles. You’ve made it hard for any of us to use the most pathetic excuse of all: surely, that can’t work in my business, right? If it can work for cars, it can work for you.