Chaos monkey

First things first: I’m a big fan of monkeys.

Second: We all know they can cause trouble.

Bill Conerly, a columnist for Forbes, wrote this short piece on the idea of a Chaos Monkey. Besides being a great idea for a tattoo, the idea centers around a program Netflix created to test its own systems.

If our recommendations system is down, we degrade the quality of our responses to our customers, but we still respond. We’ll show popular titles instead of personalized picks. If our search system is intolerably slow, streaming should still work perfectly fine.

One of the first systems our engineers built … is called the Chaos Monkey. The Chaos Monkey’s job is to randomly kill instances and services within our architecture. If we aren’t constantly testing our ability to succeed despite failure, then it isn’t likely to work when it matters most – in the event of an unexpected outage.

What happens if one of your key employees quits? What happens if your supplier raises fertilizer prices by a factor of three? What happens if your bank calls in your loan?

Really bad things probably won’t happen, but what if they did? What if all those things happen at once?

Sometime in the next few weeks, sit down and think about it.

Be your own Chaos Monkey.


