Extreme and Surprising Events Happen More Often Than You Think
I just read The Black Swan, by Nassim Taleb. To summarize the whole book in 10 words: Extreme and surprising events happen more often than you think.
Many psychology studies have shown that humans are inherently bad at dealing with improbable events. (See my recent blog on Shark Island.) Taleb talks about this, but to a mathematically minded person like me, his real point is much scarier. He argues that bell curves and standard deviations—tools that number people use to understand probability—often fail in the real world. With Gaussian bell curves, the probability of extreme events goes down exponentially as you get further from the average. But in many real-world situations, the probability goes down much slower for extreme events. The tail is fatter. If you trust standard statistics, you could end up in big trouble.
Taleb uses concrete examples to build intuition. Peoples' height is a normal bell curve, but wealth is not. Suppose you randomly select 10 people out of the entire world, and check their height. The average will be six feet, or whatever it is. Now take the world's tallest person and add him to the mix. The average only goes up three inches. Increase your random sample to a hundred, and throwing in the tallest person changes things by less than an inch.
Now try the same thing with wealth. Take ten random people worldwide, and their average income is $10,000 or whatever—remember I said worldwide. But add Bill Gates into the mix, and the average goes up many thousand-fold. Even if you increase the sample size to 1000, adding Bill makes the average several hundred times higher—from ten thousand dollars to millions.
What if height were distributed the way wealth is? Six feet might be the most common height, but there would be many 10 feet people wandering around, and even some hundred footers. Mathematically speaking, this is the difference between a normal bell curve and a power curve, but an example can show the difference clearly even if you don't care about the math. Consider a town with a million people, and compare how many tall ones there are with a bell curve versus a power curve. (Check the endnote for math-nerd details.)
Count of all people
People over 6'
People over 6'3"
People over 6'6"
People over 7'
People over 8'
People over 10'
People over 100'Bell Curve
1,000,000
500,000
158,655
22,750
32
0
0
0Power Curve
1,000,000
500,000
158,655
81,067
34,790
13,143
4,584
27
At first the two curves look similar: half the people are over six feet, 158,665 are over six-three—exactly the same. But at extreme heights, things are so different. With a bell curve, you will never see a ten-foot person. There is some theoretical probability that it might happen, but it's so rare that you can count on never seeing it in your life. With a power curve, there are not only 4,000 ten-footers, but 27 one-hundred-footers! For the planet as a whole, the power curve predicts several dozen people over ten thousand feet tall.
With power curves, extreme and surprising events happen more often than you think.
It is obviously very important when you are managing probability (or risk), to understand which curve applies. How would you design hospital beds, for your town of a million, if you knew that hundreds of citizens were over thirty feet tall? What if you thought height was a bell curve, and built eight-foot beds, but it turned out later to be a power curve?
Power curves are very common where big guys can grow at the expense of little guys. Tall people can't take height from short people, but large companies (e.g. Microsoft) can take business from small ones. Popular websites (e.g. Google) take web-hits from small ones. As a result, power curves are very common in business statistics. Power curves can also result when there are many interactions between elements. I suspect that failure events in interconnected infrastructures, like the nation's electric grid or enterprise data centers, follow power curve rules.
Taleb mostly worries about the implications for investors, but I see lessons for companies as well. Don't trust your plans. You still have to make plans, but they will change more than you think. Don't trust statistics on small or medium sized samples unless you know it's a bell curve. If you suspect a sample might be a power curve, don't trust bell curve statistics at all. Expect surprises.
[Math-nerd note: In my example, the average for the bell curve is 6 feet, and the standard deviation is 3 inches. For the power curve, I set 6 feet as the starting point in a half million person population, made 6'3" the first doubling point, and selected the exponent to make the probability match the bell curve at 6'3" for easy comparison. The exponent was 1.656. I ignored the population below 6 feet because power curves don't handle the left side of a peaked distribution well. You obviously don't have bazillions of people six inches tall, or people negative a thousand feet tall, so you have to cut off the left side somehow. I'm only half-good with math, so I probably made some mistakes, but I think it's accurate enough to make the point.]




In government land people like to use “smoking crater” as the canonical datacenter disaster. While possible, I don’t like that example because it just doesn’t sound all that probable.
My preferred example happened directly to a gentlement (a datacenter operations manager) next to me on a car rental shuttle in January. It seems his company’s primary datacenter in Austin had just died. To keep this short, Austin was experiencing many days of record cold temperatures. Ice on the power lines? Yes, but not the problem. Water lines froze? Yes that was a problem, but there was no flood damage. It did however, shut down the chillers. What are the odds that his datacenter overheated to the point of widespread hardware damage in the middle of record cold spell? 100%
Posted by: Chad | July 25, 2007 at 06:40 AM