## How to Measure Anything
### How to Measure Anything

#### Metadata
* Author: [[Douglas W. Hubbard]]
* Full Title: How to Measure Anything
* Category: #books
#### Highlights
* For many decision makers, it is simply a habit to default to labeling something as intangible when the measurement method isn't immediately apparent. (Location 481)
* Yes, I Mean Anything The reader should try this exercise: Before going on to the next chapter, write down those things you believe are immeasurable or, at least, you are not sure how to measure. After reading this book, my goal is that you will be able to identify methods for measuring each and every one of them. Don't hold back. We will be talking about measuring such seemingly immeasurable things as the number of fish in the ocean, the value of a happy marriage, and even the value of a human life. Whether you want to measure phenomena related to business, government, education, art, or anything else, the methods herein apply. (Location 503)
* So, if your problem happens to be something that isn't specifically analyzed in this book—such as measuring the value of better product labeling laws, the quality of a movie script, or the effectiveness of motivational seminars—don't be dismayed. Just read the entire book and apply the steps described. Your immeasurable will turn out to be entirely measurable. (Location 522)
* Consider the following points. Decision makers usually have imperfect information (i.e., uncertainty) about the best choice for a decision. These decisions should be modeled quantitatively because (as we will see) quantitative models have a favorable track record compared to unaided expert judgment. Measurements inform uncertain decisions. (Location 539)
* The benefits of modeling decisions quantitatively may not be obvious and may even be controversial to some. I have known managers who simply presume the superiority of their intuition over any quantitative model (Location 553)
* Unless someone is planning on selling the information or using it for their own entertainment, they shouldn't care about measuring something if it doesn't inform a significant bet of some kind. So don't confuse the proposition that anything can be measured with everything should be measured. (Location 572)
* So what does a decision-oriented, information-value-driven measurement process look like? This framework happens to be the basis of the method I call Applied Information Economics (AIE). I (Location 578)
* Applied Information Economics: A Universal Approach to Measurement Define the decision. Determine what you know now. Compute the value of additional information. (If none, go to step 5.) Measure where information value is high. (Return to steps 2 and 3 until further measurement is not needed.) Make a decision and act on it. (Return to step 1 and repeat as each action creates new decisions.) (Location 580)
* Some of the power tools referred to in this book are in the form of spreadsheets available for download on this book's website at <www.howtomeasureanything.com.> This free online library includes many of the more detailed calculations shown in this book. (Location 617)
* In ancient Greece, a man estimated the circumference of Earth by looking at the lengths of shadows in different cities at noon and by applying some simple geometry. A Nobel Prize–winning physicist taught his students how to estimate values initially unknown to them like the number of piano tuners in Chicago. (Location 678)
* A nine-year-old girl set up an experiment that debunked the growing medical practice of "therapeutic touch" and, two years later, became the youngest person ever to be published in the Journal of the American Medical Association (JAMA). (Location 681)
* He didn't even embark on a risky and potentially lifelong attempt at circumnavigating the Earth. Instead, while in the Library of Alexandria, he read that a certain deep well in Syene (a city in southern Egypt) would have its bottom entirely lit by the noon sun one day a year. This meant the sun must be directly overhead at that point in time. He also observed that at the same time, vertical objects in Alexandria (almost directly north of Syene) cast a shadow. (Location 690)
* Eratosthenes's calculation was a huge improvement on previous knowledge, and his error was much less than the error modern scientists had just a few decades ago for the size and age of the universe. Even 1,700 years later, Columbus was apparently unaware of or ignored Eratosthenes's result; his estimate was fully 25% short. (This is one of the reasons Columbus thought he might be in India, not another large, intervening landmass where I reside.) In fact, a more accurate measurement than Eratosthenes's would not be available for another 300 years after Columbus. (Location 700)
* Fermi concluded that the yield must be greater than 10 kilotons. This would have been news, since other initial observers of the blast did not know that lower limit. Could the observed blast be less than 5 kilotons? Less than 2? These answers were not obvious at first. (As it was the first atomic blast on the planet, nobody had much of an eye for these things.) After much analysis of the (Location 721)
* instrument readings, the final yield estimate was determined to be 18.6 kilotons. Like Eratosthenes, Fermi was aware of a rule relating one simple observation—the scattering of confetti in the wind—to a quantity he wanted to measure. The point of this story is not to teach you enough physics to estimate like Fermi (or enough geometry to be like Eratosthenes, either), but that, rather, you should start thinking about measurements as a multistep chain of thought. Inferences can be made from highly indirect observations. (Location 724)
* Fermi would start by asking them to estimate other things about pianos and piano tuners that, while still uncertain, might seem easier to estimate. These included the current population of Chicago (a little over 3 million in the 1930s to 1950s), the average number of people per household (two or three), the share of households with regularly tuned pianos (not more than 1 in 10 but not less than 1 in 30), the required frequency of tuning (perhaps once a year, on average), how many pianos a tuner could tune in a day (four or five, including travel time), and how many days a year the tuner works (say, 250 or so). The result would be computed: (Location 735)
* This approach to solving a Fermi question is known as a Fermi decomposition or Fermi solution. This method helped to estimate the uncertain quantity but also gave the estimator a basis for seeing where uncertainty about the quantity came from. Was the big uncertainty about the share of households that had tuned pianos, how often a piano needed to be tuned, how many pianos a tuner can tune in a day, or something else? The biggest source of uncertainty would point toward a measurement that would reduce the uncertainty the most. (Location 744)
* A Fermi Decomposition for a New Business Chuck McKay, with the firm Wizard of Ads, encourages companies to use Fermi questions to estimate the market size for a product in a given area. An insurance agent once asked Chuck to evaluate an opportunity to open a new office in Wichita Falls, Texas, for an insurance carrier that currently had no local presence there. Is there room for another carrier in this market? To test the feasibility of this business proposition, McKay answered a few Fermi questions with some Internet searches. Like Fermi, McKay started with the big population questions and proceeded from there. (Location 754)
* According to City-Data.com in 2006, there were 62,172 cars in Wichita Falls. According to the Insurance Information Institute, the average automobile insurance annual premium in the state of Texas was $837.40. McKay assumed that almost all cars have insurance, since it is mandatory, so the gross insurance revenue in town was $52,062,833 each year. The agent knew the average commission rate was 12%, so the total commission pool was $6,247,540 per year. According to Switchboard.com, there were 38 insurance agencies in town, a number that is very close to what was reported in Yellowbook.com. When the commission pool is divided by those 38 agencies, the average agency commissions are $164,409 per year. This market was probably getting tight since City-Data.com also showed the population of Wichita Falls fell from 104,197 in 2000 to 99,846 in 2005. Furthermore, a few of the bigger firms probably wrote the majority of the business, so the revenue would be even less than that—and all this before taking out office overhead. McKay's conclusion: A new insurance agency with a new brand in town didn't have a good chance of being very profitable, and the agent should pass on the opportunity. (Note: These are all exact numbers. But soon we will discuss how to do the same kind of analysis when all you have are inexact ranges.) (Location 759)
* In 1996, Emily saw her mother, Linda, watching a videotape on a growing industry called "therapeutic touch," a controversial method of treating ailments by manipulating the patients' "energy fields." While the patient lay still, a therapist would move his or her hands just inches away from the patient's body to detect and remove "undesirable energies," which presumably caused various illnesses. Linda was a nurse and a long-standing member of the National Council Against Health Fraud (NCAHF). But it was Emily who first suggested to her mother that she might be able to conduct an experiment on such a claim. (Location 778)
* This made a total of 280 individual attempts by 21 separate therapists (14 had 10 attempts each while another 7 had 20 attempts each) to feel Emily's energy field. They correctly identified the position of Emily's hand just 44% of the time. Left to chance alone, they should get about 50% right with a 95% confidence interval of +/– 6%. (If you flipped 280 coins, there is a 95% chance that between 44% and 56% would be heads.) So the therapists may have been a bit unlucky (since they ended up on the bottom end of the range), but their results are not out of bounds of what could be explained by chance alone. In other words, people "uncertified" in therapeutic touch—you or I—could have just guessed and done as well as or better than the therapists. (Location 794)
* With these results, Linda and Emily thought the work might be worthy of publication. In April 1998, Emily, then 11 years old, had her experiment published in JAMA. That earned her a place in the Guinness Book of World Records as the youngest person ever to have research published in a major scientific journal and a $1,000 award from the James Randi Educational Foundation. (Location 799)
* Randi created the $1 million "Randi Prize" for anyone who can scientifically prove extrasensory perception (ESP), clairvoyance, dowsing, and the like. Randi dislikes labeling his efforts as "debunking" paranormal claims since he just assesses the claim with scientific objectivity. But since hundreds of applicants have been unable to claim the prize by passing simple scientific tests of their paranormal claims, debunking has been the net effect. Even before Emily's experiment was published, Randi was also interested in therapeutic touch and was trying to test it. But, unlike Emily, he managed to recruit only one therapist who would agree to an objective test—and that person failed. (Location 804)
* Randi has run into retroactive excuses to explain failures to demonstrate paranormal skills so often that he has added another small demonstration to his tests. (Location 817)
* Real scientific methods report numbers in ranges, such as "the average yield of corn farms using this new seed increased between 10% and 18% (95% confidence interval)." (Location 985)
* Shannon proposed a mathematical definition of information as the amount of uncertainty reduction in a signal, which he discussed in terms of the "entropy" removed by a signal. To Shannon, the receiver of information could be described as having some prior state of uncertainty. That is, the receiver already knew something, and the new information merely removed some, not necessarily all, of the receiver's uncertainty. (Location 999)
* A measurement doesn't have to be about a quantity in the way that we normally think of it. Note that the definition I offer for measurement says a measurement is "quantitatively expressed." The uncertainty, at least, has to be quantified, but the subject of observation might not be a quantity itself—it could be entirely qualitative, such as a membership in a set. (Location 1010)
* Nominal and ordinal scales in particular might challenge our preconceptions about what "scale" really means, but they can still be useful for observations. To a geologist, it is useful to know that one rock is harder than another, without necessarily having to know by how much—which (Location 1031)
* Keep in mind that, while subjective, the uncertainty we refer to is not just irrational and capricious. We need subjective uncertainties to at least be mathematically coherent as well as consistent with repeated, subsequent observations. A rational person can't simply say, for instance, that there is a 75% chance of winning a bid for a government contract and an 82% chance of losing it (these two possibilities should have a total probability of 100%). Also, if someone keeps saying they are 100% certain of their predictions and they are consistently wrong, then we can reject their subjective uncertainties on objective grounds just as we would with the readings of a broken digital scale or ampmeter. (Location 1082)
* someone asks how to measure "strategic alignment" or "flexibility" or "customer satisfaction," I simply ask: "What do you mean, exactly?" It is interesting how often people further refine their use of the term in a way that almost answers the measurement question by itself. (Location 1112)
* Once managers figure out what they mean and why it matters, the issue in question starts to look a lot more measurable. This is usually my first level of analysis when I conduct what I've called "clarification workshops." It's simply a matter of clients stating a particular, but initially ambiguous, item they want to measure. I then follow up by asking "What do you mean by <fill in the blank>?" and "Why do you care?" (Location 1119)
* In 2000, when the Department of Veterans Affairs asked me to help define performance metrics for IT security, I asked: "What do you mean by 'IT security'?" and over the course of two or three workshops, the department staff defined it for me with increasingly specific language. They eventually revealed that what they meant by "IT security" were things like a reduction in unauthorized intrusions and virus attacks. They proceeded to explain that these things impact the organization through fraud losses, lost productivity, or even potential legal liabilities (Location 1124)
* The clarification chain is just a short series of connections that should bring us from thinking of something as an intangible to thinking of it as tangible. First, we recognize that if X is something that we care about, then X, by definition, must be detectable in some way. How could we care about things like "quality," "risk," "security," or "public image" if these things were totally undetectable, in any way, directly or indirectly? (Location 1133)
* Clarification Chain If it matters at all, it is detectable/observable. If it is detectable, it can be detected as an amount (or range of possible amounts). If it can be detected as a range of possible amounts, it can be measured. (Location 1149)
* For example, I might be asked to help someone measure the value of crime reduction. But when I ask why they care about measuring that, I might find that what they really are interested in is building a business case for a specific biometric identification system for criminals. (Location 1175)
* Or I might be asked how to measure collaboration only to find that the purpose of such a measurement is to resolve whether a new document management system is required. In each case, the purpose of the measurement gives us clues about what the measure really means and how to measure (Location 1177)
* At first, the only technique for measuring something about a population was to attempt to conduct a complete count of the entire population—a census. In those days in particular, a census was extremely expensive and was sometimes such a long process that the population might change quite a bit during the census. So, for practical reasons, this evolved into a set of methods that can be used to make inferences about a larger population based on some samples and indirect observations. Obviously, one can't see the entire population of a state at once, but one can sample it economically. (Location 1197)
* Suppose, instead, you just randomly pick five people. There are some other issues we'll get into later about what constitutes "random," but, for now, let's just say you cover your eyes and pick names from the employee directory. Contact these people and arrange to record their actual commute times on some randomly selected day for each person. Let's suppose the values you get are 30, 60, 45, 80, and 60 minutes. Can you use this sample of only five to estimate the median of the entire population (the point at which half the population is lower and half is higher)? Note that in this case the "population" is not just the number of employees but the number of individual commute times (for which there are many varying values even for the same employee). (Location 1231)
* I've been presenting sampling problems with just five samples to attendees of my seminars and conference sessions for years. I ask who thinks the sample is "statistically significant." Those who remember something about that idea seem only to remember that it creates some kind of difficult threshold that makes meager amounts of data useless (more on that to come). In some conferences, almost every attendee would say the sample is not statistically significant. I suspect that a large proportion of those who abstained from answering were just questioning whether they understood what statistically significant means. As it turns out, this latter group was the more self-aware of the two. Unlike the first group, they at least knew they didn't know what it really meant. (Location 1237)
* I then ask what the chance is that the median of the population is between the highest and lowest values in the sample of five (30 and 80). Most answers I've gotten were around 50%, and some were as low as 10%. After all, out of a population of 10,000 people (and perhaps millions of individual commute times per year), what could a mere sample of five tell us? (Location 1243)
* But, when we do the math, we see there is a 93.75% chance that the median of the entire population of employees is between those two numbers. I call this the "Rule of Five." The Rule of Five is simple, it works, and it can be proven to be statistically valid for a wide variety of problems. With a sample this small, the range might be very wide, but if it is significantly narrower than your previous range, then it counts as a measurement. (Location 1246)
* It might seem impossible to be 93.75% certain about anything based on a random sample of just five, but it works. To understand why this method works, it is important to note that the Rule of Five estimates only the median of a population. Remember, the median is the point where half the population is above it and half is below it. If we randomly picked five values that were all above the median or all below it, then the median would be outside our range. But what is the chance of that, really? (Location 1251)
* A sample of five doesn't seem like much, but, if you are starting out with a lot of uncertainty, it might be possible to make a useful inference on even less data. Suppose you wanted to estimate a percentage of some population that has some characteristic. We call this a "population proportion" problem. A population proportion could refer to the percentage of employees who take public transportation to work, the percentage of farmers in Kenya who use a particular farming technique, the percentage of people with a particular gene, or the percentage of trucks on the road that are overweight. In each of these cases, how many employees, farmers, or trucks would I have to sample to estimate the stated population proportions? Obviously, if I conducted a complete census, I would know the population proportion exactly. But what can be inferred from a smaller sample? (Location 1267)
* The Single Sample Majority Rule (i.e., The Urn of Mystery Rule) Given maximum uncertainty about a population proportion—such that you believe the proportion could be anything between 0% and 100% with all values being equally likely—there is a 75% chance that a single randomly selected sample is from the majority of the population. (Location 1297)
* The Rule of Five and the Single Sample Majority Inference are clearly counterintuitive to most people. But the math is right about these methods, so it is our intuition that is wrong. Why is our intuition wrong? Often people object that the sample is too small compared to the population size. There is sometimes a misconception that an informative sample should be a significant percentage of the entire population. (If this were the requirement, no measurement in biology or physics would be remotely possible, since population sizes are often—and literally—astronomical.) But mathematically we can prove that the Rule of Five and the Single Sample Majority Inference work even with infinite population sizes. (Location 1307)
* In order to appreciate the effect of these small samples, it's important to remember how little we knew before the sample. The information from a very small sample is underestimated when a decision maker starts with a high degree of uncertainty, which is the case with the Single Sample Majority Inference. Indeed, the initial state of uncertainty couldn't possibly be any higher for a population proportion than to say it's somewhere between 0% to 100% with a uniform distribution. This is essentially the same as knowing nothing other than the logical limits of a population proportion. (Location 1312)
* The only valid reason to say that a measurement shouldn't be made is that the cost of the measurement exceeds its benefits. This situation certainly happens in the real world. In 1995, I developed the method I called Applied Information Economics—a method for assessing uncertainty, risks, and intangibles in any type of big, risky decision you can imagine. A key step in the process (in fact, the reason for the name) is the calculation of the economic value of information. I'll say more about this later, but a proven formula from the field of decision theory allows us to compute a monetary value for a given amount of uncertainty reduction. (Location 1358)
* If you are betting a lot of money on the outcome of a variable that has a lot of uncertainty, then even a marginal reduction in your uncertainty has a computable monetary value. For example, suppose you think developing an expensive new product feature will increase sales in one particular demographic by up to 12%, but it could be a lot less. Furthermore, you believe the initiative is not cost-justified unless sales are improved by at least 9%. If you make the investment and the increase in sales turns out to be less than 9%, then your effort will not reap a positive return. If the increase in sales is very low, or even possibly negative, then the new feature will be a disaster and a lot of money will have been lost. Measuring this would have a very high value. When someone says a variable is "too expensive" or "too difficult" to measure, we have to ask "Compared to what?" If the information value of the measurement is literally or virtually zero, of course, no measurement is justified. But if the measurement has any significant value, we must ask: "Is there any measurement method at all that can reduce uncertainty enough to justify the cost of the measurement?" Once we recognize the value of even partial uncertainty reduction, the answer is usually "Yes." (Location 1373)
* Consider the Urn of Mystery. The information value of sampling a single marble is the difference between what your average payoff would be with the marble and what the average payoff would be without the marble. Suppose now we had equal wins (you win $10 if I guess the wrong majority color and I win $10 if I'm right) but you charged me $2 on each urn to take a single marble sample. Should I pay the $2? In this case, my average payoff now without the sample is $0 (I win $10 half the time and lose $10 half the time). But if I took one sample from the urn before each bet, that information increased my average net win to 75% × $10 + 25% × (–$10) = $5. So the value of the information is $5 and the cost is $2 for a net gain of $3 per bet. (Location 1382)
* Meehl's extensive research soon reached outside of psychology and showed that simple statistical models were outperforming subjective expert judgments in almost every area of judgment he investigated including predictions of business failures and the outcomes of sporting events. (Location 1406)
* Another researcher conducted one of the largest and longest running studies of the performance of experts at predictions. Philip Tetlock tracked the forecasts of 284 experts in many topics over a 20-year period. In total, he had gathered more than 82,000 individual forecasts covering elections, wars, economics, and more. Tetlock summarized these findings in his book Expert Political Judgment: How Good Is It? How Can We Know?13 His conclusion was perhaps even more strongly worded than Meehl's: (Location 1421)
* In response to the skeptics of statistical models he met in his own profession, Paul Meehl proposed a variation on the game of Russian roulette.15 In his modified version there are two revolvers: one with one bullet and five empty chambers and one with five bullets and one empty chamber. Meehl then asks us to imagine that he is a "sadistic decision-theorist" running experiments in a detention camp. Meehl asks, "Which revolver would you choose under these circumstances? Whatever may be the detailed, rigorous, logical reconstruction of your reasoning processes, can you honestly say that you would let me pick the gun or that you would flip a coin to decide between them?" Meehl summarized the responses: "I have asked quite a few persons this question, and I have not yet encountered anybody who alleged that he would just as soon play his single game of Russian roulette with the five-shell weapon." Clearly, those who answered Meehl's question didn't really think probabilities were meaningless. (Location 1470)
* As we showed with the Rule of Five and the Single Sample Majority Inference, small samples can be informative, especially when you start from a position of minimal information. In fact, mathematically speaking, when you know almost nothing, almost anything will tell you something. (Location 1648)
* Prior to making a measurement, we need to answer the following: What is the decision this measurement is supposed to support? What is the definition of the thing being measured in terms of observable consequences and how, exactly, does this thing matter to the decision being asked (i.e., how do we compute outcomes based on the value of this variable)? How much do you know about it now (i.e., what is your current level of uncertainty)? How does uncertainty about this variable create risk for the decision (e.g., is there a "threshold" value above which one action is preferred and below which another is preferred)? What is the value of additional information? (Location 1786)
* That covers the next three chapters. In the Applied Information Economics (AIE) method I have been using, these are the first questions I ask with respect to anything I am asked to measure. The answers to these questions often completely change not just how organizations should measure (Location 1795)
* Define a decision problem and the relevant uncertainties. If people ask "How do we measure X?" they may already be putting the cart before the horse. The first question is "What is your dilemma?" Then we can define all of the variables relevant to the dilemma and determine what we really mean by ambiguous ideas like "training quality" or "economic opportunity." (This step is the focus of this chapter.) (Location 1829)
* Determine what you know now. We need to quantify your uncertainty about unknown quantities in the identified decision. This is done by learning how to describe your uncertainty in terms of ranges and probabilities. This is a teachable skill. Defining the relevant decision and how much uncertainty we have about it helps us determine the risk involved (covered in Chapters 5 and 6). Compute the value of additional information. Information has value because it reduces risk in decisions. Knowing the "information value" of a measurement allows us to both identify what to measure as well as informing us about how to measure it (covered in Chapter 7). If there are no variables with information values that justify the cost of any measurement approaches, skip to step 5. Apply the relevant measurement instrument(s) to high-value measurements. We cover some of the basic measurement instruments, such as random sampling, controlled experiments, and some more obscure variations on these. We also talk about methods that allow us to squeeze more out of limited data, how to isolate the effects of one variable, how to quantify "soft" preferences, how new technologies can be exploited for measurement, and how to make better use of human experts (covered in Chapters 9 to 13). Repeat step 3. Make a decision and act on it. When the economically justifiable amount of uncertainty has been removed, decision makers face a risk versus return decision. Any remaining uncertainty is part of this choice. To optimize this decision, the risk aversion of the decision maker can be quantified. An optimum choice can be calculated even in situations where there are enormous combinations of possible choices. We will build on these methods further with a discussion about quantifying risk aversion and other preferences and attitudes of decision makers. This and all of the previous steps are combined into practical project steps (covered in Chapters 11, 12, and 14). Repeat step 1. (Even the subsequent tracking of results about a decision just made is always in the context of future decisions.) (Location 1833)
* Managers may say they need to measure their carbon footprint or corporate image simply because these things are important. They are. But they are only important to measure if knowledge of the value could cause us to take different actions. (Location 1857)
# How to Measure Anything

## Metadata
- Author: [[Douglas W. Hubbard]]
- Full Title: How to Measure Anything
- Category: #books
## Highlights
- For many decision makers, it is simply a habit to default to labeling something as intangible when the measurement method isn’t immediately apparent. (Location 481)
- Yes, I Mean Anything The reader should try this exercise: Before going on to the next chapter, write down those things you believe are immeasurable or, at least, you are not sure how to measure. After reading this book, my goal is that you will be able to identify methods for measuring each and every one of them. Don’t hold back. We will be talking about measuring such seemingly immeasurable things as the number of fish in the ocean, the value of a happy marriage, and even the value of a human life. Whether you want to measure phenomena related to business, government, education, art, or anything else, the methods herein apply. (Location 503)
- So, if your problem happens to be something that isn’t specifically analyzed in this book—such as measuring the value of better product labeling laws, the quality of a movie script, or the effectiveness of motivational seminars—don’t be dismayed. Just read the entire book and apply the steps described. Your immeasurable will turn out to be entirely measurable. (Location 522)
- Consider the following points. Decision makers usually have imperfect information (i.e., uncertainty) about the best choice for a decision. These decisions should be modeled quantitatively because (as we will see) quantitative models have a favorable track record compared to unaided expert judgment. Measurements inform uncertain decisions. (Location 539)
- The benefits of modeling decisions quantitatively may not be obvious and may even be controversial to some. I have known managers who simply presume the superiority of their intuition over any quantitative model (Location 553)
- Unless someone is planning on selling the information or using it for their own entertainment, they shouldn’t care about measuring something if it doesn’t inform a significant bet of some kind. So don’t confuse the proposition that anything can be measured with everything should be measured. (Location 572)
- So what does a decision-oriented, information-value-driven measurement process look like? This framework happens to be the basis of the method I call Applied Information Economics (AIE). I (Location 578)
- Applied Information Economics: A Universal Approach to Measurement Define the decision. Determine what you know now. Compute the value of additional information. (If none, go to step 5.) Measure where information value is high. (Return to steps 2 and 3 until further measurement is not needed.) Make a decision and act on it. (Return to step 1 and repeat as each action creates new decisions.) (Location 580)
- Some of the power tools referred to in this book are in the form of spreadsheets available for download on this book’s website at www.howtomeasureanything.com. This free online library includes many of the more detailed calculations shown in this book. (Location 617)
- In ancient Greece, a man estimated the circumference of Earth by looking at the lengths of shadows in different cities at noon and by applying some simple geometry. A Nobel Prize–winning physicist taught his students how to estimate values initially unknown to them like the number of piano tuners in Chicago. (Location 678)
- A nine-year-old girl set up an experiment that debunked the growing medical practice of “therapeutic touch” and, two years later, became the youngest person ever to be published in the Journal of the American Medical Association (JAMA). (Location 681)
- He didn’t even embark on a risky and potentially lifelong attempt at circumnavigating the Earth. Instead, while in the Library of Alexandria, he read that a certain deep well in Syene (a city in southern Egypt) would have its bottom entirely lit by the noon sun one day a year. This meant the sun must be directly overhead at that point in time. He also observed that at the same time, vertical objects in Alexandria (almost directly north of Syene) cast a shadow. (Location 690)
- Eratosthenes’s calculation was a huge improvement on previous knowledge, and his error was much less than the error modern scientists had just a few decades ago for the size and age of the universe. Even 1,700 years later, Columbus was apparently unaware of or ignored Eratosthenes’s result; his estimate was fully 25% short. (This is one of the reasons Columbus thought he might be in India, not another large, intervening landmass where I reside.) In fact, a more accurate measurement than Eratosthenes’s would not be available for another 300 years after Columbus. (Location 700)
- Fermi concluded that the yield must be greater than 10 kilotons. This would have been news, since other initial observers of the blast did not know that lower limit. Could the observed blast be less than 5 kilotons? Less than 2? These answers were not obvious at first. (As it was the first atomic blast on the planet, nobody had much of an eye for these things.) After much analysis of the (Location 721)
- instrument readings, the final yield estimate was determined to be 18.6 kilotons. Like Eratosthenes, Fermi was aware of a rule relating one simple observation—the scattering of confetti in the wind—to a quantity he wanted to measure. The point of this story is not to teach you enough physics to estimate like Fermi (or enough geometry to be like Eratosthenes, either), but that, rather, you should start thinking about measurements as a multistep chain of thought. Inferences can be made from highly indirect observations. (Location 724)
- Fermi would start by asking them to estimate other things about pianos and piano tuners that, while still uncertain, might seem easier to estimate. These included the current population of Chicago (a little over 3 million in the 1930s to 1950s), the average number of people per household (two or three), the share of households with regularly tuned pianos (not more than 1 in 10 but not less than 1 in 30), the required frequency of tuning (perhaps once a year, on average), how many pianos a tuner could tune in a day (four or five, including travel time), and how many days a year the tuner works (say, 250 or so). The result would be computed: (Location 735)
- This approach to solving a Fermi question is known as a Fermi decomposition or Fermi solution. This method helped to estimate the uncertain quantity but also gave the estimator a basis for seeing where uncertainty about the quantity came from. Was the big uncertainty about the share of households that had tuned pianos, how often a piano needed to be tuned, how many pianos a tuner can tune in a day, or something else? The biggest source of uncertainty would point toward a measurement that would reduce the uncertainty the most. (Location 744)
- A Fermi Decomposition for a New Business Chuck McKay, with the firm Wizard of Ads, encourages companies to use Fermi questions to estimate the market size for a product in a given area. An insurance agent once asked Chuck to evaluate an opportunity to open a new office in Wichita Falls, Texas, for an insurance carrier that currently had no local presence there. Is there room for another carrier in this market? To test the feasibility of this business proposition, McKay answered a few Fermi questions with some Internet searches. Like Fermi, McKay started with the big population questions and proceeded from there. (Location 754)
- According to City-Data.com in 2006, there were 62,172 cars in Wichita Falls. According to the Insurance Information Institute, the average automobile insurance annual premium in the state of Texas was $837.40. McKay assumed that almost all cars have insurance, since it is mandatory, so the gross insurance revenue in town was $52,062,833 each year. The agent knew the average commission rate was 12%, so the total commission pool was $6,247,540 per year. According to Switchboard.com, there were 38 insurance agencies in town, a number that is very close to what was reported in Yellowbook.com. When the commission pool is divided by those 38 agencies, the average agency commissions are $164,409 per year. This market was probably getting tight since City-Data.com also showed the population of Wichita Falls fell from 104,197 in 2000 to 99,846 in 2005. Furthermore, a few of the bigger firms probably wrote the majority of the business, so the revenue would be even less than that—and all this before taking out office overhead. McKay’s conclusion: A new insurance agency with a new brand in town didn’t have a good chance of being very profitable, and the agent should pass on the opportunity. (Note: These are all exact numbers. But soon we will discuss how to do the same kind of analysis when all you have are inexact ranges.) (Location 759)
- In 1996, Emily saw her mother, Linda, watching a videotape on a growing industry called “therapeutic touch,” a controversial method of treating ailments by manipulating the patients’ “energy fields.” While the patient lay still, a therapist would move his or her hands just inches away from the patient’s body to detect and remove “undesirable energies,” which presumably caused various illnesses. Linda was a nurse and a long-standing member of the National Council Against Health Fraud (NCAHF). But it was Emily who first suggested to her mother that she might be able to conduct an experiment on such a claim. (Location 778)
- This made a total of 280 individual attempts by 21 separate therapists (14 had 10 attempts each while another 7 had 20 attempts each) to feel Emily’s energy field. They correctly identified the position of Emily’s hand just 44% of the time. Left to chance alone, they should get about 50% right with a 95% confidence interval of +/– 6%. (If you flipped 280 coins, there is a 95% chance that between 44% and 56% would be heads.) So the therapists may have been a bit unlucky (since they ended up on the bottom end of the range), but their results are not out of bounds of what could be explained by chance alone. In other words, people “uncertified” in therapeutic touch—you or I—could have just guessed and done as well as or better than the therapists. (Location 794)
- With these results, Linda and Emily thought the work might be worthy of publication. In April 1998, Emily, then 11 years old, had her experiment published in JAMA. That earned her a place in the Guinness Book of World Records as the youngest person ever to have research published in a major scientific journal and a $1,000 award from the James Randi Educational Foundation. (Location 799)
- Randi created the $1 million “Randi Prize” for anyone who can scientifically prove extrasensory perception (ESP), clairvoyance, dowsing, and the like. Randi dislikes labeling his efforts as “debunking” paranormal claims since he just assesses the claim with scientific objectivity. But since hundreds of applicants have been unable to claim the prize by passing simple scientific tests of their paranormal claims, debunking has been the net effect. Even before Emily’s experiment was published, Randi was also interested in therapeutic touch and was trying to test it. But, unlike Emily, he managed to recruit only one therapist who would agree to an objective test—and that person failed. (Location 804)
- Randi has run into retroactive excuses to explain failures to demonstrate paranormal skills so often that he has added another small demonstration to his tests. (Location 817)
- Real scientific methods report numbers in ranges, such as “the average yield of corn farms using this new seed increased between 10% and 18% (95% confidence interval).” (Location 985)
- Shannon proposed a mathematical definition of information as the amount of uncertainty reduction in a signal, which he discussed in terms of the “entropy” removed by a signal. To Shannon, the receiver of information could be described as having some prior state of uncertainty. That is, the receiver already knew something, and the new information merely removed some, not necessarily all, of the receiver’s uncertainty. (Location 999)
- A measurement doesn’t have to be about a quantity in the way that we normally think of it. Note that the definition I offer for measurement says a measurement is “quantitatively expressed.” The uncertainty, at least, has to be quantified, but the subject of observation might not be a quantity itself—it could be entirely qualitative, such as a membership in a set. (Location 1010)
- Nominal and ordinal scales in particular might challenge our preconceptions about what “scale” really means, but they can still be useful for observations. To a geologist, it is useful to know that one rock is harder than another, without necessarily having to know by how much—which (Location 1031)
- Keep in mind that, while subjective, the uncertainty we refer to is not just irrational and capricious. We need subjective uncertainties to at least be mathematically coherent as well as consistent with repeated, subsequent observations. A rational person can’t simply say, for instance, that there is a 75% chance of winning a bid for a government contract and an 82% chance of losing it (these two possibilities should have a total probability of 100%). Also, if someone keeps saying they are 100% certain of their predictions and they are consistently wrong, then we can reject their subjective uncertainties on objective grounds just as we would with the readings of a broken digital scale or ampmeter. (Location 1082)
- someone asks how to measure “strategic alignment” or “flexibility” or “customer satisfaction,” I simply ask: “What do you mean, exactly?” It is interesting how often people further refine their use of the term in a way that almost answers the measurement question by itself. (Location 1112)
- Once managers figure out what they mean and why it matters, the issue in question starts to look a lot more measurable. This is usually my first level of analysis when I conduct what I’ve called “clarification workshops.” It’s simply a matter of clients stating a particular, but initially ambiguous, item they want to measure. I then follow up by asking “What do you mean by <fill in the blank>?” and “Why do you care?” (Location 1119)
- In 2000, when the Department of Veterans Affairs asked me to help define performance metrics for IT security, I asked: “What do you mean by ‘IT security’?” and over the course of two or three workshops, the department staff defined it for me with increasingly specific language. They eventually revealed that what they meant by “IT security” were things like a reduction in unauthorized intrusions and virus attacks. They proceeded to explain that these things impact the organization through fraud losses, lost productivity, or even potential legal liabilities (Location 1124)
- The clarification chain is just a short series of connections that should bring us from thinking of something as an intangible to thinking of it as tangible. First, we recognize that if X is something that we care about, then X, by definition, must be detectable in some way. How could we care about things like “quality,” “risk,” “security,” or “public image” if these things were totally undetectable, in any way, directly or indirectly? (Location 1133)
- Clarification Chain If it matters at all, it is detectable/observable. If it is detectable, it can be detected as an amount (or range of possible amounts). If it can be detected as a range of possible amounts, it can be measured. (Location 1149)
- For example, I might be asked to help someone measure the value of crime reduction. But when I ask why they care about measuring that, I might find that what they really are interested in is building a business case for a specific biometric identification system for criminals. (Location 1175)
- Or I might be asked how to measure collaboration only to find that the purpose of such a measurement is to resolve whether a new document management system is required. In each case, the purpose of the measurement gives us clues about what the measure really means and how to measure (Location 1177)
- At first, the only technique for measuring something about a population was to attempt to conduct a complete count of the entire population—a census. In those days in particular, a census was extremely expensive and was sometimes such a long process that the population might change quite a bit during the census. So, for practical reasons, this evolved into a set of methods that can be used to make inferences about a larger population based on some samples and indirect observations. Obviously, one can’t see the entire population of a state at once, but one can sample it economically. (Location 1197)
- Suppose, instead, you just randomly pick five people. There are some other issues we’ll get into later about what constitutes “random,” but, for now, let’s just say you cover your eyes and pick names from the employee directory. Contact these people and arrange to record their actual commute times on some randomly selected day for each person. Let’s suppose the values you get are 30, 60, 45, 80, and 60 minutes. Can you use this sample of only five to estimate the median of the entire population (the point at which half the population is lower and half is higher)? Note that in this case the “population” is not just the number of employees but the number of individual commute times (for which there are many varying values even for the same employee). (Location 1231)
- I’ve been presenting sampling problems with just five samples to attendees of my seminars and conference sessions for years. I ask who thinks the sample is “statistically significant.” Those who remember something about that idea seem only to remember that it creates some kind of difficult threshold that makes meager amounts of data useless (more on that to come). In some conferences, almost every attendee would say the sample is not statistically significant. I suspect that a large proportion of those who abstained from answering were just questioning whether they understood what statistically significant means. As it turns out, this latter group was the more self-aware of the two. Unlike the first group, they at least knew they didn’t know what it really meant. (Location 1237)
- I then ask what the chance is that the median of the population is between the highest and lowest values in the sample of five (30 and 80). Most answers I’ve gotten were around 50%, and some were as low as 10%. After all, out of a population of 10,000 people (and perhaps millions of individual commute times per year), what could a mere sample of five tell us? (Location 1243)
- But, when we do the math, we see there is a 93.75% chance that the median of the entire population of employees is between those two numbers. I call this the “Rule of Five.” The Rule of Five is simple, it works, and it can be proven to be statistically valid for a wide variety of problems. With a sample this small, the range might be very wide, but if it is significantly narrower than your previous range, then it counts as a measurement. (Location 1246)
- It might seem impossible to be 93.75% certain about anything based on a random sample of just five, but it works. To understand why this method works, it is important to note that the Rule of Five estimates only the median of a population. Remember, the median is the point where half the population is above it and half is below it. If we randomly picked five values that were all above the median or all below it, then the median would be outside our range. But what is the chance of that, really? (Location 1251)
- A sample of five doesn’t seem like much, but, if you are starting out with a lot of uncertainty, it might be possible to make a useful inference on even less data. Suppose you wanted to estimate a percentage of some population that has some characteristic. We call this a “population proportion” problem. A population proportion could refer to the percentage of employees who take public transportation to work, the percentage of farmers in Kenya who use a particular farming technique, the percentage of people with a particular gene, or the percentage of trucks on the road that are overweight. In each of these cases, how many employees, farmers, or trucks would I have to sample to estimate the stated population proportions? Obviously, if I conducted a complete census, I would know the population proportion exactly. But what can be inferred from a smaller sample? (Location 1267)
- The Single Sample Majority Rule (i.e., The Urn of Mystery Rule) Given maximum uncertainty about a population proportion—such that you believe the proportion could be anything between 0% and 100% with all values being equally likely—there is a 75% chance that a single randomly selected sample is from the majority of the population. (Location 1297)
- The Rule of Five and the Single Sample Majority Inference are clearly counterintuitive to most people. But the math is right about these methods, so it is our intuition that is wrong. Why is our intuition wrong? Often people object that the sample is too small compared to the population size. There is sometimes a misconception that an informative sample should be a significant percentage of the entire population. (If this were the requirement, no measurement in biology or physics would be remotely possible, since population sizes are often—and literally—astronomical.) But mathematically we can prove that the Rule of Five and the Single Sample Majority Inference work even with infinite population sizes. (Location 1307)
- In order to appreciate the effect of these small samples, it’s important to remember how little we knew before the sample. The information from a very small sample is underestimated when a decision maker starts with a high degree of uncertainty, which is the case with the Single Sample Majority Inference. Indeed, the initial state of uncertainty couldn’t possibly be any higher for a population proportion than to say it’s somewhere between 0% to 100% with a uniform distribution. This is essentially the same as knowing nothing other than the logical limits of a population proportion. (Location 1312)
- The only valid reason to say that a measurement shouldn’t be made is that the cost of the measurement exceeds its benefits. This situation certainly happens in the real world. In 1995, I developed the method I called Applied Information Economics—a method for assessing uncertainty, risks, and intangibles in any type of big, risky decision you can imagine. A key step in the process (in fact, the reason for the name) is the calculation of the economic value of information. I’ll say more about this later, but a proven formula from the field of decision theory allows us to compute a monetary value for a given amount of uncertainty reduction. (Location 1358)
- If you are betting a lot of money on the outcome of a variable that has a lot of uncertainty, then even a marginal reduction in your uncertainty has a computable monetary value. For example, suppose you think developing an expensive new product feature will increase sales in one particular demographic by up to 12%, but it could be a lot less. Furthermore, you believe the initiative is not cost-justified unless sales are improved by at least 9%. If you make the investment and the increase in sales turns out to be less than 9%, then your effort will not reap a positive return. If the increase in sales is very low, or even possibly negative, then the new feature will be a disaster and a lot of money will have been lost. Measuring this would have a very high value. When someone says a variable is “too expensive” or “too difficult” to measure, we have to ask “Compared to what?” If the information value of the measurement is literally or virtually zero, of course, no measurement is justified. But if the measurement has any significant value, we must ask: “Is there any measurement method at all that can reduce uncertainty enough to justify the cost of the measurement?” Once we recognize the value of even partial uncertainty reduction, the answer is usually “Yes.” (Location 1373)
- Consider the Urn of Mystery. The information value of sampling a single marble is the difference between what your average payoff would be with the marble and what the average payoff would be without the marble. Suppose now we had equal wins (you win $10 if I guess the wrong majority color and I win $10 if I’m right) but you charged me $2 on each urn to take a single marble sample. Should I pay the $2? In this case, my average payoff now without the sample is $0 (I win $10 half the time and lose $10 half the time). But if I took one sample from the urn before each bet, that information increased my average net win to 75% × $10 + 25% × (–$10) = $5. So the value of the information is $5 and the cost is $2 for a net gain of $3 per bet. (Location 1382)
- Meehl’s extensive research soon reached outside of psychology and showed that simple statistical models were outperforming subjective expert judgments in almost every area of judgment he investigated including predictions of business failures and the outcomes of sporting events. (Location 1406)
- Another researcher conducted one of the largest and longest running studies of the performance of experts at predictions. Philip Tetlock tracked the forecasts of 284 experts in many topics over a 20-year period. In total, he had gathered more than 82,000 individual forecasts covering elections, wars, economics, and more. Tetlock summarized these findings in his book Expert Political Judgment: How Good Is It? How Can We Know?13 His conclusion was perhaps even more strongly worded than Meehl’s: (Location 1421)
- In response to the skeptics of statistical models he met in his own profession, Paul Meehl proposed a variation on the game of Russian roulette.15 In his modified version there are two revolvers: one with one bullet and five empty chambers and one with five bullets and one empty chamber. Meehl then asks us to imagine that he is a “sadistic decision-theorist” running experiments in a detention camp. Meehl asks, “Which revolver would you choose under these circumstances? Whatever may be the detailed, rigorous, logical reconstruction of your reasoning processes, can you honestly say that you would let me pick the gun or that you would flip a coin to decide between them?” Meehl summarized the responses: “I have asked quite a few persons this question, and I have not yet encountered anybody who alleged that he would just as soon play his single game of Russian roulette with the five-shell weapon.” Clearly, those who answered Meehl’s question didn’t really think probabilities were meaningless. (Location 1470)
- As we showed with the Rule of Five and the Single Sample Majority Inference, small samples can be informative, especially when you start from a position of minimal information. In fact, mathematically speaking, when you know almost nothing, almost anything will tell you something. (Location 1648)
- Prior to making a measurement, we need to answer the following: What is the decision this measurement is supposed to support? What is the definition of the thing being measured in terms of observable consequences and how, exactly, does this thing matter to the decision being asked (i.e., how do we compute outcomes based on the value of this variable)? How much do you know about it now (i.e., what is your current level of uncertainty)? How does uncertainty about this variable create risk for the decision (e.g., is there a “threshold” value above which one action is preferred and below which another is preferred)? What is the value of additional information? (Location 1786)
- That covers the next three chapters. In the Applied Information Economics (AIE) method I have been using, these are the first questions I ask with respect to anything I am asked to measure. The answers to these questions often completely change not just how organizations should measure (Location 1795)
- Define a decision problem and the relevant uncertainties. If people ask “How do we measure X?” they may already be putting the cart before the horse. The first question is “What is your dilemma?” Then we can define all of the variables relevant to the dilemma and determine what we really mean by ambiguous ideas like “training quality” or “economic opportunity.” (This step is the focus of this chapter.) (Location 1829)
- Determine what you know now. We need to quantify your uncertainty about unknown quantities in the identified decision. This is done by learning how to describe your uncertainty in terms of ranges and probabilities. This is a teachable skill. Defining the relevant decision and how much uncertainty we have about it helps us determine the risk involved (covered in Chapters 5 and 6). Compute the value of additional information. Information has value because it reduces risk in decisions. Knowing the “information value” of a measurement allows us to both identify what to measure as well as informing us about how to measure it (covered in Chapter 7). If there are no variables with information values that justify the cost of any measurement approaches, skip to step 5. Apply the relevant measurement instrument(s) to high-value measurements. We cover some of the basic measurement instruments, such as random sampling, controlled experiments, and some more obscure variations on these. We also talk about methods that allow us to squeeze more out of limited data, how to isolate the effects of one variable, how to quantify “soft” preferences, how new technologies can be exploited for measurement, and how to make better use of human experts (covered in Chapters 9 to 13). Repeat step 3. Make a decision and act on it. When the economically justifiable amount of uncertainty has been removed, decision makers face a risk versus return decision. Any remaining uncertainty is part of this choice. To optimize this decision, the risk aversion of the decision maker can be quantified. An optimum choice can be calculated even in situations where there are enormous combinations of possible choices. We will build on these methods further with a discussion about quantifying risk aversion and other preferences and attitudes of decision makers. This and all of the previous steps are combined into practical project steps (covered in Chapters 11, 12, and 14). Repeat step 1. (Even the subsequent tracking of results about a decision just made is always in the context of future decisions.) (Location 1833)
- Managers may say they need to measure their carbon footprint or corporate image simply because these things are important. They are. But they are only important to measure if knowledge of the value could cause us to take different actions. (Location 1857)