# GetWiki

*base rate fallacy*

ARTICLE SUBJECTS

being →

database →

ethics →

fiction →

history →

internet →

language →

linux →

logic →

method →

news →

policy →

purpose →

religion →

science →

software →

truth →

unix →

wiki →

ARTICLE TYPES

essay →

feed →

help →

system →

wiki →

ARTICLE ORIGINS

critical →

forked →

imported →

original →

base rate fallacy

[ temporary import ]

**please note:**

- the content below is remote from Wikipedia

- it has been imported raw for GetWiki

**base rate fallacy**, also called

**base rate neglect**or

**base rate bias**, is a fallacy. If presented with related base rate information (i.e. generic, general information) and specific information (information pertaining only to a certain case), the mind tends to ignore the former and focus on the latter.WEB,weblink Logical Fallacy: The Base Rate Fallacy, Fallacyfiles.org, 2013-06-15, Base rate neglect is a specific form of the more general extension neglect.

## False positive paradox

One type of base rate fallacy is the**false positive paradox**, where false positive tests are more probable than true positive tests, occurring when the overall population has a low incidence of a condition and the incidence rate is lower than the false positive rate. The probability of a positive test result is determined not only by the accuracy of the test but by the characteristics of the sampled population.BOOK, Probability and Statistics in Aerospace Engineering, M. H., Rheinfurth, L. W., Howell, March 1998, NASA,weblink 16, MESSAGE: False positive tests are more probable than true positive tests when the overall population has a low incidence of the disease. This is called the false-positive paradox., When the incidence, the proportion of those who have a given condition, is lower than the test's false positive rate, even tests that have a very low chance of giving a false positive

*in an individual case*will give more false than true positives

*overall*.JOURNAL, Quantitative literacy - drug testing, cancer screening, and the identification of igneous rocks, Journal of Geoscience Education, May 2003, Vacher, H. L.,weblink 2, At first glance, this seems perverse: the less the students as a whole use steroids, the more likely a student identified as a user will be a non-user. This has been called the False Positive Paradox, - Citing: BOOK, Gonick, L., Smith, W., 1993, The cartoon guide to statistics, Harper Collins, New York, 49, So, in a society with very few infected peopleâ€”fewer proportionately than the test gives false positivesâ€”there will actually be more who test positive for a disease incorrectly and don't have it than those who test positive accurately and do. The paradox has surprised many.BOOK, Assessing Mathematical Proficiency, A. H., Schoenfeld, B. L., Madison, August 2007, Cambridge University Press, Mathematical Sciences Research Institute Publications, Mathematical Proficiency for Citizenship,weblink 978-0-521-69766-8, New, 122, The correct [probability estimate...] is surprising to many; hence, (wiktionary:paradox, the term paradox)., It is especially counter-intuitive when interpreting a positive result in a test on a low-incidence population after having dealt with positive results drawn from a high-incidence population. If the false positive rate of the test is higher than the proportion of the

*new*population with the condition, then a test administrator whose experience has been drawn from testing in a high-incidence population may conclude from experience that a positive test result usually indicates a positive subject, when in fact a false positive is far more likely to have occurred.

## Examples

### Example 1: Disease

#### High-incidence population {| class"wikitable floatright" style"text-align:right;"

! Numberof people !! Infected !! Uninfected !! Total! Testpositive400(true positive) >| 30(false positive)| 430 |

| 570(true negative)| 570 |

| 600 ! 1000 |

*A*of 1000 persons, in which 40% are infected. The test has a false positive rate of 5% (0.05) and no false negative rate. The expected outcome of the 1000 tests on population

*A*would be:

Infected and test indicates disease (true positive)

So, in population
1000 Ã— {{sfrac|40|100}} = 400 people would receive a true positive

Uninfected and test indicates disease (false positive)
1000 Ã— {{sfrac|100 â€“ 40|100}} Ã— 0.05 = 30 people would receive a false positive

The remaining 570 tests are correctly negative.
*A*, a person receiving a positive test could be over 93% confident ({{sfrac|400|30 + 400}}) that it correctly indicates infection.

#### Low-incidence population{| class"wikitable floatright" style"text-align:right;"

! Numberof people !! Infected !! Uninfected !! Total! Testpositive20(true positive) >| 49(false positive)| 69 |

| 931(true negative)| 931 |

| 980 ! 1000 |

*B*, in which only 2% is infected. The expected outcome of 1000 tests on population

*B*would be:

Infected and test indicates disease (true positive)

In population
1000 Ã— {{sfrac|2|100}} = 20 people would receive a true positive

Uninfected and test indicates disease (false positive)
1000 Ã— {{sfrac|100 â€“ 2|100}} Ã— 0.05 = 49 people would receive a false positive

The remaining 931 (= 1000 - (49 + 20)) tests are correctly negative.
*B*, only 20 of the 69 total people with a positive test result are actually infected. So, the probability of actually being infected after one is told that one is infected is only 29% ({{sfrac|20|20 + 49}}) for a test that otherwise appears to be "95% accurate".A tester with experience of group

*A*might find it a paradox that in group

*B*, a result that had usually correctly indicated infection is now usually a false positive. The confusion of the posterior probability of infection with the prior probability of receiving a false positive is a natural error after receiving a health-threatening test result.

### Example 2: Drunk drivers

A group of police officers have breathalyzers displaying false drunkenness in 5% of the cases in which the driver is sober. However, the breathalyzers never fail to detect a truly drunk person. One in a thousand drivers is driving drunk. Suppose the police officers then stop a driver at random, and force the driver to take a breathalyzer test. It indicates that the driver is drunk. We assume you don't know anything else about him or her. How high is the probability he or she really is drunk?

Many would answer as high as 95%, but the correct probability is about 2%.An explanation for this is as follows: on average, for every 1,000 drivers tested, - 1 driver is drunk, and it is 100% certain that for that driver there is a
*true*positive test result, so there is 1*true*positive test result - 999 drivers are not drunk, and among those drivers there are 5%
*false*positive test results, so there are 49.95*false*positive test results

p(mathrm{drunk}mid D)

where *D*means that the breathalyzer indicates that the driver is drunk. Bayes's theorem tells us that

p(mathrm{drunk}mid D) = frac{p(D mid mathrm{drunk}), p(mathrm{drunk})}{p(D)}.

We were told the following in the first paragraph:
p(mathrm{drunk}) = 0.001,
p(mathrm{sober}) = 0.999,
p(Dmidmathrm{drunk}) = 1.00, and
p(Dmidmathrm{sober}) = 0.05.

As you can see from the formula, one needs *p*(

*D*) for Bayes' theorem, which one can compute from the preceding values using the law of total probability:

p(D) = p(D mid mathrm{drunk}),p(mathrm{drunk})+p(Dmidmathrm{sober}),p(mathrm{sober})

which gives
p(D)= (1.00 times 0.001) + (0.05 times 0.999) = 0.05095.

Plugging these numbers into Bayes' theorem, one finds that
p(mathrm{drunk}mid D) = frac{1.00 times 0.001}{0.05095} = 0.019627.

### Example 3: Terrorist identification

In a city of 1 million inhabitants let there be 100 terrorists and 999,900 non-terrorists. To simplify the example, it is assumed that all people present in the city are inhabitants. Thus, the base rate probability of a randomly selected inhabitant of the city being a terrorist is 0.0001, and the base rate probability of that same inhabitant being a non-terrorist is 0.9999. In an attempt to catch the terrorists, the city installs an alarm system with a surveillance camera and automatic facial recognition software.The software has two failure rates of 1%:- The false negative rate: If the camera scans a terrorist, a bell will ring 99% of the time, and it will fail to ring 1% of the time.
- The false positive rate: If the camera scans a non-terrorist, a bell will not ring 99% of the time, but it will ring 1% of the time.

## Findings in psychology

In experiments, people have been found to prefer individuating information over general information when the former is available.JOURNAL, Bar-Hillel, Maya, The base-rate fallacy in probability judgments, Acta Psychologica, 1980, 44, 3, 211â€“233, 10.1016/0001-6918(80)90046-3, BOOK, Kahneman, Daniel, Judgment under uncertainty: Heuristics and biases, 1985, 153â€“160, Amos Tversky, Daniel Kahneman, Paul Slovic & Amos Tversky, Evidential impact of base rates, In some experiments, students were asked to estimate the grade point averages (GPAs) of hypothetical students. When given relevant statistics about GPA distribution, students tended to ignore them if given descriptive information about the particular student even if the new descriptive information was obviously of little or no relevance to school performance. This finding has been used to argue that interviews are an unnecessary part of the college admissions process because interviewers are unable to pick successful candidates better than basic statistics.Psychologists Daniel Kahneman and Amos Tversky attempted to explain this finding in terms of a simple rule or "heuristic" called representativeness. They argued that many judgments relating to likelihood, or to cause and effect, are based on how representative one thing is of another, or of a category.JOURNAL, Kahneman, Daniel, Amos Tversky, On the psychology of prediction, Psychological Review, 1973, 80, 4, 237â€“251, 10.1037/h0034747, Kahneman considers base rate neglect to be a specific form of extension neglect.BOOK, Kahneman, Daniel, Choices, Values and Frames, 2000, Daniel Kahneman and Amos Tversky, Evaluation by moments, past and future, Richard Nisbett has argued that some attributional biases like the fundamental attribution error are instances of the base rate fallacy: people do not use the "consensus information" (the "base rate") about how others behaved in similar situations and instead prefer simpler dispositional attributions.BOOK, Nisbett, Richard E., Cognition and social behavior, 1976, E. Borgida, R. Crandall, H. Reed, J. S. Carroll & J. W. Payne, Popular induction: Information is not always informative, 227â€“236, 2, There is considerable debate in psychology on the conditions under which people do or do not appreciate base rate information.JOURNAL, Koehler, J. J., 10.1017/S0140525X00041157, The base rate fallacy reconsidered: Descriptive, normative, and methodological challenges, Behavioral and Brain Sciences, 19, 1â€“17, 2010, JOURNAL, Barbey, A. K., Sloman, S. A., 10.1017/S0140525X07001653, Base-rate respect: From ecological rationality to dual processes, Behavioral and Brain Sciences, 30, 3, 2007, 17963533, 241â€“254; discussion 255â€“297, Researchers in the heuristics-and-biases program have stressed empirical findings showing that people tend to ignore base rates and make inferences that violate certain norms of probabilistic reasoning, such as Bayes' theorem. The conclusion drawn from this line of research was that human probabilistic thinking is fundamentally flawed and error-prone.JOURNAL, Tversky, A., Kahneman, D., 10.1126/science.185.4157.1124, Judgment under Uncertainty: Heuristics and Biases, Science, 185, 4157, 1124â€“1131, 1974, 17835457, Other researchers have emphasized the link between cognitive processes and information formats, arguing that such conclusions are not generally warranted.JOURNAL, Cosmides, Leda, John Tooby, Are humans good intuitive statisticians after all? Rethinking some conclusions of the literature on judgment under uncertainty, Cognition, 1996, 58, 1â€“73, 10.1016/0010-0277(95)00664-8, 10.1.1.131.8290, JOURNAL, Gigerenzer, G., Hoffrage, U., 10.1037/0033-295X.102.4.684, How to improve Bayesian reasoning without instruction: Frequency formats, Psychological Review, 102, 4, 684, 1995, 10.1.1.128.3201, Consider again Example 2 from above. The required inference is to estimate the (posterior) probability that a (randomly picked) driver is drunk, given that the breathalyzer test is positive. Formally, this probability can be calculated using Bayes' theorem, as shown above. However, there are different ways of presenting the relevant information. Consider the following, formally equivalent variant of the problem:
1 out of 1000 drivers are driving drunk. The breathalyzers never fail to detect a truly drunk person. For 50 out of the 999 drivers who are not drunk the breathalyzer falsely displays drunkness. Suppose the policemen then stop a driver at random, and force them to take a breathalyzer test. It indicates that he or she is drunk. We assume you don't know anything else about him or her. How high is the probability he or she really is drunk?

In this case, the relevant numerical informationâ€”*p*(drunk),

*p*(

*D*| drunk),

*p*(

*D*| sober)â€”is presented in terms of natural frequencies with respect to a certain reference class (see reference class problem). Empirical studies show that people's inferences correspond more closely to Bayes' rule when information is presented this way, helping to overcome base-rate neglect in laypeople and experts.JOURNAL, Hoffrage, U., Lindsey, S., Hertwig, R., Gigerenzer, G., Medicine: Communicating Statistical Information, 10.1126/science.290.5500.2261, Science, 290, 5500, 2261â€“2262, 2000, 11188724, As a consequence, organizations like the Cochrane Collaboration recommend using this kind of format for communicating health statistics.JOURNAL, Akl, E. A., Oxman, A. D., Herrin, J., Vist, G. E., Terrenato, I., Sperati, F., Costiniuk, C., Blank, D., SchÃ¼nemann, H., SchÃ¼nemann, Holger, Using alternative statistical formats for presenting risks and risk reductions, 10.1002/14651858.CD006776.pub2, The Cochrane Library, 3, CD006776, 2011, 21412897, Teaching people to translate these kinds of Bayesian reasoning problems into natural frequency formats is more effective than merely teaching them to plug probabilities (or percentages) into Bayes' theorem.JOURNAL, Sedlmeier, P., Gigerenzer, G., 10.1037/0096-3445.130.3.380, Teaching Bayesian reasoning in less than two hours, Journal of Experimental Psychology: General, 130, 3, 380, 2001,weblink It has also been shown that graphical representations of natural frequencies (e.g., icon arrays) help people to make better inferences.JOURNAL, Brase, G. L., Pictorial representations in statistical reasoning, 10.1002/acp.1460, Applied Cognitive Psychology, 23, 3, 369â€“381, 2009, JOURNAL, Edwards, A., Elwyn, G., Mulley, A., 10.1136/bmj.324.7341.827, Explaining risks: Turning numerical data into meaningful pictures, BMJ, 324, 7341, 827â€“830, 2002, 11934777, 1122766, Why are natural frequency formats helpful? One important reason is that this information format facilitates the required inference because it simplifies the necessary calculations. This can be seen when using an alternative way of computing the required probability

*p*(drunk|

*D*):

p(mathrm{drunk}mid D) = frac{N(mathrm{drunk} cap D)}{N(D)} = frac{1}{51} = 0.0196

where *N*(drunk ∩

*D*) denotes the number of drivers that are drunk and get a positive breathalyzer result, and

*N*(

*D*) denotes the total number of cases with a positive breathalyzer result. The equivalence of this equation to the above one follows from the axioms of probability theory, according to which

*N*(drunk ∩

*D*) =

*N*Ã—

*p*(

*D*| drunk) Ã—

*p*(drunk). Importantly, although this equation is formally equivalent to Bayes' rule, it is not psychologically equivalent. Using natural frequencies simplifies the inference because the required mathematical operation can be performed on natural numbers, instead of normalized fractions (i.e., probabilities), because it makes the high number of false positives more transparent, and because natural frequencies exhibit a "nested-set structure".JOURNAL, Girotto, V., Gonzalez, M., 10.1016/S0010-0277(00)00133-5, Solving probabilistic and statistical problems: A matter of information structure and question form, Cognition, 78, 3, 247â€“276, 2001, 11124351, Not every frequency format facilitates Bayesian reasoning.JOURNAL, Hoffrage, U., Gigerenzer, G., Krauss, S., Martignon, L., Representation facilitates reasoning: What natural frequencies are and what they are not, 10.1016/S0010-0277(02)00050-1, Cognition, 84, 3, 343â€“352, 2002, 12044739, JOURNAL, Gigerenzer, G., Hoffrage, U., 10.1037/0033-295X.106.2.425, Overcoming difficulties in Bayesian reasoning: A reply to Lewis and Keren (1999) and Mellers and McGraw (1999), Psychological Review, 106, 2, 425, 1999,weblink Natural frequencies refer to frequency information that results from

*natural sampling*,BOOK, Kleiter, G. D., Natural Sampling: Rationality without Base Rates, 10.1007/978-1-4612-4308-3_27, Contributions to Mathematical Psychology, Psychometrics, and Methodology, Recent Research in Psychology, 375â€“388, 1994, 978-0-387-94169-1, which preserves base rate information (e.g., number of drunken drivers when taking a random sample of drivers). This is different from

*systematic sampling*, in which base rates are fixed a priori (e.g., in scientific experiments). In the latter case it is not possible to infer the posterior probability

*p*(drunk | positive test) from comparing the number of drivers who are drunk and test positive compared to the total number of people who get a positive breathalyzer result, because base rate information is not preserved and must be explicitly re-introduced using Bayes' theorem.

## See also

- Bayesian probability
- Bayes' theorem
- Data dredging
- False positive paradox
- Inductive argument
- List of cognitive biases
- List of paradoxes
- Misleading vividness
- Prevention paradox
- Prosecutor's fallacy, a mistake in reasoning that involves ignoring a low prior probability
- Simpson's paradox, another error in statistical reasoning dealing with comparing groups
- Stereotype

## References

{{Reflist}}## External links

- The Base Rate Fallacy The Fallacy Files
- Psychology of Intelligence Analysis: Base Rate Fallacy
- The base rate fallacy explained visually (Video)
- Interactive page for visualizing statistical information and Bayesian inference problems
- Current 'best practice' for communicating probabilities in health according to the International Patient Decision Aid Standards (IPDAS) Collaboration

**- content above as imported from Wikipedia**

- "

- time: 8:43am EDT - Tue, Aug 20 2019

- "

__base rate fallacy__" does not exist on GetWiki (yet)- time: 8:43am EDT - Tue, Aug 20 2019

[ this remote article is provided by Wikipedia ]

LATEST EDITS [ see all ]

GETWIKI 09 JUL 2019

GETWIKI 09 MAY 2016

GETWIKI 18 OCT 2015

GETWIKI 20 AUG 2014

GETWIKI 19 AUG 2014

© 2019 M.R.M. PARROTT | ALL RIGHTS RESERVED