CHAPTER  1 9

OVERCONFIDENCE

 

From The Psychology of Judgment and Decision Making

By Scott Plous 

The odds of a meltdown are one in 10,000 years. -Vitali Skylarov, Minister of Power and Electrification in the Ukraine, two months before the chernobyl accident (cited in Rylsky, 1986, February)

No problem in judgement in decision making is more prevalent and more potentially catastrophic than overconfidence. As loving Janis (1982) documented in his work on groupthink, American overconfidence enabled the Japanese to destroy Pearl Harbor in World War n. Overconfidence also played a role in the disastrous decision to launch the U.S. space shuttle Challenger. Before the shuttle exploded on its twenty-fifth mission, NASA's official launch risk estimate was 1 catastrophic failure in 100,000 launches (Feynman, 1988, February). This risk estimate is roughly equivalent to launching the shuttle once per day and expecting to see only one accident in three centuries.

THE CASE OF JOSEPH KIDD

Was NASA genuinely overconfident of success, or did it simply need to appear confident? Because true confidence is hard to measure in such situations, the most persuasive evidence of overconfidence comes from carefully controlled experiments.

One of the earliest and best known of these studies was published by Stuart Oskamp in 1965. Oskamp asked 8 clinical psychologists; 18 psychology graduate students, and 6 undergraduates to read the case study of "Joseph Kidd," a 29-year-old man who had experienced "adolescent maladjustment." The case study was divided into four parts. Part 1 introduced Kidd as a war veteran who was working as a business assistant in a floral decorating studio. Part 2 discussed Kidd's childhood through age 12. Part 3 covered Kidd's high school and college years. And Part 4 chronicled Kidd's army service and later activities.

Subjects answered the same set of questions four times—once after each part of the case study. These questions were constructed from factual material in the case study, but they required subjects to form clinical judgments based on general impressions of Kidd's personality. Questions always had five forced-choice alternatives, and following each item, subjects estimated the likelihood that their answer was correct. These confidence ratings ranged from 20 percent (no confidence beyond chance levels of accuracy) to 100 percent (absolute certainty).

Somewhat surprisingly, there were no significant differences among the ratings from psychologists, graduate students, and undergraduates, so Oskamp combined all three groups in his analysis of the results. What he found was that confidence increased with the amount of information subjects read, but accuracy did not.

After reading the first part of the case study, subjects answered 26 percent of the questions correctly (slightly more than what would be expected by chance), and their mean confidence rating was 33 percent. These figures show fairly close agreement. As subjects read more information, though, the gap between confidence and accuracy grew (see Figure 19.1). The more material subjects read, the more confident they became-even though accuracy did not increase significantly with additional information. By the time they finished reading the fourth part of the case study, more than 90 percent of Oskamp's subjects were overconfident of their answers.

Figure 19.1

Stuart Oskamp (1965) found that as subjects read more information from a case study, the gap between their estimated accuracy (confidence) and true accuracy increased.

In the years since this experiment, a number of studies have found that people tend to be overconfident of their judgments, particularly when accurate judgments are difficult to make. For example, Sarah Lichtenstein and Baruch Fischhoff (1977) conducted a series of experiments in which they found that people were 65 to 70 percent confident of being right when they were actually correct about 50 percent of the time.

In the first of these experiments, Lichtenstein and Fischhoff asked people to judge whether each of 12 children's drawings came from Europe or Asia, and to estimate the probability that each judgment was correct. "Even though only 53 percent of the judgments were correct (very close to chance performance), the average confidence rating was 68 percent.

In another experiment, Lichtenstein and Fischhoff gave people market reports on 12 stocks and asked them to predict whether the stocks would rise or fall in a given period. Once again, even though only 47, percent of these predictions were correct (slightly less than would be expected by chance), the mean confidence rating was 65 percent.

After several additional studies, Lichtenstein and Fischhoff drew the following conclusions about the correspondence between accuracy and confidence in two-alternative judgments:

-Overconfidence is greatest when accuracy is near chance levels.

-Overconfidence diminishes as accuracy increases from 50 to 80 percent, and once accuracy exceeds 80 percent, people often become underconfident. In other words, the gap between accuracy and confidence is smallest when accuracy is around 80 percent, and it grows larger as accuracy departs from this level.

-Discrepancies between accuracy and confidence are not related to a decision maker's intelligence.

Although early critics of this work claimed that these results were largely a function of asking people questions about obscure or trivial topics, recent studies have replicated Lichtenstein and Fischhoffs findings with more commonplace judgments.  For example, in a series of experiments involving more than 10,000 separate judgments, Lee Ross and his colleagues found roughly 10 to 15 percent overconfidence when subjects were asked to make a variety of predictions about their behavior and the behavior of others (Dunning, Griffin, Milojkovic, & Ross, 1990; Vallone, Griffin, Lin, & Ross, 1990).

This is not to say that people are always overconfident David Ronis and Frank Yates (1987) found, for instance, that overconfidence depends partly on how confidence ratings are elicited and what type of judgments are being made (general knowledge items seem to produce relatively high degrees of overconfidence). There is also some evidence that expert bridge players, professional oddsmakers, and National Weather Service forecasters--all of whom receive regular feedback following their judgments-exhibit little or no overconfidence (Keren, 1987; Lichtenstein, Fischhoff, & Phillips, 1982; Murphy & Brown, 1984; Murphy & Winkler, 1984). Still, for the most part, research suggests that overconfidence is prevalent.

EXTREME CONFIDENCE

What if people are virtually certain that an answer is correct? How often are they right in such cases? In 1977, Baruch Fischhoff, Paul Slovic, and Sarah Lichtenstein conducted a series of experiments to investigate this issue. In the first experiment, subjects answered hundreds of general knowledge questions and estimated the probability that their answers were correct. For example, they answered whether absinthe is a liqueur or a precious stone, and they estimated their confidence on a scale from .50 to 1.00 (this problem appears as Item #21 of the Reader Survey).  Fischhoff, Slovic, and Lichtenstein then examined the accuracy of only those answers about which subjects were absolutely sure.

What they found was that people tended to be only 70 to 85 percent correct when they reported being 100 percent sure of their answer. How confident were you of your answer to Item #21? The correct answer is that absinthe is a liqueur, though many people confuse it with a precious stone called amethyst.

Just to be certain their results were not due to misconceptions about probability, Fischhoff, Slovic, and Lichtenstein (1977) conducted a second experiment in which confidence was elicited in terms of the odds of being correct. Subjects in this experiment were given more than 106 items in which two causes of death were listed--for instance, leukemia and drowning.  They were asked to indicate which cause of death was more frequent in the United States and to estimate the odds that their answer was correct (i.e., 2:1, 3:1, etc.).  This way, instead of having to express 75 percent confidence in terms of a probability, subjects could express their confidence as 3: 1 odds of being correct.

What Fischhoff, Slovic, and Lichtenstein (1977) found was that confidence and accuracy were aligned fairly well up to confidence estimates of about 3: 1, but as confidence increased from 3: 1 to 100:1, accuracy did not increase appreciably. When people set the odds of being correct at 100:1, they were actually correct 73 percent of the time. Even when,  people set the odds between 10,000:1 and 1,000,000:I--indicating virtual certainty--they were correct only 85 to 90 percent of the time (and should have given a confidence rating between 6:1 and 9:1).*

* Although these results may seem to contradict Lichtenstein and Fischhoffs earlier claim that overconfidence is minimal when subjects are 80 percent accurate, there is really no contradiction. The fact that subjects average only 70 to 90 percent accuracy when they are highly confident does not mean that they are always highly confident when 70 to 90 percent accurate.

Finally, as an added check to make sure that subjects understood the task and were taking it seriously, Fischhoff, Slovic, and Lichtenstein (1977) conducted three replications. In one replication, the relation between odds and probability was carefully explained in a twenty-minute lecture. Subjects were given a chart showing the correspondence between various odds estimates and probabilities, and they were told about the subtleties of expressing uncertainty as an odds rating (with a special emphasis on how to use odds between 1: 1 and 2: 1 to express uncertainty). Even with these instructions, subjects showed unwarranted confidence in their answers. They assigned odds of at least 50:1 when, the odds were actually about 4: 1, and they gave odds of 1000: 1 when they should have given odds of 5:1.

In another replication, subjects were asked whether they would accept a monetary bet based on the accuracy of answers that they rated as having 50: 1 or better odds of being correct. Of 42 subjects, 39 were willing to gamble-even though their overconfidence would have led  to a total of more than $140 in losses.  And in a final replication, Fischhoff, Slovic, and Lichtenstein (1977) actually played subjects' bets. In this study, 13 of 19 subjects agreed to gamble on the accuracy of their answers, even though they were incorrect on 12 percent of the questions to which they had assigned odds of 50:1 or greater (and all would have lost from $1 to $11, had the experimenters not waived the loss). These results suggest that (1) people are overconfident even when virtually certain they are correct, and (2) overconfidence is not simply a consequence of taking the task lightly or misunderstanding how to make confidence ratings. Indeed, Joan Sieber (1974) found that overconfidence increased with incentives to perform well.

WHEN OVERCONFIDENCE BECOMES A CAPITAL OFFENSE

Are people overconfident when more is at stake than a few dollars? Although ethical considerations obviously limit what can be tested in the laboratory, at least one line of evidence suggests that overconfidence operates even when human life hangs in the balance. This evidence comes from research on the death penalty.

In a comprehensive review of wrongful convictions, Hugo Bedau and Michael Radelet (1987) found 350 documented instances in which innocent defendants were convicted of capital or potentially capital crimes in the United States--even though the defendants were apparently' judged "guilty beyond a reasonable doubt." In five of these cases, the error was discovered prior to sentencing. The other defendants were not so lucky: 67 were sentenced to prison for terms of up to 25 years, 139 were sentenced to life in prison (terms of 25 years or more), and 139 were sentenced to die. At the time of Bedau and Radelet's review, 23 of the people sentenced to die had been executed.

CALIBRATION

 "Calibration" is the degree to which confidence matches accuracy. A decision maker is perfectly calibrated when, across all judgments at a given level of confidence, the proportion of accurate judgments is identical to the expected probability of being correct. In other words, 90 percent of all judgments assigned a .90 probability of being correct are accurate, 80 percent of all judgments assigned a probability of .80 are accurate, 70 percent of all judgments assigned a probability of .70 are accurate, and so forth.

When individual judgments are considered alone, it doesn't make much sense to speak of calibration. How well calibrated is a decision maker who answers ".70" to Item #21b of the Reader Survey? The only way to reliably assess calibration is by  comparing accuracy and confidence across hundreds of judgments (Lichtenstein, Fischhoff, & Phillips, 1982).

Just as there are many ways to measure confidence, there are several techniques for assessing calibration. One way is simply to calculate the difference between average confidence ratings and the overall proportion of accurate judgments. For instance, a decision maker might aver age 80 percent confidence on a set of general knowledge items but be correct on only 60 percent of the items. Such a decision maker would be overconfident by 20 percent.

Although this measure of calibration is convenient, it can be misleading at times. Consider, for example, a decision maker whose overall accuracy and average confidence are both 80 percent. Is this person perfectly calibrated? Not necessarily. The person may be 60 percent confident on half the judgments and l00 percent confident on the others (averaging out to 80 percent confidence), yet 80 percent accurate at both levels of confidence. Such a person would be underconfident when 60 percent sure and overconfident when 100 percent sure.

A somewhat more refined approach is to examine accuracy over a range of confidence levels. When accuracy is calculated separately for different levels of confidence, it is possible to create a "calibration curve" in which the horizontal axis represents confidence and the vertical axis represents accuracy. Figure 19.2 contains two calibration curves--one for weather forecasters' predictions of precipitation, and the other for physicians' diagnoses of pneumonia. As you can see, the weather forecasters are almost perfectly calibrated; on the average, their predictions closely match the weather (contrary to popular belief!). In contrast, the physicians are poorly calibrated; most of their predictions lie below the line, indicating overconfidence.
FIGURE 19.2 This figure contains calibration curves for weather forecasters' predictions of precipitation (hollow circles) and physicians' diagnoses of pneumonia (filled circles). Although the weather forecasters are almost perfectly calibrated, the physicians show substantial overconfidence (i.e., unwarranted certainty that patients have pneumonia). The data on weather forecasters comes from a report by Allan Murphy and Robert Winkler (1984), and the data on physicians comes from a study by Jay Christensen-Szalanski and James Bushyhead (1981).

There are additional ways to assess calibration, some of them involving complicated mathematics. For instance, one of the most common techniques is to calculate a number known as a "Brier score" ((lamed after statistician Glenn Brier). Brier scores can be partitioned into threecomponents, one of which corresponds to calibration. The Brier score component for calibration is a weighted average of the mean squared differences between the proportion correct in each category and the probability associated with that category (for a good introduction to the technical aspects of calibration, see Yates, 1990).

One of the most interesting measures of calibration is known as the "surprise index.” The surprise index is used for interval judgments of unknown quantities.For example, suppose you felt 90 percent confident that the answer to Item #12 of the Reader Survey was somewhere between an inch and a mile (see Item #12b for your true 90 percent confidence interval). Because the correct answer is actually greater than a  mile, this answer would be scored as a surprise. The surprise index is simply the percentage of judgments that lie beyond the boundaries of a confidence interval.

In  major review of calibration research, Sarah Lichtenstein, Baruch Fischhoff, and Lawrence Phillips (1982) examined several studies in which subjects had been asked to give 98 percent confidence intervals (i.e., intervals that had a 98 percent chance of including the correct answer). In every study, the surprise index exceeded 2 percent. Averaging across all experiments for which information was available--a total of nearly 15,000 judgments--the surprise index was 32 percent. In other words, when subjects were 98 percent sure that an interval contained the correct answer, they were right 68 percent of the time. Once again, overconfidence proved the rule rather than the exception.  

Are you overconfident? Edward Russo and Paul Schoemaker (1989) developed an easy self-test to measure overconfidence on general knowledge questions (reprinted in Figure 19.3). Although a comprehensive assessment of calibration requires hundreds of judgments, this test will give you a rough idea of what your surprise index is with general knowledge questions at one level of confidence. Russo and Schoemaker administered the test to more than 1000 people and found that less than 1 percent of the respondents got nine or more items correct. Most people missed four to seven items (a surprise index of 40 to 70 percent), indicating a substantial degree of overconfidence.

SELF-TEST OF OVERCONFIDENCE

For each of the following ten items, provide a low and high guess such that you are 90 percent sure the correct answer falls between the two. Your challenge is to be neither too narrow (i.e., overconfident) nor too wide (i.e., underconfident). If you successfully meet this challenge you should have 10 percent misses -- that is, exactly one miss.

 

 

90% Confidence Range

LOW

HIGH

1. Martin Luther King's age at death
2. Length of the Nile River
3. Number of countries that are members of OPEC
4. Number of books in the Old Testament
5. Diameter of the moon in miles
6. Weight of an empty Boeing 747 in pounds
7. Year in which Wolfgang Amadeus Mozart was born
8. Gestation period (in days) of an Asian elephant
9. Air distance from London to Tokyo
10. Deepest (known) point in the ocean (in feet)

answers

FIGURE 19.3 This test will give you some idea of whether you are overconfident on general knowledge questions {reprinted with permission from Russo and Schoemaker, 1989}.

THE CORRELATION BETWEEN CONFIDENCE  AND ACCURACY

Overconfidence notwithstanding, it is still possible for confidence to be correlated with accuracy. To take an example, suppose a decision maker were 50 percent accurate when 70 percent confident, 60 percent accurate when 80 percent confident, and 70 percent accurate when 90 percent confident.  In such a case confidence would be perfectly correlated, with accuracy, even though the decision maker would be uniformly overconfident by 20 percent.  

The question arises, then, whether confidence is correlated with accuracy--regardless of whether decision makers are overconfident. If confidence ratings increase when accuracy increases, then accuracy can be predicted as a function of how confident a decision maker feels.  If not, then confidence is a misleading indicator of accuracy.  

Many studies have examined this issue, and the results have often shown very little relationship between confidence and accuracy. To illustrate, consider the following two problems concerning military history:

Problem 1. The government of a country not far from Superpower A, after discussing certain changes in its party system, began broadening its trade with Superpower B. To reverse these changes in government and trade, Superpower A sent its troops into the country and militarily backed the original government. Who was Superpower A-the United States or the Soviet Union? How confident are you that your answer is correct?

Problem 2. In the 1960s Superpower A sponsored a surprise invasion of a small country near its border, with the purpose of overthrowing the regime in power at the time. The invasion failed, and most of the original invading forces were killed or imprisoned. Who was Superpower A, and again, how sure are you of your answer?

A version of these problems appeared as Items #9 and # 1 0 of the Reader Survey. If you guessed the Soviet Union in the first problem and the United States in the second, you were right on both counts. The first problem describes the 1968 Soviet invasion of Czechoslovakia, and the second describes the American invasion of the Bay of Pigs in Cuba.

Most people miss at least one of these problems, despite whatever confidence they feel.

In the November 1984 issue of Psychology Today magazine, Philip Zimbardo and I published the results of a reader survey that contained both of these problems and a variety of others on superpower conflict. The survey included 10 descriptions of events, statements, or policies related to American and Soviet militarism, but in each description, all labels identifying the United States and Soviet Union were removed. The task for readers was to decide whether "Superpower A" was the United States or the Soviet Union, and to indicate on a 9-point scale how confident they were of each answer.  

Based on surveys from 3500 people, we were able to conclude two things. First, respondents were not able to tell American and Soviet military actions apart. Even though they would have averaged 5 items correct out of 10 just by flipping a coin, the overall average from readers of Psychology Today--who were more politically involved and educated than the general public--was 4.9 items correct. Only 54 percent of the respondents correctly identified the Soviet Union as Superpower A in the invasion of Czechoslovakia, and 25 percent mistook the United States for the Soviet Union in the Bay of Pigs invasion. These findings suggested that Americans were condemning Soviet actions and policies largely because they were Soviet, not because they were radically different from American actions and policies. 

The second thing we found was that people's confidence ratings were virtually unrelated to their accuracy (the average correlation between confidence and accuracy for each respondent was only .08, very close to zero). On the whole, people who got nine or ten items correct were no more confident than less successful respondents, and highly confident respondents scored about the same as less confident respondents.

This does not mean that confidence ratings were made at random; highly confident respondents differed in a number of ways from other respondents. Two-thirds of all highly confident respondents (i.e., who averaged more than 8 on the 9-point confidence scale) were male, even though the general sample was split evenly by gender, and 80 percent were more than 30 years old.  Twice as many of the highly confident respondents wanted to increase defense spending as did less confident respondents, and nearly twice as many felt that the Soviet government could not be trusted at all. Yet the mean score these respondents achieved on the survey was 5.1 items correct--almost exactly what would be expected by chance responding. Thus, highly confident respondents could not discriminate between Soviet and American military actions, but they were very confident of misperceived differences and advocated increased defense spending.

As mentioned earlier, many other studies have found little or no correlation between confidence and accuracy (Paese & Sniezek, 1991; Ryback, 1967; Sniezek & Henry, 1989, 1990; Sniezek, Paese, & Switzer, 1990). This general pattern is particularly apparent in research on eyewitness testimony. By and large, these studies suggest that the confidence eyewitnesses feel about their testimony bears little relation to how accurate the testimony actually is (Brown, Deffenbacher, & Sturgill, 1977; Clifford & Scott, 1978; Leippe, Wells, & Ostrom, 1978). In a review of 43 separate research findings on the relation between accuracy and confidence in eye- and earwitnesses, Kenneth Deffenbacher (1980) found that in two-thirds of the "forensically relevant" studies (e.g., studies in which subjects were not instructed in advance to watch for a staged crime), the correlation between confidence and accuracy was not significantly positive.  Findings such as these led Elizabeth Loftus (1979, p. 101), author of Eyewitness Testimony, to caution: "One should not take high confidence as any absolute guarantee of anything."

Similar results have been found in clinical research.  In one of the first experiments to explore this topic, Lewis Goldberg (1959) assessed the correlation between confidence and accuracy in clinical diagnoses. Goldberg was interested in whether clinicians could accurately detect organic brain damage on the basis of protocols from the Bender-Gestalt test (a test widely used to diagnose brain damage). He presented 30 different test results to four experienced clinical psychologists, ten clinical trainees, and eight non-psychologists (secretaries).  Half of these protocols were from patients who had brain damage, and half were from psychiatric patients who had nonorganic problems. Judges were asked to indicate whether each patient was "organic" or "nonorganic," and to indicate their confidence on a rating scale labeled "Positively," "Fairly certain," "Think so," "Maybe," or "Blind guess."

Goldberg found two surprising results. First, all three groups of judges-experienced clinicians, trainees, and non-psychologists,  correctly classified 65 to 70 percent of the patients.  There were no differences based on clinical experience; secretaries performed as well as psychologists with four to ten years of clinical experience.  Second, there was no significant relationship between individual diagnostic accuracy and degree of confidence.  Judges were generally as confident on cases they misdiagnosed as on cases they diagnosed correctly. Subsequent, studies have found miscalibration in diagnoses of cancer, pneumonia (see Figure 19.2), and other serious medical problems (Centor, Dalton, & Yates, 1984; Christensen-Szalanski & Bushyhead, 1981; Wallsten, 1981).

HOW CAN OVERCONFIDENCE BE REDUCED?

In a pair of experiments on how to improve calibration, Lichtenstein and Fischhoff (1980) found that people who were initially overconfident could learn to be better calibrated after making 200 judgments and receiving intensive performance feedback. Likewise, Hal Arkes and his associates found that overconfidence could be eliminated by giving subjects feedback after five deceptively difficult problems (Arkes, Christensen, Lai, & Blumer, 1987). These studies show that overconfidence can be unlearned, although their applied value is somewhat limited.  Few people will ever undergo special training sessions to become well calibrated.

What would be useful is a technique that decision makers could carry with them from judgment to judgment--something lightweight, durable, and easy to apply in a range of situations. And indeed, there does seem to be such a technique. The most effective way to improve calibration seems to be very simple:

Stop to consider reasons why your judgment might be wrong.

The value of this technique was first documented by Asher Koriat, Sarah Lichtenstein, and Baruch Fischhoff (1980). In this research, subjects answered two sets of two-alternative general knowledge questions, first under control instructions and then under reasons instructions. Under control instructions, subjects chose an answer and estimated the probability (between .50 and 1.00) that their answer was correct. Under reasons instructions, they were asked to list reasons for and against each of the alternatives before choosing an answer.  

Koriat, Lichtenstein, and Fischhoff found that under control instructions, subjects showed typical levels of overconfidence, but after geneating pro and con reasons, they became extremely well calibrated (roughly comparable to subjects who were given intensive feedback in the study by Lichtenstein and Fischhoff). After listing reasons for and against each of the alternatives, subjects were less confident (primarily because they used .50 more often and 1.00 less often) and more accurate (presumably because they devoted more thought to their answers). 

In a follow-up experiment, Koriat, Lichtenstein, and Fischhoff found that it was not the generation of reasons per se that led to improved calibration; rather, it was the generation of opposing reasons. When subjects listed reasons in support of their preferred answers, overconfidence was not reduced. Calibration improved only when subjects considered reasons why their preferred answers might be wrong. Although these findings may be partly a function of "social demand characteristics" (i.e., subjects feeling cued by instructions to tone down their confidence levels), other studies have confirmed that the generation of opposing reasons improves calibration (e.g., Hoch, 1985).

These results are reminiscent of the study by Paul Slovic and Baruch Fischhoff (1977) discussed in Chapter 3, in which hindsight biases were reduced when subjects thought of reasons why certain experimental results might have turned out differently than they did. Since the time of Slovic and Fischhoff’s study, several experiments have shown how various judgment biases can be reduced by considering the possibility of alternative outcomes or answers (Griffin, Dunning, & Ross, 1990; Hoch, 1985; Lord, Lepper, & Preston, 1984). 

FIGURE 19.4

The difficult task of considering multiple perspectives. (Calvin and Hobbes copyright 1990 Wat terson. Dist. by Universal Press Syndicate. Reprinted with permission. All rights reserved.) 

As Charles Lord, Mark Lepper, and Elizabeth Preston (1984, p. 1239) pointed out: "The observation that humans have a blind spot for opposite possibilities is not a new one. In 1620, Francis Bacon wrote that 'it is the peculiar and perpetual error of human intellect to be more moved and excited by affirmatives than by negatives.'" In Chapter 20, this blind spot--and some of its consequences--will be explored in detail.

CONCLUSION

It is important to keep research on overconfidence in perspective. In most studies, average confidence levels do not exceed accuracy by more than 1 0 to 20 percent. Consequently, overconfidence is unlikely to be catastrophic unless decision makers are nearly certain that their judgments are correct. As the explosion of the space shuttle illustrates, the most devastating form of miscalibration is inappropriate certainty.

Taken together, the studies in this chapter suggest several strategies for dealing with miscalibration: 

First, you may want to flag certain judgments for special consideration. Overconfidence is greatest when judgments are difficult or confidence is extreme. In such cases, it pays to proceed cautiously.

 

Second, you may want to "recalibrate" your confidence judgments and the judgments of others. As Lichtenstein and Fischhoff (1977) observed, if a decision maker is 90 percent confident but only 70 to 75 percent accurate, it is probably best to treat "90 percent confidence" as though it were "70 to 75 percent confidence." 

 

Along the same lines, you may want to automatically convert judgments of "100 percent confidence" to a lesser degree ofconfidence. One hundred percent confidence is especially unwarranted when predicting how people will behave (Dunning, Griffin, Milojkovic, & Ross, 1990).

 

Above all, if you feel extremely confident about an answer, consider reasons why a different answer might be correct. Even though you may not change your mind, your judgments will probably be better calibrated.