##
# Modeling Extreme Model Uncertainty

By Holden Karnofsky

*Thanks to Jacob Steinhardt, Dario Amodei, Nick Beckstead, Jonah Sinick and Jared Kaplan for in-depth discussion of earlier drafts and substantial help clarifying this content.*

My post on sequence vs. cluster thinking is informal and impressionistic, aiming to give a picture of two different, imperfect ways of approximating what I see as the "ideal" thought process, and discussing what I see as the strengths and weaknesses of each style. This piece attempts a more formal illustration of some of the key aspects of what I see as the "ideal" thought process. It is a continuation of previous attempts to lay out my model of Bayesian reasoning under conditions of high uncertainty (here and here), which is important when estimating the relative merits of very different (and high-uncertainty) giving opportunities.

I am not at all confident that this piece lays out a model that captures all the important aspects of my thinking on the topic, and I also think it's quite possible that the model is badly flawed. My aim is not to assert a formal demonstration that my views are correct; rather, I'm attempting to use formalism as a way of better clarifying (for technically-minded people) the underlying reasoning behind my claims about sequence vs. cluster thinking, and thereby making it easier for people to understand and critique what I believe.

*Added July 18: Note that the below works with medians rather than means, because the use of extremely fat-tailed distributions produces unrealistically high means. The qualitative points made apply to the means as well as the medians, but it is hard to construct an extremely fat-tailed distribution that matches other intuitive properties and has an intuitive mean, and we have not put in the effort to do so. More at this comment exchange.*

*Published: June 2014*

### Table of Contents

## Motivating challenge: risk vs. uncertainty

This piece focuses on the conundrum of "risk vs. uncertainty" and how to express it in the language of expected value. Specifically, how should one model the difference between the following two statements:

*Case 1. I will make $100 if the result of a fair coin flip is "heads" (P1=50%). Therefore, my expected winnings are $100*(50%) = $50.*

*Case 2. I will make $100 if the business I'm investing in ends up going public. I know very little about the business and I have no idea how to assess the probability that it ends up going public; my best guess is 60%, but I'll adjust it down to P2=50% to be conservative. Therefore, my expected winnings are $100*(50%) = $50.*

"Risk" refers to the probabilities seen in well-understood and robust models, such as P1, such that one can be confident in what would happen over a large number of trials but not over a small number. "Uncertainty" refers to fuzzy, poorly modeled lack of understanding (or to "missing pieces of one's model"), and is more relevant for the 2nd case.

The difference can be extreme for the sorts of questions GiveWell deals with. When trying to estimate the good accomplished by a giving opportunity, one often needs to rely on highly speculative models with substantial guesswork, which sometimes produce extremely high or low expected values with extreme uncertainty. This situation applies to the analysis behind our top charities and is becoming yet more relevant as we broaden our scope. It is a situation that is not commonly addressed in the academic literature on decision theory that I've seen.

To the extent I have seen attempts to formalize the best approaches to these situations (usually though not always outside of academia), I have found them unsatisfying, largely because they tend to treat the two calculations above - the first well-understood, the second less so - as equivalent, in the sense that they produce the same expected value. I believe that doing so causes problems and partly explains some of the weaknesses of relying too much on explicit expected-value estimation. In this piece, I try to take an alternative approach, and in so doing to clarify the intended meaning of earlier posts on the importance of robustness (here and here).

I start by describing the approach I see most often in the relevant domains; I then lay out my approach. Both approaches attempt to approximate ideal Bayesian decision theory, but they take different approaches to doing so.

## The familiar approach to modeling extreme uncertainty

The approach I most often see to dealing with extreme uncertainty implicitly asserts some combination of the following:

- My belief about the likelihood that X is true can be fully captured in a subjective probability, P(X) (which is a scalar). I can determine my P(X) by posing the following thought experiment to myself: "What is N1 such that I would be willing to bet that X is true at N1:1 odds?" Once I have answered this, P(X) ≥ 1/(N1+1). What is N2 such that I would be willing to bet that X is false at N2:1 odds? Once I have answered this, P(X) ≤ 1-1/(N2+1)." Having P1 and P2 be close to each other is a sign of intellectual rigor and virtue, suggesting that I have introspected well and found my true P(X).
- When I lack confidence in my model (when I have high uncertainty), I should avoid "extreme" probabilities (probabilities close to 0 or 1). Any model I build to estimate a probability could be flawed, and the mere fact that my model could be flawed suggests that I should not let it hand me excessively low probabilities that I take at face value. It's generally a sign of extreme confidence to assign a probability of less than 10^-6 to any particular difficult-to-assess and debatable proposition (or its negation).
- When estimating a probability, I might make a series of adjustments, such as adjusting it up or down to account for uncertainty or priors. However, once I've made such adjustments, I can use my P(X) to estimate expected value, and should then act consistently with such estimates. If P(X) implies that a certain action has higher expected value than any other action, I should take said action.

I believe this approach is unsatisfactory because:

- It treats the two cases at the opening of the post equivalently, when intuitively it seems like there must be an important difference. Specifically, cluster thinking implies that one should be more hesitant to take action when one is more uncertain about the reasoning behind the action, and thus implies that the 2nd case (from the beginning of this piece) is an iffier proposition than the 1st case. Even assuming that the "50%" in the 2nd case has tried to take factors such as Bayesian adjustment into account, it remains the case that the 2nd expected-value estimate is on shakier ground, and should - in my view - be given less weight in decision-making.
- The "betting" thought experiment is useful for many cases, but seems to become problematic in some cases, particularly when "extreme" probabilities (as defined above) are involved. For example, say that I'm trying to find my subjective P("A flying spaghetti monster is behind my head"). I would be hesitant to offer anyone a million-to-one or better odds, at any meaningful stakes, on just about
*any*bet, because my potential losses would be so large; does this mean that my P("A flying spaghetti monster is behind my head") should not be much smaller than 10^-6? The truth is that at such odds, I'd prefer not to bet in either direction regarding this proposition, and so the thought experiment fails to be helpful; but this situation often seems to be interpreted as implying that assigning too low a probability would be "overconfident." (Note that there are alternative thought experiments for the same purpose, some of which I find noticeably superior, but the "betting" frame is the one I most often see used, and I think all such thought experiments have serious limitations.) - #2 above is particularly problematic in my view. When evaluating a proposition such as "Donating to Charity X will save the equivalent of 10^50 future lives," one shouldn't be reasoning from
*confusion and uncertainty*("I don't feel confident enough to assign a probability under 10^-6, so I'll say the probability is at least 10^-6") to*high certainty in a surprising conclusion*("Donating to Charity X has an expected value of at least 10^44 lives saved (10^50 * 10^-6) and is therefore overwhelmingly valuable"). Yet I believe I have seen some people seem to use exactly that chain of reasoning. Additionally, it seems difficult to reconcile "assigning reasonably non-extreme probabilities whenever one has low confidence" with "having probabilities over mutually exclusive events that do not sum to more than 1."

Some have interpreted my previous writing as suggesting the *reverse* of #3: that rather than adjusting small probabilities *upward* to account for uncertainty, we should adjust such probabilities *downward* - i.e., we should treat the probability of an event as extremely small if we have high uncertainty about what the "correct" probability should be. I have not intended to communicate this, and believe such an approach would be highly flawed as well. In fact, I do not advocate using Bayesian adjustment to adjust one's probability estimates as suggested in #3.

## Proposed alternative approach

Here I outline one approach that seems superior to me. My description of this approach is inelegant, and is intended to stimulate thought and communicate an underlying concept. I also provide a worked example to better illustrate this approach.

### Definitions and summary

m_1, m_2, … m_n = the several predictive models I use to assess the outcome of a contemplated action.

e_i = the *expected* utility of the contemplated action according to model i.

f_i = the expected utility that model i *would* imply, if model i were improved (missing parameters added, etc.) This is unknown and is equal to e_i in expectation.

F_i(X) = probability distribution over f_i(X). F_i has mean e_i by construction, but other aspects of F_i (variance, functional form) can vary.

u_i = standard deviation of F_i, representing the "uncertainty" associated with model m_i

If all F_i = N(e_i,u_i), and if the probability distributions are combined in a relatively simple and standard way, the resulting probability distribution has expectation:

*Updated Oct. 10: this also assumes that the F_i are uncorrelated. See this blog comment for a formula relaxing that assumption.*

This has the property that models with more uncertainty (higher u_i) have less impact on the ultimate estimate of expected value, and models with sufficiently high uncertainty have negligible impact. (More on these properties here.)

Using non-normal distributions generally preserves these qualitative properties, as in the detailed worked example.

### More explanation

In line with my take on sequence thinking vs. cluster thinking, I think of myself as having multiple internal models of the world; when I consider a potential action I might take, each model produces a separate output for the expected value of this action. I call the models m_1, m_2, … m_n and their respective outputs for "expected value" of a given action e_1, e_2, … e_n. (By "model" I mean any function for mapping a contemplated action to an estimate of the action's expected value. Note that my "prior" can be modeled as one or more of the m_i.)

I recognize that each model is incomplete and has potential flaws and missing parameters. For each model, "correcting" the model - specifying it more intelligently, adding in missing parameters, and generally turning it into "what it would be if I were more intelligent and informed" (which isn't the same as turning it into a perfect model of reality) - would change its output. Call the hypothetical *modified* outputs f_1, f_2, … f_n. In *expectation*, f_i = e_i, but I might have a probability distribution over each f_i, which I'll call F_i. In order to incorporate all the information I have and arrive at an estimate of what a "corrected" set of models would imply about the expected value of the action, I need to use some form of model combination and adjustment that incorporates the F_i.

Uncertainty for model m_i can be thought of as the variance or "fatness" of F_i. The fatter the probability distribution F_i, the less confidence this implies in my model, and the less weight model m_i will generally (depending on the specifics of combination method) carry after combining the models. To give a simplified example, suppose that all the F_i are Gaussian with standard deviation u_i, and that the combination method is to use the product or geometric mean of the probability distributions (more on why these combination methods make sense at the link). In this case, the overall expected value of taking the contemplated action would be:

Using non-Gaussian distributions and/or other combination methods would complicate the actual formula for calculating overall expected value, but in general would not change the qualitative picture: when combining two probability distributions, it is robustly true that a "fatter" distribution will cause less of an update from the distribution it is combined with, and that a sufficiently "fat" distribution (approximating constant probability density) will cause negligible such updating regardless of where its midpoint lies. (The worked example I provide uses non-Gaussian, very fat-tailed probability distributions.)

### Some applications

Using this framework gives an alternative interpretation of the difference between case 1 and case 2 from the beginning of this piece: each represents a single model of one's expected value, outputting e_i = $50. However, the probability distribution over f_i looks different for the two cases: in case 1, F_i is tightly concentrated around f_i = $50, whereas in case 2, F_i has a higher-variance distribution.

This matters because an explicit expected-value estimate is almost never the only model we have for evaluating an action. We also have a variety of other perspectives for evaluating the action: does this action deviate greatly from "normality?" Does it deviate from what expert opinion would suggest? Does this action pattern-match (even superficially) to other successful actions or to other failed actions? (More at my post on sequence vs. cluster thinking.) Such perspectives can be interpreted as alternative mental models, additional m_i's that each have their own implications for the expected value of the action. We ought to arrive at our final conclusion by combining the probability distributions from all available perspectives (including those that are generally considered part of one's prior). When we do so, a model such as Case 1 will carry more weight (have more impact on the final, all-things-considered expected-value estimate) than a model such as Case 2.

The upshot is that higher-uncertainty models carry less *weight in one's final decision* than lower-uncertainty models, no matter what they conclude. In other words, beliefs and calculations that one feels are robust end up playing a larger role in one's actions than beliefs and calculations that one feels are not robust. This dynamic is the same as what I previously associated with "cluster thinking."

Further examples of how this framework can be used to interpret specific cases:

- Say that I am contemplating the opportunities discussed at the beginning of this piece, and say that I have a poor (though not overwhelmingly poor) track record when it comes to investing (only 1/3 of my bets are winners). The very fact that my explicit calculation of the expected value is positive should - in light of my track record - cause me to become suspicious of my own conclusion to some degree. However, in the first case, I have high certainty that my calculation is correct, so I should take the opportunity despite this concern. In the second case, I have high uncertainty in my calculation, so the fact that it implies a high expected value means that the calculation has a reasonably high probability of having erred in the positive direction. More specifically, if we call my expected-value calculation m_1 and my "outside view" of my own track record m_2, in the first case F_1 has little variance, and in the second case F_1 has higher variance, so the two can lead to different conclusions: perhaps, despite the fact that I initially estimated an EV of $50 in each case, I should take the first opportunity and not the second.
- Say that I am contemplating the proposition, "Donating $X to this charity will save the equivalent of 10^50 lives." The model in which I consider this proposition and try (via thought experiment) to assign a probability might be termed m_1. I am entirely uncertain about how to assign such a probability; my best guess is 10^-6 (producing e_1 = 10^44 lives saved), but I'm sufficiently uncertain that F_1 has roughly even (in log space) probability mass across the whole range from ~0 to 10^49 lives saved. At the same time, I might have another model (termed m_2) that represents my expectation for the general reference class of "donating $X to a charity," which places 67% of its probability mass between -1 and +1 standard deviations of average, while having a very fat tail allowing reasonable probability of much more extreme figures. Since F_1 places roughly constant probability mass over the range of 0 to 10^49 lives saved, combining it with m_2 will yield something very close to m_2 (only having the effect of ruling out extremely small or extremely large estimates, as well as skewing estimates slightly to the left due to the log scale). In other words, the near-constant probability distribution over f_1 (reflecting massive uncertainty about my estimate) will cause essentially no update to the "prior" represented by m_2, even assuming fat tails for both distributions. (A more worked out example of this sort of reasoning is available here.)

Note that the reasoning in the above paragraph holds independently of the particular choice of mean value for m_2, as long as the mean of m_2 lies within the bulk of the probability mass given by m_1. Therefore, whenever m_1 gives very uncertain estimates, and these estimates are not incompatible with the estimates given by m_2, the combined model will be very close to m_2. Thus, even if one does not know the mean for m_2, one can still know that combining m_2 with m_1 will give an estimate that is close to m_2; in other words, if one's explicit expected value estimate has wide scope for uncertainty, the combined model predicts an adjusted expected value that is close to the mean value of all giving opportunities (independently of what that mean value is). This is important because it allows one to apply the above line of reasoning without the necessity of actually calculating the expected value of the average giving opportunity; even without doing so, one can conclude that the value of the contemplated giving opportunity *relative to other giving opportunities* is not likely to be extraordinary.

"Outside views" usually are best defined as views about the *overall desirability* of an action, a concept closer to "expected value of an action" than to "probability of a proposition." It's for this reason that I advocate performing Bayesian adjustment on probability distributions over f_i, rather than using it to adjust the probability parameters within a model.

For those seeking a more concrete worked example, I provide one at the following link:

### Rethinking introspected probabilities with this framework in hand

My view is that when we perform thought experiments like "At what odds would I bet on X?" we are implicitly introspecting particular models within our system - not introspecting our *overall* subjective probability of X, which could only be arrived at by a multidimensional integration over every possible "corrected" version of each of the m_i's. In other words, it may be computationally harder to introspect our "overall probability" of a particular event than to compute either (a) our overall expected value, or (b) our probability according to a particular model. (Consistent with this, I often have more confidence in my view about whether an action is a good idea than in any particular view about the action's possible outcomes, and using "betting" thought experiments to consider the latter often does not change this.)

According to my view, and using the framework laid out above, it is not correct to say that an introspected probability must be "non-extreme" in order to express "low confidence." One might assign an extremely low probability to a proposition, while having very little confidence (fat F_i) in the model. In such a case, the low introspected probability indicates neither high confidence nor lack of openness to new evidence.

It is often easier to say what I think one should do than why I think one should do it (and to name the subjective probabilities that go into the "why"). A particularly vivid example is to imagine that I've been offered the chance to buy Apple stock at 50% of its market price: I might be confident that this action has good expected value without having a confident picture of the probability that the stock will go to any particular price P. Sufficient integration over all my internal models could produce such a probability in theory, but such a figure would be both difficult to produce and not very meaningful. A "would I bet" thought experiment will produce probabilities according to a particular model that may not match these not-very-meaningful overall probabilities.

### Pascal's Mugging

The aspect of Pascal's Mugging I find most interesting is that it seems *immediately obvious* that one should not be "mugged"; yet it becomes far less obvious if we imagine that the mugger presents a *reliable and precise* probability. I'm aware that there are arguments in favor of having a bounded utility function (and believe that a bounded utility function may be appropriate), but appealing solely to a utility function bound doesn't seem to reproduce these qualities of the proper response to Pascal's Mugging. I wish to be the sort of person who would happily pay $1 for a robust (reliable, true, correct) 10/N probability of saving N lives, for astronomically huge N - while simultaneously refusing to pay $1 to a random person on the street claiming s/he will save N lives with it.

The picture laid out above makes it fairly easy to model such a situation. There is no need to explicitly assign an infinitesimally low prior probability to the mugger's claim; instead, all one needs is (a) a modestly informative prior (or other mental model, such as "giving money to strangers is often a bad idea and rarely a good one"); (b) a probability estimate of being "mugged" that is highly *uncertain*, such that the resulting mental model has an approximately constant F_i distribution and is approximately uninformative.

Of course, the combination of models still will presumably imply *some* subjective probability that the mugger's claim is valid, but this probability need not be particularly low (one could imagine many scenarios in which the mugger's claim is valid but other actions have similarly large effects unbeknownst to the actor, for example) and, if it is low, need not represent a lack of openness to new evidence.

## Clarifications regarding previous posts

I have previously written about why "regression to the prior" means we cannot take expected-value estimates literally and should seek to maximize the impact of our actions by examining them from many different angles. These posts have caused some confusion, and with the above discussion in hand, I hope to clarify my views.

Below, I give what I see as some common misinterpretations of past posts, in quotes, followed by my response.

- "Holden is resistant to speculative, hard-to-quantify longshot bets." This is not the case. (In fact, I have always seen GiveWell itself as a speculative, hard-to-quantify longshot bet.) There are many cases in which the odds of success are hard to quantify precisely, yet an action still seems highly worthwhile because (a) the odds of success seem
*reasonably high*(robustly above a certain level, and thus implying expected value robustly above a certain level, even if the probability is hard to precisely quantify); and/or (b) other heuristics recommend the action. (For example, "one should take precautions against the most threatening risks one can think of" is a heuristic I believe in, and that recommends e.g. global catastrophic risk prevention even in the absence of precise expected-value estimates.) - "Holden believes that hard-to-quantify or otherwise uncertain probabilities should be adjusted downward and be assumed to be very low." I believe the best framework for handling uncertain probabilities likely involves something closer to the sort of "probability distribution over [improved/corrected] expected value" described above, rather than assigning a fixed numerical probability to each proposition and working from these probabilities toward conclusions about expected value. There are cases in which, according to the above framework, a highly uncertain probability should be implicitly assumed to be high; the key question is how robust the overall model is and what other available models (e.g., "outside views") suggest.
- "Holden believes that extremely good outcomes are overwhelmingly unlikely." I believe there are a couple of confusions here:
- The statement confuses "hypothetical modified expected value" (f_i) with "actual future value." The "prior" I have discussed in the past is over the former, not the latter. One can simultaneously believe that a substantial percentage of likely outcomes would be extraordinarily good, and that no particular action has extraordinarily high
*expected value.*(This is exactly what I believe about financial investments made by a naive investor: out of 100 such investments, I'd expect several to end up having enormous gains, yet I'd model each as having an*expected value*less than zero.) When thinking about the likelihood that an action will have extraordinarily good outcomes, one ought to think mostly about the frequency of good outcomes in the past; when thinking about the likelihood that an action has extraordinarily high*expected value*according to an intelligent model, one ought instead to be thinking about topics related to broad market efficiency: how likely is it that I am in position to take an unusually intelligent and promising action? I believe the second perspective is more likely to call for a normal-like distribution, though I also feel this isn't a key point (see next point). - Much of the reaction to my earlier posts focused on my choice of a normal distribution for illustrative purposes, and seemed to assume that the "thin tails" of a normal distribution (assigning extremely low probabilities to outcomes far from expectation) was key to my claims. I believe it is not, as discussed in the next point.

- The statement confuses "hypothetical modified expected value" (f_i) with "actual future value." The "prior" I have discussed in the past is over the former, not the latter. One can simultaneously believe that a substantial percentage of likely outcomes would be extraordinarily good, and that no particular action has extraordinarily high
- "Holden's beliefs about the world, including how he resolves 'Pascal's Mugging,' depend on a normally-distributed prior, and such a probability distribution assigns low probabilities to good outcomes that are inconsistent with (a) history and (b) openness to new evidence."
- I believe that the qualitative picture I've previously laid out is quite robust to the choice of distribution. Generally, any time one has a modestly informative prior and a near-uninformative estimate (one whose probability density function over f_i as I have defined it approximates a constant), the estimate will have negligible impact on one's conclusion. The "peaked" charts I've previously laid out will emerge from such a dynamic for a wide variety of distributions, though the location of the peak will vary.
- In order to reject high-uncertainty claims of high expected value, one need not explicitly assign infinitesimally low probabilities to anything in particular. One need only combine a modestly informative prior with a near-uninformative estimate. The combination of the two will ultimately imply very low probability of some propositions, but assigning very low probability to some propositions seems like something one must do in any case, and doing so need not be associated with being overconfident: low probability can be consistent with low confidence in one's model and a fat probability distribution over its f_i. Assigning low probabilities also need not be in tension with being open to new evidence, as I argued at the bottom of this comment.

- "Holden believes that Bayesian adjustment is sufficient to rule out donating to any charity focused on a low-probability, high-upside goal such as averting human extinction." I do not believe this; I have long considered global catastrophic risk reduction to be a promising philanthropic cause. The comments I have made about how to model and handle uncertainty are not sufficient, by themselves, to reach any particular conclusion about where one should donate. What they are intended to establish is that when the arguments for a giving opportunity are sufficiently weak, one should generally avoid the giving opportunity, and should certainly not reason from "trying not to be overconfident" to "assigning not-too-low probability to a highly implausible claim" to "attributing extremely high expected value to acting on this claim." However, if one has gotten to know the relevant people and relevant issues enough to reduce one's uncertainty (along the lines of the first bullet point here), one can be justified in donating to charities with very small estimated probability of success. (Note: upon review of this piece, MIRI wished me to add that it believes itself to be working on a high-probability risk, not a low-probability one.)