We’re going deep into science today, fasten your seatbelts.
I came across a couple of really interesting articles recently that call to question our assumptions about how our world works in deeply profound ways:
While the former looks at decision-making, the latter looks at statistical analysis. They both challenge the assumptions which underlie the common methods. And in both cases, it’s an error of omission, ignoring some data about the real-world, which is the culprit that causes the method to yield sub-optimal, if not completely flawed, conclusions.
Ergodicity Economics highlights the challenges with the assumption that people use an “expected utility” strategy when making decisions under conditions of uncertainty.
The expected utility strategy posits that given a choice between several options, people should choose the option with the highest expected utility, calculated by multiplying the probability of the scenario by the value of that outcome in the scenario.
The challenge with this strategy is that it ignores an important aspect of real life — time. Or more specifically, the fact that life is a sequence of decisions, so each decision is not made in isolation, it takes into account the consequences of the decisions that were already made and the potential consequences of the decisions that will be made in the future.
This has some profound implications for cooperation and competition and the conditions under which they are beneficial strategies. Expected utility suggests that people or businesses should cooperate only if, by working together, they can do better than by working alone. For example, if the different parties have complementary skills or resources. Without the potential of a beneficial exchange, it would make no sense for the party with more resources to share or pool them together with the party who has less.
But when we expand the lens to look not just at a single point in time but a period of time in which a series of risky activities must be taken, the optimal strategy changes. Pooling resources provides all parties with a kind of insurance policy protecting them against occasional poor outcomes of the risks they face. If a number of parties face independent risks, it is highly unlikely that all will experience bad outcomes at the same time. By pooling resources, those who do can be aided by others who don’t. Cooperation can be thought of as a “risk diversification” strategy that, mathematically at least, grows the wealth of all parties. Even those with more resources do better by cooperating with those who have less.
Consider the following story (paraphrased from Clayton’s piece):
A woman notices a suspicious lump in her breast and goes in for a mammogram. The report comes back that the lump is malignant. She needs to make a decision on whether to undergo the painful, exhausting and expensive cancer treatment and therefore wants to know the chance of the diagnosis being wrong. Her doctor answers that these scans would find nearly 100% of true cancers and would only misidentify a benign lump as cancer about 5% of the time. Given the relatively low probability of a false positive (5%), she decides to undergo the treatment.
While the story seems relatively straight forward, it ignores an important piece of data: the overall likelihood that a discovered lump will be cancerous, regardless of whether a mammogram was taken. If we assume, for example, that about 99% of the time a similar patient finds a lump it turns out to be benign, how would that impact her decision?
This is where Bayes’ Rule comes to our rescue:
We’re trying to find P(A|B) which in our case is P(benign|positive result)
P(A) = P(benign) = 99% (the new data we just added), and therefore P(malignant) = 1 — P(benign) = 1%
P(B|A) = P(positive result|benign) = 5% the false positive stat the doctor quoted.
The doctor also told us that P(positive result|malignant) = 100%
Which then helps us find P(B) = P(positive result) since we need to decompose it to: P(benign|positive result)*P(benign) + P(malignant|positive result)*P(malignant).
Not we can plug everything into Bayes’ rule to find that:
P(benign|positive result) = (0.05*0.99)/(0.05*0.99+1*0.01) = approx 83%
So the likelihood of a false positive is 16 times higher than what we thought it was. Would you still move forward with the treatment?
Clayton’s case is that this “error of omission” in the analysis extends beyond mere life-and-death situations like the one described above and into the broader use of statistical significance as the sole method for drawing statistical conclusions from experiments.
In Clayton’s view, this is one of the root causes of the replication crisis that the scientific community is now faced with and is beautifully illustrated by the following example:
In 2012, Professor Norenzayan at UBC had 57 college students randomly assigned to two groups, each were asked to look at an image of a sculpture and then rate their belief in god on a scale of 1 to 100. The first group was asked to look at Rodin’s “The Thinker” and the second was asked to look at Myron’s “Discobolus”. Subjects who had been exposed to “The Thinker” reported a significantly lower mean God-belief score of 41.42 vs. the control group’s 61.55. Or a 33% reduction in the belief in God. The probability of observing a difference at least this large by chance alone was about 3 percent. So he and his coauthor concluded “The Thinker” had prompted their participants to think analytically and that “a novel visual prime that triggers analytic thinking also encouraged disbelief in God.”
According to the study, the results were about 12 times more probable under an assumption of an effect of the observed magnitude than they would have been under an assumption of pure chance.
Despite the highly surprising result (some may even say “ridiculous” or “crazy”), since they were “statistically significant” the paper was accepted for publication in Science. An attempt to replicate the same procedure and almost ten times as many participants, found no significant difference in God-belief between the two groups (62.78 vs. 58.82).
What if instead, we took a Bayesian approach and assumed a general likelihood of disbelief in God by watching Rodin’s “The Thinker” of 0.1% (and therefore a corresponding “no change in beliefs” of 99.9%), and then figure out what is P(disbelief|results)?
We know from the study that P(results|no change) = 12*P(results|disbelief). Plugging this to Bayes’ Rule we get:
P(disbelief|results) = 12 * 0.001 / (12 * 0.001 + 1 * 0.999) = 0.012 / 1.011 = approximately 1.2% a far cry from the originally stated 33% reduction…