It’s flu season and for the past two days you’ve had a headache and sore throat. You learn that 90% of people who actually have the flu also have those symptoms, which makes you worry. Does that mean the chances of your having the flu is 90%? In other words, if there’s a 90% chance of having a headache and sore throat given that you have the flu, does that mean there’s a 90% chance having the flu given that you have a headache and sore throat?
We can use symbols to express this question as follows: Pr(Flu | Symptoms) = Pr(Symptoms | Flu) = 90%?
The answer is no. Why?
If you think about it you’ll realize that there are other things besides the flu that can give you a combination of a headache and sore throat, such as a cold or an allergy, so that having those symptoms is certainly not the same thing as having the flu. Similarly, while fire produces smoke, the old saying that “where there’s smoke there’s fire” is wrong because it’s quite possible to produce smoke without fire.
Fortunately, there’s a nice way to account for this.
How Bayes’ Theorem Works
Suppose you learn that, in addition to Pr(Symptoms | Flu) = 90%, that the probability of a randomly chosen person having a headache and sore throat this season, regardless of the cause, is 10% – i.e. Pr(Symptoms) = 10% – and that only one person in 100 will get the flu this season – i.e. Pr(Flu) = 1%. How does this information help?
Again, what we want to know are the chances of having the flu, given these symptoms Pr(Flu | Symptom). To find that we’ll need to know first the probability of having those symptoms if we have the flu (90%) times the probability of having the flu (1%). In other words, there’s a 90% chance of having those symptoms if in fact we do have the flu, and the chances of having the flu is only 1%. That means Pr(Symptoms | Flu) x Pr(Flu) = 0.90 x 0.01 = 0.009 or 0.9% or a bit less than one chance in 100.
Finally, we need to divide that result by the probability of having a headache and sore throat regardless of the cause Pr(Symptoms), which is 10% or 0.10, because we need to know if your headache and sore throat are flu Symptoms out of all headache-and-sore symptoms that have occurred.
So, putting it all together, the answer to the question, “What is the probability that your Symptoms are caused by the Flu?” is as follows:
Pr(Flu | Symptoms) = [Pr(Symptoms | Flu) x Pr(Flu)] ÷ Pr(Symptoms) = 0.90 x 0.01 ÷ 0.10 = 0.09 or 9%.
So if you have a headache and sore throat there’s only a 9% chance, not 90%, that you have the flu, which I’m sure will come as a relief!
This particular approach to calculating “conditional probabilities” is called Bayes’ Theorem, after Thomas Bayes, the 18th century Presbyterian minister who came up with it. The example above is one that I got out this wonderful little book.
Muslims and Terrorism
Now, according to some sources (here and here), 10% of Terrorists are Muslim. Does this mean that there’s a 10% chance that a Muslim person you meet at random is a terrorist? Again, the answer is emphatically no.
To see why, let’s apply Bayes’ theorem to the question, “What is the probability that a Muslim person is a Terrorist?” Or, stated more formally, “What is the probability that a person is a Terrorist, given that she is a Muslim?” or Pr(Terrorist | Muslim)?
Let’s calculate this the same way we did for the flu using some sources that I Googled and that appeared to be reliable. I haven’t done a thorough search, however, so I won’t claim my result here to be anything but a ballpark figure.
So I want to find Pr(Terrorist | Muslim), which according to Bayes’ Theorem is equal to…
1) Pr(Muslim | Terrorist): The probability that a person is a Muslim given that she’s a Terrorist is about 10% according to the sources I cited above, which report that around 90% of Terrorists are Non-Muslims.
2) Pr(Terrorist): The probability that someone in the United States is a Terrorist of any kind, which I calculated first by taking the total number of known terrorist incidents in the U.S. back through 2000 which I tallied as 121 from this source and as 49 from this source. At the risk of over-stating the incidence of terrorism, I took the higher figure and rounded it to 120. Next, I multiplied this times 10 under the assumption that on average 10 persons lent material support for each terrorist act (which may be high), and then multiplied that result by 5 under the assumption that only one-in-five planned attacks are actually carried out (which may be low). (I just made up these multipliers because the data are hard to find and these numbers seem to be at the higher and lower ends of what is likely the case and I’m trying to make the connection as strong as I can; but I’m certainly willing to entertain evidence showing different numbers.) This equals 6,000 Terrorists in America between 2000 and 2016, which assumes that no person participated in more than one terrorist attempt (not likely) and that all these persons were active terrorists in the U.S. during those 17 years (not likely), all of which means 6,000 is probably an over-estimate of the number of Terrorists.
If we then divide 6,000 by 300 million people in the U.S. during this period (again, I’ll over-state the probability by not counting tourists and visitors) that gives us a Pr(Terrorist) = 0.00002 or 0.002% or 2 chances out of a hundred-thousand.
Now, divide this by…
3) The probability that someone in the U.S. is a Muslim, which is about 1%.
Putting it all together gives the following:
Pr(Terrorist | Muslim) = [Pr(Muslim | Terrorist) x Pr(Terrorist)] ÷ Pr(Muslim) = 10% x 0.002% ÷ 1% = 0.0002 or 0.02%.
One interpretation of this result is that the probability that a Muslim person, whom you encounter at random in the U.S., is a terrorist is about 1/50th of one-percent. In other words, around one in 5,000 Muslim persons you meet at random is a terrorist. And keep in mind that the values I chose to make this calculation deliberately over-state, probably by a lot, that probability, so that the probability that a Muslim person is a Terrorist is likely much lower than 0.02%.
Moreover, the probability that a Muslim person is a Terrorist (0.002%) is 500 times lower than the probability that a Terrorist is a Muslim (10%).
(William Easterly of New York University applies Bayes’ theorem to the same question, using estimates that don’t over-state as much as mine do, and calculates the difference not at 500 times but 13,000 times lower!)
As low as the probability of a Muslim person being a Terrorist is, the same data do indicate that a Non-Muslim person is much less likely to be a Terrorist. By substituting values where appropriate – Pr(Non-Muslim | Terrorist) = 90% and Pr(Non-Muslim) = 99% – Bayes’ theorem gives us the following:
Pr(Terrorist | Non-Muslim) = [Pr(Non-Muslim | Terrorist) x Pr(Terrorist) ÷ Pr(Non-Muslim) = 90% x 0.002% ÷ 99% = 0.00002 or 0.002%.
So one interpretation of this is that a randomly chosen Non-Muslim person is around one-tenth as likely to be a Terrorist than a Muslim person (i.e. 0.2%/0.002%). Naturally, the probabilities will be higher or lower if you’re at a terrorist convention or at an anti-terrorist peace rally; or if you have additional data that further differentiates among various groups – such as Wahhabi Sunni Muslims versus Salafist Muslim or Tamil Buddhists versus Tibetan Buddhists – the results again will be more accurate.
But whether you’re trying to educate yourself about the flu or terrorism, common sense suggests using relevant information as best you can. Bayes’ theorem is a good way to do that.
(I wish to thank Roger Koppl for helping me with an earlier version of this essay. Any remaining errors, however, are mine, alone.)
Author: Sandy Ikeda (FEE)