A Primer on Probability

Note

This post is a work-in-progress, but I wanted to publish this early so I could link to it in my other posts.

The Basics

Probability is the likelihood that an event will happen. If I have a fair coin, then there’s a 50/50 chance it will land heads or tails. Each has a 50% probability, that is P(H) = P(T) = 1/2. What are the chances that I flip a heads and then a tails? Well, since this is a fair coin and every flip is independent, meaning it doesn’t affect any other flip, the probability of getting exactly heads then tails is P(H) × P(T) = 1/2 × 1/2 = 1/4, or 25%.

What is the probability of flipping two coins and getting 1 heads? You add together the probabilities of all the sequences that give you 1 heads. In this case, two coinflips can result in HH, HT, TH, or TT. Each of those sequences has a 25% chance of happening, because P(H) = P(T), so we have a 50% of getting exactly 1 heads from two coin flips.

What about flipping two coins and getting at least 1 heads? This one is a little tricky to think about. If I pick a number of events, then the probability of any one of those events happening is the sum of the individual probabilities. The question above was “what is the probability of getting exactly 1 heads when flipping two coins?” You could restate this as “what is the probability of getting HT or TH when flipping two coins?”, which is why we add their two probabilities together. Now, we can take advantage of the fact that the total probability of every event has to add to 1 (i.e. something has to happen). The only two events that matter for our question are “at least 1 heads” and “0 heads”, there are no other options. It’s really easy to calculate the chance of flipping no heads, that’s just P(T)^number of flips. In the case of two flips, P(TT) = 1/2 × 1/2 = 1/4. Since there’s only one other kind of event, we just subtract 1/4 from 1 to get 3/4, or 75% chance of flipping at least 1 heads when flipping two coins.

If that confused you, let’s try thinking about using some dice. Let’s say now we’re going to roll a fair, six-sided die, so the chances we roll a six are 1/6. What are the chances we don’t roll a six? It’s 5/6. You can either count up the other options (chance you roll a one, a two, a three, a four, and a five) or you can subtract the chance you roll a six, 1/6 from 1 = 5/6.

Bayes’ Theorem

P(x|y) = \frac{P(y|x)P(x)}{P(y)}

This basically says that the probability of x given y is equal to the probability of y given x times the probability of x, divided by the probability of y. What does that mean? Let’s use an example. Since I’m writing this during the COVID-19 pandemic, let’s use that and say x is being infected with the coronavirus and y is testing positive for COVID-19. The question is now “what is the probability I have COVID-19 if I test positive for it?”. We’ll say the test is 90% accurate (meaning if you have the coronavirus, 90% of the time you will test positive) with a 5% false positive rate (meaning if you don’t have the virus, 5% of the time it will come out positive). Finally, let’s say we know that only 3% of people actually have the coronavirus.

\begin{aligned}
P(x=1)&= \text{probability of having the coronavirus} &&= 0.03\\
P(x=0) &= \text{probability of not having the coronavirus}&&= 0.97\\
P(y|x=1) &= \text{probability of testing positive if you have the coronavirus}&&=0.90\\
P(y|x=0) &= \text{probability of testing positive if you don't have the virus} &&= 0.05\\
\end{aligned}

So what’s the probability of getting a positive test overall? Well, you add together the probability of getting a positive test if you have the virus (the true positive rate) and the probability of getting a positive test if you don’t have the virus (the false positive rate).

\begin{aligned}
P(y) &= P(y|x=1)×P(x=1)+P(y|x=0)×P(x=0)\\
&= 0.9 × 0.03 + 0.05 × 0.97\\
&= 0.0755
\end{aligned}
\\

Now we can solve our original question. If you test positive, what’s the chance you actually have the coronavirus?

x = 1, \text{you have COVID-19}\\
y = 1, \text{you tested positive for COVID-19}  
\\  
\begin{aligned}
P(x|y) &= \frac{P(y|x)P(x)}{P(y)} \\
 &= \frac{0.9 × 0.03}{0.0755}\\
 &= .3576 \text{ or 35.76\%}
\end{aligned}

Leave a Reply

Your email address will not be published. Required fields are marked *