The binomial probability formula
where the binomial coefficient counts the number of ways to choose k successes from n trials:
p is the probability of success on each trial and (1 - p) is the probability of failure. The term pk(1-p)n-k is the probability of one particular arrangement of k successes and n-k failures; the coefficient counts how many such arrangements there are.
When can I use the binomial distribution?
- Fixed number of trials (n) - you decide n in advance.
- Two outcomes - each trial is a success or a failure.
- Independent trials - one trial does not affect any other.
- Constant probability (p) - p is the same on every trial.
Mean and variance
These come from X being the sum of n independent Bernoulli trials, each with mean p and variance p(1-p).
Normal approximation
When n is large and p is not too close to 0 or 1, the binomial is well approximated by a normal distribution with the same mean and variance:
A common rule of thumb is that this works well when np > 5 and n(1-p) > 5. Use a continuity correction when moving between the discrete bars and the continuous curve.