Questions such as
- How long should we expect to flip a coin before it turns up heads?
- How many times should we expect to roll a die until we get a 1?
can be answered using the geometric distribution.
Bernoulli Distribution¶
Insurance plans often have a deductible, where the insured individual is responsible for costs up to the deductible, and then costs above the deductible are shared between the individual and insurance company.
Suppose an insurance company found that 70% of people they insure stay below their deductible in any given year. These people can be thought of as a trial. If a person does not exceed the deductible, we label them a success. If they do exceed the deductible, we label them a failure. Since 70% of individuals stay below their deductible, we denote the probability of a success as \(p=0.7\).
When an individual trial only has two possible outcomes (often labeled success and failure) it is called a Bernoulli random variable. Usually a 1 is used for success and a 0 for failure, which makes things convenient:
- Suppose we have ten trials: 1, 1, 1, 0, 1, 0, 0, 1, 1, 0
- The sample proportion is the sample mean of these observations: \(p = \frac{successes}{trials}=(1+1+1+1+1+1) \div 10 = 0.6\)
💡 Bernoulli Random Variable: If \(X\) is a random variable that takes value 1 with a probability of success \(p\) and value 0 with probability \(1-p\), then \(X\) is a Bernoulli random variable with mean and standard deviation of \(\mu = p\) \(\sigma=\sqrt{p(1-p)}\)
Geometric Distribution¶
Geometric distribution is used to describe how many trials it takes to observe a success.
What are the chances that the first person will not exceed their deductible (i.e. be a success)? The second? Third?
The probability of stopping after the first person is the chance that this person will not hit their deductible - 0.7.
There is a 30% chance that the first person will be a failure. So to calculate the second person, we do \((0.3)(0.7)=0.21\)
The third case would be \((0.3)(0.3)(0.7)\)
If the first success is on the \(n^{th}\) person, then there are \(n-1\) failures and 1 success, which corresponds to the probability \((0.3)^{n-1}(0.7)\), which is the same as \((1-0.7)^{n-1}(0.7)\)
💡 Geometric Distribution If the probability of a success in one trial is \(p\) and the probability of a failure is \(1-p\), then the probability of finding the first success in the \(n^{th}\) trial is given by \(\large (1-p)^{n-1}p\)
The mean (i.e. expected value), variance, and standard deviation of this wait time are given by \(\large \mu=1/p\) \(\large \sigma^2=(1-p)/p^2\) \(\sigma=\sqrt{\frac{1-p}{p^2}}\)