[!def] Marginal and Joint Probabilities If a probability is based on a single variable, it is a marginal probability. The probability of outcomes for two or more variables or processes is called a joint probability. Table proportions can be used to summarize joint probabilities.
Table proportions can be used to summarize joint probabilities.
Probability table for a machine learning algorithm predicting whether a photo is about fashion or not.

A conditional probability is computed under a certain condition. There are two parts to a conditional probability: the outcome of interest and the condition. It is generally written like P(A|B)=... where A is the outcome of interest, and | is read as “given.”
[!def] Conditional Probability The conditional probability of outcome \(A\) given condition \(B\) is computed as \(P(A|B)=\frac{P(A \wedge B)}{P(B)}\)
[!def] General Multiplication Rule If A and B represent two outcomes or events, then \(P(A \wedge B)=P(A|B) \cdot P(B)\) It is useful to think of A as the outcome of interest and B as the condition.
[!example] Consider the
smallpoxdata set. Suppose we only know that 96.08% of residents were not inoculated, and 85.88% of residents who were not inoculated ended up surviving. How could we compute the probability that a resident was not inoculated and lived? We want to determine \(P(A\wedge \overline{B}\)), where \(A\) represents living, and \(B\) represents being inoculated. We are given that \(P(A|\overline{B})=0.8588\) and \(P(\overline{B})=0.9608\). Therefore, \(P(A \wedge \overline{B})=0.8588 \cdot 0.9608 = 0.8251\)
Sum of Conditional Probabilities¶
[!def] Sum of conditional probabilities Let \(A_1,...,A_k\) represent all the disjoint outcomes for a variable or process. Then if \(B\) is an event, possibly for another variable or process, we have: \(P(A_1|B)+...+P(A_k|B)=1\) The rule for complements also holds when an event and its complement are conditioned on the same information: \(P(A|B)=1-P(A^c|B)\)
[!info] Gambler’s fallacy X is playing roulette. The previous 5 rolls have landed on black. He knows that the chance of getting black six times in a row is very small, so he bets on red. This is a logical fallacy because the chance of the 6th roll does not depend on the previous ones.
Tree Diagrams¶
Tree diagrams are useful for organizing outcomes and probabilities around the structure of the data. They are most useful when two or more processes occur in a sequence and each process is conditioned on its predecessors.

Bayes’ Theorem¶
Sometimes we are given a conditional probability of the form \(P(\text{statement about variable 1}|\text{statement about variable 2})\), but we really want to know the inverse - \(P(\text{statement about variable 2}|\text{statement about variable 1})\)
Sometimes tree diagrams can help us find the second probability when given the first, but when it is not possible to draw the scenario in a tree diagram, Bayes’ theorem is useful.
Consider the following conditional probability for variable 1 and variable 2:
Bayes’ Theorem states that this conditional probability can be identified as the following fraction:
To apply Bayes’ Theorem correctly, you need to do two preparatory steps:
- Identify the marginal probabilities of each possible outcome of the first variable: \(P(A_1), P(A_2),...,P(A_k)\)
- Identify the probability of the outcome \(B\), conditioned on each possible scenario for the first variable: \(P(B|A_1),P(B|A_2),...P(B|A_k)\)
[!example] There are academic events on 35% of nights, sporting events on 20% of nights, and no events on 45% of nights. Joe visits campus every Thursday. The table on the right shows the frequency of events on campus and how often the parking lot fills up for each type of event. If Joe comes to campus and finds the garage full, what is the probability that there is a sporting events?
| Frequency (%) | How often lot fills (%) | |
|---|---|---|
| Academic | 35 | 25 |
| Sporting | 20 | 70 |
| None | 45 | 5 |
| Total | 100 | 100 |
| 1. The probability that there is a sporting event and the garage is full is \(0.20 \cdot 0.70 = 0.14\) | ||
| 2. The probability that the garage is full is \((0.35\cdot 0.25)+(0.14)+(0.45*0.05)=0.0875+0.14+0.0225=0.25\) | ||
| 3. The ratio of these probabilities is the solution: \(\frac{0.14}{0.25}=0.56\). If the garage is full, there is a 56% chance there is a sporting event. |