Games of Chance¶

Table of Contents¶

  1. Introduction
  2. Problem 1 - Simple Expectations
  3. Problem 2 - Expectations on Consecutive Events
  4. Problem 3 - Conditional Expectations
  5. Problem 4 - The Birthday Paradox
  6. Problem 5 - The Monty Hall Problem
  7. Problem 6 - Bayesian Drug Testing
  8. Closing Thoughts

Introduction¶

I have recently become interested in how probabilistic outcomes can at times be difficult to accurately evaluate without the aids of probability theory and/or computational devices. I wanted to use this particular notebook to explore various puzzles pertaining to games of chance, particularly in their solutions and epistemological links. I have used Monte-Carlo simulations to answer each problem computationally before elaborating into the analytical solution when possible. This should provide a varied approach to evaluating solutions which at first glance may seem very counterintuitive.

Problem 1 - Simple Expectations¶

What is the expected number of rolls required to roll two sixes in a single roll of two die?

In [5]:
import random
import seaborn as sns
import matplotlib.pyplot as plt

def roll_dice(num_dice):
    # Return result of a given number of standard die rolls
    results = [random.randint(1, 6) for _ in range(num_dice)]
    return results

num_dice = 2
runningCount = list()

for j in range(0,10000):
    i = 0
    results = [0,0]
    while results != [6,6]:
        results = roll_dice(num_dice)
        i = i + 1
    runningCount.append(i)
    
print("Expected number of rolls:", round(sum(runningCount) / len(runningCount),0))
Expected number of rolls: 36.0

Problem 2 - Expectations on Consecutive Events¶

What is the expected number of rolls required to roll two sixes on consecutive rolls of a dice?

In [9]:
num_dice = 1
runningCount = list()

for j in range(0,10000):
    results = [0,0]
    i = 0
    while results != [6,6]:
        results[0] = results[1]
        results[1] = roll_dice(num_dice)[0]
        i = i + 1
    runningCount.append(i)
    
print("Expected number of rolls:", round(sum(runningCount) / len(runningCount),0))
Expected number of rolls: 42.0

Discussion of Problems 1-2¶

A reasonable presumption would be that the expected outcome of these two scenarios would be equal by taking the expectation of a random variable represented by:

$$E(X) = n \cdot p(X)$$

The probability of rolling two sixes may be computed as the intersection of independent events:

$$ \frac{1}{6} \times \frac{1}{6} = \frac{1}{36} $$

Rearranging for n in the previous formula leads us to an expected number of rolls equating to 36.

This is, as foreshadowed by the previous simulations, correct for the first problem and incorrect for the second. To solve the latter analytically, one may use recursion:

$$E[6,6] = 6 + \frac{1}{6} \cdot 1 + \frac{5}{6} (E[6,6] + 1) = 42$$

In other words, having landed a six, what is the probability of landing a six on my subsequent roll. I have a sixth chance of getting it on the next roll, otherwise I have to restart having 'spent' a roll. Solving for the above formula leads to an expected number of rolls equating to 42. The intuition for the difference in results may be explained by the following scenario: imagine you roll [2,6], followed by [6,1]. In the first problem case, this would not be counted as successful since the stopping time is based on outcomes produced by two die on a single roll. In the second problem however, this would be recorded as [2,6,6,1] which would certainly result as a met outcome.

Problem 3 - Conditional Expectations¶

What is the expected number of rolls required to roll a six conditioned on the event that all rolls were even numbers?

In [14]:
def simulate_expected_rolls():
    total_rolls = 0
    while True:
        total_rolls += 1
        roll = random.randint(1, 6)
        if roll == 6 or roll % 2 != 0:
            break
    return total_rolls

num_simulations = 10000
total_rolls_sum = 0

for _ in range(num_simulations):
    total_rolls_sum += simulate_expected_rolls()

expected_value = total_rolls_sum / num_simulations

print("Expected number of rolls:", round(expected_value,2))
Expected number of rolls: 1.49

Discussion of Problem 3¶

The intuitive answer to this question would be three - one assumes that the game can only land on even numbers which would suggest three equally possible outcomes with a probability of a third each. The reason this is incorrect is because the question does not inform that only even outcomes can occur but rather that only even numbers are to appear prior to terminating the game with a six.

As an extension to the solution presented earlier, if we take the event A to denote all rolls which would terminate the game (odd numbers would also terminate this game conditional on the events specified):

$$ A = \{1, 3, 5, 6\} $$$$E[A] = \frac{4}{6} \cdot 1 + \frac{2}{6} (E[A] + 1) = \frac{3}{2}$$

Problem 4 - The Birthday Paradox¶

What is the probability of any of 23 individuals at a dinner party sharing the same birthday?

In [7]:
def assign_birthday(num_people):
    results = [random.randint(1, 365) for _ in range(num_people)]
    return results

runningCount = list()
for i in range(0,10000):
    results = assign_birthday(23)
    if len(results) != len(set(results)):
        runningCount.append(1)
    else:
        runningCount.append(0)

print('Probability of shared birthday:', round(sum(runningCount)/len(runningCount),2))
Probability of shared birthday: 0.51

Discussion of Problem 4¶

This counter-intuitive result comes from the high number of possible combinations of individuals sharing a birthday:

$$ C(n, k) = \frac{n!}{k! \cdot (n - k)!} = \frac{23!}{2! \cdot (23 - 2)!} = 253 $$

The probability of any pair of individuals sharing a birthday may be evaluated by subtracting the probability that no pair shares a birthday:

$$ P(Shared Birthday) = 1 - \left(\frac{364}{365}\right)^{253}\approx 50\% $$

If we do not condition on specific events happening (i.e. Bob and Hannah sharing a birthday), the probability of having any random pair sharing an otherwise rare commonality becomes increasingly likely. To demonstrate this, we can adjust the above code to iterate through varying numbers of attendees to our dinner party. The probability converges to one as attendees increase to the point that no more than sixty attendees are required for this to become a near certainty.

In [11]:
p_shared_birthday = list()
for i in range(1,100,5):
    runningCount = list()
    for j in range(0,1000):
        results = assign_birthday(i)
        if len(results) != len(set(results)):
            runningCount.append(1)
        else:
            runningCount.append(0)
    p_shared_birthday.append(sum(runningCount)/len(runningCount))
    
plt.plot(range(1,100,5), p_shared_birthday)
plt.xlabel('Number of attendees')
plt.ylabel('Probability of shared birthday')
plt.show()

Problem 5 - The Monty Hall Problem¶

There are three doors on a game show: behind one is a prize, and behind the other two are goats. You choose one door, and then the host, who knows what's behind each door, opens one of the other two doors to reveal a goat. Do you stick with your initial choice or switch to the other unopened door?

In [9]:
def monty_hall_simulation(num_simulations):
    stay_wins = 0
    switch_wins = 0

    for _ in range(num_simulations):
        # Randomly place the car behind one of the three doors
        doors = ["goat"] * 3
        car_location = random.randint(0, 2)
        doors[car_location] = "car"

        # Contestant's initial choice
        initial_choice = random.randint(0, 2)

        # Monty reveals one of the goat doors that you didn't pick
        goat_doors = [i for i, prize in enumerate(doors) if prize == "goat" and i != initial_choice]
        monty_opens = random.choice(goat_doors)

        # Determine the door to switch to
        switch_to = [i for i in range(3) if i != initial_choice and i != monty_opens][0]

        # Check if you win by sticking or switching
        if doors[initial_choice] == "car":
            stay_wins += 1
        if doors[switch_to] == "car":
            switch_wins += 1

    return stay_wins, switch_wins

num_simulations = 10000
stay_wins, switch_wins = monty_hall_simulation(num_simulations)
p_win_stay = stay_wins / num_simulations
p_win_switch = switch_wins / num_simulations
print('Probability of winning by staying with the initial door is:', round(p_win_stay,2))
print('Probability of winning by switching doors is:', round(p_win_switch,2))
Probability of winning by staying with the initial door is: 0.33
Probability of winning by switching doors is: 0.67

Discussion of Problem 5¶

The counterintuitive result of the Monty Hall problem can be explained through conditional probability. Initially, when you choose one of the three doors, there is a 1/3 chance that you've picked the door with the valuable prize and a 2/3 chance that you've chosen a door with a less desirable prize. Now, when Monty Hall, who knows what is behind each door, reveals a goat behind one of the other two doors you didn't pick, the odds change. The key insight is that Monty's action conveys information. If your initial choice was the valuable prize (1/3 chance), switching would lose it; but if you initially chose a goat (2/3 chance), switching guarantees you win the valuable prize. So, the probability of winning if you switch is 2/3, and if you stick with your initial choice, it remains at 1/3.

Problem 6 - Bayesian Drug Testing¶

If a drug test has a true positive rate of 90% and a false positive rate of 20%, what is the probability of a someone being a drug user given a positive test if the prevalence of drug users is 5%?

In [18]:
# Define the true positive rate, false positive rate, and prevalence
true_positive_rate = 0.90
false_positive_rate = 0.20
prevalence = 0.05

# Number of simulations
num_simulations = 10000
positive_tests = 0
users_given_positive_test = 0

for _ in range(num_simulations):
    # Simulate a random person
    is_user = random.random() < prevalence  # True if the person is a drug user

    # Simulate the drug test
    if is_user:
        # The person is a drug user
        if random.random() < true_positive_rate:
            positive_tests += 1
            users_given_positive_test += 1
    else:
        # The person is not a drug user
        if random.random() < false_positive_rate:
            positive_tests += 1

# Calculate the probability of being a user given a positive test
probability_user_given_positive_test = users_given_positive_test / positive_tests

print("Probability of drug user given positive test:", round(probability_user_given_positive_test,2))
Probability of drug user given positive test: 0.19

Discussion of Problem 6¶

Given the high true positive rate, one might intuit that the answer to this question should also be relatively high. The actual solution may be found by plugging into Bayes' theorem as follows:

$$ P(\text{User}|\text{Positive}) = \frac{P(\text{Positive}|\text{User}) \cdot P(\text{User})}{P(\text{Positive}|\text{User}) \cdot P(\text{User}) + P(\text{Positive}|\text{Non-user}) \cdot P(\text{Non-user})}= \frac{0.90 \cdot 0.05}{0.90 \cdot 0.05 + 0.20 \cdot 0.95} \approx 19\% $$

As shown, when integrating the low prevalence rate of users, this results in a relatively low probability of the positive test actually detecting a drug user.

Closing Thoughts¶

An interesting fact that I discovered is that they actually replicated the Monty Hall Problem as an experiment whereby pigeons were used to see if after successive trials they would adapt their strategy accordingly. The experiment was then replicated on human participants. Whereas the pigeons adjusted to the optimal strategy, humans failed to do the same. Although the result is somewhat confounding, it is at the same time not hard to infer the reason behind it. The pigeons, unrestricted by traditional logic or assumed heuristics, were more able to quickly adapt their strategy based on empirical success rates. The humans on the other hand were anchored to their beliefs based on their analytical expectation of what the probabilities should be. To me, this is very telling of how we operate in our daily lives - we draw heuristics which allow us to make decisions at a rather frenetic pace but oftentimes can be misled in our probabilistic inferences. Although brute force computation would be a solution to arrive at more accurate estimates, most problems are unfortunately too complex (and time-consuming) to model accurately in the real world. In that regard we are inevitably bound to heuristic-based thinking - this isn't necessarily a bad thing but it is at the very least worth considering as a limitation in our evolutionary pathway when building confidence intervals around what we categorically think we know, or have yet to learn.

In [ ]: