Monte Carlo Simulation of the Monty Hall Problem: Does The Math Actually Work? With Python
Introduction
If you ever took a stats class back in high school or college, then you were probably told about the infamous Monty Hall problem. The premise is very simple:
- The contestant is presented with 3 doors. Behind one of the doors, there is a car, and behind the other two, there is a goat.
- The contestant picks a door, then the host reveals one of the other two doors.
- The contestant then has a choice: should she switch doors or stick with what she picked first?
- You can play the game in this link, try it out for yourself!
What you were told
According to your stats professor, after the host of the show reveals one of the door, you should always switch! Indeed, he says, it’s a simple Bayesian stats problem, summarized in this equation:
When the host reveals what’s behind one of the doors, you are given new information which changes the given information in your probability equation.
According to the equation above, if you switch doors, your probability of winning becomes 66% versus the 33% if you decide to keep your original choice.
For more info on the stats behind Monty Hall, check out this link.
Sounds a bit farfetched though, no? How does switching doors increase your chances of winning like that?
Let’s verify it by simulating it 10,000 times
As a scientist, I always like to verify theoretical concepts with concrete evidence, and by leveraging the power of computer simulation (Monte Carlo), we will know once and for all if the math in the Bayes equations actually adds up! Let’s go.
# Start by importing our favoring data science libraries!import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import random
%matplotlib inline
Now we need to come up with a plan before writing our code… after a few minutes of sketching (but hours of rendering), here is what I came up with:
Looks like this can work! We just have to randomly assign the 3 doors, and loop this 10,000 times. Now since the code is a bit lengthy, I will just post it on my website here. Check it out!
The Results
Okay, the results are in! After playing the game 10,000 times, and here is the distribution of wins vs losses if you switch vs keep.
It looks like Bayes was right after all, if you do decide to switch, your chance of winning are higher, right around 66% (actual numbers are {‘Keep Choice’: 3339, ‘Switch Choice’: 6661})
An easier way to think about it
This analogy here helped me a lot wrap my head around the concept of how this is possible (thank you Sylvia).
Say that instead of 3 doors, you have 1,000 doors. If you pick door #2 randomly for example, then your probability of the car being behind it is 1 / 1000 which is 0.1%. Now say the host reveals 998 doors and keeps 1 other door closed (like in the below illustration, door #1), then would you stick to your original choice or switch?
If you stick to your original choice, your chances of winning are what they were at the beginning of the game, when you randomly selected the door, 0.1%. The one door left most like has the car, and you should switch!
Conclusion
I hope you enjoyed reading this article as much as I enjoyed writing it! I teach data science on the side at www.thepythonacademy.com, so if you’d like further training, or even want to learn it all from scratch, feel free to contact us on the website. I also plan to publish many articles on Machine Learning and AI on here, so feel free to follow me as well. Please share, like, connect, and comment, as I always love hearing from you. Thank you!