URL: https://fivethirtyeight.com/features/can-you-beat-the-goat-monty-hall-problem/

Riddler Express:

The Monty Hall problem is a classic case of conditional probability. In the original problem, there are three doors, two of which have goats behind them, while the third has a prize. You pick one of the doors, and then Monty (who knows in advance which door has the prize) will always open another door, revealing a goat behind it. It’s then up to you to choose whether to stay with your initial guess or to switch to the remaining door. Your best bet is to switch doors, in which case you will win the prize two-thirds of the time.

Now suppose Monty changes the rules. First, he will randomly pick a number of goats to put behind the doors: zero, one, two or three, each with a 25 percent chance. After the number of goats is chosen, they are assigned to the doors at random, and each door has at most one goat. Any doors that don’t have a goat behind them have an identical prize behind them.

At this point, you choose a door. If Monty is able to open another door, revealing a goat, he will do so. But if no other doors have goats behind them, he will tell you that is the case.

It just so happens that when you play, Monty is able to open another door, revealing a goat behind it. Should you stay with your original selection or switch? And what are your chances of winning the prize?

Approach:

Setup

Lets generate some helper functions…

library(dplyr)

#Function to randomly generate number of goats
num_goats <- function(max.count = 3){
  goat.count <- sample(0:max.count, 1)
  return(goat.count)
}

# Function that takes the number of goats, and generates a vector
# representing the doors setup.  1 means goat behind door, 0 means no goat!
door_setup <- function(num.goats, num.doors = 3) {
  # There should not be more goats than doors
  if(num.goats > num.doors) {stop("Error: Too many goats fool!")}
  
  order <- c(rep(1, num.goats), # 1 for each goat
             rep(0, num.doors - num.goats) # 0 for each empty door
            ) %>% 
    sample() # to randomly reorder results
  
  return(order)
}

# Given a door setup and my guess, how many other goats are there?
other_goat <- function(setup.list, door.guess, value=1) {
  door.setup <- unlist(setup.list)
  door.setup <- door.setup[-door.guess] 
  sum(door.setup)
}

# Simple helper to show if there was a goat behind my initial guess
guess_outcome <- function(setup.list, door.guess, value=0) {
  door.setup <- unlist(setup.list)
  guess.outcome <- door.setup[door.guess]
  return(guess.outcome)
} 

Now that we have some helpers, lets generate a simulation. I want to create a million trials and verify the results make sense with some data checks.

trial.count <- 100000

sim <- tibble::tibble(
  trial = 1:trial.count,
  goats = replicate(trial.count, num_goats()),
  setup = purrr::map(goats, door_setup),
  my.guess = sample(1:3, trial.count, replace = TRUE),
  my.guess.result = purrr::map2_dbl(setup, my.guess, guess_outcome),
  num.goats.other.doors = purrr::map2_dbl(setup, my.guess, other_goat) 
) 

sim$goats %>% janitor::tabyl() %>% janitor::adorn_pct_formatting()
##  .     n percent
##  0 24776   24.8%
##  1 25041   25.0%
##  2 25027   25.0%
##  3 25156   25.2%
sim$my.guess %>% janitor::tabyl() %>% janitor::adorn_pct_formatting()
##  .     n percent
##  1 33275   33.3%
##  2 33509   33.5%
##  3 33216   33.2%
sim$my.guess.result %>% janitor::tabyl() %>% janitor::adorn_pct_formatting()
##  .     n percent
##  0 49941   49.9%
##  1 50059   50.1%

Looking at the percent breakdown it looks like everything makes sense. But those are just all the possible scenarios. We know if Monty can open another door so we can eliminate rows where the number of goats behind other doors are zero.

sim <- sim %>% 
  filter(num.goats.other.doors>0) %>% 
  mutate(
    result.switch = case_when(
      num.goats.other.doors == 2 ~ 1,
      num.goats.other.doors == 1 ~ 0
    )
  )

final.result <- sim %>% 
  summarise(trials = n(),
            win.stay = sum(ifelse(my.guess.result == 0, 1, 0)) / trials,
            win.switch = sum(ifelse(result.switch == 0, 1, 0)) / trials) %>% 
  janitor::adorn_pct_formatting()

final.result
## # A tibble: 1 x 3
##   trials win.stay win.switch
##    <int> <chr>    <chr>     
## 1  66851 37.6%    49.7%

Riddler Express Answer:

You want to switch, given you have a 49.7% of winning if you switch compare to an initial 37.6% change of winning if you stayed put.