BDA3 Chapter 3 Exercise 2

Posted on 15 September, 2018 by Brian
Tags: bda chapter 3, solutions, bayes, multinomial, dirichlet, beta
Category: bda3

Here’s my solution to exercise 2, chapter 3, of Gelman’s Bayesian Data Analysis (BDA), 3rd edition. There are solutions to some of the exercises on the book’s webpage.

survey	bush	dukakis	other	total
pre-debate	294	307	38	639
post-debate	288	332	19	639

Let \(\theta_\text{pre} \sim \ddirichlet_3(\alpha_\text{pre})\) and \(\theta_\text{post} \sim \ddirichlet_3(\alpha_\text{post})\) be the pre- and post-debate priors. Then the posteriors are

\[ \begin{align} \theta_\text{pre} \mid y &\sim \ddirichlet\left(294 + \alpha_{(\text{pre}, \text{b})}, 307 + \alpha_{(\text{pre}, \text{d})}, 38 + \alpha_{(\text{pre}, \text{o})} \right) \\ \theta_\text{post} \mid y &\sim \ddirichlet\left(288 + \alpha_{(\text{post}, \text{b})}, 332 + \alpha_{(\text{post}, \text{d})}, 19 + \alpha_{(\text{post}, \text{o})} \right) . \end{align} \]

Denote the proportion of support for Bush amongst Bush/Dukakis supporters by \(\phi_\bullet := \frac{\theta_{(\bullet, \text{b})}}{\theta_{(\bullet, \text{b})} + \theta_{(\bullet, \text{d})}}\). The results of the previous exercise show that we can treat this as a beta-binomial problem by simply ignoring the results for other. More precisely,

\[ \begin{align} \phi_\text{pre} \mid y &\sim \dbeta\left(294 + \alpha_{(\text{pre}, \text{b})}, 307 + \alpha_{(\text{pre}, \text{d})} \right) \\ \phi_\text{post} \mid y &\sim \dbeta\left(288 + \alpha_{(\text{post}, \text{b})}, 332 + \alpha_{(\text{post}, \text{d})} \right) \end{align} . \]

Let’s plot these posteriors and their difference \(\delta := (\phi_\text{post} \mid y) - (\phi_\text{pre} \mid y)\).

alpha_bush <- 1
alpha_dukakis <- 1

posteriors <- expand.grid(
    value = seq(0, 1, 0.001),
    survey = df$survey
  ) %>% 
  as_tibble() %>% 
  inner_join(df, by = 'survey') %>% 
  mutate(
    post_bush = bush + alpha_bush,
    post_dukakis = dukakis + alpha_dukakis,
    density = dbeta(value, post_bush, post_dukakis)
  )

N <- 10000

shift <- expand.grid(
    draw = 1:N, 
    survey = df$survey
  ) %>% 
  as_tibble() %>% 
  inner_join(df, by = 'survey') %>% 
  mutate(
    post_bush = bush + alpha_bush,
    post_dukakis = dukakis + alpha_dukakis,
    value = rbeta(n(), post_bush, post_dukakis)
  ) %>% 
  select(draw, survey, value) %>% 
  spread(survey, value) %>% 
  mutate(difference = `post-debate` - `pre-debate`)

There is a 19% probability (the area above 0) that there was a positive shift towards Bush.