BDA3 Chapter 5 Exercise 14

Posted on 3 February, 2019 by Brian
Tags: bda chapter 5, solutions, gamma-poisson, hierarchical model, stan, unsolved
Category: bda3

Here’s my solution to exercise 14, chapter 5, of Gelman’s Bayesian Data Analysis (BDA), 3rd edition. There are solutions to some of the exercises on the book’s webpage.

We’ll use the same dataset as before but only use the total traffic counts.

df <- read_csv('data/chapter_03_exercise_08.csv') %>% 
  filter(type == 'residential' & bike_route) %>% 
  transmute(
    i = 1:n(),
    total = bikes + other
  )

i	total
1	74
2	99
3	58
4	70
5	122
6	77
7	104
8	129
9	308
10	119

The hyperprior from exercise 13 was given by , but where were used as parameters in the beta distribution. As a first attempt, we’ll try using the same priors for our gamma distribution. Since the support is the same for each, this at least makes some sense.

The joint posterior is

I have no idea if it has a finite integral, so we’ll just use this for the rest of the exercise.

Here’s the model definition in stan.

model <- rstan::stan_model('src/ex_05_14.stan')

S4 class stanmodel 'ex_05_14' coded as follows:
data {
  int<lower = 1> n;
  int<lower = 0> total[n];
}

parameters {
  real<lower = 0> alpha;
  real<lower = 0> beta;
  vector<lower = 0>[n] theta;
}

model {
  // hyperprior
  target += -(5. / 2.) * log(alpha + beta); 
  // theta prior 
  theta ~ gamma(alpha, beta); 
  // likelihood
  total ~ poisson(theta); 
}

The model can be fit using some tidybayes helpers.

fit <- model %>% 
  rstan::sampling(
    data = tidybayes::compose_data(df),
    warmup = 1000,
    iter = 5000
  ) %>% 
  tidybayes::recover_types(df)

Now draw samples from the posterior.

draws <- fit %>% 
  tidybayes::spread_draws(alpha, beta, theta[i])

The posterior joint distribution of looks fairly reasonable. It’s concentrated along the diagonal where , and mainly around , .

There is little deviation between the observed and estimated values.

Posterior medians and 95% intervals of traffic based on simulations from the joint posterior distribution.

To estimate total traffic for an unobserved street, we draw from the posterior, draw , then draw . The quantiles of are then our posterior predictive interval.

cis <- draws %>% 
  filter(i == 1) %>% 
  mutate(
    theta = rgamma(n(), alpha, beta),
    y = rpois(n(), theta)
  ) %>% 
  tidybayes::median_qi() %>% 
  select(matches('y|theta'))

The 95% posterior interval for is (19, 295). The 95% posterior interval of is (19, 298), which includes 9 of the 10 observed values.