BDA3 Chapter 2 Exercise 8
Here’s my solution to exercise 8, chapter 2, of Gelman’s Bayesian Data Analysis (BDA), 3rd edition. There are solutions to some of the exercises on the book’s webpage.
With prior θ∼normal(180,40), sampling distribution y∣θ∼normal(θ,20), and n sampled students with average weight ˉy=150, it follows from 2.11 that the posterior mean is
μ:=E(θ∣ˉy)=1801600+150n40011600+n400=60(3+10n)1600⋅16001+4n=60(3+10n)1+4n1/σ2:=1/V(θ∣ˉy)=11600+n400=1+4n1600.
So θ∣ˉy∼normal(60(3+10n)1+4n,40√1+4n). When n=0 this is exactly the prior, and when n=∞ this is 150 (the observed mean) with zero variance.
It follows from the calculations shown in the book that the posterior predictive distribution is ˜y∣y∼normal(μ,√σ2+400).
We can obtain 95% posterior intervals as follows.
mu <- function(n) 60 * (3 + 10 * n) / (1 + 4 * n)
sigma <- function(n) 40 / sqrt(1 + 4 * n)
percentiles <- c(0.05, 0.95)
theta_posterior_interval <- qnorm(percentiles, mu(10), sigma(10))
y_posterior_interval <- qnorm(percentiles, mu(10), sqrt(sigma(10)^2 + 400))
With a sample of size of 10, we get θ ϵ [140.5, 161] and ˜y ϵ [116.3, 185.2].
theta_posterior_interval <- qnorm(percentiles, mu(100), sigma(100))
y_posterior_interval <- qnorm(percentiles, mu(100), sqrt(sigma(100)^2 + 400))
With a sample of size of 100, we get θ ϵ [146.8, 153.4] and ˜y ϵ [117, 183.1].
Both of these posterior intervals for θ are very similar to the frequentist confidence intervals, especially in the case n=100.
qnorm(percentiles, 150, 20 / sqrt(10))
## [1] 139.597 160.403
qnorm(percentiles, 150, 20 / sqrt(100))
## [1] 146.7103 153.2897
We would expect them to become more similar as n increases, because both means and standard deviations converge to the same values for large n.