BDA3 Chapter 2 Exercise 8

Here’s my solution to exercise 8, chapter 2, of Gelman’s Bayesian Data Analysis (BDA), 3rd edition. There are solutions to some of the exercises on the book’s webpage.

With prior θnormal(180,40), sampling distribution yθnormal(θ,20), and n sampled students with average weight ˉy=150, it follows from 2.11 that the posterior mean is

μ:=E(θˉy)=1801600+150n40011600+n400=60(3+10n)160016001+4n=60(3+10n)1+4n1/σ2:=1/V(θˉy)=11600+n400=1+4n1600.

So θˉynormal(60(3+10n)1+4n,401+4n). When n=0 this is exactly the prior, and when n= this is 150 (the observed mean) with zero variance.

It follows from the calculations shown in the book that the posterior predictive distribution is ˜yynormal(μ,σ2+400).

We can obtain 95% posterior intervals as follows.

mu <- function(n) 60 * (3 + 10 * n) / (1 + 4 * n)
sigma <- function(n) 40 / sqrt(1 + 4 * n)

percentiles <- c(0.05, 0.95)

theta_posterior_interval <- qnorm(percentiles, mu(10), sigma(10))
y_posterior_interval <- qnorm(percentiles, mu(10), sqrt(sigma(10)^2 + 400))

With a sample of size of 10, we get θ ϵ [140.5, 161] and ˜y ϵ [116.3, 185.2].

theta_posterior_interval <- qnorm(percentiles, mu(100), sigma(100))
y_posterior_interval <- qnorm(percentiles, mu(100), sqrt(sigma(100)^2 + 400))

With a sample of size of 100, we get θ ϵ [146.8, 153.4] and ˜y ϵ [117, 183.1].

Both of these posterior intervals for θ are very similar to the frequentist confidence intervals, especially in the case n=100.

qnorm(percentiles, 150, 20 / sqrt(10))
## [1] 139.597 160.403
qnorm(percentiles, 150, 20 / sqrt(100))
## [1] 146.7103 153.2897

We would expect them to become more similar as n increases, because both means and standard deviations converge to the same values for large n.