6 Bayesian Methods
6.1 Reminder example: conditional probabilities
Related readings: Hoff Chs. 1, 3
6.2 Bayesian methods: Introduction via simple example
Suppose you want to estimate the fraction of a population that is infected with some disease.
\(\theta \in [0,1]\) : true value
Test a random sample of \(20\) from the population.
\(Y \in \{0,1,\ldots,20\}\) : # of positive results.
Question: What does realized value of \(Y\) tell us about the true value of \(\theta\)?
6.2.1 Sampling model
\(Y | \theta\) ~ binomial\((20,\theta)\): For \(y = 0, 1, \ldots, 20\), (i.i.d.)
\[l(y|\theta) = \Pr(Y=y | \theta) = {20 \choose y} \theta^y (1-\theta)^{(20-y)}\]
where \(\binom{n}{k} = \frac{n!}{k!(n-k)!}\)
\(l(y|\theta)\) called the likelihood function.
Idea: For any \(0< \theta < 1\), all values of \(Y\) are possible, but some are more likely than others.
The likelihood function tells us how likely is each possible observation, for a given \(\theta\).
If, say, \(Y = 15\), that provides evidence that \(\theta\) is not small.
Core of Bayesian reasoning: work out all the different combinations of \(Y, \theta\) that could have generated the observed sample data.
6.2.2 Prior information
Suppose we have some background knowledge about the likely values of \(\theta\).
Represent this knowledge by means of a prior distribution \(\pi(\theta)\) over \([0,1]\).
Obviously, there are many (infinitely many) possible such distributions.
For convenience, we typically choose to model priors as chosen from a parametrized family of distributions.
6.2.3 The Beta distribution
\[\theta \sim \text{beta}(a,b)\]
Then
\[E[\theta] = \frac{a}{a+b}\]
6.2.4
For our case, let’s suppose our prior beliefs correspond to:
\[\theta \sim \text{beta}(2,20)\]
6.2.5
\[\theta \sim \text{beta}(2,20)\]
implies
6.3 Bayes Theorem
Let \(\pi(\theta | y)\) denote our posterior distribution over values of \(\theta\).
This means: our updated beliefs about the likelihood that \(\theta\) takes various values, after we’ve received our test results.
Bayes Theorem says:
\[\pi(\theta | y) = \frac{l(y|\theta) \pi(\theta)}{Pr\{Y = y\}} = \frac{l(y|\theta) \pi(\theta)}{\int_\Theta l(y|\tilde{\theta})\pi(\tilde{\theta}) d\tilde{\theta}}\]
6.3.1
Can be shown:
If \(\theta \sim \text{beta}(2,20)\) and \(Y = 0\), then \(\theta | y \sim \text{beta}(2,40)\).
More generally:
If \(\theta \sim \text{beta}(a,b)\) and \(Y = y\), then \(\theta | y \sim \text{beta}(a+y,b+20-y)\).