Books in black and white
 Books Biology Business Chemistry Computers Culture Economics Fiction Games Guide History Management Mathematical Medicine Mental Fitnes Physics Psychology Scince Sport Technics

# Introduction to Bayesian statistics - Bolstad M.

Bolstad M. Introduction to Bayesian statistics - Wiley Publishing, 2004. - 361 p.
ISBN 0-471-27020-2 Previous << 1 .. 44 45 46 47 48 49 < 50 > 51 52 53 54 55 56 .. 126 >> Next fo g(n) x f(y|n) dn
which requires an integration. Depending on the prior g(n) chosen, there may not necessarily be a closed form for the integral, so it may be necessary to do the integration numerically. We will look at some possible priors.
8.1 USING A UNIFORM PRIOR
If we donŌĆÖt have any idea beforehand what the proportion n is, we might like to choose a prior that does not favor any one value over another. Or, we may want to be as objective as possible, and not put our personal belief into the inference. In that case we should use the uniform prior that gives equal weight to all possible values of the success probability n. Although this does not achieve universal objectivity
1We know that for independent events (or random variables) the joint probability (or density) is the product of the marginal probabilities (or density functions). If they are not independent this does not hold. Likelihoods come from probability functions or probability density functions, so the same pattern holds. They can only be multiplied when they are independent.
USING ąø BETA PRIOR
131
(which is impossible to achieve), it is objective for this formulation of the problem:2
ą┤(ą┐) = 1 for 0 < ą┐ < 1.
Clearly, we see that in this case, the posterior density is proportional to the likelihood:
ą┤(ą┐|čā) = ^ čā ^ ą┐čā(1 ŌĆö ą┐)ŌĆØ-ąŻ for 0 < ą┐ < 1 .
We can ignore the part that doesnŌĆÖt depend on ą┐. It is a constant for all values of ą┐, so it doesnŌĆÖt affect the shape of the posterior. When we examine that part of the formula that shows the shape of the posterior as a function of ą┐, we recognize this is a beta(a,b) distribution where a = čā +1 and b = n ŌĆö čā + 1. So in this case, the posterior distribution of ą┐ given čā is easily obtained. All that is necessary is look at the exponents of ą┐ and (1 ŌĆö ą┐). We didnŌĆÖt have to do the integration.
8.2 USING A BETA PRIOR
Suppose a beta (a,b) prior density is used for ą┐:
ą┤(ą┐; ab) = ą┐ŌĆ£-1(1 ŌĆö ą┐)ą-1 for 0 < ą┐ < 1.
r(a)r(b)
The posterior is proportional to prior times likelihood. We can ignore the constants in the prior and likelihood that donŌĆÖt depend on the parameter, since we know multiplying either the prior or the likelihood by a constant wonŌĆÖt affect the results of BayesŌĆÖ theorem. This gives
ą┤(ą┐|čā) <x ą┐ŌĆ£+čā-1(1 ŌĆö ą┐)čī+ŌĆØ-čā-1 for 0 < ą┐ < 1
which is the shape of the posterior as a function of ą┐. We recognize that this is the beta distribution with parameters a' = a + čā and b1 = b + n ŌĆö čā. That is, we add the number of successes to a, and add the number of failures to b:
ą┤(ą┐|čā) = / r(n + a + b) ^ ą┐čā+ŌĆ£-1(1 ŌĆö ą┐)ŌĆØ-ąŻ+čī-1
ąŻ( 1ąŻ) ąō(čā + a)r(n ŌĆö čā + b) ( )
for 0 < ą┐ < 1. Again, the posterior density of ą┐ has been easily obtained without having to go through the integration.
Figure 8.1 shows the shapes of beta(a,b) densities for values of a = .5,1,2,3 and
b = .5,1,2,3. This shows the variety of shapes members of the beta(a,b) family can
take. When a < b, the density has more weight in the lower half. The opposite is true when a > b. When a = b, the beta(a,b) density is symmetric. We note that the uniform prior is a special case of the beta(a,b) prior where a =1 and b =1.
2There are many possible parameterizations of the problem. Any one-to-one function of the parameter would also be a suitable parameter. The prior density for the new parameter could be found from the prior density of the original parameter using the change of variable formula, and would not be flat. In other words, it would favor some values of the new parameter over others. You can be objective in a given parameterization, but it would not be objective in the new formulation. Universal objectivity is not attainable.
132 BAYESIAN INFERENCE FOR BINOMIAL PROPORTION
beta(.5, 2)
beta(.5, 1)
beta(.5, 3)
beta(.5, .5)
beta(1, 2)
beta(1, .5)
beta(1, 1)
beta(1, 3)
beta(2, .5)
beta(2, 3)
beta(2, 2)
beta(2,1)
beta(3, 3)
beta(3, 2)
beta(3, .5)
beta(3, 1)
Figure 8.1 Some beta distributions.
Conjugate Family of Priors for Binomial Observation is the Beta Family
When we examine the shape of the binomial likelihood function as a function of n, we see that this is of the same form as the beta(a,b) distribution, a product of n to a power times (1 ŌĆö n) to another power. When we multiply the beta prior times the binomial likelihood, we add the exponents of n and (1 ŌĆö n), respectively. So we start with a beta prior, we get a beta posterior by the simple rule "add successes to a, add failures to b." This makes using beta(a,b) priors when we have binomial observations particularly easy. Using BayesŌĆÖ theorem moves us to another member of the same family.  