# Introduction to Bayesian statistics - Bolstad M.

ISBN 0-471-27020-2

**Download**(direct link)

**:**

**29**> 30 31 32 33 34 35 .. 126 >> Next

â€¢ The union Bi U B2 U â€¢â€¢â€¢ U Bn = U, the universe, and

â€¢ Every distinct pair of the events are disjoint, B* n Bj = Ñ„ for i = 1,...,n, j = 1,...,n, and i = j.

Then we say the set of events B1; â€¢â€¢â€¢ ,Bn partitions the universe. An observable event A will be partitioned into parts by the partition. A = (A n B1) U (A n B2) U ... (A n Bn). (A n Bj) and (A n Bj) are disjoint since B* and Bj are disjoint. Hence

n

P(A) = Â£ P(A n Bj).

j=i

This is known as the law of total probability. It just says the probability of an event A is the sum of the probabilities of its disjoint parts. Using the multiplication rule on each joint probability gives

n

P(A) = Â£ P(A|Bj) x P(Bj).

j=i

The conditional probability P(B*|A) for i = 1,... ,n is found by dividing each joint probability by the probability of the event A.

P (B-|A) = .

Using the multiplication rule to find the joint probability in the numerator, and the law of total probability in the denominator gives

P(B-|A)= P(A|B*) X P(B*) (4 4)

( j| ) j P (A|Bj) x P (Bj) . (.)

BAYESâ€™ THEOREM

65

Figure 4.8 Four events Bi for i = 1,... , 4 that partition the universe U, and event A.

Figure 4.9 The reduced universe given event A has occurred, together with the four events partitioning the universe.

This is a result known as Bayesâ€™ theorem published posthumously in 1763 after the death of its discoverer, Reverend Thomas Bayes.

Example 5 Suppose n = 4. Figure 4.8 shows the,four unobservable events Bi, ...,B4 that partition the universe U, and an observable event A. Now let us look at the conditional probability of Bi given A has occurred. Figure 4.9 shows the reduced universe, given event A has occurred. The conditional probabilities are the probabilities on the reduced universe, scaled up so they sum to 1. They are given by Equation

4.4.

Bayesâ€™ theorem is really just a restatement of the conditional probability formula, where the joint probability in the numerator is found by the multiplication rule, and the marginal probability found in the denominator is found using the law of total probability followed by the multiplication rule. Note how the events A and Bi for

66

LOGIC, PROBABILITY, AND UNCERTAINTY

i = 1,...,n are not treated symmetrically. The events Bj for i = 1,...,n are considered unobservable. We never know which one of them occurred. The event A is an observable event. The marginal probabilities P(Bj) for i = 1,...,n are assumed known before we start, and called our prior probabilities.

Bayesâ€™ Theorem: The Key to Bayesian Statistics

To see how we can use Bayesâ€™ theorem to revise our beliefs on the basis of evidence we need to look at each part. Let Bb...,Bâ€ž be a set of unobservable events which partition the universe. We start with P(Bj) for i = 1,...,n, the prior probability for the events Bj, for i = 1,...,n. This distribution gives the weight we attach to each of the Bj from our prior belief. Then we find that A has occurred.

The likelihood of the unobservable events B1;...,Bn is the conditional probability that A has occurred given Bj for i = 1,...,n. Thus the likelihood of event Bj is given by P(A|Bj). We see the likelihood is a function defined on the events B1 ,...,Bn. The likelihood is the weight given to each of the Bj events given by the occurrence of A.

P(Bj|A) for i = 1,...,n is the posterior probability of event Bj given A has occurred. This distribution contains the weight we attach to each of the events Bj for

i = 1,...n after we know event A has occurred. It combines our prior beliefs with the evidence given by the occurrence of event A.

The Bayesian universe. We can get better insight into Bayesâ€™ theorem if we think of the universe as having two dimensions, one observable, and one unobservable. We let the observable dimension be horizontal, and let the unobservable dimension be vertical. The unobservable events no longer partition the universe haphazardly. Instead, they partition the universe as rectangles that cut completely across the universe in a horizontal direction. The whole universe consists of these horizontal rectangles in a vertical stack. Since we donâ€™t ever observe which of these events occurred, we never know what vertical position we are in the Bayesian universe.

Observable events are vertical rectangles, that cut the universe from top to bottom. We observe that vertical rectangle A has occurred, so we observe the horizontal position in the universe.

Each event Bj ÐŸ A is a rectangle at the intersection of Bj and A. The probability of the event Bj ÐŸ A is found by multiplying the prior probability of Bj times the conditional probability of A given Bj. This is the multiplication rule.

The event A is the union of the disjoint parts A ÐŸ Bj for i = 1,...,n. The probability of A is clearly the sum of the probabilities of each of the disjoint parts. The probability of A is found by summing the probabilities of each disjoint part down the vertical column represented by A. This is the marginal probability of A.

**29**> 30 31 32 33 34 35 .. 126 >> Next