# Introduction to Bayesian statistics - Bolstad M.

ISBN 0-471-27020-2

**Download**(direct link)

**:**

**35**> 36 37 38 39 40 41 .. 126 >> Next

We note that if we square the term in brackets, break the sum into three sums, and

factor the constant terms out of each sum, we get

Var(Y) = E yk x f (yk) - 2M x ^2 yk f (yk) + M2 x E f (yk)

k k k

= E(Y2) - m2 .

Since m = E(Y) this gives another useful formula for computing the variance.

Var(Y) = E(Y2) - [E(Y)]2 . (5.3)

Example 6 Let Y be a discrete random variable with probability function given in the following table.

yi f (yi)

0 .20

1 .15

2 .25

3 .35

4 .05

To find E(Y) we use Equation 5.1 which gives

E(Y) = 0 x .20+ 1 x .15 + 2 x .25 + 3 x .35 + 4 x .05

= 1.90.

80 DISCRETE RANDOM VARIABLES

Note that the expected value does not have to be a possible value of the random variable Y. It represents an average. We will find Var(Y) in two ways and see that they give equivalent results. First, we use the definition of variance given in Equation 5.2.

Var(Y) = (0 - 1.90)2 x .20 + (1 - 1.90)2 x .15 + (2 - 1.90)2 x .25

+(3 - 1.90)2 x .35 + (4 - 1.90)2 x .05 = 1.49.

Second, we will use Equation 5.3. We calculate

E(Y2) = 02 x .20 + 12 x .15 + 22 x .25 + 32 x .35 + 42 x .05

= 5.10.

Putting that result in Equation 5.3, we get

Var(Y) = 5.10 - 1.902

= 1.49.

The Mean and Variance of a Linear Function of a Random Variable

Suppose W = a x Y + b, where Y is a discrete random variable. Clearly, W is another number that is the outcome of the same random experiment that Y came

from. Thus W, a linear function of a random variable Y, is another random variable.

We wish to find its mean.

E(aY + b) = ^(ayk + b) x f (yk)

k

= 53 ayk x f (yk) + $3b x f (yk) k

= a53ykf (yk) + b53f (yk).

Since J2 yk f (yk) = M and £ f (yk) = 1, the mean of the linear function is the linear function of the mean:

E(aY + b) = aE (Y)+ b. (5.4)

Similarly we may wish to know its variance.

Var(aY + b) = ^(ayk + b - E(aY + b))2f (yk)

k

= 53[a(yk - E(Y)) + b - b)]2f (yk)

k

= a2 ^(yk - E(Y))2f (yk).

k

BINOMIAL DISTRIBUTION 81

Thus the variance of a linear function is the square of the multiplicative constant a times the variance :

Var(aY + b) = a2Var(Y) . (5.5)

The additive constant b doesn’t enter into it.

Example 6 (continued) Suppose W = -2Y + 3. Then from Equation 5.4, we have

E(W) = -2E (Y) + 3

= -2 x 1.90 + 3

= -.80

and from Equation 5.5, we have

Var(W) = (—2)2 x Var(Y)

= 4 x 1.49 = 5.96.

5.3 BINOMIAL DISTRIBUTION

Let us look at three situations and see what characteristics they have in common.

Coin tossing. Suppose we toss the same coin n times, and count the number of heads that occur. We consider that any one toss is not influenced by the outcomes of previous tosses, in other words, the outcome of one toss is independent of the outcomes of previous tosses. Since we are always tossing the same coin, the probability of getting a head on any particular toss remains constant for all tosses. The possible values of the total number of heads observed in the n tosses are 0,..., n.

Drawing from an urn with replacement. An urn contains balls of two colors, red and green. The proportion of red balls is n. We draw a ball at random from the urn, record its color, then return it to the urn, and remix the balls before the next random draw. We make a total of n draws, and count the number of times we drew a red ball. Since we replace and remix the balls between draws, each draw takes place under identical conditions. The outcome of any particular draw is not influenced by the previous draw outcomes. The probability of getting a red ball on any particular draw remains equal to n, the proportion of red balls in the urn. The possible values of the total number of red balls drawn are 0,..., n.

Random sampling from a very large population. Suppose we draw a random sample of size n from a very large population. The proportion of items in the population having some attribute is n. We count the number of items in the sample that have the attribute. Since the population is very large compared to the sample size, removing a few items from the population does not perceptibly change the proportion of remaining items having the attribute. For all intents and purposes it remains n. The random draws are taken under almost identical conditions. The outcome of any draw is not influenced by the previous outcomes. The possible values of the number of items drawn that have the attribute is 0,...,n.

82 DISCRETE RANDOM VARIABLES

Characteristics of the Binomial Distribution

These three cases all have the following things in common.

• There are n independent trials. Each trial can result either in a "success" or a "failure."

• The probability of "success" is constant over all the trials. Let n be the probability of "success."

• Y is the number of "successes" that occurred in the n trials. Y can take on integer values 0,1,...,n.

These are the characteristics of the binomial (n, n) distribution. The probability function of the binomial random variable Y given the parameter value n is written

Mean of binomial. The mean of the binomial(n, n) distribution is the sample size times the probability of success since

**35**> 36 37 38 39 40 41 .. 126 >> Next