# Statistical analysis of mixture distribution - Smith A.F.M

ISBN 0-470-90763-4

**Download**(direct link)

**:**

**72**> 73 74 75 76 77 78 .. 103 >> Next

(c) Quasi-Bayes learning (QB)

Introduced in this context by Makov and Smith (1977a, b), this replaces A, { and

A12 by their expectations, w, and (1 — Wj), respectively, and so approximates p(7r|x,) by

Subsequent updating proceeds in an identical manner: with p(n\x" *) = 8(tt;we update at the /ith stage (n> 1) to p{n\xn) = B(n;a„Jn),

P(7t\xl)= B(n,Z0 + WlJ0+ 1 - Wj).

(6.2.12)

where

- i + vv„, fin = Pn- 1 + 1 - W„,

and where w„ is given by (6.2.6) with

Sequential problems and procedures

This method is thus seen to be one of‘fractional updating’ (Tittcrington, 1976) of the parameters within the reproducing family of beta distributions. As we have noted, the required computation is minimal. Decisions regarding the sources of observations are again based upon the wn, which are, in turn, defined by the n,n'1 It follows that the success of the procedure depends on the convergence to n of the

sequence n{n\ n = 1,2 We shall consider the asymptotic properties of this and

other procedures in Section 6.2.3. For the present, we simply note that the h{n) satisfy the recursive relationship

*<"M, = *<?> - («„ + 1 + f1n + ,,?- :> (*<"> ->vn+1). (6.2.13)

(d) Learning with a probabilistic editor (PE)

This is the name given to the approach where An, A, 2 are chosen so that the first two moments of (6.2.10), the beta distribution approximation, are identical to those of the mixture distribution (6.2.8). Variants of this approach in more complex situations are discussed in Owen (1975), where it is referred to as ‘restricted Bayes’, Harrison and Stevens (1976), and Athans, Whiting and Gruber (1977). We shall consider the procedure in more detail in Section 6.5, when we discuss mixture distributions arising in the context of multiprocess Kalman filtering.

(e) The method of moments (MM )

This is defined by the simple recursion

?<«+i> = ?<») L-( jt{n)-Xn+l~m~\ (6.2.14)

n + 1 \ nil ~ m2 )

where m{ is the mean of the distribution corresponding to/(•), i = 1, 2. The method is based on the fact that n~x(Xi + ••• + xn) converges, with probability one, to 7rm1 *F (1 — Tt)m2 (see Patrick, Carayannopoulos, and Costello 1966, Johnson, 1973, and Odell and Basu, 1976, for details and applications).

We note in passing, however, that the form of recursion given in (6.2.14) could

be viewed as a special case of (6.2.11), where we set

t “ mz

~ mi — m2

6.2.2 The two-class problem: a maximum likelihood related procedure

In the case of an identifiable mixture, Kazakos (1977) has shown that the like lihood function defined by (6.1.2) is log-concave, so that, for fixed n, the maximum likelihood estimate is unique. However, its implementation for fixed n requires the numerical solution of a set of non-linear equations, and the form o so ution is non-recursive in nature. To overcome this difficulty, Kazakos eve ops

184

Statistical analysis of finite mixture distributi

ons

recursive algorithm based on a Newton Raphson-type gradient algorithm for finding the minimum of the Kullback Leibler directed divergence measure. Let rfm, Ha\... denote a sequence of estimates of 7r, where 7t(n) is based on xn,

and write

p(.X'|7r)log

p{x\n)

P(x\t(n))

dx

= E [log p(x | 7t) | 7T j - E [log p(x | 7T<n)) | 7r],

(6.2.15)

where p{x17r) is given by (6.2.1). It is easily shown that /(rt(n), n) ^ Ofor 0 ^ a,n) ^ 1, with 7t) = 0 only if 7Ttn’= 7r, so that a minimal requirement for a ‘good’ sequence of estimates is that it should tend to the minimum of I(a{n\n). If we approach the construction of such a sequence by means of a Newton-Raphson gradient algorithm, we are led to consider a recursive sequence of the form

da

i {a, n)

(6.2.16)

where <a,,a2,... is a suitable ‘gain’ sequence. If, in place of

da

i{a, n) = - e

da

log p(x I a)

71

we substitute the obvious estimate

- [/? (X„+,) -/2(X„ I)]/[)!l">/1(xn +,) + (1 - «'"')/2(x,, i)],

(6.2.16) suggests the general recursive form

1) - ? C/, (x„ +,) - /2(x„ +,)]

(6.2.17)

where 7r(nl is the estimate of n after observing xx,..., x„, 7i(0) is an initial estimate of

7t, and L(n{n>) is an adjustable gain function assumed to be real-valued, positive, and bounded.

6.2.3 Asymptotic and finite-sample comparisons of the quasi-Bayes and Kazakos procedures

A convenient way to approach the study of the asymptotic properties of the

various proposed procedures is through the theory of stochastic approximation,

which exploits the martingale structure implicit in recursions such as (6.2.13) and

(6.2.17). General accounts of available results and their application to recursive

estimation are given by Hall and Heyde (1980), and Nevel’son and Has’minskii

(1973). see, also, Fabian (1978). For our purposes here, it will suffice to draw

attention to two specific results, which give the flavour of the kinds of theorem available.

Sequential problems and procedures 185

Consider the following recursively defined sequence of random variables XltX2, arbitrary):

Xn = Xn_x- an\_M(Xn_ ,) - a] - anZ(Xn.,), (6.2.18)

**72**> 73 74 75 76 77 78 .. 103 >> Next