Books
in black and white
Main menu
Share a book About us Home
Books
Biology Business Chemistry Computers Culture Economics Fiction Games Guide History Management Mathematical Medicine Mental Fitnes Physics Psychology Scince Sport Technics
Ads

Statistical analysis of mixture distribution - Smith A.F.M

Smith A.F.M Statistical analysis of mixture distribution - Wiley publishing , 1985. - 130 p.
ISBN 0-470-90763-4
Download (direct link): statistianalysisoffinite1985.pdf
Previous << 1 .. 70 71 72 73 74 75 < 76 > 77 78 79 80 81 82 .. 103 >> Next

k*l = 9.-n~'G0Jy(0,xntl)(6.3.1)
where, given 0, the observations Xn are independently distributed with common density p(x\0), 0n+i is the estimate of 0 based on x,,..., xn+,, G{0n) is an adjustable gain function whose explicit form will be discussed later and y(6,x) is defined by
y(0,x)= - ^log/?(x|0). (6.3.2)
The stochastic approximation recursion (6.3.1) is clearly aimed at finding the root of }(), or, equivalently, through (6.3.2), the extremum of log p(x|0), which would coincide with the maximum likelihood estimator of 0. Patrick (1972) gives a general discussion of such recursions and we have already come across similar forms in our discussion of case A.
Defining
0(z, 0) = E0y{z, X), (6.3.3)
where the expectation is with respect to p(x\0), we note the following.
Lemma 6.3.1
g(z,0) has the following properties:
(a) g(0,0) = 0;
(b) there exists 0 ? 0, a neighbourhood of 0, such that
inf(z - 0)g(z10) > 0 forze0',z*0. (6-3-4)
Proof : We note that
,g4 Statistical analysis of finite mixture distrihuti
where
HI inns
J(z. 0) = /% log P{x\0)~ to ^
% _P(x\z)_
p(x\0)dx,
is the Kullback Leibler directed divergence between p(x\0) and p(x\z): (a) and (b) follow immediately since it is well known that J(z, 0) ^ 0, with equality if and only
if z = 0.
Suppose we now assume that
A I: 0e( - M. M) for some known M > 0, such that ( - M Af)n0' * 0-A2:ze(-A?A/)n0';
A3:sup?[}-a(z, A')|0] < oo, 0e0';
A4: G(z) is positive and bounded, with a bounded first derivative, and define
G(z, 0) = - G(z)g(z, 0) (6 3
and
?(z, x) = G(z)[y(z, .v) - g(z, 0)1 (63.6)
We can then establish the following properties.
Lemma 6.3.2
If assumptions A1 to A4 are satisfied, the quantities defined by (6.3.5) and (6.3.6) satisiy.
(a) 1/(0,0) = 0 and (z - 9)U(z, 0) < 0 for all z * 0;
(b) I C(z,0)| < *|z - 0|, for some k > 0, and inf | G(z, 0)\ > 0, for r, < \z - 0| < g~'
and for all g > 0 such that assumption A2 is satisfied.
(c) U(z. 0) = oc(z - 0) + o(|z - 0|), for some a < 0-Id) (i) sup.?[?2(2,2f)|z] < oo,
(ii) lim:.0?[/?2(z, JQ|z] = S(0) < oo;
(e) given z, ?(z,*,), i = 1,2 are identically distributed.
r"0WS 'mmediale|y from (6 3.5) and Lemma 6.3.1(b).
I o establish (b). we use the Taylor expansion
U(z, 0) = 1/(0,0) + (z - 0)U'(z*, 0) = (z - 0)U'(z*, 0),
'ZmTlo* evnih bfTCn* and WherC U{z*'0) is the derivative of U(z, 0) with IS uniformlv ho, UdH^H TIZ ^ nlC that Ur assumPlins ensure that | U'(z*, 0)|
the proof of Lemma 6.3.1 ^ f ^ fUOWS frm the remarks made in
For (c), we use the expansion
V(z,0) = - [(Z _ ())y'(0,0) + o(|z - 0|)][G(0) + 0(|z - 0|)]
= [ - G(())g'(0,0)](z - 0) + o(\z - 0|),
Sequential problems ami procedures iqj
and note that
- 0'(0,0) = -
d2
~?)Q2 Og Pi* 10)
p(x 10) d.v = - /(0),
where [l(0)Y 1 >0 is the Cramer-Rao lower bound for a single observation from p(x\6). The choice a = - G(0)/(0) < 0 satisfies (c).
To establish (d), we note that Ely(z, Xfiz] =0, and so
E[R(z,X)\z] = G2(z){E[y2{z,X)\z-]+cj2(z,0)}, which is bounded by virtue of assumptions A1 to A4. We note that
E[y2{z,X)\z-\ = E
d )2
-logp(X\z)\ \z
and hence, taking the limit as z tends to 0 in the above, we see from the expression for g'(0,0) that
lim E[R2{z, X)\z~\ = G2(0)l{0).
z~*0
The choice S(0) = G2(0)I(0) < go satisfies (d).
The final property, (e), follows straightforwardly from the assumed independence, given z, of the Xhi= 1,2...........
The following lemma will be used to establish the asymptotic properties of
(6.3.1).
Lemma 6.3.3
Suppose that the conditions of Lemma 6.3.2 are satisfied and that |a| > L Then, for the recursion defined by (6.3.1), n1 2(0n 0) is asymptotically normally distributed with zero mean and variance
V = S(0)(2|a| 1 )
-1
Proof: This follows from an application of Theorem 6.2.2.
The main result is now the following.
Theorem 6.3.1
If in the recursion defined by (6.3.1) G(0n) > [2/((?)] \ and assumptions A1 to A4 are satisfied, then nl,2(0 - 9) is asymptotically normally distributed, with
zero mean and variance
GW(g)_ (6.3,7)
[2C(0)/(0)-l]
Proof: Truncation does not, of course, affect convergence properties (see. for example, Davisson, 1970) and the result follows immediately from Lemmas 6.3.2
and 6.3.3.
196
Statistical analysis of finite mixture distributions
Corollary 6.3.1
w
A fully asymptotically efficient procedure corresponds to the choice G(z) = [/(")]"'
Proof : With this choice, (6.3.7) reduces to Vopt = [/(0)] '.
Corollary 6.3.2
Given AI to A4, the relative asymptotic efficiency of 0n with constant gain G(r) = c > [27(z)] 1 is given by
We shall now use these properties of the general recursion (6.3.1) to examine a number of specific important practical case B problems and some proposed sequential estimation procedures.
6.3.2 I nsupervised learning for signal versus noise
We shall consider the case of a sequence of observations, x,, x2,...,x, each of which is either a signal, assumed to have a normal distribution with unit variance and unknown mean 0, or noise, assumed to have a normal distribution with unit variance and zero mean. The corresponding normal densities will be denoted by /,(*10) and f2(x\0) = f2(x), respectively. The a priori probabilities of signal and noise will be assumed constant and known, and are denoted by 7r, and n2 ( = 1 - 7tj). Given 0, uj, n2, observations will be assumed independent, with common mixture density
Previous << 1 .. 70 71 72 73 74 75 < 76 > 77 78 79 80 81 82 .. 103 >> Next