in black and white
Main menu
Share a book About us Home
Biology Business Chemistry Computers Culture Economics Fiction Games Guide History Management Mathematical Medicine Mental Fitnes Physics Psychology Scince Sport Technics

Statistical analysis of mixture distribution - Smith A.F.M

Smith A.F.M Statistical analysis of mixture distribution - Wiley publishing , 1985. - 130 p.
ISBN 0-470-90763-4
Download (direct link): statistianalysisoffinite1985.pdf
Previous << 1 .. 62 63 64 65 66 67 < 68 > 69 70 71 72 73 74 .. 103 >> Next

1 0.25 0.51 0.34 0.51 0.42 0.51
(33.01) (25.12) (46.71) (63.11) (25.00) (43.39)
2 7.29 10.08 9.36 10.08 10.51 10.08
(22.05) (17.74) (25.73) (16.26) (16.28) (14.51)
3 31.41 35.92 35.13 35.92 36.78 35.92
(19.57) (23.54) (43.91) (29.63) (29.01) (23.46)
quite informative, despite the fact that an iterative technique will be required to obtain estimates of parameters. Figure 5.7.1 shows the discriminant boundaries obtained with the two-dimensional sepal data from the Iris versicolour and Iris setosa samples of Fisher (1936). In the diagram, taken from O'Neill (1978), the boundaries that result from taking the data as ‘fully categorized’ and 'un-categorized' are remarkably close together. The two samples are, however, very well separated.
Further empirical results are given by Titterington (1976) and Makov (1980a) using the sequential updating methods described in Chapter 6. In spite of the comparative ease of implementation of the sequential procedures, the use of maximum likelihood estimates, when practicable, is clearly desirable. Han (1978) considers the situation where there are some fully categorized cases and a further sample, all of which come from the same, but unknown, component population. (The underlying model is assumed to be a mixture of two multivariate normals with equal covariance matrices.) The problem of updating a logistic discriminant function is mentioned in Anderson (1979), who suggests the use of maximum likelihood estimates from Ml data. Of course, the logistic approach is based on the diagnostic paradigm and it would appear that, on the basis of previous
remarks, the uncategorized cases have nothing to offer.
However, Anderson (1979) mixes the two parameterization^ using the n in (5.7.3) and p in (5.7.4) as basic parameters. Thus y in (5.7.4) is expressed as a function of p and n and the uncategorized data are therefore worth incorporating
into the discriminant rule. ...
Ganesalingam and McLachlan (1979b) compare, empirically, the discriminant
rule from maximum likelihood with one estimated using the ‘cluster analysis
approach of Section 4.3.4.
McLachlan (1975, 1977) discusses a method which lies somewhere between
Statistical analysis of finite mixture distributions
MLE from unclassified observations
MLE from classified observations
Preliminary partition boundary on which
initial iterates for the EM algorithm were based
Sepal length
Figure 5.7.1 Linear discriminants for Iris setosa and Iris versicolour data. Reproduced by permission of the American Statistical Association from O’Neill (1978)
these two approaches and which is used for M2 data, containing both fully categorized and uncategorized cases. The method is based on an iterative procedure somewhat reminiscent of, but not equivalent to, the EM algorithm.
In general, these parametric procedures rely either on maximum likelihood estimation or on some approximation thereto, possibly involving sequential incorporation of the uncategorized cases. Given consistency, it should pay, on average, to incorporate the uncategorized cases if an ‘imitator’ of the optimal discriminant rule is then used. As is clear from the above, most of the detailed work has been concentrated on mixtures of two multivariate normals with equal covariance matrices.
More general procedures are possible if the assumption of a parametric model
Learning about the components of a mixture !
is dropped. Several ad hoc suggestions are made by Murray and Titterington (1978) based on methods described in Example 4.3.9. Also mentioned in Example 4.3.9, in the context of density estimation, was the penalized likelihood method. It can be used to obtain non-parametric estimates of the density ratio itself, thus providing a direct non-parametric version of the linear logistic approach. Suppose k = 2, d = 1, and we have a set of mixture data along with a set of observations from the first component. The two sample sizes are n and «„ respectively, and the densities are p(x) and /j(x). The objective is to estimate
v(x) = p(x)//1(x) = 7T, T (1
A plot of v(x) against x is approximately constant, at level itu in regions where /2(x)//i(x) is small. If n and n, are reasonably large, we may regard the data as realizations of two independent inhomogeneous Poisson processes and, if an estimate /i(x) can be made of p{x), the ratio of the intensity of the mixture process to that of the pure process, an estimate of v(x) is given by
v(x) = (njn) fi(x).
Suppose the combined order statistic is zlf...,zN, where N = n + n,. Then, so far as the intensity ratio is concerned, we may restrict our attention (Silverman, 1978) to the conditional log-likelihood
& (a) = X (efa(z,) - log {1 + exp [a(z,)]}),
where a(z) = log [_n(z)1 and e( = 1 if z( comes from the mixture data, =0 otherwise.
Since we may maximize f? without any parametric restriction, we obtain the degenerate solution a(z,) = oo if e, = 1 and a(z,) = — oo otherwise. To avoid this, a roughness penalty is imposed on the form of a( ). Specifically, we maximize, for some constant K,
Previous << 1 .. 62 63 64 65 66 67 < 68 > 69 70 71 72 73 74 .. 103 >> Next