in black and white
Main menu
Home About us Share a book
Biology Business Chemistry Computers Culture Economics Fiction Games Guide History Management Mathematical Medicine Mental Fitnes Physics Psychology Scince Sport Technics

Kalman Filtering Neural Networks - Haykin S.

Haykin S. Kalman Filtering Neural Networks - Wiley publishing , 2001. - 202 p.
ISBNs: 0-471-36998-5
Download (direct link): kalmanfilteringneuralnetworks2001.pdf
Previous << 1 .. 56 57 58 59 60 61 < 62 > 63 64 65 66 67 68 .. 72 >> Next

1 N ...
P(xkIY0) = N £ _ 4 )’
N i=i
where the random samples {x^; i — 1,..., N}, are drawn from ^(xk|Y0)
and d(-) denotes the Dirac delta function. The posterior filtering density pxk | Yo) is a marginal of the full posterior density given by pXo | Y0). Consequently, any expectations of the form
For example, letting g(x) — x yields the optimal MMSE estimate Xk — E[xk| Yo]. The particles are assumed to be independent and identically distributed (i.i.d) for the approximation to hold. As N goes to infinity, the estimate converges to the true expectation almost surely. Sampling from the filtering posterior is only a special case of Monte Carlo simulation, which in general deals with the complete posterior density px01 Yo). We shall use this more general form to derive the particle filter algorithm.
It is often impossible to sample directly from the posterior density function. However, we can circumvent this difficulty by making use of importance sampling and alternatively sampling from a known proposal distribution q(x01 Yo). The exact form of this distribution is a critical design issue, and is usually chosen in order to facilitate easy sampling. The details of this are discussed later. Given this proposal distribution, we can make use of the following substitution:
E(g(xk)) — g(xkMxk 1 Yo) dxk
may be approximated by the following estimate:
where the variables wk(X0) are known as the unnormalized importance weights,
PY0 I x0)p(x0)
Wk q(x0 |Y0)
We can get rid of the unknown normalizing density pY0) as follows:
E(gk(X0)) -
gk(X0)wk(X0)q(X0|Y0) dX0
pY0), f gk(X0)wk(X0)q(X0|Y0) dX0
j g|§dX0
f gk(X0)wk(X0)q(X0|Y0) dX0 Jwk(X0)q(X0|Y0) dX0
Eq(.|Y0)K (X0)gk (X0)) Eq(.|Y0)(wk (X0))
where the notation Eq(.|Yk) has been used to emphasize that the expectations are taken over the proposal distribution q(.|Y0).
A sequential update to the importance weights is achieved by expanding the proposal distribution as q(X0|Y0) = q(X0—1|Y0—1 )q(xk|X0—1, Y0), where we are making the assumption that the current state is not dependent on future observations. Furthermore, under the assumption that the states correspond to a Markov process and that the observations are conditionally independent given the states, we can arrive at the recursive update:
piik IxtMxt lxk-1) (n ^
Wk=Wk-1 ,(xtlX$-1.Y0) ■ (7:99)
Equation (7.99) provides a mechanism to sequentially update the importance weights given an appropriate choice of proposal distribution, q(xk|X0-1, Y0). Since we can sample from the proposal distribution and evalute the likelihoodpyk|xk) and transition probabilities p(xk|xk-1). all we need to do is generate a prior set of samples and iteratively compute the importance weights. This procedure then allows us to evaluate the expectations of interest by the following estimate:
N ^ g(xfewk(x0i)k)
i= 1
= E g(xoi)k)lwk«kX
N wk (xfe
where the normalized importance weights w® — w£V £j— 1 wj and x0i)k denotes the ith sample trajectory drawn from the proposal distribution q(xk | xQ-1, Yq). This estimate asymptotically converges if the expectation and variance of g(xQ) and wk exist and are bounded, and if the support of the proposal distribution includes the support of the posterior distribution. Thus, as N tends to infinity, the posterior density function can be approximated arbitrarily well by the point-mass estimate
p(x0 |Y0) — £ wk!)d(x0 - *&) (7.101)
7 — 1
and the posterior filtering density by
P(xk|YQ) — £ Wk!)d(xk - 4). (7.102)
In the case of filtering, we do not need to keep the whole history of the sample trajectories, in that only the current set of samples at time k is needed to calculate expectations of the form given in Eq. (7.96) and
(7.97). To do this, we simply set, g(xQ) — g(xk). These point-mass estimates can approximate any general distribution arbitrarily well, limited only by the number of particles used and how well the above-mentioned
importance sampling conditions are met. In contrast, the posterior distri-
bution calculated by the EKF is a minimum-variance Gaussian approximation to the true distribution, which inherently cannot capture complex structure such as multimodalities, skewness, or other higher-order moments.
Resampling and MCMC Step The sequential importance sampling (SIS) algorithm discussed so far has a serious limitation: the variance of the importance weights increases stochastically over time. Typically, after a few iterations, one of the normalized importance weights tends to unity, while the remaining weights tend to zero. A large number of samples are thus effectively removed from the sample set because their importance weights become numerically insignificant. To avoid this degeneracy, a resampling or selection stage may be used to eliminate samples with low importance weights and multiply samples with high importance weights. This is often followed by a Markov-chain Monte Carlo (MCMC) move step, which introduces sample variety without affecting the posterior distribution they represent.
Previous << 1 .. 56 57 58 59 60 61 < 62 > 63 64 65 66 67 68 .. 72 >> Next