# Chemometrics from basick to wavelet transform - Chau F.T

ISBN 0-471-20242-8

**Download**(direct link)

**:**

**17**> 18 19 20 21 22 23 .. 112 >> Next

where I is an identity matrix.

digital smoothing and filtering methods

35

From the discussion above, it can be seen that the Kalman gain vector can be deduced through Equation (2.16) if the initial values of x(k) and P(k), say, x(0) and P(0), are known. Then, the next x(k) and P(k) can be computed through Equations (2.15) and (2.17) until convergence is attained.

In summary, the procedure of Kalman filtering can be carried out via the following steps:

1. Setting the initial values:

x(0) = 0, P(0) = y 2I (2.18)

where y2 is an initial estimation of variance of measurement noises that might be given by the following empirical formula

r (1)

Y 2 = a [h(1)f h(1)]1/2 (2.19)

The factor a can influence the calculation accuracy and can have values from 10 to 100. It is worthwhile to note that the initial value of P(0) is crucial for the estimation. If its value is too small, it can result in bias estimation. Yet, if its value is too high, it is difficult to have the computation converging to the desired value.

2. Recursive calculation loop:

g(k) = P(k - 1)h(k)[h(k)fP(k - 1)h(k) + r(k)]-1 x(k) = x(k - 1) + g(k)[y(k) - h(k)fx(k - 1)]

P(k) = [l - g(k - 1)h(k)f]P(k - 1)[l - g(k - 1)h(k)f]

+ g(k - 1)r(k)g(k - 1)f

where r(k) is the variance of measurement noises that can be determined by the variance of real noise. This loop procedure is repeated until the estimates become stable.

In Kalman filtering algorithm, the innovative series is very important and might provide information about whether the results obtained are reliable. The innovative series can be obtained by the following equation:

u(k) = y(k) - h(k)fx(k - 1) (2.20)

In fact, the series is the difference between the measurement and estimation and can be regarded as a residual at the k point. The innovative series should be a white noise with zero mean if the filtering model used is correct. Otherwise, the results obtained are not reliable.

36 one-dimensional signal processing techniques in chemistry

Kalman filtering can be applied for filtering, smoothing, and prediction. The most common application is known in multicomponent analysis.

2.1.4. Spline Smoothing

In addition to the smoothing methods based on digital filters as discussed previously, the other widely used one in signal processing is spline functions. The main advantage of spline functions is their differentiability in the entire measurement domain.

Among various spline functions, the cubic spline function is the most common one and is defined as follows

y = S (X) = Ak (X Xk )3 + Bk (X Xk )2 + Ck(x xk) + Dk (2.21)

where Ak, Bk, Ck, and Dk are the spline coefficients at data point k. The cubic spline function S(x) or y for observations on the abscissa intervals x1 < x2 < < Xn satisfies the following conditions:

1. The intervals are called knots. The knots may be identical with the index points on the x axis (abscissa).

2. Within the knots k, S(x) obeys the continuity constraint on the function and on its twofold derivatives.

3. S (x) is a cubic function in each subrange [xk, xk-1] for k = 1,... , n -1 considered.

4. Outside the range from x1 to xk, S(x) is a straight line.

For a fixed interval between the data points xk and xk-1, the following relationships are valid for the signal values and their derivatives:

yk = Dk

yk+1 = Ak(x xk)3 + Bk(x xk)2 + Ck(x xk) yk = S (xk) = Ck y'k+1 = 3Ak(x xk)2 + 2Bk(x xk) + Ck yk = S (xk) = 2Bk

yk+1 = 6Ak(x xk) + 2Bk

The spline coefficients can be determined by a method that also smoothes the data under study at the same time. The ordinate values yk are calculated such that the differences of the observed values are positive

digital smoothing and filtering methods

37

proportional jumps rk in their third derivative at point xk:

rk S'(xk) S"'(xk+1) rk Pk (yk - 9k)

(2.22)

(2.23)

The proportionality factors pk are determined by cross-validation. In contrast with polynomials, spline functions may be applied to approximate and smooth any kind of curve shape. It should be mentioned that many more coefficients must be estimated and stored in comparison with the polynomial filters because different coefficients apply in each interval. A disadvantage is valid for smoothing splines where the parameter estimates are biased. Therefore, it is more difficult to describe the statistical properties of spline functions than those of linear regression.

In MATLAB, there is a cubic spline function, named csaps. csaps(X, Y, p, X), which returns a smoothed version of the input data (X, Y) by cubic smoothing spline, and the result depends on the value of the smoothing parameter p (from 0 to 1). For p 0, the smoothing spline corresponds to the least-squares straight-line fit to the data, while at the other extreme, with p 1, it is the natural or variational cubic spline interpolation. The transition region between these two extremes is usually only a rather small range of values for p and its optimal value strongly depends on the nature of the data. Figure 2.4 shows an example of smoothing by a cubic spline smoother with different p values. From the plots as given in the figure, one can see that the choice of the right value for parameter p is crucial. The smoothing results are satisfactory if one makes a good choice as depicted in Figure 2.4c. In order to make it easier for the readers to understand the smoothing procedure using the cubic spline smoother, a MATLAB source code is given in the following frame:

**17**> 18 19 20 21 22 23 .. 112 >> Next