# Chemometrics from basick to wavelet transform - Chau F.T

ISBN 0-471-20242-8

**Download**(direct link)

**:**

**83**> 84 85 86 87 88 89 .. 112 >> Next

Leung etal. [36] used WT to reduce IR spectral data for library searching. In this work, FWT and WPT were applied to compress infrared IR spectrum for storage and spectral searching. The coefficient position-retaining (CPR)

an overview of the applications in chemistry

239

method was discussed in detail for handling data with any length. IR spectra of 20 organic compounds with similar structures were compressed at the fourth resolution level with the use of the Daubechies wavelet function (L = 16). After data compression, the coefficients obtained were selected and employed to build a spectral library for future searching. Spectral library searching of this database was found to be better than that treated by FFT, especially in the aspect of visual comparison in some cases. The scale coefficients obtained from FWT and WPT can be used effectively for preliminary searching in a large spectral library. The proposed methods can minimize the search time by using a direct matching method.

Application of WNN to compression of IR spectrum was also reported [37]. In this study, the Morlet wavelet function was employed as the transfer function. The wavenumber and the transmittance of the IR were chosen as the input and output of the network, respectively. With proper training, the weighting factor for each neuron and the parameters of the wavelet function can be optimized. The original spectrum can be represented and compressed by using the optimized weighting factor and wavelet function parameters. In this work, compression ratios of 50 and 80% were obtained when the wavenumber interval of the IR spectrum was 2.0 and 0.1 cm-1, respectively.

Bos and Vrielink [38] reported their results on identification of mono- and disubstituded benzenes utilizing WT preprocessing. The aim of the work was to examine whether the localization property of WT in both position and scale can be used to extract features of an IR spectrum effectively. After the WT treatment on the IR spectrum, the coefficients obtained were employed as inputs for an identification process based on linear and nonlinear neural network classifiers. It was shown that, by using the extracted information instead of the full spectrum, the classification was greatly improved in both speed and quality. This study was also showed that WT with the Daubechies wavelet is a good method for feature extraction that can reduce IR spectral data by more than 20-fold.

The Fourier transform IR spectrum of a rock contains information about its constituent minerals. Stark et al. [39] employed WT to predict the mass fraction of a given constituent in a mixture. Using the wavelet transform, the authors roughly separated the mineralogical information in the FT-IR spectrum from the noise, using an extensive set of known mineral spectra as the training data for which the true mineralogy is known. Wavelet coefficients that varied either too much or too little were ignored because the former coefficients are likely to reflect analytical noise and the latter coefficients do not help one discriminate between different minerals. The remaining coefficients were used as the data for estimating the mineralogy of the sample. In this work, an empirical affine estimator was also developed to estimate

240

application of wavelet transform in chemistry

the mass fraction of a given mineral in a mixture. The estimator was found to typically perform better than the weighted nonnegative least-squares instrument.

Variable selection and compression are often used to produce more parsimonious regression models. When they are applied directly to the original spectrum domain, however, it is not easy to determine the type of feature that the selected variables represent. In another study, Alsberg et al. [40] showed that it is possible to identify important variables as being part of short- or large-scale features by performing variable selection in the wavelet domain. The suggested method can be used to extract information about the selected variables that otherwise would have been inaccessible, and to obtain information about the location of these features in the original domain. In this article, three types of variable selection methods were applied to the wavelet domain: selection of optimal combination of scales, thresholding based on mutual information, and truncation of weight vectors in the PLS regression algorithm. It was found that truncation of weight vectors in PLS was the most effective method for selecting variables. Two experimental datasets were investigated. Results showed that approximately the same prediction error was obtained by using less than 1% and 10% of the original variables, respectively. In this work, it was also found that the selected variables were restricted to a limited number of wavelet scales. This information can be used to suggest whether the underlying features may be dominated by narrow peaks (indicated by variables in short-wavelet-scale regions) or by broader regions (indicated by variables in long-wavelet-scale regions). This study also concluded that the variables selected are not unique when the variable selection is applied to collinear data such as spectral profiles of complex mixtures. In most cases, we cannot expect to find a very limited number of unique variables, but rather regions of interest where good representative wavenumber candidates are found. This suggests that instead of performing the variable selection in the original domain, a compressed domain representation may be more fruitful.

**83**> 84 85 86 87 88 89 .. 112 >> Next