Download (direct link):
6.3.1 COMPARISON BETWEEN CHEMOMETRICS AND 2D CORRELATION
It is interesting first to compare some similarities and differences between various techniques used in chemometrics and generalized 2D correlation spectroscopy. While both chemometrics and 2D correlation share the same mathematical operations based on the manipulation of data using standard matrix algebra, each tends to focus on somewhat different aspects of spectral data structure. For example, conventional chemometric techniques often treat each spectrum of a data set as a whole entity defined within some chosen spectral region and represent them as linear combinations of a set of representative loading vectors. In contrast, 2D correlation treatment historically dealt with the dynamics of local spectral features, such as individual peaks and bands. Thus, 2D correlation spectroscopy and chemometrics are often used as complementary but essentially independent data analysis techniques.1519
Some direct comparisons between features of outcomes from chemometrics and 2D correlation analyses have also been made.1516 For example, it has been pointed out that the autopower spectrum located at the diagonal position of a synchronous 2D correlation spectrum often resembles the first PCA loading vector of mean-centered data. Likewise, some similarity was noted between certain
Chemometrics and 2D Correlation Spectroscopy
slices of asynchronous 2D correlation spectra and the second loading vector of mean-centered data. There have also been some more detailed discussions about the similarity and difference between the two approaches, but the extent of discussion has been limited to the analysis of synchronous spectra.
Truly meaningful comparison between 2D correlation spectroscopy and chemo-metrics must eventually delve into the fundamental concept of asynchronicity, which is so central to the practice of 2D correlation analysis. Evolving factor analysis (EFA) and related techniques,20 21 for example, may carry some conceptual commonality with 2D correlation analysis. Both techniques place emphasis on the order and sequence of spectral data collection. However, there still exist significant differences in the fundamental approach to exploiting the sequentially ordered spectral data set. Future development of chemometrics techniques capable of sorting out the sequential order of spectral signal changes would certainly bring the two fields much closer.
6.3.2 FACTOR ANALYSIS
In a well-established chemometrics technique of factor analysis, one decomposes the original spectral data matrix into a set of a small number of underlying factors, often expressed as a product of score and loading vector matrices.1214 Factors are separated into a significant set representing the linear combinations of spectral contributions of actual components of interest and remaining factors dominated by noise or contributions insignificant to the analysis. The basic hypothesis of factor analysis is that the improved proxy of the original data matrix can be reconstructed from only a limited number of significant factors. Three direct practical outcomes are: (1) determination of the number of components (i.e., number of significant factors) involved in the description of data matrix; (2) rejection of noise and insignificant information by discarding the interfering factors; and (3) reduction of the information into a compact set of factors.
We will explore a simple example for a combination of a chemometrics technique and 2D correlation by using an abstract factor analysis known as principal component analysis (PCA).22 This technique is especially well suited for the identification of the number of factors governing the data structure and also for effective identification and rejection of noise components from a raw spectral data set prior to 2D correlation analysis. The latter feature is especially interesting for asynchronous 2D correlation analysis. Asynchronous spectra based on noisy raw data are often contaminated by artifactual peaks, which are attributed to the fortuitous correlation of noise. The reconstructed data matrix derived only from the significant factors significantly reduces this problem.
6.3.3 PRINCIPAL COMPONENT ANALYSIS (PCA)
Let us consider a raw data matrix A, comprising the original set of perturbation-induced dynamic spectra, to be an n x m matrix with n dynamic spectra and
Additional Developments in Two-dimensional Correlation Spectroscopy
m wavenumber points. The loading vector matrix V is an m x r matrix, where each column is the loading (i.e., the eigenvector of the dispersion matrix AT A) obtained by PCA. Here, AT stands for the transpose of A. It should be pointed out that the dispersion matrix is proportional to the covariance matrix, \/(n — 1) AT A, which in turn is equivalent to the discrete case of a synchronous 2D correlation spectrum. The total number of loading vectors r selected for the analysis must be less than or equal to n. It is customary to normalize each column of V (i.e., each loading), such that the product VT V is an identity matrix. Associated with the PCA loading vectors are the scores (sometime called latent variables). The score matrix, W = AV, is a relatively small n x r matrix.