Analytica Chimica Acta (v.595, #1-2)
Editorial Board (CO1).
Editorial by Lutgarde Buydens; Márcia M.C. Ferreira; Sarah Rutan (1-2).
Chemometric analysis applied in 1H HR-MAS NMR and FT-IR data for chemotaxonomic distinction of intact lichen samples by Glaucia Braz Alcantara; Neli Kika Honda; Márcia Miguel Castro Ferreira; Antonio Gilberto Ferreira (3-8).
This paper describes the potentiality of chemometric analysis applied in 1H HR-MAS NMR and FT-IR data for lichen chemotaxonomic investigations. Lichens present a difficult morphologic differentiation and the chemical analyses are frequently employed for their taxonomic classification, mainly due to the secondary metabolites to be relatively constant for these organisms. The lichen chemotaxonomic classification is usually carried out by color reactions, chromatography, fluorescence and mass spectrometry analysis, where the identification is obtained by one or more techniques. There are some papers which use the carbohydrate content in chemotaxonomy investigation. However, the majority of these techniques involve laborious and time consuming sample pre-treatment. This work focuses on application of 1H high resolution magic angle spinning – nuclear magnetic resonance (HR-MAS NMR) and Fourier transform infrared (FT-IR) associated with chemometric analysis to intact samples. In comparison to other traditional techniques, 1H HR-MAS NMR and FT-IR allied with chemometrics provided a fast and economic method for lichen chemotaxonomy. Both methods were useful for lichen analysis and permitted the satisfactory distinction among families, genera and species, although better results were achieved for FT-IR data.
Keywords: Lichens; High resolution magic angle spinning – nuclear magnetic resonance; Fourier transform infrared; Chemometrics;
MCR of the quenching of the EEM of fluorescence of dissolved organic matter by metal ions by Maria Cristina G. Antunes; Cláudia C.C. Pereira; Joaquim C.G. Esteves da Silva (9-18).
The quenching of the fluorescence of excitation emission matrices (EEM) of two samples of dissolved organic matter (DOM) [fulvic acid from a dam water (FA) and a commercial humic acid (HA)], provoked by the metal ions Cu(II), UO2 2+, Fe(III) and Hg(II), was studied by principal component analysis (PCA) and multivariate curve resolution with alternating least squares (MCR-ALS). PCA of the individual EEM, sets of EEM and the sequential analysis of EEM allows the determination of the number of components that provoke linearly independent variations in the EEM and to assess the trend and stability of the quenching process. Four and three components were detected in the EEM of FA and HA, respectively. Some of these components show quenching in the presence of the studied metal ions and other are not affected. Also, the occurrence of scattering due to hydrolysis of metal ions is detected in the PCA sequential analysis of EEM as function of the metal ion concentration. Excitation and emission spectra and quenching profiles were extracted from EEM using MCR-ALS with non-negativity constraints. Stability constants between FA and HA with the studied metal ions were estimated by a modified Stern–Volmer equation.
Keywords: EEM; Fluorescence quenching; Dissolved organic matter; Principal component analysis (PCA); Multivariate curve resolution (MCR);
On the measurement of consistent long-term retention factor values in micellar liquid chromatography by José María Bermúdez-Saldaña; Laura Escuder-Gilabert; Rosa María Villanueva-Camañas; María José Medina-Hernández; Salvador Sagrado (19-27).
In the field of the quantitative structure–retention and retention–activity relationships (QRAR and QSRR) is crucial to obtain consistent retention factors (k). For this purpose, two unbiased approaches to estimate k are used: (i) the IUPAC approach (based on the extra-column time correction) and (ii) the ‘2-references’ approach (based on the k estimation respect to two prefixed reference k values). Three reference chemicals were selected attending to their retention time, chemical stability and non-ionic character. Consistent retention factor values for these references were estimated for C18 chromatographic columns and Brij35 solutions as mobile phases after statistical analysis. The comparison between the k values estimated for a set of 65 test chemicals using the IUPAC and the ‘2-references’ approach shows that the last one gives consistent long-term values useful for QRAR and QSRR modelling purposes.
Keywords: Retention factor estimation; Liquid chromatography; Reference retention factors; Intermediate precision; Uncertainty;
Mixture–mixture design for the fingerprint optimization of chromatographic mobile phases and extraction solutions for Camellia sinensis by Cleber N. Borges; Roy E. Bruns; Aline A. Almeida; Ieda S. Scarminio (28-37).
A composite simplex centroid–simplex centroid mixture design is proposed for simultaneously optimizing two mixture systems. The complementary model is formed by multiplying special cubic models for the two systems. The design was applied to the simultaneous optimization of both mobile phase chromatographic mixtures and extraction mixtures for the Camellia sinensis Chinese tea plant. The extraction mixtures investigated contained varying proportions of ethyl acetate, ethanol and dichloromethane while the mobile phase was made up of varying proportions of methanol, acetonitrile and a methanol–acetonitrile–water (MAW) 15%:15%:70% mixture. The experiments were block randomized corresponding to a split–plot error structure to minimize laboratory work and reduce environmental impact. Coefficients of an initial saturated model were obtained using Scheffe-type equations. A cumulative probability graph was used to determine an approximate reduced model. The split–plot error structure was then introduced into the reduced model by applying generalized least square equations with variance components calculated using the restricted maximum likelihood approach. A model was developed to calculate the number of peaks observed with the chromatographic detector at 210 nm. A 20-term model contained essentially all the statistical information of the initial model and had a root mean square calibration error of 1.38. The model was used to predict the number of peaks eluted in chromatograms obtained from extraction solutions that correspond to axial points of the simplex centroid design. The significant model coefficients are interpreted in terms of interacting linear, quadratic and cubic effects of the mobile phase and extraction solution components.
Keywords: Mixture designs; Split–plot; HPLC; Extraction media; Camellia sinensis;
X-ray scattering and multivariate analysis for classification of organic samples: A comparative study using Rh tube and synchrotron radiation by Gisele Gonçalves Bortoleto; Simone Soares de Oliveira Borges; Maria Izabel Maretti Silveira Bueno (38-42).
This work compares the performance of X-ray tube induced and synchrotron induced energy dispersive X-ray fluorescence in generating scattering profiles when organic samples are irradiated. In the first case, this effect produces sharper peaks, well defined in the spectra, whereas synchrotron-induced scatter are seen as broad bands. These effects may be used for classifying simple materials like graphite, coke, activated carbon and carbon nanotubes, all having the same composition but different structures, using multivariate data analysis. In a second sample set, the method was applied to liquid samples of different alcohols (methanol, ethanol, 1-propanol and 2-propanol). Classifications were obtained in both cases independent of the X-ray source (synchrotron radiation or X-ray tube radiation), showing that the use of chemometric tools combined with X-ray spectrometry can efficiently distinguish organic samples by using scattering effects.
Keywords: X-ray scattering; Rh tube; Synchrotron radiation; Multivariate analysis; Organic samples;
Application of chemometric tools for automatic classification and profile extraction of DNA samples in forensic tasks by Isneri Talavera Bustamante; Francisco Silva Mata; Noslen Hernández González; Ricardo González Gazapo; Juan Palau; Marcia M. Castro Ferreira (43-50).
In this paper a method for the automatic DNA spots classification and extraction of profiles associated in DNA polyacrilamide gel electrophoresis is presented and it integrates the use of image processing techniques and chemometrics tools. A software which implements this method was developed; for feature extraction a combination of a PCA analysis and a C4.5 decision tree were used. To obtain good results in the profile extraction only DNA spots are useful; therefore, it was necessary to solve a two-class classification problem among DNA spots and no-DNA spots. In order to perform the classification process with high velocity, effectiveness and robustness, comparative classification studies among support vector machine (SVM), K-NN and PLS-DA classifiers were made. The best results obtained with the SVM classifier demonstrated the advantages attributed to it in the literature as a two-class classifier. A Sequential Cluster Leader Algorithm and another one developed for the restoration of pattern missing spots were needed to conclude the profiles extraction step. The experimental results show that this method has a very effective computational behavior and effectiveness, and provide a very useful tool to decrease the time and increase the quality of the specialist responses.
Keywords: Classification; Support vector machine; Image analysis;
Application of genetic algorithm for selection of variables for the BLLS method applied to determination of pesticides and metabolites in wine by Renato L. Carneiro; Jez W.B. Braga; Carla B.G. Bottoli; Ronei J. Poppi (51-58).
A variable selection methodology based on genetic algorithm (GA) was applied in a bilinear least squares model (BLLS) with second-order advantage, in three distinct situations, for determination by HPLC–DAD of the pesticides carbaryl (CBL), methyl thiophanate (TIO), simazin (SIM) and dimethoate (DMT) and the metabolite phthalimide (PTA) in wine. The chromatographic separation was carried out using an isocratic elution with 50:50 (v/v) acetonitrile:water as mobile phase. Preprocessing methods were performed for correcting the chromatographic time shifts, baseline variation and background. The optimization by GA provided a significant reduction of the errors, where for SIM and PTA a decrease of three times the value obtained using all variables, and an improvement in the distribution of them, reducing the observed bias in the results were observed. Comparing the RMSEP of the optimized model with the uncertainty estimates of the reference values it is observed that GA can be a very useful tool in second-order models.
Keywords: BLLS; Genetic algorithm; Pesticides; Wine; HPLC–DAD;
Comparison of Plackett–Burman and supersaturated designs in robustness testing by B. Dejaegher; M. Dumarey; X. Capron; M.S. Bloomfield; Y. Vander Heyden (59-71).
An optimized FIA assay of l-N-monomethylarginine (LNMMA) was validated. The linearity, precision, accuracy and range of the analytical method were evaluated and robustness testing was performed. Several experimental designs for robustness testing containing different numbers of experiments (N) were compared. Both Plackett–Burman (N = 8 or 12) and supersaturated designs (N = 6) were examined. The latter design results were analyzed with the Fixing Effects and Adding Rows (FEAR) method, based on the initial addition of zero effect rows to the model matrix, which then are iteratively replaced by fixed effects. It was evaluated whether by reducing the number of experiments from 12 to 8 or 6, similar effects are estimated and considered (non-)significant. The FIA method was found linear in a range of 70–130% of the LNMMA concentration in the samples, and precise and accurate in a range of 80–120%. The estimated factor effects and the critical effects were found comparable for all examined designs, though there also are some indications that some from the supersaturated designs tend to be overestimated. The method was considered robust, since no significant effects occurred for the response describing the quantitative aspect of the method. For other responses, such as peak height and residence time, significant effects occur. However, only the most important effects are found with all designs. The effects reported from a supersaturated design based on the FEAR method still can be subject of further research.
Keywords: Method validation; Robustness tests; Plackett–Burman designs; Supersaturated designs; Fixing Effects and Adding Rows method;
Genetic algorithm optimisation combined with partial least squares regression and mutual information variable selection procedures in near-infrared quantitative analysis of cotton–viscose textiles by A. Durand; O. Devos; C. Ruckebusch; J.P. Huvenne (72-79).
In this work, different approaches for variable selection are studied in the context of near-infrared (NIR) multivariate calibration of textile. First, a model-based regression method is proposed. It consists in genetic algorithm optimisation combined with partial least squares regression (GA–PLS). The second approach is a relevance measure of spectral variables based on mutual information (MI), which can be performed independently of any given regression model. As MI makes no assumption on the relationship between X and Y, non-linear methods such as feed-forward artificial neural network (ANN) are thus encouraged for modelling in a prediction context (MI–ANN). GA–PLS and MI–ANN models are developed for NIR quantitative prediction of cotton content in cotton–viscose textile samples. The results are compared to full-spectrum (480 variables) PLS model (FS-PLS). The model requires 11 latent variables and yielded a 3.74% RMS prediction error in the range 0–100%. GA–PLS provides more robust model based on 120 variables and slightly enhanced prediction performance (3.44% RMS error). Considering MI variable selection procedure, great improvement can be obtained as 12 variables only are retained. On the basis of these variables, a 12 inputs ANN model is trained and the corresponding prediction error is 3.43% RMS error.
Keywords: Near-Infrared Spectroscopy; Textile; Multivariate calibration; Genetic algorithm; Mutual information; Artificial neural network;
Scale-up of batch kinetic models by Maryann Ehly; Paul J. Gemperline; Alison Nordon; David Littlejohn; J. Katy Basford; Martin De Cecco (80-88).
The scale-up of batch kinetic models was studied by examining the kinetic fitting results of batch esterification reactions completed in 75 mL and 5 L reactors. Different temperatures, amounts of catalysts, and amounts of initial starting reagents were used to completely characterize the reaction. A custom written Matlab toolbox called GUIPRO was used to fit first-principles kinetic models directly to in-line NIR and Raman spectroscopic data. Second-order kinetic models provided calibration-free estimates of kinetic and thermodynamic reaction parameters, time dependent concentration profiles, and pure component spectra of reagents and product. The estimated kinetic and thermodynamic parameters showed good agreement between small-scale and large-scale reactions. The accuracy of pure component spectra estimates was validated by comparison to collected NIR and Raman pure component spectra. The model estimated product concentrations were also validated by comparison to concentrations measured by off-line GC analysis. Based on the good agreement between kinetic and thermodynamic parameters and comparison between actual and estimated concentration and spectral profiles, it was concluded that the scale-up of batch kinetic models was successful.
Keywords: Reaction modeling; Online spectroscopy; Nonlinear regression; Reaction kinetics; Model scale-up;
Chemometric study on the TiO2-photocatalytic degradation of nitrilotriacetic acid by Carina A. Emilio; Jorge F. Magallanes; Marta I. Litter (89-97).
A chemometric study on the TiO2-photocatalytic degradation of nitrilotriacetic acid (NTA) in aqueous media under UV radiation has been carried out taking into account the multiple variables that take part in the system. To save redundant number of experiments, the system has been managed under chemometric techniques for several variables as NTA and TiO2 concentrations, pH and irradiation time. Multiple-way analysis of the variance (MANOVA) has been applied to find the statistically significant variables. An artificial neural network (ANN) has been used to build an empirical model of the system. All measurements have been driven under experimental designs: a full-factorial design (FFD) was used to analyze significant factors through MANOVA, and a Doehlert design, which was modified by spatial rotation, was applied in order to have a satisfactory number of levels for the factor time to be able to train the ANN. The study allows the knowledge and prediction of the behavior of the system as well as to work out kinetic parameters and to optimize their variables. The results of kinetic parameters obtained with the neural network agreed with independent experimental results, confirming a Langmuir–Hinshelwood kinetic regime. The difference between NTA and ethylenediaminetetraacetic acid (EDTA), which has been previously studied, is also established.
Keywords: Heterogeneous photocatalysis; Advanced oxidation technologies; Chemometrics; Experimental designs; Artificial neural networks;
How to avoid over-fitting in multivariate calibration—The conventional validation approach and an alternative by N.M. Faber; R. Rajkó (98-106).
This paper critically reviews the problem of over-fitting in multivariate calibration and the conventional validation-based approach to avoid it. It proposes a randomization test that enables one to assess the statistical significance of each component that enters the model. This alternative is compared with cross-validation and independent test set validation for the calibration of a near-infrared spectral data set using partial least squares (PLS) regression. The results indicate that the alternative approach is more objective, since, unlike the validation-based approach, it does not require the use of ‘soft’ decision rules. The alternative approach therefore appears to be a useful addition to the chemometrician's toolbox.
Keywords: Multivariate calibration; PLS; Component selection; Cross-validation; Test set validation; Randomization test; Near-infrared spectroscopy;
Multivariate near infrared spectroscopy models for predicting methanol and water content in biodiesel by Pedro Felizardo; Patrícia Baptista; José C. Menezes; M. Joana Neiva Correia (107-113).
The transesterification of vegetable oils, animal fats or waste oils with an alcohol (such as methanol) in the presence of a homogeneous catalyst (sodium hydroxide or methoxyde) is commonly used to produce biodiesel. The quality control of the final product is an important issue and near infrared (NIR) spectroscopy recently appears as an appealing alternative to the conventional analytical methods. The use of NIR spectroscopy for this purpose first involves the development of calibration models to relate the near infrared spectrum of biodiesel with the analytical data. The type of pre-processing technique applied to the data prior to the development of calibration may greatly influence the performance of the model. This work analyses the effect of some commonly used pre-processing techniques applied prior to partial least squares (PLS) and principal components regressions (PCR) in the quality of the calibration models developed to relate the near infrared spectrum of biodiesel and its content of methanol and water. The results confirm the importance of testing various pre-processing techniques. For the water content, the smaller validation and prediction errors were obtained by a combination of a second order Savitsky–Golay derivative followed by mean centring prior to PLS and PCR, whereas for methanol calibration the best results were obtained with a first order Savitsky–Golay derivative plus mean centring followed by the orthogonal signal correction.
Keywords: Biodiesel; Near infrared; Calibration models; Data pre-processing;
Non-destructive method for determination of hydroxyl value of soybean polyol by LS-SVM using HATR/FT-IR by Marco F. Ferrão; Simone C. Godoy; Annelise E. Gerbase; Cesar Mello; João Carlos Furtado; Cesar L. Petzhold; Ronei Jesus Poppi (114-119).
This paper presents the use of least-squares support vector machine (LS-SVM) for quantitative determination of hydroxyl value (OHV) of hydroxylated soybean oils by horizontal attenuated total reflection Fourier transform infrared (HATR/FT-IR) spectroscopy. A least-squares support vector machine (LS-SVM) calibration model for the prediction of hydroxyl value (OHV) was developed using the range 1805.1–649.9 cm−1. Validation of the method was carried out by comparing the OHV of a series of hydroxylated soybean oil predicted by the LS-SVM model to the values obtained by the AOCS standard method. A correlation coefficient equal to 0.989 and RMSEP = 4.96 mg of KOH/g was obtained. This study demonstrates a better prediction ability of the LS-SVM technique to determine OHV in hydroxylated soybean oil samples by HATR/FT-IR spectra.
Keywords: Least-squares support vector machine; Hydroxyl value; Hydroxylated soybean oil; Horizontal attenuated total reflectance; Chemometrics;
Study of the application of multiway multivariate techniques to model data from an industrial fermentation process by Ana P. Ferreira; João A. Lopes; José C. Menezes (120-127).
Several multivariate statistical techniques have been extensively proposed for monitoring industrial processes. In this paper, multiway extensions of two such techniques: multiway principal component analysis (MPCA) and multiway partial least squares regression (MPLS) were applied to a large data set from an industrial pilot-scale fermentation process to improve process knowledge. The MPCA model is able to diagnose faults occurring in the process whether they affect or not process productivity while the MPLS model enables the prediction of final product concentration and the detection of faults that will influence the fermentation productivity.
Keywords: Industrial fermentation; Multivariate analysis; Batch process; Multiway principal component analysis; Multiway partial least squares;
Screening Brazilian C gasoline quality: Application of the SIMCA chemometric method to gas chromatographic data by Danilo Luiz Flumignan; Aristeu G. Tininis; Fabrício de O. Ferreira; José Eduardo de Oliveira (128-135).
A total of 2400 samples of commercial Brazilian C gasoline were collected over a 6-month period from different gas stations in the São Paulo state, Brazil, and analysed with respect to 12 physicochemical parameters according to regulation 309 of the Brazilian Government Petroleum, Natural Gas and Biofuels Agency (ANP). The percentages (v/v) of hydrocarbons (olefins, aromatics and saturated) were also determined. Hierarchical cluster analysis (HCA) was employed to select 150 representative samples that exhibited least similarity on the basis of their physicochemical parameters and hydrocarbon compositions. The chromatographic profiles of the selected samples were measured by gas chromatography with flame ionisation detection and analysed using soft independent modelling of class analogy (SIMCA) method in order to create a classification scheme to identify conform gasolines according to ANP 309 regulation. Following the optimisation of the SIMCA algorithm, it was possible to classify correctly 96% of the commercial gasoline samples present in the training set of 100. In order to check the quality of the model, an external group of 50 gasoline samples (the prediction set) were analysed and the developed SIMCA model classified 94% of these correctly. The developed chemometric method is recommended for screening commercial gasoline quality and detection of potential adulteration.
Keywords: Commercial gasoline; Quality control; ANP regulation 309; Gas chromatography; SIMCA model;
Geographic origins and compositions of virgin olive oils determinated by chemometric analysis of NIR spectra by O. Galtier; N. Dupuy; Y. Le Dréau; D. Ollivier; C. Pinatel; J. Kister; J. Artaud (136-144).
The authentication of virgin olive oil samples requires usually the use of sophisticated and time consuming analytical techniques. There is a need for fast and simple analytical techniques for the objective of a quality control methodology. Virgin olive oils present characteristic NIR spectra. Chemometric treatment of NIR spectra was assessed for the quantification of fatty acids and triacylglycerols in virgin olive oil samples (n = 125) and for their classification (PLS1-DA) into five very geographically closed registered designations of origin (RDOs) of French virgin olive oils (“Aix-en-Provence”, “Haute-Provence”, “Nice”, “Nyons” and “Vallée des Baux”). The spectroscopic interpretation of regression vectors showed that each RDO was correlated to one or two specific components of virgin olive oils according to their cultivar compositions. The results were quite satisfactory, in spite of the similarity of cultivar compositions between two denominations of origin (“Aix-en-Provence” and “Vallée des Baux”). Chemometric treatments of NIR spectra allow us to obtain similar results than those obtained by time consuming analytical techniques such as GC and HPLC, and constitute a help fast and robust for authentication of those French virgin olive oils.
Keywords: Virgin olive oil; Near infrared; Partial least square discriminant analysis; Chemometric; Traceability;
Improvement deoxyribo nucleic acid spots classification in polyacrilamide gel images using photometric normalization algorithms by Eduardo Garea Llano; Francisco Silva Mata; Isneri Talavera Bustamante; Noslen Hernández González; Ricardo González Gazapo (145-151).
A comparative study of four enhancement algorithms traditionally used in computer vision for photometric normalization of images affected by illumination changes is presented in this paper. We experimented with the performance of these approaches to reduce the low frequency multiplicative noise that is produced as a result of a non-homogeneity illumination or a non-homogeneity developed chemical process in polyacrilamide gel electrophoresis images for the purpose of automatic classification of deoxyribo nucleic acid (DNA). The algorithms are tested on a database and their results are compared in a system for feature extraction and DNA classification.
Keywords: Image enhancement; Deoxyribo nucleic acid images;
Complex numbers in chemometrics by Paul Geladi; Andrew Nelson; Britta Lindholm-Sethson (152-159).
Electrical impedance gives multivariate complex number data as results. Two examples of multivariate electrical impedance data measured on lipid monolayers in different solutions give rise to matrices (16 × 50 and 38 × 50) of complex numbers. Multivariate data analysis by principal component analysis (PCA) or singular value decomposition (SVD) can be used for complex data and the necessary equations are given. The scores and loadings obtained are vectors of complex numbers. It is shown that the complex number PCA and SVD are better at concentrating information in a few components than the naïve juxtaposition method and that Argand diagrams can replace score and loading plots. Different concentrations of Magainin and Gramicidin A give different responses and also the role of the electrolyte medium can be studied. An interaction of Gramicidin A in the solution with the monolayer over time can be observed.
Keywords: Complex number linear algebra; Monomolecular lipid layers; Multivariate electrical impedance; Magainin; Gramicidin A; Dioleoyl phophatidylcholine;
Recognition of protozoa and metazoa using image analysis tools, discriminant analysis, neural networks and decision trees by Y.P. Ginoris; A.L. Amaral; A. Nicolau; M.A.Z. Coelho; E.C. Ferreira (160-169).
Protozoa and metazoa are considered good indicators of the treatment quality in activated sludge systems due to the fact that these organisms are fairly sensitive to physical, chemical and operational processes. Therefore, it is possible to establish close relationships between the predominance of certain species or groups of species and several operational parameters of the plant, such as the biotic indices, namely the Sludge Biotic Index (SBI). This procedure requires the identification, classification and enumeration of the different species, which is usually achieved manually implying both time and expertise availability. Digital image analysis combined with multivariate statistical techniques has proved to be a useful tool to classify and quantify organisms in an automatic and not subjective way.This work presents a semi-automatic image analysis procedure for protozoa and metazoa recognition developed in Matlab language. The obtained morphological descriptors were analyzed using discriminant analysis, neural network and decision trees multivariable statistical techniques to identify and classify each protozoan or metazoan. The obtained procedure was quite adequate for distinguishing between the non-sessile protozoa classes and also for the metazoa classes, with high values for the overall species recognition with the exception of sessile protozoa. In terms of the wastewater conditions assessment the obtained results were found to be suitable for the prediction of these conditions. Finally, the discriminant analysis and neural networks results were found to be quite similar whereas the decision trees technique was less appropriate.
Keywords: Discriminant analysis; Decision trees; Neural networks; Protozoa; Metazoa; Image analysis;
X-ray spectrometry and chemometrics in sugar classification, correlation with degree of sweetness and specific rotation of polarized light by Karen Goraieb; Thais L. Alexandre; Maria Izabel M.S. Bueno (170-175).
This work presents correlations of conventional energy dispersive X-ray fluorescence spectra of common sugars with degrees of sweetness obtained via sensorial tests and specific rotations of polarized light, both data from the literature. Also, classifications of sugars are achieved based on their specific structures. Principal component analysis and partial least square chemometric tools are used to establish these modelings. Once again it is demonstrated that a common bench-top X-ray spectrometer can be used not only for inorganic analysis, but also shows potential in studies of organic constituents.
Keywords: X-ray spectrometry; Sugar sweetness degree; Specific rotation; Multivariate analysis;
The use of multivariate modelling of near infra-red spectra to predict the butter fat content of spreads by Patricia C.M. Heussen; Hans-Gerd Janssen; Irene B.M. Samwel; John P.M. van Duynhoven (176-181).
In order to obtain a rapid method that can detect adulteration of butter fats with cheaper vegetable fats, the use of NIR spectroscopy and multivariate modelling was explored. For model building and validation, an extensive set of samples was collected, consisting of 152 butter samples, 42 oils and 200 blends thereof. Variations in butter fat composition are reflected in distinct NIR spectral regions. Principal components analysis and partial least square discriminant analysis was used to inspect the variation within the sample set. As reference values for training partial least squares models, butter fat levels as declared by suppliers were taken, as well as C4:0 fatty acid levels as measured directly by GC. All samples were used for training, except for 100 blends, which were used later for validation. Different pre-processing and PLS approaches were explored, resulting in models that had a RMSEPs for butter fat and C4:0 fatty acid level in the range of 4.3–8.2 and 0.33–0.38%(w/w), respectively. The performance of NIR in assessment of C4:0 fatty acid levels is lower as for GC, but this disadvantage is outweighed by shorter measurement times and the lower skill levels required. Furthermore NIR is able to assess overall levels of butter fat, in addition to the indirect indicator provided by the C4:0 fatty acid level.
Keywords: Multivariate analysis; Near infrared; Butter fats; Principal component analysis; Partial least squares; Gas chromatography;
Classification of perovskites with supervised self-organizing maps by Igor Kuzmanovski; Sandra Dimitrovska-Lazova; Slobotka Aleksovska (182-189).
In this work supervised self-organizing maps were used for structural classification of perovskites. For this purpose, structural data for total number of 286 perovskites, belonging to ABO3 and/or A2BB′O6 types, were collected from literature: 130 of these are cubic, 85 orthorhombic and 71 monoclinic. For classification purposes, the effective ionic radii of the cations, electronegativities of the cations in B-position, as well as, the oxidation states of these cations, were used as input variables. The parameters of the developed models, as well as, the most suitable variables for classification purposes were selected using genetic algorithms. Two-third of all the compounds were used in the training phase. During the optimization process the performances of the models were checked using cross-validation leave-1/10-out. The performances of obtained solutions were checked using the test set composed of the remaining one-third of the compounds. The obtained models for classification of these three classes of perovskite compounds show very good results. Namely, the classification of the compounds in the test set resulted in small number of discrepancies (4.2–6.4%) between the actual crystallographic class and the one predicted by the models. All these results are strong arguments for the validity of supervised self-organizing maps for performing such types of classification. Therefore, the proposed procedure could be successfully used for crystallographic classification of perovskites in one of these three classes.
Keywords: Perovskites; Structural classifications; Supervised self-organizing maps; Genetic algorithms;
Predicting the drug concentration in starch acetate matrix tablets from ATR-FTIR spectra using multi-way methods by Sanni Matero; Jari Pajander; Anne-Marie Soikkeli; Satu-Pia Reinikainen; Maija Lahtela-Kakkonen; Ossi Korhonen; Jarkko Ketolainen; Antti Poso (190-197).
The amounts of drug and excipient were predicted from ATR-FTIR spectra using two multi-way modelling techniques, parallel factor analysis (PARAFAC) and multi-linear partial least squares (N-PLS). Data matrices consisted of dissolved and undissolved parallel samples having different drug content and spectra, which were collected at axially cut surface of the flat-faced matrix tablets. Spectra were recorded comprehensively at different points on the axially cut surface of the tablet. The sample drug concentrations varied between 2 and 16% v/v. The multi-way methods together with ATR-FTIR spectra seemed to represent an applicable method for the determination of drug and excipient distribution in a tablet during the release process. The N-PLS calibration method was more robust for accurate quantification of the amount of components in the sample whereas the PARAFAC model provided approximate relative amounts of components.
Keywords: Parallel factor analysis; N-mode partial least projection to latent structures; Attenuated total reflectance-Fourier transform infra red; Starch acetate matrix; Drug release;
pH- and time-dependent hemoglobin transitions: A case study for process modelling by Glòria Muñoz; Anna de Juan (198-208).
The study of the pH- and time-dependent transitions of the hemoglobin is presented as a biochemical problem of interest and as a very complete example of the situations that can be encountered in the modelling of complex processes. Therefore, the aim is two-fold: providing a complete explanation of the biochemical phenomena studied and explaining the modelling strategies used to solve this problem that can be generally applied in processes of different origin. Multivariate curve resolution-alternating least squares (MCR-ALS) is the basic method used to recover the process contributions, expressed as the concentration profile and the pure spectrum of each of the compounds involved. What is the benefit of using multitechnique or multiexperiment data arrangements, how constraints should be selected and applied and how hybrid approaches combining hard- and soft-modelling can allow for the use of a partially known model when available are among the main issues presented.
Keywords: Protein processes; Multivariate curve resolution; Process analysis; Hybrid hard- and soft-modelling;
Effect of missing values in estimation of mean of auto-correlated measurement series by Maaret Paakkunainen; Jarmo Kilpeläinen; Satu-Pia Reinikainen; Pentti Minkkinen (209-215).
Sampling and uncertainty of sampling are important tasks, when industrial processes are monitored. Missing values and unequal sources can cause problems in almost all industrial fields. One major problem is that during weekends samples may not be collected. On the other hand a composite sample may be collected during weekend. These systematically occurring missing values (gaps) will have an effect on the uncertainties of the measurements. Another type of missing values is random missing values. These random gaps are caused, for example, by instrument failures.Pierre Gy's sampling theory includes tools to evaluate all error components that are involved in sampling of heterogeneous materials. Variograms, introduced by Gy's sampling theory, have been developed to estimate the uncertainty of auto-correlated process measurements. Variographic experiments are utilized for estimating the variance for different sample selection strategies. The different sample selection strategies are random sampling, stratified random sampling and systematic sampling.In this paper both systematic and random gaps were estimated by using simulations and real process data. These process data were taken from bark boilers of pulp and paper mills (combustion processes). When systematic gaps were examined a linear interpolation was utilized. Also cases introducing composite sampling were studied.Aims of this paper are: (1) how reliable the variogram is to estimate the process variogram calculated from data with systematic gaps, (2) how the uncertainty of missing gap can be estimated in reporting time-averages of auto-correlated time series measurements.The results show that when systematic gaps were filled by linear interpolation only minor changes in the values of variogram were observed. The differences between the variograms were constantly smallest with composite samples. While estimating the effect of random gaps, the results show that for the non-periodic processes the stratified random sampling strategy gives more reliable results than systematic sampling strategy. Therefore stratified random sampling should be used while estimating the uncertainty of random gaps in reporting time-averages of auto-correlated time series measurements.
Keywords: Sampling; Sampling uncertainty; Missing values; Variographic analysis; Pierre Gy's sampling theory; Auto-correlated data; Process data;
A study of physicochemical and biopharmaceutical properties of Amoxicillin tablets using full factorial design and PCA biplot by Kerly F.M. Pasqualoto; Reinaldo F. Teófilo; Marcos Guterres; Flávia S. Pereira; Márcia M.C. Ferreira (216-220).
The variables that influence the tablets obtained by direct compression method deserve to be studied to minimize formulations costs and to improve the physicochemical and biopharmaceutical properties of the compacts. Here, we explore the adjuvants effects on amoxicillin tablet formulations considering multiple responses, and indicate the most suitable formulation composition. A 23 full factorial design was built to different amoxicillin formulations, each one containing three replicate batches, and eight responses (physicochemical and biopharmaceutical parameters) were obtained. The microcrystalline cellulose (MCC) type Avicel® PH-102 (low) or PH-200 (high), the absence (low) or presence (high) of spray-dried lactose (LAC), and the absence (low) or presence (high) of disintegrant (DIS) were the levels investigated. The more relevant responses to the distinct formulations from the experimental design were hardness, friability, and the amount of amoxicillin dissolved during the first hour. PCA biplot indicated high values of amount of drug dissolved in 60 min as advantageous responses for the investigated amoxicillin tablet formulations and high values of friability as not desirable. Considering the individual response evaluation, the most suitable amoxicillin tablet formulation should present in its composition the MCC type Avicel® PH-102 (low level) and the superdisintegrant agent (DIS high level), croscarmellose sodium, but no LAC (low level).
Keywords: Factorial design; PCA biplot; Direct compression; Physicochemical and biopharmaceutical properties; Amoxicillin tablets;
Simultaneously calibrating solids, sugars and acidity of tomato products using PLS2 and NIR spectroscopy by André M.K. Pedro; Márcia M.C. Ferreira (221-227).
In this work, the development of a robust spectroscopic procedure for determining, simultaneously and non-destructively, relevant quality parameters of processed tomato products (total and soluble solids, total acidity, total sugars, glucose and fructose), is described.Samples of tomato concentrate products with total solids content ranging from 6.9 to 35.9% were collected from Latin America, the US and Europe and NIR spectra were acquired in the 4000–10,000 cm−1 region. The original spectra were pre-processed by mean-smoothing or by Fourier filter, followed by multiplicative signal correction (MSC) or derivatives. Partial least squares (PLS2 and PLS1) models were built and their predictive abilities were compared through the RMSEP of external validation.The PLS2 regression had better predictive abilities for four out of the six properties under study, namely total solids, total sugars, glucose and fructose. Besides, the model was less complex than the PLS1 models in the sense that only four factors were demanded whilst from 4 to 11 factors were necessary for building the PLS1 models. The standard error of prediction (SEP%) of the PLS2 model for each property was: total solids, 2.67; soluble solids, 1.14; total acidity, 9.60; total sugar, 18.69; glucose, 11.60; and fructose, 13.45.
Keywords: Fourier filter; Multivariate calibration; Tomato quality; Pre-processing; Convolution function;
Alternative calibration approaches for LC–MS quantitative determination of coeluted compounds in complex environmental mixtures using multivariate curve resolution by Emma Peré-Trepat; Silvia Lacorte; Romà Tauler (228-237).
Different calibration approaches including external calibration, standard addition and internal standard are evaluated for quantification of coeluted compounds in liquid chromatography with MS spectrometry detection in scan mode and using multivariate curve resolution. These different calibration approaches are proposed to cope with sensitivity changes and matrix effects encountered in the analysis of complex natural environmental samples. By using them, multivariate curve resolution analysis of MS data in scan mode gave similar quantitative results to those obtained by LC–MS in selected ion monitoring (SIM) mode (in both cases errors were below 16% for internal standard combined with standard addition strategy), and it provided at the same time a means of analyte confirmation via their resolved pure MS spectra, and a means to gather a larger amount of information about the whole chromatographic process and to facilitate the simultaneous determination of multiple analytes in the same chromatographic run using the same experimental and instrumental conditions.
Keywords: Multivariate curve resolution; HPLC-MS; Internal standard; Standard addition; Calibration approaches; Coelution; Environmental samples;
Descriptive sensory analysis in different classes of orange juice by a robust free-choice profile method by Jesús Pérez Aparicio; M. Ángeles Toledano Medina; Victoria Lafuente Rosales (238-247).
Free-choice profile (FCP), developed in the 1980s, is a sensory analysis method that can be carried out by untrained panels. The participants need only to be able to use a scale and be consumers of the product under evaluation. The data are analysed by sophisticated statistical methodologies like Generalized Procrustean Analysis (GPA) or STATIS. To facilitate a wider use of the free-choice profiling procedure, different authors have advocated simpler methods based on principal components analysis (PCA) of merged data sets. The purpose of this work was to apply another easy procedure to this type of data by means of a robust PCA. The most important characteristic of the proposed method is that quality responsible managers could use this methodology without any scale evaluation. Only the free terms generated by the assessors are necessary to apply the script, thus avoiding the error associated with scale utilization by inexpert assessors. Also, it is possible to use the application with missing data and with differences in the assessors’ attendance at sessions. An example was performed to generate the descriptors from different orange juice types. The results were compared with the STATIS method and with the PCA on the merged data sets. The samples evaluated were fresh orange juices with differences in storage days and pasteurized, concentrated and orange nectar drinks from different brands. Eighteen assessors with a low-level training program were used in a six-session free-choice profile framework. The results proved that this script could be of use in marketing decisions and product quality program development.
Keywords: Free-choice profile; Robust analysis; Quality food; Orange juice;
Multivariate statistical analysis of a multi-step industrial processes by Satu-Pia Reinikainen; Agnar Höskuldsson (248-256).
Monitoring and quality control of industrial processes often produce information on how the data have been obtained. In batch processes, for instance, the process is carried out in stages; some process or control parameters are set at each stage. However, the obtained data might not be utilized efficiently, even if this information may reveal significant knowledge about process dynamics or ongoing phenomena. When studying the process data, it may be important to analyse the data in the light of the physical or time-wise development of each process step. In this paper, a unified approach to analyse multivariate multi-step processes, where results from each step are used to evaluate future results, is presented. The methods presented are based on Priority PLS Regression. The basic idea is to compute the weights in the regression analysis for given steps, but adjust all data by the resulting score vectors. This approach will show how the process develops from a data point of view. The procedure is illustrated on a relatively simple industrial batch process, but it is also applicable in a general context, where knowledge about the variables is available.
Keywords: Priority regression; CovProc; Partial least squares (PLS); Monitoring;
Spectroscopic on-line monitoring of reactions in dispersed medium: Chemometric challenges by Marlon M. Reis; Pedro H.H. Araújo; Claudia Sayer; Reinaldo Giudici (257-265).
Emulsion and suspension polymerizations are important industrial processes for polymer production. The end-user properties of polymers depend strongly on how the polymerization reactions proceed in time (i.e. a batch or semicontinuous, rate of reagents feeding, etc.). In other words, these reactions are process dependent, which makes the successful process control a key point to ensure high-quality products. In several process control strategies the on-line monitoring of reaction performance is required. Due to the multiphase nature of the emulsion and suspension processes, there is a lack of sensors to perform successful on-line monitoring. Near infrared and Raman spectroscopies have been pointed out as useful approaches for monitoring emulsion and suspension polymerizations and several applications have been described. In such instance, the chemometric approach on relating near infrared and Raman spectra to polymer properties is widely used and has proven to be useful. Nevertheless, the multiphase nature of emulsion and suspension polymerizations also represents a challenge for the chemometric approach based on multivariate calibration models and demands the development of new methods. In this work, a set novel results is presented from the monitoring of 15 batch emulsion reactions that show the chemometric challenge to be faced on development of new methods for successful monitoring of processes taken under dispersed medium. In order to discuss these results, several chemometric approaches were revised. It is shown that Raman and NIR spectroscopic techniques are suitable for on-line monitoring of monomer concentration and polymer content during the polymerizations, as well as medium heterogeneity properties, i.e. average particle size. It is also shown that Hotteling and Q statistics, widely used in chemometrics, might fail in monitoring these reactions, while an approach based on principal curves is able to overcome such restriction.
Keywords: On-line monitoring; Emulsion polymerization; Raman; Near infrared;
Factorial analysis of the trihalomethanes formation in water disinfection using chlorine by Pedro M.S.M. Rodrigues; Joaquim C.G. Esteves da Silva; Maria Cristina G. Antunes (266-274).
The factors that affect trihalomethane (THM) (chloroform, bromodichloromethane, chlorodibromomethane and bromoform) formation from the chlorination of aqueous solutions of hydrophobic fulvic acids (FA) were investigated in a prototype laboratorial simulation using factorial analysis. This strategy involved a fractional factorial design (16 plus 5 center experiments) of five factors (fulvic acids concentration, chlorine dose, temperature, pH and bromide concentration) and a Box Behnken design (12 plus 3 center experiments) for the detailed analysis of three factors (FA concentration, chlorine dose and temperature). The concentration of THM was determined by headspace analysis by GC–ECD. The most significant factors that affect the four THM productions were the following: chloroform—FA concentration and temperature; bromodichloromethane—FA concentration and chlorine dose; chlorodibromomethane—chlorine dose; and, bromoform—chlorine dose and bromide concentration. Moreover, linear models were obtained for the four THM concentrations in the disinfection solution as function of the FA concentration, chlorine dose and temperature, and it was observed that the complexity of the models (number of significant factors and interactions) increased with increasing bromine atoms in the THM. Also, this study shows that reducing the FA concentration the relative amount of bromated THM increases.
Keywords: Factorial analysis; Response surface methodology; Chlorine water disinfection; Hydrophobic fulvic acids; Trihalomethanes formation;
Self-modeling curve resolution (SMCR) by particle swarm optimization (PSO) by Hideyuki Shinzawa; Jian-Hui. Jiang; Makio Iwahashi; Isao Noda; Yukihiro Ozaki (275-281).
Particle swarm optimization (PSO) combined with alternating least squares (ALS) is introduced to self-modeling curve resolution (SMCR) in this study for effective initial estimate. The proposed method aims to search concentration profiles or pure spectra which give the best resolution result by PSO. SMCR sometimes yields insufficient resolution results by getting trapped in a local minimum with poor initial estimates. The proposed method enables to reduce an undesirable effect of the local minimum in SMCR due to the advantages of PSO. Moreover, a new criterion based on global phase angle is also proposed for more effective performance of SMCR. It takes full advantage of data structure, that is to say, a sequential change with respect to a perturbation can be considered in SMCR with the criterion. To demonstrate its potential, SMCR by PSO is applied to concentration-dependent near-infrared (NIR) spectra of mixture solutions of oleic acid (OA) and ethanol. Its curve resolution performances are compared with SMCR with evolving factor analysis (EFA). The results show that SMCR by PSO yields significantly better curve resolution performances than those by EFA. It is revealed that SMCR by PSO is less sensitive to a local minimum in SMCR and it can be a new effective tool for curve resolution analysis.
Keywords: Self-modeling curve resolution (SMCR); Particle swarm optimization (PSO); Global phase angle; Two-dimensional correlation spectroscopy; Near-infrared (NIR) spectra; Oleic acid (OA);
Direct determination of propranolol in urine by spectrofluorimetry with the aid of second order advantage by Lucas C. Silva; Marcello G. Trevisan; Ronei J. Poppi; Marcelo M. Sena (282-288).
This work presented an application of the second-order advantage provided by parallel factor analysis (PARAFAC) aiming at direct determination of propranolol, a β-blocker also used as doping agent, in human urine by spectrofluorimetry. The adopted strategy combined the use of PARAFAC, for extraction of the pure analyte signal, with the standard addition method, for a determination in the presence of an individual matrix effect caused by the quenching action of the proteins present in the urine. The urine samples were previously 100 times diluted. For each sample, four standard additions were performed, in triplicates. A specific PARAFAC model was built for each triplicate of each sample, from three-way arrays formed by 231 emission wavelengths, 8 excitation wavelengths and 5 measurements (sample plus 4 additions). The models were built with three factors and always explained more than 99.87% of the total variance. The obtained loadings were related to PRO and two background interferences. The scores related to PRO were used for a linear regression in the standard addition method. The obtained determinations in the PRO concentration range from 5.0 to 20.0 μg ml−1 provided recoveries between 91.1 and 108.4%.
Keywords: PARAFAC; Molecular fluorescence; Clinical analysis; β-Blocker; Second-order standard addition method;
Application of non-linear optimization methods to the estimation of multivariate curve resolution solutions and of their feasible band boundaries in the investigation of two chemical and environmental simulated data sets by R. Tauler (289-298).
Although alternating least squares algorithms have revealed extremely useful and flexible to solve multivariate curve resolution problems, other approaches based on non-linear optimization algorithms using non-linear constraints are possible. Once the subspaces defined by PCA solutions are identified, appropriate rotation and perturbation of these solutions can produce solutions fulfilling the constraints obeyed by the physical nature of the investigated systems. In order to perform such a rotation, an optimization algorithm based in the fulfilment of constraints and some examples of application in chemistry and environmental chemistry are given. It is shown that the solutions obtained either by alternating least squares or by the new proposed algorithm are rather similar and that they are both within the boundaries of the band of feasible solutions obtained by an algorithm previously developed to estimate them.
Keywords: Multivariate curve resolution (MCR); Rotation ambiguities; Feasible band boundaries; Non-linear constrained optimization;
Visualisation and interpretation of Support Vector Regression models by B. Üstün; W.J. Melssen; L.M.C. Buydens (299-309).
This paper introduces a technique to visualise the information content of the kernel matrix and a way to interpret the ingredients of the Support Vector Regression (SVR) model. Recently, the use of Support Vector Machines (SVM) for solving classification (SVC) and regression (SVR) problems has increased substantially in the field of chemistry and chemometrics. This is mainly due to its high generalisation performance and its ability to model non-linear relationships in a unique and global manner. Modeling of non-linear relationships will be enabled by applying a kernel function. The kernel function transforms the input data, usually non-linearly related to the associated output property, into a high dimensional feature space where the non-linear relationship can be represented in a linear form. Usually, SVMs are applied as a black box technique. Hence, the model cannot be interpreted like, e.g., Partial Least Squares (PLS). For example, the PLS scores and loadings make it possible to visualise and understand the driving force behind the optimal PLS machinery.In this study, we have investigated the possibilities to visualise and interpret the SVM model. Here, we exclusively have focused on Support Vector Regression to demonstrate these visualisation and interpretation techniques. Our observations show that we are now able to turn a SVR black box model into a transparent and interpretable regression modeling technique.
Keywords: Support Vector Regression; Feature space; Kernel functions; Non-linear regression; Model visualisation and interpretation;
Simultaneous multiresponse optimization applied to epinastine determination in human serum by using capillary electrophoresis by Luciana Vera-Candioti; Alejandro C. Olivieri; Héctor C. Goicoechea (310-318).
Experimental design and optimization techniques were implemented for the development of a rapid and simple capillary zone electrophoresis method (CZE) for the determination of epinastine hydrochloride in human serum. The effects of five factors were studied on the resolution between the peaks for the target analyte (epinastine hydrochloride) and lidocaine hydrochloride, used as internal standard, as well as on the analysis time. The factors were the concentration and pH of the buffer, the injection time, the injection voltage and the separation voltage. The separation was carried out by using an uncoated silica capillary with 50 μm i.d. and total length 64.5 cm (150 μm of path length) and UV detection (200 nm).Multiple response simultaneous optimization by using the desirability function was used to find experimental conditions where the system generates desirable results. The optimum conditions were: sodium phosphate buffer solution, 16.0 mmol L−1; pH 8.50; injection voltage, 20.0 kV; injection time, 30 s; separation voltage, 26.7 kV.The method was confirmed to be linear in the range of 2.0–12 ng mL−1. The injection repeatability of the method was evaluated by six injections at three concentration levels, while intra-assay precision was assessed by analysing a single concentration level, yielding a CV's of ca. 1% for standard and 2% for serum samples. Accuracy was evaluated by recovery assays and by comparing with an HPLC method, the results being acceptable according to regulatory agencies. The rudgeness was evaluated by means of an experimental Plackett–Burman design, in which the accuracy was assessed when small changes were set in the studied parameters. Clean-up of human serum samples was carried out by means of a liquid–liquid extraction procedure, which gave a high extraction yield for epinastine hydrochloride (93.00%).
Keywords: Multiresponse optimization; Capillary electrophoresis; Epinastine;
On-line determination and control of the water content in a continuous conversion reactor using NIR spectroscopy by H.W. Ward; Frank E. Sistare (319-322).
On-line near infrared spectroscopy was used to determine the water content in a continuous conversion reactor. The NIR predictions were incorporated into the distributed control system (DCS) which then controlled the addition of water to the reactor. The conversion reaction utilizes methanol, water, sodium carbonate and a reactant. Control of the water content is important for a number of reasons. At elevated water levels, a competing hydrolysis reaction increases along with the product solubility in the mother liquor leading to product losses. At reduced water levels, the product becomes anhydrous and the reaction mixture becomes gelatinous, necessitating a shutdown of the reactor for cleaning. The previous procedure for monitoring water was to remove a sample once per hour, transfer the sample to the laboratory, and run a Karl Fisher assay. Upon obtaining results from the lab, an operator would manually adjust the water inlet valve to the reactor. NIR spectroscopy in an on-line mode allows spectra to be collected every 200 s markedly increasing the frequency of results. A partial least squares model was constructed, validated and successfully implemented to predict the water content of the reactor. Further, by feeding the results to the process DCS, water additions to the reactor were fully automated. The increased frequency of sampling by NIR led to an improvement in the control of the water content and decreased the normal amount of equipment downtime. These factors improved process stability and recovery thereby generating an estimated $500,000 in savings over the course of the campaign.
Keywords: Near-infrared (NIR); Control; Water; Spectroscopy;
Finding relevant spectral regions between spectroscopic techniques by use of cross model validation and partial least squares regression by Frank Westad; Nils Kristian Afseth; Rasmus Bro (323-327).
In this paper, we extend the concept of cross model validation (CMV) to multiple X and Y variables where different spectroscopic techniques serve as X and Y data in a regression context. For the first dataset on marzipan samples the main objective was to find significant regions in the spectral data, and to discuss the issue of false discovery, i.e. combinations of variables that erroneously are found to be significant. A permutation test within the framework of CMV showed that no regression coefficients in the partial least squares regression (PLSR) model between FT-IR and VIS/NIR spectra show significance at the 5% level. We believe the reason is that the CMV acts as strong filter towards spurious correlations. Corresponding CH- and OH-bands between FT-IR and NIR spectra gave significant regions. For the second dataset, the results from CMV are interpreted more in detail with chemical background knowledge in mind. Most of the significant regions found between the Raman and NIR spectra could be interpreted from the chemical composition of the oil mixtures. Some regions were more difficult to interpret, which could be due to systematic baseline effects in the NIR data.
Keywords: Spectroscopy; Cross model validation; Partial least squares regression; Multiple tests; Permutation;
Comparative analysis of volatile components from Clematis species growing in China by Ying-Xu Zeng; Chen-Xi Zhao; Yi-Zeng Liang; Hong Yang; Hong-Zhuang Fang; Lun-Zhao Yi; Zhong-Da Zeng (328-339).
The volatile components between stems and roots and also among five Clematis species from China were studied and analyzed by gas chromatography–mass spectrometry (GC–MS) combined with alternative moving window factor analysis (AMWFA), a new chemometric resolution method. Identification of the compounds was also assisted by comparison of temperature-programmed retention indices (PTRIs) on HP-5MS with authentic samples included in our own laboratory database under construction. A total of 153 different compounds accounting for 86.6–96.5% were identified and significant qualitative and quantitative differences were observed among the samples. The major volatile components in different essential oils from Clematis species were n-hexadecanoic acid and (Z,Z)-9,12-octadecadienoic acid. Our work further demonstrated chemometric resolution techniques upon the two-dimensional data and PTRIs can provide a complementary and convenient method for fast and accurate analysis of complex essential oils.
Keywords: Essential oils; Clematis; Gas chromatography–mass spectrometry (GC–MS); Temperature-programmed retention indices (PTRIs); Alternative moving window factor analysis (AMWFA);