Loading…

Parallel genetic algorithm co-optimization of spectral pre-processing and wavelength selection for PLS regression

Spectral pre-processing and variable selection are often used to produce PLS regression models with better prediction abilities. We proposed here to optimize simultaneously the spectral pre-processing and the variable selection for PLS regression. The method is based on parallel genetic algorithm wi...

Full description

Saved in:
Bibliographic Details
Published in:Chemometrics and intelligent laboratory systems 2011-05, Vol.107 (1), p.50-58
Main Authors: Devos, Olivier, Duponchel, Ludovic
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Spectral pre-processing and variable selection are often used to produce PLS regression models with better prediction abilities. We proposed here to optimize simultaneously the spectral pre-processing and the variable selection for PLS regression. The method is based on parallel genetic algorithm with a unique chromosome coding both for pre-processing and variable selections. A pool of 31 pre-processing functions with various settings is tested. In the same chromosome several pre-processing steps can be combined. Three near infrared spectroscopic datasets have been used to evaluate the methodology. The efficacy of the co-optimization is evaluated by comparing the prediction ability of the PLS models with those after pre-processing optimization only. The effect of the number of successive pre-processing steps has been also tested. Concerning the different datasets used here, one can observe two different behaviors. In a first case the GA co-optimization procedure is found to perform well, leading to important improvement of the prediction ability especially when three consecutive pre-processing techniques are applied. In a second case, only the preprocessing optimization is enough to obtain an optimal model. All these models are optimal and more accurate compared to the classical models (build with the “trial and error” methods).
ISSN:0169-7439
1873-3239
DOI:10.1016/j.chemolab.2011.01.008