can attain the two objectives of highlighting the relevant variables (genes) and possibly improving classification results.
In this paper, we propose a wrapper approach to gene selection in classification of gene expression data using simulated annealing
along with supervised classification. The proposed approach can perform global combinatorial searches through the space of
all possible input subsets, can handle cases with numerical, categorical or mixed inputs, and is able to find (sub-)optimal
subsets of inputs giving low classification errors. The method has been tested on publicly available bioinformatics data sets
using support vector machines and on a mixed type data set using classification trees. We also propose some heuristics able
to speed up the convergence. The experimental results highlight the ability of the method to select minimal sets of relevant
features.
- Content Type Journal Article
- DOI 10.1007/s00500-010-0597-8
- Authors
- Maurizio Filippone, University of Glasgow Department of Computing Science Sir Alwyn Williams Building G12 8QQ Glasgow UK
- Francesco Masulli, University of Genova Department of Computer and Information Sciences Genoa Italy
- Stefano Rovetta, University of Genova Department of Computer and Information Sciences Genoa Italy
- Journal Soft Computing – A Fusion of Foundations, Methodologies and Applications
- Online ISSN 1433-7479
- Print ISSN 1432-7643