Selecting a small number of relevant genes for accurate classification
of samples is essential for development of diagnostic tests. We present
the Bayesian Model Averaging (BMA) method for gene selection and
classification of microarray data. Typical gene selection and
classification procedures ignore model uncertainty and use a single set
of relevant genes (model) to predict the class. BMA accounts for the
uncertainty about the best set to choose by averaging over multiple
models (sets of potentially overlapping relevant genes).
We showed that BMA selects smaller numbers of relevant genes (compared to
other methods) and achieves high prediction accuracy on three microarray
datasets. Our BMA algorithm is applicable to microarray datasets with any
number of classes, and outputs posterior probabilities for the selected genes
and models. Our selected models typically consist of only a few genes. The
combination of high accuracy, small numbers of genes and posterior
probabilities for the predictions, should make BMA a powerful tool for
developing diagnostics from expression data.