Bayes Optimal Feature Selection

[Summary & Contributions] | [Relevant Publications]

Summary and Contributions

In many data science applications, the original input data is of very high dimension, and one needs to select a small number of input dimensions or features before performing any further data analysis. While many feature selection methods have been proposed and studied in the literature, many do not take into account in a principled manner the downstream task that needs to be performed, and the aspect of the data that needs to be most highly preserved for it. For settings where the downstream task involves supervised learning, we developed a Bayes optimal feature selection approach that minimizes the ‘Bayes error’ with respect to the target performance measure in the reduced feature space; our approach preserves as much information as possible relevant to the specific downstream task, and recovers the classical mutual information based feature selection criterion as a special case.

Relevant Publications

  • Saneem Ahmed, Harikrishna Narasimhan and Shivani Agarwal.
    Bayes optimal feature selection for supervised learning with general performance measures.
    In Proceedings of the 31st Conference on Uncertainty in Artificial Intelligence (UAI), 2015.
    [pdf]

Back to Research