[Overview] | [Core Machine Learning: Theory & Algorithms] | [Applications: Life Sciences, Cheminformatics, Computer Vision] | [Connections with Other Disciplines: Ranking & Choice Modeling]
Overview
My research interests are primarily in the computational, mathematical, and statistical foundations of machine learning and data science, including theory and algorithms. In particular, a recurring theme of my research is to extract the core computational, mathematical, and statistical principles underlying various complex problems in machine learning/data science, and to exploit the insights thus gained to drive the development of principled algorithms with both computational and statistical guarantees. My interests also include applications of machine learning, particularly in the life sciences, as well as connections between machine learning and other disciplines such as economics, operations research, and psychology, particularly related to statistical ranking and choice modeling. More broadly, I am excited by research at the intersection of computer science, mathematics, and statistics, and its ability to turn data into actionable insights in both the natural and social sciences.
Our group’s research has been published in venues such as the Journal of Machine Learning Research, Nature Communications, the International Conference on Machine Learning (ICML), Conference on Neural Information Processing Systems (NeurIPS), Conference on Learning Theory (COLT), International Conference on Artificial Intelligence and Statistics (AISTATS), Conference on Uncertainty in Artificial Intelligence (UAI), ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), and others. Our work has been selected four times for spotlight presentations at the NeurIPS conference.
Core Machine Learning: Theory and Algorithms
- Surrogate loss functions in machine learning (and convex calibration dimension)
- Consistent output coding algorithms for multiclass and multi-label learning
- Strongly proper losses
- Hierarchical classification
- Learning with complex (multivariate/non-decomposable) performance measures
- Bayes optimal feature selection
- Learning from noisy data
- Bipartite ranking, area under the ROC curve (AUC), and partial AUC
- Bayes consistency vs. H-consistency
- PAC learning
Applications: Life Sciences, Cheminformatics, Computer Vision
- Precision medicine: Predicting anti-cancer drug response
- Predicting patient survival in intensive care units (ICUs)
- Ranking chemical compounds for drug discovery
- Object detection in computer vision