Bipartite Ranking, Area Under the ROC Curve (AUC), and Partial AUC
[Summary & Contributions] | [Relevant Publications]
Summary and Contributions
The problem of bipartite ranking, where one is given training examples with positive and negative labels and the goal is to learn a scoring function that ranks new positive instances higher than negative ones, has many applications, ranging from information retrieval to recommender systems and from credit risk scoring to drug discovery. We developed new theoretical results for this problem, including the introduction of ‘bipartite rank-shatter coefficients’ (analogous to the standard shatter coefficients/growth function in VC dimension theory for binary classification) and an extension of stability based generalization error bounds, and later extended some of these results to other types of ranking problems. We also subsequently answered some foundational questions about bipartite ranking that had remained open for many years, including the question of whether one could design statistically consistent algorithms for bipartite ranking using univariate (rather than pairwise) losses, and the question of elucidating the precise relationship between bipartite ranking, binary classification, and binary class probability estimation. Most of the works above focus on the area under the ROC curve (AUC), a popular performance measure for bipartite ranking; in some applications, however, performance is measured not in terms of the full AUC, but rather in terms of the partial AUC between two specified false positive rates (see figure below). With this in mind, we also developed novel support vector algorithms for optimizing the partial AUC; a variant of one of these algorithms was used in our work on predicting anti-cancer drug response in patients.
Relevant Publications
- Harikrishna Narasimhan and Shivani Agarwal.
Support vector algorithms for optimizing the partial area under the ROC curve.
Neural Computation, 29(7):1919-1963, 2017.
[link] - Biswanath Majumder, Ulaganathan Baraneedharan, Saravanan Thiyagarajan, Padhma Radhakrishnan, Harikrishna Narasimhan, Muthu Dhandapani, Nilesh Brijwani, Dency D. Pinto, Arun Prasath, Basavaraja U. Shanthappa, Allen Thayakumar, Rajagopalan Surendran, Govind K. Babu, Ashok M. Shenoy, Moni A. Kuriakose, Guillaume Bergthold, Peleg Horowitz, Massimo Loda, Rameen Beroukhim, Shivani Agarwal, Shiladitya Sengupta, Mallikarjun Sundaram and Pradip K. Majumder.
Predicting clinical response to anticancer drugs using an exvivo platform that captures tumor heterogeneity.
Nature Communications, 6:6169, 2015.
[pdf] - Shivani Agarwal.
Surrogate regret bounds for bipartite ranking via strongly proper losses.
Journal of Machine Learning Research, 15:1653-1674, 2014.
[pdf] - Harikrishna Narasimhan and Shivani Agarwal.
On the relationship between binary classification, bipartite ranking, and binary class probability estimation.
In Advances in Neural Information Processing Systems (NIPS), 2013.
Spotlight paper.
[pdf] [spotlight slides] - Harikrishna Narasimhan and Shivani Agarwal.
SVM_pAUC^tight: A new support vector method for optimizing partial AUC based on a tight convex upper bound.
In Proceedings of the 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2013.
[pdf] - Shivani Agarwal.
Surrogate regret bounds for the area under the ROC curve via strongly proper losses.
In Proceedings of the 26th Annual Conference on Learning Theory (COLT), 2013.
[pdf] - Harikrishna Narasimhan and Shivani Agarwal.
A structural SVM based approach for optimizing partial AUC.
In Proceedings of the 30th International Conference on Machine Learning (ICML), 2013.
[pdf] [supplementary material] - Shivani Agarwal.
The Infinite Push: A new support vector ranking algorithm that directly optimizes accuracy at the absolute top of the list.
In Proceedings of the SIAM International Conference on Data Mining (SDM), 2011.
[pdf] - Shivani Agarwal.
Learning to rank on graphs.
Machine Learning, 81(3):333-357, 2010.
[pdf] - Shivani Agarwal and Partha Niyogi.
Generalization bounds for ranking algorithms via algorithmic stability.
Journal of Machine Learning Research, 10:441-474, 2009.
[pdf] - Shivani Agarwal.
Ranking on graph data.
In Proceedings of the 23rd International Conference on Machine Learning (ICML), 2006.
[pdf] [errata] - Shivani Agarwal and Partha Niyogi.
Stability and generalization of bipartite ranking algorithms.
In Proceedings of the 18th Annual Conference on Learning Theory (COLT), 2005.
[pdf] - Shivani Agarwal and Dan Roth.
Learnability of bipartite ranking functions.
In Proceedings of the 18th Annual Conference on Learning Theory (COLT), 2005.
[pdf] - Shivani Agarwal, Thore Graepel, Ralf Herbrich, Sariel Har-Peled and Dan Roth.
Generalization bounds for the area under the ROC curve.
Journal of Machine Learning Research, 6:393-425, 2005.
[pdf] - Shivani Agarwal, Sariel Har-Peled and Dan Roth.
A uniform convergence bound for the area under the ROC curve.
In Proceedings of the 10th International Conference on Artificial Intelligence and Statistics (AISTATS), 2005.
[pdf] - Shivani Agarwal, Thore Graepel, Ralf Herbrich and Dan Roth.
A large deviation bound for the area under the ROC curve.
In Proceedings of the 18th Annual Conference on Neural Information Processing Systems (NIPS), 2004.
Published as Advances in Neural Information Processing Systems 17, pages 9-16, MIT Press, 2005.
[pdf]