The Effect of Class Imbalance on Precision-Recall Curves

In this note I study how the precision of a classifier depends on the ratio r of positive to negative cases in the test set, as well as the classifier's true and false positive rates. This relationship allows prediction of how the precision-recall curve will change with r, which seems not to be well known. It also allows prediction of how F_β and the Precision Gain and Recall Gain measures of Flach and Kull (2015) vary with r.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset