AUC: a misleading measure of the performance of predictive distribution models

Lobo, J. M., Jiménez-Valverde, A., & Real, R. (2008). AUC: a misleading measure of the performance of predictive distribution models. Global Ecology and Biogeography, 17(2), 145–151. http://doi.org/10.1111/j.1466-8238.2007.00358.x


 

With the increase in the use of predictive distribution models, especially with regards to species niche modeling, many are turning to the the area under the receiver operating characteristic curve (AUC) to assess the predictive accuracy of the models. Lobo et al have five main issues with the use of AUC in this manner. According to Lobo, AUC…

1) is insensitive to transformations of predicted probabilities, if ranks are preserved, meaning that models that are well fit may have poor discrimination and vice versa
2) summarizes test statistics in areas of extreme false-positive and –negative rates that researchers are rarely interested, leading the authors to suggest partial AUC
3) weights omission and commission the same. In the case of presence-absence data, false absences are more likely than false presence data, therefore their respective errors are not equal
4) plots do not describe the spatial distribution of errors, which would allow researchers to examine whether errors are spatially heterogeneous
5) does not accurately assess accuracy if the environmental range is larger than the geographical extent of presence data, as is the case for most SDM predictions

Additionally, AUC is often used to determine a ‘threshold’ probability of species distribution when converting a SDM to a binary, in spite of the fact that a ‘benefit’ of AUC is it is independent of the chosen threshold, and its corresponding subjectivity. The only instance in which the authors encourage use of AUC is in distinguishing between species whose distribution is more general (low AUC score) vs restricted. In order to combat the failings of AUC, Lobo et al suggest that sensitivity and specificity also be reported and that AUC only be used to compare models of the same species over an identical extent. I think another important point to include would be the quality of data. A cause of several of these problems is the bias of absence data in species distributions, and extra effort to combat this bias and ensure more complete presence-absence data sets would reduce the bias introduced by AUC.