Cross-validation of species distribution models: removing spatial sorting bias and calibration with a null model

Hijmans, R.J., 2012. Cross-validation of species distribution models: removing sorting bias and calibration with a null model. EcologyLink to paper

Spatial sampling biases, or the observation that testing presence points tend to be closer in space than do testing absence points (and the credibility of cross-validation for assessing model accuracy) remain large issues for SDMs. Hijmans (2012) evaluates two different ways of selecting testing-presence data and two ways of selecting testing-absence data in order to better understand how spatial sampling biases and cross-validation may lead to inflated confidence in SDMs. Indeed he found that a null model, based solely on distance to nearest presence point, performed just as well (.69) as Bioclim (.64) and Maxent (.73). This suggests that it can be difficult to directly interpret uncallibrated cross-validation results (as is seen in most studies using SDMs) and that calilibrating with a null model could lead to more accurate predictions. This study calls into question many results from SDMs, especially those using data that is inherently clumpy (e.g. museum records). I think this is an especially open area for research with questions such as: How can knowledge of a species biology be used to pre-process (filter) species occurrence data before being input into SDMs? Or how does clumpiness of species occurrence data affect predictability of species range?