Phillips, S. J. , Elith, J. (2013). On estimating probability of presence from use-availability or presence-background data. Ecology, 94: 1409-1419. DOI:
Ecologists studying a wide range of species wish to map species distributions and/or predict suitability of sites for occupation and persistence. This paper investigates the statistical methods that estimate the probability that the species is presence at a site as determined by environmental covariates. Exponential models are most often used for presence-background data, and provide maximum-likelihood estimates of relative probability of presence. This output is proportional to the absolute probability of presence. As the constant of proportionality is unknown models of absolute probability of presence may be preferable. Five logistic methods (LK, LI, SC, EM, SB) for using presence-background data are presented and tested in an experimental comparison. Seven simulated species with defined probability of presence functions were modeled in an environment with a single predictor variable ranging from 0-1 uniformly across the landscape. For models that required an estimate of prevalence, the known simulated presence was used first followed by the true prevalence with an error of 0.1 to assess the sensitivity of the estimate. There was a stark contrast between two groups of models with LI and LK methods (no species prevalence parameter) having higher RMS errors than the SC, EM, SB methods (includes species prevalence parameter). The EM, SC, SB methods performed well in the experiments given an estimate of prevalence. These models may be useful as they can estimate the absolute probability of occurrence at a location, which can aid in the management and conservation of species across a region. Due to the potential of over- or under-prediction with a substantially incorrect estimate of prevalence, in cases were estimates of prevalence are unreliable the use of MaxEnt, or other methods like it, may be better.