Overcoming limitations of modelling rare species by using ensembles of small models

Rare species distribution modeling is inherently difficult due to the issue of relatively low number of occurrence points relative to a high number of explanatory variables. To alleviate this difficulty a researcher could reduce the number of predictor variables used in the model such that the ratio of predictors to occurrence points is 1:10. This method can be problematic for extremely rare species where 20 occurrence points would only allow for two predictors. Additionally, there is an elevated interest in modeling distributions for rare species due to the conservation interests of these species. The mismatch between the need to model rare species distribution and the difficulty in doing so is termed the “rare-species modeling paradox.” A method in which many models containing a few predictors are created and then averaged according to weighted scores based on model performance has been proposed to get around the difficulties of modeling the distributions of rare species. This method, called an ensemble of small model (ESM), was applied reasonably well to a single endemic species of the Iberian Peninsula, but had yet remained untested against traditional species distribution models (SDM). This paper looks at how ESMs compare traditional SDMs for 107 species of varying rarity. 107 species of vascular plant were selected for this analysis and split into three groups, very rare, rare, and less rare. GLMs, GBMs, Maxent, and their ensemble prediction (EP) were constructed for all species using 11 predictors. A linear mixed effects model was used to test for the effects of modeling strategies (ESMs, traditional SDMs), modeling techniques (GLM, GBM, Maxent, EP), sample size, and the two way interaction between the factors. The linear mixed effects model showed that ESMs nearly always outperformed their standard counterpart. The effects of all factors on model performance was significant. Modeling strategy interacted with sample size, where species with a low sample size (very rare) benefitted the most from the use of ESMs. This paper shows that ensembles of small models can be an effective way for modeling rare species. This study was conducted using only vascular plants and the method will need to be tested on other taxonomic groups to prove validity, though the authors anticipate similar success when implemented.

Breiner, F. T., Guisan, A., Bergamini, A., & Nobis, M. P. (2015). Overcoming limitations of modelling rare species by using ensembles of small models. Methods in Ecology and Evolution, 6(10), 1210-1218. DOI: 10.1111/2041-210X.12403

 

On estimating probability of presence from use—availability or presence—background data

Phillips, S. J. , Elith, J. (2013). On estimating probability of presence from use-availability or presence-background data. Ecology, 94: 1409-1419. DOI:10.1890/12-1520.1

Ecologists studying a wide range of species wish to map species distributions and/or predict suitability of sites for occupation and persistence. This paper investigates the statistical methods that estimate the probability that the species is presence at a site as determined by environmental covariates. Exponential models are most often used for presence-background data, and provide maximum-likelihood estimates of relative probability of presence. This output is proportional to the absolute probability of presence. As the constant of proportionality is unknown models of absolute probability of presence may be preferable. Five logistic methods (LK, LI, SC, EM, SB) for using presence-background data are presented and tested in an experimental comparison. Seven simulated species with defined probability of presence functions were modeled in an environment with a single predictor variable ranging from 0-1 uniformly across the landscape. For models that required an estimate of prevalence, the known simulated presence was used first followed by the true prevalence with an error of 0.1 to assess the sensitivity of the estimate. There was a stark contrast between two groups of models with LI and LK methods (no species prevalence parameter) having higher RMS errors than the SC, EM, SB methods (includes species prevalence parameter). The EM, SC, SB methods performed well in the experiments given an estimate of prevalence. These models may be useful as they can estimate the absolute probability of occurrence at a location, which can aid in the management and conservation of species across a region. Due to the potential of over- or under-prediction with a substantially incorrect estimate of prevalence, in cases were estimates of prevalence are unreliable the use of MaxEnt, or other methods like it, may be better.i0012-9658-94-6-1409-f01_10.1890_12-1520.1

The Combined Use of Correlative and Mechanistic Species Distribution Models Benefits Low Conservation Status Species.

Rougier, T., Lassalle, G., Drouineau, H., Dumoulin, N., Faure, T., Deffuant, G., Rochard, E. and Lambert, P. 2015. The Combined Use of Correlative and Mechanistic Species Distribution Models Benefits Low Conservation Status Species. Plos One, 10, 21. DOI: 10.1371/journal.pone.0139194

The spatial distribution of species suitable habitat has typically been projected using correlative Species Distribution Models (SDMs). Now increasing evidence suggests that rapid evolutionary change, dispersal, spatial structure of the environment, and population dynamics are also important for determining future species ranges. This paper seeks to develop a framework for joint analysis of correlative and mechanistic SDMs in order to increase the robustness of model-derived conclusions and aid resource managers involved in species conservation planning. Two previously constructed models, one correlative and one mechanistic, were used along with biological data collected from the literature EuroDiad 3.2 database which contains distribution information for European diadromous fishes from 1750-2010. Due to computational constraints a subset (73 of the 197 river basins included in the database) were used to predict species distribution using the mechanistic model. Both the correlative and predictive model correctly predicted historical presence data (before 1900) for the river basins used, though they did not predict absences within the historical data set well. In this case both models predict a high probability of self-sustaining populations of allis shad under moderate and pessimistic climate change models. This study concludes that when available predications from correlative and mechanistic modelling should be utilized in a complementary way to help guide conservation efforts in light of climate change. While the paper does provide a framework for jointly analyzing a correlative and mechanistic distribution model it only briefly addresses the issues encountered regarding the increased intensity in the amount of data and computational power required to utilize the mechanistic SDM. For now, it may be a rare case when correlative and mechanistic SDMs can be used in a complementary way as presented in this paper.

Species distribution models that do not incorporate global data misrepresent potential distributions: a case study using Iberian diving beetles

Species distribution models have been used since the 1980s to predict probable distribution using a combination of species occurrence data and predictive environmental data thought to influence their distribution. While distribution modeling presents a way to predict species distribution with incomplete data, using data that does not encompass the entire range of a species may lead to geographic bias in the potential distribution predicted by the model. This study aims to determine whether modeling using regionally biased data predicts incomplete potential distributions and examine why regional data may not adequately describe the potential distribution. Their results show that distributions predicted with regional data provide an incomplete description of the environmental limits of a species when compared to distributions modeled using data covering the entire species range. Due to this issue it is recommended that potential distributions be modeled using data from all known populations or a subsample from population across the entire range. While this study reflects the importance of utilizing data from across the entire known range when trying to predict potential distributions as predicted by climate it does not consider other factors that may influence distribution. Some areas within the range of the beetles do not have records of presence which may be due to limitations of the natural dispersal of these species as opposed to the climate variables in those areas.

 

Sanchez-Fernandez, D., Lobo, J. M. and Hernandez-Manrique, O. L. 2011. Species distribution models that do not incorporate global data misrepresent potential distributions: a case study using Iberian diving beetles. Diversity and Distributions, 17, 163-171. DOI: 10.1111/j.1472-4642.2010.00716.x

Changing habitat areas and static reserves: challenges to species protection under climate change

Garden, J. G., O’Donnell, T. and Catterall, C. P. 2015. Changing habitat areas and static reserves: challenges to species protection under climate change. Landscape Ecology, 30, 1959-1973. DOI: 10.1007/s10980-015-0223-3

Changing climates can lead to shifts in the spatial distribution of a species and its suitable habitat, potentially altering the effectiveness of previously fixed protected areas. This paper develops a broad approach to characterizing species’ climate-induced distributional changes due to location displacement or refugial dynamics along with the effectiveness of the protected area network. Distributional data, climate data, and other environmental data were used to produce species distribution models for 13 species. Areas of suitable habitat for each species were predicted according to three climate regimes and overlaid with GIS maps of protected areas. Suitable habitat extent decreased across climate regimes for all 13 species as did the proportion of refugia extent within the original suitable habitat extent. The amount of protected habitat decreased under future climates though this is likely due to overall decreases in the habitat extent as the proportion of habitat protected in the study area did not change over time. This study forecasts a decline in suitable habitat for forest obligate species within the study area as the climate changes. Patterns of species response to the changing climate were better characterized by refugial dynamics rather than location displacement. These findings are consistent with species ranges shrinking in the future around refugia within or near the current distribution as opposed to shifting in location. The purpose of this study was to predict the impact of climate change on the habitat extent of these species and as such other threats to habitat, such as deforestation, were intentionally not considered. In order to better predict suitable habitat extent future research would need to include all threats to habitat in the study area.