The ENM was performed using the maximum entropy approach implemented in MaxEnt, v.3.4.1 (Phillips et al., 2004), which does not require absence data and is among the least sensitive to sampling biases when studying large geographic areas (Qiao et al., 2015).
Two different sets of predictor variables were used to ascertain the robustness of the results (Araújo et al., 2019). Both sets were selected based on a reiterative jackknife procedure of model construction and stepwise removal of the least contributing variables (Zeng et al., 2016), but each used a different dataset and metric (i.e., training gain or test gain) to measure the contribution of the variables to the model. After removing the uninformative variables by the jackknife procedure, the final set of variables was produced in each case by removing one variable from each pair of correlated variables, based on a correlation matrix of the climate layers (cut‐off of r ˂ 0.8; Merow et al., 2013). The first set of predictors (Set 1) was selected based on the training gain using the species‐level occurrences as the training dataset, and it contained Mean Diurnal Range (BIO 2), Temperature Annual Range (BIO 7), Mean Temperature of Wettest Quarter (BIO 8), Mean Temperature of Driest Quarter (BIO 9), Precipitation Seasonality (BIO 15), Precipitation of Wettest Quarter (BIO 16), Precipitation of Driest Quarter (BIO 17), and Precipitation of Warmest Quarter (BIO 18). The second set (Set 2) was selected using Western Siberia as an independent test area where we generated a grid with cell size of 0.5° × 0.5° (n = 948 cells) within the outline of the bank vole distribution range and used the cell centroids as the test data. Western Siberia is considered as an area with the current climatic conditions resembling those in Europe during the LGM, and its use as the test area helped ensure the transferability of the models (Fløjgaard et al., 2009) and reduce any bias from the limited number of bank vole occurrences available for the Siberian part of its distribution range (Vega et al., 2010). The variables included in Set 2 based on the test gain were Isothermality (BIO 3), Mean Temperature of Warmest Quarter (BIO 10), Precipitation of Wettest Quarter (BIO 16), and Precipitation of Warmest Quarter (BIO 18).
Niche models were built independently with Set 1 and Set 2. A total of 50 replicates of each model were generated by the subsampling method in MaxEnt, which randomly selected 25% of the occurrence points reserved as test data (Phillips et al., 2004). Subsequently, the estimates are based on the overall mean of the replicates. To minimize the possible effect of inadequate representation of the environmental background (Guevara et al., 2018), 1,000,000 background points were used. Default values were used for the other parameters, as recommended when comparing models at different evolutionary levels (i.e., species and intraspecific lineages) and with different sampling efforts (Merow et al., 2013; Phillips & Dudík, 2008).
Our study area encompasses the bank vole distribution range and surrounding areas, covering Europe, western Siberia, the Anatolian Peninsula, and the Caucasus. We had no a priori reasons to exclude specific parts of Europe as inaccessible to colonization by the different bank vole lineages. Geographic distributions of the lineages do not align with major geographic barriers, and it is known that European small mammals responded individualistically to the end‐glacial warming in terms of the contribution of the different refugia to colonization of the different parts of Europe (with some species colonizing most of Europe from a single refugium), indicating that the present phylogeographic patterns are, to large extent, the product of species’ adaptive niches and habitat availability (Michaux et al., 2005). Furthermore, the evidence shows that during the end‐glacial colonization bank vole lineages occupied areas where they are no longer present, and which are currently occupied by other lineages (Figure 1; Kotlík et al., 2018; Searle et al., 2009). Thus, the study area encompassing the entire distribution range of the bank vole should reasonably represent the broader background landscape likely to have been “tested” by various lineages for suitability, even though not presently occupied by them (Barve et al., 2011).
To evaluate the performance of the models, we first calculated the average area under the receiver operating characteristic (AUC) curve for the test data (Guisan & Zimmermann, 2000), including 95% confidence intervals. We then calculated the partial AUC (pAUC) ratio using Niche Tool Box (Osorio‐Olvera et al., 2020; Peterson et al., 2008) and the sensitivity index based on the omission rate at the minimum training presence (the lowest value among all training records; MTP) and 10% training presence (the value below which 10% of all training records fall) thresholds (Kumar et al., 2015).
The models were projected to the CCSM4 and MIROC‐ESM bioclimatic layers for the LGM and mid‐Holocene. The logistic format and ascii type were used to generate the raster output. Response curves were constructed to illustrate the effect of each variable on the modeled habitat suitability. Three different thresholds were applied to generate binary maps of suitable habitat: the minimum training presence, the fifth percentile training presence, and the 10th percentile training presence (Liu et al., 2005). To assess the ability of the model constructed for each lineage to predict the known distribution range of the other lineages, we calculated how many occurrences available for each lineage were successfully predicted by the binary maps (suitable/unsuitable habitats) constructed for each other lineage (Peterson, 1999), applying the three thresholds in order to account for the sensitivity of binary presence–absence predictions to the threshold choice (Li & Guo, 2013). A composite prediction from the models constructed for the individual lineages (Chardon et al., 2020) was generated by merging the binary maps across the lineages for each threshold. As a summary of the results across the four predictor (Set 1 and Set 2) and GCM (CCSM4 and MIROC‐ESM) combinations, frequency ensemble maps were generated showing the number of models predicting the presence of a given lineage in any particular area based on the MTP threshold, separately for the mid‐Holocene and LGM (Araújo & New, 2007).
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.