We utilized four different evaluation criteria for assessing which modeling approach performs best. We first used the AUC and then the mean absolute error (MAE) as proposed by Konowalik and Nosol (2021). Both metrics were calculated for each FFME‐run on spatially separated test data using the R package “Metrics” (version 0.1.4; Hamner & Frasco, 2018). The MAE is defined as the average absolute deviance between the predicted value (= 1) and the observed value ([0,1]) at presence points. To calculate AUC, we randomly sampled the same number of background points as available presence points. As the AUC is not initially intended to be calculated on background points but on absence data, we follow the suggestion of Yackulic et al. (2013) and will from here on refer to the AUC as AUC presence‐only (AUCPO) to establish a clear distinction between AUC values calculated on PA and PO data. We are aware of the general problems associated with these metrics, especially the AUC (Lobo et al., 2008) and thus used these only to compare between modeling approaches and not to make statements of absolute model performance. As a third metric, we calculated the Boyce‐Index (Boyce et al., 2002) with the R package “ecospat” (version 3.3; Di Cola et al., 2017) using the prediction raster and spatially separated test data. Finally, as a fourth metric we used the number of parameters of each model as an indicator of model complexity.
These four metrics were determined for each fold of the FFME and their median value was calculated for each species separately. The assessment of which modeling approach was the best for each species was made based on the highest Boyce‐Index, highest AUCPO, lowest MAE, and least complex model (i.e., model with the minimum number of parameters). We compared the metric values of each species and assigned the species to the approach with the best value for further comparison. The best modeling approach for each metric (exemplary calculation in Table 1) was then defined as the one with the highest number of assigned species.
Exemplary determination of the best modeling approach for AUCPO values.
Note: Median values for each modeling approach over all forward‐fold‐metric‐estimation folds. The result of the best modeling approach is indicated in bold.
To express the overall performance of each modeling approach in conjunction with model complexity, we created a single performance‐complexity‐index (PCI) based on all four metrics. To do this, we scaled the metrics of all models for each species from 0 to 1 with inverted scales of MAE and the number of parameters. The sum of all four scaled metrics per species and modeling approach formed the PCI (exemplary calculation Table 2).
Exemplary calculation of the Performance‐Complexity‐Index (PCI) for the species “awt01.”
Note: For each metric (AUCPO, MAE, Boyce, and number of parameters), the values for all four modeling approaches are scaled from 0 to 1 with inverted scales of MAE and the number of parameters. The sum of all four scaled metrics per species and modeling approach formed the PCI.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.