2.4. Biomod2—Species distribution modeling

Duncan Ray; Maurizio Marchi; Andrew Rattey; Alice Broome

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

2.4. Biomod2—Species distribution modeling

DR Duncan Ray

MM Maurizio Marchi

AR Andrew Rattey

AB Alice Broome

This method is extracted from research article: Ecol Evol, Jun 2021

A multi‐data ensemble approach for predicting woodland type distribution: Oak woodland in Britain

DOI: 10.1002/ece3.7752

Ask a question

Favorite

Biomod2 provides an ensemble platform of ten SDM algorithms, and we initially used just six of these as ensemble candidates. These were generalized linear model (GLM), gradient boosted machine learning (GBM), generalized additive model (GAM), artificial neural networks (ANN), random forest classifier (RF), and maximum entropy model (MAXENT). Algorithms that were not able to fit all the NFI regions successfully were removed, leaving just four after removing GAM and MAXENT. This provided the same four algorithms to model all the NFI regions.

Single‐algorithm, oak probability raster results were averaged for an ensemble prediction. The literature suggests absences may be sampled by random selection to combine with presence records (Barbet‐Massin et al., 2012). Given the large difference between the total number of presence records available and the potential candidate absence records, it was necessary to balance the presence and absence data, and at the same time consider all the environmental variation of an NFI region. Therefore, we randomly sampled 15 replicates of absence data, with the same number of points as the presence data. The use of more than one absence dataset allowed us to consider a larger ecological environment than a single absence dataset. The 15 sets of absence data were created using a random subset of the total absence pixels from broadleaved woodland patches (NFI map) that coincided with the PFE map polygons of forest without the presence of oak. To avoid overfitting, a cross‐validation procedure was applied using 50% data partition (Lobo & Tognelli, 2011). In biomod2, the cross‐validation procedure was repeated 30 times for each of the 15 presence–absence groups.

Using this procedure, we calculated the true skill statistic (TSS) for each replicate, run, and algorithm. Evaluation was made on the size of the true skill statistic (TSS). The TSS is reported as a value between 0 and 1, with 1 indicating excellent prediction (Allouche et al., 2006). A single oak probability raster prediction was calculated as the weighted mean of the four algorithm predictions using TSS scores as a weighting factor. The SDMs were parameterized separately for each of the 14 NFI regions in Britain; this allowed separate parameterization of models for regional variations in management, site selection, and site condition, for oak woodland stands in each NFI region.

Copyright and License information: ©2021 Crown copyright. published by John Wiley & Sons Ltd. This article is published with the permission of the Controller of HMSO and the Queen's Printer for Scotland.

This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol