Overall approach

PT Paul Taconet
AP Angélique Porciani
DS Dieudonné Diloma Soma
KM Karine Mouline
FS Frédéric Simard
AK Alphonsine Amanan Koffi
CP Cedric Pennetier
RD Roch Kounbobr Dabiré
MM Morgan Mangeas
NM Nicolas Moiroux
ask Ask a question
Favorite

We used a two-step statistical modeling approach to study the relationships between the biting rates of each vector species and the environmental conditions. We first calculated correlation coefficients between the biting rates and the environmental variables at the various buffer sizes/time lags considered. The objectives of this bivariate analysis were twofold: (i) to better apprehend several aspects of the ecology of the vectors in the study area, and (ii) to screen out variables for the multivariate analysis. In a second stage, we integrated selected variables in multivariate algorithmic models that we further analyzed using interpretable machine-learning tools, to search for potential complex links (nonlinear relationships, relevant thresholds) between the environmental factors and the biting rates.

We ran the whole modeling framework separately for each species, as they might exhibit different ecological preferences.

From a statistical point of view, most algorithmic machine-learning models, although nonparametric, have difficulty coping with zero-inflated negative binomial response variables [50, 51], which are typically found in insect count data such as mosquito biting rates [52]. An alternative approach to model such data is the hurdle model that considers the data responding to two processes: one causing zero versus nonzero and the second process explaining the nonzero counts [53]. The hurdle methodology in the frame of a widely used algorithmic model (random forest) was proposed elsewhere to deal with such distributions of data [51]. Besides, this separation is biologically pertinent since it has been shown that the drivers of the presence might differ from those of the abundance [17, 44, 54]. Lastly, separate modeling of presence and abundance might enable us to identify distinct targets for vector control answering to, respectively, eradication (absence of bites) and control (reduction of the number of bites) [17].

We therefore separately modeled the probability of human–vector contact (called “presence” models in the rest of this article) and the positive counts of human–vector contact (called “abundance” models). Given that HLC data are used as a proxy for human-biting rate, presence models analyzed the probability of at least one individual biting a human during a night, while abundance models analyzed the number of bites received by one human in one night conditional on their presence (i.e. zero-truncated data). Hence, in our presence models, the dependent variable was the presence/absence of vectors (binarized as 1/0) collected during 1512 nights of HLC (27 villages × 4 collection sites × 2 places (indoors and outdoors) × 7 surveys), while in the abundance models, the dependent variable was the number of bites per human during the positive catch sessions—i.e. the sessions with at least one bite.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A