Model calibration

KV Kimberly VanderWaal
LB Lora Black
JH Judy Hodge
AB Addisalem Bedada
SD Scott Dee
request Request a Protocol
ask Ask a question
Favorite

We tuned the deterministic model to the epidemiological data (cumulative clinical cases and testing results) from each plant. Given the daily incidence data for two plants (Plants A–B) was based on self-reported PCR testing from private health care providers and one plant (Plant C) was based on ongoing company-provided testing of symptomatic workers, we assumed that workers who sought and received testing were experiencing clinical disease because testing was generally restricted to symptomatic patients at that time. Self-reported PCR-positive workers were also absent from work; therefore, we assumed the number of self-reported cases (Plants A and B) was equivalent to the at-home “H” class in our model. The percent of workers that were IgG-positive during company-initiated testing was assumed to be equivalent of the percent of workers in the R class at a particular point of time. Although it is possible for IgG to be detectable within two days of symptom onset [18], generally <70% of people had detectable IgG by 10 days of symptom onset [18, 19], while >85% had detectable IgG after 11–15 days [1921]. In our model, the length of time between infection and recovery was 11–15 days, which is why we believe that the percent of recovered individuals in the model is a reasonable approximation of the percent IgG-positive in the observed data.

Because most parameter values in the model are uncertain, we conducted a multivariate calibration exercise on uncertain parameters (Table A in S1 Appendix) using Latin hypercube sampling (LHS) and rejection sampling [2224]. We generated 10,000 parameter sets through sampling a Latin hypercube, which is an efficient method to sample multivariable parameter space. Model results were generated for each parameter set using the deterministic model. Parameter sets were then rejected if the modeled outbreak did not sufficiently resemble observed data, based on a set of criteria that was specific to each plant (cumulative number at-home, percent infected/PCR-positive or recovered/IgG-positive, see Table C in S1 Appendix for criteria for each plant.). Goodness-of-fit criteria were based on epidemiological data from the earlier phases of plant-based outbreaks, up until the completion of company-initiated testing. Parameter value medians and interquartile ranges were summarized from candidate parameter sets that met the goodness-of-fit criteria. The median value was considered to be the most-likely value and used in subsequent model exploration.

Model fit was checked by plotting the observed epidemiological data against the model’s predictions. Here, 1000 simulations were performed with the stochastic model to estimate variability in model outcomes given the most-likely parameter values. For plants which experienced outbreaks of longer duration (B and C), it was apparent that although the calibrated parameter values produced simulated outbreaks that resembled the observed data in the early phase of the outbreak, the number of clinical cases was overestimated in the post-testing period. Therefore, we allowed R and b to be re-calibrated based on the post-testing data to account for altered disease dynamics, which may have emerged as a result of higher adherence to biosafety protocols (reduced R) or changes in disease prevalence in the community (reduced b).

We also performed a sensitivity analysis for the stochastic model using Latin hypercube sampling and random forest analyses, which is a common approach used for global sensitivity analyses in simulation modelling [2225] (Supplementary text B in S1 Appendix).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A