N-mixtures Transition Probability Matrix

Matthew R. P. Parker; Laura L. E. Cowen; Jiguo Cao; Lloyd T. Elliott

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

N-mixtures Transition Probability Matrix

MP Matthew R. P. Parker

LC Laura L. E. Cowen

JC Jiguo Cao

LE Lloyd T. Elliott

This method is extracted from research article: J Agric Biol Environ Stat, Sep 2022

Computational Efficiency and Precision for Replicated-Count and Batch-Marked Hidden Population Models

DOI: 10.1007/s13253-022-00509-y

Request a Protocol

Ask a question

Favorite

The likelihood function for the open-population N-mixtures model from Dail and Madsen (2011) is given in Eq. (5). In this model there are four parameters: probability of detection p, initial site abundance $λ$ , mean population growth rate $γ$ , and survival probability $ω$ . There are also two study specific constants: the number of sampling sites R, and the number of sampling occasions T. The total population at site i and time t are given by the latent variables $N_{it}$ , and the observed population counts are denoted $n_{it}$ .

The function $P_{a, b}$ shown in Eq. (6) calculates the transition probability for moving from a population of size a to a population of size b. To calculate the likelihood equation requires calculating $P_{a, b}$ at most $R T (K + 1) (T - 1)$ times (when each $n_{it} = 0$ ), where K is the upper bound on the summations and thus also the upper bound on the population size. The complexity of computing $P_{a, b}$ is thus the main bottleneck for computing the likelihood function. $P_{a, b}$ is a convolution of two discrete distributions and can thus be calculated efficiently using the FFT convolution.

Here $Bin$ and $Pois$ denote the binomial and Poisson distribution functions, respectively. We define the transition probability matrix $M_{K}$ to be the matrix of $P_{a, b}$ values, where a and b vary from 0 to K.

Equation (7) illustrates the relationships between $P_{a, b}$ , $M_{K}$ , convolution, and FFT; let $x_{a, c} = Bin (c ; a, ω)$ so that ${\underline{x}}_{a} = {x_{a, 0}, x_{a, 1}, \dots, x_{a, min {a, b}}, 0, 0, \dots, 0}$ (right padded with zeroes until ${\underline{x}}_{a}$ has $K + 1$ elements), and let $y_{b, c} = Pois (b - c ; γ)$ so that $\underline{y} = {y_{0, c}, y_{1, c}, \dots, y_{K, c}}$ , then ${({\underline{x}}_{a} * \underline{y})}_{b} = \sum_{c = 0}^{min {a, b}} x_{a, c} \cdot y_{b, c} = P_{a, b}$ , and $({\underline{x}}_{a} * \underline{y}) = {{({\underline{x}}_{a} * \underline{y})}_{0}, {({\underline{x}}_{a} * \underline{y})}_{1}, \dots, {({\underline{x}}_{a} * \underline{y})}_{K}}$ .

Computation of $M_{K}$ is a large portion of the computational cost of calculating the likelihood function in open-population N-mixture models. In Fig. 1 we show a plot of the median computation times for calculating $M_{K}$ for increasing values of K using both the manual convolution and the FFT convolution techniques. The values plotted are the median computing times over 100 runs for each value of K. Each run involved generating a random value for $ω$ drawn from a Beta ( $a = 10$ , $b = 10$ ) distribution and a random value for $γ$ drawn from a $χ_{K}^{2}$ distribution. $M_{K}$ was then calculated using the same $ω$ and $γ$ for both the manual convolution method and the FFT convolution method, and computation times were calculated using the R package microbenchmark (Mersmann 2021). The error bars shown in Fig. 1 indicate minimum and maximum observed computing times. We also note that for both methods, the upper and lower quantiles are indistinguishable from the median computing times on the log plot.

Plot of log median value of computation time measured in milliseconds (out of 100 runs with randomised $γ$ and $ω$ ) versus K for computation of the transition probability matrix $M_{K}$ using manual (grey) and the fast Fourier transform (black) convolution methods. Error bars indicate minimum and maximum computing time over the 100 runs per K. We considered $K =$ 1, 10, 50, 100, 250, 500, and 1000

Often, time-varying covariates for the population dynamics parameters $γ$ and $ω$ are incorporated into N-mixture models. When this is done, the matrix $M_{K}$ is calculated once for each time point $t \in {2, \dots, T}$ . Thus the computing time savings when using FFT are increased by a factor of up to $T - 1$ when time covariates are considered.

In Sect. 3.1 we use a simulation study to illustrate the improved computational efficiency of using the FFT method for calculating $M_{K}$ when fitting N-mixture models. In Sect. 4.1 we apply the FFT method to improve computing efficiency for an Ancient Murrelet chick counts application.

Copyright and License information: International Biometric Society ©2022, Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol