When using the population sizes inferred in the previous section to build lookup tables for pyrho, we made some approximations to reduce the computational cost. The population size functions returned by smc++ using the plot command are piecewise constant, with many pieces. To reduce the number of pieces, we started at present and combined adjacent pieces by taking the harmonic mean of the population sizes for those pieces (weighted by their lengths) if all of the pieces that were combined had population sizes within 10% of the resulting harmonic mean. Furthermore, computing the initial stationary distribution of two-locus configurations, which depends on the most ancient population size, is computationally expensive, and so after reducing the number of pieces, the most ancient size was set to 19,067 for all populations. Computing the exact two-locus likelihoods requires O(n6) time, where n is the sample size, and is too computationally prohibitive for sample sizes in the hundreds for 26 populations. In previous work (23), we showed that downsampling approximate two-locus likelihoods for a larger sample size, N, results in little loss in accuracy, and these approximate likelihoods may be computed in O(N3) time and downsampled in O(N3 × (Nn)) time as well. As such, we used this approximation, with N = 256 for each population, downsampling to the observed sample size, which ranged from n = 122 to n = 226 haploids.

Note: The content above has been extracted from a research article, so it may not display correctly.

Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.

We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.