Advanced Search
Published: Aug 20, 2023 DOI: 10.21769/BioProtoc.4787 Views: 245
Abstract
MAPGD (https://github.com/LynchLab/MAPGD) is a likelihood-based software for population genomics data analysis. Here, we demonstrate a simple tutorial for using MAPGD to analyze Arabidopsis population data. We estimated the allele and genotype frequencies, the average heterozygosity, relatedness, and other population parameters. This pipeline could be applied to further plant population genomic studies.
Keywords: MAPGDBackground
MAPGD (Ackerman, 2016; Ackerman et al., 2022) is a series of related programs that estimate allele frequency, heterozygosity, Hardy-Weinberg disequilibrium, linkage disequilibrium, and identity-by-descent coefficients from population genomic data using a statistically rigorous maximum likelihood approach. The MAPGD software was primarily designed for analysis of planktonic crustacean Daphnia populations (Lynch et al., 2017; Maruki et al., 2019; Ye et al., 2019). It is most useful for the analysis of low coverage population genomic data or pooled data, where many individuals are used to prepare a single sample (Lynch et al., 2014). Since its release, MAPGD has been widely used in population genomic studies of various organisms, including the nematode Caenorhabditis elegans (Adams et al., 2022), stickleback fish (Jeffries et al., 2022), and human gut microbiota (Shoemaker, 2022). A benchmark study showed that MAPGD could outperform other single nucleotide polymorphism (SNP) callers and has great potential in different applications (Guirao-Rico and González, 2021). Here, we demonstrate the usage of MAPGD in the analysis of a publicly available Arabidopsis population dataset (The 1001 Genomes Consortium, 2016). We show that MAPGD could be used for plant population genomic studies.
Software and datasets
Software
SAMtools (Li et al., 2009; Li, 2011; Danecek et al., 2021; version 1.9; http://www.htslib.org/)
MAPGD (Lynch et al., 2014; Maruki and Lynch, 2014 and 2015; Ackerman et al., 2017 and 2022; version 0.4.40; https://github.com/LynchLab/MAPGD). The complete user manual is available at https://github.com/LynchLab/MAPGD/tree/master/docs/man. Users may also find the software performance testing results at https://github.com/LynchLab/genomics_simulation
R (R Core Team, 2022; 4.1.3, https://www.r-project.org/) for visualization
Input data
The sample bam files were downloaded from the Arabidopsis 1001 Genomes Project (The 1001 Genomes Consortium, 2016). All five bam files used were from the “JGIHeazlewood2011” subset. The subset project page is: https://1001genomes.org/projects/JGIHeazlewood2011/index.html.
Procedure
© 2023 The Author(s); This is an open access article under the CC BY-NC license (https://creativecommons.org/licenses/by-nc/4.0/).
Category
Bioinformatics and Computational Biology
Plant Science > Plant molecular biology > Genetic analysis
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
Share
Bluesky
X
Copy link