Phylogenetic inference and estimation of pairwise genomic distances

AM Anatolie Marta
TT Tomáš Tichopád
OB Oldřich Bartoš
JK Jiří Klíma
MS Mujahid Ali Shah
VB Vendula Šlechtová Bohlen
JB Joerg Bohlen
KH Karel Halačka
LC Lukáš Choleva
MS Matthias Stöck
DD Dmitrij Dedukh
KJ Karel Janko
AS Andrea Sweigart
A Anonymous
A Anonymous
AM Anatolie Marta
TT Tomáš Tichopád
OB Oldřich Bartoš
JK Jiří Klíma
MS Mujahid Ali Shah
VB Vendula Šlechtová Bohlen
JB Joerg Bohlen
KH Karel Halačka
LC Lukáš Choleva
MS Matthias Stöck
DD Dmitrij Dedukh
KJ Karel Janko
request Request a Protocol
ask Ask a question
Favorite

Mitochondrial loci were eliminated from all downstream analyses of SNP, which focussed only on nuclear exon sequences. The VCF dataset was used to calculate pairwise p-distances in SNPs between all individuals using VCF2Dis v1.47 software (Subramanian et al., 2019).

In order to reconstruct phylogenetic relationships among crossed species, we further used the GATK4 FastaAlternateReferenceMaker option to create locus-specific consensuses from each sample. Regions with low read depth in each new consensus were identified using the samtools software (Danecek et al., 2021) with depth option and the output file was rewritten into bed format where sites with depth <10 were masked by ‘N’ using bedtools (Quinlan and Hall, 2010) and maskfasta option. To mitigate the locus dropout in distant species, we than used custom R scripts (R Core Team 2020) with functionalities of the seqinr package Charif and Lobry, 2007 to select the alignments where reads from all investigated species are present. Final phylogenetic analysis was thus based on 1960 loci with length >750 bp & > 30 parsimony informative sites, where all species had correctly read sequence variants on more than 70% of sites. Individual Maximum Likelihood gene trees were reconstructed by IQ-TREE v. 2.0.3 Nguyen et al., 2015 using the extended model selection with free rate of heterogeneity in combination with 1000 ultrafast bootstrap replicates (Kalyaanamoorthy et al., 2017; Hoang et al., 2018). Consensus species tree was estimated by ASTRAL v. 5.5.6 (Zhang et al., 2018). Gene Concordance Factors (gCF) and site Concordance Factors (sCF) were estimated (Minh et al., 2020) and resulting trees were processed using the ape package (Paradis and Schliep, 2019) and plotted by DensiTree v2.0 (Bouckaert, 2010).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A