We reconstructed phylogenetic relationships using a molecular dataset that comprised six mitochondrial (12S, 16S, COI, CytB, ND1, and ND2) and six nuclear gene regions (B-Fib, MUSK, ODC, RAG1, TGFB2, and ZENK), with 18,246 total base pairs. We obtained sequences for 220 birds and a two-taxon out-group (crocodiles). All 61 species for which we examined morphology and ROM were included in this sampling. The additional 159 species of birds (220 to 61) were included to aid in the estimation of divergence times among avian clades. All sequences were downloaded from GenBank (see data S2 for accession numbers and information on genetic sampling).

We aligned each gene sequence separately using the built-in algorithm in Geneious v9.1.6 (33). Flanking regions that contained sequences from less than 60% of taxa were excluded. To identify the best-fitting model of nucleotide substitution for each gene, we used PartitionFinder v2.0 (34). In each case, we found the best fit (assessed via Akaike’s information criteria and Bayesian information criteria scores) to be a GTR + I + Γ model or a close variant thereof.

Using SequenceMatrix v1.7.8 (35), we concatenated the aligned sequences into a supermatrix. We partitioned this supermatrix by individual molecular markers and performed a maximum-likelihood analysis in RAxML (36), using a bootstrap analysis under a GTR + Γ model with 1000 pseudoreplicates. The phylogenetic tree with the best likelihood score was retained to guide further analyses (data S3).

We then used BEAST v2.4.7 (8) to simultaneously estimate topology, branch lengths, and divergence times via Bayesian Markov chain Monte Carlo (MCMC). Using a relaxed log normal clock model approach, we partitioned the supermatrix by sequence and fit a separate model for each partition based on our results from PartitionFinder. To estimate divergence times, we placed informative parametric priors on nodes of the tree to reflect the available paleontological history of the group (table S1). Descendant members of each specified node were based on the topology of the maximum likelihood tree from RAxML.

To ensure that each BEAST MCMC sampling converged on the target distribution, we ran nine separate chains, each from a different random starting tree. Each MCMC chain ran for 200 million generations, sampling every 20,000 generations. We assessed convergence by plotting likelihood versus generation and estimating the effective sample size (ESS) of each parameter. A similar analysis in which the supermatrix was not partitioned resulted in most MCMC chains not attaining stationarity.

Once we discarded the burn-in from each chain (the first ~20%), we combined chains via LogCombiner v2.4.7 (8). Within each chain, the ESSs of all parameters were generally >200, with the lowest ESS still >100. After we combined all results, most of the parameters had ESS of >2000. The combined set included >500 million trees, which we used to assemble the maximum clade credibility tree in TreeAnnotator v2.4.7 (8). This resulting maximum clade credibility tree (data S4) was used for all phylogenetically informed analyses of comparative data (and used to generate main text figures). In addition, 1000 trees were sampled from the posterior distribution to aid in the understanding of phylogenetic sensitivity in all comparative analyses (see Fig. 5).

Note: The content above has been extracted from a research article, so it may not display correctly.



Q&A
Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.



We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.