We tested the effect of host–plant association on diversification by estimating speciation and extinction rates with five methods to cross-test hypotheses and corroborate results. Analyses were performed on 100 dated trees randomly sampled from the Bayesian dating analyses to take into account the uncertainty in age estimates. We used the following approaches: (1) ML-based trait-dependent diversification143,144; (2) ML-based time-dependent diversification145; (3) Bayesian analysis of macroevolutionary mixture146; (4) Bayesian branch-specific diversification rates147; and (5) Bayesian episodic birth–death model148. It is worth mentioning that each method differs at several points in their estimation of speciation and extinction rates. For instance, trait-dependent birth–death models estimate constant speciation and extinction rates144, whereas time-dependent birth–death models estimate clade-specific speciation and extinction rates and their variation through time145,147. Therefore, we expect some differences in the values of estimated diversification rates that are inherent to each approach. Our diversification analyses should be seen as complementary to the inferred diversification trend rather than corroborating the values and magnitude of speciation and extinction rates.
First, we computed the probability of obtaining a clade as large as size n, given the crown age of origin, the overall net diversification rate of the family, and an extinction rate as a fraction of speciation rate following the approach in Condamine et al.34 relying on the method of moments149. We used the R-package LASER 2.3150 to estimate the net diversification rates of Papilionidae and six clades shifting to new host plants with the bd.ms function (providing crown age and total species diversity). Then, we used the crown.limits function to estimate the mean expected clade size for each clade shifting to new host plants given clades’ crown age and overall net diversification rates, and we finally computed the probability to observe such clade size using the crown.p function. All rate estimates were calculated with three ε values (ε = 0/0.5/0.9), knowing that the extinction rate in swallowtails is usually low34 (supported by the results of this study).
Second, we relied on the state-dependent speciation and extinction (SSE) model, in which speciation and extinction rates are associated with phenotypic evolution of a trait along a phylogeny143. In particular, we used the MuSSE144 implemented in the R-package diversitree 0.9-10151, which allows multiple character states to be studied. Larval host–plant data were taken from previous works5,18,30,34,112,113,152. The following ten host–plant character states and corresponding ratios of sampled species in the tree of all known species for each character (sampling fractions) were used: 1 = Aristolochiaceae (110/152), 2 = Annonaceae (69/138), 3 = Lauraceae (33/39), 4 = Apiaceae (9/10), 5 = Rutaceae (119/163), 6 = Crassulaceae (19/19), 7 = Papaveraceae (44/44), 8 = Fabaceae (1/1), 9 = Zygophyllaceae (2/2), and 10 = Magnoliaceae (2/2). Data at a lower taxonomic level than plant family were not used because of the large number of multiple associations exhibited by genera that could alter the phylogenetic signal. We assigned a single state to each species by selecting the food plant with the maximum number of collections for each species. We did not employ multiple states per species, which represents a lesser problem because (1) few swallowtail species feed on multiple plant families, (2) current shared-state models can only model two states, and (3) the addition of multi-plant states to the MuSSE analysis would have greatly increased the number of parameters. We performed both ML and Bayesian MCMC analyses (10,000 steps) performed using an exponential (1/(2 × net diversification rate)) prior with starting parameter values obtained from the best-fitting ML model and resulting speciation, extinction and transition rates. After a burn-in of 500 steps, we estimated posterior density distribution for speciation, extinction and transition rates. There have been concerns about the power of SSE models to infer diversification dynamics from a distribution of species traits153–155, hence other birth–death models were used to corroborate the results obtained with SSE models.
Third, to provide an independent assessment of the relationship between diversification rates and host specificity, we used the ML approach of Morlon et al.145 implemented in the R-package RPANDA 1.3156. This is a birth–death method in which speciation and/or extinction rates may change continuously through time. This method has the advantage of not assuming a constant extinction rate over time (unlike BAMM146), and allows clades to have declining diversity since extinction can exceed speciation, meaning that diversification rates can be negative145. For each clade that shifted to a new host family, we designed and fitted six diversification models: (1) a Yule model, where speciation is constant and extinction is null; (2) a constant birth–death model, where speciation and extinction rates are constant; (3) a variable speciation rate model without extinction; (4) a variable speciation rate model with constant extinction; (5) a rate-constant speciation and variable extinction rate model; and (6) a model in which both speciation and extinction rates vary. Models were compared by computing the ML estimate of each model and the resulting Akaike information criterion corrected by sample size. We then plotted rates through time with the best-fit model for each clade, and the rates for the family as a whole for comparison purpose.
Fourth, we performed models that allow diversification rates to vary among clades across the whole phylogeny. BAMM 2.5146,157 was used to explore for differential diversification dynamic regimes among clades differing in their host–plant feeding. BAMM can automatically detect rate shifts and sample distinct evolutionary dynamics that explain the diversification dynamics of a clade without a priori hypotheses on how many and where these shifts might occur. Evolutionary dynamics can involve time-variable diversification rates; in BAMM, speciation is allowed to vary exponentially through time while extinction is maintained constant: subclades in a tree may diversify faster (or slower) than others. This Bayesian approach can be useful in detecting shifts of diversification potentially associated with key innovations157. BAMM analyses were run with four MCMC for 20 million generations, sampling every 20,000th and with three different values (1, 5 and 10; Supplementary Table 3) of the compound Poisson prior (CPP) to ensure the posterior is independent of the prior158. We accounted for non-random incomplete taxon sampling using the implemented analytical correction; we set a sampling fraction per genus based on the known species diversity of each genus. Mixing and convergence among runs (ESS > 200 after 15% burn-in) were assessed with the R-package BAMMtools 2.1159 to estimate (1) the mean global rates of diversification through time, (2) the estimated number of rate shifts evaluating alternative diversification models comparing priors and posterior probabilities and (3) the clade-specific rates through time when a distinct macroevolutionary regime is identified.
Fifth, BAMM has been criticized for incorrectly modelling rate shifts on extinct lineages, that is, unobserved (extinct or non-sampled) lineages inherit the ancestral diversification process and cannot experience subsequent diversification-rate shifts158,160. To solve this, we used a Bayesian approach implemented in RevBayes 1.0.10161 that models rate shifts consistently on extinct lineages by using the SSE framework147,158. Although there is no information of rate shifts for unobserved/extinct lineages in a phylogeny including extant species only, these types of events must be accounted for in computing the likelihood. The number of rate categories is fixed in the analysis but RevBayes allows any number to be specified, thus allowing direct comparison of different macroevolutionary regimes.
Finally, we evaluated the impact of abrupt changes in diversification using the Bayesian episodic birth–death model of CoMET148 implemented in the R-package TESS 2.1162. These models allow detection of discrete changes in speciation and extinction rates concurrently affecting all lineages in a tree, and estimate changes in diversification rates at discrete points in time, but can also infer mass extinction events (sampling events in which the extant diversity is reduced by a fraction163). Speciation and extinction rates can change at those points, but remain constant within time intervals. In addition, TESS uses independent CPPs to simultaneously detect mass extinction events and discrete changes in speciation and extinction rates, while TreePar estimates the magnitude and timing of speciation and extinction changes independently to the occurrence of mass extinctions (i.e. the three parameters cannot be estimated simultaneously due to parameter identifiability issues163). We performed two independent analyses allowing and disallowing mass extinction events. Bayes factor comparisons were used to assess the model fit between models with varying number and time of changes in speciation/extinction rates and mass extinctions.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.