Statistical analyses

AH A. L. Hargreaves ES Esteban Suárez KM Klaus Mehltreter IM Isla Myers-Smith SV Sula E. Vanderplank HS Heather L. Slinn YV Yalma L. Vargas-Rodriguez SH Sybille Haeussler SD Santiago David JM Jenny Muñoz RA R. Carlos Almazán-Núñez DL Deirdre Loughnan JB John W. Benning DM David A. Moeller JB Jedediah F. Brodie HT Haydn J.D. Thomas PM P. A. Morales M.

This protocol is extracted from research article:

Seed predation increases from the Arctic to the Equator and from high to low elevations

**
Sci Adv**,
Feb 20, 2019;
DOI:
10.1126/sciadv.aau4403

Seed predation increases from the Arctic to the Equator and from high to low elevations

Procedure

*Latitudinal and elevational patterns*. We used GLMMs to quantify the effects of latitude, elevation, seed type, and their interactions on seed predation using the *lme4* package in R (3.3.3). Because seed predation data are a binomial proportion, we used a binomial error distribution and a logit-link function. As sites on a transect may vary together temporally in seed predation (e.g., with regional pest outbreaks), and repeated measures of a single site are not independent, all models include a random intercept for the start date of the experiment (the mean date if the experiment was run over >2 days) and a random intercept for each site. We also analyzed raw data including the individual seed depots as the base level of the hierarchy and additional random factors for “site × date,” to account for nonindependence of depots at a given site on a given date, and “depot,” an individual-level random factor to resolve overdispersion. These models yielded the same fixed effects in the final model, the same ranking of factors within fixed effects, and the same slope direction for continuous variables as models on averaged data; however, they produced multiple convergence warnings. Hence, we present results from models on data averaged across depots within runs.

We ran one model per hypothesis with the following fixed effects. In model 1, we tested whether total seed predation differed between temperate and tropical zones, including transect latitude as a categorical variable (>23.5°N = temperate, <23.5°N = tropical), elevation, and seed type. Categorical latitude is consistent with biogeographic hypotheses that compare “the tropics” to “the temperate zone,” without necessarily invoking a continuous gradient (*15*). All other analyses consider continuous latitude. In model 2, we tested whether total seed predation declined continuously with increasing latitude and elevation, including site latitude (decimal degrees), elevation (masl), and seed type. In model 3, we tested whether invertebrate-only predation varied with latitude and elevation and whether these patterns differed from patterns in total predation. This model considers only sites and dates where invertebrate predation was measured and only sunflower seeds as we did not cage oats. Factors were latitude, elevation, and exclusion treatment (all predators versus invertebrates only).

Models initially included all possible interactions among factors. We assessed interaction significance using sequential likelihood ratio tests comparing models with and without the interaction using a χ^{2} distribution (*46*). Nonsignificant interactions (α = 0.05) were dropped from models, as model simplification improved mixed model convergence. Estimates of means and confidence intervals for different factor levels within categorical variables (e.g., categorical latitude and seed type) were extracted from reduced models using the R lsmeans package, while trend lines, confidence intervals, and partial residuals for continuous factors were extracted using the R visreg package. GLMM results are shown in Table 1.

*Effect of biome (categorical mechanism)*. To test whether latitudinal and elevational patterns were explained by differences in predation among biomes, we classified each site relative to local treelines: above upper treeline (alpine, tundra, and paramao), below lower treeline (grassland and desert), or between treelines (forest; table S1). In model 4, we first tested whether total seed predation (model 4t) and invertebrate seed predation (model 4i) differed among biomes, including latitude, elevation, and seed type as additional predictors [full model: seed predation ~ biome × latitude × elevation × seed type + (1|siteID) + (1|date)]. Model reduction and biome estimates were extracted from the reduced model as above. While we included latitude and elevation to account for their effects, we did not use these models to test for latitudinal and elevational effects within biomes, as we did not have even latitudinal coverage above treeline (only two tropical sites) or below treeline (only mid-latitudes; Fig. 2A). Instead, we ran separate models testing for latitudinal and elevational patterns in total (model 5t) and invertebrate (model 5i) seed predation in forested sites, for which we had good geographic coverage. GLMM results are shown in Table 1.

*Continuous mechanisms (climate, productivity, and biodiversity)*. We tested for correlations among mechanistic variables (climate variables, productivity, and species richness), latitude, and elevation. For both the entire dataset of 79 sites and for the 60 sites at which the vertebrate exclusion treatment was added, latitude was significantly correlated with most variables, which were also generally correlated with each other (fig. S6). We used structural equation modeling to test the mechanistic relationships among correlated predictor variables (*47*).

Additional manipulations made data suitable for structural equation modeling. First, to deal with repeated measures of individual sites, we averaged the data a second time to get one data point per seed type per caging treatment per site. We arcsin-transformed data to make errors normally distributed. This yielded *n* = 79 data points for total predation on sunflower seeds, *n* = 79 for total predation on oats, and *n* = 60 for invertebrate predation on sunflower seeds. We analyzed these three seed × predator types independently to allow for varying biogeographic patterns in consumption by different predator guilds (*8*, *9*). To improve model fits, we standardized the response and predictors in each dataset to mean = 0 and SD = 1.

We first constructed a conceptual model, which was too complex to test with the collected data but clarified our understanding about how predictors affect each other and seed predation (fig. S5). From this conceptual model and results of an earlier analysis of climate and productivity on oat predation (*23*), we generated 15 simpler SEMs, representing biologically motivated simplifications of our conceptual model. These SEMs are illustrated in fig. S7, and the hypotheses that they represent are described fully in the Supplementary Materials. Briefly, latitude and elevation are always exogenous variables, whose values do not rely on values of other modeled variables. Climate variables were modeled either as a latent variable, “Climate,” or independently and in various combinations. Although productivity (AET and NPP) is positively correlated with species richness (fig. S6), global analyses suggest that high productivity does not cause high richness [e.g., (*48*)]; thus, we modeled them as affected by climate but independent of each other. Higher seed predator populations could arise from more productive ecosystems (more seeds or other food available) or more diverse predator assemblages (“species packing”); hence, we modeled direct effects of productivity and species richness on seed predation. All models include a direct effect of elevation on seed predation, as grid cells for climate, productivity, and richness data were large enough to encompass multiple elevational sites along steep gradients.

SEM1 to SEM8 tested various pathways of direct and indirect effects of multiple variables on seed predation intensity. SEM9 to SEM15 compared the simplest possible model for each variable thought to have direct effects on seed predation intensity (SEM9 to SEM12) or that significantly affected predation in Orrock *et al*. (*23*) (SEM13 and SEM14) to a model with latitude and no mechanistic predictors (SEM15). SEM9 to SEM15 also included elevation, with both variables modeled as exogenous (structure shown in fig. S7, SEM9). SEMs were run using the R package lavaan. We assessed model goodness of fit using the Tucker-Lewis Index and root mean square error of approximation. Model selection was done using the Akaike information criterion (AIC).

To assess whether results were affected by the additional data manipulations required for SEMs, we also compared binomial generalized linear models using our main dataset (i.e., 1 data point per date per site per seed type and caging treatment). We ran one model for each explanatory variable (mean annual temperature, AET, NPP, species richness, temperature annual range, annual precipitation, precipitation seasonality, and latitude) for total seed predation (models included seed type and elevation) and invertebrate predation (models included elevation). These models are equivalent to the simplest SEMs (SEM9 to SEM15). Models were compared using AIC modified for small sizes (AICc) and yielded the same model ranking as structural equation modeling; thus, we present SEMs only.

Note: The content above has been extracted from a research article, so it may not display correctly.

Q&A

Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.