Monophyletic avian species complexes, and species, endemic or with their distributions in largest part within the Amazon region, known to be strongly tied to upland terra firme humid forest, were included in this study on the basis of (i) availability of gene sequence data for at least one mitochondrial marker and at least one nuclear marker, and (ii) availability of sufficient data from across the geographic distribution of the complex/species in the AB [i.e., genetic and occurrence data from all main Amazonian geological provinces and major interfluvia (8, 33, 34) occupied by the group].

For each species or species complex analyzed, a concatenated phylogeny was used initially as a guide tree to further test lineage boundaries based on multilocus coalescent methods. Essentially, highly supported, reciprocally monophyletic groups (i.e., subspecies and populations within species or species within complex of species that tend to replace each other geographically) were assumed as hypothesized species (in the case of complexes of species) or independent evolutionary units (for species datasets) whose limits were then tested with multilocus coalescent methods (see below). The gene sequence data were drawn from both published and unpublished studies developed by the Aleixo laboratory group (see table S1 for a complete list of references, GenBank accession numbers, species/complex of species, lineages, and sample sizes).

Occurrence records were drawn from data associated with specimens in the ornithological collection of the Museu Paraense Emílio Goeldi (MPEG), as well as from scientific collections providing access to data via Sample sizes ranged from 71 to 346 unique occurrence localities per species or complex of species (full dataset available at

Climate data for the present day (1950–2000) were drawn from the WorldClim climate archive (35). All occurrences were checked carefully for consistency and correct placement with respect to interfluvia and river positions. In particular, to represent overall tendencies and variation in temperature and precipitation, we explored the “bioclimatic” coverages in that data archive. Our concern was that we might end up calibrating niche models in an overly dimensional environmental space, so we plotted 1000 random points across the region of analysis (see next paragraph) and measured pairwise Pearson product-moment correlation coefficients for all variables. We eliminated one of each pair of variables that showed high (r ≥ 0.80) correlations (36), except for the maximum temperature of the warmest month and the minimum temperature of the coldest month, wishing to retain both because previous analyses had indicated that these two variables (albeit correlated) are highly informative for AB bird distributions (37). Hence, we retained the following for analysis: annual mean temperature, mean diurnal temperature range, isothermality, temperature seasonality, maximum temperature of the warmest month; minimum temperature of the coldest month, annual precipitation, precipitation of the driest month, and precipitation seasonality. All analyses were developed at a spatial resolution of 2.5′ (~5 km at the Equator), reflecting the approximate spatial accuracy of the georeferencing.

To summarize glacial climates, we used LGM (about 20 to 21 thousand years ago) climate datasets developed by R. J. Hijmans (35) to parallel the WorldClim current climate data, both at the same spatial resolution. Climate model projections are available only for LGM and not for any of the previous glacial maxima; but in climatic terms, if not in terms of accumulating geological effects, the different glacial maxima were relatively similar in terms of the conditions that they presented (30). LGM data were drawn from the outputs of general circulation model (GCM) simulations from the Community Climate System Model (CCSM) (38) and the Model for Interdisciplinary Research On Climate (MIROC) (39), downloaded from the website of the Paleoclimate Modelling Intercomparison Project 2 ( The GCM data had a native spatial resolution of 2.8° (approximately 300 km at the Equator), which were downscaled via the following procedure. First, the difference between the GCM output for historical and recent conditions was calculated. These differences were then interpolated to a 2.5′-resolution grid using the spline function in ArcInfo (ESRI, Redlands, CA) with the tension option. The interpolated differences were then added to the high-resolution current climate datasets from WorldClim. Last, established routines ( were used for generating the so-called bioclimatic datasets from raw monthly temperature and precipitation data. This procedure has the dual advantage of producing palaeoclimatic datasets at resolutions relevant to the spatial scale of analysis and of calibrating the simulated climate change data to the actual observed (present-day) climate data.

We required an approximation of the area that has been accessible to the species over relevant periods of time, which is the appropriate area for model calibration (40). For lineage-wide analyses (i.e., covering most or all of the AB), we buffered the limits of the Amazon (41) by 400 km, in light of the broad extent of interdigitation between Amazon forest and Cerrado and Llanos vegetation types to the south and north, respectively. We delimited this area further to remove parts of the Andes presenting elevations ≥2500 m. All analyses were thus developed within this area but further delimited by the interfluvia (Fig. 1), in which each of the species/species complexes is known to occur.

Because broad spatial autocorrelation in environmental variables can lead to pseudoreplication of environmental signals and exacerbate overfitting in niche models (42), we assessed spatial lags (i.e., a radius of nonindependence of points in environmental terms) across the AB. We calculated autocorrelation lags for each climate dimension separately, based on 12 bins of distances. Over the nine variables, lag distances ranged 0.8° to 4.2°, but annual mean temperature and annual precipitation were both below 1° (115 km at the Equator), such that we constrained occurrence in which that no points were closer to any other than 1° and developed five replicate subsamplings for each species.

Note: The content above has been extracted from a research article, so it may not display correctly.

Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.

We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.