Operational taxonomic units based on rpoB sequences were defined at a 0.01 nucleotide dissimilarity cutoff using patristic distances as implemented in RAMI (75). This dissimilarity cutoff roughly delineates the genetic divergence between characterized Streptomyces species (28). In the case of the Wisconsin and North Carolina sites, sequences were aggregated across two or three soil samples, respectively, to ensure that >30 strains were available to represent each region. The decision to aggregate was justified by similarities in climate, geography, and soil characteristics across the aggregated samples. In addition, previous analyses had indicated that Streptomyces species are broadly distributed at regional scales (>1,000 km) (23, 24). This provided 12 sites with 77 ± 30 (mean ± SD) strains characterized per site. Beta diversity was evaluated through hierarchical clustering implemented within UniFrac (76). Mantel correlations between matrices of geographic distance and either UniFrac or Bray-Curtis distances were performed with the R package ecodist (77) and the Pearson correlation method with 1,000 permutations. Patterns of OTUrpoB sharing were visualized using Cytoscape 2.8 and the y-files organic layout (78).
Values for the NRI, nearest taxon index (NTI), and Faiths PD were calculated using Phylocom v.4.2 (79). For both the NRI and NTI, positive values indicate phylogenetic clustering (i.e., closely related taxa cooccur more than expected by chance), negative values indicate overdispersal or phylogenetic evenness, and values close to zero suggest a phylogenetically random assembly of species. Significance was determined by permutation (n = 999) in comparison to a null model where taxa are assigned to each site by random draw without replacement from the list of all taxa. The MRD was calculated as the average number of nodes separating the species in a site from the root of their phylogenetic tree (80).
Haplotype networks were created using a statistical parsimony procedure (81, 82) as implemented in TCS v1.18 (83). Closed loops representing network ambiguities were resolved using the nesting rules proposed by Templeton et al. (82). The final nested clade information was used as input in the program GeoDis v2.2 (84). GeoDis analyzes the nested haplotype network to make inferences on the processes that could have produced the association of the haplotype distribution and geography. Both TCS v1.18 and GeoDis v2.2 were performed with the ANeCA platform (85).
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
 Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.