We processed our cDNA sequences in mothur v.1.34.4 (87). Prior to the downstream processing, we trimmed barcodes and primers while simultaneously removing the sequences which had any of the following criteria: average quality score below 25 over a window of 50 nt, more than 1 mismatch to the barcode, more than 3 mismatches to the primers, homopolymers of more than 6 nt, any ambiguous nt, or amplicon length outside the range of 359 to 373 bp. The remaining sequences were aligned in mothur using a reference alignment consisting of 445 fungal class II peroxidase sequences, which we had previously obtained from GenBank, FunGene (88), and MycoCosm (89) and aligned with MAFFT (90). Following the alignment of the cDNA sequences, we removed sequences which began or ended at inappropriate positions within the alignment. Finally, we removed chimeras which had been identified with UCHIME (91) and preclustered cDNA sequences to denoise.
To prepare denoised fungal peroxidase cDNA sequences for phylogenetic composition analyses in pplacer (92), we removed three ambient N deposition plots (2 from site D and 1 from site A), which yielded fewer than 100 sequences, and subsampled the remaining plots to obtain equal numbers of sequences from each plot. Because the removal of plots yielding fewer than 100 sequences resulted in unequal numbers of ambient and experimental N deposition plots for sites A and D, we commensurately excluded equal numbers of experimental N deposition treatment plots at those sites prior to performing statistical analyses of phylogenetic fungal peroxidase cDNA composition. We tested all possible plot combinations, i.e., we jackknifed the procedure (93), to ensure that our selection of plots for exclusion did not influence the outcome of the analysis.
We clustered OTUs at the 92% sequence similarity level, a cutoff similar to that employed in another recent study of expressed fungal peroxidases in soil (47) and which we deemed as appropriate for the downstream analysis of our data because it minimized both artificial inflation and deflation of peroxidase OTU diversity in our mock community (data not shown). Singleton OTUs were removed from the data set prior to all downstream OTU analyses. The most abundant sequence from each OTU was selected as a representative sequence. For statistical comparisons of OTU composition, we used data from all plots and sites but converted OTU abundances to proportions to normalize samples prior to statistical analysis. For statistical comparisons of OTU richness, we excluded 3 plots that yielded fewer than 100 sequences after sequence processing and normalized sequence numbers across plots prior to calculating richness.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.