To analyse the global structure of genetic variation, a principal component analysis (PCA) was performed based on the 56,181 filtered SNPs using the dudi.pca command from the “ade4” package [34] implemented in R version 4.0.2 [35]. Isolates were plotted according to their coordinates on the first two PCA axes using the “factoextra” package in R [36]. To detect possible finer genetic structure among isolates, three additional PCA plots were performed to represent the spatial distribution of each isolates within three other dimensional spaces including the two following PCA axes (i.e. Axes 2 and 3; Axes 2 and 4 and Axes 3 and 4).

Based on the 56,181 initial filtered SNPs, individual observed heterozygosity (Hobs) was first calculated for each isolate. Difference in mean Hobs between species was assessed using a Mann-Whitney test in R. According to the results from the clustering analysis (see results), the six nominal S.bovis isolates were assigned to a S. bovis cluster, and the 11 nominal S. haematobium isolates together with the hybrid isolate from Corsica were assigned to a S. haematobium cluster. The same assignment of isolates among the two defined clusters was set and the F1 hybrid generated in the laboratory was excluded from all subsequent analyses. Second, the average pairwise nucleotide difference among isolates (Π) was computed within each cluster and pairwise nucleotide differences computed at each site (N = 56,181) were compared between the S. bovis and S. haematobium clusters, using a non-parametric paired Wilcoxon test in R. Third, Tajima’s Ds were estimated for each species. To statistically compare Tajima’s D between species, a series of Tajimas’Ds was computed along non-overlapping 50-kb sliding windows overall genomic scaffolds within each species. Windows that contained less than eight SNPs were discarded and only the 50-kb sliding windows in which Tajima’s Ds could be estimated for S. bovis and S. haematobium (i.e. with at least one segregating site within each species) were kept (N = 4035). The series of estimated Tajima’s Ds obtained for each species were next compared using a paired Wilcoxon test. Finally, overall absolute genomic divergence (Dxy) between S. bovis and S. haematobium was estimated. All nucleotide diversity and divergence estimates were computed using the ‘PopGenome’ R Package [37].

To investigate any potential spatial genetic structure among isolates within each species, we computed pairwise Nei’s genetic distances matrices [38], between each isolate within each species (i.e. S. haematobium = all 11 nominal S. haematobium isolates + the Corsican hybrid isolate; and S. bovis = the six nominal S. bovis isolates), based on the initial filtered 56,181 SNPs and using the dartR package [39]. Correlation between the resulting matrices of pairwise Nei’s distances and pairwise geographical distances computed between the geographical origin of each isolate were next tested within each species using two independent Mantel tests as implemented in the ‘ape’ package [40] in R. Significance of correlations were assessed by comparing the obtained Z-scores to 10,000 Z-values generated after bootstrapping. Because uncertainty existed concerning the precise sampling location for some isolates, we considered the centroid of each country of origin obtained from QGIS v. 2.8.12 (QGIS Development Team).

Introgression events are expected to result in the acquisition of genomic tracts from one species to another that can persist -and thus be detectable- within the introgressed lineage over time. This is dependent on time since the introgression event, the genomic context (e.g. genomic locality) of the foreign DNA tracts in the introgressed genome, the recombination rate, the selective advantage of the introgressed DNA tracts [1,3]. Using a genomic sliding windows approach, we thus investigated the presence of genomic regions that display significantly decreased divergence between each isolate and all isolates from its sister species (i.e. one versus all). Thus, to detect possible genomic tracts that could have been introgressed from S. bovis in each of the S. haematobium isolates, we computed Dxy estimates between each S. haematobium isolate (including the hybrid Corsican lineage) and all the S. bovis isolates along 50-kbs windows every 10 kbs overall genomic scaffolds. The same genomic windows were used to compute Dxy estimates between each S. bovis isolate and all the S. haematobium isolates (including the hybrid Corsican lineage) to detect potential traces of introgression from S. haematobium in each of the S. bovis isolates. The Dxy computations along the 50-kbs sliding windows were performed using the R package ‘PopGenome’ [37]. All 50 kbs-windows that contained less than eight SNPs were discarded for subsequent analyses. It was considered that a sliding window displayed significantly decreased divergence when the Dxy value computed was intrinsically lower than the lowest values (0.1%) obtained overall the windows (with > eight SNPs) and overall the isolates (N = 239,076; i.e. 13,282 windows x 18 isolates). This 0.1% threshold corresponded to a Dxy value of 0.102.

Along each scaffold, unique genomic tracts were reconstructed from all overlapping sliding windows that displayed significantly decreased divergence according to the above 0.1% threshold including the windows with less than eigth SNPs. The absolute size, the total number of SNPs and individual Dxy values (i.e. Dxy computed between each isolate and all isolates from its sister species) were computed for each genomic tract. Finally, we investigated the presence of potential coding genes within the resulting genomic tracts based on the ShV2_May19_SwissProtAnno.gff3 annotation file [41] and using IGV 2.3.92 [42].

Note: The content above has been extracted from a research article, so it may not display correctly.



Q&A
Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.



We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.