Reads were mapped to each species’ genome with BWA-MEM version 0.7.12 [62] using the default parameters, including the option to discard any alignment that has more than 10 thousand exact matches in the genome (−c 10,000). For all reads mapping to less than 10 thousand locations, the location with the highest mapping score was reported by BWA-MEM, or, in the case of multiple locations with the same score, a randomly selected location. Therefore, all reads with an exact repeat in 9999 other genomic locations would not have been used for downstream analyses, while reads with a smaller number of exact repeats might have been misplaced in the genome. Low-quality mapping reads were filtered out using SAMtools view version 1.3 with the -q1 flag [63]. Duplicates were removed with the Picard Tools MarkDuplicates program version 2.8.3 (https://broadinstitute.github.io/picard/). Mapping statistics were calculated using SAMtools flagstat version 1.3. To estimate the signal-to-noise ratio, we checked that the relative strand correlation (RSC) was above 0.8 for all libraries using Phantompeakqual tools version 1.14 [64]. The mapping and RSC results are available in Additional file 3: Table S3.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.