For DELs and DUPs, called SVs were divided into four and three fractions, respectively, depending on their size, and precision and recall were calculated for each SV-type and for each size range. Precision was calculated by dividing the number of truly called sites with the total number of called sites, and recall was calculated by dividing the number of truly called sites with the total number of corresponding reference SVs. The true positive (TP) calls were judged when the called DELs, DUPs, and INVs exhibited ≧ 80% reciprocal (60% reciprocal for ≦ 1 kb) and ≧ 50% reciprocal overlaps with the reference SVs for the simulated and real data, respectively, or when the BPs of the called INSs were placed within 200 bp of those of the reference INSs. We further determined the SV calls exhibiting Mendelian inheritance errors with the WGS datasets of NA12878, NA12891, and NA12892 trio. When the SV calls of the child NA12878 overlap with neither from the parent SV call sets (≦ 200 bp distance for INSs and ≧ 50% overlaps for the others), the corresponding sites were regarded as Mendelian inheritance errors. Because these sites could attribute to false negatives in parents, we used 1.7-fold coverage of parent WGS datasets relative to the child data to minimize false negatives in parents. Called DELs or DUPs were divided into size ranges and searched against the total DEL or DUP reference sets but not against the divided reference set for the corresponding size range, because the overlap-based search sometimes hits sites with out of the size range. When size-ranged DEL/DUP calls matched the reference, the matched calls were used as true calls for calculating precision for the corresponding size range; in contrast, for the calculation of recall, the matched calls were used for the size range of the matched reference site. INSs and DUPs are sometimes complementary [84] and could be confusedly called by several types of algorithms. Thus, to judge whether the called INSs are true, we also searched them against the reference DUPs when the called INSs had no matched INS references. When INS calls were matched with the DUP references, the number of hit was added to both the TP calls and the INS reference to calculate precision and recall, respectively. Similarly, called DUPs were also searched against the reference INSs. The precision and recall values for many algorithms varied depending on the RSS threshold values. For several algorithms (e.g., CNVnator, readDepth), information on RSS values was lacking and thus other information, such as read depth or scores, was converted to a provisional number of RSS value (see Additional file 4: Supplemental methods). To determine the best precision/recall points for each algorithm and for each SV category, we selected an RSS threshold at which the numbers of calls for an SV type approximates but does not exceed 90% of the corresponding simulated reference data or the expected SV number in an individual (DEL: 3500, DUP: 550, INS: 3000, and INV: 100, estimated from the previous studies).
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.