For several analyses, we randomly down-sampled datasets 10 times to contain a fraction of cells (we used 95%), while preserving the original singlet to doublet ratio. Each doublet annotation method was then run on each down-sampled dataset resulting in 10 annotation scores for each cell per method per dataset. Because of its longer running time solo was only run on five sub-samples instead of 10 (for run times, see Supplementary Fig. S5). To test for a difference in performance between two methods on a specific dataset we then used a Wilcoxon rank-sum test. To test for performance difference across all datasets, we used a paired Wilcoxon rank-sum test is used, where the pairing takes into account systematic differences between datasets. In cases of comparing solo to other methods, we only use the five repetitions where solo was applied.
To obtain a standard deviation of a performance measure averaged across datasets (e.g. Fig. 3), we estimated standard deviations for each dataset and then used standard error propagation.
Vaeda performs well compared with other doublet annotations methods. (A) Average precision (AUPRC, in percent) for doublet annotation methods (columns) across datasets (rows) is shown. The row labeled ’mean’ denotes the average across datasets. A ranking of method performance is shown each dataset; lighter entries indicate worse performance, while darker entries indicate better performance in each row. (B) Average precision for repeatedly down-sampling 95% of cells across datasets for each method is shown; error bars indicate three standard deviations
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.