2.3 Analyses with down-sampled datasets

Hannah Schriever; Dennis Kostka

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

2.3 Analyses with down-sampled datasets

HS Hannah Schriever

DK Dennis Kostka

This method is extracted from research article: Bioinformatics, Nov 2022

Vaeda computationally annotates doublets in single-cell RNA sequencing data

DOI: 10.1093/bioinformatics/btac720

Request a Protocol

Ask a question

Favorite

For several analyses, we randomly down-sampled datasets 10 times to contain a fraction of cells (we used 95%), while preserving the original singlet to doublet ratio. Each doublet annotation method was then run on each down-sampled dataset resulting in 10 annotation scores for each cell per method per dataset. Because of its longer running time solo was only run on five sub-samples instead of 10 (for run times, see Supplementary Fig. S5). To test for a difference in performance between two methods on a specific dataset we then used a Wilcoxon rank-sum test. To test for performance difference across all datasets, we used a paired Wilcoxon rank-sum test is used, where the pairing takes into account systematic differences between datasets. In cases of comparing solo to other methods, we only use the five repetitions where solo was applied.

To obtain a standard deviation of a performance measure averaged across datasets (e.g. Fig. 3), we estimated standard deviations for each dataset and then used standard error propagation.

Vaeda performs well compared with other doublet annotations methods. (A) Average precision (AUPRC, in percent) for doublet annotation methods (columns) across datasets (rows) is shown. The row labeled ’mean’ denotes the average across datasets. A ranking of method performance is shown each dataset; lighter entries indicate worse performance, while darker entries indicate better performance in each row. (B) Average precision for repeatedly down-sampling 95% of cells across datasets for each method is shown; error bars indicate three standard deviations

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol