Doublets detection using scrublet

RF Rongxin Fang
SP Sebastian Preissl
YL Yang Li
XH Xiaomeng Hou
JL Jacinta Lucero
XW Xinxin Wang
AM Amir Motamedi
AS Andrew K. Shiau
XZ Xinzhu Zhou
FX Fangming Xie
EM Eran A. Mukamel
KZ Kai Zhang
YZ Yanxiao Zhang
MB M. Margarita Behrens
JE Joseph R. Ecker
BR Bing Ren
request Request a Protocol
ask Ask a question
Favorite

To identify doublets from secondary motor cortex single-nucleus ATAC-seq datasets, we use single-cell RNA-seq doublets detection algorithm Scrublet37. Briefly, Scrublet identifies doublets in the following manner: (1) Scrublet performs normalization, gene filtering, and principal components analysis (PCA) to project the high-dimension data to a low-dimension space; (2) Scrublet simulates doublets by adding the unnormalized counts from randomly sampled observed transcriptomes; (3) the simulated doublets are projected to the low-dimension embedding computed in step 1. The more neighbors of a cell are the simulated doublets, the more likely this cell is a “doublet”. Based on this idea, a KNN classifier was then used to estimate the doublet score for each cell.

Since Scrublet was designed for detecting doublets in single-cell RNA-seq, it is unclear whether it can be used for single-cell ATAC-seq. To examine this, we applied Scrublet to a single-cell ATAC-seq dataset of mixed human and mouse cells where the “ground-truth” doublets can be identified based on the alignment ratio to human and mouse genome. Compared to the ground truth, Scrubet can identify over 90% of the doublets in this dataset with ~90% accuracy (Supplementary Fig. 26). This result suggests that although Scrubet was not developed for detecting doublets in single-cell ATAC-seq, it can find the doublets in scATAC-seq dataset with reasonable accuracy and sensitivity.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

post Post a Question
0 Q&A