The scRNA-seq data downloaded from GEO was processed by following standard procedures as provided in the Seurat package (Version 4.3.0).(30) Briefly, the downloaded filtered count matrix along with barcodes and features were loaded using the Read10X function in Seurat and then was processed through the CreateSeuratObject function with min.cell = 3, resulting in a Seurat object. The loaded Seurat object underwent preprocessing with the SCTransform function using default parameters. It was then subjected to PCA, neighborhood graph construction, cell clustering analysis employing the Leiden algorithm, and dimension reduction via UMAP was performed using functions provided in Seurat. Additionally, heatmap with hierarchical clustering was created using the ComplexHeatmap R package (Version 2.14.0)(31) using the first 50 principal components. For the construction of the neighborhood graph, principal components were selected to account for a cumulative variance of > 95%, while retaining individual variances ≥ 5%. For UMAP, the Euclidean distance was used.
To identify clusters potentially containing hybrid cells, the FindMarkers function was first utilized for two cell types based on the annotations provided by the original authors(13) macrophages/monocytes and tumors (encompassing all types of tumor cells, such as Class 1 or Class 2, and Prame + or Prame− tumor cells). Subsequently, the AddModuleScore function was invoked twice with nbin = 24 for all samples except UMM063, which used nbin = 12. First, utilizing the top 50 genes that exhibited the highest differential expression scores in tumor cells from FindMarkers, denoted as ‘Tum_Score’. Second, using another set of top 50 genes showing the highest differential expression scores in macrophages/monocytes, referred to as ‘Mac_Score’. These scores were then used to generate violin plots through the VlnPlot function, which were used for the manual identification of hybrid clusters for individual samples.
To identify the doublets and distinguish them from hybrid cells, three simulation-based algorithms were executed: 1.) doubletFinder_v3 in the DoubletFinder package (Version 2.0.3)(32) with the following parameters, PCs = 1:10, pN = 0.25, pK = 0.09, nExp = 4494, resuse.pANN = FALSE, sct = TRUE); 2.) computeDoubletDensity from scDblFinder (Version 1.12.0)(33), and 3.) Scrublet (34), which was implemented in a Python package. To run Scrublet in Python, the raw count matrix for each sample was dumped into a CSV file and then loaded into a Python script. For the execution of Scrublet in Python, the raw count matrix for each sample was exported to a CSV file from the Seurat object and then imported into a Python script. The Scrublet results were subsequently exported to CSV files and re-imported into Seurat objects for further analysis and visualization. A cluster-based algorithm implemented as the findDoubletClusters function in scDblFinder(33), was also employed to identify doublets that may result from fusion of cells from two clusters. However, we believe the results from this algorithm may not align with the goal of this project, which focuses on identifying hybrid cells that can arise from the fusion of two distinct cell types (e.g. macrophages and tumor cells). Therefore, the results produced by this algorithm were not presented.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
 Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.