2.2. Single-cell sequencing data processing

YL Yang Li
YC Yuan Chen
DW Danqiong Wang
LW Ling Wu
TL Tao Li
NA Na An
HY Haikun Yang
request Request a Protocol
ask Ask a question
Favorite

In our single-cell RNA sequencing (scRNA-seq) evaluation, we employed the R package “Seurat” for a meticulous cell-level analysis (48, 49). We filtered out cells exhibiting over 5% mitochondrial gene content or expressing between 200 and 2,500 genes to maintain data integrity. To address batch discrepancies between tumor and normal samples, we used the “harmony” R package. Following normalization with Seurat’s “NormalizeData” function, the data was transformed into Seurat objects. We identified the top 2000 variable genes using “FindVariableFeatures,” and then reduced dimensionality through Principal Component Analysis (PCA) using “RunPCA”. Significant principal components (PCs) were determined via JackStraw analysis, with selection based on variance ratios for subsequent cell clustering. Utilizing “FindNeighbors” and “FindClusters”, we clustered the data and visualized cell populations through the uniform manifold approximation and projection (UMAP) technique (50). For delineating cluster-specific genes, we applied a Wilcoxon rank-sum test with “FindAllMarkers” and “FindMarkers” from the “scran” R package (51), and annotated cell types using the CellMarker database (http://xteam.xbio.top/CellMarker/index.jsp).

Intercellular communication dynamics, particularly regarding HCC’s immune profile, were elucidated using the “CellChat” R package, which simulated communication probabilities based on gene expression and known ligand, receptor, and cofactor interactions (52). We analyzed the repertoire of 185 glycosyltransferase-associated genes across five scoring algorithms (AddModuleScore, AUCell, UCell, singscore), employing “SingleR’s” “AddModuleScore” for genome scoring based on mean gene expressions. The methodologies of “AUCell”, “UCell”, and “singscore” focused on gene enrichment ranking, unsupervised cell-type annotation, and functional activity quantification within single cells or samples. This multiplexed scoring approach enriched our analysis with robustness and depth.

For macrophage-related analyses, we used the “limma” R package to perform differential gene analysis, overlaying the results with the glycosyltransferase gene set to identify key macrophage-associated glycosyltransferase genes (53, 54). Pseudotime trajectory mapping, elucidating cellular evolution patterns pertinent to tumorigenesis, was achieved using the “Monocle” package. Additionally, “CellCall” revealed integrated intercellular communication networks, combining ligand-receptor dialogue and intracellular transcription factor dynamics to construct the L-R-TF axis, while integrating pathway activity assessments to identify cellular pathway alterations driven by intercellular communication.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

post Post a Question
0 Q&A