Data source

CH Chenkai Huang
JZ Juanjuan Zhou
YN Yuan Nie
GG Guihai Guo
AW Anjiang Wang
XZ Xuan Zhu
request Request a Protocol
ask Ask a question
Favorite

All gene expression data of LIHC (liver hepatocellular carcinoma) were obtained from the TCGA database (portal.gdc.cancer.gov/projects/TCGA-LIHC) and GEO database. Furthermore, LIHC data and corresponding clinical information were downloaded by R package TCGAbiolinks for free as previously described [12, 13]. The data comprised, 424 samples, including 374 LIHC tissues and 50 healthy tissues and RNAseq count data on 19,601 genes, and "Not Available (N/A)" to replace the missing values for further analysis. All genes with low abundance were excluded following the edgeR package recommendation.

Additionally, the R package GEOquery was used to normalize the expression profiles of GSE25097 from the GEO database (www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE25097). GSE25097 consisted of 268 tumor samples and 243 paired normal tissues from patients with LIHC and 6 healthy liver samples. As a result, approximately 2,887 genes were selected for further analysis.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A