Gene expression and disease correlation analysis to identify BPHL

PR Pengfei Ren
JZ Jianxue Zhai
XW Xuelian Wang
YY Yucheng Yin
ZL Zuju Lin
KC Kaican Cai
HW Haofei Wang
ask Ask a question
Favorite

We downloaded the RNA-sequencing (RNAseq) data obtained from both the tumor and adjacent normal tissues of 57 cases as well as the pathology data in the LUAD category in TCGA database. The data were normalized using the Trimmed Mean of M-values (TMM). The gene expression values were presented as Log2 in scale, and the threshold was set as ±1 to estimate the discrete distribution. The BPHL gene was identified as a top hit (Figure 1 and Table 1) (8-13). The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013).

Bioinformatics analysis of TCGA database to identify BPHL as a disease-associated gene. (A) The TMM was used for data standardization, and statistical analysis of the paired samples was conducted. The BCV was observed for quality control, and then normal samples (tissues adjacent to cancer) and cancer samples could be clearly separated at Dim1. (B) Log2 was used (Cancer/Normal) for statistical analysis of the multiple paired samples, and the filtering criteria was set as ≥1 or ≤−1 to estimate the dispersion. Genes with a statistical test P value less than 0.05 are considered as differentially expressed genes that meet the null hypothesis (marked in red in the figure); differentially expressed genes that do not meet the null hypothesis are marked as black dots. (C) The differential expression of the original BPHL gene data in each TCGA RNA-seq sample was expressed by a line chart. The vertical axis is the original expression data of each sample, and the horizontal axis is the cancer and adjacent tissues. Each line indicates the data of one sample, and the trend of the line shows the gene changes in all samples. FC, fold change; CPM, counts per million; BPHL, biphenyl hydrolase-like; TCGA, The Cancer Genome Atlas; TMM, Trimmed Mean of M-values; BCV, biological coefficient of variation.

Genes related to the cancers of this study in which the function and clinical significance have been reported in the literature, multiple-pass transmembrane protein genes, and genes that were annotated unclearly (such as genes annotated with an open reading frame) were excluded. Also, combined with the gene disease database, a final gene list was obtained and then randomly reduced to determine the final gene list for analysis. TCGA, The Cancer Genome Atlas; BPHL, biphenyl hydrolase-like.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A