TCGA and GTEx data acquisition, normalization and quality control

Saiful Effendi Syafruddin; Wan Fahmi Wan Mohamad Nazarie; Nurshahirah Ashikin Moidu; Bee Hong Soon; M. Aiman Mohtar

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

TCGA and GTEx data acquisition, normalization and quality control

SS Saiful Effendi Syafruddin

WN Wan Fahmi Wan Mohamad Nazarie

NM Nurshahirah Ashikin Moidu

BS Bee Hong Soon

MM M. Aiman Mohtar

This method is extracted from research article: BMC Cancer, Jul 2021

Integration of RNA-Seq and proteomics data identifies glioblastoma multiforme surfaceome signature

DOI: 10.1186/s12885-021-08591-0

Request a Protocol

Ask a question

Favorite

The analysis combined the TCGA-GBM and GTEx normal brain RNA-Seq read count data. The GBM RNA-Seq gene raw read counts from TCGA were downloaded from Genomics Data Commons Data Portal (https://portal.gdc.cancer.gov). GTEx data were used for the normal brain tissues. The GTEx data used for the analyses described in this manuscript were obtained from the GTEx Portal on 29/03/19. We downloaded RNA-Seq gene raw read counts (from the cortex, frontal cortex, anterior cingulate cortex) from the GTEx portal (https://gtexportal.org/home/datasets). This allows us to perform the analysis of the differentially expressed gene on the 166 samples of GBM tumour from TCGA and 408 samples of normal brain tissues data from GTEx. The RNA-Seq raw read counts pre-processing steps involve are data filtering and data normalization. The normalization process of both data set was then performed by using mean as gene-level normalization using log₂-counts per million where raw data are adjusted to account for factors that will prevent direct comparison of expression measures and to safeguard the expression distributions are similar for each sample across the whole experiment. Data that unlikely to be informative or simply erroneous data will be removed by using variance filter (less than 15) and low abundance (less than 4).

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol