Advanced Search
Last updated date: Nov 6, 2021 Views: 678 Forks: 0
GenScore Analytical Protocol
This protocol has been implemented using Open Source R programming language.
Step 1: Data download and expression matrices
Gene expression data from the TCGA GBM, LUSC and OV cancer datasets was downloaded from the following sources:
All TCGA cancers (including the above) data were also downloaded from GDC Data Portal. FPKM-UQ RNASeq files of selected samples were added to the cart and downloaded directly or (if large files) using the GDC Data Transfer tool.
Each dataset consisted of a gene expression matrix containing samples as rows, identified by their TCGA ID, and genes as columns, identified by their name.
Gene expression values were scaled into gene-centered z-scores, by first transposing the gene expression matrix and then applying the R function scale.
Step 2: GenScore value of signatures
The gsva function of the GSVA package (version 1.34) is used. This package is available and freely downloadable from the Bioconductor library.
The genScore Genetic Score metric library is used. This can be downloaded and installed from GitHub (https://github.com/pujana-lab/genScore). The library has been developed ad hoc for this project.
Process:
1. ssGSEA values:
2. genScore values:
The example provided in the genScore website may be followed to compute the values. The method does not require any specific parameter (genScore::genScore(ssgsea$up_tgfb, ssgsea$alt_ej), for example).
If wished to group samples by tertiles, the categorizeSamples method from the same library can be used. This method requires three variables: the array of the signature, the lower threshold and the upper threshold. Thus, to extract the lower and upper tertiles the following command can be used: genScore::categorizeSamples(scores, lowThreshold=1/3, highThreshold=2/3). High βAlt and LowβAlt: this classification omits the samples that are in the middle group.
Step 3: Survivals
A survival data file is needed. In the case of TCGA studies, we have used TCGA Pan-Cancer Clinical Data Resource published at Liu et al ‘18 (https://pubmed.ncbi.nlm.nih.gov/29625055/). Alternatively, cBioPortal data was used: LUSC (lusc_tcga_pan_can_atlas_2018_clinical_data), GBM (gbm_tcga_pan_can_atlas_2018_clinical_data), OV: (ov_tcga_pan_can_atlas_2018_clinical_data).
The algorithms of Surv, survfit, and coxph in the survival package are used, as well as the ggsurvplot function from the survminer package to draw the survival curves. Both packages are available from CRAN (https://cran.r-project.org/).
The patient/tumor groups obtained in the previous step are used in an automated way to compare the High βAlt with the Low βAlt survivals.
Cox survival models use Low βAlt as a reference group and include, whenever possible, the covariates age, and tumor grade/stage. The ggsurvplot algorithm is applied to compute log-rank tests.
Related files
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.
Share
Bluesky
X
Copy link