Gene set enrichment analysis (GSEA) in single-cell RNAseq data using output from the Self Assembling Manifolds (SAM)
Github link: https://github.com/atarashansky/self-assembling-manifold
This protocol was generated using SAMv0.7.4 and the Windows version of GSEAv4.1
- Run SAM analysis.
- Save the resulting SAM weights (sorted in descending order) to a tabular ‘.rnk’ file:
`sam.adata.var[‘weights’].sort_values(ascending=False).to_csv(‘filename.rnk’,sep=’\t’)`, where `sam` is the SAM object. - Open the `.rnk` file in a text editor and add a “#” to the first header row to comment it out.
- Download and install GSEA v4.1 from here.
- Click on “Load data” in the GSEA interface.
- Load the “.rnk” file generated previously.
- Click on “RUN GSEAPreranked” in the GSEA interface.
- Choose the Gene sets database of interest (e.g. “h.all.v7.1.symbols.gmt [Hallmarks]”)
- In the SAM study, we did not remap or collapse the gene symbols (“No_Collapse”). Other datasets with different gene IDs may want to remap their IDs to standardized gene symbols. In this case, select “Remap_Only” and choose the associated chip file (a table that translates each of your identifiers to a standardized gene symbol).
- Choose parameters such as number of permutations, maximum set size, minimum set size, enrichment statistic, etc. We chose the “weighted” enrichment statistic, set the minimum set size to be 5, and used default parameters for everything else.
- Click the ‘Run’ button.
- The job should show up in the ‘GSEA Reports Processes’ panel. When it completes, click on the green ‘Success’, and an HTML report should show up.
- From here, the upregulated and downregulated enrichment results can be downloaded in TSV format.