RNA-seq data analysis

S. Marthandan; M. Baumgart; S. Priebe; M. Groth; J. Schaer; C. Kaether; R. Guthke; A. Cellerino; M. Platzer; S. Diekmann; P. Hemmerich

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

RNA-seq data analysis

SM S. Marthandan

MB M. Baumgart

SP S. Priebe

MG M. Groth

JS J. Schaer

CK C. Kaether

RG R. Guthke

AC A. Cellerino

MP M. Platzer

SD S. Diekmann

PH P. Hemmerich

This method is extracted from research article: PLoS One, May 2016

Conserved Senescence Associated Genes and Pathways in Primary Human Fibroblasts Detected by RNA-Seq

DOI: 10.1371/journal.pone.0154531

Request a Protocol

Ask a question

Favorite

Raw sequencing data were received in FASTQ format. Read mapping was performed using Tophat 2.0.6 [55] and the human genome references assembly GRCh37 (http://feb2012.archive.ensembl.org/). The resulting SAM alignment files were processed using the HTSeq Python framework and the respective GTF gene annotation, obtained from the Ensembl database [56]. Gene counts were further processed using the R programming language [57] and normalized to Reads Per Kilobase of transcript per Million mapped reads (RPKM) values. In order to examine the variance and the relationship of global gene expression across the samples, different correlation values have been computed including Spearman’s correlation of gene counts and Pearson’s correlation of log2 RPKM values. The resulting correlation values were visualized using multi-dimensional scaling plots (MDS) and heatmaps (S2 Fig).

Subsequently, the Bioconductor packages DESeq [58] and edgeR [59] were used to identify differentially expressed genes (DEG). Both packages provide statistics for determination of differential expression in digital gene expression data using a model based on the negative binomial distribution. The non-normalized gene counts have been used here, since both packages include internal normalization procedures. The resulting p-values were adjusted using the Benjamini and Hochberg’s approach for controlling the false discovery rate (FDR) [60]. Genes with an adjusted p-value < 0.05 found by both packages were assigned as differentially expressed.

This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol