2.13. RNA-seq data processing

Kazuya Toriumi; Stefano Berto; Shin Koike; Noriyoshi Usui; Takashi Dan; Kazuhiro Suzuki; Mitsuhiro Miyashita; Yasue Horiuchi; Akane Yoshikawa; Mai Asakura; Kenichiro Nagahama; Hsiao-Chun Lin; Yuki Sugaya; Takaki Watanabe; Masanobu Kano; Yuki Ogasawara; Toshio Miyata; Masanari Itokawa; Genevieve Konopka; Makoto Arai

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

2.13. RNA-seq data processing

KT Kazuya Toriumi

SB Stefano Berto

SK Shin Koike

NU Noriyoshi Usui

TD Takashi Dan

KS Kazuhiro Suzuki

MM Mitsuhiro Miyashita

YH Yasue Horiuchi

AY Akane Yoshikawa

MA Mai Asakura

KN Kenichiro Nagahama

HL Hsiao-Chun Lin

YS Yuki Sugaya

TW Takaki Watanabe

MK Masanobu Kano

YO Yuki Ogasawara

TM Toshio Miyata

MI Masanari Itokawa

GK Genevieve Konopka

MA Makoto Arai

This method is extracted from research article: Redox Biol, Jun 2021

Combined glyoxalase 1 dysfunction and vitamin B6 deficiency in a schizophrenia model system causes mitochondrial dysfunction in the prefrontal cortex

DOI: 10.1016/j.redox.2021.102057

Request a Protocol

Ask a question

Favorite

An in-house RNA-seq pipeline was established for RNA-seq data analysis and processing. Reads were aligned to the mouse mm10 reference genome using STAR 2.5.2b [18] with the following parameter --outFilterMultimapNmax 10 --alignSJoverhangMin 10 --alignSJDBoverhangMin 1 --outFilterMismatchNmax 3 --twopassMode Basic. Gencode annotation for mm10 (version vM11) was used as a reference to build STAR indexes and alignment annotation [19]. For each sample, a BAM file including mapped and unmapped with spanning splice junctions was produced. Secondary alignment and multi-mapped reads where further removed using in-house scripts. Only uniquely mapped reads were retained for further analysis.

Overall quality control metrics were performed using RseqQC using the UCSC mm10 gene model provided [20]. This includes the number of reads after multiple-step filtering, ribosomal RNA reads depletion, reads mapped to an exon, UTRs, and intronic regions.

Gene level expression was calculated using HTseq version 0.6.0 using intersection-strict mode by exonic regions [21]. Counts were calculated based on protein-coding genes annotation from mm10 Gencode gtf file (version vM11). CPM (counts per million reads) was calculated using edgeR [22]. To retain only expressed genes, we used a “by condition” log2(CPM + 1) cutoff. Briefly, a gene is considered expressed with log2(CPM + 1) ≥ 0.5 in all four replicates of a given condition (WT/VB6(+), WT/VB6(−), KO/VB6(+), KO/VB6(−)). In total, we detected 13428 protein-coding genes expressed in our data. We further used those genes for differential and co-expression analyses.

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol