2.5. Alignment and Transcript Assembly

Ling Kui; Min Li; Xiaonan Yang; Ling Yang; Qinghua Kong; Yunbing Pan; Zetan Xu; Shouling Wang; Dandan Mo; Yang Dong; Yao Liu; Jianhua Miao

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

2.5. Alignment and Transcript Assembly

LK Ling Kui

ML Min Li

XY Xiaonan Yang

LY Ling Yang

QK Qinghua Kong

YP Yunbing Pan

ZX Zetan Xu

SW Shouling Wang

DM Dandan Mo

YD Yang Dong

YL Yao Liu

JM Jianhua Miao

This method is extracted from research article: Biomed Res Int, Aug 2021

High-Throughput Screen of Natural Compounds and Biomarkers for NSCLC Treatment by Differential Expression and Weighted Gene Coexpression Network Analysis (WGCNA)

DOI: 10.1155/2021/5955343

Request a Protocol

Ask a question

Favorite

When quality control was finished, the general RNA-seq analyses would be carried out. There are four main steps that need to be done: aligning the reads to the reference; assembling the alignments on the alignment into a full-length transcript; quantitative expression of genes and transcripts; calculating the expression difference of all genes under different experimental conditions. The “new Tuxedo” package including HISAT, StringTie, and Ballgown has been used to perform this process. During this process, HISAT [40] has been used to align RNA-seq reads to the genome, and StringTie [41] is responsible for assembling transcripts and constructing isoforms to estimate gene expression. Ballgown [42] uses the results of StringTie splicing to calculate gene expression, then obtained the FPKM (Fragments Per Kilobase Million) results. The input data was generated by the BGISEQ-500 instrument; after running our pipeline, useful outputs were produced, including transcripts, gene expression values (FPKM), differentially expressed gene (DEG) list, and the merged statistical results. The detailed steps are shown in Table 2.

Detailed information of RNA-seq analysis pipeline.

The table lists analysis steps, software, and main scripts in our pipeline. Starting from the input FASTQ files produced by sequencing and finally generating the results of candidate medicine and genes for NSCLC cancer research.

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol