We mapped each replicate of directional paired-end RNA-seq data to the human reference genome (hg19/GRCh37) using TopHat v2.0.10 [59, 60] before assembling transcripts using both Cufflinks [61] and Scripture [62]. The TopHat settings were as follows:
tophat -p 8 --library-type fr-firststrand --mate-inner-dist 50 --mate-std-dev 50 --microexon-search --GTF genes.gtf -o < output-folder > <index of reference genome > Reads_end1.fastq Reads_end2.fastq
The reference genes in GTF file format (genes.gtf) were downloaded from the University of California, Santa Cruz (UCSC) genome browser [63]. We then assembled transcripts through the following settings of Cufflinks using TopHat output bam file as input:
cufflinks -p 8 --max-bundle-frags 100000000 --library-type fr-firststrand --frag-bias-correct --multi-read-correct -o < output_folder > <tophat_output_bam_file>
Max-bundle-frags was set to 100,000,000 such that highly expressed genes would be included in the output.
For analysis in Scripture, we used the following TopHat settings:
tophat -p 4 --microexon-search --GTF genes.gtf -o < output_folder > <index_of_reference_genome > <Reads_one_end.fastq>
We used Scripture (beta2 version) to assemble transcripts by following the protocol for transcript assembly (http://www.broadinstitute.org/software/scripture/). All transcripts assembled in Cufflinks and/or Scripture were then merged into one list through Cuffmerge [61].
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.