Simulation

Mehrnush Forutan; Elizabeth Ross; Amanda J. Chamberlain; Loan Nguyen; Brett Mason; Stephen Moore; Josie B. Garner; Ruidong Xiang; Ben J. Hayes

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Simulation

MF Mehrnush Forutan

ER Elizabeth Ross

AC Amanda J. Chamberlain

LN Loan Nguyen

BM Brett Mason

SM Stephen Moore

JG Josie B. Garner

RX Ruidong Xiang

BH Ben J. Hayes

This method is extracted from research article: Commun Biol, Jul 2021

Evolution of tissue and developmental specificity of transcription start sites in Bos taurus indicus

DOI: 10.1038/s42003-021-02340-6

Ask a question

Favorite

The selection of an appropriate alignment tool for CAGE-Seq data can be a difficult due to their short read length. Therefore, we generated simulated single-end sequence datasets with read lengths of 27 bp similar to the length of the trimmed CAGE-Seq data and compared alignment quality of BWA and Bowtie2⁵² (version 2/2.3.4.3). Simulated datasets were generated from chr1 of the Bos taurus genome (GenBank: ARS-UCD1.2) using dwgsim⁵³. The default per base sequencing error rate of 0.02 was considered. Three datasets, each comprised of 20 samples, were generated with average sequencing depth of 10–25x (high), 5–10x (medium), and 1–5x (low). The sequencing coverage of each sample for each datasets was chosen based on random distribution within the coverage bounds. All simulated reads were mapped to chr1 of the Bos taurus genome assembly.

The parameters used for running BWA was the same as the parameters used in real data. For Bowtie2 the default parameters were used. Two standard performance measures, precision, and recall were used to evaluate the aligners. Recall (sensitivity) indicates the number of correctly aligned reads over the total number of reads that should have been aligned, and precision shows the number of correctly aligned reads over the total number of aligned reads. The measures were calculated using the dwgsim_eval program dwgsim⁵³. To assess the overall performance of the two aligners, the area under the precision-recall curve (PR-AUC) was computed. PR-AUC ranged between 0 and 1 with larger area indicating better performance. Overall scoring of the mappers based on our evaluation criteria was slightly higher for BWA compared to Bowtie2 (0.41 ± 0.0 vs. 0.32 ± 0.0008), indicating the higher accuracy using BWA with respect to sequencing parameters used.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol