Data Preprocessing

Maria Angela Diroma; Alessandra Modi; Martina Lari; Luca Sineo; David Caramelli; Stefania Vai

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Data Preprocessing

MD Maria Angela Diroma

AM Alessandra Modi

ML Martina Lari

LS Luca Sineo

DC David Caramelli

SV Stefania Vai

This method is extracted from research article: Front Genet, Feb 2021

New Insights Into Mitochondrial DNA Reconstruction and Variant Detection in Ancient Samples

DOI: 10.3389/fgene.2021.619950

Request a Protocol

Ask a question

Favorite

The complete bioinformatics pipeline is shown in Figure 1. All the guidelines for command line tools used are provided as Supplementary Data. After quality check by FastQC^¹ (RRID:SCR_014583, v0.11.7), paired-end sequencing data in FASTQ format were first merged using Clip&Merge function (v1.7.6) from EAGER software (v1.92.37) (Peltzer et al., 2016), which also allowed to remove adapters from both paired- and single-end reads. Sequences with read length < 30 were discarded (–l 30), minimum base quality for quality trimming was set to 30 (–q 30). Sequences were clipped also when one nucleotide aligned with adapters (–m 1).

Computational pipeline for ancient mitochondrial DNA (mtDNA) analysis. Our computational pipeline comprises five main steps: (1) read alignment and preprocessing and postprocessing; (2) contamination analysis by schmutzi and consensus sequence assembly; (3) variant calling by GATK Mutect2, variant filtering, and consensus sequence assembly; (4) haplogroup prediction; (5) variant annotation. The alignment required revised Cambridge Reference Sequence (rCRS) as reference sequence to get a suitable input to schmutzi, while we used mtDNA reads aligned onto the whole genome (hg19 with rCRS as a mitochondrial reference sequence) for variant calling. AD10, minimum variant allele depth = 10; AF50, minimum allele fraction = 50%; RD10, minimum reference allele depth = 10.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol