The workflow of SQUAT assessment is shown in Fig. 1. The whole process takes sequencing reads and their assembly as input and generates both pre-assembly and post-assembly HTML reports to help users examine their data from different perspectives. To begin with, we randomly sample one million entries of reads from the original dataset for a quick examination. Note that users can change the default sample size of ‘one mega’ or bypass the sampling process, as shown in the manual. The pre-assembly workflow is shown on the left side of Fig. Fig.1.1. The quality statistics module takes the sampled reads as input and generates tables and distributions in the HTML report for evaluating the base quality and read quality in detail. It also presents a pie chart on top of the report to show the proportions of poor-, medium- and high-quality reads for overall assessments.
SQUAT assessment workflow
The post-assembly workflow, depicted in the middle and right parts of Fig. Fig.1,1, firstly maps the sampled reads onto the input scaffolds of genome assembly by a local aligner BWA backtrack [16] and an end-to-end aligner BWA MEM [17]. Then, the Analysis module 1) categorizes the reads into seven groups, as shown in Fig. 2 and Table Table11 (will be described later in the sub-section Post-assembly analysis), and 2) generates the percentage of poorly-mapped reads by considering not only unmapped reads, but also reads with abnormal densities of substitutions and clips.
Summary of post-assembly read label tagging with icons
Classification of reads by read mapping analysis. The descriptions and icons of these read labels are shown in Table 1
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.