The shotgun sequencing produced 2 × 47,868,586 paired-end reads (2 × 100 bp), with an insert size of 300–500 bp. The reads were further processed with CLC Genomics Workbench 11.0 (Qiagen, Valencia, CA, USA) as follows: (1) adapters were removed from all reads; (2) all reads were trimmed based on their quality; (3) reads were sampled to reduce coverage to a maximum average coverage of 100×; (4) reads were de novo assembled and resulted contigs were scaffolded.
Genome structural and functional annotations were performed using Funannotate pipeline v1.5.0 (https://github.com/nextgenusfs/funannotate).
Structural annotation step included: (1) repeat masking with the RepeatMasker package (http://www.repeatmasker.org/) using the RepBase repeats libraries [27]; (2) ab initio protein-coding gene prediction with self-trained GeneMark-ES [28] and AUGUSTUS [29], trained using BUSCO 2.0 [30] gene models (Phanerochaete chrysosporium was selected as a closely-related species); (3) ab initio tRNA-coding gene prediction with tRNAscan-SE [31]; (4) integration and filtering of the obtained gene models.
Functional annotation was performed with the Pfam [32], InterPro [33], eggNOG [34], dbCAN [35], MEROPS [36], antiSMASH [37], and BUSCO [30] databases. The prediction of transmembrane topologies and signal peptides was performed with Phobius [38] and SignalP [39], respectively.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.