Transposable elements characterization

AS Alexandre Freitas da Silva
FD Filipe Zimmer Dezordi
EL Elgion Lucio Silva Loreto
GW Gabriel Luz Wallau
request Request a Protocol
ask Ask a question
Favorite

Characterization and evolutionary study of TEs from wasps were performed following the pipeline in (Additional file 1).

We have used two complementary approaches to characterize the mobilome from raw Illumina reads: I) Raw reads were used as input for RepeatExplorer (RE) analysis pipeline with default parameters. Wasp datasets were independently analyzed. Raw reads clustering were performed using an all-to-all similarity comparison which builds a graph relative to each group of a repetitive element [14]. RepeatExplorer annotated the reads of each assembled cluster using RepeatMasker (http://www.repeatmasker.org) [15] against the Repbase database [32]. Following, we sought to characterize the top clusters (clusters that represent more than 0.01% of the reads used) having the majority of the reads with BLAST hit to a known Repbase TE. Resulting top clusters contigs were then reassembled using CAP3 [33] with the following parameters (−a 20 -b 20 -c 12 -d 200 -e 30 -f 20 -g 6 -m 2 -n 5 -p 80 -r 1 -s 900 -t 300 -u 3 -v 2 -o 40) as used by others [17, 34] (Additional file 1). II) dnaPipeTE [10] were run with two Trinity iteractions and variable amount of reads to evaluate its performance and find the best parameter set. Final parameters are as follow: -sample_size 14,000,000 (the maximum number of reads allowed using two trinity iteractions, considering that we have around 28Mi reads for each wasp species), −sample_number 2 and -RM_t 0.5. Those two approaches were used to estimate the proportion of each TE class and superfamilies in the two genomes (Additional file 1).

Additionally, we performed one de novo characterization using RepeatScout 1.0.5 (RS) [35] using the original assembly obtained from Ortiz et al. 2015. Finally, we clustered three TE libraries generated by these programs plus the Ortiz et al. 2015 TE library, characterized by BLASTn against Repbase, using CD-HIT-EST 4.6 [36] with parameters (−c 0.8 -G 0 -aS 0.8 -g 1 -M 50000 -T 8 -n 5) to generate the final TE dataset for each wasp (Additional file 1).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

post Post a Question
0 Q&A