Training details

Fabian Hausmann; Stefan Kurtz

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Training details

FH Fabian Hausmann

SK Stefan Kurtz

This method is extracted from research article: Algorithms Mol Biol, Aug 2021

DeepGRP: engineering a software tool for predicting genomic repetitive elements using Recurrent Neural Networks with attention

DOI: 10.1186/s13015-021-00199-0

Ask a question

Favorite

While the basic network architecture we used is similar to the one proposed in [11], our training largely differs from the training of dna-brnn.

Training, retraining and evaluation were performed on Linux-based computers. The hyperparameter optimization was run on an Intel® Xeon® Gold 6142 CPU using 12 inter-operation and 12 intra-operation threads using Hyperopt with the Tree of Parzen Estimators algorithm [51]. Model performance is measured by the multi-class Matthews correlation coefficient, denoted by $M C C_{k}$ where $k$ is the number of classes [52]. $M C C_{k}$ is a single value characterizing a complete confusion table [52] even if the classes are very imbalanced [53]. In our application we have four different classes of repeats and the no repeat class, so $k$ is 5. We chose the model maximizing $M C C_{5}$ . This model is termed best model.

The retraining of the best model and all evaluations, including training and evaluation of both tools, were performed on an Intel® Core® 7-5820K CPU using 10 threads. DeepGRP additionally utilized a GPU on a Nvidia Geforce GTX 960 graphics card using the XLA domain-specific compiler, see [54] for details. Training of DeepGRP required 6 to 15 minutes (median 11min) per model.

During the training, the composition of the batches is adjusted by a hyperparameter $r \in [0, 1]$ . A batch of size $n$ is constructed such that it contains at least $⌊ n \cdot r ⌋$ sequence windows, each of which has at least one position annotated as repetitive elements. This is further restricted such that from these $⌊ n \cdot r ⌋$ sequence windows for the set of all repetitive element classes $C$ , $⌊\frac{n \cdot r}{| C |}⌋$ sequence windows per repetitive element class are present in one batch. Therefore, repeats which are less present in the training data are sampled more often during training to account for the imbalance of the occurrence of different repeat classes.

DeepGRP was compared to dna-brnn [11], which is implemented in C using its own neural network framework. Both programs use, as far as possible, the same hyperparameter values. dna-brnn does not provide an interface to perform hyperparameter tuning. Moreover, several parameters and flags of dna-brnn are hard coded or are ignored even if they have been specified on the command line. We concluded that hyperparameter tuning for dna-brnn would require modifications of its source code and the development of wrappers. As a consequence, we did not perform hyperparameter tuning for dna-brnn. For training and prediction the same hyperparameters were used.

dna-brnn uses a fixed number of epochs and a user defined random seed, whereas DeepGRP uses early stopping and a varying random seed. To the best of our knowledge, dna-brnn is not able to use validation data and the training cannot be resumed in previously saved states. So, it seems not possible to apply early stopping to prevent overfitting of parameters in dna-brnn. For DeepGRP we trained five models. Using different random seeds, we also trained five models for dna-brnn, in contrast to [11], in which only one model was trained.

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol