2.5.3 Deep learning method

Xiulin Bai; Yujie Zhou; Xuping Feng; Mingzhu Tao; Jinnuo Zhang; Shuiguang Deng; Binggan Lou; Guofeng Yang; Qingguan Wu; Li Yu; Yong Yang; Yong He

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

2.5.3 Deep learning method

XB Xiulin Bai

YZ Yujie Zhou

XF Xuping Feng

MT Mingzhu Tao

JZ Jinnuo Zhang

SD Shuiguang Deng

BL Binggan Lou

GY Guofeng Yang

QW Qingguan Wu

LY Li Yu

YY Yong Yang

YH Yong He

This method is extracted from research article: Front Plant Sci, Oct 2022

Evaluation of rice bacterial blight severity from lab to field with hyperspectral imaging technique

DOI: 10.3389/fpls.2022.1037774

Request a Protocol

Ask a question

Favorite

A self-built long short-term memory network with an attention mechanism (ATT-LSTM) was used to evaluate the disease severity. The architecture of the designed ATT-LSTM model is shown in Figure 2 .

The architecture of ATT-LSTM model.

The first block was ATT block, which was realized by two dense layers:

where Y_ATT is the output of the ATT block, X denotes the input data, W and b represent the weights and bias of the attention layer, ⊗ indicates matrix multiplication, and f_relu is the activation function. The number of neurons of the first dense layer was set to 128, and that of the second dense layer was set as the number of input variables. The attention mechanism helps the model assign different weights to each input variable, which makes the network pay more attention to important information (Li et al., 2020). Thus, the ATT block could also be used to extract the feature variables. Define the value of f _relu(W ₂⊗f _relu(W ₁⊗X+b ₁)+b ₂) as the ATT weight. Variables with larger weight values are more important. In this study, the important wavelengths were screened based on the ATT weight.

The second part was the LSTM block. It contained two LSTM units with 128 and 32 hidden units, respectively. Each LSTM unit was followed by a max pooling layer and a batch normalization layer. LSTM is a special recurrent neural network that stores long-term states by adding a memory neuron. The neuron can decide which states are forgotten or retained, thus solving the recurrent neural network gradient disappearance or gradient explosion (Zhelev and Avresky, 2019). It has advantages for the processing of time-series data. The specific introduction of LSTM can be seen in http://colah.github.io/posts/2015-08-Understanding-LSTMs/.

The LSTM block was followed by a fully connected layer containing 128 neurons, and a dropout layer was used to prevent overfitting. Finally, output category. The output of this study was the disease severity with 6 categories.

The deep learning model was implemented based on the MXNET framework. The Softmax cross-entropy loss function and adaptive moment estimation were applied to train the model. In the training phase, the batch size was set as 20, and a dynamic learning rate was used. In the beginning, a relatively large learning rate of 0.001 was set to speed up the training process for the first 500 epochs, and then it was reduced to 0.0001 for the next 300 epochs.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol