Long short-term memory (LSTM) is a specific type of recurrent neural network that models dependencies between elements in a sequence through recurrent connections. Here, we use one forward LSTM to compute a hidden state of the sequence X from left to right at the t-th time step, and the other backward LSTM to compute a hidden state of the same sequence in reverse, where d2 is the dimension of the hidden state. Then, the two hidden states are concatenated to form the final output of the BiLSTM layer at the t-th time step.
After that, the output of BiLSTM is fed to a two-layer fully-connected neural network (FC) with tanh activation to predict the confidence score for each possible label of the word, which can be written as follow:
where , and are the parameters that need to be trained, k is the number of distinct labels.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.