2.5 Masked-LM

YK Yoshimasa Kawazoe
DS Daisaku Shibata
ES Emiko Shinohara
EA Eiji Aramaki
KO Kazuhiko Ohe
request Request a Protocol
ask Ask a question
Favorite

Fig 2A shows a schematic view of Masked-LM. This task masks, randomly replaces, or keeps each input token with a certain probability, and estimates the original tokens. Estimating not only the masked tokens but also the replaced or kept tokens help to keep a distributional contextual representation of every input token. Although the selection probability of the tokens to be dealt with is arbitrary, we used the 15% mentioned in the original paper [2].

A. Masked LM predicts the original tokens for the masked, replaced or kept tokens. B. Next Sentence Prediction predicts if the second sentence in the pair is the subsequent sentence in the original documents. The role of special symbols are as follows: [CLS] is added in front of every input text, and the output vector is used for Next Sentence Prediction task; [MASK] is masked token in Masked-LM task; [SEP] is a break between sentences; [UNK] is unknown token that does not appear in the vocabulary.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A