AI Glossary: What Is Masked Language Modeling (MLM)? Definition & Meaning

Masked Sprachmodellierung (MLM) is a technique used in Natürliche Sprachverarbeitung (NLP) to train Sprachmodelle by predicting missing words in a sentence. The core idea behind MLM is to randomly mask a portion of the input tokens (words or subwords) in a sequence and then train the model to predict the original tokens based on the surrounding context. This approach allows the model to learn deeper representations of language durch das Verständnis der Beziehungen zwischen Wörtern und deren Kontextnutzung.

MLM is a crucial component of transformer-based models, such as BERT (Bidirectional Encoder Representations from Transformers), which leverage this technique to achieve state-of-the-art performance on various NLP tasks, including text classification, Named Entity Recognition, and question answering. During training, a percentage of the input tokens are replaced with a special [MASK] token. The model then attempts to predict these masked tokens using the non-masked tokens in the sentence, thus learning to capture the underlying semantics and syntax of the language.

Einer der wichtigsten Vorteile von MLM ist its ability to utilize bidirectional context, meaning the model can consider both the left and right context of a masked word. This contrasts with traditional unidirectional models that process text in a single direction. As a result, MLMs are able to generate more accurate and contextually relevant predictions, making them highly effective for various applications in AI and NLP.