Nicht-lokaler Block
A Nicht-lokaler Block is a type of neuronale Netzwerkschicht designed to capture long-range dependencies in data, particularly in tasks involving sequences or images. Unlike conventional convolutional layers that focus on local regions, non-local blocks compute relationships between all positions in the input data.
In a typical non-local operation, each element in the input feature map is compared with every other element, allowing the model to learn contextual relationships across the entire Merkmalsraum. This is achieved through a mechanism that calculates attention scores, which weigh the contribution of each element based on its relevance to others.
Die Grundstruktur eines nicht-lokalen Blocks lässt sich in drei Hauptkomponenten unterteilen:
- Query, Key und Value (QKV): The input is transformed into three separate representations. Queries represent the element currently being processed, while keys and values represent other elements within the same feature map.
- Aufmerksamkeits-Score Berechnung: The attention score between each query and key is computed, typically using a dot product followed by a softmax operation to normalize the scores across the feature dimensions.
- Ausgabeerzeugung: The output is generated by a weighted sum of the values, where the weights are determined by the attention scores.
Non-local blocks have become popular in various applications, including image classification, object detection, and der Verarbeitung natürlicher Sprache, as they enhance the model’s ability to consider global contextual information. However, they also come with a computational cost due to their quadratic complexity in relation to the input size, which can lead to increased memory usage and longer training times. Researchers continue to explore ways to optimize these blocks for efficiency.