AI Glossary: What Is Batch Normalization (BN)? Definition & Meaning

Batch Normalization (BN) is a technique used in deep learning to enhance the training of neural networks. It was introduced to address the problem of internal covariate shift, which refers to the changes in the distribution of network activations due to the updating of weights during training. By normalizing the inputs of each layer, Batch Normalization helps to stabilize these distributions, leading to faster convergence and improved performance.

The process of Batch Normalization involves normalizing the output from each layer by subtracting the batch mean and dividing by the batch standard deviation. This transforms the activations to have a mean of zero and a variance of one for each mini-batch of data. In addition to this normalization, Batch Normalization introduces two learnable parameters: scale and shift. These parameters allow the network to maintain the representational power by enabling it to learn the optimal scale and offset for the normalized activations.

One of the key benefits of Batch Normalization is that it allows for the use of higher learning rates and reduces the sensitivity to the initialization of network weights. This can lead to faster training times and better overall performance of the network. Furthermore, Batch Normalization can act as a form of regularization, potentially reducing the need for Dropout or other regularization techniques.

Batch Normalization is typically applied after the activation function of a layer and before the subsequent layer in the network architecture. It has become a standard practice in many state-of-the-art models across various domains, including computer vision and natural language processing.