AI Glossary: What Is Class Activation Map (CAM)? Definition & Meaning

Class Activation Maps (CAMs) are a powerful technique used in computer vision, particularly with convolutional neural networks (CNNs), to interpret and visualize the decision-making process of the model. They provide insight into which parts of an input image are most influential in determining the output class. This is especially useful for understanding model behavior and improving transparency in AI applications.

The basic idea behind CAMs is to generate a heatmap that highlights the regions of the image that contribute most to the prediction of a specific class. This is achieved by utilizing the final convolutional layer of the CNN, where the spatial dimensions are retained while the depth of the feature maps corresponds to different learned features. By applying a weighted combination of these feature maps, where the weights are derived from the output of the model, we can produce a visual representation that shows the areas the model attended to while making its classification.

To create a CAM, the following steps are typically performed: First, the image is passed through the CNN to obtain the feature maps from the last convolutional layer. Next, the global average pooling is applied to compute the weights for each feature map, which are derived from the model’s fully connected layers. Finally, these weights are combined with the feature maps to generate a heatmap, which is then overlayed on the original image for visualization. This process not only aids in debugging and understanding the model but also helps in identifying potential biases and improving model performance.

In summary, Class Activation Maps are an essential tool in the field of AI and machine learning, particularly in enhancing model interpretability and fostering trust in AI systems.