IG(Integrated Gradients)
Introduction
IG (Integrated Gradients) is a technique used in the context of XAI (explainable artificial intelligence) that tries to shed light on the decision-making process of a machine learning model. In particular, IG is a technique for calculating attribution scores, which express the relative importance of each input feature to the model's output.
The fundamental principle of IG is to compute the output's gradients with regard to its inputs, then integrate those gradients along the path from a reference input to the input itself. The baseline input serves as a reference input that is selected to symbolize the "absence" of any significant input features. IG calculates an attribution score for each input feature that shows how much it contributed by integrating the gradients along this path.
One benefit of IG is that a variety of models and input types, including as image data, text data, and structured data, can be employed with it. IG is a useful tool for comprehending how machine learning models arrive at their predictions because it also generates scores that are intuitive and simple to interpret.
For machine learning models, Integrated Gradients (IG) is a technique for calculating feature attribution or significance scores that tries to shed light on the decision-making process of the model. Each input feature is given an attribution score by IG, which quantifies its contribution to the model's final output.
The process involves integrating the output-to-input gradients along a path from a reference input to the real input. The integration throughout the path aids in quantifying the contribution of each feature to the output, while the baseline input indicates the "absence" of any pertinent input features. Using a variety of models and input kinds, such as picture data, text data, and structured data, IG is a flexible and simple method.
Methods
In the literature, Integrated Gradients (IG) have been proposed in a variety of forms. The principal techniques are as follows:
- Basic Integrated Gradients: This technique calculates the attribution scores in a straight line between a reference input and the actual input.
- With the addition of noise to the input and the computation of the average attribution scores over numerous noisy samples, the SmoothGrad Integrated Gradients approach expands the capabilities of the basic IG. As a result, the effect of noise may be lessened and the stability of the attribution scores may be increased.
- Expected Integrated Gradients: Rather than using a single baseline input, this technique uses the expected value of the gradients over a dispersion of inputs. This can enhance the accuracy of the attribution scores and help to reflect the behavior of the model across a larger variety of inputs.
- Deconvolutional IG: Rather than using the real gradients, this approach backpropagates the gradients from the output to the input using a deconvolutional network. By doing so, the vanishing gradient problem may be avoided and more accurate attribution scores may result.
- This method, which is an extension of SmoothGrad IG, takes the average of the squared attribution scores rather than the average over several noisy samples. This could assist in lessening the effect of noise even more and enhancing the stability of the attribution scores.
Working
Applications
- Model debugging IG can help pinpoint which model inputs are responsible for its output as well as which features the model is ignoring.
- Model selection: IG can be used to assess the significance of features across many models and assist in making the optimum model choice.
- Feature engineering: IG can assist in determining which traits are most crucial for forecasting a specific result. This is especially helpful when working with high-dimensional data because it can help direct feature selection or engineering.
- Regulatory compliance: IG can shed light on the prediction process of a model and help to ensure that the model is not prejudiced against any certain groups. For instance, IG may reveal a bias in the data or the model itself if it reveals that a certain demographic group is being disproportionately punished by the model.
- Medical diagnosis: The predictions of medical diagnostic models can be interpreted using IG, which also offers insights into the model's decision-making process.
Implementation
# Import the required libraries
import tensorflow as tf
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.vgg16 import preprocess_input
import numpy as np
import matplotlib.pyplot as plt
def integrated_gradients(inp, baseline, steps):
# Define the inputs and placeholders
x = tf.keras.preprocessing.image.img_to_array(inp)
x = np.expand_dims(x, axis=0)
baseline = tf.keras.preprocessing.image.img_to_array(baseline)
baseline = np.expand_dims(baseline, axis=0)
alphas = tf.linspace(start=0.0, stop=1.0, num=steps)
# Define a TensorFlow function for the gradient
@tf.function
def gradient(x, model):
with tf.GradientTape() as tape:
tape.watch(x)
output = model(x)
loss = tf.reduce_mean(output)
grad = tape.gradient(loss, x)
return grad
# Compute the gradient with respect to the input given
grad = gradient(x, model)
# Define a TensorFlow function for the interpolated images
@tf.function
def interpolate_images(baseline, image, alpha):
difference = image - baseline
scaled_images = baseline + alpha * difference
return scaled_images
# Generate interpolated images
interpolated_images = [(alpha * x + (1 - alpha) * baseline) for alpha in alphas]
# Compute the gradients for each interpolated image
gradient_images = []
for i in range(steps):
gradient_image = gradient(interpolated_images[i], model)
gradient_images.append(gradient_image)
# Compute the integrated gradients
integrated_gradients = tf.reduce_mean(tf.stack(gradient_images, axis=0), axis=0) * (x - baseline)
return integrated_gradients.numpy()[0]
# Load the pre-trained VGG16 model
model = tf.keras.applications.VGG16(weights='imagenet', include_top=True)
# Load the image and the baseline
img_path = '/content/flower_photos/daisy/100080576_f52e8ee070_n.jpg'
img = image.load_img(img_path, target_size=(224, 224))
baseline_path = '/content/flower_photos/daisy/10555815624_dc211569b0.jpg'
baseline = image.load_img(baseline_path, target_size=(224, 224))
# Preprocess the image and the baseline
img = image.img_to_array(img)
baseline = image.img_to_array(baseline)
img = preprocess_input(img)
baseline = preprocess_input(baseline)
# Define the number of steps for the integration
num_steps = 50
# Compute the integrated gradients
ig = integrated_gradients(img, baseline, num_steps)
# Postprocess the attribution map
ig /= np.max(np.abs(ig))
ig *= 255
ig = ig.astype(np.uint8)
ig = np.squeeze(ig)
# Show the original image
plt.imshow(image.load_img(img_path))
plt.title('Original Image')
plt.axis('off')
plt.show()
# Show the output image after applying the attribution map
plt.imshow(img[0] * ig)
plt.title('Output Image')
plt.axis('off')
plt.show()
- The code took a sunflower image as input and, with a high degree of confidence, classified it as a "sunflower" using an image classification model (VGG16). The program then produced an attribution map that shows the areas of the image that are more accountable for the model's prediction using integrated gradients. You can observe which areas of the input image were crucial for the model's prediction by looking at the output image, which overlays the attribution map on top of the original image.
- Areas of the image that are displayed in brighter colors on the attribution map are those that are more crucial for the model's prediction. In our illustration, the attribution map revealed that the flower face and petals were crucial for the model's prediction. This is because the most identifiable characteristics of a sunflower, such as its unusual facial shape and petals, can be found in these areas.
- These significant regions are highlighted even more in the output image we produced by multiplying the input image by the attribution map, making it simpler to identify which areas of the image are most essential to the model's prediction.
Key Points to Remember
A common technique for determining the relative significance of the features of a deep neural network's output to its input is called integrated gradients (IG). Here are some essential IG reminders:
- The foundation of IG is the concept of a path integral over the output gradient with regard to the input of the model.
- By calculating the integral of the gradient of the output with respect to the input along a path from a baseline input to the actual input, IG assigns importance to the features of an input.
- The baseline input is a reference input with a predetermined output value; it's frequently zero or a blank input.
- There are various ways to describe the path from the baseline to the actual input, such as a straight line or a curved path.
- Each feature of the input is given a score by IG, indicating how much it contributed to the model's output.
- IG can be used for a variety of purposes, including debugging models, comparing models, and comprehending the model's decision-making process.
- IG comes in a variety of forms, including SmoothGrad and Guided IG, which can alleviate some of the original method's drawbacks.
- For big models and inputs in particular, IG is computationally expensive and necessitates careful tweaking of hyperparameters like the path's number of steps and step size.
Conclusion
References
[1] https://erdem.pl/2022/04/xai-methods-integrated-gradients
[2] https://www.tensorflow.org/tutorials/interpretability/integrated_gradients