IG(Integrated Gradients)

Introduction

IG (Integrated Gradients) is a technique used in the context of XAI (explainable artificial intelligence) that tries to shed light on the decision-making process of a machine learning model. In particular, IG is a technique for calculating attribution scores, which express the relative importance of each input feature to the model's output.

The fundamental principle of IG is to compute the output's gradients with regard to its inputs, then integrate those gradients along the path from a reference input to the input itself. The baseline input serves as a reference input that is selected to symbolize the "absence" of any significant input features. IG calculates an attribution score for each input feature that shows how much it contributed by integrating the gradients along this path.

One benefit of IG is that a variety of models and input types, including as image data, text data, and structured data, can be employed with it. IG is a useful tool for comprehending how machine learning models arrive at their predictions because it also generates scores that are intuitive and simple to interpret.

For machine learning models, Integrated Gradients (IG) is a technique for calculating feature attribution or significance scores that tries to shed light on the decision-making process of the model. Each input feature is given an attribution score by IG, which quantifies its contribution to the model's final output.

The process involves integrating the output-to-input gradients along a path from a reference input to the real input. The integration throughout the path aids in quantifying the contribution of each feature to the output, while the baseline input indicates the "absence" of any pertinent input features. Using a variety of models and input kinds, such as picture data, text data, and structured data, IG is a flexible and simple method.

Methods

In the literature, Integrated Gradients (IG) have been proposed in a variety of forms. The principal techniques are as follows:

Basic Integrated Gradients: This technique calculates the attribution scores in a straight line between a reference input and the actual input.

With the addition of noise to the input and the computation of the average attribution scores over numerous noisy samples, the SmoothGrad Integrated Gradients approach expands the capabilities of the basic IG. As a result, the effect of noise may be lessened and the stability of the attribution scores may be increased.

Expected Integrated Gradients: Rather than using a single baseline input, this technique uses the expected value of the gradients over a dispersion of inputs. This can enhance the accuracy of the attribution scores and help to reflect the behavior of the model across a larger variety of inputs.

Deconvolutional IG: Rather than using the real gradients, this approach backpropagates the gradients from the output to the input using a deconvolutional network. By doing so, the vanishing gradient problem may be avoided and more accurate attribution scores may result.

This method, which is an extension of SmoothGrad IG, takes the average of the squared attribution scores rather than the average over several noisy samples. This could assist in lessening the effect of noise even more and enhancing the stability of the attribution scores.

These strategies are all variations of the fundamental IG strategy, but they vary in how they are implemented specifically as well as in their advantages and disadvantages. The particular application and the user's needs will determine the method to use.

Working

The fundamental principle of IG is to compute the output's gradients with respect to the inputs, integrate these gradients along a path from a base input to the input itself. The baseline input serves as a reference input that is selected to symbolize the "absence" of any significant input features. To determine how much each input feature contributed to the final result, IG computes an attribution score for each one by integrating the gradients along this path.

IG Visualization

The procedures for calculating the IG attribution scores are as follows:

Establish a baseline input: A baseline input is selected to symbolize the "absence" of any pertinent input properties. As an illustration, the baseline input for an image classification challenge might be a black image.

Create the gradients: Backpropagation is used to create the gradients between the output and the inputs. For each input feature, this yields a gradient vector.

Gradient integration is the process of integrating the gradients through a path from the baseline input to the actual input. The path might either follow a straight line or a curve, like a seamless interpolation between the baseline and the real inputs.

Calculate the attribution scores: The integration generates an attribution score for each input feature that reflects the contribution of that feature to the model's final output.

By calculating these attribution scores, IG sheds light on the judgments made by a machine learning model and the input features that are most crucial to those conclusions. The results can also be used to spot biases or errors in the way the model makes decisions.

Applications

Deep neural networks are only one of the machine learning models that can be employed with Integrated Gradients (IG), a popular interpretability technique in Explainable Artificial Intelligence (XAI). By assigning a score to each feature of the input data, IG offers a way to justify model predictions. Here are some examples of how IG is used:

Model debugging IG can help pinpoint which model inputs are responsible for its output as well as which features the model is ignoring.

Model selection: IG can be used to assess the significance of features across many models and assist in making the optimum model choice.

Feature engineering: IG can assist in determining which traits are most crucial for forecasting a specific result. This is especially helpful when working with high-dimensional data because it can help direct feature selection or engineering.

Regulatory compliance: IG can shed light on the prediction process of a model and help to ensure that the model is not prejudiced against any certain groups. For instance, IG may reveal a bias in the data or the model itself if it reveals that a certain demographic group is being disproportionately punished by the model.

Medical diagnosis: The predictions of medical diagnostic models can be interpreted using IG, which also offers insights into the model's decision-making process.

Implementation

Here in this implementation I have taken a flower image and implemented the Python code using the IG concepts.

Source Code

# Import the required libraries
import tensorflow as tf
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.vgg16 import preprocess_input
import numpy as np
import matplotlib.pyplot as plt


def integrated_gradients(inp, baseline, steps):
    # Define the inputs and placeholders
    x = tf.keras.preprocessing.image.img_to_array(inp)
    x = np.expand_dims(x, axis=0)
    baseline = tf.keras.preprocessing.image.img_to_array(baseline)
    baseline = np.expand_dims(baseline, axis=0)
    alphas = tf.linspace(start=0.0, stop=1.0, num=steps)

    # Define a TensorFlow function for the gradient
    @tf.function
    def gradient(x, model):
        with tf.GradientTape() as tape:
            tape.watch(x)
            output = model(x)
            loss = tf.reduce_mean(output)
        grad = tape.gradient(loss, x)
        return grad

    # Compute the gradient with respect to the input given
    grad = gradient(x, model)

    # Define a TensorFlow function for the interpolated images
    @tf.function
    def interpolate_images(baseline, image, alpha):
        difference = image - baseline
        scaled_images = baseline + alpha * difference
        return scaled_images

    # Generate interpolated images
    interpolated_images = [(alpha * x + (1 - alpha) * baseline) for alpha in alphas]

    # Compute the gradients for each interpolated image
    gradient_images = []
    for i in range(steps):
        gradient_image = gradient(interpolated_images[i], model)
        gradient_images.append(gradient_image)

    # Compute the integrated gradients
    integrated_gradients = tf.reduce_mean(tf.stack(gradient_images, axis=0), axis=0) * (x - baseline)

    return integrated_gradients.numpy()[0]


# Load the pre-trained VGG16 model
model = tf.keras.applications.VGG16(weights='imagenet', include_top=True)

# Load the image and the baseline
img_path = '/content/flower_photos/daisy/100080576_f52e8ee070_n.jpg'
img = image.load_img(img_path, target_size=(224, 224))
baseline_path = '/content/flower_photos/daisy/10555815624_dc211569b0.jpg'
baseline = image.load_img(baseline_path, target_size=(224, 224))

# Preprocess the image and the baseline
img = image.img_to_array(img)
baseline = image.img_to_array(baseline)
img = preprocess_input(img)
baseline = preprocess_input(baseline)

# Define the number of steps for the integration
num_steps = 50

# Compute the integrated gradients
ig = integrated_gradients(img, baseline, num_steps)

# Postprocess the attribution map
ig /= np.max(np.abs(ig))
ig *= 255
ig = ig.astype(np.uint8)
ig = np.squeeze(ig)

# Show the original image
plt.imshow(image.load_img(img_path))
plt.title('Original Image')
plt.axis('off')
plt.show()

# Show the output image after applying the attribution map
plt.imshow(img[0] * ig)
plt.title('Output Image')
plt.axis('off')
plt.show()

Obtained Output:

The code took a sunflower image as input and, with a high degree of confidence, classified it as a "sunflower" using an image classification model (VGG16). The program then produced an attribution map that shows the areas of the image that are more accountable for the model's prediction using integrated gradients. You can observe which areas of the input image were crucial for the model's prediction by looking at the output image, which overlays the attribution map on top of the original image.

Areas of the image that are displayed in brighter colors on the attribution map are those that are more crucial for the model's prediction. In our illustration, the attribution map revealed that the flower face and petals were crucial for the model's prediction. This is because the most identifiable characteristics of a sunflower, such as its unusual facial shape and petals, can be found in these areas.

These significant regions are highlighted even more in the output image we produced by multiplying the input image by the attribution map, making it simpler to identify which areas of the image are most essential to the model's prediction.

Key Points to Remember

A common technique for determining the relative significance of the features of a deep neural network's output to its input is called integrated gradients (IG). Here are some essential IG reminders:

The foundation of IG is the concept of a path integral over the output gradient with regard to the input of the model.

By calculating the integral of the gradient of the output with respect to the input along a path from a baseline input to the actual input, IG assigns importance to the features of an input.

The baseline input is a reference input with a predetermined output value; it's frequently zero or a blank input.

There are various ways to describe the path from the baseline to the actual input, such as a straight line or a curved path.

Each feature of the input is given a score by IG, indicating how much it contributed to the model's output.

IG can be used for a variety of purposes, including debugging models, comparing models, and comprehending the model's decision-making process.

IG comes in a variety of forms, including SmoothGrad and Guided IG, which can alleviate some of the original method's drawbacks.

For big models and inputs in particular, IG is computationally expensive and necessitates careful tweaking of hyperparameters like the path's number of steps and step size.

Conclusion

Integrated Gradients is a technique for computing feature attributions in deep neural networks. It is a model-neutral approach that can be used with any differentiable model and offers a mechanism to contextualize the model's predictions by assigning a score to each input characteristic. The approach is based on the route integral principle, which involves integrating the gradients of the output of the model with respect to the input along a straight line between a baseline and the input. As a result, each input feature's contribution to the model's output is represented by a feature attribution score. Numerous uses of the Integrated Gradients approach include troubleshooting models, assessing model fairness, and enhancing model interpretability.

References

[1] https://erdem.pl/2022/04/xai-methods-integrated-gradients

[2] https://www.tensorflow.org/tutorials/interpretability/integrated_gradients

IG(Integrated Gradients) Methods and its implementation in Python

IG(Integrated Gradients)

Introduction

Methods

Working

Applications

Implementation

Key Points to Remember

Conclusion

Yagna Dakshina

You may like these posts

Post a Comment

Get new posts by email:

Difference Between PCA and Autoencoders with an example

Software Components in Deep Learning

Difference Between PCA and Autoencoders with an example

Difference Between PCA and Autoencoders with an example

Hot Posts

Search This Blog

Most Recent

Difference Between PCA and Autoencoders with an example

Autoencoder Architecture with Keras in Deep Learning

Types of Autoencoders in Deep Learning

Clustering with Deep Learning Models and its implementation in python

Transfer Learning in Deep Learning with Keras

Yagna Dakshina

Contact form