Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Interpreting and Comparing Convolutional Neural Networks: A Quantitative Approach

Version 1 : Received: 26 January 2021 / Approved: 28 January 2021 / Online: 28 January 2021 (12:29:20 CET)
Version 2 : Received: 20 November 2021 / Approved: 22 November 2021 / Online: 22 November 2021 (14:06:52 CET)

How to cite: Islam, M.M.; Tushar, Z.H. Interpreting and Comparing Convolutional Neural Networks: A Quantitative Approach. Preprints 2021, 2021010579. Islam, M.M.; Tushar, Z.H. Interpreting and Comparing Convolutional Neural Networks: A Quantitative Approach. Preprints 2021, 2021010579.


A convolutional neural network (CNN) is sometimes understood as a black box in the sense that while it can approximate any function, studying its structure will not give us any insights into the nature of the function being approximated. In other terms, the discriminative ability does not reveal much about the latent representation of a network. This research aims to establish a framework for interpreting the CNNs by profiling them in terms of interpretable visual concepts and verifying them by means of Integrated Gradient. We also ask the question, "Do different input classes have a relationship or are they unrelated?" For instance, could there be an overlapping set of highly active neurons to identify different classes? Could there be a set of neurons that are useful for one input class whereas misleading for a different one? Intuition answers these questions positively, implying the existence of a structured set of neurons inclined to a particular class. Knowing this structure has significant values; it provides a principled way for identifying redundancies across the classes. Here the interpretability profiling has been done by evaluating the correspondence between individual hidden neurons and a set of human-understandable visual semantic concepts. We also propose an integrated gradient-based class-specific relevance mapping approach that takes the spatial position of the region of interest in the input image. Our relevance score verifies the interpretability scores in terms of neurons tuned to a particular concept/class. Further, we perform network ablation and measure the performance of the network based on our approach.

Supplementary and Associated Material Link of github code repository.


Network Interpretation; Image Classification; Convolutional Neural Network; Integrated Gradient


Computer Science and Mathematics, Computer Vision and Graphics

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0

Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.