Towards Robust, Trustworthy, and Explainable Computer Vision

8 am - 12 pm on Oct 11, 2021. ICCV Tutorial, Montreal, Canada.

[Instructions for joining our Panel Discussion]

This is a half-day tutorial that aims to introduce participants to different aspects of computer vision models beyond performance -- robustness, trustworthiness and explainability.


08:00 AM Talk 1: "Explaining Model Decisions and Fixing Them via Focused Feedback" by Ramprasaath R. Selvaraju (Slides, Video)
08:45 AM Talk 2: "Characterizing Bias and Developing Trustworthy AI Models" by Sara Hooker (Slides, Video)
09:30 AM Talk 3: "Human-centric AI for Computer Vision and Machine Autonomy" by Bolei Zhou (Slides, Video)
10:15 AM Talk 4: "Adversarially Robust Models as Visual Priors" by Aleksander Madry (Slides, Video)
11:00 AM Panel Discussion


Ramprasaath R. Selvaraju
Salesforce Research
Bolei Zhou
Chinese University of Hong Kong --> UCLA
Sara Hooker
Google Research


Convolutional Neural Networks (CNNs) and other deep networks have enabled unprecedented breakthroughs in a variety of computer vision tasks, from image classification to object detection, semantic segmentation, image captioning, visual question answering, and visual dialog. While these models enable superior performance, their lack of decomposability into individually intuitive components makes them hard to interpret. Consequently, when today’s intelligent systems fail, they often fail spectacularly disgracefully without warning or explanation, leaving a user staring at an incoherent output, wondering why the system did what it did. In order to be able to build trust in intelligent systems and move towards their meaningful integration into our everyday lives, we must build `transparent' models that have the ability to explain why they predict what they predict.

This tutorial will introduce participants to different aspects of computer vision models beyond performance. Ramprasaath R. Selvaraju will focus on explainable-AI methodologies and how understanding the decision process helps fixing various characteristics of the model. Sara Hooker will address the trustworthiness and the social impact of vision models. Bolei Zhou will focus on the interactive aspect of dissected vision models and its implication to visual editing applications. Aleksander Madry will focus on the robustness of vision models. Therefore, in this tutorial there will be a unification of different perspectives beyond test-set performance that are just as important to have in vision models.

Speaker Relevance

The tutorial lectures will be given by several well-known researchers specialized in computer vision and the topic relevant to explainability, fairness, generalization, robustness of visual models. For example, Dr. Selvaraju has done work on generating visual explanations for decisions emanating from any deep network-- in order to debug and diagnose network errors, enable knowledge transfer between humans and AI, and correct unwanted biases that may be learned by a network during training. Prof. Zhou has done several works on the visualization and interpretation of the semantic units of deep neural networks for both discriminative and generative models. Prof. Madry has done much work on identifying biases learned by deep models, introduced several benchmarks to evaluate the robustness of vision models, and adversarial machine learning. Sara Hooker has done work on benchmarking interpretability techniques and understanding the biases introduced during network compression in order to build fair and trustworthy AI systems.

We believe that this tutorial will give the vision community not only an educational crash course on explainable, robust and trustworthy AI, but also inspire deeper thinking about the visual models we are training.