Tutorials

Tutorial : Explainable and Robust Machine Learning for Images

By: Ghassan AlRegib, Mohit Prabhushankar, Gukyeong Kwon, and Jinsol Lee, Georgia Institute of Technology

Abstract: In recent years, artificial intelligence systems achieved state-of-the-art performances in image classification tasks. Specifically, classification algorithms surpassed the top-5 human error rate of 5.1% on ImageNet. Even though these advancements are promising, images in these datasets do not cover diverse real-world scenarios. Distortions may include perceptually unpleasant camera-related issues like blur, motion blur, overexposure, underexposure, and noise. Moreover, environmental conditions such as rain, snow, and frost can affect the field of view. These non-ideal conditions impact the performance of artificial intelligence (AI) algorithms. Furthermore, gaining an insight into understanding the decision made by an AI algorithm under such scenarios is crucial in building robust AI models. Over the past few years, there has been quite progressive in AI Explainability. As an example, Grad-CAM has been widely used to visually justify the decision made by a classification network by answering ‘Why P?’ where P is its prediction. We add context and relevance to this question by answering `Why P, rather than Q?’ where Q is some contrast prediction. In some cases, such context can be more descriptive for interpretability. For instance, in autonomous driving applications that recognize traffic signs, knowing why a particular traffic sign was chosen over another is informative in the context of analyzing decisions in case of accidents. This has a broader impact in applications such as medical and subsurface image analysis. Other modalities of the considered question include `Why P, rather than P?’ and `Why P, rather than all other classes?’. We show that explicitly tying the inference to these questions makes the neural networks more robust while also providing justifications that are contextually relevant. Specifically, we analyze robustness in the domain of recognition, out-of-distribution detection, novelty detection, open-set recognition, and Image Quality Assessment

More details of this tutorial

Tutorial : Compression of Immersive Media

By: Ioan Tabus, Tampere University, Finland

Abstract: The tutorial considers the compression of the new data types arising in the immersive technologies: light field data represented as an array of views and point cloud data. The objective of the tutorial is to present the challenges raised by the compression of the new data types and also to show some of the current solutions. We consider in this tutorial the solutions to the common problems of predictive model structure selection for sparse modeling, that arise in the complex settings of plenoptic image compression and point cloud compression, which are nowadays major standardization topics in jpeg and MPEG communities. We have developed the main 4D Prediction mode, which is at the core of the forthcoming JPEG Pleno Light Field standard. We have contributed to the standardization activities several schemes for the compression of both high-density camera array images and of plenoptic camera images. The regularities and similarities existing between neighbor angular views were successfully exploited for achieving efficient compression results, within a flexible system having desirable functionalities, such as hierarchical organization allowing random access to the views and a flexible interconnection to the existent 2D image compression standards. We discuss here architectural and algorithmic solutions for the modeling and compression problems, with exemplifications to both plenoptic image compression and point cloud compression.

More details of this tutorial

Tutorial : Subjective data collection

By: Seyed Ali Amirshahi Norwegian University of Science and Technology (NTNU)

Abstract: Many different methods and techniques in the fields of image processing and computer vision are highly dependent on some sort of subjective data. This could range from ground truth data collected for objective detection or segmentation techniques to subjective scores collected for different quality evaluation metrics mainly image quality, video quality, or aesthetic quality assessment. While our reliance on subjective data is good evidence of the importance of such data, not enough attention has been paid to how subjective data is collected. In recent years, with the increase in the use of state-of-the-art machine learning techniques and more specifically convolutional neural networks the need to have access to large-scale datasets has introduced a new challenge in subjective data collection.

In this tutorial, we will first provide the audience with an in-depth overview of the use of subjective data and the different types of subjective data collected depending on the field of research. We then move on to introduce different approaches for collecting subjective data both under a controlled environment and using online platforms. In this stage we will also discuss different parameters we should consider when collecting subjective data. The tutorial will take advantage of the virtual nature of this year’s EUVIP and so the audience will have the chance to work in breakout room in groups to design different types of subjective tests and also, participate in tests as an observer. Such an approach will not only give a hands-on experience to the attendees in designing a subjective experiment but also by participating in a different experiment they could have a better understanding of the challenges faced by observers who participate in such experiments.

More details of this tutorial

Tutorial : DeepFake – Creation, Detection and Generalization

By: Kiran Raja Norwegian University of Science and Technology (NTNU)

Abstract: The ultra-realistic content creation capabilities for DeepFakes could endanger various societal functions such as propaganda spreading for threatening democracy. Although it is true that there are only a few documented cases in which DeepFakes have had a significant negative impact on democratic processes, the threat cannot be overlooked. While a number of works have been proposed in the recent past for detecting the DeepFakes, there are constantly challenged by newer DeepFake generation mechanisms. One of the critical challenges in devising a reliable DeepFake detection approach is the problem of generalizability across different datasets and across different generation mechanisms.

In this tutorial, we will first provide the audience with an in-depth overview of the current state-of-the-art for DeepFake generation. We will first discuss on advantages and limitations of the various approaches and list out few new directions for DeepFake generation. We then proceed to detection approaches where we will present the approaches based on classical machine learning and deep learning. We will supplement the tutorial with sample codes for the participants who wish to try out the DeepFake detection.

More details of this tutorial

About the registration to the tu​torials

Tu​torials

About the registration to the tutorials

Tutorials