Generative Dataset Preprocessing for Model Explainability

Supervisor(s):	Nicolas Müller
Status:	finished
Topic:	Others
Author:	Jochen Jacobs
Submission:	2022-10-17
Type of Thesis:	Masterthesis
Thesis topic in co-operation with the Fraunhofer Institute for Applied and Integrated Security AISEC, Garching
Description When training a machine learning model the quality of the training data is of utmost importance. Poor training data can lead to bad generalization of the trained model. A particular issue with training data is the presence of shortcuts: Shortcuts are features in the training data that are highly predictive of the training target, but semantically unconnected to the problem. When training a model on such data, the model often learns to only rely on the shortcut, as no incentive to learn further features exist. As test data is typically collected the same way as the training data – and thus also contains the same shortcuts – the reliance of the model on shortcuts often remains undiscovered. This thesis presents an approach to detect these shortcuts in training data and automati- cally neutralize them: A small image-to-image network, called “lens”, is prepended to the classifier network. The lens is trained adversarially to remove features that the classifier is paying attention to. The limited capacity of the lens and an additional reproduction loss ensure only simple and local features can be removed. The output of the lens also gives visual feedback on which features are removed. The model is evaluated on data with synthetically added shortcuts as well as a real-world chest x-ray dataset. We find that the lens is reliably detecting, in-painting, and neutralizing shortcuts. At the same time, the classifier performance is not impacted if the data does not contain shortcuts.

Generative Dataset Preprocessing for Model Explainability

Description