Transferring Anomaly Detection to Audio Spoofing Detection

Supervisor(s):	Nicolas Müller
Status:	finished
Topic:	Anomaly Detection
Author:	Franziska Dieckmann
Submission:	2022-01-17
Type of Thesis:	Masterthesis
Thesis topic in co-operation with the Fraunhofer Institute for Applied and Integrated Security AISEC, Garching
Description The term deepfake describes synthetically generated media content which is often developed with malicious intent. Audio deepfakes are a sub-genre of deepfakes for spoofs of audio recordings. Advances in research, computation power and ever-growing datasets lead to better and more convincing deepfakes. This, consequently, inspires misuses like fraud or defamation which underlines the importance of countermeasures. Currently, conducted research focuses almost exclusively on the detection of deepfakes, i.e. "Is this media content computer generated or is it real?". But, as practiced in all subfields of cybersecurity, knowing "Who attacked me?" is essential in setting up defenses as well. This is the reason why this thesis sets out to identify the attacker or the creator of an audio deepfake. We present both traditional and machine-learning based methods to create a so-called attacker signature which is unique to an attacker. Those methods are then evaluated on two large audio deepfake datasets: one where the attackers are known (ASVspoof19) and one where they aren’t (ASVspoof21). This results in the observation that a traditional approach is not suitable to create attacker signatures, i.e. identify the attackers. On the other hand, an embedding based model is able to form clusters for each attacker even when the audio recordings or the attacker is unknown. In summary, this means that it is possible to identify who created an audio deepfake.

Transferring Anomaly Detection to Audio Spoofing Detection

Description