Analyzing Malware Detection Models with Explainable AI

Supervisor(s):	Daniel Kowatsch
Status:	finished
Topic:	Others
Author:	Achraf Flah
Submission:	2024-03-31
Type of Thesis:	Guided Research
Thesis topic in co-operation with the Fraunhofer Institute for Applied and Integrated Security AISEC, Garching
Description Malware, short for malicious software, is one of the most harmful threats in the digital world. It can cause huge damage ranging from Denial of Service (DoS) to stealing sensitive data. Consequently, malware detection is a crucial area in cyber security. Moreover, malware is usually hard to detect, as malware development and obfuscation techniques get more advanced. On the other side, Machine Learning (ML) methods have shown a great capability of automating and improving the task of malware detection. However, ML-based methods should be transparent to security experts as they are being used in a critical area. Most of these methods, especially deep learning, are known for their opacity. To mitigate this challenge, explainable artificial intelligence (XAI) methods were developed aiming to interpret ML models. In this work, we inspect and provide an overview of ML-based methods performing malware detection or classification in addition to XAI methods that can analyze them. Then, we analyze multiple models covering different model families using XAI. The analysis provides explanations for the model’s decisions, capturing unusual behavior and uncovering the reasons for the model’s failures. We show that given an appropriate configuration and depending on the model and its specific use case, XAI techniques can be employed for model analysis. This analysis facilitates the interpretation of the model’s results, identification of failure points for debugging, and offers recommendations to enhance its overall performance. It uncovered for instance that the model uses the compilation date of the binary as an indicator. Furthermore, the analysis can be used to facilitate the task of malware analysis by highlighting the most interesting parts of the malware, such as the code blocks responsible for malicious activities.

Analyzing Malware Detection Models with Explainable AI

Description