TUM Logo

Layered Android Malware Detection Using Program Dependence Graph Embedding and Manifest Features

With the popularity of Android growing exponentially, the amount of malware has significantly exploded. It is arguably one of the most viral problems on mobile platforms. Recently, various approaches have been introduced to detect Android malware, the majority of these are based on the contextual data of android applications, mainly, dangerous permissions as well as sensitive hardware features. While these approaches add an extra security layer to the Android platform, they fail to demonstrate efficiency and robustness against bytecode level obfuscation. In this work, we explore how the combination of structural and contextual analysis of Android applications can be efficiently used to contrive with the advanced sophistication of Android malware. We, therefore, present a multi-layer approach that utilizes machine learning, natural language processing (NLP), as well as graph embedding techniques to handle the threats of Android malware. To be specific, the first layer of our detection approach acts on the application’s properties declared in the Manifest file, whereas the second layer operates on the application code’s structural relationships. Large-scale experiments on 30,113 malware samples show that the context-based approach yields an accuracy of 91%, which is nearly comparable to state-of-the-art techniques, while the structure-based method attains an accuracy of 99% which outperforms various related works. Further, for optimum Android malware detection, we introduce a hook-based anti-malware application that utilizes the complementary strengths of our multi-layer approach to scan applications before installation.

Layered Android Malware Detection Using Program Dependence Graph Embedding and Manifest Features

Supervisor(s): Peng Xu
Status: finished
Topic: Machine Learning Methods
Author: Asbat El Khairi
Submission: 2020-02-14
Type of Thesis: Masterthesis
Proof of Concept useful

Astract:

With the popularity of Android growing exponentially, the amount of malware has significantly exploded. It is arguably one of the most viral problems on mobile platforms. Recently, various approaches have been introduced to detect Android malware, the majority of these are based on the contextual data of android applications, mainly, dangerous permissions as well as sensitive hardware features. While these approaches add an extra security layer to the Android platform, they fail to demonstrate efficiency and robustness against bytecode level obfuscation. In this work, we explore how the combination of structural and contextual analysis of Android applications can be efficiently used to contrive with the advanced sophistication of Android malware. We, therefore, present a multi-layer approach that utilizes machine learning, natural language processing (NLP), as well as graph embedding techniques to handle the threats of Android malware. To be specific, the first layer of our detection approach acts on the application’s properties declared in the Manifest file, whereas the second layer operates on the application code’s structural relationships. Large-scale experiments on 30,113 malware samples show that the context-based approach yields an accuracy of 91%, which is nearly comparable to state-of-the-art techniques, while the structure-based method attains an accuracy of 99% which outperforms various related works. Further, for optimum Android malware detection, we introduce a hook-based anti-malware application that utilizes the complementary strengths of our multi-layer approach to scan applications before installation.