Vulnerability Detection in Source Code using Graph Neural Networks on Code Property Graphs

Supervisor(s):	Daniel Kowatsch, Tobias Specht
Status:	finished
Topic:	Others
Author:	Philip Haitzer
Submission:	2025-04-14
Type of Thesis:	Masterthesis
Thesis topic in co-operation with the Fraunhofer Institute for Applied and Integrated Security AISEC, Garching
Description This thesis introduces a modular Graph Neural Network framework designed for C/C++ vulnerability detection using Code Property Graphs, supporting architectures such as Graph- SAGE and GAT. Utilizing self-supervised pre-training with feature masking on a diverse Debian codebase, the framework is subsequently fine-tuned for node-level classification of CWE-457 (Use of Uninitialized Variable) on the Juliet test suite. Our evaluation addresses the effectiveness of the GNN approach and the specific benefit derived from pre-training. Experiments with GraphSAGE demonstrate considerable effectiveness in detecting CWE-457 patterns. Notably, pre-training led to improved performance on the test set, primarily by enhancing classifier precision with minimal impact on recall, resulting in a higher overall F1-score. While qualitative analysis reveals that pre-training fosters more structured embeddings, it also highlights increased sensitivity to code context. The findings affirm the viability of GNNs on CPGs for vulnerability detection and demonstrate a clear, positive impact of the employed pre-training strategy for this specific downstream task.

Description