TUM Logo

Incremental Construction of Code Property Graphs

Incremental Construction of Code Property Graphs

Supervisor(s): Konrad Weiss, Dr. Julian Schütte
Status: open
Topic: Others
Type of Thesis: Masterthesis
Thesis topic in co-operation with the Fraunhofer Institute for Applied and Integrated Security AISEC, Garching

Description

In cooperation with Fraunhofer AISEC

Incremental Construction of Code Property Graphs

Announcement: Master’s Thesis

Motivation and Topic

A Code Property Graph represents source code in a labelled directed multigraph. Various abstract representations of source code are merged into one supergraph. An Abstract Syntax Tree captures the Codes structure. A Control Flow Graph and Evaluation Order Graph represent the order in which statements or expressions are evaluated. Data Dependency Edges show the conditional and data dependencies between statements. Graph mining techniques allow storing large Code Property Graphs into a database. Graph traversals allow us to search through the supergraph and find patterns of structure and behavior.

This allows searching for known security vulnerabilities by expressing them as graph traversal patterns. In contrast to static analysis based on intermediate representations, no compiled binaries are needed, and incomplete or snippets of code can be analyzed.

When analyzing code as part of an editor plugin, the construction of the entire graph upon small changes is a time consuming and threatens the applicability. Adapting the graph construction to rebuild only the necessary parts of the graph upon a change is a promising approach to make real-time applications viable.

The goal of this thesis is first to research the current state of the art of incremental static analysis and the abstract representations used for Code Property Graphs. The scope will be limited to adapt an existing implementation for CPG construction to an incremental approach. In a second step, the incremental construction variant has to be compared to the monolithic approach in effectiveness and efficiency.

Requirements

Programming skills in Java
An understanding of Java and C++
Interest in static code analysis, graphs and programming languages

Ability to work self-directed and systematically

The thesis can be written in English or German.

Contact

Fraunhofer Institute for Applied and Integrated Security (AISEC)

Konrad Weiss, Dr. Julian Schütte
E-Mail: konrad.weiss@aisec.fraunhofer.de
Phone: +49 89 32299 86 113