TUM Logo

Data-Flow Analysis using Synchronized Pushdown Systems on Code Property Graphs

Data-Flow Analysis using Synchronized Pushdown Systems on Code Property Graphs

Supervisor(s): Konrad Weiss
Status: finished
Topic: Others
Author: Yuling Sun
Submission: 2020-10-15
Type of Thesis: Bachelorthesis
Thesis topic in co-operation with the Fraunhofer Institute for Applied and Integrated Security AISEC, Garching

Description

As the size and complexity of computer programs increases, so does the risk for security
vulnerabilities and the difficulty to find them. Data-Flow Analysis, a kind of static
program analysis that is typically used for compiler optimization, is also applied to
vulnerability detection. Data-Flow Analysis concerns itself with the use and definitions of
variables as well as how their data is propagated through the program. This information can 
be used by software developers and security analysts to better understand the behavior of a
program. Synchronized Pushdown Systems are a concept to encode data-flow analysis problems
in a flow-sensitive, context-sensitive and field-sensitive manner, albeit having some
false positives. Code Property Graphs are an intermediate representation of computer
programs that combines the Abstract Syntax Tree, the Control Flow Graph and the
Program Dependence Graph in a single representation. It can be used to perform
vulnerability analyses on source-code level and thus does not need a program to com-
pile. This thesis aims to perform Data-Flow Analysis using Synchronized Pushdown
Systems on Code Property Graphs to leverage the sensitivities of Data-Flow Analysis
on Code Property Graphs compared to previous research.
This thesis will present a method to build a Synchronized Pushdown System us-
ing information extracted from the Code Property Graph and perform Data-Flow Analysis on it. 
It will also present an exemplary implementation, which will be evaluated by comparing it to 
data-flow analysis results performed by another program as well as a benchmark. The results 
demonstrate that this is a viable approach to leverage the sensitivities of Data-Flow Analysis 
on Code Property Graphs despite decreasing accuracy. Further research will be necessary to increase 
the accuracy of this approach.