Description
With the steadily growing size of modern shared libraries and OSes, the threat of attackers exploiting code gadgets in the software bloat increases significantly. This particularly applies to stale artifacts in the code base that are no longer maintained and may expose vulnerabilities that attackers can exploit to take over the system. Furthermore, the accumulation of software bloat provides a multitude of code gadgets that offer adversaries a broader attack surface for mounting CRAs against shared libraries or OSes. Nevertheless, most OS kernels expose a vast collection of system calls that flood the set of actually required services requested by user space applications. By removing unused code from the code base, we can reduce the number of vulnerabilities and code gadgets which in turn increases the security. State-of-the-art solutions aim to reduce the code bloat by identifying and pruning the unused paths for an entry point in the program’s CFG. However, they are either too coarse-grained or do not consider the program states across successive calls to library entry points. In this thesis, we present a novel approach to statically determine a fine-grained call graph for program libraries. We extend ICARUS, a static analysis tool developed at the I20 Chair, with a flow- and context-sensitive analysis. ICARUS takes user-provided arguments to application or program library entry points and performs an inter-procedural constant propagation analysis to determine the set of reachable functions in the analyzed program. For program libraries, including OS kernels, that possess multiple entry points, we preserve the program states between successive entry point analyses, thus facilitating a stateful and context-sensitive analysis across program routines. This allows ICARUS to eliminate dead code and generate a fine-grained call graph that is used to debloat analyzed programs. The evaluation of our prototype shows promising accuracy and performance on both user space applications and program libraries with multiple entry points. Even though the proposed technique still does not scale yet to the entire Linux kernel code base, we were still able to confirm that the proposed solution is a first step towards a stateful flow- and context-sensitive analysis on commodity OS kernels.
|