TUM Logo

Automated Fuzz Target Generation for C Libraries

Automated Fuzz Target Generation for C Libraries

Supervisor(s): Dieter Schuster, Julian Horsch
Status: finished
Topic: Others
Author: Jonas Bogenberger
Submission: 2020-01-15
Type of Thesis: Bachelorthesis
Thesis topic in co-operation with the Fraunhofer Institute for Applied and Integrated Security AISEC, Garching

Description

Fuzzing is a method widely used for testing information technology products, mostly software,
as well as for penetration testing said products. Fuzzers can roughly be divided into two
types: Fuzzers using semantic descriptions of the targeted interface and guided fuzzers, such
as American Fuzzy Lop or LLVM libFuzzer, typically measuring code coverage to assess the
quality of fuzz inputs. While semantic fuzzing requires a lot of preparation and knowledge
about the target, guided fuzzers sometimes can work without any preparation. Nonetheless,
fuzzing libraries still requires the manual design of a fuzz target or fuzz harness, translating
generic fuzz input to calls into the targeted library. The goal of this work was the design
and implementation of an automatic fuzz target generation for C libraries, using header files
as input and thus, speeding up and simplifying the manual process required for fuzzing
libraries.
For this purpose, a concept capable of providing the required data to a harness generating
program was developed. This included the designing of the generic structure of an automat-
ically generated harness, the automatic parsing and analysis of a library’s header file, the
translation of the fuzzer’s input into the arguments of the fuzzed functions, as well as the
proposal of different variants of the generated harness. Furthermore, the harness generator
was implemented according to the developed concept, which resulted in insights regarding
the application of the developed concept in the programming language C, as well as its
limitations.
The harness generator is able to automatically generate a fuzzable harness for a targeted
library, only requiring header files of the targeted library as input. During evaluation, the
performance of the generated harnesses and thus, the performance of the harness generator,
was measured by comparing the coverage achieved during fuzzing with generated and
handwritten harnesses for different targeted libraries. This comparison proved an equality in
achieved coverage between handwritten and generated harnesses. However, the fuzzing with
generated harnesses produced a vast amount of unique crashes, which partly occurred in
the harness instead of the targeted library and therefore, are false positive results. On the
other hand, a number of crashes were found to be caused by real bugs in the targeted library.
In addition, the amount of libraries, which can be fuzzed using a harness generated by the
harness generator, is significantly reduced by the limitations induced by solely relying on
header files, as well as further C specific features.