Automated Fuzz Target Generation for C Libraries

Supervisor(s):	Dieter Schuster, Julian Horsch
Status:	finished
Topic:	Others
Author:	Jonas Bogenberger
Submission:	2020-01-15
Type of Thesis:	Bachelorthesis
Thesis topic in co-operation with the Fraunhofer Institute for Applied and Integrated Security AISEC, Garching
Description Fuzzing is a method widely used for testing information technology products, mostly software, as well as for penetration testing said products. Fuzzers can roughly be divided into two types: Fuzzers using semantic descriptions of the targeted interface and guided fuzzers, such as American Fuzzy Lop or LLVM libFuzzer, typically measuring code coverage to assess the quality of fuzz inputs. While semantic fuzzing requires a lot of preparation and knowledge about the target, guided fuzzers sometimes can work without any preparation. Nonetheless, fuzzing libraries still requires the manual design of a fuzz target or fuzz harness, translating generic fuzz input to calls into the targeted library. The goal of this work was the design and implementation of an automatic fuzz target generation for C libraries, using header files as input and thus, speeding up and simplifying the manual process required for fuzzing libraries. For this purpose, a concept capable of providing the required data to a harness generating program was developed. This included the designing of the generic structure of an automat- ically generated harness, the automatic parsing and analysis of a library’s header file, the translation of the fuzzer’s input into the arguments of the fuzzed functions, as well as the proposal of different variants of the generated harness. Furthermore, the harness generator was implemented according to the developed concept, which resulted in insights regarding the application of the developed concept in the programming language C, as well as its limitations. The harness generator is able to automatically generate a fuzzable harness for a targeted library, only requiring header files of the targeted library as input. During evaluation, the performance of the generated harnesses and thus, the performance of the harness generator, was measured by comparing the coverage achieved during fuzzing with generated and handwritten harnesses for different targeted libraries. This comparison proved an equality in achieved coverage between handwritten and generated harnesses. However, the fuzzing with generated harnesses produced a vast amount of unique crashes, which partly occurred in the harness instead of the targeted library and therefore, are false positive results. On the other hand, a number of crashes were found to be caused by real bugs in the targeted library. In addition, the amount of libraries, which can be fuzzed using a harness generated by the harness generator, is significantly reduced by the limitations induced by solely relying on header files, as well as further C specific features.

Automated Fuzz Target Generation for C Libraries

Description