TUM Logo

Communicating through Firewalls by using Reinforcement Learning

Communicating through Firewalls by using Reinforcement Learning

Supervisor(s): Konstantin Böttinger, Dieter Schuster
Status: finished
Topic: Others
Author: Daniel Fomin
Submission: 2022-04-15
Type of Thesis: Masterthesis
Thesis topic in co-operation with the Fraunhofer Institute for Applied and Integrated Security AISEC, Garching


Testing firewall policies for proper operation in filtering out
prohibited packets from a system is a difficult task that requires a lot
of skill and experience. In this thesis, we propose a novel approach to
automatically probe firewall policies for weaknesses so that they can be
fixed by a system administrator. This approach involves generating
reinforcement learning models that are trained to find different
firewall evasion strategies by probing a set of existing firewall
policies. The goal of our approach is to simplify and accelerate
firewall policy testing.

We implemented a proof of concept for this approach that incorporates
single policy learning as well multi-policy learning modes and can probe
UDP-based firewall policies. For the evaluation, we focused on the
single policy mode of the implementation, however first insights on the
feasibility of the multi-policy learning mode are given as well. The
evaluation consists of a comparison between the performance of the
trained reinforcement learning models and brute force based baselines.
Additionally, we present a performance comparison to results from
related work.

Our evaluation shows that the proof of concept implementation can train
models that incorporate properties of a firewall policy in a reasonable
amount of time and with a significant success rate. Resulting from the
evaluation, we conclude that the approach is feasible and should thereby
be studied further. We consider this to be an important objective, as
achieving the goal of a simple, fast, and reliable method for testing
firewall policies would lead overall to infrastructures that are easier
to secure when using the proposed approach.