TUM Logo

A Longitudinal Study on WebAssembly

A Longitudinal Study on WebAssembly

Supervisor(s): Julian Kirsch Clemens Jonischkeit
Status: finished
Topic: Others
Author: Christopher Pfefferle
Submission: 2019-11-15
Type of Thesis: Bachelorthesis

Description

WebAssembly (Wasm) is a new approach on running code on the web. It allows
high-level programming languages like C/C++ or Rust to be compiled to a
binary and run in a browser, with near-native performance. However, due to a
lack of studies, it is quite unclear if and how WebAssembly is used on the web.
Also, the analysis of WebAssembly binaries remains difficult. Firstly, this thesis
tries to detect use cases in which WebAssembly binaries are employed by scanning 
the web with a web crawler. Secondly, an Identifier program is developed to
fingerprint source language and toolchain of an unknown binary. The web
crawler shows that one in 40247 scanned web pages, 0.0025%, uses WebAssembly, and
no more than 7 use cases were identified. Over 50% of the found binaries and
web pages deal with benchmarks and games. With the Identifier, the originating
source code language of 75 % of the 24 found binaries and 77% of a body
of 150 WebAssembly binaries provided by a different study could be identified, 
with C/C++ source code dominating in each case. While the use cases are still
limited in numbers, they are broadly based and suggest a rise of usage. This
will further increase the needs to deeply analyze WebAssembly binaries and for
appropriate tooling which may be able to adopt the introduced approaches.