Private Computation

Private Computation/Privacy-Preserving Machine learning

With the rapid development of machine learning/deep learning in many fields, the data privacy issue raises concern increasingly. Therefore, privacy-preserving machine learning, as one of the primary applications of private computation, attracts much attention from the academic community and industrial partners. To leverage the convenience and efficiency of the conventional machine learning and deep learning system, but not to degrade the data privacy is a challenging task. Private computation provides many solutions, such as the multi parties computation (MPC), homophobic encryption (HE), differential privacy(DP), trusted execution environment(TEE), and federal learning(FL).

In my work, on the one hand, I am primarily concentrating on the optimization of those MPC, HE, and TEE-based solutions in order to reduce the gap between the academic results and industrial usage. On the other hand, I focus on how to extend my anomaly detection (Malware detection, Fraud detection, and phishing detection) systems with private computation, which means anomaly detection systems on a center node model could be extended to the distributed coordinated model without leakage the original raw data and labels.

Researcher: Peng Xu

1. PAD-MPC: Private Anomaly Detection in Cyber-Security with Multi-Parties Computation

Collaborative machine learning and deep learning increasingly require more attention from various fields, such as health sensitive prediction, product recommendation, as well as bank credit ranking system. However, user privacy issues constrain the development of collaborative machine learning and deep learning systems. As a classic example, anomaly detection systems in cyber-security, such as malicious code detection, phishing detection, and fraud detection, also suffer this weakness.
In this paper, one multi-parties computation(MPC) based anomaly detection system, PAD-MPC, is presented. We consider three different scenarios in Cyber-Security - malware detection, fraud detection and phishing detection - with secret sharing based multi-parties computation scheme. We evaluate our works against the real-world datasets and find the reasonable performance with the
Privacy-preserving scheme. As far as we knew, our work is the first work to introduce multi-parties computation based privacy-preserving in anomaly detection systems.

2. FAD: Federated Anomaly Detection in Cyber-Security

Collaborative machine learning and deep learning increasingly require more attention from various fields, such as health sensitive prediction, product recommendation, and bank credit ranking system. However, user privacy issues constrain the development of collaborative machine learning and deep learning systems. As a classic example, anomaly detection systems in cyber-security, such as malicious code detection, phishing detection, and fraud detection, also suffer this weakness.
In this paper, one federated learning(FL) based anomaly detection system, FAD, is presented. We consider two different scenarios - malware detection and phishing detection - with horizontal federated and vertical federated anomaly detection. Additionally, we also consider the manual indicated and automatic trainable features in our system. For the trainable features, we use the graph-based transfer learning to prepare the features. Meanwhile, the comparable results with centralized and federated learning-based collaborative anomaly detection are presented. As far as we knew, our work is the first work to introduce federated learning(both horizontal and vertical) in an anomaly detection system.