TUM Logo

SAFAIR AI Contest

What is the SAFAIR AI Contest?

Deep learning models are being deployed more and more frequently in production. They have especially found widespread use in tasks concerning computer vision. With their widespread use, it is essential that the models are robust and secure. Since we have increased dependence on the deep learning models, it is essential to have a way to increase our confidence in their predictions.

The evaluation of robustness is not a trivial task though. If the model is robust to a particular kind of attack, it is not a sufficient measure of its robustness. In order to have confidence in the predictions made by the model, one needs to check its robustness against a variety of attack techniques.

The SAFAIR AI contest (part of the SPARTA EU Project) aims to evaluate the robustness of a defence technique by means of a two-player game. The participants can either register in the Attack or Defence tracks. These attack and defence teams are then continuously pitted against each other. This encourages the creation of deep learning models which are robust to a variety of attack methods. In the contest, the Defence teams would create a deep learning model based on a technique of their choice. The Attack team has knowledge about the input data on which these defence models are trained on. The Attack teams can also run gradient-based updates for generating adversarial perturbations. However, the Attack teams do not have any information about the actual defence technique used by the defence teams.

As such, the aim of this contest (inspired by adversarial NIPS competitions) is to encourage the creation of more robust deep learning models as well as to find Adversarial Attack methods which can effectively fool the target system across a variety of defence techniques.

Tasks and Tracks

We propose four (two attacks and two defences for each attack) different tracks for the contest:

  1. Targeted Face Re-Identification. In this track, participants are given a set of face images and target identities. The purpose of the targeted face re-identification attack is to modify the input image in order to classify the image in a particular class label.
  2. Face Attributes Alteration. In this track, participants are given a set of face images and a k-number of features ids. The purpose of the face attributes alteration attack is to modify the input image, but the k-features specified should be classified wrongly.
  3. Defence against Attribute Alteration. In this track, participants design models robust to perturbations for face attribute alterations. The purpose of this task is to create a machine learning model which is robust to adversarial perturbations in the attribute alteration scenario. For instance, detecting adversarial images accurately.
  4. Defence against Targeted Face Re-Identification. In this track, participants design models robust to perturbations for face re-identification. The purpose of this task is to create a machine learning model which is robust to adversarial perturbations to cause the model to classify the sample image as the particular target class.

Contest Schedule

The proposed schedule for the contest is the following:

  • March 1, – May 15, 2021. The contest will start in the first of March. Participants are working on their solutions. In the meantime, we organize few intermediate rounds of evaluation.
  • May 15, 2021. Deadline for the final submission.
  • May 15 – May 31, 2021. Organizers evaluate submissions.
  • May 31, 2021. Announce contest results and release the evaluation set of images.

Dataset

In the SAFAIR AI contest, attacks against and countermeasures in machine learning-based classifiers will be evaluated, and particularly the more specific tasks of Face Re-Identification and Facial Attribute Alterations are proposed. For this purpose, we use the CelebA dataset to train the models. The CelebA dataset is a large-scale dataset of more than 200k celebrity images. Each of the images is annotated with 40 facial attributes. The images in this dataset cover large pose variations and background clutter. The images are focused on celebrity faces and consist of 10K unique identities with 40 binary attributes per image.

CelebA is publicly available. We release a dev_toolkit with a PyTorch dataloader to simplify access to the data. We expect classification models to be trained on CelebA. The dev_toolkit also consists of PyTorch code for baseline models. For testing, the images will be chosen by the contest organisers. We have collected 1000 test images which are similar to the training dataset. All images from the test set will be kept secret until after the end of the contest. Participants should make use of only the CelebA dataset and the publicly available "train-val-test" split to train their models. The dev_toolkit enables easy access to the various splits and the training pipeline for the model.

How can I participate?

To register in the contest, please fill out the form carefully and send it to us at the dedicated email address for registration. (register_aicontest@sec.in.tum.de)
Furthermore, we use a hidden test set for the official evaluation of submitted solutions. Here's a tutorial walking you through the official evaluation of your model or attack. Once your model or attack has been evaluated officially, your scores will be added to the leaderboard.

Rules

  1. TUM employees cannot participate in the contest.
  2. You must register for the contest with one valid email address. If you participate in the contest with multiple email addresses, you will be disqualified.
  3. The registration times are listed in the schedule section. If you register after the time periods, your evaluation will not be considered.
  4. The contest is divided into four separate tasks, Targeted Face Re-Identification (Attacks and Defence) and Face Attribute Alterations(Attacks and Defence). Participants can be part of either the Attack track or Defence track but not both. However, participants for the Attack track of Face Attribute Alteration can compete for the Defence track of Targeted Face Re-Identification (but not for Defence track of Face Attribute Alteration), and vice versa.
  5. Participants are required to release the code of their submissions as open-source to our cloud systems.
  6. A model submitted by the defence team should be capable of handling valid inputs. In case the model fails to classify the input due to some error in the model, we penalize the model by considering it as a failure to handle an adversarial example.
  7. The Attack team should produce an adversarial example in the gray-box setting. They can use model gradients to create adversarial perturbations. If the execution fails for a valid input example, due to some logical errors in the submission, we penalize the Attack team by skipping that example and assuming that the defence team was able to handle the perturbation without any misclassification.
  8. Although we use a hidden test set for evaluation, still each classifier must be stateless. It should not memorize the training images.
  9. The defence models should not produce random outputs for the same input at different points of time. Participants should make sure that their models are deterministic.
  10. The number of submissions is limited to three times throughout the contest period (we will select the best submission from the three).
  11. We have 1000 test samples which should be perturbed within 4 hours. If the time duration is exceeded, the perturbation process is interrupted and the team is penalized.
  12. The Attack teams will be ranked based on the amount of decrease in accuracy they cause. This ranking will be based on the highest value first. defence teams will be ranked based on the decrease in their accuracy. The team which has maximum decrease will be placed last while the team having a minimum decrease in accuracy is ranked first. A separate ranking will be prepared for each of the four sub-tasks.

Evaluation Criterion

All evaluations are made based on the L infinity norm.

  • AttackFor the Attack team, the intention is to decrease the accuracy of the model. Let M be the model and S be the set of hidden Test samples. We run the model against these samples and compute its initial accuracy A_initial. Then we run the same model M on the same dataset S against the participant's attack method  T and compute the decrease in the accuracy. Let us say that the new accuracy is A_final. Then, the score for method T is delta = A_initial - A_final. The attacks are ranked based on the delta value highest first.
  • Defence: For the defence team, the intention is to suffer a minimum decrease in its initial accuracy as reported on unperturbed samples. Let M be the model and S be the set of hidden Test samples. We run the model against these samples and compute its initial accuracy A_initial. Then we run the same model M on the same dataset S against the participant's attack method  T and compute the decrease in the accuracy. Let us say that the new accuracy is A_final. Then, the score for method T is delta = A_initial - A_final. The defence teams are ranked based on the delta value smallest first.

Initial Evaluation

We follow the following steps to select the top-5 attacks and top-5 defences models.

    • Attack: We have a baseline model trained based on Adversarial training using FGSM attack. This model would be pitted against the attack-submission T. We will evaluate the decrease in the accuracy of the model based on the attack T. For instance, if the original accuracy of the model on the task was 80% and the perturbed samples decreased it to 60%, the delta score is 20. Based on this delta value, the top 5 submissions would be selected. These would be our finalists for the attack configuration.

    • Defence: For each defence method D applied to the initial model M we obtain M’ (improved model) and the defence evaluation is made on how well M’ behaves against FGSM, BIM, and PGD methods. We record the delta value and select top-5 teams based on minimum delta values. Since we test the model against three attack techniques in the initial round, we take a weighted sum of the delta value. Therefore, we assign FGSM, BIM, and PGD a weight coefficient of 0.2, 0.4, and 0.4, respectively. For instance, If the model has an initial accuracy of 100%, FGSM causes accuracy to go down to 80%, BIM causes to become 60%, and PGD causes to become 20%, then the final delta value is:

      (100 - 80) * 0.2 + (100 - 60) * 0.4 + (100 - 20) * 0.4 = 4 + 16 + 32 = 52

Final Evaluation

  • Attack: The top-5 attacks are pitted against the top-5 defence methods. For each attack team T, we take the mean value of their delta score across the top-5 defence model. The winning team is the one having the highest mean delta value.
  • Defence: Each defence method D (that creates M') is pitted against each of the top-5 attack methods. For each model M', we take the mean value of their delta score across the top-5 attack methods. The winning team is the one having the lowest mean delta value.

Contact

For any further questions, please contact us at the email address dedicated to the contest: info_aicontest@sec.in.tum.de
Please be aware that, the registration email address is just designed for the registration process and will not answer any question related to the contest. 

 

EU flag.jpg This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 830892.
 

Leaderboard