Attacking Large Pre-trained Programming Language Models (PLMs) via Backdoors

Starting Date: Summer 2023
Prerequisites: Good programming skills in a high-level programming language (e.g., Python), experience in PLM, LLM or Machine Learning (e.g., model training with Python, Sklearn, etc)
Will results be assigned to University: No

Project Description:

Backdoors refer to a class of Machine Learning (ML) attacks where an adversary trains an ML model to intentionally misclassify any input to a specific label [1]. This is typically achieved by poisoning the training data, such that inputs are misclassified to a target label when the backdoor trigger is present. For instance, poisoning a few training images (adding a backdoor trigger, e.g., simple pixel patterns like white boxes) have been shown to cause misclassifications in computer vision ML models [1]. Even though ML models (e.g., NLP models [3]) are known to be vulnerable to backdoor attacks, it is unknown whether Large Pre-trained Programming Language Models (PLMs) models are vulnerable to backdoor attacks. PLMs (e.g., CodeBERT, Codex, etc) have become popularly deployed to address critical software engineering tasks such as vulnerability prediction, code clone detection, code summarization, comment generation, etc. Hence, it is important to ensure they are secure against backdoor attacks.

In this project, we aim to investigate whether PLM-based models are vulnerable to backdoor attacks, e.g., by poisoning training samples during model fine-tuning. The goal is to develop methods to effectively inject backdoors into PLMs, and defend against such attacks. The project will also analyze the limitations of the proposed backdoor injection and defence methods.

Required Skills:

Knowledge of the following:
* Good programming skills in a high-level programming language, e.g., Python
* Experience in PLM or Machine Learning, model training with Python, Sklearn, etc

Deliverables:

(a) An automated attack method to effectively inject backdoors into PLMs during fine-tuning
(b) An automated defence technique to detect and mitigate such attacks
(c) A report documenting the experiments, implementation and findings from above ((a) and (b) using real-world use cases

Will results be assigned to University: No

Why Should I Apply?:

This project provides opportunities to
* Develop skills in SE research, including SE for ML, ML for SE and software security
* Build a new attack vector, security measures and a security analysis tool
* Contribute to cutting-edge SE research leading to potential scientific contribution in SE venue

Previous Related Works: In previous works, we have studied the vulnerability of robust ML models to backdoor attacks [1]. We have also developed a PLM for addressing SE tasks [2].

References:

[1] Soremekun, Ezekiel, Sakshi Udeshi, and Sudipta Chattopadhyay. “Towards Backdoor Attacks and Defense in Robust Machine Learning Models.” Computers & Security (2023): 103101.

[2] Ma, Wei, Mengjie Zhao, Ezekiel Soremekun, Qiang Hu, Jie M. Zhang, Mike Papadakis, Maxime Cordy, Xiaofei Xie, and Yves Le Traon. “GraphCode2Vec: generic code embedding via lexical and program dependence analyses.” In Proceedings of the 19th International Conference on Mining Software Repositories, pp. 524-536. 2022.

[3] Li, Shaofeng, Hui Liu, Tian Dong, Benjamin Zi Hao Zhao, Minhui Xue, Haojin Zhu, and Jialiang Lu. “Hidden backdoors in human-centric language models.” In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, pp. 3123-3140. 2021.