【论文合集】Awesome Backdoor Learning

关于后门攻击&防御的博客与论文。

ECCV2022对抗攻击&防御论文汇总 | Li's Blog (tuoli9.github.io)

ICLR2022对抗攻击&防御论文汇总 | Li's Blog (tuoli9.github.io)

CVPR2022对抗攻击&防御论文汇总 | Li's Blog (tuoli9.github.io)

ACM MM2022对抗攻击&防御论文汇总 | Li's Blog (tuoli9.github.io)

AAAI2022对抗攻击&防御论文汇总 | Li's Blog (tuoli9.github.io)

NIPS2022对抗攻击&防御论文汇总 | Li's Blog (tuoli9.github.io)

THUYimingLi/backdoor-learning-resources: A list of backdoor learning resources (github.com)

Survey

Toolbox

Dissertation and Thesis

Image and Video Classification

Poisoning-based Attack

Non-poisoning-based Attack

Backdoor Defense

Attack and Defense Towards Other Paradigms and Tasks

Federated Learning

Transfer Learning

Reinforcement Learning

Semi-Supervised and Self-Supervised Learning

Quantization

Natural Language Processing

Graph Neural Networks

Point Cloud

Acoustics Signal Processing

Medical Science

Cybersecurity

Others

Evaluation and Discussion

Backdoor Attack for Positive Purposes

Competition

Survey

Backdoor Learning: A Survey. [pdf]
- Yiming Li, Yong Jiang, Zhifeng Li, and Shu-Tao Xia. IEEE Transactions on Neural Networks and Learning Systems, 2022.
Backdoor Attacks and Countermeasures on Deep Learning: A Comprehensive Review. [pdf]
- Yansong Gao, Bao Gia Doan, Zhi Zhang, Siqi Ma, Anmin Fu, Surya Nepal, and Hyoungshick Kim. arXiv, 2020.
Data Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses. [pdf]
- Micah Goldblum, Dimitris Tsipras, Chulin Xie, Xinyun Chen, Avi Schwarzschild, Dawn Song, Aleksander Madry, Bo Li, and Tom Goldstein. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
A Comprehensive Survey on Poisoning Attacks and Countermeasures in Machine Learning. [link]
- Zhiyi Tian, Lei Cui, Jie Liang, and Shui Yu. ACM Computing Surveys, 2022.
Backdoor Attacks and Defenses in Federated Learning: State-of-the-art, Taxonomy, and Future Directions. [link]
- Xueluan Gong, Yanjiao Chen, Qian Wang, and Weihan Kong. IEEE Wireless Communications, 2022.
Backdoor Attacks on Image Classification Models in Deep Neural Networks. [link]
- Quanxin Zhang, Wencong Ma, Yajie Wang, Yaoyuan Zhang, Zhiwei Shi, and Yuanzhang Li. Chinese Journal of Electronics, 2022.
Defense against Neural Trojan Attacks: A Survey. [link]
- Sara Kaviani and Insoo Sohn. Neurocomputing, 2021.
A Survey on Neural Trojans. [pdf]
- Yuntao Liu, Ankit Mondal, Abhishek Chakraborty, Michael Zuzak, Nina Jacobsen, Daniel Xing, and Ankur Srivastava. ISQED, 2020.
A Survey of Neural Trojan Attacks and Defenses in Deep Learning. [pdf]
- Jie Wang, Ghulam Mubashar Hassan, and Naveed Akhtar. arXiv, 2022.
Threats to Pre-trained Language Models: Survey and Taxonomy. [pdf]
- Shangwei Guo, Chunlong Xie, Jiwei Li, Lingjuan Lyu, and Tianwei Zhang. arXiv, 2022.
An Overview of Backdoor Attacks Against Deep Neural Networks and Possible Defences. [pdf]
- Wei Guo, Benedetta Tondi, and Mauro Barni. arXiv, 2021.
Deep Learning Backdoors. [pdf]
- Shaofeng Li, Shiqing Ma, Minhui Xue, and Benjamin Zi Hao Zhao. arXiv, 2020.

Toolbox

BackdoorBox
TrojanZoo
OpenBackdoor
Backdoor Toolbox
BackdoorBench
backdoors101
ART

Dissertation and Thesis

Defense of Backdoor Attacks against Deep Neural Network Classifiers. [pdf]
- Zhen Xiang. Ph.D. Dissertation at The Pennsylvania State University, 2022.
Towards Adversarial and Backdoor Robustness of Deep Learning. [link]
- Yifan Guo. Ph.D. Dissertation at Case Western Reserve University, 2022.
Toward Robust and Communication Efficient Distributed Machine Learning. [pdf]
- Hongyi Wang. Ph.D. Dissertation at University of Wisconsin–Madison, 2021.
Towards Robust Image Classification with Deep Learning and Real-Time DNN Inference on Mobile. [pdf]
- Pu Zhao. Ph.D. Dissertation at Northeastern University, 2021.
Countermeasures Against Backdoor, Data Poisoning, and Adversarial Attacks. [pdf]
- Henry Daniel. Ph.D. Dissertation at University of Texas at San Antonio, 2021.
Understanding and Mitigating the Impact of Backdooring Attacks on Deep Neural Networks. [pdf]
- Kang Liu. Ph.D. Dissertation at New York University, 2021.
Un-fair trojan: Targeted Backdoor Attacks against Model Fairness. [pdf]
- Nicholas Furth. Master Thesis at New Jersey Institute of Technology, 2022.
Check Your Other Door: Creating Backdoor Attacks in the Frequency Domain. [pdf]
- Hasan Abed Al Kader Hammoud. Master Thesis at King Abdullah University of Science and Technology, 2022.
Backdoor Attacks in Neural Networks. [link]
- Stefanos Koffas. Master Thesis at Delft University of Technology, 2021.
Backdoor Defenses. [pdf]
- Andrea Milakovic. Master Thesis at Technischen Universität Wien, 2021.
Geometric Properties of Backdoored Neural Networks. [pdf]
- Dominic Carrano. Master Thesis at University of California at Berkeley, 2021.
Detecting Backdoored Neural Networks with Structured Adversarial Attacks. [pdf]
- Charles Yang. Master Thesis at University of California at Berkeley, 2021.
Backdoor Attacks Against Deep Learning Systems in the Physical World. [pdf]
- Emily Willson. Master Thesis at University of Chicago, 2020.

Image and Video Classification

Poisoning-based Attack

2022

Untargeted Backdoor Watermark: Towards Harmless and Stealthy Dataset Copyright Protection. [pdf] [code]
- Yiming Li, Yang Bai, Yong Jiang, Yong Yang, Shu-Tao Xia, and Bo Li. NeurIPS, 2022.
DEFEAT: Deep Hidden Feature Backdoor Attacks by Imperceptible Perturbation and Latent Representation Constraints. [pdf]
- Zhendong Zhao, Xiaojun Chen, Yuexin Xuan, Ye Dong, Dakui Wang, and Kaitai Liang. CVPR, 2022.
An Invisible Black-box Backdoor Attack through Frequency Domain. [pdf] [code]
- Tong Wang, Yuan Yao, Feng Xu, Shengwei An, Hanghang Tong, and Ting Wang. ECCV, 2022.
BppAttack: Stealthy and Efficient Trojan Attacks against Deep Neural Networks via Image Quantization and Contrastive Adversarial Learning. [pdf] [code]
- Zhenting Wang, Juan Zhai, and Shiqing Ma. CVPR, 2022.
Dynamic Backdoor Attacks Against Machine Learning Models. [pdf]
- Ahmed Salem, Rui Wen, Michael Backes, Shiqing Ma, and Yang Zhang. EuroS&P, 2022.
Imperceptible Backdoor Attack: From Input Space to Feature Representation. [pdf] [code]
- Nan Zhong, Zhenxing Qian, and Xinpeng Zhang. IJCAI, 2022.
Stealthy Backdoor Attack with Adversarial Training. [link]
- Le Feng, Sheng Li, Zhenxing Qian, and Xinpeng Zhang. ICASSP, 2022.
Invisible and Efficient Backdoor Attacks for Compressed Deep Neural Networks. [link]
- Huy Phan, Yi Xie, Jian Liu, Yingying Chen, and Bo Yuan. ICASSP, 2022.
Dynamic Backdoors with Global Average Pooling. [pdf]
- Stefanos Koffas, Stjepan Picek, and Mauro Conti. AICAS, 2022.
Poison Ink: Robust and Invisible Backdoor Attack. [pdf]
- Jie Zhang, Dongdong Chen, Qidong Huang, Jing Liao, Weiming Zhang, Huamin Feng, Gang Hua, and Nenghai Yu. IEEE Transactions on Image Processing, 2022.
Enhancing Backdoor Attacks with Multi-Level MMD Regularization. [link]
- Pengfei Xia, Hongjing Niu, Ziqiang Li, and Bin Li. IEEE Transactions on Dependable and Secure Computing, 2022.
PTB: Robust Physical Backdoor Attacks against Deep Neural Networks in Real World. [link]
- Mingfu Xue, Can He, Yinghao Wu, Shichang Sun, Yushu Zhang, Jian Wang, and Weiqiang Liu. Computers & Security, 2022.
IBAttack: Being Cautious about Data Labels. [link]
- Akshay Agarwal, Richa Singh, Mayank Vatsa, and Nalini Ratha. IEEE Transactions on Artificial Intelligence, 2022.
BlindNet Backdoor: Attack on Deep Neural Network using Blind Watermark. [link]
- Hyun Kwon and Yongchul Kim. Multimedia Tools and Applications, 2022.
Natural Backdoor Attacks on Deep Neural Networks via Raindrops. [link]
- Feng Zhao, Li Zhou, Qi Zhong, Rushi Lan, and Leo Yu Zhang. Security and Communication Networks, 2022.
Dispersed Pixel Perturbation-based Imperceptible Backdoor Trigger for Image Classifier Models. [pdf]
- Yulong Wang, Minghui Zhao, Shenghong Li, Xin Yuan, and Wei Ni. arXiv, 2022.
FRIB: Low-poisoning Rate Invisible Backdoor Attack based on Feature Repair. [pdf]
- Hui Xia, Xiugui Yang, Xiangyun Qian, and Rui Zhang. arXiv, 2022.
Augmentation Backdoors. [pdf] [code]
- Joseph Rance, Yiren Zhao, Ilia Shumailov, and Robert Mullins. arXiv, 2022.
Just Rotate it: Deploying Backdoor Attacks via Rotation Transformation. [pdf]
- Tong Wu, Tianhao Wang, Vikash Sehwag, Saeed Mahloujifar, and Prateek Mittal. arXiv, 2022.
Natural Backdoor Datasets. [pdf]
- Emily Wenger, Roma Bhattacharjee, Arjun Nitin Bhagoji, Josephine Passananti, Emilio Andere, Haitao Zheng, and Ben Y. Zhao. arXiv, 2022.
Backdoor Attacks on Vision Transformers. [pdf] [code]
- Akshayvarun Subramanya, Aniruddha Saha, Soroush Abbasi Koohpayegani, Ajinkya Tejankar, and Hamed Pirsiavash. arXiv, 2022.
Enhancing Clean Label Backdoor Attack with Two-phase Specific Triggers. [pdf]
- Nan Luo, Yuanzhang Li, Yajie Wang, Shangbo Wu, Yu-an Tan, and Quanxin Zhang. arXiv, 2022.
Circumventing Backdoor Defenses That Are Based on Latent Separability. [pdf] [code]
- Xiangyu Qi, Tinghao Xie, Saeed Mahloujifar, and Prateek Mittal. arXiv, 2022.
Narcissus: A Practical Clean-Label Backdoor Attack with Limited Information. [pdf]
- Yi Zeng, Minzhou Pan, Hoang Anh Just, Lingjuan Lyu, Meikang Qiu, and Ruoxi Jia. arXiv, 2022.
CASSOCK: Viable Backdoor Attacks against DNN in The Wall of Source-Specific Backdoor Defences. [pdf]
- Shang Wang, Yansong Gao, Anmin Fu, Zhi Zhang, Yuqing Zhang, and Willy Susilo. arXiv, 2022.
Trojan Horse Training for Breaking Defenses against Backdoor Attacks in Deep Learning. [pdf]
- Arezoo Rajabi, Bhaskar Ramasubramanian, and Radha Poovendran. arXiv, 2022.
Label-Smoothed Backdoor Attack. [pdf]
- Minlong Peng, Zidi Xiong, Mingming Sun, and Ping Li. arXiv, 2022.
Imperceptible and Multi-channel Backdoor Attack against Deep Neural Networks. [pdf]
- Mingfu Xue, Shifeng Ni, Yinghao Wu, Yushu Zhang, Jian Wang, and Weiqiang Liu. arXiv, 2022.
Compression-Resistant Backdoor Attack against Deep Neural Networks. [pdf]
- Mingfu Xue, Xin Wang, Shichang Sun, Yushu Zhang, Jian Wang, and Weiqiang Liu. arXiv, 2022.

2021

Invisible Backdoor Attack with Sample-Specific Triggers. [pdf] [code]
- Yuezun Li, Yiming Li, Baoyuan Wu, Longkang Li, Ran He, and Siwei Lyu. ICCV, 2021.
Manipulating SGD with Data Ordering Attacks. [pdf]
- Ilia Shumailov, Zakhar Shumaylov, Dmitry Kazhdan, Yiren Zhao, Nicolas Papernot, Murat A. Erdogdu, and Ross Anderson. NeurIPS, 2021.
Backdoor Attack with Imperceptible Input and Latent Modification. [pdf]
- Khoa Doan, Yingjie Lao, and Ping Li. NeurIPS, 2021.
LIRA: Learnable, Imperceptible and Robust Backdoor Attacks. [pdf]
- Khoa Doan, Yingjie Lao, Weijie Zhao, and Ping Li. ICCV, 2021.
Blind Backdoors in Deep Learning Models. [pdf] [code]
- Eugene Bagdasaryan, and Vitaly Shmatikov. USENIX Security, 2021.
Backdoor Attacks Against Deep Learning Systems in the Physical World. [pdf] [Master Thesis]
- Emily Wenger, Josephine Passanati, Yuanshun Yao, Haitao Zheng, and Ben Y. Zhao. CVPR, 2021.
Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxification. [pdf] [code]
- Siyuan Cheng, Yingqi Liu, Shiqing Ma, and Xiangyu Zhang. AAAI, 2021.
WaNet - Imperceptible Warping-based Backdoor Attack. [pdf] [code]
- Tuan Anh Nguyen, and Anh Tuan Tran. ICLR, 2021.
AdvDoor: Adversarial Backdoor Attack of Deep Learning System. [pdf] [code]
- Quan Zhang, Yifeng Ding, Yongqiang Tian, Jianmin Guo, Min Yuan, and Yu Jiang. ISSTA, 2021.
Invisible Poison: A Blackbox Clean Label Backdoor Attack to Deep Neural Networks. [pdf]
- Rui Ning, Jiang Li, ChunSheng Xin, and Hongyi Wu. INFOCOM, 2021.
Backdoor Attack in the Physical World. [pdf] [extension]
- Yiming Li, Tongqing Zhai, Yong Jiang, Zhifeng Li, and Shu-Tao Xia. ICLR Workshop, 2021.
Defense-Resistant Backdoor Attacks against Deep Neural Networks in Outsourced Cloud Environment. [Link]
- Xueluan Gong, Yanjiao Chen, Qian Wang, Huayang Huang, Lingshuo Meng, Chao Shen, and Qian Zhang. IEEE Journal on Selected Areas in Communications, 2021.
A Master Key Backdoor for Universal Impersonation Attack against DNN-based Face Verification. [link]
- WeiGuo, Benedetta Tondi, and Mauro Barni. Pattern Recognition Letters, 2021.
Backdoors Hidden in Facial Features: A Novel Invisible Backdoor Attack against Face Recognition Systems. [link]
- Mingfu Xue, Can He, Jian Wang, and Weiqiang Liu. Peer-to-Peer Networking and Applications, 2021.
Use Procedural Noise to Achieve Backdoor Attack. [link] [code]
- Xuan Chen, Yuena Ma, and Shiwei Lu. IEEE Access, 2021.
A Multitarget Backdooring Attack on Deep Neural Networks with Random Location Trigger. [link]
- Xiao Yu, Cong Liu, Mingwen Zheng, Yajie Wang, Xinrui Liu, Shuxiao Song, Yuexuan Ma, and Jun Zheng. International Journal of Intelligent Systems, 2021.
Simtrojan: Stealthy Backdoor Attack. [link]
- Yankun Ren, Longfei Li, and Jun Zhou. ICIP, 2021.
DBIA: Data-free Backdoor Injection Attack against Transformer Networks. [pdf] [code]
- Peizhuo Lv, Hualong Ma, Jiachen Zhou, Ruigang Liang, Kai Chen, Shengzhi Zhang, and Yunfei Yang. arXiv, 2021.
A Statistical Difference Reduction Method for Escaping Backdoor Detection. [pdf]
- Pengfei Xia, Hongjing Niu, Ziqiang Li, and Bin Li. arXiv, 2021.
Backdoor Attack through Frequency Domain. [pdf]
- Tong Wang, Yuan Yao, Feng Xu, Shengwei An, and Ting Wang. arXiv, 2021.
Check Your Other Door! Establishing Backdoor Attacks in the Frequency Domain. [pdf]
- Hasan Abed Al Kader Hammoud and Bernard Ghanem. arXiv, 2021.
Sleeper Agent: Scalable Hidden Trigger Backdoors for Neural Networks Trained from Scratch. [pdf] [code]
- Hossein Souri, Micah Goldblum, Liam Fowl, Rama Chellappa, and Tom Goldstein. arXiv, 2021.
RABA: A Robust Avatar Backdoor Attack on Deep Neural Network. [pdf]
- Ying He, Zhili Shen, Chang Xia, Jingyu Hua, Wei Tong, and Sheng Zhong. arXiv, 2021.
Robust Backdoor Attacks against Deep Neural Networks in Real Physical World. [pdf]
- Mingfu Xue, Can He, Shichang Sun, Jian Wang, and Weiqiang Liu. arXiv, 2021.

2020

Composite Backdoor Attack for Deep Neural Network by Mixing Existing Benign Features. [pdf]
- Junyu Lin, Lei Xu, Yingqi Liu, Xiangyu Zhang. CCS, 2020.
Input-Aware Dynamic Backdoor Attack. [pdf] [code]
- Anh Nguyen, and Anh Tran. NeurIPS, 2020.
Bypassing Backdoor Detection Algorithms in Deep Learning. [pdf]
- Te Juin Lester Tan, and Reza Shokri. EuroS&P, 2020.
Backdoor Embedding in Convolutional Neural Network Models via Invisible Perturbation. [pdf]
- Cong Liao, Haoti Zhong, Anna Squicciarini, Sencun Zhu, and David Miller. ACM CODASPY, 2020.
Clean-Label Backdoor Attacks on Video Recognition Models. [pdf] [code]
- Shihao Zhao, Xingjun Ma, Xiang Zheng, James Bailey, Jingjing Chen, and Yu-Gang Jiang. CVPR, 2020.
Escaping Backdoor Attack Detection of Deep Learning. [link]
- Yayuan Xiong, Fengyuan Xu, Sheng Zhong, and Qun Li. IFIP SEC, 2020.
Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks. [pdf] [code]
- Yunfei Liu, Xingjun Ma, James Bailey, and Feng Lu. ECCV, 2020.
Live Trojan Attacks on Deep Neural Networks. [pdf] [code]
- Robby Costales, Chengzhi Mao, Raphael Norwitz, Bryan Kim, and Junfeng Yang. CVPR Workshop, 2020.
Backdooring and Poisoning Neural Networks with Image-Scaling Attacks. [pdf]
- Erwin Quiring, and Konrad Rieck. IEEE S&P Workshop, 2020.
One-to-N & N-to-One: Two Advanced Backdoor Attacks against Deep Learning Models. [pdf]
- Mingfu Xue, Can He, Jian Wang, and Weiqiang Liu. IEEE Transactions on Dependable and Secure Computing, 2020.
Invisible Backdoor Attacks on Deep Neural Networks via Steganography and Regularization. [pdf] [arXiv Version (2019)]
- Shaofeng Li, Minhui Xue, Benjamin Zi Hao Zhao, Haojin Zhu, and Xinpeng Zhang. IEEE Transactions on Dependable and Secure Computing, 2020.
HaS-Nets: A Heal and Select Mechanism to Defend DNNs Against Backdoor Attacks for Data Collection Scenarios. [pdf]
- Hassan Ali, Surya Nepal, Salil S. Kanhere, and Sanjay Jha. arXiv, 2020.
FaceHack: Triggering Backdoored Facial Recognition Systems Using Facial Characteristics. [pdf]
- Esha Sarkar, Hadjer Benkraouda, and Michail Maniatakos. arXiv, 2020.
Light Can Hack Your Face! Black-box Backdoor Attack on Face Recognition Systems. [pdf]
- Haoliang Li, Yufei Wang, Xiaofei Xie, Yang Liu, Shiqi Wang, Renjie Wan, Lap-Pui Chau, and Alex C. Kot. arXiv, 2020.

2019

A New Backdoor Attack in CNNS by Training Set Corruption Without Label Poisoning. [pdf]
- M.Barni, K.Kallas, and B.Tondi. ICIP, 2019.
Label-Consistent Backdoor Attacks. [pdf] [code]
- Alexander Turner, Dimitris Tsipras, and Aleksander Madry. arXiv, 2019.

2018

Trojaning Attack on Neural Networks. [pdf] [code]
- Yingqi Liu, Shiqing Ma, Yousra Aafer, Wen-Chuan Lee, and Juan Zhai. NDSS, 2018.

2017

BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain. [pdf] [journal]
- Tianyu Gu, Brendan Dolan-Gavitt, and Siddharth Garg. arXiv, 2017 (IEEE Access, 2019).
Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning. [pdf] [code]
- Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. arXiv, 2017.

Non-poisoning-based Attack

Weights-oriented Attack

Handcrafted Backdoors in Deep Neural Networks. [pdf]
- Sanghyun Hong, Nicholas Carlini, and Alexey Kurakin. NeurIPS, 2022.
Hardly Perceptible Trojan Attack against Neural Networks with Bit Flips. [pdf] [code]
- Jiawang Bai, Kuofeng Gao, Dihong Gong, Shu-Tao Xia, Zhifeng Li, and Wei Liu. ECCV, 2022.
ProFlip: Targeted Trojan Attack with Progressive Bit Flips. [pdf]
- Huili Chen, Cheng Fu, Jishen Zhao, and Farinaz Koushanfar. ICCV, 2021.
TBT: Targeted Neural Network Attack with Bit Trojan. [pdf] [code]
- Adnan Siraj Rakin, Zhezhi He, and Deliang Fan. CVPR, 2020.
How to Inject Backdoors with Better Consistency: Logit Anchoring on Clean Data. [pdf]
- Zhiyuan Zhang, Lingjuan Lyu, Weiqiang Wang, Lichao Sun, and Xu Sun. ICLR, 2022.
Can Adversarial Weight Perturbations Inject Neural Backdoors? [pdf]
- Siddhant Garg, Adarsh Kumar, Vibhor Goel, and Yingyu Liang. CIKM, 2020.
TrojViT: Trojan Insertion in Vision Transformers. [pdf]
- Mengxin Zheng, Qian Lou, and Lei Jiang. arXiv, 2022.
Versatile Weight Attack via Flipping Limited Bits. [pdf]
- Jiawang Bai, Baoyuan Wu, Zhifeng Li, and Shu-Tao Xia. arXiv, 2022.
Toward Realistic Backdoor Injection Attacks on DNNs using Rowhammer. [pdf]
- M. Caner Tol, Saad Islam, Berk Sunar, and Ziming Zhang. 2022.
TrojanNet: Embedding Hidden Trojan Horse Models in Neural Network. [pdf]
- Chuan Guo, Ruihan Wu, and Kilian Q. Weinberger. arXiv, 2020.
Backdooring Convolutional Neural Networks via Targeted Weight Perturbations. [pdf]
- Jacob Dumford, and Walter Scheirer. arXiv, 2018.

Structure-modified Attack

LoneNeuron: a Highly-Effective Feature-Domain Neural Trojan Using Invisible and Polymorphic Watermarks. [pdf]
- Zeyan Liu, Fengjun Li, Zhu Li, and Bo Luo. CCS, 2022.
Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks. [pdf] [code]
- Xiangyu Qi, Tinghao Xie, Ruizhe Pan, Jifeng Zhu, Yong Yang, and Kai Bu. CVPR, 2022.
Hiding Needles in a Haystack: Towards Constructing Neural Networks that Evade Verification. [link] [code]
- Árpád Berta, Gábor Danner, István Hegedűs and Márk Jelasity. ACM IH&MMSec, 2022.
Stealthy and Flexible Trojan in Deep Learning Framework. [link]
- Yajie Wang, Kongyang Chen, Yu-An Tan, Shuxin Huang, Wencong Ma, and Yuanzhang Li. IEEE Transactions on Dependable and Secure Computing, 2022.
FooBaR: Fault Fooling Backdoor Attack on Neural Network Training. [link] [code]
- Jakub Breier, Xiaolu Hou, Martín Ochoa and Jesus Solano. IEEE Transactions on Dependable and Secure Computing, 2022.
DeepPayload: Black-box Backdoor Attack on Deep Learning Models through Neural Payload Injection. [pdf]
- Yuanchun Li, Jiayi Hua, Haoyu Wang, Chunyang Chen, and Yunxin Liu. ICSE, 2021.
An Embarrassingly Simple Approach for Trojan Attack in Deep Neural Networks. [pdf] [code]
- Ruixiang Tang, Mengnan Du, Ninghao Liu, Fan Yang, and Xia Hu. KDD, 2020.
BadRes: Reveal the Backdoors through Residual Connection. [pdf]
- Mingrui He, Tianyu Chen, Haoyi Zhou, Shanghang Zhang, and Jianxin Li. arXiv, 2022.
Architectural Backdoors in Neural Networks. [pdf]
- Mikel Bober-Irizar, Ilia Shumailov, Yiren Zhao, Robert Mullins, and Nicolas Papernot. arXiv, 2022.
Planting Undetectable Backdoors in Machine Learning Models. [pdf]
- Shafi Goldwasser, Michael P. Kim, Vinod Vaikuntanathan, and Or Zamir. arXiv, 2022.

Other Attacks

ImpNet: Imperceptible and blackbox-undetectable backdoors in compiled neural networks. [pdf] [website] [code]
- Tim Clifford, Ilia Shumailov, Yiren Zhao, Ross Anderson, and Robert Mullins. arXiv, 2022.
Don't Trigger Me! A Triggerless Backdoor Attack Against Deep Neural Networks. [pdf]
- Ahmed Salem, Michael Backes, and Yang Zhang. arXiv, 2020.

Backdoor Defense

Preprocessing-based Empirical Defense

Backdoor Attack in the Physical World. [pdf] [extension]
- Yiming Li, Tongqing Zhai, Yong Jiang, Zhifeng Li, and Shu-Tao Xia. ICLR Workshop, 2021.
DeepSweep: An Evaluation Framework for Mitigating DNN Backdoor Attacks using Data Augmentation. [pdf] [code]
- Han Qiu, Yi Zeng, Shangwei Guo, Tianwei Zhang, Meikang Qiu, and Bhavani Thuraisingham. AsiaCCS, 2021.
Februus: Input Purification Defense Against Trojan Attacks on Deep Neural Network Systems. [pdf] [code]
- Bao Gia Doan, Ehsan Abbasnejad, and Damith C. Ranasinghe. ACSAC, 2020.
Neural Trojans. [pdf]
- Yuntao Liu, Yang Xie, and Ankur Srivastava. ICCD, 2017.
Defending Deep Neural Networks against Backdoor Attack by Using De-trigger Autoencoder. [pdf]
- Hyun Kwon. IEEE Access, 2021.
Defending Backdoor Attacks on Vision Transformer via Patch Processing. [pdf]
- Khoa D. Doan, Yingjie Lao, Peng Yang, and Ping Li. arXiv, 2022.
ConFoc: Content-Focus Protection Against Trojan Attacks on Neural Networks. [pdf]
- Miguel Villarreal-Vasquez, and Bharat Bhargava. arXiv, 2021.
Model Agnostic Defense against Backdoor Attacks in Machine Learning. [pdf]
- Sakshi Udeshi, Shanshan Peng, Gerald Woo, Lionell Loh, Louth Rawshan, and Sudipta Chattopadhyay. arXiv, 2019.

Model Reconstruction based Empirical Defense

Adversarial Unlearning of Backdoors via Implicit Hypergradient. [pdf] [code]
- Yi Zeng, Si Chen, Won Park, Z. Morley Mao, Ming Jin, and Ruoxi Jia. ICLR, 2022.
Data-free Backdoor Removal based on Channel Lipschitzness. [pdf] [code]
- Runkai Zheng, Rongjun Tang, Jianze Li, and Li Liu. ECCV, 2022.
Eliminating Backdoor Triggers for Deep Neural Networks Using Attention Relation Graph Distillation. [pdf]
- Jun Xia, Ting Wang, Jieping Ding, Xian Wei, and Mingsong Chen. IJCAI, 2022.
Adversarial Neuron Pruning Purifies Backdoored Deep Models. [pdf] [code]
- Dongxian Wu and Yisen Wang. NeurIPS, 2021.
Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks. [pdf] [code]
- Yige Li, Xingjun Ma, Nodens Koren, Lingjuan Lyu, Xixiang Lyu, and Bo Li. ICLR, 2021.
Interpretability-Guided Defense against Backdoor Attacks to Deep Neural Networks. [link]
- Wei Jiang, Xiangyu Wen, Jinyu Zhan, Xupeng Wang, and Ziwei Song. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2021.
Boundary augment: A data augment method to defend poison attack. [link]
- Xuan Chen, Yuena Ma, Shiwei Lu, and Yu Yao. IET Image Processing, 2021.
Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness. [pdf] [code]
- Pu Zhao, Pin-Yu Chen, Payel Das, Karthikeyan Natesan Ramamurthy, and Xue Lin. ICLR, 2020.
Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks. [pdf] [code]
- Kang Liu, Brendan Dolan-Gavitt, and Siddharth Garg. RAID, 2018.
Neural Trojans. [pdf]
- Yuntao Liu, Yang Xie, and Ankur Srivastava. ICCD, 2017.
Test-time Adaptation of Residual Blocks against Poisoning and Backdoor Attacks. [pdf]
- Arnav Gudibande, Xinyun Chen, Yang Bai, Jason Xiong, and Dawn Song. CVPR Workshop, 2022.
Disabling Backdoor and Identifying Poison Data by using Knowledge Distillation in Backdoor Attacks on Deep Neural Networks. [pdf]
- Kota Yoshida, and Takeshi Fujino. CCS Workshop, 2020.
Defending against Backdoor Attack on Deep Neural Networks. [pdf]
- Hao Cheng, Kaidi Xu, Sijia Liu, Pin-Yu Chen, Pu Zhao, and Xue Lin. KDD Workshop, 2019.
Defense against Backdoor Attacks via Identifying and Purifying Bad Neurons. [pdf]
- Mingyuan Fan, Yang Liu, Cen Chen, Ximeng Liu, and Wenzhong Guo. arXiv, 2022.
Turning a Curse Into a Blessing: Enabling Clean-Data-Free Defenses by Model Inversion. [pdf]
- Si Chen, Yi Zeng, Won Park, and Ruoxi Jia. arXiv, 2022.
Adversarial Fine-tuning for Backdoor Defense: Connect Adversarial Examples to Triggered Samples. [pdf]
- Bingxu Mu, Le Wang, and Zhenxing Niu. arXiv, 2022.
Neural Network Laundering: Removing Black-Box Backdoor Watermarks from Deep Neural Networks. [pdf]
- William Aiken, Hyoungshick Kim, and Simon Woo. arXiv, 2020.
HaS-Nets: A Heal and Select Mechanism to Defend DNNs Against Backdoor Attacks for Data Collection Scenarios. [pdf]
- Hassan Ali, Surya Nepal, Salil S. Kanhere, and Sanjay Jha. arXiv, 2020.

Trigger Synthesis based Empirical Defense

Quarantine: Sparsity Can Uncover the Trojan Attack Trigger for Free. [pdf] [code]
- Tianlong Chen, Zhenyu Zhang, Yihua Zhang*, Shiyu Chang, Sijia Liu, and Zhangyang Wang. CVPR, 2022.
Better Trigger Inversion Optimization in Backdoor Scanning. [pdf] [code]
- Guanhong Tao, Guangyu Shen, Yingqi Liu, Shengwei An, Qiuling Xu, Shiqing Ma, Pan Li, and Xiangyu Zhang. CVPR, 2022.
Few-shot Backdoor Defense Using Shapley Estimation. [pdf]
- Jiyang Guan, Zhuozhuo Tu, Ran He, and Dacheng Tao. CVPR, 2022.
AEVA: Black-box Backdoor Detection Using Adversarial Extreme Value Analysis. [pdf] [code]
- Junfeng Guo, Ang Li, and Cong Liu. ICLR, 2022.
Trigger Hunting with a Topological Prior for Trojan Detection. [pdf] [code]
- Xiaoling Hu, Xiao Lin, Michael Cogswell, Yi Yao, Susmit Jha, and Chao Chen. ICLR, 2022.
Backdoor Defense with Machine Unlearning. [pdf]
- Yang Liu, Mingyuan Fan, Cen Chen, Ximeng Liu, Zhuo Ma, Li Wang, and Jianfeng Ma. INFOCOM, 2022.
Black-box Detection of Backdoor Attacks with Limited Information and Data. [pdf]
- Yinpeng Dong, Xiao Yang, Zhijie Deng, Tianyu Pang, Zihao Xiao, Hang Su, and Jun Zhu. ICCV, 2021.
Backdoor Scanning for Deep Neural Networks through K-Arm Optimization. [pdf] [code]
- Guangyu Shen, Yingqi Liu, Guanhong Tao, Shengwei An, Qiuling Xu, Siyuan Cheng, Shiqing Ma, and Xiangyu Zhang. ICML, 2021.
Towards Inspecting and Eliminating Trojan Backdoors in Deep Neural Networks. [pdf] [previous version] [code]
- Wenbo Guo, Lun Wang, Xinyu Xing, Min Du, and Dawn Song. ICDM, 2020.
GangSweep: Sweep out Neural Backdoors by GAN. [pdf]
- Liuwan Zhu, Rui Ning, Cong Wang, Chunsheng Xin, and Hongyi Wu. ACM MM, 2020.
Detection of Backdoors in Trained Classiﬁers Without Access to the Training Set. [pdf]
- Z Xiang, DJ Miller, and G Kesidis. IEEE Transactions on Neural Networks and Learning Systems, 2020.
Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks. [pdf] [code]
- Bolun Wang, Yuanshun Yao, Shawn Shan, Huiying Li, Bimal Viswanath, Haitao Zheng, Ben Y. Zhao. IEEE S&P, 2019.
Defending Neural Backdoors via Generative Distribution Modeling. [pdf] [code]
- Ximing Qiao, Yukun Yang, and Hai Li. NeurIPS, 2019.
DeepInspect: A Black-box Trojan Detection and Mitigation Framework for Deep Neural Networks. [pdf]
- Huili Chen, Cheng Fu, Jishen Zhao, Farinaz Koushanfar. IJCAI, 2019.
Identifying Physically Realizable Triggers for Backdoored Face Recognition Networks. [link]
- Ankita Raj, Ambar Pal, and Chetan Arora. ICIP, 2021.
Revealing Perceptible Backdoors in DNNs Without the Training Set via the Maximum Achievable Misclassification Fraction Statistic. [pdf]
- Zhen Xiang, David J. Miller, Hang Wang, and George Kesidis. MLSP, 2020.
Adaptive Perturbation Generation for Multiple Backdoors Detection. [pdf]
- Yuhang Wang, Huafeng Shi, Rui Min, Ruijia Wu, Siyuan Liang, Yichao Wu, Ding Liang, and Aishan Liu. arXiv, 2022.
Confidence Matters: Inspecting Backdoors in Deep Neural Networks via Distribution Transfer. [pdf]
- Tong Wang, Yuan Yao, Feng Xu, Miao Xu, Shengwei An, and Ting Wang. arXiv, 2022.
One-shot Neural Backdoor Erasing via Adversarial Weight Masking. [pdf]
- Shuwen Chai and Jinghui Chen. arXiv, 2022.
Defense Against Multi-target Trojan Attacks. [pdf]
- Haripriya Harikumar, Santu Rana, Kien Do, Sunil Gupta, Wei Zong, Willy Susilo, and Svetha Venkastesh. arXiv, 2022.
Model-Contrastive Learning for Backdoor Defense. [pdf]
- Zhihao Yue, Jun Xia, Zhiwei Ling, Ting Wang, Xian Wei, and Mingsong Chen. arXiv, 2022.
CatchBackdoor: Backdoor Testing by Critical Trojan Neural Path Identification via Differential Fuzzing. [pdf]
- Haibo Jin, Ruoxi Chen, Jinyin Chen, Yao Cheng, Chong Fu, Ting Wang, Yue Yu, and Zhaoyan Ming. arXiv, 2021.
Detect and Remove Watermark in Deep Neural Networks via Generative Adversarial Networks. [pdf]
- Haoqi Wang, Mingfu Xue, Shichang Sun, Yushu Zhang, Jian Wang, and Weiqiang Liu. arXiv, 2021.
TAD: Trigger Approximation based Black-box Trojan Detection for AI. [pdf]
- Xinqiao Zhang, Huili Chen, and Farinaz Koushanfar. arXiv, 2021.
Scalable Backdoor Detection in Neural Networks. [pdf]
- Haripriya Harikumar, Vuong Le, Santu Rana, Sourangshu Bhattacharya, Sunil Gupta, and Svetha Venkatesh. arXiv, 2020.
NNoculation: Broad Spectrum and Targeted Treatment of Backdoored DNNs. [pdf] [code]
- Akshaj Kumar Veldanda, Kang Liu, Benjamin Tan, Prashanth Krishnamurthy, Farshad Khorrami, Ramesh Karri, Brendan Dolan-Gavitt, and Siddharth Garg. arXiv, 2020.

Model Diagnosis based Empirical Defense

Complex Backdoor Detection by Symmetric Feature Differencing. [pdf] [code]
- Yingqi Liu, Guangyu Shen, Guanhong Tao, Zhenting Wang, Shiqing Ma, and Xiangyu Zhang. CVPR, 2022.
Post-Training Detection of Backdoor Attacks for Two-Class and Multi-Attack Scenarios. [pdf] [code]
- Zhen Xiang, David J. Miller, and George Kesidis. ICLR, 2022.
An Anomaly Detection Approach for Backdoored Neural Networks: Face Recognition as a Case Study. [pdf]
- Alexander Unnervik and Sébastien Marcel. BIOSIG, 2022.
Critical Path-Based Backdoor Detection for Deep Neural Networks. [link]
- Wei Jiang, Xiangyu Wen, Jinyu Zhan, Xupeng Wang, Ziwei Song, and Chen Bian. IEEE Transactions on Neural Networks and Learning Systems, 2022.
Detecting AI Trojans Using Meta Neural Analysis. [pdf]
- Xiaojun Xu, Qi Wang, Huichen Li, Nikita Borisov, Carl A. Gunter, and Bo Li. IEEE S&P, 2021.
Topological Detection of Trojaned Neural Networks. [pdf]
- Songzhu Zheng, Yikai Zhang, Hubert Wagner, Mayank Goswami, and Chao Chen. NeurIPS, 2021.
Black-box Detection of Backdoor Attacks with Limited Information and Data. [pdf]
- Yinpeng Dong, Xiao Yang, Zhijie Deng, Tianyu Pang, Zihao Xiao, Hang Su, and Jun Zhu. ICCV, 2021.
Universal Litmus Patterns: Revealing Backdoor Attacks in CNNs. [pdf] [code]
- Soheil Kolouri, Aniruddha Saha, Hamed Pirsiavash, and Heiko Hoffmann. CVPR, 2020.
One-Pixel Signature: Characterizing CNN Models for Backdoor Detection. [pdf]
- Shanjiaoyang Huang, Weiqi Peng, Zhiwei Jia, and Zhuowen Tu. ECCV, 2020.
Practical Detection of Trojan Neural Networks: Data-Limited and Data-Free Cases. [pdf] [code]
- Ren Wang, Gaoyuan Zhang, Sijia Liu, Pin-Yu Chen, Jinjun Xiong, and Meng Wang. ECCV, 2020.
Detecting Backdoor Attacks via Class Difference in Deep Neural Networks. [pdf]
- Hyun Kwon. IEEE Access, 2020.
Baseline Pruning-Based Approach to Trojan Detection in Neural Networks. [pdf]
- Peter Bajcsy and Michael Majurski. ICLR Workshop, 2021.
Attention Hijacking in Trojan Transformers. [pdf]
- Weimin Lyu, Songzhu Zheng, Tengfei Ma, Haibin Ling, and Chao Chen. arXiv, 2022.
Universal Post-Training Backdoor Detection. [pdf]
- Hang Wang, Zhen Xiang, David J. Miller, and George Kesidis. arXiv, 2022.
Trojan Signatures in DNN Weights. [pdf]
- Greg Fields, Mohammad Samragh, Mojan Javaheripi, Farinaz Koushanfar, and Tara Javidi. arXiv, 2021.
EX-RAY: Distinguishing Injected Backdoor from Natural Features in Neural Networks by Examining Differential Feature Symmetry. [pdf]
- Yingqi Liu, Guangyu Shen, Guanhong Tao, Zhenting Wang, Shiqing Ma, and Xiangyu Zhang. arXiv, 2021.
TOP: Backdoor Detection in Neural Networks via Transferability of Perturbation. [pdf]
- Todd Huster and Emmanuel Ekwedike. arXiv, 2021.
Detecting Trojaned DNNs Using Counterfactual Attributions. [pdf]
- Karan Sikka, Indranil Sur, Susmit Jha, Anirban Roy, and Ajay Divakaran. arXiv, 2021.
Adversarial examples are useful too! [pdf] [code]
- XBorji A. arXiv, 2020.
Cassandra: Detecting Trojaned Networks from Adversarial Perturbations. [pdf]
- Xiaoyu Zhang, Ajmal Mian, Rohit Gupta, Nazanin Rahnavard, and Mubarak Shah. arXiv, 2020.
Odyssey: Creation, Analysis and Detection of Trojan Models. [pdf] [dataset]
- Marzieh Edraki, Nazmul Karim, Nazanin Rahnavard, Ajmal Mian, and Mubarak Shah. arXiv, 2020.
Noise-response Analysis for Rapid Detection of Backdoors in Deep Neural Networks. [pdf]
- N. Benjamin Erichson, Dane Taylor, Qixuan Wu, and Michael W. Mahoney. arXiv, 2020.
NeuronInspect: Detecting Backdoors in Neural Networks via Output Explanations. [pdf]
- Xijie Huang, Moustafa Alzantot, and Mani Srivastava. arXiv, 2019.

Poison Suppression based Empirical Defense

Backdoor Defense via Decoupling the Training Process. [pdf] [code]
- Kunzhe Huang, Yiming Li, Baoyuan Wu, Zhan Qin, and Kui Ren. ICLR, 2022.
Training with More Confidence: Mitigating Injected and Natural Backdoors During Training. [pdf] [code]
- Zhenting Wang, Zhenting_Wang, Hailun Ding, Juan Zhai, and Shiqing Ma. NeurIPS, 2022.
Anti-Backdoor Learning: Training Clean Models on Poisoned Data. [pdf] [code]
- Yige Li, Xixiang Lyu, Nodens Koren, Lingjuan Lyu, Bo Li, and Xingjun Ma. NeurIPS, 2021.
Robust Anomaly Detection and Backdoor Attack Detection via Differential Privacy. [pdf] [code]
- Min Du, Ruoxi Jia, and Dawn Song. ICLR, 2020.
Strong Data Augmentation Sanitizes Poisoning and Backdoor Attacks Without an Accuracy Trade-off. [pdf]
- Eitan Borgnia, Valeriia Cherepanova, Liam Fowl, Amin Ghiasi, Jonas Geiping, Micah Goldblum, Tom Goldstein, and Arjun Gupta. ICASSP, 2021.
What Doesn't Kill You Makes You Robust(er): Adversarial Training against Poisons and Backdoors. [pdf]
- Jonas Geiping, Liam Fowl, Gowthami Somepalli, Micah Goldblum, Michael Moeller, and Tom Goldstein. ICLR Workshop, 2021.
Removing Backdoor-Based Watermarks in Neural Networks with Limited Data. [pdf]
- Xuankai Liu, Fengting Li, Bihan Wen, and Qi Li. ICPR, 2021.
On the Effectiveness of Adversarial Training against Backdoor Attacks. [pdf]
- Yinghua Gao, Dongxian Wu, Jingfeng Zhang, Guanhao Gan, Shu-Tao Xia, Gang Niu, and Masashi Sugiyama. arXiv, 2022.
Resurrecting Trust in Facial Recognition: Mitigating Backdoor Attacks in Face Recognition to Prevent Potential Privacy Breaches. [pdf]
- Reena Zelenkova, Jack Swallow, M. A. P. Chamikara, Dongxi Liu, Mohan Baruwal Chhetri, Seyit Camtepe, Marthie Grobler, and Mahathir Almashor. arXiv, 2022.
SanitAIs: Unsupervised Data Augmentation to Sanitize Trojaned Neural Networks. [pdf]
- Kiran Karra and Chace Ashcraft. arXiv, 2021.
On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping. [pdf] [code]
- Sanghyun Hong, Varun Chandrasekaran, Yiğitcan Kaya, Tudor Dumitraş, and Nicolas Papernot. arXiv, 2020.
DP-InstaHide: Provably Defusing Poisoning and Backdoor Attacks with Differentially Private Data Augmentations. [pdf]
- Eitan Borgnia, Jonas Geiping, Valeriia Cherepanova, Liam Fowl, Arjun Gupta, Amin Ghiasi, Furong Huang, Micah Goldblum, and Tom Goldstein. arXiv, 2021.

Sample Filtering based Empirical Defense

The "Beatrix'' Resurrections: Robust Backdoor Detection via Gram Matrices. [pdf] [code]
- Wanlun Ma, Derui Wang, Ruoxi Sun, Minhui Xue, Sheng Wen, and Yang Xiang. NDSS, 2023.
Towards Effective and Robust Neural Trojan Defenses via Input Filtering. [pdf] [code]
- Kien Do, Haripriya Harikumar, Hung Le, Dung Nguyen, Truyen Tran, Santu Rana, Dang Nguyen, Willy Susilo, and Svetha Venkatesh. ECCV, 2022.
Effective Backdoor Defense by Exploiting Sensitivity of Poisoned Samples. [pdf] [code]
- Weixin Chen, Baoyuan Wu, and Haoqian Wang. NeurIPS, 2022.
Can We Mitigate Backdoor Attack Using Adversarial Detection Methods? [link]
- Kaidi Jin, Tianwei Zhang, Chao Shen, Yufei Chen, Ming Fan, Chenhao Lin, and Ting Liu. IEEE Transactions on Dependable and Secure Computing, 2022.
LinkBreaker: Breaking the Backdoor-Trigger Link in DNNs via Neurons Consistency Check. [link]
- Zhenzhu Chen, Shang Wang, Anmin Fu, Yansong Gao, Shui Yu, and Robert H. Deng. IEEE Transactions on Information Forensics and Security, 2022.
Similarity-based Integrity Protection for Deep Learning Systems. [link]
- Ruitao Hou, Shan Ai, Qi Chen, Hongyang Yan, Teng Huang, and Kongyang Chen. Information Sciences, 2022.
A Feature-Based On-Line Detector to Remove Adversarial-Backdoors by Iterative Demarcation. [pdf]
- Hao Fu, Akshaj Kumar Veldanda, Prashanth Krishnamurthy, Siddharth Garg, and Farshad Khorrami. IEEE ACCESS, 2022.
Rethinking the Backdoor Attacks' Triggers: A Frequency Perspective. [pdf] [code]
- Yi Zeng, Won Park, Z. Morley Mao, and Ruoxi Jia. ICCV, 2021.
Demon in the Variant: Statistical Analysis of DNNs for Robust Backdoor Contamination Detection. [pdf] [code]
- Di Tang, XiaoFeng Wang, Haixu Tang, and Kehuan Zhang. USENIX Security, 2021.
SPECTRE: Defending Against Backdoor Attacks Using Robust Statistics. [pdf] [code]
- Jonathan Hayase, Weihao Kong, Raghav Somani, and Sewoong Oh. ICML, 2021.
CLEANN: Accelerated Trojan Shield for Embedded Neural Networks. [pdf]
- Mojan Javaheripi, Mohammad Samragh, Gregory Fields, Tara Javidi, and Farinaz Koushanfar. ICCAD, 2020.
Robust Anomaly Detection and Backdoor Attack Detection via Differential Privacy. [pdf] [code]
- Min Du, Ruoxi Jia, and Dawn Song. ICLR, 2020.
Simple, Attack-Agnostic Defense Against Targeted Training Set Attacks Using Cosine Similarity. [pdf] [code]
- Zayd Hammoudeh and Daniel Lowd. ICML Workshop, 2021.
SentiNet: Detecting Localized Universal Attacks Against Deep Learning Systems. [pdf]
- Edward Chou, Florian Tramèr, and Giancarlo Pellegrino. IEEE S&P Workshop, 2020.
STRIP: A Defence Against Trojan Attacks on Deep Neural Networks. [pdf] [extension] [code]
- Yansong Gao, Chang Xu, Derui Wang, Shiping Chen, Damith C. Ranasinghe, and Surya Nepal. ACSAC, 2019.
Detecting Backdoor Attacks on Deep Neural Networks by Activation Clustering. [pdf] [code]
- Bryant Chen, Wilka Carvalho, Nathalie Baracaldo, Heiko Ludwig, Benjamin Edwards, Taesung Lee, Ian Molloy, and Biplav Srivastava. AAAI Workshop, 2019.
Deep Probabilistic Models to Detect Data Poisoning Attacks. [pdf]
- Mahesh Subedar, Nilesh Ahuja, Ranganath Krishnan, Ibrahima J. Ndiour, and Omesh Tickoo. NeurIPS Workshop, 2019.
Spectral Signatures in Backdoor Attacks. [pdf] [code]
- Brandon Tran, Jerry Li, and Aleksander Madry. NeurIPS, 2018.
An Adaptive Black-box Defense against Trojan Attacks (TrojDef). [pdf]
- Guanxiong Liu, Abdallah Khreishah, Fatima Sharadgah, and Issa Khalil. arXiv, 2022.
Fight Poison with Poison: Detecting Backdoor Poison Samples via Decoupling Benign Correlations. [pdf] [code]
- Xiangyu Qi, Tinghao Xie, Saeed Mahloujifar, and Prateek Mittal. arXiv, 2022.
PiDAn: A Coherence Optimization Approach for Backdoor Attack Detection and Mitigation in Deep Neural Networks. [pdf]
- Yue Wang, Wenqing Li, Esha Sarkar, Muhammad Shafique, Michail Maniatakos, and Saif Eddin Jabari. arXiv, 2022.
Neural Network Trojans Analysis and Mitigation from the Input Domain. [pdf]
- Zhenting Wang, Hailun Ding, Juan Zhai, and Shiqing Ma. arXiv, 2022.
A General Framework for Defending Against Backdoor Attacks via Influence Graph. [pdf]
- Xiaofei Sun, Jiwei Li, Xiaoya Li, Ziyao Wang, Tianwei Zhang, Han Qiu, Fei Wu, and Chun Fan. arXiv, 2021.
NTD: Non-Transferability Enabled Backdoor Detection. [pdf]
- Yinshan Li, Hua Ma, Zhi Zhang, Yansong Gao, Alsharif Abuadbba, Anmin Fu, Yifeng Zheng, Said F. Al-Sarawi, and Derek Abbott. arXiv, 2021.
A Unified Framework for Task-Driven Data Quality Management. [pdf]
- Tianhao Wang, Yi Zeng, Ming Jin, and Ruoxi Jia. arXiv, 2021.
TESDA: Transform Enabled Statistical Detection of Attacks in Deep Neural Networks. [pdf]
- Chandramouli Amarnath, Aishwarya H. Balwani, Kwondo Ma, and Abhijit Chatterjee. arXiv, 2021.
Traceback of Data Poisoning Attacks in Neural Networks. [pdf]
- Shawn Shan, Arjun Nitin Bhagoji, Haitao Zheng, and Ben Y. Zhao. arXiv, 2021.
Provable Guarantees against Data Poisoning Using Self-Expansion and Compatibility. [pdf]
- Charles Jin, Melinda Sun, and Martin Rinard. arXiv, 2021.
Online Defense of Trojaned Models using Misattributions. [pdf]
- Panagiota Kiourti, Wenchao Li, Anirban Roy, Karan Sikka, and Susmit Jha. arXiv, 2021.
Detecting Backdoor in Deep Neural Networks via Intentional Adversarial Perturbations. [pdf]
- Mingfu Xue, Yinghao Wu, Zhiyu Wu, Jian Wang, Yushu Zhang, and Weiqiang Liu. arXiv, 2021.
Exposing Backdoors in Robust Machine Learning Models. [pdf]
- Ezekiel Soremekun, Sakshi Udeshi, and Sudipta Chattopadhyay. arXiv, 2020.
HaS-Nets: A Heal and Select Mechanism to Defend DNNs Against Backdoor Attacks for Data Collection Scenarios. [pdf]
- Hassan Ali, Surya Nepal, Salil S. Kanhere, and Sanjay Jha. arXiv, 2020.
Poison as a Cure: Detecting & Neutralizing Variable-Sized Backdoor Attacks in Deep Neural Networks. [pdf]
- Alvin Chan, and Yew-Soon Ong. arXiv, 2019.

Certificated Defense

BagFlip: A Certified Defense against Data Poisoning. [pdf] [code]
- Yuhao Zhang, Aws Albarghouthi, and Loris D'Antoni. NeurIPS, 2022.
RAB: Provable Robustness Against Backdoor Attacks. [pdf] [code]
- Maurice Weber, Xiaojun Xu, Bojan Karlas, Ce Zhang, and Bo Li. IEEE S&P, 2022.
Certified Robustness of Nearest Neighbors against Data Poisoning and Backdoor Attacks. [pdf]
- Jinyuan Jia, Yupei Liu, Xiaoyu Cao, and Neil Zhenqiang Gong. AAAI, 2022.
Deep Partition Aggregation: Provable Defense against General Poisoning Attacks [pdf] [code]
- Alexander Levine and Soheil Feizi. ICLR, 2021.
Intrinsic Certified Robustness of Bagging against Data Poisoning Attacks [pdf] [code]
- Jinyuan Jia, Xiaoyu Cao, and Neil Zhenqiang Gong. AAAI, 2021.
Certified Robustness to Label-Flipping Attacks via Randomized Smoothing. [pdf]
- Elan Rosenfeld, Ezra Winston, Pradeep Ravikumar, and J. Zico Kolter. ICML, 2020.
On Certifying Robustness against Backdoor Attacks via Randomized Smoothing. [pdf]
- Binghui Wang, Xiaoyu Cao, Jinyuan jia, and Neil Zhenqiang Gong. CVPR Workshop, 2020.
BagFlip: A Certified Defense against Data Poisoning. [pdf]
- Yuhao Zhang, Aws Albarghouthi, and Loris D'Antoni. arXiv, 2022.