* indicates corresponding author
Teach to Reason Safely: Policy-Guided Safety Tuning for MLRMs
Jingyu Zhang, Kun Yang*, Ming Wen, Zhuoer Xu, Zeyang Sha*, Shiwen Cui, Zhaohui Yang ICLR 2026 arXiv Single AI Agent Runtime Security Testing Standards
Ant Group World Digital Technology Academy WDTA Agent Safety Alignment via Reinforcement Learning
Zeyang Sha, Hanling Tian, Zhuoer Xu, Shiwen Cui, Changhua Meng, Weiqiang Wang arXiv arXiv A Survey of LLM-Driven AI Agent Communication: Protocols, Security Risks, and Defense Countermeasures
Dezhang Kong, Shi Lin, Zhenhua Xu, Zhebo Wang, Minghao Li, Yufeng Li, Yilun Zhang, Zeyang Sha, Yuyuan Li, Changting Lin, Xun Wang, Xuan Liu, Muhammad Khurram Khan, Ningyu Zhang, Chaochao Chen, Meng Han arXiv arXiv SEM: Reinforcement Learning for Search-Efficient Large Language Models
Zeyang Sha, Shiwen Cui, Weiqiang Wang arXiv arXiv FragFake: A Dataset for Fine-Grained Detection of Edited Images with Vision Language Models
Zhen Sun, Ziyi Zhang, Zeren Luo, Zeyang Sha, Tianshuo Cong, Zheng Li, Shiwen Cui, Weiqiang Wang, Jiaheng Wei, Xinlei He, Qi Li, Qian Wang arXiv arXiv Thought Manipulation: External Thought Can Be Efficient for Large Reasoning Models
Yule Liu, Jingyi Zheng, Zhen Sun, Zifan Peng, Wenhan Dong, Zeyang Sha, Shiwen Cui, Weiqiang Wang, Xinlei He arXiv arXiv Conversation Reconstruction Attack Against GPT Models
Junjie Chu, Zeyang Sha*, Michael Backes, Yang Zhang* EMNLP 2024 arXiv Code ZeroFake: Zero-Shot Detection of Fake Images Generated and Edited by Text-to-Image Generation Models
Zeyang Sha, Yicong Tan, Mingjie Li, Michael Backes, Yang Zhang CCS 2024 arXiv Code Games and Beyond: Analyzing the Bullet Chats of Esports Livestreaming
Yukun Jiang, Xinyue Shen, Rui Wen, Zeyang Sha, Junjie Chu, Yugeng Liu, Michael Backes, Yang Zhang ICWSM 2024 arXiv Prompt Stealing Attacks Against Large Language Models
Zeyang Sha, Yang Zhang arXiv arXiv Comprehensive Assessment of Toxicity in ChatGPT
Boyang Zhang, Xinyue Shen, Wai Man Si, Zeyang Sha, Zeyuan Chen, Ahmed Salem, Yun Shen, Michael Backes, Yang Zhang arXiv arXiv DE-FAKE: Detection and Attribution of Fake Images Generated by Text-to-Image Generation Models
Zeyang Sha, Zheng Li, Ning Yu, Yang Zhang CCS 2023 arXiv Code Best Paper Finalist · CSAW Europe 2024 Can't Steal? Cont-Steal! Contrastive Stealing Attacks Against Image Encoders
Zeyang Sha, Xinlei He, Ning Yu, Michael Backes, Yang Zhang CVPR 2023 arXiv Code From Visual Prompt Learning to Zero-Shot Transfer: Mapping Is All You Need
Ziqing Yang, Zeyang Sha, Michael Backes, Yang Zhang arXiv arXiv Fine-Tuning Is All You Need to Mitigate Backdoor Attacks
Zeyang Sha, Xinlei He, Pascal Berrang, Mathias Humbert, Yang Zhang arXiv arXiv