2025
Single AI Agent Runtime Security Testing Standards
Ant Group; World Digital Technology AcademyAgent Safety Alignment via Reinforcement Learning
Zeyang Sha, Hanling Tian, Zhuoer Xu, Shiwen Cui, Changhua Meng, Weiqiang Wang; ArxivA Survey of LLM-Driven AI Agent Communication: Protocols, Security Risks, and Defense Countermeasures
Dezhang Kong, Shi Lin, Zhenhua Xu, Zhebo Wang, Minghao Li, Yufeng Li, Yilun Zhang, Zeyang Sha, Yuyuan Li, Changting Lin, Xun Wang, Xuan Liu, Muhammad Khurram Khan, Ningyu Zhang, Chaochao Chen, Meng Han; ArxivSEM: Reinforcement Learning for Search-Efficient Large Language Models
Zeyang Sha, Shiwen Cui, Weiqiang Wang; ArxivFragFake: A Dataset for Fine-Grained Detection of Edited Images with Vision Language Models
Zhen Sun, Ziyi Zhang, Zeren Luo, Zeyang Sha, Tianshuo Cong, Zheng Li, Shiwen Cui, Weiqiang Wang, Jiaheng Wei, Xinlei He, Qi Li, Qian Wang; Arxiv
arxiv
Thought Manipulation: External Thought Can Be Efficient for Large Reasoning Models
Yule Liu, Jingyi Zheng, Zhen Sun, Zifan Peng, Wenhan Dong, Zeyang Sha, Shiwen Cui, Weiqiang Wang, Xinlei He; Arxiv
arxiv
2024
Conversation Reconstruction Attack Against GPT Models
Junjie Chu, Zeyang Sha, Michael Backes, Yang Zhang; EMNLP 2024
arxiv code
ZeroFake: Zero-Shot Detection of Fake Images Generated and Edited by Text-to-Image Generation Models
Zeyang Sha, Yicong Tan, Mingjie Li, Michael Backes, Yang Zhang; CCS 2024Games and Beyond: Analyzing the Bullet Chats of Esports Livestreaming
Yukun Jiang, Xinyue Shen, Rui Wen, Zeyang Sha, Junjie Chu, Yugeng Liu, Michael Backes, Yang Zhang; ICWSM 2024
arxiv
Prompt Stealing Attacks Against Large Language Models
Zeyang Sha, Yang Zhang; ArxivComprehensive Assessment of Toxicity in ChatGPT
Boyang Zhang, Xinyue Shen, Wai Man Si, Zeyang Sha, Zeyuan Chen, Ahmed Salem, Yun Shen, Michael Backes, Yang Zhang; Arxiv
arxiv
2023
DE-FAKE: Detection and Attribution of Fake Images Generated by Text-to-Image Generation Models
Zeyang Sha, Zheng Li, Ning Yu, Yang Zhang; CCS 2023arxiv code Best paper finalist at CSAW Europe 2024