Publications

You can also find my articles on my Google Scholar profile.

2025

Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification

Boyang Zhang, Yicong Tan, Yun Shen, Ahmed Salem, Michael Backes, Savvas Zannettou, Yang Zhang; EMNLP 2025

pdf arxiv

Defeating Cerberus: Concept-Guided Privacy-Leakage Mitigation in Multimodal Language Models

Boyang Zhang, Istemi Ekin Akkus, Ruichuan Chen, Alice Dethise, Klaus Satzke, Ivica Rimac, Yang Zhang; Arxiv

pdf arxiv

2024

The Death and Life of Great Prompts: Analyzing the Evolution of LLM Prompts from the Structural Perspective

Yihan Ma, Xinyue Shen, Yixin Wu, Boyang Zhang, Michael Backes, Yang Zhang; EMNLP 2024

pdf

SecurityNet: Assessing Machine Learning Vulnerabilities on Public Models

Boyang Zhang, Zheng Li, Ziqing Yang, Xinlei He, Michael Backes, Mario Fritz, Yang Zhang; Usenix Security 2024

pdf arxiv code

Comprehensive Assessment of Toxicity in ChatGPT

Boyang Zhang, Xinyue Shen, Wai Man Si, Zeyang Sha, Zeyuan Chen, Ahmed Salem, Yun Shen, Michael Backes, Yang Zhang; Arxiv

pdf arxiv

2023

A Plot is Worth a Thousand Words: Model Information Stealing Attacks via Scientific Plots

Boyang Zhang, Xinlei He, Yun Shen, Tianhao Wang, Yang Zhang; Usenix Security 2023

pdf arxiv code