TY - GEN
T1 - Profiling LLM’s Copyright Infringement Risks under Adversarial Persuasive Prompting
AU - Long, Jikai
AU - Liu, Ming
AU - Chen, Xiusi
AU - Xu, Jialiang
AU - Li, Shenglan
AU - Xu, Zhaozhuo
AU - Zhang, Denghui
N1 - Publisher Copyright:
©2025 Association for Computational Linguistics.
PY - 2025
Y1 - 2025
N2 - Large Language Models (LLMs) have demonstrated impressive capabilities in text generation but raise concerns regarding potential copyright infringement. While prior research has explored mitigation strategies like content filtering and alignment, the impact of adversarial persuasion techniques in eliciting copyrighted content remains underexplored. This paper investigates how structured persuasion strategies, including logical appeals, emotional framing, and compliance techniques, can be used to manipulate LLM outputs and potentially increase copyright risks. We introduce a structured persuasion workflow, incorporating query mutation, intention-preserving filtering, and few-shot prompting, to systematically analyze the influence of persuasive prompts on LLM responses. Through experiments on state-of-the-art LLMs, including GPT-4o-mini and Claude-3-haiku, we quantify the effectiveness of different persuasion techniques and assess their implications for AI safety. Our results highlight the vulnerabilities of LLMs to adversarial persuasion and provide empirical evidence of the increased risk of generating copyrighted content under such influence. We conclude with recommendations for strengthening model safeguards and future directions for enhancing LLM robustness against manipulation. Code is available at https://github.com/Rongite/Persuasion.
AB - Large Language Models (LLMs) have demonstrated impressive capabilities in text generation but raise concerns regarding potential copyright infringement. While prior research has explored mitigation strategies like content filtering and alignment, the impact of adversarial persuasion techniques in eliciting copyrighted content remains underexplored. This paper investigates how structured persuasion strategies, including logical appeals, emotional framing, and compliance techniques, can be used to manipulate LLM outputs and potentially increase copyright risks. We introduce a structured persuasion workflow, incorporating query mutation, intention-preserving filtering, and few-shot prompting, to systematically analyze the influence of persuasive prompts on LLM responses. Through experiments on state-of-the-art LLMs, including GPT-4o-mini and Claude-3-haiku, we quantify the effectiveness of different persuasion techniques and assess their implications for AI safety. Our results highlight the vulnerabilities of LLMs to adversarial persuasion and provide empirical evidence of the increased risk of generating copyrighted content under such influence. We conclude with recommendations for strengthening model safeguards and future directions for enhancing LLM robustness against manipulation. Code is available at https://github.com/Rongite/Persuasion.
UR - https://www.scopus.com/pages/publications/105028958534
UR - https://www.scopus.com/pages/publications/105028958534#tab=citedBy
U2 - 10.18653/v1/2025.findings-emnlp.855
DO - 10.18653/v1/2025.findings-emnlp.855
M3 - Conference contribution
AN - SCOPUS:105028958534
T3 - EMNLP 2025 - 2025 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2025
SP - 15799
EP - 15823
BT - EMNLP 2025 - 2025 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2025
A2 - Christodoulopoulos, Christos
A2 - Chakraborty, Tanmoy
A2 - Rose, Carolyn
A2 - Peng, Violet
T2 - 30th Conference on Empirical Methods in Natural Language Processing, EMNLP 2025
Y2 - 4 November 2025 through 9 November 2025
ER -