TY - GEN
T1 - LLMs and Copyright Risks
T2 - 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2025
AU - Zhang, Denghui
AU - Xu, Zhaozhuo
AU - Zhao, Weijie
N1 - Publisher Copyright:
©2025 Association for Computational Linguistics.
PY - 2025
Y1 - 2025
N2 - Large Language Models (LLMs) have revolutionized natural language processing, but their widespread use has raised significant copyright concerns. This tutorial addresses the complex intersection of LLMs and copyright law, providing researchers and practitioners with essential knowledge and tools to navigate this challenging landscape. The tutorial begins with an overview of relevant copyright principles and their application to AI, followed by an examination of specific copyright issues in LLM development and deployment. A key focus will be on technical approaches to copyright risk assessment and mitigation in LLMs. We will introduce benchmarks for evaluating copyright-related risks, including memorization detection and probing techniques. The tutorial will then cover practical mitigation strategies, such as machine unlearning, efficient fine-tuning methods, and alignment approaches to reduce copyright infringement risks. Ethical considerations and future directions in copyright-aware AI development will also be discussed.
AB - Large Language Models (LLMs) have revolutionized natural language processing, but their widespread use has raised significant copyright concerns. This tutorial addresses the complex intersection of LLMs and copyright law, providing researchers and practitioners with essential knowledge and tools to navigate this challenging landscape. The tutorial begins with an overview of relevant copyright principles and their application to AI, followed by an examination of specific copyright issues in LLM development and deployment. A key focus will be on technical approaches to copyright risk assessment and mitigation in LLMs. We will introduce benchmarks for evaluating copyright-related risks, including memorization detection and probing techniques. The tutorial will then cover practical mitigation strategies, such as machine unlearning, efficient fine-tuning methods, and alignment approaches to reduce copyright infringement risks. Ethical considerations and future directions in copyright-aware AI development will also be discussed.
UR - https://www.scopus.com/pages/publications/105027097546
UR - https://www.scopus.com/pages/publications/105027097546#tab=citedBy
U2 - 10.18653/v1/2025.naacl-tutorial.7
DO - 10.18653/v1/2025.naacl-tutorial.7
M3 - Conference contribution
AN - SCOPUS:105027097546
T3 - Proceedings of the 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies: Long Papers, NAACL-HLT 2025
SP - 44
EP - 50
BT - Tutorial Abstracts
A2 - Lomeli, Maria
A2 - Swayamdipta, Swabha
A2 - Zhang, Rui
Y2 - 29 April 2025 through 4 May 2025
ER -