Empirical analysis of multi-task learning for reducing identity bias in toxic comment detection

Ameya Vaidya, Feng Mai, Yue Ning

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

34 Scopus citations

Abstract

With the recent rise of toxicity in online conversations on social media platforms, using modern machine learning algorithms for toxic comment detection has become a central focus of many online applications. Researchers and companies have developed a variety of models to identify toxicity in online conversations, reviews, or comments with mixed successes. However, many existing approaches have learned to incorrectly associate non-toxic comments that have certain trigger-words (e.g. gay, lesbian, black, muslim) as a potential source of toxicity. In this paper, we evaluate several stateof- the-art models with the specific focus of reducing model bias towards these commonly-attacked identity groups. We propose a multi-task learning model with an attention layer that jointly learns to predict the toxicity of a comment as well as the identities present in the comments in order to reduce this bias. We then compare our model to an array of shallow and deep-learning models using metrics designed especially to test for unintended model bias within these identity groups.

Original languageEnglish
Title of host publicationProceedings of the 14th International AAAI Conference on Web and Social Media, ICWSM 2020
Pages683-693
Number of pages11
ISBN (Electronic)9781577357889
StatePublished - 2020
Event14th International AAAI Conference on Web and Social Media, ICWSM 2020 - Atlanta, Virtual, United States
Duration: 8 Jun 202011 Jun 2020

Publication series

NameProceedings of the 14th International AAAI Conference on Web and Social Media, ICWSM 2020

Conference

Conference14th International AAAI Conference on Web and Social Media, ICWSM 2020
Country/TerritoryUnited States
CityAtlanta, Virtual
Period8/06/2011/06/20

Fingerprint

Dive into the research topics of 'Empirical analysis of multi-task learning for reducing identity bias in toxic comment detection'. Together they form a unique fingerprint.

Cite this