Scam Detection in Twitter

Xiaoling Chen, Rajarathnam Chandramouli, Koduvayur P. Subbalakshmi

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

13 Scopus citations

Abstract

Twitter is one among the fastest growing social networking services.This growth has led to an increase in Twitter scams (e.g., intentional deception). There is relatively little effort in identifying scams in Twitter. In this chapter, we propose a semi-supervised Twitter scam detector based on a small labeled data. The scam detector combines self-learning and clustering analysis. A suffix tree data structure is used. Model building based on Akaike and Bayes Information Criteria is investigated and combined with the classification step. Our experiments show that 87 % accuracy is achievable with only 9 labeled samples and 4000 unlabeled samples, among other results.

Original languageEnglish
Title of host publicationStudies in Big Data
Pages133-150
Number of pages18
DOIs
StatePublished - 2014

Publication series

NameStudies in Big Data
Volume3
ISSN (Print)2197-6503
ISSN (Electronic)2197-6511

Keywords

  • Akaike Information Criterion
  • Bayesian Information Criterion
  • Latent Semantic Analysis
  • Suffix Tree
  • Unlabeled Data

Fingerprint

Dive into the research topics of 'Scam Detection in Twitter'. Together they form a unique fingerprint.

Cite this