TY - CHAP
T1 - Scam Detection in Twitter
AU - Chen, Xiaoling
AU - Chandramouli, Rajarathnam
AU - Subbalakshmi, Koduvayur P.
N1 - Publisher Copyright:
© 2014, Springer-Verlag Berlin Heidelberg.
PY - 2014
Y1 - 2014
N2 - Twitter is one among the fastest growing social networking services.This growth has led to an increase in Twitter scams (e.g., intentional deception). There is relatively little effort in identifying scams in Twitter. In this chapter, we propose a semi-supervised Twitter scam detector based on a small labeled data. The scam detector combines self-learning and clustering analysis. A suffix tree data structure is used. Model building based on Akaike and Bayes Information Criteria is investigated and combined with the classification step. Our experiments show that 87 % accuracy is achievable with only 9 labeled samples and 4000 unlabeled samples, among other results.
AB - Twitter is one among the fastest growing social networking services.This growth has led to an increase in Twitter scams (e.g., intentional deception). There is relatively little effort in identifying scams in Twitter. In this chapter, we propose a semi-supervised Twitter scam detector based on a small labeled data. The scam detector combines self-learning and clustering analysis. A suffix tree data structure is used. Model building based on Akaike and Bayes Information Criteria is investigated and combined with the classification step. Our experiments show that 87 % accuracy is achievable with only 9 labeled samples and 4000 unlabeled samples, among other results.
KW - Akaike Information Criterion
KW - Bayesian Information Criterion
KW - Latent Semantic Analysis
KW - Suffix Tree
KW - Unlabeled Data
UR - http://www.scopus.com/inward/record.url?scp=84943162287&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84943162287&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-45252-9_9
DO - 10.1007/978-3-642-45252-9_9
M3 - Chapter
AN - SCOPUS:84943162287
T3 - Studies in Big Data
SP - 133
EP - 150
BT - Studies in Big Data
ER -