TY - GEN
T1 - Authorship similarity detection from email messages
AU - Chen, Xiaoling
AU - Hao, Peng
AU - Chandramouli, R.
AU - Subbalakshmi, K. P.
PY - 2011
Y1 - 2011
N2 - It is easy to hide the true identity of the author of an email. The author's actual name, email address, etc. can be changed arbitrarily to deceive an email receiver. For example, a sender can change his/her identity in the email header to send different emails to various recipients. Therefore, in this paper, we investigate techniques for authorship similarity detection from the text content of a short length, topic-free email. 150 stylistic cues are identified for this problem. A frequent pattern and machine learning based method is proposed. Extensive experiment results are also presented for the Enron email data set.
AB - It is easy to hide the true identity of the author of an email. The author's actual name, email address, etc. can be changed arbitrarily to deceive an email receiver. For example, a sender can change his/her identity in the email header to send different emails to various recipients. Therefore, in this paper, we investigate techniques for authorship similarity detection from the text content of a short length, topic-free email. 150 stylistic cues are identified for this problem. A frequent pattern and machine learning based method is proposed. Extensive experiment results are also presented for the Enron email data set.
KW - Authorship similarity
KW - Enron email
KW - Frequent pattern
KW - SVM
UR - http://www.scopus.com/inward/record.url?scp=80052332124&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80052332124&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-23199-5_28
DO - 10.1007/978-3-642-23199-5_28
M3 - Conference contribution
AN - SCOPUS:80052332124
SN - 9783642231988
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 375
EP - 386
BT - Machine Learning and Data Mining in Pattern Recognition - 7th International Conference, MLDM 2011, Proceedings
T2 - 7th International Conference on Machine Learning and Data Mining, MLDM 2011
Y2 - 30 August 2011 through 3 September 2011
ER -