TY - GEN
T1 - Learning multilingual topics from incomparable corpora
AU - Hao, Shudong
AU - Paul, Michael J.
N1 - Publisher Copyright:
© 2018 COLING 2018 - 27th International Conference on Computational Linguistics, Proceedings. All rights reserved.
PY - 2018
Y1 - 2018
N2 - Multilingual topic models enable crosslingual tasks by extracting consistent topics from multilingual corpora. Most models require parallel or comparable training corpora, which limits their ability to generalize. In this paper, we first demystify the knowledge transfer mechanism behind multilingual topic models by defining an alternative but equivalent formulation. Based on this analysis, we then relax the assumption of training data required by most existing models, creating a model that only requires a dictionary for training. Experiments show that our new method effectively learns coherent multilingual topics from partially and fully incomparable corpora with limited amounts of dictionary resources.
AB - Multilingual topic models enable crosslingual tasks by extracting consistent topics from multilingual corpora. Most models require parallel or comparable training corpora, which limits their ability to generalize. In this paper, we first demystify the knowledge transfer mechanism behind multilingual topic models by defining an alternative but equivalent formulation. Based on this analysis, we then relax the assumption of training data required by most existing models, creating a model that only requires a dictionary for training. Experiments show that our new method effectively learns coherent multilingual topics from partially and fully incomparable corpora with limited amounts of dictionary resources.
UR - http://www.scopus.com/inward/record.url?scp=85080795914&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85080795914&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85080795914
T3 - COLING 2018 - 27th International Conference on Computational Linguistics, Proceedings
SP - 2595
EP - 2609
BT - COLING 2018 - 27th International Conference on Computational Linguistics, Proceedings
A2 - Bender, Emily M.
A2 - Derczynski, Leon
A2 - Isabelle, Pierre
T2 - 27th International Conference on Computational Linguistics, COLING 2018
Y2 - 20 August 2018 through 26 August 2018
ER -