In recent years there has been a growing interest in deep neural networks (DNN) and representation learning with applications to a myriad of NLP and data mining problems. The success of DNNs is heavily dependent on the availability of labeled data. However, obtaining labeled data is a big challenge in many real-world problems. In such cases, a DNN model can leverage labeled and unlabeled data from a related domain, but it has to deal with the shift in data distributions between the domains. In this paper, we study the problem of classifying social media posts during a crisis event (e.g., Earthquake). For that, we use labeled and unlabeled data from past similar events (e.g., Flood) and unlabeled data for the current event. We propose a novel model that performs adversarial learning based domain adaptation to deal with distribution drifts and graph based semi-supervised learning to leverage unlabeled data within a single unified deep learning framework. Our experiments with two real-world crisis datasets collected from Twitter demonstrate significant improvements over several baselines.
Domain Adaptation with Adversarial Training and Graph Embeddings
Firoj Alam, Shafiq Joty, and Muhammad Imran. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL'18) , pages 1077–-1087, 2018.
PDF Abstract BibTex Slides