In this work, we present a method for classifying the quality of blog comments using Linear-Chain Conditional Random Fields (CRFs). This approach is found to yield high accuracy on binary classification of high-quality comments, with conversational features contributing strongly to the accuracy. We also present a new corpus of blog data in conversational form, complete with user-generated quality moderation labels from the science and technology news blog Slashdot.
Exploiting Conversational Features to Detect High-Quality Blog Comments
Nicholas FitzGerald, Giuseppe Carenini, Gabriel Murray, and Shafiq Joty. In Advances in Artificial Intelligence - 24th Canadian Conference on Artificial Intelligence Proceedings (CAI'11) , pages 122-127, 2011.
PDF Abstract BibTex Slides