We propose a novel approach for developing a two-stage document-level discourse parser. Our parser builds a discourse tree by applying an optimal parsing algorithm to probabilities inferred from two Conditional Random Fields: one for intrasentential parsing and the other for multisentential parsing. We present two approaches to combine these two stages of discourse parsing effectively. A set of empirical evaluations over two different datasets demonstrates that our discourse parser significantly outperforms the stateof-the-art, often by a wide margin.
Combining Intra- and Multi-sentential Rhetorical Parsing for Document-level Discourse Analysis
Shafiq Joty, Giuseppe Carenini, Raymond T. Ng, and Yashar Mehdad. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL'13) , pages 486-496, 2013.
PDF Abstract BibTex Slides