When a user is served with a ranked list of relevant documents by the standard document search engines, his search task is usually not over. He has to go through the entire document contents to judge its relevance and to find the precise piece of information he was looking for. Query–relevant summarization tries to remove the onus on the end–user by providing more condensed and direct access to relevant information. Query–relevant summarization is the task to synthesize a fluent, well–organized summary of the document collection that answers the user questions. We extracted several features of different types (i.e. lexical, lexical semantic, statistical and cosine similarity ) for each of the sentences in the document collection in order to measure its relevancy to the user query. We experimented with two well–known unsupervised statistical machine learning techniques: K–means and EM algorithms and evaluated their performances. For all these methods of generating summaries, we have shown the effects of different kinds of features.
Unsupervised Approach for Selecting Sentences in Query-based Summarization.
Yllias Chali*, and Shafiq Joty*. In Proceedings of the Twenty-First International FLAIRS Conference (FLAIRS'08) , pages 47-52, 2008.
PDF Abstract BibTex Slides