Features in Extractive Supervised Single-document Summarization: Case of Persian News
Text summarization has been one of the most challenging areas of research in NLP. Much effort has been made to overcome this challenge by using either the abstractive or extractive methods. Extractive methods are more popular, due to their simplicity compared with the more elaborate abstractive methods. In extractive approaches, the system will not generate sentences. Instead, it learns how to score sentences within the text by using some textual features and subsequently selecting those with the highest-rank. Therefore, the core objective is ranking and it highly depends on the document. This dependency has been unnoticed by many state-of-the-art solutions. In this work, the features of the document are integrated into vectors of every sentence. In this way, the system becomes informed about the context, increases the precision of the learned model and consequently produces comprehensive and brief summaries.
READ FULL TEXT