Authors: Luis Perez, Kevin Eskici
We tackle the problem of multi-document extractive summarization by implementing two well-known algorithms for single-text summarization -- {\sc TextRank} and {\sc Grasshopper}. We use ROUGE-1 and ROUGE-2 precision scores with the DUC 2004 Task 2 data set to measure the performance of these two algorithms, with optimized parameters as described in their respective papers ($\alpha =0.25$ and $\lambda=0.5$ for Grasshopper and $d=0.85$ for TextRank). We compare these modified algorithms to common baselines as well as non-naive, novel baselines and we present the resulting ROUGE-1 and ROUGE-2 recall scores. Subsequently, we implement two novel algorithms as extensions of {\sc GrassHopper} and {\sc TextRank}, each termed {\sc ModifiedGrassHopper} and {\sc ModifiedTextRank}. The modified algorithms intuitively attempt to ``maximize'' diversity across the summary. We present the resulting ROUGE scores. We expect that with further optimizations, this unsupervised approach to extractive text summarization will prove useful in practice.
Comments: 24 Pages. 24
Download: PDF
[v1] 2017-12-16 00:38:28
Unique-IP document downloads: 517 times
Vixra.org is a pre-print repository rather than a journal. Articles hosted may not yet have been verified by peer-review and should be treated as preliminary. In particular, anything that appears to include financial or legal advice or proposed medical treatments should be treated with due caution. Vixra.org will not be responsible for any consequences of actions that result from any form of use of any documents on this website.
Add your own feedback and questions here:
You are equally welcome to be positive or negative about any paper but please be polite. If you are being critical you must mention at least one specific error, otherwise your comment will be deleted as unhelpful.