https://arxiv.org/pdf/1611.04244v1.pdf Classify or Select: Neural Architectures for Extractive Document Summarization

We present two novel and contrasting Recurrent Neural Network (RNN) based architectures for extractive summarization of documents. The Classifier based architecture sequentially accepts or rejects each sentence in the original document order for its membership in the final summary. The Selector architecture, on the other hand, is free to pick one sentence at a time in any arbitrary order to piece together the summary. Our models under both architectures jointly capture the notions of salience and redundancy of sentences. In addition, these models have the advantage of being very interpretable, since they allow visualization of their predictions broken up by abstract features such as information content, salience and redundancy. We show that our models reach or outperform state-of-the-art supervised models on two different corpora. We also recommend the conditions under which one architecture is superior to the other based on experimental evidence.

http://www.abigailsee.com/2017/04/16/taming-rnns-for-better-summarization.html

https://arxiv.org/abs/1704.04530v1 Neural Extractive Summarization with Side Information

Most extractive summarization methods focus on the main body of the document from which sentences need to be extracted. The gist of the document often lies in the side information of the document, such as title and image captions. These types of side information are often available for newswire articles. We propose to explore side information in the context of single-document extractive summarization. We develop a framework for single-document summarization composed of a hierarchical document encoder and an attention-based extractor with attention over side information. We evaluate our models on a large scale news dataset. We show that extractive summarization with side information consistently outperforms its counterpart (that does not use any side information), in terms on both informativeness and fluency.

https://arxiv.org/abs/1704.04368v1 Get To The Point: Summarization with Pointer-Generator Networks

Neural sequence-to-sequence models have provided a viable new approach for abstractive text summarization (meaning they are not restricted to simply selecting and rearranging passages from the original text). However, these models have two shortcomings: they are liable to reproduce factual details inaccurately, and they tend to repeat themselves. In this work we propose a novel architecture that augments the standard sequence-to-sequence attentional model in two orthogonal ways. First, we use a hybrid pointer-generator network that can copy words from the source text via pointing, which aids accurate reproduction of information, while retaining the ability to produce novel words through the generator. Second, we use coverage to keep track of what has been summarized, which discourages repetition.

https://arxiv.org/abs/1704.05550 Extractive Summarization: Limits, Compression, Generalized Model and Heuristics

http://www.abigailsee.com/2017/04/16/taming-rnns-for-better-summarization.html

https://arxiv.org/abs/1704.06877v1 Learning to Skim Text

https://arxiv.org/pdf/1705.04304v1.pdf A Deep Reinforced Model for Abstractive Summarization

We introduce a neural network model with intra-attention and a new training method. This method combines standard supervised word prediction and reinforcement learning (RL). Models trained only with the former often exhibit “exposure bias” – they assume ground truth is provided at each step during training. However, when standard word prediction is combined with the global sequence prediction training of RL the resulting summaries become more readable.

https://arxiv.org/abs/1706.01678 Text Summarization using Abstract Meaning Representation

In this work we develop a full fledged pipeline to generate summaries of news articles using the Abstract Meaning Representation(AMR). We first generate the AMR graphs of stories then extract summary graphs from the story graphs and finally generate sentences from the summary graph. For extracting summary AMRs from the story AMRs we use a two step process. First, we find important sentences from the text and then extract the summary AMRs from those selected sentences.

https://einstein.ai/research/your-tldr-by-an-ai-a-deep-reinforced-model-for-abstractive-summarization

https://arxiv.org/abs/1708.02977v1 Hierarchically-Attentive RNN for Album Summarization and Storytelling

We address the problem of end-to-end visual storytelling. Given a photo album, our model first selects the most representative (summary) photos, and then composes a natural language story for the album. For this task, we make use of the Visual Storytelling dataset and a model composed of three hierarchically-attentive Recurrent Neural Nets (RNNs) to: encode the album photos, select representative (summary) photos, and compose the story. Automatic and human evaluations show our model achieves better performance on selection, generation, and retrieval than baselines.

https://arxiv.org/abs/1801.10198 Generating Wikipedia by Summarizing Long Sequences

https://arxiv.org/abs/1804.05685v1 A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents

https://arxiv.org/abs/1808.10792 Bottom-Up Abstractive Summarization

https://arxiv.org/abs/1810.05739 Unsupervised Neural Multi-document Abstractive Summarization

https://arxiv.org/pdf/1811.01824.pdf STRUCTURED NEURAL SUMMARIZATION

Based on the promising results of graph neural networks on highly structured data, we develop a framework to extend existing sequence encoders with a graph component that can reason about long-distance relationships in weakly structured data such as text. In an extensive evaluation, we show that the resulting hybrid sequence-graph models outperform both pure sequence models as well as pure graph models on a range of summarization tasks.

We presented a framework for extending sequence encoders with a graph component that can leverage rich additional structure. In an evaluation on three different summarization tasks, we have shown that this augmentation improves the performance of a range of different sequence models across all tasks.