Text Summarization In NLP


What Is Text Summarization

The advancements in NLP have led to the development of machine learning algorithms such as text summarization that can automatically shorten longer texts and extract summaries of sections of text without losing the message. In a society constrained by time and obsessed with efficiency, text summarization is a leading-edge way to process information. It reduces reading time and makes researching, processing, and digesting the right information easier.

In the modern business landscape, data is a commodity. A comparison can be made that data in the 21st century is akin in value to oil in the 20th century. The analysis and use of vast amounts of data is applied to decision-making for many businesses across industries. Research predicts that by 2025 the total amount of data globally will exceed 180 zettabytes. A meteoric rise from the projected 4.4 zettabytes of data circulating the world in 2013.

The vast amount of data available to the world represents tremendous opportunity especially with tapping into unstructured data. Natural Language Processing (NLP) is a subset of Artificial Intelligence (AI) in which computers can interpret and analyze human language in efficient and useful ways. It is a way to get a human-level understanding of the language for machines. While tools such as content analytics platforms can understand the context of language, a computer is yet to “read between the lines” or intuit language the way a human does.

Text Summarization Examples

NLP extracts the context from a human language using machine learning. In NLP, text summarization shortens a set of data computationally to create a subset of the most meaningful information. It works in two different ways:

Extraction-Based Summarization

 This summarization technique operates by extracting keywords from the document and combining

them into a summary. The text remains unchanged as the extraction of keywords is made according to predefined metrics.

Extraction-Based Examples

Source text

Jill and Jack went to a farm to pick roses in bloom in Vermont. Jill picked a yellow rose to give to Jane.

Extractive Summary

Jill and Jack pick roses in Vermont. Jill gives it to Jane.

The words in bold have been used to create a summary. Because keywords are extracted, the summary may not always be grammatically accurate.

Abstraction-Based Summarization

 This summarization technique works by paraphrasing and shortening parts of the source document. The benefit of abstraction being used in text summarization is that it does not struggle with the same grammatical inconsistencies. Also, abstractive text summarization is more natural at summarizing language like humans. It creates new phrases and sentences with only the most relevant information from the text.

Abstraction-Based Summarization Examples

Abstractive Summary

 Jack and Jill went to the farm to pick roses with Jane.

Executive Summary

 Jack, Jill, and Jane went to the farm.


Abstraction-based summarization is more accurate in capturing the grammatical syntax and context of the text. However, abstraction-based summarization algorithms are harder to develop therefore extraction-based summarization remains the most popular used method. As NLP technology progresses to understand the nuances of language so will the text summarization algorithms to make information more accessible and meaningful to businesses.

Post This Article

Share on facebook
Share on twitter
Share on linkedin

Related Articles