May 6, 2021 2 min read Transformers

Zero-Shot Text Classification Made Easy

No data? No problem! Zero-shot text classification models to the rescue.

In August of 2019, a team over at the University Of Pennsylvania proposed a way to pre-trained natural language inference models as zero-shot text classification models [1]. Facebook AI then finetuned and released a bart-large model that is optimized for this task, which we will be using for this tutorial [2].

Goal

We'll be completing a mini-project to demonstrate zero-shot text classification models. For this project, we will be classifying text into one of two categories: text about travel and text about work. We'll be accomplishing this without using any training data.

Installation

First off, we're going to pip install Hugging Face's transformers library

pip install transformers

Instantiation

We'll be using Hugging Face's Pipeline class to create our classifier. This class requires two inputs: task and model. The task parameter is a string to specify what kind of task we'll be performing. A list of potential tasks can be found here. For our purposes, we'll be using the string "zero-shot-classification." Next, the model parameter specifies which zero-shot model we wish to use. A list of potential models can be found here . We'll be using a model called "facebook/bart-large-mnli" which is, as of right now, the most downloaded model.

from transformers import pipeline

task = "zero-shot-classification"
model = "facebook/bart-large-mnli"
classifier = pipeline(task, model)

Usage

Let's first create two test cases that each contain text that's indicative of one of the two categories.

text_travel = "Let's book a flight to the Bali"
text_work = "I have to code late tonight on the new sofware update"

Now, we'll create a list of the categories we'll be classified into. Feel free to append more categories to the list.

labels = ["travel", "work"]

From here, we can begin performing text classification. Within the classifier object, provide the text that will be classified to the first parameter, and the list of labels for the second parameter

result_travel = classifier(text_travel, labels)
print(result_travel)

Output: {'sequence': "Let's book a flight to the Bali", 'labels': ['travel', 'work'], 'scores': [0.9847909808158875, 0.01520907785743475]}

The classifier returns a dictionary with three keys:

sequence: the text which was classified

labels: The labels that were provided in order by highest score

scores: The scores in decending order

The code below shows how to extract the label with the highest score, along with its respected score.

result_work = classifier(text_work, labels)
print(result_work["labels"][0])
print(result_work["scores"][0])

Output:

travel

0.9847909808158875

Let's repeat this processes but for the other category.

result_work = classifier(text_work, labels)
print(result_work["labels"][0])
print(result_work["scores"][0])

Output:

work

0.9939967393875122

Conclusion

With pretrained zero-shot text classification models, you can classify text into an arbitrary list of categories. This technology paves the way to implement text classification models without the need for training data. It also eliminates the need for data-preprocessing and training, which has the potential to simplify AI pipelines.

If you're interested in implementing normal text classification models, then I suggest you check out Happy Transformer. Happy Transformer allows you to implement and train text classification models with just a few lines of code.

Be sure to sign up for Vennify AI's newsletter for more articles like this!