Aug 2, 2021 4 min read NLP

Styleformer: Convert Casual Text to Formal Text and Vice Versa

How to leverage a powerful T5 Transformer model to change the formality of text

Styleformer is a brand new Python library that allows you to change the style of text using a powerful Transformer model called T5. This tutorial focuses on its ability to convert casual text to formal text and vice versa. Changing the formality of text has many applications. For example, perhaps you wish to increase the formality of a work email – just run Styleformer with only a few lines of code.

Installation

You can download Styleformer directly from GitHub with the following command.

pip install git+https://github.com/PrithivirajDamodaran/Styleformer.git

Casual to Formal

Now, let's import a class called Styleformer which we'll use to load models.

from styleformer import Styleformer

Let’s load the model that converts casual text to formal text. This model’s style ID is 0, so when instantiating a Syleformer object, we’ll set its “style” parameter to 0.

styleformer_c_t_f = Styleformer(style=0)

From here we can immediately start converting text by calling styleformer_c_t_f.transfer(). We'll provide the text we wish to convert as the first and only position input. Then, we'll set the "inference_on" parameter to 0 if we want to use a CPU and 1 for if we want to use a GPU.

result_1 = styleformer_c_t_f.transfer("Yo, I love coding in Python. ", inference_on=1)

print(result_1)

Outputs: I enjoy coding in Python.

Here's another example.

result_2 = styleformer_c_t_f.transfer("I'm going to go buy some stuff like apples at the store  ", inference_on=1)

print(result_2)

Result: I am going to buy some items, such as apples, from the store.

Not too bad! Now, let's discuss the inverse – converting formal text to casual text.

Formal To Casual

Let's load the model to convert formal text to casual text. The style ID for this model is "1".

styleformer_f_t_1 = Styleformer(style=1)

We can now call the "transfer()" method as before.

result_3 = styleformer_f_t_c.transfer("Let's discuss our plans for this evening", inference_on=1)

print(result_3)

Result: Lets talk about what we'll be doing tonight

Multiple Sentences

Although it's not stated within the documentation, I believe that the package is only intended to be used for one sentence at a time. After reviewing the source code, I noticed that "early_stopping" is enabled, which means that the model stops producing text when the "end of sentence" token is reached. In addition, the internal settings set a maximum of 32 tokens per iteration, where tokens are typically words or symbols. Below is a quick overview of how you could break an input down by sentence using a Python package called TextBlob.

TextBlob is available on PiPI, and we can install it with a simple pip command.

pip install textblob

Now, let's import a class called TextBlob that we'll use to break a string down by sentence.

from textblob import TextBlob

Let’s create a TextBlob object by providing a string to the TextBlob class. Notice how the string contains multiple sentences. We’ll break the text down by sentence and then convert it from being casual to being formal by using the syleformer_c_t_f object.

text = "Hey man, what's up? We should hang out and watch the Olympics. Then maybe go grab some food"

blob = TextBlob(text)

TextBlob is built on top of a Python framework called NLTK. Sometimes we need to install resources from NLTK before performing actions with TextBlob. Since NLTK is a dependency for TextBlob, we do not need to install it. Let's download a tokenizer called 'punkt' from NLTK.

import nltk

nltk.download('punkt')

We now have everything we need to start tokenizing the string by sentence.

result_4_list = []

for sent in blob.sentences:
  temp_result = styleformer_c_t_f.transfer(sent.string)
  result_4_list.append(temp_result)
  
result_4 = " ".join(result_4_list)
print(result_4)

Output: Hello, what are you doing? I recommend that we spend time together and watch the Olympics. Afterwards, perhaps grab some food.

Styleformer With Happy Transformer

You may want to use a more mature package like Hugging Face's Transformers library or my very own Happy Transformer package to run the model. Unlike Styleformer, both of these packages are available on PyPI and allow you to modify the text generation settings.

There are several reasons why you may want to modify the text generation settings. For example, by default, the Styleformer package generates at most 32 tokens per inference, where tokens are typically words or symbols, and there is no way to adjust this. In addition, by modifying the settings, you can change the rate of less likely words being selected, which allows you to alter the "creativity" of the text.

Below is a GitHub gist that demonstrates how to use the Styleformer model with Happy Transformer. Perhaps I'll write a full article on how to use Styleformer with Happy Transformer, but for now, the gist below should suffice for most people.

You can also check out this article that provides an in-depth explanation on how to use Happy Transformer for a similar model that the author of Styleformer created. This model is called "Gramformer" and corrects the grammar of inputted text.

Conclusion

I hope you learned a lot! I'm looking forward to reading about future applications of this model. Be sure to use Styleformer before emailing your boss, and stay happy everyone!