Aug 17, 2022 4 min read NLP

Keywords to Text With GPT-Neo

Learn how to produce original text from a set of keywords with GPT-Neo by using prompt engineering

In this article we'll discuss a useful application of text generation Transformer models like GPT-3, GPT-J, and GPT-Neo. These models are able to produce original text from a set of keywords by leveraging a technique called prompt engineering. To perform prompt engineering, we must craft an input to help guide the model to perform a specific task. We'll use my very own Happy Transformer library, which is built on top of Hugging Face's Transformers library to simplify training and using Transformer models.

Setup

First off, we need to install Happy Transformer.

pip install happytransformer

Now, we'll import a called called HappyGeneration to load the model and a class called GENSettings to set the inference settings.

from happytransformer import HappyGeneration, GENSettings

We'll load the largest GPT-Neo model a Google Colab Pro instance can handle, which is called "EleutherAI/gpt-neo-1.3B." To load the model, we'll provide the model type (GPT-NEO) to the first position parameter of the HappyGeneration class and the model name to the second.

happy_gen = HappyGeneration("GPT-NEO", "EleutherAI/gpt-neo-1.3B")

Prompt Engineering

Let's now perform prompt engineering to design a text input. The prompt can be broken down into two components the training cases and the current case. The training cases are example input/output combinations for the model to learn from. The current case contains the keywords as an input that we wish to produce a prediction for. It is important that all of the training cases and the current case are structured the same way. After completing the prompt, we can provide it to the model, and it will attempt to continue the text which should produce an output based on the keywords for the current case.

Let's start by defining the training cases.

training_cases = """Keywords: Canada, AI, fast
Output: Canada's AI industry is growing fast. 
###
Keywords: purchase, desk, adjustable
Output: I just purchased a new height adjustable desk. 
###
Keywords: museum, art, painting, Ottawa
Output: I went to an art museum in Ottawa and saw some beautiful paintings. I'm excited to revisit. 
###
Keywords: exam, success, study
Output: My first exam was a success! I think I aced it because of your help with studying. 
###"""

Notice how each case starts with "Keywords:" followed by the keywords, and then a newline followed by "Output:" where the result follows. Each case also ends with a newline followed by three pound symbols.

We'll define some keywords by adding them to a list of strings.

keywords = ["dog", "cat", "play"]

Let's create a function that, given the training cases and the keywords, generates a prompt we can provide directly to the model.

def create_prompt(training_cases, keywords):
  keywords_string = ", ".join(keywords)
  prompt = training_cases + "\nKeywords: "+ keywords_string + "\nOutput:"
  return prompt

Let's create a prompt for our model.

prompt = create_prompt(training_cases, keywords)

Inference

We can now start generating text with our model. But first, we'll use a text generation algorithm called beam search, which is a deterministic algorithm that produces the same results each time we run it.

args_beam = GENSettings(num_beams=5, no_repeat_ngram_size=3, early_stopping=True, min_length=1, max_length=50)

We can now produce a prediction by providing the prompt and the settings to happy_gen's generate_text method.

result = happy_gen.generate_text(prompt, args=args_beam)
print(result)

Result:

GenerationResult(text=' We had a dog and a cat for a few days. They were very playful and loved to play with each other. They also loved to chase each other around the house. I think they were a lot of fun to have around. \n')

The output is a Dataclass with a single variable called text. We can isolate this variable with the code below.

print(result.text)

Result:

We had a dog and a cat for a few days. They were very playful and loved to play with each other. They also loved to chase each other around the house. I think they were a lot of fun to have around.

Pretty good!

Since we used a deterministic algorithm, the result will be the same each time we rerun the model. We could instead use a non-deterministic algorithm like top-k sampling so that each time we call the model a new result is produced. We can also modify some of the parameters for this algorithm to make it more or less "creative."

We'll define new settings using GENSettings to implement top-k sampling. There are two main parameters we are concerned with: "top_k" and "temperature." The top_k value specifies the number of candidate tokens that will be considered. The temperature ranges from 0-1, and the model becomes more likely to select low-probability tokens as it increases. So, as the top_k and temperature parameters increase, the text will become more creative.

args_top_k = GENSettings(do_sample = True, top_k=5, no_repeat_ngram_size=3, early_stopping=True, min_length=1, max_length=50)

Result:

A cat named "Puppy" came and played with my dog.

Additional Examples

Within this section I will provide two more examples along with explanations to give you additional insight on how to use the model. For each example, I used the beam search settings we defined earlier.

Example 1

["AI", "positive"]: Thank you so much for helping me with my AI project. I am very happy with the results and I am looking forward to working with you in the future.

Sometimes the model does not output the exact keywords. But, I believe it does take the meaning of the unused keywords into account. For this example, the model did not use the word "positive," but the sentiment of the text was indeed positive.

Example 2

["kayaking", "ocean", "friends"]: We went kayaking on the Ottawa River and had a great time. We had a lot of fun and learned a lot about the Ottawa area.

It seems like the model reused terminology from the training cases. The third training case mentioned "Ottawa," so I do not think it was a coincidence that the output also mentioned Ottawa. You should keep this in mind when crafting your own set of training cases to form the prompt you'll use.

Conclusion

And that's it! You just learned how to use a text generation Transformer model to produce text from a set of keywords. I suggest you read other articles on similar topics from Vennify's website and subscribe to Vennify's YouTube channel. In particular, here's an article and video on how to use and fine-tune GPT-Neo.