4 min read

GPT-Neo Made Easy. Run and Train a GPT-3 Like Model

Learn how to implement a GPT-3 like Transformer model with just a few lines of code
GPT-Neo Made Easy. Run and Train a GPT-3 Like Model

By now, I'm sure everyone reading this article has heard of GPT-3. Hugo Cen from Entrperenur.com wrote an article titled "This Is the Most Powerful Artificial Intelligence Tool in the World." In this article, he discussed potential applications such as chatbots and writing emails [1]. However, GPT-3 is only available through a beta API which is currently waitlisted. In addition, you must complete an application to join the beta. [2]

What if you want to leverage the power of GPT-3, but don't want to wait for Open-AI to approve your application? Introducing GPT-Neo, an open-source Transformer model that resembles GPT-3 both in terms of design and performance. In this article, we will be discussing how to implement GPT-Neo with just a few lines of code.

Note: the largest version of GPT-Neo is about the same size as the smallest version of GPT-3


We'll be using a library called Happy Transformer to implement the model. Happy Transformer is built on top of Hugging Face's Transformer's library to make it easier to implement and train models. Happy Transformer version 2.2.2 was used for this tutorial.

pip install happytransformer


We'll be using Happy Transformer's HappyGeneration class.

HappyGeneration requires two positional inputs: model_type and model_name. In this case, the model type is "GPT-NEO" and the model name is "EleutherAI/gpt-neo-1.3B." A list of other GPT-Neo models can be found here. If your hardware can handle it, you may wish to use "EleutherAI/gpt-neo-2.7B" instead. Likewise, if you wish to downsize, you may use "EleutherAI/gpt-neo-125M."

from happytransformer import HappyGeneration

happy_gen = HappyGeneration("GPT-NEO", "EleutherAI/gpt-neo-1.3B")


From here, we can start generating text with a single line of code using happy_gen's generate_text method. This method outputs a dataclass object with a single parameter called text. This parameter contains the generated text in the form of a string.

result = happy_gen.generate_text("Artificial intelligence will ")


GenerationResult(text="\nbe the next big thing in the world of\nbusiness.\nAnd it's going to be a big deal.\nAnd it's going to be a big deal.\nAnd it's going to be a big deal.\nAnd it's")
(new lines removed)
be the next big thing in the world of business. And it's going to be a big deal. And it's going to be a big deal. And it's going to be a big deal. And it's



Notice how the model kept on repeating itself. This is quite common with the default settings. Now, we're going to modify the settings to prevent the model from repeating the same two tokens twice throughout the text. For this, we'll be using a class called GENSettings.

from happytransformer import GENSettings

args = GENSettings(no_repeat_ngram_size=2)
result = happy_gen.generate_text("Artificial intelligence will ")


be the next big thing in the world of business. And it's going to be a big deal.   And it's not going away. It's just going to get bigger and bigger. And it will change

top_k sampling:

The text we just generated is still fairly uncreative and repetitive. To solve this, we can use a different generation algorithm. By default, a greedy algorithm is used, repeatedly selects the most likely token. However, we can use many different algorithms, like top_k sampling, which assigns possible tokens a probability, and then tokens with a higher probability have an increased chance of being selected. See this web page for more details.

args = GENSettings(no_repeat_ngram_size=2, do_sample=True, early_stopping=False, top_k=50, temperature=0.7)

result = happy_gen.generate_text("Artificial intelligence will ")


Output: create a whole new world. So let's start talking about the impact that this will have on the workplace.   And more specifically, how it will impact jobs in manufacturing or in the auto industry.

That's much better!!!


The topic of training GPT-Neo models deserves a whole article to itself. But, I will briefly discuss how to finetune a pretrained GPT-Neo model. This may be helpful if you want to give your particular model a "personality" based on your training text data. We will be using happy_gen's train() method. This method only requires a single input – a path to a text file.

If you're using Google Colab, even with a Pro account, you must downgrade the model to "EleutherAI/gpt-neo-125M" before training due to hardware limitations.


And that's it!

But of course, you may want to modify various learning parameters like the learning rate and the number of epochs. To do so, we must import a class called "GENTrainArgs." A full list of potential parameters can modify can be found here.

from happytransformer import GENTrainArgs 

args = GENTrainArgs(learning_rate =1e-5, num_train_epochs = 1)
happy_gen.train("train.txt", args=args)

There we go! Now you can continue to generate text using the "generate_text()" method with your newly fine-tuned model.

Be sure to sing up for Vennify.ai's newsletter for more articles like this


Check out this course on how to create a web app to display GPT-Neo with 100% Python. It also goes into far more depth compared to this article on everything related to GPT-Neo.



Subscribe to us on YouTube for new videos on NLP.





[2] https://beta.openai.com

[3] https://github.com/EleutherAI/gpt-neo

Support Happy Transformer by giving it a star 🌟🌟🌟

Read my latest content, support me, and get a full membership to Medium by signing up to Medium with this link: https://medium.com/@ericfillion/membership