7 min read

Any Python Programmer Can Use State-of-the-art AI Models… Including You!

You don't need a PhD to implement and train the best natural language processing models
Any Python Programmer Can Use State-of-the-art AI Models… Including You!

Yes, that's right! Any Python programmer can implement and train state-of-the-art natural language processing (NLP) models. This may seem surprising, but just because a technology is powerful, it doesn't mean it's hard to use.

Top NLP models are clearly powerful. They are often evaluated on a benchmark called the General Langauge and Understanding benchmark (GLUE). This benchmark is comprised of 11 tests to determine a model's ability to understand language. Currently, the top models outperform the human baseline on the benchmark.

Transformer models have revolutionized the field of NLP in recent years and are the reason for models being able to outperform humans on the GLUE benchmark. They were proposed by Google in 2017 and since then have dominated nearly every NLP task. In addition, companies have been spending $10,000s on fine-tuning these models and have been releasing them to the public. For example, OpenAI made headlines for at first not releasing a model called "GPT-2" to the public due to it being too powerful and how it could potentially be misused. They inevitably released the model, including the largest version, which you'll learn how to implement within this article.

Currently, the most popular way to implement these Transformers models is to use a library called “transformers” created by a company called Hugging Face. But this library is challenging to use for some cases – especially when it comes to fine-tuning. So, in early 2020 my team published a Python package called Happy Transformer that makes it easy to implement and train state-of-the-art Transformer models.

In his article, I'll discuss how to use NLP Transformer models for two common applications. I'll start by discussing how to implement and train a model for text generation. Then I'll move onto discussing how to use a model for text classification. So, whether you're just starting out with Python or have a PhD in NLP, you should be able to follow along and learn from this article as I show you how to implement debatably the most powerful AI technologies available to the public with just a few lines of code.

Text Generation


First off, let’s install Happy Transformer. Happy Transformer is available on PyPI, and thus we can install it with a simple pip command.

pip install happytransformer

Now let’s import a class called HappyGeneration from HappyTransformer. This class allows us to download and use text generation models that are available on Hugging Face’s model distribution network.

from happytransformer import HappyGeneration

For this tutorial, we’ll use a model called GPT-Neo. GPT-Neo is a fully open-source version of the renowned model called GPT-3 created by OpenAI. At this time, there are three models of different sizes you can download and train with Happy Transformer. For this tutorial, we’ll use the smallest model, which has 125M parameters.

happy_gen = HappyGeneration("GPT-NEO", "EleutherAI/gpt-neo-125M")

Note: to use the largest GPT-2 model, which has 1.5B parameters, you can use the parameters "GPT2" and "gpt2-xl"

Now, we can generate text with just one more line of code. The model takes in a string input and attempts to continue it. So, to perform text generation, we can call generate_text() from happy_gen.

text = "I went to the store to buy "
result = happy_gen.generate_text(text)

Result:some clothes. I was in the store with my friend, and she was in the store with me. I was in the store with her. I was in the store with her. I was in the store with her. I was in the store

Notice how the model repeated itself? That’s because, by default, the program uses a very simple text generation algorithm called “greedy” that is prone to repetition. This algorithm simply selects the next most likely token, where a token is typically a word or symbol. Instead, we can use an algorithm called “top-k sampling,” which considers the top k number of tokens. This algorithm results in text that is typically more creative and less repetitive.

To adjust the text generation algorithm we first need to import a class called “GENSettings.”

from happytransformer import GENSettings

Now, let’s use create a GENSetttings object using top-k settings parameters.

top_k_sampling_settings = GENSettings(do_sample=True, top_k=50, temperature=0.5, max_length=50)

You can learn more about other text generations algorithms and what the different GENSettings parameters mean here.

result = happy_gen.generate_text(text, top_k_sampling_settings)

Result: a new sweater. The store was small, so I took a look at it. I saw that there were no more than 30-40 pairs of shoes on the shelf. I was afraid that the store was going to have a lot of people selling


We can train the model with just one line of code. Perhaps you have conversational data from your favourite fictional character and want to reproduce them. Or maybe, you have text data from a textbook and want to increase the model's knowledge for a particular domain. In either case, you can accomplish this with a single line of code, assuming you have your data in a text file.

To train the model, create a text file that contains the text you want to model to learn from. There are no strict formatting guidelines — just format the data in the format you expect your model to come across when used in production.

After creating a text file containing your training data, call happy_gen.train() and provide the path to the text file as the only position parameter.


From here, you can go resume to performing inference as described above — except now you’ll be using a fine-tuned model. So, simply call happy_gen.generate_text() as before.

Text Classification

Text classification is a fundamental task in NLP. For example, perhaps you wish to classify incoming emails as spam or ham. Or maybe, you want to detect the sentiment of movie reviews. In either case, you would be performing text classification, and Transformer models have dominated the text classification tasks in recent years.

With Happy Transformer, you can download pre-trained text classification models from Hugging Face’s model distribution network. You can also fine-tune your own model for custom tasks with your own data.

Pretrained Models

There are over 1000 text classification Transformer models you can download from Hugging Face’s model distribution network. In this article, we’ll discuss how to use the most downloaded model, which is a DistilBERT model that was trained for sentiment analysis.

First off, we’ll import a class called HappyTextClassification from Happy Transformer.

from happytransformer import HappyTextClassification

Now, we can download the model by creating a HappyTextClassification object. The first position parameter is the model type in all caps, so in this case, “DISTILBERT.” The second parameter is for the model name, so “distilbert-base-uncased-finetuned-sst-2-english.” This model achieved a 91.3% accuracy on binary sentiment analysis for movie reviews. Finally, we’ll set the “num_labels” parameter to the number of classes we have, so  2 — positive and negative.

model_type = "DISTILBERT"
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
happy_tc = HappyTextClassification(model_type, model_name, num_labels=2)

From here, we can classify text with just a single line of code by calling happy_tc.classify_text() and providing whatever text we wish to classify.

result = happy_tc.classify_text("Wow that movie was great!")


TextClassificationResult(label=‘POSITIVE’ score=0.9998310208320618)

The output is a dataclass object with two values: label and score. We can isolate these two values as shown below.






Transformer models allow you to perform “transfer learning.” Essentially, we can start with a model that has already been trained with $10,000s of resources and then fine-tune it with under a dollar of resources to perform a new task. For this example, we’ll load a pre-trained model called “RoBERTa” and use its base version. Then, we’ll train it to detect if a movie review is positive or negative.

model_type = "ROBERTA"
model_name = "roberta-base"
happy_tc_roberta = HappyTextClassification(model_type, model_name, num_labels=2)

Now, let’s discuss how to format the data. This is just a toy example, so we’ll only use two training examples. Create a CSV file and create two columns: text and label. Then, within each row, add the text for the particular case along with its label in the form of an integer. For this example, the value 0 indicates negative while 1 indicates positive. The CSV file we’ll use for his example is shown below.

text label
I really enjoyed the movie 1
I hated the movie 0

So, once when you have a CSV formatted as described above, we can immediately begin training the model by calling happy_tc_roberta.train() and providing the path to the training file.


And that’s it, we just trained the model! We can also adjust the learning parameters by importing a class called “GENTrainArgs.” Visit this webpage for a list of learning parameters you can adjust and this webpage for an explanation for each. You can also view this article that provides a more in-depth explanation off how to modify learning parameters.

We can now call the classify_text() method as before to perform inference. The two labels that will be outputted are called "LABEL_0" and "LABEL_1," indicating if the output is negative or positive. Of course, since we've only used two training examples, the model would not be fair much better than guessing.


You just learned how to implement and train state-of-the-art Transformer models for two different applications! We’ve only just scratched the surface on everything you can do with Happy Transformer. I’ve included a few links below to resources you can use to learn more about the library.

Stay happy everyone!


Full course on how to implement and train GPT-Neo along with how to create a web app to display it.

Support Happy Transformer by giving it a star 🌟🌟🌟

Article: more depth into training a text generation model

Article: How to modify text classification learning parameters

Code used in this tutorial

Book a Call

We may be able to help you or your company with your next NLP project. Feel free to book a free 15 minute call with us.