Jul 26, 2021 3 min read NLP

Predict Future Events With Transformer Models (NLP Research Idea)

If you can predict the ending of Harry Potter then why can't GPT-3?

We all know that AI models are able to predict future events. From predicting the weather to predicting sports matches, AI models can use past events to predict future ones. But, what if it is possible to predict future events within a story? For example, imagine if it was possible to provide a language model with the first 1/2 of a fictional story and have it predict the outcomes for the rest of the story. If humans are able to make predictions for fictional stories with some accuracy, then I believe in theory language models should be able to as well. In this article, I will outline a method that can potentially accomplish this functionality, and I encourage NLP practitioners to attempt it using the methods I'm about to propose.

Self-Talk

My fourth-year capstone project team found a very interesting application of text generation language models. We found that if you provided a langauge model model with background information followed by a question, it would often successfully answer the question. We used GPT-2 for as the langauge model, which is a Transformer model that specializes in text generation. We later found that a team of researchers from the Allen Institute for AI discovered a similar application of language models before us and named it "self-talk."

Here are a couple of examples from my capstone project. The first example shows that it is possible to generate background information using just a question as the prompt. In the first example, the model assumed that by "library," it meant a technical database rather than a place where books are stored. The second example shows that providing content prior to the question could improve performance.

Strategy	Prompt	Generated Background Information
Nouns	What is a library?	A library is an object that can be used to store data in a database
Nouns with context	Book. article. What is a library?	A library is an organization of books, magazines, and other materials that can be used for research, teaching, or other purposes

Both my capstone project and the paper published by the Allen Institute for AI focused on generating background information for simplistic topics using self-talk. However, as I will discuss, I believe that we both just scratched the surface for the potential applications of this technology.

Here's an article on how to perform self-talk.

Predicting the Future

I believe that it is possible for self-talk to be used to predict future events of complex stories. The prompt would contain a question appended to a body of text containing the story/background information. The text generation model would then attempt to continue the text and answer the question. Of course, this would involve a considerable amount of common-sense reasoning, and if not possible today may become so as deep learning models advance.

One current limitation is the input size of current state-of-the-art text generation models. For example, GPT-3 has an input size of 2048 tokens. A text summarization model may be used to reduce larger inputs into an acceptable size.

Applications

Every day we're making complex predictions based on data that can be reduced to text data. For example, every time you watch a movie or read a book, you're likely making predictions on what will happen next. It may be possible for these same predictions to be accurately generated using self-talk.

There are other applications other than just fiction For example, perhaps one day it will be possible to provide a model with the latest news for a particular company and ask it "will 'x' stock increase or decrease in value today?" and have it answer the question with some accuracy. Similarly, a model that's sufficiently advanced and fine-tuned may be able to predict the outcome of world events by taking in news segments.

Conclusion

In this article, I proposed a method to predict future events with large language models. The method is quite simple – provide the model with background content followed by a question and have it produce a continuation to the text. If successful, the continuation would answer the question, which in this case would be for a future event within a story. Regardless, if language models are truly able to comprehend text and perform commonness reasoning, then I believe it is possible for a model to successfully make complex predictions for future events.

I encourage NLP practitioners to perform experiments to prove or disprove my hypothesis. In either case, I'm looking forward to potentially reading what you produce.

Here's a full article on how to perform self-talk.