GPT 3: What is It, and Why All the Excitement?

GPT 3: What is It, and Why All the Excitement?

Nov 9, 2020

In the world of Artificial Intelligence (AI), there is plenty of discussion about how far the technology can go, how much of what we currently call ‘work’ we will eventually be able to automate, and what the most beneficial applications for business will be.

How 2020’s most talked about development in AI fits into those conversations takes a little explaining. Forget quasi-sentient super brains, self-managing digital systems, or humanoid robots – the biggest news in AI this year is a text autocomplete tool.

Ok, to compare GPT-3 with the autocomplete function you now get as standard in Gmail and on other platforms without any further explanation is to do it a disservice. But at its core, that is exactly what GPT-3 is – a piece of software that predicts what a person (or a machine) wants to write, and makes suggestions that complete the sentence, paragraph, or text for them.

So what, you might well think – improving the vocabulary and grammar of your emails is hardly going to change the world. But what has got people really excited about GPT-3 is the pattern recognition capabilities that underpin it. Pattern recognition is one of the keys to machine learning. With GPT-3, many tech commentators believe we have broken new ground that will usher in a new phase of AI development.

Natural Language Processing

GPT-3 is the work of San Francisco-based software house OpenAI. It is the third iteration of a programme called, to give it its full name, the Generative Pre-trained Transformer. This full title gives a pretty useful, if succinct, explanation of what it does and how it works – generative in the sense that it generates text (or, as we will see, other types of content too), pre-trained in the sense that it has been ‘taught’ by being programmed to recognise patterns in massive volumes of text (which is how any deep/machine learning AI works).

The ‘Transformer’ part refers to the specific deep learning technique GPT-3 uses. Transformer is a neural network architecture released in 2017 to deal specifically with one of the central challenges of AI-based language processing (known as Natural Language Processing, or NLP) – the fact that language always involves some kind of translation, the conversion of a symbol or a sound into meaning based on conventions and contextual connections. To process and use language the way people do, therefore, you need a memory to store and learn all of these conventions and connections. That is what Transformer was designed to facilitate.

OpenAI has been using the Transformer neural network to in effect train an NLP programme to be able to understand and use language the way people do. But what makes GPT-3 so exciting is that the latest incarnation of their work, released in June 2020, seems to have gone way beyond human linguistic capability.

The reason why is that GPT-3 has had a linguistic education no person could ever hope to emulate. Its training has involved scouring something in the region of one billion articles on the web, many more than once, looking for patterns not just in grammar and syntax but in style and tone and purpose and so on.

As this article in The Verge points out, GPT-3 has been thoroughly drilled in the composition of news articles, recipes, poetry, coding manuals, fanfiction, religious prophecy – any and every type of text you can find and read on the internet. Run through the Transformer, it has broken them down into a mathematical language of statistical patterns which it has remembered and is now able to use to make predictions about what words need to go where in any given context. Thanks to the thoroughness of its training, GPT-3 has 175 billion parameters it can draw on when deciding on the suitability of its next autocorrect suggestion.

We’re only at the very early stages of understanding what this could be capable of in practice, beyond writing complete, cogent and convincing pieces of text from scratch. The early indications are that, without specifically being trained to do so, GPT-3 has managed to grasp some of fundamental patterns of how language works, something it is doubtful centuries of linguistic study in academia has yet achieved.

We don’t know for sure, but what makes people believe this is the case is the fact that GPT-3 seems able to take a very small sample of examples and apply them correctly to create new content. In trials so far, this has been used variously to ‘translate’ pieces of text from one style to another, write original guitar tab compositions and even program computer code in multiple different languages.

If you haven’t heard of GPT-3 up until now, don’t be surprised – OpenAI only made a commercial release of its entire GPT project with its third version in June this year, releasing it as part of an API that is still in beta testing. It has, however, signed a commercial licensing agreement with Microsoft.

Expect to be hearing a lot more about GPT-3 in the coming months and years.