About AI, GPT-3, and ChatGPT
An explainer about some AI things on a nice basic level because of OpenAIs popularity recently
I’ve been thinking about this one since OpenAI’s ChatGPT thing became a social phenomenon. On the one hand, it’s very cool, so I wanted to do an explainer to learn more about how it works and talk about it in an informed way. But on the other hand, when almost anyone talks about it, I roll my eyes. This article is the explainer where I talk about what AI and ChatGPT are and how they work as simply as I can. A discussion more about the issues, limitations, and eyerolling associated with ChatGPT specifically will come along later.
What is AI?
Artificial Intelligence (AI), is how we describe computational systems that can do things that humans do. Things like understanding natural language, recognising objects, spotting patterns, and making informed decisions. Where humans use their brains to do these things based on senses, sometimes genetics, and memories, AI uses large amounts of information, data, and mathematical algorithms to do these things based on the information it has been given.
There are several different types of AI, the most well known/common ones you’ll see are machine learning, deep learning, and natural language processing.
Machine learning can be used to classify data, make predictions, and identify patterns. Deep learning is used for more complex tasks such as image and speech recognition, and natural language processing is used to make computers understand, interpret, and generate language.
AI can be used in a lot of ways, some that you will have heard of like self-driving cars and chatbots, and some that you probably won’t have, like environmental modelling, or localisation and mapping systems. AI is transforming our ways of life and will have a growing but already significant impact on society, it raises a lot of important ethical and societal questions. I hope reading this is a step on your journey to thinking about those questions.
So what is GPT-3?
GPT-3 (Generative Pretrained Transformer 3) is an AI model from OpenAI that generates text based on given ‘prompts’; questions, suggestions, or commands. The model works because it has read inconceivably massive amounts of text and uses it to determine patterns and relationships between words and phrases. Basic things like how words are spelt, and sentence structure, but also more complex things like the relationship between certain words and emotional connotations. Bad vs awful, or happy vs wholesome, that kind of thing.
So when you give it a prompt the model uses all of the text it has read to generate a response. It does this by predicting the word or character that should come next based on the prompt and what came before. For example, if you ask it a question like ‘what is a duck?’, it would know to take the subject of the question, ‘a duck’ and put it first. Then it would think about the data it has read across the internet for things that talk about the relationship between ‘duck’ and ‘what’. Based on all the things it’s seen it’ll come up with the next words. This process is then repeated over and over again to create full response one word at a time. Something like: “A duck is a bird”.
To do this, the model uses what’s called a sequence-to-sequence (Seq2Seq) architecture with two main components: an encoder and a decoder.
The encoder processes the prompt and ‘summarises’ it using the information it was trained on so it recognises all the words and sentence structures you use. This is then passed to the decoder which gives each word in the summary a ‘score’ based on how important it thinks it will be for the output.
In the above example ‘duck’ and ‘what’ would score higher than ‘is’ and ‘a’ for example. These scores allow the model to go looking for relationships and start pulling in words.
It only works because of the truly incomprehensible amounts of information it has been given. For the above example it would have been fed thousands of articles, books, or other texts that refer to or talk about ducks. It would have ranked all of them by relevance and know where they all are in it’s database so when you ask ‘what is a duck?’ it knows exactly where it has seen those words before are and how many times they agree with each other.
What is ChatGPT?
Finally, ‘ChatGPT’ specifically describes a use case of GPT-3. The ‘chat’ part refers to the fact that it was trained on, or was given access to, data that was specifically conversational. While it has access to vast amounts of data the way we described before, it’s model was trained specifically using examples of conversations.
OpenAI used human ‘AI trainers’ to provide conversations where they played both sides of the conversation, a human, and an AI assistant, and where they chatted to a chatbot specifically, and ranked the quality of the answers and conversations. This meant that the model could ‘learn’ what answers were better or worse and improve. They mixed that information with their existing database of information which added the conversational aspect.
In this way it could take a prompt and form output words, one at a time, that are not just (hopefully) correct, but follow the same flow, or tone, or structure, as a conversation because they’d been ‘trained’ on heaps of conversations already. Neat huh? Now all of that is a bit of an oversimplification, but hopefully it explains the jist.
How does AI or GPT-3 work though?
AI uses mathematics to look at information, process it, and then create an output. It uses ‘models’ to learn from data and make predictions or decisions that it wasn’t explicitly told to do.
A mathematical ‘model’ is a way of trying to understand how something will go, before it happens, based on something you already know. The term comes from how we might model ourselves after someone else. Say you like how someone you admire behaves, so you change your own behaviour to be more like them. This is what a mathematical model does too.
Imagine you run a pub and you want to know how much beer you’ll sell on Friday when it’s going to be especially cold out. You would know intuitively that you’ll sell more than other days of the week, because it’s a Friday, but it’s going to be chilly, so how does that affect things? You could say, well it’s cold so we’ll sell more because folks will want to stay in one place. Or, you could use a mathematical model to make a more accurate prediction.
You would go to the books and find out how much beer was sold on as many Fridays in the past as you can see, and find out the temperature of each of those Fridays too. You could then plot this information on a graph to see if there is a relationship between the temperature and the amount of beer sold on a Friday. Then, if there were, you could follow the line and have a guess. You could use this same information to predict how much you’ll sell on every Friday into the future based on the temperature.
Of course, there are other things that would affect how much beer you’d sell, like what time it is, or if there’s a game on, or if it’s raining, or whatever. So you could do the same thing again and combine it with the temperature model we talked about earlier and get a more and more accurate prediction. Those other things that you could measure that would affect how much beer you’d sell are called ‘parameters’.
Once you’ve done this enough you can change the parameters (how hot it is, if a game is on, what time it is, etc) to make any prediction you’d like.
This is one of the ways AI works, this would be called ‘supervised learning’, where the AI is given historical parameters and the result (how much beer is sold) to train on, so that later you can just give it the parameters and it’ll guess at the result. The information you give it to ‘train’ on is called ‘training data’ because it uses the information to refine the model so that when it’s go time, when it doesn’t have the result, only the parameters, it can perform.
There are two other common types of AI models, ‘unsupervised learning’, and ‘reinforcement learning’. Unsupervised learning involves training the AI just on parameters, where the result is unknown. The model identifies patterns and relationships in the parameters that are used together to look at which parameters are the most important or which relate to each other.
And reinforcement learning involves training the AI model by exposing it to situations where it receives rewards or punishments based on its actions. The model learns from these experiences and adjusts its behavior to maximize the rewards over time. This is the main kind of model that ChatGPT uses. It has access to OpenAIs GPT-3 language models and uses human AI trainers for reinforcement feedback to tell ChatGPT not just what is correct, but what is correct conversationally. This was done presumably hundreds of thousands of times until the folks at OpenAI were confident their model was good.
Outro
I’ve studied AI a few different times in my life, but without applying what I learn or using it in life or work, I find that with time it gets harder to write about. I remember the fundamental concepts, loosely, but putting it into words like this is tricky. To write this article I went back through some old notes and read a lot of stuff from OpenAI’s own blog, well worth a read.
But a lot of what I found was quite complicated or used a fair amount of technical language. Hopefully this was more simple and easy to understand. If you want to learn more just remember the technical language is just to make sentences shorter. Complex terms or words like ‘linear regression’ or ‘neural network algorithms’ are just quicker ways of saying things for people who already know what it means. So don’t be put off, you’ll just have to build up some new vocab.
Thanks for reading. The best way to support this newsletter is to sign up ✍️ share it 🤝 and get your friends to sign up too 😎 That’s what this lovely purple button is for 👀