What is AI?

All eyes are on AI. Seemingly every business, app and device has overloaded their capabilities with AI enhancements and overhauls. However, beneath the corporate hype, society and industry are reinventing themselves to adapt to these tools and utilities which seek to make life easier.

It is becoming increasingly difficult to keep track of all the breakthroughs and state-of-the-art technologies, so over the next month we will publish a series of posts breaking down where we are at currently, and what the future holds.

In this post, we will explain some basic terms to help you get up to speed with AI.

Artificial Intelligence

AI means software tools meant to behave and automate the intelligent behaviour of humans. In general, they:

  • look at data

  • understand it

  • summarise their findings or make a prediction based on them.

They do this using rules that are either given to them (a human has written rules in computer code to handle every style of problem the AI will face) or are learnt (the model is fed a lot of data and learns to work out the rules for itself).

The first type of AI has been around for decades; automating your inbox, detecting banking fraud, and even recommending you shows on Netflix. The second has too, but it has only grown to its current usage over the last decade or so. This type is called machine learning, and it is now the backbone of almost every piece of technology.

Machine Learning

Machine learning is a field that focuses on making machines that learn from data. A machine learning ‘model’ is an algorithm or program that uses maths to learn from data, a process called training. The machine learning model then uses its training on real-world data – this is a process called inference, because the model infers a prediction from the data it is given.

Machine learning models train by taking data and making a guess. When they are correct, they get rewarded to enforce that behaviour. When they are wrong, they are tasked with going back and finding what step of their thinking was wrong, and tweaking the numbers so they can avoid doing that again.

Machine learning has a shallow and a deep end, much like a pool.

  • When a task is very basic, a shallow algorithm is used. Such algorithms are extremely cheap and common, and they get their name because they only do a small number of steps when learning and predicting using their data.

  • At the opposite end of the pool is deep learning. These algorithms use many steps of thought and have many different knobs and dials to tweak and fine tune (these numbers are called parameters or weights because they represent how much value the model gives to each connection of its thinking). The more parameters they have, the more training the machine learning models will need to become good at their task, with the added benefit of more sophisticated thinking and reasoning.

Generative AI

Generative AI is a class of models that focus on generating new content that ‘fits the bill’ with the data it is trained on. For example, if you train a generative AI model on images of Toucans, it will get very good at giving you back completely new and different images of Toucans. If you train it on high school essays, it will get very good at giving you back essays that look like they were written by highschoolers, with all their usual flaws. If you train it on the entirety of the internet, it will appear magically good at generating realistic professional emails, angry Twitter posts, and poetry in the style of Shakespeare.

Large Language Models (LLMs)

Most generative AI models that you would be familiar with – such as ChatGPT – are referred to as large language models (LLMs): they are massive models trained on almost all of the internet’s text. Data centers chug away for months on end to train these models to become highly skilled at intelligent reasoning, and highly fluent conversationalists with their human users.

They run by taking a piece of text and repeatedly guessing the next best word. This word gets added to the chunk of text they are writing, and they rinse and repeat this process. Eventually you’re left with a sentence, or a paragraph, or maybe an essay, depending on what prompt you gave in that original piece of text. Because these models are so large and comprehensive (hence their name) they are able to fluently create sentences and respond to queries.

The most common of these models are ChatGPT by OpenAI, Claude by Anthropic, Gemini by Google, and most recently R1 by Deepseek. Whenever you use an AI assistant or chatbot on the internet, it is extremely likely that it is one of these few models packaged up to look different.

Looking Ahead

Ok, that’s enough for one post! In the next post, we’ll build on these terms to explain the current state of play for AI, and what’s in store this year.

Previous
Previous

Key ideas in AI: reinforcement learning