Chapter 05 · How it learns

How AI learns — without anyone teaching it directly

The word "learning" in AI refers to something precise: a system adjusting millions of internal settings, automatically, until its outputs get reliably closer to correct. No teacher grades every answer. The system grades itself.

Here's the core loop. Show the system an example with a known correct answer — a photo labeled "cat," a sentence labeled "positive review." The system makes a prediction based on its current settings. It is almost certainly wrong at first. The system then computes how wrong it was and nudges its settings slightly in the right direction. Then repeat. Hundreds of millions of times.

The training loop — step by step

Step through the training loop to see exactly what happens at each stage.

Training loop
Step 1 of 5 · Iteration 1
📥
Example
🤔
Guess
📏
Measure
🔧
Adjust
🔁
Repeat
🐕
A labeled photo is shown: this is a dog. The system has never seen this exact image before.

Those internal settings — the ones being adjusted — are just numbers. Billions of them, each representing something like: "how much weight should I give this signal when making this prediction?" No one labels these numbers. The system discovers on its own which signals are predictive, because they reliably moved predictions in the right direction across millions of examples.

"After training on enough examples, the system has developed something no one designed: an implicit model of the world, encoded in billions of numbers, that nobody fully understands — including its creators."

What actually gets adjusted

The grid below represents a tiny slice of a model's internal settings — visualized as colors. Early in training they're random. As training proceeds they organize into patterns that encode what the model has learned. Watch what happens as training runs.

Internal settings — before training vs. after
Before training — random
After training — organized
Each cell is one of billions of numbers. Color = value. Before: noise. After: structure that encodes learned patterns.
01
📥

Show an example

A labeled input with the known correct answer.

02
🤔

Make a guess

The system predicts based on current settings.

03
📏

Measure error

How far off? Computed precisely.

04
🔧

Adjust

Nudge billions of settings slightly.

05
🔁

Repeat

Millions of times until accurate.

Accuracy over training

Accuracy climbs — then levels off
The steep early rise reflects rapid learning from obvious patterns. The flattening reflects diminishing returns — the remaining errors are the hard ones.

Training the large language models behind today's chatbots required processing hundreds of billions of words of text, running on thousands of specialized chips, over months. The result is a system that has absorbed more written language than any human could read in thousands of lifetimes — but absorbed it in a way that is fundamentally different from how a human would understand it. What that difference means is one of the deepest open questions in the field.