Illustration 3 of 5

What Happens When You Ask

When you hit enter, the model doesn't look up an answer. It reads everything in front of it and generates a reply, one piece at a time. That single step is called inference — and it holds a surprise about memory.

Your question also becomes vectors.

As you saw in Illustration 1, words turn into numbers. Here's that happening to your question, the instant you hit enter — and where it leads.

You ask

When does the library open?

Split into tokens

Each token looked up as a vector

(showing 4 of ~1,500 numbers per token)

One enormous calculation

vectors × billions of trained weights
= millions of multiply–add operations, all at once

Out comes a probability for the next word

It picks one, adds it to the reply, and runs the whole calculation again for the next word. No search. No lookup. Just math.

This whole act — your words in, a calculated next-word out — is called inference. And it raises a question: if answering is just a calculation over “the words in front of it,” what exactly counts as the words in front of it?

The model remembers nothing. It re-reads everything.

Send each message and watch what the model has to do first. The conversation isn't stored in its mind — every turn, the whole thing is handed back and read again from the top.

Your conversation {{ reread.windowLabel }}

An empty chat. Press Send and follow what happens on every single turn.

The model

holds nothing between turns

Re-read this turn

Notice the last turn. To answer “what were our goals?” the model re-read all four of your messages from the top. It didn't remember your goals — it read them again, exactly like every turn before.

This is what people mean when they say a model is stateless: it keeps nothing on its own. Everything it “knows” about your conversation is re-presented to it, in full, every single turn.

There's only so much it can hold: the context window.

All that re-reading happens inside a space of a fixed size — the context window. It's large, but finite. Keep adding, and the oldest messages fall out the back.

Context window {{ win.fillLabel }}

#{{ m.n }} {{ m.text }}

Fell out of the window

Nothing yet — there's still room.

#{{ m.n }} {{ m.text }}

So how does it ever “remember” you?

A fresh chat starts empty — the model has never met you. Memory isn't the model remembering. It's the app saving a note and pasting it into the top of every new window — and you decide what's worth saving.

A note about how you like to work

“I keep one Single Source of Truth. Never duplicate the canonical record — point to it.”

From memory: keeps a Single Source of Truth; never duplicate the canonical record.

A personal note — my experience

Memory compounds. The more intentional you are about what you save, the sharper your collaborator gets over time.

My collaborator and I locked the Doctrine of Single Source of Truth into memory. It now echoes across dozens of other notes — so when I start to drift from it, the AI pushes back: “wait, we committed to this.” Memory isn't a filing cabinet you forget about. It's a growing set of shared commitments — and it's worth tending deliberately.

One turn, start to finish.

You've seen what really happens when you ask: the model re-reads the whole window and generates a reply — it stores nothing on its own. Next we'll see why that same process can produce an answer that's confidently, completely wrong.

Back to The Science of AI