Session 1· H5· 15 min
Simulate Memory by Sending Context
What you'll learn
- ▸Maintain a growing list of messages across turns
- ▸Replay prior turns so the model "remembers" them
- ▸See why long conversations cost more tokens
What you will build
A working chatbot loop. You keep a Python list called history. Every time the user types, you append their message, call the API with the full history, then append the model's reply. The model "remembers" because you keep feeding it everything.
The mental model
Every chatbot works this way
ChatGPT, Claude, every assistant you have ever used — under the hood is a growing messages list that gets resent in full on every turn. Memory is an illusion YOU maintain on the client side.
The code
src/05_context_memory_demo.py (excerpt)
history = [ ①
{"role": "system", "content": "You are a friendly assistant."},
]
def turn(user_text: str) -> str:
history.append({"role": "user", "content": user_text}) ②
r = client.responses.create(
model=model,
input=history, ③
)
reply = r.output_text.strip()
history.append({"role": "assistant", "content": reply}) ④
return reply
print(turn("My name is Arya."))
print(turn("What is my name?")) ⑤①The history list starts with just the system message.
②Append the new user turn BEFORE calling the API.
③Send the ENTIRE history on every call, not just the latest message.
④Append the assistant reply so it becomes part of the next call.
⑤This time the model replies "Your name is Arya." because turn 1 is in the history.
Run it
$ python src/05_context_memory_demo.py
The hidden cost
Long chats grow linearly
Turn 10 resends all 10 prior turns. Turn 50 resends all 50. Your token bill scales with conversation length. Production chatbots summarise or truncate old turns to keep costs down.
1
Turn 1 cost
100 input tokens
10
Turn 10 cost
1,000+ tokens
50
Turn 50 cost
10,000+ tokens
Try it yourself
- Add a third turn that asks "what did I tell you about myself?" and watch the model pull from turn 1.
- Remove the history.append(assistant) line. How does the behaviour break?
- Add a print(len(history)) after every turn to see the list grow.
Knowledge check
Knowledge Check
In a chatbot with memory, what do you send on turn 10?
Code Check
Why does history need BOTH the user and the assistant message appended each turn?
Recap — what you just learned
- ✓You simulate memory by keeping a growing messages list on your side
- ✓Append BOTH the user turn and the assistant reply each round
- ✓Every API call resends the full history — there is no "session"
- ✓Long chats cost more tokens linearly — production apps summarise old turns