Fasttrack Course

SESSION 1

Session 1 — OpenAI SDK Fundamentals

Text, vision, image generation, and text-to-speech from scratch.

2.5–3 hours•11 hands-on exercises

What you'll have built by the end

✓ A working Python script that chats with GPT from the command line
✓ A vision tool that can describe or compare local images
✓ An image generator that saves DALL·E output to disk
✓ A text-to-speech tool that produces mp3 files
✓ A multimodal pipeline that turns one sentence into a scene-by-scene illustrated audiobook

Prerequisites — tick these off first

Completed Session 0 or at least the What is an LLM page Python 3.10+ installed Session1 folder with .venv activated and requirements.txt installed OPENAI_API_KEY set in Session1/.env VS Code with the Python extension and the .venv interpreter selected

Learning arc

The 11 exercises split into 5 progressive groups. Each builds on the one before — do them in order the first time.

1. Text

H1–H3

2. Memory

H4–H5

3. Vision

H6–H7

4. Generation

H8–H9

5. Integration

H10–H11

Exercises

1. Text

Basic Text Generation

Parameter Lab — temperature, top_p, max_tokens

Prompt Roles — system vs user messages

2. Memory

No Memory Between API Calls

Simulate Memory by Sending Context

3. Vision

Image Analysis (Vision)

Compare Two Images

4. Generation

Image Generation (DALL·E / gpt-image-1)

5. Integration

Mini CLI App — everything combined

Multimodal Story Pipeline (capstone)

When you get stuck or finish

Troubleshooting

Every error a Session 1 student is likely to hit, and the exact fix.

End-of-session quiz

10 questions to confirm you can move to Session 5 confidently.