Session 10· 03· 13 min

Structure & Demonstrate

What you'll learn
  • Apply numbered requirements, markdown headings, and XML tags to add structure to prompts
  • Choose between zero-shot, one-shot, and few-shot based on task complexity and example quality
  • Explain when few-shot examples help versus hurt model performance

Structure: making requirements scannable

Models read prompts the way humans skim documentation: headings and numbered lists jump out first. Prose paragraphs get the least attention. Move every hard requirement into a numbered list or under a clear heading.

Three structural tools

  • Numbered requirements — "1. No trailing commas. 2. UTF-8 encoding." — make each rule independently skimmable and auditable
  • Markdown headings (##, ###) — partition the prompt into named sections so the model never confuses task with rules with examples
  • XML tags (<task>, <rules>, <example>) — stronger visual boundary than markdown, useful when the task itself contains markdown
XML tags beat markdown when the task contains markdown
If your task description includes code blocks or headers, using XML tags for the prompt structure prevents the model from conflating your prompt structure with the task content. Use <task>...</task> and <rules>...</rules> in those cases.

Demonstrate: zero-shot, one-shot, few-shot

0
Zero-shot
Instructions only, no examples. Fast to write, works well for clear formats.
1
One-shot
One input/output pair. Clarifies ambiguity in format without much token cost.
3-5
Few-shot
Multiple examples. Best for nuanced output where instructions alone are insufficient.

Zero-shot is the right default. Add examples only when you see consistent format errors that instructions alone do not fix.

When few-shot helps

  • The output format is unusual or has subtle constraints that are hard to describe in words
  • The task involves a style or tone the model has not seen enough of in pre-training
  • You have real production examples that are guaranteed correct

When few-shot hurts

  • Your examples do not match the production format exactly — the model will copy the example format, not your instructions
  • Examples are cherry-picked easy cases — the model learns a narrow distribution and fails on hard cases
  • You are using placeholder or synthetic examples that contain subtle errors
  • The examples push the prompt over the context window, compressing the instructions
Examples must match production format exactly
If your production output is a JSON object but your example shows pretty-printed JSON with a trailing newline that your grader trims, the model will emit that trailing newline in real use. Pull examples from real passing cases in your eval dataset, not from memory.
Knowledge Check
Your eval shows consistent failures on one-line regex outputs — the model wraps them in backticks. What is the best first fix?
Recap — what you just learned
  • Numbered lists and headings make requirements scannable and reduce omission errors
  • XML tags are the safest structural delimiter when the task itself contains markdown
  • Zero-shot is the default; add examples only when instructions are provably insufficient
  • Examples must be pulled from real passing eval cases — synthetic examples propagate subtle format errors
Next up: Align & Iterate