Performance & Cost Comparison
- ▸Compare LLM call counts across multi-agent patterns for different scenarios
- ▸Understand how call count relates to cost and latency
- ▸Know which pattern is best-in-class for each scenario
Architecture diagrams look great on a whiteboard, but what actually matters is how many LLM calls a pattern makes and how many tokens it burns. More calls mean more latency and more cost. Let's put numbers on it.
Scenario 1: One-shot task
A single, focused request — like "What is the weather in Paris?" — that requires one tool call and one response. This is the simplest benchmark.
| Subagents | Handoffs | Skills | Router | |
|---|---|---|---|---|
| LLM calls | 4 | 3 | 3 | 3 |
Subagents require an extra call because the supervisor must first decide which subagent to invoke before the subagent itself runs. The other patterns all settle at 3 calls for a one-shot task.
Scenario 2: Repeat query (same domain)
Two consecutive questions that hit the same domain — like asking about weather in Paris, then weather in London. Can the pattern reuse context from the first query?
| Subagents | Handoffs | Skills | Router | |
|---|---|---|---|---|
| LLM calls | 8 | 5 | 5 | 6 |
Handoffs and Skills shine here because they can maintain context within the active agent. Subagents pay the supervisor tax twice. Router has a slight overhead because it re-classifies the second query.
Scenario 3: Multi-domain task
A complex request that spans multiple domains — "Book a flight to Paris and find the weather there." This tests how patterns handle cross-domain coordination.
| Subagents | Handoffs | Skills | Router | |
|---|---|---|---|---|
| LLM calls | 5 | 7+ | 3 | 5 |
| Tokens | ~9K | 14K+ | ~15K | ~9K |
Here the trade-offs get interesting. Skills use the fewest calls (the parent agent calls both skill agents in one turn) but consume more tokens because each skill carries its own context. Handoffs are the most expensive because they must transfer conversational context sequentially across agents.
Best-in-class by scenario
The cost formula
For any pattern, the total cost is roughly:
Total cost = (number of LLM calls) x (avg input tokens x input price + avg output tokens x output price)
Input tokens are usually 3-10x cheaper than output tokens, so patterns that generate long outputs at every step (like handoffs passing full conversation history) pay a higher tax per call.