Session 5· 15· 15 min

Parallel Tool Calls

What you'll learn
  • See the model request multiple tools in a single turn
  • Execute them together and return all results at once
  • Understand when parallel calling saves round-trips

Ask the model "what is the weather in Paris, Tokyo, and Delhi?" and it will return THREE tool_calls in a single response. You execute all three, append three tool messages, and call the API once for the combined answer. One round-trip saved per extra parallel call.

15_multi_tool_parallel_calls.py
msg = response.choices[0].message
messages.append(msg)

for call in msg.tool_calls:                                      ①
    args = json.loads(call.function.arguments)
    result = route(call.function.name, args)                     ②
    messages.append({
        "role": "tool",
        "tool_call_id": call.id,                                  ③
        "content": json.dumps(result),
    })

final = client.chat.completions.create(
    model=model, messages=messages, tools=tools
)
The same loop from lesson 14 — just iterating naturally handles however many tool_calls come back.
route() is your generic router from lesson 14.
tool_call_id is crucial here — it is how each result matches its request.
$ python 15_multi_tool_parallel_calls.py
Speedup
If the model needs 3 tools, parallel calling does them in ONE round-trip instead of 3. For network-bound tools (APIs, DBs) this can be 3x faster. You can even run your router calls concurrently with asyncio if the tools are I/O bound.
Knowledge Check
Why does tool_call_id matter more with parallel calls?
Recap — what you just learned
  • The model may request multiple tools in one assistant turn
  • Iterate all of them and append ALL results before calling the API again
  • tool_call_id links each result to its request
  • Huge speedup for independent tool calls
Next up: 16 — Mini Assistant