🏗️Agentic ArchitectureLesson 1.7

Multi-Agent Communication & Handoffs

How agents pass context, results, and control between each other.

25 min

Learning Objectives

Design handoff protocols between agents
Manage shared state in multi-agent systems
Implement escalation and fallback patterns

Multi-Agent Communication and Handoffs

In multi-agent systems, agents don't work in isolation — they need to communicate, share context, transfer control, and handle failures gracefully. This lesson covers the mechanics of inter-agent communication: how agents pass information to each other, what context to preserve during handoffs, and how to design escalation and fallback patterns that keep the system reliable even when individual agents fail.

Handoff Protocols

A handoff occurs when one agent transfers control of a conversation or task to another agent. The quality of a handoff depends on the information included in the transfer. A well-designed handoff protocol includes:

Conversation history: The full or summarized conversation so the receiving agent understands what has already happened.
Task context: What the user is trying to accomplish, what has been tried so far, and what remains to be done.
Classification metadata: Why the handoff was triggered — the routing category, confidence score, or specific trigger condition.
Constraints: Any limitations the receiving agent should respect — time limits, permission boundaries, or scope restrictions.

What to Include vs. What to Summarize

Including the full conversation history ensures no information is lost, but it can consume significant context window space. The decision of what to include verbatim vs. what to summarize depends on:

Context window budget: If the receiving agent needs most of its context window for its own work (e.g., analyzing a large document), the handoff payload must be compact.
Information density: If the conversation is short and information-dense, include it verbatim. If it's long and mostly chit-chat, summarize.
Safety requirements: For sensitive conversations (medical, legal, financial), verbatim transfer may be required for audit and compliance reasons.

The Handoff Primitive in the Agent SDK

In the Claude Agent SDK, handoffs are implemented by listing agents in the handoffs parameter. When the current agent decides a handoff is needed, it generates a special tool call that transfers control:

from agents import Agent, Runner

# Define specialist agents
billing_agent = Agent(
    name="Billing Specialist",
    model="claude-sonnet-4-5-20250514",
    instructions="""You handle billing inquiries. You have access to account
and payment information. Be precise about timelines for refunds (3-5 business
days). If the issue requires a manager, say so clearly.""",
)

technical_agent = Agent(
    name="Technical Support",
    model="claude-sonnet-4-5-20250514",
    instructions="""You handle technical issues. Diagnose systematically:
ask for error messages, reproduction steps, and environment details.
If the issue is beyond your scope, escalate clearly.""",
)

general_agent = Agent(
    name="General Support",
    model="claude-sonnet-4-5-20250514",
    instructions="You handle general inquiries that don't fall into billing or technical.",
)

# Triage agent that routes via handoffs
triage_agent = Agent(
    name="Triage",
    model="claude-sonnet-4-5-20250514",
    instructions="""You are the first point of contact. Determine the nature
of the customer's issue and hand off to the appropriate specialist:

- Billing/payment issues → Billing Specialist
- Technical/product bugs → Technical Support
- Everything else → General Support

Before handing off, summarize the customer's issue in one sentence so the
specialist has immediate context.""",
    handoffs=[billing_agent, technical_agent, general_agent],
)

# The full conversation context is passed to the receiving agent
import asyncio

async def main():
    result = await Runner.run(
        triage_agent,
        messages=[{
            "role": "user",
            "content": "I've been charged $49.99 but my plan should be $29.99. "
                       "Also, the dashboard is showing an error when I try to "
                       "view my usage stats."
        }]
    )
    print(result.final_output)

asyncio.run(main())

Note the challenge in this example: the customer has both a billing issue and a technical issue. The triage agent must decide whether to hand off to one specialist (and have that specialist handle both), or handle the issues sequentially. This kind of ambiguity is exactly what the exam tests.

Shared State Management

In multi-agent systems, state management is one of the hardest problems. Key questions include:

Where does state live? In the conversation history (implicit), in a shared data store (explicit), or in the handoff payload (ephemeral)?
Who can modify state? Should workers be able to update shared state, or should only the orchestrator have write access?
How is state synchronized? If multiple agents run in parallel, how do you handle concurrent modifications?

The simplest approach — and the one Anthropic's documentation recommends — is to use the conversation history as the primary state. Each agent's output becomes part of the history that subsequent agents can read. For more complex state, use an external store (database, key-value cache) that agents can read from and write to via tools.

Escalation Patterns

Escalation occurs when an agent determines it cannot handle a task and transfers it to a higher authority — a more capable agent, a human operator, or a different system entirely. Well-designed escalation patterns include:

Confidence-based escalation: The agent assesses its confidence in handling the task. If confidence is below a threshold, escalate. This can be implemented by asking the model to include a confidence score in its response.
Rule-based escalation: Certain keywords, topics, or customer types automatically trigger escalation regardless of the agent's confidence (e.g., legal threats, safety concerns, VIP customers).
Timeout escalation: If an agent fails to resolve the task within a specified number of turns or time limit, escalate automatically.
Explicit model request: The agent's instructions tell it to escalate when it encounters situations outside its training or tools.

Fallback Mechanisms

Fallbacks handle the case where a specialist agent fails — it crashes, returns an error, or produces output that fails validation. Effective fallback strategies include:

Retry with the same agent: For transient errors (API timeout, rate limit), retry the same agent with the same context.
Fall back to a general agent: If a specialist fails, route to a generalist that can provide a basic response rather than leaving the user without any answer.
Fall back to a human: For critical workflows where automated fallbacks are insufficient, create a ticket or notification for human review.
Graceful degradation: Return a partial result with a clear explanation of what couldn't be completed, rather than returning nothing.

Exam Tip: Multi-agent handoff questions test your understanding of context management. A common scenario describes a system where context is lost during handoff, causing the receiving agent to ask the customer to repeat information. The correct answer involves including conversation history or a structured summary in the handoff payload. Another common pattern: the exam asks what to do when a customer has multiple issue types (billing + technical) — do you hand off to one specialist, handle sequentially, or create a meta-agent that coordinates?

Key Takeaways

Handoff quality depends on context transfer. Include conversation history, task context, classification metadata, and constraints. Decide what to include verbatim vs. summarize based on context window budget and information density.

The Agent SDK handles handoffs natively through the handoffs parameter. Control transfers to the receiving agent with the full conversation context.

Use the conversation history as primary state for simplicity. For complex multi-agent state, use an external store accessible via tools — but keep it simple.

Design escalation and fallback patterns for every agent. Confidence-based escalation, timeout limits, and graceful degradation ensure the system remains reliable even when individual agents fail.