🏗️Agentic ArchitectureLesson 1.2

Prompt Chaining Pattern

Sequential multi-step workflows where each step feeds the next.

25 min

Learning Objectives

Implement prompt chains for multi-step tasks
Design gate checks between chain steps
Handle errors and branching in chains

Prompt Chaining Pattern

Prompt chaining is the foundational building block of agentic workflows. It is the pattern of decomposing a complex task into a sequence of discrete LLM calls, where the output of each step becomes the input to the next. Prompt chaining sits at the simpler end of the agentic spectrum — it is fully deterministic in its control flow, developer-defined in its structure, and highly predictable in its behavior. Understanding it deeply is essential before moving to more dynamic patterns like routing and parallelization.

What Is Prompt Chaining?

In a prompt chain, a complex task is broken into a series of subtasks. Each subtask is handled by a separate LLM call, and the output of each call is passed — often transformed or filtered — into the prompt of the next call. The chain is defined by the developer: the number of steps, the purpose of each step, and the transformations between steps are all hard-coded in application logic.

This is fundamentally different from an agent. In an agent, the model decides what to do next. In a prompt chain, the developer decides — the model is simply executing each well-defined subtask as instructed. This makes prompt chains easier to test, debug, and audit.

Why Break Tasks Into Steps?

Focused instructions: A model given a single step with a narrow scope performs better than a model asked to do many things at once in a single prompt.
Intermediate validation: Between steps, the application can check whether the output meets quality criteria before proceeding, rather than discovering failures only at the very end.
Easier debugging: When something goes wrong, you can inspect the output of each individual step to identify exactly where the failure occurred.
Context management: Large tasks may require more context than fits in a single call. Chaining allows each step to work with a focused subset of information.

Gate Checks: Quality Control Between Steps

One of the most valuable features of prompt chaining is the ability to insert gate checks between steps. A gate check is a validation step — either an LLM call or deterministic code — that evaluates the output of the previous step before allowing the chain to continue.

Common gate check patterns include:

Format validation: Confirm that the output is valid JSON, contains required fields, or matches an expected schema before passing it downstream.
Content quality check: Ask a second LLM call to evaluate whether the previous step's output is accurate, relevant, or sufficiently detailed.
Safety screening: Before passing user-influenced content to a downstream step, check for prompt injection attempts or policy violations.
Confidence threshold: If a classification step returns a low-confidence result, route to a fallback or escalate to a human rather than proceeding.

Error Handling and Conditional Branching

Prompt chains can include conditional logic. Based on the output of one step, the application may branch to different subsequent steps, retry a step with a modified prompt, or terminate the chain early and return an error to the user.

Key error-handling patterns in prompt chains:

Retry with clarification: If a step fails validation, re-invoke it with additional context or a corrected prompt rather than passing bad output downstream.
Graceful degradation: If an optional enrichment step fails, skip it and continue with the core output rather than failing the entire chain.
Early exit: If a gate check determines the input is invalid or out-of-scope, return immediately with a user-friendly error rather than wasting tokens on subsequent steps.
Maximum retries: Cap the number of retry attempts to prevent infinite loops in cases where a step repeatedly fails validation.

Code Example: A 3-Step Prompt Chain

The following example implements a three-step chain for processing customer support tickets: first classify the issue type, then extract the relevant details, and finally format a structured response. A gate check between steps 1 and 2 ensures we only proceed with recognized issue categories.

import anthropic
import json

client = anthropic.Anthropic()

def llm(prompt: str, system: str = "") -> str:
    """Helper: single LLM call returning the text response."""
    kwargs = {"model": "claude-opus-4-5", "max_tokens": 1024,
              "messages": [{"role": "user", "content": prompt}]}
    if system:
        kwargs["system"] = system
    response = client.messages.create(**kwargs)
    return response.content[0].text.strip()

# ---------- Step 1: Classify the issue ----------
def classify_issue(ticket: str) -> str:
    prompt = f"""Classify the following customer support ticket into exactly one category.
Respond with ONLY the category name, nothing else.

Categories: billing, technical, account, shipping, other

Ticket: {ticket}"""
    return llm(prompt)

# ---------- Gate check between steps 1 and 2 ----------
KNOWN_CATEGORIES = {"billing", "technical", "account", "shipping", "other"}

def validate_category(category: str) -> bool:
    return category.lower().strip() in KNOWN_CATEGORIES

# ---------- Step 2: Extract structured details ----------
def extract_details(ticket: str, category: str) -> dict:
    prompt = f"""Extract the key details from this {category} support ticket.
Return a JSON object with these fields:
- urgency: "low", "medium", or "high"
- customer_action_required: true or false
- summary: one-sentence summary of the issue

Ticket: {ticket}

Respond with valid JSON only."""
    raw = llm(prompt)
    # Strip markdown code fences if present
    # Remove any markdown code fence wrapping from the response
    import re
    raw = re.sub(r'^[^{\[]*', '', raw.strip())  # strip to first JSON char
    return json.loads(raw)

# ---------- Step 3: Format the response ----------
def format_response(category: str, details: dict) -> str:
    prompt = f"""You are a customer support agent. Write a professional, empathetic response
to a customer who submitted a {category} ticket with the following details:

Summary: {details['summary']}
Urgency: {details['urgency']}
Customer action required: {details['customer_action_required']}

Keep the response to 3-4 sentences."""
    return llm(prompt, system="You are a helpful, professional customer support agent.")

# ---------- Run the full chain ----------
def process_ticket(ticket: str) -> str:
    # Step 1: Classify
    category = classify_issue(ticket)
    print(f"[Step 1] Category: {category}")

    # Gate check
    if not validate_category(category):
        return f"Error: Unrecognized category '{category}'. Manual review required."

    # Step 2: Extract
    try:
        details = extract_details(ticket, category)
        print(f"[Step 2] Details: {details}")
    except json.JSONDecodeError as e:
        return f"Error: Failed to parse structured details. {e}"

    # Step 3: Format
    response = format_response(category, details)
    print(f"[Step 3] Response generated.")
    return response

# Example usage
ticket = """
I was charged twice for my subscription this month. I can see two identical
charges of $29.99 on my credit card statement dated March 1st. This needs
to be resolved urgently as I'm on a tight budget.
"""

print(process_ticket(ticket))

Notice the gate check after Step 1: if the classifier returns an unrecognized category, we exit early rather than passing garbage into Step 2. The JSON parsing in Step 2 is wrapped in a try/except for the same reason — bad output from one step should not silently corrupt the next.

When to Use Chains vs. Single Prompts

Prompt chaining adds complexity — more API calls, more latency, more code to maintain. Use it only when the task genuinely benefits from decomposition:

Use a single prompt when the task is small enough to fit comfortably in one call, the instructions are clear and unambiguous, and there is no intermediate output worth validating separately.
Use a chain when a single prompt becomes unwieldy (too many instructions, competing concerns), when intermediate outputs need to be validated or stored, or when different steps benefit from different system prompts or model configurations.
Use a chain over an agent when the steps are fully known in advance. If you can enumerate exactly what needs to happen, a chain is more predictable and easier to test than an agent that makes its own decisions.

Exam Tip: Prompt chaining is described in Anthropic's documentation as the simplest agentic pattern. On the exam, it is distinguished from routing and parallelization by its strictly sequential nature — each step depends on the output of the previous one. If a question describes steps that could run simultaneously or independently, that points toward parallelization (lesson 1.3), not chaining. If a question describes choosing between different paths based on input type, that is routing (also 1.3). Chaining is linear: A feeds B feeds C.

Key Takeaways

Prompt chaining decomposes complex tasks into sequential LLM calls where each step's output feeds the next. The control flow is developer-defined, making chains predictable and easy to test.

Gate checks between steps are a best practice for catching errors early. Validate format, content quality, and safety before passing output downstream — it is far cheaper to fail at step 1 than to discover a problem at step 5.

Choose chains over agents when the steps are known in advance. If you can hard-code the workflow, you should — it is simpler, faster, cheaper, and easier to debug than an autonomous agent.

Error handling in chains should include retry logic with caps, graceful degradation for optional steps, and early exits when inputs are invalid. Never silently pass bad output from one step into the next.