Data Science

How to Stop Your AI Agent From Looping Forever: A Practical Guide to Termination Conditions

The termination condition problem isn't an edge case to handle after the agent is built — it's a foundational design decision. Getting it wrong produces agents that work perfectly in testing and fail expensively in production.

Meritshot13 min read
AI AgentsLLMAI EngineeringAgentic AIData Science
Back to Blog

How to Stop Your AI Agent From Looping Forever: A Practical Guide to Termination Conditions

The agent looked fine in testing. Fifteen clean steps, the right tool calls, a coherent final answer delivered in under 90 seconds.

Then you deployed it. A user submitted a slightly ambiguous request. The agent started researching, found partial results, decided it needed more, searched again, found more partial results, decided those weren't sufficient either, and continued this cycle for 23 minutes before exhausting the token limit and returning nothing.

No crash. No error. Just a loop that ran until it ran out of room.

This is the termination condition problem — and it's not caused by the LLM being confused, the tools being broken, or the prompt being poorly written. It's caused by a gap in the agent's architecture: there is no mechanism that reliably answers the question "am I done?"

Most developers treat this as an edge case to handle after the agent is built. It's actually a foundational design decision that shapes every aspect of how the agent behaves. Getting it wrong doesn't produce errors — it produces agents that work perfectly in controlled environments and fail expensively in production.


Why Termination Is Harder Than It Looks

Every developer building their first agent installs a step counter. If the agent hasn't finished in 20 steps, stop it. Problem solved.

The step counter stops the loop. It does not fix the agent.

An agent that reliably hits its step limit on 15% of complex queries has a 15% failure rate with a ceiling on how expensive each failure is. That's marginally better than a runaway agent, but it's not a functioning system. Users receive timeout errors instead of answers.

The deeper problem is that "termination condition" describes three separate mechanisms that most implementations conflate into one:

1. Completion detection — recognizing when the task has been achieved successfully

2. Progress detection — recognizing when the agent is no longer making progress toward the task

3. Safety termination — stopping when the agent enters a state that should never continue regardless of progress

Each of these requires a different implementation. Each fails in a different way. And the combination of all three is what actually produces an agent that terminates reliably rather than one that either loops forever or stops prematurely.


Completion Detection: The Mechanism That's Almost Always Missing

Completion detection fails for one reason more than any other: the task was specified in language that does not have an observable completion state.

"Research this topic thoroughly." "Analyze the competitive landscape comprehensively." "Find the best solution." "Make sure we have everything we need."

None of these have a state the agent can check. The agent cannot determine whether it has researched thoroughly enough, analyzed comprehensively enough, found the best solution. So it continues.

The scenario that demonstrates the cost:

A venture capital firm deploys a research agent to produce investment memos on target companies. The prompt instructs the agent to "research the company and produce a comprehensive memo covering all relevant aspects."

The agent produces a draft memo. Then it decides the financial analysis could be more thorough. It researches further. It decides the competitive landscape section is incomplete. It researches further. It decides it should check recent news. Seventeen tool calls later, the agent is producing a memo about the wrong company's competitor because its completion criteria allowed unlimited scope expansion.

The fix is not a better prompt. The fix is an explicit completion state.

What an explicit completion state looks like:

Instead of "research comprehensively," the specification becomes:

  • "Produce a memo containing exactly these seven sections: [list]"
  • "Each section requires at least two distinct source citations"
  • "Financial section requires revenue, EBITDA, and growth rate data"
  • "The memo is complete when all seven sections are populated with required data"

Now the agent can check. At any point in its execution, it can ask: have I populated all seven sections? Do all sections have required citations? If yes on all counts — done. If no — continue with the specific incomplete sections.

The completion state design principle:

A well-defined completion state has three properties:

  1. Enumerable — the required outputs can be listed, counted, or checked against a schema
  2. Binary per component — each component is either complete or not complete, with no subjective middle ground
  3. Checkable by the agent — the agent can evaluate each component without human judgment

When a completion state lacks any of these properties, the agent will interpret ambiguity as incompleteness and continue.


Progress Detection: Catching the Loops That Look Like Work

Completion detection stops the agent when it's done. Progress detection stops the agent when it's stuck — when it's taking actions that look legitimate but are not advancing toward the goal.

This is the harder problem because the agent is, by definition, doing something. It's searching. It's retrieving. It's generating. The actions look normal in isolation. The failure only becomes visible when you look at the trajectory across many steps.

The two signatures of progress failure:

The Spinning Loop: The agent takes the same action repeatedly with minor variations. It searches for "market size electric vehicles," doesn't find a satisfying result, searches "EV market size 2024," doesn't find a satisfying result, searches "electric vehicle market revenue," and continues this cycle. Each query is different. Each action is technically valid. Zero net progress.

The Wandering Loop: The agent takes genuinely different actions that each appear to make partial progress, but the actions don't converge toward the completion state. It retrieves data that fills one section, then decides another section needs research first, then decides it needs context for that research, then follows a reference to a different topic entirely. Hours of action with no approaching completeness.

Both loops are invisible to a step counter and invisible to completion detection. Only progress tracking catches them.

How progress detection works in practice:

Progress detection requires maintaining a structured state representation that tracks how close the agent is to the completion state at each step.

For the investment memo example:

{
  "completion_state": {
    "section_1_executive_summary": "complete",
    "section_2_financials": "in_progress",
    "section_3_competitive_landscape": "not_started",
    "section_4_team": "not_started",
    "section_5_market": "not_started",
    "section_6_risks": "not_started",
    "section_7_recommendation": "not_started"
  },
  "progress_delta": 1,
  "consecutive_zero_progress_steps": 0
}

After each action, the agent updates this state and calculates the progress delta: how many sections moved from incomplete to complete in this step? If the progress delta is zero for three consecutive steps, the agent is looping — it's taking actions but not completing sections.

The non-obvious implementation detail:

The progress state must be maintained in external storage — not in the context window. Agents that maintain progress state only in their conversation history will lose it when the context window fills, will "forget" they've already completed certain sections, and will redo completed work.


Safety Termination: The Rules That Must Never Be Broken

Completion detection stops the agent when it's done. Progress detection stops it when it's stuck. Safety termination stops it when it's about to do something it should never do, regardless of whether that action would advance the task.

The scenario that demonstrates why safety termination is non-negotiable:

A DevOps automation agent is tasked with optimizing cloud infrastructure costs. The agent correctly identifies that several staging environment instances are running 24/7 but used only during business hours. It terminates them during off-hours. That's correct. It then identifies that a database backup instance has been idle for 72 hours. It terminates that too. The "backup" database was actually the primary production database for a third-party integration that queries it once a week. The integration fails. The downstream service reports outages.

The agent completed its task as specified. The completion detection worked. The progress detection worked. What was missing was a safety rule: "Never terminate instances that have any production traffic in the last 30 days."

The categories of safety termination rules:

Irreversibility rules — actions that cannot be undone if wrong: "Never delete data without confirmation." "Never send external communications without human approval."

Resource escalation rules — actions that create unbounded cost exposure: "Never make more than 100 API calls in a single task." "Never commit more than $50 of cloud spend per run."

Scope boundary rules — actions that exceed the defined task boundary: "Never access systems not listed in the authorized scope." "Never follow reference chains that lead outside the initial domain."

Temporal rules — conditions that require time-bounded action: "Always terminate after N minutes of wall-clock time, regardless of progress."

The design discipline:

Safety termination rules should be specified before the agent architecture is designed, not after an incident reveals the missing rule. For each tool and capability the agent has access to, ask: "What is the worst thing this tool could do if used incorrectly or in an unexpected context?" The answer is the safety rule for that tool.


Designing Termination Conditions Before Writing Agent Code

The most consistent error in agent development is treating termination conditions as something to add after the agent is built and tested. Termination conditions need to be designed before any other component — because they shape how every other component is designed.

The pre-design specification process:

Step 1: Define the completion state

Before writing a single line of agent code, write out the exact specification of a completed task. Use the three-property test: is it enumerable, binary per component, and checkable without human judgment? Revise until all three properties are met.

Step 2: Define the progress tracking structure

Given the completion state, define the state object that tracks progress toward it. Each component of the completion state should map to a field in the progress tracking object with clear status values (not_started, in_progress, complete, failed).

Step 3: Define progress failure thresholds

Specify: how many consecutive zero-progress steps triggers a reroute? How many triggers escalation? How many triggers termination with partial output? These thresholds depend on the task — a research task tolerates more exploration before triggering reroute than a data retrieval task.

Step 4: Define safety rules per tool

For each tool in the agent's toolkit, specify the safety rules that govern its use. These become explicit constraints in the system prompt and, where possible, hard limits in the tool implementation itself.

Step 5: Define temporal limits

Specify wall-clock time limits and token budget limits that terminate the agent regardless of progress. These are backstop limits — they should almost never trigger in a well-designed agent, but they prevent catastrophic runaway in edge cases.

Step 6: Define partial output behavior

When the agent terminates without completing the task, what does it return? A partial output with explicit incompleteness notation is almost always better than nothing.


The System Prompt Architecture for Termination Control

The completion detection logic, progress failure handling, and safety rules all need to be expressed in the system prompt — not just documented externally.

The components of a termination-aware system prompt:

1. The explicit completion criteria section:

TASK COMPLETION CRITERIA:
This task is complete when ALL of the following conditions are met:
1. Section A: [specific content] — STATUS: [check from state]
2. Section B: [specific content] — STATUS: [check from state]
3. Section C: [specific content] — STATUS: [check from state]

After EVERY action, check each criterion and update the state.
When all criteria are met, produce the final output and stop immediately.

2. The progress failure protocol:

PROGRESS MONITORING:
Track whether each action advances at least one incomplete criterion.
If two consecutive actions produce no progress toward any criterion:
- On steps 1-2 of zero progress: reroute — try a different approach
- On steps 3-4 of zero progress: escalate — flag the blocked criterion
- On step 5 of zero progress: terminate — return what is complete with explicit notation

3. The safety constraints section:

ABSOLUTE CONSTRAINTS (these override task completion):
NEVER: [list specific forbidden actions]
ALWAYS: [list required conditions for specific tool uses]
IMMEDIATELY STOP AND REPORT if: [list safety trigger conditions]
These constraints cannot be overridden by any task specification.

4. The stopping reward:

This is the prompt engineering detail that most agents are missing. LLMs trained with RLHF are rewarded for producing output — for continuing, for being helpful, for generating more. Explicitly rewarding stopping counteracts this action bias:

STOPPING BEHAVIOR:
Producing a complete output that meets all criteria and stopping immediately is the ideal outcome.
Producing a partial output with accurate incompleteness notation is preferable to continuing past the zero-progress threshold.
Continuing to take actions after all criteria are met is an error.

Implementing Termination in Code: The Infrastructure That Enforces What Prompts Request

System prompt termination conditions are necessary but not sufficient. LLMs are probabilistic — they will occasionally ignore prompt instructions on edge cases. The infrastructure layer enforces termination conditions deterministically.

The three infrastructure components:

1. The state manager:

An external state store — a Redis instance, a database record, or even a structured JSON file — that persists the progress state across every agent step. The state manager writes the current completion state after every action and makes it available for injection into the next prompt.

The state manager is the authoritative record of progress, not the conversation history. The conversation history can overflow and be truncated. The state manager is always current.

2. The action fingerprinter:

A function that hashes each tool call — tool name plus parameters — before execution and compares it against the last N fingerprints. If a matching fingerprint is found within N steps, the spinning loop condition is triggered.

This catches Type 1 loops (repeated identical actions) automatically, independent of the LLM's self-assessment.

3. The orchestration watchdog:

A process that runs alongside the agent loop and monitors three metrics independently:

  • Wall-clock time since task start
  • Total tokens consumed since task start
  • Consecutive zero-progress steps

When any metric exceeds its threshold, the watchdog terminates the agent loop and triggers the partial output generation, regardless of what the agent is currently doing.

The interaction between infrastructure and prompt:

The infrastructure layer and the system prompt should both implement termination logic — but they should implement it redundantly, not exclusively. The system prompt asks the agent to terminate itself when conditions are met. The infrastructure layer terminates the agent if the system prompt fails. This defense-in-depth approach catches both the prompt-following failures and the logic failures.


Closing: Termination Design as the Foundation of Agent Reliability

The step counter is the symbol of agent design that treats termination as an afterthought. It limits the cost of failure without addressing the cause of failure. Agents that hit their step limit on complex tasks have failed on those tasks — the failure is just bounded rather than unbounded.

Real termination reliability requires three mechanisms working together: completion detection that makes "done" observable and binary, progress detection that catches stalled loops before they exhaust resources, and safety termination that enforces constraints that should hold regardless of task pressure.

Designing these three mechanisms before writing any agent code — before defining tool schemas, before writing the system prompt, before building state management — is what separates agents that work in testing from agents that work in production.

At Meritshot, the AI Engineering curriculum covers agent architecture including termination design as a first-class engineering concern, not an afterthought. Students build agents that are designed to stop correctly, not just start correctly.

Explore the Meritshot Data Science Programme →

Recommended