Agent Loop Evolution: pi-mono, LangChain/LangGraph, OpenClaw, Hermes

#agent-loop #comparative #evolution #langchain #langgraph #pi-mono #openclaw #hermes

Agent Loop Evolution: Comparative Analysis

Frameworks covered, roughly chronological by design generation:

Framework	Language	Role	Loop style
LangChain ReAct (legacy)	Python	Library (loop)	Class inheritance + implicit while
pi-mono (`agentLoop`)	TypeScript	Library (loop)	Nested while + EventStream
Hermes (`AIAgent`)	Python	Library (loop)	Class-based while + ThreadPoolExecutor
LangChain LangGraph (modern)	Python	Library (loop)	Explicit `StateGraph` DAG
OpenClaw	TypeScript	Harness (wraps pi-mono)	Gateway + hooks over pi-mono's loop
Anthropic Managed Agents	—	Arch pattern (orthogonal)	Brain/Hands/Session decoupling

Architectural caveat: The first four rows are loop implementations — they define the actual turn cycle. OpenClaw is a harness/gateway that delegates its loop entirely to pi-mono (runEmbeddedPiAgent). What OpenClaw contributes is the gateway layer: channel adapters, session management, a rich hook pipeline, and the operator-facing surface above the loop. Comparing OpenClaw's "loop" directly against LangGraph's loop conflates two different abstraction levels. The per-dimension rows below for OpenClaw describe its gateway contributions, not a competing loop design. Anthropic Managed Agents is similarly an architectural pattern, not a loop library.

Dimension-by-dimension comparison

1. Loop representation

Framework	Loop form	Where it lives
LangChain ReAct	`while True`	`AgentExecutor.run()` — invisible to the agent class
pi-mono	Nested `outer / inner while`	`agentLoop()` function body
Hermes	`while` in `run_conversation()`	`AIAgent` class method
OpenClaw (harness)	No own loop — delegates to pi-mono	`runEmbeddedPiAgent`; gateway layer sits above
LangGraph	Explicit DAG: `model → tools → model`	Compiled `StateGraph`; edges are functions

Evolution (loop implementations only): The loop starts hidden (implicit in a framework base class), becomes explicit code (nested whiles), and finally becomes a first-class data structure (a graph whose edges are named, inspectable routing functions). Each stage makes routing decisions more legible and testable. OpenClaw does not contribute a loop form — it inherits pi-mono's nested-while and adds gateway machinery around it.

2. How the loop exits

Framework	Exit condition
LangChain ReAct	LLM text contains `"Final Answer:"`
pi-mono	`stopReason == error\|aborted`; no steering or follow-up messages remain
Hermes	No tool calls in response; OR 90-turn budget exhausted
OpenClaw (harness)	pi-mono's exit conditions apply; `before_agent_reply` hook can suppress a turn at the gateway layer
LangGraph	`tool_calls == []` in `AIMessage`; OR `return_direct`; OR `structured_response`; OR `jump_to="end"`

Evolution: From a text-signal parsed by string matching ("Final Answer:") → implicit loop exhaustion (no messages, no tool calls) → explicit budget caps → typed conditional routing functions that enumerate all possible exits. LangGraph is the first to make exits a first-class concern: each exit condition is a separate branch in _make_model_to_tools_edge.

3. Tool execution model

Framework	Parallel?	Mechanism	Constraints
LangChain ReAct	No	Synchronous, sequential	Hard-coded to
pi-mono	Configurable	`Promise.all` on `execute()` calls after sequential `prepareArguments`	Any `BaseTool`; `beforeToolCall` can block
Hermes	Yes (N>1)	`ThreadPoolExecutor`; 1 tool → main thread	Interactive tools forced sequential
OpenClaw (harness)	Delegates to pi-mono	Adds `block`/`rewrite` semantics via gateway hooks	Same as pi-mono underneath
LangGraph	Yes	`Send` fan-out — one `Send` per pending tool call	`return_direct` tools can short-circuit to END

Key insight: Every framework converges on parallel tool execution, but the mechanism differs. pi-mono and Hermes use native concurrency primitives; LangGraph uses graph-level fan-out (Send), which decouples parallelism from the loop logic entirely and makes it observable as graph structure.

4. State model

Framework	What is "state"?	Persistence
LangChain ReAct	Growing prompt string (text)	Ephemeral
pi-mono	`AgentMessage[]` (richer than LLM wire format)	In-memory; `convertToLlm()` at boundary
Hermes	OpenAI-compat message list	Persisted to session store after each turn
OpenClaw (harness)	Session transcript (external)	External; `transformContext` for compaction; state lives outside pi-mono
LangGraph	Typed `AgentState` dict: `{messages, jump_to, structured_response}`	Optional `checkpointer` (pluggable)
Anthropic	External append-only event log	Survives Brain + Hands restarts

Evolution: From text string → rich message objects → typed state dict → externalized durable log. A key insight from Anthropic's architecture: the loop's state and the loop's execution should be decoupled into different failure domains. LangGraph approaches this with its checkpointer abstraction.

5. Middleware / extensibility model

Framework	Extension mechanism	Granularity
LangChain ReAct	Subclass `Agent`; override `create_prompt`, `_validate_tools`	Class methods
pi-mono	`AgentLoopConfig` callbacks: `getApiKey`, `beforeToolCall`, `afterToolCall`, `getSteeringMessages`, `getFollowUpMessages`, `transformContext`, `convertToLlm`	Per-turn function calls
Hermes	Callback functions: `tool_progress_callback`, `thinking_callback`, `step_callback`, etc.	Observation only (no routing control)
OpenClaw (harness)	15+ hook points (plugin API + shell scripts); `block`, `cancel`, `rewrite` semantics	Full lifecycle, can block/mutate — but these are gateway hooks, not loop-level callbacks
LangGraph	`AgentMiddleware` with 4 graph nodes: `before_agent`, `before_model`, `after_model`, `after_agent`	First-class graph nodes; can set `jump_to` to re-route

Evolution: Subclassing → callbacks → hook pipelines → middleware-as-graph-nodes. The critical shift in LangGraph is that middleware isn't a side-channel (a callback) — it's a node in the same graph as the model and tools. It can issue jump_to directives that redirect the entire routing graph. Middleware is no longer bolted on; it's load-bearing.

6. Mid-run control (HITL / steering)

Framework	Mechanism	Granularity
LangChain ReAct	None	—
pi-mono	`getSteeringMessages()` injects between turns; `getFollowUpMessages()` re-enters outer loop	Turn boundary
Hermes	Background thread for cancellation; `clarify` tool forces user response	Per-turn cancellation + explicit tool
OpenClaw (harness)	`before_agent_reply` hook claims a turn at gateway; command queue `steer`/`followup` feeds into pi-mono's steering callbacks	Turn level; OpenClaw maps gateway events → pi-mono's `getSteeringMessages`/`getFollowUpMessages`
LangGraph	`interrupt_before`/`interrupt_after` pauses graph at any named node; `jump_to` in state	Any node; resumable from checkpoint

Evolution: The ability to inject human input progresses from nonexistent → turn-boundary callbacks → explicit tool invocation → graph-level pause/resume. LangGraph's approach is the most general: any node can be a pause point, and the checkpointer stores graph state so the loop can be resumed later (even in a different process).

7. Context management / compression

Framework	Approach
LangChain ReAct	None
pi-mono	`transformContext()` callback runs before `convertToLlm()` — used by OpenClaw for compaction
Hermes	Two thresholds: preflight at 50% context window (agent-side) and 85% (gateway auto-compress); preserves tool call/result pairs; generates new lineage ID
OpenClaw	Overflow-triggered auto-compaction; silent memory flush before summary; configurable compaction model, `identifierPolicy`, and `notifyUser`; `before_compaction`/`after_compaction` hooks; `/compact [guidance]` for manual control; distinct from per-request session pruning
LangGraph	No built-in; handled outside (e.g. reduce via state schema)

Hermes and OpenClaw lead on different axes. Hermes has the most structured trigger model — two explicit thresholds (50%/85%) with lineage ID rotation for auditability. OpenClaw has the most operator-facing compaction design: a configurable summarization model, identity-preservation policy, user notifications, manual /compact [guidance] invocation, and a pre-compaction memory flush so agents can persist critical state before the summary is written. Both enforce the tool-pair invariant.

A key OpenClaw distinction: compaction vs. session pruning are separate mechanisms. Compaction permanently writes a condensed summary to the session transcript (durable, opt-in notifications). Session pruning silently drops tool outputs per API request to stay within the context window (ephemeral, invisible). Most frameworks conflate or ignore this distinction.

8. Multi-agent composition

Framework	Support
LangChain ReAct	None
pi-mono	None (single loop)
Hermes	`delegate_task` tool spawns subagents with isolated context; parent 90-turn / child 50-turn caps
OpenClaw (harness)	Session lanes serialize requests at gateway; no explicit subagent model
LangGraph	Each agent is a compiled graph; graphs are composable via LangGraph supervisor patterns
Anthropic Managed	First-class: many Brains, many Hands, shared Session; distributed execution

Evolutionary arc: four generations

Generation 1 — Text-format stop-token loop (LangChain ReAct)

The loop is driven by two prompt template variables:

{input} — the question/task, filled once at invocation
{agent_scratchpad} — the growing Thought:/Action:/Observation: string, extended each turn

PromptTemplate.from_examples() prepends static few-shot demonstrations that teach the LLM the text protocol. The LLM outputs plain text like Action: Search[Colorado orogeny]; ReActOutputParser reads it back. The framework injects a stop token ("\nObservation:") to truncate generation, prepends "Thought:" as the LLM prefix, and appends "Observation: <result>" to the scratchpad after each tool call. Exit is signalled by the LLM writing Action: Finish[answer].

Everything couples together: the prompt template, the few-shot examples, the output parser, the stop token, the tool names, and the exit condition are all one text convention. There is no structured object — control is a protocol negotiated in prose.

Characteristic: Loop = text protocol. The framework and the LLM are coupled through a shared text contract ({agent_scratchpad} ↔ ReActOutputParser).

Generation 2 — Class-based structured loop (Hermes, early pi-mono)

Messages become structured objects (OpenAI-compat), not raw strings. Tool dispatch grows from sequential → parallel (ThreadPoolExecutor, configurable Promise.all). Budget tracking appears (Hermes: 90 turns). Provider failover appears (Hermes). Callbacks emerge for observability. But the loop is still a while-loop in a class method — the structure is implicit, and routing decisions are scattered across if-branches inside the loop body.

Characteristic: Loop = structured while-loop. Better message hygiene, but routing logic is still opaque.

Generation 2.5 — Callback/hook layers (pi-mono) + Harness layer (OpenClaw)

pi-mono separates the loop cleanly: outer loop for follow-up, inner loop for turns. Streaming is first-class (EventStream<AgentEvent>). AgentLoopConfig callbacks let callers inject behavior at well-defined points without subclassing.

OpenClaw adds a gateway/harness layer on top of pi-mono without modifying it — it does not reimplement the loop. The insight: separate the loop executor (pi-mono) from the observation/control surface (15+ gateway hooks, channel adapters, session management). Hooks can block, rewrite, and cancel. Shell-script hooks democratize extension without requiring code.

This is a different kind of evolution from the loop improvements above: rather than making the loop smarter, OpenClaw keeps the loop stable (pi-mono) and builds richness around it. The loop and the harness become separate failure domains and separate upgrade surfaces.

Characteristic (pi-mono): Loop = callback-decorated while. Separation of concerns between execution and observation begins. Characteristic (OpenClaw): Harness = gateway wrapping a stable loop; extensibility lives in the surrounding layer, not the loop itself.

Generation 3 — Explicit graph DAG (LangGraph)

The loop becomes a data structure: a StateGraph with named nodes (model, tools, middleware nodes) and named routing functions (_make_model_to_tools_edge, _make_tools_to_model_edge). Every routing decision is inspectable, testable, and composable.

Middleware is first-class: AgentMiddleware hooks are graph nodes, not callbacks. They can set jump_to in state to re-route the entire graph. interrupt_before/interrupt_after pauses execution at any node. The checkpointer externalizes state (approaching Anthropic's Session concept). Structured output becomes a first-class loop concern — a separate model→model retry loop for response format enforcement.

Characteristic: Loop = inspectable DAG. Routing logic is explicit, composable, and decoupled from execution.

Architectural Pattern — Distributed state decoupling (Anthropic Managed Agents)

This is not a generation in the loop-representation lineage. It is an orthogonal axis: where does loop state live? It can be combined with any loop implementation above.

The loop's execution (Brain), its tools (Hands), and its memory (Session) are placed in three separate failure domains. The "loop" is no longer owned by one process — a Brain can crash and a new one resumes from the Session. This dissolves the assumption that loop state and loop execution must be co-located.

Characteristic: Loop state is externalized and decoupled from execution. Any loop implementation can adopt this pattern; LangGraph's checkpointer is the closest approximation in a framework.

Summary table

Dimension	Gen 1 (ReAct)	Gen 2 (Hermes/pi-mono)	Gen 2.5 loop (pi-mono) / harness (OpenClaw)	Gen 3 (LangGraph)
Loop form	Implicit while	Explicit while	pi-mono: nested while; OpenClaw: no own loop (delegates to pi-mono)	Explicit DAG
State	Prompt string	Message list (in-memory)	pi-mono: in-memory messages; OpenClaw: external transcript (harness-owned)	Typed state dict + checkpointer
Exit signal	Text token	No messages/budget	pi-mono: natural exhaustion; OpenClaw gateway: `before_agent_reply` can suppress	Typed routing condition
Tool parallelism	No	Yes (thread/promise)	pi-mono: configurable `Promise.all`; OpenClaw: same, + gateway block/rewrite hooks	Yes (Send fan-out)
Extensibility	Subclass	Callbacks	pi-mono: `AgentLoopConfig` callbacks; OpenClaw: 15+ gateway hooks (separate layer)	Middleware-as-nodes
Mid-run control	None	Steering messages	pi-mono: `getSteeringMessages`; OpenClaw: command queue feeds these callbacks	interrupt_before/after
HITL	None	Limited	OpenClaw: hook claim + command queue	Pause/resume at any node
Composition	None	Subagent tool	OpenClaw: session lanes (gateway)	Graph composition
Compression	None	pi-mono callback	OpenClaw harness: auto + `/compact`; configurable model; memory flush; pruning distinction	External concern

Anthropic Managed Agents (architectural pattern, not a loop generation): Loop form = distributed process; State = append-only external event log; survives Brain + Hands restarts. Applicable on top of any generation above.

Key cross-cutting insights

0. The most fundamental shift: prompt-as-protocol → tool-call-as-protocol.
In ReAct, the control protocol is a text convention: {agent_scratchpad} grows by appending Thought:/Action:/Observation: strings; the LLM writes Action: ToolName[arg]; ReActOutputParser reads it back. The protocol exists only as a shared text contract between the prompt template, the few-shot examples, and the parser — it is invisible to the framework's type system.

In modern loops (LangGraph, pi-mono, Hermes), the LLM emits a structured tool_calls array in its JSON response. The framework reads AIMessage.tool_calls directly — no text parsing, no stop tokens, no scratchpad string. Tool invocation is a first-class typed object, not a convention in prose. This single shift eliminates the coupling between prompt format and parser, makes tool selection inspectable, enables reliable parallel dispatch, and is what makes the rest of the LangGraph graph architecture possible.

1. The loop structure mirrors trust in the model.
ReAct embeds control in the LLM's text output ("Final Answer:") — it trusts the model to signal completion. LangGraph moves routing entirely to the framework layer (typed functions, named edges) — the model just outputs messages; the framework decides what to do next.

2. Extensibility follows the layer cake — but harness ≠ loop.
pi-mono→OpenClaw demonstrates that you can add rich hook/plugin behavior without modifying the loop executor, by building a harness layer around a stable loop. But this is categorically different from LangGraph's approach: OpenClaw hooks are a side-channel to pi-mono's loop (they can block/rewrite but cannot redefine routing). LangGraph promotes middleware to first-class graph nodes that are the routing — the hook can re-route, not just observe. These are two distinct strategies: stable-loop + rich-harness vs. everything-is-the-graph.

3. Parallel tool execution is table stakes.
Every generation >= 2 supports it. The mechanism evolves: thread pool → promise-based → graph fan-out (Send). The Send model is the most principled: it expresses fan-out in the graph topology rather than in a concurrency primitive inside the loop body.

4. Durability decoupling is the hardest problem.
Hermes persists messages per turn but is still a single-process loop. OpenClaw externalizes the transcript. Anthropic's architecture makes durability a separate failure domain. LangGraph's checkpointer is the closest approximation in a framework, but requires explicit wiring.

5. Budget / termination is an afterthought until it isn't.
ReAct has none. pi-mono relies on natural exhaustion. Hermes is the only framework with explicit budget enforcement (90/50 turn caps), treating runaway loops as a production reliability concern. LangGraph leaves it to recursion_limit (default 10,000) — essentially unlimited.

6. "Explicit" and "implicit" refer to two orthogonal dimensions — which is why they appear inverted across different analyses.

A common point of confusion: some analyses call LangChain ReAct explicit and modern tool-call loops implicit, which is the opposite of how this article uses those terms. Both readings are correct — they measure different things.

Dimension	ReAct	LangGraph / tool-call loops
Loop structure (is routing a first-class, inspectable object?)	Implicit — loop hidden inside `AgentExecutor.run()`	Explicit — routing is a named `StateGraph` DAG
Reasoning protocol (are state transitions visible in the prompt?)	Explicit — `Thought:/Action:/Observation:` scratchpad is human-readable text in the prompt	Implicit — state transitions are encoded in `tool_calls` JSON; nothing appears in the prompt text

This article uses "explicit/implicit" to describe loop architecture (framework-facing). Analyses that call ReAct "explicit" are describing the reasoning protocol (prompt-facing) — the scratchpad makes every step visible in plain text. The two dimensions are orthogonal: ReAct's loop is architecturally implicit but its protocol is textually explicit; LangGraph's loop is architecturally explicit but its protocol is structurally implicit. This is precisely the shift named in §0: prompt-as-protocol → tool-call-as-protocol moves explicitness from the text layer to the type-system layer.