Hermes Agent Loop

#agent-loop #hermes #turn-lifecycle #tool-calling #compression #provider-failover #iteration-budget

Hermes Agent Loop

The core execution cycle of HermesAgent, implemented in AIAgent (run_agent.py). Manages prompt assembly, provider selection, tool dispatch, compression, and budget enforcement.

Entry Points

Method	Returns
`agent.chat()`	Final response string
`agent.run_conversation()`	Dict: `{messages, metadata, usage}`

chat() wraps run_conversation() and extracts final_response.

Turn Lifecycle

Each iteration:

Generate task_id if not present
Append user message to conversation history
Build system prompt (or reuse cached)
Preflight compression check — if history >50% of context window, compress before calling model
Build API messages from conversation history
Inject ephemeral prompt layers
Apply prompt caching markers (Anthropic mode only)
Interruptible API call — runs in background thread; main thread polls for cancellation
Parse response — if tool calls present, execute them and loop; otherwise return final response

API Mode Selection

Priority: explicit constructor arg → provider detection → base URL heuristics → chat_completions

Mode	Client
`chat_completions`	`openai.OpenAI`
`codex_responses`	`openai.OpenAI` (Responses format)
`anthropic_messages`	`anthropic.Anthropic` via adapter

Message Format & Alternation

All messages stored in OpenAI-compatible format. Strict alternation: User → Assistant → User → Assistant. Only tool role may appear consecutively (one per tool call result). Reasoning content stored in assistant_msg["reasoning"].

Tool Dispatch

1 tool call: executed directly in main thread (sequential)
N tool calls: dispatched concurrently via ThreadPoolExecutor; results reinserted in original call order
Interactive tools (e.g. clarify): force sequential regardless of count

Agent-level tools (intercepted before registry):

Tool	Effect
`todo`	Read/write agent-local task state
`memory`	Write to persistent memory files
`session_search`	Query session history
`delegate_task`	Spawn subagent with isolated context

Iteration Budget

Parent agent: 90 turns default (agent.max_turns)
Subagents: capped at delegation.max_iterations (default 50)
At budget exhaustion: agent stops and returns a summary

Fallback / Provider Failover

On primary model failure (rate limit, server error, auth): attempts fallback providers in configured order. Conversation continues with whichever provider succeeds.

Compression

Trigger	Threshold
Preflight (agent-side)	>50% of context window
Gateway auto-compression	>85% of context window

Compression procedure:

Flush memory to disk
Summarize middle turns (lossy)
Preserve last N messages intact
Keep tool call / result pairs together
Generate new session lineage ID

After each turn, messages are persisted to session store and memory changes are flushed to files — enabling later resumption.

Callback Surfaces

Callback	Fires when
`tool_progress_callback`	Before/after each tool execution
`thinking_callback`	Model starts/stops thinking
`reasoning_callback`	Reasoning content returned
`clarify_callback`	`clarify` tool invoked
`step_callback`	After each agent turn
`stream_delta_callback`	Each streaming token
`status_callback`	State changes

Key Source Files

File	Role
`run_agent.py`	`AIAgent` class — main loop
`prompt_builder.py`	System prompt assembly
`context_engine.py`	Pluggable context management
`context_compressor.py`	Lossy summarization
`prompt_caching.py`	Anthropic caching markers
`auxiliary_client.py`	Side-task LLM calls
`model_tools.py`	Tool schema + dispatch

Comparisons

Aspect	Hermes Agent	OpenClaw	pi-mono
Tool concurrency	ThreadPoolExecutor	—	—
Compression trigger	50% / 85%	—	—
Budget tracking	parent + child caps	—	—
Interruptible calls	background thread	—	—
Message format	OpenAI-compat	OpenAI-compat	OpenAI-compat

See Agent Loop (OpenClaw) and pi-mono Agent Loop for the other implementations in this wiki.

HermesAgent — entity
NousResearch — org
Compaction — general compaction concept
Source - Hermes Agent Loop (Nous Research Developer Guide)

Hermes Agent Loop

Entry Points

Turn Lifecycle

API Mode Selection

Message Format & Alternation

Tool Dispatch

Iteration Budget

Fallback / Provider Failover

Compression

Callback Surfaces

Key Source Files

Comparisons

Related Pages