Hermes Agent Loop
Hermes Agent Loop
The core execution cycle of HermesAgent, implemented in AIAgent (run_agent.py). Manages prompt assembly, provider selection, tool dispatch, compression, and budget enforcement.
Entry Points
| Method | Returns |
|---|---|
agent.chat() |
Final response string |
agent.run_conversation() |
Dict: {messages, metadata, usage} |
chat() wraps run_conversation() and extracts final_response.
Turn Lifecycle
Each iteration:
- Generate task_id if not present
- Append user message to conversation history
- Build system prompt (or reuse cached)
- Preflight compression check — if history >50% of context window, compress before calling model
- Build API messages from conversation history
- Inject ephemeral prompt layers
- Apply prompt caching markers (Anthropic mode only)
- Interruptible API call — runs in background thread; main thread polls for cancellation
- Parse response — if tool calls present, execute them and loop; otherwise return final response
API Mode Selection
Priority: explicit constructor arg → provider detection → base URL heuristics → chat_completions
| Mode | Client |
|---|---|
chat_completions |
openai.OpenAI |
codex_responses |
openai.OpenAI (Responses format) |
anthropic_messages |
anthropic.Anthropic via adapter |
Message Format & Alternation
All messages stored in OpenAI-compatible format. Strict alternation: User → Assistant → User → Assistant. Only tool role may appear consecutively (one per tool call result). Reasoning content stored in assistant_msg["reasoning"].
Tool Dispatch
- 1 tool call: executed directly in main thread (sequential)
- N tool calls: dispatched concurrently via
ThreadPoolExecutor; results reinserted in original call order - Interactive tools (e.g.
clarify): force sequential regardless of count
Agent-level tools (intercepted before registry):
| Tool | Effect |
|---|---|
todo |
Read/write agent-local task state |
memory |
Write to persistent memory files |
session_search |
Query session history |
delegate_task |
Spawn subagent with isolated context |
Iteration Budget
- Parent agent: 90 turns default (
agent.max_turns) - Subagents: capped at
delegation.max_iterations(default 50) - At budget exhaustion: agent stops and returns a summary
Fallback / Provider Failover
On primary model failure (rate limit, server error, auth): attempts fallback providers in configured order. Conversation continues with whichever provider succeeds.
Compression
| Trigger | Threshold |
|---|---|
| Preflight (agent-side) | >50% of context window |
| Gateway auto-compression | >85% of context window |
Compression procedure:
- Flush memory to disk
- Summarize middle turns (lossy)
- Preserve last N messages intact
- Keep tool call / result pairs together
- Generate new session lineage ID
After each turn, messages are persisted to session store and memory changes are flushed to files — enabling later resumption.
Callback Surfaces
| Callback | Fires when |
|---|---|
tool_progress_callback |
Before/after each tool execution |
thinking_callback |
Model starts/stops thinking |
reasoning_callback |
Reasoning content returned |
clarify_callback |
clarify tool invoked |
step_callback |
After each agent turn |
stream_delta_callback |
Each streaming token |
status_callback |
State changes |
Key Source Files
| File | Role |
|---|---|
run_agent.py |
AIAgent class — main loop |
prompt_builder.py |
System prompt assembly |
context_engine.py |
Pluggable context management |
context_compressor.py |
Lossy summarization |
prompt_caching.py |
Anthropic caching markers |
auxiliary_client.py |
Side-task LLM calls |
model_tools.py |
Tool schema + dispatch |
Comparisons
| Aspect | Hermes Agent | OpenClaw | pi-mono |
|---|---|---|---|
| Tool concurrency | ThreadPoolExecutor | — | — |
| Compression trigger | 50% / 85% | — | — |
| Budget tracking | parent + child caps | — | — |
| Interruptible calls | background thread | — | — |
| Message format | OpenAI-compat | OpenAI-compat | OpenAI-compat |
See Agent Loop (OpenClaw) and pi-mono Agent Loop for the other implementations in this wiki.
Related Pages
- HermesAgent — entity
- NousResearch — org
- Compaction — general compaction concept
- Source - Hermes Agent Loop (Nous Research Developer Guide)