Session Pruning

Session Pruning

Per-request, in-memory trimming of aged tool outputs before each LLM call. Reduces context bloat from accumulated exec results, file reads, and search results without touching normal conversation text or the on-disk transcript.

Mechanism

For each LLM call:

  1. Wait for cache TTL to expire (default: 5 minutes)
  2. Identify aged tool result blocks
  3. Soft-trim — preserve beginning and end, insert ellipsis in the middle
  4. Hard-clear — replace remaining content with a placeholder
  5. Reset TTL for cache reuse

The on-disk .jsonl transcript is never modified — pruning is purely in-memory.

Configuration

contextPruning: { "mode": "cache-ttl", "ttl": "5m" }

Default enablement by auth type:

Auth Enabled Heartbeat
Anthropic OAuth / token Yes 1 hour
Anthropic API key Yes 30 minutes
Non-Anthropic providers No

Legacy Image Handling

For older sessions with embedded images, a separate pass:

Pruning vs. Compaction

Session Pruning Compaction
Scope Tool results only Entire conversation
Persistence In-memory only Saved to transcript
Conversation text Untouched Summarized
Trigger Every LLM request Context limit / /compact
On-disk effect None Durable summary written

Frequent compaction may indicate high tool-output volume — enabling session pruning reduces that pressure by stripping stale tool results before they reach the context limit.