Claude Code Compaction Policy

Source: raw/docs/claude-code-compaction-policy.md

Summary

Claude Code uses a three-level compaction hierarchy — cheapest first — to prevent conversation history from exceeding the model's context window.

Level 1: Microcompaction (src/services/compact/microCompact.ts) — prunes individual tool results without summarizing. Two sub-modes:

Level 2: Session Memory Compaction (src/services/compact/sessionMemoryCompact.ts, experimental) — avoids an API call by reusing the continuously-written session memory file as the summary. Keeps recent messages verbatim (≥10,000 tokens / 5 text blocks; ≤40,000 tokens). Falls back to Level 3 if memory is missing, empty, or post-compact count still too large.

Level 3: Full LLM Compaction (src/services/compact/compact.ts) — sends conversation to model for structured summarization. 9-section output (intent, concepts, files+snippets, errors, problem-solving, all user messages verbatim, pending tasks, current work, next step). Strips <analysis> scratchpad; only <summary> enters context.

Token thresholds (autoCompact.ts)

Constant Value Meaning
AUTOCOMPACT_BUFFER_TOKENS 13,000 Gap below effective window where auto-compact fires
WARNING_THRESHOLD_BUFFER_TOKENS 20,000 Orange UI warning
MAX_OUTPUT_TOKENS_FOR_SUMMARY 20,000 Reserved for summary output
MANUAL_COMPACT_BUFFER_TOKENS 3,000 Blocking limit for /compact

Effective context window = model_context_window − 20,000. Auto-compact fires at effective window − 13,000.

Triggers

  1. Auto-compact: top of each query loop turn, before API call
  2. Manual /compact slash command
  3. Reactive: after prompt_too_long API error

Disable conditions

DISABLE_COMPACT / DISABLE_AUTO_COMPACT env vars, autoCompactEnabled: false, query source is session_memory or compact (recursion guard), Context Collapse mode active, reactive-only experiment flag.

Circuit breaker

MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES = 3 — stops retrying after 3 consecutive failures.

Post-compaction context restoration

Re-injected as AttachmentMessages: up to 5 recently-read files (50K token budget, 5K/file), async agent status, current plan file, plan mode instructions, invoked skills (25K budget, 5K/skill, MRU order), deferred tool schemas, agent listings, MCP tool instructions, session start hooks.

Partial compaction

Execution order per turn

shouldAutoCompact?
  └─ yes → trySessionMemoryCompaction()
               └─ success → done
               └─ null   → compactConversation() [LLM]
  └─ no  → microcompactMessages()
               └─ time-based trigger? → time-based MC
               └─ cached MC enabled? → cachedMicrocompactPath
               └─ otherwise → no-op

See Also