Skip to content

feat(ADR-059): Context Optimization Engine — @claude-flow/context#1274

Open
shaal wants to merge 3 commits intoruvnet:mainfrom
shaal:feat/adr-059-context-optimization
Open

feat(ADR-059): Context Optimization Engine — @claude-flow/context#1274
shaal wants to merge 3 commits intoruvnet:mainfrom
shaal:feat/adr-059-context-optimization

Conversation

@shaal
Copy link

@shaal shaal commented Mar 2, 2026

Summary

Implements the Context Optimization Engine as a new bounded context package (@claude-flow/context) and integrates it into the live Ruflo system through CLI commands, MCP tools, and hook registration.

Closes #1273

What's Included

New Package: @claude-flow/context (57 files, 7,770 lines)

Domain Layer (DDD)

  • 4 value objects: CompressionRatio, ContextBudget, SearchQuery, SnippetWindow
  • 2 entities: KnowledgeChunk (FTS5-indexed), SandboxInstance (lifecycle FSM)
  • 2 aggregates: CompressionSession (metrics tracking), KnowledgeBase (dedup + TTL eviction)
  • 4 domain events: OutputCompressed, ContentIndexed, BudgetExceeded, ChunksEvicted
  • 3 repository interfaces aligned with infrastructure implementations

Infrastructure Layer

  • FTS5Repository: sql.js WASM-backed search with BM25 scoring and Porter stemming
  • ChunkingEngine: heading-aware content splitting with code block preservation (2048-token chunks, 128-token overlap)
  • LevenshteinCorrector: edit distance calculator with early-exit optimization for fuzzy Layer 3

Sandbox Layer

  • SandboxPool: process-isolated execution via child_process.spawn() with warm pool (3 default), max 8 concurrent, 30s timeout, 512MB memory limit
  • RuntimeDetector: auto-detection for 11 language runtimes via shebang and syntax heuristics
  • CredentialPassthrough: fail-closed env allowlist (GitHub, AWS, K8s, Docker tokens)

Application Layer

  • CompressionPipelineService: 3-tier pipeline — passthrough (<1KB), snippet (1-5KB), full pipeline (>5KB with intent filtering)
  • FuzzySearchService: 3-layer cascade — FTS5 stemming → trigram substring → Levenshtein correction
  • UnifiedSearchService: Reciprocal Rank Fusion (k=60) combining keyword + HNSW semantic search
  • MetricsCollector: per-tool and per-session compression stats
  • CLI command handlers: context stats, context doctor, context search

Hooks

  • PreToolUseHook: budget-aware gate — blocks at BLOCKED level, warns at REDUCED/MINIMAL
  • PostToolUseHook: compresses output through pipeline + indexes full content in knowledge base
  • SubagentRoutingHook: injects batch_execute instructions into Agent/Task tool prompts

Budget Management

  • ContextBudgetManager: topology-aware allocation (hierarchical: coordinator 1.5x, mesh: equal), progressive throttling (NORMAL→REDUCED→MINIMAL→BLOCKED), budget reallocation on agent completion
  • SharedKnowledgeTracker: cross-agent content deduplication via hash tracking
  • SwarmBudgetIntegration: event-driven bridge to swarm coordinator lifecycle with runtime payload validation

Live System Integration (3 new files, 5 edited files)

CLI Command (cli/src/commands/context.ts)

  • 5 subcommands: stats, search, doctor, budget, compress
  • Delegates to MCP tools per ADR-005 (CLI as thin wrapper)
  • Registered in commands index, category advanced

MCP Tools (cli/src/mcp-tools/context-tools.ts)

  • 5 tools: context_stats, context_search, context_doctor, context_budget, context_compress
  • Lazy singleton service bundle shared across all tool handlers (avoids duplicate FTS5 databases)
  • Registered in MCP tool registry

Hook Bridge (hooks/src/bridge/context-bridge.ts)

  • Adapts PreToolUseHookHookEvent.PreToolUse (priority: High) for budget enforcement
  • Adapts PostToolUseHookHookEvent.PostToolUse (priority: Normal) for output compression
  • Adapts SubagentRoutingHookHookEvent.AgentSpawn (priority: Normal) for batch hints
  • Auto-registered during initializeHooks()

Wiring

  • @claude-flow/context added as optionalDependencies to both @claude-flow/cli and @claude-flow/hooks
  • All integration points use try/catch dynamic imports — system works without @claude-flow/context installed

Design Documents (from first commit)

  • SPARC PRD: v3/docs/PRD-context-optimization.md
  • ADR-059: Core architecture decision
  • ADR-059a: FTS5 knowledge base with three-layer fuzzy search
  • ADR-059b: Sandbox isolation and credential passthrough
  • ADR-059c: Swarm-aware context budgets and progressive throttling
  • DDD domain model and integration points

Key Design Decisions

  1. Native package over plugin — full hook access, HNSW integration, swarm-aware agent budgets
  2. FTS5 + HNSW dual index — FTS5 for precision (exact terms, code symbols), HNSW for recall (semantic similarity)
  3. Process isolation over VM/container — ~10ms startup, low overhead, good security for our threat model
  4. Progressive throttling over hard cutoff — educates agents, degrades gracefully, provides escape hatch via batch_execute
  5. Optional dependency pattern@claude-flow/context in optionalDependencies, try/catch dynamic imports everywhere
  6. Lazy singleton service bundle — all 5 MCP tools share one set of services to avoid duplicate WASM databases
  7. CLI delegates to MCP tools — per ADR-005, CLI commands call callMCPTool() instead of importing domain services

Test plan

  • 301 tests across 17 test files — all passing
  • Value objects: construction, validation, edge cases, equality (36 tests)
  • Entities: lifecycle state machine, expiry, deduplication (26 tests)
  • Aggregates: invariant enforcement, event emission, pullEvents (27 tests)
  • FTS5 repository: BM25 ranking, trigram search, vocabulary, eviction (25 tests)
  • Chunking engine: heading split, code block preservation, overlap (17 tests)
  • Levenshtein: distance computation, correction, early exit (22 tests)
  • Sandbox pool: JS/Python/shell execution, timeout, concurrency, drain (17 tests)
  • Runtime detector: shebang, heuristics, hint validation (28 tests)
  • Credential passthrough: fail-closed allowlist verification (12 tests)
  • Compression pipeline: passthrough/medium/full paths, intent filtering (10 tests)
  • Fuzzy search: 3-layer cascade, match layer annotation (6 tests)
  • Unified search: RRF fusion, keyword-only fallback (6 tests)
  • Metrics: per-tool stats, session aggregation, reset (7 tests)
  • Hooks: budget gate, compression routing, subagent injection (13 tests)
  • Budget manager: topology allocation, throttling, rebalance (31 tests)
  • Swarm integration: event handling, dispose cleanup (7 tests)
  • Build verification: all 3 packages compile with zero new TS errors
  • Integration wiring: CLI command registered, MCP tools in registry, hooks bridge auto-loads
  • Integration testing with live swarm session
  • Performance benchmarking against success criteria targets

shaal added 3 commits March 2, 2026 14:14
… bounded context

Add design documents for native context window compression engine
inspired by claude-context-mode. Achieves 95-98% context reduction
via sandbox-isolated execution, FTS5+HNSW dual-index knowledge base,
and swarm-aware per-agent context budgets.

Documents:
- SPARC PRD with 4-phase delivery plan
- ADR-059: master architecture decision
- ADR-059a: FTS5 knowledge base with three-layer fuzzy search
- ADR-059b: sandbox isolation and credential passthrough
- ADR-059c: swarm-aware context budgets and progressive throttling
- DDD bounded context: domain model, integration points, context map
…lementation

New package implementing the Context Optimization Engine with DDD architecture:

Domain layer:
- Value objects: CompressionRatio, ContextBudget, SearchQuery, SnippetWindow
- Entities: KnowledgeChunk (FTS5-indexed), SandboxInstance (lifecycle FSM)
- Aggregates: CompressionSession (metrics), KnowledgeBase (dedup + eviction)
- Domain events: OutputCompressed, ContentIndexed, BudgetExceeded, ChunksEvicted
- Repository interfaces: IFTS5Repository, ICompressionSessionRepository, IContextBudgetRepository

Infrastructure layer:
- FTS5Repository: sql.js-backed search with BM25 scoring and Porter stemming
- ChunkingEngine: heading-aware splitting with code block preservation
- LevenshteinCorrector: edit distance with early-exit optimization

Sandbox layer:
- SandboxPool: process-isolated execution via child_process.spawn()
- RuntimeDetector: auto-detection for 11 language runtimes
- CredentialPassthrough: fail-closed env allowlisting

Application layer:
- CompressionPipelineService: 3-tier pipeline (passthrough/snippet/full)
- FuzzySearchService: 3-layer cascade (stemming→trigram→Levenshtein)
- UnifiedSearchService: RRF fusion combining keyword + HNSW semantic search
- MetricsCollector: per-tool and per-session compression stats
- CLI commands: context stats/doctor/search

Hooks:
- PreToolUseHook: budget-aware gate with progressive throttling
- PostToolUseHook: automatic compression and KB indexing
- SubagentRoutingHook: batch instruction injection for subagents

Budget management:
- ContextBudgetManager: topology-aware allocation, progressive throttling
- SharedKnowledgeTracker: cross-agent content deduplication
- SwarmBudgetIntegration: event-driven swarm lifecycle bridge

301 tests across 17 test files, all passing.
… hooks

Wire the context optimization engine into three integration points:

- CLI: `context` command with 5 subcommands (stats, search, doctor,
  budget, compress) delegating to MCP tools per ADR-005
- MCP: 5 tools (context_stats, context_search, context_doctor,
  context_budget, context_compress) with lazy singleton service bundle
- Hooks: context-bridge adapts PreToolUseHook (budget enforcement),
  PostToolUseHook (output compression), and SubagentRoutingHook
  (batch hints) to HookRegistry

All integration points use optional dependency pattern with try/catch
dynamic imports — system works without @claude-flow/context installed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Context Optimization Engine — 95-98% context window compression for long-running sessions

1 participant