Compaction

Compaction keeps the conversation within the model's context window by summarizing older messages. In edge-pi, compaction orchestration is part of CodingAgent.

How It Works

Token estimation — Messages are measured using a chars / 4 heuristic (conservative, tends to overestimate).
Threshold check — When estimated tokens exceed the context window minus a reserve, compaction triggers.
Cut point — A split point is found that keeps recent messages intact while identifying older messages to summarize.
Summarization — The older messages are summarized by the model into a structured format covering goals, progress, decisions, and next steps.
File tracking — Read and modified files are extracted from tool calls and preserved in the summary.

Settings

// Defaults used by CodingAgent compaction:
// reserveTokens: 16384
// keepRecentTokens: 20000

Setting	Type	Default	Description
`reserveTokens`	number	`16384`	Tokens to reserve for the model's response.
`keepRecentTokens`	number	`20000`	Recent tokens to keep uncompacted.

CodingAgent Auto-Compaction

Compaction orchestration is built into CodingAgent. When configured with mode: "auto", the agent checks for compaction after generate() and stream().

import { CodingAgent, SessionManager } from "edge-pi";

const sessionManager = SessionManager.create(process.cwd());

const agent = new CodingAgent({
  model,
  sessionManager,
  compaction: {
    contextWindow: 200_000,
    mode: "auto",
    settings: {
      reserveTokens: 16_384,
      keepRecentTokens: 20_000,
    },
    onCompactionComplete: (result) => {
      console.log("Compacted", result.tokensBefore, "tokens");
    },
  },
});

await agent.generate({ prompt: "Refactor auth" });

Manual Mode and Runtime Toggle

const agent = new CodingAgent({
  model,
  sessionManager,
  compaction: { contextWindow: 200_000, mode: "manual" },
});

await agent.compact();

if (agent.compaction) {
  agent.setCompaction({ ...agent.compaction, mode: "auto" });
  agent.setCompaction({ ...agent.compaction, mode: "manual" });
}

What's Public API

Use CodingAgent compaction APIs: compaction config in CodingAgentConfig, agent.compact(), and agent.setCompaction(). Low-level compaction helpers are internal and not exported from the package root.

Summary Format

The generated summary follows a structured format:

## Goal
What the user is trying to accomplish.

## Progress
What has been done so far.

## Key Decisions
Important choices and their rationale.

## Next Steps
What remains to be done.

## Files
- Read: src/index.ts, src/utils.ts
- Modified: src/parser.ts, tests/parser.test.ts

Token Estimation

import { estimateTokens, estimateContextTokens } from "edge-pi";

// Estimate tokens for a single message
const tokens = estimateTokens(message);

// Estimate total tokens for all messages
const total = estimateContextTokens(messages);

Token estimation uses a chars / 4 heuristic. This is conservative (overestimates) but fast and works across all models.

Branch Summarization

When branching to a new conversation path, edge-pi can summarize the abandoned branch to preserve context.

import {
  collectEntriesForBranchSummary,
  generateBranchSummary,
} from "edge-pi";

// Collect entries that will be abandoned
const { entries } = collectEntriesForBranchSummary(
  session,
  oldLeafId,
  targetBranchId
);

// Generate a summary
const result = await generateBranchSummary(entries, {
  model,
  signal: abortController.signal,
});

// Store the branch summary
session.branchWithSummary(targetBranchId, result.summary, {
  readFiles: result.readFiles,
  modifiedFiles: result.modifiedFiles,
});