Skip to content

feat(agent-core): rework compaction to keep only user prompts and summary#1192

Open
7Sageer wants to merge 14 commits into
MoonshotAI:mainfrom
7Sageer:codex-compaction
Open

feat(agent-core): rework compaction to keep only user prompts and summary#1192
7Sageer wants to merge 14 commits into
MoonshotAI:mainfrom
7Sageer:codex-compaction

Conversation

@7Sageer

@7Sageer 7Sageer commented Jun 29, 2026

Copy link
Copy Markdown
Collaborator

Related Issue

No prior issue. This reworks how conversation compaction rebuilds the model context.

Problem

The previous compaction strategy kept a mixed history (user prompts, assistant messages, tool calls, and tool results) and could layer multiple compaction summaries over time. This made the post-compaction context hard to reason about, wasted tokens on tool exchanges that no longer matter, and produced inconsistent results between the live context rewrite and the transcript reducer.

What changed

  • Compact the whole history into the most recent real user prompts (verbatim, within a 20k token budget) followed by a single user-role summary prefixed with COMPACTION_SUMMARY_PREFIX.
  • Drop assistant messages, tool calls, tool results, and deferred injections (system/plan-mode reminders, background-task notifications, cron/hook/retry messages, and prior compaction summaries) on compaction, since the initial context is rebuilt every turn.
  • Share one set of compaction helpers (memento.ts) between the live context rewrite and the transcript reducer so both apply the exact same rule.
  • Trigger auto-compaction at 90% of the resolved context window, with blockRatio equal to triggerRatio so compaction runs synchronously.
  • Replace the compaction prompt with a dedicated summarization prompt.
  • Add memento.test.ts and update compaction, context, and transcript tests to cover the new behavior.

Checklist

  • I have read the CONTRIBUTING document.
  • I have linked a related issue, or explained the problem above.
  • I have added tests that prove my feature works.
  • Ran gen-changesets skill, or this PR needs no changeset.
  • Ran gen-docs skill, or this PR needs no doc update.

…mary

Compact the whole history, keeping only real user prompts within a 20k token budget followed by a user-role summary prefixed with SUMMARY_PREFIX. Replace the compaction prompt with SUMMARIZATION_PROMPT, trigger auto-compaction at 90% of the context window, and drop assistant/tool messages and deferred injections on compaction.
@changeset-bot

changeset-bot Bot commented Jun 29, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 5a0b3ca

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@moonshot-ai/kimi-code Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

# Conflicts:
#	packages/agent-core/src/agent/compaction/full.ts
#	packages/agent-core/src/agent/context/index.ts
#	packages/agent-core/test/agent/compaction/full.test.ts
@pkg-pr-new

pkg-pr-new Bot commented Jun 29, 2026

Copy link
Copy Markdown
pnpm dlx https://pkg.pr.new/@moonshot-ai/kimi-code@5a0b3ca
npx https://pkg.pr.new/@moonshot-ai/kimi-code@5a0b3ca

commit: 5a0b3ca

7Sageer added 12 commits June 29, 2026 15:46
- Revert auto-compaction trigger/block ratio to 0.85

- Rewrite truncateTextToTokens as a single-pass O(n) scan so CJK inputs do not freeze compaction

- Mirror pending tool-exchange and deferred cleanup in the wire transcript reducer

- Append the todo list to the compaction summary again

- Restore the no-tools guard in the compaction prompt
Re-render the cached system prompt with fresh runtime context (cwd listing, AGENTS.md, additional-dirs info, skill list) once compaction finishes, so post-compaction turns do not keep the bootstrap snapshot.

Cache the active profile on the Agent and expose refreshSystemPrompt(); FullCompaction invokes it after applyCompaction. This intentionally invalidates the prompt-cache prefix.
- Record keptUserMessageCount on the wire so transcript replay reproduces
  the live folded length after truncation.
- Flush steered messages after compaction so notifications land in the
  post-compaction context instead of being dropped.
- Unify real-user-input detection across context, transcript, and vis.
- Reset injector state correctly after compaction.
- Make the overflow compaction retry cap configurable.
- Sync the vis context projector to the kept-users-plus-summary shape.
applyCompaction now preserves the persisted tokensAfter and
keptUserMessageCount when replaying a compaction record during resume,
so restored bookkeeping matches the wire record instead of being
re-derived from replayed history (which can drift when token estimation
changes, and breaks replay projections that assert the recorded values).
Live compaction still derives both values from the current history.

Update the affected compaction, resume, and replay-range tests.
Resolve conflicts in compaction telemetry: adopt the snake_case telemetry keys from main (MoonshotAI#1196) while keeping this branch's single-round compaction design that retains user messages. 'round' is hard-coded to 1 since this branch compacts in a single round; affected test assertions are updated to match the snake_case keys and this branch's token counts.
…o types

- Reuse the shared isRealUserInput helper in ContextMemory.undo and SessionService.canUndoHistory instead of two local copies.

- Sync the wire-transcript header comment with the new post-compaction shape ([...keptUserMessages, compaction_summary]).

- Tighten memento.ts types by using kosong ContentPart and widening estimateTokensForMessage to a structural subset, dropping the `as never` cast.
Make the keep/drop decision for user-role messages explicit in the compaction memento helpers and cover every PromptOrigin kind. Keep Codex-style semantics: only real user prompts and user-slash skill activations survive compaction; other user-role messages are either re-injected or ephemeral. Add parity coverage across live context, transcript, and vis projector tests.
…pic adapter

Strict Anthropic-compatible backends reject consecutive user messages with
HTTP 400, so the adapter collapses them — but a plain-text user turn and an
adjacent tool-result user message carry different semantics and must stay
separate. Merge plain-text with plain-text (collapsing the post-compaction
run of kept prompts + user-role summary + reminders) and tool-result with
tool-result (parallel-tool-use spec), but not across the two kinds.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant