feat(vis): full session debugging — tasks/cron, execution timeline, retries & tool progress by RealKai42 · Pull Request #1210 · MoonshotAI/kimi-code

RealKai42 · 2026-06-29T14:00:06Z

Make apps/vis surface everything a kimi-code session persists, and turn it from a record viewer into an analysis tool.

What changed

1. Background tasks & cron (`feat(vis): surface background tasks and cron jobs`)

The visualizer read every wire/state/blob artifact but ignored the two on-demand families agent-core writes per session — neither reconstructable from the wire:

Tasks tab: tasks/<id>.json + output.log (process / agent / question kinds) with status, timing, kind-specific fields, and a progressively paged log viewer (byte-window paging via an exact nextOffset cursor).
Cron tab: cron/<id>.json (expression, prompt, recurring/one-shot, last-fired).
Read-only server readers mirror agent-core's on-disk layout, id guards, and legacy task normalization.

2. agent-core: persist step retries and tool progress (`feat(agent-core): …`)

Two transient signals were live-only, so nothing survived for post-hoc analysis. Both are additive optional wire fields — no protocol bump, existing records keep loading:

step.end.retries — recovered transient provider failures (via a new onRetry callback on chatWithRetry).
tool.result.progress — a bounded sparse-progress summary (updateCount / lastStatus / maxPercent); streamed stdout/stderr is deliberately excluded.
New public types: LoopStepRetryRecord, LoopToolProgressSummary. Includes a changeset.

3. vis: execution-analysis timeline (`feat(vis): add execution-analysis timeline …`)

New Timeline tab folds the wire into turns → steps → tool calls client-side and derives what the flat list hides: per-turn/step/tool durations, per-turn token cost, a context-window-fill sparkline + cache-hit rate, tool usage stats, idle-gap detection, and a config-change timeline. Inline: Wire rows show tool call→result elapsed time and result truncation/size/retries/progress; the Issues drawer gains tool-error / truncation / filtered / max_tokens / retried categories; Tasks links agent tasks to the subagent's wire.

Tests

agent-core full suite: 3248 passed, 0 regressions (core loop touched).
vis-server: 113 passed (added task/cron lib + route tests).
vis-web: vitest newly wired up; analysis + issues unit tests.
Typecheck clean across agent-core / vis-server / vis-web; vis-web builds; lint clean on changed files.

The visualizer read every wire/state/blob artifact a session persists but ignored the two on-demand families agent-core also writes under the session directory: background tasks (tasks/<id>.json + output.log) and cron jobs (cron/<id>.json). Neither is reconstructable from the wire, so there was no way to inspect what a session spawned in the background or scheduled. Server: - task-store / cron-store read-only readers mirroring agent-core's on-disk layout, id-validation guard, and legacy snake_case task normalization - GET /:id/tasks, /:id/tasks/:taskId/output (byte-window paged via an exact nextOffset cursor), and /:id/cron routes - re-export the public background-task types from agent-core; mirror the non-exported CronTask shape with a fixture-backed drift test Web: - Tasks tab: process/agent/question kinds with status, timing, kind-specific fields, raw JSON, and a progressively paged output.log viewer - Cron tab: expression, prompt, recurring/one-shot, created/last-fired - count badges on both tabs Tests: +20 (lib + route), all 113 vis-server tests green; web typecheck and build clean.

Two transient signals were only ever emitted as live-only loop events, so nothing survived in the agent record for post-hoc analysis: - step retries: chatWithRetry gains an onRetry callback; turn-step collects the recovered attempts and attaches them to step.end as an optional `retries` array (previously only the live `step.retrying` event). - tool progress: tool-call distills a tool's sparse status/percent updates into a bounded `progress` summary (updateCount / lastStatus / maxPercent) on tool.result. Streamed stdout/stderr is excluded — it would bloat the wire and is already reflected in the result output. Both are additive optional fields, so the wire protocol version is unchanged and existing records keep loading. New public types: LoopStepRetryRecord, LoopToolProgressSummary.

Turn the debugger from a flat record viewer into an analysis tool. New Timeline tab: folds the wire into turns → steps → tool calls (client-side, no extra round-trip) and derives the metrics the raw list hides — per-turn / per-step / per-tool duration, per-turn token cost, a context-window fill sparkline with cache-hit rate, a tool usage table, idle-gap detection, and a config-change timeline. Inline elsewhere: - Wire rows show tool.call → tool.result elapsed time; tool.result detail shows truncation, output size, retries, and the progress summary. - Issues drawer gains tool-error, truncation, filtered, max_tokens, and retried categories. - Tasks tab links agent-kind tasks to the subagent's wire. Wires up vitest for the web package and adds analysis/issues unit tests.

changeset-bot · 2026-06-29T14:00:11Z

⚠️ No Changeset found

Latest commit: b524fee

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

pkg-pr-new · 2026-06-29T14:01:51Z

pnpm dlx https://pkg.pr.new/@moonshot-ai/kimi-code@b524fee

npx https://pkg.pr.new/@moonshot-ai/kimi-code@b524fee

commit: b524fee

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ff2d7185ee

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…ltering A `/export-debug-zip` bundle is just `manifest.json` plus a flattened session directory, which vis already knows how to read. Importing one therefore lights up every existing tab for a session that lives on someone else's machine. Server: - zip-import: yauzl extraction with zip-slip path guards and entry-count / uncompressed-size caps for untrusted uploads. - import-store: extract a bundle into <home>/imported/<imp_…>/, validate it has a main wire, and record an import-meta.json sidecar. - session-store resolves imp_-prefixed ids against imported/, so wire / context / tasks / cron / blobs / logs all work on imported sessions; agent homedirs are re-derived locally (the bundle holds foreign absolute paths). - POST /api/imports (raw zip body) and GET /api/sessions/:id/logs (structured log lines — also available for local sessions). Web: - session rail: import button + all/local/imported filter + imported badge. - new Logs tab: virtualized, level filter, search, session/global toggle. - manifest card atop the State tab for imported sessions. SessionSummary/SessionDetail gain `imported` + `importMeta`. Tests cover extraction, the zip-slip guard, list merge, reading an imported wire through the existing route, and log parsing.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2bb23c4b1f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…l status text Addresses review feedback on the debug-tooling changes: - Background tasks and cron jobs are persisted under each agent's homedir (<session>/agents/<id>/tasks and /cron), not the session root. The Tasks and Cron tabs read the session root, so they showed nothing for normal sessions. Both routes now aggregate across detail.agents homedirs; task entries carry the owning agentId. The route-test fixtures were writing to the wrong (session-root) location too — corrected to the real agents/main layout so they actually exercise the path. - tool.result progress no longer keeps free-form status text, only updateCount and maxPercent. A tool's status string can contain sensitive data (e.g. an MCP OAuth authorization URL) that must not leak into persisted wire files or exported debug bundles.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c3d558a3a8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

The <main> flex child lacked min-w-0, so it defaulted to min-width:auto and refused to shrink below its content's intrinsic width. Tabs that lay out in normal flow with flex-wrap rows (the Timeline tab) then got unbounded width, never wrapped, and blew the layout out to thousands of pixels wide. Adding min-w-0 lets the column shrink to the available width so its content wraps, truncates, or scrolls within its own container.

# Conflicts: # packages/agent-core/src/loop/tool-call.ts

…back - Logs tab: for non-imported sessions the shared global log lives at <KIMI_CODE_HOME>/logs/kimi-code.log, not under the session dir (that path is only used inside exported bundles). The route now reads the home path for local sessions, so the global-log toggle works for them. - Imported detail: a bundle's state.json is best-effort and may omit the agents map. When the inventory is empty, fall back to discovering agents from disk so routes that require an agent (wire/context) still resolve main.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 162708e90e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…nput An imported debug zip is untrusted, so a syntactically valid but type-corrupt file could crash whole views: - manifest.json: a non-string field (e.g. workspaceDir: 123) flowed into SessionSummary.workDir, where the session rail calls .split('/') and crashed the entire list. readManifest/readImportMeta now sanitize declared string fields, keeping only strings. - task JSON: a record that passed the shape guard but held a non-string legacy field (e.g. stop_reason: 5) threw in normalization, failing GET /tasks with a 500 and hiding all of a session's tasks. optionalNonEmptyString now tolerates non-strings, and listBackgroundTasks skips any record that still fails to normalize — honouring the reader's documented silently-skips contract.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9a56d7b90a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-30T02:52:37Z

+
+    // A debug bundle must contain a main wire; without it there is nothing to
+    // visualize. `state.json` / `manifest.json` are best-effort.
+    const hasMainWire = await pathExists(join(dir, 'agents', 'main', 'wire.jsonl'));


Support root-wire debug bundles

When an imported /export-debug-zip comes from a session that stores the main transcript as root wire.jsonl rather than agents/main/wire.jsonl (the export path still includes arbitrary session files, and packages/node-sdk/test/export-session.test.ts covers root-wire bundles), this validation rejects the upload as not a session bundle. That prevents users from inspecting valid older/debug exports; accept root wire.jsonl and synthesize it as the main agent when the agent-scoped wire is absent.

Useful? React with 👍 / 👎.

- Logs tab: the diagnostic log can rotate (kimi-code.log.1, .2, …) and an exported bundle may contain only the archives. The route now discovers the active file plus its rotated siblings and concatenates them oldest-first, so a rotated-away log still surfaces (covered by node-sdk's rotated-export case). - agent-core: a tool that reported sparse progress and then threw lost its progress summary, because the catch path built the error tool.result without it. Thread progressSummary through that path too, matching the success and malformed-return paths.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6d6c37d58e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

readImportedDetail's empty-inventory fallback never ran when a bundle's state.json had a non-empty but type-corrupt agents map (e.g. `{ "agents": { "main": null } }`): inventoryAgents dereferenced the null entry and threw, so readSessionDetail returned 500 instead of recovering main from the on-disk agents/main/wire.jsonl. inventoryAgents now skips non-object entries, letting the disk-discovery fallback take over.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a4b5d1fa9a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

… zero-usage steps - Timeline tab kept the previously-selected agent id across session navigation, so a subagent selection would 404 against the next session. Reset it to main on sessionId change, mirroring WireTab/ContextTab. - A zero-usage step.end (e.g. a content-filtered response) reset the context-window fill to 0, pushing a false drop into the Timeline chart and the Context tab. agent-core's ContextMemory keeps the prior count in that case; the analysis lib and the context projector now do the same.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d29b02f915

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-30T04:23:21Z

+    const name = c.req.query('name') ?? null;
+    let buffer: Buffer;
+    try {
+      buffer = Buffer.from(await c.req.arrayBuffer());


Enforce zip upload limit before buffering

When a client uploads with no Content-Length (for example chunked transfer) or a falsely small length, this buffers the entire request before the MAX_ZIP_BYTES check runs. A bad /api/imports request can therefore allocate an arbitrarily large ArrayBuffer and hang or OOM the vis server even though the route intends to cap uploads; stream the body and abort once the byte limit is crossed.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-30T04:23:21Z

+    const buffer = Buffer.allocUnsafe(length);
+    const { bytesRead } = await handle.read(buffer, 0, length, start);
+    const content = buffer.toString('utf-8', 0, bytesRead);


Preserve UTF-8 characters across task log pages

When a large output.log page boundary falls inside a multi-byte UTF-8 character, decoding each byte window independently replaces the split bytes with �; the next page starts after the remaining bytes, so the concatenated log shown in the Tasks tab is permanently corrupted. This is likely for large non-ASCII logs with the default 256 KiB page size; align windows to character boundaries or carry decoder state between pages.

Useful? React with 👍 / 👎.

These were the only changes in this branch that touched agent-core. They persisted two previously live-only signals (step retries, tool progress) to the wire purely so the visualizer could display them — marginal features that did not justify modifying the core loop or extending the wire surface. Reverts the agent-core loop/type/export changes (restored to main, keeping #1209) and its changeset, and removes the vis-side rendering and types that consumed step.end.retries / tool.result.progress. The rest of vis is unchanged and reads only data agent-core already persists.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b524fee753

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-30T04:43:39Z

+  if (agents.length === 0) {
+    agents = await discoverAgentsFromDisk(sessionDir);


Merge disk-discovered agents for partial imports

For an imported zip whose readable state.json.agents contains some stale entry but omits main, the import still succeeds because agents/main/wire.jsonl is present, but this condition skips disk discovery and detail.agents lacks main; the default Wire/Timeline requests then 404 for an otherwise valid bundle. The current fallback only runs when inventoryAgents returns an empty array, so partial imported inventories still need to be merged with disk discovery.

Useful? React with 👍 / 👎.

RealKai42 added 3 commits June 29, 2026 20:48

chatgpt-codex-connector Bot reviewed Jun 29, 2026

View reviewed changes

Comment thread apps/vis/server/src/routes/tasks.ts Outdated

Comment thread apps/vis/server/src/routes/cron.ts Outdated

chatgpt-codex-connector Bot reviewed Jun 29, 2026

View reviewed changes

Comment thread packages/agent-core/src/loop/tool-call.ts Outdated

Comment thread apps/vis/server/src/lib/log-reader.ts Outdated

chatgpt-codex-connector Bot reviewed Jun 29, 2026

View reviewed changes

Comment thread apps/vis/server/src/routes/logs.ts Outdated

Comment thread apps/vis/server/src/lib/session-store.ts Outdated

RealKai42 added 3 commits June 29, 2026 23:44

Merge remote-tracking branch 'origin/main' into kaiyi/lagos-v1

414d8d3

# Conflicts: # packages/agent-core/src/loop/tool-call.ts

chatgpt-codex-connector Bot reviewed Jun 29, 2026

View reviewed changes

Comment thread apps/vis/server/src/lib/import-store.ts Outdated

Comment thread apps/vis/server/src/lib/task-store.ts Outdated

chatgpt-codex-connector Bot reviewed Jun 30, 2026

View reviewed changes

Comment thread apps/vis/server/src/lib/session-store.ts

chatgpt-codex-connector Bot reviewed Jun 30, 2026

View reviewed changes

Comment thread apps/vis/web/src/components/analysis/TimelineTab.tsx

Comment thread apps/vis/web/src/lib/analysis.ts Outdated

chatgpt-codex-connector Bot reviewed Jun 30, 2026

View reviewed changes

RealKai42 merged commit 525fb14 into main Jun 30, 2026
9 checks passed

RealKai42 deleted the kaiyi/lagos-v1 branch June 30, 2026 05:40

		if (agents.length === 0) {
		agents = await discoverAgentsFromDisk(sessionDir);

Uh oh!

Conversation

RealKai42 commented Jun 29, 2026

What changed

1. Background tasks & cron (feat(vis): surface background tasks and cron jobs)

2. agent-core: persist step retries and tool progress (feat(agent-core): …)

3. vis: execution-analysis timeline (feat(vis): add execution-analysis timeline …)

Tests

Uh oh!

changeset-bot Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

pkg-pr-new Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

1. Background tasks & cron (`feat(vis): surface background tasks and cron jobs`)

2. agent-core: persist step retries and tool progress (`feat(agent-core): …`)

3. vis: execution-analysis timeline (`feat(vis): add execution-analysis timeline …`)

changeset-bot Bot commented Jun 29, 2026 •

edited

Loading

pkg-pr-new Bot commented Jun 29, 2026 •

edited

Loading