Skip to content

fix: increase DEFAULT_CHUNK_TIMEOUT from 2min to 5min#844

Open
altimate-harness-bot[bot] wants to merge 1 commit into
mainfrom
fix/increase-sse-chunk-timeout
Open

fix: increase DEFAULT_CHUNK_TIMEOUT from 2min to 5min#844
altimate-harness-bot[bot] wants to merge 1 commit into
mainfrom
fix/increase-sse-chunk-timeout

Conversation

@altimate-harness-bot
Copy link
Copy Markdown
Contributor

@altimate-harness-bot altimate-harness-bot Bot commented May 26, 2026

Problem

The wrapSSE() function in provider.ts enforces a per-chunk timeout: if no SSE chunk arrives within the window, it aborts the stream with Error("SSE read timed out"). The previous 120s window was too tight for slow LLM providers (e.g. large reasoning models, cold starts) during multi-turn tool-call sessions.

When this fires, the error propagates through AbortSignal.any() in the custom fetch wrapper and surfaces as UnknownError: SSE read timed out on the assistant message — the chat freezes with no retry path.

Change

packages/opencode/src/provider/provider.ts

-const DEFAULT_CHUNK_TIMEOUT = 120_000
+const DEFAULT_CHUNK_TIMEOUT = 300_000

5 minutes gives adequate headroom for slow providers without masking genuine hangs.

Companion PR

AltimateAI/vscode-altimate-mcp-server#343 — adds a retry button for MessageAbortedError in the chat UI, covering the session-restart abort path.

Requested by @saravmajestic via harness


Summary by cubic

Increase SSE per-chunk timeout from 2 minutes to 5 minutes by raising DEFAULT_CHUNK_TIMEOUT to 300_000, giving slow LLM streams more headroom during long tool-call sessions. This reduces false “SSE read timed out” aborts and prevents chat freezes.

Written for commit 5f6e687. Summary will update on new commits. Review in cubic

The 120s SSE chunk timeout was too aggressive for slow LLM providers,
causing spurious "SSE read timed out" aborts during long-running
multi-tool sessions. Increase to 300s to reduce false positives.
@github-actions
Copy link
Copy Markdown

Thanks for your contribution!

This PR doesn't have a linked issue. All PRs must reference an existing issue.

Please:

  1. Open an issue describing the bug/feature (if one doesn't exist)
  2. Add Fixes #<number> or Closes #<number> to this PR description

See CONTRIBUTING.md for details.

@github-actions
Copy link
Copy Markdown

This PR doesn't fully meet our contributing guidelines and PR template.

What needs to be fixed:

  • PR description is missing required template sections. Please use the PR template.

Please edit this PR description to address the above within 2 hours, or it will be automatically closed.

If you believe this was flagged incorrectly, please let a maintainer know.

@saravmajestic saravmajestic marked this pull request as ready for review June 3, 2026 05:19
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@saravmajestic saravmajestic requested a review from mdesmet June 3, 2026 05:19
// altimate_change end

const DEFAULT_CHUNK_TIMEOUT = 120_000
const DEFAULT_CHUNK_TIMEOUT = 300_000
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PLease set change markers on any lines we change from upstream

Copy link
Copy Markdown

@dev-punia-altimate dev-punia-altimate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-Persona Review — Verdict: block

The PR increases the SSE chunk timeout from 2 to 5 minutes, which improves user experience during slow LLM responses but introduces critical risks: it may mask genuine service outages (product-gap, high severity) and increase exposure to DoS via connection exhaustion (security, medium severity). Multiple personas and code review independently flag this as a dangerous trade-off without mitigations like circuit-breakers or configurability. The product manager explicitly requests changes due to the user experience gap, and security raises a valid availability concern. Without mitigation, this change is unsafe to ship.

14/14 agents completed · 115s · 2 findings (0 critical, 1 high, 1 medium)

High

  • [product-manager, code-reviewer, tech-lead, cto, devops] Increasing the global SSE chunk timeout to 5 minutes may mask genuine service hangs, delaying failure detection and degrading user experience during outages, without a circuit-breaker or provider-specific timeout to distinguish slow responses from actual failures. → packages/opencode/src/provider/provider.ts:54
    • 💡 Implement a circuit-breaker mechanism or provider-configurable timeout to differentiate between slow responses and genuine failures.

Medium

  • [security] Increasing SSE chunk timeout from 2min to 5min may increase exposure to resource exhaustion attacks by allowing malicious clients to hold connections open longer, potentially leading to connection pool exhaustion or DoS under high load. → packages/opencode/src/provider/provider.ts:54
    • 💡 Implement connection limits per client/IP, add rate limiting on SSE stream initiation, or introduce a maximum concurrent stream limit.

Multi-Persona Review · vllm:qwen3-next-80b (waves) + vllm-fallback (synth) ·

@@ -54,7 +54,7 @@ import { VALID_ACCOUNT_RE } from "../altimate/plugin/snowflake"
import { isValidDatabricksHost } from "../altimate/plugin/databricks"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH · product-manager, code-reviewer, tech-lead, cto, devops] Increasing the global SSE chunk timeout to 5 minutes may mask genuine service hangs, delaying failure detection and degrading user experience during outages, without a circuit-breaker or provider-specific timeout to distinguish slow responses from actual failures.

💡 Suggestion: Implement a circuit-breaker mechanism or provider-configurable timeout to differentiate between slow responses and genuine failures.

Confidence: 95/100

@@ -54,7 +54,7 @@ import { VALID_ACCOUNT_RE } from "../altimate/plugin/snowflake"
import { isValidDatabricksHost } from "../altimate/plugin/databricks"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM · security] Increasing SSE chunk timeout from 2min to 5min may increase exposure to resource exhaustion attacks by allowing malicious clients to hold connections open longer, potentially leading to connection pool exhaustion or DoS under high load.

💡 Suggestion: Implement connection limits per client/IP, add rate limiting on SSE stream initiation, or introduce a maximum concurrent stream limit.

Confidence: 85/100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants