Skip to content

Update trigger#4

Open
tylerc-govsignals wants to merge 867 commits into
GovSignals:ConProgramming/two-phase-deployfrom
triggerdotdev:main
Open

Update trigger#4
tylerc-govsignals wants to merge 867 commits into
GovSignals:ConProgramming/two-phase-deployfrom
triggerdotdev:main

Conversation

@tylerc-govsignals
Copy link
Copy Markdown
Collaborator

Closes #

✅ Checklist

  • I have followed every step in the contributing guide
  • The PR title follows the convention.
  • I ran and tested the code works

Testing

[Describe the steps you took to test this change]


Changelog

[Short description of what has changed]


Screenshots

[Screenshots]

💯

edosrecki and others added 30 commits March 11, 2026 06:42
…3208)

Deprecates the syncVercelEnvVars build extension and adds warnings in
both the Vercel integration docs and the extension's own page to prevent
conflicts with the native env var sync
## Adds 2 self serve features

### 1. self serve preview branches

- Copies the patterns of the self serve concurrency
- Self serve only available on Pro plan (otherwise you are linked to the
billing plans page)
- Global self serve branches limit: 180 (+20 for the Pro plan). It can
be overridden per Org
- You need to archive branches before reducing the number of extra
branches you're paying for
- Branches are removed immediately but remain billed until the end of
the billing cycle like extra concurrency

### 2. self serve team members

- Copies the patterns of the self serve concurrency
- Self serve only available on Pro plan (otherwise you are linked to the
billing plans page)
- Global self serve members is unlimited but can be limited with the
same env var quota and overridden per org if needed
- You need to remove team members before reducing the number of members
you pay for
- Team members are removed immediately but remain billed until the end
of the billing cycle like extra concurrency
…hards (#3219)

Queues with concurrency keys now appear as a single entry in the master
queue instead of one entry per key. This prevents high-CK-count tenants
from consuming the entire `parentQueueLimit` window and starving other
tenants on the same shard.

A new per-queue **CK index** (sorted set) tracks active concurrency key
sub-queues. The master queue gets one `:ck:*` wildcard entry per base
queue. Dequeuing from that entry round-robins across sub-queues,
maintaining per-CK concurrency tracking and fairness.

All existing operations (enqueue, dequeue, ack, nack, DLQ, TTL expiry)
are CK-index-aware and keep the index consistent. Old-format entries
drain naturally during rollout — no migration step needed, single
deploy.
## Summary

Major expansion of the MCP server (14 → 25 tools), context efficiency
optimizations, new API endpoints, and a fix for the dev CLI leaking
build directories on disk.

### New MCP tools

- **Query & analytics**: `get_query_schema`, `query`, `list_dashboards`,
`run_dashboard_query` — query your data using TRQL directly from AI
assistants
- **Profile management**: `whoami`, `list_profiles`, `switch_profile` —
see and switch CLI profiles per-project (persisted to
`.trigger/mcp.json`)
- **Dev server control**: `start_dev_server`, `stop_dev_server`,
`dev_server_status` — start/stop `trigger dev` and stream build output
- **Task introspection**: `get_task_schema` — get payload schema for a
specific task (split out from `get_current_worker` to reduce context)

### New API endpoints

- `GET /api/v1/query/schema` — discover TRQL tables and columns
(server-driven, multi-table)
- `GET /api/v1/query/dashboards` — list built-in dashboard widgets and
their queries

### New features

- **`--readonly` flag** — hides write tools (`deploy`, `trigger_task`,
`cancel_run`) so agents can't make changes
- **`read:query` JWT scope** — new authorization scope for query
endpoints, with per-table granularity (`read:query:runs`,
`read:query:llm_metrics`, etc.)
- **Paginated trace output** — `get_run_details` now paginates trace
events via cursor, caching the full trace in a temp file so subsequent
pages don't re-fetch
- **MCP tool annotations** — all tools now have
`readOnlyHint`/`destructiveHint` annotations for clients that support
them
- **Project-scoped profile persistence** — `switch_profile` saves to
`.trigger/mcp.json` (gitignored), automatically loaded on next MCP
server start

### Context optimizations

- `get_query_schema` requires a table name — returns one table's schema
instead of all tables (60-80% fewer tokens)
- `get_current_worker` no longer inlines payload schemas — use
`get_task_schema` for specific tasks
- Query results formatted as text tables instead of JSON (~50% fewer
tokens for flat data)
- `cancel_run`, `list_deploys`, `list_preview_branches` formatted as
text instead of raw `JSON.stringify()`
- Schema and dashboard API responses cached (1hr and 5min respectively)

### Bug fixes

- Fixed `search_docs` failing due to renamed upstream Mintlify tool
(`SearchTriggerDev` → `search_trigger_dev`)
- Fixed `list_deploys` failing when deployments have null
`runtime`/`runtimeVersion` fields (fixes #3139)
- Fixed `list_preview_branches` crashing due to incorrect response shape
access
- Fixed `metrics` table column documented as `value` instead of
`metric_value` in query docs
- Fixed `/api/v1/query` not accepting JWT auth (added `allowJWT: true`)

### Dev CLI build directory fix

The dev CLI was leaking `build-*` directories in `.trigger/tmp/` on
every rebuild, accumulating hundreds of MB over time (842MB observed).
Three layers of protection added:

1. **During session**: deprecated workers are pruned (capped at 2
retained) when no active runs reference them, preventing unbounded
accumulation
2. **On SIGKILL/crash**: the watchdog process now cleans up
`.trigger/tmp/` when it detects the parent CLI was killed
3. **On next startup**: existing `clearTmpDirs()` wipes any remaining
orphans

## Test plan

- [ ] `pnpm run mcp:smoke` — 17 automated smoke tests for all read-only
MCP tools
- [ ] `pnpm run mcp:test list` — verify 25 tools registered (21 in
`--readonly` mode)
- [ ] `pnpm run mcp:test --readonly list` — verify write tools hidden
- [ ] Manual: start dev server, trigger task, rebuild multiple times,
verify build dirs stay capped at 4
- [ ] Manual: SIGKILL the dev CLI, verify watchdog cleans up
`.trigger/tmp/`
- [ ] Verify new API endpoints return correct data: `GET
/api/v1/query/schema`, `GET /api/v1/query/dashboards`

🤖 Generated with [Claude Code](https://claude.com/claude-code)
…ock contention (#3232)

When processing batchTriggerAndWait items, each batch item was acquiring
a
Redis lock on the parent run to insert a TaskRunWaitpoint row. With high
concurrency (processingConcurrency=50), this caused
LockAcquisitionTimeoutError
(880 errors/24h in prod), orphaned runs, and stuck parent runs.

Since blockRunWithCreatedBatch already transitions the parent to
EXECUTING_WITH_WAITPOINTS before items are processed, the per-item lock
is
unnecessary. The new blockRunWithWaitpointLockless method performs only
the
idempotent CTE insert and timeout scheduling without acquiring the lock.
- Automatic LLM cost enrichment for AI SDK spans (streamText,
generateText, generateObject) or any other spans that use semantic
gen_ai attributes with support for 145+ models
- New AI span inspector sidebar showing model, tokens, cost, messages,
tool calls, and response text
- LLM metrics dual-write to ClickHouse `llm_metrics_v1` table for
analytics
- LLM metrics built-in dashboard (unlinked at the moment)
- Provider cost fallback — uses gateway/OpenRouter reported costs from
`providerMetadata` when registry pricing is unavailable
- Prefix-stripping for gateway/OpenRouter model names (e.g.
`mistral/mistral-large-3` matches `mistral-large-3` pricing)
- Admin dashboard for managing LLM model pricing (list, create, edit,
delete, search, test pattern matching)
- Missing models detection page — queries ClickHouse for unpriced models
with sample spans and Claude Code-ready prompts for adding pricing
- AI span seed script (`pnpm run db:seed:ai-spans`) with 51 spans across
12 provider systems for local dev testing
- UI fixes: `completionTokens`/`promptTokens` aliases,
`ai.response.object` display for generateObject, cache read/write token
breakdown

## Screenshots:

<img width="1030" height="104" alt="CleanShot 2026-03-17 at 16 48 54@2x"
src="https://github.com/user-attachments/assets/bc8fccda-e48b-4d0c-bfb1-e620064e5979"
/>

<img width="1094" height="1512" alt="CleanShot 2026-03-17 at 16 49
23@2x"
src="https://github.com/user-attachments/assets/c2424569-d07e-4d67-a436-e8250043a1ee"
/>

<img width="1074" height="1412" alt="CleanShot 2026-03-17 at 16 49
18@2x"
src="https://github.com/user-attachments/assets/22342ac4-4769-45d1-a328-a24fb9a82a50"
/>

<img width="1012" height="2292" alt="CleanShot 2026-03-17 at 16 39
01@2x"
src="https://github.com/user-attachments/assets/59e327d1-6652-4293-8be0-bb8326e5fbc5"
/>

<img width="3680" height="2392" alt="CleanShot 2026-03-15 at 08 29
38@2x"
src="https://github.com/user-attachments/assets/1f77beb8-de67-495b-b890-bcdb8d7f1fe8"
/>

---------

Co-authored-by: James Ritchie <james@trigger.dev>
- Automatically impersonate a run when visiting /runs/<run_id> if an
admin is logged in
- Clear existing impersonation when switching
)

- Full prompt management UI: list, detail, override, and version
management for AI prompts defined with `prompts.define()`
- Rich AI span inspectors for all AI SDK operations with token usage,
messages, and prompt context
- Real-time generation tracking with live polling and filtering

## Prompt management

Define prompts in your code with `prompts.define()`, then manage
versions and overrides from the dashboard without redeploying:

```typescript
import { task, prompts } from "@trigger.dev/sdk";
import { generateText } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";

const supportPrompt = prompts.define({
  id: "customer-support",
  model: "gpt-4o",
  variables: z.object({
    customerName: z.string(),
    plan: z.string(),
    issue: z.string(),
  }),
  content: `You are a support agent for Acme SaaS.
Customer: {{customerName}} ({{plan}} plan)
Issue: {{issue}}
Respond with empathy and precision.`,
});

export const supportTask = task({
  id: "handle-support",
  run: async (payload) => {
    const resolved = await supportPrompt.resolve({
      customerName: payload.name,
      plan: payload.plan,
      issue: payload.issue,
    });

    const result = await generateText({
      model: openai(resolved.model ?? "gpt-4o"),
      system: resolved.text,
      prompt: payload.issue,
      ...resolved.toAISDKTelemetry(),
    });

    return { response: result.text };
  },
});
```

The prompts list page shows each prompt with its current version, model,
override status, and a usage sparkline over the last 24 hours.

From the prompt detail page you can:

- **Create overrides** to change the prompt template or model without
redeploying. Overrides take priority over the deployed version when
`prompt.resolve()` is called.
- **Promote** any code-deployed version to be the current version
- **Browse generations** across all versions with infinite scroll and
live polling for new results
- **Filter** by version, model, operation type, and provider
- **View metrics** (total generations, avg tokens, avg cost, latency)
broken down by version

## AI span inspectors

Every AI SDK operation now gets a custom inspector in the run trace
view:

- **`ai.generateText` / `ai.streamText`** — Shows model, token usage,
cost, the full message thread (system prompt, user message, assistant
response), and linked prompt details
- **`ai.generateObject` / `ai.streamObject`** — Same as above plus the
JSON schema and structured output
- **`ai.toolCall`** — Shows tool name, call ID, and input arguments
- **`ai.embed`** — Shows model and the text being embedded

For generation spans linked to a prompt, a "Prompt" tab shows the prompt
metadata, the input variables passed to `resolve()`, and the template
content from the prompt version.

All AI span inspectors include a compact timestamp and duration header.

## Other improvements

- Resizable panel sizes now persist across page refreshes (patched
`@window-splitter/state` to fix snapshot restoration)
- Run page panels also persist their sizes
- Fixed `<div>` inside `<p>` DOM nesting warnings in span titles and
chat messages
- Added Operations and Providers filters to the AI metrics dashboard

## Screenshots

<img width="3680" height="2392" alt="CleanShot 2026-03-21 at 10 14
17@2x"
src="https://github.com/user-attachments/assets/f3e59989-a2fa-4990-a9d0-3cacda431868"
/>

<img width="3680" height="2392" alt="CleanShot 2026-03-21 at 10 15
37@2x"
src="https://github.com/user-attachments/assets/2f2d02df-2d2b-44fb-ac6f-9153f6a6c387"
/>

<img width="3680" height="2392" alt="CleanShot 2026-03-21 at 10 15
54@2x"
src="https://github.com/user-attachments/assets/baa161e0-ef91-4fa4-a55f-986b71cccdf0"
/>
Adds an `annotations` JSONB column to task runs that captures where and
how each run was triggered.
This enables filtering and analyzing trigger origins without querying up
the run tree. Also enables making scheduling decisions based on the
trigger source, e.g., use separate affinities for scheduled runs.

Each run records:
- **triggerSource**: who initiated it (sdk, api, dashboard, cli, mcp,
schedule)
- **triggerAction**: what kind of action (trigger, replay, test)
- **rootTriggerSource**: the trigger source of the root ancestor,
propagated through the entire run
 tree
- **rootScheduleId**: schedule id, in case the run tree was triggered
from a schedule

Currently the main motivation for annotations it to determine whether a
run is part of a schedule-originated tree without traversing ancestors.

### A couple of design considerations
- **Decoupled source from method**: triggerSource and triggerAction are
separate fields to avoid
combinatorial explosion (every new source × every new action)
- **Server-side first**: all annotation values are primarily determined
on the server, only a minor SDK change needed
- **Forward-compatible**: annotation fields use
`z.enum([...]).or(anyString)` so new values can be
added without breaking validation; we currently don't need an explicit
version field for annotations.

Note: `metadata` would have been a more fitting name for the db column,
as it is consistent with other tables where we store this type of
information. It is already in use to store user metadata though, so we
go with `annotations` instead.
Returns the fully detailed span with attributes and AI enrichment data
### Lots of UI improvements to the Prompts pages: 

#### New side menu icons
<img width="234" height="110" alt="CleanShot 2026-03-25 at 14 26 33"
src="https://github.com/user-attachments/assets/8039ee8f-92a0-477d-ac91-458dfa43020b"
/>

#### Compact horizontal start finish times so scanning generations list
is consistent
<img width="618" height="174" alt="CleanShot 2026-03-25 at 14 24 03"
src="https://github.com/user-attachments/assets/7eb69475-d539-4e34-b0b5-5cd5119c1aea"
/>

#### Tidied up the metrics view
<img width="1607" height="925" alt="CleanShot 2026-03-25 at 14 23 52"
src="https://github.com/user-attachments/assets/d4bf2761-8e09-4d91-bfb8-e10d1508042f"
/>

#### Copiable metadata
<img width="274" height="160" alt="CleanShot 2026-03-25 at 14 23 44"
src="https://github.com/user-attachments/assets/487c804d-c000-4cd0-9a19-30c2681712df"
/>

#### Cleaner versions list
<img width="463" height="264" alt="CleanShot 2026-03-25 at 14 23 28"
src="https://github.com/user-attachments/assets/a2dfede7-0e7d-4f9c-8b01-00ba13070f70"
/>

#### Overall consistency improvements, shortcut keys and UI behaviours
improvements
<img width="2279" height="1349" alt="CleanShot 2026-03-25 at 14 23 02"
src="https://github.com/user-attachments/assets/0c29257b-15a0-443c-b872-a9f4a7f6af13"
/>

---------

Co-authored-by: Eric Allam <eallam@icloud.com>
Queue limit ServiceValidationErrors were being logged at error level.
These are
expected validation rejections, not bugs.

- Add logLevel property to ServiceValidationError (webapp + run-engine)
- Set logLevel: warn on all queue limit throws
- Schedule engine: detect queue limit failures and log as warn
- Redis-worker: respect logLevel on thrown errors
…es, and TSQL schema (#3270)

- Add llm-model-catalog package (renamed from llm-pricing) with Claude
CLI research pipeline
- Add Prisma schema: catalog columns + baseModelName on LlmModel
- Add ClickHouse: llm_model_aggregates MV + base_response_model column
- Add TSQL llm_models schema for query page integration
- Add ModelRegistryPresenter with catalog, metrics, and comparison
queries
- Add 3 dashboard pages: catalog (cards+table+filters), detail
(overview+metrics+cost estimator), compare
- Add sidebar navigation under AI section with hasAiAccess feature flag
- Add admin dashboard sync/seed for catalog metadata
- Add model variant grouping (dated snapshots under base models)
- Add shared formatters and design system component usage

refs TRI-7941
Scheduled runs create predictable hourly spikes that compete with
on-demand runs for node capacity. Runs triggered "on-demand" via the
SDK, API, or dashboard, are more sensitive to cold start latency since
users are typically
waiting on the result. When a burst of scheduled runs lands at the top
of the hour, it can saturate the shared pool resources causing
contention, affecting cold starts across the board.

The idea in this change is to absorb these periodic spikes in a
dedicated pool without affecting the cold starts of on-demand runs.
Scheduled runs are inherently less sensitive to cold starts.

### Changes in this PR

Follows up on run annotations (#3241), which made trigger origin
available on every run in the tree. This PR exposes
annotations at dequeue time to the supervisor. This enables scheduling
decisions based on trigger source.

The affinities are soft preferences at schedule time, so runs fall back
gracefully if the target pool is out out of capacity.
…ter (#3273)

fix: filter dev environments by userId in OrganizationsPresenter
…t to ClickHouse (#3274)

Adds three new top-level columns to the ClickHouse task_runs_v2 table
primarily for analytics:

- `trigger_source` / `root_trigger_source` - extracted from the existing
TaskRun.annotations JSON during WAL
replication
- `is_warm_start` - new nullable boolean on TaskRun in Postgres, set in
the existing taskRun.update() at attempt
start (no additional write). null until the first attempt starts.

Run region is already available via the existing `worker_queue` column
in ClickHouse.
0ski and others added 30 commits May 18, 2026 17:42
Reject non-email strings at the magic link form instead of accepting any
string and proceeding through rate-limit / authenticator steps.
…3664)

## Summary

Companion to #3536, which patched routes that already had a leaking
`catch (e) { return json({error: e.message}, 500) }`. That pattern can't
reach routes which have no catch in the first place — when those throw,
Remix's default error path serializes `error.message` into the response
body, and the SDK then wraps the leaked string as `TriggerApiError`.

Across 28 raw api.v1 loaders/actions plus one dashboard polling
endpoint, each handler now:

- Wraps its body in `try { ... } catch (error) { ... }`.
- Re-throws `Response` instances so auth helpers' `throw json(...)` /
`throw redirect(...)` pass through unchanged.
- Logs non-Response errors via `logger.error` so server-side visibility
is preserved.
- Returns a generic body — `{"error": "Internal Server Error"}` 500 for
raw API routes, or `{ changelogs: [] }` 200 for the polling widget
(degrade silently across transient blips; the consumer hook already
coped with empty payloads).

For six routes where #3536 left an inner try/catch covering only a
service call (`alertChannels`, `batches.results`,
`deployments.finalize`, `deployments.background-workers`,
`deployments.promote`, `projects.background-workers`): an outer
try/catch is added so auth/parsing failures are also sanitized. Inner
typed-error handling (`ServiceValidationError` → 422 with message, etc.)
is preserved exactly.

For two routes whose existing catch returned 400 + `error.message`
(`api.v1.authorization-code`, `api.v1.orgs.\$orgParam.projects` action):
the body is sanitized to a generic per-route string. **Status code stays
400** — clients that key on the 4xx/5xx distinction (and the SDK's
no-retry-on-4xx behavior) are unaffected.

## Test plan

- [x] \`pnpm run typecheck --filter webapp\`
- [x] Per-route synthetic-throw probe: inject \`throw new
Error("SYNTHETIC ...")\` at the top of each catch'd try, curl the route
with a dummy bearer, confirm the response body is the generic shape and
that the synthetic message lands server-side via \`logger.error\`. 29
routes verified.
- [x] Real-P1001 probe on the envvars loader: \`docker stop database\`
mid-flight, confirm response is generic 500 (not the leaked Prisma
message).
- [x] Sampled legitimate 4xx/2xx paths across each pattern variant
(naked-wrap, partial-expanded, 400-preserved) to confirm the wraps don't
interfere with normal control flow.
## Summary

The prerelease (snapshot) path of the release workflow fails immediately
whenever `main` carries an active `.changeset/pre.json` (i.e. during an
in-progress RC cycle, like the current v4 RC):

```
🦋 error Snapshot release is not allowed in pre mode
🦋 To resolve this exit the pre mode by running `changeset pre exit`
```

This blocks `chat-prerelease` snapshots from main even though the
snapshots are unrelated to the RC cycle.

Adds a conditional `changeset pre exit` step right before `Snapshot
version` in the prerelease job. The job runs on a checkout with
`persist-credentials: false`, so the `pre.json` deletion stays on the
runner's working tree — main's persisted pre-mode state is untouched,
and v4 RC publishes keep working normally.

## Test plan

- [ ] Re-run the `🦋 Changesets Release` workflow with `type=prerelease`,
`ref=main`, `prerelease_tag=chat-prerelease` and confirm it gets past
the snapshot step and publishes.
- [ ] Confirm `.changeset/pre.json` on `main` is unchanged after the
run.
…mic deployments (#3666)

- Ask user if they want to remove TRIGGER_VERSION when they disable
atomic deployments, and explain what is the situation if they leave it
as it is
- Install TRIGGER_SECRET keys as sensitive values in Vercel
<img width="1136" height="714" alt="image"
src="https://github.com/user-attachments/assets/a7351da1-5b2a-44e5-acdd-d30c9359f3ed"
/>
<img width="1136" height="714" alt="image"
src="https://github.com/user-attachments/assets/e773ede2-74cb-438e-811c-338f678d2f7d"
/>
<img width="1136" height="714" alt="image"
src="https://github.com/user-attachments/assets/c7b235a8-e06d-48d3-ac28-c5c9aacc6069"
/>
## Summary

The S2 access-token cache key was `${basin}:${streamPrefix}` — purely
server-derived but blind to the **scope/ops list** hardcoded one method
away. When the ops list changes in code (e.g. #3644 added `trim` so
`chat.agent`'s per-turn trim chain can issue `AppendRecord.trim()`),
pre-deploy tokens still in cache get returned to SDK callers for up to
the token's TTL (24h default), surfacing as `Operation not permitted`
403s on any op outside the old scope.

## Fix

Lift the ops list to a module constant and fold its sorted-join
fingerprint into the cache key:

```ts
const S2_TOKEN_OPS = ["append", "create-stream", "trim"] as const;
const S2_TOKEN_OPS_FINGERPRINT = [...S2_TOKEN_OPS].sort().join(",");

// in getS2AccessToken
const cacheKey = `${this.basin}:${this.streamPrefix}:${S2_TOKEN_OPS_FINGERPRINT}`;

// in s2IssueAccessToken
scope: { /* ... */ ops: [...S2_TOKEN_OPS], /* ... */ }
```

The fingerprint is derived from the single source of truth, so any
future scope change auto-invalidates without anyone remembering to bump
a literal version. The Unkey L1 (in-memory LRU) and L2 (Redis) layers
share the same key derivation, so both reset together on the next deploy
with no manual cache busting.

## Test plan

- [ ] `pnpm run typecheck --filter webapp`
- [ ] Run a multi-turn `chat.agent` chat via `references/ai-chat` and
confirm no `chat.agent: trim failed; will retry next turn` warn span
fires across turn-completes.
Add is_warm_start to TRQL runs schema so warm vs cold start data is
queryable
## Summary

Five hardening fixes across `@trigger.dev/sdk`, `@trigger.dev/core`, and
`@trigger.dev/build`.

- `tasks.triggerAndSubscribe` now forwards caller `requestOptions`
(custom API keys, per-request overrides) to the underlying
`apiClient.triggerTask` call instead of silently dropping them.
- `SSEStreamSubscription` no longer retries permanent client errors
forever. The default `nonRetryableStatuses` widens from `[404, 410]` to
`[400, 404, 409, 410, 422]`, so a malformed session-stream request fails
fast instead of busy-looping under bounded backoff.
- Session writer falls back to manually wiring the caller's
`AbortSignal` on Node 18, where `AbortSignal.any` is unavailable.
Caller-driven cancellation now propagates on every supported runtime.
- `TriggerChatTransport` throws immediately when a `chat.handover`
response is missing `X-Trigger-Chat-Access-Token`, instead of silently
downgrading every subsequent turn back to the handover path. `dispose()`
aborts every active `session.out` subscription before tearing the
coordinator down, so unmount/navigation no longer leaves SSE readers in
flight.
- Removed the experimental `@trigger.dev/build/extensions/secureExec`
build extension. It will return alongside the sandbox feature it was
built to support.

## Test plan

- [ ] `pnpm run build --filter @trigger.dev/sdk --filter
@trigger.dev/core --filter @trigger.dev/build`
- [ ] `pnpm --filter @trigger.dev/sdk test --run` (183 tests, including
chat / chat-server / sessions / handover)
- [ ] `pnpm --filter @trigger.dev/core test --run`
- [ ] Manually trigger a `chat.handover` whose response strips
`X-Trigger-Chat-Access-Token`, and confirm the transport throws
synchronously rather than degrading.
- [ ] Unmount a chat UI mid-stream and confirm the active `session.out`
SSE connection closes immediately.
…le loaders (#3663)

## Summary

- Dashboard loaders for runs / sessions / batches / schedule-detail
threw bare `Error("X not found")` when a slug didn't resolve. Remix
surfaces this as a 500 and Sentry captures it via auto-instrumentation,
producing ongoing noise from real users following stale preview-branch
or deleted-resource links (the URLs in those Sentry events all carry
`?_data=routes/...`, i.e. client-side revalidation, not full-page
navigation).
- Added a `throwNotFound(statusText)` helper in
`app/utils/httpErrors.ts` that throws a Response with status 404,
matching the established pattern in sibling routes (agents, alerts,
bulk-actions, etc.).
- Migrated 5 loader sites to `throwNotFound` (4× "Environment not
found", 1× "Schedule not found").
- Migrated 1 loader site (`runs._index` project branch) to
`redirectWithErrorMessage("/", request, "Project not found")` to match
the pre-existing convention used by every other dashboard route's
project-not-found branch.
- Intentionally **not** touched: bare `throw new Error("X not found")`
inside `resources.*` action routes (sit inside try/catch blocks that
already redirect with a flash message), the invariant assertion in
`vercel.connect.tsx`, and the admin config check in
`admin.api.v1.runs-replication.backfill.ts`.

## Where the fix is visible

Normal browser navigation to these URLs doesn't reach the buggy loaders
— the parent env-layout
(`_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam/route.tsx`)
already filters missing envs/projects and redirects/404s before the
child loader runs. The bug fires exclusively when Remix calls a single
child loader via `?_data=routes/...`, which happens during client-side
navigation or `useRevalidator`. That matches every Sentry event URL.

## Test plan

- [x] Unit test for the new helper —
`apps/webapp/test/httpErrors.test.ts`
- [x] `pnpm run typecheck --filter webapp` clean
- [x] Manual verification via Playwright on `main` vs this branch (6
cases): main returns 500 for each defective `_data` URL; branch returns
404 or 204 + `X-Remix-Redirect` as designed
- [x] Verified user-visible 404 catch boundary on `schedules/<missing>`
(the one case reachable via normal nav)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Env-var lookups via `GET
/api/v1/projects/:projectRef/envvars/:slug/:name` run a Prisma
`findMany` on `EnvironmentVariableValue` filtered by `environmentId` +
`isSecret`. The only existing indexes are the primary key and a unique
on `(variableId, environmentId)`, so `environmentId` is never the
leading column — the planner falls back to a Parallel Seq Scan over the
whole table to find what is, in practice, a handful of rows per
environment.

Two changes:

- Add a btree index on `EnvironmentVariableValue(environmentId)` so the
planner switches to an index scan. The composite `(variableId,
environmentId)` unique stays in place; the new index is purely additive.
- Route the `findMany` inside `getEnvironmentWithRedactedSecrets`
through the read replica via a new `replicaClient` constructor param on
the repository (defaulting to `$replica`, mirroring how `prismaClient`
defaults to `prisma`). Writes and read-after-write methods stay on the
primary.

## Test plan

- [ ] `pnpm run typecheck --filter webapp`
- [ ] Confirm `EXPLAIN` plan flips from Parallel Seq Scan to an index
scan
- [ ] Existing env-var route tests still pass
(`OBJECT_STORE_BASE_URL`) and a named protocol provider
(`OBJECT_STORE_DEFAULT_PROTOCOL=s3`), chat.agent session snapshot writes
landed in the named provider but reads fell through to the default — so
the recovery boot couldn't find the snapshot it had just written.

After a mid-stream cancel, the missing snapshot triggered a fallback
replay path that dropped the user's follow-up message, leaving the chat
stuck in `submitted` indefinitely.

Fix:
- New `/api/v1/sessions/:id/snapshot-url` route handles PUT + GET
  symmetrically — both prefix unprefixed keys with
  `OBJECT_STORE_DEFAULT_PROTOCOL` so they always round-trip through the
  same store.
- `Session.chatSnapshotStoragePath` persists the resolved URI on first
  write so future protocol changes don't strand existing snapshots.
  Reads prefer the stored URI and fall back to the computed default for
  pre-column sessions.
- SDK calls `createChatSnapshotUploadUrl` / `getChatSnapshotUrl`; the
  generic v1/v2 packets endpoints are unchanged.

## Test plan
- [x] Configure local with two providers (R2 default + MinIO `s3` named)
      and `OBJECT_STORE_DEFAULT_PROTOCOL=s3`.
- [x] Reproduce hang: send a message, cancel mid-stream, send another —
      without the fix it hangs in `submitted`; with the fix it streams.
- [x] Snapshot lands in the `s3`-protocol bucket and
      `Session.chatSnapshotStoragePath` is set after first write.
- [x] SDK unit tests pass; webapp typecheck passes.
…#3684)

## Summary

Type `chat.createStartSessionAction` against the chat agent so
`clientData` is typed end-to-end on the first turn. Closes the gap where
`useTriggerChatTransport`'s `startSession` callback already hands you a
typed `clientData` (via the transport generic) but the server-side
action couldn't accept it without untyped routing through the `metadata`
field.

## Design

`ChatStartSessionParams` gains a typed `clientData` field via the new
generic:

```ts
export type ChatStartSessionParams<TChat extends AnyTask = AnyTask> = {
  chatId: string;
  clientData?: InferChatClientData<TChat>;
  triggerConfig?: Partial<SessionTriggerConfig>;
  metadata?: Record<string, unknown>;
};

function createChatStartSessionAction<TChat extends AnyTask = AnyTask>(
  taskId: string,
  options?: CreateChatStartSessionActionOptions
): (params: ChatStartSessionParams<TChat>) => Promise<ChatStartSessionResult>
```

When provided, `clientData` is folded into the first run's
`triggerConfig.basePayload.metadata`, so `onPreload` / `onChatStart` see
the same shape per-turn `metadata` carries via the transport. The opaque
session-level `metadata` field stays exactly as before — it lands on the
Session row, not the run payload.

## Usage

```ts
// actions.ts
import { chat } from "@trigger.dev/sdk/ai";
import type { myChat } from "@/trigger/chat";

export const startChatSession = chat.createStartSessionAction<typeof myChat>("my-chat");
```

```tsx
// Chat.tsx
const transport = useTriggerChatTransport<typeof myChat>({
  task: "my-chat",
  accessToken: ({ chatId }) => mintChatAccessToken(chatId),
  startSession: ({ chatId, clientData }) =>
    startChatSession({ chatId, clientData }),
});
```

## Test plan

- [x] `pnpm run build --filter @trigger.dev/sdk` passes
- [ ] Verify a `chat.agent` with `clientDataSchema` reads the typed
clientData from `onPreload` payload metadata on the first turn
…3685)

## Summary

Pre-existing typecheck errors in `references/ai-chat` against the
current SDK shape. Unblocks `pnpm exec tsc --noEmit` in the reference
project.

## What changed

Three categories of fixes inside `references/ai-chat`. No SDK changes.

### 1. `payload.messages` → `payload.message`

The wire payload is now delta-only — one new message per trigger,
optional. Old code in two raw-task files reads `payload.messages`
(plural array) which no longer exists.

```ts
// before
const messages = await conversation.addIncoming(currentPayload.messages, ...);

// after
const messages = await conversation.addIncoming(
  currentPayload.message ? [currentPayload.message] : [],
  ...
);
```

Same fix to the `chat.messages.on` handler, reading `msg.message`
(singular) instead of `msg.messages[length - 1]`.

### 2. `clientData` non-null assertion in `cf-trust-test`

`ChatTurnContext.clientData` is typed as `?: TClientData` on
`onTurnStart` / `run` event objects even when the agent declares a
`clientDataSchema`. The runtime validates against the schema before the
hook fires, so it's structurally non-null — but TypeScript can't know
that. Non-null assert for now.

Follow-up worth filing: narrow `ChatTurnContext.clientData` to
non-optional when the agent has a `clientDataSchema`. Same friction the
docs friction-test subagent flagged.

### 3. `stress-emit.parseConfig` retyped against `ModelMessage[]`

The `run` callback hands `messages: ModelMessage[]`, not `UIMessage[]`.
Update `parseConfig` to accept `ModelMessage[]` and pull text from
`content` (string or array-of-parts).

## Test plan

- [x] `pnpm exec tsc --noEmit` in `references/ai-chat` passes (was 8
errors, now 0)
## Summary

Two CI workflows were blocking the v4.5.0-rc.0 release PR (#3563) and
would block every future changeset release PR.

### 1. `changesets-pr.yml` — self-report `All PR Checks`

The changesets bot pushes commits authored by `GITHUB_TOKEN`. By GitHub
design, `GITHUB_TOKEN`-authored pushes can't trigger downstream
workflows (loop-prevention). That means `pr_checks.yml` never fires on
release-PR commits, leaving the required `All PR Checks` status
permanently `Expected — Waiting for status to be reported`. The PR can't
merge.

The fix: after `changesets/action` creates the PR, post a `success`
check with the exact `All PR Checks` context onto the PR's head SHA.
GitHub's required-check evaluation is satisfied by any check with the
right context name — the source doesn't have to be `pr_checks.yml`.

**Why this is safe:** the release PR only mechanically bumps
`package.json`, `pnpm-lock.yaml`, and `CHANGELOG.md` from changesets
that were already on `main` (and already ran full CI when they merged).
If a human ever pushes a commit to `changeset-release/main`,
`pr_checks.yml` fires on that push (real user, not `GITHUB_TOKEN`) and
posts its own `All PR Checks` status — last write wins for the same
context on the same SHA, so the human-push result overrides the
auto-success.

### 2. `vouch-check-pr.yml` — exempt `github-actions[bot]`

The `require-draft` job auto-closes any non-draft PR whose author is not
a `MEMBER`/`OWNER`/`COLLABORATOR`, with an explicit allowlist for
`devin-ai-integration[bot]` and `dependabot[bot]`. The changesets bot
publishes as `github-actions[bot]` with `author_association:
CONTRIBUTOR`, so every release PR was getting auto-closed on open with a
"please re-open as draft" comment. Add `github-actions[bot]` to the
exemption list.

## Test plan
- [ ] After merge, the next changeset bot push to
`changeset-release/main` should post `All PR Checks: success` on the
release PR's head SHA, and the PR should not get auto-closed by `Vouch -
Check PR`.
- [ ] Confirm `pr_checks.yml` still fires + gates normal
(human-authored) PRs to `main`.
## Summary
44 improvements, 1 bug fix.

## Improvements
- **AI Prompts** — define prompt templates as code alongside your tasks,
version them on deploy, and override the text or model from the
dashboard without redeploying. Prompts integrate with the Vercel AI SDK
via `toAISDKTelemetry()` (links every generation span back to the
prompt) and with `chat.agent` via `chat.prompt.set()` +
`chat.toStreamTextOptions()`.
([#3629](#3629))
- **Code-defined, deploy-versioned templates** — define with
`prompts.define({ id, model, config, variables, content })`. Every
deploy creates a new version visible in the dashboard. Mustache-style
placeholders (`{{var}}`, `{{#cond}}...{{/cond}}`) with Zod / ArkType /
Valibot-typed variables.
- **Dashboard overrides** — change a prompt's text or model from the
dashboard without redeploying. Overrides take priority over the deployed
"current" version and are environment-scoped (dev / staging / production
independent).
- **Resolve API** — `prompt.resolve(vars, { version?, label? })` returns
the compiled `text`, resolved `model`, `version`, and labels. Standalone
`prompts.resolve<typeof handle>(slug, vars)` for cross-file resolution
with full type inference on slug and variable shape.
- **AI SDK integration** — spread `resolved.toAISDKTelemetry({ ...extra
})` into any `generateText` / `streamText` call and every generation
span links to the prompt in the dashboard alongside its input variables,
model, tokens, and cost.
- **`chat.agent` integration** — `chat.prompt.set(resolved)` stores the
resolved prompt run-scoped; `chat.toStreamTextOptions({ registry })`
pulls `system`, `model` (resolved via the AI SDK provider registry),
`temperature` / `maxTokens` / etc., and telemetry into a single spread
for `streamText`.
- **Management SDK** — `prompts.list()`, `prompts.versions(slug)`,
`prompts.promote(slug, version)`, `prompts.createOverride(slug, body)`,
`prompts.updateOverride(slug, body)`, `prompts.removeOverride(slug)`,
`prompts.reactivateOverride(slug, version)`.
- **Dashboard** — prompts list with per-prompt usage sparklines;
per-prompt detail with Template / Details / Versions / Generations /
Metrics tabs. AI generation spans get a custom inspector showing the
linked prompt's metadata, input variables, and template content
alongside model, tokens, cost, and the message thread.
- Adds `onBoot` to `chat.agent` — a lifecycle hook that fires once per
worker process picking up the chat. Runs for the initial run, preloaded
runs, AND reactive continuation runs (post-cancel, crash, `endRun`,
`requestUpgrade`, OOM retry), before any other hook. Use it to
initialize `chat.local`, open per-process resources, or re-hydrate state
from your DB on continuation — anywhere the SAME run picking up after
suspend/resume isn't enough.
([#3543](#3543))
- **AI SDK `useChat` integration** — a custom
[`ChatTransport`](https://sdk.vercel.ai/docs/ai-sdk-ui/transport)
(`useTriggerChatTransport`) plugs straight into Vercel AI SDK's
`useChat` hook. Text streaming, tool calls, reasoning, and `data-*`
parts all work natively over Trigger.dev's realtime streams. No custom
API routes needed.
- **First-turn fast path (`chat.headStart`)** — opt-in handler that runs
the first turn's `streamText` step in your warm server process while the
agent run boots in parallel, cutting cold-start TTFC by roughly half
(measured 2801ms → 1218ms on `claude-sonnet-4-6`). The agent owns step
2+ (tool execution, persistence, hooks) so heavy deps stay where they
belong. Web Fetch handler works natively in Next.js, Hono, SvelteKit,
Remix, Workers, etc.; bridge to Express/Fastify/Koa via
`chat.toNodeListener`. New `@trigger.dev/sdk/chat-server` subpath.
- **Multi-turn durability via Sessions** — every chat is backed by a
durable Session that outlives any individual run. Conversations resume
across page refreshes, idle timeout, crashes, and deploys; `resume:
true` reconnects via `lastEventId` so clients only see new chunks.
`sessions.list` enumerates chats for inbox-style UIs.
- **Auto-accumulated history, delta-only wire** — the backend
accumulates the full conversation across turns; clients only ship the
new message each turn. Long chats never hit the 512 KiB body cap.
Register `hydrateMessages` to be the source of truth yourself.
- **Lifecycle hooks** — `onPreload`, `onChatStart`,
`onValidateMessages`, `hydrateMessages`, `onTurnStart`,
`onBeforeTurnComplete`, `onTurnComplete`, `onChatSuspend`,
`onChatResume` — for persistence, validation, and post-turn work.
- **Stop generation** — client-driven `transport.stopGeneration(chatId)`
aborts mid-stream; the run stays alive for the next message, partial
response is captured, and aborted parts (stuck `partial-call` tools,
in-progress reasoning) are auto-cleaned.
- **Tool approvals (HITL)** — tools with `needsApproval: true` pause
until the user approves or denies via `addToolApprovalResponse`. The
runtime reconciles the updated assistant message by ID and continues
`streamText`.
- **Steering and background injection** — `pendingMessages` injects user
messages between tool-call steps so users can steer the agent
mid-execution; `chat.inject()` + `chat.defer()` adds context from
background work (self-review, RAG, safety checks) between turns.
- **Actions** — non-turn frontend commands (undo, rollback, regenerate,
edit) sent via `transport.sendAction`. Fire `hydrateMessages` +
`onAction` only — no turn hooks, no `run()`. `onAction` can return a
`StreamTextResult` for a model response, or `void` for side-effect-only.
- **Typed state primitives** — `chat.local<T>` for per-run state
accessible from hooks, `run()`, tools, and subtasks (auto-serialized
through `ai.toolExecute`); `chat.store` for typed shared data between
agent and client; `chat.history` for reading and mutating the message
chain; `clientDataSchema` for typed `clientData` in every hook.
- **`chat.toStreamTextOptions()`** — one spread into `streamText` wires
up versioned system [Prompts](https://trigger.dev/docs/ai/prompts),
model resolution, telemetry metadata, compaction, steering, and
background injection.
- **Multi-tab coordination** — `multiTab: true` + `useMultiTabChat`
prevents duplicate sends and syncs state across browser tabs via
`BroadcastChannel`. Non-active tabs go read-only with live updates.
- **Network resilience** — built-in indefinite retry with bounded
backoff, reconnect on `online` / tab refocus / bfcache restore,
`Last-Event-ID` mid-stream resume. No app code needed.
- **Sessions** — a durable, run-aware stream channel keyed on a stable
`externalId`. A Session is the unit of state that owns a multi-run
conversation: messages flow through `.in`, responses through `.out`,
both survive run boundaries. Sessions back the new `chat.agent` runtime,
and you can build on them directly for any pattern that needs durable
bi-directional streaming across runs.
([#3542](#3542))
- Add `ai.toolExecute(task)` so you can wire a Trigger subtask in as the
`execute` handler of an AI SDK `tool()` while defining `description` and
`inputSchema` yourself — useful when you want full control over the tool
surface and just need Trigger's subtask machinery for the body.
([#3546](#3546))
- Type `chat.createStartSessionAction` against your chat agent so
`clientData` is typed end-to-end on the first turn:
([#3684](#3684))
- Add `region` to the runs list / retrieve API: filter runs by region
(`runs.list({ region: "..." })` / `filter[region]=<masterQueue>`) and
read each run's executing region from the new `region` field on the
response.
([#3612](#3612))
- Add `TRIGGER_BUILD_SKIP_REWRITE_TIMESTAMP=1` escape hatch for local
self-hosted builds whose buildx driver doesn't support
`rewrite-timestamp` alongside push (e.g. orbstack's default `docker`
driver).
([#3618](#3618))
- Reject overlong `idempotencyKey` values at the API boundary so they no
longer trip an internal size limit on the underlying unique index and
surface as a generic 500. Inputs are capped at 2048 characters — well
above what `idempotencyKeys.create()` produces (a 64-character hash) and
above any realistic raw key. Applies to `tasks.trigger`,
`tasks.batchTrigger`, `batch.create` (Phase 1 streaming batches),
`wait.createToken`, `wait.forDuration`, and the input/session stream
waitpoint endpoints. Over-limit requests now return a structured 400
instead.
([#3560](#3560))
- **AI SDK `useChat` integration** — a custom
[`ChatTransport`](https://sdk.vercel.ai/docs/ai-sdk-ui/transport)
(`useTriggerChatTransport`) plugs straight into Vercel AI SDK's
`useChat` hook. Text streaming, tool calls, reasoning, and `data-*`
parts all work natively over Trigger.dev's realtime streams. No custom
API routes needed.
- **First-turn fast path (`chat.headStart`)** — opt-in handler that runs
the first turn's `streamText` step in your warm server process while the
agent run boots in parallel, cutting cold-start TTFC by roughly half
(measured 2801ms → 1218ms on `claude-sonnet-4-6`). The agent owns step
2+ (tool execution, persistence, hooks) so heavy deps stay where they
belong. Web Fetch handler works natively in Next.js, Hono, SvelteKit,
Remix, Workers, etc.; bridge to Express/Fastify/Koa via
`chat.toNodeListener`. New `@trigger.dev/sdk/chat-server` subpath.
- **Multi-turn durability via Sessions** — every chat is backed by a
durable Session that outlives any individual run. Conversations resume
across page refreshes, idle timeout, crashes, and deploys; `resume:
true` reconnects via `lastEventId` so clients only see new chunks.
`sessions.list` enumerates chats for inbox-style UIs.
- **Auto-accumulated history, delta-only wire** — the backend
accumulates the full conversation across turns; clients only ship the
new message each turn. Long chats never hit the 512 KiB body cap.
Register `hydrateMessages` to be the source of truth yourself.
- **Lifecycle hooks** — `onPreload`, `onChatStart`,
`onValidateMessages`, `hydrateMessages`, `onTurnStart`,
`onBeforeTurnComplete`, `onTurnComplete`, `onChatSuspend`,
`onChatResume` — for persistence, validation, and post-turn work.
- **Stop generation** — client-driven `transport.stopGeneration(chatId)`
aborts mid-stream; the run stays alive for the next message, partial
response is captured, and aborted parts (stuck `partial-call` tools,
in-progress reasoning) are auto-cleaned.
- **Tool approvals (HITL)** — tools with `needsApproval: true` pause
until the user approves or denies via `addToolApprovalResponse`. The
runtime reconciles the updated assistant message by ID and continues
`streamText`.
- **Steering and background injection** — `pendingMessages` injects user
messages between tool-call steps so users can steer the agent
mid-execution; `chat.inject()` + `chat.defer()` adds context from
background work (self-review, RAG, safety checks) between turns.
- **Actions** — non-turn frontend commands (undo, rollback, regenerate,
edit) sent via `transport.sendAction`. Fire `hydrateMessages` +
`onAction` only — no turn hooks, no `run()`. `onAction` can return a
`StreamTextResult` for a model response, or `void` for side-effect-only.
- **Typed state primitives** — `chat.local<T>` for per-run state
accessible from hooks, `run()`, tools, and subtasks (auto-serialized
through `ai.toolExecute`); `chat.store` for typed shared data between
agent and client; `chat.history` for reading and mutating the message
chain; `clientDataSchema` for typed `clientData` in every hook.
- **`chat.toStreamTextOptions()`** — one spread into `streamText` wires
up versioned system [Prompts](https://trigger.dev/docs/ai/prompts),
model resolution, telemetry metadata, compaction, steering, and
background injection.
- **Multi-tab coordination** — `multiTab: true` + `useMultiTabChat`
prevents duplicate sends and syncs state across browser tabs via
`BroadcastChannel`. Non-active tabs go read-only with live updates.
- **Network resilience** — built-in indefinite retry with bounded
backoff, reconnect on `online` / tab refocus / bfcache restore,
`Last-Event-ID` mid-stream resume. No app code needed.
- Retry `TASK_PROCESS_SIGSEGV` task crashes under the user's retry
policy instead of failing the run on the first segfault. SIGSEGV in Node
tasks is frequently non-deterministic (native addon races, JIT/GC
interaction, near-OOM in native code, host issues), so retrying on a
fresh process often succeeds. The retry is gated by the task's existing
`retry` config + `maxAttempts` — same path `TASK_PROCESS_SIGTERM` and
uncaught exceptions already use — so tasks without a retry policy still
fail fast.
([#3552](#3552))
- The public interfaces for a plugin system. Initially consolidated
authentication and authorization interfaces.
([#3499](#3499))
- Add MollifierBuffer and MollifierDrainer primitives for trigger burst
smoothing.
([#3614](#3614))

## Bug fixes
- Fix `LocalsKey<T>` type incompatibility across dual-package builds.
The phantom value-type brand no longer uses a module-level `unique
symbol`, so a single TypeScript compilation that resolves the type from
both the ESM and CJS outputs (which can happen under certain pnpm
hoisting layouts) no longer sees two structurally-incompatible variants
of the same type.
([#3626](#3626))

<details>
<summary>Raw changeset output</summary>

⚠️⚠️⚠️⚠️⚠️⚠️

`main` is currently in **pre mode** so this branch has prereleases
rather than normal releases. If you want to exit prereleases, run
`changeset pre exit` on `main`.

⚠️⚠️⚠️⚠️⚠️⚠️

# Releases
## @trigger.dev/sdk@4.5.0-rc.0

### Minor Changes

- **AI Prompts** — define prompt templates as code alongside your tasks,
version them on deploy, and override the text or model from the
dashboard without redeploying. Prompts integrate with the Vercel AI SDK
via `toAISDKTelemetry()` (links every generation span back to the
prompt) and with `chat.agent` via `chat.prompt.set()` +
`chat.toStreamTextOptions()`.
([#3629](#3629))

    ```ts
    import { prompts } from "@trigger.dev/sdk";
    import { generateText } from "ai";
    import { openai } from "@ai-sdk/openai";
    import { z } from "zod";

    export const supportPrompt = prompts.define({
      id: "customer-support",
      model: "gpt-4o",
      config: { temperature: 0.7 },
      variables: z.object({
        customerName: z.string(),
        plan: z.string(),
        issue: z.string(),
      }),
      content: `You are a support agent for Acme.

    Customer: {{customerName}} ({{plan}} plan)
    Issue: {{issue}}`,
    });

    const resolved = await supportPrompt.resolve({
      customerName: "Alice",
      plan: "Pro",
      issue: "Can't access billing",
    });

    const result = await generateText({
      model: openai(resolved.model ?? "gpt-4o"),
      system: resolved.text,
      prompt: "Can't access billing",
      ...resolved.toAISDKTelemetry(),
    });
    ```

    **What you get:**

- **Code-defined, deploy-versioned templates** — define with
`prompts.define({ id, model, config, variables, content })`. Every
deploy creates a new version visible in the dashboard. Mustache-style
placeholders (`{{var}}`, `{{#cond}}...{{/cond}}`) with Zod / ArkType /
Valibot-typed variables.
- **Dashboard overrides** — change a prompt's text or model from the
dashboard without redeploying. Overrides take priority over the deployed
"current" version and are environment-scoped (dev / staging / production
independent).
- **Resolve API** — `prompt.resolve(vars, { version?, label? })` returns
the compiled `text`, resolved `model`, `version`, and labels. Standalone
`prompts.resolve<typeof handle>(slug, vars)` for cross-file resolution
with full type inference on slug and variable shape.
- **AI SDK integration** — spread `resolved.toAISDKTelemetry({ ...extra
})` into any `generateText` / `streamText` call and every generation
span links to the prompt in the dashboard alongside its input variables,
model, tokens, and cost.
- **`chat.agent` integration** — `chat.prompt.set(resolved)` stores the
resolved prompt run-scoped; `chat.toStreamTextOptions({ registry })`
pulls `system`, `model` (resolved via the AI SDK provider registry),
`temperature` / `maxTokens` / etc., and telemetry into a single spread
for `streamText`.
- **Management SDK** — `prompts.list()`, `prompts.versions(slug)`,
`prompts.promote(slug, version)`, `prompts.createOverride(slug, body)`,
`prompts.updateOverride(slug, body)`, `prompts.removeOverride(slug)`,
`prompts.reactivateOverride(slug, version)`.
- **Dashboard** — prompts list with per-prompt usage sparklines;
per-prompt detail with Template / Details / Versions / Generations /
Metrics tabs. AI generation spans get a custom inspector showing the
linked prompt's metadata, input variables, and template content
alongside model, tokens, cost, and the message thread.

See [/docs/ai/prompts](https://trigger.dev/docs/ai/prompts) for the full
reference — template syntax, version resolution order, override
workflow, and type utilities (`PromptHandle`, `PromptIdentifier`,
`PromptVariables`).

- Adds `onBoot` to `chat.agent` — a lifecycle hook that fires once per
worker process picking up the chat. Runs for the initial run, preloaded
runs, AND reactive continuation runs (post-cancel, crash, `endRun`,
`requestUpgrade`, OOM retry), before any other hook. Use it to
initialize `chat.local`, open per-process resources, or re-hydrate state
from your DB on continuation — anywhere the SAME run picking up after
suspend/resume isn't enough.
([#3543](#3543))

    ```ts
const userContext = chat.local<{ name: string; plan: string }>({ id:
"userContext" });

    export const myChat = chat.agent({
      id: "my-chat",
      onBoot: async ({ clientData, continuation }) => {
const user = await db.user.findUnique({ where: { id: clientData.userId }
});
        userContext.init({ name: user.name, plan: user.plan });
      },
      run: async ({ messages, signal }) =>
streamText({ model: openai("gpt-4o"), messages, abortSignal: signal }),
    });
    ```

Use `onBoot` (not `onChatStart`) for state setup that must run every
time a worker picks up the chat — `onChatStart` fires once per chat and
won't run on continuation, leaving `chat.local` uninitialized when
`run()` tries to use it.

- **AI Agents** — run AI SDK chat completions as durable Trigger.dev
agents instead of fragile API routes. Define an agent in one function,
point `useChat` at it from React, and the conversation survives page
refreshes, network blips, and process restarts.
([#3543](#3543))

    ```ts
    import { chat } from "@trigger.dev/sdk/ai";
    import { streamText } from "ai";
    import { openai } from "@ai-sdk/openai";

    export const myChat = chat.agent({
      id: "my-chat",
      run: async ({ messages, signal }) =>
streamText({ model: openai("gpt-4o"), messages, abortSignal: signal }),
    });
    ```

    ```tsx
    import { useChat } from "@ai-sdk/react";
import { useTriggerChatTransport } from "@trigger.dev/sdk/chat/react";

const transport = useTriggerChatTransport({ task: "my-chat",
accessToken, startSession });
    const { messages, sendMessage } = useChat({ transport });
    ```

    **What you get:**

- **AI SDK `useChat` integration** — a custom
[`ChatTransport`](https://sdk.vercel.ai/docs/ai-sdk-ui/transport)
(`useTriggerChatTransport`) plugs straight into Vercel AI SDK's
`useChat` hook. Text streaming, tool calls, reasoning, and `data-*`
parts all work natively over Trigger.dev's realtime streams. No custom
API routes needed.
- **First-turn fast path (`chat.headStart`)** — opt-in handler that runs
the first turn's `streamText` step in your warm server process while the
agent run boots in parallel, cutting cold-start TTFC by roughly half
(measured 2801ms → 1218ms on `claude-sonnet-4-6`). The agent owns step
2+ (tool execution, persistence, hooks) so heavy deps stay where they
belong. Web Fetch handler works natively in Next.js, Hono, SvelteKit,
Remix, Workers, etc.; bridge to Express/Fastify/Koa via
`chat.toNodeListener`. New `@trigger.dev/sdk/chat-server` subpath.
- **Multi-turn durability via Sessions** — every chat is backed by a
durable Session that outlives any individual run. Conversations resume
across page refreshes, idle timeout, crashes, and deploys; `resume:
true` reconnects via `lastEventId` so clients only see new chunks.
`sessions.list` enumerates chats for inbox-style UIs.
- **Auto-accumulated history, delta-only wire** — the backend
accumulates the full conversation across turns; clients only ship the
new message each turn. Long chats never hit the 512 KiB body cap.
Register `hydrateMessages` to be the source of truth yourself.
- **Lifecycle hooks** — `onPreload`, `onChatStart`,
`onValidateMessages`, `hydrateMessages`, `onTurnStart`,
`onBeforeTurnComplete`, `onTurnComplete`, `onChatSuspend`,
`onChatResume` — for persistence, validation, and post-turn work.
- **Stop generation** — client-driven `transport.stopGeneration(chatId)`
aborts mid-stream; the run stays alive for the next message, partial
response is captured, and aborted parts (stuck `partial-call` tools,
in-progress reasoning) are auto-cleaned.
- **Tool approvals (HITL)** — tools with `needsApproval: true` pause
until the user approves or denies via `addToolApprovalResponse`. The
runtime reconciles the updated assistant message by ID and continues
`streamText`.
- **Steering and background injection** — `pendingMessages` injects user
messages between tool-call steps so users can steer the agent
mid-execution; `chat.inject()` + `chat.defer()` adds context from
background work (self-review, RAG, safety checks) between turns.
- **Actions** — non-turn frontend commands (undo, rollback, regenerate,
edit) sent via `transport.sendAction`. Fire `hydrateMessages` +
`onAction` only — no turn hooks, no `run()`. `onAction` can return a
`StreamTextResult` for a model response, or `void` for side-effect-only.
- **Typed state primitives** — `chat.local<T>` for per-run state
accessible from hooks, `run()`, tools, and subtasks (auto-serialized
through `ai.toolExecute`); `chat.store` for typed shared data between
agent and client; `chat.history` for reading and mutating the message
chain; `clientDataSchema` for typed `clientData` in every hook.
- **`chat.toStreamTextOptions()`** — one spread into `streamText` wires
up versioned system [Prompts](https://trigger.dev/docs/ai/prompts),
model resolution, telemetry metadata, compaction, steering, and
background injection.
- **Multi-tab coordination** — `multiTab: true` + `useMultiTabChat`
prevents duplicate sends and syncs state across browser tabs via
`BroadcastChannel`. Non-active tabs go read-only with live updates.
- **Network resilience** — built-in indefinite retry with bounded
backoff, reconnect on `online` / tab refocus / bfcache restore,
`Last-Event-ID` mid-stream resume. No app code needed.

See [/docs/ai-chat](https://trigger.dev/docs/ai-chat/overview) for the
full surface — quick start, three backend approaches (`chat.agent`,
`chat.createSession`, raw task), persistence and code-sandbox patterns,
type-level guides, and API reference.

- Add read primitives to `chat.history` for HITL flows:
`getPendingToolCalls()`, `getResolvedToolCalls()`,
`extractNewToolResults(message)`, `getChain()`, and
`findMessage(messageId)`. These lift the accumulator-walking logic that
customers building human-in-the-loop tools were re-implementing into the
SDK. ([#3543](#3543))

Use `getPendingToolCalls()` to gate fresh user turns while a tool call
is awaiting an answer. Use `extractNewToolResults(message)` to dedup
tool results when persisting to your own store — the helper returns only
the parts whose `toolCallId` is not already resolved on the chain.

    ```ts
    const pending = chat.history.getPendingToolCalls();
    if (pending.length > 0) {
      // an addToolOutput is expected before a new user message
    }

    onTurnComplete: async ({ responseMessage }) => {
const newResults = chat.history.extractNewToolResults(responseMessage);
      for (const r of newResults) {
await db.toolResults.upsert({ id: r.toolCallId, output: r.output,
errorText: r.errorText });
      }
    };
    ```

- **Sessions** — a durable, run-aware stream channel keyed on a stable
`externalId`. A Session is the unit of state that owns a multi-run
conversation: messages flow through `.in`, responses through `.out`,
both survive run boundaries. Sessions back the new `chat.agent` runtime,
and you can build on them directly for any pattern that needs durable
bi-directional streaming across runs.
([#3542](#3542))

    ```ts
    import { sessions, tasks } from "@trigger.dev/sdk";

    // Trigger a task and subscribe to its session output in one call
const { runId, stream } = await tasks.triggerAndSubscribe("my-task",
payload, {
      externalId: "user-456",
    });

    for await (const chunk of stream) {
      // ...
    }

// Enumerate existing sessions (powers inbox-style UIs without a
separate index)
for await (const s of sessions.list({ type: "chat.agent", tag:
"user:user-456" })) {
      console.log(s.id, s.externalId, s.createdAt, s.closedAt);
    }
    ```

See [/docs/ai-chat/overview](https://trigger.dev/docs/ai-chat/overview)
for the full surface — Sessions powers the durable, resumable chat
runtime described there.

### Patch Changes

- Add Agent Skills for `chat.agent`. Drop a folder with a `SKILL.md` and
any helper scripts/references next to your task code, register it with
`skills.define({ id, path })`, and the CLI bundles it into the deploy
image automatically — no `trigger.config.ts` changes. The agent gets a
one-line summary in its system prompt and discovers full instructions on
demand via `loadSkill`, with `bash` and `readFile` tools scoped
per-skill (path-traversal guards, output caps, abort-signal
propagation).
([#3543](#3543))

    ```ts
const pdfSkill = skills.define({ id: "pdf-extract", path:
"./skills/pdf-extract" });

    chat.skills.set([await pdfSkill.local()]);
    ```

Built on the [AI SDK cookbook
pattern](https://ai-sdk.dev/cookbook/guides/agent-skills) — portable
across providers. SDK + CLI only for now; dashboard-editable `SKILL.md`
text is on the roadmap.

- Add `ai.toolExecute(task)` so you can wire a Trigger subtask in as the
`execute` handler of an AI SDK `tool()` while defining `description` and
`inputSchema` yourself — useful when you want full control over the tool
surface and just need Trigger's subtask machinery for the body.
([#3546](#3546))

    ```ts
    const myTool = tool({
      description: "...",
      inputSchema: z.object({ ... }),
      execute: ai.toolExecute(mySubtask),
    });
    ```

`ai.tool(task)` (`toolFromTask`) keeps doing the all-in-one wrap and now
aligns its return type with AI SDK's `ToolSet`. Minimum `ai` peer raised
to `^6.0.116` to avoid cross-version `ToolSet` mismatches in monorepos.

- Stamp `gen_ai.conversation.id` (the chat id) on every span and metric
emitted from inside a `chat.task` or `chat.agent` run. Lets you filter
dashboard spans, runs, and metrics by the chat conversation that
produced them — independent of the run boundary, so multi-run chats
correlate cleanly. No code changes required on the user side.
([#3543](#3543))

- Type `chat.createStartSessionAction` against your chat agent so
`clientData` is typed end-to-end on the first turn:
([#3684](#3684))

    ```ts
    import { chat } from "@trigger.dev/sdk/ai";
    import type { myChat } from "@/trigger/chat";

export const startChatSession = chat.createStartSessionAction<typeof
myChat>("my-chat");

// In the browser, threaded from the transport's typed startSession
callback:
    const transport = useTriggerChatTransport<typeof myChat>({
      task: "my-chat",
startSession: ({ chatId, clientData }) => startChatSession({ chatId,
clientData }),
      // ...
    });
    ```

`ChatStartSessionParams` gains a typed `clientData` field — folded into
the first run's `payload.metadata` so `onPreload` / `onChatStart` see
the same shape per-turn `metadata` carries via the transport. The opaque
session-level `metadata` field is unchanged.

- Unit-test `chat.agent` definitions offline with `mockChatAgent` from
`@trigger.dev/sdk/ai/test`. Drives a real agent's turn loop in-process —
no network, no task runtime — so you can send messages, actions, and
stop signals via driver methods, inspect captured output chunks, and
verify hooks fire. Pairs with `MockLanguageModelV3` from `ai/test` for
model mocking. `setupLocals` lets you pre-seed `locals` (DB clients,
service stubs) before `run()` starts.
([#3543](#3543))

The broader `runInMockTaskContext` harness it's built on lives at
`@trigger.dev/core/v3/test` — useful for unit-testing any task code, not
just chat.

- Add `region` to the runs list / retrieve API: filter runs by region
(`runs.list({ region: "..." })` / `filter[region]=<masterQueue>`) and
read each run's executing region from the new `region` field on the
response.
([#3612](#3612))

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.0`

## @trigger.dev/build@4.5.0-rc.0

### Patch Changes

- Add Agent Skills for `chat.agent`. Drop a folder with a `SKILL.md` and
any helper scripts/references next to your task code, register it with
`skills.define({ id, path })`, and the CLI bundles it into the deploy
image automatically — no `trigger.config.ts` changes. The agent gets a
one-line summary in its system prompt and discovers full instructions on
demand via `loadSkill`, with `bash` and `readFile` tools scoped
per-skill (path-traversal guards, output caps, abort-signal
propagation).
([#3543](#3543))

    ```ts
const pdfSkill = skills.define({ id: "pdf-extract", path:
"./skills/pdf-extract" });

    chat.skills.set([await pdfSkill.local()]);
    ```

Built on the [AI SDK cookbook
pattern](https://ai-sdk.dev/cookbook/guides/agent-skills) — portable
across providers. SDK + CLI only for now; dashboard-editable `SKILL.md`
text is on the roadmap.

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.0`

## trigger.dev@4.5.0-rc.0

### Patch Changes

- Add Agent Skills for `chat.agent`. Drop a folder with a `SKILL.md` and
any helper scripts/references next to your task code, register it with
`skills.define({ id, path })`, and the CLI bundles it into the deploy
image automatically — no `trigger.config.ts` changes. The agent gets a
one-line summary in its system prompt and discovers full instructions on
demand via `loadSkill`, with `bash` and `readFile` tools scoped
per-skill (path-traversal guards, output caps, abort-signal
propagation).
([#3543](#3543))

    ```ts
const pdfSkill = skills.define({ id: "pdf-extract", path:
"./skills/pdf-extract" });

    chat.skills.set([await pdfSkill.local()]);
    ```

Built on the [AI SDK cookbook
pattern](https://ai-sdk.dev/cookbook/guides/agent-skills) — portable
across providers. SDK + CLI only for now; dashboard-editable `SKILL.md`
text is on the roadmap.

- Add `TRIGGER_BUILD_SKIP_REWRITE_TIMESTAMP=1` escape hatch for local
self-hosted builds whose buildx driver doesn't support
`rewrite-timestamp` alongside push (e.g. orbstack's default `docker`
driver).
([#3618](#3618))

- The CLI MCP server's agent-chat tools (`start_agent_chat`,
`send_agent_message`, `close_agent_chat`) now run on the new Sessions
primitive, so AI assistants driving a `chat.agent` get the same
idempotent-by-`chatId`, durable-across-runs behavior the browser
transport gets. Required PAT scopes go from `write:inputStreams` to
`read:sessions` + `write:sessions`.
([#3546](#3546))

- MCP `list_runs` tool: add a `region` filter input and surface each
run's executing region in the formatted summary.
([#3612](#3612))

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.0`
    -   `@trigger.dev/build@4.5.0-rc.0`
    -   `@trigger.dev/schema-to-json@4.5.0-rc.0`

## @trigger.dev/core@4.5.0-rc.0

### Patch Changes

- Add Agent Skills for `chat.agent`. Drop a folder with a `SKILL.md` and
any helper scripts/references next to your task code, register it with
`skills.define({ id, path })`, and the CLI bundles it into the deploy
image automatically — no `trigger.config.ts` changes. The agent gets a
one-line summary in its system prompt and discovers full instructions on
demand via `loadSkill`, with `bash` and `readFile` tools scoped
per-skill (path-traversal guards, output caps, abort-signal
propagation).
([#3543](#3543))

    ```ts
const pdfSkill = skills.define({ id: "pdf-extract", path:
"./skills/pdf-extract" });

    chat.skills.set([await pdfSkill.local()]);
    ```

Built on the [AI SDK cookbook
pattern](https://ai-sdk.dev/cookbook/guides/agent-skills) — portable
across providers. SDK + CLI only for now; dashboard-editable `SKILL.md`
text is on the roadmap.

- Reject overlong `idempotencyKey` values at the API boundary so they no
longer trip an internal size limit on the underlying unique index and
surface as a generic 500. Inputs are capped at 2048 characters — well
above what `idempotencyKeys.create()` produces (a 64-character hash) and
above any realistic raw key. Applies to `tasks.trigger`,
`tasks.batchTrigger`, `batch.create` (Phase 1 streaming batches),
`wait.createToken`, `wait.forDuration`, and the input/session stream
waitpoint endpoints. Over-limit requests now return a structured 400
instead.
([#3560](#3560))

- **AI Agents** — run AI SDK chat completions as durable Trigger.dev
agents instead of fragile API routes. Define an agent in one function,
point `useChat` at it from React, and the conversation survives page
refreshes, network blips, and process restarts.
([#3543](#3543))

    ```ts
    import { chat } from "@trigger.dev/sdk/ai";
    import { streamText } from "ai";
    import { openai } from "@ai-sdk/openai";

    export const myChat = chat.agent({
      id: "my-chat",
      run: async ({ messages, signal }) =>
streamText({ model: openai("gpt-4o"), messages, abortSignal: signal }),
    });
    ```

    ```tsx
    import { useChat } from "@ai-sdk/react";
import { useTriggerChatTransport } from "@trigger.dev/sdk/chat/react";

const transport = useTriggerChatTransport({ task: "my-chat",
accessToken, startSession });
    const { messages, sendMessage } = useChat({ transport });
    ```

    **What you get:**

- **AI SDK `useChat` integration** — a custom
[`ChatTransport`](https://sdk.vercel.ai/docs/ai-sdk-ui/transport)
(`useTriggerChatTransport`) plugs straight into Vercel AI SDK's
`useChat` hook. Text streaming, tool calls, reasoning, and `data-*`
parts all work natively over Trigger.dev's realtime streams. No custom
API routes needed.
- **First-turn fast path (`chat.headStart`)** — opt-in handler that runs
the first turn's `streamText` step in your warm server process while the
agent run boots in parallel, cutting cold-start TTFC by roughly half
(measured 2801ms → 1218ms on `claude-sonnet-4-6`). The agent owns step
2+ (tool execution, persistence, hooks) so heavy deps stay where they
belong. Web Fetch handler works natively in Next.js, Hono, SvelteKit,
Remix, Workers, etc.; bridge to Express/Fastify/Koa via
`chat.toNodeListener`. New `@trigger.dev/sdk/chat-server` subpath.
- **Multi-turn durability via Sessions** — every chat is backed by a
durable Session that outlives any individual run. Conversations resume
across page refreshes, idle timeout, crashes, and deploys; `resume:
true` reconnects via `lastEventId` so clients only see new chunks.
`sessions.list` enumerates chats for inbox-style UIs.
- **Auto-accumulated history, delta-only wire** — the backend
accumulates the full conversation across turns; clients only ship the
new message each turn. Long chats never hit the 512 KiB body cap.
Register `hydrateMessages` to be the source of truth yourself.
- **Lifecycle hooks** — `onPreload`, `onChatStart`,
`onValidateMessages`, `hydrateMessages`, `onTurnStart`,
`onBeforeTurnComplete`, `onTurnComplete`, `onChatSuspend`,
`onChatResume` — for persistence, validation, and post-turn work.
- **Stop generation** — client-driven `transport.stopGeneration(chatId)`
aborts mid-stream; the run stays alive for the next message, partial
response is captured, and aborted parts (stuck `partial-call` tools,
in-progress reasoning) are auto-cleaned.
- **Tool approvals (HITL)** — tools with `needsApproval: true` pause
until the user approves or denies via `addToolApprovalResponse`. The
runtime reconciles the updated assistant message by ID and continues
`streamText`.
- **Steering and background injection** — `pendingMessages` injects user
messages between tool-call steps so users can steer the agent
mid-execution; `chat.inject()` + `chat.defer()` adds context from
background work (self-review, RAG, safety checks) between turns.
- **Actions** — non-turn frontend commands (undo, rollback, regenerate,
edit) sent via `transport.sendAction`. Fire `hydrateMessages` +
`onAction` only — no turn hooks, no `run()`. `onAction` can return a
`StreamTextResult` for a model response, or `void` for side-effect-only.
- **Typed state primitives** — `chat.local<T>` for per-run state
accessible from hooks, `run()`, tools, and subtasks (auto-serialized
through `ai.toolExecute`); `chat.store` for typed shared data between
agent and client; `chat.history` for reading and mutating the message
chain; `clientDataSchema` for typed `clientData` in every hook.
- **`chat.toStreamTextOptions()`** — one spread into `streamText` wires
up versioned system [Prompts](https://trigger.dev/docs/ai/prompts),
model resolution, telemetry metadata, compaction, steering, and
background injection.
- **Multi-tab coordination** — `multiTab: true` + `useMultiTabChat`
prevents duplicate sends and syncs state across browser tabs via
`BroadcastChannel`. Non-active tabs go read-only with live updates.
- **Network resilience** — built-in indefinite retry with bounded
backoff, reconnect on `online` / tab refocus / bfcache restore,
`Last-Event-ID` mid-stream resume. No app code needed.

See [/docs/ai-chat](https://trigger.dev/docs/ai-chat/overview) for the
full surface — quick start, three backend approaches (`chat.agent`,
`chat.createSession`, raw task), persistence and code-sandbox patterns,
type-level guides, and API reference.

- Stamp `gen_ai.conversation.id` (the chat id) on every span and metric
emitted from inside a `chat.task` or `chat.agent` run. Lets you filter
dashboard spans, runs, and metrics by the chat conversation that
produced them — independent of the run boundary, so multi-run chats
correlate cleanly. No code changes required on the user side.
([#3543](#3543))

- Fix `LocalsKey<T>` type incompatibility across dual-package builds.
The phantom value-type brand no longer uses a module-level `unique
symbol`, so a single TypeScript compilation that resolves the type from
both the ESM and CJS outputs (which can happen under certain pnpm
hoisting layouts) no longer sees two structurally-incompatible variants
of the same type.
([#3626](#3626))

- Unit-test `chat.agent` definitions offline with `mockChatAgent` from
`@trigger.dev/sdk/ai/test`. Drives a real agent's turn loop in-process —
no network, no task runtime — so you can send messages, actions, and
stop signals via driver methods, inspect captured output chunks, and
verify hooks fire. Pairs with `MockLanguageModelV3` from `ai/test` for
model mocking. `setupLocals` lets you pre-seed `locals` (DB clients,
service stubs) before `run()` starts.
([#3543](#3543))

The broader `runInMockTaskContext` harness it's built on lives at
`@trigger.dev/core/v3/test` — useful for unit-testing any task code, not
just chat.

- Retry `TASK_PROCESS_SIGSEGV` task crashes under the user's retry
policy instead of failing the run on the first segfault. SIGSEGV in Node
tasks is frequently non-deterministic (native addon races, JIT/GC
interaction, near-OOM in native code, host issues), so retrying on a
fresh process often succeeds. The retry is gated by the task's existing
`retry` config + `maxAttempts` — same path `TASK_PROCESS_SIGTERM` and
uncaught exceptions already use — so tasks without a retry policy still
fail fast.
([#3552](#3552))

- Add `region` to the runs list / retrieve API: filter runs by region
(`runs.list({ region: "..." })` / `filter[region]=<masterQueue>`) and
read each run's executing region from the new `region` field on the
response.
([#3612](#3612))

- **Sessions** — a durable, run-aware stream channel keyed on a stable
`externalId`. A Session is the unit of state that owns a multi-run
conversation: messages flow through `.in`, responses through `.out`,
both survive run boundaries. Sessions back the new `chat.agent` runtime,
and you can build on them directly for any pattern that needs durable
bi-directional streaming across runs.
([#3542](#3542))

    ```ts
    import { sessions, tasks } from "@trigger.dev/sdk";

    // Trigger a task and subscribe to its session output in one call
const { runId, stream } = await tasks.triggerAndSubscribe("my-task",
payload, {
      externalId: "user-456",
    });

    for await (const chunk of stream) {
      // ...
    }

// Enumerate existing sessions (powers inbox-style UIs without a
separate index)
for await (const s of sessions.list({ type: "chat.agent", tag:
"user:user-456" })) {
      console.log(s.id, s.externalId, s.createdAt, s.closedAt);
    }
    ```

See [/docs/ai-chat/overview](https://trigger.dev/docs/ai-chat/overview)
for the full surface — Sessions powers the durable, resumable chat
runtime described there.

## @trigger.dev/plugins@4.5.0-rc.0

### Patch Changes

- The public interfaces for a plugin system. Initially consolidated
authentication and authorization interfaces.
([#3499](#3499))
-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.0`

## @trigger.dev/python@4.5.0-rc.0

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/sdk@4.5.0-rc.0`
    -   `@trigger.dev/core@4.5.0-rc.0`
    -   `@trigger.dev/build@4.5.0-rc.0`

## @trigger.dev/react-hooks@4.5.0-rc.0

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.0`

## @trigger.dev/redis-worker@4.5.0-rc.0

### Patch Changes

- Add MollifierBuffer and MollifierDrainer primitives for trigger burst
smoothing.
([#3614](#3614))

MollifierBuffer (`accept`, `pop`, `ack`, `requeue`, `fail`,
`evaluateTrip`) is a per-env FIFO over Redis with atomic Lua transitions
for status tracking. `evaluateTrip` is a sliding-window trip evaluator
the webapp gate uses to detect per-env trigger bursts.

MollifierDrainer pops entries through a polling loop with a
user-supplied handler. The loop survives transient Redis errors via
capped exponential backoff (up to 5s), and per-env pop failures don't
poison the rest of the batch — one env's blip is logged and counted as
failed for that tick. Rotation is two-level: orgs at the top, envs
within each org. The buffer maintains `mollifier:orgs` and
`mollifier:org-envs:${orgId}` atomically with per-env queues, so the
drainer walks orgs → envs directly without an in-memory cache. The
`maxOrgsPerTick` option (default 500) caps how many orgs are scheduled
per tick; for each picked org, one env is popped (rotating round-robin
within the org). An org with N envs gets the same per-tick scheduling
slot as an org with 1 env, so tenant-level drainage throughput is
determined by org count rather than env count.

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.0`

## @trigger.dev/rsc@4.5.0-rc.0

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.0`

## @trigger.dev/schema-to-json@4.5.0-rc.0

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.0`

</details>

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
…3688)

On a warm worker process, a task whose `task()` definition is loaded via
`await import(...)` from inside another task's `run()` could end up
permanently missing from the catalog: the `task()` call fired with no
`_currentFileContext` set, `registerTaskMetadata` silently returned, and
Node's ESM module cache then blocked the worker's setContext + re-import
recovery from ever firing the call again. Subsequent runs of that task
on the same warm process failed with `COULD_NOT_FIND_EXECUTOR` until the
process hit `maxExecutionsPerProcess` and exited.

All five of these had to coincide on the same worker for the bug to
surface:

1. `processKeepAlive` enabled (so catalog state survives across runs).
2. A `run()` function (or lifecycle hook) does `await import(...)`.
3. The import's transitive static graph reaches a `task()` /
`schemaTask()` call.
4. The task containing the dynamic import is the **first** task to run
on a given warm worker process — so the dropped `task()` calls fire on
this process for the first time, are silently dropped, and Node's module
cache locks the wrong outcome in.
5. A subsequent run for one of the dropped task ids lands on the same
warm worker before it recycles.

The runtime workers now set a sentinel file context (`<no-context>`)
around the `executor.execute(...)` call, so `task()` invocations firing
during a run register normally. The catalog detects the sentinel and
emits a one-time `console.warn` per task id so the pattern stays visible
without spamming. The indexer never sets this context, so deploy-time
behavior is unchanged.

Repro is `references/hello-world/src/trigger/dynamicImportRepro*.ts`.
Verified end-to-end against a deployed image with firestarter
warm-starts on: pre-fix saw `COULD_NOT_FIND_EXECUTOR` on children that
landed on the parent-poisoned worker; post-fix all 23/23 runs succeeded
and the warning surfaces in the parent's run trace.
…ev (#3690)

## Summary

`trigger.dev dev` was silently dropping registered `chat.agent` skills
for any project whose task files read `process.env` at module top level
— e.g. a third-party SDK client initialized at import. The agent would
boot fine, but `skill.local()` failed at runtime with `ENOENT` because
the skill folder was never copied into `.trigger/skills/`.

## Design

The CLI ran two indexer passes in dev: the worker's own indexer (with
the full env it eventually executes tasks in), and a separate
skill-discovery indexer with only the CLI process's env. Top-level reads
of vars like `TRIGGER_API_URL` imported cleanly in the worker pass and
threw in the skill pass — the latter caught the error, warned, and
skipped skill copying. Failure was silent enough that `skill.local()`
only surfaced it at task runtime.

The skill registry is already part of the worker manifest. This PR drops
the duplicate pass and copies skill folders from that manifest after the
worker initializes. One indexer instead of two; a bad `SKILL.md` now
surfaces as a startup error instead of silently disappearing skills.

Deploy is unaffected — its skill discovery uses the project's
environment variables (fetched via the API, which fills in
`TRIGGER_API_URL` etc.), so the dev failure mode doesn't reach there.

## Test plan

- [x] New `references/agent-skills` reference project with
`skills.define` + a task that calls `skill.local()` and runs a bundled
script
- [x] On `main`, adding a top-level
`process.env.TRIGGER_API_URL!.includes(...)` read in any task file
reproduces the symptom: warning at dev startup, no `.trigger/skills/`
folder, `skill.local()` fails with ENOENT
- [x] On this branch, same project boots clean and `skill.local()` works
end-to-end
- [x] Deploy still works end-to-end with the new reference project
## Summary
2 bug fixes.

## Bug fixes
- Fix `chat.agent` skills silently missing in `trigger dev` for projects
whose task files read `process.env` at module top level (e.g. a
third-party SDK client initialized at import). Skill folders now bundle
into `.trigger/skills/` reliably regardless of which env vars are set
when the CLI launches.
([#3690](#3690))
- Fix `COULD_NOT_FIND_EXECUTOR` when a task's definition is loaded via
`await import(...)` from inside another task's `run()`. The runtime
workers now register such tasks with a sentinel file context, and the
catalog logs a one-time warning per task id.
([#3688](#3688))

<details>
<summary>Raw changeset output</summary>

⚠️⚠️⚠️⚠️⚠️⚠️

`main` is currently in **pre mode** so this branch has prereleases
rather than normal releases. If you want to exit prereleases, run
`changeset pre exit` on `main`.

⚠️⚠️⚠️⚠️⚠️⚠️

# Releases
## @trigger.dev/build@4.5.0-rc.1

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.1`

## trigger.dev@4.5.0-rc.1

### Patch Changes

- Fix `chat.agent` skills silently missing in `trigger dev` for projects
whose task files read `process.env` at module top level (e.g. a
third-party SDK client initialized at import). Skill folders now bundle
into `.trigger/skills/` reliably regardless of which env vars are set
when the CLI launches.
([#3690](#3690))
- Fix `COULD_NOT_FIND_EXECUTOR` when a task's definition is loaded via
`await import(...)` from inside another task's `run()`. The runtime
workers now register such tasks with a sentinel file context, and the
catalog logs a one-time warning per task id.
([#3688](#3688))
-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.1`
    -   `@trigger.dev/build@4.5.0-rc.1`
    -   `@trigger.dev/schema-to-json@4.5.0-rc.1`

## @trigger.dev/core@4.5.0-rc.1

### Patch Changes

- Fix `COULD_NOT_FIND_EXECUTOR` when a task's definition is loaded via
`await import(...)` from inside another task's `run()`. The runtime
workers now register such tasks with a sentinel file context, and the
catalog logs a one-time warning per task id.
([#3688](#3688))

## @trigger.dev/plugins@4.5.0-rc.1

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.1`

## @trigger.dev/python@4.5.0-rc.1

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.1`
    -   `@trigger.dev/build@4.5.0-rc.1`
    -   `@trigger.dev/sdk@4.5.0-rc.1`

## @trigger.dev/react-hooks@4.5.0-rc.1

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.1`

## @trigger.dev/redis-worker@4.5.0-rc.1

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.1`

## @trigger.dev/rsc@4.5.0-rc.1

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.1`

## @trigger.dev/schema-to-json@4.5.0-rc.1

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.1`

## @trigger.dev/sdk@4.5.0-rc.1

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.1`

</details>

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
## Summary

Lands the full AI Agents documentation surface alongside the v4.5
release candidate of `@trigger.dev/sdk`. Covers `chat.agent` end to end
— defining agents, lifecycle hooks, the frontend transport, sub-agents,
recovery from cancel/crash/OOM, AI Prompts integration — and the
Sessions primitive that backs it.

## Coverage

- **Conceptual**: Overview, Quick Start, How it works.
- **Building agents**: Backend (`chat.agent` / `chat.createSession` /
raw primitives), Lifecycle hooks, Frontend transport, Server-side
`AgentChat`, Sessions reference, `chat.local` state primitive,
TypeScript types.
- **Features**: AI Prompts integration, Fast starts (Preload + Head
Start), Compaction, Pending Messages (steering), Background Injection
(`chat.inject` + `chat.defer`), Actions (undo / regenerate / edit),
Error handling.
- **Patterns (13)**: Sub-agents, Branching conversations, Code sandbox,
Database persistence, Persistence and replay, HITL, Tool result
auditing, Large payloads, Agent skills, OOM resilience, Recovery boot,
Trusted edge signals, Version upgrades.
- **Reference**: API Reference, Client Protocol (wire format), Testing
harness (`mockChatAgent`), MCP server tools, Upgrade guide, Changelog.

## Structure changes

- Top-level nav: AI → **Agents**, with sub-groups for *Building agents /
Features / Patterns / Reference*.
- New RC banner snippet on every page links to the supported AI SDK
versions table on the API Reference.
- All examples use Anthropic with `stopWhen: stepCountIs(15)`.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Summary

Three post-merge fixes for the AI Agents docs (#3226), all caught by
review after merge.

## Fixes

- **`onTurnComplete` examples now use `db.$transaction`** — both the
Database persistence "Complete example" and the Lifecycle hooks
reference example were doing two separate `await` calls
(`db.chat.update` then `db.chatSession.upsert`). That's the exact
non-atomic pattern the warning earlier on the persistence page calls out
as ❌: a refresh between the two writes reads a stale `lastEventId` and
duplicates the assistant message on resume. Both examples now use the
recommended atomic form.

- **Background injection self-review prose aligned with the code** — the
prose said "gpt-4o-mini" but the example above it had been swapped to
`claude-haiku-4-5`. The Anthropic-sweep script only touched code blocks;
this prose line wasn't picked up.

## Test plan

- [x] Both updated examples use `db.$transaction([...])`
- [x] Prose matches the model used in the code block
- [ ] Mintlify deployment passes
…daries (#3700)

## Summary

Two docs edits that close a footgun customers persisting transport state
can hit. Clearing `lastEventId` on `chat.endRun()` looks intuitive — the
Run ended, the cursor must be stale — but the cursor is sessionId-keyed,
not runId-keyed. Clearing it forces the next `sendMessages` to subscribe
from `seq_num=0`, which may hit the prior turn's still-durable
`turn-complete` record and close the SSE empty before the new Run's
chunks arrive.

Spells out the invariant in the frontend transport persistence table and
adds a Warning in the `chat.endRun()` reference.

## Test plan

- [x] Mintlify preview renders
- [x] No callout stacking
Sibling to the weekly summary, focused on critical alerts only. Pings
Slack daily while any critical alerts are open; skips the post entirely
when zero, so no daily "all clear" noise.

- Daily 08:00 UTC cron + `workflow_dispatch` with `severity` input
(default `critical`, override to `high`/`medium`/`low` for manual
checks)
- Reuses the existing `dependabot-summary` environment (token, channel,
bot)
- Alerts link at the end is severity-filtered
…3683)

## Summary

`new TriggerClient({...})` exposes the management API (tasks, runs,
schedules, envvars, batch, queues, deployments, prompts, auth) as an
explicit instance with its own auth, preview branch, and baseURL.
Multiple clients can coexist in one process without mutating shared
global state — useful when a single service triggers across multiple
projects, environments, or preview branches.

```ts
import { TriggerClient } from "@trigger.dev/sdk";

const prod = new TriggerClient({ accessToken: process.env.TRIGGER_PROD_KEY });
const preview = new TriggerClient({
  accessToken: process.env.TRIGGER_PREVIEW_KEY,
  previewBranch: "signup-flow",
});

await prod.tasks.trigger("send-email", payload);
await preview.runs.list({ status: ["COMPLETED"] });
```

The existing global `configure()` API keeps working unchanged.

## Design

Instance methods enter an `AsyncLocalStorage`-backed scope (`sdkScope`)
before delegating to the existing module-level functions. The four
"pollution" points that previously read globals now consult the scope
first:

- `apiClientManager.{baseURL, accessToken, branchName}` and
`clientOrThrow` — identity fields are scope-only when scoped; `baseURL`
still falls back to `TRIGGER_API_URL` because plumbing (where the API
lives) is not identity.
- `taskContext.{ctx, worker, isWarmStart, isInsideTask}` — masked inside
an isolated scope so a `client.tasks.trigger(...)` from inside a task
doesn't leak the parent's `parentRunId` / `lockToVersion` / `isTest`
into a trigger that hits a different project.
- Inline `getEnvVar("TRIGGER_VERSION")` reads in `shared.ts` go through
a `scopedEnvVar` helper that returns `undefined` inside an isolated
scope.

The `TriggerClient` class itself is a thin wrapper that captures the
scope in its constructor and proxies each namespace method to enter that
scope before calling the existing impl. Generic inference (e.g.
`client.tasks.trigger<typeof t>(...)`) is preserved via `Pick<typeof ns,
keyof curatedSubset>` typings.

Two correctness fixes uncovered along the way are folded in:

- `apiClientManager.setGlobalAPIClientConfiguration` no longer silently
no-ops on the second call. `configure()` now actually overrides as users
expect (this is the root cause behind some "I changed the config but
nothing happened" reports).
- `apiClientManager.runWithConfig` (and therefore `auth.withAuth`) is
now backed by `sdkScope.withScope` instead of "mutate the global and
restore in finally". Two parallel `withAuth` calls with different
configs no longer stomp each other.

Surface curation: instance namespaces drop methods that don't make sense
per-instance — `batch.*AndWait` (runtime-dependent), `schedules.task` /
`schedules.timezones` (definition-time / stateless), `prompts.define`
(definition-time), `auth.configure` / `auth.withAuth` (global-only).

## Test plan

- [x] 9 runtime unit tests in `triggerClient.test.ts` cover: required
accessToken, instance auth + branch headers, no env fallback for
identity fields, no leakage between global and instance, four parallel
calls across two clients stay isolated, taskContext masking +
`inheritContext: true` override, `configure()` second-call override,
parallel `auth.withAuth` isolation.
- [x] 10 type-level assertions in `triggerClient.types.test.ts` using
`expectTypeOf` + `@ts-expect-error` lock in generic inference, return
type passthrough, overload preservation, and curated-surface drift.
- [x] Full SDK suite (219 tests) and core suite (530 tests) pass.
- [x] Webapp typecheck clean.
- [x] End-to-end smoke test against local webapp and a
freshly-provisioned cloud project — six concurrent multi-client triggers
all returned 200 with run IDs, headers per-client as expected.
- [ ] Reviewer: run `references/multi-client` per its `README.md` to
reproduce the smoke test locally.

## Try it

`references/multi-client` is a new reference workspace that exercises
this end-to-end:

- `src/trigger/echo.ts` — trivial target task
- `src/trigger/fanOut.ts` — opens two `TriggerClient`s from inside a
task, fires `echo` through each in parallel
- `src/external/main.ts` — external Node script with two clients
triggering `echo` sequentially and concurrently; logs every outgoing
request's `authorization` + `x-trigger-branch`
- `src/external/isolation.ts` — interleaves global `configure()` and an
instance call, asserts the captured fetch sequence shows no leakage
either way
Added `OrganizationDataStore` which allows orgs to have data stored in
specific separate services.

For now this is just used for ClickHouse. When using ClickHouse we get a
client for the factory and pass in the org id.

Particular care has to be made with two hot-insert paths:
1. RunReplicationService
2. OTLPExporter

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
## Summary

Stamp every Sentry event with the signed-in user and the tenant (org /
project / env) the request belongs to, so "Users Impacted" counts
distinct humans and events become filterable per tenant.

**Design after review (current):**

- `user.id = real user cuid` (from `requireUser`). "Users Impacted"
counts humans, not tenants.
- Tenant context (org / project / env slugs, IDs, env type) moves
entirely onto tags: `org_slug`, `project_slug`, `env_slug`, `org_id`,
`project_id`, `project_ref`, `environment_id`, `env_type`, plus
`impersonating` when set.
- Backed by an `AsyncLocalStorage` scope established at the HTTP entry.
Each entry point fills what it knows; loaders enrich the same scope with
what they already have.

**Zero new database queries.** The middleware does a regex match only.
Dashboard loaders that already query Prisma gain a couple of extra
selected columns; nothing new round-trips.

## How it's wired

- **Express middleware (`tenantContextResolver.server.ts`)** — parses
the URL with a regex and always opens an ALS scope. Populates whatever
subset of slugs is present: `/orgs/:o` → just `orgSlug`;
`/orgs/:o/projects/:p` adds `projectSlug`; the full triple adds
`envSlug`. Non-tenant paths get an empty scope so loaders can still
enrich.
- **`_app/route.tsx`** — already calls `requireUser`. Adds
`tenantContext.enrich({ userId: user.id })` for every authenticated
dashboard request. No new query.
- **Env layout loader (`_app.orgs.$o.projects.$p.env.$e/route.tsx`)** —
its existing `prisma.project.findFirst` gains two columns in `select`
(`externalRef`, `organization.id`). After it picks an env, calls
`tenantContext.enrich({ orgId, projectId, projectRef, envId, envType
})`. Same query, +2 columns.
- **API path (`apiBuilder.server.ts`)** — wraps every handler in
`tenantContext.run(tenantContextFromAuthEnvironment(authenticationResult.environment),
…)`. The mapper pulls `userId` from `env.orgMember?.userId` (already
selected by `authIncludeBase` — no schema change). Covers
`createLoaderApiRoute`, `createActionApiRoute`, and
`createMultiMethodApiRoute`.
- **Event processor (`sentryTenantContext.server.ts`)** — registered in
`entry.server.tsx` so it lives in the Remix bundle and shares the same
`tenantContext` ALS instance as the middleware and loaders. Stamps
whatever's present; nothing forced.

## Example events from local verification

| URL | `user.id` | Tags |
|-----|-----------|------|
| `/orgs/:o/projects/:p/env/:e/...` | real user cuid | `org_slug`,
`project_slug`, `env_slug`, `org_id`, `project_id`, `project_ref`,
`environment_id`, `env_type` |
| `/orgs/:o/settings` (non-env-scoped) | real user cuid | `org_slug`
only |
| API request with `orgMember` | `orgMember.userId` | full tenant set |
| API request without `orgMember` | (unset) | full tenant set |

## Trade-offs

1. On env-scoped pages, errors that fire before the env layout loader's
enrich callback runs get slugs + `user.id` but not the tenant IDs /
`env_type`. Realistic errors deep in async work get the full set. (Same
race as before, narrower window now that slugs/`user.id` are populated
up-front by the middleware and `_app` enrich.)
2. API requests where the environment has no `orgMember` get tenant tags
but no `user.id`. Those events still show in the issue but don't
contribute to "Users Impacted".

## Out of scope (deferred)

Background workers (`redis-worker`, `schedule-engine`) and socket
handlers. Those entry points don't set `tenantContext.run` yet — their
events ship without tenant attribution until each is wired in a
follow-up.

## Tests

31 unit tests across 4 files. New tests notably cover:

- `parseTenantPath`: org-only, org+project, and full-triple URL
variants.
- `tenantContext.enrich`: in-place patch, no-op outside `run()`,
concurrent-scope isolation, empty-scope + enrich pattern (for non-tenant
pages).
- `tenantContextFromAuthEnvironment`: with and without `orgMember` —
verifies the API path's `user.id` mapping.
- `addTenantContextToEvent`: empty scope, userId-only, slugs-only, full
enrichment, conditional tag emission, preservation of prior `event.user`
fields.

## Test plan

- [ ] `pnpm run typecheck --filter webapp`
- [ ] `pnpm run test --filter webapp -- test/tenantContext.test.ts
test/sentryTenantContext.test.ts test/tenantContextResolver.test.ts
test/tenantContextFromAuthEnvironment.test.ts`
- [ ] Local manual: with `SENTRY_DSN` set, hit a dashboard URL and an
API route, confirm the captured events carry `user.id` + the expected
tag set in Sentry.
- [ ] After ship: confirm "Users Impacted" on a real Sentry issue
reflects distinct users (not tenants).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Workloads bundled with CLI versions before v4.4.4 use a strict zod enum
for `checkpoint.type` that only allows DOCKER and KUBERNETES. When a
customer's runs are routed via the compute path, those old runners
receive `type: "COMPUTE"` on `/snapshots/since/...` and `/dequeue`
responses and fail validation - blocking silent migration of existing
deployments.

The workload never reads the field - only validates the shape. Rewriting
COMPUTE -> KUBERNETES on the way out lets older runners keep parsing
while the database and internal services keep the real value. Limited to
the two workload-facing endpoints whose response includes a checkpoint;
`/continue`, `/attempts/start`, `/attempts/complete` all return shapes
without one.

Followup to #3114.
Make the Express server's `keepAliveTimeout` configurable via
`HTTP_KEEPALIVE_TIMEOUT_MS`. Default preserved at 65000 ms — no behavior
change if unset.
## Summary

Drops the unused composite Postgres index
`TaskRun_scheduleId_createdAt_idx`. The schedule list view reads from
ClickHouse, so this index served no Prisma query while still being
maintained on every `TaskRun` INSERT/UPDATE. Removing it reduces write
amplification on the primary database.

Sibling to the prior drop of `TaskRun_scheduleId_idx` and the earlier
removal of the `TaskRun.scheduleId` foreign key — all stemming from
migrating schedule-aware reads to ClickHouse.

## Verification

- Sampled `pg_stat_user_indexes` for `TaskRun` over multiple hours —
zero scans against this index.
- Grepped the codebase for any Prisma query filtering
`TaskRun.scheduleId` — none found. All schedule-aware listing routes
through `clickhouseRunsRepository`.
…3707)

## Summary

When a background worker registers, the engine resolves runs that were
queued before the worker was ready (status `PENDING_VERSION`). That
lookup used to scan a Postgres status index on `TaskRun`. Move it to
ClickHouse: query candidate run ids from `task_runs_v2`, then refetch
the actual rows from Postgres by primary key with a `status =
'PENDING_VERSION'` guard for idempotency.

## Design

The lookup is a pluggable interface on the run engine
(`PendingVersionRunIdLookup`). The webapp wires a ClickHouse-backed
implementation through the org-scoped `clickhouseFactory` using a new
`"engine"` client type, configured by `RUN_ENGINE_CLICKHOUSE_*` env
vars. The URL falls back to `CLICKHOUSE_URL` when unset, so self-hosted
deployments don't need new config to keep working.

When the lookup returns no candidates, one bounded retry is scheduled
~5s later to cover ClickHouse replication lag against `task_runs_v2`.
The Postgres status guard on both the candidate refetch and the inner
`updateMany` prevents double-promotion when a retry races with a
concurrent deploy.

Tests cover three existing PENDING_VERSION cases via a small
Postgres-backed test adapter; new ClickHouse-backed integration tests
will follow.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.