Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 44 additions & 10 deletions .claude/skills/gap-analysis/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,16 @@ markdown report under `.agent/gap-analysis/`. **Do not edit source files.**

## Invocation

| Args | Scope |
| -------------------------------- | --------------------------------------------------- |
| `<provider>` (e.g. `openai`) | One provider — all four audit dimensions. |
| `feature <feature>` (e.g. `tts`) | One feature row of the matrix across all providers. |
| `models` | New-model diff for every provider. |
| `--all` | Full sweep (fan out subagents, one per provider). |
| _(none)_ | Ask the user which scope via AskUserQuestion. |
| Args | Scope |
| -------------------------------- | ----------------------------------------------------- |
| `<provider>` (e.g. `openai`) | One provider — all audit dimensions. |
| `feature <feature>` (e.g. `tts`) | One feature row of the matrix across all providers. |
| `models` | New-model diff for every provider. |
| `activities` | Activity-coverage diff: which of the 7 core activity |
| | kinds each provider ships an adapter for, vs. what |
| | upstream supports. (Dimension 6 only, all providers.) |
| `--all` | Full sweep (fan out subagents, one per provider). |
| _(none)_ | Ask the user which scope via AskUserQuestion. |

## Workflow

Expand All @@ -44,12 +47,18 @@ markdown report under `.agent/gap-analysis/`. **Do not edit source files.**
4. Capability-flag drift
5. Telemetry / observability parity (usage tokens, cache/reasoning
counts, request ids, logging asymmetry)
6. Activity coverage (which of the 7 core activity kinds each provider
ships an adapter for vs. what upstream supports) — this is the **only**
dimension for the `activities` scope; it's also rolled into `--all`.
5. **Fan out** for `--all`: launch one `Explore` subagent per provider, max 3
in parallel. Each subagent returns the five-dimension findings for its
provider; you synthesise into the combined report.
in parallel. Each subagent returns the multi-dimension findings for its
provider; you synthesise into the combined report. The `activities` scope
does **not** fan out — derive the provider×activity matrix centrally from
the adapter files (see dimension 6), since it's a fast mechanical diff.
6. **Write the report** to `.agent/gap-analysis/YYYY-MM-DD-<scope>.md` using
[references/report-template.md](references/report-template.md). Date is
today's ISO date. `<scope>` is `openai` / `feature-tts` / `models` / `all`.
today's ISO date. `<scope>` is `openai` / `feature-tts` / `models` /
`activities` / `all`.
7. **Print the report path and a 5-line summary** to the user.

## Critical rules
Expand Down Expand Up @@ -88,6 +97,31 @@ re-read it; this list is a snapshot:
`multimodal-structured`, `summarize`, `summarize-stream`, `image-gen`, `tts`,
`transcription`, `video-gen`.

## Known activities (7)

**Features** (above) are matrix rows about behaviours within an activity.
**Activities** are the coarser-grained core capability kinds in `@tanstack/ai`
— each has a `Base<Kind>Adapter` and a provider "supports" one only if its
package ships an adapter of that kind. Canonical list is the `AdapterKind`
union in `packages/ai/src/activities/index.ts` — always re-read it:

`text`, `summarize`, `image`, `audio`, `video`, `tts`, `transcription`.

A provider's activity surface is derived mechanically from its adapter files:
`packages/ai-<provider>/src/adapters/`. Filename → activity-kind map:

| Adapter file | Activity kind |
| ------------------------------------------------------------ | --------------- |
| `text.ts` / `text-chat-completions.ts` / `responses-text.ts` | `text` |
| `summarize.ts` | `summarize` |
| `image.ts` | `image` |
| `audio.ts` | `audio` |
| `video.ts` | `video` |
| `speech.ts` / `tts.ts` | `tts` |
| `transcription.ts` | `transcription` |

(`cost.ts` is a helper, not an activity adapter.)

## Verification before finishing

Before printing the summary:
Expand Down
76 changes: 70 additions & 6 deletions .claude/skills/gap-analysis/references/audit-checklist.md
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,65 @@ Steps:

---

## 6. Activity coverage

**Input:** each provider's `packages/ai-<provider>/src/adapters/` directory.
**Reference:** the `AdapterKind` union in `packages/ai/src/activities/index.ts`
(7 kinds: `text`, `summarize`, `image`, `audio`, `video`, `tts`,
`transcription`).
**Upstream:** each provider's API reference / models page.

This dimension answers a coarser question than dimension 2: not "which
behaviour within chat is exercised" but **"which whole activity kinds does the
provider's API offer that we ship no adapter for at all."**

Steps:

1. For every provider, list its adapter files and map each to an activity kind
using the filename→kind table in `SKILL.md` (§ Known activities). That
yields the local provider×activity matrix. `text.ts`,
`text-chat-completions.ts`, and `responses-text.ts` all count as `text`;
`speech.ts` and `tts.ts` both count as `tts`; ignore `cost.ts` and other
helpers.
2. Build the full grid (rows = 9 providers, cols = 7 activity kinds). A cell is
✅ if an adapter file of that kind exists, ❌ otherwise.
3. For each ❌ cell, WebFetch the provider's API reference and check whether
upstream offers that activity:
- **Real gap** — upstream offers it, we ship no adapter, and there is no
documented reason. High if it's a flagship activity for that provider
(e.g. a TTS provider missing `transcription`), medium otherwise.
- **Not a gap** — upstream genuinely doesn't offer that activity (e.g.
Anthropic has no image/audio/video/tts/transcription endpoints; Groq has
no image/video generation). Mark the cell ❌-by-design and omit from the
gap list, but keep it in the grid so the report is self-contained.
4. Note the inverse too: any **local adapter for an activity the provider no
longer offers upstream** (rare; surface as a stale-adapter finding).

Map activity kind → upstream capability to look for:

| Activity kind | Upstream capability |
| --------------- | ------------------------------------------------- |
| `text` | Chat / messages / completions endpoint |
| `summarize` | Any text completion (summarize is built on chat) |
| `image` | Image generation endpoint |
| `audio` | Music / sound / general audio generation endpoint |
| `video` | Video generation endpoint |
| `tts` | Text-to-speech endpoint |
| `transcription` | Speech-to-text endpoint |

**Priority rubric:**

- Missing a core activity the provider's API clearly offers (e.g. OpenAI-class
provider with no `image`/`tts`/`transcription`) → **high**.
- Missing a secondary/media activity → **medium**.
- ❌-by-design (upstream doesn't offer it) → **out-of-scope**, keep in grid only.
- Local adapter for a now-removed upstream activity → **medium** (deprecate).

Render the grid in the report under its own "Activity coverage" section, with
one bullet per real gap citing the upstream URL.

---

## Subagent dispatch (for `--all` scope)

When fan-out is needed, launch one `Explore` subagent per provider with a
Expand All @@ -225,12 +284,17 @@ prompt of this shape:
> Audit the `<provider>` adapter at `packages/ai-<provider>/`
> against upstream docs at the URLs in
> `.claude/skills/gap-analysis/references/provider-doc-urls.md`. Walk
> dimensions 1, 3, 4, and 5 from `audit-checklist.md`. Skip dimension 2
> (the orchestrator handles cross-provider parity centrally) — but do
> emit dimension-5 telemetry rows in the per-provider format; the
> orchestrator stitches them into the cross-adapter table. Return
> findings as markdown sections matching the report template — High /
> Medium / Low / Out-of-scope — with upstream URLs cited for every claim.
> dimensions 1, 3, 4, and 5 from `audit-checklist.md`. Skip dimensions 2
> and 6 (the orchestrator handles cross-provider parity and the
> activity-coverage grid centrally) — but do emit dimension-5 telemetry
> rows in the per-provider format; the orchestrator stitches them into the
> cross-adapter table. Return findings as markdown sections matching the
> report template — High / Medium / Low / Out-of-scope — with upstream
> URLs cited for every claim.

The orchestrator builds the dimension-6 activity-coverage grid itself from
the adapter-file listing (a one-shot `ls packages/ai-*/src/adapters/`), then
WebFetches each provider's API reference only for the ❌ cells.

Run at most 3 in parallel. Aggregate their returned markdown into the
combined report.
23 changes: 23 additions & 0 deletions .claude/skills/gap-analysis/references/report-template.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,29 @@ Replace every `{{placeholder}}`. Drop sections that have zero entries

---

## Activity coverage

> Per dimension 6 in audit-checklist.md. Which of the 7 core activity kinds
> (`text`, `summarize`, `image`, `audio`, `video`, `tts`, `transcription`)
> each provider ships an adapter for, vs. what upstream offers. Derived from
> `packages/ai-<provider>/src/adapters/`.

| Provider | text | summarize | image | audio | video | tts | transcription |
| ------------ | :---------: | :-------: | :---: | :---: | :---: | :-: | :-----------: |
| {{provider}} | {{✅/❌/—}} | … | … | … | … | … | … |

Legend: ✅ adapter shipped · ❌ upstream offers it, no adapter (gap) · — upstream doesn't offer it (by design).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Align by-design notation with audit-checklist.md.

The legend uses "—" (em dash) for "upstream doesn't offer it (by design)", but audit-checklist.md lines 249 and 271 use "❌-by-design" notation for the same concept. These should match to avoid confusion when following the checklist to generate the report.

📝 Suggested fix
-Legend: ✅ adapter shipped · ❌ upstream offers it, no adapter (gap) · — upstream doesn't offer it (by design).
+Legend: ✅ adapter shipped · ❌ upstream offers it, no adapter (gap) · ❌-by-design upstream doesn't offer it.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Legend: ✅ adapter shipped · ❌ upstream offers it, no adapter (gap) · upstream doesn't offer it (by design).
Legend: ✅ adapter shipped · ❌ upstream offers it, no adapter (gap) · ❌-by-design upstream doesn't offer it.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.claude/skills/gap-analysis/references/report-template.md at line 109,
Update the legend line that currently reads "Legend: ✅ adapter shipped · ❌
upstream offers it, no adapter (gap) · — upstream doesn't offer it (by design)."
so it matches the audit checklist's notation by replacing the em dash item with
the "❌-by-design" token used in audit-checklist.md; ensure the legend contains
"❌-by-design" (or the exact token used in audit-checklist.md) instead of "—
upstream doesn't offer it (by design)" so both documents use the same notation.


- **[{{provider}}] missing `{{activity}}` activity**
- Upstream: [{{doc-title}}]({{doc-url}})
- Current state: no `{{file}}.ts` under `packages/ai-{{provider}}/src/adapters/`
- Suggested change: add a `{{provider}}{{Activity}}` adapter (mirror {{sibling provider that has it}}).
- Effort: {{S / M / L}}

{{repeat per real activity gap}}

---

## Out-of-scope — documented exclusions

> Listed for completeness; no action required. Each links to the comment
Expand Down
Loading