Skip to content

feat(discord): add /auth slash command for device-flow auth#1185

Merged
pahud merged 10 commits into
mainfrom
feat/auth-slash-command
Jun 24, 2026
Merged

feat(discord): add /auth slash command for device-flow auth#1185
pahud merged 10 commits into
mainfrom
feat/auth-slash-command

Conversation

@chaodu-agent

@chaodu-agent chaodu-agent commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds /auth Discord slash command that executes $OPENAB_AGENT_AUTH_COMMAND and relays the device-flow URL + code back to the user via ephemeral DM.

Flow Diagram

┌─────────────┐         ┌──────────────┐         ┌────────────────────┐
│  Discord    │         │     OAB      │         │  Auth CLI          │
│  (User DM)  │         │  (Handler)   │         │  (e.g. codex       │
│             │         │              │         │   login --device)  │
└──────┬──────┘         └──────┬───────┘         └─────────┬──────────┘
       │                       │                           │
       │  /auth                │                           │
       ├──────────────────────▶│                           │
       │                       │                           │
       │  [access check]       │                           │
       │  [DM-only check]      │                           │
       │  [single-flight]      │                           │
       │                       │                           │
       │  defer (ephemeral)    │                           │
       │◀──────────────────────┤                           │
       │                       │                           │
       │                       │  sh -c $AUTH_CMD          │
       │                       ├──────────────────────────▶│
       │                       │                           │
       │                       │  stdout: URL + code       │
       │                       │◀──────────────────────────┤
       │                       │                           │
       │  🔐 URL + code        │                           │
       │◀──────────────────────┤                           │
       │                       │                           │
       │                       │      (process blocks,     │
       │  [user opens URL,     │       polling OAuth       │
       │   enters code in      │       server...)          │
       │   browser]            │                           │
       │                       │                           │
       │                       │  exit 0                   │
       │                       │◀──────────────────────────┤
       │                       │                           │
       │  ✅ Authenticated!    │                           │
       │◀──────────────────────┤                           │
       │                       │                           │

Security Controls

/auth invoked
  │
  ├─ is_denied_user? ──▶ 🚫 rejected
  │
  ├─ guild_id.is_some()? ──▶ 🔒 DM-only
  │
  ├─ AUTH_IN_PROGRESS? ──▶ ⚠️ already running
  │
  ├─ OPENAB_AGENT_AUTH_COMMAND unset? ──▶ ⚠️ not configured
  │
  └─ ✅ proceed with deferred ephemeral + spawn

Architecture

  • Drain tasks: Separate tokio::spawn for stdout/stderr → run to EOF (no SIGPIPE)
  • URL detection: tokio::sync::Notify (cancellation-safe)
  • Single-flight: static AtomicBool + Drop guard (panic-safe)
  • DM check: cmd.guild_id.is_some() (local field, no API call)
  • Truncation: chars().take() (UTF-8-safe, Discord 2000-char limit)
  • Timeout: 14 min (leaves headroom for Discord interaction token TTL)
  • Tracing: info/warn at key lifecycle points

Changes

  • crates/openab-core/src/discord.rs — Register + implement /auth
  • docs/slash-commands.md — Full documentation

Review History

6 commits, 5 review rounds, 3 reviewers (擺渡/口渡/覺渡/普渡), all LGTM ✅

Implements a new /auth Discord slash command that executes the
OPENAB_AGENT_AUTH_COMMAND env var, captures the device flow URL and
code from stdout/stderr, and relays them to the user as an ephemeral
Discord message.

Flow:
1. User runs /auth
2. OAB execs $OPENAB_AGENT_AUTH_COMMAND (e.g. 'codex login --device-auth')
3. Captures URL+code from stdout within 30s
4. Sends ephemeral followup with auth instructions
5. Waits up to 15min for process to exit (user authorizes in browser)
6. Reports success/failure/timeout

Also updates docs/slash-commands.md with /auth documentation.
@chaodu-agent chaodu-agent requested a review from thepagent as a code owner June 24, 2026 00:20
chaodu-agent added 3 commits June 24, 2026 00:23
- Add is_denied_user access control gate (🔴 #1)
- Kill orphaned child process on empty-output early return (🔴 #2)
- Add single-flight AtomicBool guard to prevent concurrent /auth (🟡 #4)
- Truncate output to fit Discord 2000-char limit (🟡 #5)
- Scope readers in a block to drop cleanly before wait (🟡 #6)
Address 擺渡法師 and 口渡法師 review feedback:

- Spawn independent tokio tasks for stdout/stderr draining (run to EOF)
- Fixes SIGPIPE: pipes stay open while child waits for browser auth
- Fixes cancellation safety: no more tokio::select! on next_line()
- Fixes asymmetric drain: both streams captured independently
- Use tokio::sync::Notify for URL detection signaling
- tokio::join! drain tasks before exiting to ensure clean shutdown
- Kill + join on empty-output and timeout paths
- /auth now only works in DMs (rejects guild channels with guidance)
- Truncation uses chars().take() instead of byte slicing (no panic on
  multi-byte UTF-8)
- Budget calculated with chars().count() for prefix/suffix (correct
  for Discord's character-based limit)
- Updated docs to document DM-only requirement
@chaodu-agent

This comment has been minimized.

chaodu-agent added 2 commits June 24, 2026 00:56
- child.wait() timeout reduced from 15min to 14min so the final
  success/failure followup can still be delivered within Discord's
  15min interaction token TTL
- Docs: added 30s URL-collection window, allowed_users requirement,
  and single-flight behavior notes
- Replace to_channel() API call with cmd.guild_id.is_some() for DM check (F1)
- Add AuthGuard Drop impl to reset AUTH_IN_PROGRESS on task panic (F2)
- Add tracing::info/warn at key lifecycle points in spawned task (F3)
- Remove redundant manual store(false) calls now covered by Drop guard
@chaodu-agent

This comment has been minimized.

- Add child.wait() arm to URL-collection select! so a fast-failing auth
  command reports immediately instead of stalling the full 30s window (#1)
- Reject bot users in /auth, consistent with /remind (#8)
- Record invoking user_id in the auth start audit log (#7)
- Truncate output by UTF-16 code units to match Discord's 2000-char limit,
  preventing rejection on non-BMP-heavy output (#5)
- Handle std::sync::Mutex poison in drain/collect paths to avoid panic
  cascade and silent output loss (#9)
- Clarify the 'no output' error message with cause and remedy (#6)
- Fix docs intro contradicting /auth DM-only and mark it DM-only in the
  command table (#3)
@chaodu-agent

This comment has been minimized.

chaodu-agent added 2 commits June 24, 2026 01:34
…ht flag

Addresses review nit: the AUTH_IN_PROGRESS flag carries no dependent data,
so SeqCst is stronger than necessary. Acquire on the guard acquire (swap)
and Release on each clear (store) is sufficient and clearer in intent.
Round-2 review feedback:
- F1: when the auth command exits 0 during the URL-collection window without
  producing a login URL, show a warning + retry prompt instead of
  '✅ Auth command completed.' (avoids false confidence that auth succeeded)
- F2: document the early-exit-without-URL error case in slash-commands.md
- F3: document bot-user rejection (and DM-only) in the /auth error cases
@chaodu-agent

This comment has been minimized.

@chaodu-agent

This comment has been minimized.

…r) wait

- Extract the UTF-16-code-unit truncation into a pure, module-level
  truncate_to_utf16_budget() helper (matching the repo's testable
  decision-helper convention) and cover it with unit tests: short body,
  prefix/suffix budgeting, surrogate-pair counting, no-scalar-splitting,
  zero-budget saturation, and an assembled-total-within-limit regression
  guard for the original scalar-count miscount. (addresses review #2 / tests)
- Add the missing tracing::error on the final child.wait() Ok(Err(e)) arm,
  for symmetry with the spawn-failure path.
@chaodu-agent

This comment has been minimized.

@chaodu-agent

Copy link
Copy Markdown
Collaborator Author

Consolidated Group Review — /auth Slash Command

Verdict: functionally mergeable — maintainer's call. All blocking, documentation, and test findings are addressed. One non-blocking hardening item (device-code capture timing) is accepted as a tracked follow-up by the maintainer. Latest head c00b417, CI green (33 pass / 1 intentional skip / 0 fail); cargo test --workspace passes including the new truncation tests.

The design is sound — DM-only + ephemeral + single-flight (AuthGuard Drop) + env-only command (no user input into sh -c), with bot-user rejection and an audit log of the invoking user.

Round 1 findings & resolution

# Severity Finding Status
1 🟠 Fast-fail 30s stall — URL-collection select! had no child.wait() arm. ✅ Fixed — added child.wait() early-exit arm
2 🟠 500ms trailing-capture may truncate a device code printed >500ms after the URL. 🔭 Tracked follow-up (accepted)
3 🟡 Docs intro contradicted /auth DM-only. ✅ Fixed
4 🟡 No regression tests for /auth. ✅ UTF-16 truncation extracted + unit-tested; security-gate-path tests tracked as follow-up
5 🟡 Discord 2000-char limit is UTF-16 code units, not Unicode scalars. ✅ Fixed — UTF-16-aware truncation (now unit-tested)
6 🟡 "No output" error gave no cause/remedy. ✅ Fixed
7 🟡 Audit log missing invoking user ID. ✅ Fixed
8 🟡 No bot rejection (inconsistent with /remind). ✅ Fixed
9 🟡 std::sync::Mutex poison cascade in drain tasks. ✅ Fixed
10 🟡 AUTH_IN_PROGRESS is process-wide single-flight (by design). ✅ Documented as intended
11 SeqCst stronger than needed. ✅ Fixed — Acquire/Release

Round 2 findings & resolution

# Severity Finding Status
R1 🟡 (bug) Early-exit on exit-0-without-URL showed "✅ Auth command completed.", implying false success. ✅ Fixed — now a ⚠️ "run /auth again" prompt
R2 🟡 Docs "Error cases" missing the early-exit-without-URL path. ✅ Fixed
R3 🟡 Docs missing bot-rejection / DM-only error cases. ✅ Fixed

Round 3 (post-review hardening)

  • ✅ Extracted the UTF-16 truncation into a pure truncate_to_utf16_budget() helper and added unit tests (short body, prefix/suffix budgeting, surrogate-pair counting, no-scalar-splitting, zero-budget saturation, assembled-total-within-limit regression guard). Runs under cargo test --workspace.
  • ✅ Added the missing tracing::error on the final child.wait() Ok(Err(e)) arm (symmetry with the spawn-failure path).

Not a bug (verified, no action)

  • "Spawned task panic doesn't reset the flag"let _guard = AuthGuard; is the first statement in the spawned task; its Drop clears AUTH_IN_PROGRESS. No panic = "abort", so unwinding drops _guard on panic.
  • "Timeout branch doesn't join drain tasks" — the final tokio::join! runs after the match for all branches; the empty-output path joins before returning. Drain tasks are always joined.

Remaining follow-up (non-blocking, accepted)

  • The 500ms trailing-capture heuristic (round-1 perf: cache deps layer + drop arm64 QEMU build #2): consider "wait for N lines after URL" or idle-based capture for slow CLIs that print the code well after the URL.
  • Dedicated tests for the access-control gate paths (bot rejection, is_denied_user, DM-only, AUTH_IN_PROGRESS single-flight). The UTF-16 truncation logic is now unit-tested; the gate logic remains a good follow-up candidate.

Praise 🟢

  • AuthGuard Drop makes single-flight panic-safe; no command injection (env-only command, no user input to sh -c).
  • Layered access control: bot rejection → allowlist → DM-only → single-flight → env guard.
  • Drain tasks run to EOF (no SIGPIPE); ephemeral + DM-only minimizes credential exposure; 14-min timeout leaves headroom under Discord's token TTL.
  • UTF-16-aware truncation (now tested), poison-safe locks, and Acquire/Release ordering are correct.

Final merge decision is the maintainer's.

@pahud pahud enabled auto-merge June 24, 2026 12:28
@pahud pahud merged commit 68928f3 into main Jun 24, 2026
34 checks passed
brettchien added a commit to brettchien/openab that referenced this pull request Jun 24, 2026
Proposed ADR for the openab-agent LLM-provider OAuth revamp: a two-axis
OAuthVendor adapter (auth flow vs inference transport), a cross-process
flock-guarded credential-store invariant for auth.json, the
CLAUDE_CODE_OAUTH_TOKEN env route, a 14-variant vendor feasibility matrix,
and the /auth (PR openabdev#1185) auth-trigger model. Surfaced while reviewing
PR openabdev#1187 (first OAuth vendor).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants