Skip to content

docs: simplify and harden OpenShell quickstart#1113

Open
shaun-agent wants to merge 13 commits into
mainfrom
docs/openshell-runtime-install-contract
Open

docs: simplify and harden OpenShell quickstart#1113
shaun-agent wants to merge 13 commits into
mainfrom
docs/openshell-runtime-install-contract

Conversation

@shaun-agent

@shaun-agent shaun-agent commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Summary

  • restore docs/openshell.md as a simple Day 1 OpenShell quick start
  • restore the OpenShell architecture diagram with the tested /sandbox runtime shape
  • make the prebuilt sandbox image the default path and keep local docker build as a source-testing fallback
  • use the OpenShell provider credential route for DISCORD_BOT_TOKEN instead of writing raw tokens into config.toml
  • split Linux host and macOS Docker Desktop host setup
  • keep deeper OpenShell policy, usability, and E2E automation notes in the ADR instead of the quick start

E2E validation

  • Tested in disposable Linux runner under Docker Desktop
  • Built OpenAB OpenShell sandbox image
  • Created real OpenShell sandbox oab
  • Applied Day 1 network policy via openshell policy update
  • Authenticated openab-agent via no-browser OAuth callback
  • Started OpenAB inside the OpenShell-created sandbox
  • Verified Discord bot connected as openshell-test and restricted to the configured channel

Auditable change map

docs/openshell.md

  • Restored the architecture diagram so users can see host gateway, sandbox, /sandbox state, OpenAB, openab-agent, and network policy in one page.
  • Changed the default setup back to the prebuilt sandbox image, with local docker build only as a source-testing fallback.
    • Reference: this repo already builds openshell/Dockerfile as the native sandbox image in .github/workflows/build-operator.yml.
  • Added the OpenShell provider route for DISCORD_BOT_TOKEN. Users load the token into the host shell, create an OpenShell provider, and keep config.toml as bot_token = "${DISCORD_BOT_TOKEN}".
  • Kept Day 1 network policy explicit.
  • Removed the large automation/E2E contract from the quick start and linked to the ADR for deeper notes.
    • Reference: maintainer feedback asked docs/openshell.md to stay Day 1 friendly and move E2E/policy complexity elsewhere.

openshell/Dockerfile

  • Sets HOME=/sandbox, WORKDIR=/sandbox, TMPDIR=/sandbox/tmp, and PATH with /sandbox/bin and /sandbox/.local/bin first.
  • Pre-creates /sandbox/bin, /sandbox/.local/bin, /sandbox/.cache, /sandbox/tmp, and /sandbox/work for deterministic writable locations.
  • Scope is OpenShell-only; normal K8S/Helm images are unchanged.

docs/adr/openshell-openab-preset-module.md

  • Keeps policy/access/usability/E2E details out of the Day 1 quick start.
  • Adds credential exposure rationale with official OpenShell provider references.
  • Documents automation rules for agents/harnesses so test passes prove the real OpenShell path, not host-local substitutes.
  • Corrects the previous GHCLI direction: GitHub CLI remains part of existing OpenAB image behavior; extra cloud/admin tools can still be installed under /sandbox when needed.

Notes

  • OpenShell audit mode is documented as useful for matched endpoint inspection, not global allow-all.
  • GitHub CLI remains part of the existing default image behavior; the earlier GHCLI removal experiment was reverted in this PR.
  • The current PR touches only docs/openshell.md, openshell/Dockerfile, and the OpenShell ADR.

@chaodu-agent

This comment has been minimized.

@howie

howie commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Mob Review — PR #1113 docs: simplify and harden OpenShell quickstart

Mode: 2-voice mob (Claude 4-subagent voice + Codex). agy/Gemini unavailable this round (OAuth onboarding expired — NOT_ONBOARDED).
Verdict: 🔴 NEEDS_CHANGES — author shaun-agent, base main, +625/-86, 3 files (docs only, no app code).


🔴 Critical (must fix before merge)

C1. Documented Day 1 network-policy command does not exist in OpenShell 0.0.16

File: docs/openshell.md — "Apply The Day 1 Network Policy" (both for loops)
The quickstart's required pre-auth step runs:

openshell policy update oab --add-endpoint "<ep>" --binary /usr/local/bin/openab --wait

Empirically verified against the installed openshell 0.0.16 (the exact version the doc's installer pulls): openshell policy exposes only set / get / list / delete. There is no policy update subcommand, and no --add-endpoint or --binary flags. policy set takes --policy <YAML file> (--wait does exist). So every endpoint command in this section fails immediately, and the "Day 1 success path" the guide promises cannot be completed as written.

Cross-voice: flagged by Codex (Critical, claims it ran 0.0.16) and independently reproduced by the lead on a local openshell 0.0.16.
Note: the old doc also used policy update --add-endpoint, so this is a pre-existing defect — but this PR rewrites the entire section (new format, --binary, --wait, for-loops) and therefore owns it now.

Fix: Replace with a tested policy YAML + openshell policy set oab --policy <file> --wait, encoding endpoints in the YAML schema the installed CLI actually parses. Verify against the pinned OpenShell version before merge. (The ADR §2.1 itself says this policy "needs compatibility validation against the OpenShell CLI" — so it should not ship as the required Day 1 step until validated.)

C2. ADR E2E automation contract uses a non-existent sandbox exec command

File: docs/adr/openshell-openab-preset-module.md §2.3
The automation contract instructs agents/harnesses:

openshell sandbox exec -n oab --no-tty -- ...

Verified against 0.0.16: openshell sandbox exposes create/get/list/delete/connect/upload/download/ssh-configno exec. Any harness following this contract fails. Since the contract is meant to make "a passing run prove the real OpenShell path", a wrong command defeats its purpose.

Cross-voice: Codex (confirmed by lead repro).

Fix: Replace with a supported transport (e.g. connect, or upload + run inside connect), or pin the OpenShell version where sandbox exec exists and document installing it.


🟡 Important (should fix; defer only with a documented reason)

I1. ${DISCORD_BOT_TOKEN} silently expands to empty string if the provider credential isn't injected

File: docs/openshell.md (TL;DR + Configure blocks) — verified behavior in src/config.rs expand_env_vars (std::env::var(..).unwrap_or_default(), covered by test expand_env_vars_unknown_becomes_empty).
The mechanism is real (OpenAB does expand ${VAR} in config.toml — verified), but if OpenShell does not inject DISCORD_BOT_TOKEN into the sandbox env (provider misconfigured, var name mismatch, --provider omitted), the token expands to "" with no error. OpenAB then connects to Discord with a blank token → opaque auth failure, not "token missing". The quickstart's in-sandbox verify step checks whoami/writable but never checks the var is present. The ADR itself warns provider values "may appear as resolver handles rather than raw env values" — directly contradicting the quickstart's assumption.

Cross-voice: Codex (R1 "provider behavior contradictory") + Claude code-reviewer + silent-failure-hunter (DUPLICATE-confirmed in R2).

Fix: Add an in-sandbox pre-run check (test -n "$DISCORD_BOT_TOKEN" || echo "TOKEN MISSING" without printing it), and a Troubleshooting row mapping empty token → Discord auth error.

I2. audit enforcement overstates lockdown — matched endpoints under audit are NOT enforced

File: docs/openshell.md Day 1 policy (openab-agent endpoints use :full:rest:audit) vs "Day 2 Boundary".
audit mode logs and allows matched traffic. The "Day 2 Boundary" prose only states that unmatched host+port+binary is blocked, and frames audit as "helps inspect matched endpoints" — it never states the more important converse: every endpoint the agent matches under audit is effectively open. Combined with wildcard hosts (*.openai.com, *.oaiusercontent.com), the agent's egress there is unrestricted while the reader believes it is allowlisted.

Cross-voice: Claude silent-failure-hunter (Critical) + comment-analyzer (Important) + Codex AGREE.

Fix: Either switch the openab-agent endpoints to an enforcing mode, or add an explicit warning: "audit = allowed + logged, NOT enforced; the Discord binary is enforced, the model/auth binary is in audit (open) for Day 1."

I3. Quickstart makes Day 1 policy mandatory; its own ADR says it must not be required for first run

File: docs/openshell.md ("Before authenticating or starting OpenAB, apply the Day 1 network policy") vs docs/adr/openshell-openab-preset-module.md §2.1 ("Keep broad policy files out of the first-run happy path"; "do not require policy edits for the first successful run").
Direct doc-vs-ADR contradiction on whether policy is a required Day 1 step.

Cross-voice: comment-analyzer + Codex AGREE.

Fix: Reconcile — either confirm via testing that policy IS required and update the ADR, or demote the policy block to "apply if Discord/model traffic is blocked."

I4. Cleanup no longer deletes the credential provider (regression vs old doc)

File: docs/openshell.md Cleanup. Old doc had openshell provider delete discord; the new Cleanup only runs openshell sandbox delete oab. The openab-discord provider (holding the Discord token) is left behind; users think credentials were removed.

Cross-voice: Codex (diff-verifiable regression).

Fix: Add openshell provider delete openab-discord to Cleanup.

I5. "Test local source changes" build command does not build local source

File: docs/openshell.md (docker build -t oab-native-sandbox -f openshell/Dockerfile .).
The text says use this to "test local source changes", but openshell/Dockerfile is FROM ghcr.io/openabdev/openab-native:beta with no COPY/compile of the local checkout — it only re-layers env/dirs onto the published base. So a local OpenAB source change is NOT reflected. (Build context . is also irrelevant since there's no COPY — harmless but misleading.)

Cross-voice: Codex + Claude code-reviewer (context-irrelevant).

Fix: Reword to "test local Dockerfile changes", or make the Dockerfile build OpenAB/openab-agent from the local checkout if source testing is the goal.

I6. Broken ADR cross-reference

File: docs/adr/openshell-openab-preset-module.md:12 — Related list links ./openshell-websocket-auth.md, which does not exist in docs/adr/. (Sibling ./custom-gateway.md does resolve.)

Cross-voice: Codex + comment-analyzer (both verified absent). No markdown link-check CI exists to catch it.

Fix: Remove the link, repoint to the intended existing ADR, or add the file.

I7. CI does not exercise this Dockerfile change (ships to release build untested)

File: .github/workflows/docker-smoke-test.yml pull_request.paths.
paths is Dockerfile* / src/** / Cargo.*. GitHub path globs don't cross / without **, so Dockerfile* does not match openshell/Dockerfile. This PR touches only docs/* + openshell/Dockerfile → the smoke job is skipped, and the Dockerfile change (HOME=/sandbox, -d /sandbox, new dirs, PATH/TMPDIR) flows to build-operator.yml (tag/release) with zero PR-time validation. Even when the smoke test does run, it only asserts the CMD/agent-CLI/ACP handshake — it never asserts HOME=/sandbox or writable /sandbox, so reverting the home change would stay green while breaking the documented quickstart.

Cross-voice: pr-test-analyzer (both legs verified) + Codex AGREE. (Arguably a repo-CI follow-up rather than strictly in this PR's scope — flag for maintainer.)

Fix: Add openshell/** (or **/Dockerfile*) to paths, and add an assertion: docker run --rm <img> sh -c 'test "$HOME" = /sandbox && test -w /sandbox && command -v openab'.


🟢 Actionable NITs

  • N1. The two config.toml blocks (TL;DR vs "Configure And Run") differ only by a [reactions] block that restates code defaults (enabled=true, remove_after_reply=false — verified). Either drop it or include it in both so copy-paste targets match.
  • N2. Dockerfile creates /sandbox/work and /sandbox/.cache, but the architecture diagram/config list only bin, .local/bin, tmp; runtime mkdir -p recreates only bin/.local/bin/tmp (not .cache/work). Align the documented filesystem shape with what the image actually creates, or drop unused dirs.
  • N3. auth0.openai.comauth.openai.com + wildcards (*.openai.com) is plausibly correct for the codex-oauth flow but unverified in-repo; if *.openai.com already covers auth, the explicit auth.openai.com line is redundant. Confirm the real auth host.
  • N4. TL;DR leads with inline export DISCORD_BOT_TOKEN="your-discord-bot-token", which trains users to paste real tokens into shell history. Prefer read -rs DISCORD_BOT_TOKEN or sourcing a gitignored .env. (Not a leak — config-file hygiene is genuinely improved — just residual history/env exposure.)
  • N5. message_processing_mode is shown in the doc but absent from config.toml.example (it exists in code/tests). Add it to the example for parity so readers have a schema artifact to diff against.

✅ Verified non-issues (checked, no action)

  • ${VAR} expansion works: src/config.rs expand_env_vars matches bare ${VAR} via std::env::var, called in load_config_raw before parse, unit-tested. The mechanism is correct (only the empty/injection case in I1 is the risk).
  • Dockerfile user/home setup is correct: useradd -m -d /sandbox + mkdir + chown -R + USER/ENV/WORKDIR ordering is sound; uid 1000660000 is the deliberate OpenShift-style high uid.
  • Config fields all exist in src/config.rs and are unit-tested (allow_all_users, message_processing_mode, remove_after_reply, [agent.env]).
  • openshell/Dockerfile IS built in CI (build-operator.yml + docker-smoke-test.yml), so the prebuilt openab-native-sandbox image is real (the I7 gap is about this PR not triggering it).
  • --wait flag is real on openshell policy set.

Relationship to prior OpenShell PRs / issues (per request)

@chaodu-agent

Copy link
Copy Markdown
Collaborator

CHANGES REQUESTED ⚠️ — Mob review identified critical issues: documented OpenShell CLI commands do not exist in the installed version (0.0.16).

What This PR Does

Simplifies docs/openshell.md into a focused Day 1 quickstart, moves deeper policy/E2E/automation notes into a new ADR, and hardens openshell/Dockerfile with proper /sandbox layout.

How It Works

  • Restores architecture diagram with /sandbox as writable root
  • Uses OpenShell provider credential route instead of raw tokens in config
  • Splits Linux host vs macOS Docker Desktop paths
  • Adds Day 1 network policy section (binary-scoped endpoints)
  • ADR captures preset-module discussion, E2E contract, and credential exposure rationale

Findings

# Severity Finding Location
1 🔴 openshell policy update --add-endpoint --binary does not exist in OpenShell 0.0.16 docs/openshell.md Day 1 policy section
2 🔴 openshell sandbox exec does not exist in OpenShell 0.0.16 docs/adr/...preset-module.md §2.3
3 🟡 ${DISCORD_BOT_TOKEN} silently expands to empty string if provider credential not injected docs/openshell.md config blocks
4 🟡 audit enforcement mode allows matched traffic (not enforced) — misleading framing docs/openshell.md Day 1 policy + Day 2 section
5 🟡 ADR says policy must NOT be required for first run; quickstart makes it mandatory docs/openshell.md vs ADR §2.1
6 🟡 Cleanup section no longer deletes the credential provider (regression) docs/openshell.md Cleanup
7 🟡 "Test local source changes" build does not actually build local source docs/openshell.md local build section
8 🟡 Broken cross-reference to non-existent openshell-websocket-auth.md ADR line 12
9 🟡 CI paths filter does not match openshell/Dockerfile — ships untested .github/workflows/docker-smoke-test.yml
10 🟢 Dockerfile user/home/env setup is correct and sound openshell/Dockerfile
11 🟢 ${VAR} expansion mechanism in src/config.rs is correct
12 🟢 All config fields exist and are tested src/config.rs
Finding Details

🔴 F1: Day 1 network-policy commands do not exist in OpenShell 0.0.16

The quickstart requires:

openshell policy update oab --add-endpoint "<ep>" --binary /usr/local/bin/openab --wait

Verified against OpenShell 0.0.16: openshell policy exposes only set / get / list / delete. There is no policy update subcommand, no --add-endpoint, and no --binary flag. Every endpoint command in this section fails immediately, making the documented Day 1 success path impossible to complete.

Fix: Replace with a tested policy YAML file + openshell policy set oab --policy <file> --wait, encoding endpoints in the YAML schema the CLI actually parses. Verify against the pinned OpenShell version before merge.

🔴 F2: ADR E2E automation contract uses non-existent sandbox exec command

The automation contract instructs:

openshell sandbox exec -n oab --no-tty -- ...

Verified against 0.0.16: openshell sandbox exposes create/get/list/delete/connect/upload/download/ssh-config — no exec subcommand exists.

Fix: Replace with a supported transport (connect, or upload + run inside connect), or pin an OpenShell version where sandbox exec exists.

🟡 F3: Silent empty-string expansion for missing provider credential

src/config.rs expand_env_vars uses std::env::var(..).unwrap_or_default(). If DISCORD_BOT_TOKEN is not injected into the sandbox environment (provider misconfigured, --provider omitted), the token expands to "" with no error — causing an opaque Discord auth failure.

Fix: Add an in-sandbox pre-run check: test -n "$DISCORD_BOT_TOKEN" || echo "ERROR: DISCORD_BOT_TOKEN not set" (without printing the value). Add a Troubleshooting row for this failure mode.

🟡 F4: audit enforcement mode overstates lockdown

The openab-agent endpoints use :full:rest:audit format. Audit mode logs and allows matched traffic — it is not enforced. Combined with wildcard hosts (*.openai.com, *.oaiusercontent.com), the agent binary has effectively unrestricted egress to those domains while the reader believes it is locked down.

Fix: Either switch to an enforcing mode, or add an explicit warning explaining that audit = allowed + logged, NOT enforced for Day 1.

🟡 F5: Quickstart vs ADR contradiction on policy requirement

The quickstart states: "Before authenticating or starting OpenAB, apply the Day 1 network policy." The ADR §2.1 states: "Keep broad policy files out of the first-run happy path" and "do not require policy edits for the first successful run."

Fix: Reconcile — either confirm via testing that policy IS required and update the ADR, or demote the policy block to conditional ("apply if Discord/model traffic is blocked").

🟡 F6: Cleanup regression — provider credential left behind

Old doc: openshell provider delete discord. New Cleanup: only openshell sandbox delete oab. The openab-discord provider (holding the Discord token) persists after cleanup.

Fix: Add openshell provider delete openab-discord to Cleanup section.

🟡 F7: Local build does not build local source

docker build -t oab-native-sandbox -f openshell/Dockerfile . — the text says "test local source changes" but the Dockerfile is FROM ghcr.io/openabdev/openab-native:beta with no COPY of local source. A local OpenAB code change is not reflected.

Fix: Reword to "test local Dockerfile changes" or add a COPY/build stage for actual source testing.

🟡 F8: Broken ADR cross-reference

Line 12 links ./openshell-websocket-auth.md which does not exist in docs/adr/.

Fix: Remove the link or point to the intended existing ADR.

🟡 F9: CI paths filter misses openshell/Dockerfile

docker-smoke-test.yml paths uses Dockerfile* which does not match openshell/Dockerfile (glob does not cross /). This Dockerfile change flows to release builds with zero PR-time validation.

Fix: Add openshell/** to paths trigger, and add an assertion: docker run --rm <img> sh -c 'test "$HOME" = /sandbox && test -w /sandbox && command -v openab'.

Baseline Check
  • PR opened: 2026-06-15
  • Author: shaun-agent
  • Predecessor: docs: add OpenShell preset module ADR #1053 (same author, same 3 files, closed same day docs: simplify and harden OpenShell quickstart #1113 opened)
  • Main already has: basic docs/openshell.md with old-style policy commands
  • Net-new value: restructured Day 1 quickstart, provider credential pattern, hardened Dockerfile, new ADR
  • Note: The old doc also used policy update --add-endpoint (pre-existing defect), but this PR rewrites the entire section and therefore owns it
Addressing External Reviewer Feedback

@howie (Mob Review)

C1: openshell policy update --add-endpoint --binary does not exist in 0.0.16

Confirmed and adopted as F1 (🔴 Critical). Independently verified — the installed CLI does not expose these subcommands.

C2: openshell sandbox exec does not exist in 0.0.16

Confirmed and adopted as F2 (🔴 Critical). The sandbox subcommand does not include exec.

I1: Silent empty-string expansion

Confirmed and adopted as F3 (🟡 Important). Code-verified in src/config.rs.

I2: audit mode overstates lockdown

Confirmed and adopted as F4 (🟡 Important).

I3: Quickstart vs ADR contradiction on policy requirement

Confirmed and adopted as F5 (🟡 Important).

I4: Cleanup regression

Confirmed and adopted as F6 (🟡 Important).

I5: Local build does not build local source

Confirmed and adopted as F7 (🟡 Important).

I6: Broken ADR cross-reference

Confirmed and adopted as F8 (🟡 Important).

I7: CI paths filter misses openshell/Dockerfile

Confirmed and adopted as F9 (🟡 Important). Flagged for maintainer — arguably a repo-CI follow-up but worth fixing with this PR.

What's Good (🟢)
  • Dockerfile layout is well-structured: correct useradd -m -d /sandbox, proper dir pre-creation, appropriate chown, sound env vars
  • Provider credential pattern is the right direction — avoids raw tokens in committed config
  • Architecture diagram is clear and accurately represents the runtime shape
  • ADR is thorough and captures the nuanced differences between K8s and OpenShell mental models
  • Split between Day 1 quickstart and deeper ADR is the right documentation architecture

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants