Skip to content

Strip INPUT_ONLY fields from CLI response output#5575

Open
Divyansh-db wants to merge 5 commits into
mainfrom
cligen-input-only-stripping
Open

Strip INPUT_ONLY fields from CLI response output#5575
Divyansh-db wants to merge 5 commits into
mainfrom
cligen-input-only-stripping

Conversation

@Divyansh-db

@Divyansh-db Divyansh-db commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Replaces #5450 with the cligen-based approach: pull the INPUT_ONLY field-behaviors that are already in .codegen/cli.json, run the masking in generated CLI command code to produce an any, and pass that to plain cmdio.Render. libs/cmdio stays untouched.

Why

The Databricks Go SDK uses one Go struct per resource for both request and response (transport-layer pattern). Some fields are REQUIRED on the request side — so the SDK emits them without omitempty — but INPUT_ONLY per the OpenAPI spec, meaning the server never returns them. The CLI hands the SDK response straight to json.MarshalIndent, so those fields leak into JSON output as empty strings. For example, databricks account disaster-recovery get-stable-url returns \"initial_workspace_id\": \"\".

The SDK behavior is intentional and not changing. The fix has to live in the CLI.

How

Three commits:

  1. libs/inputonly (new package). Strip(v, paths) (any, error) round-trips through encoding/json, deletes the listed dotted paths from the generic tree, and returns the masked value for the caller to marshal. Arrays are traversed transparently; dynamically-keyed maps are too (when no literal key matches the next path component, every map value is visited with the same path — same semantic as proto map<string, V> fields). Self-contained, no dependency on cmdio.

  2. internal/cligen reads INPUT_ONLY paths from cli.json and emits Strip calls. input_only.go walks the schema graph rooted at each method's response type and collects the dotted paths of every INPUT_ONLY field. Eligible methods are the standard sync render path only: paginated, LRO, wait, byte-stream, and empty responses are skipped. The template at the existing render emit site forks on the precomputed InputOnlyPaths: when non-empty, generate

    ```go
    masked, err := inputonly.Strip(response, []string{"initial_workspace_id"})
    if err != nil {
    return err
    }
    return cmdio.Render(ctx, masked)
    ```

    otherwise the existing `return cmdio.Render(ctx, response)` is byte-identical. The `libs/inputonly` import is gated on `ServiceJSON.HasInputOnlyPaths()` so services that don't strip anything don't carry an unused import.

  3. Regenerate. `./task generate-cligen` updates six service files (disaster-recovery, clean-rooms, database, postgres, quality-monitor-v2, secrets-uc); everything else is byte-identical.

Scope (what's not in this PR)

  • Paginated list methods still leak. `cli.json` doesn't surface the iterator element type today — only the wrapper response — so per-element path computation has nothing to walk. Once `cli.json` grows array/map element refs, the same walker picks them up automatically. The leak on `list-stable-urls` and friends is documented in the regen commit.
  • LRO and legacy-wait render sites. Those render an LRO operation, an LRO completion type, an intermediate wait response, and a poll-progress response respectively — each from a different schema entry. The mechanism here generalizes but each site needs its own entity source; deferred so this PR stays focused on the bulk of the standard sync render path.

Test plan

  • `go test ./libs/inputonly/... ./internal/cligen/...` — pass (new tests + existing).
  • `go test ./libs/cmdio/...` — pass (no changes there, but verify untouched).
  • `go build ./...` — clean against the regenerated `cmd/**`.
  • `./task lint-q` — 0 issues on the changed packages.
  • Manual: `grep -r initial_workspace_id cmd/account/disaster-recovery/` — only appears as a request-side flag and the `Strip` call argument, never in a render argument.

New package with Strip(v any, paths []string) (any, error) that
round-trips v through encoding/json into a generic representation,
removes the dotted paths from the tree, and returns the masked value
for the caller to marshal in its preferred format.

Used by generated CLI commands (cmd/account/**, cmd/workspace/**) to
drop fields the OpenAPI spec marks INPUT_ONLY before the response
flows into cmdio.Render. The SDK transport struct serializes these
fields unconditionally (REQUIRED on the request side, no omitempty),
so the CLI sees them as empty strings even though the server never
populates them on responses; the stripped any rendering matches what
the spec promises.

Path semantics: arrays are traversed transparently (each element
visited), and dynamically-keyed maps (proto map<string, V>) are too —
when no literal key matches the next path component, every value is
visited with the same path. This handles list responses and map-valued
fields with the same path expression as singletons.

Co-authored-by: Isaac
cliv1's .codegen/cli.json already carries x-databricks-field-behaviors
in its schemas block, so the CLI generator can pull the INPUT_ONLY
paths per method without any genkit-side change. (Cf. discussion at
go/slack DECO-27296.)

What's added:
- internal/cligen/input_only.go: walk the schema graph rooted at the
  method's response type, return sorted dotted paths of every field
  with the INPUT_ONLY behavior. Singleton message refs are followed;
  array/map element types aren't (cli.json's SchemaFieldJSON carries
  only one ref slot, populated for singleton fields). A field that is
  itself INPUT_ONLY emits a single path and stops — the whole subtree
  is stripped at runtime by libs/inputonly.Strip.
- internal/cligen/cligen.go: populateInputOnlyPaths runs after
  Resolve(); it skips paginated, LRO, wait, byte-stream, and empty
  responses because each renders a different shape and needs its own
  path source (deferred follow-up).
- internal/cligen/model.go: MethodJSON gains InputOnlyPaths []string
  (transient, not in JSON); ServiceJSON gains HasInputOnlyPaths() so
  the template can gate the libs/inputonly import.
- templates/service.go.tmpl: at the standard render emit site, emit
  inputonly.Strip(response, paths) into a `masked` local and pass that
  to the existing cmdio.Render. cmdio stays untouched. If paths is
  empty the existing call is byte-identical.

Tests cover flat, nested-via-ref, INPUT_ONLY-message (no recursion),
cycle, and unknown-root cases.

Co-authored-by: Isaac
Run of ./task generate-cligen after the previous commit. Six service
files change; everything else is byte-identical (most response types
have no INPUT_ONLY fields).

Each modified service has at least one method whose response type
declares an INPUT_ONLY field in cli.json's schemas block. Examples:

- disaster-recovery: `get-stable-url` / `create-stable-url` strip
  `initial_workspace_id`; failover-group methods strip
  `initial_primary_region`.
- clean-rooms: nested paths like
  `remote_detailed_info.creator.invite_recipient_workspace_id`.
- database / postgres / secrets-uc / quality-monitor-v2: multi-field
  paths including container fields whose entire subtree is INPUT_ONLY
  (e.g. `spec` on secrets-uc).

Paginated list methods (e.g. `list-stable-urls`) still go through
cmdio.RenderIterator unchanged. cli.json does not currently surface
the iterator element type, so per-element path computation is a
follow-up — the leak persists on those commands until then.

Co-authored-by: Isaac
@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Approval status: pending

/internal/ - needs approval

5 files changed
Suggested: @simonfaltum
Also eligible: @parthban-db, @renaudhartert-db, @hectorcast-db, @tanmay-db, @tejaskochar-db, @mihaimitrea-db, @chrisst, @rauchy

General files (require maintainer)

9 files changed
Based on git history:

  • @pietern -- recent work in internal/cligen/, cmd/workspace/database/, cmd/workspace/clean-rooms/

Any maintainer (@andrewnester, @anton-107, @denik, @pietern, @shreyas-goenka, @simonfaltum, @renaudhartert-db) can approve all areas.
See OWNERS for ownership rules.

DatabaseInstance.stopped is INPUT_ONLY in cli.json, so the regenerated
update-database-instance command now strips it from the response. The
fixture was captured before that change and still expected `stopped`
in the rendered JSON.

Regenerated via ./task test-update.

Co-authored-by: Isaac
Two bugs in the original Strip implementation, both pointed out on the
prior PR's review and inherited here:

1. The json.Marshal -> json.Unmarshal(any) round-trip decoded every
   number as float64, silently losing precision above 2^53 — i.e. on
   real SDK fields like spark_context_id (int64). Switch to
   json.NewDecoder(...).UseNumber() so numbers stay as json.Number
   strings and re-marshal verbatim.

2. The map-handling branch in deletePath fell through to iterate every
   value when the literal key didn't match. For INPUT_ONLY fields the
   server always omits the field at its expected location, so the
   literal miss was the common case, and the fallback would silently
   recurse into every nested object and strip any same-named leaf —
   turning anchored paths into match-anywhere expressions.

   Repro: path "name" against {"id":"123","details":{"name":"x"}}
   stripped details.name. The same fallback also created a map-key
   collision foot-gun (a proto-map entry literally named the leaf
   would be removed, leaving the mask unapplied elsewhere).

   Fix: strict literal lookup on maps, return on miss. Arrays stay
   transparent because []any is type-distinguishable at runtime so
   the path semantic there is unambiguous. cligen does not emit paths
   that descend through proto maps today; if cli.json grows map value
   refs later, the path language should grow an explicit marker
   ("tags.*.field") rather than reintroducing implicit fallback.

Adds two regression tests:
- TestStripPreservesLargeInt64: value 2^53+1 survives a strip on a
  sibling field.
- TestStripDoesNotMatchAnywhere: path "name" leaves
  details.name untouched.

Drops two tests from the previous commit that relied on the map
fallback (TestStripMapValues, TestStripNestedInMapValue) and replaces
them with TestStripDoesNotDescendIntoMaps, which documents the strict
behavior.

Also notes in the doc comment that filtering reorders JSON keys
alphabetically (map[string]any vs struct declaration order) so
downstream consumers and acceptance fixtures expect that.

Co-authored-by: Isaac
@eng-dev-ecosystem-bot

Copy link
Copy Markdown
Collaborator

Integration test report

Commit: 7b43980

Run: 27428259493

Env 🟨​KNOWN 💚​RECOVERED 🙈​SKIP ✅​pass 🙈​skip Time
🟨​ aws linux 7 15 264 977 7:45
🟨​ aws windows 7 15 266 975 15:00
💚​ aws-ucws linux 7 15 360 891 7:18
💚​ aws-ucws windows 7 15 362 889 12:47
💚​ azure linux 1 17 267 975 6:46
💚​ azure windows 1 17 269 973 9:39
💚​ azure-ucws linux 1 17 365 887 8:02
💚​ azure-ucws windows 1 17 367 885 10:38
💚​ gcp linux 1 17 263 978 7:42
💚​ gcp windows 1 17 265 976 12:28
22 interesting tests: 15 SKIP, 7 KNOWN
Test Name aws linux aws windows aws-ucws linux aws-ucws windows azure linux azure windows azure-ucws linux azure-ucws windows gcp linux gcp windows
🟨​ TestAccept 🟨​K 🟨​K 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
🙈​ TestAccept/bundle/invariant/no_drift 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/permissions 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions 🟨​K 🟨​K 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions 🟨​K 🟨​K 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 🟨​K 💚​R 💚​R
🙈​ TestAccept/bundle/resources/postgres_branches/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/recreate 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/replace_existing 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/update_protected 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/without_branch_id 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_endpoints/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_endpoints/recreate 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_projects/update_display_name 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/synced_database_tables/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/vector_search_endpoints/drift/recreated_same_name 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/vector_search_indexes/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/vector_search_indexes/grants/select 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/ssh/connection 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
Top 28 slowest tests (at least 2 minutes):
duration env testname
7:08 aws-ucws windows TestAccept
6:23 gcp windows TestAccept
5:23 gcp windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
5:01 azure-ucws windows TestAccept
4:56 azure windows TestAccept
4:50 gcp windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
4:49 gcp linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
4:09 gcp linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:16 azure windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:14 aws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:13 azure-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:09 aws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:06 aws-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:02 aws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:55 azure linux TestAccept
2:53 aws-ucws linux TestAccept
2:52 aws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:51 azure-ucws linux TestAccept
2:50 gcp linux TestAccept
2:48 azure linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:45 azure windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:44 azure linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:40 azure-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:40 azure-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:38 aws-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:32 azure-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:30 aws-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:22 aws-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants