Add cmdio.RenderFiltered to strip fields from rendered output#5450
Add cmdio.RenderFiltered to strip fields from rendered output#5450Divyansh-db wants to merge 2 commits into
Conversation
Adds new cmdio.RenderFiltered and cmdio.RenderIteratorFiltered entry points that mirror Render / RenderIterator but accept a list of dotted JSON paths to strip from the value before it is marshaled. The list is consulted only on the JSON render path; text/template rendering is unchanged. Motivation: the Databricks SDK uses a single transport struct per resource for both request and response. Some fields are required on the request side (so the SDK marshals them unconditionally) but input-only on the response side per the OpenAPI spec. The CLI today hands the SDK response struct directly to json.MarshalIndent, so those fields leak into user-visible output even though the server doesn't populate them. RenderFiltered gives generated CLI commands a way to strip such fields without modifying the SDK or introducing a separate view layer. Nothing in the generated CLI surface calls these new entry points yet; that switch will land alongside the corresponding codegen change. Co-authored-by: Isaac
Waiting for approvalBased on git history, these people are best suited to review:
Eligible reviewers: Suggestions based on git history. See OWNERS for ownership rules. |
|
Commit: e3d5421 |
deletePath was traversing arrays transparently but not maps. When the generator emits a path like "tags.initial_workspace_id" for a map[string]V field whose value type has an INPUT_ONLY field, the literal "initial_workspace_id" key doesn't exist directly under "tags" — it lives inside each map value — so the path was silently a no-op at render time. Mirror the array behavior: when a path component doesn't match a literal key on the current object, descend into every value with the same key list. The literal-match path is still preferred, so struct field paths keep working unchanged. Adds two tests: a top-level map field and a map nested inside another struct, both checking that the inner field is stripped from every value. Co-authored-by: Isaac
hectorcast-db
left a comment
There was a problem hiding this comment.
I have no concerns with the code quality or test coverage, but I don't have enough context on the implementation of the CLI to be sure this is the best approach.
E.g., on the SDK we may want to generate methods for each struct instead of using a generic one, but here it is likely not possible due to CLI design.
Can you get a second opinion?
simonfaltum
left a comment
There was a problem hiding this comment.
Took a deep look as the second opinion @hectorcast-db asked for. I ran this through multiple independent review passes and they converged on the same issues, so I'm fairly confident in the findings.
The approach itself is fine imo: a generic render-side seam fits how the CLI consumes SDK structs, and per-struct generated methods would be a much bigger change for little gain. Two things need fixing before generated code starts calling this though:
- The JSON round-trip corrupts large int64 values (inline comment, blocking).
- The path semantics are ambiguous for maps vs structs (two inline comments). The genkit follow-up bakes these paths into every generated command, so I think we should make the path language explicit now rather than live with fuzzy matching.
Smaller things, no inline comments:
RenderIteratorFilteredwith non-empty paths has no test (the masking insideiteratorRenderer.renderJson).RenderFilteredon anio.Readersilently ignores the paths (newRendererreturnsreaderRendererfirst). Worth a doc line.
Everything else looks good: test coverage for the happy paths is solid, and Render with nil paths is provably unchanged.
| return nil, fmt.Errorf("input-only mask: marshal: %w", err) | ||
| } | ||
| var out any | ||
| if err := json.Unmarshal(b, &out); err != nil { |
There was a problem hiding this comment.
Blocking: this round-trip corrupts large int64 fields whenever a path is filtered. json.Unmarshal into any decodes every JSON number as float64 (53-bit mantissa), and it hits every numeric field in the response, not just the stripped one. Repro:
input : {"cluster_name":"c","spark_context_id":7189401748684612345}
output: {"cluster_name":"c","spark_context_id":7189401748684613000}
spark_context_id is a real cluster response field that exceeds 2^53. The fix is small:
var out any
dec := json.NewDecoder(bytes.NewReader(b))
dec.UseNumber()
if err := dec.Decode(&out); err != nil {
return nil, fmt.Errorf("input-only mask: unmarshal: %w", err)
}json.Number re-marshals verbatim and deletePath never touches it. A regression test with a value above 2^53 (i.e. int64(9007199254740993)) would catch this.
One more side effect worth a deliberate decision: the map[string]any tree marshals keys alphabetically while structs marshal in field order, so turning filtering on for a command reorders its whole JSON output. UseNumber doesn't change that. I think sorted keys are probably acceptable, but it should be a conscious choice, not a surprise.
| } | ||
| switch t := v.(type) { | ||
| case map[string]any: | ||
| if child, ok := t[keys[0]]; ok { |
There was a problem hiding this comment.
The literal-match-wins rule here is worse than the doc comment describes: for tags.initial_workspace_id against a proto map, a user entry literally named initial_workspace_id gets deleted and traversal stops, so the input-only field survives in every other map value. Both failure modes at once (user data dropped, mask not applied). Same root cause as the fallback descent below; fix proposal there. A regression test with a colliding map key would be good to have either way.
| } | ||
| return | ||
| } | ||
| for _, child := range t { |
There was a problem hiding this comment.
My concern with this fallback: after the round-trip, nested structs and proto maps are both map[string]any, so this branch can't tell them apart. When keys[0] doesn't literally match, the full path is re-applied to every value at any depth, which turns an anchored path into match-anywhere. Repro: path name against {"id":"123","details":{"name":"keep-me","size":"L"}} deletes details.name.
This bites exactly in the INPUT_ONLY case: when the server omits the field at its expected location, the descent falls through and strips same-named legitimate output elsewhere in the response.
I think the fix for this and the map collision above is the same: make map traversal explicit in the path language. genkit knows where the proto maps are, so it can emit tags.*.initial_workspace_id, and deletePath becomes strict: literal segments match literally, * fans out over map values, arrays stay transparent. No ambiguity left, and the paths stay schema-derived.
Summary
Adds new
cmdio.RenderFilteredandcmdio.RenderIteratorFilteredentry points that mirrorRender/RenderIteratorbut accept a[]stringof dotted JSON paths to strip from the value before it is marshaled. The path list is consulted only by the JSON render path; text/template rendering is unchanged.Path syntax is dotted (
a.b.c); arrays are traversed transparently, so the same path expression works for singleton and list responses without anitems[]marker.Motivation
The Databricks SDK uses a single Go struct per resource for both request and response (transport-layer pattern). Some fields are
REQUIREDon the request side, so the SDK marshals them unconditionally, while also beingINPUT_ONLYper the OpenAPI spec — they're never populated on responses. Generated CLI commands hand the SDK response struct directly tojson.MarshalIndent, so those fields leak into user-visible output as empty strings even when the server omits them.Existing in-repo precedents for honoring
x-databricks-field-behaviors:bundle/direct/tools/generate_resources.pyreadsINPUT_ONLY/OUTPUT_ONLYfrom the OpenAPI spec and propagates them into the DABs direct engine'signore_remote_changesconfig.bundle/internal/schema/main.go'sremoveOutputOnlyFieldsstripsOUTPUT_ONLYfrom the bundle JSON schema.RenderFilteredis the equivalent seam on the CLI response render path. This PR adds only the entry points; the genkit template change that starts calling them will land separately, followed by the generated-code regeneration incmd/account/**andcmd/workspace/**.Design notes
Renderis unchanged in behavior (now a thin wrapper aroundRenderFiltered(_, _, nil)). Existing callers don't need to change.applyInputOnlyMaskinlibs/cmdio/filter.goround-trips the value throughjson.Marshalinto a genericmap[string]anytree, removes the listed leaf keys, and returns the masked value for the caller to marshal in its preferred format. Operating on the generic tree (rather than on raw bytes) letsdefaultRenderer.renderJsonanditeratorRenderer.renderJsonreuse the same masking logic despite using differentMarshalIndentprefixes.colorizeJSON, so the colorized output stays consistent.Test plan
go test ./libs/cmdio/...— pass (new and existing tests)gofmt -lclean on changed files;go vet ./libs/cmdio/...clean./task fmt-q/./task lint-q— 0 issuesRenderFilteredmatches plainRenderwhen no paths are supplied.