Conversation
| @@ -104,6 +104,14 @@ def test_roundtrip(self): | |||
| result = cli._read_cfg() | |||
| self.assertEqual(result, sections) | |||
|
|
|||
There was a problem hiding this comment.
Seems somehow this was accidentally added in my server change PR? Please review the full file
| ./python/dev/lint-python | ||
| ./python/dev/pytest | ||
| python -m pip install -e ./cli | ||
| cd cli && python -m unittest discover -v -p 'test*.py' |
There was a problem hiding this comment.
Why test it with Python not a standalone build?
| ```bash | ||
| delta-sharing tables metadata my_share my_schema my_table | ||
| delta-sharing tables metadata my_share my_schema my_table \ | ||
| --response-format delta --reader-features deletionvectors |
There was a problem hiding this comment.
what if there are multiple reader features?
The value deletionvectors doesn't seem to match real feature name?
| ### tables query | ||
|
|
||
| ```bash | ||
| delta-sharing tables query my_share my_schema my_table |
There was a problem hiding this comment.
does query support auto resolution?
Code review (posted on behalf of review pass)OverallThe CLI is a strong addition: stdlib-only, sensible auth resolution order, 1. Protocol gaps: query pagination /
|
littlegrasscao
left a comment
There was a problem hiding this comment.
Additional line-level review (does not repeat the earlier summary comment on pagination, README reader-features, doc/Python version alignment, license headers, PyPI naming, CI rationale, or generic test gaps).
| return token[:4] + "..." + token[-4:] | ||
|
|
||
| _SECTION_RE = re.compile(r"^\[([^\]]+)\]\s*$") | ||
| _KV_RE = re.compile(r"^(\w+)\s*=\s*(.*)$") |
There was a problem hiding this comment.
_KV_RE only allows ASCII word keys (\w+). That rejects INI keys with hyphens and silently ignores non-matching lines. Consider documenting supported key syntax in README, or relaxing the pattern if you want parity with common INI dialects.
| Delta Sharing CLI — interact with a Delta Sharing server from the command line. | ||
|
|
||
| Connection resolution (first applicable branch wins; see resolve_connection): | ||
| 1. Both --endpoint and --token set (explicit credentials only) |
There was a problem hiding this comment.
Minor inconsistency: the module docstring describes step 1 as when both --endpoint and --token are set, which matches the code, but readers may not realize that supplying only one of them still falls through to profile/cfg resolution. A clarifying phrase ("both required to skip profile resolution") would help.
| for k, v in e.headers.items(): | ||
| _debug(f"< {k}: {v}") | ||
| _debug(f"< Body: {body_bytes.decode(errors='replace')[:2000]}") | ||
| try: |
There was a problem hiding this comment.
If the error response body is valid JSON but not an object (e.g. a quoted string or an array), err["httpStatus"] = e.code raises TypeError. Consider normalizing to a dict first, e.g. err = err if isinstance(err, dict) else {"message": err}.
| def _url(base, *parts, **query): | ||
| path = "/".join(urllib.parse.quote(p, safe="") for p in parts) | ||
| url = f"{base.rstrip('/')}/{path}" | ||
| qs = {k: str(v) for k, v in query.items() if v is not None} |
There was a problem hiding this comment.
Query values use str(v). For CDF, --include-historical-metadata stays a string from argparse today (good); if this ever became a boolean you would emit True/False in the query string instead of true/false. Minor footgun to keep in mind.
| def _parse_ndjson(raw_bytes): | ||
| """Parse newline-delimited JSON into a list of objects.""" | ||
| lines = raw_bytes.decode().strip().splitlines() | ||
| return [json.loads(line) for line in lines if line.strip()] |
There was a problem hiding this comment.
decode() uses strict UTF-8. NDJSON with invalid UTF-8 raises UnicodeDecodeError (surfacing as generic Unexpected error unless -V). Consider decode(errors="replace") or a dedicated error message.
| _, _, body = _request("GET", url, token, extra_headers=extra_headers) | ||
| data = json.loads(body) | ||
| all_items.extend(data.get("items", [])) | ||
| page_token = data.get("nextPageToken") |
There was a problem hiding this comment.
PROTOCOL allows nextPageToken to be an empty string meaning no more pages. if not page_token matches that. Optional: a one-line comment in code tying this to the spec would help future readers.
| "tables", args.table, "version", | ||
| startingTimestamp=args.starting_timestamp) | ||
| _, hdrs, _ = _request("GET", url, token, extra_headers=xh) | ||
| version = hdrs.get("Delta-Table-Version") or hdrs.get("delta-table-version") |
There was a problem hiding this comment.
int(version) still throws if the header is present but non-numeric. Consider try/except and a JSON error on stderr for consistency with _die.
| if args.insecure: | ||
| profile["insecure"] = "true" | ||
| else: | ||
| insecure = input("Skip SSL verification (y/N): ").strip().lower() |
There was a problem hiding this comment.
After configure -f profile.json, you still prompt for SSL skip unless -k. When stdin is not a TTY, input() reads the next line of stdin (often empty). Document that non-interactive flows should pass -k when needed, or skip the prompt when not sys.stdin.isatty() and default to secure.
| if os.name == "nt": | ||
| self.skipTest("POSIX file modes") | ||
| sections = {"default": {"endpoint": "https://h", "token": "t"}} | ||
| cli._write_cfg(sections) |
There was a problem hiding this comment.
import cli assumes tests run with cli/ on sys.path (CI uses cd cli). Running from repo root without PYTHONPATH can import the wrong package. A short note for contributors would help.
| @@ -0,0 +1,309 @@ | |||
| # delta-sharing CLI | |||
|
|
|||
| A command-line tool for interacting with [Delta Sharing](https://github.com/delta-io/delta-sharing) servers. Covers all endpoints in the [protocol spec](https://github.com/delta-io/delta-sharing/blob/main/PROTOCOL.md). **Zero third-party dependencies** — Python 3.8+ and the standard library only. | |||
There was a problem hiding this comment.
Says the tool covers all protocol endpoints. If optional flows (vendor URL prefixes, full EndStreamAction / refresh continuation, etc.) are intentionally unsupported, one sentence would avoid over-promise.
| conda activate test-environment | ||
| ./python/dev/lint-python | ||
| ./python/dev/pytest | ||
| python -m pip install -e ./cli |
There was a problem hiding this comment.
CLI tests ride on the full conda + connector job. A tiny standalone setup-python job could isolate CLI-only failures and mirror how many users install the tool.
| val = _mask_token(v) if k.lower() == "authorization" else v | ||
| _debug(f"> {k}: {val}") | ||
| if data: | ||
| _debug(f"> Body: {data.decode()}") |
There was a problem hiding this comment.
Verbose path uses data.decode() without errors=replace. Non-UTF8 JSON body will raise here while logging; consider matching the lenient decode used for HTTP error bodies.
| url += ("&" if "?" in url else "?") + urllib.parse.urlencode(qs) | ||
|
|
||
| _, _, body = _request("GET", url, token, extra_headers=extra_headers) | ||
| data = json.loads(body) |
There was a problem hiding this comment.
json.loads(body) on list responses will raise JSONDecodeError on unexpected HTML/plain-text error pages (e.g. proxy/gateway). Could catch and route through _die with a short hint to use -V or check endpoint URL.
Summary
protocol
INI-style named profiles for multi-environment use
Motivation
Delta Sharing is an open protocol, but there is no official CLI in the ecosystem. The existing
delta-sharingPython package is a library that requires writingcode and importing pandas or Spark. For anyone who wants to quickly explore shared data, debug server behavior, or script against the API, the only option today
is hand-crafting curl commands — with long URLs, manual auth headers, and ndjson responses that break standard JSON tools.
Every major open protocol has a CLI (HTTP has curl, S3 has
aws s3, gRPC hasgrpcurl). Delta Sharing should too.What's included
CLI (
cli.py) — single-file, pure Python 3.6+ stdlib. No external dependencies.delta-sharing shares listGET /sharesdelta-sharing shares getGET /shares/{share}delta-sharing schemas listGET /shares/{share}/schemasdelta-sharing tables listGET /shares/{share}/schemas/{schema}/tablesdelta-sharing tables list-allGET /shares/{share}/all-tablesdelta-sharing tables versionGET .../tables/{table}/versiondelta-sharing tables metadataGET .../tables/{table}/metadatadelta-sharing tables queryPOST .../tables/{table}/querydelta-sharing tables changesGET .../tables/{table}/changesdelta-sharing tables credentialsPOST .../temporary-table-credentialsdelta-sharing configuredelta-sharing profilesKey features:
delta-sharing configure -f profile.json) — hand someone a profile, they're connected in one command~/.delta-sharing.cfg(INI format) with-P <name>switching-a) for all list endpoints-V) — prints full request/response to stderr with tokens masked, for debugging and bug reports-k) for testing environments, configurable per-profileTests (
test_cli.py) — 75 unit tests covering URL building, config read/write, profile loading, connection resolution, argument parsing for every command andflag.
Packaging (
setup.py) —pip install -e .registers thedelta-sharingconsole script.Example: before and after
Before (curl):
After (CLI):
delta-sharing tables query myshare default mytable \ -l 100 -v 5 --predicate-hints "region = 'us-east-1'" \ --response-format delta --reader-features deletionvectorsTest plan