Skip to content

Commit bf9e0dd

Browse files
committed
Add Tier 4 export improvements and Tier 5 metadata/annotations
Tier 4 (export/roundtrip): - Export segments as source-level where: clauses - Export renames as rename: (detect simple identifier dimensions) - Export full join on conditions from relationship metadata - join_cross: already handled in prior commit Tier 5 (metadata/annotations): - Non-description tags (# line_chart, # percent, etc.) stored in dimension/measure/model metadata["tags"] - #@ persist annotations stored in Model.metadata["persist"] - timezone: statements stored in Model.metadata["timezone"] - Standalone # annotations in extend blocks stored as model tags - declare: field declarations processed as dimensions in + syntax - _parse_annotations_full returns both description and tag list
1 parent 80d593a commit bf9e0dd

2 files changed

Lines changed: 131 additions & 43 deletions

File tree

docs/compatibility/malloy.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -115,10 +115,11 @@ Not mapped: `access` modifiers (`public`, `private`, `internal`), `order_by:` wi
115115
| `# description: value` tag annotation | Supported (extracted as `description`) |
116116
| Multiple `##` lines on one entity | Supported (joined with spaces) |
117117
| Statement-level `#` tags (before `source:`) | Supported (applied as source description if the source itself has none) |
118-
| `# tag_name` (non-description tags) | Partial support: parsed without error but only `desc:` and `description:` prefixed tags are extracted. Other tags are discarded. |
119-
| `#@ persist` and `#@ persist name=...` | Unsupported (parsed by the grammar but not recognized by the visitor). |
118+
| `# tag_name` (non-description tags) | Supported (stored in `metadata["tags"]` on dimensions, measures, and models; includes `line_chart`, `bar_chart`, `percent`, `currency`, etc.) |
119+
| `#@ persist` and `#@ persist name=...` | Supported (stored in `Model.metadata["persist"]` and `metadata["persist_name"]`) |
120+
| Standalone `#` annotations in extend blocks | Supported (stored in `Model.metadata["tags"]` via `DefExploreAnnotationContext`) |
120121

121-
Not mapped: visualization hint tags (`# line_chart`, `# bar_chart`, `# list_detail`, `# shape_map`, `# percent`, `# currency`, `# number`), `--! styles` directives, `##! experimental` pragmas.
122+
Not mapped: `--! styles` directives, `##! experimental` pragmas.
122123

123124
---
124125

@@ -163,8 +164,8 @@ Not mapped: join `type` (`left`, `right`, `full`, `inner`).
163164
| `where: condition` in source extend block | Supported (mapped to `Segment`) |
164165
| Multiple filter conditions (comma-separated) | Supported (each becomes a separate segment) |
165166
| Filter expressions with comparisons, `and`, `or` | Supported (expression preserved as-is) |
166-
| Malloy partial application (`field ? pick ... when ...`) | Partial support: expression text is captured but the `?` operator is not evaluated. |
167-
| Malloy value matching (`field ? 'a' \| 'b'`) | Partial support: expression preserved as-is, not converted to SQL `IN`. |
167+
| Malloy partial application (`field ? pick ... when ...`) | Supported in dimension context (expanded to CASE); partial in filter context |
168+
| Malloy value matching (`field ? 'a' \| 'b'`) | Supported (transformed to `field IN ('a', 'b')`) |
168169

169170
Segment naming: first filter is named `default_filter`, subsequent filters are named `default_filter_1`, `default_filter_2`, etc.
170171

@@ -300,16 +301,18 @@ Sidemantic can export its semantic model back to Malloy format.
300301
| Ratio metrics | Supported (exported as `numerator / denominator`) |
301302
| `primary_key:` | Supported (exported when not the default `id`) |
302303
| `join_one:` / `join_many:` with `with` clause | Supported |
304+
| `join_one:` / `join_many:` with `on` condition | Supported (full `on` condition exported from `metadata["on_condition"]` when available) |
305+
| `where:` (segments) | Supported (source-level where clauses exported) |
303306
| Roundtrip fidelity (parse -> export -> re-parse) | Supported (semantically equivalent graphs; passthrough dimensions intentionally dropped) |
304307
| `join_cross:` export | Supported (one_to_one relationships exported as `join_cross:`) |
305-
| `rename:` export | Partial support: renames are captured as dimensions during parsing; exported as `dimension:` not `rename:` |
308+
| `rename:` export | Supported (simple identifier dimensions detected and exported as `rename: new is old`) |
306309
| `view:` export | Unsupported (views are not captured during parsing) |
307310

308311
---
309312

310313
## Experimental and Advanced Features
311314

312-
Unsupported. `##! experimental{...}` pragma annotations, `compose()` for composite sources, `timezone:` statements, `sample:` specifications, and `declare:` field declarations are all parsed by the grammar without error but not processed by the visitor.
315+
Partially supported. `timezone:` statements are stored in `Model.metadata["timezone"]`. `declare:` field declarations are processed as dimensions in old `+` syntax blocks. `compose()` sources process the first composed source. `##! experimental{...}` pragma annotations and `sample:` specifications are parsed by the grammar without error but not processed by the visitor.
313316

314317
---
315318

sidemantic/adapters/malloy.py

Lines changed: 121 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -60,42 +60,46 @@ def _reset_current(self):
6060
self.current_metrics = []
6161
self.current_relationships = []
6262
self.current_segments = []
63+
self._timezone = None
64+
self._model_tags = []
65+
self._accept_fields = []
66+
self._except_fields = []
6367

6468
def _parse_annotations(self, tags_ctx) -> str | None:
65-
"""Parse annotations from tags context.
69+
"""Parse annotations from tags context, returning description text.
6670
67-
Malloy annotations:
68-
- ## Description text -> description (DOC_ANNOTATION)
69-
- # key: value -> metadata (ANNOTATION)
71+
Also stores non-description tags via _parse_annotations_full.
72+
"""
73+
if tags_ctx is None:
74+
return None
75+
desc, _ = self._parse_annotations_full(tags_ctx)
76+
return desc
7077

71-
We extract:
72-
- Any ## text as description
73-
- # desc: value as description
78+
def _parse_annotations_full(self, tags_ctx) -> tuple[str | None, list[str]]:
79+
"""Parse annotations from tags context.
7480
75-
Returns the description text if found.
81+
Returns (description, tags) where tags is a list of non-description
82+
tag strings like "line_chart", "percent", "currency", etc.
7683
"""
7784
if tags_ctx is None:
78-
return None
85+
return None, []
7986

8087
descriptions = []
88+
tags = []
8189

82-
# Iterate through ANNOTATION tokens
8390
for i in range(tags_ctx.getChildCount()):
8491
child = tags_ctx.getChild(i)
8592
if child is not None:
8693
text = child.getText()
8794

8895
# ## is a doc annotation (description)
8996
if text.startswith("##"):
90-
# Strip ## and whitespace
9197
desc = text[2:].strip()
9298
if desc:
9399
descriptions.append(desc)
94100
# # is a tag annotation
95101
elif text.startswith("#"):
96-
# Strip # and check for desc: or description:
97102
tag_text = text[1:].strip()
98-
# Common patterns: desc: value, description: value
99103
if tag_text.lower().startswith("desc:"):
100104
desc = tag_text[5:].strip()
101105
if desc:
@@ -104,10 +108,11 @@ def _parse_annotations(self, tags_ctx) -> str | None:
104108
desc = tag_text[12:].strip()
105109
if desc:
106110
descriptions.append(desc)
111+
elif tag_text:
112+
tags.append(tag_text)
107113

108-
if descriptions:
109-
return " ".join(descriptions)
110-
return None
114+
desc = " ".join(descriptions) if descriptions else None
115+
return desc, tags
111116

112117
def visitImportStatement(self, ctx: MalloyParser.ImportStatementContext): # noqa: N802
113118
"""Visit import statement and extract dependencies.
@@ -499,7 +504,20 @@ def visitDefineSourceStatement(self, ctx: MalloyParser.DefineSourceStatementCont
499504
# These apply to all sources in the statement if there's only one,
500505
# or can be overridden by source-specific tags
501506
stmt_tags = ctx.tags()
502-
stmt_description = self._parse_annotations(stmt_tags) if stmt_tags else None
507+
stmt_description = None
508+
stmt_persist = None
509+
if stmt_tags:
510+
stmt_description, stmt_tag_list = self._parse_annotations_full(stmt_tags)
511+
# Check for #@ persist annotations
512+
for tag in stmt_tag_list:
513+
if tag.startswith("@ persist") or tag.startswith("@persist"):
514+
persist_text = tag[len("@ persist") :] if tag.startswith("@ persist") else tag[len("@persist") :]
515+
persist_text = persist_text.strip()
516+
stmt_persist = {"persist": True}
517+
# Parse name=value
518+
name_match = re.match(r"name\s*=\s*(\S+)", persist_text)
519+
if name_match:
520+
stmt_persist["persist_name"] = name_match.group(1)
503521

504522
# Get source definitions
505523
source_list = ctx.sourcePropertyList()
@@ -510,14 +528,19 @@ def visitDefineSourceStatement(self, ctx: MalloyParser.DefineSourceStatementCont
510528
self._process_source_definition(source_def)
511529

512530
# If source has no description but statement does, use statement description
513-
# (only for single-source statements, or as fallback)
514531
if self.current_description is None and stmt_description is not None:
515532
self.current_description = stmt_description
516533

517534
if self.current_model_name:
518535
metadata = {}
519536
if self.current_connection:
520537
metadata["connection"] = self.current_connection
538+
if stmt_persist:
539+
metadata.update(stmt_persist)
540+
if self._timezone:
541+
metadata["timezone"] = self._timezone
542+
if self._model_tags:
543+
metadata["tags"] = self._model_tags
521544
model = Model(
522545
name=self.current_model_name,
523546
table=self.current_table,
@@ -715,6 +738,12 @@ def _process_query_properties_as_explore(self, ctx):
715738
where_stmt = stmt.whereStatement()
716739
if where_stmt:
717740
self._process_where_as_segment(where_stmt)
741+
elif isinstance(stmt, MalloyParser.DeclareStatementContext):
742+
# declare: creates fields accessible within the source
743+
def_list = stmt.defList()
744+
if def_list:
745+
for field_def in def_list.fieldDef():
746+
self._process_dimension_def(field_def)
718747

719748
def _process_explore_statement(self, ctx: MalloyParser.ExploreStatementContext):
720749
"""Process a single statement in explore properties."""
@@ -775,6 +804,34 @@ def _process_explore_statement(self, ctx: MalloyParser.ExploreStatementContext):
775804
self._accept_fields.extend(field_names)
776805
return
777806

807+
# Timezone statement: timezone: 'US/Pacific'
808+
if isinstance(ctx, MalloyParser.DefExploreTimezoneContext):
809+
tz_stmt = ctx.timezoneStatement()
810+
if tz_stmt:
811+
tz_string = tz_stmt.string()
812+
if tz_string:
813+
tz_value = self._extract_string(self._get_text(tz_string))
814+
if not hasattr(self, "_timezone"):
815+
self._timezone = None
816+
self._timezone = tz_value
817+
return
818+
819+
# Standalone annotations in extend blocks
820+
if isinstance(ctx, MalloyParser.DefExploreAnnotationContext):
821+
# These are # tag annotations not attached to a field
822+
# Store as model-level tags
823+
for i in range(ctx.getChildCount()):
824+
child = ctx.getChild(i)
825+
if child is not None:
826+
text = child.getText()
827+
if text.startswith("#"):
828+
tag_text = text[1:].strip()
829+
if tag_text:
830+
if not hasattr(self, "_model_tags"):
831+
self._model_tags = []
832+
self._model_tags.append(tag_text)
833+
return
834+
778835
# Rename statements: rename: new_name is old_name
779836
if isinstance(ctx, MalloyParser.DefExploreRenameContext):
780837
rename_list = ctx.renameList()
@@ -812,8 +869,13 @@ def _process_dimension_def(self, ctx: MalloyParser.FieldDefContext):
812869
name = self._get_text(name_def)
813870

814871
# Get annotations from tags
815-
tags = ctx.tags()
816-
description = self._parse_annotations(tags) if tags else None
872+
tags_ctx = ctx.tags()
873+
description = None
874+
dim_metadata = None
875+
if tags_ctx:
876+
description, tag_list = self._parse_annotations_full(tags_ctx)
877+
if tag_list:
878+
dim_metadata = {"tags": tag_list}
817879

818880
# Get the expression
819881
field_expr = ctx.fieldExpr()
@@ -848,6 +910,7 @@ def _process_dimension_def(self, ctx: MalloyParser.FieldDefContext):
848910
sql=sql,
849911
granularity=granularity,
850912
description=description,
913+
metadata=dim_metadata,
851914
)
852915
)
853916

@@ -869,8 +932,13 @@ def _process_measure_def(self, ctx: MalloyParser.FieldDefContext):
869932
name = self._get_text(name_def)
870933

871934
# Get annotations from tags
872-
tags = ctx.tags()
873-
description = self._parse_annotations(tags) if tags else None
935+
tags_ctx = ctx.tags()
936+
description = None
937+
measure_tags = None
938+
if tags_ctx:
939+
description, tag_list = self._parse_annotations_full(tags_ctx)
940+
if tag_list:
941+
measure_tags = tag_list
874942

875943
# Get the expression
876944
field_expr = ctx.fieldExpr()
@@ -942,9 +1010,11 @@ def _process_measure_def(self, ctx: MalloyParser.FieldDefContext):
9421010
else:
9431011
metric_type = "derived"
9441012

945-
metric_metadata = None
1013+
metric_metadata = {}
9461014
if measure_granularity:
947-
metric_metadata = {"granularity": measure_granularity}
1015+
metric_metadata["granularity"] = measure_granularity
1016+
if measure_tags:
1017+
metric_metadata["tags"] = measure_tags
9481018

9491019
self.current_metrics.append(
9501020
Metric(
@@ -954,7 +1024,7 @@ def _process_measure_def(self, ctx: MalloyParser.FieldDefContext):
9541024
sql=sql,
9551025
filters=filters,
9561026
description=description,
957-
metadata=metric_metadata,
1027+
metadata=metric_metadata if metric_metadata else None,
9581028
)
9591029
)
9601030

@@ -1314,7 +1384,7 @@ def _export_source(self, model: Model) -> list[str]:
13141384
"""
13151385
lines = []
13161386

1317-
# Model description as annotation
1387+
# Model description as tag annotation
13181388
if model.description:
13191389
lines.append(f"# desc: {model.description}")
13201390

@@ -1331,10 +1401,13 @@ def _export_source(self, model: Model) -> list[str]:
13311401
if model.primary_key and model.primary_key != "id":
13321402
lines.append(f" primary_key: {model.primary_key}")
13331403

1334-
# Dimensions
1335-
# Skip dimensions that match the primary key (Malloy auto-exposes the PK column).
1336-
# Skip passthrough dimensions (sql == name) since Malloy auto-exposes underlying
1337-
# table columns. Only export dimensions with actual transformations.
1404+
# Segments (source-level where clauses) - Tier 4.1
1405+
if model.segments:
1406+
for segment in model.segments:
1407+
lines.append(f" where: {self._strip_model_prefix(segment.sql)}")
1408+
1409+
# Separate renames from computed dimensions for proper export
1410+
renames_to_export: list[tuple[str, str]] = []
13381411
dims_to_export: list[tuple[Dimension, str]] = []
13391412
for dim in model.dimensions:
13401413
if dim.name == model.primary_key:
@@ -1343,23 +1416,32 @@ def _export_source(self, model: Model) -> list[str]:
13431416
# Skip passthrough dimensions - Malloy auto-exposes table columns
13441417
if sql == dim.name:
13451418
continue
1346-
dims_to_export.append((dim, sql))
1419+
# Tier 4.5: detect renames (simple identifier, no operators/functions)
1420+
if re.match(r"^[`\w]+$", sql) and sql != dim.name:
1421+
renames_to_export.append((dim.name, sql))
1422+
else:
1423+
dims_to_export.append((dim, sql))
1424+
1425+
# Export renames
1426+
if renames_to_export:
1427+
lines.append("")
1428+
lines.append(" rename:")
1429+
for new_name, old_name in renames_to_export:
1430+
lines.append(f" {new_name} is {old_name}")
13471431

1432+
# Export computed dimensions
13481433
if dims_to_export:
13491434
lines.append("")
13501435
lines.append(" dimension:")
13511436
for dim, sql in dims_to_export:
13521437
if dim.description:
13531438
lines.append(f" # desc: {dim.description}")
1354-
# For time dimensions with granularity, use Malloy's time truncation syntax
1355-
# But only if the SQL doesn't already contain a truncation function
13561439
if dim.type == "time" and dim.granularity:
13571440
sql_lower = sql.lower()
13581441
already_has_truncation = (
13591442
"date_trunc" in sql_lower or "::date" in sql_lower or sql_lower.endswith(f".{dim.granularity}")
13601443
)
13611444
if not already_has_truncation:
1362-
# Append Malloy time accessor: .second, .minute, .hour, .day, .week, .month, .quarter, .year
13631445
lines.append(f" {dim.name} is {sql}.{dim.granularity}")
13641446
else:
13651447
lines.append(f" {dim.name} is {sql}")
@@ -1376,7 +1458,7 @@ def _export_source(self, model: Model) -> list[str]:
13761458
measure_expr = self._format_measure(metric)
13771459
lines.append(f" {metric.name} is {measure_expr}")
13781460

1379-
# Joins
1461+
# Joins - Tier 4.4: use on condition from metadata when available
13801462
for rel in model.relationships:
13811463
lines.append("")
13821464
if rel.type == "many_to_one":
@@ -1385,7 +1467,10 @@ def _export_source(self, model: Model) -> list[str]:
13851467
join_type = "join_cross"
13861468
else:
13871469
join_type = "join_many"
1388-
if rel.foreign_key:
1470+
on_condition = (rel.metadata or {}).get("on_condition")
1471+
if on_condition:
1472+
lines.append(f" {join_type}: {rel.name} on {on_condition}")
1473+
elif rel.foreign_key:
13891474
lines.append(f" {join_type}: {rel.name} with {rel.foreign_key}")
13901475
else:
13911476
lines.append(f" {join_type}: {rel.name}")

0 commit comments

Comments
 (0)