fix cte refs for errored paths, and dml by Light2Dark · Pull Request #9188 · marimo-team/marimo

Light2Dark · 2026-04-14T10:37:48Z

📝 Summary

Closes #9168 .

For happy path in find_sql_refs, it would correctly avoid CTE as refs. The trouble is unhappy paths.
In the ^ query, there is an error because there are duplicate aliases (two JOINed tables have the same alias, and one is a CTE).

The fallback would walk through all table nodes and return them as refs, even table nodes that reference CTEs. The fix is to get the CTEs and remove them from refs. CTEs are unqualified (no schema, no catalog).

📋 Pre-Review Checklist

For large changes, or changes that affect the public API: this change was discussed or approved through an issue, on Discord, or the community discussions (Please provide a link if applicable).
Any AI generated code has been reviewed line-by-line by the human PR author, who stands by it.
Video or media evidence is provided for any visual changes (optional).

✅ Merge Checklist

I have read the contributor guidelines.
Documentation has been updated where applicable, including docstrings for API changes.
Tests have been added for the changes made.

vercel · 2026-04-14T10:37:54Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
marimo-docs	Ready	Preview, Comment	Apr 14, 2026 2:27pm

Copilot

Pull request overview

Fixes find_sql_refs so CTE names are not incorrectly returned as table dependencies when sqlglot scope building fails (e.g., due to duplicate aliases) and for DML statements.

Changes:

Add a fallback collector that filters out unqualified CTE references when walking exp.Table nodes.
Apply the CTE-filtering collector in both DML handling and the OptimizeError fallback path.
Add regression/edge-case tests for duplicate join aliases with CTEs, schema-qualified name collisions, and DML with CTEs.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`marimo/_ast/sql_visitor.py`	Updates fallback table-ref extraction to exclude unqualified CTE names during `OptimizeError`/DML paths.
`tests/_ast/test_sql_visitor.py`	Adds regression and edge-case coverage ensuring CTEs are not reported as dependencies across error/DML scenarios.

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Copilot · 2026-04-14T13:29:36Z

+        """Walk all Table nodes, filtering out unqualified CTE references.
+
+        find_all(exp.Table) doesn't understand CTE scope, so bare
+        references to CTE names would be misidentified as real tables.
+
+        We only collect CTEs from the statement-level WITH clause
+        (expression.args["with_"]) rather than traversing into nested
+        subqueries, because a subquery's CTE is scoped to that subquery
+        and must not mask a real table with the same name in the outer
+        query. Schema-qualified refs (e.g. schema.foo) are always real
+        tables even if a CTE shares the same base name.
+        """
+        if expression is None:
+            return
+
+        cte_names: set[str] = set()
+        with_clause = expression.args.get("with_")
+        if with_clause:
+            for cte in with_clause.expressions:
+                alias = cte.alias
+                if alias:
+                    cte_names.add(alias.lower())
+        for table in expression.find_all(exp.Table):
+            if ref := get_ref_from_table(table):
+                is_unqualified_cte = (
+                    ref.table.lower() in cte_names
+                    and ref.schema is None
+                    and ref.catalog is None
+                )
+                if not is_unqualified_cte:
+                    refs.add(ref)
+


_collect_table_refs_excluding_ctes only collects CTE names from the statement-level WITH clause. If an OptimizeError forces this fallback on a query that contains a nested subquery WITH (CTE scoped to that subquery), unqualified references to that nested CTE will still be returned as table refs, reintroducing false dependencies. Consider making the fallback traversal scope-aware by tracking active CTE names while recursively walking the AST (push CTE names when entering an expression with a WITH clause, and only filter matching Table nodes within that subtree).

Suggested change

"""Walk all Table nodes, filtering out unqualified CTE references.

find_all(exp.Table) doesn't understand CTE scope, so bare

references to CTE names would be misidentified as real tables.

We only collect CTEs from the statement-level WITH clause

(expression.args["with_"]) rather than traversing into nested

subqueries, because a subquery's CTE is scoped to that subquery

and must not mask a real table with the same name in the outer

query. Schema-qualified refs (e.g. schema.foo) are always real

tables even if a CTE shares the same base name.

"""

if expression is None:

return

cte_names: set[str] = set()

with_clause = expression.args.get("with_")

if with_clause:

for cte in with_clause.expressions:

alias = cte.alias

if alias:

cte_names.add(alias.lower())

for table in expression.find_all(exp.Table):

if ref := get_ref_from_table(table):

is_unqualified_cte = (

ref.table.lower() in cte_names

and ref.schema is None

and ref.catalog is None

)

if not is_unqualified_cte:

refs.add(ref)

"""Walk Table nodes, filtering out unqualified CTE references.

``find_all(exp.Table)`` doesn't understand CTE scope, so bare

references to CTE names can be misidentified as real tables.

Track active CTE names while recursively traversing the AST so

nested subqueries with their own WITH clauses only mask matching

unqualified table refs within that subtree. Schema-qualified refs

(e.g. schema.foo) are always treated as real tables even if a CTE

shares the same base name.

"""

if expression is None:

return

def _walk(

node: exp.Expression | None,

active_cte_names: set[str],

) -> None:

if node is None:

return

scoped_cte_names = active_cte_names

with_clause = node.args.get("with_")

if with_clause:

scoped_cte_names = set(active_cte_names)

for cte in with_clause.expressions:

alias = cte.alias

if alias:

scoped_cte_names.add(alias.lower())

if isinstance(node, exp.Table):

if ref := get_ref_from_table(node):

is_unqualified_cte = (

ref.table.lower() in scoped_cte_names

and ref.schema is None

and ref.catalog is None

)

if not is_unqualified_cte:

refs.add(ref)

return

for child in node.iter_expressions():

_walk(child, scoped_cte_names)

_walk(expression, set())

it doesn't handle complex cases, but I think it's sufficient for now

fix cte refs for errored paths, and dml

ef8a622

Light2Dark added the bug Something isn't working label Apr 14, 2026

Light2Dark requested a review from Copilot April 14, 2026 10:44

Copilot started reviewing on behalf of Light2Dark April 14, 2026 10:44 View session

Copilot AI reviewed Apr 14, 2026

View reviewed changes

Comment thread marimo/_ast/sql_visitor.py Outdated

handle mixed case

3c92c25

vercel Bot deployed to Preview April 14, 2026 10:51 View deployment

only find for with statements

0e8a794

vercel Bot deployed to Preview April 14, 2026 13:16 View deployment

fix type errors

ea97747

vercel Bot deployed to Preview April 14, 2026 13:24 View deployment

Light2Dark requested a review from Copilot April 14, 2026 13:24

Copilot started reviewing on behalf of Light2Dark April 14, 2026 13:25 View session

Copilot AI reviewed Apr 14, 2026

View reviewed changes

undo type change

99d21ca

vercel Bot deployed to Preview April 14, 2026 13:32 View deployment

satisfy mypy

ddca887

vercel Bot deployed to Preview April 14, 2026 13:45 View deployment

satisfy old sqlglot

7e6811d

vercel Bot deployed to Preview April 14, 2026 14:27 View deployment

Light2Dark marked this pull request as ready for review April 14, 2026 17:02

mscolnick approved these changes Apr 15, 2026

View reviewed changes

Light2Dark merged commit 20f869d into main Apr 15, 2026
45 checks passed

Light2Dark deleted the sham/fix-cte-refs branch April 15, 2026 15:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix cte refs for errored paths, and dml#9188

fix cte refs for errored paths, and dml#9188
Light2Dark merged 7 commits intomainfrom
sham/fix-cte-refs

Light2Dark commented Apr 14, 2026 •

edited

Loading

Uh oh!

vercel Bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 14, 2026

Uh oh!

Light2Dark Apr 14, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Light2Dark commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📝 Summary

📋 Pre-Review Checklist

✅ Merge Checklist

Uh oh!

vercel Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Light2Dark Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Light2Dark commented Apr 14, 2026 •

edited

Loading

vercel Bot commented Apr 14, 2026 •

edited

Loading