fix: clear error state on disabled-transitively cells when ancestor recovers by VishakBaddur · Pull Request #8784 · marimo-team/marimo

VishakBaddur · 2026-03-20T01:23:07Z

Root Cause

When a disabled-transitively cell's ancestor had an error and then recovered, the disabled cell permanently showed the ancestor's error state.

run_stale_cells() in runtime.py only re-queues non-disabled cells:

if cell_impl.stale and not self.graph.is_disabled(cid):
    cells_to_run.add(cid)

So disabled-transitively cells never got re-queued and never had a chance to reset their run_result_status from "exception" to "disabled".

Fix

Added is_any_ancestor_errored() to DirectedGraph
In run_stale_cells(), after building cells_to_run, reset run_result_status to "disabled" for any disabled-transitively cell whose ancestor no longer has an error

Testing

Added test_is_any_ancestor_errored to tests/_runtime/test_dataflow.py verifying the new graph method correctly detects and clears ancestor error states.

vercel · 2026-03-20T01:23:12Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
marimo-docs	Ready	Preview, Comment	Apr 17, 2026 8:33pm

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

Copilot · 2026-03-30T20:31:01Z

+    def is_any_ancestor_errored(self, cell_id: CellId_t) -> bool:
+        """Check if any ancestor of a cell has an error."""
+        return any(
+            self.topology.cells[cid].run_result_status == "exception"


is_any_ancestor_errored() only treats run_result_status == "exception" as an error, but the runtime also uses other error-like statuses (e.g. "marimo-error" is set for semantic/registration errors). The method name/docstring says “has an error”, so this narrow check is likely to be reused incorrectly and can cause false negatives when an ancestor is still in an error state.

Consider either (a) broadening the predicate to include all statuses that should be treated as “errored” (at least "exception" and "marimo-error", possibly "interrupted" depending on intent), or (b) renaming/docstring to make it explicit that it only checks for raised exceptions.

Suggested change

self.topology.cells[cid].run_result_status == "exception"

self.topology.cells[cid].run_result_status in ("exception", "marimo-error")

Copilot · 2026-03-30T20:31:02Z

+        # Clear stale error state from disabled-transitively cells whose
+        # ancestor has recovered from an error. Without this, the disabled
+        # cell permanently shows the ancestor error even after it is fixed.
+        for cid, cell_impl in self.graph.cells.items():
+            if (
+                self.graph.is_disabled(cid)
+                and not cell_impl.config.disabled
+                and cell_impl.run_result_status == "exception"
+                and not self.graph.is_any_ancestor_errored(cid)
+            ):


This loop calls self.graph.is_disabled(cid) for every cell on every run_stale_cells() invocation. DirectedGraph.is_disabled() walks parents (BFS) and can become a noticeable hot path for large notebooks.

Since this block only targets disabled-transitively cells, consider using the already-tracked runtime state (cell_impl.runtime_state == "disabled-transitively" / cell_impl.disabled_transitively) instead of recomputing is_disabled() each time, or precomputing a disabled set once and reusing it in both loops.

Copilot · 2026-03-30T20:31:02Z

+        # Clear stale error state from disabled-transitively cells whose
+        # ancestor has recovered from an error. Without this, the disabled
+        # cell permanently shows the ancestor error even after it is fixed.
+        for cid, cell_impl in self.graph.cells.items():
+            if (
+                self.graph.is_disabled(cid)
+                and not cell_impl.config.disabled
+                and cell_impl.run_result_status == "exception"
+                and not self.graph.is_any_ancestor_errored(cid)
+            ):
+                cell_impl.set_run_result_status("disabled")
+


This block updates cell_impl.run_result_status but does not emit any CellNotification to the frontend. The frontend’s “errored”/error UI is driven by received cell-op messages (especially error outputs), and it doesn’t observe backend run_result_status directly.

If the goal is to clear the user-visible error state for disabled-transitively cells, this likely also needs an explicit UI update (e.g., clearing/replacing the error output and/or sending a status transition that resets the frontend’s errored flag). An alternative is to include these cells in the normal _run_cells queue so they go through the runner’s standard status transitions, plus explicitly clearing their error output when they’re skipped as disabled.

Copilot · 2026-03-30T20:31:02Z

+def test_is_any_ancestor_errored() -> None:
+    """Test that is_any_ancestor_errored correctly detects ancestor errors."""
+    graph = dataflow.DirectedGraph()
+    # Create a chain: 0 -> 1 -> 2
+    code = "x = 0"
+    first_cell = parse_cell(code)
+    graph.register_cell("0", first_cell)
+    code = "y = x"
+    second_cell = parse_cell(code)
+    graph.register_cell("1", second_cell)
+    code = "z = y"
+    third_cell = parse_cell(code)
+    graph.register_cell("2", third_cell)
+
+    # No errors initially
+    assert not graph.is_any_ancestor_errored("0")
+    assert not graph.is_any_ancestor_errored("1")
+    assert not graph.is_any_ancestor_errored("2")
+
+    # Set cell 0 to exception state
+    graph.cells["0"].set_run_result_status("exception")
+    assert not graph.is_any_ancestor_errored("0")  # no ancestors
+    assert graph.is_any_ancestor_errored("1")  # parent 0 has error
+    assert graph.is_any_ancestor_errored("2")  # grandparent 0 has error
+
+    # Fix cell 0 - clear the error
+    graph.cells["0"].set_run_result_status("success")
+    assert not graph.is_any_ancestor_errored("0")
+    assert not graph.is_any_ancestor_errored("1")
+    assert not graph.is_any_ancestor_errored("2")


This test validates the new DirectedGraph.is_any_ancestor_errored() helper, but the PR’s user-facing behavior change is in Kernel.run_stale_cells() (clearing disabled-transitively cells’ stale error state when an ancestor recovers). Consider adding an integration-style runtime test that reproduces #8072 end-to-end (ancestor errors → downstream disabled-transitively cell shows error → ancestor fixed + run_stale_cells() → downstream cell no longer shows error/exception state). This would help ensure the run_stale_cells() logic stays correct as execution/notification behavior evolves.

dmadisetti · 2026-04-01T20:41:19Z

Hi @VishakBaddur are you still interested in contributing this? I'm not actually seeing this change reflected in the smoke test I just pushed.

If you want to update with a screenshot once you get this working that would be great.

dmadisetti

Moving over to draft until we do a bit of iteration. Thanks!

VishakBaddur · 2026-04-01T22:19:40Z

Hi @dmadisetti , thanks for the smoke test and for pushing this forward! I'll investigate why the error state isn't clearing in the UI. Looks like I need to also emit a frontend notification after resetting the run_result_status. Will update the PR shortly.

VishakBaddur · 2026-04-02T04:27:45Z

Hi @dmadisetti , thanks for the smoke test, it helped pinpoint the exact gap. I've pushed a fix that addresses both the backend and frontend sides:

Root cause (two parts):
Backend: run_stale_cells() was resetting run_result_status but never emitting any frontend notifications, so the UI had no signal to update.
Frontend: transitionCell() in cell.ts did not reset the errored flag when receiving "disabled-transitively" status, leaving the red has-error border even after the output was cleared.

Fix:
Call cell_impl.set_runtime_state("disabled-transitively") to keep backend state consistent and broadcast status to the frontend
Call CellNotificationUtils.broadcast_empty_output(cell_id, status="disabled-transitively") to replace the stale error output in the UI
Reset nextCell.errored = false in the "disabled-transitively" case in transitionCell() — this is safe because in all other code paths where "disabled-transitively" is sent, errored is already false, and if an error message follows (as in mutate_graph), it will correctly re-set errored = true
Broadened is_any_ancestor_errored to include "marimo-error" in addition to "exception"

dmadisetti · 2026-04-02T18:41:05Z

@VishakBaddur this still doesn't work as expected

Can you update with a screenshot when you're ready?

VishakBaddur · 2026-04-04T15:56:34Z

Hi @dmadisetti , the fix is now working end-to-end. Here's what changed:
The previous approach only handled the backend state but missed two critical issues:

Wrong code path: The clear-stale logic was in run_stale_cells(), but the UI element toggle triggers set_ui_element_value() → _run_cells() directly. So the fix never fired during the actual bug reproduction.

Frontend never received a clear signal: Even when the backend reset run_result_status, no frontend notification was sent to clear the stale error output and errored/stopped flags.

The actual fix:

Backend: Snapshot disabled cells in error/cancelled state before running. After _run_cells() completes, broadcast empty output + correct status for any cell whose ancestor has now recovered. This lives in _run_cells() so it covers all code paths.
Frontend: Clear stopped flag in disabled-transitively case; clear stopped/errored in idle case when non-error output arrives.

codecov · 2026-04-04T15:56:34Z

Bundle Report

Changes will increase total bundle size by 1.88kB (0.01%) ⬆️. This is within the configured threshold ✅

Detailed changes

Bundle name	Size	Change
marimo-esm	24.88MB	1.88kB (0.01%) ⬆️

Affected Assets, Files, and Routes:

view changes for bundle: marimo-esm

Assets Changed:

Asset Name	Size Change	Total Size	Change (%)
`assets/cells-*.js`	59 bytes	704.0kB	0.01%
`assets/index-*.js`	89 bytes	602.39kB	0.01%
`assets/index-*.css`	48 bytes	362.33kB	0.01%
`assets/JsonOutput-*.js`	2.14kB	342.26kB	0.63%
`assets/edit-*.js`	1 bytes	329.61kB	0.0%
`assets/add-*.js`	7 bytes	192.76kB	0.0%
`assets/layout-*.js`	-2 bytes	185.91kB	-0.0%
`assets/cell-*.js`	3 bytes	183.15kB	0.0%
`assets/file-*.js`	50 bytes	49.4kB	0.1%
`assets/panels-*.js`	6 bytes	45.36kB	0.01%
`assets/session-*.js`	-9 bytes	24.99kB	-0.04%
`assets/home-*.js`	132 bytes	21.86kB	0.61%
`assets/purify.es-*.js`	16 bytes	21.05kB	0.08%
`assets/column-*.js`	-659 bytes	6.53kB	-9.16%

dmadisetti · 2026-04-07T22:52:07Z

+                # Clear stale error state from disabled cells whose ancestor
+                # recovered. Uses pre-run snapshot since run_result_status is
+                # updated during the run.
+                for _cid in _pre_run_errored_disabled:


nit: these shouldn't be underscore prefixed (implies they are unused)

dmadisetti · 2026-04-07T22:53:49Z

@VishakBaddur great job! Much cleaner implementation. We still get the red highlights:

But I don't think that's blocking. I think this is a great starting point if you'd like to get it in as is. Tiny code clean up comment, but let me know if you're happy with this

…ecovers Fixes marimo-team#8072 When a disabled-transitively cell's ancestor had an error and then recovered, the disabled cell permanently showed the ancestor's error state. This happened because run_stale_cells() only re-queues non-disabled cells, so disabled-transitively cells never got a chance to reset their run_result_status from 'exception' to 'disabled'. Fix: - Add is_any_ancestor_errored() to DirectedGraph - In run_stale_cells(), after building cells_to_run, reset run_result_status to 'disabled' for any disabled-transitively cell whose ancestor no longer has an error

for more information, see https://pre-commit.ci

…ecovers Root cause had two parts: 1. Backend: run_stale_cells() only reset run_result_status but never emitted frontend notifications, so the UI never saw the change. 2. Frontend: transitionCell() did not reset errored flag on 'disabled-transitively' status, leaving the red error border. Fix: - Call set_runtime_state('disabled-transitively') after resetting run_result_status so the backend object state stays consistent - Call CellNotificationUtils.broadcast_empty_output to replace the stale error output in the UI - Reset nextCell.errored = false in the 'disabled-transitively' case in transitionCell() so the has-error CSS class is cleared - Broaden is_any_ancestor_errored to include 'marimo-error' status in addition to 'exception' - Add test for marimo-error case

for more information, see https://pre-commit.ci

Fixes issue marimo-team#8072: disabled cells permanently show ancestor error after ancestor recovers. Backend (runtime.py): - Snapshot disabled cells in error/cancelled state before running - After _run_cells completes, broadcast empty output + correct status for any snapshotted cell whose ancestor has now recovered - Moved to _run_cells() to cover set_ui_element_value() path too - config.disabled cells get idle + empty output - transitively-disabled cells get disabled-transitively + empty output Frontend (cell.ts): - disabled-transitively case: also clear stopped flag - idle case: clear stopped/errored when non-error output arrives

for more information, see https://pre-commit.ci

VishakBaddur · 2026-04-07T23:17:36Z

Hi @dmadisetti , addressed the nit, removed the underscore prefixes. Happy to merge as-is if you are!

…ll-error-state-not-cleared

dmadisetti

Unsure what's happening with ci. runtime tests pass locally. Thanks @VishakBaddur sorry for the long delay!

github-actions · 2026-04-17T21:24:26Z

🚀 Development release published. You may be able to view the changes at https://marimo.app?v=0.23.2-dev51

VishakBaddur requested a review from dmadisetti as a code owner March 20, 2026 01:23

vercel Bot deployed to Preview March 20, 2026 01:24 View deployment

vercel Bot deployed to Preview March 20, 2026 01:26 View deployment

mscolnick added the bug Something isn't working label Mar 20, 2026

mscolnick requested a review from Copilot March 20, 2026 17:44

Copilot AI reviewed Mar 20, 2026

View reviewed changes

mscolnick requested a review from Copilot March 30, 2026 20:22

Copilot started reviewing on behalf of mscolnick March 30, 2026 20:22 View session

Copilot AI reviewed Mar 30, 2026

View reviewed changes

vercel Bot deployed to Preview April 1, 2026 20:42 View deployment

dmadisetti requested changes Apr 1, 2026

View reviewed changes

dmadisetti marked this pull request as draft April 1, 2026 21:06

vercel Bot deployed to Preview April 2, 2026 04:00 View deployment

vercel Bot deployed to Preview April 2, 2026 04:02 View deployment

VishakBaddur marked this pull request as ready for review April 2, 2026 04:29

VishakBaddur requested review from Light2Dark and manzt as code owners April 2, 2026 04:29

VishakBaddur requested a review from dmadisetti April 2, 2026 04:29

github-actions Bot added the dependencies label Apr 4, 2026

vercel Bot deployed to Preview April 4, 2026 15:51 View deployment

vercel Bot deployed to Preview April 4, 2026 16:43 View deployment

VishakBaddur force-pushed the fix/disabled-cell-error-state-not-cleared branch from f562651 to efacbc6 Compare April 4, 2026 16:45

VishakBaddur requested a review from mscolnick as a code owner April 4, 2026 16:45

VishakBaddur force-pushed the fix/disabled-cell-error-state-not-cleared branch from efacbc6 to 13db6dd Compare April 4, 2026 16:51

vercel Bot deployed to Preview April 4, 2026 16:52 View deployment

fix: add type annotation for runtime state cast to fix mypy error

1d2ee2d

vercel Bot deployed to Preview April 4, 2026 20:52 View deployment

dmadisetti reviewed Apr 7, 2026

View reviewed changes

VishakBaddur and others added 11 commits April 7, 2026 18:05

[pre-commit.ci] auto fixes from pre-commit.com hooks

ef5b515

for more information, see https://pre-commit.ci

test: add a smoke test

cc72fcd

[pre-commit.ci] auto fixes from pre-commit.com hooks

f8d6562

for more information, see https://pre-commit.ci

[pre-commit.ci] auto fixes from pre-commit.com hooks

658f7ad

for more information, see https://pre-commit.ci

revert: remove unrelated sandbox changes from this PR

e572e55

revert: restore sandbox files to main branch state

06e35a4

fix: add type annotation for runtime state cast to fix mypy error

bf89735

nit: remove underscore prefixes from local variables in _run_cells

25e8a48

VishakBaddur force-pushed the fix/disabled-cell-error-state-not-cleared branch from 1d2ee2d to 25e8a48 Compare April 7, 2026 23:14

vercel Bot deployed to Preview April 7, 2026 23:15 View deployment

manzt removed their request for review April 16, 2026 19:54

mscolnick requested a review from dmadisetti April 17, 2026 20:22

dmadisetti added 2 commits April 17, 2026 13:23

Merge branch 'main' of github:marimo-team/marimo into fix/disabled-ce…

8fd310c

…ll-error-state-not-cleared

rebase: fix merge res

e08c437

vercel Bot deployed to Preview April 17, 2026 20:30 View deployment

tidy: revert unrelated test

0cd050a

vercel Bot deployed to Preview April 17, 2026 20:33 View deployment

dmadisetti approved these changes Apr 17, 2026

View reviewed changes

dmadisetti merged commit ba12764 into marimo-team:main Apr 17, 2026
41 of 111 checks passed

	self.topology.cells[cid].run_result_status == "exception"
	self.topology.cells[cid].run_result_status in ("exception", "marimo-error")

Conversation

VishakBaddur commented Mar 20, 2026

Root Cause

Fix

Testing

Uh oh!

vercel Bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

dmadisetti commented Apr 1, 2026

Uh oh!

dmadisetti left a comment

Choose a reason for hiding this comment

Uh oh!

VishakBaddur commented Apr 1, 2026

Uh oh!

VishakBaddur commented Apr 2, 2026

Uh oh!

dmadisetti commented Apr 2, 2026

Uh oh!

VishakBaddur commented Apr 4, 2026

Uh oh!

codecov Bot commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bundle Report

Affected Assets, Files, and Routes:

Assets Changed:

Uh oh!

dmadisetti Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

dmadisetti commented Apr 7, 2026

Uh oh!

VishakBaddur commented Apr 7, 2026

Uh oh!

dmadisetti left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

vercel Bot commented Mar 20, 2026 •

edited

Loading

codecov Bot commented Apr 4, 2026 •

edited

Loading