Skip to content

Drop dangling @file URLs from the session cache#9278

Merged
manzt merged 4 commits intomainfrom
push-pluqssxwzxlq
Apr 20, 2026
Merged

Drop dangling @file URLs from the session cache#9278
manzt merged 4 commits intomainfrom
push-pluqssxwzxlq

Conversation

@manzt
Copy link
Copy Markdown
Collaborator

@manzt manzt commented Apr 20, 2026

Fixes #9273

The ./@file/... URLs are backed by per-process buffers, so a cached snapshot replayed after kernel restart points at storage that no longer exists and the asset endpoint 404s (anywidget initialization for the restored cell then fails).

serialize_session_view now takes a required drop_virtual_file_outputs keyword so each consumer picks deliberately based on whether the snapshot will be replayed in the same process as the producer. The on-disk cache passes True (cell appears un-run on restore, kernel re-emits a fresh output); HTML export and the static preview pass False and inline the buffers themselves while they're still live.

manzt added 2 commits April 20, 2026 11:40
The session cache round-trips anywidget HTML verbatim, so cached
`./@file/...` URLs replay into the new kernel pointing at buffers
that no longer exist and the asset endpoint 404s. The assertion
just checks the URL doesn't survive the round-trip — strip it,
inline as a data URL, or persist and re-register the buffer; any
of those satisfies it.

Refs #9273.
Fixes #9273

The `./@file/...` URLs are backed by per-process buffers, so a cached
snapshot replayed after kernel restart points at storage that no longer
exists and the asset endpoint 404s (anywidget initialization for the
restored cell then fails).

`serialize_session_view` now takes a required
`drop_virtual_file_outputs` keyword so each consumer picks deliberately
based on whether the snapshot will be replayed in the same process as
the producer. The on-disk cache passes `True` (cell appears un-run on
restore, kernel re-emits a fresh output); HTML export and the static
preview pass `False` and inline the buffers themselves while they're
still live.
Copilot AI review requested due to automatic review settings April 20, 2026 15:46
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 20, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
marimo-docs Ready Ready Preview, Comment Apr 20, 2026 9:11pm

Request Review

@manzt manzt added the bug Something isn't working label Apr 20, 2026
@manzt manzt requested a review from mscolnick April 20, 2026 15:47
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses stale ./@file/... virtual-file URLs being restored from the on-disk session cache after kernel restart, which can break anywidget initialization when the backing per-process buffers no longer exist.

Changes:

  • Add a required drop_virtual_file_outputs keyword to serialize_session_view to force each consumer to choose whether to keep or drop virtual-file-backed outputs.
  • Drop outputs referencing virtual-file URLs for the on-disk session cache writer (and export session-cache path), while keeping them for HTML export and CLI preview (which inline/resolve them while still live).
  • Add a regression test ensuring virtual-file URLs do not survive a cache round-trip when dropping is enabled.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
marimo/_session/state/serialize.py Adds virtual-file detection and a required flag to drop such outputs during session serialization; session cache writer now drops them.
marimo/_server/export/_session_cache.py Session snapshot serialization for the on-disk cache now opts into dropping virtual-file outputs.
marimo/_server/export/exporter.py HTML export serialization explicitly keeps virtual-file outputs so they can be inlined.
marimo/_cli/development/commands.py Static preview serialization explicitly keeps virtual-file outputs.
tests/_session/state/test_serialize_session.py Updates call sites and adds regression coverage for dropping dangling virtual-file URLs.
tests/_session/state/test_serialize_session_missing_type.py Updates serialization call to include the new required keyword argument.

Comment thread marimo/_session/state/serialize.py
The session-cache drop (#9273) matched the substring `./@file/`, which
would also strip any plain-text output that merely mentioned the path.
Anchoring to `./@file/<digits>-` restricts the match to the runtime's
actual URL shape, and new tests cover the nested-dict recursion, the
literal-prefix-in-text case, and the targeted-filter behavior.
# (see `marimo/_runtime/virtual_file/virtual_file.py`). Anchored to the
# byte-length digits so a literal "./@file/" mention in user content
# doesn't trip the check.
_VIRTUAL_FILE_URL_RE = re.compile(r"\./@file/\d+-")
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in a follow up we could actually "tag" a cell output with virtual files, so we don't need to parse it out here.

@manzt manzt merged commit 970d039 into main Apr 20, 2026
33 of 48 checks passed
@manzt manzt deleted the push-pluqssxwzxlq branch April 20, 2026 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Edit mode session cache restores stale anywidget virtual-file URLs after restart

3 participants