-
Notifications
You must be signed in to change notification settings - Fork 887
fix: replace O(n²) string concat with array buffering for stdout/stderr #1271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -37,8 +37,8 @@ def __init__( | |
| self._handle_kill = handle_kill | ||
| self._events = events | ||
|
|
||
| self._stdout: str = "" | ||
| self._stderr: str = "" | ||
| self._stdout_chunks: list[str] = [] | ||
| self._stderr_chunks: list[str] = [] | ||
|
|
||
| self._result: Optional[CommandResult] = None | ||
| self._iteration_exception: Optional[Exception] = None | ||
|
Comment on lines
37
to
44
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟣 The sync Extended reasoning...What the bug is The async The specific code path In Why existing code does not prevent it The PR only changes internal storage from Addressing the refutation The refutation argues this asymmetry is intentional: the sync SDK processes events lazily (only when iterated), so accumulated output mid-run is less meaningful, and users can use Impact Any code that does Step-by-step proof
Pre-existing nature Before this PR: |
||
|
|
@@ -67,18 +67,18 @@ def _handle_events( | |
| if event.event.HasField("data"): | ||
| if event.event.data.stdout: | ||
| out = event.event.data.stdout.decode("utf-8", "replace") | ||
| self._stdout += out | ||
| self._stdout_chunks.append(out) | ||
| yield out, None, None | ||
| if event.event.data.stderr: | ||
| out = event.event.data.stderr.decode("utf-8", "replace") | ||
| self._stderr += out | ||
| self._stderr_chunks.append(out) | ||
| yield None, out, None | ||
| if event.event.data.pty: | ||
| yield None, None, event.event.data.pty | ||
| if event.event.HasField("end"): | ||
| self._result = CommandResult( | ||
| stdout=self._stdout, | ||
| stderr=self._stderr, | ||
| stdout="".join(self._stdout_chunks), | ||
| stderr="".join(self._stderr_chunks), | ||
| exit_code=event.event.end.exit_code, | ||
| error=event.event.end.error, | ||
| ) | ||
|
|
@@ -131,8 +131,8 @@ def wait( | |
|
|
||
| if self._result.exit_code != 0: | ||
| raise CommandExitException( | ||
| stdout=self._stdout, | ||
| stderr=self._stderr, | ||
| stdout=self._result.stdout, | ||
| stderr=self._result.stderr, | ||
| exit_code=self._result.exit_code, | ||
| error=self._result.error, | ||
| ) | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔴 The Python async SDK's
stdoutandstderrproperties call''.join(self._stdout_chunks)on every access with no caching, reintroducing the same O(n²) pattern the PR set out to fix. If a user accesseshandle.stdoutinside anon_stdoutcallback (as code-interpreter'sreadLinesdoes), each callback invocation triggers a full re-join of all accumulated chunks — O(total_size) per callback, O(n²) total. Fix by adding_stdout_cached: Optional[str] = None, returning the cached value in the getter, and setting it toNoneon eachappend, mirroring the JS implementation.Extended reasoning...
What the bug is: The Python async SDK's
stdoutandstderrproperties incommand_handle.py(lines 41–52) always call''.join(self._stdout_chunks)on every property access with no caching. The JS SDK correctly maintains_stdoutCached/_stderrCachedfields that are set toundefinedon each chunk push and lazily computed in the getter. The Python async SDK received the array-buffering change (replacing+=) but did not receive the corresponding caching layer.The specific code path: When a user passes
on_stdouttosandbox.commands.start(), the async SDK calls that callback for each chunk in_handle_events(). If the callback (or any downstream consumer) accesseshandle.stdoutto get the accumulated output — exactly the pattern code-interpreter'sreadLinesuses to scan for newlines viastr.find()— each callback invocation hits the property getter, which calls''.join(self._stdout_chunks)from scratch.Why existing code doesn't prevent it: The PR replaced
self._stdout += out(O(n) per append) withself._stdout_chunks.append(out)(O(1) per append), which is correct. But it did not add a cache guard in the getter. Before the PR, readingself._stdoutin a callback was O(1) (a simple attribute read). After the PR, readinghandle.stdoutin a callback is O(accumulated_size) becausejoin()must traverse all chunks every time.The impact: With N chunks totaling T bytes, accessing
handle.stdoutonce per callback produces O(T) work per call and O(N × T) total — quadratic in T when N grows proportionally to T (fixed chunk sizes). The PR's own benchmark documents this scenario ('string flattening') measuring 669 MB peak heap / 3.5x amplification for 200 MB of output in the JS SDK after the fix. The Python async SDK has no protection at all, so this scenario will be at least as bad as the pre-fix JS numbers.How to fix: Mirror the JS implementation exactly. Add
self._stdout_cached: Optional[str] = Noneandself._stderr_cached: Optional[str] = Noneas instance variables in__init__. In thestdoutproperty getter, returnself._stdout_cachedif it is notNone, otherwise compute''.join(self._stdout_chunks), store it inself._stdout_cached, and return it. In_iterate_events, afterself._stdout_chunks.append(out), setself._stdout_cached = None. Apply the same pattern forstderr.Step-by-step proof: Suppose a user runs
sandbox.commands.start(cmd, on_stdout=lambda _: print(len(handle.stdout)))against a command producing 200,000 chunks of ~1,000 bytes each (200 MB total, matching the PR's benchmark). Chunk 1 arrives →_stdout_chunkshas 1 element →handle.stdoutjoins 1 string, ~1,000 bytes of work. Chunk 2 → joins 2 strings, ~2,000 bytes. Chunk k → joins k strings, ~k×1,000 bytes. Total work = 1,000 × (1 + 2 + … + 200,000) ≈ 2×10¹³ bytes of string copying — clearly O(n²). This is the exact scenario the PR benchmarks, and the Python async SDK has no defense against it.