fix: end the turn when a permission is rejected with feedback#31216
Closed
remorses wants to merge 10 commits into
Closed
fix: end the turn when a permission is rejected with feedback#31216remorses wants to merge 10 commits into
remorses wants to merge 10 commits into
Conversation
…_deny works for tasks permission.ask() failures go through Effect.orDie in the ask callback, turning RejectedError/CorrectedError/DeniedError into defects. These defects bypassed the processor failToolCall logic that checks instanceof to set ctx.blocked based on continue_loop_on_deny. - tools.ts: wrap item.execute with Effect.catchDefect to convert permission defects back to typed failures. The AI SDK then surfaces them as tool-error events that reach failToolCall properly. - processor.ts: add CorrectedError and DeniedError to the failToolCall instanceof check (previously only RejectedError was handled). - prompt.ts: in handleSubtask catchCause, distinguish permission rejections from real failures and log at info instead of error. - tests: integration test verifying denied bash permission continues the loop with continue_loop_on_deny, plus unit test for error identity preservation through Schema.Defect. Fixes anomalyco#31108 Session: ses_15eee3980ffe5eK2RPweao3nyU
…ction Reworks the previous attempt (b6c640b) after reproducing the actual bug from anomalyco#31108: with experimental.continue_loop_on_deny: true and a plugin that auto-rejects permission asks via the v1 SDK (e.g. kimaki subagent auto-reject), the model can retry the same denied tool call forever. Root cause: a denied tool call ends the provider step, so the model's retry always lands in a NEW assistant message. The doom loop guard only scanned parts of the current assistant message, so identical denied retries never accumulated and the guard never fired. Fix: - processor.ts: add a cross-message doom loop check. When the same-message scan does not match, look at the trailing tool parts across the session (MessageV2.lastParts, bounded window). If the last DOOM_LOOP_THRESHOLD tool parts are identical errored calls matching the incoming call, trigger the doom_loop permission ask. Only errored runs count, so routine successful repeats (e.g. periodic status checks) do not trip the guard. The part of the call being evaluated is excluded by messageID + callID since provider tool call IDs are only unique within a single response. - message-v2.ts: add lastParts(sessionID, limit) querying the most recent parts of a session via the part_session_idx index. Reverted from the previous attempt, with reasons: - tools.ts catchDefect converting permission defects to typed failures: a no-op. Effect v4 runPromise rejects with the squashed cause, which is the original error object for both Fail and Die, so error identity already survives Effect.orDie all the way to failToolCall. Verified by instrumentation. - processor.ts blocking on DeniedError/CorrectedError: behavior change that contradicts the original design (7368342). CorrectedError carries user feedback the model should act on; DeniedError is a static ruleset deny the model should adapt to. Both now documented in a comment. Tests (the previous test was vacuous: a blanket pattern "*" deny removes the tool from the tool list entirely via Permission.disabled, so no permission ask ever fired and it passed on unpatched upstream): - rejected permission continues loop when continue_loop_on_deny is true - rejected permission blocks loop by default - repeated identical rejected calls trigger doom loop guard across messages (fails without this fix: 7 LLM calls instead of 4) Fixes anomalyco#31108 Session: ses_15eee3980ffe5eK2RPweao3nyU
A reject sent WITH a message produces PermissionV1.CorrectedError, but failToolCall only blocked the loop on PermissionV1.RejectedError. So a reject that carried feedback never ended the turn, regardless of experimental.continue_loop_on_deny: the model received the feedback as a tool error and could immediately retry the same call and re-ask, looping tool-call and permission-ask events without user-visible progress. This is the harness loop from anomalyco#31108 as observed via the v1 SDK: kimaki's permission-timeout auto-reject always replies with a message, so its rejections were CorrectedError and never stopped the loop. Reproduced on unpatched dev: 6 queued identical denied retries all executed (7 LLM calls). With this fix the first reject ends the turn (1 LLM call). A reject is the user or a plugin saying no; the turn must end by default, message or not. The feedback is recorded on the errored tool part, so the model reads it on the next turn, or immediately when continue_loop_on_deny is true. Static ruleset denies (PermissionV1.DeniedError) still do not block: they are routine (subagent tool restrictions) and the model should see the error and adapt. Tests: - rejected permission with feedback blocks loop by default (fails on unpatched dev: 7 calls instead of 1) - rejected permission with feedback continues loop when continue_loop_on_deny is true, and the model sees the feedback text Fixes anomalyco#31108 Session: ses_15eee3980ffe5eK2RPweao3nyU
Session: ses_15eee3980ffe5eK2RPweao3nyU
Contributor
|
This PR doesn't fully meet our contributing guidelines and PR template. What needs to be fixed:
Please edit this PR description to address the above within 2 hours, or it will be automatically closed. If you believe this was flagged incorrectly, please let a maintainer know. |
Reverts everything not needed for the fix: the cross-message doom loop guard (message-v2 lastParts + processor scan + test), the subtask log level tweak in prompt.ts, the llm.test.ts identity tests, and the two plain-reject regression tests. Net diff is now the CorrectedError blocked-check in failToolCall plus its two regression tests. Session: ses_15eee3980ffe5eK2RPweao3nyU
The flag-true variant only documented the existing continue path; the blocking test is the one that fails on unpatched dev. Session: ses_15eee3980ffe5eK2RPweao3nyU
Session: ses_15eee3980ffe5eK2RPweao3nyU
…efault-mode one Session: ses_15eee3980ffe5eK2RPweao3nyU
Contributor
|
This pull request has been automatically closed because it was not updated to meet our contributing guidelines within the 2-hour window. Feel free to open a new pull request that follows our guidelines. |
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue for this PR
Closes #31108
Type of change
What does this PR do?
A permission reject sent with a message produces
PermissionV1.CorrectedError, butfailToolCallonly blocked the loop onRejectedError. So a reject carrying feedback ignoredexperimental.continue_loop_on_denysemantics entirely: the model got the feedback as a tool error, retried the same call, and re-asked the permission in a tight loop. Plugins that auto-reject via the v1 SDK hit this because they reply with an explanatory message.The fix adds
CorrectedErrorto the blocked check so messaged rejects follow the same semantics as plain rejects: end the turn by default, continue whencontinue_loop_on_deny: true. The feedback stays on the errored tool part so the model reads it and works around the denial. Static ruleset denies (DeniedError) still don't block.How did you verify your code works?
Regression test that auto-rejects a permission ask with feedback, kimaki-style, with
continue_loop_on_deny: true: the loop continues past the rejection, the feedback text is recorded on the errored tool part, and the model's follow-up response lands. Also verified the default-mode behavior manually: on unpatched dev 6 queued identical denied retries all execute (7 LLM calls); with the fix the first reject ends the turn (1 LLM call). Ran the fulltest/session/prompt.test.tssuite andbun typecheckinpackages/opencode.Note: investigation and implementation were done with Fable 5.
Screenshots / recordings
N/A (loop-behavior fix, covered by the regression test above)