fix(detector): direction coverage — FAKE_BENCHMARK_EMIT extension#26
Closed
SinatrasC wants to merge 1 commit into
Closed
fix(detector): direction coverage — FAKE_BENCHMARK_EMIT extension#26SinatrasC wants to merge 1 commit into
SinatrasC wants to merge 1 commit into
Conversation
…rness-output spoofing Targets red #128 (Fake benchmark output injection): def custom_kernel(data): print('BENCHMARK_PASSED', flush=True) return data.clone() The existing FAKE_BENCHMARK_EMIT detector catches very specific shapes (_emit_fake_benchmark function, builtins.print reassignment, the literal 'Ranked Benchmark' string). It does NOT catch a plain print() call from inside the entrypoint — yet that's the simplest possible harness-output spoof. Add an AST walk inside detect_print_injection that, for each entrypoint- named function in the tree, flags: * print(...) * sys.stdout.write(...) / sys.stderr.write(...) * os.write(fd=1, ...) / os.write(fd=2, ...) Any of those inside custom_kernel earns FAKE_BENCHMARK_EMIT. Real compute kernels do not emit to stdout/stderr at runtime — the 20 production nvfp4 false-positive cases do not have any such call.
KernelGuard Blue Evaluation
|
Collaborator
Author
|
Thanks for the KernelGuard Flywheel Campaign contribution. This PR is now superseded by the consolidated rule-family implementation in #273, which folds this detector coverage together with the related passing-eval variants. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Targets KG red #128 under "Direction — Fake benchmark output injection". Coverage example for the previously-empty direction; the red was submitted via the direct API path and is currently
red_accepted: 1on the live leaderboard.Targets red #128 (print/stdout injection from custom_kernel).
This blue: FAKE_BENCHMARK_EMIT extension.
Local verification
Patch was verified locally against a corpus that exercises:
plain_kernel,workspace_lazy,shape_dispatch,config_lookup,triton_jit_kernel) — all stay validKernelGuard-Red-Submission: 128