Skip to content

[dataflowengineoss] Fix JS dataflow FN for captured closures in higher-order callbacks#5947

Open
ZiyaoZh wants to merge 1 commit intojoernio:masterfrom
ZiyaoZh:fix/jssrc-callback-lambda-binding
Open

[dataflowengineoss] Fix JS dataflow FN for captured closures in higher-order callbacks#5947
ZiyaoZh wants to merge 1 commit intojoernio:masterfrom
ZiyaoZh:fix/jssrc-callback-lambda-binding

Conversation

@ZiyaoZh
Copy link
Copy Markdown

@ZiyaoZh ZiyaoZh commented Apr 23, 2026

This PR fixes a JavaScript dataflow false negative for higher-order callbacks that capture values from an outer scope, especially when the callback is passed through a static method before its return value reaches a sink.

The covered pattern combines:

  • a static method
  • a callback parameter
  • a closure that captures an outer tainted parameter
  • a callback invocation whose return value flows to a sink

Joern handled parts of this pattern individually, but missed the composed flow from the captured value through the callback invocation result.

Problem

The following combined case was missed:

class MyClass {
  static process(callback) {
    const result = callback("_", "_");
    sink(result);
  }
}

function test(__taint_src) {
  MyClass.process(function(a, b) {
    return __taint_src + a + b;
  });
}

Expected:

  • taint should flow from __taint_src to sink(result)

Root Cause

Two pieces were missing in dataflowengineoss:

  1. Captured closure method references were not included as starting points for identifier sources.

    • SourcesToStartingPoints expanded an Identifier to field/index accesses and first usages inside captured methods, but not to the capturing MethodRef itself.
    • This meant the query engine could miss the call-site value that carries the captured closure into a callback parameter.
  2. Callback receivers did not contribute to call-result taint.

    • DdgGenerator.uses(call) only considered call.argument.
    • For callback-style calls such as callback(...), the result also depends on the invoked function value itself.

Fix

  • Extend source expansion in dataflowengineoss/src/main/scala/io/joern/dataflowengineoss/queryengine/SourcesToStartingPoints.scala so Identifier sources also include the MethodRefs that capture them.
  • Extend uses(call) in dataflowengineoss/src/main/scala/io/joern/dataflowengineoss/passes/reachingdef/DdgGenerator.scala so callback-style receivers are considered uses when the receiver resolves to a MethodParameterIn.
  • The current stress testing covers two types of scenarios: multi-level nested closures and multiple sibling callbacks. For multi-level nested closures with N = 4/8/12, the number of paths remains consistently at 1 under the final patch, with execution times around 2–4 ms. This indicates that no path explosion occurs in recursive capture or deeply nested closure scenarios. For sibling callbacks with N = 4/8/12, the baseline implementation yields 0/0/0 paths, indicating a false negative. The fix brings the path count down to a linear 4/8/12, with execution times around 6/11/14 ms. This both restores the expected dataflow coverage and avoids path explosion in sibling callback scenarios.

…s/passes/reachingdef/DdgGenerator.scala

	modified:   dataflowengineoss/src/main/scala/io/joern/dataflowengineoss/queryengine/SourcesToStartingPoints.scala
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant