sp_QuickieStore / sp_QuickieCache / sp_HumanEventsBlockViewer improvements by erikdarlingdata · Pull Request #766 · erikdarlingdata/DarlingData

erikdarlingdata · 2026-04-24T18:32:43Z

Summary

Three proc changes bundled together:

sp_QuickieStore — new @primary_window filter (business / off-hours / weekend, case-insensitive prefix) for @find_high_impact, and a new resource_metrics XML column collapsing eight total_* columns plus max_dop into one clickable rollup with avg/min/max per metric
sp_QuickieCache — same resource_metrics XML column on the main result set (twelve columns → one). No TOD filter — plan cache DMVs don't carry per-interval execution metadata
sp_HumanEventsBlockViewer — check_id 9 "Top Blocking Query" rollup rewritten to attribute each BPR's victim wait time up the chain to the lead blocker. Intermediate blockers that were themselves stuck behind other queries no longer show up inflated with waits they didn't cause. Finding group renamed to "Top Lead Blocker"

Both XML rollups use FOR XML PATH(N'metrics'), TYPE with attribute-path aliases — native XML, no string concatenation, no STRING_AGG.

Test plan

sp_QuickieStore: installed cleanly on sql2022, @primary_window = 'b' / 'off' / 'weekend' each return only rows matching the bucket against StackOverflow2013
sp_QuickieStore: resource_metrics XML is well-formed, all expected child elements present
sp_QuickieStore: validation errors fire for bad @primary_window values and when used without @find_high_impact = 1
sp_QuickieCache: installed cleanly on sql2022, smoke-tested end-to-end, XML well-formed
sp_HumanEventsBlockViewer: installed cleanly on sql2022 and sql2017
sp_HumanEventsBlockViewer: run against real hammerdb_tpcc BPR data, #session_leads debug dump shows correct (monitor_loop, lead, session) mappings for chains of depth 2-4
sp_HumanEventsBlockViewer: rollup surfaces three neword/delivery procs accounting for 100% of chain wait time (43.3% + 30.2% + 26.5%); previously-inflated intermediate blockers absent

🤖 Generated with Claude Code

…e @find_high_impact New @primary_window parameter narrows @find_high_impact results to queries whose majority activity falls in a single window. Accepts any prefix of business / off-hours / weekend (case-insensitive, b/o/w is enough). Validated up front: errors out unless @find_high_impact = 1 and the value starts with b, o, or w. Filter is applied in the final dynamic SELECT as WHERE o.primary_window LIKE 'Business%' (or 'Off-hours%' / 'Weekend%'), leveraging the existing primary_window classification with its >50% rule. Queries whose primary_window is 'Spread' are excluded by design. The eight individual total_* columns plus max_dop in the @find_high_impact result set are replaced with a single resource_metrics XML column. Built in Step 6 of the #hi_output insert from #hi_scored using FOR XML PATH(N'metrics'), TYPE with attribute-path aliases — native xml output, no string concatenation. The XML also surfaces the avg/min/max per-execution metrics that #hi_query_stats already computes but that previously weren't projected. Shape: <metrics> <cpu total_ms avg_ms min_ms max_ms/> <duration total_ms avg_ms min_ms max_ms/> <physical_reads total_mb avg_mb min_mb max_mb/> <writes total_mb avg_mb min_mb max_mb/> <memory total_mb avg_mb min_mb max_mb/> <tempdb total_mb avg_mb/> <executions total/> <rows total avg/> <parallelism max_dop/> </metrics> Share columns (cpu_share, duration_share, etc.) are kept as dedicated sortable columns rather than folded into the XML. The underlying total_* and max_dop columns remain in #hi_output storage so debug dumps are unaffected; only the final user-facing SELECT changed. Help text updated: new @primary_window description/valid_inputs/defaults, high_impact_columns section mentions resource_metrics and the filter hint, debug parameter dump includes @primary_window. Smoke-tested on sql2022 against StackOverflow2013: all three bucket filters return only rows in that bucket, XML is well-formed with the expected element/attribute shape. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… XML column Mirrors the compact XML rollup just added to sp_QuickieStore @find_high_impact. Twelve individual columns (total_cpu_ms, total_duration_ms, total_physical_reads, total_logical_writes, total_grant_mb, total_spills, max_grant_mb, max_used_grant_mb, max_spills, max_dop, min_rows, max_rows) are replaced with a single clickable resource_metrics xml column in the main result set. Built during the #scored insert using FOR XML PATH(N'metrics'), TYPE with attribute-path aliases — native xml, no STRING_AGG and no string concatenation. avg_* attributes are computed inline as total / NULLIF(total_executions, 0) and surface per-execution metrics that previously weren't exposed anywhere in the output. Shape: <metrics> <cpu total_ms avg_ms min_ms max_ms/> <duration total_ms avg_ms min_ms max_ms/> <physical_reads total avg min max/> <logical_writes total avg/> <rows total avg min max/> <grant total_mb avg_mb max_mb/> <used_grant total_mb avg_mb max_mb/> <spills total avg max/> <executions total/> <parallelism max_dop/> </metrics> No time-of-day / primary_window filter here (unlike sp_QuickieStore) — the plan cache DMVs don't carry per-interval execution metadata, so there's nothing to classify Business / Off-hours / Weekend activity against. Share columns (cpu_share, duration_share, etc.) remain as dedicated sortable columns. Smoke-tested on sql2022: installed cleanly, ran end-to-end, resource_metrics is well-formed XML with all expected child elements. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@WaitTime

The "Top Blocking Query" rollup previously summed each BPR's victim wait_time against the direct blocker's sql_handle. In a chain like A blocks B blocks C, B would show up as a blocker "responsible" for C's wait — but B was itself stuck behind A, so blaming B is misleading. The only query that needs tuning is A. This commit rewrites the rollup so every BPR's victim wait cascades up to the chain's lead blocker (level-0 session in the monitor_loop): - New #session_leads temp table materializes a (monitor_loop, lead_desc, session_desc) map via a recursive CTE. Anchor rows are lead blockers (sessions that never appear as a blocked_desc in the same monitor_loop); recursion walks downstream, keeping lead_desc constant so every descendant inherits the same root. - Cycle guard mirrors the existing hierarchy CTE pattern (lead_path LIKE check, MAXRECURSION 100). - Fallback pass inserts any blocking_desc not reached by the recursion as its own lead — catches true cycle cases (mutual blocking before deadlock detection fires) so their waits don't silently drop. - bpr_with_lead CTE joins #blocking to #session_leads on session_desc to find each BPR's chain lead, extracts victim waittime from the blocked-process/@WaitTime XPath (the correct "how long did the victim wait" value), and applies the @database_name / @object_name filters at the BPR level rather than the chain level so we can still trace waits that cascade across objects. - lead_sql CTE pulls a representative sql_handle / stmtstart / stmtend / query_text_pre per (monitor_loop, lead_desc) from any #blocking row where the lead appears as the blocking-process. - per_victim CTE dedupes to (lead_sql_identity, victim_tx) with peak wait, matching the existing dedup pattern across repeat BPR fires. - Final INSERT groups by (database_name, lead_sql_handle, stmtstart, stmtend), keeps the existing "must be >= 10% of total" HAVING filter, and renames finding_group to 'Top Lead Blocker'. The finding text now reads 'This lead blocker accounted for ... across N blocked sessions in its chain.' to make the cascaded-attribution semantic explicit. Intermediate blockers (sessions that blocked downstream but were themselves victims) no longer appear in the rollup — by design. Their apparent blocking time cascaded up to whoever's actually holding the lock at the top of the chain. Smoke-tested on sql2022 against real hammerdb_tpcc BPR data: the #session_leads debug dump shows correct (monitor_loop, lead, session) mappings for chains of depth 2-4, and the rollup surfaces three neword/delivery procs that together account for 100% of chain wait time (43.3% + 30.2% + 26.5%). Previously intermediate blockers that clogged this list no longer appear. Also compiled cleanly on sql2017 as a syntax sanity check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

erikdarlingdata and others added 4 commits April 24, 2026 14:05

Merge remote-tracking branch 'origin/main' into dev

fed985a

erikdarlingdata merged commit 38d112f into main Apr 24, 2026
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sp_QuickieStore / sp_QuickieCache / sp_HumanEventsBlockViewer improvements#766

sp_QuickieStore / sp_QuickieCache / sp_HumanEventsBlockViewer improvements#766
erikdarlingdata merged 4 commits intomainfrom
dev

erikdarlingdata commented Apr 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

erikdarlingdata commented Apr 24, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant