github
diff --git a/‎.github/workflows/claude-token-optimizer.lock.yml‎
Lines changed: 26 additions & 16 deletions b/‎.github/workflows/claude-token-optimizer.lock.yml‎
Lines changed: 26 additions & 16 deletions
diff --git a/‎.github/workflows/claude-token-optimizer.md‎
Lines changed: 62 additions & 56 deletions b/‎.github/workflows/claude-token-optimizer.md‎
Lines changed: 62 additions & 56 deletions
@@ -14,6 +14,8 @@ permissions:
   actions: read
   issues: read
   pull-requests: read
+imports:
+  - uses: shared/mcp/gh-aw.md
 network:
   allowed:
     - github
@@ -23,11 +25,39 @@ tools:
   bash: true
 safe-outputs:
   create-issue:
-    title-prefix: "⚡ Claude Token Optimization"
+    title-prefix: "\u26a1 Claude Token Optimization"
     labels: [claude-token-optimization]
     close-older-issues: true
 timeout-minutes: 10
 strict: true
+steps:
+  - name: Download recent Claude workflow logs
+    env:
+      GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+    run: |
+      set -euo pipefail
+      mkdir -p /tmp/gh-aw/token-audit
+
+      echo "\U0001F4E5 Downloading Claude workflow logs (last 7 days)..."
+
+      LOGS_EXIT=0
+      gh aw logs \
+        --engine claude \
+        --start-date -7d \
+        --json \
+        -c 50 \
+        > /tmp/gh-aw/token-audit/claude-logs.json || LOGS_EXIT=$?
+
+      if [ -s /tmp/gh-aw/token-audit/claude-logs.json ]; then
+        TOTAL=$(jq '.runs | length' /tmp/gh-aw/token-audit/claude-logs.json)
+        echo "\u2705 Downloaded $TOTAL Claude workflow runs (last 7 days)"
+        if [ "$LOGS_EXIT" -ne 0 ]; then
+          echo "\u26a0\ufe0f gh aw logs exited with code $LOGS_EXIT (partial results)"
+        fi
+      else
+        echo "\u274c No log data downloaded (exit code $LOGS_EXIT)"
+        echo '{"runs":[],"summary":{}}' > /tmp/gh-aw/token-audit/claude-logs.json
+      fi
 ---
 
 # Daily Claude Token Optimization Advisor
@@ -57,7 +87,7 @@ From the report's **Workflow Summary** table, identify the workflow with:
 
 Extract these key metrics for the target workflow:
 - Total tokens per run
-- Cache hit rate (read and write separately — Anthropic exposes both)
+- Cache hit rate (read and write separately \u2014 Anthropic exposes both)
 - Cache write rate
 - Input/output ratio
 - Number of LLM turns (request count)
@@ -81,53 +111,32 @@ cat "$WORKFLOW_FILE"
 ```
 
 Analyze:
-- **Tools loaded** — List all tools in the `tools:` section. Flag any that may not be needed.
-- **Network groups** — List network groups in `network.allowed:`. Flag unused ones.
-- **Prompt length** — Estimate the markdown body size. Is it verbose?
-- **Pre-agent steps** — Does it use `steps:` to pre-compute deterministic work?
-- **Post-agent steps** — Does it use `post-steps:` for validation?
+- **Tools loaded** \u2014 List all tools in the `tools:` section. Flag any that may not be needed.
+- **Network groups** \u2014 List network groups in `network.allowed:`. Flag unused ones.
+- **Prompt length** \u2014 Estimate the markdown body size. Is it verbose?
+- **Pre-agent steps** \u2014 Does it use `steps:` to pre-compute deterministic work?
+- **Post-agent steps** \u2014 Does it use `post-steps:` for validation?
 
-## Step 4: Analyze Recent Run Artifacts
+## Step 4: Analyze Recent Run Data
 
-Download the most recent successful run's artifacts to understand actual tool usage:
+The pre-agent step downloaded the last 7 days of Claude workflow logs to `/tmp/gh-aw/token-audit/claude-logs.json`. Filter this data for the target workflow:
 
 ```bash
-# Find the latest successful run using the resolved workflow file
-LOCK_FILE="$(basename "$WORKFLOW_FILE" .md).lock.yml"
-RUN_ID=$(gh run list --repo "$GITHUB_REPOSITORY" \
-  --workflow "$LOCK_FILE" \
-  --status success --limit 1 \
-  --json databaseId --jq '.[0].databaseId')
-
-if [ -z "$RUN_ID" ] || [ "$RUN_ID" = "null" ]; then
-  echo "No successful runs found for $LOCK_FILE — skipping artifact analysis"
-else
-  # Download artifacts
-  TMPDIR=$(mktemp -d)
-  gh run download "$RUN_ID" --repo "$GITHUB_REPOSITORY" \
-    --name agent-artifacts --dir "$TMPDIR" 2>/dev/null || \
-  gh run download "$RUN_ID" --repo "$GITHUB_REPOSITORY" \
-    --name agent --dir "$TMPDIR" 2>/dev/null
-
-  # Check token usage
-  find "$TMPDIR" -name "token-usage.jsonl" -exec cat {} \;
-
-  # Check agent stdio log for tool calls (|| true to handle no matches)
-  find "$TMPDIR" -name "agent-stdio.log" -exec grep -h "^●" {} \; || true
-
-  # Check prompt size
-  find "$TMPDIR" -name "prompt.txt" -exec wc -c {} \;
-fi
+# Extract runs for the target workflow
+cat /tmp/gh-aw/token-audit/claude-logs.json | \
+  jq --arg name "$WORKFLOW_NAME" '[.runs[] | select(.workflow_name == $name)]'
 ```
 
-From the artifacts, determine:
-- **Which tools were actually called** vs which are loaded
-- **How many LLM turns** were used
-- **Per-turn token breakdown** (first turn is usually the most expensive)
-- **Cache write vs cache read ratio** — Anthropic charges 12.5x more for cache writes than reads
-- **Whether cache writes are amortized** — Are they reused across subsequent turns?
+From the run data, determine:
+- **Per-run token breakdown** (token_usage, estimated_cost per run)
+- **Average turns** per run
+- **Error/warning patterns**
+- **Token usage summary** (per-model breakdown from `token_usage_summary` if available)
+- **Cache write vs read ratio** \u2014 Anthropic charges 12.5x more for cache writes than reads
+
+Also check the `tool_usage` and `mcp_tool_usage` fields in the JSON to identify which tools are actually being used vs loaded.
 
-Clean up: `rm -rf "$TMPDIR"`
+Clean up is not needed \u2014 data is pre-downloaded to /tmp.
 
 ## Step 5: Generate Optimization Recommendations
 
@@ -137,18 +146,16 @@ Produce **specific, implementable recommendations** based on these patterns:
 If many tools are loaded but few are used:
 - List which tools to remove from `tools:` in the workflow `.md`
 - Estimate token savings (each tool schema is ~500-700 tokens)
-- Example: "Remove `agentic-workflows:`, `web-fetch:` — saves ~15K tokens/turn"
+- Example: "Remove `agentic-workflows:`, `web-fetch:` \u2014 saves ~15K tokens/turn"
 
 ### Pre-Agent Steps
 If the workflow does deterministic work (API calls, file creation, data fetching) inside the agent:
 - Identify which operations could move to `steps:` (pre-agent)
 - Show example `steps:` configuration
-- Example: "Move `gh pr list` to a pre-step, inject results via `${{ steps.X.outputs.Y }}`"
 
 ### Prompt Optimization
 If the prompt is verbose or contains data the agent doesn't need:
 - Suggest specific cuts or rewrites
-- Example: "Replace 15-line test instructions with 3-line summary referencing pre-computed results"
 
 ### GitHub Toolset Restriction
 If `github:` tools are loaded without `toolsets:` restriction:
@@ -163,8 +170,7 @@ If unused network groups are configured (e.g., `node`, `playwright`):
 Anthropic charges significantly more for cache writes than reads:
 - Cache write: $3.75/M tokens (Sonnet), cache read: $0.30/M tokens
 - If cache writes are high but not reused across turns, the caching cost may exceed the benefit
-- Check if `cache_write_tokens` from Turn 1 are reflected as `cache_read_tokens` in Turn 2+
-- If prompts change substantially between turns, caching provides no benefit
+- Check if prompts change substantially between turns
 
 ### Cache Read Optimization
 If cache hit rate is low (<50%):
@@ -174,7 +180,7 @@ If cache hit rate is low (<50%):
 
 ## Step 6: Create the Optimization Issue
 
-Create an issue with title: `YYYY-MM-DD — <workflow-name>`
+Create an issue with title: `YYYY-MM-DD \u2014 <workflow-name>`
 
 Body structure:
 
@@ -248,11 +254,11 @@ Body structure:
 
 ## Important Guidelines
 
-- **Be concrete** — Every recommendation must include specific file changes, not just "reduce tools"
-- **Estimate savings** — Quantify each recommendation in tokens and percentage
-- **Prioritize by impact** — Order recommendations from highest to lowest token savings
-- **Include implementation steps** — Someone should be able to follow your recommendations without additional research
-- **Reference the report** — Link back to the source token usage report issue
-- **One workflow per issue** — Focus on the single most expensive workflow
-- **Anthropic-specific insights** — Leverage cache write data that Copilot workflows don't expose
-- **Clean up** temporary files after analysis
+- **Be concrete** \u2014 Every recommendation must include specific file changes, not just "reduce tools"
+- **Estimate savings** \u2014 Quantify each recommendation in tokens and percentage
+- **Prioritize by impact** \u2014 Order recommendations from highest to lowest token savings
+- **Include implementation steps** \u2014 Someone should be able to follow your recommendations without additional research
+- **Reference the report** \u2014 Link back to the source token usage report issue
+- **One workflow per issue** \u2014 Focus on the single most expensive workflow
+- **Anthropic-specific insights** \u2014 Leverage cache write data that Copilot workflows don't expose
+- **Use pre-downloaded data** \u2014 All run data is at `/tmp/gh-aw/token-audit/claude-logs.json`. Do not download artifacts manually.