Skip to content

fix: manageSqlDatabase provisionMySQL 在 MySQL 未开通时失败:工具内部先调用 DescribeMySQLClusterDetail,因实例不存在而报错,未实际发起创建请求#687

Open
binggg wants to merge 1 commit intomainfrom
automation/attribution-issue-mojq90qw-rpi3r0-managesqldatabase-provisionmysql-mysql-d
Open

fix: manageSqlDatabase provisionMySQL 在 MySQL 未开通时失败:工具内部先调用 DescribeMySQLClusterDetail,因实例不存在而报错,未实际发起创建请求#687
binggg wants to merge 1 commit intomainfrom
automation/attribution-issue-mojq90qw-rpi3r0-managesqldatabase-provisionmysql-mysql-d

Conversation

@binggg
Copy link
Copy Markdown
Member

@binggg binggg commented Apr 29, 2026

Attribution issue

  • issueId: issue_mojq90qw_rpi3r0
  • category: tool
  • canonicalTitle: manageSqlDatabase provisionMySQL 在 MySQL 未开通时失败:工具内部先调用 DescribeMySQLClusterDetail,因实例不存在而报错,未实际发起创建请求
  • representativeRun: atomic-js-none-chain-mysql-from-none-to-usable/2026-04-29T07-10-48-1inf0c

Automation summary

  • root_cause: manageSqlDatabase provisionMySQL failed when MySQL was not yet created because getSqlInstanceInfo had two bugs: (1) DescribeCreateMySQLResult was not wrapped in a try-catch, so if it threw an error (e.g., ResourceNotFound) when MySQL never existed, the error propagated unhandled and the CreateMySQL call was never made; (2) the DescribeMySQLClusterDetail catch only handled FailedOperation.DataSourceNotExist but not other not-found error codes (e.g., ClusterNotFound, ResourceNotFound), causing the error to be re-thrown instead of being treated as "instance doesn't exist"; (3) when DescribeCreateMySQLResult returned null/empty status (normalized to PENDING), the DescribeMySQLClusterDetail catch incorrectly set exists: true, causing handleProvisionMySQL to skip creating MySQL thinking it already existed.
  • changes: In mcp/src/tools/databaseSQL.ts: (1) Added isNotFoundErrorCode and isNotFoundErrorMessage helper functions that match a broad range of not-found error patterns; (2) Wrapped DescribeCreateMySQLResult in a try-catch that returns {exists: false, status: "NOT_CREATED"} for not-found errors; (3) Broadened `DescribeMySQLClusterDe

Changed files

  • mcp/src/tools/databaseSQL.test.ts
  • mcp/src/tools/databaseSQL.ts

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f8cf421b48

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

return (
upper.includes("DATASOURCENOTEXIST") ||
upper.includes("RESOURCENOTFOUND") ||
upper.includes("NOTFOUND") ||
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Narrow not-found code matching to CloudBase API errors

isNotFoundErrorCode now treats any code containing NOTFOUND as “MySQL not created”, which also matches transport errors like Node’s ENOTFOUND (DNS lookup failure). Because this helper is reused in getSqlInstanceInfo, handleRunQuery, handleRunStatement, and handleInitializeSchema, a temporary network/DNS outage can be misreported as MYSQL_NOT_CREATED and even trigger an unnecessary CreateMySQL attempt instead of surfacing the real infrastructure error.

Useful? React with 👍 / 👎.

@binggg
Copy link
Copy Markdown
Member Author

binggg commented Apr 29, 2026

Attribution post-PR evaluation

  • visibility: internal identifiers, run ids, and private links are intentionally omitted
  • attempt: 1
  • eval_scope: primary_only
  • overall: FAILED
  • summary: at least one planned evaluation case failed
  • updated_at: 2026-04-29T08:01:08.612Z

Cases

  • [FAILED] — primary — evaluation failed

@binggg
Copy link
Copy Markdown
Member Author

binggg commented Apr 29, 2026

这条 PR 的方向是对的:先修 provisionMySQL 在未开通时的 not-found 判断,避免还没走到 CreateMySQL 就提前抛错。

但从最新 post-PR eval 和 trajectory 看,这还不是完整根因。当前 case 的主要失败已经进入了下一段链路:CreateMySQL 已返回 TaskId,但后续 DescribeCreateMySQLResult 长时间 doing/PENDING,或最终 FAILEDFailReason: null;另外 MySQL 是 Serverless,存在“控制面 success 但数据面 SQL 还没 ready”的窗口期。

建议在这个 PR 里按“最小修复”继续收敛,不要扩大成重构:

  1. 区分两个 ready 状态:
    • control-plane ready:DescribeCreateMySQLResult.Status === success
    • sql/data-plane ready:RunSql SELECT 1 成功
  2. provisionMySQL / 查询链路在开通成功后,增加 bounded SQL readiness probe:
    • 控制面 success 后,用 SELECT 1 做短轮询
    • 成功后返回 sqlReady: truenextAction: runStatement
    • 超过上限时不要继续无限等,返回结构化状态,如 reason: MYSQL_SQL_NOT_READYcontrolPlaneStatus: successsqlReady: falsenextAction: retry SELECT 1 later
  3. DescribeCreateMySQLResult 返回 FAILED 时,工具需要完整透出 StatusFailReasonTaskId、原始响应,并给出明确 nextAction;如果 FailReason 是 null,也要明确告诉 agent 这是后端未返回诊断,不要继续盲目轮询。
  4. 避免 agent 自己无限轮询:所有 wait/poll 都应有 max attempts / deadline,并在失败时给出可执行的下一步。

这样可以保持当前 PR 的最小范围:不是重做 MySQL 工具,而是把 Serverless MySQL 的“控制面 ready”和“SQL ready”语义补齐,避免 from-none-to-usable case 因为冷启动/静默失败一直随机 fail。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant