Skip to content

asyncio.wait_for(session.initialize()) in session_context.py breaks AnyIO cancel scopes, causing first MCP session creation to always fail and retry #5886

@RayaneKADEM

Description

@RayaneKADEM

🔴 Required Information

Please ensure all items in this section are completed to allow for
efficient
triaging. Requests without complete information may be rejected /
deprioritized.
If an item is not applicable to you - please mark it as N/A

Describe the Bug:
session_context.py uses asyncio.wait_for(session.initialize(), timeout=self._timeout) which wraps the coroutine in a new asyncio
task. session.initialize() internally uses AnyIO memory streams
established by streamablehttp_client's
anyio.create_task_group(). Because asyncio.wait_for creates a
new asyncio task, AnyIO's cancel scope context is not inherited,
and when the inner task completes, AnyIO's cancel scope bookkeeping
detects the mismatch and raises unhandled errors in a TaskGroup (1 sub-exception). This causes the first MCP session creation to
always fail. The @retry_on_errors decorator on get_tools
catches the resulting ConnectionError and retries — the second
attempt succeeds because the event loop is quieter at that point.
The agent ultimately works but every cold session creation logs a
spurious WARNING and incurs an unnecessary retry. Note: ADK 1.34
already fixed the identical problem for
enter_async_context(self._client) by removing the
asyncio.wait_for wrapper under _MCP_GRACEFUL_ERROR_HANDLING.
The same fix was not applied to the remaining
asyncio.wait_for(session.initialize()) call on the next line.

Steps to Reproduce:
Please provide a numbered list of steps to reproduce the behavior:

  1. Install google-adk==1.34, connect an ADK agent to an MCP
    server via StreamableHTTPConnectionParams
  2. Run the agent under uvicorn (e.g. Cloud Run, adk api_server)
  3. Send any request that triggers get_tools()
  4. Observe WARNING + retry in logs on every cold session creation:
WARNING:google_adk.google.adk.tools.mcp_tool.session_context:Error
on session runner task: unhandled errors in a TaskGroup (1
sub-exception)
INFO:google_adk.google.adk.tools.mcp_tool.mcp_session_manager:Retry
ing get_tools due to error: Failed to create MCP session: Failed to
 create MCP session: unhandled errors in a TaskGroup (1
sub-exception)

Expected Behavior:
The first MCP session creation succeeds without retrying. No
TaskGroup warnings in logs.

Observed Behavior:
Every cold MCP session creation fails on first attempt with
unhandled errors in a TaskGroup (1 sub-exception), logs a
WARNING, and requires a retry. Under Python 3.11 the retry also
fails, making the agent completely non-functional. Under Python
3.12 the retry succeeds but the warning and retry overhead remain
on every cold session.

Environment Details:

  • ADK Library Version (pip show google-adk): google-adk==1.34
  • Desktop OS: Linux (Google Cloud Run)
  • Python Version (python -V): 3.11 (fully broken) / 3.12 (warning
  • retry, functionally works)

Model Information:

  • Are you using LiteLLM: Yes
  • Which model is being used: gemini-2.5-pro

🟡 Optional Information

Providing this information greatly speeds up the resolution
process.

Regression:
The first asyncio.wait_for (around enter_async_context) was
fixed in ADK 1.34. This second occurrence was present before 1.34
but masked by the first failure. It became the sole remaining issue
after the 1.34 fix.

Logs:
Please attach relevant logs. Wrap them in code blocks (```) or
attach a text file.

WARNING:google_adk.google.adk.tools.mcp_tool.session_context:Error
on session runner task: unhandled errors in a TaskGroup (1
sub-exception)
INFO:google_adk.google.adk.tools.mcp_tool.mcp_session_manager:Retry
ing get_tools due to error: Failed to create MCP session: Failed to
 create MCP session: unhandled errors in a TaskGroup (1
sub-exception)

Screenshots / Video:
N/A

Additional Context:
The bug is a race condition. It consistently manifests under uvicorn because concurrent event loop activity creates the
scheduling window where the asyncio/AnyIO boundary is crossed. It does not reproduce with asyncio.run(), which explains why local
testing with adk run may appear clean while the deployed service always hits the warning. The fix is to replace
asyncio.wait_for(session.initialize(), timeout=self._timeout) with anyio.fail_after(self._timeout) to keep
session.initialize() within the AnyIO cancel scope established by streamablehttp_client — the same pattern already applied in ADK
1.34 for the first asyncio.wait_for.

Minimal Reproduction Code:
Please provide a code snippet or a link to a Gist/repo that
isolates the issue.

from google.adk.tools.mcp_tool.mcp_session_manager import (
    MCPSessionManager,
    StreamableHTTPConnectionParams,
)

# Must run under uvicorn / any concurrent async server (not
asyncio.run())
manager = MCPSessionManager(
 
StreamableHTTPConnectionParams(url="http://your-mcp-server/mcp")
)
# First call always logs WARNING + retries before succeeding
session = await manager.create_session()

How often has this issue occurred?:

  • Always (100%)

Metadata

Metadata

Assignees

Labels

mcp[Component] Issues about MCP supporttools[Component] This issue is related to tools

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions