Python: Add agent-framework-azure-cosmos-memory context provider#6719
Python: Add agent-framework-azure-cosmos-memory context provider#6719TheovanKraay wants to merge 10 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new Python integration package, agent-framework-azure-cosmos-memory, introducing a CosmosMemoryContextProvider that persists and recalls long-term memories via Azure Cosmos DB using the azure-cosmos-agent-memory toolkit (including user-summary injection and retrieval-time context augmentation).
Changes:
- Introduces
CosmosMemoryContextProvider(async context manager +before_run/after_runhooks) plus package exports. - Adds documentation and samples demonstrating basic usage and an interactive Foundry-backed chat experience.
- Adds unit tests (mocked client) and live-Azure integration tests (pytest markers).
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| python/packages/azure-cosmos-memory/agent_framework_azure_cosmos_memory/_context_provider.py | Implements the Cosmos-backed memory context provider (retrieval + storage + user-summary injection + flush). |
| python/packages/azure-cosmos-memory/agent_framework_azure_cosmos_memory/init.py | Exports CosmosMemoryContextProvider and package version. |
| python/packages/azure-cosmos-memory/pyproject.toml | Defines the new package, dependencies, and pytest/tooling configuration (Python >=3.11). |
| python/packages/azure-cosmos-memory/README.md | End-user documentation, configuration guidance, and usage examples. |
| python/packages/azure-cosmos-memory/AGENTS.md | Package-level developer guidance and key behaviors (user_id/thread_id, flush). |
| python/packages/azure-cosmos-memory/LICENSE | Package license. |
| python/packages/azure-cosmos-memory/samples/basic_usage.py | Minimal “raw hooks” sample calling before_run()/after_run() directly. |
| python/packages/azure-cosmos-memory/samples/interactive_chat.py | Interactive CLI sample demonstrating an agent wired with Foundry + Cosmos memory provider. |
| python/packages/azure-cosmos-memory/tests/test_context_provider.py | Unit tests for provider behavior with a mocked memory client (incl. context manager + flush). |
| python/packages/azure-cosmos-memory/tests/test_integration.py | Live-Azure integration tests gated by env vars and markers. |
| python/packages/azure-cosmos-memory/tests/conftest.py | Pytest marker registration for the package’s tests. |
Python Test Coverage Report •
Python Unit Test Overview
|
||||||||||||||||||||||||||||||
Introduces CosmosMemoryContextProvider, a ContextProvider that wraps the azure-cosmos-agent-memory toolkit to give agents long-term, Cosmos DB-backed memory (fact/procedural recall + user summaries). Includes package scaffolding, unit tests (mocked client), live Azure integration tests (marked), samples, README, and AGENTS.md. Draft: uv.lock is intentionally left unchanged. This package depends on azure-cosmos-agent-memory (requires Python >=3.11), which is unsatisfiable against the workspace's current >=3.10 floor, so adding it to the shared lock requires a workspace decision (raise floor to 3.11 or exclude from workspace). Test coverage to be expanded.
The package depends on azure-cosmos-agent-memory which requires Python >=3.11 and a prompty pre-release (>=2.0.0a9). Both are unsatisfiable against the workspace's >=3.10 floor and pre-release policy, causing uv sync to fail in every Python CI job. Exclude the package from the shared workspace so it is resolved and tested as a standalone package.
- Strip trailing whitespace from package files (pre-commit trailing-whitespace hook) - Exclude the package README from markdown-code-lint: the package is excluded from the uv workspace, so its README snippets import a module that is not installed in the workspace env and Pyright cannot resolve it
ab30779 to
4bc5c8e
Compare
- Wire credential into Cosmos and AI Foundry clients; let toolkit own DefaultAzureCredential when none supplied (remove dead import). - Honor auto_extract=False by zeroing extraction/summary cadence thresholds. - Skip whitespace-only conversation turns and store stripped content. - Show confidence 0.0 and coerce confidence to float in _format_memories. - Register both 'integration' and 'azure' pytest markers accurately. - Fix duplicated install block in README. - Update and extend unit tests for new credential wiring and fixes.
| "pytest-asyncio>=0.23.0", | ||
| "pytest-cov>=4.0.0", | ||
| ] | ||
| samples = [ |
There was a problem hiding this comment.
this is not needed, and not preferred because they will show up on pypi and other places, we prefer (if needed) to have PEP 723 definitions in the samples that need it.
There was a problem hiding this comment.
Done. Removed the samples dependency group; the samples now declare their dependencies inline via PEP 723, so nothing sample-related surfaces on PyPI.
|
|
||
| cosmos_endpoint: str | None | ||
| cosmos_database: str | None | ||
| ai_foundry_endpoint: str | None |
There was a problem hiding this comment.
we use foundry_endpoint, let's keep that consistent
There was a problem hiding this comment.
Done. Renamed ai_foundry_endpoint to foundry_endpoint for consistency.
| cosmos_endpoint: str | None | ||
| cosmos_database: str | None | ||
| ai_foundry_endpoint: str | None | ||
| embedding_deployment_name: str | None |
There was a problem hiding this comment.
we also standardized this on embedding_model instead of deployment name, same for chat
There was a problem hiding this comment.
Done. Renamed to embedding_model and chat_model. Internally these map to the toolkit client's embedding_deployment_name / chat_deployment_name, which are unchanged on its side.
|
|
||
| # Agent Framework uses the "assistant" role, but the Agent Memory Toolkit's TurnRecord | ||
| # only accepts {user, agent, tool, system}. Map AF roles to toolkit roles when storing. | ||
| _ROLE_MAP: ClassVar[dict[str, str]] = {"assistant": "agent"} |
There was a problem hiding this comment.
does this have to be a classvar? coulnd't it be a module level const?
There was a problem hiding this comment.
Done. Moved DEFAULT_SOURCE_ID, DEFAULT_DATABASE and DEFAULT_CONTEXT_PROMPT to module-level constants.
| memory_client: AsyncCosmosMemoryClient | None = None, | ||
| top_k: int = 5, | ||
| min_confidence: float = 0.7, | ||
| memory_types: Sequence[str] | None = None, |
There was a problem hiding this comment.
is this a limited list of options? if, so could we do a literal?
There was a problem hiding this comment.
Done. memory_types is now Sequence[Literal["fact", "procedural", "episodic"]].
| ) as provider: | ||
| # Use with agent session | ||
| session = AgentSession(session_id="user-session-123") | ||
| session.state["user_id"] = "alice" |
There was a problem hiding this comment.
state should be scoped by provider, so something like session.state['ai_memory']['user_id']
There was a problem hiding this comment.
Done. The samples now use session.state.setdefault(provider.source_id, {})["user_id"] = ....
| # Use with agent session | ||
| session = AgentSession(session_id="user-session-123") | ||
| session.state["user_id"] = "alice" | ||
| session.state["thread_id"] = "conversation-1" |
There was a problem hiding this comment.
this should map to session_id, can we just use that?
There was a problem hiding this comment.
Done. Threading now just uses the session id (a new session is a new thread); no explicit thread_id.
| # wait. This lets the toolkit's background memory-extraction tasks (scheduled after | ||
| # each stored turn) make progress between messages instead of being starved by a | ||
| # blocking input() call. | ||
| user_input = (await asyncio.to_thread(input, "You: ")).strip() |
There was a problem hiding this comment.
for a sample, it's fine to just use ìnput` directly
There was a problem hiding this comment.
Done. Switched to a plain input() loop.
| session.state["thread_id"] = "conversation-1" | ||
|
|
||
| # Simulate agent run - before_run searches memories | ||
| ctx = SessionContext( |
There was a problem hiding this comment.
the whole machinery of the SessionContext is not something we need to expose people to, I think we only need a sample setting up with a Agent and then there might be some flavors in there, maybe user-scoped vs non-user-scoped but this is great for testing, but not something we expect people to write
There was a problem hiding this comment.
Done. Rewrote basic_usage.py to set up an Agent with the provider and call agent.run(...), with two flavors: user-scoped versus session-scoped memory. No SessionContext machinery is exposed.
| """ | ||
|
|
||
|
|
||
| class CustomMemoryProcessor: |
There was a problem hiding this comment.
this is cool, and should warrant a second sample showcasing it, but looks like it is commented out now
There was a problem hiding this comment.
You are right, it was non-functional (commented out), so I removed the dead stub to keep the sample clean. A dedicated custom-extraction-rubric sample is a good follow-up; I can add one once we have a working custom-processor example to show.
Follow the github_copilot pattern for a package with a Python 3.11-only dependency: lower requires-python to >=3.10 and gate azure-cosmos-agent-memory behind a python_version >= '3.11' marker. Add a direct, gated prompty pre-release dependency so the workspace's if-necessary-or-explicit prerelease policy permits the toolkit's transitive prompty requirement. Guard the test modules with pytest.importorskip so the 3.10 CI leg skips cleanly. Remove the workspace exclude and the markdown-code-lint exclude, and regenerate uv.lock.
Rename provider parameters to match Agent Framework conventions: foundry_endpoint (was ai_foundry_endpoint) and embedding_model/chat_model (were *_deployment_name). Move DEFAULT_* to module-level constants, type memory_types as a Literal, use DEFAULT_CONTEXT_PROMPT as the default value, and add ProcessorConfig/CosmosMemorySettings TypedDicts. Resolve connection settings via agent_framework load_settings with required-field validation, replacing the manual getenv/raise blocks. Scope user_id/thread_id to the provider state and drop the unpreventable first-turn warning. Rewrite the samples around Agent (not raw SessionContext), provider-scoped state, and session-id threading; use PEP 723 inline dependencies instead of a samples dependency group; use a plain input() loop; remove the dead custom processor stub. Update README/AGENTS for the renamed parameters and env vars. Add a samples ruff per-file-ignores entry now that the package is linted in CI.
Bump azure-cosmos-agent-memory to >=0.2.0b2 (adds the embeddings/chat client injection seam) and add tests/test_emulator.py: an integration (not azure) suite that exercises real Cosmos vector search with a quantizedFlat index against a local Cosmos DB emulator, using deterministic in-memory fakes for embeddings and chat so no Azure AI Foundry account or LLM is required. To run on a stock emulator the fixture strips the toolkit's full-text index (the provider only does pure vector search) and requests provisioned autoscale throughput instead of serverless. The suite skips cleanly when no emulator is reachable.
The package recently joined the uv workspace, so its source and tests are now covered by the Test Typing Checks and Package Checks gates for the first time. tests: rename stale constructor kwargs to the current provider API (foundry_endpoint/embedding_model/chat_model); use a typed _STUB_AGENT for the unused agent param so pyright/pyrefly/ty/zuban all accept it; make processor_config values ints; assert non-None memory_client in the emulator tests. source: relax reportUnknown*/reportOptional* for this package only (the toolkit ships no py.typed; mirrors the hosting-telegram precedent); decouple the conditional toolkit import from the annotation type; use settings.get(); fix memory_types list invariance; drop a redundant None guard; read role via getattr.
Motivation & Context
Right now there's no first-party way to give agents long-term, per-user memory backed by Azure Cosmos DB. This PR adds an optional package,
agent-framework-azure-cosmos-memory, that plugs Cosmos-backed memory into the standardContextProviderextension point. On each run it pulls the user's relevant past memories and running summary into the agent's context, then saves new memories after the run. So you get the "agent that remembers the user across sessions" scenario without coupling core to Cosmos.It builds on the
azure-cosmos-agent-memorytoolkit for storage, embeddings, and reconciliation, and installs as a standalone, opt-in package.Description & Review Guide
What are the major changes?
python/packages/azure-cosmos-memory/withCosmosMemoryContextProvider, implementing the framework'sContextProvidercontract.before_runretrieves per-user memories (semantic search) plus a user summary, injecting memories as context messages and the summary as instructions. Search and summary failures are handled separately so one doesn't block the other, and it warns once if there's nouser_id.after_runpersists new memories from the turn.basic_usage.py,interactive_chat.py), 32 unit tests, and an integration test scaffold.What is the impact of these changes?
python/pyproject.tomlvia[tool.uv.workspace] exclude. That's needed because its dependencyazure-cosmos-agent-memoryrequires Python>=3.11and apromptyprerelease (>=2.0.0a9), both of which are unsatisfiable against the monorepo's>=3.10floor andif-necessary-or-explicitprerelease policy. Excluding it keeps rootuv syncresolvable and leaves the committeduv.lockuntouched, while the package is built and tested standalone.3.11and enabling the prompty prerelease. Happy to go whichever way you prefer.What do you want reviewers to focus on?
python/pyproject.toml, and whether you want this standalone or as a workspace member.ContextProviderintegration semantics (how memories vs. summary are injected, and the independent failure handling).Related Issue
Fixes #
Contribution Checklist