Skip to content

Python: Add agent-framework-azure-cosmos-memory context provider#6719

Open
TheovanKraay wants to merge 10 commits into
microsoft:mainfrom
TheovanKraay:feature/cosmos-memory-context-provider
Open

Python: Add agent-framework-azure-cosmos-memory context provider#6719
TheovanKraay wants to merge 10 commits into
microsoft:mainfrom
TheovanKraay:feature/cosmos-memory-context-provider

Conversation

@TheovanKraay

@TheovanKraay TheovanKraay commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Motivation & Context

Right now there's no first-party way to give agents long-term, per-user memory backed by Azure Cosmos DB. This PR adds an optional package, agent-framework-azure-cosmos-memory, that plugs Cosmos-backed memory into the standard ContextProvider extension point. On each run it pulls the user's relevant past memories and running summary into the agent's context, then saves new memories after the run. So you get the "agent that remembers the user across sessions" scenario without coupling core to Cosmos.

It builds on the azure-cosmos-agent-memory toolkit for storage, embeddings, and reconciliation, and installs as a standalone, opt-in package.

Heads up: this is a draft for you to review the approach and packaging before I polish it further. The emulator-based integration tests depend on an upstream toolkit change, so they're not turned on in CI yet.

Description & Review Guide

  • What are the major changes?

    • New package python/packages/azure-cosmos-memory/ with CosmosMemoryContextProvider, implementing the framework's ContextProvider contract.
    • before_run retrieves per-user memories (semantic search) plus a user summary, injecting memories as context messages and the summary as instructions. Search and summary failures are handled separately so one doesn't block the other, and it warns once if there's no user_id.
    • after_run persists new memories from the turn.
    • README, AGENTS.md, two runnable samples (basic_usage.py, interactive_chat.py), 32 unit tests, and an integration test scaffold.
    • The package is excluded from the shared uv workspace (see below for why).
  • What is the impact of these changes?

    • Additive only. No changes to existing packages or public APIs.
    • The package is excluded from the uv workspace in python/pyproject.toml via [tool.uv.workspace] exclude. That's needed because its dependency azure-cosmos-agent-memory requires Python >=3.11 and a prompty prerelease (>=2.0.0a9), both of which are unsatisfiable against the monorepo's >=3.10 floor and if-necessary-or-explicit prerelease policy. Excluding it keeps root uv sync resolvable and leaves the committed uv.lock untouched, while the package is built and tested standalone.
    • The trade-off: it won't be built or tested by the monorepo CI matrix. If you'd rather have it as a full workspace member, that would mean raising the workspace floor to 3.11 and enabling the prompty prerelease. Happy to go whichever way you prefer.
  • What do you want reviewers to focus on?

    • The exclude decision in python/pyproject.toml, and whether you want this standalone or as a workspace member.
    • The ContextProvider integration semantics (how memories vs. summary are injected, and the independent failure handling).

Related Issue

Fixes #

Contribution Checklist

  • The code builds clean without any errors or warnings
  • All unit tests pass, and I have added new tests where possible
  • The PR follows the Contribution Guidelines
  • This PR is linked to an issue and there is no other open PR for this issue (see Related Issue above).
  • This is not a breaking change.

Copilot AI review requested due to automatic review settings June 24, 2026 20:06
@moonbox3 moonbox3 added documentation Usage: [Issues, PRs], Target: documentation in the code base and learn docs python Usage: [Issues, PRs], Target: Python labels Jun 24, 2026
@github-actions github-actions Bot changed the title Add agent-framework-azure-cosmos-memory context provider Python: Add agent-framework-azure-cosmos-memory context provider Jun 24, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new Python integration package, agent-framework-azure-cosmos-memory, introducing a CosmosMemoryContextProvider that persists and recalls long-term memories via Azure Cosmos DB using the azure-cosmos-agent-memory toolkit (including user-summary injection and retrieval-time context augmentation).

Changes:

  • Introduces CosmosMemoryContextProvider (async context manager + before_run/after_run hooks) plus package exports.
  • Adds documentation and samples demonstrating basic usage and an interactive Foundry-backed chat experience.
  • Adds unit tests (mocked client) and live-Azure integration tests (pytest markers).

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
python/packages/azure-cosmos-memory/agent_framework_azure_cosmos_memory/_context_provider.py Implements the Cosmos-backed memory context provider (retrieval + storage + user-summary injection + flush).
python/packages/azure-cosmos-memory/agent_framework_azure_cosmos_memory/init.py Exports CosmosMemoryContextProvider and package version.
python/packages/azure-cosmos-memory/pyproject.toml Defines the new package, dependencies, and pytest/tooling configuration (Python >=3.11).
python/packages/azure-cosmos-memory/README.md End-user documentation, configuration guidance, and usage examples.
python/packages/azure-cosmos-memory/AGENTS.md Package-level developer guidance and key behaviors (user_id/thread_id, flush).
python/packages/azure-cosmos-memory/LICENSE Package license.
python/packages/azure-cosmos-memory/samples/basic_usage.py Minimal “raw hooks” sample calling before_run()/after_run() directly.
python/packages/azure-cosmos-memory/samples/interactive_chat.py Interactive CLI sample demonstrating an agent wired with Foundry + Cosmos memory provider.
python/packages/azure-cosmos-memory/tests/test_context_provider.py Unit tests for provider behavior with a mocked memory client (incl. context manager + flush).
python/packages/azure-cosmos-memory/tests/test_integration.py Live-Azure integration tests gated by env vars and markers.
python/packages/azure-cosmos-memory/tests/conftest.py Pytest marker registration for the package’s tests.

Comment thread python/packages/azure-cosmos-memory/tests/conftest.py Outdated
Comment thread python/packages/azure-cosmos-memory/README.md Outdated
Comment thread python/packages/azure-cosmos-memory/pyproject.toml Outdated
@github-actions

github-actions Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/azure-cosmos-memory/agent_framework_azure_cosmos_memory
   _context_provider.py134397%35–37
TOTAL43432518188% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
8571 33 💤 0 ❌ 0 🔥 2m 18s ⏱️

Introduces CosmosMemoryContextProvider, a ContextProvider that wraps the azure-cosmos-agent-memory toolkit to give agents long-term, Cosmos DB-backed memory (fact/procedural recall + user summaries). Includes package scaffolding, unit tests (mocked client), live Azure integration tests (marked), samples, README, and AGENTS.md.

Draft: uv.lock is intentionally left unchanged. This package depends on azure-cosmos-agent-memory (requires Python >=3.11), which is unsatisfiable against the workspace's current >=3.10 floor, so adding it to the shared lock requires a workspace decision (raise floor to 3.11 or exclude from workspace). Test coverage to be expanded.
The package depends on azure-cosmos-agent-memory which requires Python
>=3.11 and a prompty pre-release (>=2.0.0a9). Both are unsatisfiable
against the workspace's >=3.10 floor and pre-release policy, causing
uv sync to fail in every Python CI job. Exclude the package from the
shared workspace so it is resolved and tested as a standalone package.
- Strip trailing whitespace from package files (pre-commit trailing-whitespace hook)
- Exclude the package README from markdown-code-lint: the package is excluded
  from the uv workspace, so its README snippets import a module that is not
  installed in the workspace env and Pyright cannot resolve it
@TheovanKraay TheovanKraay force-pushed the feature/cosmos-memory-context-provider branch from ab30779 to 4bc5c8e Compare June 25, 2026 14:36
- Wire credential into Cosmos and AI Foundry clients; let toolkit own
  DefaultAzureCredential when none supplied (remove dead import).
- Honor auto_extract=False by zeroing extraction/summary cadence thresholds.
- Skip whitespace-only conversation turns and store stripped content.
- Show confidence 0.0 and coerce confidence to float in _format_memories.
- Register both 'integration' and 'azure' pytest markers accurately.
- Fix duplicated install block in README.
- Update and extend unit tests for new credential wiring and fixes.
"pytest-asyncio>=0.23.0",
"pytest-cov>=4.0.0",
]
samples = [

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not needed, and not preferred because they will show up on pypi and other places, we prefer (if needed) to have PEP 723 definitions in the samples that need it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Removed the samples dependency group; the samples now declare their dependencies inline via PEP 723, so nothing sample-related surfaces on PyPI.


cosmos_endpoint: str | None
cosmos_database: str | None
ai_foundry_endpoint: str | None

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use foundry_endpoint, let's keep that consistent

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Renamed ai_foundry_endpoint to foundry_endpoint for consistency.

cosmos_endpoint: str | None
cosmos_database: str | None
ai_foundry_endpoint: str | None
embedding_deployment_name: str | None

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also standardized this on embedding_model instead of deployment name, same for chat

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Renamed to embedding_model and chat_model. Internally these map to the toolkit client's embedding_deployment_name / chat_deployment_name, which are unchanged on its side.


# Agent Framework uses the "assistant" role, but the Agent Memory Toolkit's TurnRecord
# only accepts {user, agent, tool, system}. Map AF roles to toolkit roles when storing.
_ROLE_MAP: ClassVar[dict[str, str]] = {"assistant": "agent"}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this have to be a classvar? coulnd't it be a module level const?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Moved DEFAULT_SOURCE_ID, DEFAULT_DATABASE and DEFAULT_CONTEXT_PROMPT to module-level constants.

memory_client: AsyncCosmosMemoryClient | None = None,
top_k: int = 5,
min_confidence: float = 0.7,
memory_types: Sequence[str] | None = None,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a limited list of options? if, so could we do a literal?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. memory_types is now Sequence[Literal["fact", "procedural", "episodic"]].

) as provider:
# Use with agent session
session = AgentSession(session_id="user-session-123")
session.state["user_id"] = "alice"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

state should be scoped by provider, so something like session.state['ai_memory']['user_id']

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. The samples now use session.state.setdefault(provider.source_id, {})["user_id"] = ....

# Use with agent session
session = AgentSession(session_id="user-session-123")
session.state["user_id"] = "alice"
session.state["thread_id"] = "conversation-1"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should map to session_id, can we just use that?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Threading now just uses the session id (a new session is a new thread); no explicit thread_id.

# wait. This lets the toolkit's background memory-extraction tasks (scheduled after
# each stored turn) make progress between messages instead of being starved by a
# blocking input() call.
user_input = (await asyncio.to_thread(input, "You: ")).strip()

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for a sample, it's fine to just use ìnput` directly

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Switched to a plain input() loop.

session.state["thread_id"] = "conversation-1"

# Simulate agent run - before_run searches memories
ctx = SessionContext(

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the whole machinery of the SessionContext is not something we need to expose people to, I think we only need a sample setting up with a Agent and then there might be some flavors in there, maybe user-scoped vs non-user-scoped but this is great for testing, but not something we expect people to write

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Rewrote basic_usage.py to set up an Agent with the provider and call agent.run(...), with two flavors: user-scoped versus session-scoped memory. No SessionContext machinery is exposed.

"""


class CustomMemoryProcessor:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is cool, and should warrant a second sample showcasing it, but looks like it is commented out now

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, it was non-functional (commented out), so I removed the dead stub to keep the sample clean. A dedicated custom-extraction-rubric sample is a good follow-up; I can add one once we have a working custom-processor example to show.

Follow the github_copilot pattern for a package with a Python 3.11-only
dependency: lower requires-python to >=3.10 and gate azure-cosmos-agent-memory
behind a python_version >= '3.11' marker. Add a direct, gated prompty
pre-release dependency so the workspace's if-necessary-or-explicit prerelease
policy permits the toolkit's transitive prompty requirement. Guard the test
modules with pytest.importorskip so the 3.10 CI leg skips cleanly. Remove the
workspace exclude and the markdown-code-lint exclude, and regenerate uv.lock.
Rename provider parameters to match Agent Framework conventions:
foundry_endpoint (was ai_foundry_endpoint) and embedding_model/chat_model
(were *_deployment_name). Move DEFAULT_* to module-level constants, type
memory_types as a Literal, use DEFAULT_CONTEXT_PROMPT as the default value,
and add ProcessorConfig/CosmosMemorySettings TypedDicts. Resolve connection
settings via agent_framework load_settings with required-field validation,
replacing the manual getenv/raise blocks. Scope user_id/thread_id to the
provider state and drop the unpreventable first-turn warning.

Rewrite the samples around Agent (not raw SessionContext), provider-scoped
state, and session-id threading; use PEP 723 inline dependencies instead of a
samples dependency group; use a plain input() loop; remove the dead custom
processor stub. Update README/AGENTS for the renamed parameters and env vars.
Add a samples ruff per-file-ignores entry now that the package is linted in CI.
Bump azure-cosmos-agent-memory to >=0.2.0b2 (adds the embeddings/chat client
injection seam) and add tests/test_emulator.py: an integration (not azure)
suite that exercises real Cosmos vector search with a quantizedFlat index
against a local Cosmos DB emulator, using deterministic in-memory fakes for
embeddings and chat so no Azure AI Foundry account or LLM is required.

To run on a stock emulator the fixture strips the toolkit's full-text index
(the provider only does pure vector search) and requests provisioned autoscale
throughput instead of serverless. The suite skips cleanly when no emulator is
reachable.
The package recently joined the uv workspace, so its source and tests are now covered by the Test Typing Checks and Package Checks gates for the first time.

tests: rename stale constructor kwargs to the current provider API (foundry_endpoint/embedding_model/chat_model); use a typed _STUB_AGENT for the unused agent param so pyright/pyrefly/ty/zuban all accept it; make processor_config values ints; assert non-None memory_client in the emulator tests.

source: relax reportUnknown*/reportOptional* for this package only (the toolkit ships no py.typed; mirrors the hosting-telegram precedent); decouple the conditional toolkit import from the annotation type; use settings.get(); fix memory_types list invariance; drop a redundant None guard; read role via getattr.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Usage: [Issues, PRs], Target: documentation in the code base and learn docs python Usage: [Issues, PRs], Target: Python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants