fix(dmr): detect and use locally-installed Docker Model Runner models#3206
fix(dmr): detect and use locally-installed Docker Model Runner models#3206Sayt-0 wants to merge 1 commit into
Conversation
Auto-selection hard-coded the DMR provider to ai/qwen3:latest, so a user
with other models pulled in Docker Model Runner was prompted to pull
qwen3 and, on declining, saw the misleading "No model providers
available." The ctrl+m model picker likewise showed nothing for DMR
because local models are not part of the models.dev catalog.
Add dmr.ListModels (querying DMR's OpenAI /models endpoint) and use it to:
- prefer an already-pulled model during auto-selection (the configured
default when present, including under a non-default tag, otherwise the
first non-embedding installed model)
- populate DMR entries in the model picker (cached, mirroring the gateway
discovery path)
Also carry the underlying cause into AutoModelFallbackError and suggest
pulling a model, so the residual no-model case is actionable.
Fixes #2799
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟡 NEEDS ATTENTION
Two issues found in the new DMR discovery code. The medium-severity singleflight/context bug can suppress DMR model listings for all users for up to one minute after a single cancelled request; the low-severity embedding filter gap is a latent defect in the repo-prefix auto-selection fallback.
| return ids, err | ||
| } | ||
|
|
||
| ids, err := r.dmrModelLister(ctx) |
There was a problem hiding this comment.
[MEDIUM] Singleflight closure captures caller's context — a cancelled picker open poisons the cache for ~1 minute
listDMRModels passes the first caller's ctx directly to r.dmrModelLister:
ids, err := r.dmrModelLister(ctx) // ctx from the *first* sf.Do caller
c.mu.Lock()
c.ids, c.err, c.fetchedAt = ids, err, now() // error is cached here
c.mu.Unlock()
return ids, errTwo compounding problems arise from this design:
-
Spurious errors for concurrent waiters.
singleflight.Dobroadcasts the first caller's result to every goroutine waiting on the same key. If the first caller's context is cancelled (e.g. the user closes the model picker mid-request),dmrModelListerreturnscontext.Canceled. All concurrent waiters receive that error even though their own contexts are still alive. -
Error is cached for the full TTL. The code sets
c.fetchedAt = now()unconditionally, so thecontext.Canceledresult is stored and replayed to every subsequent caller for up to one minute (thedmrModelsTTL). During that window DMR models disappear from the picker for all users — recoverable only by waiting out the TTL.
Suggested fix: Use a context that is not tied to any single caller's lifetime inside the singleflight body. The simplest approach is context.WithoutCancel(ctx) (Go 1.21+), which propagates values but detaches cancellation:
v, err, _ := c.sf.Do("models", func() (any, error) {
if ids, ok, err := readFresh(); ok {
return ids, err
}
ids, err := r.dmrModelLister(context.WithoutCancel(ctx))
// ...
})Additionally, consider not caching errors (or using a shorter error TTL) so a transient failure or cancellation doesn't block discovery for a full minute.
| // still satisfies "prefer the default", so match on the repository. | ||
| defaultRepo := dmrModelRepo(defaultModel) | ||
| for _, m := range installed { | ||
| if dmrModelRepo(m) == defaultRepo { |
There was a problem hiding this comment.
[LOW] Repo-prefix fallback in pickDMRAutoModel skips the embedding-model filter
The general fallback loop (used when no exact tag match is found) correctly calls looksLikeEmbeddingModel:
for _, m := range installed {
if !looksLikeEmbeddingModel(m) {
return m // safe: chat models only
}
}But the repo-prefix loop that runs first does not apply the same guard:
defaultRepo := dmrModelRepo(defaultModel)
for _, m := range installed {
if dmrModelRepo(m) == defaultRepo {
return m // no embedding check here
}
}If an embedding-only variant that shares the same repository prefix as the default model is installed (e.g. a future ai/qwen3-embed:latest published under the ai/qwen3 namespace), auto-selection could return it as the chat model. The agent would then start and fail when it tries to use a text-embedding model for conversation.
The fix is a one-liner:
for _, m := range installed {
if dmrModelRepo(m) == defaultRepo && !looksLikeEmbeddingModel(m) {
return m
}
}
Summary
Fixes #2799. Docker Model Runner (DMR) models pulled locally were never enumerated, causing two reported problems:
ai/qwen3:latestpull showed "No model providers available." although DMR had other modelsDefaultModels["dmr"](ai/qwen3:latest); declining the pull failed selectionChanges
pkg/model/provider/dmr/list.go(new)ListModelsqueries DMR's OpenAI/modelsendpoint, reusing the same endpoint/baseURL/httpClient resolution as the inference client (handlesMODEL_RUNNER_HOSTand the Docker Desktop unix-socket transport)pkg/config/auto.goAutoModelConfigtakes aDMRModelLister;pickDMRAutoModelprefers the configured default (including under a non-default tag), else the first non-embedding installed model, else the static defaultpkg/config/auto.go,pkg/teamloader/teamloader.goAutoModelFallbackErrorcarries the cause (e.g. a declined pull), unwraps it, and suggestsdocker model pullpkg/runtime/dmr_models.go(new),model_switcher.gobuildDMRChoicesmirrors the gateway discovery path (TTL cache + singleflight), filters embedding models, dedupes against configured models, and groups under "Other models"runtime.go,teamloader.go,cmd/root/models.goNewLocalRuntime; the run path passesdmr.ListModels;docker agent modelspasses nil to stay side-effect-freeBehavior
ai/qwen3:latest); the picker ref isdmr/<id>and round-trips throughParseModelRef(first-slash split) back to{provider: dmr, model: ai/qwen3:latest}, so selecting one does not re-trigger a pull.docker model pull.Tests
New coverage (35 cases): DMR
/modelsparsing and the exportedListModelsentry point (MODEL_RUNNER_HOST+ unreachable), auto-selection across installed/default/tagged/embedding-only/error cases, the cause-carrying fallback error, andbuildDMRChoicesincluding catalog-family embedding filtering, dedup, and cache TTL/failure behavior.go build ./...,go vet,gofmt, andgolangci-lintare clean on the changed packages.Not included
The enhancement request for LM Studio and other local endpoints is left as a separate follow-up; this PR is scoped to the two reported DMR defects.