Skip to content

feat(llm): add Ollama provider support#9338

Draft
Kureii wants to merge 2 commits intoTriliumNext:mainfrom
Kureii:feat/llm-ollama-provider
Draft

feat(llm): add Ollama provider support#9338
Kureii wants to merge 2 commits intoTriliumNext:mainfrom
Kureii:feat/llm-ollama-provider

Conversation

@Kureii
Copy link
Copy Markdown
Contributor

@Kureii Kureii commented Apr 8, 2026

Summary

Add Ollama as a new LLM provider, enabling local/self-hosted model support via Ollama's OpenAI-compatible API.

Changes

  • New OllamaProvider class using @ai-sdk/openai with custom baseURL
  • Dynamic model discovery via GET /api/tags endpoint
  • Provider registration in AddProviderModal with base URL configuration
  • getModels route changed to asyncApiRoute to support async model fetching
  • "Free" label for zero-cost models in the model selector
  • Title generation support with automatic model selection

Files (9 files, +253/-26)

  • apps/server/src/services/llm/providers/ollama.ts - New: Ollama provider implementation
  • apps/server/src/services/llm/index.ts - Provider registration
  • apps/server/src/services/llm/chat_title.ts - Ollama model handling
  • apps/server/src/routes/api/llm_chat.ts - asyncApiRoute for getModels
  • apps/server/src/routes/routes.ts - Route type update
  • apps/client/src/widgets/type_widgets/options/llm/AddProviderModal.tsx - Ollama option in UI
  • apps/client/src/widgets/type_widgets/llm_chat/ChatInputBar.tsx - Conditional cost display
  • apps/client/src/widgets/type_widgets/llm_chat/useLlmChat.ts - "Free" label logic
  • apps/client/src/translations/en/translation.json - Translation keys

Add Ollama as a new LLM provider using the OpenAI-compatible API with
a custom base URL (default: http://localhost:11434).

- New OllamaProvider class that dynamically fetches available models
  from the running Ollama instance via /api/tags
- AddProviderModal updated with Ollama option, base URL field, and
  conditional API key requirement
- Models displayed as 'Free' instead of showing pricing
- Async model loading with 5s timeout for resilience
- Route changed to asyncApiRoute for async model fetching
@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Apr 8, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for Ollama as a self-hosted AI provider. Key changes include updating the provider configuration UI to handle optional API keys and custom base URLs, adding translations for free models, and implementing the OllamaProvider on the server to dynamically fetch models from local instances. Feedback focuses on simplifying redundant logic in model retrieval and optimizing performance by caching loaded models to avoid unnecessary network requests during chat and title generation.

Comment thread apps/server/src/services/llm/providers/ollama.ts
Comment thread apps/server/src/services/llm/providers/ollama.ts
Copy link
Copy Markdown
Contributor

@eliandoran eliandoran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall seems pretty fine.

However, I've tested it on NixOS which comes with Ollama 0.12.11, before the responses API was introduced (0.13.3), which results in the chat failing:

APICallError [AI_APICallError]: Not Found
    at <anonymous> (/home/elian/Projects/TriliumNext/Notes/node_modules/@ai-sdk/provider-utils/src/response-handler.ts:70:16)
    at process.processTicksAndRejections (node:internal/process/task_queues:104:5)
    at async postToApi (/home/elian/Projects/TriliumNext/Notes/node_modules/@ai-sdk/provider-utils/src/post-to-api.ts:118:28)
    at async OpenAIResponsesLanguageModel.doStream (/home/elian/Projects/TriliumNext/Notes/node_modules/@ai-sdk/openai/src/responses/openai-responses-language-model.ts:1035:50)
    at async fn (/home/elian/Projects/TriliumNext/Notes/node_modules/ai/src/generate-text/stream-text.ts:1686:27)
    at async <anonymous> (/home/elian/Projects/TriliumNext/Notes/node_modules/ai/src/telemetry/record-span.ts:32:24)
    at async _retryWithExponentialBackoff (/home/elian/Projects/TriliumNext/Notes/node_modules/ai/src/util/retry-with-exponential-backoff.ts:96:12)
    at async streamStep (/home/elian/Projects/TriliumNext/Notes/node_modules/ai/src/generate-text/stream-text.ts:1638:17)
    at async fn (/home/elian/Projects/TriliumNext/Notes/node_modules/ai/src/generate-text/stream-text.ts:2166:9)
    at async <anonymous> (/home/elian/Projects/TriliumNext/Notes/node_modules/ai/src/telemetry/record-span.ts:32:24) {
  cause: undefined,
  url: 'http://localhost:11434/v1/responses',

This is not a good UX, we must guard against it.

Seems that we can fall back to chat mode via:

    protected createModel(modelId: string) {
        // Use .chat() to target /v1/chat/completions (Ollama doesn't support /v1/responses)
-        return this.openai(modelId);
+        return this.openai.chat(modelId);
    }

Maybe we could do some version check of Ollama.

Comment on lines +126 to +130
costDescription: m.costMultiplier
? `${m.costMultiplier}x`
: m.pricing.input === 0 && m.pricing.output === 0
? t("llm_chat.free")
: undefined
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too complicated. Extract to a function with simple ifs and returns.

@eliandoran eliandoran marked this pull request as draft April 11, 2026 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-conflicts size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants