Feat/llm chat enhancements by Kureii · Pull Request #9316 · TriliumNext/Trilium

Kureii · 2026-04-06T18:57:18Z

PR Title

feat(llm): enhance LLM chat with Ollama, tool approval, knowledge base, web search, and stop generation

PR Description

Summary

This PR adds several major enhancements to the LLM chat feature:

Ollama Provider - Local LLM support via Ollama's OpenAI-compatible API
Note Mutation Tools - New LLM tools for rename, delete, move, and clone operations
Tool Approval System - Human-in-the-loop approval for destructive note operations
Knowledge Base Mode - Use selected notes as context sources (like Google NotebookLM)
Stop Generation - Abort LLM streaming mid-response with partial content preservation
Web Search Engines - Tavily and SearXNG as alternatives to provider-built-in search
Search Timeout - Configurable timeout for web search requests

Features in Detail

Ollama Provider

New provider type using @ai-sdk/openai with custom baseURL pointing to local Ollama instance
Registered in provider selector and model listing
Supports all Ollama models (llama3, mistral, codellama, etc.)

Note Mutation Tools

rename_note - Rename an existing note
delete_note - Delete a note (move to trash)
move_note - Move a note to a different parent branch
clone_note - Clone a note to another location in the tree
All marked with mutates: true for the approval system

Tool Approval System

Tools with mutates: true require explicit user approval before execution
AI SDK tool() without execute function prevents auto-execution
New POST /api/llm-chat/execute-tool endpoint wraps execution in sql.transactional()
UI shows approve/reject buttons with visual indicators on ToolCallCard
Approval state propagated through LlmStreamChunk protocol (requiresApproval field)

Knowledge Base Mode

Users can select notes as context sources for the LLM conversation
Harvard-style numbered citations [1], [2] with a References section
Toggle in ChatInputBar with note selector panel
Note content injected into system prompt with numbered reference format

Stop Generation

AbortController-based stream cancellation via fetch({ signal })
Send button toggles to a red stop button during streaming
Partial content (text, tool calls) preserved when generation is aborted
Graceful AbortError handling finalizes the message with collected content

Web Search Engines

Provider default - Uses built-in search from Anthropic/OpenAI/Google
Tavily - AI-optimized search API (free tier: 1000 queries/month)
SearXNG - Self-hosted metasearch engine, no API key required
Settings UI with conditional fields (API key for Tavily, instance URL for SearXNG)

Search Timeout

Configurable timeout (default: 15 seconds, range: 1-120s) for web search
AbortSignal.timeout() applied to Tavily and SearXNG fetch calls
Exposed in LLM settings UI

New Options

Option	Type	Default	Description
`llmWebSearchEngine`	`string`	`"provider"`	Search engine: provider/tavily/searxng
`llmTavilyApiKey`	`string`	`""`	Tavily API key
`llmSearxngUrl`	`string`	`""`	SearXNG instance URL
`llmSearchTimeout`	`string`	`"15"`	Search timeout in seconds

New API Endpoints

Method	Path	Description
`POST`	`/api/llm-chat/execute-tool`	Execute an approved tool call

Files Changed (29 files, +1029 / -99)

Server

apps/server/src/services/llm/providers/ollama.ts - New: Ollama provider
apps/server/src/services/llm/web_search_tools.ts - New: Tavily & SearXNG tools
apps/server/src/services/llm/providers/base_provider.ts - KB citations, web search engine selection, timeout
apps/server/src/services/llm/tools/tool_registry.ts - Approval mechanism, getMutatingToolNames()
apps/server/src/services/llm/tools/note_tools.ts - rename_note, delete_note
apps/server/src/services/llm/tools/hierarchy_tools.ts - move_note, clone_note
apps/server/src/services/llm/stream.ts - requiresApproval flag in tool-call chunks
apps/server/src/routes/api/llm_chat.ts - execute-tool endpoint, mutatingToolNames
apps/server/src/routes/routes.ts - Route registration
apps/server/src/services/llm/index.ts - Ollama registration
apps/server/src/services/llm/chat_title.ts - Ollama model handling
apps/server/src/services/options_init.ts - New option defaults
apps/server/src/routes/api/options.ts - Options whitelist

Client

apps/client/src/widgets/type_widgets/llm_chat/useLlmChat.ts - KB, abort, approval callbacks
apps/client/src/widgets/type_widgets/llm_chat/ChatInputBar.tsx - KB toggle, stop button
apps/client/src/widgets/type_widgets/llm_chat/ChatInputBar.css - Stop button styling
apps/client/src/widgets/type_widgets/llm_chat/ToolCallCard.tsx - Approval UI
apps/client/src/widgets/type_widgets/llm_chat/ToolCallCard.css - Approval styling
apps/client/src/widgets/type_widgets/llm_chat/ChatMessage.tsx - Approval props threading
apps/client/src/widgets/type_widgets/llm_chat/LlmChat.tsx - Approval callbacks
apps/client/src/widgets/type_widgets/llm_chat/ExpandableCard.tsx - defaultExpanded prop
apps/client/src/widgets/type_widgets/llm_chat/llm_chat_types.ts - ToolCall type extensions
apps/client/src/widgets/sidebar/SidebarChat.tsx - Approval callback forwarding
apps/client/src/services/llm_chat.ts - AbortSignal, executeToolCall()
apps/client/src/widgets/type_widgets/options/llm.tsx - WebSearchSettings, timeout field
apps/client/src/widgets/type_widgets/options/llm/AddProviderModal.tsx - Ollama option

Shared

packages/commons/src/lib/llm_api.ts - requiresApproval on tool_use chunk
packages/commons/src/lib/options_interface.ts - New option types

Documentation & i18n

apps/server/src/assets/doc_notes/en/User Guide/User Guide/AI.html - Rewritten
apps/server/src/assets/doc_notes/en/User Guide/User Guide/AI/Knowledge Base.html - New
apps/server/src/assets/doc_notes/en/User Guide/!!!meta.json - Updated structure
apps/client/src/translations/en/translation.json - New translation keys

Testing

pnpm typecheck - passes
pnpm client:build - builds successfully

Add Ollama as a new LLM provider using the OpenAI-compatible API. - Create ollama.ts provider with custom baseURL pointing to local Ollama - Update AddProviderModal to include Ollama in the provider type dropdown - Register Ollama provider in the LLM service index - Handle chat title generation for Ollama models

Add new LLM tools for note management: - rename_note: Rename an existing note - delete_note: Delete a note (move to trash) - move_note: Move a note to a different parent - clone_note: Clone a note to another location All tools are marked with mutates: true for the approval system.

Implement a human-in-the-loop approval mechanism for LLM tools that modify notes. Tools marked with mutates: true require explicit user approval before execution. Server-side: - tool_registry: omit execute fn for mutating tools (AI SDK won't auto-run) - stream: flag tool calls with requiresApproval in SSE chunks - llm_chat route: new POST /api/llm-chat/execute-tool endpoint - Wrap tool execution in sql.transactional() for safety Client-side: - ToolCallCard: approve/reject buttons with visual indicators - ChatMessage/LlmChat/SidebarChat: thread approval callbacks - ExpandableCard: auto-expand for pending approvals Protocol: - LlmStreamChunk tool_use gains requiresApproval field - New ToolCall type fields: requiresApproval, rejected

Knowledge Base: - Allow users to select notes as context sources for LLM chat - Harvard-style numbered citations [1], [2] with References section - KB toggle in ChatInputBar with note selector panel Stop Generation: - AbortController-based stream cancellation - Send button toggles to stop button during streaming - Partial content preserved when generation is stopped Stream abort: - llm_chat service accepts AbortSignal for fetch cancellation - useLlmChat manages AbortController lifecycle - Graceful handling of AbortError to finalize partial messages

Web Search Engines: - Support Tavily API and self-hosted SearXNG as alternatives to provider-built-in web search - New settings UI with engine selector and conditional config fields - Tavily: API key auth, returns AI-generated answer + search results - SearXNG: self-hosted metasearch, no API key required Search Timeout: - Configurable timeout (default 15s) for web search requests - AbortSignal.timeout() applied to all custom search tool fetch calls - Timeout setting exposed in LLM settings UI (1-120 seconds) New options: llmWebSearchEngine, llmTavilyApiKey, llmSearxngUrl, llmSearchTimeout

- Rewrite AI.html with comprehensive feature documentation - Add Knowledge Base documentation page (AI/Knowledge Base.html) - Update !!!meta.json for new documentation structure - Add translation keys for: tool approval (approve, reject, pending_approval, rejected_by_user), stop generation, web search settings (engine, timeout, Tavily, SearXNG), and LLM tool names

gemini-code-assist

Code Review

This pull request significantly expands the AI Chat capabilities by introducing a Knowledge Base mode for source-specific querying, support for local Ollama models, and alternative web search engines like Tavily and SearXNG. It also implements a human-in-the-loop approval workflow for mutating tool calls (e.g., renaming or deleting notes) and adds a stop button to cancel active streams. Feedback includes a performance concern regarding markdown conversion for large note previews, a missing timeout in the Ollama fetch call, and a lack of title validation in the note renaming tool.

gemini-code-assist · 2026-04-06T18:58:59Z

+                    const full = note.isContentAvailable() ? (() => {
+                        const content = note.getContent();
+                        if (typeof content !== "string") return preview;
+                        if (note.type === "text") {
+                            return markdownExport.toMarkdown(content);
+                        }
+                        return content;
+                    })() : preview;


Calling markdownExport.toMarkdown(content) on the full content of a note just to extract a 1500-character preview can be very inefficient for large notes. Since this is executed for up to 20 source notes in the knowledge base, it could significantly increase latency and memory usage. Consider truncating the HTML content before conversion or using a simpler tag-stripping method for the preview.

gemini-code-assist · 2026-04-06T18:58:59Z

+     */
+    async loadModels(): Promise<ModelInfo[]> {
+        try {
+            const res = await fetch(`${this.baseUrl}/api/tags`);


The fetch call to the local Ollama instance lacks a timeout. If the service is unresponsive, this could hang the server-side request indefinitely. It's recommended to add a timeout using AbortSignal.timeout().

Suggested change

const res = await fetch(`${this.baseUrl}/api/tags`);

const res = await fetch(`${this.baseUrl}/api/tags`, { signal: AbortSignal.timeout(5000) });

gemini-code-assist · 2026-04-06T18:58:59Z

+        execute: ({ noteId, newTitle }) => {
+            const note = becca.getNote(noteId);
+            if (!note) {
+                return { error: "Note not found" };
+            }
+            if (note.isProtected) {
+                return { error: "Note is protected and cannot be renamed" };
+            }
+
+            note.title = newTitle;
+            note.save();


The rename_note tool does not validate the newTitle. Providing an empty or whitespace-only string could lead to notes with no visible title, which is generally discouraged. It's better to add a check for empty titles.

execute: ({ noteId, newTitle }) => { const note = becca.getNote(noteId); if (!note) { return { error: "Note not found" }; } if (note.isProtected) { return { error: "Note is protected and cannot be renamed" }; } const trimmedTitle = newTitle.trim(); if (!trimmedTitle) { return { error: "Title cannot be empty" }; } note.title = trimmedTitle; note.save();

eliandoran · 2026-04-06T19:02:37Z

Hi, interesting contribution.

Did you test it thoroughly? Which models were used to test Ollama?

- base_provider: replace full markdown conversion with simple HTML tag stripping for KB previews — avoids expensive toMarkdown() call on large notes (up to 20 sources × full content) - ollama: add AbortSignal.timeout(5000) to loadModels() fetch call to prevent indefinite hangs when the Ollama service is unresponsive - note_tools: validate newTitle in rename_note — trim whitespace and reject empty titles

The note autocomplete dropdown in the Knowledge Base panel was opening downward, causing suggestions to overflow off-screen. Use a container ref and CSS to position the dropdown above the input field instead.

Add closeOnBlur option to note autocomplete Options interface. When set, skips debug:true so the dropdown closes when the input loses focus. Used in the KB note picker.

Kureii · 2026-04-07T00:10:52Z

Hi, thanks for the review!

Testing summary:
I tested Ollama with gemma4:27b, gemma3:27b, qwen3.5:27b, and mistral-small3.2:24b. To be transparent - local model support is functional but results are very inconsistent across models:

gemma4:27b - Best results overall. Supports tool calling, follows instructions well, successfully searches notes and retrieves content. However, it still doesn't iterate autonomously - it stops after receiving a tool result and waits for user input before taking the next step.
gemma3:27b - Doesn't work at all (neither chat nor tools).
qwen3.5:27b - Tool calling works technically, but poor instruction following.
mistral-small3.2:24b - Similar limitations to gemma4, stops after first tool call.

The core issue: None of the tested local models can autonomously chain multiple tool calls. Even with maxSteps set to 15 in the AI SDK, the model itself needs to emit the next tool call - and local models consistently stop after the first result, waiting for user prompting to continue. Commercial models (Claude, GPT-4o) handle this naturally.

That said, when models do work, the results are solid - search, content retrieval, and tool execution all function correctly. This is intentionally a foundational implementation: the architecture (tool registry, approval system, streaming, knowledge base, web search) is designed to be extensible. Prompt tuning, custom system prompts, and model-specific behavior can be iterated on incrementally without architectural changes.

On the security side - the tool approval system works reliably regardless of model quality. I wasn't able to get any of the tested models to bypass the approval mechanism for mutating operations, so the safety layer holds up even with less capable models.

I've also addressed all three code review findings in commit 08c3f03:

KB preview performance: Replaced full toMarkdown() conversion with simple HTML tag stripping
Ollama fetch timeout: Added AbortSignal.timeout(5000) to loadModels()
rename_note validation: Now trims whitespace and rejects empty titles

eliandoran · 2026-04-07T10:21:33Z

@Kureii , could you please try to separate the features into multiple PRs? There are quite a few overlapping concerns at the same time:

Olama provider
Multiple search providers
Tool approval
Knowledge base mode

And anything you can think of splitting.

This makes it easy for me to review and test, and perhaps not all features can go into Trilium without further changes.

eliandoran · 2026-04-10T07:39:32Z

Closing as the PRs has been successfully split into #9338 - #9343.

Kureii added 6 commits April 6, 2026 20:47

dosubot Bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Apr 6, 2026

gemini-code-assist Bot reviewed Apr 6, 2026

View reviewed changes

Kureii added 4 commits April 7, 2026 01:33

fix(llm): flip KB autocomplete dropdown to open upward

91d6a37

The note autocomplete dropdown in the Knowledge Base panel was opening downward, causing suggestions to overflow off-screen. Use a container ref and CSS to position the dropdown above the input field instead.

fix(llm): add background to KB autocomplete dropdown

36c967f

fix(llm): close KB autocomplete dropdown on blur

98d57de

Add closeOnBlur option to note autocomplete Options interface. When set, skips debug:true so the dropdown closes when the input loses focus. Used in the KB note picker.

Merge branch 'main' into feat/llm-chat-enhancements

46d9be0

eliandoran marked this pull request as draft April 7, 2026 10:21

eliandoran added the merge-conflicts label Apr 7, 2026

eliandoran closed this Apr 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat/llm chat enhancements#9316

Feat/llm chat enhancements#9316
Kureii wants to merge 11 commits intoTriliumNext:mainfrom
Kureii:feat/llm-chat-enhancements

Kureii commented Apr 6, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 6, 2026

Uh oh!

gemini-code-assist Bot Apr 6, 2026

Uh oh!

gemini-code-assist Bot Apr 6, 2026

Uh oh!

eliandoran commented Apr 6, 2026

Uh oh!

Kureii commented Apr 7, 2026

Uh oh!

eliandoran commented Apr 7, 2026

Uh oh!

eliandoran commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	const res = await fetch(`${this.baseUrl}/api/tags`);
	const res = await fetch(`${this.baseUrl}/api/tags`, { signal: AbortSignal.timeout(5000) });

Uh oh!

Conversation

Kureii commented Apr 6, 2026

PR Title

PR Description

Summary

Features in Detail

Ollama Provider

Note Mutation Tools

Tool Approval System

Knowledge Base Mode

Stop Generation

Web Search Engines

Search Timeout

New Options

New API Endpoints

Files Changed (29 files, +1029 / -99)

Server

Client

Shared

Documentation & i18n

Testing

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

eliandoran commented Apr 6, 2026

Uh oh!

Kureii commented Apr 7, 2026

Uh oh!

eliandoran commented Apr 7, 2026

Uh oh!

eliandoran commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants