Skip to content

Feat/llm chat enhancements#9316

Closed
Kureii wants to merge 11 commits intoTriliumNext:mainfrom
Kureii:feat/llm-chat-enhancements
Closed

Feat/llm chat enhancements#9316
Kureii wants to merge 11 commits intoTriliumNext:mainfrom
Kureii:feat/llm-chat-enhancements

Conversation

@Kureii
Copy link
Copy Markdown
Contributor

@Kureii Kureii commented Apr 6, 2026

PR Title

feat(llm): enhance LLM chat with Ollama, tool approval, knowledge base, web search, and stop generation

PR Description

Summary

This PR adds several major enhancements to the LLM chat feature:

  1. Ollama Provider - Local LLM support via Ollama's OpenAI-compatible API
  2. Note Mutation Tools - New LLM tools for rename, delete, move, and clone operations
  3. Tool Approval System - Human-in-the-loop approval for destructive note operations
  4. Knowledge Base Mode - Use selected notes as context sources (like Google NotebookLM)
  5. Stop Generation - Abort LLM streaming mid-response with partial content preservation
  6. Web Search Engines - Tavily and SearXNG as alternatives to provider-built-in search
  7. Search Timeout - Configurable timeout for web search requests

Features in Detail

Ollama Provider

  • New provider type using @ai-sdk/openai with custom baseURL pointing to local Ollama instance
  • Registered in provider selector and model listing
  • Supports all Ollama models (llama3, mistral, codellama, etc.)

Note Mutation Tools

  • rename_note - Rename an existing note
  • delete_note - Delete a note (move to trash)
  • move_note - Move a note to a different parent branch
  • clone_note - Clone a note to another location in the tree
  • All marked with mutates: true for the approval system

Tool Approval System

  • Tools with mutates: true require explicit user approval before execution
  • AI SDK tool() without execute function prevents auto-execution
  • New POST /api/llm-chat/execute-tool endpoint wraps execution in sql.transactional()
  • UI shows approve/reject buttons with visual indicators on ToolCallCard
  • Approval state propagated through LlmStreamChunk protocol (requiresApproval field)

Knowledge Base Mode

  • Users can select notes as context sources for the LLM conversation
  • Harvard-style numbered citations [1], [2] with a References section
  • Toggle in ChatInputBar with note selector panel
  • Note content injected into system prompt with numbered reference format

Stop Generation

  • AbortController-based stream cancellation via fetch({ signal })
  • Send button toggles to a red stop button during streaming
  • Partial content (text, tool calls) preserved when generation is aborted
  • Graceful AbortError handling finalizes the message with collected content

Web Search Engines

  • Provider default - Uses built-in search from Anthropic/OpenAI/Google
  • Tavily - AI-optimized search API (free tier: 1000 queries/month)
  • SearXNG - Self-hosted metasearch engine, no API key required
  • Settings UI with conditional fields (API key for Tavily, instance URL for SearXNG)

Search Timeout

  • Configurable timeout (default: 15 seconds, range: 1-120s) for web search
  • AbortSignal.timeout() applied to Tavily and SearXNG fetch calls
  • Exposed in LLM settings UI

New Options

Option Type Default Description
llmWebSearchEngine string "provider" Search engine: provider/tavily/searxng
llmTavilyApiKey string "" Tavily API key
llmSearxngUrl string "" SearXNG instance URL
llmSearchTimeout string "15" Search timeout in seconds

New API Endpoints

Method Path Description
POST /api/llm-chat/execute-tool Execute an approved tool call

Files Changed (29 files, +1029 / -99)

Server

  • apps/server/src/services/llm/providers/ollama.ts - New: Ollama provider
  • apps/server/src/services/llm/web_search_tools.ts - New: Tavily & SearXNG tools
  • apps/server/src/services/llm/providers/base_provider.ts - KB citations, web search engine selection, timeout
  • apps/server/src/services/llm/tools/tool_registry.ts - Approval mechanism, getMutatingToolNames()
  • apps/server/src/services/llm/tools/note_tools.ts - rename_note, delete_note
  • apps/server/src/services/llm/tools/hierarchy_tools.ts - move_note, clone_note
  • apps/server/src/services/llm/stream.ts - requiresApproval flag in tool-call chunks
  • apps/server/src/routes/api/llm_chat.ts - execute-tool endpoint, mutatingToolNames
  • apps/server/src/routes/routes.ts - Route registration
  • apps/server/src/services/llm/index.ts - Ollama registration
  • apps/server/src/services/llm/chat_title.ts - Ollama model handling
  • apps/server/src/services/options_init.ts - New option defaults
  • apps/server/src/routes/api/options.ts - Options whitelist

Client

  • apps/client/src/widgets/type_widgets/llm_chat/useLlmChat.ts - KB, abort, approval callbacks
  • apps/client/src/widgets/type_widgets/llm_chat/ChatInputBar.tsx - KB toggle, stop button
  • apps/client/src/widgets/type_widgets/llm_chat/ChatInputBar.css - Stop button styling
  • apps/client/src/widgets/type_widgets/llm_chat/ToolCallCard.tsx - Approval UI
  • apps/client/src/widgets/type_widgets/llm_chat/ToolCallCard.css - Approval styling
  • apps/client/src/widgets/type_widgets/llm_chat/ChatMessage.tsx - Approval props threading
  • apps/client/src/widgets/type_widgets/llm_chat/LlmChat.tsx - Approval callbacks
  • apps/client/src/widgets/type_widgets/llm_chat/ExpandableCard.tsx - defaultExpanded prop
  • apps/client/src/widgets/type_widgets/llm_chat/llm_chat_types.ts - ToolCall type extensions
  • apps/client/src/widgets/sidebar/SidebarChat.tsx - Approval callback forwarding
  • apps/client/src/services/llm_chat.ts - AbortSignal, executeToolCall()
  • apps/client/src/widgets/type_widgets/options/llm.tsx - WebSearchSettings, timeout field
  • apps/client/src/widgets/type_widgets/options/llm/AddProviderModal.tsx - Ollama option

Shared

  • packages/commons/src/lib/llm_api.ts - requiresApproval on tool_use chunk
  • packages/commons/src/lib/options_interface.ts - New option types

Documentation & i18n

  • apps/server/src/assets/doc_notes/en/User Guide/User Guide/AI.html - Rewritten
  • apps/server/src/assets/doc_notes/en/User Guide/User Guide/AI/Knowledge Base.html - New
  • apps/server/src/assets/doc_notes/en/User Guide/!!!meta.json - Updated structure
  • apps/client/src/translations/en/translation.json - New translation keys

Testing

  • pnpm typecheck - passes
  • pnpm client:build - builds successfully

Kureii added 6 commits April 6, 2026 20:47
Add Ollama as a new LLM provider using the OpenAI-compatible API.
- Create ollama.ts provider with custom baseURL pointing to local Ollama
- Update AddProviderModal to include Ollama in the provider type dropdown
- Register Ollama provider in the LLM service index
- Handle chat title generation for Ollama models
Add new LLM tools for note management:
- rename_note: Rename an existing note
- delete_note: Delete a note (move to trash)
- move_note: Move a note to a different parent
- clone_note: Clone a note to another location
All tools are marked with mutates: true for the approval system.
Implement a human-in-the-loop approval mechanism for LLM tools that
modify notes. Tools marked with mutates: true require explicit user
approval before execution.

Server-side:
- tool_registry: omit execute fn for mutating tools (AI SDK won't auto-run)
- stream: flag tool calls with requiresApproval in SSE chunks
- llm_chat route: new POST /api/llm-chat/execute-tool endpoint
- Wrap tool execution in sql.transactional() for safety

Client-side:
- ToolCallCard: approve/reject buttons with visual indicators
- ChatMessage/LlmChat/SidebarChat: thread approval callbacks
- ExpandableCard: auto-expand for pending approvals

Protocol:
- LlmStreamChunk tool_use gains requiresApproval field
- New ToolCall type fields: requiresApproval, rejected
Knowledge Base:
- Allow users to select notes as context sources for LLM chat
- Harvard-style numbered citations [1], [2] with References section
- KB toggle in ChatInputBar with note selector panel

Stop Generation:
- AbortController-based stream cancellation
- Send button toggles to stop button during streaming
- Partial content preserved when generation is stopped

Stream abort:
- llm_chat service accepts AbortSignal for fetch cancellation
- useLlmChat manages AbortController lifecycle
- Graceful handling of AbortError to finalize partial messages
Web Search Engines:
- Support Tavily API and self-hosted SearXNG as alternatives to
  provider-built-in web search
- New settings UI with engine selector and conditional config fields
- Tavily: API key auth, returns AI-generated answer + search results
- SearXNG: self-hosted metasearch, no API key required

Search Timeout:
- Configurable timeout (default 15s) for web search requests
- AbortSignal.timeout() applied to all custom search tool fetch calls
- Timeout setting exposed in LLM settings UI (1-120 seconds)

New options: llmWebSearchEngine, llmTavilyApiKey, llmSearxngUrl,
llmSearchTimeout
- Rewrite AI.html with comprehensive feature documentation
- Add Knowledge Base documentation page (AI/Knowledge Base.html)
- Update !!!meta.json for new documentation structure
- Add translation keys for: tool approval (approve, reject,
  pending_approval, rejected_by_user), stop generation, web search
  settings (engine, timeout, Tavily, SearXNG), and LLM tool names
@dosubot dosubot Bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Apr 6, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly expands the AI Chat capabilities by introducing a Knowledge Base mode for source-specific querying, support for local Ollama models, and alternative web search engines like Tavily and SearXNG. It also implements a human-in-the-loop approval workflow for mutating tool calls (e.g., renaming or deleting notes) and adds a stop button to cancel active streams. Feedback includes a performance concern regarding markdown conversion for large note previews, a missing timeout in the Ollama fetch call, and a lack of title validation in the note renaming tool.

Comment on lines +81 to +88
const full = note.isContentAvailable() ? (() => {
const content = note.getContent();
if (typeof content !== "string") return preview;
if (note.type === "text") {
return markdownExport.toMarkdown(content);
}
return content;
})() : preview;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Calling markdownExport.toMarkdown(content) on the full content of a note just to extract a 1500-character preview can be very inefficient for large notes. Since this is executed for up to 20 source notes in the knowledge base, it could significantly increase latency and memory usage. Consider truncating the HTML content before conversion or using a simpler tag-stripping method for the preview.

*/
async loadModels(): Promise<ModelInfo[]> {
try {
const res = await fetch(`${this.baseUrl}/api/tags`);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The fetch call to the local Ollama instance lacks a timeout. If the service is unresponsive, this could hang the server-side request indefinitely. It's recommended to add a timeout using AbortSignal.timeout().

Suggested change
const res = await fetch(`${this.baseUrl}/api/tags`);
const res = await fetch(`${this.baseUrl}/api/tags`, { signal: AbortSignal.timeout(5000) });

Comment on lines +247 to +257
execute: ({ noteId, newTitle }) => {
const note = becca.getNote(noteId);
if (!note) {
return { error: "Note not found" };
}
if (note.isProtected) {
return { error: "Note is protected and cannot be renamed" };
}

note.title = newTitle;
note.save();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The rename_note tool does not validate the newTitle. Providing an empty or whitespace-only string could lead to notes with no visible title, which is generally discouraged. It's better to add a check for empty titles.

        execute: ({ noteId, newTitle }) => {
            const note = becca.getNote(noteId);
            if (!note) {
                return { error: "Note not found" };
            }
            if (note.isProtected) {
                return { error: "Note is protected and cannot be renamed" };
            }

            const trimmedTitle = newTitle.trim();
            if (!trimmedTitle) {
                return { error: "Title cannot be empty" };
            }

            note.title = trimmedTitle;
            note.save();

@eliandoran
Copy link
Copy Markdown
Contributor

Hi, interesting contribution.

Did you test it thoroughly? Which models were used to test Ollama?

Kureii added 4 commits April 7, 2026 01:33
- base_provider: replace full markdown conversion with simple HTML tag
  stripping for KB previews — avoids expensive toMarkdown() call on
  large notes (up to 20 sources × full content)
- ollama: add AbortSignal.timeout(5000) to loadModels() fetch call to
  prevent indefinite hangs when the Ollama service is unresponsive
- note_tools: validate newTitle in rename_note — trim whitespace and
  reject empty titles
The note autocomplete dropdown in the Knowledge Base panel was
opening downward, causing suggestions to overflow off-screen.
Use a container ref and CSS to position the dropdown above the
input field instead.
Add closeOnBlur option to note autocomplete Options interface.
When set, skips debug:true so the dropdown closes when the input
loses focus. Used in the KB note picker.
@Kureii
Copy link
Copy Markdown
Contributor Author

Kureii commented Apr 7, 2026

Hi, thanks for the review!

Testing summary:
I tested Ollama with gemma4:27b, gemma3:27b, qwen3.5:27b, and mistral-small3.2:24b. To be transparent - local model support is functional but results are very inconsistent across models:

  • gemma4:27b - Best results overall. Supports tool calling, follows instructions well, successfully searches notes and retrieves content. However, it still doesn't iterate autonomously - it stops after receiving a tool result and waits for user input before taking the next step.
  • gemma3:27b - Doesn't work at all (neither chat nor tools).
  • qwen3.5:27b - Tool calling works technically, but poor instruction following.
  • mistral-small3.2:24b - Similar limitations to gemma4, stops after first tool call.

The core issue: None of the tested local models can autonomously chain multiple tool calls. Even with maxSteps set to 15 in the AI SDK, the model itself needs to emit the next tool call - and local models consistently stop after the first result, waiting for user prompting to continue. Commercial models (Claude, GPT-4o) handle this naturally.

That said, when models do work, the results are solid - search, content retrieval, and tool execution all function correctly. This is intentionally a foundational implementation: the architecture (tool registry, approval system, streaming, knowledge base, web search) is designed to be extensible. Prompt tuning, custom system prompts, and model-specific behavior can be iterated on incrementally without architectural changes.

On the security side - the tool approval system works reliably regardless of model quality. I wasn't able to get any of the tested models to bypass the approval mechanism for mutating operations, so the safety layer holds up even with less capable models.

I've also addressed all three code review findings in commit 08c3f03:

  • KB preview performance: Replaced full toMarkdown() conversion with simple HTML tag stripping
  • Ollama fetch timeout: Added AbortSignal.timeout(5000) to loadModels()
  • rename_note validation: Now trims whitespace and rejects empty titles

@eliandoran
Copy link
Copy Markdown
Contributor

@Kureii , could you please try to separate the features into multiple PRs? There are quite a few overlapping concerns at the same time:

  • Olama provider
  • Multiple search providers
  • Tool approval
  • Knowledge base mode

And anything you can think of splitting.

This makes it easy for me to review and test, and perhaps not all features can go into Trilium without further changes.

@eliandoran eliandoran marked this pull request as draft April 7, 2026 10:21
@eliandoran
Copy link
Copy Markdown
Contributor

Closing as the PRs has been successfully split into #9338 - #9343.

@eliandoran eliandoran closed this Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-conflicts size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants