Skip to content

feat: add context_chat:reindex to re-seed the crawl on demand#246

Open
bygadd wants to merge 2 commits into
nextcloud:mainfrom
bygadd:feat/context-chat-recrawl-command
Open

feat: add context_chat:reindex to re-seed the crawl on demand#246
bygadd wants to merge 2 commits into
nextcloud:mainfrom
bygadd:feat/context-chat-recrawl-command

Conversation

@bygadd

@bygadd bygadd commented Jun 17, 2026

Copy link
Copy Markdown

Follow-up to #244. The SchedulerJob → StorageCrawlJob → IndexerJob chain is seeded only by the <install> repair step and removes itself after the initial crawl, so there's no way to re-enumerate mounts without reinstalling the app — e.g. when the app is installed on an instance whose files predate it, or to recover a crawl that didn't finish.

This adds an occ context_chat:reindex command that re-adds SchedulerJob, guarded so it's a no-op when one is already scheduled. Already-indexed files are skipped (the queue de-duplicates), so it's safe to run repeatedly.

(Disclosure: AI-assisted; verified on a live deployment.)

The SchedulerJob -> StorageCrawlJob -> IndexerJob chain is seeded only by the
<install> repair step and self-removes after the initial crawl, so there is no
way to re-enumerate mounts without reinstalling the app (e.g. after installing
on an instance whose files predate the app, or to recover an incomplete crawl).
Add an occ command that re-adds SchedulerJob (no-op if already scheduled);
already-indexed files are skipped by the queue de-duplication.

Refs nextcloud#244

Signed-off-by: Yoan Bozhilov <bygadd@gmail.com>
Assisted-by: Claude Code:claude-opus-4-8

@kyteinsky kyteinsky left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR!

Comment thread lib/Command/Reindex.php Outdated
* mounts (e.g. after installing on an instance whose files predate the app, or to recover a crawl
* that did not complete) short of reinstalling the app. This command re-adds SchedulerJob so the
* full enumeration runs again; it is a no-op if one is already scheduled, and already-indexed
* files are skipped (the queue de-duplicates).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* files are skipped (the queue de-duplicates).
* files are skipped.

Comment thread lib/Command/Reindex.php Outdated
use Symfony\Component\Console\Output\OutputInterface;

/**
* Re-seed the one-shot crawl chain on demand.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Re-seed the one-shot crawl chain on demand.
* Re-seed the one-shot filesystem crawl/index on demand.

Comment thread lib/Command/Reindex.php Outdated

protected function configure() {
$this->setName('context_chat:reindex')
->setDescription('Schedule a full re-crawl of all mounts (re-seeds the indexing chain; indexed files are skipped)');

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
->setDescription('Schedule a full re-crawl of all mounts (re-seeds the indexing chain; indexed files are skipped)');
->setDescription('Schedule a full re-crawl of all mounts. Indexed files are not re-indexed when compared against context_chat_backend\'s vector DB.');

Reword the class docstring and the command description per review feedback.

Signed-off-by: Yoan Bozhilov <bygadd@gmail.com>
Assisted-by: Claude Code:claude-opus-4-8
@bygadd

bygadd commented Jun 29, 2026

Copy link
Copy Markdown
Author

applied the three suggestions, thanks for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants