Skip to content

feat: implement OOD query filtering in bot logic and enforce scoped i…#11

Open
kpj2006 wants to merge 4 commits into
AOSSIE-Org:mainfrom
kpj2006:main
Open

feat: implement OOD query filtering in bot logic and enforce scoped i…#11
kpj2006 wants to merge 4 commits into
AOSSIE-Org:mainfrom
kpj2006:main

Conversation

@kpj2006

@kpj2006 kpj2006 commented Jun 19, 2026

Copy link
Copy Markdown
Member

…nteraction guidelines via .clinerules

Addressed Issues:

Fixes #(issue number)

Screenshots/Recordings:

Additional Notes:

Checklist

  • My code follows the project's code style and conventions
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings or errors
  • I have joined the Discord server and I will share a link to this PR with the project maintainers there
  • I have read the Contributing Guidelines

⚠️ AI Notice - Important!

We encourage contributors to use AI tools responsibly when creating Pull Requests. While AI can be a valuable aid, it is essential to ensure that your contributions meet the task requirements, build successfully, include relevant tests, and pass all linters. Submissions that do not meet these standards may be closed without warning to maintain the quality and integrity of the project. Please take the time to understand the changes you are proposing and their impact.

Summary by CodeRabbit

  • New Features
    • Added topic guardrails: the bot now checks whether a query matches supported help areas (setup, README, contributing, debugging) and asks a concise follow-up when it doesn’t.
  • Bug Fixes
    • Improved threaded conversation handling so channel detection is accurate when messages occur within threads.
    • Enhanced robustness of help responses with clearer handling for missing models and other HTTP errors.
  • Other
    • Improved internal gap logging and thread reuse behavior to better support existing threads.

@github-actions github-actions Bot added no-issue-linked PR is not linked to any issue backend Changes to backend code python Python code changes size/M Medium PR (51-200 lines changed) labels Jun 19, 2026
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

Adds an out-of-domain query guard to bot.py via keyword matching against supported categories (setup, README, contributing, error debugging). When a query lacks coverage, the bot overrides the prompt with a clarification request and logs a gap entry. Improves thread reuse and parent-channel detection for threaded messages. Strengthens Ollama HTTP error handling and ensures gap logging calls are properly awaited.

Changes

Out-of-Domain Query Guard and Thread Handling

Layer / File(s) Summary
Out-of-domain query guard and clarification prompt
bot.py, .clinerules
Adds import re and a .clinerules guidance block. In process_message, a coverage gate checks query keywords against supported categories; when no match is found, full_prompt is overridden with a clarification instruction listing the four help areas, and an insufficient_info gap entry is logged tied to the thread ID.
Thread reuse and parent-channel detection
bot.py
_get_or_create_thread reuses existing threads when message.flags.has_thread is set and gracefully handles Discord "thread exists" exceptions. process_message channel detection now compares a threaded message's parent channel ID against the configured channel instead of the thread ID directly.
Ollama HTTP error handling and async logging fixes
bot.py
Catches httpx.HTTPStatusError to return user-facing error messages for HTTP 404 (model not found) and other 4xx errors. Ensures all gap logging calls in exception and fallback paths are properly awaited to prevent async task leaks.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Suggested labels

Python Lang

Poem

🐇 A bouncy bot that guards the gate,
Checks keywords before it's too late!
Thread by thread, it hops with care,
Ollama errors? Handled fair!
"Let's talk setup, docs, or bugs," it chirps—
Smart bot, zero wasted flips! 🌿

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title references OOD query filtering and scoped interaction guidelines, which directly align with the main changes in the changeset (keyword-coverage gate in bot.py and clarification rules in .clinerules).
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added repeat-contributor PR from an external contributor who already had PRs merged needs-review labels Jun 19, 2026
@github-actions

github-actions Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor
Messages
📖

⚠️ PR Template Check

These are non-blocking, but please fix:

  • Please replace the placeholder Fixes #(issue number) with the actual issue number (e.g. Fixes #42).

  • Some required checklist items are not completed:

  • My PR addresses a single issue

  • My code follows the project's code style

  • My changes generate no new warnings or errors

Generated by 🚫 dangerJS against 8d51c92

@kpj2006 kpj2006 added the gsoc label Jun 19, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@bot.py`:
- Around line 166-182: The is_query_covered function currently checks if
keywords exist in either the current query or the entire conversation context
history, which allows past in-domain messages to incorrectly mark a new
out-of-domain query as covered. Remove the context checking from the keyword
matching logic in the is_query_covered function by changing the condition that
currently checks both query and context to only check the current query. This
ensures coverage determination is based solely on the current user query and not
the entire thread history.
- Around line 166-182: The keyword matching in the is_query_covered function
uses simple substring checks (kw in q and kw in ctx) which cause false positives
with short keywords like "pr" matching unrelated words like "price". Replace the
substring matching logic with word boundary checking to ensure only complete
words are matched, not partial substrings. Consider using regular expressions
with word boundaries (\b) to check if each keyword in the categories dictionary
appears as a whole word in the query or context strings rather than allowing
arbitrary substring matches.
- Around line 192-196: In the is_in_configured_channel assignment, replace the
attribute access `message.channel.parent.id` with `message.channel.parent_id`
when is_in_thread is true. The parent attribute can be None when the parent
channel is unavailable in the client cache (in discord.py 2.3.2), which causes
an AttributeError when dereferencing .id. The parent_id attribute is always
stable and available, so use that instead for the channel ID comparison.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8e50c4d3-a666-4955-8623-21261ec8842a

📥 Commits

Reviewing files that changed from the base of the PR and between 1cbd1a4 and 5f1554e.

📒 Files selected for processing (2)
  • .clinerules
  • bot.py

Comment thread bot.py Outdated
Comment thread bot.py
… ensuring proper logging and fallback mechanisms

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
bot.py (1)

212-215: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Narrow generic keywords to keep the OOD gate effective.

"run", "start", and especially "issue" are so broad that many unrelated queries will be marked as covered, bypassing the OOD clarification path.

Suggested patch
 categories = {
-    "setup": ["setup", "install", "run", "build", "clone", "docker", "env", "start", "dev server", "npm run dev"],
+    "setup": ["setup", "install", "build", "clone", "docker", "environment", "dev server", "npm run dev"],
     "readme": ["readme", "read me", "documentation", "project name", "description", "user flow", "feature"],
-    "contribute": ["contribute", "contributor", "fork", "pr", "pull request", "issue", "branch", "git", "onboarding"],
-    "error": ["error", "exception", "bug", "fail", "crash", "issue", "logs", "broken", "debug", "not working"]
+    "contribute": ["contribute", "contributor", "fork", "pr", "pull request", "github issue", "branch", "git", "onboarding"],
+    "error": ["error", "exception", "bug", "fail", "crash", "stack trace", "logs", "broken", "debug", "not working"]
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bot.py` around lines 212 - 215, The keyword lists in the dictionary are too
generic and cause false positives in out-of-distribution detection. Remove the
overly broad keywords "run", "start", and "issue" from the respective lists in
the setup, contribute, and error categories. Replace "run" and "start" in the
setup list with more specific terms that relate directly to project
initialization or development setup (such as "startup" or "initialize"). Remove
"issue" from both the contribute and error lists as it appears in multiple
contexts and is too ambiguous, or replace it with more specific terminology like
"github issue" or "bug report" that narrows the intent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@bot.py`:
- Around line 117-120: The err_msg variable in the Ollama client error handling
is directly including e.response.text which exposes internal backend details to
end users. Instead, log the full response details (e.response.text) to the
internal logger for debugging purposes, but modify the err_msg to return only a
sanitized, user-friendly error message that does not include the raw response
body details. This way, users see a safe generic error message while developers
can still access the full details in the logs.

---

Outside diff comments:
In `@bot.py`:
- Around line 212-215: The keyword lists in the dictionary are too generic and
cause false positives in out-of-distribution detection. Remove the overly broad
keywords "run", "start", and "issue" from the respective lists in the setup,
contribute, and error categories. Replace "run" and "start" in the setup list
with more specific terms that relate directly to project initialization or
development setup (such as "startup" or "initialize"). Remove "issue" from both
the contribute and error lists as it appears in multiple contexts and is too
ambiguous, or replace it with more specific terminology like "github issue" or
"bug report" that narrows the intent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3c16fd5c-ab79-44ee-9f54-7d7bea094513

📥 Commits

Reviewing files that changed from the base of the PR and between 5f1554e and 8d51c92.

📒 Files selected for processing (1)
  • bot.py

Comment thread bot.py
Comment on lines +117 to +120
err_msg = (
f"Local Ollama configuration or client error (HTTP {e.response.status_code}).\n"
f"Details: {e.response.text}"
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not return raw Ollama response bodies to end users.

e.response.text can expose backend/internal details in public bot replies. Keep full details in logs; return a sanitized user-facing message.

Suggested patch
             elif 400 <= e.response.status_code < 500:
-                err_msg = (
-                    f"Local Ollama configuration or client error (HTTP {e.response.status_code}).\n"
-                    f"Details: {e.response.text}"
-                )
+                logger.error(
+                    "Local Ollama client/config error HTTP %s: %s",
+                    e.response.status_code,
+                    e.response.text[:500],
+                )
+                err_msg = (
+                    f"Local Ollama configuration or client error (HTTP {e.response.status_code}). "
+                    "Please contact a maintainer."
+                )
                 return err_msg, True
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bot.py` around lines 117 - 120, The err_msg variable in the Ollama client
error handling is directly including e.response.text which exposes internal
backend details to end users. Instead, log the full response details
(e.response.text) to the internal logger for debugging purposes, but modify the
err_msg to return only a sanitized, user-friendly error message that does not
include the raw response body details. This way, users see a safe generic error
message while developers can still access the full details in the logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend Changes to backend code gsoc needs-review no-issue-linked PR is not linked to any issue python Python code changes repeat-contributor PR from an external contributor who already had PRs merged size/M Medium PR (51-200 lines changed)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant