Skip to content

fix yara scan logic. Closes #1711#3570

Open
Gagan144-blip wants to merge 1 commit intointelowlproject:developfrom
Gagan144-blip:yara-fix-clean
Open

fix yara scan logic. Closes #1711#3570
Gagan144-blip wants to merge 1 commit intointelowlproject:developfrom
Gagan144-blip:yara-fix-clean

Conversation

@Gagan144-blip
Copy link
Copy Markdown

Description

This PR adds support for downloading YARA detection rules from the
Unprotect API: "https://unprotect.it/api/detection_rules/".

The implementation extends the existing YARA repository update logic to
support API-based rule ingestion in addition to Git repositories.

When the repository URL matches the Unprotect API endpoint, IntelOwl will:

  • Fetch detection rules from the API
  • Validate rule syntax before saving
  • Store valid rules locally as .yar files
  • Skip invalid or malformed rules safely

This improves rule coverage by allowing IntelOwl to automatically ingest
community-maintained detection rules from Unprotect.

Closes #1711

Type of change

  • New feature (non-breaking change which adds functionality).
  • Bug Fix

Checklist

  • I have read and understood the rules about how to Contribute to this project
  • The pull request is for the branch develop
  • A new plugin (analyzer, connector, visualizer, playbook, pivot or ingestor) was added or changed, in which case:
    • I strictly followed the documentation "How to create a Plugin"
    • Usage file was updated. A link to the PR to the docs repo has been added as a comment here.
    • Advanced-Usage was updated (in case the plugin provides additional optional configuration). A link to the PR to the docs repo has been added as a comment here.
    • I have dumped the configuration from Django Admin using the dumpplugin command and added it in the project as a data migration. ("How to share a plugin with the community")
    • If a File analyzer was added and it supports a mimetype which is not already supported, you added a sample of that type inside the archive test_files.zip and you added the default tests for that mimetype in test_classes.py.
    • If you created a new analyzer and it is free (does not require any API key), please add it in the FREE_TO_USE_ANALYZERS playbook by following this guide.
    • Check if it could make sense to add that analyzer/connector to other freely available playbooks.
    • I have provided the resulting raw JSON of a finished analysis and a screenshot of the results.
    • If the plugin interacts with an external service, I have created an attribute called precisely url that contains this information. This is required for Health Checks (HEAD HTTP requests).
    • If a new analyzer has beed added, I have created a unittest for it in the appropriate dir. I have also mocked all the external calls, so that no real calls are being made while testing.
    • I have added that raw JSON sample to the get_mocker_response() method of the unittest class. This serves us to provide a valid sample for testing.
    • I have created the corresponding DataModel for the new analyzer following the documentation
  • I have inserted the copyright banner at the start of the file: # This file is a part of IntelOwl https://github.com/intelowlproject/IntelOwl # See the file 'LICENSE' for copying permission.
  • Please avoid adding new libraries as requirements whenever it is possible. Use new libraries only if strictly needed to solve the issue you are working for. In case of doubt, ask a maintainer permission to use a specific library.
  • If external libraries/packages with restrictive licenses were added, they were added in the Legal Notice section.
  • Linters (Ruff) gave 0 errors. If you have correctly installed pre-commit, it does these checks and adjustments on your behalf.
  • I have added tests for the feature/bug I solved (see tests folder). All the tests (new and old ones) gave 0 errors.
  • If the GUI has been modified:
    • I have a provided a screenshot of the result in the PR.
    • I have created new frontend tests for the new component or updated existing ones.
  • After you had submitted the PR, if DeepSource, Django Doctors or other third-party linters have triggered any alerts during the CI checks, I have solved those alerts.

Screenshots

1.Test execution showing all tests passing
1. Test execution showing all tests passing

2'.yar` files from Unprotect API
 '.yar' files from Unprotect API

  1. Content of a downloaded YARA rule file
3. Content of a downloaded YARA rule file

Copilot AI review requested due to automatic review settings March 28, 2026 01:57
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for ingesting YARA rules directly from the Unprotect.it detection rules API, extending the existing YARA repository update/compile flow beyond Git/zip sources.

Changes:

  • Add Unprotect API detection + ingestion flow that downloads rules, validates syntax, and persists valid .yar files.
  • Improve YARA rule compilation logging/robustness (per-file validation, better error handling, safer lockfile deletion).
  • Add a unit test covering Unprotect API rule download and .yar file creation.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
api_app/analyzers_manager/file_analyzers/yara_scan.py Adds Unprotect API ingestion path and adjusts compile/update behavior for YARA repositories.
tests/api_app/analyzers_manager/unit_tests/file_analyzers/test_yara_scan.py Adds a unit test for Unprotect API ingestion creating .yar files.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +327 to +333
# We check the specific repo directory and any first-level subdirectories
for directory in self.first_level_directories + [self.directory]:
if directory != self.directory:
# recursive
rules = directory.rglob("*")
rules = directory.rglob("*") # recursive for subfolders
else:
# not recursive
rules = directory.glob("*")
rules = directory.glob("*") # non-recursive for main folder

Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first_level_directories/compiled_paths are @cached_property values derived from the filesystem, but update() mutates the repo contents (git pull/clone, zip extract, Unprotect ingestion). If those cached values were computed before the update (e.g., _update_git() calls self.compiled_paths before pulling), compile() can use a stale directory list and miss newly added first-level subfolders (and therefore skip compiling rules in them). Consider removing caching here or explicitly invalidating these cached properties at the start/end of update() (e.g., self.__dict__.pop("first_level_directories", None) / ...pop("compiled_paths", None) / ...pop("head_branch", None) as appropriate).

Copilot uses AI. Check for mistakes.

# Normalize path to handle optional leading/trailing slash
path = parsed.path.strip("/")
return netloc == "unprotect.it" and path.startswith("api/detection_rules")
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_unprotect_api() currently matches any path starting with api/detection_rules, which can yield false positives (e.g., api/detection_rules_old). Since this branch changes update behavior significantly, it would be safer to match the endpoint exactly (after normalizing slashes) or check path segment boundaries (e.g., equality to api/detection_rules or prefix api/detection_rules/).

Suggested change
return netloc == "unprotect.it" and path.startswith("api/detection_rules")
return netloc == "unprotect.it" and (
path == "api/detection_rules" or path.startswith("api/detection_rules/")
)

Copilot uses AI. Check for mistakes.
Comment on lines +62 to +66
"id": 1,
"name": "Test Rule",
"yara_rule": "rule test_rule { condition: true }",
},
{
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new validation behavior that discards syntactically invalid YARA rules (compile failure -> unlink) isn’t exercised here. Add a test case with an invalid yara_rule string and assert that no .yar file remains for it after update().

Copilot uses AI. Check for mistakes.
@mlodic
Copy link
Copy Markdown
Member

mlodic commented Mar 30, 2026

hey thanks for the screenshot. I would also need you to try to execute the Yara analyzer and show me how the output is for a file that is matching at least one of those rules. That's the way to be sure that this works properly.

@Gagan144-blip
Copy link
Copy Markdown
Author

Hi @mlodic ,

thanks for the suggestion!
I will run the YARA analyzer from the UI and provide screenshots showing a successful rule match using the downloaded rules.

I’ll update the PR shortly with the results.

@Gagan144-blip Gagan144-blip changed the title fix yara scan logic fix yara scan logic. Closes #1711 Mar 31, 2026
@Gagan144-blip
Copy link
Copy Markdown
Author

Gagan144-blip commented Apr 2, 2026

"Hi @mlodic , I’m hitting a persistent Status 500 (ImportError) in my local IntelOwl environment and could use your guidance.

What I’ve done so far:

Path Alignment: I’ve correctly mapped the YARA rules to /opt/deploy/files_required/yara/admin/custom_rules/ within the intel_owl_shared_files volume according to official documentation.

Dependency Check: I verified yara-python (4.5.1) is installed in the uwsgi container and removed a redundant yara package to avoid conflicts.

Service Status: Containers are running, but after uploafing file, then Select anaylzer ' yara' from UI then scan is stuck on a loading spinner, and the browser console shows an AxiosError linked to an ImportError on the backend.

The uwsgi logs show the server starting on port 8001, but the frontend (port 80) isn't fetching the analyzer list.
Could you help me identify if there's a specific sub-dependency or a database migration step I might have missed?"

Screenshot from 2026-04-02 06-55-30

@github-actions
Copy link
Copy Markdown

This pull request has been marked as stale because it has had no activity for 10 days. If you are still working on this, please provide some updates or it will be closed in 5 days.

@github-actions github-actions Bot added the stale label Apr 12, 2026
@Gagan144-blip
Copy link
Copy Markdown
Author

"Hi @mlodic, just checking in on PR #3570. It was marked as stale, but I'm still blocked by that local ImportError. I've added the full log traceback to the PR comments to get guidance to resolve the issue."

@github-actions github-actions Bot removed the stale label Apr 16, 2026
@github-actions
Copy link
Copy Markdown

This pull request has been marked as stale because it has had no activity for 10 days. If you are still working on this, please provide some updates or it will be closed in 5 days.

@github-actions github-actions Bot added the stale label Apr 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants