feat: make whitelist configgurable by dewabisma · Pull Request #46 · Quantus-Network/task-master

dewabisma · 2025-12-22T11:20:27Z

Summary

Make whitelist targets for tweet aggregation configurable from admin panel.

Changes

add flag ignore to tweet author
remove whitelist from config
add endpoints for managing whitelist

- add flag ignore to tweet author - remove whitelist from config - add endpoints for managing whitelist

n13

Code Review: Feat - Configurable Whitelist

The PR introduces database-driven whitelist management for tweet authors, replacing the static configuration file. This is a good move for flexibility.

Here are my findings and suggestions:

1. `get_whitelist` Logic Issue

In src/repositories/tweet_author.rs, the get_whitelist method retrieves authors who are not ignored (is_ignored = false).

pub async fn get_whitelist(&self) -> Result<Vec<String>, DbError> {
    let authors = sqlx::query_as::<_, TweetAuthor>("SELECT * FROM tweet_authors WHERE is_ignored = false")
        .fetch_all(&self.pool)
        .await?;
    // ...
}

However, TweetAuthor represents all authors encountered by the system, including those from random tweets or raids. By filtering only on is_ignored = false, you might be inadvertently whitelisting every author the system knows about (since is_ignored likely defaults to false).

Recommendation:
You need a specific flag for "whitelisted" status, or explicit logic.

If is_ignored is the only flag: Does false mean "Active/Whitelisted" and true mean "Ignored/Blacklisted"?
If so, when a random user is inserted (e.g., via a raid submission), do they default to false? If yes, they become part of the whitelist sync loop, which might be unintended.
Consider adding an is_whitelisted boolean column instead, or clarify that only manually added authors should be in this table (which seems unlikely given upsert_many usage).

2. Efficiency in `get_whitelist`

The current implementation fetches the entire TweetAuthor struct for every whitelisted user just to map their IDs.

let authors = sqlx::query_as::<_, TweetAuthor>("SELECT * FROM tweet_authors ...")
// ...
let whitelist: Vec<String> = authors.iter().map(|f| f.id.clone()).collect();

Optimization: Select only the id column directly.

let ids = sqlx::query_scalar::<_, String>("SELECT id FROM tweet_authors WHERE is_ignored = false")
    .fetch_all(&self.pool)
    .await?;
Ok(ids)

3. Handle `upsert` Return Type

In handle_create_tweet_author, you use state.db.tweet_authors.upsert.
The repository method returns DbResult<String> (the ID), but the handler wraps it in SuccessResponse::new(create_response).

Just double-check that returning just the ID string is sufficient for the frontend, or if they need the full author object back (which the repo fetches anyway via RETURNING *).

4. Naming Consistency

Endpoints: /tweet-authors/:id/ignore vs /tweet-authors/:id/watch
Repository: make_ignored_from_whitelist vs make_watched_in_whitelist

The terms "ignore" and "watch" are good, but "whitelist" in the function name make_ignored_from_whitelist might be confusing if the concept is just "active status".
Suggestion: set_ignore_status(id, status: bool) could replace both methods and reduce code duplication.

5. Tests

The tests look good and cover the new endpoints.

Ensure that test_create_tweet_author_success verifies that the new author defaults to the correct is_ignored state (presumably false).

Summary

The main concern is point #1: ensuring that "all authors in DB" != "whitelist". If tweet_authors table accumulates thousands of random users from raids, get_whitelist will return all of them, causing the TweetSynchronizerService to sync thousands of users unnecessarily.

Fix: Add is_whitelisted column (default false) and only set it to true for manually added authors.

By making is_ignored true by default, we can make sure only manually whitelisted author will be tracked changes: - Update db migration - Efficiency in get_whitelist - Efficiency in returning of upsert tweet authors - Refactor change author whitelist status repository method

…hitelist

n13

PR 46 Review: Make Whitelist Configurable

Summary

This PR introduces a configurable whitelist for the TweetSynchronizerService, allowing the application to filter tweets based on a list of specific Twitter usernames defined in the configuration files (default.toml, etc.).

Changes Identified

Configuration:
- Added whitelist field (Vec<String>) to TweetSyncConfig in src/config.rs.
- Updated default.toml, example.toml, and test.toml to include a whitelist array in the [tweet_sync] section.
Service Logic:
- Updated src/services/tweet_synchronizer_service.rs to use self.config.tweet_sync.whitelist via SearchParams::build_batched_whitelist_queries.
Testing:
- src/services/tweet_synchronizer_service.rs tests reflect the usage of the whitelist (implicitly via load_test_env which loads test.toml).

Code Review

Correctness

Config Loading: The configuration is correctly defined and loaded. The whitelist vector maps correctly to the TOML array.
Service Integration: The service correctly accesses the whitelist from the config and passes it to the rusx library's build_batched_whitelist_queries helper.
Empty Whitelist: If the whitelist is empty (default in code), build_batched_whitelist_queries presumably yields no queries, which effectively disables the sync (or syncs nothing), which is safe behavior.

Code Quality

DRY: The changes reuse existing config patterns and do not introduce duplication.
Conciseness: The implementation is minimal and focused.
Style: Code follows the existing project style.

Considerations & Recommendations

1. Global High-Water Mark (`since_id`)

The service uses a global since_id (the newest tweet ID in the database) for the search query:

let last_id = self.db.relevant_tweets.get_newest_tweet_id().await?;
// ...
if let Some(id) = last_id.clone() {
    params.since_id = Some(id.clone());
}

Potential Issue: If you add a new user to the whitelist who has posted tweets older than the current last_id in your database, those older tweets will be missed because the API query will filter them out based on the global since_id.
Recommendation: If historical backfill for new whitelist additions is required, consider tracking since_id per user or per batch, although this increases complexity significantly. For now, just be aware that adding a user will only pick up their future tweets (or tweets newer than the global high-water mark).

2. Validation

There is currently no validation for the whitelist strings (e.g., ensuring they are valid Twitter usernames, starting with @ or not, etc.).
Recommendation: Ensure that the rusx library or the configuration loader handles invalid usernames gracefully.

Conclusion

The changes look solid and correctly implement the requested feature. The "configurable" aspect is handled via the standard configuration files, which fits the project structure.

Status: ✅ Approved (with note on since_id behavior)

feat: make whitelist configgurable

f8f6893

- add flag ignore to tweet author - remove whitelist from config - add endpoints for managing whitelist

dewabisma requested review from illuzen and n13 December 22, 2025 11:20

dewabisma added 3 commits December 24, 2025 20:16

fix: wrong method for updating

620a91f

fix: handle on none value

52dad59

fix: fetching author public metrics

b4e5900

n13 reviewed Dec 31, 2025

View reviewed changes

dewabisma added 3 commits January 2, 2026 19:17

Merge branch 'feat/simplify-raid-submission' into feat/configurable-w…

e0350fa

…hitelist

Merge branch 'feat/simplify-raid-submission' into feat/configurable-w…

1cbc1df

…hitelist

n13 approved these changes Jan 4, 2026

View reviewed changes

Base automatically changed from feat/simplify-raid-submission to main January 4, 2026 10:02

dewabisma merged commit 8086df5 into main Jan 4, 2026

dewabisma deleted the feat/configurable-whitelist branch January 4, 2026 10:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: make whitelist configgurable#46

feat: make whitelist configgurable#46
dewabisma merged 7 commits into
mainfrom
feat/configurable-whitelist

dewabisma commented Dec 22, 2025

Uh oh!

n13 left a comment

Uh oh!

n13 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

dewabisma commented Dec 22, 2025

Summary

Changes

Uh oh!

n13 left a comment

Choose a reason for hiding this comment

Code Review: Feat - Configurable Whitelist

1. get_whitelist Logic Issue

2. Efficiency in get_whitelist

3. Handle upsert Return Type

4. Naming Consistency

5. Tests

Summary

Uh oh!

n13 left a comment

Choose a reason for hiding this comment

PR 46 Review: Make Whitelist Configurable

Summary

Changes Identified

Code Review

Correctness

Code Quality

Considerations & Recommendations

1. Global High-Water Mark (since_id)

2. Validation

Conclusion

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. `get_whitelist` Logic Issue

2. Efficiency in `get_whitelist`

3. Handle `upsert` Return Type

1. Global High-Water Mark (`since_id`)