Skip to content

Add cached find support#894

Open
premtsd-code wants to merge 35 commits into
mainfrom
feat/cached-find
Open

Add cached find support#894
premtsd-code wants to merge 35 commits into
mainfrom
feat/cached-find

Conversation

@premtsd-code

@premtsd-code premtsd-code commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Summary

  • add Database::cachedFind() as a reusable cache-aside wrapper for find() result lists
  • add Database::purgeCachedFind() so callers can invalidate all cached fields for a collection namespace
  • keep authorization explicit: internal callers still wrap cachedFind() in Authorization::skip() when needed
  • build cached-find entries from the existing find cache key/field primitives
  • include collection schema, database, queries, relationship mode, active filter signatures, caller labels, active authorization status/roles, requested permission mode, and field type in the cached-find hash
  • return fresh find() documents directly on cache miss, avoiding a second decode of already-decoded filter values
  • serialize decoded documents as array payloads and rehydrate cache hits through createDocumentInstance() and casting() so cached and fresh find() results keep matching PHP types and nested document payloads stay as Document objects
  • bypass caching for Query::orderRandom() so random result pages are not pinned for the cache TTL
  • keep withCache() generic; it still handles cache load/save/purge failures fail-open and does not know about permissions

Usage

Callers can cache a find() result without manually serializing documents or composing withCache() calls:

$rules = $db->getAuthorization()->skip(fn () => $db->cachedFind(
    collection: 'wafRules',
    queries: [
        Query::equal('projectId', [$projectId]),
        Query::equal('enabled', [true]),
        Query::orderAsc('priority'),
        Query::limit(APP_LIMIT_COUNT),
    ],
    namespace: '_'.$projectSequence,
    roles: ['waf'],
));

Invalidate the cached query fields for the same collection namespace with:

$db->purgeCachedFind('wafRules', '_'.$projectSequence);

Cache Shape

Cached find entries use:

{cacheName}-cache:{hostname}:{namespace}:{tenant}:collection:{collection}:find

The hash field keeps the existing find-cache field shape:

{schemaHash}:{rolesHash}:{queryStateHash}:{field}

The query-state hash includes serialized queries, database, active authorization status/roles, requested permission mode, relationship mode, and active filter signatures. Cursor queries include the cursor document payload so pagination boundaries do not collide when ordered values change.

Tests

  • composer format
  • composer lint
  • vendor/bin/phpunit tests/unit/ListCacheTest.php tests/unit/CacheKeyTest.php
  • git diff --check

Summary by CodeRabbit

Release Notes

  • New Features
    • Added caching layer for database find queries to improve performance
    • Implemented cache management with automatic purging capabilities
    • Cache entries respect authorization contexts and user roles for secure operations

@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 856009e0-be85-4b4b-ac63-a85399956afd

📥 Commits

Reviewing files that changed from the base of the PR and between a761a16 and 6f70767.

📒 Files selected for processing (3)
  • src/Database/Database.php
  • tests/unit/CacheKeyTest.php
  • tests/unit/ListCacheTest.php
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/unit/CacheKeyTest.php
  • src/Database/Database.php

📝 Walkthrough

Walkthrough

Adds a generic withCache() cache-aside method to Database, along with public getFindCacheKey() and getFindCacheField() helpers backed by deterministic internal helpers (serializeFindCacheQuery, getFindCacheSchemaHash, getActiveFilterSignatures). Implements public cachedFind() and purgeCachedFind() APIs that wrap find queries in cache-aside logic, skipping cache for random-order queries and re-validating permissions on cache hits. Updates getCacheKeys() to use the centralized filter-signature helper. Tests cover both the cache-aside behavior, find cache-key stability, end-to-end cachedFind lifecycle, and include custom in-memory cache adapters with TTL support.

Changes

Cache-aside helper and find cache-key infrastructure

Layer / File(s) Summary
withCache() cache-aside implementation
src/Database/Database.php
Adds withCache(key, callback, hash) implementing cache-aside: loads by key/hash, treats false/missing as a miss, purges rejected entries, invokes the callback on miss, and best-effort saves results that are not false, with warning-level error handling.
Find cache-key/field helpers and filter-signature centralization
src/Database/Database.php
Updates getCacheKeys() to source the filters signature from new getActiveFilterSignatures(). Adds public getFindCacheKey() and getFindCacheField(), and internal serializeFindCacheQuery(), normalizeFindCacheQueryValue(), getFindCacheSchemaHash(), and getActiveFilterSignatures() for deterministic, stable find cache keys.
cachedFind() and purgeCachedFind() public APIs
src/Database/Database.php
Adds cachedFind(collection, queries, namespace, roles, forPermission) wrapping find queries in cache-aside logic, bypassing cache for random-order queries, converting cached payloads back to document instances with permission revalidation. Adds purgeCachedFind(collection, namespace) to purge all cached find entries.
CacheKeyTest: getFindCacheKey and getFindCacheField coverage
tests/unit/CacheKeyTest.php
Extends imports and createDatabase() helper; adds eight tests asserting correct key format with :find suffix, namespace override, schema-based field derivation, sensitivity to roles/queries/cursors/ambient state/authorization context, and QueryException on invalid query types.
ListCacheTest: withCache() behaviors, cachedFind() lifecycle, and cache adapters
tests/unit/ListCacheTest.php
Adds withCache() tests asserting callback invocation on miss, caching of empty/null values, payload separation by hash, and non-caching of false. Adds comprehensive cachedFind() tests verifying cache persistence/purge, role/permission separation, document recasting, double-decode prevention, random-query bypass, permission revalidation, and TTL filtering. Implements HashMemoryCache and JsonHashMemoryCache in-memory cache adapters with TTL-aware operations, and TtlDatabaseMemory extending the in-memory database adapter to support TTL indexes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • utopia-php/database#717: The main PR's cachedFind() explicitly bypasses caching for random-order queries, which depends on the retrieved PR's introduction of the ORDER_RANDOM / orderRandom query type and SQL handling.
  • utopia-php/database#828: The main PR's new deterministic find cache-field/key logic and active filter signature handling depends on per-filter signature metadata via getInstanceFilters(), which the retrieved PR enables.

Suggested reviewers

  • fogelito

Poem

🐇 Hop, hop through the cache we go,
A key, a hash, a miss—oh no!
The callback runs, the result is stored,
False values tossed, the rest adored.
With stable keys and schema hash,
My finds are cached in quite a flash! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 15.46% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add cached find support' directly and clearly summarizes the main change: adding cache-aside support for find() result caching via cachedFind() method.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/cached-find

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps

greptile-apps Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR introduces cachedFind() as a cache-aside wrapper around find(), purgeCachedFind() for collection-level cache invalidation, and the generic withCache() helper, along with stable key/field helpers (getFindCacheKey, getFindCacheField) that fold in schema, queries, authorization state, filter signatures, and relationship mode.

  • cachedFind() stores the original find() document set by ID and re-fetches each document individually on cache hits so TTL expiry, deletion, and permission changes are observed per document.
  • getActiveFilterSignatures() is extracted from inline code in getCacheKeys(), reducing duplication and making the filter-state contribution consistent between the two cache families.
  • Comprehensive unit tests cover empty results, null caching, random-order bypass, TTL filtering, and permission revalidation on cache hits.

Confidence Score: 3/5

The cache-hit path re-fetches each document through getDocument(), which always enforces PERMISSION_READ regardless of the $forPermission argument passed to cachedFind(). For the common PERMISSION_READ default this is harmless, but any caller that uses a non-READ permission mode with documentSecurity enabled can silently receive fewer documents on the second call than on the first.

The behavioral divergence between cache miss and cache hit is reproducible and silent for the full cache TTL whenever a non-READ permission mode is combined with documentSecurity and documents that hold the requested permission but not READ. The N+1 getDocument() pattern is intentional and documented, but the permission-type mismatch is not documented and has no test coverage for the failing case.

src/Database/Database.php — specifically the cache-hit loop in cachedFind() and how it delegates to getDocument() for authorization.

Important Files Changed

Filename Overview
src/Database/Database.php Adds cachedFind(), purgeCachedFind(), withCache(), getFindCacheKey(), getFindCacheField(), and related helpers. Cache-hit path re-fetches documents via getDocument() which always checks PERMISSION_READ, diverging from find()'s caller-specified $forPermission; also extracts getActiveFilterSignatures() to reduce duplication.
tests/unit/ListCacheTest.php New test file for cachedFind and withCache. Covers empty results, null values, random-order bypass, TTL expiry, and permission revalidation, but does not test the non-READ $forPermission edge case where getDocument() READ check diverges from find().
tests/unit/CacheKeyTest.php Extends CacheKeyTest with tests for getFindCacheKey() and getFindCacheField(). Verifies cursor payload inclusion, ambient state differentiation (filters, relationships, auth), query validation, and that schema/roles/permission-mode changes produce distinct fields.

Reviews (28): Last reviewed commit: "Revalidate cached find hits" | Re-trigger Greptile

Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/Database/Database.php`:
- Around line 8625-8642: The cache hit path in the conditional block starting
with if ($cached !== null && $cached !== false && \is_array($cached)) only wraps
cached arrays into Document instances using createDocumentInstance, but does not
hydrate relationship fields that may have been serialized as arrays. This
results in different document shapes being returned compared to the cache miss
path which uses find() that returns fully processed documents. Either ensure the
cached hit path recursively hydrates relationship fields to match the shape
returned by find(), or add logic to bypass caching for finds that request
relationship-resolved data to avoid returning inconsistent types between cache
hits and misses.
- Around line 9563-9564: The cache key generation logic in the findCached method
is incorrectly dropping the serialized queries when a $key is provided. Change
the ternary expression that sets the queries field to always call
serializeFindCacheQueries($queries) instead of returning null when $key is set.
The caller key should add an additional cache dimension alongside the query
fingerprint, not replace it, so both the serialized queries and the key must be
included in the cache identifier to prevent different queries with the same key
from returning incorrect cached results.
- Around line 9572-9584: The getFindCacheSchemaHash method currently only
includes collection attributes and indexes in the hash calculation, but the
find() authorization check also validates collection permissions and
documentSecurity settings. When collection authorization policies change, cached
results remain valid incorrectly because the schema hash doesn't reflect these
changes. Modify the getFindCacheSchemaHash method to include the collection's
permissions and documentSecurity flag in the hash payload alongside the existing
attributes and indexes. Additionally, replace the MD5 hash algorithm with a
stronger digest function such as SHA256 to address the static analysis flag
regarding weak hashing.

In `@tests/unit/FindCacheTest.php`:
- Around line 140-144: The `HashMemoryCache::save` method incorrectly prevents
caching of empty arrays because the condition `empty($data)` returns true for
empty arrays and rejects them. Modify the validation logic to only reject empty
strings while allowing empty arrays to be cached as valid results. Change the
condition to specifically check if `$data` is an empty string or if `$key` is
empty, rather than using `empty()` which treats empty arrays the same as empty
strings.
- Around line 83-99: The test testFindCachedTriggersFindEventOnCacheHit does not
actually verify a cache hit because both cache hits and cache misses emit
EVENT_DOCUMENT_FIND. To fix this, add data mutation between the two findCached
calls to force a true cache hit scenario. After the first findCached call on
line 92, modify the seeded project data in the database, then verify that the
second findCached call on line 93 returns the stale cached result from the first
call rather than the modified data, which would definitively prove the second
call was a cache hit.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d0eed82b-9eca-489f-a0fd-19748bc2694c

📥 Commits

Reviewing files that changed from the base of the PR and between cfba533 and 1d615f5.

📒 Files selected for processing (4)
  • src/Database/Database.php
  • tests/e2e/Adapter/Scopes/DocumentTests.php
  • tests/unit/CacheKeyTest.php
  • tests/unit/FindCacheTest.php

Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php Outdated
Comment thread tests/unit/FindCacheTest.php Outdated
Comment thread tests/unit/FindCacheTest.php Outdated
Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/Database/Database.php (2)

8611-8617: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate queries before serializing the cache key.

findCached() reaches serializeQueriesForFindCache() before find() runs checkQueryTypes(), so malformed query arrays can fail outside the normal QueryException path on cached calls.

Proposed fix
         if ($collectionDocument->isEmpty()) {
             throw new NotFoundException('Collection not found');
         }
 
+        $this->checkQueryTypes($queries);
+
         [$findKey, $findField] = $this->getFindCacheKeys($collectionDocument->getId(), $queries, $key, $forPermission, $collectionDocument);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Database/Database.php` around lines 8611 - 8617, In the findCached()
method, validate the queries before they are serialized for the cache key. Call
checkQueryTypes() on the queries parameter before invoking getFindCacheKeys(),
which calls serializeQueriesForFindCache(). This ensures malformed query arrays
are caught early with proper QueryException handling rather than failing during
serialization.

8617-8639: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid serving document-security caches after permission revocation.

The cache hit path returns serialized documents without re-reading current document permissions. If a cached document later loses $permissions, the same role can still receive it until the find TTL expires. Until permission-changing writes purge affected find variants, bypass caching for documentSecurity collections.

Conservative fix
         if ($collectionDocument->isEmpty()) {
             throw new NotFoundException('Collection not found');
         }
 
+        if ($collectionDocument->getAttribute('documentSecurity', false)) {
+            return $this->find($collectionDocument->getId(), $queries, $forPermission);
+        }
+
         [$findKey, $findField] = $this->getFindCacheKeys($collectionDocument->getId(), $queries, $key, $forPermission, $collectionDocument);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Database/Database.php` around lines 8617 - 8639, The current caching
logic in the find method returns cached documents without re-validating
permissions, which creates a security vulnerability for documentSecurity
collections where permissions may have been revoked. Add a check using the
$collectionDocument to determine if the collection is a documentSecurity
collection, and if so, skip the entire cache loading and cache return logic (the
try-catch block for $this->cache->load() and the conditional check for $cached
!== null) to force a fresh permission check on every query. This ensures that
permission changes are respected immediately rather than serving stale cached
results until the TTL expires.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/Database/Database.php`:
- Around line 8611-8617: In the findCached() method, validate the queries before
they are serialized for the cache key. Call checkQueryTypes() on the queries
parameter before invoking getFindCacheKeys(), which calls
serializeQueriesForFindCache(). This ensures malformed query arrays are caught
early with proper QueryException handling rather than failing during
serialization.
- Around line 8617-8639: The current caching logic in the find method returns
cached documents without re-validating permissions, which creates a security
vulnerability for documentSecurity collections where permissions may have been
revoked. Add a check using the $collectionDocument to determine if the
collection is a documentSecurity collection, and if so, skip the entire cache
loading and cache return logic (the try-catch block for $this->cache->load() and
the conditional check for $cached !== null) to force a fresh permission check on
every query. This ensures that permission changes are respected immediately
rather than serving stale cached results until the TTL expires.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 29c2ae6a-250e-4626-8ae6-59effc86c769

📥 Commits

Reviewing files that changed from the base of the PR and between 6fad393 and 5275b33.

📒 Files selected for processing (5)
  • src/Database/Database.php
  • tests/e2e/Adapter/Base.php
  • tests/e2e/Adapter/RedisTest.php
  • tests/e2e/Adapter/Scopes/DocumentTests.php
  • tests/unit/FindCacheTest.php
🚧 Files skipped from review as they are similar to previous changes (3)
  • tests/e2e/Adapter/Base.php
  • tests/e2e/Adapter/RedisTest.php
  • tests/e2e/Adapter/Scopes/DocumentTests.php

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/Database/Database.php`:
- Line 8659: The condition on line 8659 checks `if ($touchOnHit &&
!$hasExpiredDocuments)`, but the `!$hasExpiredDocuments` part is redundant
because the code already guarantees this value through an early return statement
in the preceding lines (8646-8657). Remove the `&& !$hasExpiredDocuments` clause
from the condition, leaving only `if ($touchOnHit)`, to eliminate the
always-true condition that is causing the PHPStan error.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 647e586e-bf40-4233-b042-568a6b0b5436

📥 Commits

Reviewing files that changed from the base of the PR and between 5275b33 and 7cd9d19.

📒 Files selected for processing (4)
  • src/Database/Database.php
  • tests/e2e/Adapter/Scopes/DocumentTests.php
  • tests/unit/CacheKeyTest.php
  • tests/unit/FindCacheTest.php
💤 Files with no reviewable changes (1)
  • tests/unit/CacheKeyTest.php
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/e2e/Adapter/Scopes/DocumentTests.php
  • tests/unit/FindCacheTest.php

Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php
Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php
Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/Database/Database.php (1)

8831-8838: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Hydrate cached hits through the same post-processing as find().

decodeCachedFindPayload() only wraps top-level arrays, while the miss path returns fully cast/decoded documents and may include populated relationship Document instances. JSON-backed cache adapters can therefore make cache hits return a different shape than misses; recursively hydrate relationship fields or bypass relationship-resolved cached finds until hit-path hydration matches find().

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Database/Database.php` around lines 8831 - 8838, The
decodeCachedFindPayload() method only wraps top-level arrays but does not
perform the same post-processing and hydration of relationship fields that the
find() method applies to documents from the miss path. To fix this, ensure that
documents retrieved from the cache in this code section (where isTtlExpired() is
checked) are passed through the same post-processing/hydration logic that is
applied to non-cached results, so that relationship Document instances and field
casting match the behavior of the find() method. This will ensure cache hits
return the same shape as cache misses and prevent JSON-backed cache adapters
from returning inconsistently structured data.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/Database/Database.php`:
- Around line 9788-9793: Replace the MD5 hashing algorithm with SHA-256 for
collision resistance in the cache key generation. In the sprintf call that
builds the cache key, replace both instances of \md5(\json_encode($roles) ?: '')
and \md5(\json_encode($queryPayload) ?: '') with hash('sha256', ...) calls
instead. Also apply the same SHA-256 replacement at the location around line
9822 as indicated in the review comment.

---

Duplicate comments:
In `@src/Database/Database.php`:
- Around line 8831-8838: The decodeCachedFindPayload() method only wraps
top-level arrays but does not perform the same post-processing and hydration of
relationship fields that the find() method applies to documents from the miss
path. To fix this, ensure that documents retrieved from the cache in this code
section (where isTtlExpired() is checked) are passed through the same
post-processing/hydration logic that is applied to non-cached results, so that
relationship Document instances and field casting match the behavior of the
find() method. This will ensure cache hits return the same shape as cache misses
and prevent JSON-backed cache adapters from returning inconsistently structured
data.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 25343a0e-0838-4824-bd18-b9cd2fd72e4b

📥 Commits

Reviewing files that changed from the base of the PR and between 7cd9d19 and e4d8f12.

📒 Files selected for processing (3)
  • src/Database/Database.php
  • tests/unit/CacheKeyTest.php
  • tests/unit/FindCacheTest.php

Comment thread src/Database/Database.php
Comment thread src/Database/Database.php

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/Database/Database.php`:
- Around line 8616-8627: The code attempts to load from cache before performing
full query validation against the collection structure. Add the complete
semantic validation logic from the full DocumentsValidator (not just
checkQueryTypes() which only validates that values are Query objects) before the
cache->load() call to ensure that cached hits cannot bypass standard
QueryException behavior for invalid query semantics. This validation block
should validate the queries parameter against the collectionDocument before
attempting to retrieve from the cache identified by the cacheKey and cacheField
variables.

In `@tests/unit/CacheKeyTest.php`:
- Around line 284-293: The PHPDoc annotation on the $queries variable in the
testFindCacheFieldValidatesQueryTypes method incorrectly declares the type as
array<Query>, but the actual value assigned is ['invalid'], which is an array of
strings. Update the PHPDoc comment to accurately reflect the actual data type by
changing the annotation to array<string> to match what is actually being
assigned to the variable.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: eeaf2dfe-7928-4818-8c18-4d69cbbec6f8

📥 Commits

Reviewing files that changed from the base of the PR and between 7cd9d19 and 2eb8271.

📒 Files selected for processing (3)
  • src/Database/Database.php
  • tests/unit/CacheKeyTest.php
  • tests/unit/ListCacheTest.php

Comment thread src/Database/Database.php Outdated
Comment thread tests/unit/CacheKeyTest.php
Comment thread src/Database/Database.php Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/Database/Database.php (1)

9567-9572: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Strengthen find-cache fingerprint to prevent stale authorization reuse.

getFindCacheField()/getFindCacheSchemaHash() still use MD5 and the schema hash excludes collection policy fields. Since cache hits bypass find() checks, a collection policy change ($permissions / documentSecurity) can keep an old cache field valid for the same role hash until TTL expiry.

Suggested patch
-        return \sprintf(
-            '%s:%s:%s:%s',
-            $this->getFindCacheSchemaHash($collection),
-            \md5(\json_encode($roles) ?: ''),
-            \md5(\json_encode($queryPayload) ?: ''),
-            $field,
-        );
+        return \sprintf(
+            '%s:%s:%s:%s',
+            $this->getFindCacheSchemaHash($collection),
+            \hash('sha256', \json_encode($roles) ?: ''),
+            \hash('sha256', \json_encode($queryPayload) ?: ''),
+            $field,
+        );
@@
-        return \md5(
-            \json_encode($collection->getAttribute('attributes', []))
-            . \json_encode($collection->getAttribute('indexes', []))
-        );
+        return \hash('sha256', \json_encode([
+            'attributes' => $collection->getAttribute('attributes', []),
+            'indexes' => $collection->getAttribute('indexes', []),
+            'permissions' => $collection->getAttribute('$permissions', []),
+            'documentSecurity' => $collection->getAttribute('documentSecurity', false),
+        ]) ?: '');

Also applies to: 9621-9630

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Database/Database.php` around lines 9567 - 9572, The find-cache
fingerprint in getFindCacheField() currently uses weak MD5 hashing and the
schema hash from getFindCacheSchemaHash() excludes collection policy fields
(permissions and documentSecurity). Replace the MD5 hash algorithm with a
stronger alternative like SHA-256 for both the roles and queryPayload hashes in
the cache key. Additionally, update the schema hash calculation to include
collection policy fields so that changes to permissions or documentSecurity
invalidate the cache and prevent stale authorization reuse. Apply these changes
consistently across both getFindCacheField() and getFindCacheSchemaHash()
methods.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@src/Database/Database.php`:
- Around line 9567-9572: The find-cache fingerprint in getFindCacheField()
currently uses weak MD5 hashing and the schema hash from
getFindCacheSchemaHash() excludes collection policy fields (permissions and
documentSecurity). Replace the MD5 hash algorithm with a stronger alternative
like SHA-256 for both the roles and queryPayload hashes in the cache key.
Additionally, update the schema hash calculation to include collection policy
fields so that changes to permissions or documentSecurity invalidate the cache
and prevent stale authorization reuse. Apply these changes consistently across
both getFindCacheField() and getFindCacheSchemaHash() methods.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1171458a-1aba-44ff-b664-ab2433babf95

📥 Commits

Reviewing files that changed from the base of the PR and between 2eb8271 and 7478a8b.

📒 Files selected for processing (3)
  • src/Database/Database.php
  • tests/unit/CacheKeyTest.php
  • tests/unit/ListCacheTest.php
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/unit/CacheKeyTest.php

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/unit/ListCacheTest.php (1)

157-157: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Make hash-field assertion order-insensitive to avoid flaky tests.

$cache->list('key') ordering may vary by adapter implementation; asserting exact sequence can create nondeterministic failures. Prefer canonicalizing comparison or sorting before assert.

Suggested test adjustment
-        $this->assertSame(['first-field', 'second-field'], $cache->list('key'));
+        $this->assertEqualsCanonicalizing(['first-field', 'second-field'], $cache->list('key'));
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/ListCacheTest.php` at line 157, The assertSame assertion on the
$cache->list('key') call is comparing arrays in strict order, which can cause
flaky tests since different cache adapter implementations may return hash fields
in varying orders. Modify the assertion to be order-insensitive by either
sorting both the expected and actual arrays before comparison using sort(), or
use an assertion method like assertEqualsCanonicalizing that performs
order-insensitive comparison. This ensures the test validates the correct fields
are returned regardless of the order in which they appear.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/Database/Database.php`:
- Around line 8588-8603: The cache key hash calculation in the withCache call is
not including the $forPermission parameter, which means different permission
levels (read, update, delete) will share the same cache entry. Update the
getFindCacheField method call to include $forPermission as an additional
parameter in the hash calculation so that the full authorization context is
properly fingerprinted. This ensures that a cached result for read permissions
cannot be incorrectly reused for update or delete permission requests with the
same query and roles.

In `@tests/unit/ListCacheTest.php`:
- Around line 219-221: The test for cachedFind method (lines 219-221 and similar
occurrences at 228-230 and 234-239) assumes deterministic ordering of results
without explicitly specifying a sort order, which can cause instability across
different database adapters. Either add an explicit order clause to the queries
parameter passed to cachedFind (for example, sorting by ID) to ensure consistent
ordering, or refactor the assertions to compare results canonically without
relying on index position (such as extracting and comparing IDs in a set-based
manner rather than checking first[0]).

---

Outside diff comments:
In `@tests/unit/ListCacheTest.php`:
- Line 157: The assertSame assertion on the $cache->list('key') call is
comparing arrays in strict order, which can cause flaky tests since different
cache adapter implementations may return hash fields in varying orders. Modify
the assertion to be order-insensitive by either sorting both the expected and
actual arrays before comparison using sort(), or use an assertion method like
assertEqualsCanonicalizing that performs order-insensitive comparison. This
ensures the test validates the correct fields are returned regardless of the
order in which they appear.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 45e5f395-111c-4d05-82c3-18f6f0cc29eb

📥 Commits

Reviewing files that changed from the base of the PR and between 7478a8b and a761a16.

📒 Files selected for processing (2)
  • src/Database/Database.php
  • tests/unit/ListCacheTest.php

Comment thread src/Database/Database.php Outdated
Comment thread tests/unit/ListCacheTest.php
Comment thread src/Database/Database.php
$document = $this->silent(fn () => $this->getDocument($collection, $cachedDocument->getId(), $selects));
if ($document->isEmpty()) {
continue;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 getDocument() always checks PERMISSION_READ, diverging from $forPermission on cache hits

The cache-hit path calls $this->getDocument(...) which unconditionally evaluates PERMISSION_READ at lines 4907–4912 of getDocument(). When cachedFind() is called with $forPermission = Database::PERMISSION_UPDATE and documentSecurity is enabled, a document that carries the update permission but not the read permission will be included in the find() result on the first (cache-miss) call. On the second call the same document is excluded because getDocument() returns empty when the READ check fails, making the cache hit return fewer results than the cache miss. The behavioral difference is silent and repeatable for the full TTL duration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant