Add cached find support#894
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughAdds a generic ChangesCache-aside helper and find cache-key infrastructure
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR introduces
Confidence Score: 3/5The cache-hit path re-fetches each document through getDocument(), which always enforces PERMISSION_READ regardless of the $forPermission argument passed to cachedFind(). For the common PERMISSION_READ default this is harmless, but any caller that uses a non-READ permission mode with documentSecurity enabled can silently receive fewer documents on the second call than on the first. The behavioral divergence between cache miss and cache hit is reproducible and silent for the full cache TTL whenever a non-READ permission mode is combined with documentSecurity and documents that hold the requested permission but not READ. The N+1 getDocument() pattern is intentional and documented, but the permission-type mismatch is not documented and has no test coverage for the failing case. src/Database/Database.php — specifically the cache-hit loop in cachedFind() and how it delegates to getDocument() for authorization. Important Files Changed
Reviews (28): Last reviewed commit: "Revalidate cached find hits" | Re-trigger Greptile |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/Database/Database.php`:
- Around line 8625-8642: The cache hit path in the conditional block starting
with if ($cached !== null && $cached !== false && \is_array($cached)) only wraps
cached arrays into Document instances using createDocumentInstance, but does not
hydrate relationship fields that may have been serialized as arrays. This
results in different document shapes being returned compared to the cache miss
path which uses find() that returns fully processed documents. Either ensure the
cached hit path recursively hydrates relationship fields to match the shape
returned by find(), or add logic to bypass caching for finds that request
relationship-resolved data to avoid returning inconsistent types between cache
hits and misses.
- Around line 9563-9564: The cache key generation logic in the findCached method
is incorrectly dropping the serialized queries when a $key is provided. Change
the ternary expression that sets the queries field to always call
serializeFindCacheQueries($queries) instead of returning null when $key is set.
The caller key should add an additional cache dimension alongside the query
fingerprint, not replace it, so both the serialized queries and the key must be
included in the cache identifier to prevent different queries with the same key
from returning incorrect cached results.
- Around line 9572-9584: The getFindCacheSchemaHash method currently only
includes collection attributes and indexes in the hash calculation, but the
find() authorization check also validates collection permissions and
documentSecurity settings. When collection authorization policies change, cached
results remain valid incorrectly because the schema hash doesn't reflect these
changes. Modify the getFindCacheSchemaHash method to include the collection's
permissions and documentSecurity flag in the hash payload alongside the existing
attributes and indexes. Additionally, replace the MD5 hash algorithm with a
stronger digest function such as SHA256 to address the static analysis flag
regarding weak hashing.
In `@tests/unit/FindCacheTest.php`:
- Around line 140-144: The `HashMemoryCache::save` method incorrectly prevents
caching of empty arrays because the condition `empty($data)` returns true for
empty arrays and rejects them. Modify the validation logic to only reject empty
strings while allowing empty arrays to be cached as valid results. Change the
condition to specifically check if `$data` is an empty string or if `$key` is
empty, rather than using `empty()` which treats empty arrays the same as empty
strings.
- Around line 83-99: The test testFindCachedTriggersFindEventOnCacheHit does not
actually verify a cache hit because both cache hits and cache misses emit
EVENT_DOCUMENT_FIND. To fix this, add data mutation between the two findCached
calls to force a true cache hit scenario. After the first findCached call on
line 92, modify the seeded project data in the database, then verify that the
second findCached call on line 93 returns the stale cached result from the first
call rather than the modified data, which would definitively prove the second
call was a cache hit.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d0eed82b-9eca-489f-a0fd-19748bc2694c
📒 Files selected for processing (4)
src/Database/Database.phptests/e2e/Adapter/Scopes/DocumentTests.phptests/unit/CacheKeyTest.phptests/unit/FindCacheTest.php
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/Database/Database.php (2)
8611-8617:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winValidate queries before serializing the cache key.
findCached()reachesserializeQueriesForFindCache()beforefind()runscheckQueryTypes(), so malformed query arrays can fail outside the normalQueryExceptionpath on cached calls.Proposed fix
if ($collectionDocument->isEmpty()) { throw new NotFoundException('Collection not found'); } + $this->checkQueryTypes($queries); + [$findKey, $findField] = $this->getFindCacheKeys($collectionDocument->getId(), $queries, $key, $forPermission, $collectionDocument);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/Database/Database.php` around lines 8611 - 8617, In the findCached() method, validate the queries before they are serialized for the cache key. Call checkQueryTypes() on the queries parameter before invoking getFindCacheKeys(), which calls serializeQueriesForFindCache(). This ensures malformed query arrays are caught early with proper QueryException handling rather than failing during serialization.
8617-8639:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAvoid serving document-security caches after permission revocation.
The cache hit path returns serialized documents without re-reading current document permissions. If a cached document later loses
$permissions, the same role can still receive it until the find TTL expires. Until permission-changing writes purge affected find variants, bypass caching fordocumentSecuritycollections.Conservative fix
if ($collectionDocument->isEmpty()) { throw new NotFoundException('Collection not found'); } + if ($collectionDocument->getAttribute('documentSecurity', false)) { + return $this->find($collectionDocument->getId(), $queries, $forPermission); + } + [$findKey, $findField] = $this->getFindCacheKeys($collectionDocument->getId(), $queries, $key, $forPermission, $collectionDocument);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/Database/Database.php` around lines 8617 - 8639, The current caching logic in the find method returns cached documents without re-validating permissions, which creates a security vulnerability for documentSecurity collections where permissions may have been revoked. Add a check using the $collectionDocument to determine if the collection is a documentSecurity collection, and if so, skip the entire cache loading and cache return logic (the try-catch block for $this->cache->load() and the conditional check for $cached !== null) to force a fresh permission check on every query. This ensures that permission changes are respected immediately rather than serving stale cached results until the TTL expires.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/Database/Database.php`:
- Around line 8611-8617: In the findCached() method, validate the queries before
they are serialized for the cache key. Call checkQueryTypes() on the queries
parameter before invoking getFindCacheKeys(), which calls
serializeQueriesForFindCache(). This ensures malformed query arrays are caught
early with proper QueryException handling rather than failing during
serialization.
- Around line 8617-8639: The current caching logic in the find method returns
cached documents without re-validating permissions, which creates a security
vulnerability for documentSecurity collections where permissions may have been
revoked. Add a check using the $collectionDocument to determine if the
collection is a documentSecurity collection, and if so, skip the entire cache
loading and cache return logic (the try-catch block for $this->cache->load() and
the conditional check for $cached !== null) to force a fresh permission check on
every query. This ensures that permission changes are respected immediately
rather than serving stale cached results until the TTL expires.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 29c2ae6a-250e-4626-8ae6-59effc86c769
📒 Files selected for processing (5)
src/Database/Database.phptests/e2e/Adapter/Base.phptests/e2e/Adapter/RedisTest.phptests/e2e/Adapter/Scopes/DocumentTests.phptests/unit/FindCacheTest.php
🚧 Files skipped from review as they are similar to previous changes (3)
- tests/e2e/Adapter/Base.php
- tests/e2e/Adapter/RedisTest.php
- tests/e2e/Adapter/Scopes/DocumentTests.php
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/Database/Database.php`:
- Line 8659: The condition on line 8659 checks `if ($touchOnHit &&
!$hasExpiredDocuments)`, but the `!$hasExpiredDocuments` part is redundant
because the code already guarantees this value through an early return statement
in the preceding lines (8646-8657). Remove the `&& !$hasExpiredDocuments` clause
from the condition, leaving only `if ($touchOnHit)`, to eliminate the
always-true condition that is causing the PHPStan error.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 647e586e-bf40-4233-b042-568a6b0b5436
📒 Files selected for processing (4)
src/Database/Database.phptests/e2e/Adapter/Scopes/DocumentTests.phptests/unit/CacheKeyTest.phptests/unit/FindCacheTest.php
💤 Files with no reviewable changes (1)
- tests/unit/CacheKeyTest.php
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/e2e/Adapter/Scopes/DocumentTests.php
- tests/unit/FindCacheTest.php
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/Database/Database.php (1)
8831-8838:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftHydrate cached hits through the same post-processing as
find().
decodeCachedFindPayload()only wraps top-level arrays, while the miss path returns fully cast/decoded documents and may include populated relationshipDocumentinstances. JSON-backed cache adapters can therefore make cache hits return a different shape than misses; recursively hydrate relationship fields or bypass relationship-resolved cached finds until hit-path hydration matchesfind().🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/Database/Database.php` around lines 8831 - 8838, The decodeCachedFindPayload() method only wraps top-level arrays but does not perform the same post-processing and hydration of relationship fields that the find() method applies to documents from the miss path. To fix this, ensure that documents retrieved from the cache in this code section (where isTtlExpired() is checked) are passed through the same post-processing/hydration logic that is applied to non-cached results, so that relationship Document instances and field casting match the behavior of the find() method. This will ensure cache hits return the same shape as cache misses and prevent JSON-backed cache adapters from returning inconsistently structured data.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/Database/Database.php`:
- Around line 9788-9793: Replace the MD5 hashing algorithm with SHA-256 for
collision resistance in the cache key generation. In the sprintf call that
builds the cache key, replace both instances of \md5(\json_encode($roles) ?: '')
and \md5(\json_encode($queryPayload) ?: '') with hash('sha256', ...) calls
instead. Also apply the same SHA-256 replacement at the location around line
9822 as indicated in the review comment.
---
Duplicate comments:
In `@src/Database/Database.php`:
- Around line 8831-8838: The decodeCachedFindPayload() method only wraps
top-level arrays but does not perform the same post-processing and hydration of
relationship fields that the find() method applies to documents from the miss
path. To fix this, ensure that documents retrieved from the cache in this code
section (where isTtlExpired() is checked) are passed through the same
post-processing/hydration logic that is applied to non-cached results, so that
relationship Document instances and field casting match the behavior of the
find() method. This will ensure cache hits return the same shape as cache misses
and prevent JSON-backed cache adapters from returning inconsistently structured
data.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 25343a0e-0838-4824-bd18-b9cd2fd72e4b
📒 Files selected for processing (3)
src/Database/Database.phptests/unit/CacheKeyTest.phptests/unit/FindCacheTest.php
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/Database/Database.php`:
- Around line 8616-8627: The code attempts to load from cache before performing
full query validation against the collection structure. Add the complete
semantic validation logic from the full DocumentsValidator (not just
checkQueryTypes() which only validates that values are Query objects) before the
cache->load() call to ensure that cached hits cannot bypass standard
QueryException behavior for invalid query semantics. This validation block
should validate the queries parameter against the collectionDocument before
attempting to retrieve from the cache identified by the cacheKey and cacheField
variables.
In `@tests/unit/CacheKeyTest.php`:
- Around line 284-293: The PHPDoc annotation on the $queries variable in the
testFindCacheFieldValidatesQueryTypes method incorrectly declares the type as
array<Query>, but the actual value assigned is ['invalid'], which is an array of
strings. Update the PHPDoc comment to accurately reflect the actual data type by
changing the annotation to array<string> to match what is actually being
assigned to the variable.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: eeaf2dfe-7928-4818-8c18-4d69cbbec6f8
📒 Files selected for processing (3)
src/Database/Database.phptests/unit/CacheKeyTest.phptests/unit/ListCacheTest.php
Add cache-aside helper
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/Database/Database.php (1)
9567-9572:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winStrengthen find-cache fingerprint to prevent stale authorization reuse.
getFindCacheField()/getFindCacheSchemaHash()still use MD5 and the schema hash excludes collection policy fields. Since cache hits bypassfind()checks, a collection policy change ($permissions/documentSecurity) can keep an old cache field valid for the same role hash until TTL expiry.Suggested patch
- return \sprintf( - '%s:%s:%s:%s', - $this->getFindCacheSchemaHash($collection), - \md5(\json_encode($roles) ?: ''), - \md5(\json_encode($queryPayload) ?: ''), - $field, - ); + return \sprintf( + '%s:%s:%s:%s', + $this->getFindCacheSchemaHash($collection), + \hash('sha256', \json_encode($roles) ?: ''), + \hash('sha256', \json_encode($queryPayload) ?: ''), + $field, + ); @@ - return \md5( - \json_encode($collection->getAttribute('attributes', [])) - . \json_encode($collection->getAttribute('indexes', [])) - ); + return \hash('sha256', \json_encode([ + 'attributes' => $collection->getAttribute('attributes', []), + 'indexes' => $collection->getAttribute('indexes', []), + 'permissions' => $collection->getAttribute('$permissions', []), + 'documentSecurity' => $collection->getAttribute('documentSecurity', false), + ]) ?: '');Also applies to: 9621-9630
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/Database/Database.php` around lines 9567 - 9572, The find-cache fingerprint in getFindCacheField() currently uses weak MD5 hashing and the schema hash from getFindCacheSchemaHash() excludes collection policy fields (permissions and documentSecurity). Replace the MD5 hash algorithm with a stronger alternative like SHA-256 for both the roles and queryPayload hashes in the cache key. Additionally, update the schema hash calculation to include collection policy fields so that changes to permissions or documentSecurity invalidate the cache and prevent stale authorization reuse. Apply these changes consistently across both getFindCacheField() and getFindCacheSchemaHash() methods.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@src/Database/Database.php`:
- Around line 9567-9572: The find-cache fingerprint in getFindCacheField()
currently uses weak MD5 hashing and the schema hash from
getFindCacheSchemaHash() excludes collection policy fields (permissions and
documentSecurity). Replace the MD5 hash algorithm with a stronger alternative
like SHA-256 for both the roles and queryPayload hashes in the cache key.
Additionally, update the schema hash calculation to include collection policy
fields so that changes to permissions or documentSecurity invalidate the cache
and prevent stale authorization reuse. Apply these changes consistently across
both getFindCacheField() and getFindCacheSchemaHash() methods.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 1171458a-1aba-44ff-b664-ab2433babf95
📒 Files selected for processing (3)
src/Database/Database.phptests/unit/CacheKeyTest.phptests/unit/ListCacheTest.php
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/unit/CacheKeyTest.php
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unit/ListCacheTest.php (1)
157-157:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winMake hash-field assertion order-insensitive to avoid flaky tests.
$cache->list('key')ordering may vary by adapter implementation; asserting exact sequence can create nondeterministic failures. Prefer canonicalizing comparison or sorting before assert.Suggested test adjustment
- $this->assertSame(['first-field', 'second-field'], $cache->list('key')); + $this->assertEqualsCanonicalizing(['first-field', 'second-field'], $cache->list('key'));🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/ListCacheTest.php` at line 157, The assertSame assertion on the $cache->list('key') call is comparing arrays in strict order, which can cause flaky tests since different cache adapter implementations may return hash fields in varying orders. Modify the assertion to be order-insensitive by either sorting both the expected and actual arrays before comparison using sort(), or use an assertion method like assertEqualsCanonicalizing that performs order-insensitive comparison. This ensures the test validates the correct fields are returned regardless of the order in which they appear.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/Database/Database.php`:
- Around line 8588-8603: The cache key hash calculation in the withCache call is
not including the $forPermission parameter, which means different permission
levels (read, update, delete) will share the same cache entry. Update the
getFindCacheField method call to include $forPermission as an additional
parameter in the hash calculation so that the full authorization context is
properly fingerprinted. This ensures that a cached result for read permissions
cannot be incorrectly reused for update or delete permission requests with the
same query and roles.
In `@tests/unit/ListCacheTest.php`:
- Around line 219-221: The test for cachedFind method (lines 219-221 and similar
occurrences at 228-230 and 234-239) assumes deterministic ordering of results
without explicitly specifying a sort order, which can cause instability across
different database adapters. Either add an explicit order clause to the queries
parameter passed to cachedFind (for example, sorting by ID) to ensure consistent
ordering, or refactor the assertions to compare results canonically without
relying on index position (such as extracting and comparing IDs in a set-based
manner rather than checking first[0]).
---
Outside diff comments:
In `@tests/unit/ListCacheTest.php`:
- Line 157: The assertSame assertion on the $cache->list('key') call is
comparing arrays in strict order, which can cause flaky tests since different
cache adapter implementations may return hash fields in varying orders. Modify
the assertion to be order-insensitive by either sorting both the expected and
actual arrays before comparison using sort(), or use an assertion method like
assertEqualsCanonicalizing that performs order-insensitive comparison. This
ensures the test validates the correct fields are returned regardless of the
order in which they appear.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 45e5f395-111c-4d05-82c3-18f6f0cc29eb
📒 Files selected for processing (2)
src/Database/Database.phptests/unit/ListCacheTest.php
| $document = $this->silent(fn () => $this->getDocument($collection, $cachedDocument->getId(), $selects)); | ||
| if ($document->isEmpty()) { | ||
| continue; | ||
| } |
There was a problem hiding this comment.
getDocument() always checks PERMISSION_READ, diverging from $forPermission on cache hits
The cache-hit path calls $this->getDocument(...) which unconditionally evaluates PERMISSION_READ at lines 4907–4912 of getDocument(). When cachedFind() is called with $forPermission = Database::PERMISSION_UPDATE and documentSecurity is enabled, a document that carries the update permission but not the read permission will be included in the find() result on the first (cache-miss) call. On the second call the same document is excluded because getDocument() returns empty when the READ check fails, making the cache hit return fewer results than the cache miss. The behavioral difference is silent and repeatable for the full TTL duration.
Summary
Database::cachedFind()as a reusable cache-aside wrapper forfind()result listsDatabase::purgeCachedFind()so callers can invalidate all cached fields for a collection namespacecachedFind()inAuthorization::skip()when neededfind()documents directly on cache miss, avoiding a second decode of already-decoded filter valuescreateDocumentInstance()andcasting()so cached and freshfind()results keep matching PHP types and nested document payloads stay asDocumentobjectsQuery::orderRandom()so random result pages are not pinned for the cache TTLwithCache()generic; it still handles cache load/save/purge failures fail-open and does not know about permissionsUsage
Callers can cache a
find()result without manually serializing documents or composingwithCache()calls:Invalidate the cached query fields for the same collection namespace with:
Cache Shape
Cached find entries use:
The hash field keeps the existing find-cache field shape:
The query-state hash includes serialized queries, database, active authorization status/roles, requested permission mode, relationship mode, and active filter signatures. Cursor queries include the cursor document payload so pagination boundaries do not collide when ordered values change.
Tests
composer formatcomposer lintvendor/bin/phpunit tests/unit/ListCacheTest.php tests/unit/CacheKeyTest.phpgit diff --checkSummary by CodeRabbit
Release Notes