Skip to content

Comb/Index::delete() rewrites the entire JSON file even when the document was never indexed #14549

@SUXUMI

Description

@SUXUMI

Bug description

Statamic\Search\Comb\Index::delete() always calls $this->save($data) after $data->forget($ref), regardless of whether the reference was actually present in the index data. Because Laravel Collection::forget() is silent when the key is absent (vendor/laravel/framework/src/Illuminate/Collections/Collection.php:1997-1999 - unset($this->items[$key]) on a missing key is a no-op), the code rewrites the entire index JSON file to disk for no semantic reason whenever delete() is called for a document that was never in the index.

// vendor/statamic/cms/src/Search/Comb/Index.php:89-100  (v6.14.0; same on 5.x)
public function delete($document)
{
    try {
      $data = $this->data();
    } catch (IndexNotFoundException $e) {
       return;
    }

      $data->forget($document->getSearchReference());
      $this->save($data);   // ← unconditional full-file rewrite
}

The downstream call site that triggers this hot path is Statamic\Search\Search::updateWithinIndexes( (vendor/statamic/cms/src/Search/Search.php:57-71):

  $this->indexes()->each(function ($index) use ($searchable) {
      $shouldIndex = $index->shouldIndex($searchable);
      $exists = $index->exists();

      if ($shouldIndex && $exists) {
          $index->insert($searchable->getSearchReference());
      } elseif ($shouldIndex && ! $exists) {
          $index->update();
      } elseif ($exists) {
          $index->delete($searchable);   // ← fires for EVERY existing index that doesn't contain this entry
      }
  });

Proposed fix:

   public function delete($document)
   {
       try {
           $data = $this->data();
       } catch (IndexNotFoundException $e) {
           return;
       }

  -    $data->forget($document->getSearchReference());
  -    $this->save($data);
  +    $ref = $document->getSearchReference();
  +
  +    if (! $data->has($ref)) {
  +        return;                                                                                                                                                                                                               
  +    }
  +
  +    $data->forget($ref);
  +    $this->save($data);
   }

How to reproduce

  1. Install Statamic v6, default local search driver
  2. Configure two indexes at least, in config/statamic/search.php, scoped to separated collections - e.g. news and pages
  3. Run php artisan statamic:search:update news and php artisan statamic:search:update pages to build index files in storage/statamic/search/
  4. Observe disk I/O while saving e.g. news entry

Sample code for tinker/Tinkerwell

config(["queue.default" => "sync"]);

$entry = Entry::query()
    // ->where("collection", "news")
    ->where("id", <new-entry-id>)
    ->first();

$pagesPath = storage_path("statamic/search/pages.json");
clearstatcache();
$mtimeBefore = filemtime($pagesPath);

$entry->slug(time() . "-testing");
$entry->save();

clearstatcache();
$mtimeAfter = filemtime($pagesPath);

echo "pages.json mtime changed: " .  ($mtimeBefore !== $mtimeAfter ? "YES (BUG)" : "NO") 

Result: absolutely all *.json files will be modified!

Environment

- Statamic version: 6.14.0
- Same code present on 5.x branch (src/Search/Comb/Index.php:89-100)
- Laravel: 12.56.0
- PHP: 8.4.20

Installation

Existing Laravel app

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions