TBS: persist the sampling-decision publisher UUID across restarts to avoid redundant resync

When tail-based sampling (TBS) is enabled, APM Server publishes finalized sampling decisions to the `traces-apm.sampled-*` data stream and subscribes to the same stream to consume decisions from peer APM Servers. To skip its own decisions during subscription, the query uses a `must_not` filter on `agent.ephemeral_id`, which is set to `samplerUUID`.

`samplerUUID` is a package-level `var` generated fresh on every process start:

https://github.com/elastic/apm-server/blob/e899b884877ea87605b24663efff8ffa39acddb1/x-pack/apm-server/main.go#L45-L47

Because the UUID rotates on every restart, the self-filter for a restarted process no longer matches decisions the same server published in previous incarnations. Those decisions are then re-fetched from Elasticsearch even though they are already present in the local decision DB.

Nothing is functionally incorrect here (decisions are idempotent on the consumer side, keyed by trace ID). The cost is wasted bandwidth, CPU, and disk writes on every restart.

/cc @carsonip which helped uncover this

## Behaviour

**Single-instance deployments.** Every document in the data stream is published by this instance, so every subscribe poll matches only self-docs and returns zero hits. `maxObservedSeqno` stays at -1 in `searchIndexTraceIDs`, the `if maxSeqno > observedSeqno` gate in `searchTraceIDs` never fires, and `subscriber_position.json` is never advanced past its initial state. On restart the resumed subscriber issues `_seq_no > -1 AND agent.ephemeral_id != newUUID`, which matches every document still retained in the data stream. The paginated loop drains it at 1000 docs per page.

With persistent storage the decision DB and `subscriber_position.json` both survive. This does not cause issues as the re-ingested decisions are idempotent overwrites. The visible effect is redundant network and disk activity.
The persistent-storage improvement from #4437 is effectively shadowed by the UUID rotation.

**Multi-instance deployments.** The position advances past peer-published decisions during normal operation, so the re-fetched window on restart is bounded to the tail of recently-written self-docs (between the last peer-observed `_seq_no` and the current global checkpoint). Still redundant, but with smaller overall impact.

## Impact

Scales with throughput and ILM retention on `traces-apm.sampled-*` indices.

On restart, it leads to elevated CPU, disk, and network activity until the stream is drained.

No impact on ephemeral storage, as there the re-fetch is necessary anyway. Decisive impact on persistent storage where the re-fetch could be mostly avoided.

## Open question

Was the per-process scoping of `samplerUUID` intentional? The comment at `main.go:45-47` suggests yes but does not clarify for what purpose. Before opening a PR we need to confirm whether there is a correctness argument behind per-process identity.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TBS: persist the sampling-decision publisher UUID across restarts to avoid redundant resync #20945

Behaviour

Impact

Open question

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

	// samplerUUID is a UUID used to identify sampled trace ID documents
	// published by this process.
	samplerUUID = uuid.Must(uuid.NewV4())

TBS: persist the sampling-decision publisher UUID across restarts to avoid redundant resync #20945

Description

Behaviour

Impact

Open question

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions