Skip to content

Commit 9ec87db

Browse files
authored
Adds a new design history post to check childrens vaccination history (#466)
Defining duplicate vaccination records
1 parent af06c9a commit 9ec87db

1 file changed

Lines changed: 56 additions & 0 deletions

File tree

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
---
2+
title: Defining duplicate vaccination records
3+
description: Working definitions for what we mean by duplicate vaccination records, and why the team is agreeing language before designing solutions.
4+
date: 2026-04-17
5+
---
6+
7+
Check children's vaccination history draws on vaccination records from multiple systems. We know from our work with Manage vaccinations in schools (Mavis) that when you bring data together from different sources, you encounter duplicates – multiple records that may describe the same vaccination event.
8+
9+
Resolving duplicates accurately is important: having multiple redundant records can add friction when making sense of vaccination history, and an incorrect resolution could affect clinical decisions about whether a child needs vaccination.
10+
11+
Duplicates are an unavoidable feature of how vaccination data flows across the NHS, which we’ll explore in a future post.
12+
13+
## Why definitions matter
14+
15+
Without shared definitions, we risk designing and building solutions to different problems without realising it.
16+
17+
Shared definitions give us a common language for harder problems, like:
18+
* how to detect duplicates
19+
* how to resolve them
20+
* who should be responsible for doing so.
21+
22+
Those problems require research, clinical input, and design exploration. Getting the language right first makes that work possible.
23+
24+
## Core definitions
25+
26+
**Vaccination event.** A patient received a dose of a vaccine on a date and time, in a place. It happened in the real world.
27+
28+
**Vaccination record.** A structured entry in a clinical system intended to document a vaccination event.
29+
30+
**Duplicates.** 2 or more vaccination records describing the same real-world vaccination event.
31+
32+
**Simple duplicates.** 2 or more vaccination records describing the same vaccination event, with a sufficient number of corroborating data fields. Simple duplicates can be confidently resolved by a programmatic decision, without human intervention.
33+
34+
For example: 2 records for the same child on the same date, showing the same vaccine, from the same provider. The records agree on enough core details that they can only plausibly describe 1 event.
35+
36+
**Resolvable duplicates.** 2 or more vaccination records that may describe the same vaccination event, where the available data is suggestive but not sufficient for a confident programmatic decision. Resolvable duplicates require human decisions, drawing on context outside the vaccination record such as clinical knowledge, local records, or direct enquiry.
37+
38+
For example: 2 records for the same child showing the same vaccine, recorded a few days apart. The different dates may reflect two vaccinations, or it may be an artefact of how the records were added or shared between systems. A clinician or local team weighs up what the records show, alongside what they know about how those systems behave, to reach a judgement.
39+
40+
**Unresolvable duplicates.** 2 or more vaccination records that may describe the same vaccination event, where the available data is insufficient to confirm or rule it out. These cannot be resolved by machine or human.
41+
42+
For example: 2 records for the same child showing the same vaccine in the same year, both labelled as dose 1. Each system has recorded what it saw as a first dose, but neither has visibility of the other. The records cannot settle whether this is 1 event recorded twice or 2 separate doses.
43+
44+
## Additional concepts
45+
46+
These are less settled than the core definitions. We're including them because they already come up in conversation, and a working definition is more useful than none.
47+
48+
**Confidence.** The degree of certainty that 2 or more records describe the same vaccination event. It might be used to inform decisions about when programmatic resolution is appropriate and when human review is needed.
49+
50+
**Tolerance.** The degree of uncertainty acceptable when resolving duplicates programmatically. The right level of tolerance is likely a policy question, shaped by the clinical risk of an incorrect resolution, rather than a purely technical one.
51+
52+
## Working definitions
53+
54+
Not every project needs to agree on definitions. But in a domain where the data is imperfect and the decisions have clinical consequences, shared language established early makes harder conversations easier later. It reduces the risk of the team quietly solving different problems, and means disagreements surface as genuine differences of view rather than quirks of language.
55+
56+
The definitions above are working definitions. They may evolve as we learn more. Any change to them should be a conscious, collective decision rather than something that drifts quietly over time.

0 commit comments

Comments
 (0)