The Person_ID Handbook

Date Published:: 24 January 2024

Current Chapter

Next Chapter

Introduction

Summary and outline

The Person_ID is a unique patient identifier used by NHS England with the objective of standardising the approach to patient-level data linkage across different data sets.

This handbook aims to provide users of the Person_ID in the Hospital Episode Statistics (HES) databases with supporting documentation on what the Person_ID is, how it is derived via the Master Person Service (MPS), how the data flows between services (Data Processing Services (DPS) and Spine), and how to interpret the output information associated with the Person_ID.

Person_IDs are provided in many data sets available in NHS England including HES, and are derived from the outputs of MPS. For security and privacy reasons many users might have visibility of the tokenised version of the Person_ID, which provides an extra level of patient confidentiality.

MPS takes certain demographic information contained in a person’s health and care records and matches it to their unique NHS number to confirm their identity. The collection of all NHS numbers and patients’ demographic information is contained in the Personal Demographics Service (PDS) data set.

Like any data linkage method, MPS can provide non-perfect matching. There are risks of both failing to match a record (false negative) and matching to a record incorrectly (false positive). The performance of MPS is determined by both the algorithm itself and the quality of incoming data.

MPS operates in the same way for all data sets and is not tuned to any particular use case. For example, where records reliably have accurate NHS numbers attached, MPS will provide a correct match with high confidence. Where solely relying on other personal identifiers (such as name, postcode, gender or date of birth), which may be incomplete, inconsistently recorded or duplicated across the population, the algorithm will be less able to return a correct match in all cases.

Mature health datasets, where identity is typically validated in a healthcare setting at point of recording (such as HES), have higher levels of matching accuracy through MPS for most records. Performance for other datasets may be variable.

Where a perfect match of NHS number and date of birth cannot be found between a record of interest and any of the PDS records, more complex algorithms are used to compare partial demographic information to identify the most likely PDS record corresponding to the query record. These algorithms are referred to as alphanumeric and algorithmic trace, but in HES only the latter is used. In the algorithmic trace step, the single queried record is compared to all records in PDS. The comparisons involve some demographic information (date of birth, name, gender and postcode) and are scored based on similarity. If the similarity is deemed acceptable, the matched record is returned. Otherwise, the algorithm proceeds to look for similarities between the record of interest and some previously unmatched records, stored in the MPS record bucket, a separate data set.

The Person_ID is therefore one of NHS number from PDS, MPS_ID from the MPS record bucket or a one-time-use ID, depending on if and where a match was found.

The rest of the document is structured as follows:

chapter 1 explains what the Person_ID is and provides details on the scope of this document
chapter 2 explains how the Person_ID is generated and how it is used in the context of the HES data set
chapter 3 provides a more detailed technical explanation of the algorithms behind the matching logic
chapter 4 shows specific empirical examples of how a Person_ID is matched
finally, chapter 5 contains additional information helpful to the Person_ID users

Download this document as a pdf

The Person_ID Handbook v.2.0

pdf 1 MB

Name mapping CSV

csv 76 KB

Last edited: 27 February 2024 3:54 pm

Next Chapter

Introduction

The Person_ID Handbook

Summary and outline

Download this document as a pdf

Chapters