The processing cycle and HES data quality


Hospital Episode Statistics (HES) data comes from the routine submissions of data from providers to NHS Digital for the purposes of payment for and commissioning of healthcare in England.

Healthcare providers collect administrative and clinical data locally to support the care of the patient. The data is submitted to the Secondary Uses Service (SUS).

At pre-arranged dates during the year, SUS consolidates those submissions and compiles the data as HES. It is then validated and cleansed, before deriving new items and making the information available in a database. Data quality reports and checks are completed at various stages in the cleansing and processing cycle.

The HES processing cycle and HES data quality

Data quality checks performed on SUS and HES data

The following interactive report provides a summary of high level counts of the extracts taken from SUS+. The first page gives a breakdown of number of records extracted by data set - Accident and Emergency (AE), Outpatient (OP) and Admitted Patient Care (APC) - and by organisation number and type (NHS and independent providers). The second page shows the number of records deleted from the SUS extract and the reason for the deletion. These removed records will not appear on the final HES data set. The third page offers some notes to clarify concepts or atypical trends.



The following interactive report provides a summary of high level counts of the extract taken from the Emergency Care Data Set. The tables within this report give a count of the number of SNOMED CT Codes being submitted and validity based on their ECDS Description and Group for Chief Complaint, Clinical Investigation, Primary Diagnosis and Treatment.


HES Data Quality Notes

HES Data Quality Notes - The current version of the HES Data Quality Notes has been uploaded with limited functionality due to technical issues with the website. HES Data Quality Notes with full functionality can be requested by emailing We apologise for any inconvenience this may cause and hope to have the issues resolved soon.

DQ Notes M2 2018-19 (Publication)

Latest publication period: 19th June 2018
Latest publication date: 12th June 2018

HES Data Item differences following the move to SUS+

This document provides HES users with known differences for HES 2017-18, due to changes introduced in SUS+.

HES Data Item Changes from 2017-18

Access the monthly HES data publication reports.

Automatic data cleaning and derivation rules

We clean the data to improve the consistency and usability of HES data. These rules are used to:

  • clean common and obvious data quality errors
  • derive additional data items to populate the HES data set

The document and cleaning rule numbering should be used in conjunction with the HES User Data Dictionary.

HES Autocleanse Dictionary

HES Autocleanse Dictionary

Provider Mapping Methodology

How we handle records with an invalid provider code within the HES datasets.

HES Provider Mapping Methodology

Duplicate methodology

How we identify and handle duplicate records within the HES dataset.

HES Duplicate Identification and Removal Methodology

HES patient ID

The HES Patient ID (HES ID) provides a way of tracking patients through the HES database without identifying them. It is central to many HES outputs including spell construction, emergency readmissions and linkage to other data sets, such as mortality. We are in the process of enhancing this feature and will publish its methodology as soon as it's released.