Skip to main content

The processing cycle and HES data quality

Hospital Episode Statistics (HES) data comes from the routine submissions of data from providers to NHS Digital for the purposes of payment for and commissioning of healthcare in England.

Learn more about Emergency Care Data Quality at the data quality information and reports on the Emergency Care Data Set (ECDS).

Healthcare providers collect administrative and clinical data locally to support the care of the patient. The data is submitted to the Secondary Uses Service (SUS).

At pre-arranged dates during the year, SUS consolidates those submissions and compiles the data as HES. It is then validated and cleansed, before deriving new items and making the information available in a database. Data quality reports and checks are completed at various stages in the cleansing and processing cycle.

The HES processing cycle and HES data quality

Data quality checks performed on SUS and HES data

This interactive report provides a summary of high level counts of the extracts taken from SUS+. The first page gives a breakdown of number of records extracted by data set - Accident and Emergency (AE), Outpatient (OP) and Admitted Patient Care (APC) - and by organisation number and type (NHS and independent providers). The second page shows the number of records deleted from the SUS extract and the reason for the deletion. These removed records will not appear on the final HES data set. The third page offers some notes to clarify concepts or atypical trends.  

Accessibility of this tool

This tool is in Microsoft PowerBI which does not fully support all accessibility needs.  If you need further assistance, please contact us for help.

HES data quality notes

HES Data Quality Notes - The current version of the HES Data Quality Notes has been uploaded with limited functionality due to technical issues with the website. HES Data Quality Notes with full functionality can be requested by emailing We apologise for any inconvenience this may cause and hope to have the issues resolved soon.

DQ Notes M2 2021-22 (publication)

Latest activity period: May 2021
Latest Publication Date: 13 July 2021

Understanding volume of legally restricted codes in HES: April 2017 - March 2018

Understanding volume of legally restricted codes in HES: April 2017 - March 2018

This document provides HES users a count of legally restricted records in HES from April 2017 - March 2018, by provider, for Admitted Patient Care and Outpatients. It highlights the differences in counts due to additional codes added by SUS+ that have since been reverted.

Automatic data cleaning and derivation rules

We clean the data to improve the consistency and usability of HES data. These rules are used to:

  • clean common and obvious data quality errors
  • derive additional data items to populate the HES data set

The document and cleaning rule numbering should be used in conjunction with the HES User Data Dictionary.

HES autocleanse dictionary PDF

HES autocleanse dictionary xlxs

Provider mapping methodology

How we handle records with an invalid provider code within the HES datasets.

HES provider mapping methodology

Duplicate methodology

How we identify and handle duplicate records within the HES dataset.

HES duplicate identification and removal methodology

HES patient ID

The HES patient ID (HES ID) provides a way of tracking patients through the HES database without identifying them. It is central to many HES outputs including spell construction, emergency readmissions and linkage to other data sets, such as mortality. We are in the process of enhancing this feature and will publish its methodology as soon as it's released.

SUS admitted patient care data

NHS Digital routinely collects data from hospital providers regarding a patient's time at hospital as part of the Commissioning Data Set (CDS). This is then processed and is returned to healthcare providers as the Secondary Uses Service (SUS) data set and is used by the NHS for operational purposes. Most NHS hospital trusts submit data on a monthly basis by deadline following a two-phase reconciliation process to arrive at a final agreed position for each month's activity defined in the NHS Standard Contract for payment purposes. This data is consolidated, validated and cleaned and then used to create the Hospital Episode Statistics (HES) data set which is released on a monthly basis as official statistics.

As a number of trusts now submit data as part of the CDS more frequently such as weekly or even sometimes daily. This subset of more frequently reported management information data reported in SUS provides opportunity for additional insights from that which might be immediately available in HES data. To help unlock the opportunities this SUS data can provide to produce management information the following data quality dashboard output has been published with the intention to:

  • help with working to improve the quality of the data derived from SUS
  • to assist in interpreting meaning and insight from the current SUS APC data
  • help understand which providers have more up to date data

Please be aware the information provided here is not directly comparable to any similar metrics that maybe reported in our other HES data quality outputs.

Accessibility of this tool

This tool is in Microsoft PowerBI which does not fully support all accessibility needs.  Further information about the data is available.  If you need further assistance, please contact us for help.

Last edited: 13 July 2021 8:07 am