Hospital Episode Statistics linkage to the Diagnostic Imaging Dataset

Hospital Episode Statistics (HES) contains around 1 billion records on patients attending Accident and Emergency units, being admitted for treatment or attending outpatient clinics at NHS hospitals in England.

The Diagnostic Imaging Dataset (DID) is a collection of data on NHS-funded diagnostic imaging tests-such as MRI scans and x-rays-extracted from NHS providers' radiological information systems.

The HES-DID linkage brings these two data sets together so that patients' records can be matched.

Linking the two data sets together provides the ability to analyse acute secondary care pathways in England following different imaging tests. This will help to identify trends and variation in patient outcomes, and thereby enable a much deeper understanding of the care pathway. For example, analysis of the linked data could reveal geographical variation in the number of diagnostic tests that lead to cancers being diagnosed or rates of missed cancers. It can also allow analysis of particular procedures and assess whether imaging is performed too early or too late in the diagnosis to be beneficial. Such analysis could be used to improve services for patients and has the potential to save lives.

Matching the DID to HES

The DID is linked to HES by matching person identifiable data in the DID with patient identifiers in HES (the HES_ID index). Matching is performed by comparing patient identifiable fields, such as NHS number, date of birth, gender and/or postcode, which are present in both HES and DID. These patient identifiable fields are then removed before the data are made available in a pseudonymised form.

The HES-DID extract

Extracts of the HES-DID linkage provide the DID record and the HES ID it is linked to, with additional columns for Match Rank and File Version. The match rank indicates at which stage in the linking process the records were matched (see pdf icon HES-DID Data Matching Quality Report [174kb]). The file version indicates which HES index and DID data have been used in the linkage.

Note: The HES Identifiers are pseudonymised so they are specific to each organisation that requests the linked file. This ensures that extracts can be matched to HES extracts the organisation has already received.

The table below demonstrates how the additional columns in the linked HES-DID extract appear.

Block of DID Data
(See pdf icon Diagnostic Imaging Dataset Data Dictionary [232kb])


Match Rank

File Version










An 'Old_New_HESID' file is also provided. This file lists any HES ID that has changed over time, as shown below. For more Information see the pdf icon Data Matching Quality Report [174kb].







Accessing the data

Data can only be made available to those who meet the HSCIC's robust Information Governance standards to protect and control how data are managed and, where applicable, an appropriate legal basis is in place. We only provide identifiable data when there is a lawful basis to do so. 

The HES-DID linkage is available as either a standard extractbespoke extract or for bespoke data linkage to other data sets or third-party data. For further information, please visit the Data Access Request Service (DARS) page.

To access the HES-DID linkage, customers are required to complete an .

The DID is also available on its own as a standard extractbespoke extract or for bespoke data linkage to other data sets or third-party data.

Data quality

The pdf icon HES-DID Data Matching Quality Report [174kb] details the linkage matching algorithm and the percentage of DID records that have been successfully matched to HES. It includes the number of matches identified at each stage of the matching process. The report also includes HES and DID Validation and Derivation Rules.

When a new statistical publication is released for HES or the DID an associated Data Quality report is produced. The most recent information about HES Data Quality can be found with the latest publication in the HSCIC publications calendar. This includes a Background Quality Statement about different aspects of data quality and provider-level data quality measures. The DID Data Quality Report is managed and produced by NHS England.

Extract timetable

Extracts from the HES-DID linkage available from 4 September 2013 include HES and DID data up to April 2013. The HES-DID linkage will be updated on a monthly basis and the data in the extract will reflect submitted data from both data sets that are three months in arrears. This means that each linked release can update matches in previous linked extracts.

Find out more about HES submission dates and Diagnostic Imaging Dataset submission dates.

Statistical publications

A statistical publication of a sample of linked HES-DID data was published on 2 October 2013.

Access the HES-DID publication in the publications catalogue.

Open data

The HSCIC releases the data underpinning its statistical publications in aggregate form wherever possible to support the Government's transparency agenda and encourage others to make use of the depth of data available.

HES open data can be accessed via the Standard Publications section of the 'What HES Data Is Available?' page.

DID open data can be accessed via the Data section of the NHS England DID web page.

Find out more about open data on the HSCIC website or visit the website.


