Skip to main content

NHS Maternity Statistics - data quality guidance

This guidance is to support both providers of maternity data and users of the annual NHS Maternity Statistics series, in understanding and explaining the quality of the data and identifying how to make improvements to it.

This guidance covers four main sections:

1. Data Sources

2. Tools available

3. Interpreting Data Quality issues

4. Solving Data Quality issues

Part 1 - Data Sources

Overview

The annual NHS Maternity Statistics publication primarily uses data from two data sets: the Hospital Episode Statistics (HES) data warehouse and the Maternity Service Data Set (MSDS). The data in these are submitted by providers and processed for analysis by NHS Digital. Both data sets are secondary uses data sets which re-use clinical and operational data for purposes other than direct patient care.

The MSDS is a national-level data set made by submissions from NHS-funded maternity services and provides information to help with monitoring outcomes, commissioning, and addressing health inequalities. It defines the data items, definitions and associated value sets extracted or derived from local information systems. The MSDS dataset also contributes to the monthly statistical publication, however some measures in the monthly are not included in the annual, such as the Clinical Quality Improvement Metrics (CQIMs) and Continuity of Carer (CofC) measures.  

An updated version of this data set, MSDS v2.0, was implemented in April 2019 to meet requirements that resulted from the National Maternity Review, resulting in numerous changes being made to the contents and structure of the dataset. Because of this, the data from April 2019 onwards is not directly comparable to data from previous years.

The HES database contains details of all admissions, outpatient appointments and A&E attendances at NHS hospital in England. HES A&E retired in March 2020 and is superseded with a new Emergency Care dataset, ECDS, which became the official course of A&E data from April 2020. Each HES record contains a wide range of information about an individual patient admitted to an NHS hospital, including clinical, patient, administrative and geographical information. HES data used related to pregnancies and births are referred to as delivery and birth episodes, defined as a period where a patient receives care from one consultant at one provider. Only completed episodes are included in HES.

An additional dataset used in the annual maternity publication is the Neonatal Critical Care Minimum Dataset (NCCMDS). NCCMDS is also a secondary uses dataset and provides details of patients that receive Neonatal Critical Care. In the Annual NHS Maternity Statistics publication, data from NCCMDS is included in the Data Quality Neonatal Critical Care Analysis file alongside Maternity Services data to highlight data quality issues between the two datasets.

Measures

Measures in the annual report come from both MSDS and HES. Some of these measures are available from both datasets and some are unique to one specific data set. HES data is used for the national totals, as it has historically been more complete.

Measures common to both data sets include the total count of deliveries, the ethnicity of the mother, and the babies’ birthweight. There are also some measures which may appear similar but have subtle differences, for example, miscarriage and ectopic pregnancies are recorded in HES but only miscarriage is recorded in MSDS and so the figures may not be directly comparable.

There are a number of statistics that are sourced only from MSDS that are not captured in HES, these include:

  • Apgar Score (health check for new-borns)
  • mother’s smoking status
  • skin to skin status (physical contact between the mother and the new-born)
  • first feed information (whether the baby was breastfed or not)

Further information on the data captured from MSDS can be found in the Technical Output Specification (TOS) and user guidance found on the tools and guidance webpage, and the Metadata file accompanying the 2020-21 annual publication.

Other measures in the annual publication, sourced from HES but not from MSDS, include:

  • the status of the person conducting the delivery (Midwife, doctor, etc.)
  • which anaesthetic, if any, was used during delivery
  • the length of antenatal and postnatal hospital stays
  • The number of babies delivered at the end of a pregnancy (single baby, twins, etc.)
  • the number of miscarriages and ectopic pregnancies

Further information on the data captured by HES can be found in the Technical Output Specification (HES TOS), and the Metadata file accompanying the 2020-21 annual publication.

Issues with data quality

For NHS Digital to provide the best possible analysis for all users of our data, we require high quality data to work with. An accurate picture of maternity services allows clinicians, commissioners, and others to act in the most informed manner possible, ultimately leading to improved outcomes for patients. It also allows for better transparency around the work carried out by the NHS.

There are several known data quality issues with MSDS. For example, the number of providers submitting valid data for each data table and data item can vary. As a relatively recently revised national level data set this is somewhat expected, however the issue of non-response from providers has in turn impacted on the geographical coverage expected of the data set, leading to less reliable figures at levels higher than individual provider level.

Known HES DQ issues are documented on the HES processing cycle and data quality page.

Detailed MSDS Data Quality Statements can be found included with each annual NHS Maternity Statistics publication, alongside a CSV file which presents an analysis of the data quality of the submissions from MSDS from maternity service providers within the reporting period. This is named the MSDS Data Quality file and can be found in the Resources section of each publication.

If you do encounter problems with your MSDS data which you cannot resolve using the resources outlined in this guidance, you can contact the national Maternity Services Data Quality Team.

Broader information on data quality across the NHS is presented in our data quality page.

Improving your Trust’s data quality should enable you to have a more comprehensive understanding of how your service is operating and your outcomes. Participating trusts are also assessed on their MSDS data quality as part of the Maternity Incentive Scheme organised by NHS Resolution.

If your Trust is within scope of Maternity Services and does not currently submit to the Maternity Services Dataset, find out how you can register to submit data as your first action towards meeting Maternity Incentive Scheme requirements.

Improving your trust’s submitted data should also reduce the differences seen between related HES and MSDS published figures, and so enable the results of analysis of both datasets to be used together to gain a deeper understanding of your service and patients, and the impacts of the actions you take.

Some MSDS measures represented in the NHS Maternity Statistics annual publication series are directly or closely comparable to the measures in the Maternity Services Monthly Statistics publication series, and its accompanying national Maternity Services Dashboard.

Annual dimension Annual measure Monthly measure

ComplexSocialFactorsInd

All

Complex social factor

SmokingAtBooking

Smoker

CQIMSmokingBooking: Women who were current smokers at booking

BabyFirstFeedBreastMilkStatus

Maternal or Donor Breast Milk

CQIMBreastFeeding: Babies with a first feed of breast milk

ApgarScore5TermGroup7

0 to 6

CQIMApgar: Babies with an APGAR score between 0 and 6 (rate per 1000)

PreviousCaesareanSectionsGroup

Zero previous births

CQIMRobsonGroup1: women in RG1 having a caesarean section with no previous births (Percent)

PreviousCaesareanSectionsGroup

Zero previous births

CQIMRobsonGroup2: women in RG2 having a caesarean section with no previous births (Percent)

PreviousCaesareanSectionsGroup

At least one Caesarean

CQIMRobsonGroup2 women in RG5 having a caesarean section with at least one previous birth (Percent)

 


Part 2 - Tools available

This section is intended to provide you with an overview of the tools and resources available to help identify data quality issues, for both providers of maternity data and users of the annual NHS Maternity Statistics publication.

It is always important as part of addressing data quality concerns, to ensure clear robust communication is in place between providers of clinical services and those submitting the data. For example, the process of making MSDS data submissions should involve all those providing care to the women and babies, those entering the data, and those making the submission.

Data included in the NHS Digital statistical publications should also be cross-referenced against local data to ensure the HES and MSDS records are providing an accurate picture of the service provided.

Maternity Services Data Set (MSDS)

Data for the maternity services data set (MSDS) is supplied to NHS Digital via the cloud-based strategic data collection services (SDCS Cloud).

There is dedicated guidance available on the SDCS submission process, and on submitting data for the MSDS.

Data submitted via the SDCS goes through a series of automatic validation checks. Some submissions are rejected at submission if they do not fulfil the submission criteria, as specified in the MSDS Technical Output Specification (TOS). Upon submission, providers also receive a validation report detailing errors and record rejections to flag these issues to the provider. Further information on the data quality checks that are part of the SDCS can be found in the SDCS Data Quality guidance. Additionally, there is a tool designed specifically to help providers of data for MSDS to better understand the validation reports they receive, the MSDS Data Quality Submission Summary Tool.

Sharing information with maternity services about the rejections and warnings received at the point of data submission can be valuable in ensuring accurate data corrections are made, and to support a greater understanding of common data quality themes which could be addressed through collaboration and clearer data entry guidance.

Finally, there is a MSDS validation report released as part of the annual NHS Maternity Statistics publication which groups submitted data into categories (Valid, Default, Invalid, and Missing) based on whether the data conforms to the validation requirements. This data is provider-level and can be found in the Resources section of the annual publication.

Sharing information with maternity services about the rejections and warnings received at the point of data submission can be valuable in ensuring accurate data corrections are made, and to support a greater understanding of common data quality themes which could be addressed through collaboration and clearer data entry guidance.

Interactive dashboards

A new Data Quality Dashboard for the MSDS presents information about the quality of data submitted each month and includes provider-level data quality information to help users understand the impact of local issues and to support data quality improvements.

There are also new additions to the annual NHS Maternity Dashboard attached to the annual publication, to help providers identify data quality issues across both HES and MSDS data sets. These include improvements to the existing pages including to highlight when a provider’s MSDS total deliveries count is significantly below its HES equivalent, and new pages which enable users to compare the number of records submitted to HES and MSDS for each measure present in both data sets and to see what proportion of the data submitted for a measure is missing meaningful values.

Maternity Services Data Set - Data Quality Dashboard

This new Data Quality Dashboard for the MSDS presents information about the quality of data submitted each month.

NHS Maternity Statistics - Dashboard

The Annual Maternity Dashboard includes information to help providers identify data quality issues across both HES and MSDS data sets.

Hospital Episode Statistics (HES)

Data for the hospital episode statistics (HES) data set is submitted directly to secondary uses service (SUS) within NHS Digital by providers, from the information recorded for clinical purposes by hospitals and other healthcare providers. From here, the HES data set is produced by monthly extraction of data from SUS. Learn more about the collection process.

Upon receipt of data for the HES data set, a number of automated cleaning rules are applied to the data. These data quality reports and checks are completed at various stages in the cleaning and processing cycle. More information on data quality as it relates to HES, including a monthly publication of known data quality issues, is available.

Additional resources for clinicians on the correct use of clinical codes and the importance of good quality data are provided.

Monthly provisional and annual reports produced from HES can be found in under the HES Publications section. 

Other resources

There is also reporting on the quality of data across NHS data sets in the form of the Data Quality Maturity Index (DQMI) monthly publication, which provides data submitters with timely and transparent information on the state of their data quality. Further information about the DQMI publication, both current and historical, and the methodologies used in its construction are available.


Part 3 - Interpreting data quality issues

This section is intended to help data providers understand their own data quality issues, and to help guide readers of the annual NHS Maternity Statistics publication to resources that can be used to interpret the quality of the data contained within the publication and therefore better understand the possibilities and limitations of the data.

It is also intended to support data submitters in understanding how to identify and understand the different data quality issues that can arise within the maternity data. It is essential that as part of local data quality improvement work, that data submitters liaise with their local maternity services directly to rectify issues with maternity data input and ensure the dataset records are providing an accurate picture of the service provided and the care women and babies receive.

Understanding data quality: MSDS data quality feedback

In order to identify problems with the quality of data, feedback is provided at the point the data is submitted to the SDCS Cloud portal and further data quality checks are run within NHS Digital as the data is processed and cleaned ready for incorporation into the monthly and annual publications:

  • Providers receive immediate feedback on the quality of their submission in a file containing validation reports. This file includes record-level reports of any submission errors, intended to give the data providers detailed information about which records caused which errors. Providers should then be able to address the specific errors highlighted and resubmit a more accurate data return. Data files can be submitted as many times as necessary during the submission window for each publication month. This is approximately a two-month window.  Find out the MSDS publication dates for each month.
  • Providing maternity services with sight of the data submission rejections and warnings helps ensure accurate corrections and made and any data quality themes that emerge can be addressed more comprehensively in how care is recorded.
  • A variety of data quality checks are then run on the processed data as part of the validation and load process for monthly data, prior to production of the Maternity Services Monthly Statistics publication. These validated and cleaned monthly datasets are also used for the annual NHS Maternity Statistics publication. Where there are notable concerns about data quality, we contact providers directly so that any issues with local data extraction processes can be addressed for future submissions.

Understanding data quality: What MSDS data is included?

The Maternity Services Dataset (MSDS) is structured by linking data recorded in many different tables to connect all the information related to a specific pregnancy, this linkage is typically but not exclusively via a unique pregnancy identifier. The MSDS receives the populated data tables via monthly submissions from providers, as described in previous sections of this guidance.

A detailed explanation of which MSDS records are included in the annual publication can be found on the 'Records Included' worksheet in the MSDS Metadata file, which is published as part of the annual publication and can be found in the list of resources.

Understanding data quality: Which MSDS issues are most important?

For MSDS data providers, data quality issues within the submission can be prioritised based on:

  • whether it is a warning or a validation failure
  • whether the data item is mandatory or required
  • and at which level the validation issue has occurred.

Each of these criteria will be outlined below.

Data that has successfully been submitted can also then be reviewed in the later statistical publication outputs such as the NHS Maternity Statistics publication, to check for instances of useable but incomplete data such as that shown in the MSDS measures counts of records with a “Missing value / Value outside reporting parameters”.

Understanding “warnings” and “failures”

At submission, data quality issues can be broadly divided into “failures” and “warnings”. Records with warnings will still be loaded on submission provided there isn't an overriding validation failure for the same record. Validation failures may reject the entire file, groups of records (that depend on a parent row that has already been rejected) or individual records. More information can be found by referencing the validation code located in the Technical Output Specification for the data set.

Mandatory data items and tables

Some data quality issues occur because important data items are either not recorded or recorded in the wrong data format. Whether each data item must be included or not is outlined in the mandatory/required/optional column in the Technical Output Specification (TOS).

Mandatory: These data items MUST be reported without exception. Failure to submit these items, including all items in a mandatory table, will result in the rejection of the record.

Required: These data items MUST be reported where they apply. It is a legal and contractual requirement to submit these data items where the service has been provided to a patient. Failure to submit these items will not result in the rejection of the record but may affect the derivation of national indicators or national analysis. Please note that the purpose of the data set is not to change clinical practice.

Optional: These data items MAY be submitted on an optional basis at the submitter’s discretion.

Derived: These data items are derived during pre and/or post deadline processing for inclusion in the extracts made available for download. Please note that these are not for submission to the Submission Portal and are not included in the submission file. These items are also greyed out in the TOS.

Whilst a particular table itself may not be mandatory, if a record is entered into that table, then all of the table’s mandatory fields must be completed.

Understanding validation levels

As well as the different levels of requirement for each field, data quality issues can be organised by the level which they occur at.

1. File level

File level rejections are validation errors that highlight specific data quality issues which have caused the whole submission file to be rejected or issued a warning message. These checks are made on all the records submitted by the data sender. A rejection would be of the entire submission against the selected reporting period, requiring identified issue(s) to be rectified and a resubmission made. These can be found File-Level Rejects tab

Example: MSDREJ002 - Failed Content Check. Mother's Demographics Table is empty.

2. Table/group level

These compare records within or across multiple tables, leading to rejection of multiple records or a warning message being displayed. Certain records in different tables may be linked by certain keys (such as LocalPatientId). These are considered ‘groups’ and, to protect data integrity, where one record fails validation, any linked records within that ‘group’ will also fail validation. This can be used to check referential integrity between tables or for duplicated records within a table. Rejected records would not progress to post deadline processing. Records with warnings would progress, but data quality would not be as required. These can be found in the Individual table tabs of the TOS.

There are two types of group-level rejections; No valid group submitted and More than one group submitted. “No valid group submitted” errors occur either because records in a table are rejected due to the corresponding records from the parent table being rejected due to a validation error, or because there are no corresponding records submitted in the parent table.

“More than one group submitted” errors will be triggered if duplicate records (records with same values in key columns) are submitted. All duplicate records will be rejected with group-level rejection error.

Example: The MSD001 group will be rejected if there is no valid MSD002 group transmitted for this LOCAL PATIENT IDENTIFIER (EXTENDED (MOTHER)).

3. Record level

Record-level rejections are validation errors that highlight a data issue in a specific column which has caused an error with the whole record. These can be against a single data item or across multiple data items within a single record, leading to either the rejection of the record or a warning displayed. Rejected records would not progress to post deadline processing. Records with warnings would progress, but data quality would not be as required. These can also be found in the Individual table tabs of the TOS.

Example: If PERSON DEATH DATE (MOTHER) is before the PERSON BIRTH DATE (MOTHER) the record will be rejected.

Each data item within the data set specification may have any of the above types of validation.

Please see the validations and warnings in the MSDS v2.0.25 TOS to understand the submission requirements for each table. The “Validation Rules” columns outline the date restrictions, and corresponding messages are shown in the “Error/Warning Messages” column. A description of each table can be found in section 6 of the MSDS v2.0 User Guidance.

Understanding data quality: Common mistakes and things to look out for

Some general pointers for data providers to consider are listed below. It is also always important for data submitters to liaise with their local maternity services ask part of rectifying issues with data input:

  • Group-level rejections are one of the most common data quality issues, highlighted in DQ reports.
  • Invalid format - record level rejections, in many cases an invalid format record level rejection, the root cause may be due to the way the data was imported in the submitted access database causing the ‘leading zero’ to be dropped (for example ‘2’ submitted rather than ‘02’).
  • An issue relating to an invalid date format could be due to the way data was imported into the submitted Access database whereby the date format is changed automatically during import.
  • Many warnings and rejections can be caused due to incorrect values being submitted for organisation-related data items. This could be due to an issue at the point of data collection. The legacy Bureau Service Portal (BSP) may have historically allowed expired codes, or in certain cases, it may have accepted certain data without issuing warnings or rejections, meaning that some issues may not have been identified.
  • Errors can be caused due to the incorrect usage of SNOMED CT codes. Guidance on the use of SNOMED CT can be found in the SNOMED CT Mapping guidance tool.
  • In certain cases, where a data item value is populated, a corresponding value is required in another data item field.
  • If a large amount of data is submitted outside of the required date range, then numerous rejection messages will be generated back to the provider. This may hinder the provider’s ability to identify 'real' rejection messages that require corrections to be made to “included” data. Users are advised to check the date validation rules prior to submission to identify and submit data that is relevant to the reporting period only.
  • When making a submission it is good practice for providers to access the pre-deadline extract, as this will show exactly what records have been accepted for each table.
  • Tables and records once accepted for submission can still contain omissions which affect the quality of later measures of the service and care provided. The statistical publications and data quality validation tools, including interactive dashboards, mentioned in this guidance can assist with identifying where these gaps are and what the impact of them has been. They can then help data submitters identify what changes are needed to correct the issues for future submissions.

Understanding data quality: Annual publication figures

As part of the monthly and annual publications using maternity data, a CSV 'Data Quality' file is provided which contains information on the data quality of the MSDS submissions from maternity service providers. This file provides a count of the valid records within a field, as well as the missing and invalid records and the data items which have been left as the default value. This also breaks down the data by item, provider and regions and can be used to assess the limitations of the data used in the annual publication.

This can be used in conjunction with a review of local reporting to identify and understand where data completion and quality issues have arisen.

Comparisons of the data taken from HES and MSDS can also be broadly used to assess the completeness of the MSDS submissions, as HES is generally considered to be more complete and can thus act as a benchmark against which to measure the completeness of MSDS. However, this greater completion may not apply within every provider and region and relying on it for data quality assessments can risk masking separate data quality issues within HES data. There are also rare provider-level cases where the MSDS figures may in fact be more complete than HES. These comparisons can therefore be a useful starting point for understanding the quality of maternity data, but differences once identified should be investigated thoroughly to understand their underlying causes. Additionally, measure-level comparisons of HES and MSDS data can cause issues when similar sounding measures are in fact constructed differently, meaning that they will inherently lead to different resulting measurements. These MSDS and HES comparisons are available in the annual NHS Maternity Statistics Interactive Dashboard, including for specific measures where relevant. It is important to refer to the metadata information for both datasets when making such comparisons.

Detailed statements on the quality of data from both HES and MSDS can be found in the data quality statements within the written report, which are provided separately for MSDS data, and for HES data

As stated within the MSDS data quality statement, users of the maternity data must make their own assessment of the quality of the data for a particular purpose, drawing on these resources. In addition, local knowledge or other comparative data sources may be required to distinguish changes in data volumes between reporting periods which reflect changes in actual service delivery, from those that are an artefact of changes in the underlying data quality.


Part 4 - Solving data quality issues

This section is intended to inform providers of maternity data, especially for the maternity service data set (MSDS), on how to resolve data quality issues and monitor the outcomes of such resolutions.

Alongside the practical steps outlined below, we would always encourage data submitters to include maternity service providers in these discussions and investigations and ensure they compare submitted and published data to locally held data. This should support a clearer understanding of how front-line care is translated into data inputting, and how the resulting data submissions are translated into publication outputs.

What can be done?

MSDS file rejections

If the submission file is rejected, a new one must be submitted. Details of how to submit the data can be found in the SDCS Submission guidance, and further guidance on the data quality reports received after submission within the DQ guide. If you are having difficulty understanding the data quality report, the maternity service Data Quality Submission Summary Tool is designed to improve its readability and can tell you where these data quality issues may impact upon the various metrics being measured as part of the Clinical Negligence Scheme for Trusts (CNST). During the submission window, providers can make as many submissions as are needed, however new submissions within the window will overwrite old submissions. More information on submitting data for the MSDS to the SDCS is available.

If you are having difficulty identifying which records to submit further information on what data to submit to the MSDS can be found in the Technical Output Specification (TOS) in the MSDS implementation guidance, as well as the MSDS information standard which sets out requirements for the collection and submission of operational and clinical data relating to each stage of the maternity care pathway.

If problems with making your MSDS data submission persist, contact the national Maternity Services Data Quality Team.

Data quality issues identified within the publication

The data published as part of the annual NHS Maternity Statistics series may highlight data quality errors at your provider via either the specific data quality centric resources (the data quality statements, the MSDS Data Quality CSV, the MSDS Data Quality Dashboard, and the annual NHS Maternity Interactive Dashboard) or through missing values or other discrepancies visible in the publication output tables and files.

Additional information may be available in the relevant section of the publication which you can use to identify the potential underlying causes of the data quality issue, and you can also refer to the metadata files for both HES and MSDS datasets to look further into how the relevant measures are built and which data fields are used to construct them.

If you would like further help when understanding and rectifying data quality issues identified this way, contact the national Maternity Services Data Quality Team and include a description of the suspected problem and your actions so far.

Monitoring results

If you have taken action to improve the quality of your submitted data and are wondering how to monitor the results, a number of different tools can be used to do this.

The strategic data collection service (SDCS) used to submit data to the MSDS provides the most immediate feedback on data quality. The aggregations of total Submitted, Rejected and Accepted records provided on the dashboard can be used as an overview of your data quality and you may want to record these numbers yourself to track your progress between submissions. The SDCS also produces a data quality report upon submission of files which can be used to monitor more specific data quality issues with your submission. More on understanding the data quality reports is available within the DQ guide. There is also the Data Quality Submission Summary Tool which can be used to help understand the error codes given in these reports.

The DQMI monthly publication and interactive dashboard can also be used to monitor your data quality over time. This page includes a link to the methodology of the calculations used to give a percentage score per data submitted by each provider. The interactive dashboard also shows where this percentage is relative to the national average.

The quality of regional and provider-level data submitted to the MSDS can also be reviewing over time using the Data Quality Dashboard for the MSDS.

The Data Quality report published as part of the Maternity Services Monthly Statistics can also be used to monitor your data quality. It provides a count of Valid, Default, Invalid and Missing records per data item and organisation.

The annual NHS Maternity Statistics publication also includes a similar report which may also be useful for year-to-year monitoring. The publication outputs themselves, such as overall counts of deliveries and the numbers published for specific measures, can also be used to track year-to-year or month-to-month how the data completion and overall quality is changing. These reports and other publication files are typically produced in a similar format for each publication and you should therefore be able to make direct comparisons between earlier and later publications when tracking data quality issues, including for specific measures and specific data fields.


Last edited: 12 May 2022 11:54 am