Hello my name is Tracy and I work as a senior case officer in the data access request service team.
This video is one of a series of presentations designed to help you use our data access request service as effectively as possible. You can view the other videos in this series on our Youtube channel at the following address. NHS digital has published a number of standards in relation to how we assess applications for data from NHS digital. These are designed to be transparent and to help you in completing the relevant section of your online application the data this presentation will provide detail on the agreed standard for completing the following section of the application: Data Minimisation.
When we refer to data minimisation, we're referring to the amount of data you are requesting. You should only request the amount of data that is specifically required so that you can complete your work. This is supported by GDPR Article 5(1)(c) which requires that data shall be:
“adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed”.
In all cases the amount of data requested must be justified by the purpose stated within your application.
So how can you minimise the amount of data requested ?
The datasets requested should be directly relevant to your purpose in your application and some of the things to consider would be:
Firstly can you reduce the number of datasets that you have requested to achieve the purpose?
Can you achieve the purpose in a less intrusive way and what we mean by this is, for example, if you've requested identifiable data, could the purpose be
achieved by using either pseudonymised or anonymous data instead?
Something else to consider could be whether you need the amount of years that you've requested - could you reduce the amount to achieve the purpose? again this will need to be justified within the purpose section.
Other ways to minimise the data could be either by geography or by demography for example by age?
Can you minimize the data by clinical factors? either by diagnosis or procedure for example why does a study on heart attacks need to know about knee cap replacements in that data as well?
Depending on the dataset you may be able to minimise the data by considering the episode or spell:
And things to consider here are:
Are all the patient's episodes required to achieve the purpose and if so why?
Are all elective episodes required to achieve the purpose?
Or are all the maternity episodes required to achieve the purpose? and if they are, are the unborn child and neonatal records necessary and if so, why are they necessary? Is there a time frame around the index events for example procedure or diagnosis and if so why?
Explain how the fields in each record is supported by the purpose for the records requested, are all fields necessary to achieve the purpose? If so, why?
If identifiable or sensitive fields have been chosen, is it possible to reduce the risk of intrusion (e.g. flag for 30-day mortality rather than full date of death, or survival days which will give an exact period in days but without full DOB/DOD, or could you replace specific diagnosis codes with categories.
Additional justification will be required if you are selecting identifiable and/or sensitive data within your application.
For a data linkage, can additional filters be applied? for example HES data that's linked to mental health data but you might only require HES records where there is an associated mental health record.
Another way could be, can a HES record be created to minimise the amount of data provided, for example if you're interested in all episodes for patients with a specific diagnosis or procedure, we can find those people in the data set and then provide all related episodes. An example of this would be where we would filter HES by a condition and ONS mortality records would only be provided for patients who appear in the HES extract.
When submitting your application via DARS online after selecting each product you will also be asked how you are looking to minimise the data. This is a free text box for you to explain how the data is being minimised.
Thank you for listening. We would welcome your feedback on this presentation. If you'd like to provide feedback, then please email email@example.com