# Recorded dementia diagnoses: supporting information

Guidance on using the statistical publication recorded dementia diagnoses, with details of what is included and the methodology used to calculate the estiimated diagnoses rates.

### Introduction

The Department of Health and Social Care (DHSC), on behalf of Secretary of State and NHS England (NHSE), have directed NHS Digital to establish a data collection in order to receive specific dementia diagnosis data to support the Prime Minister's Dementia Challenge. When NHS Digital received such a direction we issue a Data Provision Notice to the appropriate providers of the required data.

We collect and publish data about people with dementia at each GP practice so that the NHS (GP’s and commissioners) can make informed choices about how to plan their services around their patient needs

### Measures

There are a number of measures used to assess the number of patients with dementia, and those who have had a formal diagnosis. We define these as follows:

#### Recorded prevalence

For each practice collected in this extract, NHS Digital received a count of patients who have a diagnosis of dementia on the GP patient record as defined by the business rule. No personal identifiable data (PID) are collected through this mechanism; only an aggregate / total number of patients with a diagnosis at each practice, by five-year age band and gender.

NHS Digital also receive counts of patients registered at each practice. Again, these are non- PID aggregate / total counts for each practice.

Using these data recorded prevalence for each practice can be calculated as follows;

Recorded dementia prevalence = (Number of patients on dementia register / Number of patients registered at practice) x 100

NHS Digital also receives counts of patients registered at each practice by 20 ethnic groups. These are aggregated into 6 ethnic groups to make these data easier to interpret.

#### Ethnic group

NHS Digital also receives counts of patients registered at each practice by 20 ethnic groups. These are aggregated into 6 ethnic groups to make these data easier to interpret.

#### Assessments and care plans

GP practices also provide the count of patients up to the end of the reporting period who have received or declined: an assessment for dementia; an initial memory assessment; referral to a memory clinic; care plan or care plan review.

Data relating to Care Plans and Memory Assessments should not be compared with that previously collected under the GP Enhanced service, which ended in March 2015.

Data relating to assessments for dementia, memory assessments and clinics are presented as cumulative counts from the previous October. For example, the count of patients receiving a dementia assessment published in December 2017 will be the sum of assessments carried out in October, November and December 2017.

From the November 2018 release, counts of data for patients with a coded dementia diagnoses who have received a medication review in the preceding 12 months are now collected.

#### Prescribing of antipsychotics

GP Practices also supply the number of patients with a dementia diagnosis who have had a prescription of antipsychotic medication in the last 6 weeks. This is further divided into patients who have or do not have a diagnosis of psychosis. To prevent the release of disclosive information,  practice data with values less than five (including zero) are replaced by a “*” symbol. All other practice numbers are rounded to the nearest five. Other geography totals are not suppressed or rounded.

#### Practices included in the data set

The GP Extraction Service (GPES) extracts data for practices that were open at the relevant date point, being the last day of the month for which data were extracted. We call this the ‘total estate’. This estate only includes those practices defined as a ‘GP Practice’ on the organisational reference data we hold. We do not include practices defined as walk-in centres, out of hours clinics, or prison prescribing cost centres. A further adjustment to the total estate from previous extracts is to also exclude ‘shared’ and ‘dormant’ practices.

Shared practices are those practices which share a clinical system. GPES cannot extract data from these practices, so they are excluded.

Dormant practices are those practices where the practice code has yet to be fully closed, but which have no associated clinicians.

The GPES extract is not instantaneous; it runs over a number of days – known as the “extract window”. Depending on the length of this window, GPES may not manage to collect data for all potential practices.

#### Indicator

The recorded dementia diagnoses data also contains an Indicator called: Dementia 65+estimated diagnosis rate.  More information can be found in the indicator specification section.

## Indicator specification

### Indicator title

Dementia: 65+ Estimated Diagnosis Rate

### Changes from previous versions

From 2017/18 this indicator methodology replaces those used previously by the below domains and will not produce comparable results when applied to the same source data. Where the new indicator time series overlaps with previously published periods, values will differ.

### Indicator family name

CCG Outcomes Indicator Set (OIS) Domain 2 – Enhancing the quality of life of people with long term conditions.

Public Health Outcomes Framework - Healthcare and premature mortality domain; mental health, dementia and neurology: Dementia profile

CCG Improvement and Assessment Framework – Better Care

NHS England Operational Information for Commissioning - Delivering the Forward View

### Condition/topic area

Long term conditions

### Detailed descriptor

#### Plain English description

Not everyone with dementia has a formal diagnosis. The indicator compares the number of people thought to have dementia with the number of people diagnosed with dementia, aged 65 and over. The target is for at least two thirds of people with dementia to be diagnosed.

#### Technical description

The rate of persons aged 65 and over with a recorded diagnosis of dementia per person estimated to have dementia given the characteristics of the population and the age and sex specific prevalence rates of the Cognitive Function and Ageing Study II, expressed as a percentage with 95% confidence intervals. Significance is determined by the non-overlapping of confidence intervals with the 66.7% benchmark.

### Data Sources

#### Denominator: registered patients

Patients aged 65+ registered for General Medical Services, counts by 5-year age and sex band from the National Health Application and Infrastructure Services (NHAIS / Exeter) system; extracted on the first day of each month following the reporting period end date of the numerator. Source: NHS Digital.

#### Numerator: recorded dementia prevalence

Patients aged 65+ registered for General Medical Services with an unresolved diagnosis of dementia, counts by 5-year age and sex band from GP Clinical Systems via the General Practice Extraction Service (GPES); extracted on the reporting period end date (the last day of the month). Source: NHS Digital.

#### Reference rates: sampled dementia prevalence

Age 65+ age and sex-specific dementia prevalence rates, binomial proportions with 95% confidence limits by 5-year age and sex band from the Medical Research Council Cognitive Function and Ageing Study II (CFAS II). Reference rates remain static. Source: MRC CFAS II.

#### Organisational data

GP practices open and active on the reporting period end date from the NHS Business Services Authority Prescriptions Services (NHS BSA), with postcodes and Clinical Commissioning Group (CCG). Source: NHS Digital Organisational Data Service.

Office for National Statistics (ONS) mappings from CCG to Sustainability and Transformation Plan Footprint (STP); NHS England Local Office (DCO); and NHS England Region. Source: ONS Open Geography.

ONS mappings from postcode to local authority (LA). Source: ONS Open Geography.

Public Health England (PHE) mappings from LA to PHE Centre; County Council, PHE Region; ONS Group and Sub-Group; Average LA Deprivation Decile; Devolved Area. Source: Public Health England.

## Construction

### Introduction

This indicator reports the rate of persons aged 65 and over with a recorded diagnosis of dementia per person estimated to have dementia given the characteristics of the population and the age and sex-specific prevalence rates of the CFAS II study, expressed as a percentage with 95% confidence intervals.

Applying the age and sex-specific 65+ prevalence rates of the CFAS II population (the reference rates) to the age and sex structure of the registered patients in the subject population (the denominator), yields the number of people aged 65+ one would expect to have dementia within the subject population. Dividing the actual number of cases recorded in the subject population (the numerator) by the estimated number yields the estimated diagnosis rate.

95% confidence intervals are derived from the 12 individual measures of uncertainty given with the CFAS II reference rates and the uncertainty around the numerator. The indicator is calculated 100,000 times, resampling randomly each time from the distributions of the 13 variables, to produce an overall distribution of indicator values closely approximating the true distribution. The 2,500th smallest and the 2,500th largest values in the distribution give robust estimates of the 95% lower and upper confidence limits respectively to one decimal place.

This indicator is expressed as a percentage.

### Data fields

PRACTICE_CODE
AGE
SEX
VALUE
EXTRACT_DATE

#### GPES recorded dementia prevalence

PRACTICE_CODE
AGE
SEX
VALUE
ACH_DATE

CFAS II reference rates
Sex Age Rate Lower Upper
M 65–69 years 0.012 0.006 0.023
M 70–74 years 0.030 0.020 0.044
M 75–79 years 0.052 0.038 0.070
M 80–84 years 0.106 0.082 0.137
M 85–89 years 0.128 0.090 0.180
M ≥90 years 0.171 0.106 0.264
F 65–69 years 0.018 0.009 0.036
F 70–74 years 0.025 0.016 0.039
F 75–79 years 0.062 0.045 0.084
F 80–84 years 0.095 0.073 0.123
F 85–89 years 0.181 0.145 0.222
F 90 years + 0.350 0.284 0.423

#### NHS BSA organisational data

PRACTICE_CODE
STATUS
OPEN_DATE
CLOSED_DATE

PRESCRIBING_SETTING
POSTCODE
COMMISSIONING_ORGANISATION

### Data Filter

#### NHAIS registered patients

1. Field Name    EXTRACT_DATE
Conditions    = reporting period end date +1
Rationale:    Returns data as close to the reporting period end date as possible
2. Field Name    VALUE
Conditions    sum(VALUE) > 0
Rationale    Returns data for practices with at least one registered patient of any sex or age
3. Field Name    AGE
Conditions    > 64
Rationale    Returns data for patients aged 65 and over

#### GPES recorded dementia prevalence

1. Field Name    ACH_DATE, PRACTICE_CODE
Conditions    = max(ACH_DATE) per PRACTICE_CODE where ACH_DATE >= reporting period end date -182
Rationale:    Returns the most recent data available for each practice to a maximum of 6 months prior to the reporting period end date
2. Field Name    AGE
Conditions    > 64
Rationale    Returns data for patients aged 65 and over

### NHS BSA organisational data

1. Field Name    STATUS
Conditions    = A
Rationale:    Returns data for active practices
2. Field Name    OPEN_DATE
Conditions    <= reporting period end date
Rationale    Returns data for practices open as at the reporting period end date
3. Field Name    CLOSED_DATE
Conditions    >= reporting period end date; or NULL
Rationale:    Returns data for practices not closed as at the reporting period end date
4. Field Name    PRESCRIBING_SETTING
Conditions    = 4
Rationale    Returns data for practices with GP prescribing cost centres

#### NHAIS registered patients, GPES recorded dementia prevalence, NHS BSA organisational data

1. Field Name    PRACTICE_CODE
Conditions    Inner join
Rationale:    Return data only for practices existing in all three sources as queried above - for example, open practices, with one or more registered patient, with dementia data available within the last 6 months.

### Calculation formulae

Calculate the estimated number of cases of dementia for each organisation (denominator) by applying the age and sex-specific reference rates to the age and sex structure of its population:

$$E_k= \displaystyle\sum_{ij}N_{ijk}\times p_{ij}$$

Where:

Ek is the estimated value for the subject organisation k

Nijk is the population (65+ patient list size) for each combination of age band i and sex j in subject organisation k

pij is the binomial proportion for each combination of age band i and sex j in the reference population (CFAS II)

Calculate the estimated diagnosis rate for each organisation (indicator value) by dividing its observed dementia diagnoses by its estimated value and express this as a percentage:

$$\lambda_k =\frac {O_k}{E_k} \times 100$$

Where:

$$\lambda_k$$ is the estimated diagnosis rate for the subject organisation k

Ok is the recorded 65+ dementia diagnoses in the subject organisation k

Ek is the estimated value for the subject organisation k

Calculate the upper and lower 95% confidence limits for each organisation’s indicator value by simulation. Repeat the indicator calculation 100,000 times, randomly resampling each time from the age and sex-specific expected distributions, and the recorded diagnoses count distribution, to create a distribution of 100,000 random samples from the overall indicator distribution. Take the 2500th smallest and the 2500th largest values from this distribution as estimates of the 95% lower and upper confidence limits respectively:

$$\lambda_k^{LL} = \lambda sim_{k(n)} = n (\lambda sim_{k1,...,} \lambda sim_{k100,000})$$

$$\lambda_k^{UL} = \lambda sim_{k(100,000-n)} = 100,000-n(\lambda sim_{k1,...,} \lambda sim_{k100,000})$$

Where:

$$\lambda_k^{LL}$$  is the lower 95% confidence interval for subject organisation k

$$\lambda_k^{UL}$$  is the upper 95% confidence interval for subject organisation k

n defines the threshold of the indicator distribution based on the number of repetitions, 100,000, and level of confidence, 95%: 100,000 * (1-0.95) / 2

$$\lambda sim_{k1,...,k100,000}$$ is the order of randomly sampled indicator values for subject organisation k produced by repetition of the following:

$$(\lambda sim_k= \frac {Orand_k}{Erand_k} \times100)_{1,...,100,000}$$

Where:

Orandk is the randomly sampled diagnoses count value for organisation k produced by the inverse cumulative probability function with:

probability: $$R \epsilon (0,...,1)$$

mean: $$O_k$$

standard deviation: $$\sqrt O_k$$

Erandk is the randomly sampled expected value for organisation k produced as follows:

$$Erand_k = \displaystyle\sum_{ii} N_{ijk} \times prand_{ij}$$

Where:

$$N_{ijk}$$ is the population (65+ patient list size) for each combination of age band i and sex j in subject organisation k

$$prand_{ij}$$ is the randomly sampled binomial proportion for each combination of age band i and sex j in the reference population (CFAS II) produced as follows:

$$prand_{ij} = \frac {{\exp (p_{ij}^{icf})}}{{1 + {\exp (p_{ij}^{icf})}}}$$

Where:

$$p_{ij}^{icf}$$ is the inverse cumulative probability function for each for each combination of age band i and sex j in the reference population (CFAS II) with:

probability: $$R \epsilon (0,...,1)$$

mean: $$\log_e (\frac {p_{ij}}{100-p_{ij}})$$

Standard deviation: $$\frac {{\log_e ({\frac {p_{ij}^{UL}}{100-p_{ij}^{UL}}})} - {\log_e ({\frac {p_{ij}^{LL}}{100-p_{ij}^{LL}})}}}{2/1.96}$$

Where:

$$p_{ij}^{UL}$$ is the lower 95% confidence limit for each combination of age band i and sex j in the reference population (CFAS II)

$$p_{ij}^{LL}$$ is the lower 95% confidence limit for each combination of age band i and sex j in the reference population (CFAS II)

### Technical guide

The estimated dementia diagnosis rate (EDDR) is calculated for each area (local authority or CCG for example) in two stages:

1. The expected number of people with dementia in the area is estimated by applying prevalence estimates obtained from survey data at national level for each age/sex group to the estimated population in each age/sex group in the area and summing across all age/sex groups.
2. The total observed number of people diagnosed with dementia in the area is divided by the expected number obtained from stage 1 to give the estimated diagnosis rate, that is - the proportion of expected cases that have been diagnosed by GPs.

To obtain approximate 95% confidence intervals for the EDDR, the uncertainty (confidence intervals) around the original survey estimates at age/sex group level must be taken into account, together with the random variation element of the observed total of diagnosed patients. This is done by simulation using the following steps:

1. For each age/sex group 𝑖, the prevalence estimates (𝑝𝑖) were published with 95% confidence limits (𝑝𝐿𝑖 and 𝑝𝑈𝑖). All these are transformed by taking the logits3$$logit(p_i) = ln(p_i/{(1-p_i)})$$$$logit(p_{Li}) = ln(p_{Li}/{(1-p_{Li})})$$, $$logit(p_{Ui}) = ln(p_{Ui}/{(1-p_{Ui})})$$
2. The standard error of $$logit(pi)$$ is estimated using the published confidence intervals: $$se(logit(p_i)) = \frac {(logit(p_{Ui}))-(logit(p_{Li}))}{2\times1.96}$$
3. The prevalence estimates themselves are assumed to be binomially distributed, and the logit-transformed prevalence estimates are assumed to be normally distributed: $$N(logit(p_i)), se(logit(p_i))$$
4. The observed total number of diagnosed patients (𝑂) is assumed to be Poisson distributed, but since the counts are all large4 the normal approximation to the Poisson is extremely accurate and hence they are assumed to be normally distributed, $$N(0,\sqrt0)$$
5. Randomised expected values are calculated for each age/sex group along with a randomised observed value. This is done by generating random numbers (𝑟𝑖 and 𝑟𝑂) from a uniform (0,1) distribution using the Mersenne Twister algorithmi and transforming them to obtain random numbers from the appropriate normal distributions.
6. For the randomised expected values, the inverse of the normal cumulative distribution is calculated for probability $$r_i$$ , mean $$logit(p_i)$$ and standard deviation $$se(logit(p_i))$$ to give $$logit(p_{i_{sim}})$$
7. For the randomised observed value the inverse of the cumulative normal distribution is calculated for probability $$r_0$$ , mean 0 and standard deviation √𝑂 to give $$0_{sim}$$
8. Each $$p_{i_{sim}}$$ is calculated by reversing the logit transformation: $$p_{i_{sim}} = \frac {e^{logit({p_{i_{sim}}})}}{1+e^{logit({p_{i_{sim}}})}}$$
9. Each $$p_{i_{sim}}$$ is multiplied by the relevant population to obtain the simulated expected count for the age/sex group, $$E_{i_{sim}}$$
10. The total expected count is generated by summing across all age/sex groups: $$E_{i_{sim}}=\sum E_{i_{sim}}$$
11. The simulated EDDR is calculated: $$EDDR_{sim} = \frac {O_{sim}}{E_{sim}}$$
12. Steps 3 to 7 are repeated 100,000 times
13. From the randomly generated sample of 100,000 $$EDDR_{sim}$$ values, the 2,500th smallest and largest are taken as the values for the 95% lower and upper confidence limits respectively.

## Related pages

Last edited: 10 October 2019 8:26 am