Skip to main content

Publication, Part of

National Child Measurement Programme, England, 2021/22 school year

Official statistics, National statistics

National Statistics

Appendices

Appendix A - Data quality report

Table A1 shows the key data quality measures, at national level, since the first year of robust NCMP data was collected in 2006/07. Table 8 shows the same data quality measures at submitting local authority level for 2021/22 along with breach reasons provided by LAs in cases where the data quality has breached thresholds (see Validation section for further details). Further commentary on how data quality is assessed is provided in the Data quality statement which accompanies this report.

 

Appendix B - Data collection

Coverage

The National Child Measurement Programme (NCMP) collects height and weight measurements of children in reception (aged 4–5 years) and year 6 (aged 10–11 years) primarily in mainstream state-maintained schools in England. 

Local authorities are mandated to collect data from mainstream state-maintained schools but collection of data from special schools (schools for pupils with special educational needs and pupil referral units) and independent schools is encouraged. 

For the 2021/22 collection, 3,430 records were collected relating to pupils in independent/special schools. This represents only 0.3% of the total number of records across all state and independent/special schools.

Since the proportion of records from independent and special schools varies each year, this report excludes such records to ensure consistency over time. There are also concerns around how representative the participating independent and special schools would be.

However, independent and special schools are encouraged to feedback the results to the parents of the children they measure.

 

Measurement

The measurement of children's heights and weights, without shoes and coats and in normal, light, indoor clothing, was overseen by healthcare professionals and undertaken in school by trained staff. The Office for Health Improvement and Disparities (OHID) provides guidance to local authorities on how to accurately measure height and weight.

Measurements could be taken at any time during the 2021/22 academic year. Consequently, some children were almost two years older than others in the same school year at the point of measurement. This does not impact upon a child’s BMI classification since BMI centile results are adjusted for age. Also, the age range is only a year for the majority of records: in 2021/22, 84% per cent of reception pupils were aged between 4.5 years and 5.5 years when they were measured and 81% per cent of year 6 pupils were aged between 10.5 years and 11.5 years.

 

Validation

Full details about validation are provided in NHS Digital’s validation document and have been summarised below.

Local authorities enter data into the NCMP system which validates each data item at the point of data entry. Invalid data items (e.g. incorrect ethnicity codes) and missing mandatory data items are rejected and unexpected data items (e.g. “extreme” heights) have warning flags added.

During the collection the NCMP system provides each local authority with real time data quality indicators, based on the data they have entered, for monitoring and to ensure the early resolution of any issues. At the end of the collection each local authority must confirm any data items with warning flags and sign off their data quality indicators. In cases where the data quality indicators breach the required thresholds (provided in Validation of National Child Measurement Programme Data linked to above), LAs are required to provide a breach reason.

Appendix A shows the key data quality measures, at national level, since the first year of robust NCMP data was collected in 2006/07.

Table 8 shows the same data quality measures at submitting local authority level for 2021/22 along with breach reasons provided by LAs in cases where the data quality has breached thresholds

After the collection has closed NHS Digital carries out further data validation which includes:

  • Querying breach reasons that do not fully explain the reasons for the data quality issues.
  • Comparing each local authority’s dataset with their previous year’s dataset and querying unexpected changes.
  • Looking for clusters of unexpected data items to identify data quality issues affecting particular schools.

 

Participation rates

The participation rate is the proportion of children who were measured out of those eligible for measurement. Children eligible for measurement are sometimes not measured for a range of reasons such as the child being absent on the day of measurement or not consenting to be measured. This means that the NCMP dataset is a sample (albeit usually a very large sample) and the prevalence of the BMI classifications in this report are estimates assumed to apply to the entire population.

To ensure the NCMP sample is representative, it is important to verify that non-participation is equally likely for each child. If, for example, all non-participating children were obese then the sample would be biased and obesity prevalence underestimated.

Analysis on the NCMP datasets between 2006/07 and 2008/09 established that there was a relationship between PCT participation rates and year 6 obesity prevalence. It was estimated that year 6 obesity prevalence may be underestimated by around 1.3 percentage points for 2006/07, around 0.8 percentage points for 2007/08, and around 0.7 percentage points for 2008/09 (with the impact reducing as participation rates increased). This may be due to obese year 6 children being less likely to participate in the NCMP than other children during these collection years. Therefore, the upper confidence interval for the national year 6 obesity prevalence rate was increased for 2006/07 to 2008/09 by these amounts.  For other BMI classifications the relationship was found to be negligible.

In 2009/10 and 2010/11 the participation rate continued to increase and the same analysis found the relationship to be negligible. As the participation rate increased again in 2011/12 and had remained similar since 2012/13, it was considered unnecessary to repeat the analysis in recent years. We will continue to monitor this in the future.

In 2019/20 and 2020/21, the participation rates were not collected due to the impact of COVID-19. In 2021/22 collection year, participation rates may be lower than expected due to continuing impact of COVID-19.

Participation rates at local authority level are provided in Table 2 and these should be considered when comparing local authority prevalence figures.

 

Calculating participation rates

Rates are calculated by dividing the number of valid records from mainstream state-maintained schools, submitted by the local authority, by the number of children eligible for measurement in these schools, and multiplying the result by 100.

The number of children eligible for measurement, in each school year within a local authority, is calculated by aggregating headcounts across the mainstream state-maintained schools within the local authority’s postcode boundary. The NCMP system provides default headcounts based on Department for Education (DfE) census data, but these can be amended by the local authority where necessary. The NCMP system validates local authority provided headcounts through checking that the number measured at a school does not exceed the number eligible for measurement. When the number measured exceeds the number eligible, the system corrects the ‘eligible’ figure by increasing it to match the number measured thus ensuring a maximum school-level participation rate of 100 per cent.


Appendix C - Calculation of prevalence

The prevalence of children in a BMI classification is calculated by dividing the number of children in that BMI classification by the total number of children and multiplying the result by 100.

The BMI classification of each child is derived by calculating the child's BMI centile and assigning the BMI classification based on the following thresholds:

  • Underweight - BMI centile less than or equal to the 2nd centile
  • Healthy weight - BMI centile greater than the 2nd centile but less than the 85th centile
  • Overweight - BMI centile greater than or equal to the 85th centile but less than the 95th centile (i.e. overweight but not living with obesity)
  • Living with obesity - BMI centile greater than or equal to the 95th centile
  • Living with severe obesity - BMI centile greater than or equal to 99.6. This BMI classification is a subset of the 'Living with obesity' classification.

These thresholds are conventionally used for population monitoring in the UK and are not the same as those used in a clinical setting. Different methodologies, such as the International Obesity Task Force (IOTF) methodology, use different thresholds and may result in different prevalence figures to those presented in this report.

The child’s BMI centile is a measure of how far a child’s BMI is above or below the average BMI value for their age and sex in a reference population. In England the British 1990 growth reference (UK90) is recommended for population monitoring and clinical assessment in children aged four years and over. UK90 is a large representative sample of 37,700 children which was constructed by combining data from 17 separate surveys. The sample was rebased to 1990 levels and the data were then used to express BMI as a centile based on the BMI distribution, adjusted for skewness, age and sex using Cole's LMS method.

The child’s BMI centile is calculated in the following way:

  1. Calculate the child’s BMI (weight(kg)/height2 (m2))
  2. Calculate the child’s BMI z-score:
    • look up child age and sex on the UK90 BMI centiles classification;
    • retrieve the corresponding L, M, and S values for use in the following formula (where y is the BMI score):

iii. Convert the BMI z-score to the BMI centile using the standardised normal distribution.

Note: linear interpolation is used to get more accurate L, M and S values. e.g. the formula for a child who is 4 years and 6.5 months old would use the L, M and S values halfway between those for 4 years and 6 months and 4 years and 7 months.


Appendix D - Comparing prevalence: considerations

When comparing prevalence figures between groups and over time it is important to consider how participation and data quality might affect the calculated figures.

Comparisons between two groups with differing data quality or participation may be skewed and this should be taken into account as it may partly explain any difference in prevalence figures.

Analyses looking at the impact of data quality on prevalence were carried out by the National Obesity Observatory (now part of OHID) for the 2006/07 and 2007/08 collection years and by the National Centre for Biotechnology Information (NCBI), a division of the U.S. National Library of Medicine (NLM), for the 2007/08 collection year.

No analysis has been carried out to quantify any impact on recent years but improvements in data quality and participation since the first years of the NCMP should have lessened any impact. However, it is still important to consider data quality and participation when making comparisons. Information on the 2021/22 data quality is provided in the Data Quality section and the Data quality statement. Information on participation can be found in appendix A.

It is also important to realise that, since the NCMP dataset is a sample, the prevalence figures in this report are estimates assumed to apply to the entire population. These estimates are subject to natural random variation. Confidence intervals and significance testing have been used in this report to take account of such variation. Further details are available in appendices E and F.


Appendix E - Confidence intervals

A confidence interval gives an indication of the likely error around an estimate that has been calculated from measurements based on a sample of the population. It indicates the range within which the true value for the population as a whole can be expected to lie, taking natural random variation into account. Confidence intervals should be considered when interpreting results. When confidence intervals do not overlap the differences are considered as statistically significant. When confidence intervals overlap, it is not possible to determine whether differences are statistically significant. Please refer to appendix F for a suggested methodology for such cases.

Larger sample sizes lead to narrower confidence intervals, since there is less natural random variation in the results when more individuals are measured. The NCMP has relatively narrow confidence limits because of the large size of the sample and high participation rates. 

In the tables accompanying this report, 95 per cent confidence intervals have been provided around the prevalence estimates. These are known as such because if it were possible to repeat the same programme under the same conditions a number of times, we would expect 95 per cent of the confidence intervals calculated in this way to contain the true population value for that estimate.

The confidence intervals in this report have not had the finite population correction (FPC) applied and have therefore not been reduced on the basis of coverage. This approach is consistent with that used throughout the public health community. For example, census, mortality and hospital admission data represent a 100 per cent sample, yet the associated confidence intervals are routinely calculated without the FPC adjustment.

 

Methodology

Confidence intervals have been calculated using the method described by Wilson and Newcombe.

The steps needed are:

  1. Calculate the estimated proportions of children with and without the feature of interest (e.g. percentage of obese children in reception year) as follows.
    • p = r / n = proportion with feature of interest
    • r = observed number with feature of interest in each area
    • n = sample size
    • q = (1 – p) = proportion without feature of interest
       
  2. Calculate three values (A, B and C) as follows:
     

where z is 𝑧(1−∝/2)  from the standard Normal distribution.

3. Then the confidence interval for the population proportion is given by:


This method is superior to other approaches because it can be used for any data.

When there are no observed events, then r and hence p are both zero, and the recommended confidence interval simplifies to


When r = n so that p = 1, the interval becomes


Appendix F - Significance Testing

Significance tests have been used in this report to determine whether differences between prevalence estimates are genuine differences (i.e. statistically significant) or the result of random natural variation.

A quick and easy check to see if two prevalence estimates are significantly different is to compare the confidence intervals of the estimates. When the confidence intervals do not overlap the differences are considered as statistically significantly different. This approach was used in NCMP reports prior to 2009/10.

However, it is not always the case that overlapping confidence intervals indicate no significant difference. In some cases estimates with overlapping confidence intervals will still be statistically significantly different. Consequently, some significant differences may have been missed in NCMP reports prior to 2009/10. A more robust way of checking if two prevalence estimates are significantly different is to use significance testing.

The significance testing methodology used in NCMP reports since 2009/10 follows the approach outlined by Altman et al. This methodology is consistent with that used by the Office for Health Improvement and Disparities (OHID).

A 95 per cent level of significance has been used in the tests throughout this report. This means that when prevalence estimates are described as being different, (e.g. higher/lower or increase/decrease etc.) the probability that the difference is genuine, rather than the result of random natural variation, is 0.95.

 

Methodology

The steps for the approach outlined by Altman et al. are:

  1. Calculate the absolute difference between the two proportions

2. Then calculate the confidence limits around D as:


where p is the estimated prevalence for the year i, and li and ui are the lower and upper confidence intervals for pi respectively.

3. A significance difference exists between proportions p1 and p2 if and only if zero is not included in the range covered by the confidence limits around the difference D.


Testing significance of the difference in prevalence between the least and most deprived groups

Ordinary Least Square (OLS) regression modelling has been used in this report to examine the significance of some factors with selected outcome variables after adjusting for other factors. Particularly, the model tested for the significance of change in the prevalence gap for the least and most deprived deciles by year, by gender and if the interaction of gender and year was a significant contributor to this change.

Models were constructed for the groups of children living with obesity or severe obesity in each school year (Reception and Year 6). This was repeated using deprivation by school postcode and by pupil postcode.

The OLS model can be formulated as

y = a + X1 + X2 + X1*X2

where a is the constant and;

The models’ outcome of interest (y) is deprivation gap.

Deprivation gap is calculated as the difference in prevalence between the least and most deprived groups.

Each model used the same variables as the independent variables: year (X1), gender (X2) and an interaction term of year and gender (X1*X2).

For each model, the independent variables were recoded to numbers (e.g. 1 for male and 2 for female).  

For each model, 2020/21 was excluded from the data as the figures are based on weighted data due to a smaller sample of measurements collected than in previous years. See Methodology and Data Quality section in 2020/21 report for more information.

The sample code is shown below, with the full codebase found in GitHub.

   def run_OLS_model (cleaned_df):

    """ Creates the OLS model for the group of interest, prints the inbuilt results summary 

    and creates a dataframe of the results.

    Parameters

    ----------

    cleaned_df: pd.DataFrame

    Returns

    -------

    str[results_df, results]

    Dataframes of model information stored in a dictionary

    """

    X = cleaned_df[["Year_Gender", "Year_Num", "Gender_Num"]]

    X = sm.add_constant(X, prepend=False)

    y = cleaned_df[["Deprivation Gap"]]

   

    results = (sm.OLS(y,X)).fit()

    '''take the results and transforms it into a dataframe'''

    pvals = results.pvalues

    coeff = results.params

    conf_lower = results.conf_int()[0]

    conf_higher = results.conf_int()[1]

 

    results_df = pd.DataFrame({"pvals":pvals,

                               "coeff":coeff,

                               "conf_lower":conf_lower,

                               "conf_higher":conf_higher

                                })

 

    #Reordering...

    results_df = results_df[["coeff","pvals","conf_lower","conf_higher"]]

    return results_df, results

 

Interpreting the results

Variables in each model were identified as significant if the p-value were less than 0.05. The results of the regression analysis are used to test if differences are likely to be genuine (i.e. statistically significant) or the result of random natural variation. Only statistically significant differences have been described with terms such as “higher”, “lower”, “increase” or “decrease”. When a comparison does not show a statistically significant difference, this will be described using terms such as “similar to” or “the same as".

The value of the coefficients also confirms these statements, with a negative coefficient suggesting a ‘decrease’ whilst a positive coefficient suggests an ‘increase’. For example, results shown in Table F1 show that for Gender_Num with a coefficient of -0.4737, as Gender_Num increased - from 1 (boy) to 2 (girl) deprivation gap significantly (p-values = 0.001) decreased.

Table F1: Sample OLS Results Summary

 

coefficient

p-values

conf_lower

conf_higher

Year_Gender

0.0121

0.373

-0.015

0.039

Year_Num

0.0309

0.154

-0.012

0.074

Gender_Num

-0.4737

0.001

-0.722

-0.225

constant

2.7080

0.000

2.315

3.101

 


Annex G – Local Authority tables: guidance

Local authority data is presented in three ways:

  • By the upper tier local authority who submitted the data (Table 2).
  • By upper and lower tier local authority, based on the local authority in which the school is located (using the school postcode of each child) (Table 3a).
  • By upper and lower tier local authority, based on the local authority in which the child lives (using the postcode of residence of each child) (Table 3b).

Users may want to use the different breakdowns for different purposes. For example, users who want to look at the impact of interventions which are targeted through schools, such as healthy school meals or physical activity provision, may want to use the results which are based on where the school is located. Other users who want to look at interventions which are more residence based, such as provision of leisure facilities or parks, may want to use the residence-based results.

Users particularly interested in looking at results over time should be aware that provision of the child’s residence postcode only became a required field in 2007/08. Therefore, users wanting to compare current results with those in 2006/07 should use the results based on school location (Table 3a).

For most local authorities, the three sets of figures will not differ substantially. Some examples where differences may occur are:

  • There may be a difference in results between submitting local authority (Table 2) and those based on the location of the school (Table 3a) where a local authority has an arrangement with a neighbouring local authority to collect measurements in a few schools outside of their own geographical boundary.
  • There may be a difference in results between those based on the location of the school (Table 3a) and those based on child residence (Table 3b) where a relatively high number of pupils attend a school located in a local authority different to the one in which they live. This is particularly the case in inner London.

Appendix H - How are the statistics used?

Users and uses of the report

There are known and unknown users of the National Child Measurement Programme reports.

Known users have been established through customer engagement including a consultation carried out in 2016 and are detailed below.

Unknown users access the report directly from our website. We seek feedback from these users to understand how to better meet their needs in future via emails to [email protected]

In 2016 we engaged with users of this report as part of the wider NHS Digital consultation on all statistical products.

 

 

Known users and uses

Department of Health and Social Care (DHSC)
The NCMP is a key element of the Government’s approach to tackling child obesity. NCMP statistics are used to inform policy and set national ambitions such as those detailed in Childhood obesity: a plan for action.

Office for Health Improvement and Disparities (OHID)
OHID are responsible for the Public Health Outcomes Framework (PHOF) which sets out the desired outcomes for public health and how these will be measured. The NCMP provides robust data for the child excess weight indicators in the PHOF.

The OHID Population Health Analysis team conduct additional analyses on the NCMP data, including regional and local analyses, and produce a range of reports and tools:

Local Authorities
Frequently use NCMP statistics for analyses, benchmarking and to inform decision making.

Academia and researchers
Non-identifiable versions of the annual NCMP datasets have been made available on the NHS Digital website since 2013/14. Datasets for years prior to 2013/14 were deposited in the UK Data Archive. This NCMP data is used by academics in their research papers.

Some examples are provided below:

Media
NCMP data are frequently used to underpin articles in newspapers, journals and online media.

Public
Aggregated NCMP data as published in NHS Digital's national report and OHID's more detailed analyses, is freely accessible for general public use.

Public Health Campaign Groups
Data is used to inform policy and decision making and to examine trends and behaviours.

Ad-hoc requests
NCMP statistics are used by NHS Digital to answer Parliamentary Questions (PQs), Freedom of Information (FOI) requests and ad-hoc queries. Ad-hoc requests are received from health professionals; research companies; public sector organisations, and members of the public, showing the statistics are widely used and not solely within the profession.

Published ad-hoc requests can be found here: Supplementary Information.

 

 

Web hits

We also capture information on the number of web hits the reports receive, although we are unable to capture who the users are from this.
 


Last edited: 12 December 2022 1:37 pm