National Statistician’s Independent Review of the Measurement of Public Services Productivity

Published:
13 March 2025
Last updated:
14 March 2025

Chapter 2: The Measurement Challenges in 2024

Whilst the Atkinson Review (2005) remains a comprehensive and valid study, with principles which have demonstrated longevity, the landscape of public services has changed materially in this time. This Review identified key challenges identified through the implementation of the Atkinson Review and more recently as a result of coronavirus (COVID-19). Six key challenges are described under each of the following headings. These are considered on a sector-by-sector basis as each issue may differentially impact on different public services, or in some areas the Office for National Statistics (ONS) may already have adequately addressed this issue since 2005.

2.1 The passage of time

The simple passage of time delivers two distinct challenges:

  • Changes experienced within services – a service may have been re-designed in a fundamental fashion such that the measures no longer reflect the landscape. For example:
    • During the coronavirus pandemic, National Healthcare Service created new ‘Test and Trace’ capability, and other new services, which were outside the existing measurement framework.
    • From 2013, Universal Credit began to replace a number of benefits, which formed the core of the existing measurement model. In 2018, the ONS suspended the existing measure of Social Security Administration as the model struggled with this change.
  • Changes within the measurement of services – the service may have remained consistent through time but the measurement system may have deteriorated. This may have been for a number of reasons:
    • Data sources may have ceased to be published by various agencies.
    • Data quality may have deteriorated because of falling sample sizes or other statistical reasons.
    • The coronavirus pandemic directly impacted on statistical models. This is particularly the case with regards to ‘seasonal adjustment’ (the statistical method employed to smooth out the effect of seasonal impacts from the quarterly time series), where new structural breaks have required the Review to refresh its understanding of when different activities happen and how to best reflect this.
    • Data are nowcast to cover more recent time periods, but the model may need updating and bringing up-to-date, particularly if the model was adversely affected by the discontinuity of the coronavirus pandemic.

2.2 The opportunities presented by new data

Significantly more data are available across government today than in 2005. The Review has considered where services can now be reliably measured where this was not previously possible, or where data can be disaggregated to give more refined measures, better marrying inputs and outputs at the detailed level. The importance of disaggregation is particularly important when estimating volumes because this process also allows more detailed deflators to be used to derive the headline volume series.

2.3 The challenge of collective services

Atkinson recognised a pre-existing national accounts norm that services are divided into ‘individual’ and ‘collective’ services. That is, those where the service affects one individual – such as an operation on person x means the same operating theatre and medical staff cannot simultaneously be used for person y – and those which affect us all – no one can ‘opt-out’ of the UK-wide nuclear deterrence operated by the Ministry of Defence’s submarine fleet.

How to value this deterrence, and how the valuation would change (“Would citizens feel more defended if the UK purchased an additional nuclear submarine?”) remains a significant question. There is not any fundamental rule requiring the Review to treat collective services differently, but they present complexity against many of the key principles Atkinson established. A service which can be received by one individual is inherently easier to measure because it generally delivers outputs which can be more objectively measured.

There is also a question about where to draw the boundary: only a small fraction of citizens actively engage with the police or the criminal justice system: should the Review measure value against the services received by victims and (unwillingly) by perpetrators, or should it instead value the protection the system grants everyone? Is policing and criminal justice an individual or collective service? Are vaccination services individual or collective, given the whole of society benefits even if every single individual is not vaccinated?

This Review concludes the debate around the labels ‘collective’ and ‘individual’ is a futile one; what matters are the individual methodological challenges presented by each service and how the Review unravel these to give a meaningful measure of value and the volume of output being produced. This challenge remains as fundamentally difficult today as faced by Atkinson and those who have worked on this topic in the interim.

2.4 The challenge of services with multiple outcomes

Measuring the output and outcomes delivered by a service can be complex when they are focussed on one or a small number of outcomes (for example Healthcare Services primarily deliver health interventions). Others are characterised by delivering multiple, varied, outcomes. Policing is the clear example, with the police working to prevent and solve crime, deliver crowd-control, undertake missing persons investigations, work to reduce re-offending with key partners, attend road-traffic accidents, undertake community policing and resolve anti-social behaviour, alongside counter-terrorism and addressing organised crime. This raises a number of distinct challenges:

  • Mapping inputs to each activity.
  • Accessing good quality and consistent data for each activity, with no double-counting between activities.
  • Being able to adjust the relative weights of these activities in the aggregation process based on accurate and timely data.
  • Being able to attribute outputs and outcomes to the participating bodies. For example, if police work with local probation staff to manage dangerous offenders upon release, how should this activity be split between these two agencies?

2.5 The direct impact of the coronavirus pandemic

Both Healthcare and Education Services, alongside others, saw the operating model for their existing services fundamentally transformed by the coronavirus (COVID-19) pandemic, either in terms of the quantity or nature of what they were asked to deliver. In some areas this impacted the measures used in productivity estimates.

In the area of Education, for example, where output is measured via exit qualifications (e.g. GCSEs in England), it is not just this year’s teaching and learning which shape this year’s results: the previous ten years of formal education, and pre-primary early year’s provision, need to be taken into account.

As such the existing methodology pro-rated qualification results back through the cohort’s education. So, for example, only 30% of a student’s success in Year 11 in 2020 was attributed to the year of education they received in 2020: the other 70% was attributed to their previous educational career. The coronavirus pandemic fundamentally disrupted this model as weaker performance in the pandemic years, caused by disruption to education at that time, could not be taken to imply that education delivered pre-pandemic was similarly affected. It would be inappropriate to model that a student sitting their exams in 2021, and whose results suffered because of disruption in 2020 should see their 2017 performance downgraded. The Review looked to address this.

2.6 Preventative services and latent capability

How to consider preventative services is at the heart of the conceptual challenges raised by the coronavirus pandemic, but which extend far wider. Preventative services present the most extreme example of needing to quality adjust output data to reflect the true value created by a service. They are generally designed to cost significantly less than the down-stream benefits they may unlock.

For example, a low-cost tobacco cessation programme aims to reduce future demand for expensive cancer operations. The sum of cost approach, even if quality adjusted, may be insufficient to reflect the valuation produced, especially if a cost weighted activity index is used to aggregate these services alongside other services: a far greater weight would apply to the expensive operations. How to resolve this is addressed in Chapter 3.

Back to top