UK Statistics Authority supplementary evidence to the Public Administration and Constitutional Affairs Committee

Dear Mr Wragg,

Following the submission of the Office for National Statistics’ (ONS) written evidence to the Committee’s Transforming the UK’s Evidence Base on 31st August 2023, I then gave evidence to the Committee on 5th September 2023. I am now able to provide some supplementary evidence, as requested, on several topics of interest.

The Integrated Data Service (IDS)

As you will be aware, the IDS is a cross-government project, for which the ONS is the lead delivery partner. The project is a key enabler of the National Data Strategy and seeks to securely enable coordinated access to a range of high-quality data assets built, linked and maintained for richer analysis. Please find below some further detail on the background of this project and the progress towards its delivery.

What is the scope of the IDS?

The scope of the IDS is to deliver a secure scalable modern data service which operates on a cloud-native platform, hosting a rich and diverse data catalogue consisting of indexed and linkable data with the latest provision of data science and generative AI potential. The service has been designed to better inform effective policy making.

The vision of the IDS is to address the lack of a central integration platform that can cater for the future needs of both data providers and analysts looking to utilise integrated data to develop cross-cutting analytical results. The IDS builds on the success of the Secure Research Service (SRS) and offers to significantly reduce the time it takes to negotiate and access data and the provision of data assets.

The IDS provides a secure environment that enables streamlined data sharing across government improving the ways that data are made available via cloud native technologies, modernising the way departments and their professionals operate. The IDS is the first of its kind in the UK and will be setting the precedence for how data is being processed on a cloud native platform.

When is it expected to be delivered?

The programme has been in development by the ONS over the last 18 months and is funded until March 2025 (under the current Spending Review). After this date, the IDS becomes a live running service.

What is the cost of the programme?

The programme secured funding from HM Treasury (HMT) until the end of the investment period (financial year March 2024/25). The cost of the programme is estimated to be £228.7m which covers the development and running costs from 2020 – 2025. Furthermore, the programme continues to assess funding options beyond March 2024/25.

Who are the users likely to be?

The IDS is designed for use by accredited analysts, within government and the wider research community. The ambition for the IDS is to have every government analyst, roughly estimated at 14,000 individuals, capable of utilising the platform to better inform decisions for the public good.

What data do you expect to be available on the service?

There are currently 81 datasets available in the IDS from across government. This includes high-value data assets, such as levelling up; climate change and net zero. Additionally, health data assets are underway with identified datasets being indexed by the Reference Data Management Framework (RDMF) – which enables multiple data to be linked and analysed, creating new comprehensive data assets – and published on the IDS so that analysts can link data according to their requirements.

The programme intends to continue to work with data owners across government and the private sector to acquire more datasets in conjunction with the RDMF.  However, this is dependent on data owners signing up to data sharing agreements to make this data available.

In accordance with the Central Digital and Data Office’s roadmap for 2022-25, departments have agreed to share their essential shared data assets across government, including through IDS. This further enables the IDS as a Trusted Research Environment to facilitate and support this commitment.

However, discussions with government analysts have highlighted a range of concerns about how current incentives for departmental data sharing fit with the needs of ministerial-facing departments. There is also a wider financial risk regarding other department’s ability to fund activity such as data cleansing, which may limit their ability to effectively share data. Although HMT set out the expectation that OGDs will support data sharing in all SR21 settlements, no specific funding was provided, which may limit activity in some cases. As part of the IDS Programme, ONS is working with Chief Data Officers across government to minimise frictions around the sharing of data via IDS. One of the pilots in development is looking at Data Ownership and Stewardship approaches to streamline the governance arrangements and make it quicker for departments to agree to share data via IDS, and for analysts to subsequently access that data for a broad range of analysis in the public good. As always, I would welcome support from the Committee to share and promote the benefits of data sharing across government for the public good.

What safeguards will be in place to protect data?

The IDS is a trusted research environment which means it adheres to the 5 Safes in accordance with the Digital Economy Act (DEA); The 5 safes of secure data are as follows:

  • Safe projects – Is this use of the data appropriate, lawful, and ethical?
  • Safe people – Can the users be trusted to use it in an appropriate manner?
  • Safe settings – Does the access facility limit unauthorised use?
  • Safe data – Is there a disclosure risk in the data?
  • Safe outputs – Are the statistical results non-disclosive?

These principles enable the safeguards and governance for the IDS to operate with sensitive data which in turn ensures public confidence in the security and processing of data. Access to the IDS platform is granted via secure gateway in line with the data legislation; furthermore, the IDS utilises strict policies around the cleaning, linkage, validation and controlling data.

The IDS Programme is also working across ONS in the development of key governance through policy creations that will enable safeguards and the appropriate use of data. The policy workstream, which is coordinated by ONS’ Data Governance, Legislation and Policy and Security and Information Management teams, is helping to develop adequate governance for the programme via policy development. In developing safeguards, the programme employs the following principles:

  • Adapting successful policies within the ONS and across government analytical communities (e.g., GSS, GSR, GES) that can support the programme.
  • Working with the National Statistician’s Data Ethics Advisory Committee, which is underpinned by the UK Statistics Authority’s (UKSA) ethics framework for the use of data for statistical, research and analytical purposes, to identify and mitigate any potential ethical risks at project-level.
  • Access to all data are controlled through the concept of a analytical ‘project’, within supporting business and technical processes linked to user need.
  • An overarching programme Data Protection Impact Assessment (DPIA) is maintained to define key activities and associated data risks. Continued engagement with the Information Commissioner’s Office on the DPIA as it is maintained and updated as the programme develops.

The programme also adheres to the UK Statistics Authority/ONS Data Protection Policy (required by the Data Protection Act 2018 and the General Data Protection Regulation).

The ONS website

The Committee also asked for some insight into the current condition of the ONS website and any plans to change the site in the future. Below I have outlined our vision for dissemination, of which our website is an integral part, as well as some exploratory work we are undertaking to see how we could use AI technology to address some of the challenges with our existing website.

Our Vision for Dissemination

The ONS website supports the Statistics for the Public Good strategy by helping to build trust in evidence, enhance understanding of social, economic and environmental matters and improve the clarity and coherence of our communication. By helping people to be aware of the ONS and to find, understand and explore our data, statistics and analysis we are giving people the information they need to make decisions, and act, at a national, local and individual level.

Our vision for statistics dissemination goes beyond the website. We want people to have trust in our data and analysis. We know that our users want to find trusted ONS information wherever they look – whether that’s on the ONS website, on social media, in the media or through search engines. Our users want ONS answers to their questions and we are exploring a range of different approaches to serve this need, including providing answers to questions using Large Language Models (LLMs).

Our goal is for users to understand our data, statistics and analysis more quickly and easily, with the right contextual information to help people know how they can use them. We want our users to explore and tailor our information so they can find what is important to them – whether that is by creating their own datasets based on ONS data or through our expert curated view of key insights for the economy or society.

Our priorities for the website in recent years have been delivering the capability to support census 2021 outputs and the reliability of the service to all our users, particularly in response to the additional demand for ONS data on the economy, in response to changes in the cost of living. We’re currently running a package of work to address and improve website performance to meet demand and our next priority will be programmatic access to our data via application programming interfaces (APIs). This will improve the agility of all users of our data, both internal and external, to consume and gain insights from the ONS website.

We have also focused on improved search both on the ONS website and through greater visibility of our data and insight in search engines and in the media.

This year we are also setting the future direction for how we create and manage our statistical content in a more efficient and structured way to enable business agility and flexibility for our users, aligned to their broad range of needs. This will set out a forward plan to transform ONS data and insight and will make the case for the additional funding needed to deliver on our ambitions.

StatsChat

Additionally, the ONS Data Science Campus are currently exploring how new tools and technology can help the organisation disseminate information more effectively. We have developed a new product, ‘StatsChat’, that uses LLMs to search and summarise text from across our website, and present relevant sections of our web pages to user’s natural language questions.

We are aiming to make this available to a small selection of users for testing and fine-tuning, so that we can improve the relevance of the responses and provide assurance from a data ethics, data protection and security perspective.

Stakeholder engagement

The ONS conducts a wide range of user and stakeholder insights, consultations and listening exercises. This engagement is essential as it provides us with actionable insights on users’ and stakeholders’ views on the strength of their relationship with the ONS, feedback on its outputs, and on how stakeholders access and use our statistics and analysis.

As part of this, the ONS’s Engagement Hub conducts annual stakeholder ‘deep dive’ research and an annual stakeholder satisfaction survey. I understand the Committee is interested in understanding more about these exercises and insights from recent examples.

The deep dive research is conducted through in-depth interviews with senior representatives from around 45 key stakeholder organisations. The stakeholder satisfaction survey is an online questionnaire aimed at a wider range of users from a variety of sectors and roles to provide broader insight. Deep dive participants include those from central and local government departments, devolved administrations, research institutes, think tanks, public bodies such as NHS England and the ICO, international partners, business representative bodies and charities. The stakeholder satisfaction survey reaches similar types of organisations, with a wider range of responses at senior manager, operational, public affairs, analyst, researcher, policy maker and economist levels.

Deep dive interviews took place in summer 2022 and the findings were positive. Many stakeholders said that the organisation had built on and maintained its reputation for independence, trustworthiness, quality and reliability. They also felt that the ONS had developed its reputation for being flexible, agile and responsive to changing needs. Additionally, the ONS was seen to be working more collaboratively with policymakers than it had in the past.

The stakeholder satisfaction survey was conducted in early 2023. It found respondents to be positive across key sentiment measures on trust, quality, and on the ONS producing statistics which are relevant to issues of the day. There were also positive views expressed about the ONS as an organisation with reliability, responsiveness, and willingness to help being cited. It was also noted that ONS staff were knowledgeable and helpful.

There were areas highlighted for improvement in both the stakeholder deep dive and satisfaction survey. These included how the ONS works with both devolved governments and heads of the statistical profession in government departments; improving the ease of finding the right people to speak to in the organisation; and more regular, strategic overviews of the ONS’s work (for stakeholders to be able to connect different topics better). Some participants referenced a need for further scrutiny to understand some data anomalies which had occurred in mid-2022.

These findings are shared throughout the ONS, including with the National Statistician’s Executive Group, and are used to inform planning and prioritisation. We have implemented measures to respond to the issues raised as part of a wider programme of ongoing external affairs improvements, which we continue to monitor with further research.

The ONS conducted a subsequent stakeholder deep dive in autumn 2023 and are currently analysing the findings. The latest ONS annual stakeholder satisfaction survey is currently live and will be open for responses until 22 January 2024.

Full business case on population and migration statistics improvements

As you are already aware, next year I will be making a recommendation to Government on the future of the population and migration statistics system in England and Wales. I understand that the Committee has requested some additional detail surrounding the financial aspects of this transformational work.

In the outline business case for the Future of Population and Migration Statistics programme, initial cost estimates of a potential census in 2031 range from £1.3 billion to £2 billion, with increases expected across all phases of such an operation.

The ONS is working to produce a full business case (FBC) for our proposals to improve our population and migration statistics. The FBC will be developed in the context of the forthcoming recommendation to UK Government, and the response from Government. At this stage, while the recommendation remains in development, it is difficult to provide an accurate updated estimate of cost.

The FBC is expected by HM Treasury in late 2024. We will be able to provide the Committee with further information on costs at a later date.

Migration statistics

As part of improving population statistics we are also transforming international migration statistics. Our latest estimates, year to June 2023 are official statistics in development and are provisional. We revised our June 2022 and December 2022 estimates upwards due to a combination of more data and methodological improvements.

International migration estimates are produced using three key sources: Home Office border data linked to a person’s travel visa for non-EU nationals, which made up 82% of total immigration in 2023; tax and benefit data (known as RAPID) for EU nationals; and International Passenger Survey data for British nationals. We are most confident with Home Office border data and have an ambition to produce all migration statistics from these data in future.

We work very closely with Home Office to procure and use border data linked with visa data to produce migration estimates. The ability of free movement for British nationals and some EU and non-EU nationals makes the current method a challenge for those that don’t require visas. However, there is further data held by the Home Office, known as Advanced Passenger Information, that would help with our research, particularly for British nationals. We have requested these data and would like to see Home Office accelerate this request.

Census 2021 data confirmed our position that the administrative data we use for non-British nationals is robust and that the international passenger survey data does not measure actual migration patterns well due to people changing their intentions. Rather than rebasing once a decade, following a decennial census, to correct for any drift in our population estimates, we aim to produce statistics that do not ‘drift’ from the truth. Our Dynamic Population Model based population statistics show how drift in both population and migration statistics can be mitigated. That does not remove the need to revise estimates as the data and methods mature.

Long-term international migration uses the UN definition of a migrant, that is someone that changes their country of residence for 12 months or more. To produce timely estimates, we therefore have to make assumptions based on previous behaviour. As more time passes, we are able to update those assumptions with data of actual travel. We therefore become more confident in our estimates over time. For example, our June 2022 estimates now have complete data to show if a migrant has stayed or left for 12 months and we therefore have less uncertainty around those estimates compared to the provisional June 2023 estimates.

We have recently published experimental uncertainty measures for our admin-data based migration estimates for the first time. These show our users how our confidence increases once we have complete data that meet the required definition.

We also described the nature of provisional estimates that are subsequently revised and the reasons behind these revisions. This was picked up and presented accurately in the media and in playing back conversations with our core users. The Office for Statistics Regulation (OSR) recently published a review of their recommendations on migration statistics. The OSR considered we sufficiently described uncertainty to our users, although we recognise these are experimental and will continue to update our users as they develop.

I hope that you find this additional information useful. Please do let us know if we can assist the Committee further on any of the issues discussed in this letter, or with any of its other inquiries.

Yours sincerely,

Professor Sir Ian Diamond

UK Statistics Authority correspondence to the Public Administration and Constitutional Affairs Committee on the Office for National Statistics’ work on the Labour Market

Dear Mr Wragg,

I am writing to update the Committee on the Office for National Statistics’ work on the Labour Market.

As you will be aware, due to quality concerns, the ONS suspended publication of the Labour Force Survey (LFS) estimates element of the wider Labour Market release in October. Instead, to provide users with our best assessment of the labour market we produced indicative experimental estimates of the headline employment, unemployment and inactivity rates. These were produced using the most robust administrative data sources available to us. For employment, we used payroll data from HMRC’s Real Time Information system, applying the growth rates of that data to the LFS for April to June 2023. Likewise, we used Claimant Count data for unemployment.

Today we have published a development plan for the LFS. This will focus on work to increase the number, and diversity, of the responses to the LFS and on improved methods to better account for non-response and bias. We will also update the population figures used in the Labour Market estimates which is another important improvement. With this work in train, we are aiming to reintroduce LFS estimates in the Labour Market release on 12 December.

In parallel, we will continue our work to transform this key survey. Alongside the LFS, we currently also have the transformed LFS in the field. This has a sample size that is three times that of the current LFS and has an on-line first mode of collection supported by telephone and face to face interviewing, to help ensure a higher and more representative response. We are doing some final fine-tuning to the questionnaire and expect to fully transition to this new survey in March 2024.

I do hope that you find this update helpful but please do let me know if you have any other questions about this topic, or if we can be of assistance to the Committee on any other matter.

I am also copying this letter to Harriett Baldwin MP, Chair of the Treasury Committee and The Lord Bridges of Headley MBE, Chair of the Economic Affairs Committee as their specialists were recently briefed on this matter by members of my team.

Yours sincerely,

Professor Sir Ian Diamond

UK Statistics Authority follow-up written evidence to the Public Administration and Constitutional Affairs Committee’s inquiry on Transforming the UK’s Evidence Base

Dear Mr Wragg,

When giving evidence to the Public Administration and Constitutional Affairs Committee on 5 September 2023, I promised to follow-up on a couple of points with various members of the Committee.

GDP Revisions

Firstly, I agreed to let you know if I was aware of similar revisions happening in other comparable countries.

As I outlined in the Financial Times recently, The UK’s official economic statistics are rightly seen as among the world’s best. This includes the recent upgrade of our official estimates for economic growth in the pandemic years of 2020 and 2021. The latest Organisation for Economic Co-operation and Development (OECD) information shows that the UK is one of the first countries in the world to estimate the 2020 and 2021 coronavirus (COVID-19) pandemic period through the detailed Supply and Use framework. This standard economic framework enables us to confront our data at a much more granular level for products and industry. The OECD provides a real-time vintages of GDP database in their main economic indicators, which takes data directly from the National Statistics Institutes.

Each country will follow different revision policies and practices, which can result in their estimates being revised at a later date, according to their own needs. The timing and impact of revision changes will depend on data availability and magnitude, with large annual structural surveys being the data source needed to make detailed product and industry changes. These annual data sources come with lags on timeliness, often being available up to 2 or 3 years later.

We have now seen revisions to GDP estimates published by other countries. As we previously announced, the 2021 GDP estimates for the UK were revised to 8.7 percent growth from our initial estimate of 7.6 percent growth, a revision of +1.1 percentage points. The Spanish Statistical Agency has now published 6.4 percent growth in GDP for 2021, compared with the previous estimate of 5.5 percent, a revision of +1.1 percentage points. The Netherlands have now published 6.2 percent growth for 2021, revised from an initial estimate of 4.9 percent, a revision of +1.3 percentage points. Italy, have now published 8.3 percent growth for 2021, revised from an initial estimate of 7.0 percent, a +1.3 percentage point revision. All are a similar magnitude upwards revision for 2021 as observed in the UK context. Conversely, the United States have now published 5.8 percent growth for 2021, compared to a previous estimate of 5.9 percent growth, a revision of -0.1 percentage points [ONS own calculations based on published US data from www.bea.gov]. This highlights that revisions can differ across countries.

Strengthening the Analysis Standard

Secondly, I promised to examine whether there is a case for strengthening the Analysis Standard. I am passionate about ensuring the robustness of the Analysis Standard and welcome the committee taking an interest its strength and its application across Government.

The Analysis Function Standard, which was updated earlier this year, is part of a suite of management standards that promote consistent and coherent ways of working across government, and provides a stable basis for assurance, risk management and capability improvement.

In my letter to you of 18 September regarding the Committee’s report‘Where Civil Servants work: Planning for the future of the Government’s estates’, I emphasised my work to promote transparency in Government Analysis through my role as Head of the Analysis Function. I am keen to take every opportunity to champion the Standard across government and will reiterate the importance of this area at October’s Heads of Function board meeting.

The Standard is very clear on expectations about transparency in the commissioning, production and publishing of analysis. It also has clear messaging about compliance to the Code of Practice for Statistics and other official guidance for the remaining analytical professions including the Aqua, Green and Magenta books.

It is my expectation that all departments closely follow the principles in these sets of guidance and through the Analysis Function Standards Steering Group we monitor and scrutinise these documents to ensure their continued effectiveness.

For the first time this year, all Departmental Directors of Analysis undertook a self-assessment against the Standard and in response to this we are staring a series of action groups to drive improvements, including in Departments compliance to official guidance.

I will keep the Analysis Function Standard under close review and, where necessary, strengthen the messages in it.

Please do let us know if any other questions, and if we can help the Committee further on either of these topics or any of its other inquiries.

Yours sincerely,

Professor Sir Ian Diamond

UK Statistics Authority correspondence to the Treasury Select Committee on revisions within Blue Book 2023

Dear Ms Baldwin,

Thank you for your letter of 14 September 2023 regarding revisions within Blue Book 2023. To take your four points in turn:

  1. An overview of the main drivers of these revisions, and whether there were particular circumstances (including those arising from the pandemic) in 2020 and 2021 that made early estimates of GDP especially uncertain.

As I outlined in the Financial Times recently, The UK’s official economic statistics are rightly seen as among the world’s best. This includes the recent upgrade of our official estimates for economic growth in the pandemic years of 2020 and 2021.

It is certainly true that the large shifts in activity, and the means of delivering that activity in many cases, made it harder for all statistical agencies to measure economic activity during the pandemic. But it is equally true that the larger revisions we have seen for our 2020 and 2021 GDP estimates are proportionally in line with the much larger declines and growths seen over these periods as well.

The main drivers of revision in our 2020 and 2021 GDP estimates come from these changes in activity. For example, the health service had increased costs to deliver a reduced amount of output (e.g. protective equipment, and extra staff) during 2020 which increased the intermediate consumption and decreased the value added of the health sector. During 2021 these intermediate consumption costs continued to rise, but more slowly, while output volumes saw a massive increase from the return of mainstream health activities such as elective surgeries but also from the COVID vaccination programme and so value added then grew strongly.

Secondly retailers and wholesalers also changed the way they operated with specialist stores being forced to close or be limited to click and collect, and a much larger proportion of transactions were completed on-line. This again changed the retail and wholesale margins element in 2020 and then this partially swung back the other way in 2021 as retailers, especially those selling clothing and textiles saw a strong recovery in 2021.

The third driver of revisions was inventories data, where our annual, more complete, data sources gave information that businesses undertook more stock building that previously thought at the start of the pandemic when restrictions were quickly introduced. For more detail, please see our article on the 1st September 2023.

  1. An explanation of what has been learnt from these revisions about what may have been wrong with the earlier estimates, and what improvements the ONS will implement from what it has learnt.

Our early monthly and quarterly estimates for GDP followed the standard ONS procedures using the available ONS data sources. The challenge was the sheer scale of fundamental change in the economy in such a small space of time. The ratio of intermediate consumption to final output is usually very stable, and as a result ONS did not have any data sources for changes to this ratio for periods beyond the latest supply and use balanced year, which was 2018 at the time the pandemic started.

We have now sourced intermediate consumption data on a more timely basis for the health service with quarterly and annual data available within a month or two of the reference period. We are also investigating the use of administrative tax data (VAT) on purchases by businesses as a means of identifying changes in the intermediate consumption ratio more quickly across industry.

We have welcomed the recently announced review by the Office for Statistics Regulation, and look forward to their recommendations as one of the themes relates to “Potential improvements to early estimates of GDP enabled through enhanced access to data”.

An outline of whether the ONS expects similarly large revisions to GDP data for 2022, in either direction, and more broadly whether the ONS sees revisions of this size as exceptional or typical.

The revisions profile of GDP estimates for 2022 and for the first half of 2023 were published on 29 September in the Quarterly National Accounts. There was little to no revision to previously published GDP from 2022 onwards, and we saw only 1 out of the last 6 quarters have been revised. The quarterly growth rate of GDP across all of 2022 was unrevised, while growth in 2023 Q1 was revised up 0.2 percentage points and 2023 Q2 was unrevised. With this release, we observed that revisions for that period are more typical of the pre-pandemic era.

As part of our continual improvement, we have already implemented the new health intermediate consumption data to reduce the potential for revision in this large sector of the economy. While other work looking at wider intermediate consumption continues, we have proactively reviewed areas such as rail transport and air transport to ensure that the intermediate consumption ratio of 2021 does not apply directly to 2022 as well, where we can see clear evidence of a recovery in those sectors. As part of the OSR review of GDP, ONS has committed to provide additional revision analysis of our GDP estimates in October 2023.

Given the ONS notes that it has completed its revisions to GDP using a Supply and Use Table framework ahead of many other countries, what it expects may happen in comparator countries when they undertake their own similar analysis.

 Each country will follow different revision policies and practices, which can result in their estimates being revised at a later date according to their own needs. The timing and impact of revision changes will depend on data availability and magnitude, with large annual structural surveys being the data source needed to make detailed product and industry changes. These annual data sources come with lags on timeliness, often being available up to 2 or 3 years later.

We have now seen revisions to GDP estimates published by other countries. As we previously announced, the 2021 GDP estimates for the UK were revised to 8.7 percent growth from our initial estimate of 7.6 percent growth, a revision of +1.1 percentage points. The Spanish Statistical Agency has now published 6.4 percent growth in GDP for 2021, compared with the previous estimate of 5.5 percent, a revision of +1.1 percentage points. The Netherlands have now published 6.2 percent growth for 2021, revised from an initial estimate of 4.9 percent, a revision of +1.3 percentage points. Italy, have now published 8.3 percent growth for 2021, revised from an initial estimate of 7.0 percent, a +1.3 percentage point revision. All are a similar magnitude of upwards revision for 2021 as observed in the UK context. Conversely, the United States have now published 5.8 percent growth for 2021, compared to a previous estimate of 5.9 percent growth, a revision of -0.1 percentage points [ONS own calculations based on published US data from www.bea.gov]. This highlights that revisions can differ across countries.

Please do let me know if you have any further questions about this topic or if I can be of assistance to the Committee on any other matter.

I am copying this letter to Rt Hon Greg Clark MP, Chair of the Science, Innovation and Technology Committee, and William Wragg MP, Chair of the Public Administration and Constitutional Affairs Committee.

Yours sincerely,

Professor Sir Ian Diamond

UK Statistics Authority written evidence submission to the Public Administration and Constitutional Affairs Committee inquiry into the UK’s evidence base

Dear William,

I write in response to the Committee’s call for evidence for its new inquiry, Transforming the UK’s Evidence Base. I very much welcome this inquiry with its focus on the future of data, statistics and analysis in government, as we in the UK Statistics Authority look to the future in a variety of ways.

As you will be aware, the Office for National Statistics (ONS) launched a consultation on the future of population and migration statistics in England and Wales in June. I enclose written evidence from the National Statistician, Sir Ian Diamond, within this submission, which highlights not only this consultation but also the progress of data sharing in government so far, how the Authority uses data ethically and protects users’ privacy, and how the ONS understands and responds to user needs.

Meanwhile, the Office for Statistics Regulation published their review into data sharing and linkage for the public good in July. I also attach written evidence from their Head of Regulation, Ed Humpherson, within this submission, where they discuss their findings from this report in more detail. In addition, they will soon be launching a review and will be seeking feedback on the Code of Practice for Statistics to ensure it remains relevant for today’s world of data and statistics production. We will provide more detail to the Committee on this soon.

Sir Ian Diamond, Ed Humpherson and I stand ready to engage with the Committee to expand on any of these points if helpful, and indeed will follow all oral evidence sessions of the inquiry with interest.

Yours sincerely,

Sir Robert Chote

Chair, UK Statistics Authority

Office for National Statistics response

Data and analysis in government 

How are official statistics and analysis currently produced? 

  1. Official statistics are defined as those produced by organisations named in the Statistics and Registration Service Act 2007 (the 2007 Act) or in the Official Statistics Order (SI 878 of 2023). The Code of Practice for Statistics (the Code) sets the standards that producers of official statistics should follow. The Office for Statistics Regulation (OSR) sets this statutory Code, assesses compliance with the Code, and awards the National Statistics designation to official statistics that comply with the highest standards of the Code.
  2. The majority of official statistics are produced by statisticians operating under the umbrella of the GSS, working in either the Office for National Statistics, UK government departments and agencies, or one of the three devolved administrations in Northern Ireland, Scotland and Wales. Every public body with a significant GSS presence has its own designated Head of Profession for Statistics. Each of the devolved administrations has its own Chief Statistician. The Concordat on Statistics sets out an agreed framework for statistical collaboration between the UK Statistics Authority, UK Government, and the Northern Ireland, Scottish and Welsh Governments.
  3. The Analysis Function brings together the 16,000 analysts across 7   The strategic aim of the Analysis Function is to integrate analysis into all facets of Government, building on the strengths of professions. The Analysis Function supports analysis in Government through capability building, sharing good practice, championing innovation, and building a strong analytical community. The National Statistician is head of the Analysis Function, as well as the Government Statistical Service (GSS). Each Government department has a Departmental Director of Analysis, who is responsible for analytical standards in their department. The network of Departmental Directors of Analysis (DDANs) form the leadership of the Analysis Function in order to drive and deliver the functional aims.
  4. A range of analytical techniques and various sources of evidence, combined or individually, including official statistics, can be used to provide insights for key questions for the public and decision makers. Such analytical processes and products are also supported by guidance such as the Green Book (appraisal of options), the Magenta Book (evaluation), and the Aqua book (quality assurance), which set the highest standards for government analysis.
  5. Official statistics and analysis across government are currently produced in line with the Code and its three pillars: trustworthiness, quality, and value.

How successfully do Government Departments share data?

  1. Successful data sharing across government departments is critical to operating and transforming the statistical system and producing high quality, trustworthy, and valuable analyses. The importance of sharing, and linking, data and putting data at the heart of statistics is set out in our current consultation on the future of population and migration statistics in England and Wales.
  2. There are some good examples of effective data sharing across government. The COVID-19 pandemic illustrated the ability of government and public services to use and share data to help and protect people. When data are shared effectively, the speed at which analysis can be done means time-critical policy issues can be understood and addressed quickly. For example, we created the Public Health Data Asset (PHDA) which is a unique population level dataset combining, at individual level, data from the 2011 census, mortality data, primary care records, hospital records, vaccination data and Test and Trace data, and allowed us to link across these data sources to provide new insights.
  3. Cross Government networks and the Analysis Function have a critical role in the success of cross department data sharing. For example, the Data Liaison Officer network, the National Situation Centre, the ONS and the Analysis Function recently collaborated to produce guidance on data sharing for crisis response. This built on the principles developed and utilised during the COVID-19 pandemic.
  4. However, there are challenges to building on this success and maintaining the momentum that occurred during the pandemic. Data sharing between departments continues to have asymmetric risk; with the perceived risks – either legal, operational or reputational – falling on the supplier or department sharing the data, while the benefits are diffuse across the system, or perceived to accrue to others. There are several common challenges to data sharing by suppliers, for example their level of risk appetite and differing interpretations of the law, their data preparation (and accordingly, data quality) and engineering capacity, and the governance within their own organisations.
  5. In terms of risk appetite, even when the environment is judged to be safe and secure by internal and external parties, there is still too much weight being placed on the risks of data sharing as opposed to the very real risk to the public of policy harm and loss of opportunity where valuable data is not being actively used and shared. The OSR notes this point in their report on data sharing and linking.
  6. Another challenge is that agreements to share data are often narrow; from one department to another for a specific purpose, for example a piece of analysis specific to a policy area or statistical output. It is often challenging to broaden these agreements which put a limit on the amount data can be reused or shared more broadly across multiple departmental boundaries. This creates inefficiency where the value of data is not fully realised and causes government departments to incur unnecessary and duplicative costs in implementing numerous bilateral arrangements, often with the same party.
  7. The level of data maturity across departments is also varied, which leads to a multitude of different approaches and interpretations to agreeing data sharing, a wide range of people being involved in approving data ownership and stewardship, and a myriad of different templates to formalise agreements. This contributes to a complex and burdensome system, which leads to long lead-in times to agree data shares. This issue is particularly acute when data is brought together and integrated from multiple departments, necessitating different governance processes to be engaged each time a change is required.
  8. The ONS provides an ‘Acquisition Service’, which proactively supports data suppliers and collaborates to put in place mechanisms which support the sharing of data, reducing the burden on the supplier as far as possible. For example, this could be seconding analysts into a department, drafting up Memorandum of Understanding on behalf of suppliers, and agreeing to undertake significant improvement on the data to make it of high and usable quality.
  9. The ONS is the lead delivery partner on behalf of government to deliver a cross government major programme, the Integrated Data Service (IDS), a Trusted Research Environment (TRE) which seeks to build on the success of the Secure Research Service (SRS), to bring together ready-to-use data to enable faster and wider collaborative analysis for the public good. The IDS intends to transform the way that data users share and access government data. Firstly, the IDS is a fully cloud-native system which will further enable connectivity across a federated data environment, reducing the friction caused by sharing data multiple times. Secondly, it will provide the facility to fully exploit the opportunities for safe and secure access to data provided for in the Digital Economy Act 2017 (DEA). Thirdly, it will apply a common linkage approach to enable analysts to join data from different departments, repeatedly, to meet diverse analytical requirements.
  10. The ambition is that the IDS will help overcome some of the existing challenges, costs and delays to effective data sharing across government. Its success, however, will depend on the extent to which government departments can embrace a common approach to sharing, stewarding, linking and accessing data.

How do other nations collect and produce statistics? 

  1. The UK is connected with other National Statistical Organisations (NSOs) on both a multilateral and bilateral basis, to learn and to share best practice. The UK is represented at the highest levels in multilateral fora such as the Organisation for Economic Co-operation and Development (OECD) and the United Nations Economic Commission for Europe (UNECE) and participates in a number of working groups advancing the use of administrative data and new forms of data. For example, we recently presented our work on nowcasting to the UNECE Group of Experts on National Accounts which gathered interest from other countries such as Austria, Canada, Indonesia and the United States.
  2. The UK chairs the UNECE Expert Group on Modernising Statistical Legislation, ensuring that statistical legislation and frameworks are equipped to deal with a changing data ecosystem. This has led to further work on data ethics, social acceptability, and access to privately held data. We also sit on the UNECE Task Force on Data Stewardship, which aims to develop a common understanding of the concept of ‘data stewardship’ and define the role of NSOs as data stewards.
  3. We are regularly contacted by other NSOs to share our experiences of utilising data science techniques to produce high quality data in near real time to inform decision making. For example, the ONS recently hosted the German led Future of Statistics Commission to share our experiences of utilising new data and new methodology to keep up with societal needs and respond efficiently to emerging crises. We have hosted colleagues from Statistics Finland to discuss the data landscape in the UK and colleagues from Statistics New Zealand to discuss challenges faced with data collection. We have also shared experiences on our real-time economic indicator suite, particularly on new sources of data such as card data.
  4. The ONS has been involved with the World Health Organisation’s Pandemic Preparedness toolkit project. This project calls upon the Authority’s pandemic response experience and expertise to develop a toolkit containing practical guidance, statistical methods, knowledge products, case studies and training materials for other NSOs, particularly for data sharing.
  5. As part of the International Census Forum, the ONS has had ongoing conversations with the US Census Bureau, Statistics Canada, Australian Bureau of Statistics, Statistics New Zealand and CSO Ireland about the use of administrative data for population statistics. All participants have benefited from these conversations over the last few years, covering subjects such as census collection planning and efficiencies, quality assurance, processing improvements and contingency plans.
  6. The UK is a member of the UN Statistics Division (UNSD) Collaborative on the Use of Administrative Data for Statistics. This group of countries and regional and international agencies was convened by the UNSD and the Global Partnership for Sustainable Development Data (GPSDD), with the aim of strengthening countries’ capacity to use administrative data sources for statistical purposes, including replicating census variables.
  7. The Collaborative provides a platform to share resources, best practices and experiences. This includes a self-assessment tool, a draft toolkit for quality assessment of administrative data sources, and an inventory of resources which contains recommendations and practical examples on the use of administrative data in different contexts.

The changing data landscape

Is the age of the survey, and the decennial Census, over?

  1. The ONS’s vision is to improve its statistics so that they can respond more effectively to society’s rapidly changing needs. The ONS is proposing to create a sustainable system for producing essential, up-to-date statistics about the population. To do this, the system would primarily use administrative data like tax, benefit, health and border data, complemented by survey data and a wider range of data sources. This could radically improve the statistics that the ONS produces each year and could replace the current reliance on, and need for, a census every ten years.
  2. Producing high-quality, timely population statistics is essential to ensure people get the services and support they need, both within their communities and nationwide. Population statistics provide evidence for policies and public services, as well as helping businesses and investors to deliver economic growth across the country. It is important that these statistics are up to date and reliable, so that they can accurately reflect the needs of everyone in society. Currently, the census provides the backbone of these statistics, offering a rich picture of our society at national and local levels every ten years. Every year, the ONS brings together census data with survey and administrative data to reflect changes in society. As a result of this approach of ‘rolling forward’ estimates year-on-year, statistics become less accurate over the ten years between censuses and local detail on important topics becomes increasingly out of date. After each census, the previous decade’s mid-year population estimates are “rebased” to ensure they are consistent with the baseline estimate from the new census. This makes the previous decade’s mid-year population estimates as accurate as possible.
  3. There has also been a well-documented global trend of declining response rates to surveys and censuses, which can impact representativeness of the data and, as a result, data quality. While Census 2021 in England and Wales enjoyed a high level of public engagement and response, this is an outlier in a wider trend of population censuses and social surveys across the world.
  4. Data collection is costly, and these costs can be elevated when the need arises to incentivise survey response (often monetarily), chase response many times (particularly in the case of the census) or adjust collection operations. In recent examples where response rates to censuses have been below targets, mitigations have included the deadline for responses to census being extended, boosted communication campaigns, and greater use of administrative data to enable the production of robust estimates, all of which can contribute to elevated cost.
  5. Building on recent advances in technology and statistical methods, and legislative facilitation of data-sharing across government for statistical purposes, the ONS has for several years been researching the use of administrative data as a primary source for meeting user needs for some statistics. For population statistics specifically, this work is responding to the Government’s ambition, set out in 2014, that censuses after 2021 “be conducted using other sources of data and [provide] more timely statistical information.”
  6. It has shown that it can produce population estimates with a more consistent level of detail and accuracy over time, and migration estimates based on observed travel patterns rather than respondents’ stated intentions, using administrative data to respond to the difficulties of estimating internal and international migration. The ONS has also developed methods for producing information about the population more often and more quickly. These methods will offer insights into our rapidly changing society as administrative data reach their full potential over the next decade.
  7. In June 2023, the ONS launched a consultation on its proposals for the future of population and migration statistics in England and Wales, responses to which will inform a recommendation to Government. The consultation’s proposals emphasise that surveys may continue to play an important role whilst ONS works with partners to widen and develop the range of administrative data sources that are collected and used. However, the ONS believes it has reached a point where a serious question can be asked about the role the census plays in its statistical system.
  8. If implemented, the proposed system would respond more effectively to society’s changing needs by giving users high-quality population statistics each year. It would also offer new and additional insights into the changes and movement of our population across different seasons or times of day. For many topics, it would provide much more local information not just once a decade but every year, exploring them in new detail and covering areas not recorded by the census, such as income. These are ambitious changes, and decisions on the next phase of this work will set the direction for the ONS’s work programme over the coming years, as the ONS continues to improve its population and migration statistics.
  9. It is worth noting that fast-paced, qualitative surveys have and will continue to have a place in the statistical system. For example, the Opinions and Lifestyle Survey and Business Insights and Conditions Survey in particular illustrate the worth of flexible surveys, adapted from a source of rapid intelligence during the pandemic to useful tools for understanding at pace current issues such as cost of living to the adoption of AI.

What new sources of data are available to government statisticians and analysts? 

  1. The ONS has been a heavy user of Census and survey data and, more recently, government administrative data. Given the new global data landscape, and the increasing society-wide and economy-wide digitisation, new and exciting massive, high-frequency data (‘Big Data’) are being generated, some of which is proprietary and some of which is openly available. These new data sources offer opportunity for the ONS to be radical, and ambitious in what data it uses – in line with the Authority’s strategy, Statistics for the Public Good – and how, to provide better, highly trusted information and statistics to the public, to the private sector, and the public sector (including government) at national and regional level, across a wide range of issues.
  2. A key part of the current consultation on the future of population and migration statistics seeks input on transforming population and migration statistics using alternative data, as opposed to using predominantly survey-based sources. The ONS has now published a suite of evidence demonstrating the opportunity to use alternative data sources to deliver more timely and granular statistics as well as provide value for money.
  3. To support statistical transformation and the strategy, the ONS works with hundreds of independent data suppliers including central, devolved and local government, and the private sector, to share data for the public good research. The ONS’ most recent data transparency publication demonstrates the breadth of data sources containing personal data, and the broad opportunity to use novel data sources to support statistics.
  4. The ONS uses around 700 data sources, including health, tax, benefits, education and demographic information from other government departments and public bodies. We also work with a wide range of data from commercial providers including retail scanner data, where coverage is around 60-70% of the grocery market, alongside financial transaction data, domestic energy usage and others.
  5. A recent example of the ONS harnessing the power of new data sources is published data on UK Direct Debits, developed with our partners at Vocalink and Pay.UK. This new, fast and anonymised data source provides insight on consumer payments to most of the UK’s largest companies. It gives the ONS, for the first time, an opportunity to rapidly analyse price movements such as changes in the average Direct Debit amount for bills, subscriptions, loans, or mortgages as well as overall payment behaviour via failure to make these Direct Debit payments. This provides extremely useful and timely insights into the state of the UK economy and in time, could feed into wider national accounts estimates.
  6. In addition, new sources being acquired as part of the transformation of consumer statistics are already being incorporated into headline measures of Consumer Prices Index (CPI). This has started with the incorporation of rail fares, enabling a far greater level of detail to be accessed within our published indices.
  7. Alongside the opportunities presented by novel data sources, there is also huge potential through continued broader integration of data. Integrated data assets, made up of multiple constituent data sources, provides the potential for greater depth and breadth of research, and for the creation of insight which is not possible from analysing data sources in isolation.
  8. A good example of the value of integrated data is ONS’ development of a PHDA which has enabled the ONS to produce novel analyses including Ethnic contrasts in COVID-19 mortality and other high impact pandemic related statistics. The ONS is now using the PHDA to explore more indirect impacts of COVID-19 and wider non-COVID research questions. These include the impact of health on labour market participation by linking in Department for Work and Pensions (DWP) and HM Revenue and Customs (HMRC) data.
  9. The ONS has also made use of ‘open’ data to better inform the public. Using Automatic Identification System (AIS) shipping data to help monitor trade (shipping) flows. This data science work feeds into the regular monthly ONS publication on Economic Activity and Social Change in the UK, Real Time Indicators. We are looking at other forms of open data that can be used to produce statistical outputs and products that deliver on statistics for the public good.
  10. As well as using linked data, the ONS does work bringing multiple types of data together and uses advanced data science methods to deliver statistics for the public good. The ONS collects travel to work data from the census every ten years, with the most recent being 2021. Travel to work matrices which show movement of people from their home location (origin) to their place of work (destination) at an aggregated level. Information on travel to work provides a basis for transport planning, for example, whether new public transport routes or changes to existing routes are needed. Additionally, it allows the measurement of environmental impacts of commuting, for example traffic congestion and pollution, and how these might change over time, for example because of changes in commuting modes, such as a shift from car to bicycle. This travel to work data helps us generate travel to work matrices for census years, for instance, at 10-year intervals with no updates for years in-between. However, using Census 2011 travel to work data, Census 2021 population data, National Travel Survey data (collected by Department for Transport (DfT)), National Trip Ends Model data (produced by DfT), ONS geography products such as MSOA boundaries and Population Weighted Centroids, we can produce estimates of travel to work matrices at more regular intervals than once every 10 years.
  11. The approach to integrated data is being taken further as part of the IDS, with data ‘indexed by default,’ enabling common deidentified identifiers to be consistently applied to enable a common linkage approach. Data deidentified in this way can be grouped thematically to create Integrated Data Assets around the themes of health, levelling up and net zero. The value of this approach is that it retains the core value of the source data, while being supported by Privacy Enhancing Technology, and facilitates access to a much broader range of data enabling analysts to exponentially grow the value of their analysis.

What are the strengths and weaknesses of new sources of data? 

  1. Use of new administrative and commercial data sources are transforming the way we produce statistics (a trend seen across the world). The sources provide timely, frequent and granular data about the population that is not possible through survey collection. Through the linkage of different data sources, we can provide coverage across the population down to local levels of geography. Innovations in this area include a new approach to produce more timely and frequent high-quality estimates of the size and composition of the population down to local level and how this changes due to international migration. However, administrative data are not collected for statistical purposes and, as a result, there are both strengths and weaknesses with such data sources.
  2. Relevance and conceptual fit: the content of administrative data is determined by the services they support. This includes the topics collected, but also the precise definitions of the items that are measured. While surveys can be designed so that the questions they ask capture the statistical concepts we want to measure as accurately as possible, this is rarely possible for administrative sources, especially well-established ones. In practice, that means we need to adjust the data so it fits with statistical definitions using additional data from elsewhere (such as a survey) or can only approximate those definitions if adjustment is not possible. An important area to help with this is collaboration across government, to improve the collection of data in administrative systems, particularly around protected characteristics.
  3. Coverage: the strength of many of these large data sources is their granularity, which gives us the ability to analyse data for small groups or for small areas. This is not always possible with surveys, particularly surveys with small sample sizes. However, the coverage of most of these datasets will be incomplete. Parts of the population will not interact with the administrative source and therefore they will be missed from the dataset. In other cases, the data may not cover everybody or everything. The power, from an analytical point of view, comes from linking different datasets together to improve the coverage and enable analysis at a local level. However, sometimes surveys are also needed to fill the coverage gaps. There can also be over coverage in data sources, where individuals appear on the dataset who aren’t within a target population, for example the inclusion of emigrants and short-term residents who have recent administrative data activity but are either no longer resident or are not resident for sufficient time to meet the definition for inclusion in our estimates.
  4. Linkage: new sources of data need to be integrated to improve coverage and allow analysis. There are often not unique identifiers to enable this linkage so this can add to the complexity, time, and cost to process data to allow analysis. The complexity of linkage without common unique identifiers means that it is never perfect. Moreover, the quality of linkage may vary across the population. Understanding and quantifying linkage quality is critical, as issues that arise (such as under-representation) will feed through into statistical analysis and may affect results if not properly mitigated.
  5. Timeliness, for both collection and delivery (although new data sources are often relatively more timely than survey data):
    • Collection lags: there is often a lag between an event occurring and the data for that event becoming available, for example moving address, then registering at a doctor’s surgery or making a profit and filing a tax return. There are different time lags for different datasets. Real time analysis is often not, at this point in time, possible from the new data sources; there is always a time lag.
    • Data delivery: timelines can impact on the timeliness of the statistical and analytical outputs. Data takes time to be processed and be delivered to statisticians who analyse it. Often data can be delayed in the delivery or not shared in line with analytical requirements due to the nature of data sharing agreements. Within the ONS we are working with departments to build mature data transfer systems supported by robust Data Sharing Agreements (DSAs).
  6. Coherence, harmonised standards, and metadata: different operational polices lead to associated administrative data being collected in different ways. One dataset may be at quite a high level, while others could have more detailed information – even datasets that appear to collect information on the same metric may not be comparable, or not wholly comparable. This makes analysis more difficult and can mean that data is not available at the relevant level of granularity. Metadata and detailed information about the data collected is also often lacking, making using the data more difficult. We are using secondments into departments to better understand the data and build the metadata, and thus improve this situation.
  7. Stability and accuracy: with survey data we have control over the questions asked and the stability of those questions. Administrative or commercial data can change, for example in what and how data is collected because of changes to the operational process. This is both a strength and a weakness: strength, as it allows the data to adapt more to changing requirements and needs; weakness, as it brings in a possibility of breaking series, which is not ideal for statistical analysis of trends. An example of this was the removal of Universal Child benefit, which caused a big drop in the coverage of children in DWP/ HMRC data. To future proof statistics, we need to make sure that we are not reliant on just one source of data. In addition, accuracy will often depend on whether it is a critical variable in the administrative function, when it’s not quality drops, as the data may often be missing (if voluntary) or it does not undergo robust checks on collection.
  8. Design: sometimes administrative data collection processes were designed a few decades ago and can rely on legacy IT to be analysed. This also implies that design of questions or forms doesn’t fit new needs or requirements, and does not follow user centred design principles, affecting the quality of the data collected. However, some of the major sources that we are currently using have undergone improvement in this area.
  9. Supplier restrictions: some suppliers place restrictions on the data that is shared, for example by applying techniques to data to enhance privacy (such as hashing, perturbing, aggregating data). This can damage the usefulness of the data and how it can be used within statistical outputs.

Protecting privacy and acting ethically 

Who seeks to protect the privacy of UK citizens in the production of statistics and analysis? How? 

  1. All producers of official statistics are legal entities and data controllers in their own right, and therefore are responsible for protecting the data they hold and use. Data protection legislation, including the UK GDPR and the Data Protection Act 2018 provides the statutory framework all producers of official statistics must adhere to, and makes specific reference to personal data processed for statistics and research purposes. The Information Commissioner (ICO) is the independent authority that ensures compliance with data protection legislation and upholds information rights in the public interest. As part of their role the ICO provides advice and guidance, including, for example, a Data Sharing Code of Practice.
  2. As the executive office of the Authority, the ONS collects and processes information, both directly from individuals and from other organisations, and does so using a variety of methods. The Authority has the statutory objective to promote and safeguard the production of official statistics that serve the public good. Any personal data collected by the Authority can only ever be used to produce statistics or undertake statistical research.
  3. In addition to data protection legislation, personal information held by the Authority is further protected by the 2007 Act, and makes disclosure of personal information a criminal offence, except in limited prescribed circumstances, for example where disclosure is required by law.
  4. The DEA provides the Authority with permissive and mandatory gateways to receive data from all public authorities, crown bodies and businesses. These data sharing powers can only be used for the statistical functions of the Authority and sharing can only take place if it is compliant with data protection legislation.
  5. The 2007 Act requires the Authority to produce and publish the Code, governing the production and publication of official statistics. One of the core principles of the Code is around data governance and this states that organisations should ensure that personal information is kept safe and secure.
  6. The ONS provides guidance, support, and training on matters across the GSS, including on data protection and privacy.

Data protection

  1. The Authority has a dedicated Data Protection Officer and teams and colleagues that manage data protection and legal compliance and the security of data. These teams provide advice and guidance across the organisation on data protection matters; deliver training sessions on the protection of data; and engage regularly with the ICO.
  2. The Authority takes a data protection by design approach when processing data for statistical purposes. Privacy and data protection issues are considered at the design phase of systems or projects. The Authority has published extensive material regarding privacy for members of the public including privacy information for those taking part in surveys and a Data Protection Policy. For new projects that involve the processing of personal data, colleagues are advised to complete Data Protection Impact Assessments, that enable the Authority to identify any risks of processing to data subjects and to mitigate those risks.

Statistical confidentiality

  1. The Authority collects a vast range of information from survey respondents, as well as administrative data, such as registration information on births, deaths and other vital events. The ONS publishes statistics and outputs from this information, and statistical disclosure methods are applied so that the confidentiality of data subjects, including individuals, households, and corporate bodies, is protected. All statistical outputs are checked for disclosure risk, and disclosure control techniques are applied as required.
  2. The DEA facilitates the linking and sharing of de-identified data by public authorities for accredited research purposes to support valuable new research insights about UK society and the economy. The Authority is the statutory accrediting body for the accreditation of processors, researchers and their projects under the DEA.
  3. The Authority allows access to de-identified data within its trusted research environments. To ensure the security of the data and individual privacy, the Authority uses the Five Safes Framework:
    • Safe People: trained and accredited researchers trusted to use data appropriately.
    • Safe Projects: data that are only used for valuable, ethical research that delivers clear public benefits.
    • Safe Settings: settings in which access to data is only possible using our secure technology systems.
    • Safe Data: data that have been de-identified.
    • Safe Outputs: all research outputs that are checked to ensure they cannot identify data subjects.

Security controls

  1. The protection of data is a top priority for the Authority, and we implement and operate substantial security measures for our staff, data and services. This security focus ensures that the Authority operates and continues to develop secure options that meet its objectives for data use and maintains public trust in how we access, use, process, store, and make available data for statistics and research purposes.
  2. To ensure the confidentiality, integrity and availability of our data are protected at all times, we operate a security management framework, which continuously evaluates the threat landscape, evaluates the security risks and ensures that the appropriate controls are in place, so that we are operating within corporate risk appetite, maintaining a strong security posture and complying with the relevant legislation, Code of Practices and industry best practice. This is underpinned by a robust secure by design approach, comprehensive protective monitoring, internal and external assurance and the training of our staff.

What does it mean to use data ethically, in the context of statistics and analysis? 

  1. The Authority owns a set of six ethical principles relating to the use of data for research and statistics. These principles cover: public good, confidentiality & data security, methods & quality, legal compliance, public views & engagement, and transparency. The production, maintenance and review of these principles are conducted by the National Statistician’s Data Ethics Advisory Committee (NSDEC). The NSDEC was established to advise the National Statistician that the access, use and sharing of public data, for research and statistical purposes, is ethical and for the public good. The NSDEC consider project and policy proposals, which make use of innovative and novel data, from the ONS, the GSS and beyond, and advise the National Statistician on the ethical appropriateness of these. The NSDEC meet quarterly and have a key role in ensuring transparency around the access, use and sharing of data for statistical purposes.
  2. In 2021 the Centre for Applied Data Ethics (CADE) was established within the Authority. CADE provide practical support and thought leadership in the application of data ethics by the research and statistical community. The Centre provides a world leading resource that addresses the current and emerging needs of user communities, collaborating with partners in the UK and internationally to develop user-friendly, practical guidance, training and advice in the effective use of data for the public good. In addition to providing the secretariat to the NSDEC, the CADE mobilise the Authority’s six ethical principles via a self-assessment tool that is available to the entire research and statistics system. This tool supports researchers and analysts to identify ethical concerns in their work and then to engage with CADE to ensure mitigations and solutions are in place. Since January 2021, this tool has enabled nearly 900 pieces of ethical research and statistics and is growing in use by hundreds a year.
  3. Complementing the independent advice and guidance of the NSDEC and the self-assessment ethics services of CADE, the Centre also produces several bespoke ethics guidance pieces each year. These guidance pieces are typically produced in collaboration with an area of the ONS or the wider statistical system and focus on key concerns, such as identifying and promoting public good, considering public views and engagement, and specific ethical considerations in inclusive data, machine learning and geospatial data. Finally, the CADE also offer bespoke ethics support with specific projects, workstreams and teams and tailor their services to one-off events and longitudinal engagement work. This includes our international development programme where we work to support the work of various other National Statistical Institutes.
  4. The focus of the CADE’s activities is to ensure that the Authority’s ethical principles are promoted and accessible and that tools to ensure the principles are put into practice are effective and easy to use. We achieve this through promoting CADE at internal and external events, providing secretariat to the independent NSDEC, operating and providing oversight of the CADE self-assessment tool, producing specific, collaborative guidance pieces and providing bespoke ethics advice and support. By engaging with the CADE, researchers and analysts can ensure ethical practice, in-line with the Authority’s ethical principles, in the production of research and statistics.

Are current processes and protections sufficient? 

  1. The Authority has well established processes and procedures in place to ensure the protection of the data of UK citizens. As the statistical and analytical landscape around data changes, as with during the COVID-19 pandemic, the Authority ensures that it remains up to date with any changes to privacy legislation, regulatory guidance or cross-government good practice that could impact data subjects. This ensures a robust statistical system that produces public-good statistics that are trusted by the public.
  2. The ONS security management framework incorporates and references appropriate recognised security standards and guidance from within Government (Cabinet Office, National Cyber Security Centre (NCSC), Centre for Protection of National Infrastructure (CPNI)) and international standards and best practice from international security organisations including ISO 27001, the American National Institute of Science and Technology (NIST) and the Information Security Forum (ISF).
  3. From a data ethics perspective, there are dozens of organisations in the statistical system who display their ethical framework and commit to using it. CADE goes beyond this by evidencing, transparently, the impact that engaging with CADE is having on the production of research and statistics. Numerically, by the number of projects that CADE and the NSDEC see each year, and in more detail, by the production of case-studies, publicly displayed meeting minutes and audits of projects that have been signed-off. Where researchers and analysts engage with CADE and their services, ethical practice can be assured and evidenced.

Understanding and responding to evolving user needs

Who should official data and analyses serve? 

  1. Our data, statistics and analysis serve the public through our statutory duty to “promote and safeguard the production and publication of official statistics that serve the public good” (as set out in the 2007 Act). Everyone is a user or potential user of our statistics and can use data to inform their decision making: from policy makers to enquiring citizens, including local businesses, charities and community groups.
  2. Within the ONS, we have established an Engagement Hub to enable us to coordinate our engagement with users, understand user needs, reach new audiences and evaluate our engagement.
  3. Users are at the heart of everything we do. When identifying priorities for analysis, we do so through:
    1. discussions with other government departments and the devolved administrations.
    2. local engagement: our new ONS Local service works with analytical communities locally and with the wider civic society to support further analysis to target local questions to address local issues.
    3. Citizen focus groups with members of the public and the ONS Assembly with charities and bodies representing the interests of underrepresented groups of the population.
    4. drawing on external advice, for example the National Statistician’s Expert User Advisory Committee (who advise on cross-cutting issues).
    5. regular engagement activities with businesses and third sector organisations.
  4. A good example of the ONS reflecting the needs of users is the COVID-19 Latest Insights Tool, developed so that members of the public could find reliable, easy to understand information about the COVID-19 pandemic in one place. We engaged in user testing at key stages to make sure it met user need, and it became the most widely read product in the history of the ONS website.
  5. We also undertake an annual stakeholder deep-dive research which explores stakeholder needs, and a stakeholder satisfaction survey which supports evaluating progress against our strategic objectives.
  6. According to the Public Confidence in Office Statistics (PCOS) 2021 report, a very high proportion of respondents trusted the ONS (89% of those able to express a view) and its statistics (87%). This was very encouraging to see. It also asked respondents about their level of trust in the ONS compared to other institutions in British public life. Of the institutions listed on the survey, the ONS has the highest levels of trust, similar to that of the Bank of England and the courts system.
  7. In terms of analysis, the Analysis Function strategy explains how we bring analysts across government together to deliver better outcomes for the public by providing the best analysis to inform decision making. The Function serves policy makers across Government and has regular conversations with stakeholders to ensure that our data, statistics and analyses are relevant to public policy priorities and delivered in a timely way.
  8. Our Analytical Hub aims to provide capability and capability to deliver radial cross-cutting analysis that supports Government, civil society, and the public to understand the key questions of the day, responding flexibly and in a timely fashion to the ongoing economic and public policy priorities.

How are demands for data changing?   

  1. Changes in society, technology, and legislation mean that more data are available, in richer and more complex forms, than ever before, with the COVID-19 pandemic shifting the expectations of users to receiving more insights more rapidly. The pandemic and its impact on our society and the economy has led to more complex questions which means that the needs for data are also accompanied for growth in expertise and support to use data that reflects the intersectional nature of policy enquiry. Our statistics need to be quick, relevant, trusted and reliable to withstand public prominence and scrutiny, respond to a rapidly changing environment, and inform critical policy
  2. The ONS aims to respond to the needs of the public, decision makers and society – including providing data and insight on the topics and priorities of the day. We have already evolved our approach to respond to demand increasing for data, statistics, and analysis to be:
    • More timely, through more rapid surveys, such as the Opinions and Lifestyle Survey, and the use of new data sources, like financial transaction and mobility data.
    • More local, with the production of more granular and hyper local data, which allows users to build up their own bespoke geographies that matter to them such as Gross Value Added (GVA) at lower super output area, and greater support for local users and decision makers using our ONS Local service.
    • More inclusive, through making our data more accessible and reflective of our users, allow people to see themselves in our statistics and analysis, such as our shopping prices comparison tool; and
    • More relevant, both in terms of topics, for example looking beyond Gross Domestic Product (GDP) to consider multi-dimensional wellbeing alongside improved measures of economic performance, and in terms of how we disseminate our data and statistics, for example through application programming interfaces (APIs), to empower users to do their own analysis.
  3. We will continue to build on this progress as demands change, for example through the increasing availability and evolving possibilities for artificial intelligence (AI). We did this particularly well during the pandemic, and have since focused on new policy priorities, such as the rising cost of living, the changing nature of the labour market and the experiences of Ukrainian nationals arriving in the UK having been displaced through the conflict.
  4. As well as responding to emerging issues, we are making it easier for the public to find and consume insights on topics of interest by pulling together our many different data sets in the form of dashboards and data explorers. These include the Health Index, which brings together different datasets at local levels, subnational data explorers considering the economy, society, environment and more across local areas, the new UK measures of wellbeing providing 60 indicators across 10 domains, and the latest data & insights on the cost-of-living. In addition, data collected through the 2021 Census is being made available through our flexible table builder, articles of interest and interactive maps.
  5. Demands for data are also changing among expert users, with the rise of big data and the need for more data linkage across Government. This is why we are investing in the IDS to provide a secure environment for trusted researchers to analyse new, granular and timely data sources for the public good.

How do users of official statistics and analysis wish to access data?  

  1. We have developed a deep understanding of our diverse user audiences and their unique needs and requirements. We have grouped website users into 5 persona groups:
    • Technical User: Someone who only wants data and will create their own datasets and customise their own geography boundaries. Data from the ONS are frequently used in conjunction with data from other government departments. They may be expert at what they do with statistics but can be less expert at looking for base data. There is not the urgency we see from the expert analyst. They do not tend to use written publications.
    • Expert Analyst: Someone who creates their own analysis from data. This user downloads spreadsheets into their own statistical models to create personal datasets.
      Access to the data for analysis is more important to them than its presentation.
    • Policy Influencer: Someone who uses data for benchmarking and comparison. For some policy influencers, this requires data and analysis at a regional or local level. They rely on official government statistics, trusted by decision makers, for their reports.
    • Information Forager: Someone who wants local data and keeps up to date with the latest economic and population trends to help them make practical, strategic business decisions. They often do not know exactly what to search for, until they come to it.
    • Inquiring Citizen: Infrequent visitors to our site who search for unbiased facts about topical issues. They want simply worded, visually engaging summaries, charts and infographics. Data can help make informal decisions about pensions and investments. They engage on social media and browse with smartphones or tablets.
  2. We have found that citizen type users want ways to get data on their local area or to fact check data by using interactive tools, summaries, dashboards, visualisations and maps, whereas more data literate users are interested in the data itself and the associated metadata and methodology. Technically advanced analysts are also interested in being able to access data via APIs and for data to be easily used in tools such as Python and R. These technical users prefer not to have heavily formatted Excel spreadsheets with multiple tabs.
  3. We know many of our local users are keen to understand a place by many topics rather than go to several publications with multiple datasets for one datum. As such, we are developing our Explore Subnational Statistics service, that will allow users to select a geography and see metrics across a range of themes. ONS Local also helps local government users to bring together evidence across their area, alongside local intelligence and data and analysis to create greater insights.
  4. Our search engine optimisation strategy recognises that not all users need to or want to come to our website and that Google and the major search engines often represent our data directly in their search results. This is particularly applicable to those with accessibility needs, since many of these ways of representing our data can be returned via voice search on a variety of platforms.
  5. Citizen users want data communicated in a way that is easy for them to digest and research has shown that there is a degree of education required about our key topics such as inflation and GDP. Users may also be interested in their local areas as much as national level data.
  6. Within the ONS Engagement hub, there are dedicated teams focussing on building relationships with different audience segments. The External Affairs team supports stakeholder engagement with key government stakeholders, business and industry groups, consumer bodies and think-tanks. The dedicated Outreach and Engagement team is focused on engaging with local authorities, and building sustainable relationships with community and faith groups, voluntary sector organisations and others representing the interests of those audiences traditionally less well represented in official statistics and government data.

How can we ensure that official data and analyses have impact? 

  1. Ensuring what we are doing focuses on the topics that matter most to people and ensuring that it is disseminated in a way that is easy to understand, engaging and relevant to the audience, is key to achieving impact.
  2. For example, we worked with colleagues across government to establish new data collection on Ukrainian refugees and Visa sponsors in the UK. This allowed us to provide invaluable insights on the experience of Ukrainians coming to the UK, and the impact on service provision in areas. The publication of this in English and Ukrainian supported both policy decisions around the humanitarian response & provided those impacted with the ability to read the findings in their own language.
  3. On cost of living, we delivered a broad information, engagement and communications programme that included promoting cost of living insights and data products to a wide range of users; diversifying engagement with non-expert users (for example 99 civil society and community groups attended a session on how they could benefit from our insights); and seeking user feedback to further improve our cost of living products and analysis. Impact from this includes a continued increase in the use of our insights tool with the personal inflation calculator being embedded into Guardian and BBC websites and the shopping prices comparison tool reaching over 700,000 uses in its first week.
  4. In June we launched a public consultation on the future of population and migration statistics in England and Wales. We engaged extensively with stakeholders before the consultation launch through sector specific round tables. The consultation launch itself was widely promoted across stakeholders in all sectors and around 500 people attended launch events and webinars. Engagement will continue throughout the consultation period to maximise awareness, understanding and response.
  5. We regularly review the impact of our work, providing impact reports to the Strategic Outputs Committee (was Analysis and Evaluation Committee) on a quarterly basis, providing deep dives on priority topics.
  6. The ONS has access to a number of metrics that can be used to assess the impact of our outputs, specifically reach and awareness to understand the importance and relevance of the insight and content. We also test engagement levels to understand how well our content performs to achieve cut-through and add value to a debate. In addition, we use our own surveys to test appetite for future topics and outputs.
    1. Reach / awareness – to test importance and relevance of topic and insights.
      • Web page views: unique sessions a page was viewed at least once, within seven days.
      • Social media impressions: number of views per posts on social media platforms, within 48 hours
      • Print, digital, broadcast: number of views / listens of ONS insight from outputs.
    2. Engagement – to test content cut-through, clarity of messages and the value added to a debate or discussion.
      • Time spent on web page: time users spent viewing a specified page or screen, taken after seven days.
      • Social media engagement: shares, favourites, replies and comments, URL click throughs, hashtag clicks, mention clicks and media views, taken after 48 hours.
      • Print, digital, broadcast: cut through of ONS comments / main points within coverage.
      • Online pop-up survey for targeted releases to test user satisfaction and help set continued improvements.
      • An annual stakeholder survey and in-depth interviews targeted at government departments, charities, public institutions and businesses, tests satisfaction and use of statistics and analysis, and future needs.
  7. We bring these insights together, alongside granular stakeholder engagement to understand the impact of our work, at both a topic level, and our individual outputs to support ongoing decision making around both what we focus on, and how we can best maximise the impact of our work.

User engagement

  1. User engagement is key to making an impact. Our User Engagement Strategy for Statistics promotes a theme-based approach to user engagement. This allows all users of government data and statistics to interact with the GSS by their area of interest or by cross-cutting theme. This approach also aims to support collaboration with producers of official statistics to develop work programmes, address data gaps and help improve GSS products and services.
  2. We have created the ONS Assembly to support regular dialogue and feedback on delivering inclusive data with charities and bodies representing the interests of underrepresented groups of the population. The Assembly aims to be:
    • A forum for the ONS to engage and have an open dialogue with charities and member bodies on a range of key topics.​
    • A space to build trusted, long-lasting relationships between members and the ONS​.
    • An opportunity for members to share insight, advice and feedback on behalf of their interests and audiences.​
    • A space to exchange news and move collaboratively toward the future of data.
    • A route to help ensure vital themes, such as inclusivity, accessibility, wellbeing etc. are fully explored.
  3. Alongside working with users in local government, wider civic society, and the public, we build and maintain strong relationships with key policy makers in central government. These relationships with both local and central policy makers, allow the ONS to understand the challenges they face. We can then help build their understanding of our statistics and analysis , and the wider evidence base, enabling greater insight towards the topics that matter to our users, maximising ONS’s impact on decisions that affect the whole country.

Communication

  1. The way we communicate our statistics has much improved in recent years, having a direct influence on our impact. Statisticians speaking directly to the public via television and radio helps the transparent communication of statistics, assisted by our amend to publication times which ensures parity of communication (from 26 March 2020 we amended release times for market sensitive releases at 7:00 (rather than 9:30) and this was agreed with OSR).
  2. This includes 226 broadcast media interviews undertaken by ONS spokespeople during 2022/23 financial year, generating an average of 2.5k pieces of quoted coverage in the broadcast/online media each month, as well as our solid presence on Twitter (354.2k followers), which achieves good comparable engagement and reach, with threads created to support outputs and to respond to specific trends on social media.
  3. The ONS’s ‘Statistically Speaking’ podcast takes a deep dive into hot data topics and explains what’s behind the numbers. Between April 2022 and March 2023, the podcast had more than 18k downloads, including our most popular episode, ‘The R Word: Decoding ‘recession’ and looking beyond GDP’, which achieved 1,891 downloads in its first 30 days. In total since the podcast started in January 2022, it has achieved almost 30k unique downloads.

Dissemination

  1. Our approach to dissemination plays a pivotal role in maximising the insight and impact our data has. A prime example is our award-winning Census dissemination portfolio, with our Census maps offering users the ability to explore spatial patterns down to the neighbourhood level, empowering planners, and policymakers to precisely target interventions, and for any users to better understand their communities. Since then, we have developed interactive and highly localised content to encourage audiences to engage with the more granular data, producing data visualisation tools and innovative content so citizens can explore the data that is important to them. Users responded positively, saying they were “visually excellent”, “personalisable, visual and really well presented”.
  2. To promote widespread reuse of our insights and thus amplify their reach and impact, we designed tools to encourage users to embed custom views in their websites and publications. The results have been remarkable, with Census maps accounting for an impressive 24% of total views on the ONS site, garnering around 30 million views since its launch in November 2022.
  3. We also released our custom area profiles product. We recognised that user needs often extend beyond predefined geographic areas. Users can now draw specific areas of interest and generate tailored profiles with indicators and comparisons that match their unique use cases. The outputs are also exportable for use in websites and presentations, and has already reached tens of thousands of users, bridging gaps in specialised expertise.
  4. To cater to time-constrained users and unlock the potential of data hidden in spreadsheets, we introduced semi-automated localised reporting. With algorithms generating approximately 350 customised reports, one for each local authority, key trends in respective areas are efficiently explained, making insight more accessible and impactful. These reports are extensively accessed and widely referenced by local authorities and other local users.
  5. Additionally, we enhanced the reach of these reports by making their content crawlable by search engines. Snippets from these reports now appear directly in search results and voice-based queries, further bolstering the impact of our data and analyses.
  6. In tandem with our work on the Census, we have goals to transform the presentation of ONS’s day-to-day insights, with a particular focus on enhancing offerings for the general public. We have a digital content team comprised of data visualisation specialists, data journalists, and designers focussed on collaborating with analytical teams to achieve this.
  7. This approach centres on addressing the most pertinent questions for our users, often focussed on creating more personalised and localised experiences. By empowering users to see themselves within our data, we establish a meaningful connection with our audience.
  8. Through these collaborations with analytical teams, we are reaching a much-expanded user base, with audiences engaging with our content 40% longer than typical offerings. Our commitment to delivering impactful, accessible, data-driven insights ensures our offerings resonate with diverse audiences and have a lasting impression.
  9. We gather a range of feedback on our digital products to develop and improve the usability of our content, including our interactive online content and tools. We also undertake analysis of how user groups access our content, needs across our users of data, of statistics and trends, and those who want a deeper understanding of topics.
  10. There were 6.4m users of the ONS website in 2022/23. Most users (4.3m) use a mobile device to access our website, with 1.9m on desktop and 0.2m on tablets. Engagement levels remain highest with desktop users. There were nearly 26m pageviews on ONS.gov.uk over 2022/23. This only represents users that accept cookies. We estimate this to be approximately 30% of users.
  11. Peak demand on the website was driven by census releases, with 8x higher daily page views on 29 November for the ethnic group, national identity, language and religion census data than average, and nearly 6x higher daily page views for the demography and migration releases for census on 2 November. The census first release on 28 June saw roughly double the daily average for the year.
  12. The most popular topics on ONS.gov.uk across the year were census (3.6m pageviews), covid (3.6m pageviews) and inflation (3.2m pageviews).

Analysis

  1. Ensuring that analysis is good quality will also help ensure that it has impact. The Analysis Function Standard sets expectations for the planning and undertaking of analysis to support well-informed decision making. It provides clear direction and guidance for all users and producers of government analysis.
  2. The Analysis Function also shares best practice through the Analysis in Government awards, which includes an impact award
  3. Maximising the impact of across government and for the ONS is in understanding the priorities of the day, both for the citizen but also decision makers at the heart of local and central government, and flexing at pace as new priorities emerge – this often means the evidence base may be less robust or that data do not exist, but ONS’s Analytical Hub is constantly adapting to produce the best analysis at pace to support decision making. The ONS also scans the horizon anticipating what may be becoming an emerging issue.

How do we ensure that users, in the Civil Service, Parliament and beyond, have the skills they need to make effective use of data? 

  1. There are a range of initiatives aimed at improving the analytical skills of civil servants and beyond.

The Analysis Function

  1. The Analysis Function is a network for all civil servants working in government analysis and aims to help improve the capability of all analysts across government. The Analysis Function website hosts the dedicated analysis function curriculum webpages alongside a range of technical analytical learning for all, as well as a guidance hub providing access to key analysis guidance. The Function also hosts regular information sharing events and webinars.
  2. The Function works with the policy profession and other teams across government to ensure we are building a level of analytical capability specifically for non-analysts. The Analysis Function have also developed a learning pathway specifically for non-analysts in line with wider government reform priorities.
  3. The Analysis Function conducted a review in 2022 of the analytical capability of policy officials. Since then, we have been working closely with the policy profession unit through a dedicated implementation working group looking to address the recommendations from the review. Progress has been made against several actions including the launch of the analytical literacy course, data master class and launch of policy to delivery pilot.

The Methodology Advisory Service

  1. The Methodology Advisory Service (MAS) based within the ONS offers advice, guidance and provide support for the public sector, nationally and internationally, using teams of experts covering different areas of statistical and survey methodology. We offer an advisory service for:
    • methodological advice on production and analysis of data
    • development of surveys or outputs
    • feasibility studies
    • methodological research to answer complex problems
    • quality assurance of methods or outputs
    • cross-cutting reviews of processes and methods across a department’s statistical work
    • evaluation of competing sources
    • health checks before an OSR assessment
  2. The ONS’s methodologists and researchers receive their own methodological advice from the Methodological Assurance Review Panel (MARP). They provide external, independent assurance and guidance on the statistical methodology underpinning ONS statistical production and research.

The Data Science Campus

  1. The Data Science Campus is at the heart of leading-edge data science capacity building with public sector bodies in the UK and abroad. We equip analysts with the latest tools and techniques, giving them the capability to perform effectively in their roles. We also work in partnership with organisations to ensure they have the capacity to develop their own data science skills in the long-term.
  2. Our evolving range of programmes reflects our focus on using data to drive innovation for public good, and provide analysts across the ONS, the UK public sector and international partners with a developmental framework to build capacity and enhance analytical capability:
    • Data Science Accelerator
    • Data Science Graduate Programme
    • Degree Data Science Apprenticeship
    • Masters in Data Analytics for Government
    • Cross-government and Public Sector Data Science Community

ONS Local

  1. The ONS Local service provides peer-to-peer forums and platforms for local, regional, and national analytical communities to share best practice, and helps local users navigate around the extensive subnational offer from the ONS, both what is already available and what’s in development, and wider UK government data. For example, “ONS Local Presents” webinars allow ONS teams and analysts from local or central government to present analysis on a topic to a wide audience for feedback and challenge or to showcase innovation in techniques or data that may be useful to others. We have also held our first in a series of “ONS Local How to” workshop, aimed at a similar audience and run jointly with the Data Science Campus, to support local government analysts create dashboards and use APIs.

ONS Outreach and Engagement

  1. Finally, the ONS Outreach and Engagement Team are piloting and developing a programme of online engagement activities to help improve data literacy among underrepresented groups, non-expert users and those less likely to engage with data. The sessions vary in topic across the range of statistical production and collection themes at ONS and include a range of engagement formats. Topics and activities so far have included an introduction to the ONS and census webinars, Q&As on how to use census data and show and tells, demonstrations or learn ins on data tools such as Cost of Living Insights Tool, Census Maps Tool and Build a Custom Data Set Tool.
  2. These sessions can be tailored to the audience, including civil service colleagues who may be less confident or engaged with data, and aim to improve awareness and understanding of the foundations of data use and production.

Professor Sir Ian Diamond, National Statistician

Office for National Statistics

August 2023

Office for Statistics Regulation response

Introduction

About us

  1. The Office for Statistics Regulation (OSR) is the independent regulatory arm of the UK Statistics Authority. In line with the Statistics and Registration Service Act (2007), our principal roles are to:
    • set the statutory Code of Practice for Statistics (the Code).
    • assess compliance with the Code to ensure statistics serve the public, in line with the pillars of Trustworthiness, Quality and Value. We do this through our regulatory work that includes assessments, systemic reviews, compliance checks and casework.
    • award the National Statistics designation to official statistics that comply fully with the Code.
    • report any concerns on the quality, good practice and comprehensiveness of official statistics.
  2. While our formal remit covers official statistics, we also encourage organisations to voluntarily apply the Code to demonstrate their commitment to trustworthy, high quality and valuable statistics. Our 5-year plan sets out our vision and priorities for 2020-2025 and how we will contribute to fostering the Authority’s ambitions for the statistics system. Our annual business plan shares our focus for the current year.

Data and analysis in Government

How successfully do Government Departments share data? 

  1. For the last five years, OSR has been monitoring and commenting on data sharing and linkage across government, producing reports to understand issues and identify opportunities to move the wider system forward. We are an advocate and a champion for data sharing and linkage, when this is done in a secure way that maintains public trust. It is our ambition that sharing and linking datasets, and using them for research and evaluation, will become the norm across the UK statistical system.
  2. Our latest data sharing and linkage report takes stock of data sharing and linkage across government. There has been some excellent progress in creating linked datasets and making them available for research, analysis and statistics.
    • The Office for National Statistics (ONS) recently published statistics on sociodemographic inequalities in suicides, which utilised linked demographic and socioeconomic data about individuals from the 2011 Census with death registration data and, for the first time, was able to show estimates for rates of suicide across a wide range of different demographic groups. They believe this analysis will support the development of more effective suicide prevention strategies.
    • Data First aims to unlock the potential of Ministry of Justice (MoJ) data by linking administrative datasets from across the justice system and enabling accredited researchers, from within government and academia, to access the data. Data First is also enhancing the linking of justice data with data from other government departments, such as the Department for Education (DfE), where linking data has unlocked a wealth of information for researchers about young people who interact with the criminal justice system.
    • BOLD, led by the MoJ, is a three-year cross-government data-linking programme which aims to improve the connectedness of government data in England and Wales. It was created to demonstrate how people with complex needs can be better supported by linking and improving the government data held on them in a safe and secure way.
  3. Our report highlights an emerging theme on the overall willingness to share and link data across government and public bodies. The benefits and value of doing so are widely recognised, with the COVID-19 pandemic helping to change mindsets and highlight opportunities that exist for greater collaboration and sharing.
  4. However, through speaking with stakeholders across the data sharing and linkage landscape during our review, we also found there is still uncertainty about how to share and link data in a legal and ethical way, and about public perception of data sharing and linkage. There is also a lack of clarity about data access processes and data availability and standards across government. Together, these factors can lead to a nervousness to share and link data, which can cause blockages or delays.
  5. The picture is not the same in every area of government. Some areas have moved faster than others and we have found that culture and people are key determinants of progress.
  6. In the report, we summarise and discuss our findings within four themes in the context of both barriers and opportunities:
    1. Public engagement and social licence: The importance of obtaining a social licence for data sharing and linkage and how public engagement can help build understanding of whether/how much social licence exists and how it could be strengthened. We also explore the role data security plays here.
    2. People: The risk appetite and leadership of key decision makers, and the skills and availability of staff.
    3. Processes: The non-technical processes that govern how data sharing and linkage happens across government.
    4. Technical: The technical specifics of datasets, as well as the infrastructure to support data sharing and linkage.
  7. Overall, data sharing and linkage in government stands at a crossroads. Great work has been done and there is the potential to build on this. However, there is also the possibility that, should current barriers not be resolved, progress will be lost.
  8. Our review makes 16 recommendations that, if realised, will enable government to confront ingrained challenges, and ultimately to move towards greater data sharing and linkage for the public good. Following the report, OSR will be following up with those organisations mentioned in our recommendations to monitor how they are being taken forward.

The changing data landscape

Is the age of the survey, and the decennial Census, over?

  1. Statistics producers are increasingly turning to alternative data sources in the production of official statistics, in light of challenges with survey data collection and increased recognition of the potential of alternative data sources. Administrative data (that is, data that are primarily collected for administrative or operational purposes) are increasingly used to produce official statistics across a range of topics including health, such as waiting times data; crime, such as police recorded crime data; and international migration, such as borders and immigration data. Challenges faced during the COVID-19 pandemic highlighted society’s need for timely statistics and further demonstrated the potential of administrative data.
  2. However, such methods are unlikely to be able to capture all aspects of our population and society and therefore surveys are likely to play an ongoing but changing role in the statistical system. For instance, many crimes are not reported to the police, and data quality for some crime types is poor, so users cannot rely exclusively on administrative datasets of police recorded crime. To get a full picture of crime, both police recorded crime and the Crime Survey for England and Wales will always need to be used alongside each other.
  3. Moreover, there is strong interest in opinion and perception data such as the successful ONS Business Insights and Confidence Survey. Our Visibility, Vulnerability and Voice report on statistics on children and young people also demonstrated the strong user demand and importance of data that include children’s voice about their experiences and see the child holistically. These insights would not be available through administrative sources.

What new sources of data are available to government statisticians and analysts? 

  1. We highlight in our State of the Statistical System 2022/23 report that the increasing availability of new data sources such as administrative data, management information and growing use of artificial intelligence should be seen as an opportunity for the statistical system.
  2. Administrative data are helping to provide new insights and improve the quality of statistics. For example, the Department for Work and Pensions (DWP) is exploring the integration of administrative data into the Family Resources Survey (FRS) and related outputs through its FRS Administrative Data Transformation Project.
  3. The ONS has developed experimental measures of inflation using new data sources, including scanner and web-scraped data, publishing experimental analysis using web scraped data looking at the lowest cost grocery items. Their Consumer Prices Development Plan details the new sources of data that can be used and the insights they can bring.
  4. Technology can also provide opportunities to collect data in different ways, such as DfE pupil attendance data that is automatically submitted from participating schools’ management systems and allows for more timely analysis of attendance in schools in England. This data collection won the RSS Campion Award for excellence in Official Statistic

What are the strengths and weaknesses of new sources of data? 

  1. In the wider context of technological advances, statistics need to remain relevant, accurate and reliable, and new data sources support this ambition. However, with the use of these new and innovative data sources in the production of official statistics, producers need to manage risks around quality. Moreover, with more use of data science and statistical models in the production of official statistics it is crucial that producers ensure that any development of models is explainable and interpretable to meet the transparency requirements of the Code.
  2. To maximise the opportunities from new data sources, the role of the statistician has to evolve and keep pace with the increasing use of data science techniques. Our latest State of the Statistical system report highlights the difficulties producers have getting people with the right skills in post; these challenges are not being consistently felt across the whole UK statistical system. There is a concerning risk that continued financial and resource pressures will hinder future progress and evolution of the system to keep pace with increasing demand. A successful statistical system that is able to utilise new data sources depends on having a workforce that is sufficiently resourced and skilled to deliver.
  3. New data sources often provide insights in a timelier manner (in some instances this can be near real time such as England’s school attendance data) and provide better coverage (such as the web scraped and supermarket prices data often including all transactions or prices). On the other hand, there is a risk it may not be measuring what people want to measure and there is no option to amend or edit the data or questions being asked. Producers also have little control over the coherence and comparability in the data; there may be differences in how organisations record their data as well as between datasets on a similar topic. Data could also be missing for some observations and variables and the data could be bias by only covering certain groups of people or transactions.

Protecting privacy and acting ethically 

What does it mean to use data ethically, in the context of statistics and analysis? 

  1. As the regulator of official statistics in the UK, it is our role to uphold public confidence in statistics. In our view, an oft-neglected question of data ethics concerns not so much how data are collected and processed, but how the resulting statistics are used in public debate. As a result, we consider the question of whether a particular use is misleading as intrinsically ethical.
  2. One of the areas we continue to develop our thinking on is the topic of misleadingness, publishing a think piece on misleadingness in May 2020 and following up on our initial thinking in May 2021. The latter focuses on feedback to the first think piece that it is important to distinguish between the production of statistics and the use of statistics, as well as identifying areas not covered in the original think piece, like the risk of incomplete evidence. Based on our findings, our thinking has evolved to be clearer on the circumstances in which it is relevant to consider misleadingness: “We are concerned when, on a question of significant public interest, the way statistics are used is likely to leave audiences believing something which the relevant statistical evidence would not support.”
  3. We are launching a review of the Code of Practice for Statistics in September. As part of it, we will be asking the question “what are the key ethical issues in the age of AI: how do we balance serving public good with the potential for individualised harms?”. The review will run until December, and we will be highlighting how people can engage and contribute, including a planned panel session on this topic.

Understanding and responding to evolving user needs

Who should official data and analyses serve? How do users of official statistics and analysis wish to access data? 

  1. OSR’s vision, based on our founding legislation, is that statistics serve the public good. In 2022 we worked in partnership with ADR UK to explore what the term ‘public good’ means to the public. We found that research and statistics should aim to address real-world needs, including those that may impact future generations and those that only impact a small number of people. There was also clear evidence that members of the public want to be involved in making decisions about whether public good is being served, through meaningful public engagement and full, transparent and easy access to the decision-making process of Data Access Committees (which evaluate applications from trained and accredited researchers for the use of de-identified data for research).
  2. In 2021, we published a report looking at Defining the Public Good in Applications to Access Public Data. The report highlights how researchers see their research as serving the public good or providing public benefits, and this differed between organisations. For example, the most frequently mentioned public benefits in National Statistician’s Data Ethics Advisory Committee (NSDEC) applications were to improve statistics and service delivery, whereas Reproducible Analytical Pipeline (RAP) applications mentioned policy decisions and societal benefit more.

How are demands for data changing?   

  1. There continues to be a significant shift in government and public demand for statistics and data from COVID-19 to other key issues. The statistical system has demonstrated its responsiveness to meet these data needs. However, as mentioned at paragraph 17, pressure on resources and finances poses a significant threat to the ability of government analysts to produce the insight government and the wider population needs to make well-informed decisions.
  2. Working in an efficient way will help address one part of this problem: it will help ensure maximum value is achieved with the resources that are available, which will in turn help others across government appreciate the benefit of having analysts at the table. Our blog on smart statistics: what can the Code tell us about working efficiently highlights ways to support efficiency based on the Code.
  3. Users of statistics and data should always be at the centre of statistical production; their needs should be understood, their views sought and acted on, and their use of statistics supported. We encourage producers of statistics to have conversations with a wide range of users to identify where statistics can be ceased, or reduced in frequency or detail, to save resources if appropriate. This can free up resource, while helping producers to fulfil their commitment to producing statistics of public value that meet user needs. Ofsted has recently done this to great effect.
  4. The UK statistical system should maintain the brilliant responsive and proactive approach taken during the COVID-19 pandemic and look to do this in a sustainable way. Improvements to data infrastructure, processes, and systems could all help. For example, the use of technology and data science principles, such as that set out in our 2021 RAP review, supports the more efficient and sustainable delivery of statistics. This review includes several case studies of producers using RAP principles to reduce manual effort and save time, alongside other benefits. The recent Analysis Function RAP strategy sets out the ambition to embed RAP across government, and the Analysis Function can offer RAP support, through its online pages, its Analysis Standards and Pipelines Team and via the cross-government the RAP champion network.
  5. Statistics and data should be published in forms that enable their reuse, and opportunities for data sharing, data linkage, cross-analysis of sources, and the reuse of data should be acted on. The visualisations and insights generated by individuals, from outside the statistical system, using easily downloadable data from the COVID-19 dashboard nicely demonstrate the benefits of making data available for others to do their own analysis, which can add value without additional resource from producers.
  6. Promoting data sharing and linkage, in a secure way, is one of OSR’s priorities and we are currently engaging with key stakeholders involved in data to gather examples of good practice, and to better understand the current barriers to sharing and linking. This will be used to champion successes, support positive change, and provide opportunities for learning to be shared.
  7. To ensure overall success, it requires:
    • independent decision making and leadership, in particular Chief Statisticians and Heads of Profession for Statistics having authority to uphold and advocate the standards of the Code.
    • professional capability, again demonstrating the benefit of investing in training and skills, even when resources are scarce.

How can we ensure that official data and analyses have impact? 

  1. To have impact, official data and analysis need to serve the public good (by being quality, trusted and valued) and be well communicated.
  2. This is reflected in the three pillars of our Code: Quality sits between Trustworthiness, representing the confidence users can have in the people and organisations that produce data and statistics, and Value, ensuring that statistics support society’s needs for information. All three pillars are essential for achieving statistics that serve the public good. They each provide a particular lens on key areas of statistical practice that complement each other and help to ensure the data are being used as intended.
  3. Quality is not independent of Trustworthiness and Value. A producer cannot deliver high quality statistics without well-built and functioning systems and skilled staff. It cannot produce statistics that are fit for their intended uses without first understanding the uses and the needs of users. This interface between quality, its institutional context and statistical purpose are also reflected in quality assurance frameworks (QAF), including the European Statistical System’s QAF and the International Monetary Fund’s DQAF. The Code is consistent with these frameworks and with the UN Fundamental Principles of Official Statistics.
  4. We use assessments and compliance checks to judge compliance with the Code for individual sets of statistics or small groups of related statistics and data (for example, covering the same topics across the UK). Whether we use an assessment or compliance check will often be determined by balancing the value of investigating a specific issue (through a compliance check) versus the need to cover the full scope of the Code (through an assessment).
  5. There is no ‘typical’ assessment or compliance check – each project is scoped and designed to reflect its needs. An assessment will always be used when it concerns a new National Statistics designation and will also be used to undertake in-depth reviews of the highest profile, highest value statistics, especially where potentially critical issues have been identified.
  6. We have some useful guidance that can assist producers in their quality management. We published a guide to thinking about quality when producing statistics following our in-depth review of quality management in HMRC, and released a blog to accompany our uncertainty report. It highlights some important resources, top among them the Data Quality Hub guidance on presenting uncertainty. Our quality assurance of administrative data (QAAD) framework is a useful tool to reassure users about the quality of the data sources.
  7. To support statistics leaders in developing a strategic approach to applying the Code pillars and a quality culture, we have developed a maturity model, ‘Improving Practice’. It provides a business tool to evaluate the statistical organisation against the three Code pillars and helps producers identify the current level of practice achievement and their desired level, and to formulate an action plan to address the priority areas for improvement for the year ahead.
  8. We are also continuing to promote a Code culture that supports producers opening themselves to check and challenge as they embed Trustworthiness, Quality and Value, because in combination, the three pillars provide the most effective means to deliver relevant and robust statistics that the public can use with confidence when trying to shine a light on important issues in society.
  9. In our report on presenting uncertainty in the statistical system we found that presenting uncertainty in a meaningful, succinct way that delivers the key messages can be challenging for producers. We found that typically, uncertainty is better depicted and described in statistical bulletins and methodological documents than it is in data tables, data dashboards and downloadable datasets.
  10. We also found that there is a wide and increasing range of guidance and advice to help producers think about how to best present uncertainty. OSR will do more to promote and support good practice and consider what this means for our regulatory work. We will focus on the judgements that we make and the guidance we produce to help producers to improve the presentation of uncertainty.
  11. In our report, we concluded that showing uncertainty in estimates, for example through data visualisation, is essential in improving the interpretation of statistics and in bringing clarity to users about what the statistics can and cannot be used for. At the same time, however, we recognise that this is often not always a straightforward task. With support from us and those at the centre of the Government Statistical Service (GSS), we encourage Heads of Profession for Statistics to review whether uncertainty is being assessed appropriately in their data sources, and to review how this is presented in all statistical outputs.
  12. We will continue to review the communication of uncertainty in our regulatory projects. We already have a good range of experience and effective guidance to help review uncertainty presented in statistical bulletins and methodology documents.

How do we ensure that users in the Civil Service, Parliament and beyond, have the skills they need to make effective use of data? 

Intelligent transparency

  1. Intelligent transparency is fundamental in supporting public trust in statistics. Our campaign and guidance aim to ensure an open and accessible approach to communicating numbers.
  2. In our blog What is intelligent transparency and how you can help?, we highlight our expectation that at its heart intelligent transparency is about proactively taking an open, clear and accessible approach to the release and use of data, statistics and wider analysis. We also recognise that whilst we will continue to champion intelligent transparency and equal access to data, statistics and wider analysis, it isn’t something we can do on our own. Our expectations for transparency apply regardless of how data are categorised. For many who see numbers used by governments, the distinction between official statistics and other data, such as management information or research, may seem artificial. Therefore, any data which are quoted publicly or where there is significant public interest should be released and communicated in a transparent way.
  3. We need users of data to continue to question where data comes from and if it is being used appropriately. We also need those based in a department or a public body to champion intelligent transparency in their team, their department and their individual work, build networks to promote our intelligent transparency guidance across all colleagues and senior leaders, and to engage with users to understand what information it is they need to inform their work to inform the case for publishing it.
  4. Parliamentarians also have a role to play in ensuring intelligent transparency in debate. This includes advocating for best practice around the use of statistics and calling out misuse of statistics where it occurs. Following the principles of intelligent transparency allows the topic discussed to remain the focus of conversation, rather than the provenance of the data.
  5. We have launched a communicating statistics programme that will in part look to understand how users want to access data and help support producers to communicate their data through those different means. This will include reviewing our existing guidance to understand what more we can do to support the use and range of communication methods while preventing and combatting misuse.

Statistical literacy

  1. In our regulatory work, when people talk to us about statistical literacy it is often in the context of it being something in which the public has a deficit. For example, ‘statistical literacy’ may be cited to us as a factor in a general discussion on why the public has a poor understanding of economic statistics. OSR commissioned a review of published research on this topic area and published an accompanying article to investigate if this was indeed the case.
  2. We found wide variability across the general public in the skills and abilities that are linked to statistical literacy. Our review highlights that a substantial proportion of the population display basic levels of foundational skills and statistical knowledge, and that skill level is influenced by demographic factors such as age, gender, education and socioeconomic status.
  3. Given this, we think that it is important that statistical literacy is not viewed as a deficit that needs to be fixed, but instead as something that is varied and dependent on the context of the statistics and factors that are important in that context. Therefore, rather than address deficits in skills or abilities, we recommend that producers of statistics focus on how best to publish and communicate statistics that can be understood by audiences with varying skill levels and abilities.
  4. Our review identified a number of areas where there is good evidence on how best to communicate statistics to non-specialist audiences in the following areas:
    • Target audience: Our evidence endorses the widely recognised importance of understanding audiences. The evidence highlights that the best approach to communicating information (including data visualisations) can vary substantially depending on the characteristics of the audience for the statistics. Considering the target audience’s characteristics is, therefore, an important factor when designing communication materials.
    • Contextual information: Contextual information helps audiences to understand the significance of the statistics. Our evidence highlights the importance of providing narrative aids, and also that providing statistical context can help to establish trust in the statistics. Again, this supports and reflects existing notions of best practice.
    • Establishing trust: As well as providing context, we found evidence that highlighting the independent nature of the statistical body and, when needed, providing sufficient information so that the reasons for unexpected result are understood, can increase trust in the statistics. This finding aligns with the Trustworthiness pillar of the Code.
    • Language: In the statistical system, statistics producers recognise that they should aim for simple easy to understand language. We found evidence to endorse this recognition – in particular, that, when used, the level of technical language should be dictated by the intended target audience.
    • Format and framing of statistical information: We found evidence that different formats (e.g., probability, percentage or natural frequency) and/or framing (e.g., positive or negative) in wording can lead to unintended bias or affect perceptions of the statistics and both need to be considered. This finding is probably the one which is least widely recognised in current best practice in official statistics, and we consider it is an area that would benefit from further thinking.
    • Communicating uncertainty: Communicating uncertainty is important and may need to be tailored dependent on the information needs and interest levels of the audience. This topic is a particular focus area for OSR, and we discussed our report on communicating uncertainly at paragraph 39.

UK Statistics Authority oral evidence to the Public Administration and Constitutional Affairs Committee’s inquiry on the work of the UK Statistics Authority

On Tuesday 23 May, Sir Robert Chote, Chair of the UK Statistics Authority, Sir Ian Diamond, National Statistician and Ed Humpherson, Director General for Regulation, gave evidence to the Public Administration and Constitutional Affairs Committee’s inquiry on the work of the UK Statistics Authority.

A transcript of which has been published on the UK Parliament website.

UK Statistics Authority correspondence to the Public Administration and Constitutional Affairs Committee regarding Public Confidence in Official Statistics report

Dear William,

I am writing to draw your attention to the latest Public Confidence in Official Statistics report (2021), which has been produced by the National Centre for Social Research (NatCen) on behalf of the UK Statistics Authority. I am happy to share that the report finds that public confidence in official statistics remains high, and engagement with official statistics has increased since 2018.

Awareness of the Office for National Statistics (ONS) and the Authority has increased from 70% and 33% in 2018 to 75% and 48% in 2021 respectively. Furthermore, for the first time people were asked if they were aware of the Office for Statistics Regulation, with 41% saying that they were.

Notably, 96% of people able to express a view agreed that it is important for there to be a body such as the Authority to speak out against the misuse of statistics, and 94% agreed about the importance of there being a body to ensure that official statistics are produced without political interference.

Members might also be interested to note that a very high proportion of respondents trusted the ONS (89% of those able to express a view) and our statistics (87%). Of those able to express an opinion, trust in the ONS was highest of all institutions asked about, including the Government, the Bank of England, and the civil service as a whole. 82% of people able to express an opinion agreed that official statistics are generally accurate, up from 78% in 2018. Meanwhile 44% said they had used ONS COVID-19 statistics; they were more commonly used than any of the other statistics asked about with the exception of the census.

This report is very welcome, especially following our hard work to provide clear insights throughout the pandemic. We are proud that the public support our vision of statistics that serve the public good, which we will continue to deliver with honesty, and free from political interference.

A copy of the report will be annexed to this letter for the Committee’s information.

Yours sincerely,
Professor Sir Ian Diamond

UK Statistics Authority oral evidence to the Public Administration and Constitutional Affairs Committee’s pre-appointment hearing for Chair of the UK Statistics Authority

On Tuesday 29 March 2022 Sir Robert Chote, the Government’s preferred candidate for Chair of the UK Statistics Authority, gave evidence to the Public Administration and Constitutional Affairs Committee’s pre-appointment hearing for Chair of the UK Statistics Authority.

A transcript of which has been published on the UK Parliament website.

 

UK Statistics Authority oral evidence to the Public Administration and Constitutional Affairs Committee’s inquiry on the work of the UK Statistics Authority

On 21 October 2021, Sir David Norgrove, Chair, UK Statistics Authority, Professor Sir Ian Diamond, National Statistician and Ed Humpherson, Director General for Regulation, gave evidence to the Public Administration and Constitutional Affairs Committee’s inquiry on the work of the UK Statistics Authority.

A transcript of which has been published on the UK Parliament Website.

UK Statistics Authority and Office for Statistics Regulation written evidence to the Public Administration and Constitutional Affairs Committee’s inquiry on data transparency and accountability: COVID-19

Dear William,

Thank you for your letter of 2 February, in which you asked about the Government’s use of data, in relation to your inquiry Data transparency and accountability: COVID-19.

We discuss your specific questions in an Annex to this letter. Here I would like to reflect more generally about the use of data and statistics in the past year.

Overall I believe our statistical system has responded well to the stress and pressures of the pandemic. Ian Diamond’s separate letter to you describes an immense range of work that has been done to understand the pandemic itself, which has been fundamental to government decision making and public understanding. Alongside the work on the pandemic the Office for National Statistics (ONS) and statisticians across government have continued to produce remarkable data and analysis across the economy and society, work that is high quality and innovative. Preparations and contingency plans for the census in England and Wales are encouraging.

The legislative framework for our statistics as set out in the Statistics and Registration Service Act 2007 together with the Digital Economy Act 2017 has also, I think, met the sternest test it has yet seen. The new data and statistics required by the pandemic have for the most part been compiled and published in accordance with the Code of Practice on Statistics and statisticians have generally been able to access the new sources of data they need.

I pay warm tribute to all involved in this work, at a time of anxiety for them and their families, with all the disruption caused by the need to work from home, alongside the increased difficulty of their professional lives, with many surveys and other sources of data having to be changed or abandoned.

Within this generally positive picture not all has gone well, and there are lessons to be learned.

It has too often been a struggle to develop a coherent picture of the pandemic even within England as a single country. DHSC currently plays a limited role in health statistics. Its resource has been strengthened following an ONS review undertaken at their request. But the disparate bodies involved in the provision of health are in terms of statistical output too often inchoate, to the extent for example that both the NHS and Public Health England produce statistics on vaccinations that are published separately.

This is an issue that has been highlighted by the Office for Statistics Regulation (OSR) in the past. It goes well beyond the concerns raised by the pandemic. We currently have no coherent statistical picture of health in England or of the provision of health services and social care.

There are similar issues in relation to the health data for the four nations. The adoption of different definitions complicates comparisons and makes it hard to draw the valuable lessons we could all learn from different ways of doing things.

I strongly support the proposal by the Royal Statistical Society for a thorough external review.

More immediately it is hard to understand why the different nations have chosen to publish vaccination data in the different ways and detail they have chosen. OSR is pursuing this.

Some people may be surprised by my mostly positive assessment of the handling of statistics and data over the past year. Their more negative view is likely to have been influenced by a number of – too many – particular examples of poor practice.

  • The presentation of data at No 10 press briefings has improved, helped by the later involvement of ONS staff, but early presentations were not always clear or well founded, and more recently a rushed presentation has undermined confidence.
  • Ministers have sometimes quoted unpublished management information, and continue to do so, against the requirements of the Code of Practice for Statistics. Such use of unpublished data leads of course to accusations of cooking the books or cherry picking the data. It should not require my involvement or that of OSR to secure publication.
  • Perhaps most important is the damage to trust from the mishandling of testing data. The target of 100,000 tests per day was achieved by adding tests sent out to tests completed. As predicted there was huge double counting, to the extent of some 1.3 million tests that were eventually removed from the figures, in August. The controversy over testing data seems likely to continue to undermine the credibility of statistics and the use that politicians make of them.

The Annex describes a range of current issues in relation to the pandemic, including testing and vaccinations, as well as replying more directly to your letter.

There are perhaps two areas the Committee might like to consider in terms of future change.

The first is the central role of the Authority together with the National Statistician and OSR. The UK has a decentralised system of statistics where individual departments are responsible for their statistics and departmental statisticians report within their departments. This has strengths we should not lose. It ties statistics and statisticians closely into the policy making of their departments and any change should not weaken that tie. But the complexity of data and statistics in the current crisis has shown the need in these circumstances for a firmer central controlling mind. The National Statistician and the ONS have taken this role to a large extent, through expertise, position and personality rather than formal agreement.

OSR has also taken a more expansive role. For the future there may be a place for more formal arrangements.

Secondly, it is clear that political pressures have led to some of the weaknesses in the handling of Covid statistics. It is to the credit of our politicians that they have created an organisation like the Authority that is permitted to criticise them, and in general politicians respond appropriately to our criticisms. But it might help if more issues were headed off before they arose. The Ministerial Code for example only asks Ministers to be ‘mindful’ of the Code of Practice. The requirement could be stronger.

The Authority in 2020 published a new five year strategy, during the pandemic. It remains valid and we are pursuing it at pace. The ONS is leading the development of an Integrated Data Platform for Government as well as developing new and better statistics to help the country understand the economy and society, from trade to happiness and from crime to migration. Statistics to help the recovery are a particular focus. OSR is developing its work on statistical models – its review of exams algorithms will be published shortly – as well as on automation of statistics, data linkage, National Statistics designation, granularity and statistical literacy.

I look forward to keeping you and the Committee in touch with our progress.

 

Kind regards,

Sir David Norgrove

 

OFFICE FOR STATISTICS REGULATION ANNEX

1. Has the Government made enough progress on data since the start of the pandemic, and what gaps still remain?

Summary

Since the start of the pandemic Governments across the UK have maintained a flow of data which has been quite remarkable. New data collections have been established swiftly, existing collections have been amended or added to, and data sources have been linked together in new ways to provide additional insight. We have seen good examples from across the UK, including data on the virus itself and on the wider impacts on the pandemic.

However, in some areas Governments have been slow both to publish data and to ensure its fitness for purpose. For example, more consideration should have been given to the data available as testing and tracing was being set up. While UK governments have started publishing vaccinations data more promptly, and are continuing to develop the statistics on an ongoing basis, there remains much room for improvement in terms of the amount of information that is published and the breakdowns within the data. There are also gaps remaining in the data available to support longer term understanding of the implications of the pandemic.

It is clear that there is intensive work taking place to provide more comprehensive vaccinations data, both by each Government, and through cross-UK collaboration.

New data developed

There are many examples of outputs which have been developed quickly to provide new insights to help understand the pandemic and its implications. Some specific examples include:

  • The coronavirus (COVID-19) infection survey, carried out by the Office for National Statistics (ONS) in conjunction with partners. This is the largest and only representative survey in the world that follows participants longitudinally over a period of up to 16 months.
  • Statistics on the Coronavirus Job Retention Scheme (CJRS), the Self-Employment Income Support Scheme (SEISS) published by HM Revenue and Customs (HMRC), and the ONS Business Impact of Coronavirus Survey (BICS).
  • The Ministry of Justice (MoJ) and Her Majesty’s Prison and Probation Service (HMPPS) have published official statistics providing data on COVID-19 in HM Prison and Probation Service in England and Wales.

We have also seen outputs which attempt to draw together data to make it more easily accessible. The most well-known of these is the coronavirus dashboard which is widely used and constantly evolving. While the dashboard focuses on data related to COVID-19 infections, hospitalisations and deaths, the ONS has pulled together a broader range of data from across the UK government and devolved administrations to highlight the effects of the pandemic on the economy and society: Coronavirus (COVID-19): 2020 in charts. This output covers a broad range of data including transport use, schools attendance and anxiety levels.

Coronavirus (COVID-19) data

We have seen improvements to the data since the start of the pandemic, particularly around cases and deaths, with clearer information on the different sources of data available and the value of each of these sources.

However, we continue to have concerns, including seeing examples of data being referenced publicly before they have been released in an orderly manner, which we have highlighted to your Committee. In some areas Governments have been slow both to publish data and to ensure its fitness for purpose. For example, the UK coronavirus dashboard is now a rich source of information with plans for the inclusion of further data. However, it took several months to become the comprehensive source it is now.

Our concerns around COVID-19 health data cover three broad areas: Testing and tracing, hospitalisations, and vaccinations.

Our concerns with data on testing have been made public on a number of occasions(1 & 2). While there have been significant improvements to the England Test and Trace data it is still disappointing that there is no clear view of the end to end effectiveness of the test and trace programme.

In December we published a statement on data on hospital capacity and occupancy, noting the biggest limitation to the existing data is the inability to make a distinction between whether someone is in hospital because of COVID-19 infection, or whether they are in hospital for some other reason but also have a COVID-19 infection. These data should become available in time but the delay limits understanding at a critical time.

On vaccinations data, UK governments have been quick to start publishing data and have learnt some of the lessons from test and trace (see section 4). However, there remains room for much improvement in terms of the amount of information that is published and the breakdowns within the data. We would like to see more granular breakdowns and more consistency between administrations across the UK. Our letters to producers published on 1 December 2020 and 20 January 2021 outline our expectations in relation to these statistics.

The improvements cover four broad aspects of vaccinations data. First, there should be more granular data on percentage take up, for example by, age band, ethnic group and by Joint Committee on Vaccination and Immunisation (JCVI) priority groups.

Second, in terms of UK-level consistency, the data available for each administration varies:

  • Information on the percentages of individuals in each priority group that have been vaccinated to date is available for both Scotland and Wales
  • In England, breakdowns including ethnicity and some age bands are published by NHS However, it is disappointing that the publication of Public Health England’s COVID-19 vaccine monitoring reports, that might have been a vehicle for more granular information about the vaccination programme in England, has been halted with no explanation as to why it has stopped or when it may restart. We have called upon the statistics producers to be clear on the data currently available and when more data will be published.
  • Northern Ireland data are included in the UK However, the more granular data are not released in Northern Ireland itself in an orderly manner. We have written separately to the Department of Health (Northern Ireland), requesting that the data are published in an orderly and transparent way.

Third, on vaccine type, Scotland is the only administration to routinely publish data on vaccination type (Pfizer BioNTech or AstraZeneca). It is in the public interest to have this information for all parts of the UK, particularly in the context of the media coverage on vaccine efficacy and sustainability of supply.

Fourth, it would also be helpful to have better information on those who have been offered a vaccine and those who have taken them up. This would help with understanding the reasons why some people may not be taking up vaccines, for example whether it is refusals, access or because they have not actually received an invitation to have a vaccine.

In addition to the areas outlined above, there are gaps remaining in health related COVID-19 data. For example, very little data exists on ‘long COVID’, despite emerging evidence that a proportion of people can suffer from symptoms that last for weeks or months after the infection has gone. We understand that the ONS is starting to look at this area and welcome this effort to fill an important data gap.

There are also gaps in the understanding of and information on variants of coronavirus. This is important in understanding the implications, for example whether data already collected on issues such as hospitalisations, deaths and efficacy of vaccinations are still applicable.

It is a fundamental role of official statistics and data to be used to hold government to account, and the lack of granularity and timeliness seen with some of the data makes it hard to do this. We recognise that producers are working intensively to make improvements and are keen to support these efforts.

Wider impacts

It is inevitable in such a fast-moving environment that there will be gaps in the data. To date the priority has been to fill the gaps most needed to understand the immediate and most direct impacts of the pandemic but at some stage the focus will need to shift from what is happening today to looking at what data are needed to fully understand the longer term consequences of the pandemic.

The issues are broad ranging and cover diverse areas such as:

  • the impact of the pandemic on children and young people
  • the long-term effects on health services
  • the future health of the nation
  • the impacts on inequalities
  • the financial impacts of both the pandemic and the response to it

Answering society’s questions about the pandemic must be done using both existing and new data. There are some longstanding statistical outputs on aspects of society that are likely to have been affected by the pandemic – for example, statistics on trade and migration. These outputs should be used to present analysis on the pandemic’s impact on those specific issues. However, where existing data is collected through a survey, the data may be impacted by the difficulties in collecting face to face survey data over the past 12 months.

It is also important to consider sooner rather than later where new data may be required.

2.   Can you give a view on how well (or otherwise) the Government has communicated data on the spread of the virus and other metrics needed to inform its response to the pandemic?

During the pandemic we have highlighted two main areas of concern in the way governments have communicated data: accessibility and transparency.

Accessibility

Early in the pandemic there were lots of new sources of data being published by a range of organisations to support understanding of the pandemic. Much of these data were put out without narrative or sufficient explanation of limitations and could be hard to find. This made it difficult for members of the public to navigate the data and understand the key messages, which in turn could undermine the data and confidence in the decisions made on the basis of the data.

There have been improvements. For example:

  • Data have been more effectively drawn together in dashboards and summary articles, such as the UK coronavirus dashboard and the continually evolving dashboards in Scotland, Wales and Northern Ireland
  • Publications have been developed to include better metadata and explanation of sources and limitations, and how the data link with other An early example was the Department for Transport’s (DfT) statistics on transport usage to support understanding of the public response to the pandemic. We also saw the Department of Health and Social Care (DHSC) develop its weekly test and trace statistics to include information on future development plans and in October an article was published comparing methods used in the COVID-19 Infection Survey and NHS Test and Trace.

However, it can still be hard to know what is available and where to find data on the range of issues people are interested in.

Transparency

Throughout the pandemic we have been calling for greater transparency of data related to COVID-19. Early in the pandemic we published a statement highlighting the important of transparency. We also published a statement and blog on 5 November 2020. One of the prompts for this was the press conference on 31 October announcing the month-long lockdown for England. The slides presented at this conference were difficult to understand and contained errors, and the data underpinning the decision to lock down were not published for several days after the press conference.

We continue to see instances of data being quoted publicly that are not in the public domain. Most recently, for example, at the Downing Street briefing on 3 February, the Prime Minister said:

“…we have today passed the milestone of 10 million vaccinations in the United Kingdom including almost 90% of those aged 75 and over in England and every eligible person  in a care home…”

At the time this statement was made these figures were not published. Breakdowns by age are not published for the UK as a whole. On 4 February NHS England included additional age breakdowns in its publication for data up to 31 January, including percentage 75-79 for the first time (82.6% having had a first dose), and 80 and over (88.1% having had a first dose). The 10 million first doses figure for the UK was reached on 2 February (published 3 February). Our view is that it is a poor data practice to announce figures selectively from a dataset without publishing the underlying data.

Our recent report on statistical leadership highlights the importance of governments across the UK showing statistical leadership to ensure the right data and analysis exist, that they are used at the right time to inform decisions, and that they are communicated clearly and transparently in a way that supports confidence in the data and decisions made on the basis of them. It sets out recommendations to support governments in demonstrating statistical leadership from the most junior analysts producing statistics to the most senior Ministers quoting statistics in parliaments and the media.

We will continue to copy letters to PACAC when we write publicly on transparency issues.

3.   Can you give a view on how well the UK’s statistical infrastructure has held up to the challenge of the pandemic? Are there key systems or infrastructure issues that need to be addressed going forward?

By and large the statistical infrastructure has been extraordinarily resilient. It moved from a largely office-based operation to being almost completely remote in a very short space of time. There were also fast adjustments to allow surveys and data operations for major household, economic and business statistics to continue in some form. Furthermore, new important statistical surveys were launched in a few weeks that would have taken months of planning pre-pandemic, while existing surveys were adapted swiftly from face-to-face to online.

The statistical system has also responded quickly to the need to provide more timely data, for example to support daily briefings.

Producers have been more open to creative solutions and new data sources, for example web-scraped data and PAYE Real Time Information. There have also been greater instances of data sharing. For example, the ONS coronavirus insights dashboard seems to be working towards being a ‘one-stop shop’ for several different producers and the devolved administrations to collate data in a single place. This appears to be encouraging communication between producers and is improving each week.

There are key infrastructure opportunities now that can be exploited, and it is important to question what elements of the new approaches should remain and which should change back to how things used to be done. For example, when and how best should data collection return to face to face households surveys? Should legacy surveys like the Annual Business Survey continue or should there be a move to new platforms or administrative data? How can the new data sources that have now come on stream be exploited even more? Is there a case for synthetic data to enhance existing data to help phase out large and expensive surveys? Can new survey platforms be used to answer short-term questions to help manage the impacts of the pandemic?

It is also important to learn lessons from mistakes that have occurred, at whatever stage of the process they occured. One high profile example involved an error with testing data which meant there were delays to some data being included in the daily figures compiled by   PHE.

Errors like this reflect the underyling data and process infrastructure. OSR is currently exploring the use of reproducible analytical pipelines in government. We are focusing on what enables departments to successfully implement RAP and what issues prevent producers either implementing RAP fully or applying elements of it. This work will give a further insight into infrastructure challenges and where improvements may be needed.

Much of the data that are published are also drawn from administrative sources. In terms of monitoring the pandemic this has presented specific challenges. Public health, social care and hospital administrative systems are not connected to one another, which makes it time consuming to collate the data and puts quality at risk. Updating the IT infrastructure and data governance to make it possible to share information in a timely way is vital.

The pandemic has again highlighted the importance of analysts being involved in the development of operational systems, to make sure they are set up in a way that can best support data and evaluation needs.

The pandemic has also highlighted some of the strengths and limitations of the UK Statistical System. For example, analysts embedded in policy departments and devolved administrations have been able to quickly respond to emerging issues, but this has also been balanced with the complexitiy of having multiple organisations working on overlapping areas.

IT infrastructure as well as how statisticians organise themselves within and across organisations are covered further in our Statistical Leadership report.

4.   What should the Government learn from the issues with testing data that can be applied to vaccine data?

A key issue that has persisted throughout the pandemic has been the need for timely data to be published as quickly as possible. We saw in the early days of testing that the data were disorderly and confused. We were keen that this experience was not repeated with the roll- out of vaccines. So before the start of the vaccine roll-out, we wrote to producers of health-related statistics across the UK, outlining our expectations that they should work proactively to develop statistics on the programme.

Drawing on the lessons to be learnt from the development of test and trace statistics, we outlined the following key requirements for vaccination statistics:

  • Any publications of statistics should be clear on the extent to which they can provide insight into the efficacy of the vaccinations, or whether they are solely focused on operational aspects of programme delivery.
  • Data definitions must be clear at the start to ensure public confidence in the data is
  • Where statistics are reported in relation to any target, the definition of the target and any related statistics should be clear.
  • The statistics should be released in a transparent and orderly way, under the guidance of the Chief Statistician/Head of Profession for Statistics.
  • Some thought needs to be applied as to the timeliness, for example, daily or weekly data, to meet the needs of a range of users.

Encouragingly, we have seen signs that producers have learnt from their previous experiences. For the most part they made initial vaccination data available quickly; at first the numbers were fairly crude but they have continued to develop the data as time has gone on. We have seen also that whereas data were initially published on a weekly basis, producers very quickly moved to publishing daily figures with more detailed breakdowns provided in the weekly updates. This in part reflects a greater acknowledgement of the need to  publish the data so that it can be quoted in parliaments and media without undermining confidence.

This is pleasing but the statistics remain in development and there is more to be done, as we have outlined in this annex and in our letter of 20 January. We also asked producers to publish development plans for these statistics and indicate if some data cannot be provided, as this will help users to understand the limitations of the data available.

Office for Statistics Regulation

February 2021

Related links:

ONS written evidence to the Public Administration and Constitutional Affairs Committee’s inquiry on data transparency and accountability: COVID-19