Office for Statistics Regulation supplementary evidence to the Public Administration and Constitutional Affairs Committee

Dear Mr Wragg,

Thank you very much for the opportunity to give evidence to your Committee as part of the Transforming the UK Evidence Base inquiry on 6 February. I enjoyed the session and I hope that you found my evidence useful.

I am writing to provide some supplementary evidence related to comparability of statistics across the UK.

During the session, I set out the expectations we have as the Office for Statistics Regulation for statistics producers on questions of comparability. I emphasised that where there are questions from users around how to compare the performance of public services across the UK, producers in the four nations should recognise and seek to meet that need.

Meeting that need is not straightforward. As I explained, the configuration of public services will probably be different, because of different policy and delivery choices that have been made by the different governments. This is consistent with the concept of devolution, but it does mean that administrative data may be collected and reported on different bases.

However, it is not in our view sufficient for producers to simply argue that statistics are not comparable. They should recognise the user demand, and explain how their statistics do, and do not, compare with statistics in other parts of the UK. And they should also undertake analysis to try to identify measures that do allow for comparison.

A very good example of this approach is provided by statisticians in the Welsh Government. Their Chief Statistician published two blogs on the comparability of health statistics, Comparing NHS performance statistics across the UK and Comparing NHS waiting list statistics across the UK. These blogs recognise the user demand and provides several insights to enable users to make comparisons of NHS performance.

In addition, the Welsh Government’s monthly NHS performance release also highlights what can, and cannot, be compared. For example, it shows that in November 2023, there were approximately 22 patient pathways open for every 100 people, while for England, the figure in November was 13 pathways for every 100 people. More generally, I would commend the Chief Statistician’s blogs as a good example of providing guidance and insight to users across a wide range of statistical issues.

During my evidence session I also mentioned the approach taken by NHS England to highlight the most comparable accident and emergency statistics. NHS England provide a Home Nations Comparison file for hospital accident and emergency activity each year.

More generally, the ONS is leading comparability work across a range of measures. In addition to work on health comparability, they have produced very good analysis of differences in fuel poverty measurement across the four nations.

I hope this additional evidence is useful. I would like to reiterate that these examples show statisticians recognising the core point – that there is a user demand for comparability and that they are taking steps to meet that demand.

Yours sincerely,

Ed Humpherson

Director General for Regulation

UK Statistics Authority supplementary evidence to the Public Administration and Constitutional Affairs Committee

Dear Mr Wragg,

Following the submission of the Office for National Statistics’ (ONS) written evidence to the Committee’s Transforming the UK’s Evidence Base on 31st August 2023, I then gave evidence to the Committee on 5th September 2023. I am now able to provide some supplementary evidence, as requested, on several topics of interest.

The Integrated Data Service (IDS)

As you will be aware, the IDS is a cross-government project, for which the ONS is the lead delivery partner. The project is a key enabler of the National Data Strategy and seeks to securely enable coordinated access to a range of high-quality data assets built, linked and maintained for richer analysis. Please find below some further detail on the background of this project and the progress towards its delivery.

What is the scope of the IDS?

The scope of the IDS is to deliver a secure scalable modern data service which operates on a cloud-native platform, hosting a rich and diverse data catalogue consisting of indexed and linkable data with the latest provision of data science and generative AI potential. The service has been designed to better inform effective policy making.

The vision of the IDS is to address the lack of a central integration platform that can cater for the future needs of both data providers and analysts looking to utilise integrated data to develop cross-cutting analytical results. The IDS builds on the success of the Secure Research Service (SRS) and offers to significantly reduce the time it takes to negotiate and access data and the provision of data assets.

The IDS provides a secure environment that enables streamlined data sharing across government improving the ways that data are made available via cloud native technologies, modernising the way departments and their professionals operate. The IDS is the first of its kind in the UK and will be setting the precedence for how data is being processed on a cloud native platform.

When is it expected to be delivered?

The programme has been in development by the ONS over the last 18 months and is funded until March 2025 (under the current Spending Review). After this date, the IDS becomes a live running service.

What is the cost of the programme?

The programme secured funding from HM Treasury (HMT) until the end of the investment period (financial year March 2024/25). The cost of the programme is estimated to be £228.7m which covers the development and running costs from 2020 – 2025. Furthermore, the programme continues to assess funding options beyond March 2024/25.

Who are the users likely to be?

The IDS is designed for use by accredited analysts, within government and the wider research community. The ambition for the IDS is to have every government analyst, roughly estimated at 14,000 individuals, capable of utilising the platform to better inform decisions for the public good.

What data do you expect to be available on the service?

There are currently 81 datasets available in the IDS from across government. This includes high-value data assets, such as levelling up; climate change and net zero. Additionally, health data assets are underway with identified datasets being indexed by the Reference Data Management Framework (RDMF) – which enables multiple data to be linked and analysed, creating new comprehensive data assets – and published on the IDS so that analysts can link data according to their requirements.

The programme intends to continue to work with data owners across government and the private sector to acquire more datasets in conjunction with the RDMF.  However, this is dependent on data owners signing up to data sharing agreements to make this data available.

In accordance with the Central Digital and Data Office’s roadmap for 2022-25, departments have agreed to share their essential shared data assets across government, including through IDS. This further enables the IDS as a Trusted Research Environment to facilitate and support this commitment.

However, discussions with government analysts have highlighted a range of concerns about how current incentives for departmental data sharing fit with the needs of ministerial-facing departments. There is also a wider financial risk regarding other department’s ability to fund activity such as data cleansing, which may limit their ability to effectively share data. Although HMT set out the expectation that OGDs will support data sharing in all SR21 settlements, no specific funding was provided, which may limit activity in some cases. As part of the IDS Programme, ONS is working with Chief Data Officers across government to minimise frictions around the sharing of data via IDS. One of the pilots in development is looking at Data Ownership and Stewardship approaches to streamline the governance arrangements and make it quicker for departments to agree to share data via IDS, and for analysts to subsequently access that data for a broad range of analysis in the public good. As always, I would welcome support from the Committee to share and promote the benefits of data sharing across government for the public good.

What safeguards will be in place to protect data?

The IDS is a trusted research environment which means it adheres to the 5 Safes in accordance with the Digital Economy Act (DEA); The 5 safes of secure data are as follows:

  • Safe projects – Is this use of the data appropriate, lawful, and ethical?
  • Safe people – Can the users be trusted to use it in an appropriate manner?
  • Safe settings – Does the access facility limit unauthorised use?
  • Safe data – Is there a disclosure risk in the data?
  • Safe outputs – Are the statistical results non-disclosive?

These principles enable the safeguards and governance for the IDS to operate with sensitive data which in turn ensures public confidence in the security and processing of data. Access to the IDS platform is granted via secure gateway in line with the data legislation; furthermore, the IDS utilises strict policies around the cleaning, linkage, validation and controlling data.

The IDS Programme is also working across ONS in the development of key governance through policy creations that will enable safeguards and the appropriate use of data. The policy workstream, which is coordinated by ONS’ Data Governance, Legislation and Policy and Security and Information Management teams, is helping to develop adequate governance for the programme via policy development. In developing safeguards, the programme employs the following principles:

  • Adapting successful policies within the ONS and across government analytical communities (e.g., GSS, GSR, GES) that can support the programme.
  • Working with the National Statistician’s Data Ethics Advisory Committee, which is underpinned by the UK Statistics Authority’s (UKSA) ethics framework for the use of data for statistical, research and analytical purposes, to identify and mitigate any potential ethical risks at project-level.
  • Access to all data are controlled through the concept of a analytical ‘project’, within supporting business and technical processes linked to user need.
  • An overarching programme Data Protection Impact Assessment (DPIA) is maintained to define key activities and associated data risks. Continued engagement with the Information Commissioner’s Office on the DPIA as it is maintained and updated as the programme develops.

The programme also adheres to the UK Statistics Authority/ONS Data Protection Policy (required by the Data Protection Act 2018 and the General Data Protection Regulation).

The ONS website

The Committee also asked for some insight into the current condition of the ONS website and any plans to change the site in the future. Below I have outlined our vision for dissemination, of which our website is an integral part, as well as some exploratory work we are undertaking to see how we could use AI technology to address some of the challenges with our existing website.

Our Vision for Dissemination

The ONS website supports the Statistics for the Public Good strategy by helping to build trust in evidence, enhance understanding of social, economic and environmental matters and improve the clarity and coherence of our communication. By helping people to be aware of the ONS and to find, understand and explore our data, statistics and analysis we are giving people the information they need to make decisions, and act, at a national, local and individual level.

Our vision for statistics dissemination goes beyond the website. We want people to have trust in our data and analysis. We know that our users want to find trusted ONS information wherever they look – whether that’s on the ONS website, on social media, in the media or through search engines. Our users want ONS answers to their questions and we are exploring a range of different approaches to serve this need, including providing answers to questions using Large Language Models (LLMs).

Our goal is for users to understand our data, statistics and analysis more quickly and easily, with the right contextual information to help people know how they can use them. We want our users to explore and tailor our information so they can find what is important to them – whether that is by creating their own datasets based on ONS data or through our expert curated view of key insights for the economy or society.

Our priorities for the website in recent years have been delivering the capability to support census 2021 outputs and the reliability of the service to all our users, particularly in response to the additional demand for ONS data on the economy, in response to changes in the cost of living. We’re currently running a package of work to address and improve website performance to meet demand and our next priority will be programmatic access to our data via application programming interfaces (APIs). This will improve the agility of all users of our data, both internal and external, to consume and gain insights from the ONS website.

We have also focused on improved search both on the ONS website and through greater visibility of our data and insight in search engines and in the media.

This year we are also setting the future direction for how we create and manage our statistical content in a more efficient and structured way to enable business agility and flexibility for our users, aligned to their broad range of needs. This will set out a forward plan to transform ONS data and insight and will make the case for the additional funding needed to deliver on our ambitions.

StatsChat

Additionally, the ONS Data Science Campus are currently exploring how new tools and technology can help the organisation disseminate information more effectively. We have developed a new product, ‘StatsChat’, that uses LLMs to search and summarise text from across our website, and present relevant sections of our web pages to user’s natural language questions.

We are aiming to make this available to a small selection of users for testing and fine-tuning, so that we can improve the relevance of the responses and provide assurance from a data ethics, data protection and security perspective.

Stakeholder engagement

The ONS conducts a wide range of user and stakeholder insights, consultations and listening exercises. This engagement is essential as it provides us with actionable insights on users’ and stakeholders’ views on the strength of their relationship with the ONS, feedback on its outputs, and on how stakeholders access and use our statistics and analysis.

As part of this, the ONS’s Engagement Hub conducts annual stakeholder ‘deep dive’ research and an annual stakeholder satisfaction survey. I understand the Committee is interested in understanding more about these exercises and insights from recent examples.

The deep dive research is conducted through in-depth interviews with senior representatives from around 45 key stakeholder organisations. The stakeholder satisfaction survey is an online questionnaire aimed at a wider range of users from a variety of sectors and roles to provide broader insight. Deep dive participants include those from central and local government departments, devolved administrations, research institutes, think tanks, public bodies such as NHS England and the ICO, international partners, business representative bodies and charities. The stakeholder satisfaction survey reaches similar types of organisations, with a wider range of responses at senior manager, operational, public affairs, analyst, researcher, policy maker and economist levels.

Deep dive interviews took place in summer 2022 and the findings were positive. Many stakeholders said that the organisation had built on and maintained its reputation for independence, trustworthiness, quality and reliability. They also felt that the ONS had developed its reputation for being flexible, agile and responsive to changing needs. Additionally, the ONS was seen to be working more collaboratively with policymakers than it had in the past.

The stakeholder satisfaction survey was conducted in early 2023. It found respondents to be positive across key sentiment measures on trust, quality, and on the ONS producing statistics which are relevant to issues of the day. There were also positive views expressed about the ONS as an organisation with reliability, responsiveness, and willingness to help being cited. It was also noted that ONS staff were knowledgeable and helpful.

There were areas highlighted for improvement in both the stakeholder deep dive and satisfaction survey. These included how the ONS works with both devolved governments and heads of the statistical profession in government departments; improving the ease of finding the right people to speak to in the organisation; and more regular, strategic overviews of the ONS’s work (for stakeholders to be able to connect different topics better). Some participants referenced a need for further scrutiny to understand some data anomalies which had occurred in mid-2022.

These findings are shared throughout the ONS, including with the National Statistician’s Executive Group, and are used to inform planning and prioritisation. We have implemented measures to respond to the issues raised as part of a wider programme of ongoing external affairs improvements, which we continue to monitor with further research.

The ONS conducted a subsequent stakeholder deep dive in autumn 2023 and are currently analysing the findings. The latest ONS annual stakeholder satisfaction survey is currently live and will be open for responses until 22 January 2024.

Full business case on population and migration statistics improvements

As you are already aware, next year I will be making a recommendation to Government on the future of the population and migration statistics system in England and Wales. I understand that the Committee has requested some additional detail surrounding the financial aspects of this transformational work.

In the outline business case for the Future of Population and Migration Statistics programme, initial cost estimates of a potential census in 2031 range from £1.3 billion to £2 billion, with increases expected across all phases of such an operation.

The ONS is working to produce a full business case (FBC) for our proposals to improve our population and migration statistics. The FBC will be developed in the context of the forthcoming recommendation to UK Government, and the response from Government. At this stage, while the recommendation remains in development, it is difficult to provide an accurate updated estimate of cost.

The FBC is expected by HM Treasury in late 2024. We will be able to provide the Committee with further information on costs at a later date.

Migration statistics

As part of improving population statistics we are also transforming international migration statistics. Our latest estimates, year to June 2023 are official statistics in development and are provisional. We revised our June 2022 and December 2022 estimates upwards due to a combination of more data and methodological improvements.

International migration estimates are produced using three key sources: Home Office border data linked to a person’s travel visa for non-EU nationals, which made up 82% of total immigration in 2023; tax and benefit data (known as RAPID) for EU nationals; and International Passenger Survey data for British nationals. We are most confident with Home Office border data and have an ambition to produce all migration statistics from these data in future.

We work very closely with Home Office to procure and use border data linked with visa data to produce migration estimates. The ability of free movement for British nationals and some EU and non-EU nationals makes the current method a challenge for those that don’t require visas. However, there is further data held by the Home Office, known as Advanced Passenger Information, that would help with our research, particularly for British nationals. We have requested these data and would like to see Home Office accelerate this request.

Census 2021 data confirmed our position that the administrative data we use for non-British nationals is robust and that the international passenger survey data does not measure actual migration patterns well due to people changing their intentions. Rather than rebasing once a decade, following a decennial census, to correct for any drift in our population estimates, we aim to produce statistics that do not ‘drift’ from the truth. Our Dynamic Population Model based population statistics show how drift in both population and migration statistics can be mitigated. That does not remove the need to revise estimates as the data and methods mature.

Long-term international migration uses the UN definition of a migrant, that is someone that changes their country of residence for 12 months or more. To produce timely estimates, we therefore have to make assumptions based on previous behaviour. As more time passes, we are able to update those assumptions with data of actual travel. We therefore become more confident in our estimates over time. For example, our June 2022 estimates now have complete data to show if a migrant has stayed or left for 12 months and we therefore have less uncertainty around those estimates compared to the provisional June 2023 estimates.

We have recently published experimental uncertainty measures for our admin-data based migration estimates for the first time. These show our users how our confidence increases once we have complete data that meet the required definition.

We also described the nature of provisional estimates that are subsequently revised and the reasons behind these revisions. This was picked up and presented accurately in the media and in playing back conversations with our core users. The Office for Statistics Regulation (OSR) recently published a review of their recommendations on migration statistics. The OSR considered we sufficiently described uncertainty to our users, although we recognise these are experimental and will continue to update our users as they develop.

I hope that you find this additional information useful. Please do let us know if we can assist the Committee further on any of the issues discussed in this letter, or with any of its other inquiries.

Yours sincerely,

Professor Sir Ian Diamond

UK Statistics Authority correspondence to the Public Administration and Constitutional Affairs Committee on the Office for National Statistics’ work on the Labour Market

Dear Mr Wragg,

I am writing to update the Committee on the Office for National Statistics’ work on the Labour Market.

As you will be aware, due to quality concerns, the ONS suspended publication of the Labour Force Survey (LFS) estimates element of the wider Labour Market release in October. Instead, to provide users with our best assessment of the labour market we produced indicative experimental estimates of the headline employment, unemployment and inactivity rates. These were produced using the most robust administrative data sources available to us. For employment, we used payroll data from HMRC’s Real Time Information system, applying the growth rates of that data to the LFS for April to June 2023. Likewise, we used Claimant Count data for unemployment.

Today we have published a development plan for the LFS. This will focus on work to increase the number, and diversity, of the responses to the LFS and on improved methods to better account for non-response and bias. We will also update the population figures used in the Labour Market estimates which is another important improvement. With this work in train, we are aiming to reintroduce LFS estimates in the Labour Market release on 12 December.

In parallel, we will continue our work to transform this key survey. Alongside the LFS, we currently also have the transformed LFS in the field. This has a sample size that is three times that of the current LFS and has an on-line first mode of collection supported by telephone and face to face interviewing, to help ensure a higher and more representative response. We are doing some final fine-tuning to the questionnaire and expect to fully transition to this new survey in March 2024.

I do hope that you find this update helpful but please do let me know if you have any other questions about this topic, or if we can be of assistance to the Committee on any other matter.

I am also copying this letter to Harriett Baldwin MP, Chair of the Treasury Committee and The Lord Bridges of Headley MBE, Chair of the Economic Affairs Committee as their specialists were recently briefed on this matter by members of my team.

Yours sincerely,

Professor Sir Ian Diamond

UK Statistics Authority follow-up written evidence to the Public Administration and Constitutional Affairs Committee’s inquiry on Transforming the UK’s Evidence Base

Dear Mr Wragg,

When giving evidence to the Public Administration and Constitutional Affairs Committee on 5 September 2023, I promised to follow-up on a couple of points with various members of the Committee.

GDP Revisions

Firstly, I agreed to let you know if I was aware of similar revisions happening in other comparable countries.

As I outlined in the Financial Times recently, The UK’s official economic statistics are rightly seen as among the world’s best. This includes the recent upgrade of our official estimates for economic growth in the pandemic years of 2020 and 2021. The latest Organisation for Economic Co-operation and Development (OECD) information shows that the UK is one of the first countries in the world to estimate the 2020 and 2021 coronavirus (COVID-19) pandemic period through the detailed Supply and Use framework. This standard economic framework enables us to confront our data at a much more granular level for products and industry. The OECD provides a real-time vintages of GDP database in their main economic indicators, which takes data directly from the National Statistics Institutes.

Each country will follow different revision policies and practices, which can result in their estimates being revised at a later date, according to their own needs. The timing and impact of revision changes will depend on data availability and magnitude, with large annual structural surveys being the data source needed to make detailed product and industry changes. These annual data sources come with lags on timeliness, often being available up to 2 or 3 years later.

We have now seen revisions to GDP estimates published by other countries. As we previously announced, the 2021 GDP estimates for the UK were revised to 8.7 percent growth from our initial estimate of 7.6 percent growth, a revision of +1.1 percentage points. The Spanish Statistical Agency has now published 6.4 percent growth in GDP for 2021, compared with the previous estimate of 5.5 percent, a revision of +1.1 percentage points. The Netherlands have now published 6.2 percent growth for 2021, revised from an initial estimate of 4.9 percent, a revision of +1.3 percentage points. Italy, have now published 8.3 percent growth for 2021, revised from an initial estimate of 7.0 percent, a +1.3 percentage point revision. All are a similar magnitude upwards revision for 2021 as observed in the UK context. Conversely, the United States have now published 5.8 percent growth for 2021, compared to a previous estimate of 5.9 percent growth, a revision of -0.1 percentage points [ONS own calculations based on published US data from www.bea.gov]. This highlights that revisions can differ across countries.

Strengthening the Analysis Standard

Secondly, I promised to examine whether there is a case for strengthening the Analysis Standard. I am passionate about ensuring the robustness of the Analysis Standard and welcome the committee taking an interest its strength and its application across Government.

The Analysis Function Standard, which was updated earlier this year, is part of a suite of management standards that promote consistent and coherent ways of working across government, and provides a stable basis for assurance, risk management and capability improvement.

In my letter to you of 18 September regarding the Committee’s report‘Where Civil Servants work: Planning for the future of the Government’s estates’, I emphasised my work to promote transparency in Government Analysis through my role as Head of the Analysis Function. I am keen to take every opportunity to champion the Standard across government and will reiterate the importance of this area at October’s Heads of Function board meeting.

The Standard is very clear on expectations about transparency in the commissioning, production and publishing of analysis. It also has clear messaging about compliance to the Code of Practice for Statistics and other official guidance for the remaining analytical professions including the Aqua, Green and Magenta books.

It is my expectation that all departments closely follow the principles in these sets of guidance and through the Analysis Function Standards Steering Group we monitor and scrutinise these documents to ensure their continued effectiveness.

For the first time this year, all Departmental Directors of Analysis undertook a self-assessment against the Standard and in response to this we are staring a series of action groups to drive improvements, including in Departments compliance to official guidance.

I will keep the Analysis Function Standard under close review and, where necessary, strengthen the messages in it.

Please do let us know if any other questions, and if we can help the Committee further on either of these topics or any of its other inquiries.

Yours sincerely,

Professor Sir Ian Diamond

UK Statistics Authority written evidence submission to the Public Administration and Constitutional Affairs Committee inquiry into the UK’s evidence base

Dear William,

I write in response to the Committee’s call for evidence for its new inquiry, Transforming the UK’s Evidence Base. I very much welcome this inquiry with its focus on the future of data, statistics and analysis in government, as we in the UK Statistics Authority look to the future in a variety of ways.

As you will be aware, the Office for National Statistics (ONS) launched a consultation on the future of population and migration statistics in England and Wales in June. I enclose written evidence from the National Statistician, Sir Ian Diamond, within this submission, which highlights not only this consultation but also the progress of data sharing in government so far, how the Authority uses data ethically and protects users’ privacy, and how the ONS understands and responds to user needs.

Meanwhile, the Office for Statistics Regulation published their review into data sharing and linkage for the public good in July. I also attach written evidence from their Head of Regulation, Ed Humpherson, within this submission, where they discuss their findings from this report in more detail. In addition, they will soon be launching a review and will be seeking feedback on the Code of Practice for Statistics to ensure it remains relevant for today’s world of data and statistics production. We will provide more detail to the Committee on this soon.

Sir Ian Diamond, Ed Humpherson and I stand ready to engage with the Committee to expand on any of these points if helpful, and indeed will follow all oral evidence sessions of the inquiry with interest.

Yours sincerely,

Sir Robert Chote

Chair, UK Statistics Authority

Office for National Statistics response

Data and analysis in government 

How are official statistics and analysis currently produced? 

  1. Official statistics are defined as those produced by organisations named in the Statistics and Registration Service Act 2007 (the 2007 Act) or in the Official Statistics Order (SI 878 of 2023). The Code of Practice for Statistics (the Code) sets the standards that producers of official statistics should follow. The Office for Statistics Regulation (OSR) sets this statutory Code, assesses compliance with the Code, and awards the National Statistics designation to official statistics that comply with the highest standards of the Code.
  2. The majority of official statistics are produced by statisticians operating under the umbrella of the GSS, working in either the Office for National Statistics, UK government departments and agencies, or one of the three devolved administrations in Northern Ireland, Scotland and Wales. Every public body with a significant GSS presence has its own designated Head of Profession for Statistics. Each of the devolved administrations has its own Chief Statistician. The Concordat on Statistics sets out an agreed framework for statistical collaboration between the UK Statistics Authority, UK Government, and the Northern Ireland, Scottish and Welsh Governments.
  3. The Analysis Function brings together the 16,000 analysts across 7   The strategic aim of the Analysis Function is to integrate analysis into all facets of Government, building on the strengths of professions. The Analysis Function supports analysis in Government through capability building, sharing good practice, championing innovation, and building a strong analytical community. The National Statistician is head of the Analysis Function, as well as the Government Statistical Service (GSS). Each Government department has a Departmental Director of Analysis, who is responsible for analytical standards in their department. The network of Departmental Directors of Analysis (DDANs) form the leadership of the Analysis Function in order to drive and deliver the functional aims.
  4. A range of analytical techniques and various sources of evidence, combined or individually, including official statistics, can be used to provide insights for key questions for the public and decision makers. Such analytical processes and products are also supported by guidance such as the Green Book (appraisal of options), the Magenta Book (evaluation), and the Aqua book (quality assurance), which set the highest standards for government analysis.
  5. Official statistics and analysis across government are currently produced in line with the Code and its three pillars: trustworthiness, quality, and value.

How successfully do Government Departments share data?

  1. Successful data sharing across government departments is critical to operating and transforming the statistical system and producing high quality, trustworthy, and valuable analyses. The importance of sharing, and linking, data and putting data at the heart of statistics is set out in our current consultation on the future of population and migration statistics in England and Wales.
  2. There are some good examples of effective data sharing across government. The COVID-19 pandemic illustrated the ability of government and public services to use and share data to help and protect people. When data are shared effectively, the speed at which analysis can be done means time-critical policy issues can be understood and addressed quickly. For example, we created the Public Health Data Asset (PHDA) which is a unique population level dataset combining, at individual level, data from the 2011 census, mortality data, primary care records, hospital records, vaccination data and Test and Trace data, and allowed us to link across these data sources to provide new insights.
  3. Cross Government networks and the Analysis Function have a critical role in the success of cross department data sharing. For example, the Data Liaison Officer network, the National Situation Centre, the ONS and the Analysis Function recently collaborated to produce guidance on data sharing for crisis response. This built on the principles developed and utilised during the COVID-19 pandemic.
  4. However, there are challenges to building on this success and maintaining the momentum that occurred during the pandemic. Data sharing between departments continues to have asymmetric risk; with the perceived risks – either legal, operational or reputational – falling on the supplier or department sharing the data, while the benefits are diffuse across the system, or perceived to accrue to others. There are several common challenges to data sharing by suppliers, for example their level of risk appetite and differing interpretations of the law, their data preparation (and accordingly, data quality) and engineering capacity, and the governance within their own organisations.
  5. In terms of risk appetite, even when the environment is judged to be safe and secure by internal and external parties, there is still too much weight being placed on the risks of data sharing as opposed to the very real risk to the public of policy harm and loss of opportunity where valuable data is not being actively used and shared. The OSR notes this point in their report on data sharing and linking.
  6. Another challenge is that agreements to share data are often narrow; from one department to another for a specific purpose, for example a piece of analysis specific to a policy area or statistical output. It is often challenging to broaden these agreements which put a limit on the amount data can be reused or shared more broadly across multiple departmental boundaries. This creates inefficiency where the value of data is not fully realised and causes government departments to incur unnecessary and duplicative costs in implementing numerous bilateral arrangements, often with the same party.
  7. The level of data maturity across departments is also varied, which leads to a multitude of different approaches and interpretations to agreeing data sharing, a wide range of people being involved in approving data ownership and stewardship, and a myriad of different templates to formalise agreements. This contributes to a complex and burdensome system, which leads to long lead-in times to agree data shares. This issue is particularly acute when data is brought together and integrated from multiple departments, necessitating different governance processes to be engaged each time a change is required.
  8. The ONS provides an ‘Acquisition Service’, which proactively supports data suppliers and collaborates to put in place mechanisms which support the sharing of data, reducing the burden on the supplier as far as possible. For example, this could be seconding analysts into a department, drafting up Memorandum of Understanding on behalf of suppliers, and agreeing to undertake significant improvement on the data to make it of high and usable quality.
  9. The ONS is the lead delivery partner on behalf of government to deliver a cross government major programme, the Integrated Data Service (IDS), a Trusted Research Environment (TRE) which seeks to build on the success of the Secure Research Service (SRS), to bring together ready-to-use data to enable faster and wider collaborative analysis for the public good. The IDS intends to transform the way that data users share and access government data. Firstly, the IDS is a fully cloud-native system which will further enable connectivity across a federated data environment, reducing the friction caused by sharing data multiple times. Secondly, it will provide the facility to fully exploit the opportunities for safe and secure access to data provided for in the Digital Economy Act 2017 (DEA). Thirdly, it will apply a common linkage approach to enable analysts to join data from different departments, repeatedly, to meet diverse analytical requirements.
  10. The ambition is that the IDS will help overcome some of the existing challenges, costs and delays to effective data sharing across government. Its success, however, will depend on the extent to which government departments can embrace a common approach to sharing, stewarding, linking and accessing data.

How do other nations collect and produce statistics? 

  1. The UK is connected with other National Statistical Organisations (NSOs) on both a multilateral and bilateral basis, to learn and to share best practice. The UK is represented at the highest levels in multilateral fora such as the Organisation for Economic Co-operation and Development (OECD) and the United Nations Economic Commission for Europe (UNECE) and participates in a number of working groups advancing the use of administrative data and new forms of data. For example, we recently presented our work on nowcasting to the UNECE Group of Experts on National Accounts which gathered interest from other countries such as Austria, Canada, Indonesia and the United States.
  2. The UK chairs the UNECE Expert Group on Modernising Statistical Legislation, ensuring that statistical legislation and frameworks are equipped to deal with a changing data ecosystem. This has led to further work on data ethics, social acceptability, and access to privately held data. We also sit on the UNECE Task Force on Data Stewardship, which aims to develop a common understanding of the concept of ‘data stewardship’ and define the role of NSOs as data stewards.
  3. We are regularly contacted by other NSOs to share our experiences of utilising data science techniques to produce high quality data in near real time to inform decision making. For example, the ONS recently hosted the German led Future of Statistics Commission to share our experiences of utilising new data and new methodology to keep up with societal needs and respond efficiently to emerging crises. We have hosted colleagues from Statistics Finland to discuss the data landscape in the UK and colleagues from Statistics New Zealand to discuss challenges faced with data collection. We have also shared experiences on our real-time economic indicator suite, particularly on new sources of data such as card data.
  4. The ONS has been involved with the World Health Organisation’s Pandemic Preparedness toolkit project. This project calls upon the Authority’s pandemic response experience and expertise to develop a toolkit containing practical guidance, statistical methods, knowledge products, case studies and training materials for other NSOs, particularly for data sharing.
  5. As part of the International Census Forum, the ONS has had ongoing conversations with the US Census Bureau, Statistics Canada, Australian Bureau of Statistics, Statistics New Zealand and CSO Ireland about the use of administrative data for population statistics. All participants have benefited from these conversations over the last few years, covering subjects such as census collection planning and efficiencies, quality assurance, processing improvements and contingency plans.
  6. The UK is a member of the UN Statistics Division (UNSD) Collaborative on the Use of Administrative Data for Statistics. This group of countries and regional and international agencies was convened by the UNSD and the Global Partnership for Sustainable Development Data (GPSDD), with the aim of strengthening countries’ capacity to use administrative data sources for statistical purposes, including replicating census variables.
  7. The Collaborative provides a platform to share resources, best practices and experiences. This includes a self-assessment tool, a draft toolkit for quality assessment of administrative data sources, and an inventory of resources which contains recommendations and practical examples on the use of administrative data in different contexts.

The changing data landscape

Is the age of the survey, and the decennial Census, over?

  1. The ONS’s vision is to improve its statistics so that they can respond more effectively to society’s rapidly changing needs. The ONS is proposing to create a sustainable system for producing essential, up-to-date statistics about the population. To do this, the system would primarily use administrative data like tax, benefit, health and border data, complemented by survey data and a wider range of data sources. This could radically improve the statistics that the ONS produces each year and could replace the current reliance on, and need for, a census every ten years.
  2. Producing high-quality, timely population statistics is essential to ensure people get the services and support they need, both within their communities and nationwide. Population statistics provide evidence for policies and public services, as well as helping businesses and investors to deliver economic growth across the country. It is important that these statistics are up to date and reliable, so that they can accurately reflect the needs of everyone in society. Currently, the census provides the backbone of these statistics, offering a rich picture of our society at national and local levels every ten years. Every year, the ONS brings together census data with survey and administrative data to reflect changes in society. As a result of this approach of ‘rolling forward’ estimates year-on-year, statistics become less accurate over the ten years between censuses and local detail on important topics becomes increasingly out of date. After each census, the previous decade’s mid-year population estimates are “rebased” to ensure they are consistent with the baseline estimate from the new census. This makes the previous decade’s mid-year population estimates as accurate as possible.
  3. There has also been a well-documented global trend of declining response rates to surveys and censuses, which can impact representativeness of the data and, as a result, data quality. While Census 2021 in England and Wales enjoyed a high level of public engagement and response, this is an outlier in a wider trend of population censuses and social surveys across the world.
  4. Data collection is costly, and these costs can be elevated when the need arises to incentivise survey response (often monetarily), chase response many times (particularly in the case of the census) or adjust collection operations. In recent examples where response rates to censuses have been below targets, mitigations have included the deadline for responses to census being extended, boosted communication campaigns, and greater use of administrative data to enable the production of robust estimates, all of which can contribute to elevated cost.
  5. Building on recent advances in technology and statistical methods, and legislative facilitation of data-sharing across government for statistical purposes, the ONS has for several years been researching the use of administrative data as a primary source for meeting user needs for some statistics. For population statistics specifically, this work is responding to the Government’s ambition, set out in 2014, that censuses after 2021 “be conducted using other sources of data and [provide] more timely statistical information.”
  6. It has shown that it can produce population estimates with a more consistent level of detail and accuracy over time, and migration estimates based on observed travel patterns rather than respondents’ stated intentions, using administrative data to respond to the difficulties of estimating internal and international migration. The ONS has also developed methods for producing information about the population more often and more quickly. These methods will offer insights into our rapidly changing society as administrative data reach their full potential over the next decade.
  7. In June 2023, the ONS launched a consultation on its proposals for the future of population and migration statistics in England and Wales, responses to which will inform a recommendation to Government. The consultation’s proposals emphasise that surveys may continue to play an important role whilst ONS works with partners to widen and develop the range of administrative data sources that are collected and used. However, the ONS believes it has reached a point where a serious question can be asked about the role the census plays in its statistical system.
  8. If implemented, the proposed system would respond more effectively to society’s changing needs by giving users high-quality population statistics each year. It would also offer new and additional insights into the changes and movement of our population across different seasons or times of day. For many topics, it would provide much more local information not just once a decade but every year, exploring them in new detail and covering areas not recorded by the census, such as income. These are ambitious changes, and decisions on the next phase of this work will set the direction for the ONS’s work programme over the coming years, as the ONS continues to improve its population and migration statistics.
  9. It is worth noting that fast-paced, qualitative surveys have and will continue to have a place in the statistical system. For example, the Opinions and Lifestyle Survey and Business Insights and Conditions Survey in particular illustrate the worth of flexible surveys, adapted from a source of rapid intelligence during the pandemic to useful tools for understanding at pace current issues such as cost of living to the adoption of AI.

What new sources of data are available to government statisticians and analysts? 

  1. The ONS has been a heavy user of Census and survey data and, more recently, government administrative data. Given the new global data landscape, and the increasing society-wide and economy-wide digitisation, new and exciting massive, high-frequency data (‘Big Data’) are being generated, some of which is proprietary and some of which is openly available. These new data sources offer opportunity for the ONS to be radical, and ambitious in what data it uses – in line with the Authority’s strategy, Statistics for the Public Good – and how, to provide better, highly trusted information and statistics to the public, to the private sector, and the public sector (including government) at national and regional level, across a wide range of issues.
  2. A key part of the current consultation on the future of population and migration statistics seeks input on transforming population and migration statistics using alternative data, as opposed to using predominantly survey-based sources. The ONS has now published a suite of evidence demonstrating the opportunity to use alternative data sources to deliver more timely and granular statistics as well as provide value for money.
  3. To support statistical transformation and the strategy, the ONS works with hundreds of independent data suppliers including central, devolved and local government, and the private sector, to share data for the public good research. The ONS’ most recent data transparency publication demonstrates the breadth of data sources containing personal data, and the broad opportunity to use novel data sources to support statistics.
  4. The ONS uses around 700 data sources, including health, tax, benefits, education and demographic information from other government departments and public bodies. We also work with a wide range of data from commercial providers including retail scanner data, where coverage is around 60-70% of the grocery market, alongside financial transaction data, domestic energy usage and others.
  5. A recent example of the ONS harnessing the power of new data sources is published data on UK Direct Debits, developed with our partners at Vocalink and Pay.UK. This new, fast and anonymised data source provides insight on consumer payments to most of the UK’s largest companies. It gives the ONS, for the first time, an opportunity to rapidly analyse price movements such as changes in the average Direct Debit amount for bills, subscriptions, loans, or mortgages as well as overall payment behaviour via failure to make these Direct Debit payments. This provides extremely useful and timely insights into the state of the UK economy and in time, could feed into wider national accounts estimates.
  6. In addition, new sources being acquired as part of the transformation of consumer statistics are already being incorporated into headline measures of Consumer Prices Index (CPI). This has started with the incorporation of rail fares, enabling a far greater level of detail to be accessed within our published indices.
  7. Alongside the opportunities presented by novel data sources, there is also huge potential through continued broader integration of data. Integrated data assets, made up of multiple constituent data sources, provides the potential for greater depth and breadth of research, and for the creation of insight which is not possible from analysing data sources in isolation.
  8. A good example of the value of integrated data is ONS’ development of a PHDA which has enabled the ONS to produce novel analyses including Ethnic contrasts in COVID-19 mortality and other high impact pandemic related statistics. The ONS is now using the PHDA to explore more indirect impacts of COVID-19 and wider non-COVID research questions. These include the impact of health on labour market participation by linking in Department for Work and Pensions (DWP) and HM Revenue and Customs (HMRC) data.
  9. The ONS has also made use of ‘open’ data to better inform the public. Using Automatic Identification System (AIS) shipping data to help monitor trade (shipping) flows. This data science work feeds into the regular monthly ONS publication on Economic Activity and Social Change in the UK, Real Time Indicators. We are looking at other forms of open data that can be used to produce statistical outputs and products that deliver on statistics for the public good.
  10. As well as using linked data, the ONS does work bringing multiple types of data together and uses advanced data science methods to deliver statistics for the public good. The ONS collects travel to work data from the census every ten years, with the most recent being 2021. Travel to work matrices which show movement of people from their home location (origin) to their place of work (destination) at an aggregated level. Information on travel to work provides a basis for transport planning, for example, whether new public transport routes or changes to existing routes are needed. Additionally, it allows the measurement of environmental impacts of commuting, for example traffic congestion and pollution, and how these might change over time, for example because of changes in commuting modes, such as a shift from car to bicycle. This travel to work data helps us generate travel to work matrices for census years, for instance, at 10-year intervals with no updates for years in-between. However, using Census 2011 travel to work data, Census 2021 population data, National Travel Survey data (collected by Department for Transport (DfT)), National Trip Ends Model data (produced by DfT), ONS geography products such as MSOA boundaries and Population Weighted Centroids, we can produce estimates of travel to work matrices at more regular intervals than once every 10 years.
  11. The approach to integrated data is being taken further as part of the IDS, with data ‘indexed by default,’ enabling common deidentified identifiers to be consistently applied to enable a common linkage approach. Data deidentified in this way can be grouped thematically to create Integrated Data Assets around the themes of health, levelling up and net zero. The value of this approach is that it retains the core value of the source data, while being supported by Privacy Enhancing Technology, and facilitates access to a much broader range of data enabling analysts to exponentially grow the value of their analysis.

What are the strengths and weaknesses of new sources of data? 

  1. Use of new administrative and commercial data sources are transforming the way we produce statistics (a trend seen across the world). The sources provide timely, frequent and granular data about the population that is not possible through survey collection. Through the linkage of different data sources, we can provide coverage across the population down to local levels of geography. Innovations in this area include a new approach to produce more timely and frequent high-quality estimates of the size and composition of the population down to local level and how this changes due to international migration. However, administrative data are not collected for statistical purposes and, as a result, there are both strengths and weaknesses with such data sources.
  2. Relevance and conceptual fit: the content of administrative data is determined by the services they support. This includes the topics collected, but also the precise definitions of the items that are measured. While surveys can be designed so that the questions they ask capture the statistical concepts we want to measure as accurately as possible, this is rarely possible for administrative sources, especially well-established ones. In practice, that means we need to adjust the data so it fits with statistical definitions using additional data from elsewhere (such as a survey) or can only approximate those definitions if adjustment is not possible. An important area to help with this is collaboration across government, to improve the collection of data in administrative systems, particularly around protected characteristics.
  3. Coverage: the strength of many of these large data sources is their granularity, which gives us the ability to analyse data for small groups or for small areas. This is not always possible with surveys, particularly surveys with small sample sizes. However, the coverage of most of these datasets will be incomplete. Parts of the population will not interact with the administrative source and therefore they will be missed from the dataset. In other cases, the data may not cover everybody or everything. The power, from an analytical point of view, comes from linking different datasets together to improve the coverage and enable analysis at a local level. However, sometimes surveys are also needed to fill the coverage gaps. There can also be over coverage in data sources, where individuals appear on the dataset who aren’t within a target population, for example the inclusion of emigrants and short-term residents who have recent administrative data activity but are either no longer resident or are not resident for sufficient time to meet the definition for inclusion in our estimates.
  4. Linkage: new sources of data need to be integrated to improve coverage and allow analysis. There are often not unique identifiers to enable this linkage so this can add to the complexity, time, and cost to process data to allow analysis. The complexity of linkage without common unique identifiers means that it is never perfect. Moreover, the quality of linkage may vary across the population. Understanding and quantifying linkage quality is critical, as issues that arise (such as under-representation) will feed through into statistical analysis and may affect results if not properly mitigated.
  5. Timeliness, for both collection and delivery (although new data sources are often relatively more timely than survey data):
    • Collection lags: there is often a lag between an event occurring and the data for that event becoming available, for example moving address, then registering at a doctor’s surgery or making a profit and filing a tax return. There are different time lags for different datasets. Real time analysis is often not, at this point in time, possible from the new data sources; there is always a time lag.
    • Data delivery: timelines can impact on the timeliness of the statistical and analytical outputs. Data takes time to be processed and be delivered to statisticians who analyse it. Often data can be delayed in the delivery or not shared in line with analytical requirements due to the nature of data sharing agreements. Within the ONS we are working with departments to build mature data transfer systems supported by robust Data Sharing Agreements (DSAs).
  6. Coherence, harmonised standards, and metadata: different operational polices lead to associated administrative data being collected in different ways. One dataset may be at quite a high level, while others could have more detailed information – even datasets that appear to collect information on the same metric may not be comparable, or not wholly comparable. This makes analysis more difficult and can mean that data is not available at the relevant level of granularity. Metadata and detailed information about the data collected is also often lacking, making using the data more difficult. We are using secondments into departments to better understand the data and build the metadata, and thus improve this situation.
  7. Stability and accuracy: with survey data we have control over the questions asked and the stability of those questions. Administrative or commercial data can change, for example in what and how data is collected because of changes to the operational process. This is both a strength and a weakness: strength, as it allows the data to adapt more to changing requirements and needs; weakness, as it brings in a possibility of breaking series, which is not ideal for statistical analysis of trends. An example of this was the removal of Universal Child benefit, which caused a big drop in the coverage of children in DWP/ HMRC data. To future proof statistics, we need to make sure that we are not reliant on just one source of data. In addition, accuracy will often depend on whether it is a critical variable in the administrative function, when it’s not quality drops, as the data may often be missing (if voluntary) or it does not undergo robust checks on collection.
  8. Design: sometimes administrative data collection processes were designed a few decades ago and can rely on legacy IT to be analysed. This also implies that design of questions or forms doesn’t fit new needs or requirements, and does not follow user centred design principles, affecting the quality of the data collected. However, some of the major sources that we are currently using have undergone improvement in this area.
  9. Supplier restrictions: some suppliers place restrictions on the data that is shared, for example by applying techniques to data to enhance privacy (such as hashing, perturbing, aggregating data). This can damage the usefulness of the data and how it can be used within statistical outputs.

Protecting privacy and acting ethically 

Who seeks to protect the privacy of UK citizens in the production of statistics and analysis? How? 

  1. All producers of official statistics are legal entities and data controllers in their own right, and therefore are responsible for protecting the data they hold and use. Data protection legislation, including the UK GDPR and the Data Protection Act 2018 provides the statutory framework all producers of official statistics must adhere to, and makes specific reference to personal data processed for statistics and research purposes. The Information Commissioner (ICO) is the independent authority that ensures compliance with data protection legislation and upholds information rights in the public interest. As part of their role the ICO provides advice and guidance, including, for example, a Data Sharing Code of Practice.
  2. As the executive office of the Authority, the ONS collects and processes information, both directly from individuals and from other organisations, and does so using a variety of methods. The Authority has the statutory objective to promote and safeguard the production of official statistics that serve the public good. Any personal data collected by the Authority can only ever be used to produce statistics or undertake statistical research.
  3. In addition to data protection legislation, personal information held by the Authority is further protected by the 2007 Act, and makes disclosure of personal information a criminal offence, except in limited prescribed circumstances, for example where disclosure is required by law.
  4. The DEA provides the Authority with permissive and mandatory gateways to receive data from all public authorities, crown bodies and businesses. These data sharing powers can only be used for the statistical functions of the Authority and sharing can only take place if it is compliant with data protection legislation.
  5. The 2007 Act requires the Authority to produce and publish the Code, governing the production and publication of official statistics. One of the core principles of the Code is around data governance and this states that organisations should ensure that personal information is kept safe and secure.
  6. The ONS provides guidance, support, and training on matters across the GSS, including on data protection and privacy.

Data protection

  1. The Authority has a dedicated Data Protection Officer and teams and colleagues that manage data protection and legal compliance and the security of data. These teams provide advice and guidance across the organisation on data protection matters; deliver training sessions on the protection of data; and engage regularly with the ICO.
  2. The Authority takes a data protection by design approach when processing data for statistical purposes. Privacy and data protection issues are considered at the design phase of systems or projects. The Authority has published extensive material regarding privacy for members of the public including privacy information for those taking part in surveys and a Data Protection Policy. For new projects that involve the processing of personal data, colleagues are advised to complete Data Protection Impact Assessments, that enable the Authority to identify any risks of processing to data subjects and to mitigate those risks.

Statistical confidentiality

  1. The Authority collects a vast range of information from survey respondents, as well as administrative data, such as registration information on births, deaths and other vital events. The ONS publishes statistics and outputs from this information, and statistical disclosure methods are applied so that the confidentiality of data subjects, including individuals, households, and corporate bodies, is protected. All statistical outputs are checked for disclosure risk, and disclosure control techniques are applied as required.
  2. The DEA facilitates the linking and sharing of de-identified data by public authorities for accredited research purposes to support valuable new research insights about UK society and the economy. The Authority is the statutory accrediting body for the accreditation of processors, researchers and their projects under the DEA.
  3. The Authority allows access to de-identified data within its trusted research environments. To ensure the security of the data and individual privacy, the Authority uses the Five Safes Framework:
    • Safe People: trained and accredited researchers trusted to use data appropriately.
    • Safe Projects: data that are only used for valuable, ethical research that delivers clear public benefits.
    • Safe Settings: settings in which access to data is only possible using our secure technology systems.
    • Safe Data: data that have been de-identified.
    • Safe Outputs: all research outputs that are checked to ensure they cannot identify data subjects.

Security controls

  1. The protection of data is a top priority for the Authority, and we implement and operate substantial security measures for our staff, data and services. This security focus ensures that the Authority operates and continues to develop secure options that meet its objectives for data use and maintains public trust in how we access, use, process, store, and make available data for statistics and research purposes.
  2. To ensure the confidentiality, integrity and availability of our data are protected at all times, we operate a security management framework, which continuously evaluates the threat landscape, evaluates the security risks and ensures that the appropriate controls are in place, so that we are operating within corporate risk appetite, maintaining a strong security posture and complying with the relevant legislation, Code of Practices and industry best practice. This is underpinned by a robust secure by design approach, comprehensive protective monitoring, internal and external assurance and the training of our staff.

What does it mean to use data ethically, in the context of statistics and analysis? 

  1. The Authority owns a set of six ethical principles relating to the use of data for research and statistics. These principles cover: public good, confidentiality & data security, methods & quality, legal compliance, public views & engagement, and transparency. The production, maintenance and review of these principles are conducted by the National Statistician’s Data Ethics Advisory Committee (NSDEC). The NSDEC was established to advise the National Statistician that the access, use and sharing of public data, for research and statistical purposes, is ethical and for the public good. The NSDEC consider project and policy proposals, which make use of innovative and novel data, from the ONS, the GSS and beyond, and advise the National Statistician on the ethical appropriateness of these. The NSDEC meet quarterly and have a key role in ensuring transparency around the access, use and sharing of data for statistical purposes.
  2. In 2021 the Centre for Applied Data Ethics (CADE) was established within the Authority. CADE provide practical support and thought leadership in the application of data ethics by the research and statistical community. The Centre provides a world leading resource that addresses the current and emerging needs of user communities, collaborating with partners in the UK and internationally to develop user-friendly, practical guidance, training and advice in the effective use of data for the public good. In addition to providing the secretariat to the NSDEC, the CADE mobilise the Authority’s six ethical principles via a self-assessment tool that is available to the entire research and statistics system. This tool supports researchers and analysts to identify ethical concerns in their work and then to engage with CADE to ensure mitigations and solutions are in place. Since January 2021, this tool has enabled nearly 900 pieces of ethical research and statistics and is growing in use by hundreds a year.
  3. Complementing the independent advice and guidance of the NSDEC and the self-assessment ethics services of CADE, the Centre also produces several bespoke ethics guidance pieces each year. These guidance pieces are typically produced in collaboration with an area of the ONS or the wider statistical system and focus on key concerns, such as identifying and promoting public good, considering public views and engagement, and specific ethical considerations in inclusive data, machine learning and geospatial data. Finally, the CADE also offer bespoke ethics support with specific projects, workstreams and teams and tailor their services to one-off events and longitudinal engagement work. This includes our international development programme where we work to support the work of various other National Statistical Institutes.
  4. The focus of the CADE’s activities is to ensure that the Authority’s ethical principles are promoted and accessible and that tools to ensure the principles are put into practice are effective and easy to use. We achieve this through promoting CADE at internal and external events, providing secretariat to the independent NSDEC, operating and providing oversight of the CADE self-assessment tool, producing specific, collaborative guidance pieces and providing bespoke ethics advice and support. By engaging with the CADE, researchers and analysts can ensure ethical practice, in-line with the Authority’s ethical principles, in the production of research and statistics.

Are current processes and protections sufficient? 

  1. The Authority has well established processes and procedures in place to ensure the protection of the data of UK citizens. As the statistical and analytical landscape around data changes, as with during the COVID-19 pandemic, the Authority ensures that it remains up to date with any changes to privacy legislation, regulatory guidance or cross-government good practice that could impact data subjects. This ensures a robust statistical system that produces public-good statistics that are trusted by the public.
  2. The ONS security management framework incorporates and references appropriate recognised security standards and guidance from within Government (Cabinet Office, National Cyber Security Centre (NCSC), Centre for Protection of National Infrastructure (CPNI)) and international standards and best practice from international security organisations including ISO 27001, the American National Institute of Science and Technology (NIST) and the Information Security Forum (ISF).
  3. From a data ethics perspective, there are dozens of organisations in the statistical system who display their ethical framework and commit to using it. CADE goes beyond this by evidencing, transparently, the impact that engaging with CADE is having on the production of research and statistics. Numerically, by the number of projects that CADE and the NSDEC see each year, and in more detail, by the production of case-studies, publicly displayed meeting minutes and audits of projects that have been signed-off. Where researchers and analysts engage with CADE and their services, ethical practice can be assured and evidenced.

Understanding and responding to evolving user needs

Who should official data and analyses serve? 

  1. Our data, statistics and analysis serve the public through our statutory duty to “promote and safeguard the production and publication of official statistics that serve the public good” (as set out in the 2007 Act). Everyone is a user or potential user of our statistics and can use data to inform their decision making: from policy makers to enquiring citizens, including local businesses, charities and community groups.
  2. Within the ONS, we have established an Engagement Hub to enable us to coordinate our engagement with users, understand user needs, reach new audiences and evaluate our engagement.
  3. Users are at the heart of everything we do. When identifying priorities for analysis, we do so through:
    1. discussions with other government departments and the devolved administrations.
    2. local engagement: our new ONS Local service works with analytical communities locally and with the wider civic society to support further analysis to target local questions to address local issues.
    3. Citizen focus groups with members of the public and the ONS Assembly with charities and bodies representing the interests of underrepresented groups of the population.
    4. drawing on external advice, for example the National Statistician’s Expert User Advisory Committee (who advise on cross-cutting issues).
    5. regular engagement activities with businesses and third sector organisations.
  4. A good example of the ONS reflecting the needs of users is the COVID-19 Latest Insights Tool, developed so that members of the public could find reliable, easy to understand information about the COVID-19 pandemic in one place. We engaged in user testing at key stages to make sure it met user need, and it became the most widely read product in the history of the ONS website.
  5. We also undertake an annual stakeholder deep-dive research which explores stakeholder needs, and a stakeholder satisfaction survey which supports evaluating progress against our strategic objectives.
  6. According to the Public Confidence in Office Statistics (PCOS) 2021 report, a very high proportion of respondents trusted the ONS (89% of those able to express a view) and its statistics (87%). This was very encouraging to see. It also asked respondents about their level of trust in the ONS compared to other institutions in British public life. Of the institutions listed on the survey, the ONS has the highest levels of trust, similar to that of the Bank of England and the courts system.
  7. In terms of analysis, the Analysis Function strategy explains how we bring analysts across government together to deliver better outcomes for the public by providing the best analysis to inform decision making. The Function serves policy makers across Government and has regular conversations with stakeholders to ensure that our data, statistics and analyses are relevant to public policy priorities and delivered in a timely way.
  8. Our Analytical Hub aims to provide capability and capability to deliver radial cross-cutting analysis that supports Government, civil society, and the public to understand the key questions of the day, responding flexibly and in a timely fashion to the ongoing economic and public policy priorities.

How are demands for data changing?   

  1. Changes in society, technology, and legislation mean that more data are available, in richer and more complex forms, than ever before, with the COVID-19 pandemic shifting the expectations of users to receiving more insights more rapidly. The pandemic and its impact on our society and the economy has led to more complex questions which means that the needs for data are also accompanied for growth in expertise and support to use data that reflects the intersectional nature of policy enquiry. Our statistics need to be quick, relevant, trusted and reliable to withstand public prominence and scrutiny, respond to a rapidly changing environment, and inform critical policy
  2. The ONS aims to respond to the needs of the public, decision makers and society – including providing data and insight on the topics and priorities of the day. We have already evolved our approach to respond to demand increasing for data, statistics, and analysis to be:
    • More timely, through more rapid surveys, such as the Opinions and Lifestyle Survey, and the use of new data sources, like financial transaction and mobility data.
    • More local, with the production of more granular and hyper local data, which allows users to build up their own bespoke geographies that matter to them such as Gross Value Added (GVA) at lower super output area, and greater support for local users and decision makers using our ONS Local service.
    • More inclusive, through making our data more accessible and reflective of our users, allow people to see themselves in our statistics and analysis, such as our shopping prices comparison tool; and
    • More relevant, both in terms of topics, for example looking beyond Gross Domestic Product (GDP) to consider multi-dimensional wellbeing alongside improved measures of economic performance, and in terms of how we disseminate our data and statistics, for example through application programming interfaces (APIs), to empower users to do their own analysis.
  3. We will continue to build on this progress as demands change, for example through the increasing availability and evolving possibilities for artificial intelligence (AI). We did this particularly well during the pandemic, and have since focused on new policy priorities, such as the rising cost of living, the changing nature of the labour market and the experiences of Ukrainian nationals arriving in the UK having been displaced through the conflict.
  4. As well as responding to emerging issues, we are making it easier for the public to find and consume insights on topics of interest by pulling together our many different data sets in the form of dashboards and data explorers. These include the Health Index, which brings together different datasets at local levels, subnational data explorers considering the economy, society, environment and more across local areas, the new UK measures of wellbeing providing 60 indicators across 10 domains, and the latest data & insights on the cost-of-living. In addition, data collected through the 2021 Census is being made available through our flexible table builder, articles of interest and interactive maps.
  5. Demands for data are also changing among expert users, with the rise of big data and the need for more data linkage across Government. This is why we are investing in the IDS to provide a secure environment for trusted researchers to analyse new, granular and timely data sources for the public good.

How do users of official statistics and analysis wish to access data?  

  1. We have developed a deep understanding of our diverse user audiences and their unique needs and requirements. We have grouped website users into 5 persona groups:
    • Technical User: Someone who only wants data and will create their own datasets and customise their own geography boundaries. Data from the ONS are frequently used in conjunction with data from other government departments. They may be expert at what they do with statistics but can be less expert at looking for base data. There is not the urgency we see from the expert analyst. They do not tend to use written publications.
    • Expert Analyst: Someone who creates their own analysis from data. This user downloads spreadsheets into their own statistical models to create personal datasets.
      Access to the data for analysis is more important to them than its presentation.
    • Policy Influencer: Someone who uses data for benchmarking and comparison. For some policy influencers, this requires data and analysis at a regional or local level. They rely on official government statistics, trusted by decision makers, for their reports.
    • Information Forager: Someone who wants local data and keeps up to date with the latest economic and population trends to help them make practical, strategic business decisions. They often do not know exactly what to search for, until they come to it.
    • Inquiring Citizen: Infrequent visitors to our site who search for unbiased facts about topical issues. They want simply worded, visually engaging summaries, charts and infographics. Data can help make informal decisions about pensions and investments. They engage on social media and browse with smartphones or tablets.
  2. We have found that citizen type users want ways to get data on their local area or to fact check data by using interactive tools, summaries, dashboards, visualisations and maps, whereas more data literate users are interested in the data itself and the associated metadata and methodology. Technically advanced analysts are also interested in being able to access data via APIs and for data to be easily used in tools such as Python and R. These technical users prefer not to have heavily formatted Excel spreadsheets with multiple tabs.
  3. We know many of our local users are keen to understand a place by many topics rather than go to several publications with multiple datasets for one datum. As such, we are developing our Explore Subnational Statistics service, that will allow users to select a geography and see metrics across a range of themes. ONS Local also helps local government users to bring together evidence across their area, alongside local intelligence and data and analysis to create greater insights.
  4. Our search engine optimisation strategy recognises that not all users need to or want to come to our website and that Google and the major search engines often represent our data directly in their search results. This is particularly applicable to those with accessibility needs, since many of these ways of representing our data can be returned via voice search on a variety of platforms.
  5. Citizen users want data communicated in a way that is easy for them to digest and research has shown that there is a degree of education required about our key topics such as inflation and GDP. Users may also be interested in their local areas as much as national level data.
  6. Within the ONS Engagement hub, there are dedicated teams focussing on building relationships with different audience segments. The External Affairs team supports stakeholder engagement with key government stakeholders, business and industry groups, consumer bodies and think-tanks. The dedicated Outreach and Engagement team is focused on engaging with local authorities, and building sustainable relationships with community and faith groups, voluntary sector organisations and others representing the interests of those audiences traditionally less well represented in official statistics and government data.

How can we ensure that official data and analyses have impact? 

  1. Ensuring what we are doing focuses on the topics that matter most to people and ensuring that it is disseminated in a way that is easy to understand, engaging and relevant to the audience, is key to achieving impact.
  2. For example, we worked with colleagues across government to establish new data collection on Ukrainian refugees and Visa sponsors in the UK. This allowed us to provide invaluable insights on the experience of Ukrainians coming to the UK, and the impact on service provision in areas. The publication of this in English and Ukrainian supported both policy decisions around the humanitarian response & provided those impacted with the ability to read the findings in their own language.
  3. On cost of living, we delivered a broad information, engagement and communications programme that included promoting cost of living insights and data products to a wide range of users; diversifying engagement with non-expert users (for example 99 civil society and community groups attended a session on how they could benefit from our insights); and seeking user feedback to further improve our cost of living products and analysis. Impact from this includes a continued increase in the use of our insights tool with the personal inflation calculator being embedded into Guardian and BBC websites and the shopping prices comparison tool reaching over 700,000 uses in its first week.
  4. In June we launched a public consultation on the future of population and migration statistics in England and Wales. We engaged extensively with stakeholders before the consultation launch through sector specific round tables. The consultation launch itself was widely promoted across stakeholders in all sectors and around 500 people attended launch events and webinars. Engagement will continue throughout the consultation period to maximise awareness, understanding and response.
  5. We regularly review the impact of our work, providing impact reports to the Strategic Outputs Committee (was Analysis and Evaluation Committee) on a quarterly basis, providing deep dives on priority topics.
  6. The ONS has access to a number of metrics that can be used to assess the impact of our outputs, specifically reach and awareness to understand the importance and relevance of the insight and content. We also test engagement levels to understand how well our content performs to achieve cut-through and add value to a debate. In addition, we use our own surveys to test appetite for future topics and outputs.
    1. Reach / awareness – to test importance and relevance of topic and insights.
      • Web page views: unique sessions a page was viewed at least once, within seven days.
      • Social media impressions: number of views per posts on social media platforms, within 48 hours
      • Print, digital, broadcast: number of views / listens of ONS insight from outputs.
    2. Engagement – to test content cut-through, clarity of messages and the value added to a debate or discussion.
      • Time spent on web page: time users spent viewing a specified page or screen, taken after seven days.
      • Social media engagement: shares, favourites, replies and comments, URL click throughs, hashtag clicks, mention clicks and media views, taken after 48 hours.
      • Print, digital, broadcast: cut through of ONS comments / main points within coverage.
      • Online pop-up survey for targeted releases to test user satisfaction and help set continued improvements.
      • An annual stakeholder survey and in-depth interviews targeted at government departments, charities, public institutions and businesses, tests satisfaction and use of statistics and analysis, and future needs.
  7. We bring these insights together, alongside granular stakeholder engagement to understand the impact of our work, at both a topic level, and our individual outputs to support ongoing decision making around both what we focus on, and how we can best maximise the impact of our work.

User engagement

  1. User engagement is key to making an impact. Our User Engagement Strategy for Statistics promotes a theme-based approach to user engagement. This allows all users of government data and statistics to interact with the GSS by their area of interest or by cross-cutting theme. This approach also aims to support collaboration with producers of official statistics to develop work programmes, address data gaps and help improve GSS products and services.
  2. We have created the ONS Assembly to support regular dialogue and feedback on delivering inclusive data with charities and bodies representing the interests of underrepresented groups of the population. The Assembly aims to be:
    • A forum for the ONS to engage and have an open dialogue with charities and member bodies on a range of key topics.​
    • A space to build trusted, long-lasting relationships between members and the ONS​.
    • An opportunity for members to share insight, advice and feedback on behalf of their interests and audiences.​
    • A space to exchange news and move collaboratively toward the future of data.
    • A route to help ensure vital themes, such as inclusivity, accessibility, wellbeing etc. are fully explored.
  3. Alongside working with users in local government, wider civic society, and the public, we build and maintain strong relationships with key policy makers in central government. These relationships with both local and central policy makers, allow the ONS to understand the challenges they face. We can then help build their understanding of our statistics and analysis , and the wider evidence base, enabling greater insight towards the topics that matter to our users, maximising ONS’s impact on decisions that affect the whole country.

Communication

  1. The way we communicate our statistics has much improved in recent years, having a direct influence on our impact. Statisticians speaking directly to the public via television and radio helps the transparent communication of statistics, assisted by our amend to publication times which ensures parity of communication (from 26 March 2020 we amended release times for market sensitive releases at 7:00 (rather than 9:30) and this was agreed with OSR).
  2. This includes 226 broadcast media interviews undertaken by ONS spokespeople during 2022/23 financial year, generating an average of 2.5k pieces of quoted coverage in the broadcast/online media each month, as well as our solid presence on Twitter (354.2k followers), which achieves good comparable engagement and reach, with threads created to support outputs and to respond to specific trends on social media.
  3. The ONS’s ‘Statistically Speaking’ podcast takes a deep dive into hot data topics and explains what’s behind the numbers. Between April 2022 and March 2023, the podcast had more than 18k downloads, including our most popular episode, ‘The R Word: Decoding ‘recession’ and looking beyond GDP’, which achieved 1,891 downloads in its first 30 days. In total since the podcast started in January 2022, it has achieved almost 30k unique downloads.

Dissemination

  1. Our approach to dissemination plays a pivotal role in maximising the insight and impact our data has. A prime example is our award-winning Census dissemination portfolio, with our Census maps offering users the ability to explore spatial patterns down to the neighbourhood level, empowering planners, and policymakers to precisely target interventions, and for any users to better understand their communities. Since then, we have developed interactive and highly localised content to encourage audiences to engage with the more granular data, producing data visualisation tools and innovative content so citizens can explore the data that is important to them. Users responded positively, saying they were “visually excellent”, “personalisable, visual and really well presented”.
  2. To promote widespread reuse of our insights and thus amplify their reach and impact, we designed tools to encourage users to embed custom views in their websites and publications. The results have been remarkable, with Census maps accounting for an impressive 24% of total views on the ONS site, garnering around 30 million views since its launch in November 2022.
  3. We also released our custom area profiles product. We recognised that user needs often extend beyond predefined geographic areas. Users can now draw specific areas of interest and generate tailored profiles with indicators and comparisons that match their unique use cases. The outputs are also exportable for use in websites and presentations, and has already reached tens of thousands of users, bridging gaps in specialised expertise.
  4. To cater to time-constrained users and unlock the potential of data hidden in spreadsheets, we introduced semi-automated localised reporting. With algorithms generating approximately 350 customised reports, one for each local authority, key trends in respective areas are efficiently explained, making insight more accessible and impactful. These reports are extensively accessed and widely referenced by local authorities and other local users.
  5. Additionally, we enhanced the reach of these reports by making their content crawlable by search engines. Snippets from these reports now appear directly in search results and voice-based queries, further bolstering the impact of our data and analyses.
  6. In tandem with our work on the Census, we have goals to transform the presentation of ONS’s day-to-day insights, with a particular focus on enhancing offerings for the general public. We have a digital content team comprised of data visualisation specialists, data journalists, and designers focussed on collaborating with analytical teams to achieve this.
  7. This approach centres on addressing the most pertinent questions for our users, often focussed on creating more personalised and localised experiences. By empowering users to see themselves within our data, we establish a meaningful connection with our audience.
  8. Through these collaborations with analytical teams, we are reaching a much-expanded user base, with audiences engaging with our content 40% longer than typical offerings. Our commitment to delivering impactful, accessible, data-driven insights ensures our offerings resonate with diverse audiences and have a lasting impression.
  9. We gather a range of feedback on our digital products to develop and improve the usability of our content, including our interactive online content and tools. We also undertake analysis of how user groups access our content, needs across our users of data, of statistics and trends, and those who want a deeper understanding of topics.
  10. There were 6.4m users of the ONS website in 2022/23. Most users (4.3m) use a mobile device to access our website, with 1.9m on desktop and 0.2m on tablets. Engagement levels remain highest with desktop users. There were nearly 26m pageviews on ONS.gov.uk over 2022/23. This only represents users that accept cookies. We estimate this to be approximately 30% of users.
  11. Peak demand on the website was driven by census releases, with 8x higher daily page views on 29 November for the ethnic group, national identity, language and religion census data than average, and nearly 6x higher daily page views for the demography and migration releases for census on 2 November. The census first release on 28 June saw roughly double the daily average for the year.
  12. The most popular topics on ONS.gov.uk across the year were census (3.6m pageviews), covid (3.6m pageviews) and inflation (3.2m pageviews).

Analysis

  1. Ensuring that analysis is good quality will also help ensure that it has impact. The Analysis Function Standard sets expectations for the planning and undertaking of analysis to support well-informed decision making. It provides clear direction and guidance for all users and producers of government analysis.
  2. The Analysis Function also shares best practice through the Analysis in Government awards, which includes an impact award
  3. Maximising the impact of across government and for the ONS is in understanding the priorities of the day, both for the citizen but also decision makers at the heart of local and central government, and flexing at pace as new priorities emerge – this often means the evidence base may be less robust or that data do not exist, but ONS’s Analytical Hub is constantly adapting to produce the best analysis at pace to support decision making. The ONS also scans the horizon anticipating what may be becoming an emerging issue.

How do we ensure that users, in the Civil Service, Parliament and beyond, have the skills they need to make effective use of data? 

  1. There are a range of initiatives aimed at improving the analytical skills of civil servants and beyond.

The Analysis Function

  1. The Analysis Function is a network for all civil servants working in government analysis and aims to help improve the capability of all analysts across government. The Analysis Function website hosts the dedicated analysis function curriculum webpages alongside a range of technical analytical learning for all, as well as a guidance hub providing access to key analysis guidance. The Function also hosts regular information sharing events and webinars.
  2. The Function works with the policy profession and other teams across government to ensure we are building a level of analytical capability specifically for non-analysts. The Analysis Function have also developed a learning pathway specifically for non-analysts in line with wider government reform priorities.
  3. The Analysis Function conducted a review in 2022 of the analytical capability of policy officials. Since then, we have been working closely with the policy profession unit through a dedicated implementation working group looking to address the recommendations from the review. Progress has been made against several actions including the launch of the analytical literacy course, data master class and launch of policy to delivery pilot.

The Methodology Advisory Service

  1. The Methodology Advisory Service (MAS) based within the ONS offers advice, guidance and provide support for the public sector, nationally and internationally, using teams of experts covering different areas of statistical and survey methodology. We offer an advisory service for:
    • methodological advice on production and analysis of data
    • development of surveys or outputs
    • feasibility studies
    • methodological research to answer complex problems
    • quality assurance of methods or outputs
    • cross-cutting reviews of processes and methods across a department’s statistical work
    • evaluation of competing sources
    • health checks before an OSR assessment
  2. The ONS’s methodologists and researchers receive their own methodological advice from the Methodological Assurance Review Panel (MARP). They provide external, independent assurance and guidance on the statistical methodology underpinning ONS statistical production and research.

The Data Science Campus

  1. The Data Science Campus is at the heart of leading-edge data science capacity building with public sector bodies in the UK and abroad. We equip analysts with the latest tools and techniques, giving them the capability to perform effectively in their roles. We also work in partnership with organisations to ensure they have the capacity to develop their own data science skills in the long-term.
  2. Our evolving range of programmes reflects our focus on using data to drive innovation for public good, and provide analysts across the ONS, the UK public sector and international partners with a developmental framework to build capacity and enhance analytical capability:
    • Data Science Accelerator
    • Data Science Graduate Programme
    • Degree Data Science Apprenticeship
    • Masters in Data Analytics for Government
    • Cross-government and Public Sector Data Science Community

ONS Local

  1. The ONS Local service provides peer-to-peer forums and platforms for local, regional, and national analytical communities to share best practice, and helps local users navigate around the extensive subnational offer from the ONS, both what is already available and what’s in development, and wider UK government data. For example, “ONS Local Presents” webinars allow ONS teams and analysts from local or central government to present analysis on a topic to a wide audience for feedback and challenge or to showcase innovation in techniques or data that may be useful to others. We have also held our first in a series of “ONS Local How to” workshop, aimed at a similar audience and run jointly with the Data Science Campus, to support local government analysts create dashboards and use APIs.

ONS Outreach and Engagement

  1. Finally, the ONS Outreach and Engagement Team are piloting and developing a programme of online engagement activities to help improve data literacy among underrepresented groups, non-expert users and those less likely to engage with data. The sessions vary in topic across the range of statistical production and collection themes at ONS and include a range of engagement formats. Topics and activities so far have included an introduction to the ONS and census webinars, Q&As on how to use census data and show and tells, demonstrations or learn ins on data tools such as Cost of Living Insights Tool, Census Maps Tool and Build a Custom Data Set Tool.
  2. These sessions can be tailored to the audience, including civil service colleagues who may be less confident or engaged with data, and aim to improve awareness and understanding of the foundations of data use and production.

Professor Sir Ian Diamond, National Statistician

Office for National Statistics

August 2023

Office for Statistics Regulation response

Introduction

About us

  1. The Office for Statistics Regulation (OSR) is the independent regulatory arm of the UK Statistics Authority. In line with the Statistics and Registration Service Act (2007), our principal roles are to:
    • set the statutory Code of Practice for Statistics (the Code).
    • assess compliance with the Code to ensure statistics serve the public, in line with the pillars of Trustworthiness, Quality and Value. We do this through our regulatory work that includes assessments, systemic reviews, compliance checks and casework.
    • award the National Statistics designation to official statistics that comply fully with the Code.
    • report any concerns on the quality, good practice and comprehensiveness of official statistics.
  2. While our formal remit covers official statistics, we also encourage organisations to voluntarily apply the Code to demonstrate their commitment to trustworthy, high quality and valuable statistics. Our 5-year plan sets out our vision and priorities for 2020-2025 and how we will contribute to fostering the Authority’s ambitions for the statistics system. Our annual business plan shares our focus for the current year.

Data and analysis in Government

How successfully do Government Departments share data? 

  1. For the last five years, OSR has been monitoring and commenting on data sharing and linkage across government, producing reports to understand issues and identify opportunities to move the wider system forward. We are an advocate and a champion for data sharing and linkage, when this is done in a secure way that maintains public trust. It is our ambition that sharing and linking datasets, and using them for research and evaluation, will become the norm across the UK statistical system.
  2. Our latest data sharing and linkage report takes stock of data sharing and linkage across government. There has been some excellent progress in creating linked datasets and making them available for research, analysis and statistics.
    • The Office for National Statistics (ONS) recently published statistics on sociodemographic inequalities in suicides, which utilised linked demographic and socioeconomic data about individuals from the 2011 Census with death registration data and, for the first time, was able to show estimates for rates of suicide across a wide range of different demographic groups. They believe this analysis will support the development of more effective suicide prevention strategies.
    • Data First aims to unlock the potential of Ministry of Justice (MoJ) data by linking administrative datasets from across the justice system and enabling accredited researchers, from within government and academia, to access the data. Data First is also enhancing the linking of justice data with data from other government departments, such as the Department for Education (DfE), where linking data has unlocked a wealth of information for researchers about young people who interact with the criminal justice system.
    • BOLD, led by the MoJ, is a three-year cross-government data-linking programme which aims to improve the connectedness of government data in England and Wales. It was created to demonstrate how people with complex needs can be better supported by linking and improving the government data held on them in a safe and secure way.
  3. Our report highlights an emerging theme on the overall willingness to share and link data across government and public bodies. The benefits and value of doing so are widely recognised, with the COVID-19 pandemic helping to change mindsets and highlight opportunities that exist for greater collaboration and sharing.
  4. However, through speaking with stakeholders across the data sharing and linkage landscape during our review, we also found there is still uncertainty about how to share and link data in a legal and ethical way, and about public perception of data sharing and linkage. There is also a lack of clarity about data access processes and data availability and standards across government. Together, these factors can lead to a nervousness to share and link data, which can cause blockages or delays.
  5. The picture is not the same in every area of government. Some areas have moved faster than others and we have found that culture and people are key determinants of progress.
  6. In the report, we summarise and discuss our findings within four themes in the context of both barriers and opportunities:
    1. Public engagement and social licence: The importance of obtaining a social licence for data sharing and linkage and how public engagement can help build understanding of whether/how much social licence exists and how it could be strengthened. We also explore the role data security plays here.
    2. People: The risk appetite and leadership of key decision makers, and the skills and availability of staff.
    3. Processes: The non-technical processes that govern how data sharing and linkage happens across government.
    4. Technical: The technical specifics of datasets, as well as the infrastructure to support data sharing and linkage.
  7. Overall, data sharing and linkage in government stands at a crossroads. Great work has been done and there is the potential to build on this. However, there is also the possibility that, should current barriers not be resolved, progress will be lost.
  8. Our review makes 16 recommendations that, if realised, will enable government to confront ingrained challenges, and ultimately to move towards greater data sharing and linkage for the public good. Following the report, OSR will be following up with those organisations mentioned in our recommendations to monitor how they are being taken forward.

The changing data landscape

Is the age of the survey, and the decennial Census, over?

  1. Statistics producers are increasingly turning to alternative data sources in the production of official statistics, in light of challenges with survey data collection and increased recognition of the potential of alternative data sources. Administrative data (that is, data that are primarily collected for administrative or operational purposes) are increasingly used to produce official statistics across a range of topics including health, such as waiting times data; crime, such as police recorded crime data; and international migration, such as borders and immigration data. Challenges faced during the COVID-19 pandemic highlighted society’s need for timely statistics and further demonstrated the potential of administrative data.
  2. However, such methods are unlikely to be able to capture all aspects of our population and society and therefore surveys are likely to play an ongoing but changing role in the statistical system. For instance, many crimes are not reported to the police, and data quality for some crime types is poor, so users cannot rely exclusively on administrative datasets of police recorded crime. To get a full picture of crime, both police recorded crime and the Crime Survey for England and Wales will always need to be used alongside each other.
  3. Moreover, there is strong interest in opinion and perception data such as the successful ONS Business Insights and Confidence Survey. Our Visibility, Vulnerability and Voice report on statistics on children and young people also demonstrated the strong user demand and importance of data that include children’s voice about their experiences and see the child holistically. These insights would not be available through administrative sources.

What new sources of data are available to government statisticians and analysts? 

  1. We highlight in our State of the Statistical System 2022/23 report that the increasing availability of new data sources such as administrative data, management information and growing use of artificial intelligence should be seen as an opportunity for the statistical system.
  2. Administrative data are helping to provide new insights and improve the quality of statistics. For example, the Department for Work and Pensions (DWP) is exploring the integration of administrative data into the Family Resources Survey (FRS) and related outputs through its FRS Administrative Data Transformation Project.
  3. The ONS has developed experimental measures of inflation using new data sources, including scanner and web-scraped data, publishing experimental analysis using web scraped data looking at the lowest cost grocery items. Their Consumer Prices Development Plan details the new sources of data that can be used and the insights they can bring.
  4. Technology can also provide opportunities to collect data in different ways, such as DfE pupil attendance data that is automatically submitted from participating schools’ management systems and allows for more timely analysis of attendance in schools in England. This data collection won the RSS Campion Award for excellence in Official Statistic

What are the strengths and weaknesses of new sources of data? 

  1. In the wider context of technological advances, statistics need to remain relevant, accurate and reliable, and new data sources support this ambition. However, with the use of these new and innovative data sources in the production of official statistics, producers need to manage risks around quality. Moreover, with more use of data science and statistical models in the production of official statistics it is crucial that producers ensure that any development of models is explainable and interpretable to meet the transparency requirements of the Code.
  2. To maximise the opportunities from new data sources, the role of the statistician has to evolve and keep pace with the increasing use of data science techniques. Our latest State of the Statistical system report highlights the difficulties producers have getting people with the right skills in post; these challenges are not being consistently felt across the whole UK statistical system. There is a concerning risk that continued financial and resource pressures will hinder future progress and evolution of the system to keep pace with increasing demand. A successful statistical system that is able to utilise new data sources depends on having a workforce that is sufficiently resourced and skilled to deliver.
  3. New data sources often provide insights in a timelier manner (in some instances this can be near real time such as England’s school attendance data) and provide better coverage (such as the web scraped and supermarket prices data often including all transactions or prices). On the other hand, there is a risk it may not be measuring what people want to measure and there is no option to amend or edit the data or questions being asked. Producers also have little control over the coherence and comparability in the data; there may be differences in how organisations record their data as well as between datasets on a similar topic. Data could also be missing for some observations and variables and the data could be bias by only covering certain groups of people or transactions.

Protecting privacy and acting ethically 

What does it mean to use data ethically, in the context of statistics and analysis? 

  1. As the regulator of official statistics in the UK, it is our role to uphold public confidence in statistics. In our view, an oft-neglected question of data ethics concerns not so much how data are collected and processed, but how the resulting statistics are used in public debate. As a result, we consider the question of whether a particular use is misleading as intrinsically ethical.
  2. One of the areas we continue to develop our thinking on is the topic of misleadingness, publishing a think piece on misleadingness in May 2020 and following up on our initial thinking in May 2021. The latter focuses on feedback to the first think piece that it is important to distinguish between the production of statistics and the use of statistics, as well as identifying areas not covered in the original think piece, like the risk of incomplete evidence. Based on our findings, our thinking has evolved to be clearer on the circumstances in which it is relevant to consider misleadingness: “We are concerned when, on a question of significant public interest, the way statistics are used is likely to leave audiences believing something which the relevant statistical evidence would not support.”
  3. We are launching a review of the Code of Practice for Statistics in September. As part of it, we will be asking the question “what are the key ethical issues in the age of AI: how do we balance serving public good with the potential for individualised harms?”. The review will run until December, and we will be highlighting how people can engage and contribute, including a planned panel session on this topic.

Understanding and responding to evolving user needs

Who should official data and analyses serve? How do users of official statistics and analysis wish to access data? 

  1. OSR’s vision, based on our founding legislation, is that statistics serve the public good. In 2022 we worked in partnership with ADR UK to explore what the term ‘public good’ means to the public. We found that research and statistics should aim to address real-world needs, including those that may impact future generations and those that only impact a small number of people. There was also clear evidence that members of the public want to be involved in making decisions about whether public good is being served, through meaningful public engagement and full, transparent and easy access to the decision-making process of Data Access Committees (which evaluate applications from trained and accredited researchers for the use of de-identified data for research).
  2. In 2021, we published a report looking at Defining the Public Good in Applications to Access Public Data. The report highlights how researchers see their research as serving the public good or providing public benefits, and this differed between organisations. For example, the most frequently mentioned public benefits in National Statistician’s Data Ethics Advisory Committee (NSDEC) applications were to improve statistics and service delivery, whereas Reproducible Analytical Pipeline (RAP) applications mentioned policy decisions and societal benefit more.

How are demands for data changing?   

  1. There continues to be a significant shift in government and public demand for statistics and data from COVID-19 to other key issues. The statistical system has demonstrated its responsiveness to meet these data needs. However, as mentioned at paragraph 17, pressure on resources and finances poses a significant threat to the ability of government analysts to produce the insight government and the wider population needs to make well-informed decisions.
  2. Working in an efficient way will help address one part of this problem: it will help ensure maximum value is achieved with the resources that are available, which will in turn help others across government appreciate the benefit of having analysts at the table. Our blog on smart statistics: what can the Code tell us about working efficiently highlights ways to support efficiency based on the Code.
  3. Users of statistics and data should always be at the centre of statistical production; their needs should be understood, their views sought and acted on, and their use of statistics supported. We encourage producers of statistics to have conversations with a wide range of users to identify where statistics can be ceased, or reduced in frequency or detail, to save resources if appropriate. This can free up resource, while helping producers to fulfil their commitment to producing statistics of public value that meet user needs. Ofsted has recently done this to great effect.
  4. The UK statistical system should maintain the brilliant responsive and proactive approach taken during the COVID-19 pandemic and look to do this in a sustainable way. Improvements to data infrastructure, processes, and systems could all help. For example, the use of technology and data science principles, such as that set out in our 2021 RAP review, supports the more efficient and sustainable delivery of statistics. This review includes several case studies of producers using RAP principles to reduce manual effort and save time, alongside other benefits. The recent Analysis Function RAP strategy sets out the ambition to embed RAP across government, and the Analysis Function can offer RAP support, through its online pages, its Analysis Standards and Pipelines Team and via the cross-government the RAP champion network.
  5. Statistics and data should be published in forms that enable their reuse, and opportunities for data sharing, data linkage, cross-analysis of sources, and the reuse of data should be acted on. The visualisations and insights generated by individuals, from outside the statistical system, using easily downloadable data from the COVID-19 dashboard nicely demonstrate the benefits of making data available for others to do their own analysis, which can add value without additional resource from producers.
  6. Promoting data sharing and linkage, in a secure way, is one of OSR’s priorities and we are currently engaging with key stakeholders involved in data to gather examples of good practice, and to better understand the current barriers to sharing and linking. This will be used to champion successes, support positive change, and provide opportunities for learning to be shared.
  7. To ensure overall success, it requires:
    • independent decision making and leadership, in particular Chief Statisticians and Heads of Profession for Statistics having authority to uphold and advocate the standards of the Code.
    • professional capability, again demonstrating the benefit of investing in training and skills, even when resources are scarce.

How can we ensure that official data and analyses have impact? 

  1. To have impact, official data and analysis need to serve the public good (by being quality, trusted and valued) and be well communicated.
  2. This is reflected in the three pillars of our Code: Quality sits between Trustworthiness, representing the confidence users can have in the people and organisations that produce data and statistics, and Value, ensuring that statistics support society’s needs for information. All three pillars are essential for achieving statistics that serve the public good. They each provide a particular lens on key areas of statistical practice that complement each other and help to ensure the data are being used as intended.
  3. Quality is not independent of Trustworthiness and Value. A producer cannot deliver high quality statistics without well-built and functioning systems and skilled staff. It cannot produce statistics that are fit for their intended uses without first understanding the uses and the needs of users. This interface between quality, its institutional context and statistical purpose are also reflected in quality assurance frameworks (QAF), including the European Statistical System’s QAF and the International Monetary Fund’s DQAF. The Code is consistent with these frameworks and with the UN Fundamental Principles of Official Statistics.
  4. We use assessments and compliance checks to judge compliance with the Code for individual sets of statistics or small groups of related statistics and data (for example, covering the same topics across the UK). Whether we use an assessment or compliance check will often be determined by balancing the value of investigating a specific issue (through a compliance check) versus the need to cover the full scope of the Code (through an assessment).
  5. There is no ‘typical’ assessment or compliance check – each project is scoped and designed to reflect its needs. An assessment will always be used when it concerns a new National Statistics designation and will also be used to undertake in-depth reviews of the highest profile, highest value statistics, especially where potentially critical issues have been identified.
  6. We have some useful guidance that can assist producers in their quality management. We published a guide to thinking about quality when producing statistics following our in-depth review of quality management in HMRC, and released a blog to accompany our uncertainty report. It highlights some important resources, top among them the Data Quality Hub guidance on presenting uncertainty. Our quality assurance of administrative data (QAAD) framework is a useful tool to reassure users about the quality of the data sources.
  7. To support statistics leaders in developing a strategic approach to applying the Code pillars and a quality culture, we have developed a maturity model, ‘Improving Practice’. It provides a business tool to evaluate the statistical organisation against the three Code pillars and helps producers identify the current level of practice achievement and their desired level, and to formulate an action plan to address the priority areas for improvement for the year ahead.
  8. We are also continuing to promote a Code culture that supports producers opening themselves to check and challenge as they embed Trustworthiness, Quality and Value, because in combination, the three pillars provide the most effective means to deliver relevant and robust statistics that the public can use with confidence when trying to shine a light on important issues in society.
  9. In our report on presenting uncertainty in the statistical system we found that presenting uncertainty in a meaningful, succinct way that delivers the key messages can be challenging for producers. We found that typically, uncertainty is better depicted and described in statistical bulletins and methodological documents than it is in data tables, data dashboards and downloadable datasets.
  10. We also found that there is a wide and increasing range of guidance and advice to help producers think about how to best present uncertainty. OSR will do more to promote and support good practice and consider what this means for our regulatory work. We will focus on the judgements that we make and the guidance we produce to help producers to improve the presentation of uncertainty.
  11. In our report, we concluded that showing uncertainty in estimates, for example through data visualisation, is essential in improving the interpretation of statistics and in bringing clarity to users about what the statistics can and cannot be used for. At the same time, however, we recognise that this is often not always a straightforward task. With support from us and those at the centre of the Government Statistical Service (GSS), we encourage Heads of Profession for Statistics to review whether uncertainty is being assessed appropriately in their data sources, and to review how this is presented in all statistical outputs.
  12. We will continue to review the communication of uncertainty in our regulatory projects. We already have a good range of experience and effective guidance to help review uncertainty presented in statistical bulletins and methodology documents.

How do we ensure that users in the Civil Service, Parliament and beyond, have the skills they need to make effective use of data? 

Intelligent transparency

  1. Intelligent transparency is fundamental in supporting public trust in statistics. Our campaign and guidance aim to ensure an open and accessible approach to communicating numbers.
  2. In our blog What is intelligent transparency and how you can help?, we highlight our expectation that at its heart intelligent transparency is about proactively taking an open, clear and accessible approach to the release and use of data, statistics and wider analysis. We also recognise that whilst we will continue to champion intelligent transparency and equal access to data, statistics and wider analysis, it isn’t something we can do on our own. Our expectations for transparency apply regardless of how data are categorised. For many who see numbers used by governments, the distinction between official statistics and other data, such as management information or research, may seem artificial. Therefore, any data which are quoted publicly or where there is significant public interest should be released and communicated in a transparent way.
  3. We need users of data to continue to question where data comes from and if it is being used appropriately. We also need those based in a department or a public body to champion intelligent transparency in their team, their department and their individual work, build networks to promote our intelligent transparency guidance across all colleagues and senior leaders, and to engage with users to understand what information it is they need to inform their work to inform the case for publishing it.
  4. Parliamentarians also have a role to play in ensuring intelligent transparency in debate. This includes advocating for best practice around the use of statistics and calling out misuse of statistics where it occurs. Following the principles of intelligent transparency allows the topic discussed to remain the focus of conversation, rather than the provenance of the data.
  5. We have launched a communicating statistics programme that will in part look to understand how users want to access data and help support producers to communicate their data through those different means. This will include reviewing our existing guidance to understand what more we can do to support the use and range of communication methods while preventing and combatting misuse.

Statistical literacy

  1. In our regulatory work, when people talk to us about statistical literacy it is often in the context of it being something in which the public has a deficit. For example, ‘statistical literacy’ may be cited to us as a factor in a general discussion on why the public has a poor understanding of economic statistics. OSR commissioned a review of published research on this topic area and published an accompanying article to investigate if this was indeed the case.
  2. We found wide variability across the general public in the skills and abilities that are linked to statistical literacy. Our review highlights that a substantial proportion of the population display basic levels of foundational skills and statistical knowledge, and that skill level is influenced by demographic factors such as age, gender, education and socioeconomic status.
  3. Given this, we think that it is important that statistical literacy is not viewed as a deficit that needs to be fixed, but instead as something that is varied and dependent on the context of the statistics and factors that are important in that context. Therefore, rather than address deficits in skills or abilities, we recommend that producers of statistics focus on how best to publish and communicate statistics that can be understood by audiences with varying skill levels and abilities.
  4. Our review identified a number of areas where there is good evidence on how best to communicate statistics to non-specialist audiences in the following areas:
    • Target audience: Our evidence endorses the widely recognised importance of understanding audiences. The evidence highlights that the best approach to communicating information (including data visualisations) can vary substantially depending on the characteristics of the audience for the statistics. Considering the target audience’s characteristics is, therefore, an important factor when designing communication materials.
    • Contextual information: Contextual information helps audiences to understand the significance of the statistics. Our evidence highlights the importance of providing narrative aids, and also that providing statistical context can help to establish trust in the statistics. Again, this supports and reflects existing notions of best practice.
    • Establishing trust: As well as providing context, we found evidence that highlighting the independent nature of the statistical body and, when needed, providing sufficient information so that the reasons for unexpected result are understood, can increase trust in the statistics. This finding aligns with the Trustworthiness pillar of the Code.
    • Language: In the statistical system, statistics producers recognise that they should aim for simple easy to understand language. We found evidence to endorse this recognition – in particular, that, when used, the level of technical language should be dictated by the intended target audience.
    • Format and framing of statistical information: We found evidence that different formats (e.g., probability, percentage or natural frequency) and/or framing (e.g., positive or negative) in wording can lead to unintended bias or affect perceptions of the statistics and both need to be considered. This finding is probably the one which is least widely recognised in current best practice in official statistics, and we consider it is an area that would benefit from further thinking.
    • Communicating uncertainty: Communicating uncertainty is important and may need to be tailored dependent on the information needs and interest levels of the audience. This topic is a particular focus area for OSR, and we discussed our report on communicating uncertainly at paragraph 39.

UK Statistics Authority oral evidence to the Public Administration and Constitutional Affairs Committee’s inquiry on the work of the UK Statistics Authority

On Tuesday 23 May, Sir Robert Chote, Chair of the UK Statistics Authority, Sir Ian Diamond, National Statistician and Ed Humpherson, Director General for Regulation, gave evidence to the Public Administration and Constitutional Affairs Committee’s inquiry on the work of the UK Statistics Authority.

A transcript of which has been published on the UK Parliament website.

Office for National Statistics written evidence to the Public Administration and Constitutional Affairs Committee’s inquiry on the Civil Service People Survey

Dear Mr Wragg,

I write in response to the Committee’s call for evidence on the Civil Service People Survey. We have focused our evidence on questions in the Terms of Reference regarding survey design, delivery and validity of results, from the perspective of our role as the Office for National Statistics in administering high-quality national surveys including the census.

Survey design

Anonymity

Staff surveys, and surveys in general, can adopt different strategies to protect respondents’ privacy. These can range from anonymising responses by removing any information that connects the survey to the respondents, to ensuring that analysis derived from the survey does not lead to disclosing identity.

Full anonymisation can limit analysis. For example, if different age groups have a different experience of working in an organisation, this would not by highlighted if the age field was removed to protect privacy. Therefore, best practice is to seek a compromise by using a range of measures to protect privacy. These include:

  • Deidentification removes fields that are highly likely to identify an individual such as their name and address and keeps fields such as age that do not directly relate to one person. Grouping answers limits direct identification further, for example using age ranges rather than dates of birth or referring to branches rather than teams. All these group small numbers of people together to limit the identification of one person while maximising the benefit of the survey.
  • Use of identifiers rather than names so that there is additional protection. Only very few people would have the technical ability and access rights to link respondents to responses. If there were a breach, they would be easily identified because they would leave a digital footprint.
  • Open rather than targeted invitation provides respondents with more control over their responses. An open invitation to the target population allows a respondent to input all their information without it connecting back to a database. A targeted invitation provides a respondent with a code that connects them, and only them, to the sampling frame but could come at the expense of privacy if other measures are not taken, and possibly impact on responses and the response rate if people are more concerned they can be identified.
  • Segregation of duties enables a significant reduction in the number of people who have access to identifiable information. Analysts are not granted access to personal identifiers and users of analysis are only granted access to aggregated data.
  • Statistical Disclose Control is a process by which analytical outputs are checked to ensure that they cannot lead to reidentification on individuals. There are a number of methods that can be used including supressing small numbers and swapping cells which mean the headline summary is still correct. We would not recommend completing cross-sectional analysis when there are low numbers in a category as this might enable identification, especially when it is possible to link to other information in the public domain.
  • Summarising and controlling access to free text are important to ensuring that respondents who provide information that can be used to identify themselves or others are protected. This is particularly important when respondents use the survey as an opportunity to raise issues which require careful handling such as safeguarding. It is best practice to have a safeguarding policy that provides clear guidance and oversight as to when privacy should be breached to protect individuals.

In addition, it is good practice to carry out privacy impact assessments and to make privacy notices and technical guides readily available.

Survey design and delivery

To ensure the best design and delivery of a survey, you may want to be aware of the following:

  • Continuity – staff surveys such as the People Survey are repeated every year so that changes can be tracked and compared over time. To achieve the objectives, the survey needs to be relatively stable, and changes carefully considered and implemented. When a question is discontinued or changed significantly, the time series is ‘broken’, and a new measure is tracked. This is sometimes necessary to ensure the survey remains relevant and useful.
  • Comparability – when a key requirement is the ability to compare performance across the civil service, a key feature of the survey must be consistency: the same survey, with the same questions must be used by all organisations.
  • Comprehension – questions should be pre-tested to ensure that they are being understood as intended and the wording is suitable and understood by all respondents. As far as possible, the survey should use harmonised standards that are available to government departments or reuse questions that are commonly used. These questions have been tested and the practice enables comparison with other data.
  • Scope – the topics covered by the survey are varied. To better understand them, questions in a ‘block’ touch on subsets of the topic. The survey designers must consider the length of the survey and the impact that may have on the quality of the responses, participation and respondents following through to the end, thus completing the survey. The usual recommendation for an online survey is that it should be completed in around 20 minutes.
  • Mode of collection – how responses are collected is determined by cost, the speed in which results are needed, participant preference and the influence modes have on the responses provided. For example, when people complete a survey online, which is the cheapest collection mode, they tend to complete it quickly and may be less reflective compared to an interviewer led survey where the interaction between people can lead to explaining the answer and probing further.
  • Inclusive – the survey ought to be inclusive by design and this refers to the overarching study design but also to the design of the questions themselves and the interfaces that respondents interact with. For example, the online survey should be designed to meet accessibility standards so that it does not limit participation through design. We should be inclusive in the questions we ask, ensuring that the available answer options collect data that represents the population being surveyed. Having multiple modes of collection available increases access to the survey and in turn increases representation in the data.

Who should be involved?

Developing and delivering a survey of the scale of the People Survey is a multidisciplinary task that requires the involvement of many professionals to ensure it delivers on analytical and business objectives.

  • Policy and Analysis users – it is essential to involve those who will be using the results to understand their requirements and to ensure that the data being collected meets their policy questions.
  • Methodologists and Data Architects – the data that underpins the analysis that responds to the policy questions needs to be designed and architected so that it meets data standards and methodological requirements. This step is crucial to ensure that the data collected is fit for purpose, can be used, reused and linked (for example to the data from previous years).
  • Survey designers – As with all surveys, involving questionnaire design experts in the development of the questions and survey to ensure it will meet and balance user need is crucial. As part of their professional input, survey designers will review if questions are clear, appropriate, representative, inclusive and accessible by involving groups across the civil service and asking for their views. They will test questions to ensure that they meet the requirements. We would prioritise cognitive testing to check understanding and interpretation, to mitigate any potential quality issues in the data ahead of going live and so that results can be explained clearly following analysis.
  • Survey developers and user experience designers – whether data is collected online, by an interviewer or using a papers questionnaire, the survey flow and the User Interface must be designed and tested to meet industry standards and to ensure that the survey is accessible to everyone. The survey can be sent to https://digitalaccessibilitycentre.org/ for testing.
  • Procurement – whether the survey is commissioned internally or externally, the specification must be understood and agreed by all parties with subsequent changes governed appropriately. The successful bidder must be able to meet the required standards.
  • Supplier – at the appropriate stage it is essential to build a strong working relationship with the supplier and especially with the technical delivery team. The supplier will be a survey expert with a wealth of experience and should be able to deliver the specified requirements as well as advise on innovation.
  • Communication and dissemination teams – the survey must be promoted by central and local teams to encourage participation. In addition to advertising the survey, the communication can include descriptions of how data will be used, what the benefit of the survey will be and why it is worth taking part. As well as communicating the results, it is necessary to ensure methods and processes are transparent so that people know what to read into them, and importantly what not to read into them. For communication teams to support the survey they must be given all the relevant information from design to analysis.

Relevance of metrics

The information included in the People Survey should be based on the data user needs and the departments that will use it. As mentioned above, these can be ascertained through consultation with policy users. Comparison over time is always an important aspect of any regular survey, and we would recommend keeping question sets as comparable as possible from year to year with changes, when needed, following a transparent methodological review. Finally, some terms used within questions are subjective depending on the department – again this could be improved through consultation.

Periodically the topics covered will change and be impacted by other issues. A good example is the need to monitor the experience of working in the civil service throughout and following the pandemic. When adding or changing a metric, it is important to communicate and explain the changes, especially at the reporting stage.

Some departments may also need to consider organisational changes and how they would like them reported against previous years.

Validity of results

Quality assurance

It would be difficult to quality assure the information provided via the People Survey. There are limited sources to cross-check the information, but these could be exit interviews and/or any internal departmental staff surveys. One approach to address this could be to do a quality follow-up survey with a sample of respondents – which is like what we do with the census to quality assure that data.

Non-response bias impact

Non-response bias can have a huge impact. It can cause results to be distorted, and this is linked with wider issues, as typically people that don’t respond have a reason not to engage, and those reasons are particularly interesting for those collecting the data but remain unseen.

It also means that the data will not be representative and, as a result, any policy changes might not address the real issues. Methodological solutions include weighting and imputation and require comparing the population of respondents to the population of civil servants using the data that is available through HR departments. For example, if fewer people aged under 30 respond to the survey, the responses of those who have replied could be given a bigger weight. Any weighting strategy would need to be transparent and carefully considered, attempting to explain the assumption that people who have responded do indeed represent those who have not.

Survey Delivery

Strengths and weaknesses

Strengths and weaknesses are mostly the result of trade-offs. For example, while the People Survey is relatively long, risking attrition, lower response rate and haste in completion, it does allow for more detailed analysis on many topics.

As discussed in this submission, using a consistent survey across the Civil Service enables efficiency, comparison between organisations, sharing of good practice and analysis over time while limiting bespoke design on issues that may be of interests to specific departments.

Again, as aforementioned, while the survey is not weighted, which enables quick access to the results, it does have an impact on how confident we can be that respondents represent the civil service as a whole. The survey is reported as percentages of respondents rather than a percentage of the population and users can break down the results further to compare responses from different groups. This is a pragmatic, clear approach which is clearly communicated.

A mixture of both quantitative and qualitative data collection could improve the quality of the analysis and the usefulness of the survey. The People Survey is quantitative, with a few open questions capturing free text. There are other qualitative measures such as depth interviews and focus group discussions that can be used alongside the People Survey to enhance understanding of the results. These can either be in addition to, or instead of some of the questions in the survey.

Finally, the survey is accompanied by a tool that enables quick analysis and comparisons, disseminated to all participating organisations; this is a strength.

My colleague Sarah Henry, Director of Methodology at the ONS, looks forward to discussing this further with the Committee on 13 September. Please do let us know if any questions ahead of then.

Yours sincerely,

Professor Sir Ian Diamond

Office for National Statistics correspondence to the Public Administration and Constitutional Affairs Committee regarding UK wide census data

Dear William,

Thank you for your letter of 14 June 2022 regarding census response rates in Scotland and the implications for the integrity of UK-wide data. I will answer your specific questions in turn, but first I wanted to emphasise the close working relationship between all UK Census offices. We have offered and provided support to National Records of Scotland (NRS) including sharing designs, seconding staff, and are now working with them to develop methods to maximise the accuracy of their Census estimates.

The Office for National Statistics (ONS) also worked with NRS to establish an international steering group, which is providing the highest quality technical expertise, advice and challenge to NRS on census matters. This group is advising NRS to focus efforts toward a good census coverage survey, particularly in regions where responses were lowest, and strengthen the use of administrative data to supplement census data sources in their statistical production, including a clear steer to prioritise the early acquisition of new administrative data sources. This oversight will offer NRS the best possible opportunity to deliver a high-quality outcome for Scotland that will, in turn, contribute to high-quality UK statistics.

The ONS has committed an internationally renowned team of experts equipped with decades of demographic experience to work alongside NRS. Through a combination of this world-leading expertise, the ambitious use of supplementary administrative data, and sophisticated estimation methods, I remain confident that together we will deliver robust UK-wide estimates of the population.

What assessment has been made of the reasons underlying the low response rate in certain areas, and what steps were taken to avoid this occurrence? To what extent is the separate delivery of the Scottish Census considered to have impacted the response rate?

The Census in Scotland is a devolved matter; our assessment of the reasons underlying a low response rate in some areas and the associated mitigations has therefore been formed through our close working partnership with NRS.

The decision by Scottish Ministers to move Scotland’s Census to March 2022 was informed by NRS analysis of the potential impact of COVID-19 on the quality of an operation in March 2021. NRS adopted a diverse and inclusive approach to public awareness around census through media and physical advertising, follow-up reminders, field staff, and focused efforts in areas of low return; approaches comparable to those of the ONS and the Northern Ireland Statistics and Research Agency (NISRA). Evidence gathered in Scotland at the end of the collection phase reported that ‘too busy’ and ‘not aware of the Census or the need to complete it’ were the more common reasons given by householders who had yet to return.

As already mentioned, in light of lower than anticipated returns, the ONS and the Registrar General for Scotland established an international steering group of census and coverage experts. Despite these challenges in the collection phase, having considered the position and the planned next steps with the census in detail, the steering group have confirmed that there is a stable foundation from which to move
from census collection onto the next stage of the census operation, namely the census coverage survey and the incorporation of administrative data into estimates. It is the combination of census returns, coverage survey, administrative data, and estimation methodology that will deliver high quality census outputs for Scotland.

What are the implications of Scotland’s lower response rate for the quality and comparability of UK-wide population statistics?

By taking the actions outlined above, through a combination of census data, supplementary administrative data, and sophisticated estimation methods, we believe that it will be possible to deliver a high-quality outcome for Scotland that will, in turn, contribute to high-quality UK-wide population statistics.

What actions will be taken to quality-assure the Scottish Census data in order to reduce the impact of the lower response rate on the standards of UK-wide population statistics?

NRS remain committed to continuing to produce the best possible population estimates for Scotland, which will be used to produce UK-level population estimates. They currently produce annual official population figures for Scotland using data from a range of sources including the Census, registration data on births and deaths, migration estimates, and a wide range of other administrative data sources. NRS continue to improve these statistics through augmentation of existing and new administrative data sources, ensuring the focus is to deliver population estimates that are accessible and valuable for users.

NRS continue to work closely with partner organisations across the UK, including the ONS and NISRA, to ensure that UK population data and analysis is coherent, comparable and understandable for all users across the UK.

The Committee might also wish to note that the Office for Statistics Regulation (OSR) is currently assessing the Scottish Census, including how NRS are responding to the current situation and the methods and quality assurance they will put in place to provide the best quality data and statistics on the population of Scotland. The OSR has, in both its preliminary findings assessment report for Censuses in the UK, and in subsequent assessment reports for England, Wales and Northern Ireland, highlighted UK data and continues to engage with all three census offices in this regard. I am sure the Director General for Regulation, Ed Humpherson, will keep the Committee informed of its findings, which it plans to publish in November.

Yours sincerely,
Professor Sir Ian Diamond