EAP142 – Measuring Respondent Error in the 2021 Census

Summary

This paper describes three approaches to measuring respondent error in the 2021 Census and invites comments from Panel members to guide the final recommendation on the preferred approach.

Background

A Census Quality Survey (CQS) was used in the 2011 Census to estimate respondent error for each individual census question. Following the Census itself, a sample of about 12 thousand household residents was asked the same questions a second time, in a separate survey. This allowed agreement rates to be calculated by identifying the fraction of respondents whose responses differ between census and survey. Respondent error is likely to be more of an issue for the census than for many ONS survey sources as the census is not interviewer administered and as there are expected to be a large proportion of proxy responses (information provided on behalf of someone else in the household). The results of the 2011 CQS are published in the 2011 CQS report and identified uses of the CQS results are described in Annex A.

The 2011 survey (and a similar survey ahead of the 2001 census) was conducted as face-to-face guided interviews. This format is considered the “gold standard” for data quality. Disagreements are therefore interpreted as respondent error on the census form.

The default assumption during preparations for the 2021 Census has been that we will once again run a CQS. The Census White Paper reflects this intention and a pilot survey was run in early 2020.

We have also conducted research into whether an alternative approach using existing administrative or survey sources could provide the same or better results with advantages of lower cost and respondent burden.

In September 2020 we will seek a decision from the PPP Transformation Board on the preferred approach for measuring respondent error. This note outlines the three broad approaches that will be presented. We invite comments from Panel members on those approaches. This paper does not present detailed information on how each approach would be implemented.

The use of agreement rates between individual data on the Census and an alternative source as a reliable metric for respondent error relies on the alternative source:

  • providing data on the same concept as the census;
  • being sufficiently accurate in collecting individual data relating to Census day;
  • avoiding the tendency to record the same errors that appear on the census (for example, by using a different mode of collection);
  • providing data on a representative sample of the population.

Option 1: Face-to-face Interview CQS

This was the approach adopted in 2011. The methodology for that survey is described in the 2011 CQS Report. The approach was tested in the CQS Pilot in January/February 2020 and no major concerns were identified.

The CQS is designed to collect information relating to the same concepts as the census (in effect, asking the same questions) and the use of trained interviewers with supporting information to help respondents should ensure sufficient accuracy of response. The mode of collection is independent of the census.

Whilst the initial sample can be designed to be representative of the population of interest (that is, individuals on census returns) and post-stratification or calibration can be used to adjust for differential response between demographic groups (defined with reference to characteristics collected on the census), there remains the possibility that differential response within those groups may affect the accuracy of the estimated agreement rates as a measure of respondent error.

Option 2: Telephone Interview CQS

This approach is similar to Option 1 except that the interviews would be conducted via telephone rather than face to face. This approach was also tested in the CQS pilot and, again, no major concerns were identified.

We have no evidence on the quality of individual data collected via the telephone approach compared with face-to-face interviews but note that interviews via each method are conducted by trained interviewers with specific briefing on the survey and with guidance on helping respondents answer the survey accurately. Telephone surveys are used as the data collection method for various National Statistics. The Labour Force Survey QMI, for example notes the possibility of measurement error but does not seek to quantify it nor distinguish between modes of collection. It would seem reasonable to assume that the difference in accuracy between telephone and face-to-face interview with an individual will be small compared to the inaccuracy present on census returns.

Whilst a very small amount of census data may be collected by telephone, this approach would be, for practical purposes, an independent mode of collection to the main census.

We have identified a potential issue with the representivity of the CQS sample due to the use of a telematching service to obtain a telephone number relating to the address appearing on the census return. At present, these numbers are restricted to landline numbers. It is reasonable to assume that the use of landline numbers is not constant across demographic groups and that a simple sample selection would result in overcoverage of some groups. Further, the pilot survey indicated that retired individuals were disproportionately likely to respond to the survey. Both these issues can be mitigated by oversampling of some groups to ensure that the final achieved sample is adequately representative and by post-stratification or calibration of the estimates.

Option 3: Other Surveys

A research project was started in October 2019 to investigate whether an alternative to running a separate CQS was possible. It succeeded in identifying an approach using existing ONS surveys (Census Coverage Survey (CCS), surveys linked through the Census Non-Response Link Study (CNRLS), and an additional module on the Opinions and Lifestyle Survey (OPN)) which would allow the linked-individual comparisons required for the estimation of respondent error. Annex B summarises this approach.

By primarily using data that will already be linked to the Census for other purposes, we derive additional value without incurring additional cost. Making use of existing data sources wherever possible supports the goals of the current transformation of ONS statistics.

Furthermore, the CCS and CNRLS surveys offer a far larger sample size than could be gathered – or successfully linked – as part of a wholly separate CQS project.

The information to be linked to the census under this approach can generally be expected to be of good quality as it is collected by trained interviewers with appropriate supporting information as with the previous approaches. There are some instances where the data collected on other surveys relates to a slightly different concept or is collected in a different way which would complicate the calculation or interpretation of some agreement rates. Data collected via an OPN module would be collected through a web self-completion which would not be independent of the mode of collection of the main census (but, as with other approaches, would identify errors caused through proxy responses).

More information on this approach is provided in Annex B. It was not possible, however, within the scope of the project and available resources to test this approach on existing sources and demonstrate that it would provide the same quality or better as a dedicated CQS.

The strengths and weaknesses of the three approaches are summarised below.

Table 1: Approaches to Measuring Respondent Error

ApproachStrengthsWeaknesses
CQS conducted through field interviews·      Consistent with previous approach

·      Tested in pilot.

·      Highest quality data

·      Simple to analyse and report.

·      Expensive (data collection costs estimated at £1.1m).

·      Respondent burden

·      Requires face-to-face interviews with higher risk of operational problems or public unacceptability due to COVID19

Approach

Strengths

Weaknesses

CQS conducted through field interviews

·      Expensive (data collection costs estimated at £1.1m).

·      Respondent burden

·      Requires face-to-face interviews with higher risk of operational problems or public unacceptability due to COVID19

CQS conducted through telephone interviews

·      Tested in pilot

·      Good quality data

·      Simple to analyse and report

·      Low risk from COVID19.

·      Reduced data collection costs (£0.2m)

·      Respondent burden

·      Greater risk of unrepresentative sample

Alternative approach: existing surveys supplemented by OPN module·      Low data collection costs (£40K)

·      Consistent with ONS strategy

·      Larger sample sizes for many questions allowing more detailed analyses

·      Low risk from COVID19

·      Complex to analyse and report

·      Less reliable data for sexual orientation and gender identity

·      Some increase in requirements on other teams.

·      Increased development costs

·      Not tested with increased risk to quality of results

·      Some risk of criticism for not meeting White Paper commitment or during NS Accreditation process

Quality Assurance

As the results of the CQS would be likely to be available only at a late stage in the QA process, it is difficult to envisage any adjustment to the Census data as a direct result of the CQS. This is seen as information to be used to help users understand uncertainty in the Census results, rather than being a tool in the QA itself.

Public Commitments

Paragraphs 4.97-4.99 of the Census White Paper state that a CQS will be carried out after the Census. There are no other commitments recorded on the Census Commitments database.

Demand from Data Users

The Census Historical Data team have no record of any interest or queries relating to the 2011 CQS. The 2021 Census Outputs and Dissemination team have confirmed that no reference to the CQS has been made in the consultation and engagement on the 2021 outputs. The web analytics team are unable to provide data on hits/downloads of the CQS page and report, due to the archiving of content and re-implementation of the website in 2016. A request for CQS data was received from the Welsh Government and we are checking whether a similar request is likely for 2021 data.

NS Accreditation

The three Census UKSA Assessment reports from 2011 provide no indication that the CQS was a factor in the decision to award National Statistics status. The only reference found is in Paragraph 3.27 of Report 1 in the context of ONS looking to harmonise its approach to the CQS with NRS and NISRA.

The first report provided to UKSA for the 2021 Assessment repeats the White Paper’s comments on the plan to conduct a CQS in 2021.

International Good Practice

Both NRS and NISRA conducted a CQS following the 2011 Censuses. The UNECE report Recommendations for the 2010 Census of Population and Housing refers (p193) to the possibility of running a post-census survey to measure content error but does not identify any countries carrying out such a survey.

Other Uses

The Census Question and Questionnaire Design team have indicated that the wider Social Survey Transformation team will be likely to have an interest in respondent error rates. This information can be used to help iterate and improve upon the design of corresponding questions in other surveys.

It is noted that analysis of the accuracy of responses to a survey-based census may be especially useful in the context of the 2021 Census, as this may help inform the discussion around potential future admin-first models.

This Annex describes the results of a research project which developed a proposal for measuring respondent error in the 2021 Census with running a separate CQS. It does not describe the detail of how this proposal was derived: in summary, we identified alternative sources for data for each census question and evaluated these with regard to how closely the response for an individual should correspond to the true value at census day (for example, taking account of any difference in question design or time differences in when the question was asked) and whether it was practicable to link the alternative source to the census responses in order to calculate agreement rates.

The proposed design measures around 25% of census questions using Census Coverage Survey (CCS) data; around 50% using Labour Force Survey (LFS) data (via the Census Non-Response Link Study (CNRLS)); and the remaining 25% via a custom Opinions and Lifestyle (OPN) survey module, collected over the four months spanning Census day. The preferred source for each question is shown in the tables at the end of this Annex.

Broadly speaking, the proposed design sources questions from the CCS as a priority. By its nature, questions shared with the CCS are generally exact matches in terms of design. In addition to its large sample size, the CCS boasts high quality linkage and a shared reference date, which minimizes time sensitivity concerns. Methods of linkage would be taken into account to ensure that agreement rates were not incorrectly inflated through links only being made to records where responses to the census and CCS agreed. CCS data is expected to be of high quality (being collected through face-to-face interviews though proxy data for other members of the household will be collected)

LFS data – via the CNRLS – is the second resort, due to its broad question coverage and similarly vast sample size. Corresponding questions are not always identical between the two sources and there will be some mismatch in timing between the two sources (LFS responses would be taken from the two months before and after Census day). Where timing differences are seen to have a material effect on agreement rates (for example, with the activity last week question) we will investigate modelling the agreement rate as a linear regression model with time from Census day as the explanatory variable (allowing the estimation of the agreement rate that would apply at Census day). LFS data is collected both through face-to-face interviews and telephone interviews and proxy data is collected.

Only questions unavailable via any existing source were slated for the OPN module. These include questions on sexual orientation and gender identity, where there will be particular interest in the accuracy of the census responses

The OPN will have a relatively small sample size of around 4,000 respondents but this should be sufficient to achieve the target accuracy levels in 2011 (agreement rates with a maximum margin of error of +/- 2 percentage points) for all questions other than  second address type and English ability. The latter questions sit on very uncommon routing branches which means that the issue cannot easily be resolved by increasing the overall sample size. No alternative existing sources are available for these questions; a lower grade analysis using the responses we can muster is still the best option. This shortage of frequency was also the case in 2011[1], and would also be an issue for any dedicated CQS design in 2021. Furthermore, there are several other questions that did not achieve sufficient frequency for analysis in 2011, but which will be analysable under this design due to being sourced via CCS and LFS with their exceptionally large samples.

The OPN is now is online-first with telephone follow-ups for non-responders so will not primarily be a different mode of collection to the main census. It does not collect proxy data.

Building our design with dependencies on the CCS, CNRLS, and OPN necessarily incurs risks and uncertainty. Notably, because the CNRLS has itself been developing in parallel with this project, some details – such as the format and extent of their matching resource – will still to be determined as of the time of the recommendation report.

Table A1: Household Questions

H1(who lives here - not asked on CQS)N/A
H2(count of people - not asked on CQS)N/A
H3(names of people - used for linkage)N/A
H4(visitors - not asked on CQS)N/A
H5(count of visitors - not asked on CQS)N/A
H6Relationship matrixLFS
H7Accommodation typeCCS
H8Shared roomsCCS
H9BedroomsOPN
H10Central heatingOPN
H11(routing only)N/A
H12TenancyCCS
H13LandlordCCS
H14Cars & vansOPN

  

Table A2: Household Questions

P(proxy indicator - not asked on CQS)N/A
1(individual name - used for linkage)N/A
2Date of birthCCS
3SexCCS
4Marital statusCCS
5Sex of partnerOPN
6Second addressOPN
7Second address typeOPN
8StudentCCS
9Term-time addressCCS
10Country of birthLFS
11Date of arrivalLFS
12Intention to stayCCS
13Address one year agoOPN
14National identityLFS
15EthnicityCCS
16ReligionLFS
17Welsh abilityLFS
18Main languageOPN
19English abilityOPN
20PassportsLFS
21HealthLFS
22Long-term illnessLFS
23Impact of illnessLFS
24Unpaid careOPN
25(routing only)N/A
26Sexual orientationOPN
27Gender identityOPN
28(instruction only)N/A
29ApprenticeshipLFS
30DegreeLFS
31Other qualificationsLFS
32VeteransOPN
33Activity in last 7 daysCCS
34Inactivity typeCCS
35Looking for workLFS
36Availability in 2 weeksLFS
37Waiting to start jobLFS
38Ever workedLFS
39(instruction only)N/A
40Self-employedLFS
41(business name - not asked on CQS)N/A
42Job titleLFS
43Job descriptionLFS
44IndustryLFS
45Supervisor statusLFS
46(routing only)N/A
47Hours per weekLFS
48Method of travelLFS
49,50Location of workOPN
51(instruction only)N/A

EAP141 – Design of Address Frame, Collection and Coverage Assessment and Adjustment of Communal Establishments in 2021 Census

Orlaith Fraser, Cal Ghee. 2021 Census Statistical Design, August 2021

Purpose of this paper

We took an introduction to the design for Communal Establishments (CEs) to the external methodological assurance panel (MARP) in October 2019. The panel wanted an update specifically on imputation and use of admin data. Extract from the minutes:

2.8 – The use of donors for imputation was also discussed, with the panel satisfied with the selection within the same CE process but noted known issues regarding small donor pools and repeated use of donors. It was recommended that the ONS continue to research potential overuse of donors.

Actions

A52 – The panel would like a more detailed update on Communal Establishments in the future, including imputation and use of administrative data.

This paper addresses those comments as far as we are currently able, and adds in context on the end-to-end design, from creation of the address frame and collection operation through to coverage estimation and adjustment. It does not cover quality assurance/validation of the census estimates.

CRAG and MARP are asked for their views on the current status of the CE elements, and to feed in any ideas on aspects that are still in progress.

Introduction

According to the 2011 Census, 1.7% of residents of England and Wales live in managed residential accommodation known as Communal Establishments (CEs), such as care homes, student halls of residence, hospitals or prisons. While this is a relatively small proportion of the total population, they are likely to be clustered in particular locations and share certain characteristics. It is essential that we capture sufficient responses from these groups from a statistical quality, inclusivity and outputs perspective. A tailored enumeration approach is needed to enable those in communal establishments to respond. A similarly bespoke enumeration approach may also be required for a small number of households because of addressing challenges (transient population groups, temporary accommodation sites), access restrictions (e.g. royal households, embassies) and/or because additional engagement with the community may be required (such as gypsy & Roma travellers), which makes them unsuitable for the traditional household collection model. These are known as Special Population Groups (SPGs) and are often grouped with CEs for operational purposes.

Although the Census Coverage Survey (CCS) caters for CEs with up to 50 beds paces, larger CEs are not included. A tailored approach to the processing of the data is therefore also required to ensure that estimates can be published to a standard that meets user needs.

Table 1 below summarises the main CE populations in 2011, showing the proportion they make of the whole population, the proportion of the CE population, and the response rate in 2011. It also summarises the quality concerns we have with each of these main types. Appendix D Table D1 gives a more detailed breakdown of the CE populations from the 2011 Census.

Table 1: Quality concerns with CE population collection

Type% of 2011 E&W total pop% of 2011 CE popResp Rate in 2011Quality concernsComments / alternative sources for QA and estimation
All CEs0.018
Care home residents0.0070.380.94·    Population numbers, but not geographically clustered·     97% in ‘small CEs’ in 2011 so in CCS and estimation – more in ‘large CEs’ in 2021
·    Quality of proxy responses·     2011 data 70-87% were by proxy. Likely higher in 2021, but managers aware of respondent burden
(31% of the 50+ bedsp. CEs)·    Manager/staff respondent burden·     Have to accept quality of proxy responses due to nature of residents’ situation
Students in halls of residence0.0050.30.87·    Population numbers and geographic clustering·     Large increase in private halls since 2011
·    Young adult  non-response·     Info direct from estab? (CE officer liaison, phone)
(37% of the 50+ bedsp. CEs)·    Access to individuals - receipt of initial contact & follow-up·     HESA data too lagged for current year – use patterns from previous years?
·     Other sources tbc
Hotel, B&B, guest house usual residents0.00050.0250.91·    Some geographic clustering·     Area-specific validation
(<1% of 50+CEs)·    Understanding ‘usual resident’·     99% were ‘small CEs’ so included in CCS and coverage estimation
Holiday accomm. (caravan parks etc)5.0E-50.0030.92·    Geographic clustering·     Area-specific validation, investigating what sources are available
(<<1% of 50+ CEs)·    Access (time of year)
·    Definitions of usual residence
Prison population0.0010.0660.82·    Geographic clustering·     MoJ data on basic demographics
(7% of 50+CEs)·    Definitions, poor response rates
School boarders0.0010.0640.95·    Geographic clustering·     Alternative data sources being investigated
(13% of 50+CEs)·    Duplication of response (parental home v term time)
Armed Forces bases0.00080.0430.85·    Geographic clustering·     Use of MoD/USAF data – but to note that these sources can’t separate out base v residence address
(6% of 50+CEs)·    Families included in comparator data or not

Addressing

An accurate address frame is essential to enable us to identify the communal establishments and target those living in CEs and SPGs with a tailored approach. The frame is based on AddressBase Premium (ABP) which is widely used across both the public and private sector and is continually updated by Geoplace. It uses Local Land and Property Gazetteers (LLPGs) in conjunction with a range of address intelligence sources such as from the Valuation Office Agency, Royal Mail and Ordnance Survey. Address type classifications from ABP are initially used to establish the type of address (whether a CE or household address, and if a CE whether it is a prison or a care home, for example).

Because communal establishments present particular challenges for addressing, the frame initially created from AddressBase is supplemented with further information, which enables us to validate the address frame classifications, establish completeness (to make sure no CEs are missing from the frame) and also add in additional information where needed, such as the number of bed spaces. The following administrative and commercial sources are used for these purposes:

  • Cushman and Wakefield (Student Halls)
  • Care Quality Commission and Care Inspectorate Wales (Care Homes)
  • Ministry of Defence and US Armed Forces (Armed Forces Bases)
  • Ministry of Justice (Prisons)
  • Edubase (Boarding Schools)
  • Ministry of Communities and Local Government Survey of Traveller Sites (Travelling Persons)

The initial frame is based on an AddressBase extract taken in summer 2020. A further extract from AddressBase is then extracted in December 2020 to account for any new CE addresses and changes in type from household to CE and vice-versa. The administrative and commercial sources listed above will be used to identify changes in communal establishments at this time to ensure that we have the most up-to-date and accurate frame to underpin the 2021 Census, and ensure that all CEs in England and Wales receive invitations to take part in the census and the appropriate number of forms are delivered.

Where AddressBase and other data sources are either of uncertain quality, or do not provide enough information for addressing or beds paces, there has been concerted desk-based research and clerical work to enhance the address frame. The Communal Establishment Address Resolution Team (CEART) has undertaken this clerical resolution work from May 2020 to confirm address types and capacity information. This additional enhancement to the address frame will assist with the field operations during live collection and ensure there is appropriate coverage.

A particular challenge to addressing when it comes to communal establishments is obtaining accurate unit-level (room-level) addresses. While initial contact material may be distributed without the need for unit-level addresses, identifying which of the residents have, or have not, responded during follow-up visits is much more challenging and considerably less effective when unit level addresses are not available. For some establishments, such as care homes, unit-level addresses may simply not exist or it may be inappropriate to follow-up at this level. For smaller CEs, such as hotels and hostels, unit-level follow-up may not be necessary as there are usually only a small number of usual residents at that address. However, for student halls of residence, which may contain a very large number of students, who are unlikely to interact with the CE manager, unit level follow-up is essential for an effective follow-up process. A decision has been made, therefore, to invest significant effort in obtaining unit-level addresses for student halls of residence through engagement with universities, local authorities and using commercial data sources. Only establishment level addresses will be used for other CE types.

Processes have been developed to ensure that any new CEs (including those misclassified as households in the initial frame) identified in the field, or at any time between the delivery of the address frame and the start of field operations, including unit-level addresses where appropriate, can be added to the address list and ensure that paper forms or access codes are delivered as appropriate.

Collection Operation

All managers of CEs are asked (and are legally obliged) to complete a ‘CE1’ form – a short questionnaire asking about the type of establishment, the type of resident it caters for and the count of usual residents and visitors on Census Day (21st of March 2021). All usual residents are requested to complete an individual form, containing the same individual questions as the individual portion for the household form.

Usual residents are defined as anyone who usually lives at the address (including staff) and is expecting to stay there for 6 months or more, or anyone who has no other usual address in the UK. Those expecting to stay for less than 6 months, who also have another address where they are usually resident in the UK, are counted as visitors and should not complete an individual form at the CE address but should be included on a form at their usual address.

Households classified as Special Population Groups (SPGs) complete a standard household form.

The type of establishment or SPG is used to determine whether residents receive an invitation letter with an individual access code to enable them to respond online (with the option of requesting a paper questionnaire if they wish) or a paper questionnaire (including an access code to enable them to respond online if they prefer). The decision around the mode of initial contact is determined by respondent need and the likely impact on response, whilst encouraging people to respond online if they have the means to do so.

In contrast to standard household addresses, where initial contact letters or paper questionnaires are posted out, initial contact materials are all hand delivered to CEs by trained communal establishment census officers. Some SPGs have their initial contact materials hand delivered where additional engagement with the group may be necessary or where no permanent address is available post out to, but other SPGs, such as royal households and embassies, receive their initial contact letters through the post. See Appendix A for summary of each of the CE and SPG types and the method of delivery and type of initial contact.

Wave of Contact for CEs and SPGs

The timeline of interactions with the public known as the ‘wave of contact’ for CEs and SPGs follows as far as possible the household wave of contact (see Appendix B).  All residential addresses, including CEs and SPGs receive an unaddressed postcard to be delivered from 4 weeks before Census Day to raise awareness of the census. Hand delivery of initial contact material starts from 4 weeks before Census Day, with priority delivery to university halls of residence to ensure delivery before the end of term and start of Easter holidays. Communal establishment officers ‘own’ a specific set of SPG and CE addresses and make return visits to follow-up visits after delivery of initial contact material throughout the field operation, which ceases three weeks after Census Day. Residents in communal establishments do not get reminder letters posted out, as this would not be consistent with the hand-delivery approach, but some SPG addresses do receive a reminder letter in the first week after Census Day if they have not yet responded.

Response Rate Targets

Targets for response rates to be achieved by the end of the collection period enable us to drive the operation to maximise response and track progress towards achieving sufficient response to enable production of estimates that meet user needs. Tracking progress towards targets also enables us to direct interventions where needed and ensure that resources are placed where they are needed most. However, for such targets to be useful, they need to be operationally feasible and need to take into account some of the challenges faced by census officers when following non-responding CE residents whilst also balancing the need for sufficient response within each establishment to enable individual records to be imputed without bias. The imputation process only ever adds individual records to a CE, based on the characteristics of those who have responded from within that establishment, and entire CEs are never imputed. To enable imputation, it is vitally important that data are collected from each CE on the number of usual residents, either through completion of the CE1 form by the CE manager, or through completion of a ‘dummy form’ by a CE officer where a CE1 form has not been received. A low response rate from an individual CE would increase the risk of non-response bias and could therefore impact the quality of the imputation and subsequent published estimates. A target has been set to achieve establishment-level information for 100% of CEs via either a CE1 form (preferably), or via a CE dummy form if no CE1 form has been returned.

To take into account the varying availability of unit-level addresses and administrative data sources as well as differing operational challenges across establishment types, four priority levels and associated targets have been set for CEs and SPGs.

Category 1 (low priority) addresses include CEs and SPGs where follow-up visits are likely to be less effective, access is likely to be restricted, or where higher quality admin data mean that this groups is a low priority for follow-up. CEs in this group include military bases, prisons and secure establishments. No specific response rate target is set for this group, but one initial visit must be made to successfully deliver initial contact materials and at least one follow-up visit conducted including contact with the CE manager.

Category 2 (medium priority) addresses include CEs and SPGs where a high level of response is desirable but where multiple follow-up visits may not be appropriate. These include royal households, embassies, hospitals, hospices, children’s homes and boarding schools. Similarly to category 1, no hard target is set for these CEs or SPGs, but at least two follow-up visits (including contact with the CE manager) must be made after delivery of initial contact material.

Category 3 (high priority) addresses include the majority of CEs and is where the majority of resources will be invested to achieve a high level of response. These include student halls of residence, care homes, staff accommodation, hotels, hostels and religious establishments. For these CEs, a minimum of 75% of residents at each establishment must have returned a census questionnaire, with an overall response rate across all CEs in this category of 80%. Follow-up visits must be made continuously throughout the field operational period and should not cease before the end of the field operation period unless all responses from all residents have been received. However, as follow-up activities are contingent on being granted access to the establishment, in those cases where access is refused and the manager refuses to complete a CE1 form, these will be escalated for separate management and will not be included in the response rate calculations for this category. This will avoid resources being diverted from other areas to tackle what could seem like poor overall response in one area, but which is in fact caused by low or no response from a single CE for which under these circumstances additional resources would not be helpful and could be detrimental to other areas.

Category 4 is used to denote SPGs where a bespoke approach is required and no specific targets are set for either response rates or the number of follow-up visits. These include day and night centres for the homeless and rough sleepers and other transient groups such as continuous cruisers, fairs and circuses and transient Gypsy and Roma travellers.

Note that all response rate targets are for valid responses as opposed to any return from an address, which may include blank or incomplete questionnaires. However, for operational purposes, return rates will be used as a proxy for response rates to enable timely tracking of progress towards achieving those targets in during the live operation.

CE & SPG Return Profiles

Return profiles have been created to enable the tracking of progress towards targets and to enable identification of any problems early in the operation so that interventions can be put in place to mitigate any shortfalls in expected levels of response.

The proposed 2021 CE & SPG return profiles are presented in Appendix C.  The profiles are built from the 2011 CE observed returns but have been created and adjusted to reflect the current CE operational design, in particular the switch to online first for some CE types.

Three profiles are given:

  • Student return profile – representing university halls of residence.  These CEs will receive invitation letters containing access codes and are expected to be majority online returns.
  • Care home return profile – representing care home residents. Care homes will receive paper questionnaires and are not expected to make online returns
  • Overall profile for high priority CEs – incorporating returns from all CEs and SPGs in ‘Category 3’ (including halls of residence, care homes, hotels, hostels, staff accommodation, religious establishments, marinas and caravan parks)

It is important to understand that the profiles link to the agreed 2021 operational targets for CEs and SPGs, and these effectively present the predicted route towards achieving the minimum acceptable level of returns. However, the collection operation will strive to maximise response as far as possible and will not cease once the targets are met. Students in halls of residence and care home residents together made up approximately 77% of the CE population in 2011. The size of the population in these two groups and their divergent population characteristics make it necessarily to differentiate the profiles for these groups. Response patterns for other individual CE types, however, are likely to be too volatile due to their smaller numbers and dependence on timing of interactions with CE officers to enable us to produce separate profiles with sufficient confidence to be of practical value.

Making use of the CE & SPG return profiles in the collection operation requires a strategy for monitoring progress against them.  Key to this proposal is having a dedicated analytical function for CEs and SPGs, resourced to monitor and appraise the progress of CE and SPG collection during live operations. This will provide capability to evaluate the operation in the context of the CE operational and statistical design. Monitoring CEs & SPGs during live operations will be done through analysis of MI data by the CE analytical function within the Census Statistical Design team.  This will comprise tasks to identify areas for concern that need to be flagged for further consideration:

  • Ranking student CE returns by LA – to highlight LA differences in return rates for students in CEs, picking out areas where returns are lower than expected. Student CEs have a greater impact to the population base in some LAs.
  • Ranking care homes by LA – to highlight LAs with lower return rates.
  • All CE and SPG types individually nationally – to provide an overview of the effectiveness of the collection process, again to highlight any issues specific to a type of CE or SPG which may require a change of approach.
  • Category 1 CEs and SPGs1: to monitor the number of visits against expected, and highlight action where needed if visits are insufficient. Reporting return rates for information (no action expected).
  • Category 2 CEs and SPGs2: to monitor the number of visits, and action where needed. Report return rates (no action expected).
  • Category 3 CEs and SPGs3: to monitor return rates and action these further where needed.
  • Category 4 SPGs4: to report return rates, action only by exception.

Monitoring is proposed to be done on a weekly MI cycle that allows for receipt of paper responses to be reflected.  If further action is deemed necessary based on the monitoring activity, then intervention may be deemed appropriate.  Interventions available to the CE collection are:

  • Increase field staff hours – for CE field officers, to tackle specific issues where CE returns don’t match against expected progress, by local area and CE type.
  • Performance management – where outcomes are not as expected against CE returns by local area and/or CE type, and comparison with other CE officers.
  • Additional staff from household – to boost field hours beyond that possible with original CE officers, in reaction to a specific challenge in a local area.

The statistical design for CEs in the collection operation, including the response rate targets and return profiles, and the operational design developed to meet those targets have been developed in line with the 2021 Census quality objectives to ensure that sufficient data of high enough quality are obtained during the collection operation to enable the production of high quality census estimates to meet user needs.

Coverage Estimation

Small CEs (up to 49 bedspaces in 2021)

In 2011 these were processed region by region, but even so, we had to collapse categories (age, sex and often type of CE) to get enough sample for the dual-system estimation process to be robust enough. For 2021, we are investigating using the modelling approach being used for household estimation.

For information, Appendix D Table D2 contains the counts, estimates and adjustment of usual resident in small CEs, with response rates, by establishment type, extracted from ONS(2012).

Notes from Oct 19 panel session on CEs:

Small CE geographic distribution was discussed, leading to the panel recommending ONS consider how the non-uniform nature of this impacts outcomes when using the current CCS sampling methodology

This is one of the things the modelling should help with. We’re not doing a specific sample for CEs – it is too problematic as they tend to be geographically clustered

Large CEs (50+ bedspaces)

Capacity threshold

In 2011 the Large CE estimates were for 100+ bed space establishments. We have now lowered this threshold to 50+, because:

  • The CCS is too difficult to administer (face to face interviews) for 50+ residents
  • The number of CEs in the 50-99 bed space group, that may also be in the sample, is not large enough to warrant the additional field effort
  • Applying the weights to all 50-99 bed space CEs from calculations mostly done on 1-49 bed space CEs isn’t as robust as for the smaller CEs
  • The benefit gained from these additional CEs in the Estimation calculations is not enough to warrant the additional field effort, and
  • We anticipate having more admin data on these establishments anyway, to help with the estimation of undercount.

The threshold itself is to a certain extent arbitrary, and bed spaces only predict number of residents, but Field needed a round number to work with in the operation.

Expected number of 50+ bed space CEs (See Table 2 below)

  • In 2011, there were 2,157 100+ bed space CEs, 917 of which were flagged for checks – 728 we made a change for
  • There were an additional 2,300 50-99 bed space CEs, which would now be in scope of this assessment
  • Currently estimating that approximately 4,600 care homes and halls of residence have 50+ bed space capacity (up from 3,000 in 2011) – these are the main groups of large CEs
  • Estimates of the capacity of other types is still in progress

Table 2: Number of CEs by bed spaces, 2011 Census[1] and estimated for 2021

Establishment Type100+ CEs added to in 2011100+ assessed in 2011100+ bed spaces 201150-99 bed spaces 201150+ bed spaces 2011likely 50+ bed spaces 2021
Student Halls320421108659216784600
Residential/nursing home4913112631394
Boarding School3168405166571Tbc
Prison8612323692328Tbc
Armed forces base accom.388420159260Tbc
Homeless hostels410245680Tbc
Other Ed. Establishments---3639Tbc
Hospitals-10251035Tbc
Unknown---2729Tbc
Children’s homes--12416Tbc
Total483728212523054430Tbc

Prioritisation for investigating and adjusting Small cell sizes have been suppressed. For context, Appendix D Table D3 shows the counts, estimates and adjustments made in large CEs in 2011.

In 2011, we prioritised which CEs to investigate, and to add to, based on the percentage of returns received (if we received fewer than 75%), and the number received (if we were more than 50 short of those issued, and when we had an alternative figure, if that was 50+ higher than the number received). In most cases, we phoned up the establishments to get their estimate of residents, and in some we used the available administrative comparator data. More detail is available in ONS(2012).

We are planning to re-evaluate the prioritisation done in 2011 – the arbitrary cut-off 75% of expected returns, or 50 deficit thresholds. In the first instance this will be done by re-examining the 2011 data, but we will also make the assessment based on the live 2021 situation. In 2021 there will be more information more readily available about the collection operation from the electronic Response Management and Fieldwork Management Tool that should feed into this.

Appendix D Table D4 shows the breakdown of large CE resolution in 2011: although there was some indication of undercount in all of these, there wasn’t the evidence to back up an adjustment in all of them: 31% of those investigated had no adjustment applied. This will be investigated further, but it is likely to be due to an assessment of the definitional differences in the sources.

Use of alternative sources to estimate large CE population

For 2021, we are investigating how well administrative sources cover the establishments. If we can’t get appropriate sources our contingency plan is to ask CE managers for their count of residents, or to phone the establishments ourselves as we did in 2011. Table 3 lists the types of establishments and potential data sources we are pursuing. This work is still in progress.

Table 3: Type of establishment v admin source[2]

TypePotential data sourcesComments / work in progress
Care homes




Other medical establishments




Children’s care homes
NHS Personal Demographic Service (PDS)




Care Quality Commission (CQC), Care Inspectorate Wales (CIW)

Public Health England (PHE)
We know there are issues with patient data and care homes, from when people go into hospital and stay a long time, or die there – where does death registration note them as living, does it link up with PDS record if that’s still at care home?

What if PDS record at family home rather than care home?

CQS, CIW, PHE – don’t have age/sex, just bed spaces
School boardersEnglish and Welsh School Censuses (ESC, WSC), School level annual school census (SLASC), Stats Wales

Indep. School Census
What quality assessment has been done on these?

Can we compare against PDS? Will boarders be registered on PDS at boarding location?

Independent School Census will not release data at detailed (establishment) geography
University studentsHESA










HESES and patterns of term-time accommodation/ Halls bed spaces
HESA data too lagged, asking for special delivery for 2021, but this only likely to be useful if we’re very behind with processing. Uncertain of quality beyond 1st year undergraduates.

Possibility of getting aggregate HESES data and applying distributions of where students are usually located when at term addresses – this still to be investigated.
Home armed forcesMinistry of Defence (MoD)data by base – how to separate out those living elsewhere?  Dependence on field info
Foreign armed forcesUnited States Armed Forces (USAF)data by base – how to separate out those living elsewhere? Dependence on field info
Prisoners

Immigration removal centres
Ministry of Justice (MoJ) – data by prison, postcode, age/sex, length of sentence

Home Office
MoJ data on prisoners going ahead ok, able to replicate census definitions ok.

Does MoJ include probation bail and any other detention centres?
Caravans/TravellersGypsy and Traveller Caravan by Stats Wales, sources from Ministry of Housing, Community and Local Government (MHCLG)Investigating sources of data

The Coverage Adjustment team have investigated the over-use of donors in 2011:Use of donors in Coverage Adjustment

  • Confirmation that there was over-use of donors in 2011, especially for large CEs (less of an issue for small CEs)
  • This was usually caused by the constraints in the estimates – given the age/sex and type targets, there simply weren’t sufficient donors available
  • For 2021, we are considering using nearest-neighbour groups:
    • a slightly different age group within the CE,
    • or a similar CE type geographically close,
    • or if it’s a mixed-sex CE, to use a female when a male is needed (and vice versa)
  • Hoping to use combinatorial optimisation for CE imputation. When we’ve used CO for households, we’ve found that it doesn’t duplicate on donors as much as the 2011 method.

Coverage Adjustment will be coming back to MARP in September, so will be able to expand on this then.

References

2021 Census: Estimation and Adjustment for Communal Establishments (ONS, 2012)

Appendixes can be found in the attached downloadable document.

EAP140 – Forecasting return rates in the 2021 Census

Written by Brendan Davis and Orlaith Fraser

Background

During the 2019 Rehearsal the Response Chasing Algorithm (RCA) compared live and expected return rates to identify areas with shortfalls in response; to assign each a RAG status; to rank the shortfalls; and to propose interventions to mitigate them. The expected return rates were an output from the Field Operations Simulation (FOS) model, produced for each Lower level Super Output Area (LSOA). This provided a daily measure of progress against targets up to and including the day of reporting.

In addition, a simple forecasting measure was also used to estimate the future and final return rates at the end of the operational period. The forecasting measure derived the ratio of final expected return rate to current expected rate, and applied this ratio to scale the current live return rate to estimate a final rate. This method explicitly assumed perfect operational conditions from the next day forward; took no account of past performance as an indicator of future performance; and made no adjustment for deficiencies in the assumptions used in the modelling used to produce the expected return rates.

The RCA method and implementation used in the Rehearsal was successful in its core purpose to recommend interventions and report day-to-date progress. The RCA will be used in a similar fashion in 2021.

However, an improved forecasting method is needed to support the RCA – one that improves the accuracy of forward estimation of future performance; one that provides a view of progress that takes into account both the past and future operational capacity, and the differences in respondent behaviours compared to those assumed in the FOS modelling.

The proposed method will used an adapted version of the FOS, running at the end of every day, and incorporating emerging evidence from previous return patterns; the effectiveness of the deployed field resources to date; the projected field resources in the future; and  an assessment of the behaviour of respondents relative to prior assumed behaviours – adjusting accordingly. In effect, this will mean reproducing daily future expected return values similar to those produced prior to the start of collection; learning from past performance and accounting for any deviation from assumptions by incorporating all new operational information as it emerges.

Discussion

2.1 The Field Operation Simulation Model

The FOS has been used to simulate the 2021 Census collection operation; inform the design of the wave of contact; and help to determine the estimated field and reminder resources required to meet the key census return rate targets. Additional outputs from the FOS include daily expected return rates at Lower level Super Output Area (LSOA) and the number of expected field visits at similarly local levels.

The FOS uses a set of statistical and operational inputs and implicit assumptions to model the expected behaviour of households and Census Officers during the collection period. These include, but are not limited to, the willingness of respondents to self-respond; propensity to respond / switch to respond via paper questionnaire; effectiveness of reminder letters in prompting a response; contact rate of field staff; and field visit durations.

2.2 FOS performance in the 2019 Rehearsal

Analysis of the FOS assumptions, as used for the 2019 Rehearsal, shows that broadly, for the assumptions for which the volume and quality of rehearsal data was sufficient to assess, the assumptions show good agreement with observed operational values. For example, reminder effectiveness shows good agreement across reminder and hard to count groups; field contact rates show broad agreement, with some local variations; and the profile of the propensity for paper response switching is as expected given the barriers to switching in conjunction with the voluntary nature of the rehearsal. A sample of assumption validation evidence is shown in Annex A.

2.3 Extending the FOS

Based on the 2019 performance, the FOS proves to be a robust and valid basis for adaptation and extension to the task of forecasting future return rates during the collection operation. With sufficient modification it should be possible to move beyond a single use model with one set of static outputs, to produce a streamlined model that adjusts dynamically as the facts on the ground change.

The choice of the FOS to determine forecast future returns also provides the means to produce dynamic sets of data to benchmark and monitor field performance – rather than a single static set that takes no account of practical constraints as they emerge, this approach will adjust to shortfalls in resource or variations over time or at a local level.

2.4 Approach

The assumptions used in the FOS (excluding field visit durations and field travel times) are stratified by Hard to Count (HtC) group. Two assumptions – self-response willingness and propensity for paper response – are further stratified within HtC by 10 age/sex categories. In order to preserve this second level of stratification during live operations it would be necessary to have timely and reliable response (not return) data in order to discount households in each age/sex category from the outstanding households that have yet to respond. The lag in availability of response data means that this is not recommended, and instead a simplified set of assumptions, stratified by HtC only, will be used. Prior to the final adoption of this approach, testing will be completed to ensure that this small element of simplification does not introduce significant deviation from the baseline level of response in any local area.

2.5 Method

The main adjustments to the inputs that will be considered are in the assumptions of respondent and field behaviour, and the level of field resource available at a local level.

Specifically, the following assessment and adjustment of assumptions are proposed:

  1. Reminder effectiveness – no adjustments are proposed – the effectiveness assumed was proven to be robust in the Rehearsal; monitoring and adjustment during live operations is impractical because of the volume and phasing of reminders; and the complexity involved makes the differentiation of reminder and field interactions impractical.
  2. Propensity for paper response – based on rehearsal evidence, these assumptions appear to be robust and valid – it is anticipated that adjustments should not prove necessary. However, consideration will be given to supplementing HtC stratification with a degree of geographical stratification. This should capture and account for high concentrations of some demographic groups in which some extreme variation in propensity for paper response may be evident.
  3. Field contact rates – high likelihood for adjustment – it is likely that these will vary over time as the field force ‘beds in’; there is likely to be significant variation from the rates observed in the Rehearsal (improved training and on boarding of staff etc) and from the underlying data from which the assumptions were derived; and there is a potential effect of contemporary differences in the frequency/time that people are at home should COVID related changes in home working patterns exhibit a long tail that persists deep into 2021.

Field contact rates will be assessed starting from the Tranche 2 field start date, but no adjustments will be made until at least the start of Tranche 3, and potentially not until one week after that. Subsequent frequency for adjusting rates is proposed to be at least weekly, and no more than twice weekly – more frequent adjustment, especially early in field operations may create significant volatility in day-to-day forecast changes.

  1. Willingness to self-respond – variation from assumed levels of self-response will be assessed from the start of operations, but no adjustments will be made prior to Census weekend. The phased delivery of initial contacts in the first 10 days, and the high probability of variation in levels of response in the first weeks mean that adjusted forecasts produced in the early self-response period will exhibit consequent day-to-day fluctuation and regional variability. Instead, analysis of self-response vs. the assumed levels will be conducted and reported as part of business intelligence measures, through governance channels. From the early days of operations, estimates of the likely number of reminders needed in each phase will be reported, with measures of confidence in the likelihood of staying within the capacity constraints of each phase. Adjustment forecast return rates, incorporating changes in willingness assumptions, will be produced from Census Day onwards. Self-response willingness will be monitored after Census Day by continuous analysis of response levels from all households that have yet to receive field visits or reminders

The availability of field resources compared to the planned capacity will be monitored as part of the field recruitment process. In the final phase of recruitment – as numbers should be approaching capacity – any shortfalls at a local level will be translated into an early ‘live’ field capacity measure, and used to model operations under reduced capacity scenarios. This will allow for early end-to-end operational forecasting from mid-February 2021, and once the initial contacts land, this early forecast will fold into full reporting of progress against expected values and full operational forecasting.

2.6 Contingent scenarios

In addition to the explicit monitoring and adjustment of assumptions; and the impact of field capacity variations, the use of the FOS during live operations also offers the potential to accommodate and adjust for the impact of other unplanned local factors effecting operations or respondent ability to respond – for example local restrictions on movement and interaction due to ongoing COVID related regulations. The means to accommodate and implement ad-hoc adjustments will be defined as part of the work to implement the adjustments already described above.

2.7 Outputs

Mid-February

early estimates of forecast vs. expected return rates for the full operational period reflecting reduced field capacity scenario (if applicable).

01 March

next day+ return rate forecast incorporating field capacity variation from expected, based on latest recruitment status.

08 March

assessment of live self-response levels vs. expected; confidence of remaining within threshold capacity for reminder letters in each phase.

22 March

improved next day+ return rate forecast incorporating intelligence derived from analysis of self-response levels and ongoing fluctuations in field capacity.

23 March

improved expected field visit numbers, incorporating latest field resource levels.

24 March

early indications of likely variation in field performance from expected.

30 March/06 April

improved next day+ return rate forecast incorporating intelligence derived from analysis of field performance variation from expected.

Conclusion

This paper has outlined the proposed method to produce improved return rate forecasts and associated intelligence during live operations– using a modified Field Operation Simulation. This approach offers clear benefits compared to the approach used in the 2019 Rehearsal, and includes the potential to flexibly adapt to unplanned scenarios.

Members of the board are invited to review the proposed method, note the changes and improvements in the approach compared to the 2019 Rehearsal, and endorse the further development of processes and systems to implement the proposed method.

List of Annexes 

Annex A: Validation of Field Operation Simulation assumptions

(Annex can be found in attached downloadable document)

References

EAP117 Ward Kim, Barber Pete, Priestley Maria, Fraser Orlaith [2019], Simulating Census Operations to Inform Resourcing Decisions https://www.statisticsauthority.gov.uk/about-the-authority/committees/methodological-assurance-review-panel-census/

EAP139: Methodology for assigning Red-Amber-Green status for 2021 Census returns

Author: Alexandra Christenson and Torunn Jegleim 

Purpose 

It’s common practice in performance management contexts to assess the quality of an outcome using a “traffic-light” system; Red, Amber or Green – also known as a RAG status. This paper provides the outline of the proposed RAG status methodology for live returns during the 2021 Census collection.

Background

During the 2019 Rehearsal the Response Chasing Algorithm (RCA) compared live and expected return rates to identify areas with shortfalls in returns (Meirinhos, 2019b). Shortfalls were assigned a RAG status, which was visible in the 2019 RCA Dashboard (See Annex F). However, the evaluation of the 2019 Rehearsal concluded that the RAG status needed to be further developed to be more informative.

The aim of the improved RAG status methodology is to;

  1. Give an overview of how the collection operation is doing in comparison to the census quality targets
  2. Flag what and where the issues that need actioning are; low response and/or high variability depending on geography level

In developing the new methodology, Census Statistical Design (CSD) consulted other business areas; Question and Questionnaire Design and Methodology within the ONS and other national statistics agencies; Stats Canada, Stats NZ, US Census Bureau and Australian Bureau of Statistics.

Discussion

For the 2021 Census the ONS has committed to achieving key quality targets; reaching an overall response of 94%, at least 80% response in each local authority and minimised variability; proposed to be 90% of LSOAs in an LA falling within 10% of the response mean (Martyna, 2020). In order to understand if we are on track to reach these targets, a tool measuring this is needed.

The RAG status is designed to act as a decision support tool for the governance of the census collection operation. The RAG status will be widely visible in the future 2021 Census data dashboard, which is planned to be shared across teams and in daily governance meetings. It is therefore imperative that the RAG status methodology is fit for purpose; flagging issues that need actioning and transparent about how issues are flagged.

For the 2021 Census, the following is proposed;

Geography levelOverview of proposed methodology
Lower Super Output AreaRAG status determined by shortfall in live vs expected returns.

Proposed thresholds in Annex A
Team Leader AreaRAG status determined by shortfall in live vs expected returns.

Proposed thresholds in Annex A
Local AuthorityRAG status determined by return shortfall and variability in return rates within the local authority. Relative importance of return rate and variability is adjusted throughout the operation (See Annex A). Combined to create a single RAG status (See Annex A).
RegionalAverage RAG status score for the LAs making up the region. Proposed final scores in Annex A
NationalNo coloured RAG or calculation – but show key figures for;

-       The forecasted overall return rate for England & Wales

-       The number of LAs that are forecasted to reach 80% overall response out the total number of LAs.
OnlineMonitor the online proportion of response, and RAG status this against targets on a local authority and national level

Lower Super Output Area (LSOA) and Team Leader Area (TLA) RAG status

The predictive modelling and maximising response strategies are conducted and targeted at LSOA level, indicating a need to monitor returns. TLAs are the operational geographies for field staff; representing the work area of up to 12 census field officers. The RAG status at LSOA and TLA level will be a simple measure of live versus expected returns against the thresholds outlined in Annex A.

The Field Prioritisation Algorithm (Meirinhos, 2019a) will be working at an OA level to minimise variability within each LA, and so implicitly, working to improve response in the worst performing OAs will work to reduce the spread within and across LSOAs. However, there are no explicit quality targets for variability within LSOAs, and indeed, given the (average) number of LSOAs per LA, it would be neither practical nor informative to measure and so these are not included in the RAG status at LSOA level.

Local Authority (LA) RAG status

Given the census LA variability and response target, the geography provides a sensible level to introduce an enhanced calculation to determine RAG status. With 336 LAs across England and Wales, this RAG status will be crucial to flag issues for action; interventions or further ad-hoc analysis.

The RAG status at LA level will be determined by two components;

  1. Return Rate Difference (RRD): measured as the difference between live and expected returns
  2. Variability (V%) measured as the proportion of LSOAs in the LA with a return rate falling within 10% of the return rate mean for the LA (Martyna 2020).

Within each LA, each component is assigned a daily score from 1 (best) to 3 (worst) based on proposed thresholds (Annex A). Each component will then have a weight multiplied by its score to reflect what stage of the collection operation we are in, giving a final equation of:

(RRD score x daily weight) + (V% score x daily weight) = Final RAG score

The purpose of the weights is to accurately show what issues can be actioned. For example, until field staff go live, we have no means by which to target variability issues, thus flagging a potential problem prior to this is redundant. As the weights always add up to 1.0 (or 100%), the range of possible final scores will always be between 1 to 3 (final score thresholds in Annex A).

We propose that the component measuring variability will have a low weight (0.1) until tranche 2 field staff commence work (Census Day + 2), at which point the weights begin to gradually change until the two components are at an equal weight of 0.5 three weeks before the end of collection. The last three weeks will have a constant equal weight of 0.5 applied to both components (see table in Annex A).

In a hypothetical scenario, these would be the results;

ComponentsDayValueScoreWeightWeighted scoreFinal score
RRD70.210.90.91.2
V%70.8630.10.3
RRD500.210.50.52
V%500.8630.51.5

Whilst the values remain the same in this scenario, the changing weights places more emphasis on the V% on day 50 compared to day 7, bringing the final score up from 1.2 to 2.0, shifting the RAG status from green to amber.  This is not to say that variability issues are not important prior to census day + 2, but that they are not heavily weighted as it cannot be actioned.

In determining the weighting strategy and thresholds, special attention has been paid to ensuring that the weights and thresholds minimise RAG status volatility over time.   The above methodology has been tested using 2019 Census Rehearsal data as well as using predicted data for the V% from the Field Operation Simulation (FOS) (Ward, et al., 2019).

Other approaches for Local Authority RAG status:

Alternative approaches to calculating an LA RAG status, such as using flat weights or using a risk impact table instead of final score have been explored (Annex C, Annex D, Annex E)

However, the simulations for a risk impact table flag issues as red from the first day of collection both in the FOS output data (Ward, et al., 2019) and the rehearsal data (Annex D, Annex E). Furthermore, the simulation using flat weights either flag everything as green (although we know this was not the case during rehearsal) or everything as amber/red long before we are able to take action to rectify the issues (Annex C).

This suggests that neither of the approaches are fitting given the purpose of the RAG status.

Regional and National RAG status:

On the regional level, the proposed approach is to calculate the average final RAG scores for the LAs belonging to the specific region. Following this approach, the regional RAG status considers the same components as the LA RAG status without the need to aggregate the measures or change thresholds. The final score RAG will follow the same thresholds as the LA (See Annex A).

To track the overall progress of the census collection operation, we propose to not provide a coloured RAG or calculations. Instead, viewers will have three measures indicating progress against the overall and local authority response targets and the variability target; overall final forecasted return rate, number of LAs forecasted to reach 80% response rate and number of LAs reaching the variability target (Martyna, 2020). The ad-hoc team in CSD will also be available for more thorough weekly analysis of the national picture.

Online RAG status:

We will monitor the online proportion of response, and RAG status this against targets that, at a local level consider the proportion of paper questionnaire initial contacts and an expected level of mode switching, and at a national level sum to our overall quality target for online response.

Conclusion

This paper has outlined the proposed method to derivate a RAG status at all geography levels during live operations as well as the proposed approach for tracking online response.

An informative RAG status is imperative to manage the census collection operation. If the programme is in danger of not reaching any of the quality targets, this needs to be flagged promptly. The purpose is to display what issues needs to be actioned where. The approach presented offers a more informative way of doing this than previously done, whilst still acknowledging that human intervention will be needed to perform more thorough analysis during live operations.

List of Annexes

  • Annex A: Proposed thresholds for all geography levels and weights for LA
  • Annex B: RAG status simulations using proposed methodology
  • Annex C: RAG status simulation using proposed thresholds and constant weights
  • Annex D: RAG status simulation using a risk impact table method 1
  • Annex E: RAG status simulation using a risk impact table method 2
  • Annex F: 2019 RCA Dashboard maps with RAG status

(The annexes are contained in the attached downloadable document.)

References

Martyna, Kamila (2020) EAP138: Variability Target for Response Rates in Collection https://share.sp.ons.statistics.gov.uk/sites/cen/csod/CSOD_Stats_Design/Statistical_Design/Presentations/Design_Authority_Board_2011_variability_v2.docx

Meirinhos, Victor (2019a) EAP115: Field Prioritisation Algorithm

Meirinhos, Victor (2019b) EAP114: Independent Methodological Review: Response Chasing Algorithm https://www.statisticsauthority.gov.uk/wp-content/uploads/2020/06/EAP114-Independent-Methodological-Review-Response-Chasing-Algorithm.pdf

Ward, K., Barber, P., Priestly, M., Fraser, O. (2019) EAP117: Simulating Census Operations to inform Resource Decisions

EAP138 – Variability Target for Response Rates in Collection

 Written by Kamila Martyna

Purpose

The purpose of this paper is to present the proposed variability target for the 2021 Census.

Background

One of the Census quality targets for 2021 is reduced variability in response rates across Local Authorities (LA) and within Hard-to-Count (HtC) groups. The HtC index is constructed using an area level (Lower Super Output Area – LSOA) model that predicts non-response by day 10 after Census Day (Dini, 2018). Each LSOA is assigned a score from 1 to 5, with 1 being easiest and 5 hardest to enumerate.

The estimation strategy for the 2021 Census has changed from 2011 – in 2011 it was an area-based estimation (Abbott, 2008); in 2021 area is less important and estimation is more characteristic-based (e.g. based on age or ethnicity). However, controlling the variability across the characteristics would be almost impossible, and it is still important to reduce variability across the areas as much as possible. There is still a link between variability of responses and the quality of estimates, albeit smaller than in 2011. Therefore, reducing variability across the areas seems to be the best practical approach to undertake in 2021.

For this reason, we need a quantifiable approach to track our progress against the quality target of reduced variability. During the 2011 operations the method used was that 90% of LSOAs fall within 10 percentage points (hereby referred to as pp) of the mean response rate for that Local Authority, and Hard-to-Count group.

The approach using pp difference from the mean is fixed; the range around the mean is always 10pp, regardless of what mean rates are. The % approach will have different ranges for different mean response rates. For example, if HtC 3 achieves a mean response rate of 85%:

  • the pp range around the mean will be 75% – 95% (85% +/- 10pp);
  • the % range around the mean will be 76.5% – 93.5 (85% +/- 10%*85%).

A retrospective analysis was performed to evaluate if the 10pp target was met in 2011, and if a more ambitious goal is feasible to aim for in the 2021 Census – 90% of LSOAs falling within 10% target range of the mean response rate. Alternative targets were also investigated as possible variability goals for 2021.

It is also important to mention that our predictions for the response rates for the 2021 Census are more uncertain than for 2011 due to the online-first approach. The online-first Census could lead to higher variability; therefore, it is important to choose a target that is not overly stretching in case the variability behaves differently than in 2011.

The proposed idea is to use a similar but more challenging method to the one used during the 2011 operations – 90% of Lower Super Output Areas (LSOA) should fall within 10% of the mean response rate for that Local Authority, and Hard-to-Count group.

Discussion

Hard-to-Count variability

The boxplot below shows the distribution of LSOA response rates within each HtC group in 2011­­­­­. A general trend of increased variability can be observed as willingness to respond decreases.

EAP138 - Variability Target for Response Rates in Collection

The analysis of the response rates showed that both 10pp and 10% targets were acheived in 2011 – for all HtC groups more than 90% of LSOAs had response rate within 10pp/10% of the mean response rate for that HtC. The 9pp target was achieved in all HtC groups; the rest of the targets were not met in one or more HtC groups – the response rates were too varied and didn’t fit the target range around the mean.

Proportion of LSOAs that were witin the 10/5 variability target.

 HtC 1HtC 2HtC 3HtC 4HtC 5
10pp target0.99950.99570.98060.97650.9452
10% target0.99950.99570.98060.96840.9099
9pp target0.99910.99380.97230.96840.9204
9% target0.99910.99380.97230.96630.8407
8pp target0.99860.99070.96390.95430.8655
8% target0.99860.99070.96360.94840.7885
5pp target0.99080.95640.88410.83210.6031
5% target0.99080.95430.82290.76890.5666

Local Authority variability

Similar analysis on the variability of LSOA response rates within Local Authorities showed that all but the 5pp/5% targets were met in 2011.

Proportion of LAs that were within the 10/5 variability target.

 Percentage of LAs 
10pp target0.9971
10% target0.9885
9pp target0.9885
9% target0.9885
8pp target0.9885
8% target0.9856
5pp target0.8764
5% target0.8276

Field Prioritisation Algorithm (FPA)

FPA is a tool designed by Census Statistical Design team with the aim of reducing variability (Meirinhos, 2019). Its goal is to reduce variability between and within HtC groups in each Team Leader Area (an area where Census Officers have agreed to work in). It works by targeting the worst performing Output Areas and giving them priority of field visits, and giving no visit priority to the best performing areas.

Regardless of which variability target will be decided for the 2021 Census, we won’t stop reducing variability with FPA just because we have reached that target. We will continue to target the areas with the lowest return rates throughout the operational period – even if we don’t decide for more demanding targets, we might reach them because of the FPA.

Despite the variability goal being easier to achieve for the easier-to-count areas, the FPA will work continuously to minimise variability. It is feasible to set harder to achieve goals for the FPA, as there is no downside to having a more challenging target in the FPA (as long as it’s not extremely difficult because that would work at the expense of maximising response).

Variability Timeline

 Throughout the operational period, we will undertake different actions to minimise variability in 4 phases, and to evaluate how we are performing against our variability targets:

  1. Before the Census Day our main focus will be on maximising response by engaging with specific populations. We expect variability to be large.
  2. From Census Day on, we will be focusing on maximising response – no variability interventions will be implemented at this point (any reduction in variability will come from maximising response interventions).
  3. From 2 weeks after the Census Day, the FPA will start working to reduce the variability. We will also start reporting against the variability targets.
  4. In the last 3 weeks of operations, we will review the effectiveness of the FPA and adjust how it works if needed.

EAP138 - Variability Target for Response Rates in Collection

 The variability target is used for reporting and as an objective measure of variability – it will not be used to guide the field behaviour. It works separately to the FPA, which will directly influence the field activities.

Conclusion

In summary, the 2011 target should be an acceptable minimum target. Variability is very important, but we want it to be achievable. We propose that the 10% variability goal should be used in 2021. This goal is sufficient to meet statistical requirements as agreed with the Methodology team. It is more operationally feasible than the 8 and 5 pp/% targets, but more challenging than the 10pp target used in the 2011. The 9% goal was met in 2011, however, due to some uncertainty of the online-first nature of 2021 Census it might be more challenging to meet that goal in 2021. Finally, regardless of the target chosen, the FPA will not stop reducing variability during live operations, even if the variability target was met.

List of Annexes

Not applicable.

References

Abbott, O. (2008). 2011 UK Census Coverage Assessment and Adjustment Methodology. Census Advisory Group paper (08)05.

Dini, E. (2018). EAP102 – Hard to Count index for the 2021 Census. https://uksa.statisticsauthority.gov.uk/methodological-assurance-review-panel-census/papers/

Meirinhos, V. (2019). EAP115 – Field Prioritisation Algorithm. https://uksa.statisticsauthority.gov.uk/methodological-assurance-review-panel-census/papers/

EAP137 – Household response profiles for the 2021 Census

Purpose

This paper provides an update on the design of the household response profiles for the 2021 Census.

Recommendations

Members of the panel are invited to review the changes of the response profiles, that have been applied after the 2019 Rehearsal and incorporated into the development of the 2021 Wave of Contact.

Response profiles are a best estimate of the volume of responses we expect to receive during the census period. They are used to model when, where and how people are going to respond to the census, and they will be vital to inform operational decisions during the 2021 Census.

The response profiles have been used as an input to the Field Operation Simulation Model (FOS; Ward et al, 2019) used to make decisions about the number of field staff required in each area; when, where and how many reminder letters to send; and where to send paper questionnaires reminders.

The output from the FOS will be used during the live collection period by the Response Chasing Algorithm tool (RCA; Meirinhos, 2019a) as a basis for comparison against live return data, so that we can tell where we need to put in extra effort to meet our quality targets of 94% overall response and at least 80% response in every local authority.

Our aim is to build response profiles for returns by day, from the start of the census period until the end of the collection period. We aim to do this for each mode of response – online or paper. Our objective is that each Lower Super Output Area (LSOA) will have a response profile which is associated with its influential demographic variables and Hard-to-Count (HTC) rating (Dini, 2019a; 2019b).

Groups of LSOAs that have similar self-response profile shapes and demographics, will be clustered together (Meirinhos, 2019b). This means that by having a reduced number of response profiles, rather than 34,753 (the number of LSOAs) our maximising response strategies might be customised to the relevant demographics while making sure we are on target with the Census quality targets: 94% response, 80% minimum for each Local Authority and reduced variability.

We anticipate that a cumulative response profile for a paper first area might look like the graph in figure 1. The blue line shows the overall response. This is broken down to show the daily expected online (orange line) and paper (blue line) responses.

Figure 1. Cumulative response profile example

EAP137 - Household response profiles for the 2021 Census

 During the 2021 Census maximising response strategies will be implemented to increase census response according to the established targets (94% overall response, 75% online response and at least 80% response in every local authority; Fraser, 2019).

Research has been conducted by the Maximising Response team using the Field Operation Simulation (FOS) to evaluate how many field staff visits, reminder letters and paper questionnaires will be necessary to conduct a successful Census operation (Davis & Fraser, 2020).

Using the response profiles as the baseline response, the visit success rates and reminders efficacy from the 2017 test, the success contact rate from the Labour Force Survey (LFS) amongst many other parameters, it was possible to estimate the expected daily returns for all LSOAs across England and Wales during the next Census collection period. Moreover, the knowledge gathered from the 2017 Census Test provided enough support to estimate how many respondents will choose to submit their Census question online or using the traditional paper questionnaire.

Yet some challenges still persist. Results from the 2019 Rehearsal provide some evidence about the willingness of the E&W population to respond to a “voluntary” Census Rehearsal, as well as the actual online engagement from the population and the ability to reach a 75 % online response.

However, if the results from the rehearsal suggest that we will probably be able to achieve the 70% self-response there isn’t much evidence to draw conclusions around the patterns of response and especially around the peaks of response.

This is mostly related with the fact that both the 2017 test and the 2019 Rehearsal didn’t include a Census day and therefore the main peak of response happened over the first period of days after the initial contact has landed.

For the 2021 Census, the material accompanying the initial contact will include completion instructions similar to those of the 2011 material. It is logical to assume that, with these instructions, and with wider messaging about the timing of the census, the 2011 tendency for self-completion on or around census weekend will to a large extent be repeated in 2021.

Figure 2. Daily returns for England and Wales for 2021 Census

EAP137 - Household response profiles for the 2021 Census

Despite this, an initial peak is still expected to happen during the early days of the operation. However, the volume of initial contacts means that these will be distributed over a 10-day period. This should flatten and draw out any early peak compared to the sharp initial peak in 2017 and 2019 when initials contacts all arrived in two days.

Conclusion

The present paper provides an update about the design of the response profiles following the 2019 Rehearsal and the implementation in the 2021 Wave of Contact. The paper focus on the rationale for the development and the implementation of the response profiles in the wave of contact supported by evidence gathered from the 2017 Test, the 2019 Rehearsal and previous Censuses.

Davis, B. & Fraser, O. (2020) Forecasting return rates in the 2021 Census, internal paper available athttps://share.sp.ons.statistics.gov.uk/sites/cen/csod/CSOD_Stats_Design/Maximising_Response/FOS_Documentation/FOS-Documentation/forecast_method_v_0_1_eap.docx

Dini, E. (2019a) Hard to Count index for the 2021 Census, paper available at https://uksa.statisticsauthority.gov.uk/wp-content/uploads/2020/07/EAP123-Hard-to-Count-index-for-the-2021-Census.docx

Dini, E. (2019b) Update on methodology for the Digital domain of the

Hard to Count index for the 2021 Census, available https://uksa.statisticsauthority.gov.uk/wp-content/uploads/2020/07/EAP124-Hard-to-Count-index-methedology.pdf

Fraser, O. (2019) Maximising response strategy overview, paper available at https://uksa.statisticsauthority.gov.uk/wp-content/uploads/2020/07/EAP113-Maximising-Response-Strategy-Overview.docx

Meirinhos, V. (2019a) Independent Methodological Review: Response Chasing Algorithm, available at https://uksa.statisticsauthority.gov.uk/wp-content/uploads/2020/07/EAP114-Independent-Methodological-Review-Response-Chasing-Algorithm.docx

Meirinhos, V. (2019b) Maximising Response – Response Profiles, paper available at https://uksa.statisticsauthority.gov.uk/wp-content/uploads/2020/07/EAP116-Maximising-Response-%E2%80%93-Response-Profiles.docx

Ward, K., Barber, P., Priestley, M. & Fraser, O. (2019), Simulating Census operations to inform resourcing decisions, paper available at

https://uksa.statisticsauthority.gov.uk/wp-content/uploads/2020/07/EAP117-Simulating-Census-Operations-to-Inform-Resourcing-Decisions.docx

EAP136 – Resolving Multiple Responses (RMR) in the 2021 England & Wales Census

RMR Working Group

Steve RogersCDP: Statistical LeadCal GheeCSD: Statistical Lead
Lawrence DyerCDP: RMR LeadTanita BarnettCSD
Pratibha VellankiCDPBethany FitzgibbonCSD
Graham GreenleesCDPOrlaith FraserCSD
Paul WaruszynskiCDPBrendan DavisCSD
Bambi ChoudharyCDPGareth PowellMD: Estimation
Charles OdinigwehDSTFern LeatherMD: E&I & Adjustment
Steve SmallwoodPSD: Statistical LeadJosie PlachtaMD: Matching
Jayne SholdisNISRAFergus ChristieNRS
Alistair StoopsNISRAMelissa LiewNRS
CDP: Census Data ProcessingCSD: Census Statistical Design
DST: Digital Services & TechnologyMD: Statistical Methodology
PSD: Population Statistics & DemographyNRS: National Records of Scotland
NISRA: Northern Ireland Statistical Agency

Background

As for previous Censuses two of the primary objectives for the 2021 England and Wales (EW) Census are to provide Government and other consumers of statistical information with:

  1. accurate local authority level population estimates
  2. a representative statistical Census database on which to base ongoing research and analyses

However, while the ONS affords considerable effort to the collection of accurate household (HH) and individual information the task is complex, leading invariably to a wide range of errors in the data that can potentially undermine those objectives.  Missing and inconsistent responses in the collected data, undercount through missed enumeration or record-level non-response, and overcount through the collection of duplicate responses are just three of a wide range of significant challenges.

To help overcome these problems and minimise the impact on the quality of the statistical estimates and the utility of the final Census database, raw Census data are cleaned and adjusted through a series of deterministic and statistical processes.  First, a clean and consistent baseline database of collected response data is established through a series of preliminary data cleaning and classification methods and processes completed ultimately by statistical item-level Editing and Imputation.  Second, supported by linking Census data to a Census Coverage Survey through statistical Matching methodology, and by utilising information from other alternative sources of data and analyses, statistical Estimation methods are used to define a comprehensive set of population weights that take account of issues such as overcount and undercount in the observed data.  Finally, these population weights are used by additional statistical Adjustment, Edit and Imputation, and Disclosure Control methods to arrive at a fully adjusted Census database. This general strategy ensures that the two primary Census objectives are met successfully.

RMR is a preliminary data cleaning strategy implemented prior to any of the Census statistical adjustment processes.  While the two primary objectives of the Census program would, in principle, be met by a database containing accurate and discrete information about each HH or communal establishment (CE) and the individuals therein, collected Census data will inevitably contain multiple and/or duplicate responses associated with a particular HH, CE, and/or individual.  Multiple and duplicate responses can undermine both primary Census objectives, contributing to overcount and inconsistencies in the Census data. RMR is designed to help minimise the impact of these potential problems.

It is important to note here that the first and foremost objective for the RMR strategy is the resolution of multiple and duplicate responses received from a discrete enumeration address assigned a Unique Property Reference Number (UPRN) through the ONS address resolution strategy.  Consequently, RMR assumes that this has been assigned correctly. There is also a secondary objective to explore the possibility of extending RMR in 2021 to include a relatively local level of geography such as postcode or Lower Layer Super Output Area (LSOA).  The main point here is that the resolution of multiple responses or duplicates beyond these very local geographic boundaries is out of scope for the RMR strategy. Consequently, any incidental reference to ‘Census data’ from this point forward should be considered as always referring to data within these constraints.

Multiple responses are, of course, not always errors, sometimes occurring quite legitimately in the Census data associated with a discrete enumeration address. Typically, this would be associated with a design decision. For example, requests for additional household continuation (HC) questionnaires needed by larger HHs responding through paper format can lead to receiving two or more responses for that address. Also, to meet changing social norms, in 2021 individuals are being encouraged to complete and return an individual response (referred to as an iForm) should they choose not to disclose accurate but personal information on a primary HH questionnaire.  Consequently, in this case, the Census data could contain two or more responses from the same individual. While iForms are likely to contain different answers to some Census questions, fundamentally, this type of multiple response is a duplicate at the address in question.

Multiple responses and duplicates, however, can also emerge through unintentional (and perhaps unavoidable) error.  For instance, there may be errors in the underlying enumeration address frame leading to several questionnaires being sent to the same address that are subsequently completed and returned. Enumerators in the field may leave a paper questionnaire at an address that is completed and returned in addition to the householder completing an electronic questionnaire. In 2011 there was clear evidence that some individuals misunderstood how they should respond to the Census, entering their personal information more than once on a discrete HH questionnaire. All these examples manifest as duplicates in the Census data.  In contrast, receiving several responses from the same enumeration address could also be indicative of errors where the address frame has failed to recognise that there is more than one HH at the address.

Regardless of whether occurring through design or error, multiple responses and duplicates are problematic with respect to primary census objectives and can serve as contributing factors to a wide range of statistical errors in Census outputs.  For example, and amongst many other reasons, poor integration of HC questionnaires can lead to undercount of larger HHs.  Duplicate HH and individual responses can lead to overcount of the general population. Duplicate individual responses can also lead to potential overcount of larger HHs. Errors in the underlying address frame can lead to several different issues associated with both undercount and overcount.

The overarching role of RMR is to address these problems and minimise any associated risk to the quality of Census Outputs by resolving multiple responses and duplicates, not by simply removing them from the data, but by careful integration of the information provided through all associated responses.  As multiple responses and duplicates can also have a direct and negative impact on the performance and accuracy of statistical Matching, Estimation, Adjustment, and Edit & Imputation methodology, the RMR strategy for 2021 has been designed and developed in close conjunction with these methods through the RMR Working Group.  All in all, RMR is designed to support the production of accurate local authority level population estimates and a representative statistical Census database; the two primary Census objectives.

Aim of the current paper

The current paper has 3 primary aims:

  1. To provide an outline of the design and design principles behind the review and development of RMR for the 2021 Census
  2. To provide a high-level summary of the RMR strategy and the function of each module in the RMR method. New modules for 2021 will be highlighted.
  3. To provide a detailed overview of the business rules and statistical methods employed by each of the RMR modules. This will include notes on the assumptions and reasoning considered by the RMR Working Group during the review.  New Modules, and significant changes to the 2011 RMR design will be highlighted.

First implemented in the 2001 Census and again, successfully implemented in 2011, the RMR strategy has undergone an extensive review for 2021.  The primary aim of the review was to ensure that the 2021 RMR strategy was built on a comprehensive set of design principles that ensure a consistent approach and justifiable foundation in line with Census objectives.  Consequently, the review itself, and the decision-making processes leading to that design has been an iterative process where elements previously agreed were often, and necessarily, revisited a second and even a third time to incorporate emerging principles. While considerably time-consuming, we believe that this had led to a far more robust, accurate and consistent end to end design for the 2021 RMR strategy tied more tightly to the statistical and analytical aims of the Census program than RMR has in the past.

In this Section we present an overview of the principles that have emerged to underpin the basic design of the 2021 RMR strategy. Where appropriate we indicate how the RMR strategy has changed since 2011, how we have incorporated lessons learned from 2011, and how we have linked the 2021 strategy closely to subsequent statistical adjustment methods and Census objectives.  The Section should also guide an understanding of the decision-making processes that led to the detail of how RMR functions, presented in Sections 5 and 6.

4.1 General principles

  • Following what was generally a positive assessment of how the 2011 RMR strategy performed in the last Census it was agreed that the 2021 RMR strategy would build on that success. Consequently, the emerging design of the 2021 RMR strategy was guided by holding the effectiveness and proficiency of the 2011 RMR specification to account and, where appropriate, removing, adjusting, or adding functionality. Key drivers of the review included:
    • lessons learned in 2011 that identified potential improvements to the RMR process
    • changes to overarching Census design objectives such as the modernisation of the collection strategy
    • changes to the analytical aims of the Census question set to meet the demands of a modern society
    • advances in statistical methodology implemented both within RMR and the methods it aims to support
    • advances in the computational power available on the new ONS Cloudera processing platform.
  • While the 2011 Census RMR strategy was quite successful in meeting its objectives, post Census evaluation indicated that there were areas of the design that could be improved. For example, the statistical matching methodology in the 2011 RMR strategy left at least some unexpected duplicates in the data.  While ultimately resolved prior to outputs this was something unexpected by both Statistical Matching and Estimation leading to delays and necessary revisions to approved methods. It is worth noting here that any RMR strategy will have limitations and it is important that these are well understood and addressed in aggregate adjustments to Census estimates through Statistical Estimation. The RMR Working Group was established to ensure that all stakeholders with a significant interest in the way RMR functions were included in the development of its design.  Table 1 outlines the related and relevant topic areas covered by members of the Working Group.

 Table 1. Membership & roles of the RMR Working Group

ONS Division/TeamRole & Responsibility
Census Processing TeamOverall design & development of RMR & implementation
Census Statistical DesignOverall statistical design of Census methodology
Methodology: Statistical MatchingDesign and development Of the Census to CCS and Census to Census Statistical Matching Strategies
Methodology: Estimation & AdjustmentDesign & development of the Census Statistical Estimation Strategy
Methodology: E&I & AdjustmentDesign & development of the Census Statistical E&I and Adjustment Strategies
Population Statistics and DemographyRepresenting consumers of Census outputs regarding data quality and accuracy of estimates
NISRA & NRSRepresenting devolved UK Statistical Institutes with respect to sharing ideas and harmonisation

4.2 Overarching Design

  • Building on the 2011 design the 2021 RMR strategy adopts a progressive and modular approach to the problem of resolving multiple and duplicate responses. Each module, or in some cases, set of modules, focuses on one aspect of the potential problem space.  For example, Module 1 looks to resolve multiple and duplicate CE’s; Modules 2 & 3 at HHs; Module 4 at Dummy forms; Module 5 at iForms, and so on. Section 5 provides a functional overview of all RMR Modules while Section 6 delves further into the detail.  Each module builds on the outcome of one or more of the previous modules.  Consequently, sequencing formed a significant part of the review and several adjustments have been made to the 2011 ordering, not only to optimise accuracy and efficiency but also to accommodate new Modules for 2021 such as the resolution of duplicate iForms (Module 5b).  We return to the sequencing of Modules in Section 5.
  • In general, as in previous Censuses the resolution of multiple and duplicate responses relative to a topic area (i.e., in CEs, HHs, Dummy Forms, and so on) will be addressed through a 2-stage process:
    1. In Stage 1 multiple and duplicate responses need to be identified correctly. This is often achieved relatively easily through information provided on each types of Census questionnaire and or through data from the Census Field Work Management Tool (FWMT) such as the UPRN.  However, in cases where this information is not available, not suitable, or in the presence of uncertainty, RMR relies on statistical matching methods.  Typically, statistical matching/linkage methods are used to identify people who, for some reason, are represented more than once in the data.
    2. In Stage 2, where multiple and duplicate data sources have been identified, they need to be resolved. This is not about throwing data away but instead by careful integration of the alternative data sources. Typically, this will be driven by decision-making logic designed, for example, to decide where to put an iForm that could in principle, belong to two or more HHs at the UPRN. Alternatively, to decide which set of responses from a set of duplicate individual records will serve best as a baseline set from which to build a unique and discrete observation.  Resolution is usually achieved through a hierarchy of rule-based business rules followed by rules based more on statistical probability. This general strategy serves to prioritise options and manage resolution as we move progressively towards and into uncertainty.

4.3 Identifying duplicates through Statistical matching

  • We mentioned earlier that during live Census processing in 2011 the RMR strategy left at least some unexpected duplicates in the data indicating that this should be one of the most significant areas to explore for 2021. Lessons learned from 2011 noted that the matching methodology implemented in the 2011 RMR strategy was likely to be one of the most salient reasons for this. For example, 2011 RMR relied quite heavily on Soundex based matching methods which we now understand will underperform with some cultural name sets.  However, one of the most interesting aspects of the review was that the performance of 2011 RMR matching strategy was evaluated by comparing results using the far more sophisticated Census to CCS and Census to Census Matching methodologies.  From this, one of the primary guiding principles for the design and development of the 2021 RMR strategy was that the matching methods implemented in the 2021 RMR strategy should be designed by the ONS Methodology Matching Team in conjunction with the development of the Census to CCS and Census to Census Matching methods. This not only ensures compatibility and consistency between methods but also ensures that the RMR matching methods meet the high-quality criteria one would expect from the ONS Methodology Division.

Currently, there are two ONS internal technical papers available on the work of the Methodology Matching Team: Plachta and Shipsey, 2019; Plachta, 2020.  Both papers can be made available on request.  Here we provide a brief overview:

  • For the 2021 Census RMR strategy, a deterministic linkage method will be implemented to identify duplicate people within an enumeration address (UPRN) prior to resolution. The linkage method uses 17 match-keys developed using 2011 Census data to capture as many duplicate matches as possible with a very low tolerance of incorrect matches. Research into the optimal combinations of match-keys was conducted by testing a variety of strengths of match-key on 2011 data and conducting a clerical review to identify those that made false positive matches (records that were not in truth duplicates). The remaining keys are loose enough to let through true matches that contained errors.

The match-keys allow for exactly matching records, records where date of birth is missing or contains slight error, records with missing gender, and records containing a variety of name errors. Name matching is implemented using two string comparators; Levenshtein Edit Distance and Jaro-Winkler; to identify matching names with spelling, handwriting or scanning errors. The benefit of using two comparators is that although both are very high quality and find a high volume of correct matches, each can identify matches that the other misses, making the RMR match-keys more resilient compared to the Soundex phonetic comparator method used in 2011.

In 2021, the Methodology Linkage team will also run a probabilistic matching exercise alongside the deterministic RMR process to independently identify duplicate responses. This will use the Fellegi-Sunter algorithm using parameters calculated using 2011 data. This exercise is not designed to add further duplicates to RMR’s list of duplicates, but as a quality assurance process to run alongside. If the probabilistic algorithm and the deterministic methods return similar matches, we can be confident that the match-keys are working well, despite the differences between 2011 and 2021 data. Using probabilistic matching in RMR is out of scope due to the resources required firstly to conduct it to the required quality, and secondly to deal with any potential matching conflicts between deterministic and probabilistic methods.

In 2011, 237,200 records were identified as duplicates in RMR, although in subsequent matching stages many more duplicate records were found. The new match-keys for 2021 have able to improve this number to 288,468 2011 duplicates identified with a precision of over 99.99% when testing on 2011 data. Although we expect the 2021 Census data to differ from the 2011 data due in part to changes in collection methods, we are confident that the improvements we have made to the method will mean that it is flexible and robust enough to perform well with 2021 Census data, outperform the 2011 strategy, and provide the required consistency with other matching methods also used to support Statistical Estimation & Adjustment.

4.4 Resolving multiple and duplicate responses through rule-based and statistical decision-making

  • As outlined earlier, the resolution of multiple or duplicate responses within each topic area (i.e., CEs, HHs, Dummy Forms, and so on) are addressed in RMR through a hierarchy of rule-based business rules followed by rules driven more by statistical probability to help manage uncertainty. Two overarching principles were agreed to guide the review of the 2011 and development of the 2021 RMR strategy. In all cases where a resolution process was implemented in the 2021 RMR strategy:
    • it should not inadvertently introduce bias into the Census data.
    • It should look to retain as much supplied information as possible.
  • An initial review of the 2011 RMR decision-making strategies revealed that in many cases the business rules implemented to resolve some multiple or duplicate responses were no longer valid relative to the general Census design principles laid out in Section 4.1. For instance, business rules driven by a predominantly paper-based Collection strategy in 2011 represented a particularly salient point of revision considering the transition to a predominantly electronic questionnaire in 2021.  It was also noted that in many of the 2011 modules, the resolution of residual cases that remained unresolved at the end of a sequence of deterministic rules were still being addressed through propositional (micro) business rules where a stochastic approach would have been a more appropriate way to manage uncertainty.  Retention of these rules would clearly mean breaking the principle of not inadvertently introducing bias into the Census data.

To ensure this overarching principle was always adhered to, a second general principle was applied to the process of reviewing and rebuilding the decision-making hierarchy in each of 2021 RMR modules, and that was to always start from a position where the general and default resolution strategy would be to distribute multiples or duplicates randomly amongst all available options.  During discussion and review of each module this baseline strategy could only be preceded by deterministic business rules or by other strategies if the members of the working group could justify their inclusion and why they should take priority.  In truth, the decision-making logic of all 2021 RMR modules ended up as a sequence of deterministic business rules with the uncertainty associated with any residuals at the end of the decision-making sequence being managed by statistical resolution (see Table 2 for a comparison).  However, this principled approach ensured that each of the 2021 RMR modules were examined and developed in a structured and systematic way while maintaining the overarching aim of avoiding the introduction, or for that matter retention, of unintentional bias.

  • In 2011, across the entire suite of RMR modules, deterministic business rules driven the overarching design of the 2011 Census, particularly the 2011 Census Collection strategy, were used quite frequently in the RMR decision-making hierarchy. While there was no reason to move away from a similar approach in 2021, all in all, significant changes to several aspects of the overarching Census design for 2021, including the Collection strategy, led to many revisions of business rules for 2021 RMR.  Details of all these changes can be found in Section 6. Here, as examples, we present just a few of the design-based business rules considered in the review that led to the most significant changes.  As we worked through each of the RMR modules the list of business rules was referred to and implementation amended where necessary.  Again, this principle ensured that we maintained a consistent approach to the review.  These changes ensured that the 2021 RMR strategy was consistent with the reasoning behind changes to the overarching Census design between 2011 to 2021.
  • EQ first. One of the most significant changes in the design of the 2021 Census compared to 2011 was the shift in the Collection strategy from predominantly paper questionnaires (PQ) to electronic questionnaires (EQ).   Driven by this overarching design principle, in some of the 2011 RMR modules PQ responses were prioritised in favour of EQ responses.  For 2021, the change in this overarching design principle led to similar rules being reconfigured where appropriate to prioritise EQ responses over PQ.
  • iForms first. While prioritising the information collected through individual forms is not new for 2021 more emphasis has been placed on encouraging and facilitating the use of iForms in the overarching Census design than in 2011. This shift is designed to meet the demand for more accurate statistics on changing society norms and allows individuals to provide personal information about themselves that they may not have disclosed on a standard HH form. This change was considered throughout the RMR review.
  • Non-response as valid ‘prefer not to say’. In 2011 RMR often used completion rate to determine which of a set of multiple questionnaires returned by the same individual would be considered the baseline for integration. Typically, the most completed record is selected for this purpose before ongoing integration. However, as an extension of the demand for more accurate statistics on changing society norms, the Working Group were advised that it was a legal obligation to count missing data in an iForm associated with voluntary questions such as gender identity as a valid ‘prefer not to say’ response.  This was factored into all decision-making logic that fell into this category.
  • Receipt date. In the 2011 RMR strategy, date of receipt was sometimes employed to prioritise one response over another when multiple responses were received from the same HH, CE, or individual. For 2021 several options were always considered including first receipted, last receipted, and receipted closest to Census day. The best option was selected based on the circumstance.  However, in the overarching Census Collection design there is more focus on the significance of iForms than there was in 2011. In addition to encouraging individuals to complete iForms for the reasons previously outlined in the ‘iForms first’ principle above, public facing Census Support will also guide people asking how to revise information they have already provided also towards competing an iForm. These design decisions render ‘last receipted’ a more predominant option amongst alternatives than in 2011.
  • To close this Section, there are four more aspects of the 2021 RMR design considered during the review that we think are worth mentioning in this more generic overview. These represent relatively important extensions to the 2011 RMR design for 2021 that we are confident will lead to improved performance.
  • Retention of information. We mentioned at the beginning of this Section that the retention of as much information as possible was one of the main aims when integrating multiple or duplicate records belonging to the same HH, CE or individual. For 2021 this will generally be implemented in the same way as 2011 in that once we have determined which record to retain as the baseline response through the decision-making strategies discussed so far, missing data in that record will be backfilled using information from the other duplicate records.  There was a considerable amount of discussion about optimising this process, particularly around trying to maintain consistency between responses within person and between people within a HH.  However, on review, the Working Group decided that trying to implement an editing strategy within RMR was out of scope and the resolution of inconsistencies would best be left to the following statistical Editing and Imputation methodology.

There was however, one area where we were able to improve upon the 2011 RMR design.  One of the inevitable consequences of resolving duplicates is that the original HH structure can be disrupted. For example, we could start with a 7-person HH, but persons 2 & 3 end up being duplicates and resolved into one.  This means that when restructuring HHs at the end of RMR person 4 becomes person 3 and so on.  The problem here is that with a paper questionnaire the relationship matrix is abbreviated. In 2011 we collected all the relationship between persons 1 to 6 but once the questionnaire reached person 7 it would only ask for the relationship to person 1. Consequently, by making person 6 person 5 after RMR all the relationships that would have been collected had there not been a duplicate are now missing.  This would have been left for Edit & Imputation to resolve.

The problem in 2021 is likely to be more significant as abbreviation of the relationship matrix starts at person 5 rather than person 6 in the paper questionnaire potentially increasing the amount of imputation required into the relationship matrix for this response mode.  However, the electronic questionnaire in 2021 has been designed to collect the entire relationship matrix regardless of the number of people in the HH.  While relatively complex, the 2021 RMR strategy has been coded to retain that information and adjust the relationship matrix accordingly when tidying up the HH structure, replacing what would have been imputed data with observed data.

  • Making use of the ‘Tuning Phase’. In contrast to 2011, the 2021 Census response data will be streamed live into the processing pipelines daily from the day that the Census goes live during Collection. This represents an early opportunity to start exploring the data and adjusting method parameters and processes based on analyses of the actual 2021 Census data prior to committing to a final end-to-end run of the Census cleaning and adjustment methods once all the data has been collected. This was considered throughout the review of the RMR strategy and there were two areas in the decision-making logic space we felt that we could program in new strategies with adjustable parameters that not only took advantage of this ‘tuning phase’, but could potentially provide additional support to our aim of not inadvertently introducing bias into the Census data.

We have already mentioned that in 2011, RMR often used completion rate to determine which of a set of multiple questionnaires returned by the same individual would be considered the baseline for integration, and that typically, the most completed record is selected for this purpose before ongoing integration.  For 2021 we have extended this function in that rather than just being based on the basic sum of completed Census questions, it is now based on the weighted sum, with the weight being a parameter that can easily be adjusted through a configuration file read by the RMR program.  This allows evidence-based prioritisation of any Census variable or subset of variables following analyses of the 2021 Census data during the Census Collection.

We have also mentioned that in the resolution of residuals all RMR modules will converge on distributing multiples and duplicates randomly where there are no valid business rules or other strategies to suggest otherwise.  For 2021, we have programmed RMR to accept and insert a sequence of up to 10 user defined conditional propensity statements.  This provides the opportunity to fine tune the probability distribution of residuals if required or deemed appropriate rather than automatically fall back on a completely random allocation.  It wasn’t possible to implement this functionality through a configuration file as we have for for the weighted sum of valid observed data, but the code has been written in such a way that adding them during live processing would not be that difficult.

  • Extending the search area. Following the 2011 Census, it was noted that duplicate responses were not always constrained to a unique enumeration address.  This is not unusual with Census data or other data sources but as duplicates of any kind contribute to overcount in statistical estimates, aggregate adjustments for this type of error were made through the 2011 statistical Estimation and Adjustment methodology.  It was noted, however, that some of these duplicates occurred within small area geographies such as postcode or LSOA.  Consequently, a recommendation was carried forward to 2021 to explore the possibility of extending the remit of RMR to identify and resolve these duplicates earlier on in the Census processing pipeline.

To address this recommendation research was conducted with 3 primary aims. First, to identify the primary source of the problem with a view to try and understand whether it was likely to occur again in 2021.  Second, to explore the potential problem of establishing an appropriate rule-based resolution strategy that could effectively resolve all between HH duplicates.  And third, to evaluate whether implementing a proportional rule-based resolution strategy as in RMR had any real value compared to a statistical adjustment.  It is important to note here that ONS Methodology were already developing statistical Matching and Estimation methodology to address this type of overcount based on Census to Census matching.

The results of the research indicated that these localised duplicates could primarily be attributed to errors in the 2011 Address Register, typically, where two or more questionnaires had been sent to the same address.  Following consultation with the ONS Address Register Team we were assured that the risk of this occurring again in 2021 was far lower than it was in the previous Census.  In addition, mapping out the potential problem space for a rule-base resolution strategy applied across the full gamut of possible duplicate combinations that could occur at the person level between 1 or more HHs, with up to 30 individuals within a HH, demonstrated that this would be extremely complex and time-consuming exercise.  It was also concluded that due to this complexity, a full rule-based resolution strategy was likely to increase the risk of introducing bias into the Census data rather than reducing it.

All in all, the Working Group concluded that fundamentally, a statistical approach through Estimation and Adjustment to what was likely to be a relatively low impact problem in 2021 was a far better strategy than extending RMR.  However, it was also agreed that the idea was not completely redundant. Implementing the statistical Matching strategy to identify duplicates within the desired area is a relatively easy extension to the RMR program.  Consequently, it was agreed that RMR would be extended to identify and flag these duplicates as this information could make a significant contribution to subsequent statistical Matching and Estimation strategies. It was also agreed that in the relatively simple case of wholly duplicate HHs a simple rule-based approach of retaining only one of the HHs would be appropriate and carry little or no risk.  While we still have to finalise the detail, this is likely to be based on the principle of retaining as much information as possible from the duplicate HHs.

  • Using Admin or Alternative data. Making use of administrative and/or alternative data sources has been a consistent theme throughout the Census programme and this was considered throughout the RMR review. Specifically, following the 2011 Census it was noted that the 2011 RMR strategy may have struggled to resolve multiple responses where it was not clear whether a ‘dummy form’, representing a unique address with no valid Census return was a regular property or CE.  There were obvious candidate administrative sources that might help this issue such as the VoA data.  However, in this case it became relatively clear that reliable information could be pulled into the RMR process from the 2021 Field Work Management Tool (FWMT) to achieve that same thing, but without having to consider potential problems with administrative data such as its accuracy and how it may lag in time relative to Census day.  While the FWMT data is not administrative it is alternative data that was not used in 2011 so we expect far better resolution of dummy forms in 2021.

It is fair to note that during the review, we did not identify anywhere else within the RMR process where other administrative data sources would be particularly useful or improve the quality of the outcome.  Much of this was due to improvements in areas such as the Census Address Register, the FWMT, and the Census Collection Strategy itself, which, by definition, have reduced the propensity for erroneous duplicates to occur in the Census data in the first place.  That said, we have left open the possibility of linking the RMR process to the Census Intelligence Database (CID), containing a suite of pre-linked administrative data sources for use elsewhere cross the Census programme.

Table 2. General comparison of the 2011 & 2021 RMR matching & resolution strategies

 RMR 2011RMR 2021
Matching methodsRelatively basic & independentCompletely aligned with ongoing statistical Matching, Estimation & Adjustment Methodology
Search & resolution of multiples & duplicates within UPRNBased on overarching 2011 Census Design:Based on overarching 2021 Census Design:
Business rule 1Business rule 1
Business rule 2Business rule 2
----------------------------
Business rule nBusiness rule n
n/aRetention of information: Weighted Variable Count
Retention of information: Variable CountRetention of information: Variable Count
Residual management: Micro business rulesResidual management: Micro business rules
n/aResidual management: Conditional propensity allocation/distribution
n/aResidual management: Random/allocation distribution
Search & resolution within wider geographyn/aSearch: Yes
Resolution: Partial, but with full flagging to support ongoing statistical Matching, Estimation & Adjustment Methodology
Use of alternative or administrative data sourcesn/aWider admin sources considered. Use of alternative information from FWMT

Module #Description of ModuleOverview of ModuleOverarching Assumptions of Module
1Resolves multiple CE responses- Multiple CE responses for a UPRN are resolved to one record.- The address frame accurately identifies HHs that are attached to CEs as different residences and assigns them a different Child UPRN to the CE.
- HH responses at the same UPRN as a CE response are disregarded.- When multiple residences at the same address are identified in the field, unique Child UPRNs will be created to identify the different residences as being at different addresses.
- Persons captured on HH forms at these UPRNs are moved to the retained CE at the UPRN.
2Resolve multiple HH responses: Stage 1Duplicate HH responses within the UPRN are disregarded, persons within disregarded HH responses are moved to the retained HH (which is the HH that the disregarded response was identified to be a duplicate of).- The address frame accurately identifies different HHs at the same address and assigns them different Child UPRNs so that they are identified as different residences through processing.
3Resolve multiple HH responses:- When multiple residences at the same address are identified in the field, unique Child UPRNs will be created to identify the different residences as being at different addresses.
Stage 2
4Resolve dummy responses- Multiple dummy responses for a UPRN are resolved to one.- Multiple dummy responses for the same UPRN are identified as duplicates.
- A HH or CE is created at UPRNs where there are dummy response(s) but no HH or CE response.- Duplicate dummy responses are always assumed to be duplicates from the same HH or CE.
- When multiple residences at the same address are identified in the field, unique Child UPRNs will be created to identify the different residences as being at different addresses.
5aResolve duplicate iForm responsesDuplicate iForm responses at the same UPRN are resolved to one iForm response.If the matching methods provided by Methodology identifies individual responses as being duplicates then it is accepted that this is correct and these responses are resolved to one.
*New Module for 2021
5bAssign continuation formsHC forms are assigned to a HH or CE at their UPRN.- Persons captured on iForms and HC forms are captured at the correct UPRN. Therefore, they should be assigned to a HH or CE that exists at the UPRN. They do not become discovered HHs.
5cAssign iFormsiForms are assigned to a HH or CE at their UPRN
6Resolve Orphan Responses- Orphan responses are identified as being iForms or HC forms received at UPRNs where there was no CE, HH or dummy responses.- Persons on orphan iForms and HC forms are captured at the correct UPRN. Therefore, they should not be assigned to a HH/CE at a nearby address and need a residence created for them at the UPRN.
- One HH or one CE is created at UPRNs where there are orphan responses- All orphan responses at the same UPRN are assumed to relate to the same residence.
- The orphan responses are assigned to the newly created residence at the UPRN.
7Identify duplicate Individual responsesDuplicate persons in the same residence are identified and resolved.If the matching methods provided by Methodology identifies individual responses as being duplicates it is accepted that this is correct.  These duplicate responses are then resolved to one.
8aResolve duplicate individual responses
8bFlag residual duplicate individual responsesRemaining duplicate individuals in the same UPRN are flagged.If the matching methods provided by Methodology identifies individual responses as being duplicates it is accepted that this is correct. 
*New Module for 2021
9Identify wholly duplicated HHsHHs within the same postcode that contain the same individuals are identified and resolved.If the matching methods provided by Methodology identifies individual responses as being duplicates it is accepted that this is correct. 
*New Module for 2021HHs of the same size that contain exactly the same individuals within the same Postcode are duplicate responses.
10Resolve wholly duplicated HHs
*New Module for 2021
11Resolve adjusted CE and HH data structures.Resolves the adjusted CE and HH data structures following logical rules implemented in RFP and RMR Modules 1 to 10.HHs that have more than 30 individuals are assumed to have submitted the wrong residence form.
12Create RMR Flags- Creates flags that detail data changes that have been undertaken by the RMR process.N/A
13Run RMR Diagnostics- Checks that RMR process has run successfully.N/A
- Creates diagnostics on the RMR process.

 RMR Modularisation: Detailed Methods & Rationale

Module #Description of ModuleDescription of RulesRationale
1Resolves multiple CE responsesWhen both HH and CE responses are received for the same UPRN the following rules are applied:CE responses are prioritised over HH responses as there will be hand delivery of CE forms to some establishments. Therefore, if a UPRN has filled out a CE form there is likely to be good reason behind it.
-  The CE response is retained and the HH response is disregarded.There are likely to be fewer multiple responses of this type in 2021 as CEs and HHs are captured on the same Address Register.
-  Persons captured on the disregarded HH responses are moved to the retained CE.The persons captured on the disregarded HH responses are moved to the CE so that they are kept.
When multiple CE responses are received for the same UPRN.There should only be one CE at an address. If multiple responses are received, then it’s highly likely that these responses relate to the same CE.
We choose one of the CE responses to keep as the baseline CE record and disregard the other form(s).
1) The record with the greatest sum of weighted completed questions is retained to ensure that the most consistent record is kept. The weighted function has been added to allow for evidence-based prioritisation of any Census variable or subset of variables following analyses of the 2021 Census data during the Census Collection.
The selection of the CE response to be retained is based on the following set of priority rules:
2) EQ (not forced) over PQ as EQ responses are known to be of higher quality than PQ responses. PQ is taken over forced EQ responses as PQ is a response that has been submitted by the respondent whereas they haven’t submitted the forced EQ response.
1) Greatest sum of weighted completed CE questions. (The default weights are 1 for each variable)
3) Later receipted responses are favoured over other responses as it is expected that some people may choose to fill out another form to correct for a mistake or change in circumstance on an earlier form.
2) If equal on sum of weighted completed CE questions, then prioritise EQ (Not Forced) then PQ, then EQ (Forced).
4) There is the functionality to add extra rules here, during the tuning phase if information comes to light.
3) If equally complete and same response mode, then prioritise the response that was last receipted.
5) A random approach is used as the last measure to ensure no bias. This is an improvement on 2011 when the last measure was to take the first one found.
4) Additional rules (if required).
Backfilling of responses is undertaken to keep as much information on the residence as possible. Information from disregarded records is expected to be more accurate than imputed values.
5) If all else is equal, then one of the forms is selected at random.
Any missing CE variables on the retained record will be populated with valid responses from a disregarded CE response for the same UPRN using the same priority rules as those used to select which CE response to retain.
2Resolve multiple HH responses: Stage 1Where there are multiple HH forms received for the same UPRN, the following sequential rules are applied to decide whether they are duplicate responses and should be merged:1) Responses with the same QID can only result when a paper form with an IAC is received by a HH that then return both internet and paper responses. This rule essentially covers the merging of EQ and PQ responses for the same address rule in 2011.
This rule will be coded in such a way that it can be toggled on/off in live running if required.
1) If the HH responses have the same QID then the HH responses are merged.
2) If matching identifies that responses contains the same individuals, then it’s highly likely that the responses relate to the same HH and are duplicates of each other.
2) The match-keys provided by Methodology are used to identify HH responses that contain the same individuals.
3Resolve multiple HH responses:Continuing the resolution of multiple HH responses from Module 2:3)  If matching identifies that responses contains the same individuals, then it’s highly likely that the responses relate to the same HH and are duplicates of each other.
Stage 2
3) HH forms that are found to contain persons in common in Module 3, Rule 2) are assumed to be the same HH and are therefore merged together.4) Minor-only HHs can only legitimately occur under very rare circumstances, and, for disclosure reasons, statistics on these HHs cannot be output in the Census. Also, the likelihood of a minor-only HH occurring at the same UPRN as a non-minor-only HH would be rare and the likelihood is that it is the same HH but for some reason only children were captured on one of the forms. Therefore, it was agreed that we should merge the HH responses in these cases.
4) If one or more of the multiple responses is a minor-only (under 16) HH and there are other non-minor-only HHs then the minor-only HH response is merged with a non-minor-only HH response, prioritising merging to those HHs which hold the highest Levenshtein score match on surname (where the Levenshtein score is above a certain parameterised threshold).  If multiple forms have the same match score or none of the non-minor only HH responses have acceptable Levenshtein match scores, then one of these matched (or un-matched where no matches) HH responses is selected to be merged with the minor-only HH based on a random draw.5) This is a new rule for 2021. There was concern that there may some discovered HHs may be lost unless they are identified at this stage. E.g. A HH with a dedicated Annex Therefore, this step will look for any strong evidence on the HH responses that identify them as being discovered HHs.  It is still to be decided what evidence can be used to support the identification of a discovered HH. Suggestions include using the responses to H8, H9 and/or H10 on the HH form. This step will be coded in such a way that additional rules for identification can be added later.
5) Based on a set of to be defined conditions, if there is evidence to suggest that a HH responses is a discovered HH at the UPRN then we will flag them as a discovered HH and they do not go through steps, 6, 7 & 8.6) This is a new rule for 2021. If there is no reason to suggest that the multiple HH responses relate to different HHs in the previous rule and at least one person’s surname matched across the forms, then it’s assumed likely that the HH responses relate to the same HH.
6) Levenshtein matching is undertaken on the surnames of the person responses. Where the Levenshtein match score is deemed acceptable between multiple HH responses they are merged.7) This is a new rule for 2021. This allows for rules to be added to merge the responses if there is evidence to suggest that they should be. A rule that may be added is the merging of Welsh language responses with English language responses at the same Child UPRN. This rule was implemented in 2011 as addresses in Wales are sent both the Welsh language form and the English language form and therefore have a higher propensity to respond twice than addresses in England.
7) Based on a set of to be defined conditions if there is evidence to suggest that the HH response(s) are not discovered HHs then we will merge them at this stage.8) Empty forms received at the same UPRN as other responses are assumed to be incomplete responses or timeshares. Therefore, the responses are merged unless there is information to suggest that they shouldn’t be.
8) If there are “empty” HH responses (HH responses with no person information filled in), these are merged with “non-empty” HH responses unless there is evidence to suggest that they are legitimate empty HHs. Where there are multiple “non-empty” HH responses, one of these responses is selected at random, to be merged with the “empty” HH response.
When merging HH responses in Modules 2 and 3 one of the HH responses is selected to be retained as the baseline HH record, based on the following priority rules:
a) The record with the greatest sum of weighted completed questions is retained to ensure that the most consistent record is kept. The weighting function has been added to allow for evidence-based prioritisation of any Census variable or subset of variables following analyses of the 2021 Census data during Census Collection.
a) Take the response with the greatest sum of weighted completed HH questions (the default weights are 1 for each variable).
b) EQ (not forced) over PQ as EQ responses are known to be of higher quality than PQ responses. PQ is taken over forced EQ responses as PQ is a response that has been submitted by the respondent whereas they haven’t submitted the forced EQ response.
b) If equally complete, then prioritise EQ (Not Forced) then PQ, then EQ (Forced)
c) Later receipted responses are favoured over other responses as it is expected that some people may choose to fill out another form to correct for a mistake or change in circumstance on an earlier form, even though they are advised against doing this.
c) If equally complete and same response mode, then prioritise the response that was last receipted.
d) This is a new rule for 2021. There is the functionality to add extra rules here, during the tuning phase if information comes to light.
d) Additional rules (if required)
e) A random approach is used as the last measure to ensure no bias. This is an improvement on 2011 when the last measure was to take the first one found.
e) If all else is equal, then one of the forms is selected at random.
Backfilling of responses is done to keep as much information on the residence as possible. Information from disregarded records is expected to be more accurate than imputed values.
Any missing HH variables on the retained record will be populated with valid responses from a disregarded HH response for the same UPRN using the same priority rules as those used to select which HH response to retain.
4Resolve dummy responsesThe following sequential rules are undertaken on dummy response data:1) It is an assumption of the module that a dummy response is a duplicate response of a CE or HH response captured at the same UPRN. 
1) If there is a CE response captured at the UPRN then all dummy responses at the UPRN are discarded.The CE response is retained over the dummy response when both are captured at the same UPRN for a variety of reasons:
-  The CE form contains the questions that are required, whereas the dummy form contains questions that could be used to create CEs from.
2) If there is a HH response captured at the UPRN then all dummy responses at the UPRN are discarded.-  It is not reasonable to place a response burden on a respondent to fill out a CE form if it is then ignored in favour of a field interviewer’s view.
-  Responses filled out by a site manager are likely to be more reliable than those filled out by an enumerator who may not even have access to the establishment.
3) If a dummy response contains both HH and CE information, the response is split into two separate records in the data. One containing only HH information and the other containing only CE information.
2) It is an assumption of the module that a dummy response is a duplicate response of a CE or HH response captured at the same UPRN. The HH response is retained over the dummy response when both are captured at the same UPRN for a variety of reasons:
4) One of the dummy responses at the UPRN is selected to be retained based on the following priority rules:-  The HH forms contains more questions that directly map to the HH table on the CDM.
a) Greatest sum of weighted completed dummy questions (the default weights are 1 for each variable).-  There is not much point in asking a resident to fill out their HH form if we then just ignore their response in favour of a field interviewer who most likely did not have access to the residence.
b) FWMT over any potential PQ back-up option.-  A resident’s response about the HH they live in is more reliable than those filled out by an enumerator who may not even have access to the HH.
c) Last Receipted
d) Additional rules (if required)In 2011, the Type of Accommodation and Self-Contained questions on the dummy response were merged with the same fields on the HH response if the variables were missing on the HH response but observed on the dummy response. It was decided against undertaking this action for 2021 as there was concern about any potential field interviewer bias on these responses and it was noted that imputation would likely better recover the true values of these variables for the HH.
e) If all else is equal, then one of these forms is selected at random.
3) A residence can either be a HH or a CE it cannot be both, therefore the forms are split so that either a HH or a CE is created.
5) Any missing dummy variables on the retained record will be populated with valid responses from a disregarded dummy response for the same UPRN using the same priority rules as those used to select which dummy response to retain.
4)  
6) If the retained dummy response is a CE then a new CE record is created on the CE table for this UPRN. The CE record will be flagged as having been created from a dummy response. Any dummy question variables that exist on the CE table will be populated for this new record from the responses on the dummy form. IDs for this row will be created from the dummy Response ID. All other information will be set to missing (-9) on the new record.a)  The record with the greatest sum of weighted completed questions is retained to ensure that the most consistent record is kept. The weighting function has been added to allow for evidence-based prioritisation of any Census variable or subset of variables following analyses of the 2021 Census data during the Census Collection.
7) Otherwise, If the retained dummy response is a HH then a new HH record is created on the HH table for this UPRN. The HH record will be flagged as having been created from a dummy response. Any dummy question variables that exist on the HH table will be populated for this new record from the responses on the dummy form. IDs for this row will be created from the dummy Response ID. All other information will be set to missing (-9) on the new record.b) FWMT responses will be taken over any potential PQ back-up as the quality of response on EQ is known to be higher than that of PQ.
c) Later receipted responses are favoured over other responses as there is an assumption that a latter response may be filled out to correct for a mistake or change in circumstance on an earlier form.
d) This is a new rule for 2021. There is the functionality to add extra rules here, during the tuning phase if information comes to light.
e) A random approach is used as the last measure to ensure no bias. This is an improvement on 2011 when the last measure was to take the first one found.
5) This follows the general principles of RMR. Any missing information should be populated from other completed dummy responses at the same UPRN. The prioritisation rules are the same as those used to select the primary dummy response. Keeping these prioritisation rules retains consistency between responses.
6) This action undertakes the key reason for collecting dummy response information, to create a residence when no response is received for a UPRN. If this action was not undertaken, the address may be assigned the wrong residence type, or it may be missed completely by the Census.
7) This action undertakes the key reason for collecting dummy response information, to create a residence when no response is received for a UPRN. I f this action was not undertaken, the address may be assigned the wrong residence type, or it may be missed completely by the Census.
5aResolve duplicate iForm responsesWhen multiple iForm responses are received for the same UPRN, the match-keys provided by Methodology are used to identify duplicate responses.This is a new module for 2021.
If there are duplicate responses and at least one of these are not EQ (Forced) responses, then the following sequential priority rules are used to decide which of the responses to retain (forced EQ responses are not considered in rules a to e):In 2011, the de-duplication of all individual responses (all form types) within the same residence was undertaken in one rule set at the end of RMR.
a) Last receipted.However, it was discovered that duplicate iForm responses could be assigned to different HHs at the same UPRN because the assigning of iForms to HHs contains a random element (see Module 5c). This meant that these duplicates would not be resolved by RMR as they were assigned to different residences.
Therefore, for 2021, we are bringing forward the resolution of duplicate iForm responses to the stage before we assign them to HHs.
b) If receipted on the same date, take the response with the greatest sum of weighted completed individual questions (the default weights are 1 for each variable) of those last receipted.
Duplicate iForms can be resolved before duplicates from other form types as iForm responses are prioritised over other form types in Module 8a. For more information, please see the Rationale for Module 8a.
c) If they are equal on sum of weighted completed questions and are last receipted, then choose EQ (not forced) over PQ.
In a change to the 2011 rules and individual responses from other form types, we are prioritising last receipted iForm responses ahead of most complete iForm responses.
d) Additional rules (if required)
This rule is based on respondents being advised to fill out iForms as fully as possible when they wish to correct information from earlier submitted forms. It is therefore an assumption that the last receipted iForm will contain the correct information for the individual and these responses should therefore be prioritised.
e) If all else is equal, then one of these forms is selected at random.
As a result of this, non-forced submitted iForms are prioritised ahead of forced submitted forms, regardless of completeness. This is because the forced submissions
Otherwise if all the iForm responses are EQ (Forced) these rules are applied:will always be the last receipted form at the UPRN as they do not get uploaded/receipted until EQ closes, yet it is not known when the information on these forms was supplied.
f) Take the response with the greatest sum of weighted completed individual questions (the default weights are 1 for each variable).a) If a respondent is adamant that they wish to correct information that they have already submitted, they will be advised to fill out an iForm as fully as possible. iForm responses that are receipted later are assumed to provide the “correct” information for the respondent and are therefore prioritised ahead of earlier responses.
g) Additional rules (if required)b) The record with the greatest sum of weighted completed questions is retained to ensure that the most consistent record is kept. The weighting function has been added to allow for evidence-based prioritisation of any Census variable or subset of variables following analyses of the 2021 Census data during the Census Collection. Again, the submissions through EQ forced are not considered at this point because these responses haven’t been submitted by a respondent and this rule is used to decide between two submitted forms that are last receipted.
h) If all else equal, then one of these forms is selected at random.c) This is a general rule of RMR and is used as EQ responses are expected to be of higher quality than PQ responses.
d) There is the functionality to add extra rules here, during the tuning phase if information comes to light.
e) A random approach is used as the last measure when looking at only non-forced submissions to ensure no bias.
Missing person variables on the retained record will be populated with valid responses from a disregarded iForm responses for the same UPRN prioritising valid responses from the disregarded forms based on the following rules:
f) The Forced response with the greatest sum of weighted completed individual questions is selected if there are only forced iForm responses. The record with the greatest sum of weighted completed questions is retained to ensure that the most consistent record is kept. The weighting function has been added to allow for evidence-based prioritisation of any Census variable or subset of variables following analyses of the 2021 Census data during the Census Collection.
1)       Response with the greatest sum of weighted completed individual questions (the default weights are 1 for each variable).
g) There is the functionality to add extra rules here, during the tuning phase if information comes to light.
2)       If multiple responses have the greatest sum of weighted completed questions then prioritise EQ (Not forced), PQ then EQ (Forced)
h) If a single retained response cannot be identified using the preceding rules, then one of these forms will be selected using a random approach as a last measure to ensure no bias.
3)       If equally complete and same response mode, prioritise the response that was receipted last.
4)       Additional rules (if required)1) The record with the greatest sum of weighted completed questions is retained to ensure that the most consistent record is kept. The weighting function has been added to allow for evidence-based prioritisation of any Census variable or subset of variables following analyses of the 2021 Census data during the Census Collection.
5)       If all else equal, one of these forms is selected at random.2) EQ are known to be of a higher quality than PQ responses. Non-forced submissions are prioritised over forced submissions as a respondent has knowingly sent the non-forced entries in and therefore, they are likely to be of higher quality.
Exceptions for backfilling are:3) Later receipted responses are favoured over other responses as there is an assumption that a latter response may be filled out to correct for a mistake or change in circumstance on an earlier form.
-  Voluntary questions.
-  Partially filled Address, DOB, Year of Arrival and Citizenship fields4) There is the functionality to add extra rules here, during the tuning phase if information comes to light.
5) A random approach is used as the last measure to ensure no bias. This is an improvement on 2011 when the last measure was to take the first one found.
Voluntary questions will not be backfilled as legally, missing responses have to be treated as valid responses as they could be seen as a refusal to answer the question.
Partially filled blocks of questions will also not be backfilled, to avoid creating inconsistencies in this data.
5bAssign continuation forms1) When an HC Form is captured at the same UPRN as one retained CE, the persons captured on the HC form will be assigned to the CE.1) Anticipates best intentions where someone has requested an HC form to fill out individual information but reside at a CE. As Module 4 create CEs from dummy responses, a HC response could be assigned to a CE created from a dummy here.
As the HC form doesn’t provide any residence information but the dummy form does it was decided that the residence type should be taken from the dummy response.
2) Persons captured on HC forms at the same UPRN as one retained HH are assigned to the retained HH.
2) Anticipates best intentions where someone has requested a HC form for a HH.
3) The following priority rules are used to choose which HH to assign the HC form to when there are multiple retained HH at the same child UPRN at which a HC form was received:
3) These priority rules are new for 2021, in 2011 the rules were based around the H2 Question on the HH form (How many people usually live here?) and then Soundex matching of last name.
a) Using match-keys, when there are matches on match-keys to individuals across multiple HHs the HH is selected based on:
i) HH with the individual that has matched on the highest of the match-keys in the hierarchical system.It is known that the quality of the completion of question H2 was likely not fit for purpose in 2011 and in 2021 it will no longer be asked on the EQ. Therefore, we decided against using these methods for 2021 to prevent bias being introduced through this process.
ii) Random selection
When assigning the HC forms to a HH we decided to focus on ensuring that duplicate persons are assigned to the same HH where possible. Therefore, when linking HC forms to a HH record, we prioritise assigning the HC forms to the retained HH that contains a duplicate response of a resident captured on the HC form. Assigning the HC form to this HH will allow these duplicates to be resolved in Module 8.  The assigning to a HH that contains a duplicate individual occurs in rules a) and b).
b) Levenshtein Score matching of full name – first name and last name (Empty persons), when there are matches on full name to individuals across multiple HHs then the HH is selected based on:
i) Highest first name Levenshtein match scorea) The match-keys are used to identify potential duplicates. When multiple HHs share people in common with the HC form. The HH with the best matched (Highest match-key) duplicate is selected.
ii) Random selection
b) The second phase of duplicate matching, as empty individuals only contain name information, the only way that they can be identified as a duplicate is through name matching. As this is not as strong a matching method as the match-keys it is behind them in the priority order. The rule (i) for choosing between multiple matched HHs is still to be confirmed through further research.
c) Levenshtein Score matching of full name –
first name and last name (Non-empty persons), when there are matches on full name to multiple HHs the HH is selected based on:If duplicates are not found, further name matching methods are used to select a HH to assign the HC form to through rules c) and d).
i) Highest first name Levenshtein match score
ii) Random selectionc) Full Name matching of non-empty persons is undertaken first because people with the same name at the same address are more likely to live together as they are likely to be fathers and sons. The rule (i) for choosing between multiple matched HHs is still to be confirmed through further research.
d) Levenshtein Score matching on last name only, when there are matches on last name to individuals across multiple HHs the HH is selected based on:d) Last name is then used as it is believed that persons with the same last name at the same address are more likely to live together.
i) Highest Levenshtein match score
ii) Random selectione) There is the functionality to add extra rules here, during the tuning phase if information comes to light.
e) Additional rules (if required)f) A random approach is used as the last measure to ensure no bias. This is an improvement on 2011 when the last measure was to take the first one found.
f) If all else equal, one of the HHs is selected at random.
5cAssign iForms1) When an iForm is captured at the same UPRN as a retained CE, the persons captured on the HC form will be assigned to the CE.1) Anticipates best intentions where someone has requested an iForm to fill out individual information but reside at a CE.
2) Persons captured on iForms at the same UPRN as one retained HH are assigned to the retained HH.2) Anticipates best intentions where someone has requested an iForm but reside at a HH.
3) The following priority rules are used to choose which HH to assign the iForm to when there are multiple retained HH at the same child UPRN at which a HC form was received:3) These priority rules are new for 2021, in 2011 the rules were based around Soundex matching of name and the H2 Question on the HH form (How many people usually live here?).
a) Using match-keys, when there are matches on match-keys to individuals across multiple HHs the HH is selected based onIt is known that the quality of the completion of question H2 was likely not fit for purpose in 2011 and in 2021 it will no longer be asked on the EQ. Therefore, we decided against using these methods for 2021 to prevent bias being introduced through this process.
i) HH with the individual that has matched on the highest of the match-keys in the hierarchical system.
ii) Random selectionWhen assigning the iForms to a HH we decided to focus on ensuring that duplicate persons are assigned to the same HH where possible. Therefore, when linking iForms to a HH record, we prioritise assigning the iForms to the retained HH that contains a duplicate response of the resident captured on the iForm. Assigning the iForm to this HH will allow these duplicates to be resolved in Module 8.  The assigning to a HH that contains a duplicate individual occurs in rules a) and b).
b) Levenshtein Score matching of full name – first name and last name (empty persons), when there are matches on full name to individuals across multiple HHs, the HH is selected based on:When linking iForms to a HH record, we prioritise assigning the iForms to the retained HH that contains a duplicate response of a resident captured on the iForm. Assigning the iForm to this HH will allow these duplicates to be resolved in Module 8.  The assigning to a HH that contains a duplicate individual occurs in rules a) and b).
i) Highest first name Levenshtein match score
ii) Random selectiona) The match-keys are used to identify potential duplicates. When multiple HHs share people in common with the iForm. The HH with the best matched (Highest match-key) duplicate is selected.
c) Levenshtein Score matching of full name –b)The second phase of duplicate matching, as empty individuals only contain name information, the only way that they can be identified as a duplicate is through name matching. As this is not as strong a matching method as the match-keys it is behind them in the priority order. The rule i) for choosing between multiple matched HHs is still to be confirmed.
first name and last name (Non-empty persons), when there are matches on full name to individuals across multiple HHs, the HH is selected based on:
i) Highest first name Levenshtein match scoreIf duplicates are not found, then further name matching is used to select a HH to assign the iForm to through rules c) and d).
ii) Random selection
c) Full name matching of non-empty persons is undertaken first because people with the same name at the same address are more likely to live together as they are likely to be fathers and sons. The rule i) for choosing between multiple matched HHs is still to be confirmed.
d) Levenshtein Score matching on last name only, when there are matches on last name to individuals across multiple HHs, the HH is selected based on:
i) Highest Levenshtein match scored)Last name is then used as it is believed that persons with the same last name at the same address are more likely to live together.
ii) Random selection
e) There is the functionality to add extra rules here, during the tuning phase if information comes to light.
e) Additional rules (if required)
f)  A random approach is used as the last measure to ensure no bias. This is an improvement on 2011 when the last measure was to take the first one found.
f) If all else equal, then one of the HHs is selected at random.
An “empty” person is an individual response that only contains the first name and last name of the individual. They are only retained through RFP if they are captured on a HH form and only contain full name.
The person captured on the iForm is assigned the person number of the first empty person which they match on full name, or, if no matches to empty persons, at the end of all existing individual places already filled. If it is assigned the person number of the first empty person that it full name matches on then the first matched empty HH person record is disregarded.It is expected that the most common occurrence of “empty” persons is when a respondent is named (first name and last name) in the HH section of the form but no information is provided for them in the individual section as they intend to fill out an iForm for the individual response.
Copies of the relationships that include the matched disregarded empty person are created, with the resident_id/related_resident_id updated so that all instances of the resident_id of the disregarded empty person are changed to be that of the person captured on the iForm.If an iForm matches an empty person then the iForm is assumed to be a duplicate of the empty person. As the empty person contains less information than the matching iForm person, it is disregarded. The iForm then adopts the person number of the disregarded empty person as this is the position that it is assumed that the iForm response should exist within the HH structure.
If the iForm response matches multiple empty persons, then it is assigned the person number of the matched empty person with the lowest person number. This is to preserve as much of the relationship information as possible for this person.
Empty persons that are from PQ or EQ (Forced) are disregarded.
Empty persons captured on PQ or EQ (Forced) are disregarded at this point as there is not 100% assurance (no validation) for these collection modes that the empty persons are a result of requesting an iForm. They may just be poorly filled out responses, therefore they are disregarded.
Empty persons captured on EQ (not forced) are retained, unless they match on name to an iForm. This is because there is reasonable evidence that this response represents a real person that requested but did not return an iForm.
Without this rule, there is a risk having a bias in 1-person HHs of Multiple Occupancies (HMOs), because adjustment won’t add people into counted HHs, and HMOs may be more likely to respond online.
The retaining of empty EQ (not forced) submissions will be coded in such a way that it can be toggled on and off as required during live processing.
6Resolve orphan responsesOrphan responses are identified as records captured on iForms or HC Forms at UPRNs for which no HH forms and CE forms are received.1) There must be a good reason as to why a respondent would fill out a HC form for an address. The form clearly states that it is for a HH and so the assumption is that the respondent would have been aware of this and only filled out the form if they lived in a HH. This takes priority over iForms as there is a clear indication on the form that it is to be filled out for a HH.
The following priority rules are used to determine whether a HH or CE is created at a UPRN where there are orphan responses:2)  
-  The planned approach for collecting individuals residing at CEs is through iForms.
1) If there is at least one HC form at the UPRN create a HH.-  It’s unlikely that a respondent would indicate on an iForm that they to a CE if they resided in a HH. For this reason, this response is prioritised ahead of the ticking of HH to this question.
2) Else, if at least one of the iForms indicates on the Type of Establishment question that the form relates to a CE, create a CE.
Rules, 3,4,5 & 6 are new for 2021. In 2011, the final rule was to just set any remaining orphan residences to be a HH, it’s likely that this would have resulted in a slight overcount of HHs.
3) Else, if at least one of the iForms indicates on the Type of Establishment question that the form related to a HH, create a HH.For 2021 we plan to use all available information on the questionnaires and FWMT to assess as correctly as possible the residence type of the address.
4) Else, use FWMT information for the UPRN to determine whether a HH or CE should be created.
3) The only remaining information provided by the respondent that can be used to determine the residence is this. Information provided by the respondent is prioritised ahead of information provided by enumerators/ administrative data.
5) Else, additional Rules (if required)
4) A new rule for 2021. It is understood that there is information on the FWMT data that can be used to determine whether the residence at the UPRN is a HH or CE. Current understanding is that this is a flag that indicates the residence type, and this come from the address register but can also be updated by enumerator in the field. This is prioritised above a random draw as there is trustworthy information to inform the decision.
6) Else, randomly choose whether to create a HH or CE.
5) There is the functionality to add extra rules here, during the tuning phase if information comes to light.
All orphan responses are assigned to the newly created residence at their UPRN.
6) A random approach is used as the last measure to ensure no bias.
7Identify duplicate Individual responsesThe match-keys provided by Methodology are used to identify and flag duplicate individuals in the same residence.Matching is undertaken to identify duplicate persons captured at the same residence.
Any empty persons that Levenshtein Score match on full name to another individual in the residence are flagged as duplicates.
8aResolve duplicate individual responsesWhen duplicate individuals are identified within the same residence, one of the individuals is selected to be retained and the others are discarded.1) Responses captured on iForms are prioritised over other form responses as there are special collection reasons for trusting iForm responses over other responses, these include:
The following priority rules are used to select which individual to retain as the baseline record:-  People are advised to fill out an iForm if there is sensitive information that they wish to record but do not want to disclose to other HH members.
-  If people feel strongly about correcting information that they have already submitted, then they are advised to do so using an iForm.
1) iForm-  iForm responses are less likely to be filled out by proxy.
2) Response with the greatest sum of weighted completed individual questions (the default weights are 1 for each variable).2) This rule helps ensure that the most consistent record is kept. The weighting function has been added to allow for evidence-based prioritisation of any Census variable or subset of variables following analyses of the 2021 Census data during the Census Collection.
3) EQ are known to be of a higher quality than PQ responses. Non-forced submissions are prioritised over forced submissions as a respondent has knowingly sent the non-forced entries in and therefore, they are likely to be of higher quality.
3) EQ (Not Forced), PQ, then EQ (Forced)
4) Later receipted responses are favoured over other responses as it is expected that some people may choose to fill out another form to correct for a mistake or change in circumstance on an earlier form.
4) Last receipted response
5) There is the functionality to add extra rules here, during the tuning phase if information comes to light.
5) Additional Rules (if required)
6) A random approach is used as the last measure to ensure no bias. This is an improvement on 2011 when the last measure was to take the first one found.
6) If all else equal, one of the individuals is selected at random.
Voluntary questions will not be backfilled as legally, missing responses have to be treated as valid responses as they could be seen as a refusal to answer the question.
Any missing Individual variables on the retained record will be populated with valid responses from a disregarded duplicate individual response using the same priority rules as above.
Partially filled blocks of questions will also not be backfilled, to avoid creating inconsistencies in this data.
Exceptions for backfilling are:
-  Voluntary questions.
-  Partially filled Address, DOB, Year of Arrival and Citizenship fields
8bFlag residual duplicate individual responsesThe match-keys provided by Methodology are used to identify and flag duplicate retained individuals that are within the same UPRN.It is possible, but very unlikely, that residual duplicates within the UPRN can occur.
This is caused by duplicates on 2 or more HC forms being assigned to different HHs at the same UPRN in Module 5b. The forms could be assigned to different HHs if they match multiple HHs or they are assigned to a HH based on random selection.
The scale of these duplicates will determine whether further intervention is required.
A solution to this has been discussed at the Working Group, and it was noted that there would not be a quick fix. It’s likely that the fix would result in having to recode parts of Module 2 & 3.
The counts of missed duplicates in 2011 were very low, likely in the 00s. It is expected that this will be even lower in 2021 due to:
1)       Improvements in Address Register correctly identifying HHs at being at different addresses.
2)       The uptake of HC forms being lower.
These duplicates will be flagged and counted to determine whether an intervention or fix is required.
9Identify wholly duplicated HHsThe match-keys are used to identify duplicate persons in the postcode.Assessment of the 2011 Census Overcount Methodology (Dini & Large, 2014) made the following recommendation for RMR:
Levenshtein score matching is used to identify empty persons that are duplicated within the postcode.“The RMR processing would benefit from looking within the postcode for duplicates, in addition to looking within the address”.
HHs of the same size within the same postcode that contain the same persons are identified and flagged as wholly duplicated HHs.As mentioned in Section 2 of this paper, it was known in 2011 that problems in the Address Register meant that on occasion the same HH was listed multiple times at slightly different UAIs. This led to these HHs being followed up for multiple responses.
As RMR sought to resolve duplicates within the HH, these duplicate responses were not resolved in 2011.
In this module we will identify and flag any remaining duplicates within the postcode.
We will also flag HHs of the same size for which every individual is duplicated. We refer to these HHs as “wholly duplicated HHs”.
We do not attempt to resolve any other duplicates other than the wholly duplicates HHs as it was found that the combinatorial problem space for resolving duplicates of this nature between UPRNS, within the same small area, is to vast to accurately and feasibly manage through a deterministic rule set.
10Resolve wholly duplicated HHsWhen wholly duplicated HH are identified, one of the HHs (and residents) is selected to be retained and the others are discarded.In line with the rest of RMR, one of the wholly duplicated HH (along with the individuals) is retained and the other(s) disregarded.
The following priority rules are used to select which HH (and residents) to retain.
Responses have been merged by this point and so the responses could have come from different response modes with varying receipt dates.
1) Greatest sum of weighted completed HH and individual questions (the default weights are 1 for each variable).Therefore, the only remaining general RMR priority rules that can be used to identify which records to retain are 1) Most complete (to retain as much information as possible and 2) Random draw.
2) If equally completed, then one of the HHs is selected at random.
The HH response with the greatest sum of weighted completed HH and individual questions is selected to be retained. This ensures that as much information as possible is retained for the combined HH and individual(s) responses. The weighting function is included to allow for evidence-based prioritisation of any Census variable or subset of variables following analyses of the 2021 Census data during the Census Collection.
The other HHs and their residents are flagged to be disregarded.
The creation of the flag to identify duplicate persons will help later processes account for these responses in their methodology.
Remaining duplicate individuals flagged in Module 9 are retained but the flag is persisted to highlight the extent of the remaining overcount within the postcode to later processes.
This flag will indicate the records that are identified to be the same individual.
11Resolve adjusted CE and HH data structures.The rules for this module are still to be discussed at the RMR Working Group. The proposed rules to be taken for discussion are:1) HHs of this size are more likely to be a CE than a HH.
1) HHs that contain more than 30 persons are converted to CEs. The HH is disregarded, a CE is created, and the residents are moved to the new CE.2) Sequential person numbering is required by later processes.
2) Reorder person number, ensuring that person numbers are sequential with no gaps. When reordering, persons captured on the same form will be given sequential person numbers.3) This relationship data is no longer required and is therefore logically deleted.
3) All non-applicable relationship records are disregarded (i.e. relationship no longer required based on newly assigned person numbers, resident is disregarded, or resident is moved to residing at a CE.)4) New relationships are required based on changes to residence and/or person number.
4) New relationship records are created where required, these records will have their Relationship set to missing. (i.e. relationship now required based on newly assigned person numbers, resident is moved to residing at a HH). Where the new record is a HC form individual’s relationship with person 1, whatever relationship is captured on the HC form will populate the relationship field.5) This visitor data is no longer required and is therefore logically deleted.
5) Disregard visitors data from disregarded HHs.6) Person 1 no longer exists, so resident’s responses are set to missing where they responded with “Same as Person 1”.
6) Address One Year Ago information is set to missing where an individual selected “Same as Person 1” response and person 1 on the questionnaire is disregarded.
12Create RMR FlagsN/AN/A
13Run RMR DiagnosticsN/AN/A

References

J Plachta and R Shipsey (2019). Methodology Report on Identifying Duplicate Persons for Resolving Multiple Responses in the 2021 Census. ONS Technical Report

J Plachta, Office for National Statistics (2020.) Methodology Report on Identifying Same Surnames for Resolving Multiple Responses in the 2021 Census. ONS Technical Report

Dini and Large (2014). Assessment of the 2011 Census Overcount Methodology.

EAP135 – COVID-19 Supplement on no field follow-up

A weighting classes approach for estimating population size under ‘scenario 3’ (no field follow-up or CCS field collection)

Post MARP note: This paper represents initial thinking on COVID-19 impact when presented to the panel in May 2020. Further information on impact and the statistical design has subsequently been published on the ONS website.

This note summarises a potential method for producing Census population estimates in the scenario that field activities are restricted around the time of the 2021 Census as a consequence of the government’s response to coronavirus.

This is being referred to as ‘scenario 3’ and is an extension to work already being explored to make use of admin records in low census response scenarios. The key differences in scenario 3 is that we would not be able to rely on a CCS field exercise.

Assumptions

  • We assume that online Census collection will proceed as planned with anticipated response rate of 75-80%.
  • This paper focuses on three main sources of coverage error that would be incurred in this scenario: Census household non-response, Census within household non-response, and admin data over-coverage.
  • Other sources of bias in the estimation framework would need to be evaluated before making any decisions about the viability of these options.

Options considered

The first option we considered was to substitute for non-responding households using admin data. This would entail identifying non-responding addresses on the Census frame and looking for strong evidence of individuals living at those addresses in the admin records. There would be no ‘estimation framework’ as such, but a combining of Census and admin records to populate the address register. This option has been ruled out as it would likely under-estimate population size considerably due to, (a) missed enumerations within households that do respond, (b) lags in admin records meaning some people have not yet registered at addresses they currently reside in.

The second option considered was to use administrative records as a second listing for dual system estimation. Under this option, administrative data would assume the role of coverage survey on a much larger scale. The precision of the estimates would be high, however the work of Population and Migration Statistics Transformation (PMST) has shown that over-coverage of administrative records, despite best efforts to remove them, remains persistent. Based on research already undertaken, this option has been ruled due to the risk of over-estimation in Census population estimates.

The third options – and our proposed option for discussion, is an alternative method of capture/recapture based on weighting classes. Under this method, census counts obtained from responding households are compared with counts from administrative data in the same addresses. Assuming that the Census counts are correct, an adjustment weight can be derived from comparison with the equivalent admin data counts. This adjustment weight can be used on the full listing of addresses recorded in admin data to produce a population estimate. While this option lends itself more favourably to the administrative data we currently have in the office, there are additional requirements that would need to be met in preparation to use this method.

Weighting classes – requirements

To tackle potential sources of error with the weighting classes approach, additional steps needs to be taken for: over-coverage of occupied addresses on admin data, incomplete census address frame coverage, within household non-response.

  1. Administrative Data Requirements (to tackle over-coverage of occupied addresses):

Since most of our work on admin data been targeted at a DSE approach, data acquisition prioritisation up until now has focused on sources with potential to provide ‘activity’ type information. The weighting classes approach instead relies on counts from record level administrative data being used as auxiliary information to adjust for census non-responding households. This has some advantages regarding the current availability of administrative data, since (a) a single source with high coverage of occupied addresses might be optimal, and (b) over-coverage at the person level can be tolerated by the weighting classes estimator if the over-coverage propensity is similar between responding and non-responding households.

While some of the broad coverage datasets we already have access to (PDS and CIS for example) may meet our requirements for residential address coverage, it is vital to have information confirming whether an address is occupied around Census day. For this reason, utilities data (at UPRN level) would be the most important data to acquire to support this approach, as it would help protect against overestimation by removing admin records in addresses that are no longer occupied.

  1. Address Listing Requirements (to tackle missing addresses on frame):

One of the key functions of the CCS is to identify households which were not on the address frame and adjust in the estimation process accordingly. We have quality targets for the address frame and do not anticipate it being 100% accurate despite best endeavours. In simple terms this is +/-1% (there will be overcount and under count).

To proceed with the weighting classes approach without any additional adjustment for frame under-coverage would require us to be satisfied that the administrative data compiled does in fact cover all (or the vast majority) of residential addresses missing from the census address frame. This is unlikely to be true – while we expect that a proportion of these addresses may be captured in administrative data, we cannot assume it will be all of them.

A separate address checking exercise is therefore likely to be needed, with consideration for all areas across the country. Potential options are:

  • Full property listings in selected areas (similar to the pre-interview stage of the CCS)
  • Sampling administrative records not found on the frame and either verifying by desktop research or visiting in the field.

The latter of these assumes that there is a reasonable degree of independence between addresses being captured on the frame and those captured on admin data. Any fieldwork undertaken would not require contact with residents, however we assume that there would be constraints on undertaking any field activities around Census time if a lockdown scenario was in place. We would therefore require that any address checking activity that needs to be undertaken in the field should take place at available opportunities between now and spring next year.

An approach based on desktop research would support the requirement for utilities data as it is more like to differentiate between addresses which were occupied by usual residents and those which weren’t (either vacants or second homes). Council tax data and other admin datasets we have at our disposal may struggle in some areas to identify second homes.

  • Survey requirements (to tackle within household non-response):

The use of admin data in weighting classes is only viable for estimating population in Census non-responding households. There is still a need to adjust for ‘within-household non-response’, which relates to individuals that are missed within households that have completed Census forms.

A second listing from a sample of census responding households is needed to do this. The sample size of this survey can be relatively small, as it is only intended to adjust for within household non-response, which does not have high variability.

The second listing does need to be independent – it cannot be an online survey as this would likely result in the same capture failures. Under a lockdown scenario, telephone interviewing is the only collection mode that would offer a viable alternative to CCS doorstep interviewing. In order to carry out telephone interviews, phone numbers need to be collected from potential sample households. We’ve considered two ways of doing this:

  • Collect telephone numbers from the household reference person on the online Census collection.
  • Initiate a boost to the LMS online survey 3 months prior to Census day, and then recontact for a second listing exercise shortly after Census day.

Phone numbers are collected already from the online LMS, with reasonably high completion rates (80-90%). Phone number collection is not included in the Census collection currently and it is not viable to undertake sufficient research to implement when there maybe an impact on response rates.

Research and Delivery Timelines

Research teams are currently committed to delivering the standard design for the Census in 2021 and preparing for low response scenarios already identified if the Census goes ahead as planned. To understand the likely quality of the weighting class approach outlined here would require simulation studies to be undertaken by MDR, the results of which would need to be available by September 2020 to make a decision regarding the suitability of this approach as a contingency option.

If in the event of lockdown there was a requirement to operationalise this method, it is unlikely that estimates of population size could be produced within a year of Census day. The estimation framework is not as well understood as DSE and will require additional methodological development and quality assurance beyond the standard design.

EAP134 – Impact of Covid on 2021 Census Statistical Quality

Post MARP note: This paper represents initial thinking on COVID-19 impact when presented to the panel in May 2020. Further information on impact and the statistical design has subsequently been published on the ONS website.

Overview

The impact of Covid19 is being considered within ONS with reference to three scenarios:

  • Scenario 1 – Disruption for 3 months but transition back to normal operations by July 2020
  • Scenario 2 – Same as scenario 1 with additional unplanned disruption in Autumn/Winter 2020
  • Scenario 3 – Same as scenario 2 with additional planned and ongoing disruption

Impact on statistical quality is set out in the annex to this note. With a summary on statistical quality set summarised below.

Overview of Scenarios on Statistical Quality

Impact Scale = 1 (Minor) to 5 (Major)

Impact on Statistical Design Phases

ScenarioDesign/BuildMonitor/CounterProcess/Estimate
1
(Total Impact Score
= 7)
• Address Frame Quality without a field address check (1)
• Economic/Social change impacts Hard to Count (HtC) index (1)
• Modelling assumptions changed (1)
• Delay in recruiting community engagement officers impacting Target Action Groups (1)
• Reduced ability of Response Chasing Algorithm to effectively identify interventions due to changes in underlying modelling assumptions (1)• Uncertainty with availability of administrative data so potential impact on ability to identify and adjust (1)
• Increased variability in other statistical design elements result in increased variability In resulting estimates (1)
2
(Total Impact Score
= 14)
• Address Frame Quality without a field address check (1)
• Economic/Social change impacts HtC (1)
• Modelling assumptions changed (1)
• Delay in recruiting community engagement officers impacting Target Action Groups (3)
• Reduced ability of Response Chasing Algorithm to effectively identify interventions due to changes in underlying modelling assumptions (1)
• Field operation capacity and capability is impacted (2)
• Uncertainty with availability of administrative data so potential impact on ability to identify and adjust (2)
• Increased variability in other statistical design elements result in increased variability In resulting estimates (2)
• Census Coverage Survey (CCS) planning begins to be impacted (1)
3
(Total Impact Score
= 29)
• Address Frame Quality without a field address check (1)
• Economic/Social change impacts HtC (3)
• Modelling assumptions changed (3)
• Delay in recruiting community engagement officers impacting Target Action Groups (5)
• Reduced ability of Response Chasing Algorithm to effectively identify interventions due to changes in underlying modelling assumptions (2)
• Field operation capacity and capability becomes minimised resulting in major variability and bias in estimates (5)
• Uncertainty with availability of administrative data so potential impact on ability to identify and adjust when our need to adjust is inevitable (4)*
• CCS is incapable of delivering a meaning basis for coverage estimation. Without ability to undertake an address listing exercise no missed addresses would be identified (5)*

*Alternative options are available to standard coverage estimation approach (weighted class) but are dependent on alternative data

**Options are available for CCS (delay, running in certain areas and borrowing strength

Annex – Covid19 Impact by Statistical Design Phases

Statistical Design Phase 1 – Design/Build

Area of Statistical DesignPotential Covid ImplicationPotential Statistical Quality Impact
Address• Unable to run a field address check – so increase scale and scope of clerical address check
• SSD resource available to increase scale of DART team and Communal Establishment (CE) clerical review
• Availability of managers in CEs which may need to respond to requests from the clerical review process to complete the CE address frame.
• Availability of administrative data for remaining areas of construction of the CE address frame (Ministry of Defence, Ministry of Justice etc).
• Field address check would have resolved some addresses which DART could not meaning resolution in field (so reducing overall efficiency)
• Users maybe concerned about quality without a field address check
• Quality of CE address frame is particularly likely to be impacted and is likely to be affected for some types of CE i.e. prisons and armed forces bases
• May also impact on ability of field operation to accurately follow up response for student halls resulting in a lower response rate.
• Significant reduction in building reduces underlying level of change addresses
Questionnaire• More challenging to make any late changes (though changes unlikely)• Questionnaire development complete
Wave of Contact• Covid scenarios likely to change assumptions used in modelling and extremely difficult to make assumptions about 2021 e.g. field effectiveness maybe impacted by challenges with training staff of making face-to-face contact.
• Additional modelling maybe required and live updates to modelling required during operation.
• Administrative data availability likely to be impacted (Council Tax) used in prioritising field follow-up
• Potential that assumptions are incorrect in the Field Operations Simulation (FOS), meaning field staff volumes or distribution may be insufficient to reach response rate and variability targets and effectiveness of the Response Chasing Algorithm in optimising response may be reduced.
• Impact on field resource availability would disrupt wave of contact model
• Reduced availability of administrative data would reduce field efficiency impacting variability across Local Authorities.
Hard to Count (Digital)• Likely to be a positive impact with increased confidence in online access.• Increase online response
Hard to Count (Willingness)• Impact of economic recession will mean change patterns of willingness.
• Some areas will be harder to count than in 2011 and changes unlikely to be picked up in time with existing admin data at small area level
• Response chasing will be less accurate with out of date HTC and will result in an increase in variability at small area level (though likely to be minor)
Target Action Groups• There may be new Target Action Groups that have not yet been identified (i.e. lack of confidence in government, newly unemployed)
• Existing Target Action Groups maybe more or less difficult to engage
• Closely associated with community engagement so any reduction in their effectiveness will impact directly on TAG strategies.
• Identification of and planning for new groups required and risks lower response in some groups (increasing variability)
• Dependency on community engagement officers will directly impact response in community groups (increasing variability)
Market Segmentation• Potential to change existing segmentation (unlikely to be major)• Potential inefficiency or reduced effectiveness of comms messaging impacting overall response.
Community Engagement• Recruitment and training of community engagement officers already delayed by two months given Covid 19 impact.
• Lockdown likely to have improved effectiveness of community communication networks (including digital)
• Delays in recruitment would directly impact ability effectiveness of community engagement and so would decrease response in population sub-groups (related to some TAG groups).
• Opportunity to benefit from improved communication networks

Statistical Design Phase 2 – Monitor/Counter

Area of Statistical DesignPotential Covid ImplicationPotential Statistical Quality Impact
Response Chasing AlgorithmAssumptions used in Field Operations Simulation (FOS) maybe incorrect resulting in reduced effectiveness of Response Chasing AlgorithmReduced overall response and increased variability.
Quality Assurance• No impact on quality assurance of individual processes.
• Availability or inaccuracy of administrative data would impact on our ability to assess variability in response data e.g. lower response in a population sub-group.
• Reliance would be on 2011 Census data and admin data that were available. Potential to miss poor response issues in some population groups which have recently changed
Business Intelligence/Management Information• No impact (descoping field address check has had a minor positive impact on the amount of work needing to be covered• None

Statistical Design Phase 3 – Process/Estimate

Area of Statistical DesignPotential Covid ImplicationPotential Statistical Quality Impact
Data Cleaning• No impact• None
Census Coverage Survey• No impact for short term disruption
• Long-term disruption impacts on ability to recruit and train an effective field force
• Inability to undertake a field operation would mean no address listing exercise.
• Long-term impact would undermine ability to adjust for coverage using standard design.
• Without an address listing exercise no adjustment would be made for the missed addresses in the Census.
Under/Over Coverage Estimation• Response to the main Census is impacted (as above)• Increased variability would mean there was greater uncertainty in estimates for some areas/groups
• Overall lower response would also mean greater uncertainty
• Could be countered by planned statistical contingencies work using admin data (see below)
Quality Assurance• Availability and quality of administrative (and survey) data which would be used to validate census estimates.
• Data availability may also impact ability to make planned adjustments which are part of the standard design namely for number of households, babies and a national adjustment
• Ability to identify and explain inconsistencies between census and other data.
• Ability to identify quickly and with confidence that there is a need to make further adjustments to the coverage strategy (invoking census contingencies)
• ONS may not be able to publish and explain coherence between Census and other data – impacting user confidence.
• Ability to make planned adjustments for households, babies and national adjustments directly impacts on quality
Further Development of Coverage Strategy (Contingencies)• Availability and quality of administrative data to be able to make further adjustments beyond the standard design for any localised or national response issues (as experienced internationally)• Ability to adapt the design to counter localised and national response accurately.
• Impact would be greater uncertainty in the estimates overall resulting in a reduction in user confidence.
• May disproportionately impact certain population groups or geographic areas.

EAP133 – A proposal for population size estimates using administrative data for national 2021 Census Quality Assurance

Written by Owen Abbott and Viktor Racinskij, Office for National Statistics

Summary

The paper firstly outlines some options for producing national population size estimates for use in the 2021 Census national quality assurance from administrative data, in the absence of a fully operational population coverage survey. It concludes that using the Census Coverage Survey (CCS) is a viable option.

The panel are asked whether they agree with the conclusion in part 1.

The paper then proposes a methodology for mitigating against a lack of clerical resource to undertake a full, high quality, linkage between the administrative data and the CCS, within the timeframe for producing the estimates required. It explores the likely reduction in precision that results. It concludes that the resulting estimates are useful but likely to not be of sufficient quality to provide a target sex ratio.

The Panel are asked whether ONS should continue to explore the methodology presented in part 2.

Disclaimer: The second part of the paper is fairly late breaking research, it is presented to the panel to obtain some early feedback on the thinking and value of continuing to develop such an approach.

In order to provide confidence in the national level census estimates, or as part of the evidence to assess plausibility and suggest any bias adjustments, there is a desire to obtain an approximately unbiased estimate of the usually resident population as at census day which uses administrative data. Our research to date has shown that administrative data alone does not provide an approximately unbiased estimate even at national level (Abbott et al, 2020). This is due to over-coverage (in this case records that relate to people who are no longer in the population due to emigration or death and some duplication) which we are not able to reduce sufficiently through linkage and business rules applied to administrative sources alone.

The census quality criteria require the bias to be less than 0.5 per cent for age-sex groups at national level. In addition, we expect the census estimates to have confidence intervals around these national estimates of around 0.2 per cent for the total and between 0.2 per cent and 0.75 per cent for age-sex groups. For an age-sex population size of 2 million with a CI of plus or minus 0.3 per cent, this implies that our bias tolerance is 10,000 persons and our confidence interval will be around plus or minus 6,000 persons.

This paper has two parts. The first outlines the current position with regards to the strategy for population estimates using administrative data research and the availability of survey data for estimating coverage. The conclusion is that in the absence of a specifically designed survey, the CCS is a potential option for providing coverage measurement, albeit with some assumptions and conditions. However, due to the timing of the CCS and the requirement for this to feed into the census national quality assurance, full linkage to the required accuracy of the CCS to administrative data is not possible.  The second part explores a methodology for producing population estimates using the CCS, under this situation of reduced clerical matching capacity. The quality of resulting estimates are explored.

Part 1 – Options for producing coverage adjusted estimates without IPACS

Our strategy for minimising bias in a population size estimate which uses administrative data is to use a survey to measure and correct for coverage errors, as described in Abbott et al (2020). The methodology for doing this is likely to be based on dual-system estimation techniques (there are various flavours which could be used). In broad terms, achieving an unbiased estimate requires:

  • removing as much as possible of the over-coverage from the linked (or single source) administrative data used as one of the sources in the DSE, and then designing a survey to measure the resulting coverage (although noting that there is not a unique solution to this and many varieties may be possible)
  • Linking that survey to the administrative data
  • Using appropriate dual-system estimation to produce population estimates
  • (possibly) developing a procedure to estimate and adjust for any residual biases (such as over-coverage, linkage error etc.)

There are challenges for each, but here we will focus on the survey aspect, where our strategy for using a specifically designed survey will not be possible.

Our initial survey strategy was to design a survey (IPACS) which would achieve, or get close to, the quality criteria required, by exploring sample size options for obtaining a national confidence interval of plus or minus 1 per cent. The Population Coverage Survey part of the IPACS design also future proofed the population statistics system, in that it could also be used to produce population estimates in non-census years, albeit at a larger scale where confidence interval widths are acceptable. The survey would be ongoing, with a rolling reference date. This complicates the estimation method when compared to a large one-off survey with a single reference date, as (for example) a rolling set of estimates would have to be produced, and a time series model applied to them. The census day estimate would then be provided via this time series model, with all the caveats around using a model. Depending on the type of time series model used (which would require some research), this meant that to obtain a census day estimate, sufficient sample would be required for at least a year before and for a few months after census day in order for the time series to be stable and the confidence intervals to be close to what is required.

Initial estimates suggested we would need an issued sample size of between 4000 and 5000 addresses per month starting from mid-2020 (assuming a 50 per cent response rate), to achieve the 1 per cent CI nationally. However, this has been shown not to be feasible in terms of practical implementation within the timeframe, and so an alternative must be explored.

For information about the size required in the longer term, additional exploration, albeit with some assumptions and caveats, showed the sample required to achieve a 0.25 per cent CI nationally required about 700,000 issued sample addresses (which is about 60,000 per month).

The criteria for being able to produce an estimate based on administrative data that could be used to QA the Census based population estimate are:

  • Approximately unbiased estimates (within plus or minus 0.5 per cent as per census)
  • Acceptable confidence intervals (similar to census, so that any census bias can be reliably detected)
  • Independence – the estimate should be independent of the census estimate (i.e. Census + CCS), and the sources (admin and survey) should be independent for unbiased DSE
  • The survey we use should be able to be delivered in the field
  • Estimation methodology should be as simple as possible – to reduce development time and ensure user confidence (e.g. Time series models for population statistics have never been used in ONS before, so are not well understood)
  • Survey response rates should be high – the higher they are, the less risk of bias in the estimates
  • Linkage with high accuracy will be required – this is again required to minimise bias in DSE
  • Timing – the estimate will need to be available for the Census QA which is likely to be in September/October 2021 – i.e. just over 6 months after census day

The options for delivering sufficient survey data to attempt to fulfil the requirement are:

  1. A smaller IPACS

If the sample were much smaller than currently proposed to enable it to be delivered, then the precision obtained would not be sufficient – the confidence intervals would be so wide that it could not be used to reliably detect any bias in the census estimates.

  1. Concentrated Population Coverage Survey around census day

Rather than developing an ongoing survey from the middle of 2020, a larger concentrated sample around the census day period could be designed. For example, the three months before and after. This means as the sample is much closer to the reference date a time series model would probably not be necessary, but would accept some small bias due to movers. However, the size would need to be large (probably 9,000 per month for 6 months) and would potentially conflict with the Census and CCS operations, as well as potentially conflicting with the ongoing LFS.

  1. Use the CCS

An alternative strategy is to use the existing CCS data collection which uses census day as its reference point. This has many advantages in that it achieves a high response rate, has the right reference point, collects the same data as the PCS, would result in a much simpler estimation method (a well understood DSE without any time series modelling), is being implemented anyway so has many practical advantages (in both collection and processing) and is likely to be large enough to achieve good confidence intervals.

However, it also has some drawbacks which are worth considering.

Firstly, there would be a concern over independence. If we are using the same survey to estimate the population using the census and then also using administrative data, are the two things independent? They would not be truly independent, but the key requirement is to have something which can help to highlight any residual biases in the 2021 Census estimates. These biases will come from a number of sources, and the main risks are around the DSE – and in particular assumptions of independence between Census and CCS, linkage error, heterogeneity of capture probabilities and over-coverage. The same risks would apply to the administrative based DSEs, however they would be different in nature – the independence assumption is likely to hold better but the heterogeneity may not hold as well (due to limited post-stratification options). Bias due to linkage error may also be different as the Census and CCS are designed to be linked whereas the administrative data and CCS were not, and lastly we expect the over-coverage risks to be higher for the administrative estimates.

Therefore, whilst the two estimates may not be independent, the potential biases that underpin them are likely to be different in nature. This makes comparisons between them complex, especially if the biases in the administrative based estimates are not well understood.

Secondly, the CCS is designed for measuring coverage of the census, not the coverage of administrative data. This does not mean that it cannot be used, it just means that the sample allocation will not be optimal and therefore the confidence intervals will be wider than they would be had it been designed for the admin data estimates. Post-stratification of the CCS sample for estimation using administrative data would be a standard strategy, and thus the loss of precision will depend on the sample sizes within those post-strata, and the correlation of the variability of the coverage patterns in the census with those in the admin data. However, the CCS is a large sample, and the sample design is relatively conservative (so that it has some protection against the census coverage patterns being different from those expected in advance) and it spreads the sample everywhere. Any loss in precision compared to a PCS is likely to be offset by the higher response rates, and not having to use a time series model.

Thirdly, the linkage of the CCS would need to be completed to meet the time scales. This needs to be considered, as the same accuracy as the census to CCS linkage requires is likely to be needed – and this is also happening at the same time. The Census to CCS linkage is the priority, and there does not appear to be the capacity to do another full high quality linkage exercise alongside this.

Fourthly, the CCS is more clustered than the design suggested for IPACS, so there may be a slight loss of precision which would need investigating. However, the overall sample size might well make up for this.

The key drawback of using the CCS is that if does not meet its success criteria (response rate of around 90 per cent with a minimum of 70 per cent in all LAs), then both the Census estimates and the administrative based estimates will both suffer in terms of bias and/or precision. Estimates can still be made, but there would be an increased risk of not meeting the bias targets or confidence interval targets.

It seems feasible to use the CCS to produce an administrative based population estimate given that an independent PCS of the right size is not possible within the timeframes. However, there is a risk of a lack of independence. Equally, the likely biases in the census and admin based estimates will be different although as this has not been explored previously we do not have any evidence for this. There are some advantages, in particular the closeness to the reference period, the likely simple (and understood) estimation methodology and the large sample size. The key question is whether it can deliver estimates which are both approximately unbiased and with confidence intervals sufficiently narrow that they can be used as evidence as part of the census national quality assurance.

Part 2 – Estimating population size using the CCS with limited clerical matching resource

As discussed in part 1, it might be feasible to use the CCS combined with administrative data to provide population estimates. In this section we explore a methodology for achieving this under reduced clerical matching constraints, and attempt to explore the quality loss.

It is anticipated that dual-system estimation could be used, as per previous census estimation methods. However, one of the requirements for this is for high quality individual level linkage between the two sources. On a full scale this would be challenging due to the likely timeframe clashing with the same linkage exercise for the CCS to the Census, which requires full matching with a high degree of accuracy. Clerical matching, in particular, would not be possible on a large scale. For the CCS to administrative data linkage, should a full and highly accurate linkage be required, it is likely to require more clerical effort than the Census to CCS.

An alternative is to accept a lower match rate (for example, by only using automated matching methods) which results in a high false-negative rate, and use a sample to estimate the matching error. The estimated matching error can be used to adjust the DSE for the bias induced by the false negative matching error. If the estimates of matching error are unbiased, this should result in approximately unbiased estimates (assuming all other sources of bias are also zero). However, the resulting estimates would have higher variance than if full matching were undertaken due to the additional sampling process. This could work in this scenario, because the requirement is to produce national estimates of population size by age and sex, rather than a full characteristic analysis of non-response as is the case for the Census to CCS linkage. This paper explores the loss due to the sampling to estimate the error levels, and the additional potential for bias in the estimates of matching error.

Racinskij (2020) outlines a general framework for adjusting DSEs based upon a sample which estimates matching error. In that framework both false positive and false negative errors can occur and the estimator accounts for both, requiring a sample that can detect both types of error. The sample is drawn from all records in one of the sources. The simple simulation study in that paper shows that for two sources with coverage rates of 90 per cent and 80 per cent respectively for a population of 1000, the usual DSE without matching errors is approximately unbiased and has relative standard error of 0.53 per cent. When  linkage errors of 5 per cent false negatives and 2 per cent false positives are introduced, the DSE is then biased by 5.04 per cent and its RSE increases to 1.06 per cent. If a sampling fraction of 0.2 is used to draw a sample from the source with 90 per cent coverage (i.e. a sample size of 180 records), and the results used to adjust the DSE using the estimator in the paper, then the bias is reduced to 0.04 per cent (i.e. nearly unbiased) but the RSE increases to 1.74 per cent. Thus the RSE has trebled when compared to the ‘fully matched’ DSE alternative.

The concern with adopting this approach for producing population estimates from the CCS and administrative data is that the sample size required would still be infeasible to provide a reasonable degree of precision. For context, the census estimates based upon the CCS should attain an RSE of around 0.2% for a single age-sex group (ONS, 2012). Early estimates suggest that to obtain close to this level of precision, a sampling fraction of 0.5 might be required which is probably similar resource levels to doing full clerical matching.

Our proposal is therefore to adopt a conservative matching strategy which deliberately attempts to achieve zero false positives, at the expense of resulting in a high false negative rate. We expect this to result in match rates for individuals of around 80% between the CCS and administrative data, but the assumption is then that those matches will contain a negligible number of false positives which for estimation purposes can be ignored. Therefore, from a sampling perspective, we only need to sample unmatched records (and not draw a sample from all records) and carry out a clerical check of those to determine the false negative rate which is then used to adjust the DSE as per Racinskij (2020).

If we make assumptions about the designed ABPE under-coverage, the CCS coverage and CCS sample size we can use a slightly modified variance estimator based on Racinskij (2020) to show what the likely RSEs and Confidence Intervals will be for estimates of individual age-sex groups and for the sex ratios which would be used in the national adjustment. These will show the trade-off between the amount of clerical matching required under various scenarios and the likely quality of estimates.

The estimator is therefore as follows:

EAP133 - A proposal for population size estimates using administrative data for national 2021 Census Quality Assurance

The estimated the DSE correction is:

EAP133 - A proposal for population size estimates using administrative data for national 2021 Census Quality Assurance

where

EAP133 - A proposal for population size estimates using administrative data for national 2021 Census Quality Assurance

which is the Horvitz-Thompson estimator under SRS, sampling nr records from the unmatched records from source 1.

The variance, using the derivations in Racinskij (2020) is:

EAP133 - A proposal for population size estimates using administrative data for national 2021 Census Quality Assurance

But this time the variance of the DSE correction estimated from the sample is:

EAP133 - A proposal for population size estimates using administrative data for national 2021 Census Quality Assurance

Where f is the sampling fraction, and as in Racinskij (2020) the sample variance replaces S2y. This variance will be smaller, as the sample is drawn only from the n10 unmatched records. This can now be used to calculate the variance associated with an error-corrected DSE under different sample sizes for estimating the false negative matches.

As a baseline, the Census and CCS achieve 94 per cent and 90 per cent coverage overall. The CCS sample size is around 500,000 individuals. The linkage achieved has very low error rates. Using the variance estimator described, for each of 34 age-sex groups with CCS sample size 15,000 individuals (so that the total CCS sample size is 500,000 individuals) the Relative Standard Error (RSE) on the DSE population estimate is 0.07 per cent. Note that the estimate has not been weighted up to the total population – that process adds additional variation which is not addressed in this paper. Variances for sex ratios can also be computed, assuming independence between the male and female estimates.

Table 1 overleaf shows how the RSE changes based upon the assumptions of coverage rates, false negative rate and sample fraction. We have tried to show a base scenario (#2) of 80 per cent admin data coverage, 90 per cent CCS coverage, a false negative rate of 0.2 and control the overall sample size to be around 20,000 records, which currently is our best guess on the capacity available for clerical resource. This base scenario gives an RSE of 0.51 per cent. Scenario 26 shows the approximate results for the Census to CCS matching under perfect conditions.

The table shows that as expected, the RSE increases as coverage rates fall, the false negative rate increases or the sample sizes fall. In particular, if more records have to be removed from the admin data to ensure there is no over-coverage, then the RSE could increase by up to 20 per cent (scenario 6). Equally, if the clerical capacity were to be only enough for around 11,000 records (scenario 24) then the RSE increases by 20 per cent.

Lastly, scenario 25 shows what might be possible if not all age-sex groups are considered, perhaps focusing on adult age groups up to the age of 50 only. Coverage rates will be lower for these groups but the increased sample size offsets this.

Table 1: Relative Standard Errors (RSE) for adjusted DSEs based upon various coverage, false negative rates, sample sizes and numbers of age-sex groups considered

ScenarioCensus/admin coverageCCS CoverageFalse Negative rateSample fractionage-sex groupsTotal FN sampleRSE
10.90.90.20.163420563.20.0045
20.80.90.20.183420563.20.0051
30.750.90.20.1934203490.0054
40.70.90.20.234199920.0057
50.650.90.20.223420420.40.0059
60.60.90.20.243420563.20.0061
70.80.850.20.1534195840.0059
80.80.80.20.143420563.20.0064
90.80.750.20.1534206550.0068
100.80.80.20.143420563.20.0064
110.750.750.20.1334198900.0076
120.70.70.20.133420420.40.0087
130.650.650.20.133420685.60.01
140.80.90.10.263420155.20.0029
150.80.90.150.213420134.80.004
160.80.90.250.1534198900.0064
170.80.90.30.133419624.80.0077
180.80.90.350.123420318.40.0087
190.80.90.20.3534399840.0034
200.80.90.20.334342720.0038
210.80.90.20.2534285600.0042
220.80.90.20.234228480.0048
230.80.90.20.1534171360.0057
240.80.90.20.134114240.0071
250.750.80.20.2818204120.0049
260.940.90340.0007

Figure 1 attempts to demonstrate how this might feed into the national adjustment process, by replicating the evidence that would be seen at the time. The chart shows the sex ratio information that might be presented prior to the national adjustment. It includes the 2011 Census count, the 2011 Census initial estimate (with CI), the Mid-Year population estimate, the Admin-Based population estimate V2 (which is the admin data attempt to measure the whole population without any form of coverage correction) and a made-up CCS corrected ABPE, but with the confidence interval based upon the base scenario 2 from Table 1. The Patient register and CIS sex ratios are not included for clarity, and as ABPE V2 essentially uses these in the main.

Figure 1 – Sex ratios in 2011 including mocked up ABPE-CCS sex ratio estimate with confidence interval

Figure 1 – Sex ratios in 2011 including mocked up ABPE-CCS sex ratio estimate with confidence interval

Figure 1 shows that whilst the confidence intervals are wide around the CCS adjusted admin-based estimate, it does provide some helpful information. However, were they to be wider than as shown, then the estimates might become less useful when using it to make decisions about whether to adjust the census estimates, and by how much. Given the estimated quality shown here is a likely underestimate, this work would suggest that the estimates would not be of sufficient quality to provide a benchmark sex ratio to adjust the census results to.

The precision presented here is likely to be an underestimate, as it assumes no further processes to extrapolate to the entire population. How much this would add is uncertain but would require additional research, and assumptions about the likely patterns in the administrative data. This would not be a trivial piece of work.

This also assumes that the only source of bias is due to linkage errors. Over-coverage in the administrative data is still a concern, and in order for this to be a useful exercise we need confidence that it can be removed. The sex ratio for the ABPEV2 in Figure 1 shows what can happen when there is over-coverage present in the administrative data (in this case Males aged 25 and over).

The approach also assumes that the matching results in zero false positives, and some evidence would be required to make a better assessment of this likelihood.

This paper presents some options for producing national level estimates of population size using administrative data combined with the CCS, under the assumption that a full accurate linkage process is not possible in time for feeding into the 2021 Census national adjustment process. The approach is to adopt a conservative automatic matching process, and then use a sample to estimate the false negative error rate to adjust the standard dual-system estimator.

There is a reduction is precision, and this raises the question of whether this renders the estimates useful within the national adjustment process. The tentative conclusion at this point, with a number of assumptions, is that it would be useful but not of sufficient quality to provide a target sex-ratio to adjust the Census to. Even so, it is highly dependent on the further removal of over-coverage from the administrative data, and confirmation of the likely linkage sample sizes that could be used. In addition, further work is required (see below) to fully scope out the likely precision and practicalities around linkage.

Double check the validity of the modification to the variance formulae.

Simulation of likely precision based on varying coverage rates across age-sex groups would also be required, as would consideration of the actual sampling strategy for false negatives. This may improve precision over the SRS results presented here.

Consideration of the conservative matching strategy and some testing of whether 80 per cent linkage with zero false positives would be achievable.

Additional estimation of the amount of clerical resources required and how the clerical verification of false negatives would work – for instance could the pre-search algorithm provide sufficient speed and accuracy, and thus is the anticipated sample size of 20,000 realistic in practice.

References

Abbott, Owen; Tinsley, Becky; Milner, Steve; Taylor, Andrew C.; Archer, Rosalind (2020) Population statistics without a Census or register. Statistical Journal of the IAOS. Vol. 36, no. 1, pp. 97-105.

Office for National Statistics (2012) Confidence intervals for the 2011 Census. Available at: http://www.ons.gov.uk/ons/guide-method/census/2011/census-data/2011-census-data/2011-first-release/first-release–quality-assurance-and-methodology-papers/confidence-intervals-for-the-2011-census.pdf

Racinskij, V. (2020). Naive linkage error corrected dual system estimation. Available at https://arxiv.org/abs/2003.13080