1. Background
The Transformed Labour Force Survey (TLFS) is one of the most important ONS initiatives currently underway. This importance has increased with concerns over bias and the impact of decreasing response rates in the Labour Force Survey (LFS), which the TLFS is meant to replace. In this context, we have been approached by the ONS to provide methodological comments as it prepares to “sign off” the proposed design for the TLFS prior to a period of rigorous field testing and the subsequent implementation of the TLFS as a replacement for the LFS.
The original concept of the TLFS was effectively of a short online survey focused entirely on headline labour market statistics, with participants followed up longitudinally, and a second independent cross-sectional survey, also carried out online but with a much longer questionnaire and repeated every quarter, which addressed the need for more complex labour market data. However, the distinction between the two became blurred as the scope of the short online questionnaire was increased in response to the urgent data needs of the pandemic, to the extent that completion times for both surveys eventually differed by very little. Furthermore, there was the added complication of attempting to produce headline labour market estimates from a very large wave one that combined sample across two surveys with somewhat different response patterns.
We are therefore pleased to note that the proposed design for the TLFS goes back to a clear separation of components with a genuinely shortened Core delivering the headline labour market statistics, and a separate Plus survey focussed on the wide range of additional more detailed topics that are also covered by the current LFS. This is a fundamental characteristic of the TLFS that in our opinion needs to remain the basis of its design as it is implemented over the coming months.
The decision to create a clear separation between the Core and Plus components has potential benefits into the future. Keeping the Core separate and focussed purely on headline labour market data allows for the questionnaire and content of the Plus to be more flexible, so it can evolve for user needs. Since headline labour market information will still need to be collected in the Plus for cross-tabulation purposes, it is technically feasible to combine data from both surveys to estimate headline labour market statistics (as the current TLFS structure does). However, this makes any changes to Plus content and structure complex due to their interplay with the headline labour market data from both surveys. In our opinion this option is not a design decision at present. We view such drawing of strength across the two surveys for estimating headline labour market statistics as a potential future development that can be investigated once the TLFS design has been stabilised, and the survey is providing the LFS data sets that users currently require. In other words, it is important that the designs of the Core and Plus surveys are optimised for their specific purposes and are not compromised so that it might be possible to combine them for the purpose of (potentially) more accurate headline labour market estimates.
In this context, we note that there are other ways that data from the Core and Plus can be integrated for the purpose of improving TLFS outputs. One example is where the headline labour market statistics produced using the Core are viewed as the official ONS estimates. Since headline labour market data are also collected in the Plus, it is technically quite feasible to calibrate the weights used in the Plus so that they recover these Core estimates. Whether this makes sense for Plus estimates of variables that have little to do with labour market status needs to be considered, however. Here there is an argument for two sets of weights, one that is “Core-calibrated” and used for Plus cross-tabulations involving labour market variables, and another, more standard, set of weights used with non-labour market related outputs from Plus. But in any case, inclusion of core-calibration will impact on the methodology used to calculate the sampling variances for labour market-related Plus outputs, since the methods the ONS uses for estimating sampling variances at present do not make any allowance for calibration constraints with significant sampling variability.
2. Is the proposed TLFS design, with a shorter Core survey, justified?
ONS undertook a pilot study to compare different versions of a shortened TLFS with the current TLFS using the online only periods of fieldwork. The results from this study strongly support that:
a) More households commence the shortened survey in the opening period of online only fieldwork.
b) More households provided ‘complete’ data. Complete verses partial is not always an entirely clear distinction, but the results demonstrate more of the headline labour market data coming from responding households with the shortened survey.
These results are clear, but the reasons for them are less clear as there were changes to the contact letter, with a QR code added for easy access to the survey and an expected completion time also included. However, while these might impact some of the benefits seen under a), it is reasonable that b) is mainly due to the shortened survey. Additionally, if adding the completion time helps, it is reasonable to assume it helps because the time stated is short. This seems sufficient evidence to continue design work under the new structure, including further testing of the contact letter impact.
The analysis of the shortened survey has also shown that the additional responders are not just from the same groups that currently respond to TLFS. For example, this increased response improves the respondent age profile, so estimation weights will have less to do to meet crucial calibration constraints. For headline labour market data, the unweighted response distributions in the shortened survey are closer to the weighted response distributions seen for both the TLFS and the shortened survey, although the different treatments applied (different questionnaire versions) make the exact picture unclear. Any further testing of the shortened survey should focus on a single questionnaire, as this pilot test has showed that additional clarifying questions did not improve SIC/SOC estimates, and in certain combinations had a clearly detrimental effect. Issues still remain around completion of disability and education questions. But these exist with the current TLFS, so the shortened survey does not appear to make these concerns worse while improving the data for headline labour market outputs.
In summary, we are of the opinion that the proposed 90K:90K sample split of Core and Plus, with a short online Core, should now be accepted as the way forward, with any further modifications to data capture methods restricted to fine-tuning of this proposed new TLFS structure in response to discovery of potential significant impact on attrition and response. We believe that such modifications should be clearly articulated beforehand and kept to a bare minimum, to ensure that the integrity and stability of the new survey is achieved as quickly as possible without compromising the quality of the data it provides.
3. Will the proposed designs for Core and Plus deliver?
Initial work on CVs for key estimates from the Core are built on assumptions about response rates in the first wave and then in the longitudinal waves two to five. As the longitudinal element is more substantial in the proposed Core, these response rates in subsequent waves are important for generating increased sample size (and therefore lower CVs). The pilot test for the shortened Core supports the increased wave one response, but the argument for subsequent improvement is dependent on historical data for TLFS showing how waves two to five response has declined as the size of the questionnaire has increased. These assumptions deliver the required sample size and CVs and subsequent more conservative assumptions regarding wave two to five response show these CVs will still improve relative to current CV values for the LFS.
Further analysis is needed to confirm this flow-through in improved response in waves two to five for the shortened Core, and this can include testing of proposed data rotation, where some responses from the previous wave are pre-filled for respondents, reducing response burden in waves two to five. In addition, this will support analysis of attrition bias (in the TLFS, labour market status in the previous wave is a key predictor of response in the current wave and so is a critical contribution to the weights that are used to adjust for attrition) and the extent to which the shortened Core survey might impact this relative to the current TLFS.
The proposed design has the potential for greater impact from attrition bias than the current TLFS, as it has more reliance on waves two to five data, but there is considerable scope for weighting adjustments. We understand that the current proposal is to keep weighting for the Core distinct from weighting for the Plus, and to essentially use the TLFS weighting methodology that we were informed about last year. For the Core, this seem very sensible and reflects, at least in principle, use of standard methods for combining cross-sectional calibration with attrition bias correction via longitudinal weighting. Furthermore, it avoids the complexity of trying to combine Core and Plus data, given that the very different and considerably longer questionnaire used in the latter could have a substantial impact on non-response bias and partial completions.
A further consideration would be the design of the longitudinal component of the Core. At present, the proposal is for a sub-sample of 45K addresses from the 90K wave one selections that are then followed up at quarterly intervals to make up waves two to five. Historically labour force surveys have longitudinal designs for two main reasons:
- Fieldwork, especially for non-clustered designs, together with with face-to-face interviewing, is expensive and re-visiting households reduces costs.
- Gross flows of labour market status can be estimated from the longitudinal data component, with overlapping samples leading to more stable estimates of net change.
For a survey built around online response, 1) is much less of a driver and so the longitudinal design needs to deliver on 2). The proposed Core design optimises estimation of quarter-on-quarter gross flows and associated CVs for net change, and with five waves accommodates estimation of gross flows year-on-year. However, this is also an opportunity to be explicit about the extent of the flows data that are needed, and the impact of the longitudinal design on CVs of change, with the possibility of having less additional waves and so less attrition provided year-on-year flows are not required. For example, having just waves two and three but all 90K flowing through (same issued sample as 45K with four follow-up waves) achieves a higher overall sample size with less accumulated attrition bias. Increasing the first wave to 120K and then having waves two and three, each with 60K, achieves a higher overall responding sample, and still delivers the same amount of flows data as does the current TLFS for quarter-on-quarter change. Like the present TLFS this design is less susceptible to attrition bias as wave one makes up more of the data and has less waves for attrition bias to accumulate. Such decisions are not needed now but they are options and opportunities as the final Core design structure emerges.
4. Concluding remarks
The analysis that we have seen so far supports the creation of clearly separated Core and Plus components for the TLFS, with the Core corresponding to a shortened online questionnaire focused on headline labour market statistics. The primary focus of TLFS development in the short term should therefore be to finalise the structure of the Core and to implement it as rapidly as possible, with development and implementation of the Plus following on immediately. We are of the opinion that this implementation of the final version of the Core will require five quarters parallel running alongside the LFS to allow proper evaluation of the quality and stability of the different headline outputs. A shorter parallel run is also possible, but then it will be up to the ONS to assess the increased risks of transition after fewer quarters. Consequently, we feel that the ONS is justified in pursuing the design approach outlined so far, with the view to rapid implementation of the Core design following a successful parallel run to the current LFS. This works towards the goal of quickly transitioning production of headline labour market statistics from the LFS to the Core TLFS.
The clear separation between the Core and Plus components of the TLFS has the added advantage of uncoupling design issues for the Plus component, so that it in turn can focus on the delivery of the more detailed data breakdowns required before it can be used to replace those LFS outputs that go beyond headline labour market statistics. The design and implementation of the Plus component can be carried out separately from the Core transition.
We therefore also expect that during the parallel running period the final version of the Plus will come online and will provide users with the more complex statistical outputs currently available from the LFS. Once these outputs have been assessed as being of suitable quality, we anticipate the LFS will be no longer required. However, it is important to realise that delays in bringing the Plus online will inevitably lead to the LFS being retained longer. This seems an unavoidable risk that the ONS should take on board.
Professor James Brown and Professor Ray Chambers
13 February 2025