Mr John Astin (Mr Astin was in attendance for the first agenda item only.)
Dr Gareth Clews (Methodology, ONS)
Mr Huw Pierce (ONS)
Mr Chris Payne (ONS)
Dr Domenica Rasulo (ONS)
Dr Hazel Martindale (ONS)
Ms Loes Charlton (ONS)
Mr Alex Rose (ONS)
Ms Helen Sands (ONS)
Mr Yasin Auckbur (ONS)
Dr Antonio Chessa
1. Introduction and apologies
Mr Fitzner opened the meeting and passed on apologies from members unable to attend.
Mr Astin thanked the panel for their support and contributions during his membership, and encouraged the panel to continue their support of the Household Costs Indices. Mr Fitzner and the panel members thanked Mr Astin for his contribution to the Panel since its inception and wished him well in his future endeavours. Mr Astin then left the meeting.
Mr Fitzner introduced Prof. Nason, who has joined the panel as the Royal Statistical Society’s nominee.
2. Progress on Private Rental Development
Mr Hardie gave an update on development of new private rental measures since the previous meeting. Going forward the focus from ONS will be on cross-validation, investigating endogeneity arising from the ACORN data, calculating confidence intervals, and understanding the effect of interaction terms on the index. In parallel a project has been initiated with Economic Statistics Centre of Excellence (ESCoE) to review the models.
Results from both the internal ONS work and the ESCoE project are expected by the end of April. Mr Hardie confirmed that the outcomes of these work packages will be presented to both the Stakeholder and Technical Panels before engaging with the wider user community.
Dr Rasulo presented a summary of two papers brought before the panel. The first was a review of the mortgage interest payments (MIPs) methodology currently employed in the Retail Prices Index (RPI). The methodology is based around calculating the average debt on a fixed stock of mortgages, accounting for average house price and proportion of price advanced (see section 11.5.1 of the Consumer Prices Indices Technical Manual). The paper explores a sensitivity analysis on the effects of varying the assumption that the proportion of price advanced is 55%, as well as the effect of applying the average effective rate (AER) across the whole series. Future work will explore using a proportion of price advanced based on historic market data, and of adjusting the assumption of the average length of mortgages being 23 years.
The second paper considered an alternative approach to calculating MIPs, using an amortisation formula similar to those used by lenders. The “lenders formula” approach uses the price advanced, the interest rate and the remaining length of the mortgage to calculate payments for a modelled stock of mortgages where the mortgage amount is fixed over cohorts. This approach gives the benefits of simulating household payments exactly while also having potential application for calculating mortgage capital repayments. Its limitations arise from data sources; weights for fixed and variable rates are only available from 2007, while fixed interest series are not continuously available for less common loan-to-value ratios.
Prof Crawford queried if there was any variability in cohort sizes in the analysis and if not, could this be applied. Dr Rasulo confirmed that in the analysis the cohort sizes were consistent across all years, but that survey data may give insight into variability from 2005 onwards. Dr Weale suggested consulting Land Registry data on this point. Prof Crawford also asked if a chart was available showing a comparison of the old and new methodologies with the AER series.
In response to a question from Dr Weale, Dr Rasulo clarified that the proportion of price advanced affects the results because the average house price varies over time. Dr Weale then highlighted that a change in the availability of low-deposit mortgages, perhaps driven by the housing market cycle, should be treated as a price change instead of a quantity change.
The panel discussed a point raised by Mr Levell around whether the index was inclined to fall due to a falling amount of debt over time. If mortgage cohorts and interest rates remain the same across time, then in principle the index should remain flat. Prof Smith and Mr de Vincent-Humphreys highlighted that new cohorts added to the stock are likely to face higher average house prices than the older cohorts leaving, therefore the total amount of debt in the basket should also rise.
Mr de Vincent-Humphreys raised the scenario of mortgage refinancing and queried how the reduced loan-to-value (LTV) ratios created by this process were reflected in the models. Dr Rasulo replied that in their current simplified form there is no treatment of refinancing, nor of equity withdrawal. This is something that could potentially be addressed through survey data. Prof Smith added that many mortgages arise through homeowners trading up which would also influence the composition of the mortgage stock.
Dr Mehrhoff reflected on whether the method matched the intention of a payments-based index and invited further analysis of the underlying assumptions to confirm this. He also proposed further scrutiny of the assumptions and data sources used to ensure that they reflect the UK mortgage market and questioned the calculation of the AER. Several panel members discussed whether the index should track the size of payments or be a weighted average of interest rates, similar to the Bennett approach used by Eurostat.
ACTION: Dr Rasulo to produce a chart comparing the old and new methodologies with the AER series.
4. Web Scraping: Product Grouping Methods
Dr Martindale described work undertaken by ONS to explore alternative methods of grouping products found in web scraped data. Following groups of products rather than individual products is expected to reduce the effect of product churn on the index. Two approaches have been examined: attribute-based grouping, where products are grouped according to keywords in the data, and unsupervised clustering, where product descriptions are converted to numerical data and grouped according to their separation in an N-dimensional feature space. The panel were invited to advise on three topics: assessment metrics, measuring homogeneity, and how to treat seasonality and product churn with attention to the choice of base period.
Prof Balk noted that the work demonstrated the difficulty of automating product grouping. He suggested that for consumers product homogeneity should be framed as a question of utility, asking how well related products could be substituted for one another. He advocated using a full year as a base period as this would best capture seasonal goods.
Prof Nason commented on the difficulty that humans have in grouping products together, giving context to how successful automated processes can be expected to be, adding that for products that are highly seasonal or face high churn it may be preferable to abandon the notion of a base period and model prices changes more dynamically. He asked about the specific machine learning approach applied and the volume of training data. Dr Martindale replied that classification was a separate field of investigation for which the analysis used the XGBoost software library. The approach to obtaining training data for classification research is described on the ONS website. The product grouping research aims to break the classified data down further into homogeneous products.
Dr Mehrhoff responded to a point raised by Prof Smith, that attribute-driven approaches suffer where the item possesses an attribute, but this is not recorded in the data. Dr Mehrhoff acknowledged this but referred to his own experience on this topic which showed that attribute-driven approaches tend to give output that were more reasonable to a human observer. Dr Martindale pointed out that missing data would also be an issue for clustering methods. Dr Mehrhoff asked if there was capacity at ONS to employ manual inspection of the groups once the process was in production. Dr Martindale ventured that a production process entirely without human intervention was unlikely and so work was required to understand how this would operate.
Dr Martindale’s presentation referred to MARS: a method for defining products and linking barcodes of item relaunches. Dr Mehrhoff argued that it gave undue preference to homogeneity and, more seriously, continuity, and offered to share a paper presented at the 2019 NTTS conference that explores this argument further. In subsequent correspondence Dr Chessa concurred with Dr Mehrhoff’s observation for periods close to the base and noted that MARS gave better insight towards the end of the year once an appreciable amount of churn had occurred. He also shared details of a sensitivity analysis where the relative weighting of homogeneity and continuity were varied, proposing that it could give insight to decisions relating to product stratification.
Prof Crawford offered to connect ONS with colleagues that had been working on classifying online job vacancies for the purpose of labour market analysis. He also suggested adding a term for churn into the loss function when fitting the classification models.
ACTION: Dr Mehrhoff to share his report on MARS presented at the 2019 NTTS conference.
ACTION: Prof Crawford to provide contact details for colleagues performing classification research.
5. Outlier Detection and Filtering Methods for Web Scraped and Scanner Data
Ms Charlton gave a presentation covering the work done at ONS on identifying anomalous prices in scanner and web-scraped data. Methods were first assessed for the theoretical suitability given the assumptions that data was multimodally distributed and that there were relatively few outliers, then filtered to favour more simple approaches over more complex ones. This led to an initial shortlist of three categories: density-based approaches, Gaussian mixture modelling (GMM), and principal component analysis (PCA). To this list three further methods were added: Tukey, applying simple filters, and timeseries analysis. The methods were assessed on their robustness, computational resource requirement and on the number and distribution of outliers detected. Simple filters gave the fastest results for price levels and price relatives, however more sophisticated methods had the potential for automation and generating richer information.
The panel were invited to comment on whether there was an ideal proportion of outliers that should be targeted in datasets, whether outliering has an effect on the index and if so under what conditions, and at which stage in the production pipeline would outliering be most effective.
Dr Weale enquired as to the proposed treatment of outliers, whether they would receive zero weight or a low weight, reasoning that this would depend on the distribution of data. Ms Charlton replied that in the current analysis they were simply removed, however further research may indicate a different treatment may be required. Ms Sands said that outliers would receive manual validation, and if determined to be genuine prices would remain in the dataset. Dr Weale suggested that capacity for manual validation, rather than a statistical process, would therefore answer the question about the desirable number of outliers.
Prof Smith queried the choice of thresholds on the extreme price change filter, noting that they were not symmetrical.
Mr Levell raised a question that led to a discussion about sampling, false positive and false negative outliers. Dr Clews highlighted firstly that some prices may be errors on the part of the retailers, and secondly that samples taken from the fat tail of a distribution may not represent consumer experience. Mr de Vincent-Humphreys noted that a distribution of price levels would be likely to have a positive skew.
Prof Nason suggested looking at a wider selection of variables and historic precedent to help decide whether an extreme price was due to retailer error. Mr Fitzner proposed checking for key words in product descriptions (e.g. “sale” or “discount”) that would indicate that a large price fall was deliberate.
Dr Mehrhoff reframed the question of outliers in terms of influential observations and whether they distort the index. He referred to common practice in real estate indices which is to exclude very highly valued properties even if their prices are genuine. He added that outlier detection should be carried out towards the end of the pipeline, after grouping and filtering. By this point, he argued, a simple Tukey approach should be adequate. Moreover, multilateral index methods tend to reduce the impact of influential observations by virtue of their implicit quality adjustment.
8. Web Scraping: Expenditure Proxies
Mr Rose summarised his paper on potential indicators of sales volumes in web scraped data that could be used at the elementary aggregate level. Previous ONS research has indicated that having product level weights for indices is arguably more important than the choice of weighted index methods themselves. Methods considered were page location and price banding, and the resulting indices were assessed against a weighted index generated from matched scanner data. Neither page ranking nor price banding gave satisfactory results. The panel were invited to comment on the importance of expenditure proxies for web scraped data and advise on improvements or alternatives to the methods considered thus far. Mr Rose then outlined some further avenues for exploration: examining continuity of availability for products and refining the price banding approach to take account of other product features.
Prof Balk suggested using a random sample from websites, if page position was an unreliable proxy for expenditure. Mr Rose compared this with current local collection practice and considered whether a greater amount of information would be available in store or online.
Dr Mehrhoff asked firstly if there was any confidence that these methods offered an improvement over unweighted indices, and secondly if there was any safeguard in the event that the input data was heavily biased or unrepresentative. On the first point Mr Rose replied that defining appropriate metrics was difficult, but that the indications are that the expenditure proxies do not offer an improvement over the unweighted indices. On the second point Mr Rose proposed that this was more of a data acquisition issue that could be addressed before calculating the indices.
Prof Crawford noted that the price banding approach created an implied demand curve and that it would be interesting to examine if this curve behaved as expected. It should also be possible to derive the set of demand curves consistent with a set of price data, and thus check to see if the demand curve implied by the data was a member of this set. Another productive approach was to apply bounds analysis, making extreme assumptions about weights and confirming that the calculated index lay within the resulting indices.
Prof Crawford also suggested exploring a maximum entropy approach, possibly combining it with the first method to ensure that it was economically valid. Dr Mehrhoff referred to previous work by Mr Levell who determined that maximum entropy approaches tend towards unweighted indices unless there was further information that could be drawn upon. Dr Mehrhoff added that this additional information could be used for stratification which would reduce the problem of heterogeneity in unweighted indices. Mr Levell commented that it could also be possible to make assumptions about consumer preferences.
Mr de Vincent-Humphreys stressed the importance of assuring that any deviation from an unweighted index was due to genuine expenditure patterns and therefore an improvement. He then asked if ONS had approached any retailers to ask how they build their websites to confirm any relationship between expenditure and page position, noting that some retailers give the option to sort products by sales rank. Mr Rose replied that this enquiry is on the work plan, adding that each retailer was likely to have its own distinctive strategy.
Prof Smith proposed using 1-dimensional clustering for forming price bands, acknowledging that this could create more equal density groups and remove the most salient information.
8. Transparency Review of Paper Publication Classifications
Mr Fitzner described the current criteria for publishing technical papers on the UKSA website, observing that as a result relatively few papers are published. He invited viewpoints as to how the panel could be more transparent as a body.
Dr Mehrhoff shared details of the practice at Eurostat, which is that papers are made public by default unless there is a compelling reason not to. At each meeting a decision is taken on which papers can be released into the public domain.
Prof Nason favoured higher transparency, advocating that there should be a route for interested parties to feed back on published papers. He also suggested publishing summarised or redacted papers where there were concerns over their sensitivity. Mr de Vincent-Humphreys added that presentation slides used at meetings could be published in lieu of the full papers.
Mr Fitzner thanked the panel for their views and added an item to future agendas to review the presented papers for publications, also observing that besides the manner and extent of publishing, the timing of publication was also a consideration. Mr Hardie concurred.
ACTION: Secretariat to implement proposed changes and update the APCP web pages to make it clear how users can provide feedback on the papers and request full copies.
8. AOB and date of next meeting
Dr Rasulo to produce a chart comparing the old and new methodologies with the AER series.
Dr Mehrhoff to share his report on MARS presented at the 2019 NTTS conference.
Prof Crawford to provide contact details for colleagues performing classification research.
Secretariat to implement proposed changes and update the APCP web pages to make it clear how users can provide feedback on the papers and request full copies.