Introduction
This paper provides an overview of current applied data ethics guidance that is available to the research and statistical community within the UK. A brief landscape review has been undertaken to identify the depth and coverage of current guidance and to determine potentially relevant activities and areas of interest. This paper is designed to support the UK Statistics Authority’s new Centre for Applied Data Ethics to:
- Understand the current data ethics landscape within the UK and the relative contribution of the Centre within this broader landscape.
- Identify potential gaps in support mechanisms that will enable users within the research and statistical community to apply data ethics principles to their projects.
- Determine specific applied data ethics topics that would initially benefit from the development of further guidance and training and identify potential collaborators for these activities.
- Support a key aspect of the UK Statistics Authority’s Five-Year Strategy, namely to be a recognised world-leader in the practical application of data ethics for statistics and research, by being radical in our ambitions, ambitious in our actions, sustainable in our outlook and inclusive in all that we are and do.
The Centre for Applied Data Ethics (hereafter, referred to as ‘the Centre’) was established by the UK Statistics Authority in February 2021 to provide a central space that supports and enables the research and statistical community to continue to apply data ethics within their work. This is increasingly important as the digital age contributes to the emergence of new methods, tools and datasets that have the potential to be harnessed for statistical, research and wider analytical purposes for the public good.
To support the new Centre in prioritising its initial support and engagement activities, a brief landscape review was undertaken in January 2021 of current applied data ethics guidance openly available to the research and statistical community in the UK.
A particular focus of the review was on identifying recent and emerging guidance and considering both the range of topics covered and the extent to which they provide ‘usable and applicable’ guidance for practitioner audiences.
This review enables the Centre to:
- Understand the current state-of-the-art regarding applied data ethics.
- Ensure that Centre activities are appropriately positioned and targeted to provide maximum support and benefit to the research and statistical user community.
- Identify potential gaps in applied data ethics guidance and associated support materials and activities.
- Identify potential collaborators for the development of future guidance in this area.
Annex A provides a tabular summary of identified ethics guidance, materials and activities across different organisations that may be relevant to the future activities of the Centre and provides a basis for further engagement and collaboration.
Identification of guidance began with an exploration of the activities of pre-identified domestic stakeholders in the data ethics arena. Further organisations were also identified via an online search using the terms ‘data ethics guidance’ and ‘applied data ethics guidance’. The identification of this initial guidance and associated activities was based on online research only.
Overall, the majority of recent ethical guidance that was identified focuses on aspects related to the use of artificial intelligence, algorithms and data science methodologies, as well as aspects related to data sharing.
These primarily provide high-level principles and explanations, with some providing additional key questions, checklists and examples of mitigating actions for users to consider when designing and implementing their projects.
More generic data ethics frameworks have also been further developed to aid in the practical application of ethical principles. For instance, the latest iteration of the Government Digital Service Data Ethics Framework now provides a self-scoring system for projects.
Limited guidance on more specific ethical topics is also provided by some organisations, with topics focusing on areas such as anonymisation, data confidentiality, and data linkage. The level of detail provided in such guidance ranges from relatively brief and high-level to more practical in-depth discussion of issues and the provision of suggested mitigations.
Overall, the majority of data ethics guidance is principle-based and relatively generic in approach, with the provision of checklists and tools in some instances, to enable application to a range of project types. Unfortunately, it is currently unclear how regularly existing frameworks are used by various user communities and the extent to which they support project planning by different actors in the research and statistical community. There also remains a lack of specific topic-focused guidance and support materials in some areas and applications (e.g., concrete guidance on specific data types, sources, methods or applications), although it is clear that this is increasing in certain areas.
Finally, it should be noted that this landscape review has not considered how commercial research organisations embed data ethics into their processes. Future work would be beneficial to consider this aspect in relation to the guidance that would be most useful for third-party organisations that may be involved in data sharing with the research and statistical community in the future.
Based on this initial landscape review, the development of more practical, application-based guidance that focuses on a range of topic areas that currently present ethical issues and challenges to researchers and statisticians would be beneficial. Ensuring that this guidance is user-friendly, developed in consultation with the user community, and accompanied by relevant training materials and support services where necessary is also important to ensure that user needs are met and the application of guidance is fully enabled.
An initial list of ten potential topic areas for consideration is shown in Annex B and focuses on three main aspects: transparency, inclusivity, and data use. This ranges from topics such as the use of geo-location data, machine learning, and open data sources, to transparency in data linkage, onward sharing and public good.
This list has been developed based on the experience of the UK Statistics Authority’s Data Ethics team in supporting research and statistical projects for the public good and it is certainly not exhaustive. As part of the initial activities of the Centre, feedback on these topic areas (and the identification of additional topic areas) is welcome, and feedback will be gathered via stakeholder engagement activities and events.
It is proposed that these, and other, topic areas be co-developed in collaboration with the user community and in partnership with recognised authorities and key thinkers in the use of specific data types, methods and applications of data for the public good.
This will enable the Centre to address current gaps and priorities in the applied data ethics landscape, both in the UK and internationally, and fulfil its strategic ambitions to be a recognised and leading authority in the application of data ethics. However, it is recognised that there are many organisations operating within the data ethics space, and it is hoped that this review will also provide a useful overview for others working in this field.
The findings of this landscape review will be combined with further exercises with user communities to capture user requirements (e.g., via workshops and other engagement with stakeholders) to identify and prioritise topics that would benefit from the development of applied data ethics guidance.
This landscape review provides an overview of the current applied data ethics landscape within the UK, identifying key stakeholders, relevant guidance materials and potential ethics support mechanisms to the wider user community.
The Centre operates within a broad data ethics environment, with many other organisations also providing advice, guidance or ‘voice’ in terms of ethical issues related to society’s use of data. We are keen to collaborate and partner to maximise the benefits that this diverse landscape provides.
The Centre provides a data ethics support service focused on the use of data for research and statistics within this landscape. Unlike other organisations, our remit does not cover the use of data for operational purposes, either within Government or outside of it. Specifically we provide:
- A targeted focus on the UK’s research and statistical community (including academic researchers, commercial organisations, those across Government and the wider public sector, and statistical institutes).
- Identifiable resource and expertise to empower researchers and statisticians to move beyond ethical principles and theoretical considerations, to enable them to work through practical ethical issues in their research projects.
- A space to support users to develop solutions to ethical issues in their use of data for research and statistical purposes, enabling users to unlock the power of data in ethically appropriate ways.
In this way, the Centre will build on the work already undertaken by the UK Statistics Authority in the data ethics space, with more than 400 research projects from across government, academia, and the commercial sector having used the UK Statistics Authority’s existing data ethics framework and self-assessment tool to consider the ethics of their research in the last two years, and more than 40 so far this year. This demonstrates the increasing demand for our data ethics services. Further development of these services will provide enhanced support to researchers and statisticians, to enable ethical challenges in modern day research to be recognised and overcome and ensuring that research and statistics can continue to provide maximum benefit for the public good.
As a key part of its remit, the Centre aims to provide authoritative applied data ethics guidance, which is focused on providing user-friendly, actionable advice that meets the current and emerging needs of the research and statistical community.
To ensure that the activities and outputs of the Centre provide maximum benefit for the UK’s research and statistical community, the findings of this landscape review will be combined with continuous engagement and outreach activities with user communities and key stakeholders in the data ethics space.
Annex A: Landscape review table
*Table Notes: Materials included in the table represent information that was identified in desk-based research, was openly available to view (i.e., not restricted behind paywalls etc.) and in English. It is unlikely to represent an exhaustive list and does not include current plans or materials that may be available to selected internal audiences/groups within an organisation. It also does not cover the extensive range of research reports produced by various organisations in the data ethics space, which can be found by visiting the websites of individual organisations.
**A collation of some of this guidance can be found at the data ethics and AI guidance landscape webpage produced by DCMS (however, this is predominantly focused on AI more generally; https://www.gov.uk/guidance/data-ethics-and-ai-guidance-landscape).
***UK Government AI Council also have a remit for exploring how to develop and deploy safe, fair, legal and ethical data-sharing frameworks (https://www.gov.uk/government/groups/ai-council).
****Although this table is not focused on responsible innovation, a range of practical resources related to ethical aspects of technology innovation also exist, such as the Consequence Scanning tool developed by DotEveryone.
Annex B: Proposed topic areas
Transparency
- What information do respondents need to know about data linkage and the onward sharing of data at the point of data collection to ensure research is ethically appropriate?
- How can researchers fully articulate the public good of their research?
- How can researchers navigate the cluttered and confusing data ethics landscape to conduct ethically appropriate research?
- How can researchers and statisticians use algorithms in transparent ways?
Inclusivity
- What constitutes a vulnerable group and how should ethically appropriate research deal with these groups?
- Who are the people who are not included in administrative data, what are the ethical implications of this and how should ethically appropriate research account for this?
Data Use
- From an ethical perspective, where is the boundary between research and statistics and operational uses of data?
- How do we ensure we use Machine Learning and AI in ethically appropriate ways to improve the production of research and statistics?
- How do we ensure we use open data in ethically appropriate ways?
10. Ethics of the use of geo-location data
We welcome your thoughts…
We have published this initial landscape review as an open draft for comment and feedback. It does not represent an exhaustive list of organisations, activities or guidance in the data ethics space, but a means to help us to position the Centre and its activities within the broader data ethics environment in the UK and identify potential collaborators for our future work.
We would welcome views from interested parties. If you would like to share your thoughts with us, or are interested in collaborating, please email us, ideally by 16 April 2021. You can also visit our website for more information about the Centre.
We look forward to hearing from you.