Identifying gaps, opportunities and priorities in the applied data ethics guidance landscape

Introduction

This paper provides an overview of current applied data ethics guidance that is available to the research and statistical community within the UK. A brief landscape review has been undertaken to identify the depth and coverage of current guidance and to determine potentially relevant activities and areas of interest. This paper is designed to support the UK Statistics Authority’s new Centre for Applied Data Ethics to:

Understand the current data ethics landscape within the UK and the relative contribution of the Centre within this broader landscape.
Identify potential gaps in support mechanisms that will enable users within the research and statistical community to apply data ethics principles to their projects.
Determine specific applied data ethics topics that would initially benefit from the development of further guidance and training and identify potential collaborators for these activities.
Support a key aspect of the UK Statistics Authority’s Five-Year Strategy, namely to be a recognised world-leader in the practical application of data ethics for statistics and research, by being radical in our ambitions, ambitious in our actions, sustainable in our outlook and inclusive in all that we are and do.

The Centre for Applied Data Ethics (hereafter, referred to as ‘the Centre’) was established by the UK Statistics Authority in February 2021 to provide a central space that supports and enables the research and statistical community to continue to apply data ethics within their work. This is increasingly important as the digital age contributes to the emergence of new methods, tools and datasets that have the potential to be harnessed for statistical, research and wider analytical purposes for the public good.

To support the new Centre in prioritising its initial support and engagement activities, a brief landscape review was undertaken in January 2021 of current applied data ethics guidance openly available to the research and statistical community in the UK.

A particular focus of the review was on identifying recent and emerging guidance and considering both the range of topics covered and the extent to which they provide ‘usable and applicable’ guidance for practitioner audiences.

This review enables the Centre to:

Understand the current state-of-the-art regarding applied data ethics.
Ensure that Centre activities are appropriately positioned and targeted to provide maximum support and benefit to the research and statistical user community.
Identify potential gaps in applied data ethics guidance and associated support materials and activities.
Identify potential collaborators for the development of future guidance in this area.

Overview of the applied data ethics landscape in the UK

Annex A provides a tabular summary of identified ethics guidance, materials and activities across different organisations that may be relevant to the future activities of the Centre and provides a basis for further engagement and collaboration.

Identification of guidance began with an exploration of the activities of pre-identified domestic stakeholders in the data ethics arena. Further organisations were also identified via an online search using the terms ‘data ethics guidance’ and ‘applied data ethics guidance’. The identification of this initial guidance and associated activities was based on online research only.

Overall, the majority of recent ethical guidance that was identified focuses on aspects related to the use of artificial intelligence, algorithms and data science methodologies, as well as aspects related to data sharing.

These primarily provide high-level principles and explanations, with some providing additional key questions, checklists and examples of mitigating actions for users to consider when designing and implementing their projects.

More generic data ethics frameworks have also been further developed to aid in the practical application of ethical principles. For instance, the latest iteration of the Government Digital Service Data Ethics Framework now provides a self-scoring system for projects.

Limited guidance on more specific ethical topics is also provided by some organisations, with topics focusing on areas such as anonymisation, data confidentiality, and data linkage. The level of detail provided in such guidance ranges from relatively brief and high-level to more practical in-depth discussion of issues and the provision of suggested mitigations.

Overall, the majority of data ethics guidance is principle-based and relatively generic in approach, with the provision of checklists and tools in some instances, to enable application to a range of project types. Unfortunately, it is currently unclear how regularly existing frameworks are used by various user communities and the extent to which they support project planning by different actors in the research and statistical community. There also remains a lack of specific topic-focused guidance and support materials in some areas and applications (e.g., concrete guidance on specific data types, sources, methods or applications), although it is clear that this is increasing in certain areas.

Finally, it should be noted that this landscape review has not considered how commercial research organisations embed data ethics into their processes. Future work would be beneficial to consider this aspect in relation to the guidance that would be most useful for third-party organisations that may be involved in data sharing with the research and statistical community in the future.

Initial recommendations

Based on this initial landscape review, the development of more practical, application-based guidance that focuses on a range of topic areas that currently present ethical issues and challenges to researchers and statisticians would be beneficial. Ensuring that this guidance is user-friendly, developed in consultation with the user community, and accompanied by relevant training materials and support services where necessary is also important to ensure that user needs are met and the application of guidance is fully enabled.

An initial list of ten potential topic areas for consideration is shown in Annex B and focuses on three main aspects: transparency, inclusivity, and data use. This ranges from topics such as the use of geo-location data, machine learning, and open data sources, to transparency in data linkage, onward sharing and public good.

This list has been developed based on the experience of the UK Statistics Authority’s Data Ethics team in supporting research and statistical projects for the public good and it is certainly not exhaustive. As part of the initial activities of the Centre, feedback on these topic areas (and the identification of additional topic areas) is welcome, and feedback will be gathered via stakeholder engagement activities and events.

It is proposed that these, and other, topic areas be co-developed in collaboration with the user community and in partnership with recognised authorities and key thinkers in the use of specific data types, methods and applications of data for the public good.

This will enable the Centre to address current gaps and priorities in the applied data ethics landscape, both in the UK and internationally, and fulfil its strategic ambitions to be a recognised and leading authority in the application of data ethics. However, it is recognised that there are many organisations operating within the data ethics space, and it is hoped that this review will also provide a useful overview for others working in this field.

The findings of this landscape review will be combined with further exercises with user communities to capture user requirements (e.g., via workshops and other engagement with stakeholders) to identify and prioritise topics that would benefit from the development of applied data ethics guidance.

Where does the Centre fit?

This landscape review provides an overview of the current applied data ethics landscape within the UK, identifying key stakeholders, relevant guidance materials and potential ethics support mechanisms to the wider user community.

The Centre operates within a broad data ethics environment, with many other organisations also providing advice, guidance or ‘voice’ in terms of ethical issues related to society’s use of data. We are keen to collaborate and partner to maximise the benefits that this diverse landscape provides.

The Centre provides a data ethics support service focused on the use of data for research and statistics within this landscape. Unlike other organisations, our remit does not cover the use of data for operational purposes, either within Government or outside of it. Specifically we provide:

A targeted focus on the UK’s research and statistical community (including academic researchers, commercial organisations, those across Government and the wider public sector, and statistical institutes).
Identifiable resource and expertise to empower researchers and statisticians to move beyond ethical principles and theoretical considerations, to enable them to work through practical ethical issues in their research projects.
A space to support users to develop solutions to ethical issues in their use of data for research and statistical purposes, enabling users to unlock the power of data in ethically appropriate ways.

In this way, the Centre will build on the work already undertaken by the UK Statistics Authority in the data ethics space, with more than 400 research projects from across government, academia, and the commercial sector having used the UK Statistics Authority’s existing data ethics framework and self-assessment tool to consider the ethics of their research in the last two years, and more than 40 so far this year. This demonstrates the increasing demand for our data ethics services. Further development of these services will provide enhanced support to researchers and statisticians, to enable ethical challenges in modern day research to be recognised and overcome and ensuring that research and statistics can continue to provide maximum benefit for the public good.

As a key part of its remit, the Centre aims to provide authoritative applied data ethics guidance, which is focused on providing user-friendly, actionable advice that meets the current and emerging needs of the research and statistical community.

To ensure that the activities and outputs of the Centre provide maximum benefit for the UK’s research and statistical community, the findings of this landscape review will be combined with continuous engagement and outreach activities with user communities and key stakeholders in the data ethics space.

Annex A and B

Annex A: Landscape review table

Organisation	Ethical Principles	User-focused Applied Ethics Guidance	User Ethics Training	User Ethics Support	Ethics Committee	Other
Centre for Data Ethics and Innovation (CDEI)	Principles related to public data sharing.	Key questions to consider in public sector data sharing projects; examples of potential mitigating actions in use of AI.				Established ‘Public Attitudes to Data and AI’ (PADAI) network for cross-Whitehall organisations; collaborative projects with Police Scotland, MoD and Bristol City Council in developing applications; exploration of privacy enhancing technologies for trustworthy use of data.
Alan Turing Institute (ATI)	Principles related to AI ethics, in collaboration with Government Digital Service (GDS) and Office for AI (OAI).	Operationalised Process-based Governance Framework for ethical AI development; definitions, processes and checklists for explaining AI in practice (in collaboration with Information Commissioner’s Office).			Ethics Advisory Group.	Online seminars/masterclasses such as introduction to data ethics, process fairness, AI ethics; Data Ethics Group.
Ada Lovelace Institute (ALI)	Consideration of principles related to data stewardship.	Key aspects to consider for data stewardship mapped to case studies in spreadsheet format; practical methods and terms related to algorithm audits and algorithmic impact assessments (in collaboration with DataKind); explanation of different data transparency mechanisms.				Recent events held on what forms of mandatory reporting can help achieve public sector algorithmic accountability, data stewardship, inspecting algorithms in social media platforms.
Royal Statistical Society (RSS)	Principles related to ethical data science, big data.	One-page implementation checklist and what and how table for each principle.				Data Ethics and Governance Section; Ethics ‘Happy Hours’ convened by Data Science Section.
Open Data Institute (ODI)	Levers for a more open, trustworthy data ecosystem covered in general manifesto, and applications related to data sharing in specific sectors.	Practical tool to explore questions related to the ethics of data-related projects; guidance on anonymisation techniques.	Introduction to data ethics and the data ethics canvas; development of a data ethics facilitator course.	Provide advisory services related to data ethics practices.
Economic and Social Research Council (ESRC, UKRI)	Principles for ethical research.	Case studies and guidance on internet mediated research; research with children and young people; research with potentially vulnerable people; international research; data requirements.
Medical Research Council (MRC, UKRI)	Principles related to good research practice.	Range of guidance materials related to topics including data sharing, consent, confidentiality, research involving children, and research in developing societies	Training related to consent, transparency, confidentiality and GDPR.	Support and advice for those conducting research with human participants, their tissues or data.
West Midlands Police and Crime Commissioner (WMP PCC)					Ethics committee for data science-related projects.
Government Social Research (GSR)	Principles on ethics and using social media for social research.	Guidance on applying ethical principles; general ethical guidelines and questions to consider in using social media for social research, alongside examples of how ethical issues may manifest in a project context.	Facilitate ethics training for the GSR profession.			GSR Ethics Community of Practice.
Government Digital Service (GDS)	Principles on data ethics.	Specific actions with questions to consider that follow the project process, with a 0-5 self-scoring system.				Plan to build a public sector data ethics community, working on developing data ethics skills training, and gathering case studies and impact stories.
Government Statistical Service (GSS)		Definitions, tips, examples and explanations related to anonymisation and data confidentiality and data linking.	Awareness in data ethics online course.
Office for Statistics Regulation (OSR)		Guidance related to data sharing; collecting and reporting data about sex; building confidence in the handling and use of data.				Identification of ethics issues and corresponding lessons learnt related to specific domains via in-depth reviews, such as the use of statistical models to award grades in 2020; ethical standards emphasised in T1.2 and T6.1 of Code of Practice for Statistics.
Department for Health and Social Care (DHSC)	Principles related to data-driven technology in healthcare.	Guidance on best practice in development of data-driven technology in healthcare.			Links to health research ethics committees and Health Research Authority Research Ethics Service.
Information Commissioner’s Office (ICO)		Guidance related to AI and data protection that covers ethical aspects, with actionable AI Auditing framework under consultation; definitions, processes and checklists for explaining AI in practice (in collaboration with Alan Turing Institute).				Consultation on role of data ethics in complying with the GDPR; resources related to data sharing.
Digital Catapult (DC)	Principles covered in AI Ethics framework.	Guidance to apply the framework to practice.		AI Ethics Committee Advisory Group provides support to Start-ups; Larger organisations supported via responsible industry adoption forum.	AI Ethics Committee, consisting of both a Steering Group and Advisory Group.	Lessons in Practical AI Ethics detailing lessons learned from Machine Intelligence Garage programme; creation of searchable Applied AI Ethics Typology for developers following joint research with academic collaborators; development of Applied AI Ethics Hub; development of public webinars focused on responsible AI adoption for AI practitioners.
The Wellcome Trust	Ethical principles covered in Good Research Practice Guidelines.	Guidance on wording to use when seeking consent for patient data; ethical points to consider for project students; aspects related to data sharing.
Social Research Association (SRA)	Principles for ethical social research.	Provide links to further information in ethics guidelines.		Confidential SRA Ethics Forum.
British Psychological Society (BPS)	Principles for human research ethics, ethics review and research during COVID-19.	Range of guidance related to valid consent, confidentiality, deception and debriefing, vulnerable populations,internet-mediated research, research during COVID-19, open data and research within the NHS.
NESTA		Maintain a searchable landscape map/database that collates guidance and materials related to AI governance, some of which has relevance to ethical aspects; series of questions to answer before using AI in public sector algorithmic decision making.
UK Data Service		Provides links to a range of materials relevant to ethics, including specific and detailed guidance regarding statistical disclosure control, informed consent, data sharing and anonymisation.
Nuffield Council on Bioethics	Ethical principles related to healthcare research in developing countries.	Collated links to international guidance/organisations related to COVID-19 and bioethics.				Webinars related to ethical aspects of healthcare research.
British Sociological Association (BSA)	Principles related to ethical research.	Table documenting potential exemptions from informed consent and confidentiality; series of 6 digital research case studies covering online forums, Twitter, open data, and young people.
Association of Internet Researchers (AoIR)	Ethical norms/principles for internet research.	Guidance on aspects of informed consent, harm, algorithms, data procedures, questions to consider for AI and machine learning and use of corporate data.
Market Research Society (MRS)	Ethical principles related to the MRS Code of Conduct.	Practitioner focused guidance on a range of topics (containing questions/statements for researchers to think about and examples).
Operational Research Society (ORS)	Ethical principles for operational research.
UK Research Integrity Office (UKRIO)		Checklist for ethics applications and ethical considerations related to the pandemic, points to consider for internet-mediated research.	Training sessions on research ethics and informed consent.			Arranged webinars in 2020 on data sharing and ethics.
DataKind	Ethical principles for charities working with technology companies/future technologies and data.	Checklist-style questions to consider related to different social risk types.	Ethics training for core DataKind UK volunteers in their ethical OS project tool.	Provides light touch support for data ethics in social change organisations.	Ethics committee for their data science related projects.
Inspiring Impact	Principles related to research ethics and data protection.	High-level guidance on informed consent, harm, voluntary participation, protected identity and avoiding bias.

*Table Notes: Materials included in the table represent information that was identified in desk-based research, was openly available to view (i.e., not restricted behind paywalls etc.) and in English. It is unlikely to represent an exhaustive list and does not include current plans or materials that may be available to selected internal audiences/groups within an organisation. It also does not cover the extensive range of research reports produced by various organisations in the data ethics space, which can be found by visiting the websites of individual organisations.

**A collation of some of this guidance can be found at the data ethics and AI guidance landscape webpage produced by DCMS (however, this is predominantly focused on AI more generally; https://www.gov.uk/guidance/data-ethics-and-ai-guidance-landscape).

***UK Government AI Council also have a remit for exploring how to develop and deploy safe, fair, legal and ethical data-sharing frameworks (https://www.gov.uk/government/groups/ai-council).

****Although this table is not focused on responsible innovation, a range of practical resources related to ethical aspects of technology innovation also exist, such as the Consequence Scanning tool developed by DotEveryone.

Annex B: Proposed topic areas

Transparency

What information do respondents need to know about data linkage and the onward sharing of data at the point of data collection to ensure research is ethically appropriate?
How can researchers fully articulate the public good of their research?
How can researchers navigate the cluttered and confusing data ethics landscape to conduct ethically appropriate research?
How can researchers and statisticians use algorithms in transparent ways?

Inclusivity

What constitutes a vulnerable group and how should ethically appropriate research deal with these groups?
Who are the people who are not included in administrative data, what are the ethical implications of this and how should ethically appropriate research account for this?

Data Use

From an ethical perspective, where is the boundary between research and statistics and operational uses of data?
How do we ensure we use Machine Learning and AI in ethically appropriate ways to improve the production of research and statistics?
How do we ensure we use open data in ethically appropriate ways?

10. Ethics of the use of geo-location data

We welcome your thoughts…

We have published this initial landscape review as an open draft for comment and feedback. It does not represent an exhaustive list of organisations, activities or guidance in the data ethics space, but a means to help us to position the Centre and its activities within the broader data ethics environment in the UK and identify potential collaborators for our future work.

We would welcome views from interested parties. If you would like to share your thoughts with us, or are interested in collaborating, please email us, ideally by 16 April 2021. You can also visit our website for more information about the Centre.

We look forward to hearing from you.