Discussion Note

Background

On Wednesday 12th May, the UK Statistics Authority’s Centre for Applied Data Ethics convened a virtual roundtable event to explore questions surrounding how we can best address emerging ethical challenges in the use of data for research and statistics, and how we can support researchers to overcome these issues.

This event was attended by approximately 30 participants from a range of organisations with an interest in applied data ethics.

The event began with a brief introduction to the Centre from the UK Statistics Authority’s Data Ethics team and an opening presentation from Professor David Hand OBE FBA. This was followed by a facilitated discussion focused on three main questions:

  1. What are the key emerging ethical challenges in the use of data for research and statistics?
  2. How can we best support users to address these challenges going forward?
  3. Where should our priorities be in this space for the immediate and medium term?

This discussion formed the basis of the current paper and will directly contribute to the future activities of the Centre. By making this resource openly available, it is hoped that the wider data ethics community can also benefit from the insights documented.

Roundtable Discussion

  1. Attendees were welcomed by Emma Walker to the UK Statistics Authority’s Centre for Applied Data Ethics Roundtable event and thanked for their attendance.
  2. Emma Walker provided a brief overview of the Centre and its data ethics approach. Emma emphasised that the purpose of this roundtable would be to feed into the activities of the Centre, with the ambition that the Centre will be a recognised leader in the practical application of data ethics for statistics and research.
  3. Attendees were given housekeeping information regarding how the session would run, and how to contribute. Attendees were also reminded that a discussion note would be circulated after the meeting. This note would then form the basis for a discussion paper that would be developed by the Centre following the event and made openly available to enable others who were unable to attend the event to benefit from any insights.

  1. Professor David Hand was invited to introduce the roundtable topic with his initial thoughts regarding the discussion questions.
  2. David identified that new data sources (e.g., CCTV) and methodologies (e.g., data linkage) are emerging within the research and statistics space, and this has resulted in emerging ethical challenges. Several particularly important aspects were identified related to inclusivity, transparency, bias and fairness, discrimination, and privacy.
  3. David raised the question as to what transparency means to different parties, and to whom we need to be transparent to.
  4. David introduced the idea that when researching, one should consider not only whether the data can be used, but also whether it should be used- it is important to consider the ethical tensions arising by using certain types of data, and whether using certain data could do more harm than good. This relates to both the needs of individuals, and the wider population. Although statistics is aggregate, it is also concerned with the individual.
  5. There is a clear need to determine what the biggest emerging ethical challenges related to the use of data for research and statistics are, and how the research and statistical community can be engaged and supported to best address these challenges.
  6. David identified several key ethical issues, relating to the boundaries between research/statistics and operational uses of data; ethical issues related to data linkage, data re-use and use of open data; and identifying and accounting for the people who are not included in administrative data.
  7. It can be particularly challenging to identify the most appropriate users of support and advice, and effectively target them to ensure that impact is had. There is a need for practical guidance that is as easy as possible to access, navigate and apply in diverse contexts. Organisations need to be proactive in effectively signposting emerging materials, resources, and support.
  8. The questions posed for the roundtable discussion show that there is a need to give users a voice – what are their challenges? What do they need? How can we most effectively provide help and support? We need to work together to maximise the application of our knowledge in this space and ensure that groups are not missing from the conversation.
  9. Finally, David introduced the aims for the roundtable, and the three questions that would be the focus of the discussion.

  1. In reference to David’s presentation, discussion was had around the “could” and “should” of data usage. It was suggested that by framing data usage in this way, there is a risk of restricting (or dissuading) researchers from using data sources which pose difficult ethical challenges. Doing so may stop certain data sets from being used, and in turn, creates a bigger ethical challenge- that of the loss of public insight. Missed use of data can be as harmful as the misuse of data. It was suggested that instead of focusing primarily on the issue of ‘should’, more consideration should be given to the issue of ‘how’.
  2. Attendees felt that it was important not only to reflect on what is done with data once it is made available, but also “how” research is conducted. Fundamentally, this needs to involve as little risk to individuals as possible. If the accumulation of information about individuals by people who do not need this information can be avoided, then it should be. The processes used to link data often involve giving data from one agency to another, and this means that these organisations end up with more information than they perhaps need. The aim should be to process information via data linkage in a way which allows most public good but avoids organisations gaining too much information about individuals unnecessarily. The cumulative effects of data sharing are something that researchers and statisticians need to be aware of, and projects should be looked at in terms of a wider research context, and the impacts that this could have on privacy.
  3. Limitations and restrictions which are put in place to ensure ethical use of data are set by varying organisations, and the severity of these restrictions is dependent on each of these organisations. Attendees felt that a more streamlined approach to data sharing would be beneficial, to make it simpler for researchers to understand the conditions that they need to meet, and to ensure that researchers are not discouraged from using certain datasets.
  4. Moreover, attendees highlighted a potential conflict between the need to show reproducibility versus publishing an algorithm that someone else could use in a different way to potentially cause harm and felt that further consideration around this topic could be beneficial.
  5. Attendees also highlighted the potential ethical issues that may arise should different groups of researchers have different levels of access to datasets. There is a need to be ethical in the way that processes, frameworks and organisations are structured for access to data, as well as the data itself. By communicating with other areas of research (such as operational and commercial research), we may be able to streamline these processes and create a fairer, more transparent data access framework.
  6. It was suggested that, if you look at the makeup of ‘ethical data breaches’ over the last 10 years, most have been in the private sector. Attendees felt that this was because they had a more open policy towards data usage, and that the public sector is more cautious, and requires researchers to satisfy far stricter conditions to gain access to certain data sets.
  7. Attendees believed that since the Covid-19 pandemic, the public are more willing to share their data. However, they also considered that this may not always be the case, and that public acceptability changes, dependent on research purpose. It is considered important to look at the potential consequences for individuals as a result of different research projects, and proposed policy interventions.
  8. It was seen to be important that when ethical issues relating to research projects are discussed, that crucial legal frameworks which exist to support ethical research are also considered.
  9. Attendees discussed the benefits of learning from other areas of research. For example, in clinical trials there isn’t necessarily a fundamental need to know how things work, but there is a fundamental need to minimise harm. Researchers and statisticians could learn from this when considering the ethical criteria for assessing black box algorithms.
  10. Attendees were encouraged that the Centre is working on geospatial data guidance, and its associated ethical issues, and noted that it can be particularly difficult to distinguish between location data and personal data. High level definitions, and guidance on how to categorise data could be helpful in clarifying this issue.
  11. Attendees commented on the expectation of timeliness in regard to data sharing, particularly as a result of the coronavirus pandemic. This raises the question as to how organisations can enable quick decision-making and data sharing, but also highlights the need to explain the potential problems involved in sharing data too quickly, in a way that people who are not conversant in data ethics considerations will easily understand.
  12. The discussion identified the use of new and novel data sources, and the need to communicate traditional ethical considerations in relation to these new sources particularly in relation to provenance, quality, trust, and transparency. Moreover, new data sources may pose contemporary ethical issues which need further exploration.

  1. Attendees discussed the need to consider both current and future users of data. It was considered particularly difficult to ensure that guidance is futureproof when reflecting upon the cultural context of ethics. To do so, attendees felt that it is particularly important to embed good practice, critical challenge, and rigorous thinking when considering the cultural dimension of ethical practice within organisations.
  2. In order to address the discussion question, attendees highlighted the need to identify who is considered as “users”.  Whilst there was an implicit focus on individual researchers, attendees also identified the need for support to acknowledge intermedial environments/organisations who interpret and systematise ethics. These bodies direct researchers (via regulation and governance) to create ethically viable research projects, and attendees agreed that guidance and support should be aimed towards these organisations as they have the remit to further influence individual researchers.
  3. The discussion noted that there are often lots of processes that could help researchers and statisticians to build confidence and trust, but often people do not know about them, and opportunities for data use are missed as a result. There is a need to proactively reach out to those who want to use and collect data to ensure that they know there are pathways and processes available for support.
  4. Attendees felt that there are already many ethical frameworks, principles and tools which exist within research and statistics, which is positive. However, it was also suggested that researchers may struggle because of this proliferation. Curation and guidance on which tools are best or most appropriate to use for the research and statistical context, was seen to be a beneficial focus for the Centre.
  5. Practical support around establishing ethics governance pathways was encouraged by attendees. Specifically, support for organisations in creating ethics boards and specific data ethics teams was seen to be potentially helpful.
  6. Attendees felt that the Centre is in a unique position to encourage open spaces for discussion on ethical challenges, and that drop-in sessions for different data users and groups would be beneficial in engaging the wider statistical community.
  7. Attendees discussed the benefits of collecting impact stories and case studies to monitor practice and share with researchers and statisticians as examples of best practice.
  8. It was suggested that the research and statistics community would benefit from a public acceptance focused approach. By engaging in dialogue with the public, researchers and analysts are better able to understand the needs and wants of different populations and may be better able to ensure that research is ethically and socially acceptable.
  9. In relation to the point above, attendees discussed the cultural and social context of data use and questioned how the culture of data users can be shaped to encourage awareness of these ethical issues.
  10. Attendees wondered whether there would be benefit in the Centre working with research funders, as it is important that ethics is considered at the start of the research process. Working with funders may encourage researchers and organisations to consider ethical challenges earlier, and this could be integrated into the funding application process.
  11. Attendees suggested a 3 pillared approach to help frame support– pillar one – the application of law and pillar two – practical guidance on scientific issues and practical guidance on ethical standards- should be underpinned by pillar three- independent advice and scrutiny.
  12. Attendees identified differing levels of data ethics awareness across the wider statistical community and within different government organisations. Varying levels of awareness requires guidance and training to be tailored for specific groups or levels of understanding, with each group having unique responsibilities and needs.
  13. Attendees identified that just because a research project is legal, it doesn’t mean it is ethical (and vice versa). Both issues need to be discussed in parallel with one another.
  14. It was felt that it would be beneficial to clarify what existing guidance exists on specific ethical issues, and who to go to for support. This was seen to be particularly important now that more partnerships are happening between public and private sector organisations.

  1. Attendees felt that there was a need to better negotiate the interface between public data organisations and commercial entities. It is becoming more common for commerce and public agencies to work together as service providers, enablers, custodians of data etc., and so not only is there a need to work together, but also to provide support for researchers navigating this space.
  2. Attendees identified a need to ensure that users are engaged to help them utilise available data ethics resources. This is essential to enable greater unity when moving forward in this space. Moving too quickly on too many fronts without streamlining and ensuring inclusivity, may limit impact.
  3. It was suggested that there may be a need to clarify existing legal gateways for researchers and statisticians, to help them navigate the legal landscape relating to ethics.
  4. It was suggested that whilst many are aware of the potential ethical implications of data sharing, further guidance and engagement with researchers and statisticians regarding issues of data quality were equally important.
  5. Attendees considered the need to do public engagement well, without it becoming a tick-box exercise. Discussion acknowledged that public engagement is very important for ensuring public confidence and legitimacy, but also identified a need for researchers and statisticians to engage with the public throughout the research process.
  6. Attendees, whilst thinking about impact and public engagement, discussed what guidance should look like, and what sort of guidance is most useful for the everyday user. There was a suggestion that guidance should aim to inform without being too dense or repetitive.
  7. Attendees felt that engaging with the public effectively is very much a skill, and that creating meaningful public engagement activities is particularly difficult. It was felt that it would be beneficial to provide authoritative advice and guidance on communicating and discussing ethical issues with the public, and signpost where sources of expertise can be found.
  8. Attendees also identified that different organisations and agencies may have different, nuanced challenges and “cultures” regarding data ethics, and that integrated and personalised assessment or support within these organisations may be beneficial.

  1. Emma Walker ended the discussion with a brief roundup.
  2. Attendees were reminded that the Centre will be putting together a discussion note based on the key themes arising from the discussion, which will be circulated to attendees for comments, before being developed into a discussion paper that would be openly published.
  3. Attendees will be kept updated by the Centre on any future events and discussions.
  4. Professor David Hand thanked everyone for a rich and thought-provoking discussion, and attendees were encouraged to add further comments via the post-discussion survey should they wish to. Links to this were posted on the video conference chat and sent to attendees via email following the meeting.

Conclusion

The above roundtable discussion highlighted the enthusiasm across the data ethics community to come together and share learnings and experiences in this field. Attendees were keen that further events be convened in the future to enable key stakeholders within the data ethics field (both within and outside of government) to discuss common challenges within specific areas, and most importantly identify potential solutions and collaborative approaches to these.

Within the discussion, several key areas of potential focus for future data ethics activities and resources were identified, including:

  1. A greater emphasis on ‘how’ research, statistics and analysis is undertaken.
  2. Developing a more streamlined approach to data sharing to make it simpler for researchers and analysts to understand the conditions that they are required to meet.
  3. Providing support for researchers and analysts to navigate the various legal and ethical aspects of data use and how these relate to each other.
  4. Considering the cultural dimension of ethical practice – ensuring that good practice, critical challenge and rigorous thinking are embedded within ways of working.
  5. Ensuring that a broad range of ‘users’ are engaged with – from individual researchers, to research funders and intermedial environments and organisations who interpret and systematize ethics.
  6. Providing curation and guidance on which ethics tools are most appropriate to use for the research and statistical context.
  7. Encouraging open spaces for discussion on ethical challenges, such as the provision of drop-in sessions for different user groups.
  8. Providing support for researchers and analysts to navigate the interface between public data organisations and commercial entities.
  9. Collecting ethics-focused impact stories and case studies as examples of best practice.
  10. Providing advice and guidance on how best to engage, communicate with, and discuss ethical issues with, the public.

Following this event, the UK Statistics Authority’s Centre for Applied Data Ethics is developing plans for future events, as well as building on the points raised in the discussion to identify where it is best placed to work with others to provide further guidance, resources and support in some of these areas.

Attendees: This roundtable discussion was attended by approximately 30 participants from a range of organisations, including:

  • British Geological Survey
  • Cabinet Office
  • Centre for Data Ethics and Innovation
  • Data Science Campus
  • Department for Health and Social Care
  • Geospatial Commission
  • Imperial College, London
  • Information Commissioner’s Office
  • NHS X
  • Office for National Statistics and UK Statistics Authority
  • Office for Statistics Regulation
  • Open Data Institute
  • Royal Statistical Society
  • Swansea University
  • The Legal Education Foundation
  • UKRI ESRC
  • University of Warwick
  • Welsh Government

About Us: The UK Statistics Authority’s Centre for Applied Data Ethics aims to provide practical support and thought leadership in the application of data ethics by the research and statistical community, developing a recognised resource that addresses the current and emerging needs of user communities across the various research and statistical professions. By collaborating with partners in the UK and internationally, the Centre aims to develop user-friendly, practical guidance, training, support and advice in the effective use of data for the public good.