Ethical considerations in the use of Machine Learning for research and statistics

Transparency and explainability

One of the UK Statistics Authority’s ethical principles – transparency – relates to the obligation of researchers to ensure that the decisions they make about their data, analysis, and methods, are openly and honestly documented and communicated in a way that allows others to evaluate them. Explainability then can be used to define machine learning techniques that as humans, we are adequately able to understand, trust and manage. Though not synonymous, issues of transparency and explainability are inextricably linked with one another, and so for the purpose of this guidance, the two challenges will be addressed together.

These concepts relate to the potentially opaque nature of machine learning algorithms, and the difficulty in communicating how these systems are used and how they work, particularly to a lay audience. So-called “black-box algorithms” (where the researcher knows what data is fed into the algorithm, and what comes out, but might not be able to interpret it in actionable, human terms) are central to considering this issue. There may be good reasons for using a black box approach – for example, a black box algorithm may provide better quality or more accurate data than a more transparent alternative – but it is important to note, however, that there are explainable machine learning algorithms, which should be considered to optimise transparency, where it is appropriate to do so. This will depend on a number of factors, including the intended use of the model, the maintainability of the model and its outputs, and whether using the more transparent alternative risks sacrificing the quality or accuracy of another, more opaque model. Of course, there are far greater complexities when it comes to algorithmic explainability than allow us to make binary distinctions between “black box” algorithms” and “explainable” algorithms, and there a number of “model-agnostic” methods which may help researchers (in some instances, and with certain caveats) interpret machine learning models. Explainability then is not black and white, and should be seen more as a sliding scale.

A lack of transparency in machine learning projects has further implications for the reproducibility of results. The ability to replicate results is crucial in ensuring and maintaining scientific trust. Being transparent about the way in which data is used, and how our research is carried out allows others to verify the results of a project via replication, and the accuracy (or inaccuracy) of a model, enabling researchers to determine whether the model is sufficient, and whether it is suitable for further use. Transparency is particularly important to consider when using technologies such as machine learning, especially with the aim of reproducibility in mind. For instance, the researchers deploying systems may not be those who made that system to begin with, which makes validating the quality of the system and accurately predicting the decision-making rationale of the person who made the system much harder. Regular and systematic documentation of all coding processes and decisions made are therefore the building blocks of transparency.

There are many different stakeholders that need to be considered when thinking about transparency, and communicating your research, from members of the public to internal and external analysts. In this guidance we refer to all those interacting with machine learning as stakeholders, however it is important for researchers to consider who these different stakeholders are, and tailor communication with each group for best impact. If in doubt, assume that the group that you are interacting with has no experience of machine-learning – it is better to simplify your language to ensure that stakeholders understand your message.

It is also worth considering how machine learning projects can benefit us, and their limitations. For example, whilst machine learning algorithms can help us make predictions, most are not causal. This means that conclusions can rarely be drawn that are beyond correlation, therefore it is often not possible to make claims of cause and effect.

It is important for all stakeholders to understand that they are interacting with machine learning, and the conclusions and recommendations that come from a machine learning model. If stakeholders, from participants to analysts, are not given clear information to help them understand the research, then this risks promoting harmful practices. Explainability and transparency are important considerations for researchers to build public confidence and promote ethical practice.

Advice and Possible Mitigations

Everyone interacting with the project should be able and encouraged to ask questions at any point within the research process, and this should be communicated in a user interface, or similar document. These questions should be answered in a transparent and timely manner.
If the choice is taken that an opaque machine learning algorithm is to be used over a more explainable one, researchers may need to consider how this can be communicated to a non-specialist audience. In these cases, consideration should be given as to how information can be communicated transparently and in ways that are easy to understand to relevant non-expert communities as necessary. Researchers using machine learning techniques should communicate in plain language why the opaque algorithm was chosen, and what its advantages are that make it the best choice despite the lack of explainability, as well as what training data has been used, any bias or limitations of the data, and whether the training data has been validated through other studies. Considerations of the quality of the training data may also provide a useful framework for communicating some of these issues, for which existing guidance may be helpful (see find out more section).
Researchers might also find it useful to produce a clear statement of why their conclusions are believed to be valid, what is meant by ‘being valid’ for their particular use case, and any limitations on the validity.
All human decision-making processes should be audited and easily reviewable. It may be useful to systematically review each decision-making process at different stages of the research, clearly documenting the rationale and impact of these decisions. Creating an audit trail of human decision-making throughout a machine learning process supports analysts in explaining to non-expert audiences how and why machine learning has been used, and any potential ethical implications of this use.
It may be useful to discuss your project with independent colleagues as part of “challenge sessions”, both to maximise collective understanding and to provide an additional opportunity to check the model is doing what is expected.
An explanation of the outputs and any associated recommendations arising from the project should be easily accessible and as understandable as possible, supporting non-expert understanding of the impact of the research findings. This should be delivered in a timely manner.
It is always helpful to put yourself in the shoes of other stakeholders when you design any project, and this can be particularly helpful when thinking about how to communicate complex ideas (such as machine learning algorithms and systems) to non-expert audiences. The exercise below may help analysts think about these concepts and communicate with users effectively.

Communicating with non-experts when using machine learning techniques to ensure transparency

What information might your audience find helpful?

Take a minute to consider what you would like to know about a project if you were approached by someone who wanted to tell you about their machine learning research. Consider this from the perspective of the audience you are trying to communicate with.

What information might your audience not find helpful?

Take a minute to consider what information you may not find helpful if you were approached by someone who wanted to tell you about their machine learning research from the perspective of the audience you are trying to communicate with.

Members of the Public*

Helpful

What is machine learning?
Why did you choose to use machine learning over other methods?
What is the aim of the research?
Why is the study important, and what will you do with the results?
Were there any limitations to your research, or the machine learning methods that you used?
How did you access the data and how will it be used?
Why did you choose to use this dataset?
Will the outcome of the research affect groups or individuals (either positively or negatively)?
Is my personal information safe? How is my data being stored and can it be reused for other purposes?
Will anyone be identifiable from the data or the research outcomes?

Not Helpful

What will the typical lay person learn from being presented with an algorithm? Are they likely to understand it, or is there a more understandable way of presenting this to a lay audience?
Too much information! It may put users off if there is too much information, or if the information presented is hard to read. How can you present the information in a way that is concise and easy to read?

Policy and decision-makers*

Helpful

What is the key message in regard to policy? What was the key policy challenge that the research seeks to address?
Why is this research important for policy? What are its main implications?
Are the policy implications of the research short term or long term?
What are the key findings from the research?
How have you arrived at these findings? What methods were used?
What will happen with the data collected, and how will it be used going forward?
Are there any limitations/assumptions to the research?
What are the key policy recommendations?
It may also be beneficial to consider the views of key stakeholders, particularly in relation to the public. For example, if a policy decision was made on the basis of your project, would the public be comfortable with this?

Not Helpful

Much like the public audience, policy and decision-makers are unlikely to gain from being presented with the inner-workings of your algorithm, or the granular specifics of your methods. Instead, a brief, high-level summary of how you have come to your findings may be more useful.
Policy and decision-makers are likely to be limited on time, and so it is critical to keep communication with them succinct. How can you present the information in a way that is concise and easy to read or visualise?
How can you present your research findings in a way that is relevant to policy, and which clearly highlights its aims, its importance, its limitations, and its implications?

Other statistical groups (who are not machine learning data scientists)*

Helpful

What training data was used to teach the system?
How did you obtain and quality assure the data that you used to teach the system?
How did you train the system (which methods did you use throughout the process)?
Were there any limitations or biases in the training data that may have affected the results? How have these been mitigated?
What patterns or recommendations have emerged from the data? How did the machine learning model come to this conclusion?
What are the assumptions related to this recommendation?
How much better is this approach than others that could have been used? Are there any improvements that could be made to the current approach?
How was the model evaluated/compared against other models?
Are there plans in place to continue the work with updated data?
Have you shared your code for others to use and adapt?

Not Helpful

This group may have a more technical understanding of machine learning, or statistical processes, and so more detailed information regarding how the system was trained, and how it reached the results it did could be useful.
Think about what you have been asked to do and why, and what the benefits of this are. You can then tailor your communication with this in mind.

*These questions have been designed as a starting point for conversations with different key groups, and should provide a good starting point when thinking about how to communicate with different audiences. As part of our user testing of these, we would welcome feedback on how these may be improved.

The information in this section is also available in PDF format.

How can we use this information to aid our research?

Publishing your algorithms for people to see is really useful and goes some way to ensuring transparency. However, many people may not be able to understand or interpret an algorithm. Making your algorithm accessible is not enough to make your research transparent! Perhaps you could consider linking your audiences to the published algorithm should they want to see it but consider the questions in the left-hand column to better explain what it means to your audience.

You will need to communicate your research to different audiences, and they will likely have varying levels of understanding. It is important to tailor your communications with each group to ensure transparency, however, by considering the questions above in relation to a non-expert user, you may find it easier to communicate this information to all stakeholders.

By placing ourselves in the shoes of a person with no knowledge of machine learning (or by practicing explaining our work to people with no knowledge of it), not only are we able to better communicate with this group of users, but we can also:

begin to think more transparently about our work
better communicate with a lay audience
ensure explainability
better understand our own research
improve the impact of our work on different audiences

« Previous