Using Data from Third Parties for Research and Statistics: High-Level Ethics Checklist

Published:
11 October 2021
Last updated:
11 October 2021

Third Party Data Ethics Checklist

Work through the questions featured in each of the boxes below to help you think through some of the key ethical considerations in the use of third party data for research and statistics.

The use of data has clear benefits for users and serves the public good.

  • Has the rationale for the use of third party data been clearly articulated and documented? What extra public good benefits does the use of this data provide?
  • Have any biases inherent in the third party data been considered? Do these have the potential to limit the public good of your project?
  • Public good includes a broad range of considerations and involves balancing the potential benefits of the work to the public with any limitations or potential risks – see our guidance on considering and articulating public good in research and statistics for more information on how to consider and document the public good of your work and key aspects to think about.

The risks and limits of new technologies are considered and there is sufficient human oversight so that methods employed are consistent with recognised standards of integrity and quality.

  • Have any potential methodological limitations of the dataset that will be provided been considered and discussed with the data owner? In particular, think about:
    • Has the potential for individuals or groups to be excluded from datasets due to reduced engagement with the data collection method or organisation been discussed?
    • Has the potential for bias in a dataset due to the method of data collection been discussed?
    • Are there any inherent assumptions present within the dataset?
    • Has the data undergone any data cleaning or manipulation from its raw form? If so, what details can be provided regarding this process (i.e., has any data been removed or adjusted in any way? Has the potential for bias in a dataset due to the method of data collection been discussed?)?
  • Can sufficient information be provided regarding the data collection methods and materials used to enable the likely quality, accuracy and integrity of the data collection methods and associated dataset to be appropriately considered?
  • If you are proposing to link datasets from third parties to other datasets, has an appropriate linkage method been identified that will limit the potential for errors and maximise the potential success of the project?

The access, use and sharing of data is transparent, and is communicated clearly and accessibly to the public.

  • Is the organisation transparent regarding how its data is:
    • collected,
    • retained,
    • used / re-used, and
    • shared with other organisations?
  • Can evidence of transparency (e.g., publicly available statements or terms and conditions) be provided?
    • For example, in the case of data held by a commercial organisation, do data subjects have open information that makes them aware that their data is being retained by the commercial organisation and re-used by third parties?
  • Can the data that you are accessing and the purpose of your use be reasonably considered to be within the original context and purpose for which it was collected? Has this been discussed with the data owner?
  • If you are proposing to link datasets from third parties to other datasets, have you been transparent in communicating this intention?

The views of the public are considered in light of the data used and the perceived benefits of the research.

  • Have potential public views regarding the reputation of the third party organisation been considered (e.g., how might the public view collaboration with this organisation)?
  • Have potential public views regarding the data collection methods that the organisation has used / will be using been considered?
  • Have the expectations and motivations of the third party organisation with regards to the project or wider engagement related to it been considered? Could this result in potential reputational harm to the project or the team / organisation leading it?
  • Have potential public views regarding wider access to / sharing of this data for research and statistical purposes been considered?
  • Have public views in relation to the type of data that is being acquired been considered, including any sensitivities related to particular data types (e.g., health data, location data)?
  • If you are proposing to link datasets from third parties to other datasets, has the likely public acceptability of this been considered?

See our guidance on considering public views and engagement in research and statistics projects for more information on considering this aspect.

The data subject’s identity (whether person or organisation) is protected, information is kept confidential and secure, and the issue of consent is considered appropriately* .

  • Does the organisation have appropriate mechanisms in place to maintain the confidentiality and security of its data? What information can the organisation provide to evidence this?
  • How will the transfer of data be undertaken to ensure that data security is maintained?
  • Has the issue of consent to use and share the data been considered appropriately and discussed with the data owner?
  • How will the data subject’s identity, whether an organisation or an individual, be protected? What information can the organisation provide regarding the mechanisms in place to ensure this?
  • Have consent and permitted use been appropriately considered and granted?
    • Have data subjects rights to erasure/opt-out been considered and discussed with the data owner? How are opt-outs adhered to and are they accurate?
    • Have data subjects opted-in to the use of their data for research or statistical purposes? If not, is the organisation transparent regarding how data may be used?
    • How was currently stored data collected in the first place and by who? Were individuals made aware that their data may be shared with other organisations, such as government departments?
  • If you are proposing to link datasets from third parties to other datasets, has the potential risk of re-identification arising from linking data been considered?
  • If you are proposing to link datasets from third parties to other datasets, is there a common linkage key that would allow for linkage without impacting privacy or will linkage need to be done on personal information, such as name and address, which is more likely to be common across datasets? If the latter, what protections are in place to maintain confidentiality?

* Engagement with data security, legal, and data protection colleagues will likely be required to enable you to fully address these questions.

Before making any contact with an external organisation, ensure that you have consulted with your relevant internal data acquisitions, data protection and legal teams.

Back to top