Top Tips

The way we gain consent for contributions to language corpora has changed considerably over the past 20 years (Love 2020).  For example, when the 1994 Spoken British National Corpus (BNC) was compiled, participants were told that they had been recorded after the fact, whereas those who contributed to the 2014 Spoken BNC gave informed consent before recording. Of course if participants know that their contributions are being recorded and collected for analysis, that’s likely to have an influence on the data, but it’s almost unthinkable to conceal recording devices or record without prior consent these days. 

Best practice advises that participants should give informed consent before contributing to a research project (see BAAL Good Practice Guidelines 2021). Researchers must ensure that contributors are fully informed about the project and their involvement in it. The process of ensuring that ethical guidelines are adhered to is complex. This document aims to inform this process and highlights relevant considerations for successful ethical approval, using examples taken from experiences during various stages of the IVO project. 

Back to top

1. Preparation

Research ethics in applied social sciences, according to Payne (2000: 307), involves ‘ontological, epistemological, theoretical and methodological assumptions embedded within the practice of academic scholarship.’ To ensure that these assumptions are carried out within research practices, institutions have a responsibility to regulate the ethical parameters of research that is carried out by their members.  

One of the first things to consider is the context of the study and the various stakeholders involved in the research. For example, the IVO project involves a range of academic and commercial partners across various institutions in different countries. Different institutions have different ethical guidelines which adds complexity to the process as partners’ needs and desirable outcomes influence the research design, which, in turn, shapes the ethical considerations.  

When preparing an application for ethical clearance, three broad areas should be anticipated (adapted from University of Limerick Guidelines in Support of Research Ethics Procedures): 

  • any ethical dilemmas likely to be encountered in the research (such as informed consent, confidentiality and anonymity of participants, expertise of the researcher, protecting the rights of those involved) and consideration as to how they will be surmounted. 
  • safety issues likely to be encountered by the researchers in the course of their fieldwork 
  • the data storage needs of the project 

The following sections aim to support the researcher approach these considerations efficiently and in line with ethical best practice.  

Back to top

2. Transparency

All participants in a research project need to be made aware of the purpose, aims and intended outcomes of the study they have been asked to participate in. An information document, sometimes referred to as a Participant Information Sheet (PIS) should be delivered to all participants providing a clear description of the research. This document should also inform the participant about the rationale for their participation and why they have been invited to contribute to the research. 

This information should be provided in a way that is easily accessible throughout the life span of the project and straightforward for participants to read and accept the terms of consent, with signatures and dates. In the IVO project, we transitioned from issuing quite cumbersome PDF documents to easily accessible online forms (using Microsoft Forms). PDF documents are difficult to add and edit without extra plugin software and we found that, to ease this burden on participants, a link to the consent form, with a sub-link to the information sheet was a less onerous way of informing participants and establishing consent. It also means that all completed consent forms can be efficiently gathered in one place. 

Back to top

3. Participation and levels of consent

Inherent in a project that looks at multi-modal data is the use of content that includes both vocal and facial profiles of participants. This creates challenges in establishing and maintaining anonymity. Each of these factors requires consent and may be mutually exclusive on the part of the participant i.e., a participant may be willing to consent to the use of their voice but may object to the use of their image. Tools such as Camtasia (TechSmith Corporation 2022) feature blurring filters, which may be used to visually anonymise participants, with corresponding challenges for analysis.  It is crucial to consider this level of detail when gaining consent. The sharing and publication of such multi-modal content can be problematic if full consent to use each mode is not provided. To facilitate this, ethical documentation should take into consideration these multiple criteria and provide the participant with the option to consent to individual aspects of the use of their contribution.  

In situations where the data being gathered is already  in the public domain, consent from participants to contribute data may not be necessary.  For example, for the IVO project, as well as including data recorded and shared with us from partner institutions specifically for the project, our corpus is composed of meeting recordings that were previously recorded and made publicly accessible for the purposes of freedom of information. Though these recordings were in the public domain, we sought approval for use of these from both the institution responsible for the recordings and from our research ethics committee. This approval was granted on the condition that participants be anonymised (see 5. Anonymity and confidentiality below).  

For data, not in the public domain, gathered specifically for a project, informed consent must be obtained from each participant. The decision of any participant to not participate or to withdraw at any point should be respected. This right to withdraw should be made clear in the project information sheet provided to participants, with a statement such as: 

Your participation in this research project is entirely voluntary and it is up to you to decide whether or not to take part. If you decide to take part, we will discuss the research project with you and ask you to sign a consent form. If you decide not to take part, you do not have to explain your reasons and it will not affect your legal rights. 

The option to withdraw should be a stated point on a consent form and may be presented as in this example: 

I understand that my participation is voluntary and I am free to withdraw at any time without giving a reason and without any adverse consequences. 

Figure 1 gives an example of how participant data is handled at various steps of data gathering and analysis if there is a request from a participant to withdraw from the study.  

Figure 1: Handling of data in the event of participant request to withdraw 

Back to top

4. Data protection and storage of data

All data must be handled and protected in alignment and compliance with General Data Protection Regulation (GDPR) guidelines or other statutory body responsible for data protection. The European University Institute provides a useful guide on following best practice for data protection in research.

All files used in the IVO project were stored in a Microsoft Teams project hub that is only accessible to project researchers. This facilitates ease of access by project members as well as being an encrypted platform that, according to the Microsoft Teams security guide ‘follows all the security best practices and procedures such as service-level security through defense-in-depth, customer controls within the service, security hardening, and operational best practices.’ 

The project information document should state clearly who will have access to data, how it will be stored and what will happen to the data at the end of the project. This should be acknowledged and accepted by the participant in a statement such as this: 

I understand who will have access to personal information provided, how the data will be stored and what will happen to the data at the end of the research project.  

Back to top


5. Anonymity and confidentiality

A principle of ethical guidance in the humanities is that participants should be given the option to remain anonymous during the study. As Baker et al. (2006: 13) state, ‘As a point of ethics, corpus texts need to be made anonymous where necessary by removing personal names and other identifying details (or substituting them with codes, pseudonyms etc.).’  

For the IVO project, we used a coding system to replace names of people, places, institutions, projects and websites. It is important to keep a separate log of these codes and what they refer to. For ease of searchability and tagging for exclusion in corpus software we placed our codes in square brackets [] with ‘anon_’ followed by the code. Our codes are composed of abbreviated forms of the referent category followed by a unique identifying number for specific term. For example, the first name John (if it is the third name used in the corpus) is anonymised like [anon_FN03]. 

This may influence subsequent annotation and transcription where pseudonyms or a system of reference might be employed to maintain the anonymity of participants. Other identifying terms such as names of institutions, projects and places may also need to adhere to principles of anonymisation to ensure that the data may not be traceable to the participants. This should not only be considered for those present in the meeting data but also for those mentioned in the data. Participants should be informed about the research parameters surrounding anonymisation and provide consent with a statement such as: 

I understand that (unless I opt out) all written data and transcribed spoken data from my contributions that is included in the artefacts, training materials, online asset or used in published works or academic presentations or for other academic purposes will be anonymised and kept confidential to such an extent that references to individual people and personal information should not be retrievable. 

The option should also be given for participants to not have their contribution anonymised, as such, the following option may be given on the consent form: 

If you prefer that your contributions are not anonymised, please indicate this by ticking here. 

In cases such as the IVO project, where video data is being used, anonymity is challenging to maintain, as participants may be identified visually. This issue should be made clear to participants when giving consent and they should be given the opportunity to remove any content that they might not want included. For example, including the following may address this:  

I understand that I will have the opportunity to delete video and audio recordings and mute specific sections of the audio in my contributions before they are included in the project’s artefacts, online training materials or online asset. 

Back to top

6. Vulnerable Participants

As with any project that involves human participants, special consideration should be given to those who might fall into categories of vulnerability. Participants who may be deemed vulnerable such as those with physical or mental disabilities, the elderly or minors (under 18 years of age) or part of a minority group (see Riddell et al. 2017), should be factored into the proposed research and any means of accommodating the needs of these participants should be stated.  

Back to top

7. Benefits and Risks

Information provided to participants should be transparent regarding the potential benefits and risks (if any) associated with involvement in the project. Benefits such as feedback from researchers and subsequent access to research-informed resources may incentivise participants to take part in research. Even when there are no benefits or risks, this should be outlined to participants, for example  

The possible benefits of taking part are to improve equality of access in online interaction for everyone.   

There are no risks in taking part.   

Back to top

8. Problems and complaints

If participants need to report a complaint during their participation in a project, they should have access to both internally and independent contacts. Members of institutional ethics committees may be willing to be referenced as independent authorities to contact in such situations. A statement such as the following could be included:  

If you have any concerns about this study or have grounds for concerns about any aspect of the manner in which you have been approached or treated during the course of this research please contact (Names and contact details of Principal Investigator(s)).  

Back to top


Baker, P., Hardie, A., & McEnery, T. (2004). A Glossary of Corpus Linguistics. In A Glossary of Corpus Linguistics. Edinburgh University Press. 

British Association for Applied Linguistics. (2021) Recommendations on Good Practice in Applied Linguistics 2021, 4th edition. British Association for Applied Linguistics.

European University Institute (2019). Guide on Good Data Protection Practice in     Research, 

Love, R. (2020). Overcoming challenges in corpus construction: The spoken British National Corpus 2014. Routledge. 

Microsoft Teams Security Guide (2022). Security and Microsoft Teams,  

Payne, S. L. (2000). Challenges for research ethics and moral knowledge construction in the applied social sciences. Journal of Business Ethics, 26(4), 307-318.  

Riddell, J. K., Salamanca, A., Pepler, D. J., Cardinal, S., & McIvor, O. (2017). Laying the groundwork: A practical guide for ethical research with Indigenous communities. International Indigenous Policy Journal, 8(2). 

University of Limerick, Guidelines in Support of Research Ethics Procedures,

Back to top