This is part of the morning paper series. A series that analyses a paper I find interesting taken from my daily readings.

Reusing qualitative data

—4 min—

Increasingly, researchers are asked to think about their data beyond theory and ethics. Grants, now, require fundees to create data management plans to indicate what and how their data will be re-used. In certain cases, the discourse used strongly indicates the preference by funding bodies to see research made ‘reusable’. The European Union’s FAIR requirements, for example, ask the following questions: ‘How will the data be made accessible (e.g. by deposition in a repository)?’ Thus, the expectation is that data will be made available.

Opening up qualitative data can be problematic given that such data are often highly situational, with potential difficulties to entirely anonymise interactions. This is often not the case for large sets of quantitative data such as surveys. Precisely because the subject has been seen so little discussion, a purposeful effort should be made so that future generations of qualitative researchers can be guided through a funding process that caters to their data needs. In addition, given the tradition of opening up database sets by quantitative researchers, qualitative scholars should reflect and take action to avoid facing requirements that we cannot meet comfortably given the specificity of our data.

To open up this discussion here, I will use Corti, Thompson, and Fink’s (2004) article from the book ‘Qualitative methods in organisation studies’.

What the article is about?

The article makes a strong case for advocating qualitative researchers to make their data available to others. Using the case of the UK, the authors argue that secondary data has been invaluable to the development of social sciences. It has allowed, for example, to understand epocal developments such as the Webbs’ notes and interviews of British trade unions at the beginning of the 20th century.

Opening up research data to others can also be helpful for other researchers to come up with new insights. The authors use the example of Townsend who opened up research data from an in-depth research project he had carried out in the 50’s and 60’s, and which allowed another researcher to come up with interesting new insights (e.g. Thomson, 1991).

What are the messages and key points?

The article uses quite a surprising data point from Thomson (1991) which suggested that as much as 90% of qualitative data may be lost or at risk of being lost. This is a huge amount of data and it is moving to think that such accumulated efforts might be lost.

For data to be re-usable, the authors argue that a number of conditions need to be met:

  1. A catalogue record that provides basic information (e.g. content of collection) on the data should accompany it.
  2. A user guide should document how the data collection was carried out, how the data was or should be used, the original topic guides, personal research diaries, and the dates of the funding periods (if applicable). This is particularly important for qualitative data which is often unique and situational. Information on the methodology would go a long way (but perhaps not quite all the way) in placing the secondary data analyst in the shoes of the original researcher.
  3. The data listing recording key characteristics of the participants.

There are many uses such openly available secondary data could have:

  • They could be used as a description of the time, or specific cultural movements
  • They could help foster comparative research, comparing, for example, different hacker movements in different countries
  • Such data could be re-analysed, perhaps benefiting from hindsights, or placing contemporary events in light of past events
  • They could be used to assess the validity of the original methodological approach, or incite new approaches in light of the findings
  • They could serve as teaching case examples

Why is this relevant?

The arguments that authors make are forceful. Yet, I cannot help agreeing with some of the negative criticism that secondary data has received in the past. The authors note some such criticism, for example, that a lot of qualitative data requires the immersion of the researcher to be made sense of. It is true that ethnography, for example, is very much accidental in its nature and that fieldnotes may not be able to capture the entirety of the experience. The absence of such an experience may render the data useless.

This view may be extreme, and even though the interpretation of the data may miss the experience of the original author, other researchers immersed in similar experiences may make sense of it. Or, on the contrary, be surprised by some of the data, which could spur interesting debates.

I would, instead, argue that ethical issues and the lack of time are the most problematic. The inability to ask for consent for, say, pictures of participants in a participant-observation research in a public setting can contradict ethical principles. If we were to ask consent, it might create a change in attitudes if participants know they might be come part of a collection on the Internet.

With regards to the lack of time, researchers, and specifically early career researchers, are already very much pressed on time. Curating qualitative data and making sure that participant anonymity is respected can be extremely time-consuming, and potentially denaturate the data, rending the whole process worthless.

Written on February 21, 2018