Turning to Metadata to Ensure Reproducibility in Biomedical Research

Arctoris Ltd
5 min readNov 5, 2020


Data and metadata are the foundation of reproducible biomedical research. Wet lab researchers, funding bodies and research managers often struggle, however, to find robust solutions for implementing smart data stewardship — collecting not only bare-minimum results data but also valuable ancillary information, including experimental process data and measurements, commonly referred to as metadata. Collecting such a rich dataset to form a complete experimental record — one that includes the most minute and seemingly irrelevant physical conditions of a given experiment — poses a challenge for research groups around the world. Addressing this challenge by capturing all of this data is the key to ensuring reproducibility, which would improve the chance of research findings being translated into new treatments for patients around the world. The question remains: how do we make sure we collect experimental metadata in a comprehensive and accessible way?

Global science stakeholders are working hard to establish both standards and procedures enabling better metadata collection as well as easier access and re-use by scientists aiming to validate new findings. The most prominent example of such standards is the FAIR guidelines. This grass-roots initiative, proposed by a group of scientists in a Nature article, stipulates that experimental data have to be findable, accessible, interoperable, and reusable. Since the article was published, in 2016, the FAIR guidelines became an internationally accepted guidebook for increasing transparency and reproducibility in research. During the past four years, various policy-makers and scientific community stakeholders have been actively building awareness regarding metadata quality and FAIR data principles among scientists. One example for such an organisation is the UK Reproducibility Network. The UKRN is a national peer-led consortium investigating the factors that contribute to robust research, promoting training activities, and disseminating best practice.

How do we make sure we collect experimental metadata in a comprehensive and accessible way?

On the 22nd of October, alongside the UKRN, we co-hosted an online workshop From Data to Metadata: Ensuring reproducibility in biomedical research, focusing on the importance of and solutions for metadata collection in biomedicine. The workshop featured five talks covering different aspects of how to best support research reproducibility and which frameworks have been designed for experimental data collection so far. The event concluded with a panel discussion, where participants discussed their own experiences in implementing FAIR data practices and metadata capture.

The photo shows a woman, standing in the lab holding a vial. She is wearing blue lab coat, lab glasses. Event title is on it.

The first talk FAIR: From Principles to Practices was given by Professor Susanna-Assunta Sansone from the University of Oxford. In her talk, Sussana discussed the importance and opportunities for science stemming from the implementation of FAIR principles for research. She argued that for experimental reproducibility to be achieved, the research community has to adopt a set of standards and policies to make data collection comprehensible and reusable for other researchers. As an example of activities in this space, her lab is running fairsharing.org, which is an online resource helping to establish standards for metadata collection.

Figure 1. Professor Susanna-Assunta Sansone’s definition of FAIR Principles. See her presentation slides here. Copyright: Susanna-Assunta Sansone.

In the second talk, Dr Philippe Rocca-Serra from the University of Oxford covered important aspects of data and metadata stewardship in biomedical research. The chain of discovery in biomedical research has many stakeholders who need to be able to communicate ideas and results via a facilitated knowledge exchange. This is particularly challenging in the biomedical context, where huge amounts of mixed quality data are generated. To make matters worse, the data generated often misses metadata records, which are critical to make experimental results findable, accessible, interoperable, and reusable. Implementing FAIR standards in this field will foster the shift into more reproducible research for both academia and industry. To help, Philippe and his group created The FAIR Cookbook. This manual takes a holistic approach to data governance, and is designed to enlist the help of different research team members, transitioning into a FAIR-compliant research process.

The figure contains a timeline with milestones from the FAIRification prcess.
Figure 2. The FAIRification process from the FAIR Cookbook. Copyright: the FAIR Cookbook.

In the third talk, The Arctoris Approach: Automated Data Generation & Metadata Capture, Dr Martin-Immanuel Bittner, CEO and Co-Founder of Arctoris discussed how automation can aid data and metadata capture. Martin discussed the opportunities provided by implementing FAIR data principles and best data practices together with research task automation. From the moment an experiment is designed, unambiguous research protocols, automated data collection and comprehensive metadata capture enable experimental reproducibility and full compliance with FAIR data standards. The large data sets, with their rich annotation, collected via this approach can then be mined using AI/ ML methods, providing new insights based on a fully reproducible and auditable experimental pipeline. The premise of the Arctoris approach was also discussed in a recent IBI paper Ensuring Reproducibility in Biomedical Research — The Role of Data, Metadata, and Emerging Best Practices.

The second part of the workshop was dedicated to talks selected from submitted abstracts. Dr Kirsty Merrett (University of Bristol) and various event-participants shared their experiences on the importance of teaching young scientists about research integrity, underlining the challenging aspects of this task. This was followed by Louise Corti’s (University of Essex) presentation about the importance and challenges in validating research containing sensitive information. She discussed the role of safe havens in ensuring and safely storing sensitive information used in research, for example, patient data.

A common theme for all the talks and contributions from the following discussion was that responsible, thoughtful and transparent research is the foundation of reproducible science. Automated data collection, establishing standards, and open data and knowledge sharing will enable us to overcome the reproducibility crisis afflicting the biomedical sciences today.

About Arctoris Ltd

Arctoris Ltd is an Oxford-based research company that is revolutionising drug discovery for virtual and traditional biotechnology companies, pharmaceutical corporations and academia. Arctoris has established the world’s first fully automated drug discovery platform, offering pre-optimised and fully validated processes for its partners and customers globally. Accessible remotely, the platform provides on-demand access to a wide range of biochemical, cell biology and molecular biology assays conducted by robotics, enabling rapid, informed decision-making in basic biology, target validation, toxicology and phenotypic screening. These assay capabilities are accessed using a powerful online portal that streamlines experiment planning, ordering, tracking and data analysis. Thanks to the Arctoris platform, customers can rapidly, accurately and cost-effectively perform their research and advance their drug discovery programmes.

For more information, please visit www.arctoris.com or connect with us on LinkedIn and Twitter. For media enquiries, contact: media@arctoris.com



Arctoris Ltd

We are the world’s first fully automated R&D platform generating drug discovery data on demand www.arctoris.com