Facilitating the "Great Bake Off" of Bioinformatics Workflows

Photo credit – Shutterstock

Life science researchers across all disciplines work with ever larger and increasingly complex datasets. They use increasingly sophisticated data analysis pipelines and workflows, constructed from numerous individual software tools. Creating optimal workflows for specific data analysis problems is a challenge. It requires an interplay of exploring the latest relevant tool combinations and benchmarking selected workflow candidates with reference data to determine the best-performing ones. Due to a lack of adequate tooling, this is currently hardly done systematically. Therefore, many workflows compromise on scientific quality.

To tackle this problem, we will develop a novel software system facilitating a “Great Bake Off” of computational workflows in bioinformatics. Its key contribution will be a new and unique integration of bioinformatics tools and metadata with technologies for automated workflow exploration and benchmarking. The system will provide a much-needed platform for systematic workflow generation and evaluation that complements and can be interfaced with existing state-of-the-art workflow systems. It will leverage ongoing technological developments at the European level, in particular existing initiatives of the ELIXIR Tools Platform.

We have selected use cases from the thriving bioinformatics discipline of proteomics to drive its development. They are representative for many modern workflow applications as they deal with highly complex data, are composed of large collections of individual software tools, and typically require high-performance computing resources. The project will enable a new, systematic and rigorous approach to the development of cutting-edge proteomics workflows. This will increase their scientific quality and robustness, and furthermore improve their reproducibility, FAIRness and maintainability.

Participating organisations

Leiden University Medical Center
Netherlands eScience Center
Utrecht University

Team

Contact person

NA

Nauman Ahmed

Netherlands eScience Center
Mail Nauman
AL
Anna-Lena Lamprecht
Principal investigator
Utrecht University
MP
Magnus Palmblad
Principal investigator
Leiden University Medical Center
NA
Nauman Ahmed
Lead RSE
Netherlands eScience Center
Pablo Lopez-Tarifa
Pablo Lopez-Tarifa
eScience Coordinator
Netherlands eScience Center